Crucial MX500 SSD drive dead after 8650 hours

Written by - 0 comments

Published on - Listed in Hardware Linux Monitoring

Recently a Western Digital Green SSD died without any pre-fail indications (see article Leaving a party without saying bye: Western Digital Green SSD dead without pre-fail indications). Can the same be applied to other SSD drives, too? The answer is: Not exactly.

Monitoring physical drives with check_smart

All server drives are constantly being monitored by the check_smart monitoring plugin. This usually helps to detect preeminent failures (pre-failures) of drive. This has worked very well in the past with magnetic disks (hard drives). Almost all HDD failures could be pre-detected by using check_smart. But does the same also apply for SSDs? Only time will tell.

The first SSD drive failure (mentioned above) did not show any pre-failures. The drive just went out of service in an instant. But on a Crucial SSD, check_smart was able to detect something.

Crucial MX500: Reallocated sectors detected

The first alert was received on July 2nd by check_smart. 2 reallocated sectors were found.

That's "good" news because that's an indicator that something's not going too well with the drive and that it will most likely die. The big question is: When?

One week later, on July 9th, the reallocated sector count increased to 6 and then even 8 in the same day. The value remained steady for a while, until the counter increased to 12 sectors on July 25th.

On August 6th, it was game over for the drive: It disappeared from the Operating System and check_smart was unable to find the drive anymore. This was also the moment when two additional monitoring checks (Disk Raid Status and Server Hardware) switched to CRITICAL and informed about the failed drive.

For this MX500 SSD drive it took a bit more than one month from the first alert to the drive's end of life. Which gave us enough time to get a replacement drive in advance.

What about the drive age?

Readers of this article (and potential or existing MX500 owners) are probably interested in one particular fact: How long did the drive run until its EOL? The answer is: 8650 hours (according to the Power_on_hours SMART value).

But use caution using this value as a fixed indicator. Another Crucial MX500 is still running without any reallocated sectors so far and its Power_On_Hours value is as of this writing at 8695 hours.

Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.