# Samsung SSD drive dead, he's dead jim.



## Alain De Vos (Jan 3, 2023)

When i mean dead, i mean completely dead. Not even read-only. What is this ? Anyone has explanation ?
Compiling 4000 freebsd ports from source shouldn't kill an SSD ? NO ?
Time to make a full backup of the drive still alive.


----------



## richardtoohey2 (Jan 3, 2023)

I think that's one of the cons of SSDs - when they die, they die hard.  I've got lots of Samsung SSDs and not happened to me ... yet.


----------



## Alain De Vos (Jan 3, 2023)

First  thing i do now is make a full-backup of my other SSD.


----------



## richardtoohey2 (Jan 3, 2023)

Not sure of the accuracy of any of this, but this is the sort of thing I've read:









						Why do SSDs tend to fail much more suddenly than HDDs?
					

Modern PCs tend to use one of two types of internal-storage devices:1 2  Hard-disk drives (HDDs) store data on spinning disks coated in a magnetic recording medium, a technology dating back to the ...




					superuser.com


----------



## eternal_noob (Jan 3, 2023)

SSD only survive n write cycles. If you write lots of temporary files when compiling, it'll die soon.

Better use the -pipe flag, mate. Or even better, don't compile on SSDs at all.


----------



## yuripv79 (Jan 3, 2023)

eternal_noob said:


> don't compile on SSDs at all


Would not agree here.  I bought NVMe Samsung 970 Evo SSD back in 2019, putting it in my ESXi box for VMs I do development on, so daily building of OS images.  I was afraid at first that I'm going to ruin it fast this way, but after 3 years of "compiling on SSDs" it has only ~10% "wear" level, which makes the time I'll go past its TBW far past its warranty.  And it's much faster for this usage scenario than spinning rust.


----------



## chessguy64 (Jan 3, 2023)

Alain De Vos said:


> When i mean dead, i mean completely dead. Not even read-only. What is this ? Anyone has explanation ?
> Compiling 4000 freebsd ports from source shouldn't kill an SSD ? NO ?
> Time to make a full backup of the drive still alive.



That's a lot of I/O. And yeah, SSDs have a certain number of reads/writes before they die.


----------



## Phishfry (Jan 3, 2023)

Alain De Vos said:


> Anyone has explanation ?


Details? Did it use a Swap Partition? TRIM enabled? Was it ZIL/SLOG device??


----------



## Alain De Vos (Jan 3, 2023)

While removing  my video-card, the connector broke on the motherboard.
Bad design of HP. (low quality connector).
Had to order a new PC.


----------



## cracauer@ (Jan 3, 2023)

You have the controller on the disk die. That is a phenomenon different from the write limit.

Early SATA SSDs died on me like flies because of this. Still have trust issues.


----------



## 6502 (Jan 3, 2023)

cracauer@ said:


> You have the controller on the disk die. That is a phenomenon different from the write limit.


This reminds me for LED bulbs. LEDs have long life, they can work 10 years... and similar praises. But the bulb has some electronics (PCB) and usually it cannot work long time because of overheating. And the "excellent" LED bulb from good brand stops working after few months.


----------



## covacat (Jan 3, 2023)

on hdds sometimes it worked exchanging the PCB  with one from a same model disk, but not always


----------



## gotnull (Jan 3, 2023)

Alain De Vos said:


> When i mean dead, i mean completely dead. Not even read-only. What is this ? Anyone has explanation ?
> Compiling 4000 freebsd ports from source shouldn't kill an SSD ? NO ?
> Time to make a full backup of the drive still alive.





Alain De Vos said:


> While removing  my video-card, the connector broke on the motherboard.
> Bad design of HP. (low quality connector).
> Had to order a new PC.


damn ! it's a really bad day! a  SSD then MB, 2023 starts with heavy costs. 
Good luck for the rest of the day mate.


----------



## Phishfry (Jan 3, 2023)

Alain De Vos said:


> Bad design of HP. (low quality connector).


If the whole socket came off the board is was probably a bad solder job.


----------



## Argentum (Jan 3, 2023)

Alain De Vos said:


> When i mean dead, i mean completely dead. Not even read-only. What is this ? Anyone has explanation ?
> Compiling 4000 freebsd ports from source shouldn't kill an SSD ? NO ?
> Time to make a full backup of the drive still alive.


What model?

Can you run a SMART tool on that drive when dead? If media is bad, the controller should still be working. I am using sysutils/gsmartcontrol for checking drive status.

Have also SSD (not Samsung) in one system and and over 1500 ports compiled from source. In addition I have also bhyve VM running on that drive.


----------



## cy@ (Jan 3, 2023)

eternal_noob said:


> SSD only survive n write cycles. If you write lots of temporary files when compiling, it'll die soon.
> 
> Better use the -pipe flag, mate. Or even better, don't compile on SSDs at all.


About once a month I run this:

smartctl -a /dev/ada0 > smartctl.ada0-`dts`

where `dts` is:

slippy# alias dts
alias dts='/bin/date +%y%m%d_%H%M%S'
slippy# 

I then diff the previous file with the current one.

Using this I know my montly write rate has been significantly reduced by the measures below.

To reduce wear I initially zfs set sync=off and set sysctl vfs.zfs.txg.timeout=30. However, later I set sync back to standard and instead added a cheap SD card log device to the pool. The rate of writes to my laptop's SSD has been significantly reduced.

I chose a TXG timeout of 30 because it is similar to the partial trickle sync timeout that softupdates uses -- no dirty buffer older than 30 seconds.

I continue to  monitor and diff smartctl -a outputs monthly in case I  need to take any proactive measures. So far so good.

IMO ZFS is not nearly as SSD friendly as UFS is and is probably not the best choice for SSDs. But it can be used when one has carefully tuned ZFS to reduce log writes and log write frequency.


----------



## Alain De Vos (Jan 3, 2023)

The device is no longer recognized by the bios. For the bios it is like the device is not even connected.
So I'm unable to revive the corps.


----------



## richardtoohey2 (Jan 3, 2023)

Yes, think that’s a con of SSDs - with the spinning drives you might be able to recover something but that’s less likely with SSDs.

Which Samsung model and what age? I’ve got quite a few Samsung SSDs and no issues thus far but might be time for an audit.


----------



## Argentum (Jan 3, 2023)

Alain De Vos said:


> The device is no longer recognized by the bios. For the bios it is like the device is not even connected.
> So I'm unable to revive the corps.


That does not seem like a wear off...


----------



## free-and-bsd (Jan 3, 2023)

Alain De Vos said:


> While removing  my video-card, the connector broke on the motherboard.
> Bad design of HP. (low quality connector).
> Had to order a new PC.


That's why I generally don't recommend HP to anyone asking me which brand (notebook, printer etc) to buy. Maybe their server equipment is better quality, I don't know... but I don't see why it should.


----------



## free-and-bsd (Jan 3, 2023)

Alain De Vos said:


> The device is no longer recognized by the bios. For the bios it is like the device is not even connected.
> So I'm unable to revive the corps.


Since this doesn't happen often "enough", I never saw this in SSD usage reviews when one of my SSDs did the same thing. Well it was a Chinese noname cheap thing... very silly of me buying this, I'm sure. And whatever I was able to find in superficial & "serious" articles alike was all but praises and how this advanced technology is undeniably winning over the market .


----------



## dgmm (Jan 3, 2023)

free-and-bsd said:


> Since this doesn't happen often "enough", I never saw this in SSD usage reviews when one of my SSDs did the same thing. Well it was a Chinese noname cheap thing... very silly of me buying this, I'm sure. And whatever I was able to find in superficial & "serious" articles alike was all but praises and how this advanced technology is undeniably winning over the market .


Yeah, when you deal with a large "estate" of laptops, you tend to see more faults.  There are generally two failure modes.  Total failure, as in the system doesn't even "see" the attached SSD and SMART failures, either wear level reached 100% or other failed "sector" reaching a limit which results in the SSD switching to Read Only mode.

Someone further up mentioned read/write limits.  AFAIK, there are no read limits on NAND memory, only write limits, hence the reason most failure modes result in Read Only as further writes will only cause more data loss.


----------



## gotnull (Jan 3, 2023)

cy@ said:


> IMO ZFS is not nearly as SSD friendly as UFS is and is probably not the best choice for SSDs. But *it can be used when one has carefully tuned ZFS *to reduce log writes and log write frequency.


Interesting, I didn't know that ZFS needed to be tuned with SSD.
I'll try to remember this, because I didn't modify anything so far.
Thank you.


----------



## Alain De Vos (Jan 3, 2023)

If you reduce zfs-writing-to-disk you have more danger of data-loss on a sudden power-outage.


----------



## cy@ (Jan 4, 2023)

gotnull said:


> Interesting, I didn't know that ZFS needed to be tuned with SSD.
> I'll try to remember this, because I didn't modify anything so far.
> Thank you.


Neither did I but one month after installing the new SSD the wear leveling count jumped from 0 to 1 and two months after that the wear leveling count jumped from 1 to 2, with LBAs written in the billions, sure the SSD may have lasted a few years but I've had hard drives that have lasted on average 8-12 years with some lasting 16, it didn't feel right that I was "consuming" this SSD at a rate that might be higher than the spinning rust I had here. So I started fiddling around with a sysctl, adjusted the TXG timeout and put the log on cheaper more disposable media. It's been almost a year now and the wear leveling count remains at 2, with about 2.8 TB written to this 25% full 1 TB drive: about 1/300 of its design lifetime.

I think it takes a bit of fiddling around, monitoring the numbers, and patiently adjusting things. I'm not sure if it's worth documenting this because a lot of it is judging whether a change has made a difference or not. It's not like do A, then B, then C. My workload is different from yours which is different from someone else's. If a person is doing simple email, browsing and using it as a desktop there would be much less wear. It depends on how many writes one does.

One thing I did forget, I did reduce the minimum ashift to 13. Even though this affects performance, not writing a large block just simply to change one byte in that block also contributes to  wear. Just keep in mind, "how can I reduce unnecessary writes?"


----------



## Alain De Vos (Jan 4, 2023)

I had put the zpool-log-device on the disk that died.


----------



## cy@ (Jan 4, 2023)

Making an SSD the log or swap device will certainly cause it to wear.

Having said that, writes to spinning disk also weaken the disk's heads. Seeks wear out the actuator and simply spinning will wear out the roller bearings over time. I think it's debatable which will last longer. Some say SSD while others have said hard disk, while others have said it's awash.

We should also note that power cycling SSDs and hard disks (and all other electronic components) also contribute to wear. At what point does leaving components running v.s. powering them off reduce or contribute wear? I don't know. We've had these discussions at $JOB. I don't think anyone knows.

BTW, my laptop SSD is Samsung SSD 870 EVO 1TB. No problems so far.


----------



## msplsh (Jan 4, 2023)

Argentum said:


> That does not seem like a wear off...


Lots of them do this.  Run out of flash lifespan: just give up and die.


----------



## dgmm (Jan 4, 2023)

msplsh said:


> Lots of them do this.  Run out of flash lifespan: just give up and die.


Mostly, they don't, at least not "die" as such, but switch to read only mode.  If it's an OS drive, then some OS's can't even complete the boot process, making it a slightly more difficult problem for some.  Keeping an eye on the SMART parameters will also clue you in when the wear leveling is reaching EOL.  When an SSD "dies" unexpectedly it's usually a failure of the controller, not flash lifespan, but could also be a failure/defect in the flash NAND, just not "wear"/lifespan as such.


----------



## chessguy64 (Jan 5, 2023)

free-and-bsd said:


> That's why I generally don't recommend HP to anyone asking me which brand (notebook, printer etc) to buy. Maybe their server equipment is better quality, I don't know... but I don't see why it should.



I've had HP branded motherboards lasting 10-15+ years. Some still work to this day, the rest were donated / recycled while still functioning. HP printers are good too. They've even won awards for reliability / performance. HP business laptops are quality. HP servers are quality. Stop posting disinformation.


----------



## facedebouc (Jan 5, 2023)

chessguy64 said:


> I've had HP branded motherboards lasting 10-15+ years. Some still work to this day, the rest were donated / recycled while still functioning. HP printers are good too. They've even won awards for reliability / performance. HP business laptops are quality. HP servers are quality. Stop posting disinformation.


Past 40 years my HP-41C is still working


----------

