[GH-ISSUE #430] SSD Not detected anymore #149

Closed
opened 2026-03-07 19:16:21 +03:00 by kerem · 4 comments
Owner

Originally created by @thibaudbrg on GitHub (Feb 20, 2025).
Original GitHub issue: https://github.com/007revad/Synology_HDD_db/issues/430

Hi, not sure if this is the right place to post such issue but I'm pretty concerned about what heppen yesterday to my NAS.

A mothn ago I setup a PNY CS2230 500GB SSD (M.2 slot) as a Storage pool successfully in (SHR) thanks to syno_create_m2_volume.sh

I then setup a scheduled task syno_hdd_db.sh -n to run at each boot of my NAS DS920+
Everything did work fine for more than a month. I'm runinng on DSM 7.2.1-69057 Update 5 not updates has been done.
Then sudently at 1h39am I received this mail:

Storage Pool 2 on NAS-DS920 has crashed. Information of the drives in abnormal status is shown below:
M.2 Drive 1, Model: PNY CS2230 500GB SSD, Serial number: PNY2447241120010025C Several reasons may result in storage pool crash.

Since then the SSD is missing, not detected by synology, not dected running ls /dev/nvme*or lspci | grep -i nvme or even sudo nvme list
Running manually syno_hdd_db.sh -nresult in:

Synology_HDD_db v3.6.110
DS920+ x86_64 DSM 7.2.1-69057-5 
StorageManager 1.0.0-0017

ds920+_host_v7 version 8058

Using options: -n -r
Running from: /volume1/homes/thibaudbrg/scripts/Synology_HDD_db-main/syno_hdd_db.sh

HDD/SSD models found: 3
ST18000NM000J-2TV103,SN02,18000 GB
ST18000NM000J-2TV103,SN04,18000 GB
ST18000NM003D-3DL103,SN05,18000 GB

No M.2 drives found

No M.2 PCIe cards found

No Expansion Units found

Updated ST18000NM000J-2TV103 in ds920+_host_v7.db
Updated ST18000NM000J-2TV103 in ds920+_host_v7.db
ST18000NM003D-3DL103 already exists in ds920+_host_v7.db

Support disk compatibility already enabled.

Disabled support memory compatibility.

Set max memory to 20 GB.

Drive db auto updates already disabled.

DSM successfully checked disk compatibility.

You may need to reboot the Synology to see the changes.

I tried removing the SSD, then putting it into the other M.2 slot of the NAS, still not detected.
What would you guys try to do ? I have all my docker containers config on it... So everything is currently broken right know..

Originally created by @thibaudbrg on GitHub (Feb 20, 2025). Original GitHub issue: https://github.com/007revad/Synology_HDD_db/issues/430 Hi, not sure if this is the right place to post such issue but I'm pretty concerned about what heppen yesterday to my NAS. A mothn ago I setup a `PNY CS2230 500GB SSD` (M.2 slot) as a Storage pool successfully in (SHR) thanks to syno_create_m2_volume.sh I then setup a scheduled task `syno_hdd_db.sh -n` to run at each boot of my NAS DS920+ Everything did work fine for more than a month. I'm runinng on `DSM 7.2.1-69057 Update 5` not updates has been done. Then sudently at 1h39am I received this mail: ``` Storage Pool 2 on NAS-DS920 has crashed. Information of the drives in abnormal status is shown below: M.2 Drive 1, Model: PNY CS2230 500GB SSD, Serial number: PNY2447241120010025C Several reasons may result in storage pool crash. ``` Since then the SSD is missing, not detected by synology, not dected running `ls /dev/nvme*`or `lspci | grep -i nvme` or even `sudo nvme list` Running manually `syno_hdd_db.sh -n`result in: ```bash Synology_HDD_db v3.6.110 DS920+ x86_64 DSM 7.2.1-69057-5 StorageManager 1.0.0-0017 ds920+_host_v7 version 8058 Using options: -n -r Running from: /volume1/homes/thibaudbrg/scripts/Synology_HDD_db-main/syno_hdd_db.sh HDD/SSD models found: 3 ST18000NM000J-2TV103,SN02,18000 GB ST18000NM000J-2TV103,SN04,18000 GB ST18000NM003D-3DL103,SN05,18000 GB No M.2 drives found No M.2 PCIe cards found No Expansion Units found Updated ST18000NM000J-2TV103 in ds920+_host_v7.db Updated ST18000NM000J-2TV103 in ds920+_host_v7.db ST18000NM003D-3DL103 already exists in ds920+_host_v7.db Support disk compatibility already enabled. Disabled support memory compatibility. Set max memory to 20 GB. Drive db auto updates already disabled. DSM successfully checked disk compatibility. You may need to reboot the Synology to see the changes. ``` I tried removing the SSD, then putting it into the other M.2 slot of the NAS, still not detected. What would you guys try to do ? I have all my docker containers config on it... So everything is currently broken right know..
kerem closed this issue 2026-03-07 19:16:22 +03:00
Author
Owner

@nagyrobi commented on GitHub (Feb 20, 2025):

Try in a laptop, does it detect it?
Otherwise replace it in warranty...

<!-- gh-comment-id:2672343865 --> @nagyrobi commented on GitHub (Feb 20, 2025): Try in a laptop, does it detect it? Otherwise replace it in warranty...
Author
Owner

@thibaudbrg commented on GitHub (Feb 20, 2025):

Thanks for your quick answer. So I sadly don't have a USB adapter to test it on a PC right now + I'm not at home physically next to the NAS which is a bummer. But I ordered on amazon a M.2 to USB adapter that hopefully will quickly be delivered.

While all this, I managed to print more logs:

btrfs_dev_stat_print_on_error: 88 callbacks suppressed
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482880, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482881, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482882, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482883, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482884, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482885, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482886, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482887, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482888, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482889, flush 0, corrupt 0, gen 0
BTRFS error (device dm-3): BTRFS: dm-3 failed to repair parent transid verify failure on 536051712, mirror = 1
BTRFS error (device dm-3): BTRFS: dm-3 failed to repair parent transid verify failure on 536051712, mirror = 2

Which likely comes from the SSD as my HDDs do not show any problem of failure. Furthermore:

thibaudbrg@NAS-DS920:~$ lspci -vvv | grep -i nvme
thibaudbrg@NAS-DS920:~$ echo 1 | sudo tee /sys/bus/pci/rescan
1
thibaudbrg@NAS-DS920:~$ dmesg | grep -i nvme
[18776.503995] nvme nvme0: pci function 0000:05:00.0
[18776.509270] nvme 0000:05:00.0: enabling device (0000 -> 0002)
[18776.515807] nvme nvme0: Removing after probe failure status: -19

and

thibaudbrg@NAS-DS920:~$ echo 1 | sudo tee /sys/bus/pci/devices/0000:05:00.0/reset
1
thibaudbrg@NAS-DS920:~$ dmesg | grep -i nvme
[18776.503995] nvme nvme0: pci function 0000:05:00.0
[18776.509270] nvme 0000:05:00.0: enabling device (0000 -> 0002)
[18776.515807] nvme nvme0: Removing after probe failure status: -19

Which surely show a hardware failure as the SSD is indeed PCI-level-detected but unmounted...
I will contact PNY support immediately. But why?? How unlucky I am?? After 1 month??

I'm wondering if some automatic synology decisions like data scrubbing during the night or some other auto tests might have been run at the same time and the failure would have occurred while running those tests? Might it be a possibility?
Sadly even after having looked at all the logs I can't go back in time to that exact time. All the logs got overriden already.

<!-- gh-comment-id:2672802349 --> @thibaudbrg commented on GitHub (Feb 20, 2025): Thanks for your quick answer. So I sadly don't have a USB adapter to test it on a PC right now + I'm not at home physically next to the NAS which is a bummer. But I ordered on amazon a M.2 to USB adapter that hopefully will quickly be delivered. While all this, I managed to print more logs: ```c btrfs_dev_stat_print_on_error: 88 callbacks suppressed BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482880, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482881, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482882, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482883, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482884, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482885, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482886, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482887, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482888, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/cachedev_0 errs: wr 10931, rd 93482889, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): BTRFS: dm-3 failed to repair parent transid verify failure on 536051712, mirror = 1 BTRFS error (device dm-3): BTRFS: dm-3 failed to repair parent transid verify failure on 536051712, mirror = 2 ``` Which likely comes from the SSD as my HDDs do not show any problem of failure. Furthermore: ```shell thibaudbrg@NAS-DS920:~$ lspci -vvv | grep -i nvme thibaudbrg@NAS-DS920:~$ echo 1 | sudo tee /sys/bus/pci/rescan 1 thibaudbrg@NAS-DS920:~$ dmesg | grep -i nvme [18776.503995] nvme nvme0: pci function 0000:05:00.0 [18776.509270] nvme 0000:05:00.0: enabling device (0000 -> 0002) [18776.515807] nvme nvme0: Removing after probe failure status: -19 ``` and ```shell thibaudbrg@NAS-DS920:~$ echo 1 | sudo tee /sys/bus/pci/devices/0000:05:00.0/reset 1 thibaudbrg@NAS-DS920:~$ dmesg | grep -i nvme [18776.503995] nvme nvme0: pci function 0000:05:00.0 [18776.509270] nvme 0000:05:00.0: enabling device (0000 -> 0002) [18776.515807] nvme nvme0: Removing after probe failure status: -19 ``` Which surely show a hardware failure as the SSD is indeed PCI-level-detected but unmounted... I will contact PNY support immediately. But why?? How unlucky I am?? After 1 month?? I'm wondering if some automatic synology decisions like data scrubbing during the night or some other auto tests might have been run at the same time and the failure would have occurred while running those tests? Might it be a possibility? Sadly even after having looked at all the logs I can't go back in time to that exact time. All the logs got overriden already.
Author
Owner

@thibaudbrg commented on GitHub (Feb 20, 2025):

UPDATE:

Becoming desperate, I was ready to kill myself when I decided to take advantage of this moment to update the NAS which is always complicated due to potential failures with my Docker configs.

Then sudently,

Volume 2 on NAS-DS920 was in read-only mode, but it has been automatically repaired and is now healthy.

?????????????

And the logs show:

thibaudbrg@NAS-DS920:~$ ls /dev/nvme*
/dev/nvme0  /dev/nvme0n1  /dev/nvme0n1p1  /dev/nvme0n1p2  /dev/nvme0n1p3  /dev/nvme0n1p5
thibaudbrg@NAS-DS920:~$ dmesg | grep -i nvme
[    5.706177] nvme nvme0: pci function 0000:05:00.0
[    5.832928]  nvme0n1: p1 p2 p3 < p5 >
[   58.940054] md: bind<nvme0n1p5>

I am speechless... I close this issue with this comment, and I hope nobody won't encounter similar stress... I might as well think of buying Compatible Synology SSD drives next time even tho Synology_HDD_db supposedly does the trick.

<!-- gh-comment-id:2672838205 --> @thibaudbrg commented on GitHub (Feb 20, 2025): UPDATE: Becoming desperate, I was ready to kill myself when I decided to take advantage of this moment to update the NAS which is always complicated due to potential failures with my Docker configs. Then sudently, ``` Volume 2 on NAS-DS920 was in read-only mode, but it has been automatically repaired and is now healthy. ```` ????????????? And the logs show: ```python thibaudbrg@NAS-DS920:~$ ls /dev/nvme* /dev/nvme0 /dev/nvme0n1 /dev/nvme0n1p1 /dev/nvme0n1p2 /dev/nvme0n1p3 /dev/nvme0n1p5 thibaudbrg@NAS-DS920:~$ dmesg | grep -i nvme [ 5.706177] nvme nvme0: pci function 0000:05:00.0 [ 5.832928] nvme0n1: p1 p2 p3 < p5 > [ 58.940054] md: bind<nvme0n1p5> ``` I am speechless... I close this issue with this comment, and I hope nobody won't encounter similar stress... I might as well think of buying Compatible Synology SSD drives next time even tho `Synology_HDD_db` supposedly does the trick.
Author
Owner

@nagyrobi commented on GitHub (Feb 21, 2025):

And never rely on a single drive with critical data... Always put at least two in a RAID.

That said, happened with me this week: bought 4 pieces of brand new 1TB WD RED SATA SSDs, one of them died within 4 hours. Apparently DOF, as new.

<!-- gh-comment-id:2673601186 --> @nagyrobi commented on GitHub (Feb 21, 2025): And never rely on a single drive with critical data... Always put at least two in a RAID. That said, happened with me this week: bought 4 pieces of brand new 1TB WD RED SATA SSDs, one of them died within 4 hours. Apparently DOF, as new.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/Synology_HDD_db#149
No description provided.