Проверка на състоянието на SSD/HDD в Линукс

Защо S.M.A.R.T tools

Ако сте системен адиминистратор и сте отговорни за управлението на Linux системи, то тогава се препоръчва честото проверяване на състоянието на SSD и HDD дисковете. Това ще помогне да се определят дискове които имат повреди и съответно те да бъдат заменени преди да се загуби информацията в тях. S.M.A.R.T tools е предназначен да наблюдава състоянието на SSD, и HDD, позволявайки тестове на дисковете по всяко време.

Какво е нужно:

Сървър или Десктоп, използващ операционна система Linux.
Root парола на операционната система.

Инсталиране на Smartctl

Smartctl е по подразбиране заложена в хранилището на всички главни дистрибуции на Linux. За Debian и Ubuntu, инсталирайте Smartctl, използвайки следната команда:

apt-get install smartmontools -y

За RHEL, CentOS, и Fedora, инсталирайте Smartctl, използвайки следната команда:

dnf install smartmontools

След инсталирането на Smartctl, стартирайте Smartctl обслужването използвайки следната команда:

systemctl status smartd

Трябва да получите следното съобщение:

[sudo] password for samyil: 
● smartmontools.service - Self Monitoring and Reporting Technology (SMART) Daemon
     Loaded: loaded (/lib/systemd/system/smartmontools.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2021-10-09 10:06:11 EEST; 5min ago
       Docs: man:smartd(8)
             man:smartd.conf(5)
   Main PID: 2857 (smartd)
     Status: "Next check of 1 device will start at 10:36:11"
      Tasks: 1 (limit: 4513)
     Memory: 1.5M
     CGroup: /system.slice/smartmontools.service
             └─2857 /usr/sbin/smartd -n

Oct 09 10:06:11 storage smartd[2857]: Drive: DEVICESCAN, implied '-a' Directive on line 21 of file /etc/smartd.conf
Oct 09 10:06:11 storage smartd[2857]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
Oct 09 10:06:11 storage smartd[2857]: Device: /dev/sda, type changed from 'scsi' to 'sat'
Oct 09 10:06:11 storage smartd[2857]: Device: /dev/sda [SAT], opened
Oct 09 10:06:11 storage smartd[2857]: Device: /dev/sda [SAT], WDC WD100EFAX-68LHPN0, S/N:1EHTGV5Z, WWN:5-000cca-27ed93822, FW:83.H0A83, 10.0 TB
Oct 09 10:06:11 storage smartd[2857]: Device: /dev/sda [SAT], found in smartd database: Western Digital Red
Oct 09 10:06:11 storage smartd[2857]: Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
Oct 09 10:06:11 storage smartd[2857]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices
Oct 09 10:06:11 storage smartd[2857]: Device: /dev/sda [SAT], state written to /var/lib/smartmontools/smartd.WDC_WD100EFAX_68LHPN0-1EHTGV5Z.ata.state
Oct 09 10:06:11 storage systemd[1]: Started Self Monitoring and Reporting Technology (SMART) Daemon.

Проверка на състоянието на SSD/HDD

След инсталирането на Smartctl, ще трябва да включите SMART features на вашия хард диск. Изпълнете следната команда:

smartctl -s on /dev/sda

Първото нещо, което ще трябва да направите е да получите информацията от SSD или HDD. Това става по този начин:

smartctl -i /dev/sda

Това ще ви даде подробна информация за вашия хард диск.

samyil@storage:~$ sudo smartctl -i /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-88-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD100EFAX-68LHPN0
Serial Number:    1EHTGV5Z
LU WWN Device Id: 5 000cca 27ed93822
Firmware Version: 83.H0A83
User Capacity:    10,000,831,348,736 bytes [10.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Oct  9 10:44:14 2021 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Тест на SSD и HDD

Ако искате да извършите кратък тест на устройството, изпълнете следната команда:

smartctl -t short -a /dev/sda

Трябва да получите следния изход:

samyil@storage:~$ sudo smartctl -t short -a /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-88-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD100EFAX-68LHPN0
Serial Number:    1EHTGV5Z
LU WWN Device Id: 5 000cca 27ed93822
Firmware Version: 83.H0A83
User Capacity:    10,000,831,348,736 bytes [10.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Oct  9 10:51:54 2021 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(   93) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (1099) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   131   131   054    Old_age   Offline      -       104
  3 Spin_Up_Time            0x0007   162   162   024    Pre-fail  Always       -       437 (Average 372)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       56
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   067    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   128   128   020    Old_age   Offline      -       18
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       3088
 10 Spin_Retry_Count        0x0012   100   100   060    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       56
 22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
192 Power-Off_Retract_Count 0x0032   097   097   000    Old_age   Always       -       4624
193 Load_Cycle_Count        0x0012   097   097   000    Old_age   Always       -       4624
194 Temperature_Celsius     0x0002   216   216   000    Old_age   Always       -       30 (Min/Max 21/38)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Sat Oct  9 10:53:54 2021 EEST
Use smartctl -X to abort test.

Този кратък тест ще провери Електрическите и Механическите свойства, заедно с Read/Verify. За да намерите и принтирате резултата на самопроверката, изпълнете следната команда след около 5 минути:

smartctl -l selftest /dev/sda

Трябва да получите подобен резултат

samyil@storage:~$ sudo smartctl -l selftest /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-88-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      3089         -

За да изпълните подробен тест, използвайте следната команда:

smartctl -t long -a /dev/sda

Трябва да получите следния изход:

samyil@storage:~$ sudo smartctl -t long -a /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-88-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD100EFAX-68LHPN0
Serial Number:    1EHTGV5Z
LU WWN Device Id: 5 000cca 27ed93822
Firmware Version: 83.H0A83
User Capacity:    10,000,831,348,736 bytes [10.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Oct  9 11:01:36 2021 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(   93) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (1099) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   131   131   054    Old_age   Offline      -       104
  3 Spin_Up_Time            0x0007   162   162   024    Pre-fail  Always       -       437 (Average 372)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       56
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   067    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   128   128   020    Old_age   Offline      -       18
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       3089
 10 Spin_Retry_Count        0x0012   100   100   060    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       56
 22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
192 Power-Off_Retract_Count 0x0032   097   097   000    Old_age   Always       -       4624
193 Load_Cycle_Count        0x0012   097   097   000    Old_age   Always       -       4624
194 Temperature_Celsius     0x0002   216   216   000    Old_age   Always       -       30 (Min/Max 21/38)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      3089         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1099 minutes for test to complete.
Test will complete after Sun Oct 10 05:20:36 2021 EEST
Use smartctl -X to abort test.

За да прекратите теста, използвайте следната команда:

smartctl -X /dev/sda

За да видите възможна най много информация, използвайте командата:

smartctl -d ata -H /dev/sda

За да проверите прогнозираното време за провеждането на проверката, използвайте командата:

smartctl -c /dev/sda

За да принтирате само регистъра с грешките, изпълнете командата:

smartctl -l error /dev/sda

За да получите помощна информация, изпълнете:

smartctl --help

Заключение

В това ръководство, се научихте как да инсталирате и използвате S.M.A.R.T monitoring tool, за да проверите състоянието на вашите SSD и HDD дискове. За повече информация, отворете smartctl man page.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.