Home » Fixes » ata3:00: status: DRDY ERR

ata3:00: status: DRDY ERR

harddisk apart

The Problem

During the years in my career I relatively often see a specific error message on Linux production servers. The error is associated with hard drive failures. I saw the problem on systems with conventional HDD drives as well on servers which are only using NVMe drives. RAID array disks are affected as well as single drive systems.

The information in the internet is relatively rare in relation to the appearance of the problem. Sometimes the error are appearing on std out, directly in the shell and sometimes they are only visible when “dmesg” is executed.

Here is the error which is meant:

ata3:00: status: { DRDY ERR }
ata3.00: error {UNC }
ata3:00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata3:00: BMDMA stat 0x25
ata3:00: failed command: READ DMA

For sure the details like the ATA number are different depending on the configuration of the system.


The Analysis

The error basically says that somethings wrong with a specific disk drive. We need to find out which physical disk is causing the error.

This command works for me on Ubuntu 20.04:

ls /dev/disk/by-path -al

It shows the ATAx to /dev/sdX association on my system. Since I know now the /dev/sdx device I need to get the serial number of the hard drive:

smartctl -a /dev/sdX

To know the serial number of the hard disk is important because the serial number is, at least on HDDs, printed on a sticker on the disk.


The Fix

I had luck at least in 40 percent of the cases in which this error appeared by just shutting down the computer, opening the casing, identifying the right physical disk and changing the connecting SATA cable.

One time the error appeared on multiple disks at the same time. The fix was to order an identical mainboard and physically installing it. I connected the disks again to the new mainboard and the error never appeared again.

Another method could be just to shut down the system and unplug and reconnect the SATA cables on the same socket (loose contact). This solved the problem one time for me.

In the case of NVMe drives it could help to shut down the computer and also unplug the NVMe and reconnect it to rule out a loose contact.

But sometimes there is just a hard disk drive defect and the hard disk needs to be replaced.


Leave a comment

Your email address will not be published. Required fields are marked *

consulting picture

WordPress Cookie Notice by Real Cookie Banner