Hard Drive Problems

 Over the years I've sold and repaired lots of computers and found that the most common failures are optical drives, motherboards, and hard drives, in that order.
DVD and CD rewriters seem to fail very frequently and I wonder whether the circuitry in some models omits to turn off the laser, or leaves the laser turned on if there's a disk present?

Motherboards fail generally because the decoupling capacitors fitted to smooth the low voltage rail get quite warm, leak, and eventually develop a high resistance allowing spikes to interfere with the working of the microprocessor.

The worst failures are of course hard drives and, I think, apart from motor bearings just wearing out, getting noisy (a warning that all is not well) then failing to spin up properly, the most common cause is bad design.

The first instance I came across that is worthy of note is a particular Fujitsu model of yesteryear. This used a complex chip supplied to the manufacturer by a sub-contractor. In turn, the sub-contractor used a specific type of glue in the manufacture of the chip that decomposed when hot into a highly corrosive chemical and ate away the internal bits of the chip. All the drives manufactured over a period of a year effectively had within them a time bomb that would cause the loss all the user's data.

The latest hard drive problem I've met concerns Seagate drives. I'm not altogether sure of the whole story, but it goes something like this

Built into a hard drive is a program, called "firmware", that determines how the drive stores and retrieves data and a few frills, such as monitoring and reporting errors.

As I understand it the Seagate designers' decided to change the design of their Barracuda drive and employ some of the platter space (the platters are where your data gets stored) to carry some, but not all, the firmware.

Most of the firmware is carried in a chip on the circuit board attached to the drive, but because the programmers ran out of space, a little is put onto the platters.
Ordinarily this is satisfactory, but if the hard drive for some reason misbehaves then a circuit board change cannot fix the problem because the firmware on the board and that on the platters does not match up.

The problem with the Seagate Barracuda drive was associated with a slip-up by the programmers who wrote the firmware, in particular version "ST15".
In ST15 a rare combination of hard drive manipulations resulted in a stray bit overflowing from one area to another. Unfortunately the area to which it overflowed included the "Busy" flag. When a hard drive does not want to be disturbed, for example, when it's writing data to the platters, it puts up a "Busy" flag and the computer will wait until this is reset before continuing its dialogue with the hard drive.

In the fault condition, because the hard drive wasn't actually busy, and didn't actually know the busy flag was set, it was oblivious to the fact that the outside world couldn't access it. In fact it was a stalemate. The computer could not access the drive and arrange somehow to reset the busy bit, and the hard drive was just sitting there waiting to be used, but completely inaccessible.
This inaccessibility extended even to the computer BIOS. At start-up the computer will always interrogate any hard drives present and either confirm the settings are correct, or revise the settings if a new drive has been fitted.
Because the busy flag is set the computer has no option but to declare that there is no hard drive present. This being so, the computer will fail to boot and display a message of some kind about the fact that it cannot find an operating system.

When faced with this difficulty there is very little that can be done by an ordinary computer user.

A glance at Seagate's website indicated they were being very coy about the affair but, a search of the Internet revealed I wasn't the first to be aware of the Barracuda drive problem.

Mostly, any positive feedback pointed in the direction of a particular data recovery company that was offering some free software and a schematic for a special piece of hardware that could be connected to the factory setup pins of the hard drive and so reset the busy flag.

I downloaded the free software and purchased the parts for the special box of tricks.

Next I investigated the exact details of how to go about sorting out my particular Barracuda drive. Drat… the free software was designed to fix the previous version of the drive… it seems the bad firmware was used in successive versions of the Barracuda before being spotted. A call to the firm that supplied the free software quickly revealed that the version I needed wasn't free, it was $500. This included the special interface cable so I didn't have to make my own.

I didn't feel like parting with $500.

I returned to the Seagate website and penned a message to them. The data is very very important to my customer, I explained. It certainly was as he hadn't backed up his prolific email exchanges.
After acknowledging a return email asking if I was serious about opening some dialogue with them, things happened very swiftly….

I received a phone call from TNT (the carriers).. requesting me to print out an advice note (to be attached to an email.. to follow) for their driver who was speeding towards me that very minute… I did so and hastily packaged the hard drive with an address supplied in the email from Seagate.

A short time later the TNT driver knocked on the door and accepted the package, remarking that the address I'd put on the box was wrong… not Seagate in the UK, but a firm in Holland..

The next day I got another email stating that my hard drive was fixed, and a day or two later the repaired drive was delivered by TNT; in fact it arrived so early I was still in bed and had to collect it from a neighbour.

Full of confidence I fitted it to my customer's computer and low and behold it fired up to the XP Desktop exactly as it was before the busy flag had been set. A quick check revealed that the firmware was now the latest version so, hopefully, the drive will be good for several years.

What about a RAID system you might ask? This would have a second hard drive carrying a mirror of the boot drive so that failure of one drive would not result in loss of data.
Well funnily enough the computer I supplied did have a RAID-1 configuration… I actually fitted two Seagate Barracuda drives, but strangely the RAID setup had been disconnected somehow after a month or so without the owner noticing. The other drive did in fact take over when I'd changed the BIOS settings, but it stepped back 6 months or so in time, to the day it had been disconnected. I checked its firmware status and found it to be ST15. I therefore ran a small firmware update program supplied by Seagate and updated it before it too met the same fate as its partner.

The final phase was to make a backup of the repaired drive, just to be safe, then reformat the second drive and wait till the RAID software finished its mirroring.
I suppose the moral is to never alter BIOS settings unless you know exactly what you're doing, and NEVER carry out a motherboard BIOS update, because this invariably resets the settings to default… which isn't compatible with a RAID setup.

It's a pity that Seagate didn't advise resellers, via suppliers, to update the firmware on any Barracuda drives they'd supplied.


Return to entrance