RAID Trouble: Running slowly?

 My computer is used every day and I was pretty sure it was taking longer and longer to boot up. Other ominous signs were also cropping up. When I bothered to check the Windows logs I usually found red marks. These were always uninformative although the type of error was often repeated. Typical was an iastor error.Searching the Internet for explanations of the error number, the phrase that kept appearing or a file name was never definitive and far from clear.
As with lots of advice on the Internet nowadays, most advice was rubbish and ill informed. When a genuine hardware fault warning appears why do lots of people believe it's a software bug? No. If a warning appears you should treat it as a genuine warning. Switching off the diagnostic software will eventually lead you into trouble.

What else happened? Just once or twice a fortnight the computer jammed up. The desktop was present and the mouse moved around, but opening a program often resulted in "not responding" being appended to the information on the screen.
About two weeks ago I noticed a message popped up in the bottom corner of the screen. It looked like an Intel Rapid Storage Technology message, so I clicked it and was informed there was a problem.
It told me that a hard disk had failed to respond and was now lost to the system, or words to that effect.
Now, I'll just explain the design of my computer to bring this into context. I've had a RAID setup for over ten years. Not the same setup of course, but each computer I build for my own use has two hard disks carrying the operating system. These are arranged in RAID 1 which is a mirrored system. Drive C which is a Seagate 1TB is shadowed on a hidden disk and whatever is read or written to drive C is echoed by reading or writing to its mirror. The latest computer I have uses a pair of data disks as well. These are also mirrored, but differently to drive C. The data disk is a 2TB Seagate named Drive E and is mirrored by a hidden second identical disk, but whereas Drive C uses Intel Raid software, Drive E uses the mirror feature built into Windows 7.

Back to the latest problem. A drive had been lost and Intel provided me with its identity. It was the drive connected to Port 1 and its serial number was S1D8HA5T. What should I do? Well, ideally I'd have got a spare drive, either running inside the computer or tucked away in a handy box. I didn't have either so I perused the catalogues and promptly ordered a new drive. At this point I considered a switch from mechanical drives to solid state, but there are good reasons for not taking that option. Primarily, the computer is at risk now that it's lost its mirror, so venturing into a totally different setup was risky. Maybe later when things are back to normal?
The new drive was now ordered and I'll fit it when it arrives.
When the failure occurred I was aware of a rhythmic clunking noise, a sure sign of something nasty going on, however the clunking eventually stopped, and a short time later I got a message stating that a rebuild was taking place. It seems, whatever had been the problem was now an ex-problem.
Eventually the rebuild was complete and the computer was repaired.
Maybe not entirely repaired however, because it was definitely taking longer to boot up than it should. After a few days I looked at the log entries. There were regular red marks under "System". I checked once again and the advice seemed to be "reload the Intel drivers". It seems that the consensus was that my problem was a software problem, not hardware?
Maybe not though. What if the Intel software was really warning me of an impending crash? The answer came a week later. After having to force a hardware reset when nothing responded to mouse clicks or useful keyboard commands, all was well for 30 minutes, then I got a BSOD. The code was STOP:0x00009086 followed by lots of zeroes. Alas this code meant nothing concrete, however what was concrete was the dull clunking sound from the PC. A visit to the Intel program showed the dodgy hard drive previously marked as faulty had failed again.
This time I had a brand new replacement so off came the sides of the PC after closing it down and turning off the power. A torch revealed the drive with serial number S1D8HA5T and as it had quick release guides, out it came screwdiverless. The plastic guides popped off (no screws needed) and placed on the new Seagate. After sliding it back into place and re-plugging the power and data leads I refitted the sides and turned on the power.
Booting up showed resulted in a message about having to configure the new drive, but I ignored this and instead opened the Intel software application. The new drive was shown as available and I selected it as a new mirror. Immediately rebuilding commenced and after not too long I had a fully functioning RAID setup.
I checked Drive E and found a "re-sync" was in progress. This unfortunately always happens if Windows is shut down forcibly. An hour or two later and the re-sync had completed.
Now the computer is running nice and fast, just like it used to. No doubt the old hard drive had been having difficulty keeping up with its reading and writing.

A note about selecting a new hard drive in a RAID setup. A replacement hard drive has to have a capacity equal or greater than the remaining good drive. There is a way round this. So, for example if I wished to fit a pair of solid state drives having a capacity of say 240GB I would first need to reduce Drive C active partition to this figure or slightly less in practice. The requirement is to determine exactly what the 240GB capacity really is, then to use the Windows shrink feature to turn my 1T hard drive into a 240GB hard drive. The mirror can then be removed and a new 240GB SSD added. This is then used to mirror the shrunk 1T Drive C. When the rebuild has finished the shrunk 1T drive is unplugged and a second 240GB drive is fitted and set to mirror the first SSD. To do all this you need a clear head and to not make any mistakes!

 

Hard drives went U/S

 A friend brought me a pair of external hard drives. They are 500GByte and 1.5TByte and carry zillions of backup pictures and files from now defunct computers so are pretty important to him.

He said he'd mislaid the power supplies for the pair of external hard drives so used an alternative power supply instead and failed to get either working. Now, if the drives have an internally stabilized supply he'll be OK, but if they need a 12 volt stabilized supply he might be in trouble. I opened the first case and removed the SATA drive then plugged it into my computer. Oops.. I shouldn't have done that as the computer just coughed and died. I unplugged the offending drive and found the computer wouldn't boot up. After switching the mains lead for another it suddenly came back on. I turned it off and tried the original lead again. This time it worked OK. Very odd.

I checked the hard drive and found the 12 volt pins were shorted to the ground pins. A protection diode had failed short-circuit. I removed the diode and the drive worked OK. This one is made by Seagate and the diode was quite visible, but there's no fuse so the design is not so good. I removed the second drive from its housing and measured the resistance across the 12 volt input to ground. No short so I plugged it into my test computer. It went unrecognised but it was spinning. As this Samsung drive has a featureless circuit board the components are hidden underneath so I removed six torx screws and looked underneath.

A layer of padding was stuck to the components. It was stuck because a protection diode had got very hot and melted the plastic padding which had stuck to it.

It looked a sorry state. Lots of soot and vaporised copper. However, not altogether bad news as the protection diode had failed short circuit and blown some tracks which I guess were designed to fuse open. After cleaning up the soot I removed the diode and added a short length of wire to replace the missing tracks. I screwed back the circuit board and tried out the drive. Success, we now have two working drives.

The diodes are coded with "LG" and "KVP", both of which mean they are TVS protection diodes rated at 13 volts. These can tolerate lots of power and clamp too high a voltage on the 12 volt input, but not indefinitely. The power supplies are rated at around 2 Amps which generally means their output voltage will be 12 volts at the rated output current. Because the output circuit will have some inherent resistance, lets say 6 ohms, the output at 1 Amp might be as high as 14 volts, hence the failure of the diodes.

Return to Reception