Every network administrator has received calls from users complaining that their "computer won't start"a nebulous, uninformative phrase that can cause quite a bit of frustration. Typically, users report that something untoward happened during the startup processeither during the computer's Power On Self Test (POST) or during the Windows startup procedures. To diagnose and cure such maddening problems, you need to understand how the boot process works.
The phrase "boot failure" describes both machine and OS problems. But in the days of MS-DOS computers, the POST took longer than OS startup, and hardware was the source of most boot-failure problems. Computer hardware has become more reliable over the years, andthanks to advanced BIOS featuresthe computer's ability to track, diagnose, and control that hardware is more robust. Therefore, you're more likely to encounter an OS problem when a system fails to boot. Let's walk through the startup process to see what happens at each step and to understand the meaning of any error messages you encounter. (For the purpose of this discussion, I assume you're using Windows 2000 or later.)
Power On
Is the user complaining that nothing's happening when he or she presses the power button? If so, first check the plug.
Here's an old administrator's trick for dealing with unplugged computers when you're working with a user over the phone. Users often don't check whether their unresponsive computer is unplugged, and when you mention this possibility, they're embarrassed if it turns out to be the cause. The user might say, "Of course, it's plugged in," but you need to know whether that's the truth. Ask the user to pull the plug and reinsert it, citing a need to "check the polarity issues." (Try not to giggle.) You'll be amazed how often users will report, "Hey, that worked."
If it's not the plug, it's probably the power supplythe most vulnerable hardware component in your system. Power supplies aren't expensive, but replacing them is a boring, labor-intensive exercise.
Hardware and BIOS Checks
If the user sees an error message during the POST, or if the computer simply hangs before the OS starts, the problem is in the hardware or the BIOS. The system reports hardware and BIOS errors to the screen, along with beeps to get your attention. Some BIOS errors appear as numbers, and at one time all BIOS manufacturers used the same numbers (the numbers that IBM used), but that changed. Today, if you see an error number, you need the documentation that came with your computer to interpret it. (You can also look it up by checking the BIOS manufacturer's Web site.) However, you're far more likely to see text rather than numbers, as in Hard drive controller failure or the always amusing Keyboard error, press F1 to continue.
You might also see an error that references memory problems. In the old days, memory components had an extra chip called a "parity chip," and part of the BIOS test was a parity test. Memory components no longer include parity checking because it's not really necessary anymore: Memory manufacturing has advanced to the point at which failure is highly unusual. However, after you add memory to a machine, you might see a memory error message at the next boot. The message displays text such as Mismatched memory information. This error message is actually a confirmation that the system sees the memory you installed but finds that it doesn't match the total recorded in CMOS.
To solve this problem, try restarting the computer and entering the keystrokes required to get into the BIOS setup program. In my experience, doing so jumpstarts the solution because the correct memory count automatically appears as soon as you enter the BIOS setup screen, and all that's left to do is exit the BIOS setup program. Accessing the BIOS setup program causes the system to check the memory count and adjust it so that it matches the physical memory total.
If you add memory to a computer and encounter an error message that doesn't mention a mismatched memory count, you have a more serious problem. The system doesn't recognize the new memory. This situation is almost always caused by an error in the physical insertion of the memory, such as using the wrong slot or not inserting the teeth properly. However, I've also seen the problem when the wrong memory type is inserted (e.g., inserting DRAM in an older computer with Enhanced Data OutputEDO), when the motherboard doesn't like mixing SIMMs and DIMMs, or when the motherboard doesn't like mixed memory speeds. Some motherboards require a change in dipswitches or jumper configuration when you add memory, although those requirements are becoming less common. To avoid these problems, always check the motherboard documentation before adding memory.
If you see a hard disk error during POST, you have a serious problem. Actually, I've found that half the time the problem is the controllernot the diskand replacing the controller lets the disk boot normally, with all data intact (whew!). If an embedded controller dies, you don't have to buy a new motherboard; instead, you can buy a controller card. Check the motherboard documentation for the tasks required to make the BIOS see the card instead of looking for the embedded chip.
If the problem is indeed the disk, you have more work to do than merely replacing a controller. In addition to replacing the disk, you have to reinstall the OS and applications, as well as restore from the most recent backupwhich is, of course, dated yesterday, right?
RRex September 02, 2004 (Article Rating: