Guide: BSOD 101

Everlong

To xor or not to xor
Vista Guru
It’s late at night, you’re sitting at your computer playing a game or working on a project when, suddenly, Windows freezes completely. All your work is gone, and you find a blue screen full of gibberish staring back at you. Windows is dead, Jim, at least until you reboot it. You have no choice but to sigh loudly, shake your fist at Bill Gates and angrily push the reset button. You’ve just been visited by the ghost of windows crashed: the Blue Screen of Death.

Website with BSOD and stop errors with some good explanations of them all.

Blue Screen of Death Survival Guide: Every Error Explained | Maximum PC
 

My Computer

System One

  • Manufacturer/Model
    Custom Built
    CPU
    Intel Core 2 Quad Q9550
    Motherboard
    XFX MB-750I-72P9 NF750i
    Memory
    4096MB Corsair XMS2 PC-5400
    Graphics card(s)
    ASUS Nvidia Geforce GTX470
    Sound Card
    ASUS Xonar DX
    Monitor(s) Displays
    Dell 24" S2409W & Dell 20" E207WFP
    Screen Resolution
    1920x1080 & 1680x1050
    Hard Drives
    750GB Western Digital Caviar Black & 500GB Samsung
    PSU
    750 watt Thermaltake Toughpower
    Case
    Coolermaster Dominator 690 Nvidia Edition
    Cooling
    Zalman CNPS9700-NT Cooler, 6x 120mm Chassis Fans
    Mouse
    Logitech G5 Laser Mouse (2007 edition)
    Keyboard
    Logitech G11 Keyboard
    Internet Speed
    100Mbps
    Other Info
    abit airpace 54mbps wireless PCI-E x1 card

My Computer

System One

  • Manufacturer/Model
    Custom Build
    CPU
    Intel Dual Core 3 GHz
    Motherboard
    Intel 945 GCL desktop motherboard
    Memory
    3 GB DDR 2 667 Mhz
    Graphics card(s)
    ATI X1550 256 MB
    Sound Card
    Onboard
    Monitor(s) Displays
    Samsung 19"
    Hard Drives
    80 GB IDE Samsung.
    Mouse
    iBall
    Keyboard
    iBall
    Internet Speed
    2mbps
It makes me feel bad to see people misled by "BSoD guides" which are of limited use because they're written by sysadmins as opposed to driver development folks. The two listed so far are perhaps better than many, but I still feel that they do more to confuse than to help:

1) Blue Screen of Death Survival Guide: Every Error Explained | Maximum PC

"There are many parts to a BSOD, but the most important is right at the top." - This is a fundamental misunderstanding. Unless it happens to be one of the special case stop codes which definitively point at hardware, and that is comparatively rare, the stop code itself is almost entirely useless as a means of uniquely identifying the problem, let alone fixing it. Not only can the same underlying fault manifest itself in a variety of different stop codes, depending on runtime factors (0xD1 or 0xA or 0x50 or...), but also all of the common stop codes are shared between so many unrelated driver bugs and hardware problems that it almost becomes a hindrance when searching online for a solution because of the many false leads.

"0xA: The most common cause of this conflict is improperly installed drivers for a piece of hardware you recently installed." - (emphasis mine) No, the most common cause for stop 0xAs are bugs in a driver properly installed. A subtle but huge difference. There is virtually no way to "improperly install" a driver so as to cause stop errors, but there are a multitude of ways to include bugs during the coding phase.

"You're more like to experience this IRQL error when switching form one videocard brand to another, as the drivers will conflict with each other." - This is an attempt to ascribe personality to drivers for lack of any actual code-level understanding.

"UNEXPECTED_KERNEL_MODE_TRAP (0x0000007F) If you see this blue screen, you're probably overclocking your CPU, but this is not always the case." - In fact, this is nominally a software error, although system abuse in the form of overclocking can theoretically yield almost any bugcheck code at all.

"PAGE_FAULT_IN_NONPAGED_AREA Faulty hardware, including RAM (system, video, or L2 cache)." - Again, the immediate conclusion that it must be hardware is counterproductive because there are a huge number of software errors which manifest themselves in this bugcheck type. The advice may lead to people prematurely and perhaps incorrectly dismissing the possibility of a software cause for their headaches.

"BAD_POOL_CALLER Caused by a faulty or incompatible hardware driver" - A tautology because 99% of (software) bugchecks are caused by "faulty or incompatible" drivers. The only ones that don't fit that description are the 0xC000021A varieties - critical process termination up in user-mode.

"PFN_LIST_CORRUPT Caused by faulty RAM." - A faulty generalisation. This bugcheck can be caused by many different hardware faults not in the RAM itself, as well as an errant driver.

"Reading blue screens of death is fun and all, but there's another, easier way to discover what your PC's problem is: the Event Viewer. When an error occurs in Windows, the OS adds a note to the system's log files. These logs are accessible through Windows's Event Viewer, and they contain all the information we need to know what ails our poor computer." - Perhaps the most telling mistake in the write-up which makes it obvious the author(s) have no driver development experience. In fact, the event viewer is generally useless when troubleshooting bugchecks because low-level breakdowns sufficiently serious to lead to KeBugCheckEx (the BSOD function) being called will not pause to first write a bit of text into the event log. You'd almost be better off to assume that whatever made it into the event log is not the cause of a subsequent bugcheck.


2) Troubleshooting Windows STOP Messages

"0x0000000A: IRQL_NOT_LESS_OR_EQUAL Technically, this error condition means that a kernel-mode process or driver tried to access a memory location to which it did not have permission, or at a kernel Interrupt ReQuest Level (IRQL) that was too high. (A kernel-mode process can access only other processes that have an IRQL lower than, or equal to, its own.)" - A tragicomical description seemingly woven together from the authors own attempts to understand other people's "guides".

"0x00000012: TRAP_CAUSE_UNKNOWN By its very nature, this error means that the cause of the identified problem is unknown." - Profound. And by the very nature of this author's writeup, it's obvious they're attempting to provide detail at a code level - more so than the previous "guide" - without any actual experience with kernel-mode code.

==============================================

My advice is to be extremely selective when choosing whom to believe in this context. Troubleshooting bugchecks (a.k.a. "stop errors", a.k.a. "BSoDs") is sufficiently complex without retracing over other (inexperienced) people's misunderstandings and over-generalisations. The authors mean well, but their advice is misleading.

Both of these "guides" gloss over the main bit of info that can actually help to understand the cause of a particular stop error - the minidump generated at the time of the crash.
 

My Computer

Back
Top