Re: Blue Screen Error
READ THIS SEE IF IT HELPS
Bugchecks Explained: PFN_LIST_CORRUPT
OSR Staff | Published: 24-Aug-04| Modified: 24-Aug-04
Windows tracks physical pages of memory using a table called the Page Frame Database. This database (which actually is just a big one-dimensional array) is indexed by physical page number. As a result, the page frame database is typically referred to as the Page Frame Number list or PFN.
Every page of physical memory has an associated PFN entry. Each PFN entry contains information about the state of its corresponding physical page in the system. This state includes information about whether the corresponding physical page is in use, how it’s being used, a count of active users of the page, and a count of pending I/O operations on the page.
Depending on the pages state, a PFN entry may be on one of several lists that the Memory Manager maintains. The listheads for these lists are simple global variables that are used for quick access to PFN entries of certain types. For example, one such list would be the list that contains all the modified pages that need to be written to disk.
Because all the PFN lists and entries are present in the high half of kernel virtual address space, they are subject to corruption through stray pointer accesses (such as by errant drivers or other similar kernel-mode modules). Also, the count in the PFN that tracks the number of I/O related accesses to a given physical page can be corrupted by improper MDL handling.
Whenever Windows detects that any of the PFN lists or any of the PFN entries themselves have become invalid, the system halts with a PFN_LIST_CORRUPT bugcheck.
Who Did It?
This bugcheck usually occurs for one of two reasons, the first reason being memory corruption. If there is a buggy driver in the system that is writing on memory that it does not own, it could easily corrupt one of the PFN lists or entries. In order to rule this out, you should run Driver Verifier with Special Pool enabled for suspect drivers in the system. This will hopefully allow you to catch the misbehaving driver in the act of scribbling memory, instead of receiving a crash sometime later when the O/S discovers the damage.
The second cause for this bugcheck is incorrect MDL handling. For example, one use of MDLs is to allow you to "lock" the physical memory that backs a virtual address range so that the memory stays resident while your driver is accessing it. This is achieved by using the MmProbeAndLockPages DDI. One of the things that this DDI does is take out a reference on the PFN entries of the underlying physical pages, ensuring that the Memory Manager does not page them out. The corresponding DDI to undo this operation, MmUnlockPages, is responsible for decrementing the reference counts taken out in the previous call. If a driver happens to call MmUnlockPages too many times on an MDL, the reference count on the underlying PFN entries could drop to below zero (to 0xFFFFFFFF). The system considers this to be a critical error, as one or more of the PFN entries is obviously invalid. Therefore, this bugcheck will occur.
If your driver or a driver in your stack is being blamed for a PFN_LIST_CORRUPT bugcheck, go over your code and make sure that you are properly handling your MDLs . Remember that even if you do not create or destroy any MDLs directly, you play a part in the creation and destruction of them if you handle IRPs whose buffers are described with DIRECT_IO. Driver Verifier and the checked build of Windows can help pinpoint IRP and MDL handling errors.
How Should I Fix It?
How this is fixed varies depending on the reason of the bugcheck. Using Driver Verifier and the checked build of the O/S should allow you to pinpoint the driver that is either corrupting memory or mishandling MDLs. If the offending driver is not a driver that you have any control over, the only available option is disabling the driver until a fixed version is available.
Related WinDBG Commands
Related O/S Structures
Related O/S Variables
Rate this article and give us feedback. Do you find anything missing? Share your opinion with the community!
Post Your Comment
"Missing a real-life debugging sample"
I would like to suggest to include each bugcheck explanation with at least one sample debugging session showing how to approach the particular bugcheck. Although I understand that a multitude of possible causes could exist, even one example can be very helpful.
Compaq Presario CQ5305K-m Intel® Pentium® Dual Core E5300 (2.6 GHz), Windows® 7 Home Premium 64 bit, 2048 MB , Hard drive: 320 Gb, with 18.5 Widescreen
SPURS TILL I DIE (DIAMONDS ARE FOREVER SO ARE SPURS)
TO DARE IS TO DO