Skip to main content

Masking bad RAM with Grub2

I recently ran into the situation that during installation of some packages in Debian the display started showing graphic errors and the root file system reported to be read only (as it was configure to switch to read-only on errors through its mount options). memtest86+ first complete pass showed no errors at all but later runs indicated errors at at least three addresses:

  • 00010c1d370
  • 00010c1dab0
  • 00004c1da90

Interestingly, grub2 supports masking sections of RAM out of the box, a feature I recently spotted in /etc/grub/grub.cfg by chance. The example and documentation of parameter GRUB_BADRAM in grub.cfg looked like it was just a list of sectors to ignore so I started with "0x10c1d370,0x10c1dab0,0x04c1da90" for it... to find a frozen Grub after reboot. After a bit of investigation I learned that every second entry is a mask on its predecessor and found a good howto and on how to construct these. The bad RAM mask gave me a few hours of no noticeable errors... and then it came back, from another unmasked section I suppose. That made me order new RAM of a different brand.