With Linux, the primary method for obtaining debugging information of a serious error or fault is via the kdump mechanism. Kdump captures a wealth of kernel and machine state and writes it to a file for post-mortem debugging. But if kdump writes to a file on a remote server, and networking is down, then kdump cannot work.
In this context, networking includes the guest’s network driver and stack, the host’s network driver(s), and the network hardware both on the host and in the surrounding data center.
What is Linux Pstore?
Linux provides a persistent storage file system that can store error records when the kernel dies (or reboots or powers off). These records, in turn can be referenced to debug kernel problems (currently, the kernel stuffs the tail of the dmesg, which also contains a stack backtrace, into store).
The pstore is backed by local non-volatile memory and presented to the running system via traditional filesystem interfaces. Since it uses local non-volatile memory, pstore works even when kdump cannot.
Overview
Pstore was introduced into Linux to record information (eg. dmesg tail) upon panics and shutdowns. Pstore is independent of and can run before kdump. In specific scenarios (ie. hosts/guests with root filesystems on NFS/iSCSI where networking software and/or hardware has failed), pstore may contain information available for post-mortem debugging not otherwise captured.
pstore is a persistent storage driver. It saves data in a reserved part of memory, which can then be read from a working kernel. Its primary purpose is to save kernel crash logs to memory.
Ramoops
Ramoops is an oops/panic logger that writes its logs to RAM before the system crashes. It works by logging oopses and panics in a circular buffer. Ramoops needs a system with persistent RAM so that the content of that area can survive after a restart.
Pstore & Ramoops Setup
Kernel Configuration
To enable pstore and use the ramoops backend, make sure the following kernel options are set:
CONFIG_PSTORE=y CONFIG_PSTORE_CONSOLE=y CONFIG_PSTORE_RAM=y
They can be found under File systems > Miscellaneous filesystems > Persistent store support.
On older kernels without CONFIG_PSTORE_CONSOLE and CONFIG_PSTORE_RAM, you will need to enable
CONFIG_RAMOOPS=y
Configuring Ramoops
Make sure that the memory address you reserve and its size are the same in your downstream and mainline configuration, or else you won’t be able to read the information.
Now you can mount the pstore partition:
mkdir /tmp/pstore mount -t pstore -o kmsg_bytes=16000 - /sys/fs/pstore
$ ls -l /sys/fs/pstore/ total 0 -r--r--r-- 1 root root 7896 Nov 30 15:38 dmesg-erst-1
Different users of this interface will result in different filename prefixes. Currently, two are defined:
- “dmseg” – saved console log
- “mce” – architecture-dependent data from fatal h/w error
Once the information in a file has been read, removing the file will signal to the underlying persistent storage device that it can reclaim the space for later re-use:
$ rm /sys/fs/pstore/dmesg-erst-1
The expectation is that all files in /sys/fs/pstore/will be saved elsewhere and erased from the persistent store soon after boot to free up space ready for the next catastrophe.
The ‘kmsg_bytes’ mount option changes the target amount of data saved on each oops/panic. Pstore saves (possibly multiple) files based on the record size of the underlying persistent storage until at least this amount is reached. Default is 10 Kbytes.
Pstore only supports one backend at a time. If multiple backends are available, the preferred backend may be set by passing the pstore.backend= argument to the kernel at boot time.
Configuration
The behavior of systemd-pstore is configured through the configuration file /etc/systemd/pstore.conf and corresponding snippets /etc/systemd/pstore.conf.d/*.conf, see pstore.conf
Disabling pstore processing
To disable pstore processing by systemd-pstore, set
Storage=none
For example, if the Linux kernel dies, the dmesg tail, is written to pstore.
If the pstore backend were UEFI, it may look more like the following:
The dmesg tail is fragmented (based on the underlying storage exchange buffer size) into several error records, which are presented as files and can be re-assembled. Of course, the most important thing is that the dmesg tail, and thus the kernel panic call trace, has been captured to determine where things went badly wrong.
The size of the dmesg tail is tuneable via CONFIG_PSTORE_DEFAULT_KMSG_BYTES and is 10KiB by default.
As the local non-volatile storage tends to be small, typically tens of kilobytes, Oracle provided the systemd-pstore service to help manage the pstore space. In short, upon boot (or when systemd-pstore is re/started), it archives the contents of the pstore to other storage (eg. the regular filesystem), thus preserving the existing information and clearing pstore for future error events. Oh, and systemd-pstore will re-assemble the dmesg too!
Systemd-pstore first appeared in v243 and is present and enabled in OL7.9 and OL8.2 and newer.The systemd-pstore service is enabled by default. It can be re-run by issuing:
systemctl restart systemd-pstore
You can find the archive of past pstore contents under /var/lib/systemd/pstore, for example:
Where the dmesg.txt is re-assembled from the dmesg tail fragments related to dmesg-efi-155741337* files.
Pstore Enablement
To enable Linux pstore, which UEK does by default, ensure the following kernel configuration options are set.
With the above, pstore is enabled in the kernel and the ACPI ERST and UEFI storage backends, if present on the machine, are available.
The selection of the pstore backend is done at kernel boot time. By default, ACPI ERST is selected as the storage backend, and is preferred as it was designed for this function.
The UEFI backend is disabled by default, and to use the UEFI backend it must be explicitly selected at kernel boot time.
For UEK5 era kernels, the following kernel parameter is selected to utilize the UEFI pstore backend:
However, for UEK6 (5.4) era kernels, an additional kernel parameter is needed:
efi_pstore.pstore_disable=0
Without this kernel parameter, the EFI backend is never attempted.
To see which backend is active, you can inquire with:
# cat /sys/module/pstore/parameters/backend
Pstore Kernel Parameters
Two kernel parameters impact the writing of data into pstore.
Parameter printk.always_kmsg_dump writes to pstore at kernel shutdown or reboot.
Parameter crash_kexec_post_notifiers enable the writing to pstore before attempting kdump. Do be aware of the kernel documentation warning for this parameter.
These parameters can be passed at kernel boot time or set via the sysfs interface.
echo Y > /sys/module/printk/parameters/always_kmsg_dump echo Y > /sys/module/kernel/parameters/crash_kexec_post_notifiers
echo Y > /sys/module/printk/parameters/always_kmsg_dump echo Y > /sys/module/kernel/parameters/crash_kexec_post_notifiers
To persist a change to these settings, un-comment the appropriate line(s) from /usr/lib/tmpfiles.d/systemd-pstore.conf:
#w /sys/module/printk/parameters/always_kmsg_dump - - - - Y #w /sys/module/kernel/parameters/crash_kexec_post_notifiers - - - - Y
At next reboot, systemd will process this file and apply the changes.
Summary
By enabling pstore to run prior to kdump, you are assured of capturing the kernel call backtrace, even under challenging scenarios. With that dmesg tail, and kernel call trace in hand, you are one step closer to finding the cause of the panic!