OpenSolaris

You are not signed in. Sign in or register.

Kernel Crash Dump Configuration

In Solaris, we place a strong emphasis on post mortem debugging, and in particular kernel crash dump analysis. One oft-overlooked facet of RAS (reliability and serviceability) is the time to root cause failure. When a customer system panics, we need to be able to root cause the problem from a single crash dump.

While Solaris has supported kernel crash dumps for a long while, there was little control over how these dumps were generated and their associated content. dumpadm(1M) command was written by Mike Shapiro to control the content and administration of these files. This was all done at approximately the same time that savecore(1M) was turned on by default, and MDB was introduced, all which served to dramatically increase our ability to diagnose fatal failure of our systems.

The default configuration looks something like this:

lazarus# dumpadm
      Dump content: kernel pages
       Dump device: /dev/dsk/c1d0s1 (swap)
Savecore directory: /var/crash/lazarus
  Savecore enabled: yes

Finding the source code

The dumpadm utility is pretty simple, since it's primary job is simply to manage the configuration file where this information is stored so that savecore(1M) knows where to find the crash dump image and where to put it. The source code is found in usr/src/cmd/dumpadm.

Understanding the source code

There's not much to say about dumpadm. The majority of the brains are found in dconf.c, which reads and writes to the configuration file, /etc/dumpadm.conf. This file is not a public interface, and gthere have been discussions about moving this information into the SMF repository, which would eliminate yet another custom configuration file and bring all the benefits that the repository provides.