|
|
Kernel Crash Dump ConfigurationIn Solaris, we place a strong emphasis on post mortem debugging, and in particular kernel crash dump analysis. One oft-overlooked facet of RAS (reliability and serviceability) is the time to root cause failure. When a customer system panics, we need to be able to root cause the problem from a single crash dump. While Solaris has supported kernel crash dumps for a long while, there was little control over how these dumps were generated and their associated content. dumpadm(1M) command was written by Mike Shapiro to control the content and administration of these files. This was all done at approximately the same time that savecore(1M) was turned on by default, and MDB was introduced, all which served to dramatically increase our ability to diagnose fatal failure of our systems. The default configuration looks something like this:
Finding the source codeThe dumpadm utility is pretty simple, since it's primary job is simply to manage the configuration file where this information is stored so that savecore(1M) knows where to find the crash dump image and where to put it. The source code is found in usr/src/cmd/dumpadm. Understanding the source codeThere's not much to say about dumpadm. The majority of the brains are found in dconf.c, which reads and writes to the configuration file, |