09 November 2014

SETTING UP KGDB FOR THE KGB

Every time I have to setup a new environment, I completely forget the mess you have to go through in order to get your kernel debug world setup properly. So this is here to make sure I won't forget it ever again.

I do not run linux as a desktop, and I suggest that nobody else does. My development consists of OS-X, and VirtualBox with two (almost) identical installations of linux (whatever you want, but I use ubuntu here in my examples). Our secondary VM will be the host which we will be debugging, as in our primary box will attach to it via kgdb.

Step 1

Clone your VM - no need to do a full history copy, just a snapshot clone, call it something that won't confuse you later, like "fucktard69" if your main VM is "shitcocks666". It doesn't matter, shut the fuck up.

Step 2

Shut your main VM down, as in, completely, don't just pause it or save the state. Shut it down! I don't care how you do this, cleanly or uncleanly, it doesn't fucking matter. Just shut it down.

Step 3

Open up the settings for the VM you just shut down:

  • Go to ports
  • Check off 'enable serial port'
  • Set the 'Port Mode' to 'Host Pipe'
  • UNCHECK 'Create Pipe'. Don't be an idiot, uncheck that shit if it is checked.
  • Set the path to '/tmp/99problems'
  • Click that big stupid button labeled 'OK'. If you don't do this you suck. I'm looking at you, Mark.

Step 4

Open up the settings for your secondary VM, the one called fucktard or whatever.

  • Go to ports
  • Check off 'enable serial port'
  • Set the 'Port Mode' to 'Host Pipe'
  • CHECK 'Create Pipe'. Yes, enable it, idiot.
  • Set the path to '/tmp/99problems'
  • Click that big fucking button that says 'OK'.

Step 5

Boot up your secondary VM first, because it needs to create the fucking /tmp/99problems pipe.

The do the following shit:

  • edit /etc/default/grub
  • Add or append this: GRUB_CMDLINE_LINUX_DEFAULT="kgdboc=ttyS0,115200"
  • run update-grub
  • reboot

If you did any of the above as a non-root user, kill yourself. Otherwise, reboot.

Step 6

Boot up your primary VM and do the following shit in order to get all the proper debug symbols for your kernel (remember, since you cloned that shit, it should be the EXACT FUCKING SAME KERNEL AS THE ONE YOU ARE RUNNING ON THE SECONDARY VM, if it's not, you should kill yourself).

codename=$(lsb_release -c | awk  '{print $2}')
sudo tee /etc/apt/sources.list.d/ddebs.list << EOF
deb http://ddebs.ubuntu.com/ ${codename}      main restricted universe multiverse
deb http://ddebs.ubuntu.com/ ${codename}-security main restricted universe multiverse
deb http://ddebs.ubuntu.com/ ${codename}-updates  main restricted universe multiverse
deb http://ddebs.ubuntu.com/ ${codename}-proposed main restricted universe multiverse
EOF
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys ECDCAD72428D7C01
sudo apt-get update
sudo apt-get install linux-image-$(uname -r)-dbgsym
sudo mkdir -p /build/buildd
sudo pushd /build/buildd
sudo apt-get apt-get source linux-image-$(uname -r)

Step 7

Send the trap signal on your secondary VM

echo g >/proc/sysrq-trigger

If it doesn't hang, you've failed at something, like setting up the right boot options or something.

Step 8

Attach your primary VM to the secondary VM via your virtual serial port using the dbg version of the kernel

gdb /usr/lib/debug/boot/vmlinux`uname -r`

If you're looking to debug a specific kernel module, make sure you compiled it with symbols first you idiot, then do the following:

for i in .text .data .bss; cat /sys/module/<module>/sections/$i ; done

And in GDB do

(gdb) add-symbol-file /path/to/ur/mod.ko <textaddr> -s .data <dataaddr> -s .bss <bssaddr>

Oh yeah, also make sure that the kernel module source is sourced in gdb, otherwise you will just be looking at symbols, and nothing else, not to useful is it? IT'S LIKE I'M TAKING CRAZY PILLS!

And finally, attach to the fucking serial:

(gdb) target remote /dev/ttyS0

Step 9

Come to the realization you could have probably found the bug using Linux's extensive kernel tracing abilities: https://www.kernel.org/doc/Documentation/trace/

some final notes

If you are still getting odd code paths, like /bbuild/build or whatever, you either didn't follow the instructions correctly, or there is something else terrible. It doesn't matter, just download the kernel source for your distribution and do something like:

(gdb) set substitute-path /build/buildd/linux-lts-raring-3.8.0 /usr/src/linux-source-3.2.0/linux-source-3.2.0

If you are sure the code you are debugging is not a SMP fuckup, set both VM's to only use a single CPU. This saves heaps of time.

If you are sure a LSM is not a cause of your bug, turn them off, avoids 34243243232 extra code steps.

I can't stress this more: BREAKPOINTS AND WATCHPOINTS OR THE BABY DIES! Working over serial is not for those smoking PCP.