Fireproxy (Firewire proxy for gdb) fireproxy is the combination of a gdb stub like it is running in the kernel for remote debugging and firewire routines which access memory of the remote machine directly over firewire which is in effect executed as DMA (direct memory access) as provided by the OHCI1394 specicication and thus does not need the cooperation of the CPU of the remote system to read and write the remote memory. The basic principle of operation is similar to remote debugging using gdbserver (for user applications) or kgdb's gdbstub (Kernel debugging over serial lines or Ethernet in polling mode), except that the remote data transfer is done by the OCHI1394 controllers over the firewire bus. Traditional remote debugging: gdb+frontend Gdbserver (Program) gdbstub/kgdb (Kernel) Debugger Program/System being debugged +-----------+ +-----------+ | | | | | Machine A |<- GDB Remote Protocol ->| Machine B | | gdb | (serial, Ethernet) | gdbstub | +-----------+ +-----------+ Fireproxy provides the same functionality, but the machine being debugged is not Machine B, but Machine C: gdb+frontend fireproxy Final target +-----------+ +-----------+ +-----------+ | | | | | | | Machine A |<- GDB Remote Protocol ->| Machine B |<---->| Machine C | | Connects | using TCP/IP network | TCP Port | IEEE | (gdbstub) | +-----------+ or tunnel to +-----------+ 1394 +-----------+ Naturally, one can achieve similar results by using remote login from Machine A to Machine B and running the debugger (and its frontend) directly on Machine B. Reasons for Running the debugger on Machine A may be however: * Network latency: Machine A and Machine B may located quite far from each other and the gdb remote protocol may work better in situcaions, where Machine A and Machine B are on opposide sides of the net with long latencies for single keystrokes being sent back and forth. * Developers flexibilty: Machine B does not need to have a full debug environment (including kernel debug infos, kernel source files, gdb GUI) * Administrative considerations: Machine B may be Administered may be controlled by a different entity, and it is less intrusive to install fireproxy on Machine B, it is not neccessary to create a user account for setting up the proxy. Fireproxy is a single, small program which can also be linked specially to have no ieee1394 library dependices. In an envronment created for testing, Machine A and Machine B will often be the same machine. To read and write memory from Machine C, only debug information about the kernel running on Machine C must be available on machine A. Running Machine C with a kernel which has gdbsub is not required for this to workt. Controlling a gdbstub which is running on Machine C is the field of current research. The current version supports remote machines of the same architecture, supported/tested are currently AMD64 and IA32, PPC32 should also work. Further status information: * fireproxy currently only supports a minimal subset of the GDB remote protocol and not everything may yet work in a perfect way, it is still in an early Alpha stage. Only the protocol functions which are implemented so far (which are reading and writing of memory) actually work exactly as they should, other protocol functions may have dummy handlers which do not yet implement the perfect solution. (implement remote commands as needed) * The current version listens on port TCP port number 4 (is unassinged) for connections from gdb from anywhere. The port number will become a commandline option along with the host from which connections are accepted. (currently low priority, easy to implement) * fireproxy exits when gdb disconnects from it, so it has to be restarted before attaching again. make test and the buildloop script do this. (currently low priority, easy to implement, restricting which hosts can connect should be done in connection with this) * IEEE1394 bus resets are not nicely handled yet. At the moment, when a bus reset is triggered (all cable connections/disconnections on the bus or devices connects/disconnects cause a bus reset), the IEEE connection is broken and fireproxy have to be restarted, gdb needs to disconnect (gdb command) and attach again using the target command. When a bus reset happens, fireproxy could lookup the uuid (an id like 00080100 fa360220) of Machine C on and reconnect to it. (currently low priority and should be easy to implement if uuids indeed stay identical over bus resets) * fireproxy currently assumes that there is only one debug target on the IEEE1394 bus, and automatically selects this other system on startup, checking if the System.map file matches this system by checking the system_utsname.sysname field, which is "Linux" for all Linux systems and the other values from the uname data are printed. This could be changed to select Machine C given a specific uuid or a menu presented at start of fireproxy (code for this is available in the firescope codebase) * Running fireproxy running on different wordsize/endianness as the debug target is not tested/supported yet. It should not be a problem to allow this. * Possible performance improvements: * repeated, single-byte reads for consective addresses are currenty extremely inefficent since there is no caching in place. It could be speed up creatly by having cachin either in gdb or in firescope, since firescope always reads 4 bytes. * libraw1394 does currently not support reads with a different size than 4 bytes, it may be investigated if the kernel raw1394 module would support other sizes as well. A modified libraw1394 (LGPL) could be linked into fireproxy. * For consecutive reads of larger memory regions, a readahead mechanism could be implemneted: libraw1394 allows to request the next block of data without waiting for it. Thus, such transfer could run while fireproxy is sending the answer for the last read back to gdb and waiting to receive the next command from gdb. * the dmesg macro from /usr/src/linux-2.6.15/Documentation/kdump/gdbmacros.txt does not work (or very slowly) and fireproxy grows very big in virtual memory size, system starts wapping quickly unless you have lots of RAM free. The command dump binary memory dmesg.out log_buf (log_buf+log_end) works fine (does not support wrapped dmesg buffers) and is fast. It writes the current dmesg buffer to the file dmesg.out. * The inclued .gdbinit has some further example macros =========================================================================== To test the current code, do the following (inside SuSE): Preparations: ------------- * On Machine C, you have to run a kernel for which you have the -debuginfo package available on machine A. * Machine B and C must be connected using any normal Firewire cable, any cable which fits into the firewire plugs of the two machines will work. No other firewire devices may be connected (for now). * Machine C needs to have the module ohci1394 loaded (Machines with do not have an OHCI1394 controller but an PCLynx controller use a different driver, I know no machine for testing this) * Machine B needs to have the modules ohci1394 and raw1394 loaded, raw1394 needs (as of CODE 10 Beta6) still be manually loaded. This is an issue which needs to be fixed in CODE 10 by either: * loading raw1394 whenever a firewhile controller driver is loaded * providing /dev/raw1394 in the libraw1394 package, this will trigger an autoload of raw1394 when the device is accessed. (I think I would prefer this solution, may be easyer to implement) Installation: ------------- * Unpack the fireproxy tarball, delete the fireproxy binary in it, and type "make". * Only for the current version: Copy the System.map from Machine C to Machine B (althogh a System.map from a similar configured kernel also works) * Run the compuiled fireproxy binary, argument 1 is the System.map: ./fireproxy ../System.map Port 0 (ohci1394) opened, 2 nodes detected Loaded system.map <../System.map> <879897> bytes 2 nodes available, local node is: 1 0: ffc0, uuid: 00080100 fa360220 1: ffc1, uuid: 00080100 cc8b0120 [LOCAL] pick a target node: not a ppc utsname addr: ffffffff80323ca0 Attached to node 'f229' System : X86_64 Version: 2.6.16-rc5-git2-3-default (#1 Tue Feb 28 09:16:17 UTC 2006) Target : ffc0 Gen : 3 Ready to accept on port 4 * On Machine A, in this example, Machine B is "localhost": If Machine C runs the latest default kernel, you can do: E.g. if you have SUSE rpms in an archive, you can do the folloing: rpm -Uhv --nodeps \ /work/CDs/all/full-$CPU/suse/$CPU/kernel-default-debuginfo.rpm Example .gdbinit (there is one included in the package): file ../vmlinux target remote localhost:4 p system_utsname.release The message: "0x0000000000000000 in ?? ()" when starting gdb is to be expected, since fireproxy cannot stop the CPU yet. The included .gdbinit has a macro called "print_welcome" which prints the following example output from the remote system: ======================================================================== = Welcome to fireproxy, if vmlinux matches, this is the remote system: = ======================================================================== Linux f229 2.6.16-rc5-git2-3-default #1 Tue Feb 28 09:16:17 UTC 2006 x86_64 === Writing to remote memory === Example deminstrating various ways to write to remote memory: # finding the address of a certain remote variable: (gdb) p &system_utsname.nodename $2 = (char (*)[65]) 0xffffffff80323ce1 # setting a remode address to any supported value ({char}, {int}, ...): (gdb) set {char}0xffffffff80323ce1 = 'f' # the same, with a symbol name: (gdb) set {char}system_utsname.nodename = 'g' # and directly using the normal data type: (gdb) set *system_utsname.nodename = 'h' (gdb) set system_utsname.nodename = "f229" === Sample session with the included .gdbinit === gdb GNU gdb 6.4 Copyright 2005 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-suse-linux". Using host libthread_db library "/lib64/libthread_db.so.1". 0x0000000000000000 in ?? () warning: shared library handler failed to enable breakpoint Messages like: warning: shared library handler failed to enable breakpoint and '0x00000000 in ?? ()' are currently normal. ======================================================================== = Welcome to fireproxy. If vmlinux matches, this is the remote system: = ======================================================================== Linux f229 2.6.16-rc5-git2-3-default #1 Tue Feb 28 09:16:17 UTC 2006 x86_64 User-defined commands. The commands in this class are those defined by the user. Use the "define" command to define a command. List of commands: btpid -- Print a symbolic backtrace of the task with the passed pid btt -- Print symbolic stack backtraces of *all* tasks dmesg -- Print the kernel ring buffer dump_dmesg -- Dump the kernel ring buffer into dmesg dump_kernel_3m -- Dump the i386 kernel memory from 0xc0000000 0xc046f788 to kernel trapinfo -- Run info threads and lookup pid of thread #1 Type "help" followed by command name for full documentation. Command name abbreviations are allowed if unambiguous. (kgdb) btpid 1 looking for task_struct of pid 1... This will take at least few seconds or even many minutes (with many tasks) pid 1 - init: ------------------- 803203a0 init_task in section .data 8012fba5 __mod_timer + 169 in section .text 802bc8e2 schedule_timeout + 150 in section .text 803e09a8 per_cpu__tvec_bases + 4968 in section .bss 803e09a8 per_cpu__tvec_bases + 4968 in section .bss 8012f6c5 process_timeout in section .text 803df640 per_cpu__tvec_bases in section .bss 8017a6d6 do_select + 963 in section .text 8017a257 __pollwait in section .text 8010a476 system_call + 126 in section .text 8017a96c sys_select + 551 in section .text 801715eb sys_newstat + 40 in section .text 8010a476 system_call + 126 in section .text (kgdb) -- Bernhard Kaindl