8784d56326
... instead of relying on libthread_db. I wrote a test that attaches to a program that constantly spawns short-lived threads, which exposed several issues. This is one of them. On Linux, we need to attach to all threads of a process (thread group) individually. We currently rely on libthread_db to list the threads, but that is problematic, because libthread_db relies on reading data structures out of the inferior (which may well be corrupted). If threads are being created or exiting just while we try to attach, we may trip on inconsistencies in the inferior's thread list. To work around that, when we see a seemingly corrupt list, we currently retry a few times: static void thread_db_find_new_threads_2 (ptid_t ptid, int until_no_new) { ... if (until_no_new) { /* Require 4 successive iterations which do not find any new threads. The 4 is a heuristic: there is an inherent race here, and I have seen that 2 iterations in a row are not always sufficient to "capture" all threads. */ ... That heuristic may well fail, and when it does, we end up with threads in the program that aren't under GDB's control. That's obviously bad and results in quite mistifying failures, like e.g., the process dying for seeminly no reason when a thread that wasn't attached trips on a breakpoint. There's really no reason to rely on libthread_db for this nowadays when we have /proc mounted. In that case, which is the usual case, we can list the LWPs from /proc/PID/task/. In fact, GDBserver is already doing this. The patch factors out that code that knows to walk the task/ directory out of GDBserver, and makes GDB use it too. Like GDBserver, the patch makes GDB attach to LWPs and _not_ wait for them to stop immediately. Instead, we just tag the LWP as having an expected stop. Because we can only set the ptrace options when the thread stops, we need a new flag in the lwp structure to keep track of whether we've already set the ptrace options, just like in GDBserver. Note that nothing issues any ptrace command to the threads between the PTRACE_ATTACH and the stop, so this is safe (unlike one scenario described in gdbserver's linux-low.c). When we attach to a program that has threads exiting while we attach, it's easy to race with a thread just exiting as we try to attach to it, like: #1 - get current list of threads #2 - attach to each listed thread #3 - ooops, attach failed, thread is already gone As this is pretty normal, we shouldn't be issuing a scary warning in step #3. When #3 happens, PTRACE_ATTACH usually fails with ESRCH, but sometimes we'll see EPERM as well. That happens when the kernel still has the thread in its task list, but the thread is marked as dead. Unfortunately, EPERM is ambiguous and we'll get it also on other scenarios where the thread isn't dead, and in those cases, it's useful to get a warning. To distiguish the cases, when we get an EPERM failure, we open /proc/PID/status, and check the thread's state -- if the /proc file no longer exists, or the state is "Z (Zombie)" or "X (Dead)", we ignore the EPERM error silently; otherwise, we'll warn. Unfortunately, there seems to be a kernel race here. Sometimes I get EPERM, and then the /proc state still indicates "R (Running)"... If we wait a bit and retry, we do end up seeing X or Z state, or get an ESRCH. I thought of making GDB retry the attach a few times, but even with a 500ms wait and 4 retries, I still see the warning sometimes. I haven't been able to identify the kernel path that causes this yet, but in any case, it looks like a kernel bug to me. As this just results failure to suppress a warning that we've been printing since about forever anyway, I'm just making the test cope with it, and issue an XFAIL. gdb/gdbserver/ 2015-01-09 Pedro Alves <palves@redhat.com> * linux-low.c (linux_attach_fail_reason_string): Move to nat/linux-ptrace.c, and rename. (linux_attach_lwp): Update comment. (attach_proc_task_lwp_callback): New function. (linux_attach): Adjust to rename and use linux_proc_attach_tgid_threads. (linux_attach_fail_reason_string): Delete declaration. gdb/ 2015-01-09 Pedro Alves <palves@redhat.com> * linux-nat.c (attach_proc_task_lwp_callback): New function. (linux_nat_attach): Use linux_proc_attach_tgid_threads. (wait_lwp, linux_nat_filter_event): If not set yet, set the lwp's ptrace option flags. * linux-nat.h (struct lwp_info) <must_set_ptrace_flags>: New field. * nat/linux-procfs.c: Include <dirent.h>. (linux_proc_get_int): New parameter "warn". Handle it. (linux_proc_get_tgid): Adjust. (linux_proc_get_tracerpid): Rename to ... (linux_proc_get_tracerpid_nowarn): ... this. (linux_proc_pid_get_state): New function, factored out from (linux_proc_pid_has_state): ... this. Add new parameter "warn" and handle it. (linux_proc_pid_is_gone): New function. (linux_proc_pid_is_stopped): Adjust. (linux_proc_pid_is_zombie_maybe_warn) (linux_proc_pid_is_zombie_nowarn): New functions. (linux_proc_pid_is_zombie): Use linux_proc_pid_is_zombie_maybe_warn. (linux_proc_attach_tgid_threads): New function. * nat/linux-procfs.h (linux_proc_get_tgid): Update comment. (linux_proc_get_tracerpid): Rename to ... (linux_proc_get_tracerpid_nowarn): ... this, and update comment. (linux_proc_pid_is_gone): New declaration. (linux_proc_pid_is_zombie): Update comment. (linux_proc_pid_is_zombie_nowarn): New declaration. (linux_proc_attach_lwp_func): New typedef. (linux_proc_attach_tgid_threads): New declaration. * nat/linux-ptrace.c (linux_ptrace_attach_fail_reason): Adjust to use nowarn functions. (linux_ptrace_attach_fail_reason_string): Move here from gdbserver/linux-low.c and rename. (ptrace_supports_feature): If the current ptrace options are not known yet, check them now, instead of asserting. * nat/linux-ptrace.h (linux_ptrace_attach_fail_reason_string): Declare. |
||
---|---|---|
.. | ||
.gitignore | ||
acinclude.m4 | ||
aclocal.m4 | ||
ax.c | ||
ax.h | ||
ChangeLog | ||
config.in | ||
configure | ||
configure.ac | ||
configure.srv | ||
debug.c | ||
debug.h | ||
dll.c | ||
dll.h | ||
event-loop.c | ||
event-loop.h | ||
gdb_proc_service.h | ||
gdbreplay.c | ||
gdbthread.h | ||
hostio-errno.c | ||
hostio.c | ||
hostio.h | ||
i387-fp.c | ||
i387-fp.h | ||
inferiors.c | ||
inferiors.h | ||
linux-aarch64-low.c | ||
linux-amd64-ipa.c | ||
linux-arm-low.c | ||
linux-bfin-low.c | ||
linux-cris-low.c | ||
linux-crisv32-low.c | ||
linux-i386-ipa.c | ||
linux-ia64-low.c | ||
linux-low.c | ||
linux-low.h | ||
linux-m32r-low.c | ||
linux-m68k-low.c | ||
linux-mips-low.c | ||
linux-nios2-low.c | ||
linux-ppc-low.c | ||
linux-s390-low.c | ||
linux-sh-low.c | ||
linux-sparc-low.c | ||
linux-tic6x-low.c | ||
linux-tile-low.c | ||
linux-x86-low.c | ||
linux-xtensa-low.c | ||
lynx-i386-low.c | ||
lynx-low.c | ||
lynx-low.h | ||
lynx-ppc-low.c | ||
Makefile.in | ||
mem-break.c | ||
mem-break.h | ||
notif.c | ||
notif.h | ||
nto-low.c | ||
nto-low.h | ||
nto-x86-low.c | ||
proc-service.c | ||
proc-service.list | ||
README | ||
regcache.c | ||
regcache.h | ||
remote-utils.c | ||
remote-utils.h | ||
server.c | ||
server.h | ||
spu-low.c | ||
symbol.c | ||
target.c | ||
target.h | ||
tdesc.c | ||
tdesc.h | ||
terminal.h | ||
thread-db.c | ||
tracepoint.c | ||
tracepoint.h | ||
utils.c | ||
utils.h | ||
win32-arm-low.c | ||
win32-i386-low.c | ||
win32-low.c | ||
win32-low.h | ||
wincecompat.c | ||
wincecompat.h | ||
x86-low.c | ||
x86-low.h | ||
xtensa-xtregs.c |
README for GDBserver & GDBreplay by Stu Grossman and Fred Fish Introduction: This is GDBserver, a remote server for Un*x-like systems. It can be used to control the execution of a program on a target system from a GDB on a different host. GDB and GDBserver communicate using the standard remote serial protocol implemented in remote.c, and various *-stub.c files. They communicate via either a serial line or a TCP connection. For more information about GDBserver, see the GDB manual. Usage (server (target) side): First, you need to have a copy of the program you want to debug put onto the target system. The program can be stripped to save space if needed, as GDBserver doesn't care about symbols. All symbol handling is taken care of by the GDB running on the host system. To use the server, you log on to the target system, and run the `gdbserver' program. You must tell it (a) how to communicate with GDB, (b) the name of your program, and (c) its arguments. The general syntax is: target> gdbserver COMM PROGRAM [ARGS ...] For example, using a serial port, you might say: target> gdbserver /dev/com1 emacs foo.txt This tells GDBserver to debug emacs with an argument of foo.txt, and to communicate with GDB via /dev/com1. GDBserver now waits patiently for the host GDB to communicate with it. To use a TCP connection, you could say: target> gdbserver host:2345 emacs foo.txt This says pretty much the same thing as the last example, except that we are going to communicate with the host GDB via TCP. The `host:2345' argument means that we are expecting to see a TCP connection from `host' to local TCP port 2345. (Currently, the `host' part is ignored.) You can choose any number you want for the port number as long as it does not conflict with any existing TCP ports on the target system. This same port number must be used in the host GDBs `target remote' command, which will be described shortly. Note that if you chose a port number that conflicts with another service, GDBserver will print an error message and exit. On some targets, GDBserver can also attach to running programs. This is accomplished via the --attach argument. The syntax is: target> gdbserver --attach COMM PID PID is the process ID of a currently running process. It isn't necessary to point GDBserver at a binary for the running process. Usage (host side): You need an unstripped copy of the target program on your host system, since GDB needs to examine it's symbol tables and such. Start up GDB as you normally would, with the target program as the first argument. (You may need to use the --baud option if the serial line is running at anything except 9600 baud.) Ie: `gdb TARGET-PROG', or `gdb --baud BAUD TARGET-PROG'. After that, the only new command you need to know about is `target remote'. It's argument is either a device name (usually a serial device, like `/dev/ttyb'), or a HOST:PORT descriptor. For example: (gdb) target remote /dev/ttyb communicates with the server via serial line /dev/ttyb, and: (gdb) target remote the-target:2345 communicates via a TCP connection to port 2345 on host `the-target', where you previously started up GDBserver with the same port number. Note that for TCP connections, you must start up GDBserver prior to using the `target remote' command, otherwise you may get an error that looks something like `Connection refused'. Building GDBserver: The supported targets as of November 2006 are: arm-*-linux* bfin-*-uclinux bfin-*-linux-uclibc crisv32-*-linux* cris-*-linux* i[34567]86-*-cygwin* i[34567]86-*-linux* i[34567]86-*-mingw* ia64-*-linux* m32r*-*-linux* m68*-*-linux* m68*-*-uclinux* mips*64*-*-linux* mips*-*-linux* powerpc[64]-*-linux* s390[x]-*-linux* sh-*-linux* spu*-*-* x86_64-*-linux* Configuring GDBserver you should specify the same machine for host and target (which are the machine that GDBserver is going to run on. This is not the same as the machine that GDB is going to run on; building GDBserver automatically as part of building a whole tree of tools does not currently work if cross-compilation is involved (we don't get the right CC in the Makefile, to start with)). Building GDBserver for your target is very straightforward. If you build GDB natively on a target which GDBserver supports, it will be built automatically when you build GDB. You can also build just GDBserver: % mkdir obj % cd obj % path-to-gdbserver-sources/configure % make If you prefer to cross-compile to your target, then you can also build GDBserver that way. In a Bourne shell, for example: % export CC=your-cross-compiler % path-to-gdbserver-sources/configure your-target-name % make Using GDBreplay: A special hacked down version of GDBserver can be used to replay remote debug log files created by GDB. Before using the GDB "target" command to initiate a remote debug session, use "set remotelogfile <filename>" to tell GDB that you want to make a recording of the serial or tcp session. Note that when replaying the session, GDB communicates with GDBreplay via tcp, regardless of whether the original session was via a serial link or tcp. Once you are done with the remote debug session, start GDBreplay and tell it the name of the log file and the host and port number that GDB should connect to (typically the same as the host running GDB): $ gdbreplay logfile host:port Then start GDB (preferably in a different screen or window) and use the "target" command to connect to GDBreplay: (gdb) target remote host:port Repeat the same sequence of user commands to GDB that you gave in the original debug session. GDB should not be able to tell that it is talking to GDBreplay rather than a real target, all other things being equal. Note that GDBreplay echos the command lines to stderr, as well as the contents of the packets it sends and receives. The last command echoed by GDBreplay is the next command that needs to be typed to GDB to continue the session in sync with the original session.