This removes save_inferior_ptid, a cleanup function, in favor of
scoped_restore.
This also fixes a possible (it seems unlikely that it could happen in
practice) memory leak -- save_inferior_ptid should have used
make_cleanup_dtor, because it allocated memory.
I tested this on the buildbot. However, there are two caveats to
this. First, sometimes it seems I misread the results. Second, I
think this patch touches some platforms that can't be tested by the
buildbot. So, extra care seems warranted.
ChangeLog
2017-08-18 Tom Tromey <tom@tromey.com>
Pedro Alves <palves@redhat.com>
* spu-multiarch.c (parse_spufs_run): Use scoped_restore.
* sol-thread.c (sol_thread_resume, sol_thread_wait)
(sol_thread_xfer_partial, rw_common): Use scoped_restore.
* procfs.c (procfs_do_thread_registers): Use scoped_restore.
* proc-service.c (ps_xfer_memory): Use scoped_restore.
* linux-tdep.c (linux_corefile_thread): Remove a cleanup.
(linux_get_siginfo_data): Add "thread" argument. Use
scoped_restore.
* linux-nat.c (linux_child_follow_fork)
(check_stopped_by_watchpoint): Use scoped_restore.
* infrun.c (displaced_step_prepare_throw, write_memory_ptid)
(THREAD_STOPPED_BY, handle_signal_stop): Use scoped_restore.
(restore_inferior_ptid, save_inferior_ptid): Remove.
* btrace.c (btrace_fetch): Use scoped_restore.
* bsd-uthread.c (bsd_uthread_fetch_registers)
(bsd_uthread_store_registers): Use scoped_restore.
* breakpoint.c (reattach_breakpoints, detach_breakpoints): Use
scoped_restore.
* aix-thread.c (aix_thread_resume, aix_thread_wait)
(aix_thread_xfer_partial): Use scoped_restore.
* inferior.h (save_inferior_ptid): Remove.
The parameter "first" of linux_nat_post_attach_wait is unused, remove
it.
gdb/ChangeLog:
* linux-nat.c (linux_nat_post_attach_wait): Remove FIRST
parameter.
(linux_nat_attach): Adjust call to linux_nat_post_attach_wait.
As a preparation for the next patch, which will move fork_inferior
from GDB to common/ (and therefore share it with gdbserver), it is
interesting to convert a few functions to C++.
This patch touches functions related to parsing command-line arguments
to the inferior (see gdb/fork-child.c:breakup_args), the way the
arguments are stored on fork_inferior (using std::vector instead of
char **), and the code responsible for dealing with argv also on
gdbserver.
I've taken this opportunity and decided to constify a few arguments to
fork_inferior/create_inferior as well, in order to make the code
cleaner. And now, on gdbserver, we're using xstrdup everywhere and
aren't checking for memory allocation failures anymore, as requested
by Pedro:
<https://sourceware.org/ml/gdb-patches/2017-03/msg00191.html>
Message-Id: <025ebdb9-90d9-d54a-c055-57ed2406b812@redhat.com>
Pedro Alves wrote:
> On the "== NULL" check: IIUC, the old NULL check was there to
> handle strdup returning NULL due to out-of-memory.
> See NULL checks and comments further above in this function.
> Now that you're using a std::vector, that doesn't work or make
> sense any longer, since if push_back fails to allocate space for
> its internal buffer (with operator new), our operator new replacement
> (common/new-op.c) calls malloc_failure, which aborts gdbserver.
>
> Not sure it makes sense to handle out-of-memory specially in
> the gdb/rsp-facing functions nowadays (maybe git blame/log/patch
> submission for that code shows some guidelines). Maybe (or, probably)
> it's OK to stop caring about it, but then we should consistently remove
> left over code, by using xstrdup instead and remove the NULL checks.
IMO this refactoring was very good to increase the readability of the
code as well, because some parts of the argument handling were
unnecessarily confusing before.
gdb/ChangeLog:
2017-04-12 Sergio Durigan Junior <sergiodj@redhat.com>
* common/common-utils.c (free_vector_argv): New function.
* common/common-utils.h: Include <vector>.
(free_vector_argv): New prototype.
* darwin-nat.c (darwin_create_inferior): Rewrite function
prototype in order to constify "exec_file" and accept a
"std::string" for "allargs".
* fork-child.c: Include <vector>.
(breakup_args): Rewrite function, using C++.
(fork_inferior): Rewrite function header, constify "exec_file_arg"
and accept "std::string" for "allargs". Update the code to
calculate "argv" based on "allargs". Update calls to "exec_fun"
and "execvp".
* gnu-nat.c (gnu_create_inferior): Rewrite function prototype in
order to constify "exec_file" and accept a "std::string" for
"allargs".
* go32-nat.c (go32_create_inferior): Likewise.
* inf-ptrace.c (inf_ptrace_create_inferior): Likewise.
* infcmd.c (run_command_1): Constify "exec_file". Use
"std::string" for inferior arguments.
* inferior.h (fork_inferior): Update prototype.
* linux-nat.c (linux_nat_create_inferior): Rewrite function
prototype in order to constify "exec_file" and accept a
"std::string" for "allargs".
* nto-procfs.c (procfs_create_inferior): Likewise.
* procfs.c (procfs_create_inferior): Likewise.
* remote-sim.c (gdbsim_create_inferior): Likewise.
* remote.c (extended_remote_run): Update code to accept
"std::string" as argument.
(extended_remote_create_inferior): Rewrite function prototype in
order to constify "exec_file" and accept a "std::string" for
"allargs".
* rs6000-nat.c (super_create_inferior): Likewise.
(rs6000_create_inferior): Likewise.
* target.h (struct target_ops) <to_create_inferior>: Likewise.
* windows-nat.c (windows_create_inferior): Likewise.
gdb/gdbserver/ChangeLog:
2017-04-12 Sergio Durigan Junior <sergiodj@redhat.com>
* server.c: Include <vector>.
<program_argv, wrapper_argv>: Convert to std::vector.
(start_inferior): Rewrite function to use C++.
(handle_v_run): Likewise. Update code that calculates the argv
based on the vRun packet; use C++.
(captured_main): Likewise.
At the end of linux_nat_detach the main_lwp is deleted (delete_lwp).
This is problematic as during detach (detach_one_lwp and
linux_fork_detach) main_lwp already gets freed. Thus calling
delete_lwp causes a read after free. Fix it by removing the
unnecessary delete_lwp.
gdb/ChangeLog:
2017-04-11 Philipp Rudo <prudo@linux.vnet.ibm.com>
* linux-nat.c (linux_nat_detach): Remove delete_lwp call.
The linux_nat_xfer_partial does a conversion of inferior_ptid: if it's
an LWP (ptid::lwp != 0), it builds a new ptid with the lwp as
the pid and assigns that temporarily to inferior_ptid. For example, if
inferior_ptid is:
{ .pid = 1234, .lwp = 1235 }
it will assign this to inferior_ptid for the duration of the call:
{ .pid = 1235, .lwp = 0 }
Instead of doing this, this patch teaches the inf-ptrace implementation
of xfer_partial to deal with ptids representing lwps by using
get_ptrace_pid.
Also, in linux_proc_xfer_spu and linux_proc_xfer_partial, we use ptid_get_lwp
instead of ptid_get_pid. While not strictly necessary, since the content of
/proc/<pid> and /proc/<lwp> should be the same, it's a bit safer, because:
- some files under /proc/<pid>/ may not work if the <pid> thread is
running, just like ptrace requires a stopped thread. The current
thread's lwp id is more likely to be in the necessary state (stopped).
- if the leader (<pid>) had exited and is thus now zombie, then several
files under "/proc/<pid>" won't work, while they will if you use
"/proc/<lwp>".
The testsuite found no regression on native amd64 linux.
gdb/ChangeLog:
* inf-ptrace.c (inf_ptrace_xfer_partial): Get pid from ptid
using get_ptrace_pid.
* linux-nat.c (linux_nat_xfer_partial): Don't set/restore
inferior_ptid.
(linux_proc_xfer_partial, linux_proc_xfer_spu): Use lwp of
inferior_ptid instead of pid.
So far linux_proc_xfer_partial refused to handle write requests. This is
still based on the assumption that the Linux kernel does not support
writes to /proc/<pid>/mem. That used to be true, but has changed with
Linux 2.6.39 released in May 2011.
This patch lifts this restriction and now exploits /proc/<pid>/mem for
writing to inferior memory as well, if possible.
gdb/ChangeLog:
* linux-nat.c (linux_proc_xfer_partial): Handle write operations
as well.
I think this comment is outdated. Nowadays, linux-nat is always async,
unless the user has explictly turned it off with
"maint set target-async off".
gdb/ChangeLog:
* linux-nat.c (linux_nat_can_async_p): Update comment.
This applies the second part of GDB's End of Year Procedure, which
updates the copyright year range in all of GDB's files.
gdb/ChangeLog:
Update copyright year range in all GDB files.
This patch consolidates the API of target_mourn_inferior between GDB
and gdbserver, in my continuing efforts to make sharing the
fork_inferior function possible between both.
GDB's version of the function did not care about the inferior's ptid
being mourned, but gdbserver's needed to know this information. Since
it actually makes sense to pass the ptid as an argument, instead of
depending on a global value directly (which GDB's version did), I
decided to make the generic API to accept it. I then went on and
extended all calls being made on GDB to include a ptid argument (which
ended up being inferior_ptid most of the times, anyway), and now we
have a more sane interface.
On GDB's side, after talking to Pedro a bit about it, we decided that
just an assertion to make sure that the ptid being passed is equal to
inferior_ptid would be enough for now, on the GDB side. We can remove
the assertion and perform more operations later if we ever pass
anything different than inferior_ptid.
Regression tested on our BuildBot, everything OK.
I'd appreciate a special look at gdb/windows-nat.c's modification
because I wasn't really sure what to do there. It seemed to me that
maybe I should build a ptid out of the process information there, but
then I am almost sure the assertion on GDB's side would trigger.
gdb/ChangeLog:
2016-09-19 Sergio Durigan Junior <sergiodj@redhat.com>
* darwin-nat.c (darwin_kill_inferior): Adjusting call to
target_mourn_inferior to include ptid_t argument.
* fork-child.c (startup_inferior): Likewise.
* gnu-nat.c (gnu_kill_inferior): Likewise.
* inf-ptrace.c (inf_ptrace_kill): Likewise.
* infrun.c (handle_inferior_event_1): Likewise.
* linux-nat.c (linux_nat_attach): Likewise.
(linux_nat_kill): Likewise.
* nto-procfs.c (interrupt_query): Likewise.
(procfs_interrupt): Likewise.
(procfs_kill_inferior): Likewise.
* procfs.c (procfs_kill_inferior): Likewise.
* record.c (record_mourn_inferior): Likewise.
* remote-sim.c (gdbsim_kill): Likewise.
* remote.c (remote_detach_1): Likewise.
(remote_kill): Likewise.
* target.c (target_mourn_inferior): Change declaration to accept
new ptid_t argument; use gdb_assert on it.
* target.h (target_mourn_inferior): Move function prototype from
here...
* target/target.h (target_mourn_inferior): ... to here. Adjust it
to accept new ptid_t argument.
* windows-nat.c (get_windows_debug_event): Adjusting call to
target_mourn_inferior to include ptid_t argument.
gdb/gdbserver/ChangeLog:
2016-09-19 Sergio Durigan Junior <sergiodj@redhat.com>
* server.c (start_inferior): Call target_mourn_inferior instead of
mourn_inferior; pass ptid_t argument to it.
(resume): Likewise.
(handle_target_event): Likewise.
* target.c (target_mourn_inferior): New function.
* target.h (mourn_inferior): Delete macro.
Add the function lwp_is_stepping which indicates whether the given LWP
is currently single-stepping. This is a common interface, usable from
native GDB as well as from gdbserver.
gdb/gdbserver/ChangeLog:
* linux-low.c (lwp_is_stepping): New function.
gdb/ChangeLog:
* nat/linux-nat.h (lwp_is_stepping): New declaration.
* linux-nat.c (lwp_is_stepping): New function.
This commit implements a new function, target_continue, on top of the
target_resume function. Then, it replaces all calls to target_resume
by calls to target_continue or to the already existing
target_continue_no_signal.
This is one of the (many) necessary steps needed to consolidate the
target interface between GDB and gdbserver. In particular, I am
interested in the impact this change will have on the unification of
the fork_inferior function (which I have been working on).
Tested on the BuildBot, no regressions introduced.
gdb/gdbserver/ChangeLog:
2016-09-31 Sergio Durigan Junior <sergiodj@redhat.com>
* server.c (start_inferior): New variable 'ptid'. Replace calls
to the_target->resume by target_continue{,_no_signal}, depending
on the case.
* target.c (target_stop_and_wait): Call target_continue_no_signal
instead of the_target->resume.
(target_continue): New function.
gdb/ChangeLog:
2016-09-31 Sergio Durigan Junior <sergiodj@redhat.com>
* fork-child.c (startup_inferior): Replace calls to target_resume
by target_continue{,_no_signal}, depending on the case.
* linux-nat.c (cleanup_target_stop): Call
target_continue_no_signal instead of target_resume.
* procfs.c (procfs_wait): Likewise.
* target.c (target_continue): New function.
* target/target.h (target_continue): New prototype.
This commit fixes detaching on Linux when some thread exits the whole
thread group (process) just while we're detaching.
On Linux, a ptracer must detach from each LWP individually, with
PTRACE_DETACH. Since PTRACE_DETACH sets the thread running free, if
one of the already-detached threads causes the whole thread group to
exit (e.g., simply calls exit), the kernel force-kills the other
threads in the group, making them zombie, just as we're still
detaching them. Since PTRACE_DETACH against a zombie thread fails
with ESRCH, and gdb/gdbserver are not expecting this, the detach fails
with an error like: "Can't detach process: No such process.".
This patch detects this detach failure as normal, and instead of
erroring out, reaps the now-dead thread.
New test included, that exercises several different scenarios that
cause GDB/GDBserver to error out when it should not.
Tested on x86-64 GNU/Linux with {unix, native-gdbserver,
native-extended-gdbserver}
Note: without the previous fix, the "single-process + continue"
variant of the new test would fail with:
(gdb) PASS: gdb.threads/process-dies-while-detaching.exp: single-process: continue: watchpoint: switch to parent
continue
Continuing.
Warning:
Could not insert hardware watchpoint 3.
Could not insert hardware breakpoints:
You may have requested too many hardware breakpoints/watchpoints.
Command aborted.
(gdb) FAIL: gdb.threads/process-dies-while-detaching.exp: single-process: continue: watchpoint: continue
gdb/gdbserver/ChangeLog:
2016-07-01 Pedro Alves <palves@redhat.com>
Antoine Tremblay <antoine.tremblay@ericsson.com>
* linux-low.c: Change interface to take the target lwp_info
pointer directly and return void. Handle detaching from a zombie
thread.
(linux_detach_lwp_callback): New function.
(linux_detach): Detach from the leader thread after detaching from
the clone threads.
gdb/ChangeLog:
2016-07-01 Pedro Alves <palves@redhat.com>
Antoine Tremblay <antoine.tremblay@ericsson.com>
* inf-ptrace.c (inf_ptrace_detach_success): New function, factored
out from ...
(inf_ptrace_detach): ... here.
* inf-ptrace.h (inf_ptrace_detach_success): New declaration.
* linux-nat.c (get_pending_status): Rename to ...
(get_detach_signal): ... this, and return a host signal instead of
filling in a wait status.
(detach_one_lwp): New function, factored out from detach_callback
and adjusted to handle detaching from a zombie thread.
(detach_callback): Skip the leader thread.
(linux_nat_detach): No longer defer to inf_ptrace_detach to detach
the leader thread, nor build a signal string to pass down.
Instead, use target_announce_detach, detach_one_lwp and
inf_ptrace_detach_success.
gdb/testsuite/ChangeLog:
2016-07-01 Pedro Alves <palves@redhat.com>
Antoine Tremblay <antoine.tremblay@ericsson.com>
* gdb.threads/process-dies-while-detaching.c: New file.
* gdb.threads/process-dies-while-detaching.exp: New file.
And with that, we can switch the current UI to the UI whose input
descriptor woke up the event loop. IOW, if the user types in UI 2,
the event loop wakes up, switches to UI 2, and processes the input.
Next the user types in UI 3, the event loop wakes up and switches to
UI 3, etc.
gdb/ChangeLog:
2016-06-21 Pedro Alves <palves@redhat.com>
* event-top.c (input_fd): Delete.
(stdin_event_handler): Switch to the UI whose input descriptor got
the event. Adjust to per-UI input_fd.
(gdb_setup_readline): Don't set the input_fd global. Adjust to
per-UI input_fd.
(gdb_disable_readline): Adjust to per-UI input_fd.
* event-top.h (input_fd): Delete declaration.
* linux-nat.c (linux_nat_terminal_inferior): Don't remove input_fd
from the event-loop here.
(linux_nat_terminal_ours): Don't register input_fd in the
event-loop here.
* main.c (captured_main): Adjust to per-UI input_fd.
* remote.c (remote_terminal_inferior): Don't remove input_fd from
the event-loop here.
(remote_terminal_ours): Don't register input_fd in the event-loop
here.
* target.c: Include top.h and event-top.h.
(target_terminal_inferior): Remove input_fd from the event-loop
here.
(target_terminal_ours): Register input_fd in the event-loop.
* top.h (struct ui) <input_fd>: New field.
When GDB attaches to a process, it looks at the /proc/PID/task/ dir
for all clone threads of that process, and attaches to each of them.
Usually, if there is more than one clone thread, it means the program
is multi threaded and linked with pthreads. Thus when GDB soon after
attaching finds and loads a libthread_db matching the process, it'll
add a thread to the thread list for each of the initially found
lower-level LWPs.
If, however, GDB fails to find/load a matching libthread_db, nothing
is adding the LWPs to the thread list. And because of that, "detach"
hits an internal error:
(gdb) PASS: gdb.threads/clone-attach-detach.exp: fg attach 1: attach
info threads
Id Target Id Frame
* 1 LWP 6891 "clone-attach-de" 0x00007f87e5fd0790 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
(gdb) FAIL: gdb.threads/clone-attach-detach.exp: fg attach 1: info threads shows two LWPs
detach
.../src/gdb/thread.c:1010: internal-error: is_executing: Assertion `tp' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
FAIL: gdb.threads/clone-attach-detach.exp: fg attach 1: detach (GDB internal error)
From here:
...
#8 0x00000000007ba7cc in internal_error (file=0x98ea68 ".../src/gdb/thread.c", line=1010, fmt=0x98ea30 "%s: Assertion `%s' failed.")
at .../src/gdb/common/errors.c:55
#9 0x000000000064bb83 in is_executing (ptid=...) at .../src/gdb/thread.c:1010
#10 0x00000000004c23bb in get_pending_status (lp=0x12c5cc0, status=0x7fffffffdc0c) at .../src/gdb/linux-nat.c:1235
#11 0x00000000004c2738 in detach_callback (lp=0x12c5cc0, data=0x0) at .../src/gdb/linux-nat.c:1317
#12 0x00000000004c1a2a in iterate_over_lwps (filter=..., callback=0x4c2599 <detach_callback>, data=0x0) at .../src/gdb/linux-nat.c:899
#13 0x00000000004c295c in linux_nat_detach (ops=0xe7bd30, args=0x0, from_tty=1) at .../src/gdb/linux-nat.c:1358
#14 0x000000000068284d in delegate_detach (self=0xe7bd30, arg1=0x0, arg2=1) at .../src/gdb/target-delegates.c:34
#15 0x0000000000694141 in target_detach (args=0x0, from_tty=1) at .../src/gdb/target.c:2241
#16 0x0000000000630582 in detach_command (args=0x0, from_tty=1) at .../src/gdb/infcmd.c:2975
...
Tested on x86-64 Fedora 23. Also confirmed the test passes against
gdbserver with "maint set target-non-stop".
gdb/ChangeLog:
2016-05-24 Pedro Alves <palves@redhat.com>
PR gdb/19828
* linux-nat.c (attach_proc_task_lwp_callback): Mark the lwp
resumed, and add the thread to GDB's thread list.
testsuite/ChangeLog:
2016-05-24 Pedro Alves <palves@redhat.com>
PR gdb/19828
* gdb.threads/clone-attach-detach.c: New file.
* gdb.threads/clone-attach-detach.exp: New file.
Working on the fix for gdb/19828, I saw
gdb.threads/attach-many-short-lived-threads.exp fail once in an
unusual way. Unfortunately I didn't keep debug logs, but it's an
issue similar to what's been fixed in remote.c a while ago --
linux-nat.c was not fetching the pending status from the right place.
gdb/ChangeLog:
2016-05-24 Pedro Alves <palves@redhat.com>
PR gdb/19828
* linux-nat.c (get_pending_status): If the thread reported the
event to the core and it's pending, use the pending status signal
number.
Hacking the gdb.threads/attach-many-short-lived-threads.exp test to
spawn thousands of threads instead of dozens, and running gdb under
perf, I saw that GDB was spending most of the time in find_lwp_pid:
- captured_main
- 93.61% catch_command_errors
- 87.41% attach_command
- 87.40% linux_nat_attach
- 87.40% linux_proc_attach_tgid_threads
- 82.38% attach_proc_task_lwp_callback
- 81.01% find_lwp_pid
5.30% ptid_get_lwp
+ 0.10% ptid_lwp_p
+ 0.64% add_thread
+ 0.26% set_running
+ 0.24% set_executing
0.12% ptid_get_lwp
+ 0.01% ptrace
+ 0.01% add_lwp
attach_proc_task_lwp_callback is called once for each LWP that we
attach to, found by listing the /proc/PID/task/ directory. In turn,
attach_proc_task_lwp_callback calls find_lwp_pid to check whether the
LWP we're about to try to attach to is already known. Since
find_lwp_pid does a linear walk over the whole LWP list, this becomes
quadratic. We do the /proc/PID/task/ listing until we get two
iterations in a row where we found no new threads. So the second and
following times we walk the /proc/PID/task/ dir, we're going to take
an even worse find_lwp_pid hit.
Fix this by adding a hash table keyed by LWP PID, for fast lookup.
The linked list embedded in the LWP structure itself is kept, and made
a double-linked list, so that removals from that list are O(1). An
earlier version of this patch got rid of this list altogether, but
that revealed hidden dependencies / assumptions on how the list is
sorted. For example, killing a process and then waiting for all the
LWPs status using iterate_over_lwps only works as is because the
leader LWP is always last in the list. So I thought it better to take
an incremental approach and make this patch concern itself _only_ with
the PID lookup optimization.
gdb/ChangeLog:
2016-05-24 Pedro Alves <palves@redhat.com>
PR gdb/19828
* linux-nat.c (lwp_lwpid_htab): New htab.
(lwp_info_hash, lwp_lwpid_htab_eq, lwp_lwpid_htab_create)
(lwp_lwpid_htab_add_lwp): New functions.
(lwp_list): Tweak comment.
(lwp_list_add, lwp_list_remove, lwp_lwpid_htab_remove_pid): New
functions.
(purge_lwp_list): Rewrite, using htab_traverse_noresize.
(add_initial_lwp): Add lwp to htab too. Use lwp_list_add.
(delete_lwp): Use lwp_list_remove. Remove htab too.
(find_lwp_pid): Search in htab.
(_initialize_linux_nat): Call lwp_lwpid_htab_create.
* linux-nat.h (struct lwp_info) <prev>: New field.
Hacking the gdb.threads/attach-many-short-lived-threads.exp test to
spawn thousands of threads instead of dozens, I saw GDB having trouble
keeping up with threads being spawned too fast, when it tried to stop
them all. This was because while gdb is doing that, it updates the
thread list to make sure no new thread has sneaked in that might need
to be paused. It does this a few times until it sees no-new-threads
twice in a row. The thread listing update itself is not that
expensive, however, in the Linux backend, updating the threads list
calls linux_common_core_of_thread for each LWP to record on which core
each LWP was last seen running, which opens/reads/closes a /proc file
for each LWP which becomes expensive when you need to do it for
thousands of LWPs.
perf shows gdb in linux_common_core_of_thread 44% of the time, in the
stop_all_threads -> update_thread_list path in this use case.
This patch simply makes linux_common_core_of_thread avoid updating the
core the thread is bound to if the thread hasn't run since the last
time we updated that info. This makes linux_common_core_of_thread
disappear into the noise in the perf report.
gdb/ChangeLog:
2016-05-24 Pedro Alves <palves@redhat.com>
PR gdb/19828
* linux-nat.c (linux_resume_one_lwp_throw): Clear the LWP's core
field.
(linux_nat_update_thread_list): Don't fetch the core if already
known.
A following patch (fix for gdb/19828) makes linux-nat.c add threads to
GDB's thread list earlier in the "attach" sequence, and that causes a
surprising regression on
gdb.threads/attach-many-short-lived-threads.exp on my machine. The
extra "thread x exited" handling and traffic slows down that test
enough that GDB core has trouble keeping up with new threads that are
spawned while trying to stop existing ones.
I saw the exact same issue with remote/gdbserver a while ago and fixed
it in 65706a29ba (Remote thread create/exit events) so part of the
fix here is the exact same -- add support for thread created events to
gdb/linux-nat.c. infrun.c:stop_all_threads enables those events when
it tries to stop threads, which ensures that new threads never get a
chance to themselves start new threads, thus fixing the race.
gdb/
2016-05-24 Pedro Alves <palves@redhat.com>
PR gdb/19828
* linux-nat.c (report_thread_events): New global.
(linux_handle_extended_wait): Report
TARGET_WAITKIND_THREAD_CREATED if thread event reporting is
enabled.
(wait_lwp, linux_nat_filter_event): Report all thread exits if
thread event reporting is enabled. Remove comment.
(filter_exit_event): New function.
(linux_nat_wait_1): Use it.
(linux_nat_thread_events): New function.
(linux_nat_add_target): Install it as target_thread_events method.
In non-stop mode, "interrupt" results in a "stop with no signal",
while in all-stop mode, it results in a remote interrupt request /
stop with SIGINT. This is currently implemented in both the Linux and
remote target backends. Move it to the core code instead, making
target_interrupt specifically always about "Interrupting as if with
Ctrl-C", just like it is documented.
gdb/ChangeLog:
2016-04-12 Pedro Alves <palves@redhat.com>
* infcmd.c (interrupt_target_1): Call target_stop is in non-stop
mode.
* linux-nat.c (linux_nat_interrupt): Delete.
(linux_nat_add_target): Don't install linux_nat_interrupt.
* remote.c (remote_interrupt_ns): Change return type to void.
Throw error if interrupting the target is not supported.
(remote_interrupt): Don't call the remote_stop_ns/remote_stop_as.
This unbreaks pending/delayed breakpoints handling, as well as
hardware watchpoints, on MIPS.
Ref: https://sourceware.org/ml/gdb-patches/2016-02/msg00681.html
The MIPS kernel reports SI_KERNEL for all kernel generated traps,
instead of TRAP_BRKPT / TRAP_HWBKPT, but GDB isn't aware of this.
Basically, this commit:
- Folds watchpoints logic into check_stopped_by_breakpoint, and
renames it to save_stop_reason.
- Adds GDB_ARCH_IS_TRAP_HWBKPT.
- Makes MIPS set both GDB_ARCH_IS_TRAP_BRPT and
GDB_ARCH_IS_TRAP_HWBKPT to SI_KERNEL. In save_stop_reason, we
handle the case of the same si_code returning true for both
TRAP_BRPT and TRAP_HWBKPT by looking at what the debug registers
say.
Tested on x86-64 Fedora 20, native and gdbserver.
gdb/ChangeLog:
2016-02-24 Pedro Alves <palves@redhat.com>
* linux-nat.c (save_sigtrap) Delete.
(stop_wait_callback): Call save_stop_reason instead of
save_sigtrap.
(check_stopped_by_breakpoint): Rename to ...
(save_stop_reason): ... this. Bits of save_sigtrap folded here.
Use GDB_ARCH_IS_TRAP_HWBKPT and handle ambiguous
GDB_ARCH_IS_TRAP_BRKPT / GDB_ARCH_IS_TRAP_HWBKPT. Factor out
common code between the USE_SIGTRAP_SIGINFO and
!USE_SIGTRAP_SIGINFO blocks.
(linux_nat_filter_event): Call save_stop_reason instead of
save_sigtrap.
* nat/linux-ptrace.h: Check for both SI_KERNEL and TRAP_BRKPT
si_code for MIPS.
* nat/linux-ptrace.h: Fix "TRAP_HWBPT" typo in x86 table. Add
comments on MIPS behavior.
(GDB_ARCH_IS_TRAP_HWBKPT): Define for all archs.
gdb/gdbserver/ChangeLog:
2016-02-24 Pedro Alves <palves@redhat.com>
* linux-low.c (check_stopped_by_breakpoint): Rename to ...
(save_stop_reason): ... this. Use GDB_ARCH_IS_TRAP_HWBKPT and
handle ambiguous GDB_ARCH_IS_TRAP_BRKPT / GDB_ARCH_IS_TRAP_HWBKPT.
Factor out common code between the USE_SIGTRAP_SIGINFO and
!USE_SIGTRAP_SIGINFO blocks.
(linux_low_filter_event): Call save_stop_reason instead of
check_stopped_by_breakpoint and check_stopped_by_watchpoint.
Update comments.
(linux_wait_1): Update comments.
linux_nat_kill relies on get_last_target_status to determine whether
the current inferior is stopped at a unfollowed fork/vfork event.
This is bad because many things can happen ever since we caught the
fork/vfork event... This commit rewrites that code to instead walk
the thread list looking for unfollowed fork events, similarly to what
was done for remote.c.
New test included. The main idea of the test is make sure that when
the program stops for a fork catchpoint, and the user kills the
parent, gdb also kills the unfollowed fork child. Since the child
hasn't been added as an inferior at that point, we need some other
portable way to detect that the child is gone. The test uses a pipe
for that. The program forks twice, so you have grandparent, child and
grandchild. The grandchild inherits the write side of the pipe. The
grandparent hangs reading from the pipe, since nothing ever writes to
it. If, when GDB kills the child, it also kills the grandchild, then
the grandparent's pipe read returns 0/EOF and the test passes.
Otherwise, if GDB doesn't kill the grandchild, then the pipe read
never returns and the test times out, like:
FAIL: gdb.base/catch-fork-kill.exp: fork-kind=fork: exit-kind=kill: fork: kill parent (timeout)
FAIL: gdb.base/catch-fork-kill.exp: fork-kind=vfork: exit-kind=kill: vfork: kill parent (timeout)
No regressions on x86_64 Fedora 20. New test passes with gdbserver as
well.
gdb/ChangeLog:
2016-01-25 Pedro Alves <palves@redhat.com>
PR gdb/19494
* linux-nat.c (kill_one_lwp): New, factored out from ...
(kill_callback): ... this.
(kill_wait_callback): New, factored out from ...
(kill_wait_one_lwp): ... this.
(kill_unfollowed_fork_children): New function.
(linux_nat_kill): Use it.
gdb/testsuite/ChangeLog:
2016-01-25 Pedro Alves <palves@redhat.com>
PR gdb/19494
* gdb.base/catch-fork-kill.c: New file.
* gdb.base/catch-fork-kill.exp: New file.
Note: this applies on top of:
[PATCH] Remove support for LinuxThreads and vendor 2.4 kernels w/ backported NPTL
https://sourceware.org/ml/gdb-patches/2015-12/msg00214.html
We try to avoid using libthread_db.so to list threads in the inferior
when debugging live processes, but the code that decides whether to
use it decides incorrectly if you have more than one inferior, and the
current inferior doesn't have execution yet. The result is visible
as:
(gdb) add-inferior
Added inferior 2
(gdb) inferior 2
[Switching to inferior 2 [<null>] (<noexec>)]
(gdb) info inferiors
Num Description Executable
1 process 15397 /home/pedro/gdb/tests/threads
* 2 <null>
(gdb) info threads
Cannot find new threads: generic error
(gdb)
Fix this by checking whether each inferior has execution rather than
just the current inferior.
By moving the core updating to linux-nat.c's update_thread_list
implementation, this also ends up fixing the
lwp-last-seen-running-on-core updating in the case we're debugging a
program that uses raw clone rather than pthreads, as linux-thread-db.c
isn't pushed in the target stack in that scenario.
Tested on x86_64 Fedora 20.
gdb/ChangeLog:
2015-12-17 Pedro Alves <palves@redhat.com>
PR threads/19354
* linux-nat.c (linux_nat_update_thread_list): Update process cores
each lwp was last seen running on here.
* linux-thread-db.c (update_thread_core): Delete.
(thread_db_update_thread_list_td_ta_thr_iter): Rename to ...
(thread_db_update_thread_list): ... this. Skip inferiors with
execution. Also call the target beneath.
(thread_db_update_thread_list): Delete.
gdb/testsuite/ChangeLog:
2015-12-17 Pedro Alves <palves@redhat.com>
PR threads/19354
* gdb.multi/info-threads.exp: New file.
Since we now rely on PTRACE_EVENT_CLONE being available (added in
Linux 2.5.46), we're relying on NPTL.
This commit removes the support for older LinuxThreads, as well as the
workarounds for vendor 2.4 kernels with NPTL backported.
- Rely on tkill being available.
- Assume gdb doesn't get cancel signals.
- Remove code that checks the LinuxThreads restart and cancel signals
in the inferior.
- Assume that __WALL is available.
- Assume that non-leader threads report WIFEXITED.
- Thus, no longer need to send signal 0 to check whether threads are
still alive.
- Update comments throughout.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/ChangeLog:
* configure.ac: Remove tkill checks.
* configure, config.in: Regenerate.
* linux-nat.c: Remove HAVE_TKILL_SYSCALL check. Update top level
comments.
(linux_nat_post_attach_wait): Remove 'cloned' parameter. Use
__WALL.
(attach_proc_task_lwp_callback): Don't set the cloned flag.
(linux_nat_attach): Adjust.
(kill_lwp): Remove HAVE_TKILL_SYSCALL check. No longer fall back
to 'kill'.
(linux_handle_extended_wait): Use __WALL. Don't set the cloned
flag.
(wait_lwp): Use __WALL. Update comments.
(running_callback, stop_and_resume_callback): Delete.
(linux_nat_filter_event): Don't stop and resume all lwps. Don't
check if the event LWP has previously exited.
(check_zombie_leaders): Update comments.
(linux_nat_wait_1): Use __WALL.
(kill_wait_callback): Don't handle clone processes separately.
Use __WALL instead.
(linux_thread_alive): Delete.
(linux_nat_thread_alive): Return true as long as the LWP is in the
LWP list.
(linux_nat_update_thread_list): Assume the kernel supports
PTRACE_EVENT_CLONE.
(get_signo): Delete.
(lin_thread_get_thread_signals): Remove LinuxThreads references.
No longer check __pthread_sig_restart / __pthread_sig_cancel in
the inferior.
* linux-nat.h (struct lwp_info) <cloned>: Delete field.
* linux-thread-db.c: Update comments.
(_initialize_thread_db): Remove LinuxThreads references.
* nat/linux-waitpid.c (my_waitpid): No longer emulate __WALL.
Pass down flags unmodified.
* linux-waitpid.h (my_waitpid): Update documentation.
gdb/gdbserver/ChangeLog:
* linux-low.c (linux_kill_one_lwp): Remove references to
LinuxThreads.
(kill_lwp): Remove HAVE_TKILL_SYSCALL check. No longer fall back
to 'kill'.
(linux_init_signals): Delete.
(initialize_low): Adjust.
* thread-db.c (thread_db_init): Remove LinuxThreads reference.
Before, on systems that did not support PTRACE_EVENT_CLONE, both GDB and
GDBServer coordinated with libthread_db.so to insert breakpoints at magic
locations in libpthread.so, in order to break at thread creation and
thread death.
Support for thread events was removed from GDBServer as patch:
https://sourceware.org/ml/gdb-patches/2015-11/msg00466.html
This patch removes support for thread events in GDB.
No regressions found on Ubuntu 14.04 x86_64.
gdb/ChangeLog:
* breakpoint.c (remove_thread_event_breakpoints): Remove.
* breakpoint.h (remove_thread_event_breakpoints): Remove
declaration.
* linux-nat.c (in_pid_list_p): Remove.
(lin_lwp_attach_lwp): Remove.
* linux-nat.h (lin_lwp_attach_lwp): Remove declaration.
* linux-thread-db.c (thread_db_use_events): Remove.
(struct thread_db_info) <td_create_bp_addr>: Remove.
<td_death_bp_addr>: Likewise.
<td_ta_event_addr_p>: Likewise.
<td_ta_set_event_p>: Likewise.
<td_ta_clear_event_p>: Likewise.
<td_ta_event_getmsg_p>: Likewise.
<td_thr_event_enable_p>: Likewise.
(attach_thread): Likewise.
(detach_thread): Likewise.
(have_threads_callback): Likewise.
(have_threads): Likewise.
(enable_thread_event): Likewise.
(enable_thread_event_reporting): Likewise.
(try_thread_db_load_1): Remove td_ta_event_addr, td_ta_set_event,
td_ta_clear_event, td_ta_event_getmsg, td_thr_event_enable
initializations.
(try_thread_db_load_1): Remove enable_thread_event_reporting call.
(disable_thread_event_reporting): Remove.
(record_thread): Adapt to thread_db_use_event removal.
(detach_thread): Remove.
(thread_db_detach): Adapt to thread_db_use_event removal.
(check_event): Remove.
(thread_db_wait): Adapt to thread events support removal.
(thread_db_mourn_inferior): Likewise.
(find_new_threads_callback): Likewise.
(find_new_threads_once): Likewise.
(thread_db_update_thread_list): Likewise.
This patch adds support for thread names in the remote protocol, and
updates gdb/gdbserver to use it. The information is added to the XML
description sent in response to the qXfer:threads:read packet.
gdb/ChangeLog:
* linux-nat.c (linux_nat_thread_name): Replace implementation by call
to linux_proc_tid_get_name.
* nat/linux-procfs.c (linux_proc_tid_get_name): New function,
implementation inspired by linux_nat_thread_name.
* nat/linux-procfs.h (linux_proc_tid_get_name): New declaration.
* remote.c (struct private_thread_info) <name>: New field.
(free_private_thread_info): Free name field.
(remote_thread_name): New function.
(thread_item_t) <name>: New field.
(clear_threads_listing_context): Free name field.
(start_thread): Get name xml attribute.
(thread_attributes): Add "name" attribute.
(remote_update_thread_list): Copy name field.
(init_remote_ops): Assign remote_thread_name callback.
* target.h (target_thread_name): Update comment.
* NEWS: Mention remote thread name support.
gdb/gdbserver/ChangeLog:
* linux-low.c (linux_target_ops): Use linux_proc_tid_get_name.
* server.c (handle_qxfer_threads_worker): Refactor to include thread
name in reply.
* target.h (struct target_ops) <thread_name>: New field.
(target_thread_name): New macro.
gdb/doc/ChangeLog:
* gdb.texinfo (Thread List Format): Mention thread names.
Since this code path returns a string owned by the target (we don't know how
it's allocated, could be a static read-only string), it's safer if we return
a constant string. If, for some reasons, the caller wishes to modify the
string, it should make itself a copy.
gdb/ChangeLog:
* linux-nat.c (linux_nat_thread_name): Constify return value.
* target.h (struct target_ops) <to_thread_name>: Likewise.
(target_thread_name): Likewise.
* target.c (target_thread_name): Likewise.
* target-delegates.c (debug_thread_name): Regenerate.
* python/py-infthread.c (thpy_get_name): Constify local variables.
* thread.c (print_thread_info): Likewise.
(thread_find_command): Likewise.
The existing logic was simply to flip syscall entry/return state when a
syscall trap was seen, and even then only with active 'catch syscall'.
That can get out of sync if 'catch syscall' is toggled at odd times.
This patch updates the entry/return state for all syscall traps,
regardless of catching state, and also updates known syscall state for
other kinds of traps. Almost all PTRACE_EVENT stops are delivered from
the middle of a syscall, so this can act like an entry. Every other
kind of ptrace stop is only delivered outside of syscall event pairs, so
marking them ignored ensures the next syscall trap looks like an entry.
Three new test scenarios are added to catch-syscall.exp:
- Disable 'catch syscall' from an entry to deliberately miss the return
event, then re-enable to make sure a new entry is recognized.
- Enable 'catch syscall' for the first time from a vfork event, which is
a PTRACE_EVENT_VFORK in the middle of the syscall. Make sure the next
syscall event is recognized as the return.
- Make sure entry and return are recognized for an ENOSYS syscall. This
is to defeat a common x86 hack that uses the pre-filled ENOSYS return
value as a sign of being on the entry side.
gdb/ChangeLog:
2015-10-19 Josh Stone <jistone@redhat.com>
* linux-nat.c (linux_handle_syscall_trap): Always update entry/
return state, even when not actively catching syscalls at all.
(linux_handle_extended_wait): Mark syscall_state like an entry.
(wait_lwp): Set syscall_state ignored for other traps.
(linux_nat_filter_event): Likewise.
gdb/testsuite/ChangeLog:
2015-10-19 Josh Stone <jistone@redhat.com>
* gdb.base/catch-syscall.c: Include <sched.h>.
(unknown_syscall): New variable.
(main): Trigger a vfork and an unknown syscall.
* gdb.base/catch-syscall.exp (vfork_syscalls): New variable.
(unknown_syscall_number): Likewise.
(check_call_to_syscall): Accept an optional syscall pattern.
(check_return_from_syscall): Likewise.
(check_continue): Likewise.
(test_catch_syscall_without_args): Check for vfork and ENOSYS.
(test_catch_syscall_skipping_return): New test toggling off 'catch
syscall' to step over the syscall return, then toggling back on.
(test_catch_syscall_mid_vfork): New test turning on 'catch syscall'
during a PTRACE_EVENT_VFORK stop, in the middle of a vfork syscall.
(do_syscall_tests): Call test_catch_syscall_without_args and
test_catch_syscall_mid_vfork.
(test_catch_syscall_without_args_noxml): Check for vfork and ENOSYS.
(fill_all_syscalls_numbers): Initialize unknown_syscall_number.
Since the record-btrace target now supports non-stop mode, we no
longer need to force-disable as-ns on x86.
gdb/ChangeLog:
2015-09-30 Pedro Alves <palves@redhat.com>
* linux-nat.c (linux_nat_always_non_stop_p): Always return 1.
* x86-linux-nat.c (x86_linux_always_non_stop_p): Delete.
(x86_linux_create_target): Don't install
x86_linux_always_non_stop_p.
This patch makes the execution control code use largely the same
mechanisms in both sync- and async-capable targets. This means using
continuations and use the event loop to react to target events on sync
targets as well. The trick is to immediately mark infrun's event loop
source after resume instead of calling wait_for_inferior. Then
fetch_inferior_event is adjusted to do a blocking wait on sync
targets.
Tested on x86_64 Fedora 20, native and gdbserver, with and without
"maint set target-async off".
gdb/ChangeLog:
2015-09-09 Pedro Alves <palves@redhat.com>
* breakpoint.c (bpstat_do_actions_1, until_break_command): Don't
check whether the target can async.
* inf-loop.c (inferior_event_handler): Only call target_async if
the target can async.
* infcall.c: Include top.h and interps.h.
(run_inferior_call): For the interpreter to sync mode while
running the infcall. Call wait_sync_command_done instead of
wait_for_inferior plus normal_stop.
* infcmd.c (prepare_execution_command): Don't check whether the
target can async when running in the foreground.
(step_1): Delete synchronous case handling.
(step_once): Always install a continuation, even in sync mode.
(until_next_command, finish_forward): Don't check whether the
target can async.
(attach_command_post_wait, notice_new_inferior): Always install a
continuation, even in sync mode.
* infrun.c (mark_infrun_async_event_handler): New function.
(proceed): In sync mode, mark infrun's event source instead of
waiting for events here.
(fetch_inferior_event): If the target can't async, do a blocking
wait.
(prepare_to_wait): In sync mode, mark infrun's event source.
(infrun_async_inferior_event_handler): No longer bail out if the
target can't async.
* infrun.h (mark_infrun_async_event_handler): New declaration.
* linux-nat.c (linux_nat_wait_1): Remove calls to
set_sigint_trap/clear_sigint_trap.
(linux_nat_terminal_inferior): No longer check whether the target
can async.
* mi/mi-interp.c (mi_on_sync_execution_done): Update and simplify
comment.
(mi_execute_command_input_handler): No longer check whether the
target is async. Update and simplify comment.
* target.c (default_target_wait): New function.
* target.h (struct target_ops) <to_wait>: Now defaults to
default_target_wait.
(default_target_wait): Declare.
* top.c (wait_sync_command_done): New function, factored out from
...
(maybe_wait_sync_command_done): ... this.
* top.h (wait_sync_command_done): Declare.
* target-delegates.c: Regenerate.
The Linux target and gdbserver now check the siginfo si_code
reported on a SIGTRAP to detect whether the trap indicates
a software breakpoint was hit.
Unfortunately, on Cell/B.E., the kernel uses an si_code value
of TRAP_BRKPT when a SW breakpoint was hit in PowerPC code,
but a si_code value of SI_KERNEL when a SW breakpoint was
hit in SPU code.
This patch updates Linux target and gdbserver to accept both
si_code values to indicate SW breakpoint on PowerPC.
ChangeLog:
* nat/linux-ptrace.h (GDB_ARCH_TRAP_BRKPT): Replace by ...
(GDB_ARCH_IS_TRAP_BRKPT): ... this. Add __powerpc__ case.
* linux-nat.c (check_stopped_by_breakpoint): Use
GDB_ARCH_IS_TRAP_BRKPT instead of GDB_ARCH_TRAP_BRKPT.
gdbserver/ChangeLog:
* linux-low.c (check_stopped_by_breakpoint): Use
GDB_ARCH_IS_TRAP_BRKPT instead of GDB_ARCH_TRAP_BRKPT.
GDB provides no indicator of progress during file operations, and can
appear to have locked up during slow remote transfers. This commit
updates GDB to print a warning each time a file is accessed over RSP.
An additional message detailing how to avoid remote transfers is
printed for the first transfer only.
gdb/ChangeLog:
* target.h (struct target_ops) <to_fileio_open>: New argument
warn_if_slow. Update comment. All implementations updated.
(target_fileio_open_warn_if_slow): New declaration.
* target.c (target_fileio_open): Renamed as...
(target_fileio_open_1): ...this. New argument warn_if_slow.
Pass warn_if_slow to implementation. Update debug printing.
(target_fileio_open): New function.
(target_fileio_open_warn_if_slow): Likewise.
* gdb_bfd.c (gdb_bfd_iovec_fileio_open): Use new function
target_fileio_open_warn_if_slow.
gdb/testsuite/ChangeLog:
* gdb.trace/pending.exp: Cope with remote transfer warnings.
Markus reported that ASNS breaks target record-btrace. In particular,
the gdb.btrace/multi-thread-step.exp test fails (both with BTS and PT
tracing) with a crash in py-inferior.c:
Program received signal SIGSEGV, Segmentation fault.
0x00000000006aa40d in add_thread_object (tp=0x27d32d0)
at /users/mmetzger/team/gdb/git/gdb/python/py-inferior.c:337
337 entry->next = inf_obj->threads;
My machine doesn't support BTS nor PT, so I missed this...
Disabling ASNS temporarily on x86 until this is addressed.
Tested on x86_64 Fedora 20.
gdb/ChangeLog:
2015-08-18 Pedro Alves <palves@redhat.com>
* linux-nat.c (linux_nat_always_non_stop_p): If the linux_ops
target implements to_always_non_stop_p, call it.
* x86-linux-nat.c (x86_linux_always_non_stop_p): New function.
(x86_linux_create_target): Install it as to_always_non_stop_p
method.
The testsuite shows no regressions with this forced on, on:
- Native x86_64 Fedora 20, with and output "set displaced off".
- Native x86_64 Fedora 20, on top of x86 software single-step series.
- PPC64 Fedora 18.
- S/390 RHEL 7.1.
Let's try making it the default.
gdb/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* linux-nat.c (linux_nat_always_non_stop_p): Return 1.
With "maint set target-non-stop on" we get:
@@ -66,13 +66,16 @@ Continuing.
interrupt
(gdb) PASS: gdb.base/interrupt-noterm.exp: interrupt
-Program received signal SIGINT, Interrupt.
-PASS: gdb.base/interrupt-noterm.exp: inferior received SIGINT
-testcase src/gdb/testsuite/gdb.base/interrupt-noterm.exp completed in 0 seconds
+[process 12119] #1 stopped.
+0x0000003615ebc6d0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
+81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
+FAIL: gdb.base/interrupt-noterm.exp: inferior received SIGINT (timeout)
+testcase src/gdb/testsuite/gdb.base/interrupt-noterm.exp completed in 10 seconds
That is, we get "[$thread] #1 stopped" instead of SIGINT.
The issue is that we don't currently distinguish send
"interrupt/ctrl-c" to target terminal vs "stop/pause" thread well;
both cases go through "target_stop".
And then, the native Linux backend (linux-nat.c) implements
target_stop with SIGSTOP in non-stop mode, and SIGINT in all-stop
mode. Since "maint set target-non-stop on" forces the backend to be
always running in non-stop mode, even though the user-visible behavior
is "set non-stop" is "off", "interrupt" causes a SIGSTOP instead of
the SIGINT the test expects.
Fix this by introducing a target_interrupt method to use in the
"interrupt/ctrl-c" case, so "set non-stop off" can always work the
same irrespective of "maint set target-non-stop on/off". I'm
explictly considering changing the "set non-stop on" behavior as out
of scope here.
Most of the patch is an across-the-board rename of to_stop hook
implementations to to_interrupt. The only targets where something
more than a rename is being done are linux-nat.c and remote.c, which
are the only targets that support async, and thus are the only ones
the core side calls target_stop on.
gdb/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* darwin-nat.c (darwin_stop): Rename to ...
(darwin_interrupt): ... this.
(_initialize_darwin_inferior): Adjust.
* gnu-nat.c (gnu_stop): Delete.
(gnu_target): Don't install gnu_stop.
* inf-ptrace.c (inf_ptrace_stop): Rename to ...
(inf_ptrace_interrupt): ... this.
(inf_ptrace_target): Adjust.
* infcmd.c (interrupt_target_1): Use target_interrupt instead of
target_stop.
* linux-nat (linux_nat_stop): Rename to ...
(linux_nat_interrupt): ... this.
(linux_nat_stop): Reimplement.
(linux_nat_add_target): Install linux_nat_interrupt.
* nto-procfs.c (nto_interrupt_twice): Rename to ...
(nto_handle_sigint_twice): ... this.
(nto_interrupt): Rename to ...
(nto_handle_sigint): ... this. Call target_interrupt instead of
target_stop.
(procfs_wait): Adjust.
(procfs_stop): Rename to ...
(procfs_interrupt): ... this.
(init_procfs_targets): Adjust.
* procfs.c (procfs_stop): Rename to ...
(procfs_interrupt): ... this.
(procfs_target): Adjust.
* remote-m32r-sdi.c (m32r_stop): Rename to ...
(m32r_interrupt): ... this.
(init_m32r_ops): Adjust.
* remote-sim.c (gdbsim_stop_inferior): Rename to ...
(gdbsim_interrupt_inferior): ... this.
(gdbsim_stop): Rename to ...
(gdbsim_interrupt): ... this.
(gdbsim_cntrl_c): Adjust.
(init_gdbsim_ops): Adjust.
* remote.c (sync_remote_interrupt): Adjust comments.
(remote_stop_as): Rename to ...
(remote_interrupt_as): ... this.
(remote_stop): Adjust comment.
(remote_interrupt): New function.
(init_remote_ops): Install remote_interrupt.
* target.c (target_interrupt): New function.
* target.h (struct target_ops) <to_interrupt>: New field.
(target_interrupt): New declaration.
* windows-nat.c (windows_stop): Rename to ...
(windows_interrupt): ... this.
* target-delegates.c: Regenerate.
This finally implements user-visible all-stop mode running with the
target_ops backend always in non-stop mode. This is a stepping stone
towards finer-grained control of threads, being able to do interesting
things like thread groups, associating groups with breakpoints, etc.
From the user's perspective, all-stop mode is really just a special
case of being able to stop and resume specific sets of threads, so it
makes sense to do this step first.
With this, even in all-stop, the target is no longer in charge of
stopping all threads before reporting an event to the core -- the core
takes care of it when it sees fit. For example, when "next"- or
"step"-ing, we can avoid stopping and resuming all threads at each
internal single-step, and instead only stop all threads when we're
about to present the stop to the user.
The implementation is almost straight forward, as the heavy lifting
has been done already in previous patches. Basically, we replace
checks for "set non-stop on/off" (the non_stop global), with calls to
a new target_is_non_stop_p function. In a few places, if "set
non-stop off", we stop all threads explicitly, and in a few other
places we resume all threads explicitly, making use of existing
methods that were added for teaching non-stop to step over breakpoints
without displaced stepping.
This adds a new "maint set target-non-stop on/off/auto" knob that
allows both disabling the feature if we find problems, and
force-enable it for development (useful when teaching a target about
this. The default is "auto", which means the feature is enabled if a
new target method says it should be enabled. The patch implements the
method in linux-nat.c, just for illustration, because it still returns
false. We'll need a few follow up fixes before turning it on by
default. This is a separate target method from indicating regular
non-stop support, because e.g., while e.g., native linux-nat.c is
close to regression free with all-stop-non-stop (with following
patches will fixing the remaining regressions), remote.c+gdbserver
will still need more fixing, even though it supports "set non-stop
on".
Tested on x86_64 Fedora 20, native, with and without "set displaced
off", and with and without "maint set target-non-stop on"; and also
against gdbserver.
gdb/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* NEWS: Mention "maint set/show target-non-stop".
* breakpoint.c (update_global_location_list): Check
target_is_non_stop_p instead of non_stop.
* infcmd.c (attach_command_post_wait, attach_command): Likewise.
* infrun.c (show_can_use_displaced_stepping)
(can_use_displaced_stepping_p, start_step_over_inferior):
Likewise.
(internal_resume_ptid): New function.
(resume): Use it.
(proceed): Check target_is_non_stop_p instead of non_stop. If in
all-stop mode but the target is always in non-stop mode, start all
the other threads that are implicitly resumed too.
(for_each_just_stopped_thread, fetch_inferior_event)
(adjust_pc_after_break, stop_all_threads): Check
target_is_non_stop_p instead of non_stop.
(handle_inferior_event): Likewise. Handle detach-fork in all-stop
with the target always in non-stop mode.
(handle_signal_stop) <random signal>: Check target_is_non_stop_p
instead of non_stop.
(switch_back_to_stepped_thread): Check target_is_non_stop_p
instead of non_stop.
(keep_going_stepped_thread): Use internal_resume_ptid.
(stop_waiting): If in all-stop mode, and the target is in non-stop
mode, stop all threads.
(keep_going_pass): Likewise, when starting a new in-line step-over
sequence.
* linux-nat.c (get_pending_status, select_event_lwp)
(linux_nat_filter_event, linux_nat_wait_1, linux_nat_wait): Check
target_is_non_stop_p instead of non_stop.
(linux_nat_always_non_stop_p): New function.
(linux_nat_stop): Check target_is_non_stop_p instead of non_stop.
(linux_nat_add_target): Install linux_nat_always_non_stop_p.
* target-delegates.c: Regenerate.
* target.c (target_is_non_stop_p): New function.
(target_non_stop_enabled, target_non_stop_enabled_1): New globals.
(maint_set_target_non_stop_command)
(maint_show_target_non_stop_command): New functions.
(_initilize_target): Install "maint set/show target-non-stop"
commands.
* target.h (struct target_ops) <to_always_non_stop_p>: New field.
(target_non_stop_enabled): New declaration.
(target_is_non_stop_p): New declaration.
gdb/doc/ChangeLog:
2015-08-07 Pedro Alves <palves@redhat.com>
* gdb.texinfo (Maintenance Commands): Document "maint set/show
target-non-stop".
The new gdb.threads/fork-plus-threads.exp test exposes one more
problem. When one types "info inferiors" after running the program,
one see's a couple inferior left still, while there should only be
inferior #1 left. E.g.:
(gdb) info inferiors
Num Description Executable
4 process 8393 /home/pedro/bugs/src/test
2 process 8388 /home/pedro/bugs/src/test
* 1 <null> /home/pedro/bugs/src/test
(gdb) info threads
Calling prune_inferiors() manually at this point (from a top gdb) does
not remove them, because they still have inf->pid != 0 (while they
shouldn't). This suggests that we never mourned those inferiors.
Enabling logs (master + previous patch) we see:
...
WL: waitpid Thread 0x7ffff7fc2740 (LWP 9513) received Trace/breakpoint trap (stopped)
WL: Handling extended status 0x03057f
LHEW: Got clone event from LWP 9513, new child is LWP 9579
[New Thread 0x7ffff37b8700 (LWP 9579)]
WL: waitpid Thread 0x7ffff7fc2740 (LWP 9508) received 0 (exited)
WL: Thread 0x7ffff7fc2740 (LWP 9508) exited.
^^^^^^^^
[Thread 0x7ffff7fc2740 (LWP 9508) exited]
WL: waitpid Thread 0x7ffff7fc2740 (LWP 9499) received 0 (exited)
WL: Thread 0x7ffff7fc2740 (LWP 9499) exited.
[Thread 0x7ffff7fc2740 (LWP 9499) exited]
RSRL: resuming stopped-resumed LWP Thread 0x7ffff37b8700 (LWP 9579) at 0x3615ef4ce1: step=0
...
(gdb) info inferiors
Num Description Executable
5 process 9508 /home/pedro/bugs/src/test
^^^^
4 process 9503 /home/pedro/bugs/src/test
3 process 9500 /home/pedro/bugs/src/test
2 process 9499 /home/pedro/bugs/src/test
* 1 <null> /home/pedro/bugs/src/test
(gdb)
...
Note the "Thread 0x7ffff7fc2740 (LWP 9508) exited." line.
That's this in wait_lwp:
/* Check if the thread has exited. */
if (WIFEXITED (status) || WIFSIGNALED (status))
{
thread_dead = 1;
if (debug_linux_nat)
fprintf_unfiltered (gdb_stdlog, "WL: %s exited.\n",
target_pid_to_str (lp->ptid));
}
}
That was the leader thread reporting an exit, meaning the whole
process is gone. So the problem is that this code doesn't understand
that an WIFEXITED status of the leader LWP should be reported to
infrun as process exit.
gdb/ChangeLog:
2015-07-30 Pedro Alves <palves@redhat.com>
PR threads/18600
* linux-nat.c (wait_lwp): Report to the core when thread group
leader exits.
gdb/testsuite/ChangeLog:
2015-07-30 Pedro Alves <palves@redhat.com>
PR threads/18600
* gdb.threads/fork-plus-threads.exp: Test that "info inferiors"
only shows inferior 1.
When a program forks and another process start threads while gdb is
handling the fork event, newly created threads are left stuck stopped
by gdb, even though gdb presents them as "running", to the user.
This can be seen with the test added by this patch. The test has the
inferior fork a certain number of times and waits for all children to
exit. Each fork child spawns a number of threads that do nothing and
joins them immediately. Normally, the program should run unimpeded
(from the point of view of the user) and exit very quickly. Without
this fix, it doesn't because of some threads left stopped by gdb, so
inferior 1 never exits.
The program triggers when a new clone thread is found while inside the
linux_stop_and_wait_all_lwps call in linux-thread-db.c:
linux_stop_and_wait_all_lwps ();
ALL_LWPS (lp)
if (ptid_get_pid (lp->ptid) == pid)
thread_from_lwp (lp->ptid);
linux_unstop_all_lwps ();
Within linux_stop_and_wait_all_lwps, we reach
linux_handle_extended_wait with the "stopping" parameter set to 1, and
because of that we don't mark the new lwp as resumed. As consequence,
the subsequent resume_stopped_resumed_lwps, called from
linux_unstop_all_lwps, never resumes the new LWP.
There's lots of cruft in linux_handle_extended_wait that no longer
makes sense. On systems with CLONE events support, we don't rely on
libthread_db for thread listing anymore, so the code that preserves
stop_requested and the handling of last_resume_kind is all dead.
So the fix is to remove all that, and simply always mark the new LWP
as resumed, so that resume_stopped_resumed_lwps re-resumes it.
gdb/ChangeLog:
2015-07-30 Pedro Alves <palves@redhat.com>
Simon Marchi <simon.marchi@ericsson.com>
PR threads/18600
* linux-nat.c (linux_handle_extended_wait): On CLONE event, always
mark the new thread as resumed. Remove STOPPING parameter.
(wait_lwp): Adjust call to linux_handle_extended_wait.
(linux_nat_filter_event): Adjust call to
linux_handle_extended_wait.
(resume_stopped_resumed_lwps): Add debug output.
gdb/testsuite/ChangeLog:
2015-07-30 Simon Marchi <simon.marchi@ericsson.com>
Pedro Alves <palves@redhat.com>
PR threads/18600
* gdb.threads/fork-plus-threads.c: New file.
* gdb.threads/fork-plus-threads.exp: New file.
If a non-leader thread exits the process while all other threads are
ptrace-stopped, native gdb fails an assertion. The test added by this
commit catches it:
/home/pedro/gdb/mygit/build/../src/gdb/linux-nat.c:3198: internal-error: linux_nat_filter_event: Assertion `lp->resumed' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
FAIL: gdb.threads/non-leader-exit-process.exp: program exits normally (GDB internal error)
The fix is just to remove the assertion.
With that out of the way, neither GDB not GDBserver handle this
perfectly though, so I'm adding a KFAIL:
(gdb) continue
Continuing.
[Thread 0x7ffff7fc0700 (LWP 15350) exited]
No unwaited-for children left.
Couldn't get registers: No such process.
(gdb) KFAIL: gdb.threads/non-ldr-exit.exp: program exits normally (PRMS: gdb/18717)
gdb/ChangeLog:
2015-07-24 Pedro Alves <palves@redhat.com>
PR gdb/18717
* linux-nat.c (linux_nat_filter_event): Don't assert that the lwp
is resumed, and extend the debug log.
gdb/testsuite/ChangeLog:
2015-07-24 Pedro Alves <palves@redhat.com>
PR gdb/18717
* gdb.threads/non-ldr-exit.c: New file.
* gdb.threads/non-ldr-exit.exp: New file.
have_ptrace_getregset is a tri-state variable (-1, 0, 1), and we have
some conditions like "if (have_ptrace_getregset)", which is not correct.
I'll explain why it is not correct in the following example. This fix
to this problem to replace the test (have_ptrace_getregset) to test
(have_ptrace_getregset == 1) or (have_ptrace_getregset == -1) etc.
However Doug thinks it hinders readability
https://sourceware.org/ml/gdb-patches/2015-05/msg00692.html so I decide
to add a new enum tribool and change have_ptrace_getregset to it, in
order to make these tests more readable.
have_ptrace_getregset is initialised to -1, and is adjusted to 0 or 1 in
$ARCH_linux_read_description according to the capability of the kernel.
However, it is possible that have_ptrace_getregset is used before it is
set to 0 or 1, which means it is still -1. This is shown below.
(gdb) run
Starting program: gdb/testsuite/gdb.base/break
Breakpoint 2, amd64_linux_fetch_inferior_registers (ops=0xceaa80, regcache=0xe72000, regnum=16) at git/gdb/amd64-linux-nat.c:128
128 {
top?p have_ptrace_getregset
$1 = TRIBOOL_UNKNOWN
top?c
Continuing.
Breakpoint 2, amd64_linux_fetch_inferior_registers (ops=0xceaa80, regcache=0xe72000, regnum=16) at git/gdb/amd64-linux-nat.c:128
128 {
top?c
Continuing.
Breakpoint 1, x86_linux_read_description (ops=0xceaa80) at git/gdb/x86-linux-nat.c:117
117 {
PTRACE_GETREGSET command is used even GDB doesn't know whether
PTRACE_GETREGSET is supported or not. It is wrong, but works on x86.
However it doesn't work on arm-linux if the kernel doesn't support
PTRACE_GETREGSET at all. We'll get:
(gdb) run
Starting program: gdb/testsuite/gdb.base/break
warning: Unable to fetch general register.
PC register is not available
gdb:
2015-06-23 Yao Qi <yao.qi@linaro.org>
* amd64-linux-nat.c (amd64_linux_fetch_inferior_registers):
Check whether have_ptrace_getregset is TRIBOOL_TRUE explicitly.
(amd64_linux_store_inferior_registers): Likewise.
* arm-linux-nat.c (fetch_fpregister): Likewise.
(fetch_fpregs, store_fpregister): Likewise.
(store_fpregister, store_fpregs): Likewise.
(fetch_register, fetch_regs): Likewise.
(store_register, store_regs): Likewise.
(fetch_vfp_regs, store_vfp_regs): Likewise.
(arm_linux_read_description): Check have_ptrace_getregset is
TRIBOOL_UNKNOWN. Set have_ptrace_getregset to TRIBOOL_TRUE
or TRIBOOL_FALSE.
* i386-linux-nat.c (fetch_xstateregs): Check
have_ptrace_getregset is not TRIBOOL_TRUE.
(store_xstateregs): Likewise.
* linux-nat.c (have_ptrace_getregset): Change its type to
enum tribool.
* linux-nat.h (tribool): New enum.
* x86-linux-nat.c (x86_linux_read_description): Use enum tribool.
Check whether have_ptrace_getregset is TRIBOOL_TRUE.
This commit allows GDB to access executables and shared libraries
on native Linux targets where GDB and the inferior have different
mount namespaces.
gdb/ChangeLog:
* linux-nat.c (nat/linux-namespaces.h): New include.
(fileio.h): Likewise.
(linux_nat_filesystem_is_local): New function.
(linux_nat_fileio_pid_of): Likewise.
(linux_nat_fileio_open): Likewise.
(linux_nat_fileio_readlink): Likewise.
(linux_nat_fileio_unlink): Likewise.
(linux_nat_add_target): Initialize to_filesystem_is_local,
to_fileio_open, to_fileio_readlink and to_fileio_unlink.
(_initialize_linux_nat): New "set/show debug linux-namespaces"
commands.
* NEWS: Mention new "set/show debug linux-namespaces" commands.
gdb/doc/ChangeLog:
* gdb.texinfo (Debugging Output): Document the "set/show debug
linux-namespaces" command.