linux-nat.c: better starvation avoidance, handle non-stop mode too

Running the testsuite with a series that reimplements user-visible all-stop behavior on top of a target running in non-stop mode revealed problems related to event starvation avoidance. For example, I see gdb.threads/signal-while-stepping-over-bp-other-thread.exp failing. What happens is that GDB core never gets to see the signal event. It ends up processing the events for the same threads over an over, because Linux's waitpid(-1, ...) returns that first task in the task list that has an event, starving threads on the tail of the task list. So I wrote a non-stop mode test originally inspired by signal-while-stepping-over-bp-other-thread.exp, to stress this independently of all-stop on top of non-stop. Fixing it required the changes described below. The test will be added in a following commit. 1) linux-nat.c has code in place that picks an event LWP at random out of all that have had events. This is because on the kernel side, "waitpid(-1, ...)" just walks the task list linearly looking for the first that had an event. But, this code is currently only used in all-stop mode. So with a multi-threaded program that has multiple events triggering debug events in parallel, GDB ends up starving some threads. To make the event randomization work in non-stop mode too, the patch makes us pull out all the already pending events on the kernel side, with waitpid, before deciding which LWP to report to the core. There's some code in linux_wait that takes care of leaving events pending if they were for LWPs the caller is not interested in. The patch moves that to linux_nat_filter_event, so that we only have one place that leaves events pending. With that in place, conceptually, the flow is simpler and more normalized: #1 - walk the LWP list looking for an LWP with a pending event to report. #2 - if no pending event, pull events out of the kernel, and store them in the LWP structures as pending. #3- goto #1. 2) Then, currently the event randomization code only considers SIGTRAP (or trap-like) events. That means that if e.g., have have multiple threads stepping in parallel that hit a breakpoint that needs stepping over, and one gets a signal, the signal may end up never getting processed, because GDB will always be giving priority to the SIGTRAPs. The patch fixes this by making the randomization code consider all kinds of pending events. 3) If multiple threads hit a breakpoint, we report one of those, and "cancel" the others. Cancelling means decrementing the PC, and discarding the event. If the next time the LWP is resumed the breakpoint is still installed, the LWP should hit it again, and we'll report the hit then. The problem I found is that this delays threads from advancing too much, with the kernel potentially ending up scheduling the same threads over and over, and others not advancing. So the patch switches away from cancelling the breakpoints, and instead remembering that the LWP had stopped for a breakpoint. If on resume the breakpoint is still installed, we report it. If it's no longer installed, we discard the pending event then. This is actually how GDBserver used to handle this before d50171e4 (Teach linux gdbserver to step-over-breakpoints), but with the difference that back then we'd delay adjusting the PC until resuming, which made it so that "info threads" could wrongly see threads with unadjusted PCs. gdb/ 2015-01-09 Pedro Alves <palves@redhat.com> * breakpoint.c (hardware_breakpoint_inserted_here_p): New function. * breakpoint.h (hardware_breakpoint_inserted_here_p): New declaration. * linux-nat.c (linux_nat_status_is_event): Move higher up in file. (linux_resume_one_lwp): Store the thread's PC. Adjust to clear stop_reason. (check_stopped_by_watchpoint): New function. (save_sigtrap): Reimplement. (linux_nat_stopped_by_watchpoint): Adjust. (linux_nat_lp_status_is_event): Delete. (stop_wait_callback): Only call save_sigtrap after storing the pending status. (status_callback): If the thread had been stopped for a breakpoint that has since been removed, discard the event and resume the LWP. (count_events_callback, select_event_lwp_callback): Use lwp_status_pending_p instead of linux_nat_lp_status_is_event. (cancel_breakpoint): Rename to ... (check_stopped_by_breakpoint): ... this. Record whether the LWP stopped for a software breakpoint or hardware breakpoint. (select_event_lwp): Only give preference to the stepping LWP in all-stop mode. Adjust comments. (stop_and_resume_callback): Remove references to new_pending_p. (linux_nat_filter_event): Likewise. Leave exit events of the leader thread pending here. Handle signal short circuiting here. Only call save_sigtrap after storing the pending waitstatus. (linux_nat_wait_1): Remove 'retry' label. Remove references to new_pending. Don't handle leaving events the caller is not interested in pending here, nor handle signal short-circuiting here. Also give equal priority to all LWPs that have had events in non-stop mode. If reporting a software breakpoint event, unadjust the LWP's PC. * linux-nat.h (enum lwp_stop_reason): New. (struct lwp_info) <stop_pc>: New field. (struct lwp_info) <stopped_by_watchpoint>: Delete field. (struct lwp_info) <stop_reason>: New field. * x86-linux-nat.c (x86_linux_prepare_to_resume): Adjust.
2015-01-07 12:48:32 +00:00 · 2015-01-07 12:48:32 +00:00 · 9c02b52532
commit 9c02b52532
parent 8af756ef81
6 changed files with 392 additions and 303 deletions
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@ -1,3 +1,43 @@
+2015-01-09  Pedro Alves  <palves@redhat.com>
+
+	* breakpoint.c (hardware_breakpoint_inserted_here_p): New
+	function.
+	* breakpoint.h (hardware_breakpoint_inserted_here_p): New
+	declaration.
+	* linux-nat.c (linux_nat_status_is_event): Move higher up in file.
+	(linux_resume_one_lwp): Store the thread's PC.  Adjust to clear
+	stop_reason.
+	(check_stopped_by_watchpoint): New function.
+	(save_sigtrap): Reimplement.
+	(linux_nat_stopped_by_watchpoint): Adjust.
+	(linux_nat_lp_status_is_event): Delete.
+	(stop_wait_callback): Only call save_sigtrap after storing the
+	pending status.
+	(status_callback): If the thread had been stopped for a breakpoint
+	that has since been removed, discard the event and resume the LWP.
+	(count_events_callback, select_event_lwp_callback): Use
+	lwp_status_pending_p instead of linux_nat_lp_status_is_event.
+	(cancel_breakpoint): Rename to ...
+	(check_stopped_by_breakpoint): ... this.  Record whether the LWP
+	stopped for a software breakpoint or hardware breakpoint.
+	(select_event_lwp): Only give preference to the stepping LWP in
+	all-stop mode.  Adjust comments.
+	(stop_and_resume_callback): Remove references to new_pending_p.
+	(linux_nat_filter_event): Likewise.  Leave exit events of the
+	leader thread pending here.  Handle signal short circuiting here.
+	Only call save_sigtrap after storing the pending waitstatus.
+	(linux_nat_wait_1): Remove 'retry' label.  Remove references to
+	new_pending.  Don't handle leaving events the caller is not
+	interested in pending here, nor handle signal short-circuiting
+	here.  Also give equal priority to all LWPs that have had events
+	in non-stop mode.  If reporting a software breakpoint event,
+	unadjust the LWP's PC.
+	* linux-nat.h (enum lwp_stop_reason): New.
+	(struct lwp_info) <stop_pc>: New field.
+	(struct lwp_info) <stopped_by_watchpoint>: Delete field.
+	(struct lwp_info) <stop_reason>: New field.
+	* x86-linux-nat.c (x86_linux_prepare_to_resume): Adjust.
+
 2015-01-09  Pedro Alves  <palves@redhat.com>

 	* linux-nat.c (linux_handle_extended_wait) <PTRACE_EVENT_EXEC>:
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@ -4293,6 +4293,29 @@ software_breakpoint_inserted_here_p (struct address_space *aspace,
  return 0;
 }

+/* See breakpoint.h.  */
+
+int
+hardware_breakpoint_inserted_here_p (struct address_space *aspace,
+				     CORE_ADDR pc)
+{
+  struct bp_location **blp, **blp_tmp = NULL;
+  struct bp_location *bl;
+
+  ALL_BP_LOCATIONS_AT_ADDR (blp, blp_tmp, pc)
+    {
+      struct bp_location *bl = *blp;
+
+      if (bl->loc_type != bp_loc_hardware_breakpoint)
+	continue;
+
+      if (bp_location_inserted_here_p (bl, aspace, pc))
+	return 1;
+    }
+
+  return 0;
+}
+
 int
 hardware_watchpoint_inserted_in_range (struct address_space *aspace,
 				       CORE_ADDR addr, ULONGEST len)
--- a/gdb/breakpoint.h
+++ b/gdb/breakpoint.h
@ -1130,6 +1130,11 @@ extern int regular_breakpoint_inserted_here_p (struct address_space *,
 extern int software_breakpoint_inserted_here_p (struct address_space *, 
 						CORE_ADDR);

+/* Return non-zero iff there is a hardware breakpoint inserted at
+   PC.  */
+extern int hardware_breakpoint_inserted_here_p (struct address_space *,
+						CORE_ADDR);
+
 /* Check whether any location of BP is inserted at PC.  */

 extern int breakpoint_has_location_inserted_here (struct breakpoint *bp,
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@ -279,6 +279,10 @@ static struct lwp_info *find_lwp_pid (ptid_t ptid);

 static int lwp_status_pending_p (struct lwp_info *lp);

+static int check_stopped_by_breakpoint (struct lwp_info *lp);
+static int sigtrap_is_event (int status);
+static int (*linux_nat_status_is_event) (int status) = sigtrap_is_event;
+

 /* Trivial list manipulation functions to keep track of a list of
   new stopped processes.  */
@ -1519,12 +1523,25 @@ linux_resume_one_lwp (struct lwp_info *lp, int step, enum gdb_signal signo)
  ptid_t ptid;

  lp->step = step;
+
+  /* stop_pc doubles as the PC the LWP had when it was last resumed.
+     We only presently need that if the LWP is stepped though (to
+     handle the case of stepping a breakpoint instruction).  */
+  if (step)
+    {
+      struct regcache *regcache = get_thread_regcache (lp->ptid);
+
+      lp->stop_pc = regcache_read_pc (regcache);
+    }
+  else
+    lp->stop_pc = 0;
+
  if (linux_nat_prepare_to_resume != NULL)
    linux_nat_prepare_to_resume (lp);
  /* Convert to something the lower layer understands.  */
  ptid = pid_to_ptid (ptid_get_lwp (lp->ptid));
  linux_ops->to_resume (linux_ops, ptid, step, signo);
-  lp->stopped_by_watchpoint = 0;
+  lp->stop_reason = LWP_STOPPED_BY_NO_REASON;
  lp->stopped = 0;
  registers_changed_ptid (lp->ptid);
 }
@ -2374,24 +2391,21 @@ maybe_clear_ignore_sigint (struct lwp_info *lp)
   soon as we see LP stop with a SIGTRAP.  If GDB changes the debug
   registers meanwhile, we have the cached data we can rely on.  */

-static void
-save_sigtrap (struct lwp_info *lp)
+static int
+check_stopped_by_watchpoint (struct lwp_info *lp)
 {
  struct cleanup *old_chain;

  if (linux_ops->to_stopped_by_watchpoint == NULL)
-    {
-      lp->stopped_by_watchpoint = 0;
-      return;
-    }
+    return 0;

  old_chain = save_inferior_ptid ();
  inferior_ptid = lp->ptid;

-  lp->stopped_by_watchpoint = linux_ops->to_stopped_by_watchpoint (linux_ops);
-
-  if (lp->stopped_by_watchpoint)
+  if (linux_ops->to_stopped_by_watchpoint (linux_ops))
    {
+      lp->stop_reason = LWP_STOPPED_BY_WATCHPOINT;
+
      if (linux_ops->to_stopped_data_address != NULL)
 	lp->stopped_data_address_p =
 	  linux_ops->to_stopped_data_address (&current_target,
@ -2401,9 +2415,27 @@ save_sigtrap (struct lwp_info *lp)
    }

  do_cleanups (old_chain);
+
+  return lp->stop_reason == LWP_STOPPED_BY_WATCHPOINT;
 }

-/* See save_sigtrap.  */
+/* Called when the LWP stopped for a trap that could be explained by a
+   watchpoint or a breakpoint.  */
+
+static void
+save_sigtrap (struct lwp_info *lp)
+{
+  gdb_assert (lp->stop_reason == LWP_STOPPED_BY_NO_REASON);
+  gdb_assert (lp->status != 0);
+
+  if (check_stopped_by_watchpoint (lp))
+    return;
+
+  if (linux_nat_status_is_event (lp->status))
+    check_stopped_by_breakpoint (lp);
+}
+
+/* Returns true if the LWP had stopped for a watchpoint.  */

 static int
 linux_nat_stopped_by_watchpoint (struct target_ops *ops)
@ -2412,7 +2444,7 @@ linux_nat_stopped_by_watchpoint (struct target_ops *ops)

  gdb_assert (lp != NULL);

-  return lp->stopped_by_watchpoint;
+  return lp->stop_reason == LWP_STOPPED_BY_WATCHPOINT;
 }

 static int
@ -2435,24 +2467,6 @@ sigtrap_is_event (int status)
  return WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP;
 }

-/* SIGTRAP-like events recognizer.  */
-
-static int (*linux_nat_status_is_event) (int status) = sigtrap_is_event;
-
-/* Check for SIGTRAP-like events in LP.  */
-
-static int
-linux_nat_lp_status_is_event (struct lwp_info *lp)
-{
-  /* We check for lp->waitstatus in addition to lp->status, because we can
-     have pending process exits recorded in lp->status
-     and W_EXITCODE(0,0) == 0.  We should probably have an additional
-     lp->status_p flag.  */
-
-  return (lp->waitstatus.kind == TARGET_WAITKIND_IGNORE
-	  && linux_nat_status_is_event (lp->status));
-}
-
 /* Set alternative SIGTRAP-like events recognizer.  If
   breakpoint_inserted_here_p there then gdbarch_decr_pc_after_break will be
   applied.  */
@ -2508,8 +2522,6 @@ stop_wait_callback (struct lwp_info *lp, void *data)
 	{
 	  /* The thread was stopped with a signal other than SIGSTOP.  */

-	  save_sigtrap (lp);
-
 	  if (debug_linux_nat)
 	    fprintf_unfiltered (gdb_stdlog,
 				"SWC: Pending event %s in %s\n",
@ -2519,6 +2531,7 @@ stop_wait_callback (struct lwp_info *lp, void *data)
 	  /* Save the sigtrap event.  */
 	  lp->status = status;
 	  gdb_assert (lp->signalled);
+	  save_sigtrap (lp);
 	}
      else
 	{
@ -2539,7 +2552,9 @@ stop_wait_callback (struct lwp_info *lp, void *data)
  return 0;
 }

-/* Return non-zero if LP has a wait status pending.  */
+/* Return non-zero if LP has a wait status pending.  Discard the
+   pending event and resume the LWP if the event that originally
+   caused the stop became uninteresting.  */

 static int
 status_callback (struct lwp_info *lp, void *data)
@ -2549,6 +2564,53 @@ status_callback (struct lwp_info *lp, void *data)
  if (!lp->resumed)
    return 0;

+  if (lp->stop_reason == LWP_STOPPED_BY_SW_BREAKPOINT
+      || lp->stop_reason == LWP_STOPPED_BY_HW_BREAKPOINT)
+    {
+      struct regcache *regcache = get_thread_regcache (lp->ptid);
+      struct gdbarch *gdbarch = get_regcache_arch (regcache);
+      CORE_ADDR pc;
+      int discard = 0;
+
+      gdb_assert (lp->status != 0);
+
+      pc = regcache_read_pc (regcache);
+
+      if (pc != lp->stop_pc)
+	{
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"SC: PC of %s changed.  was=%s, now=%s\n",
+				target_pid_to_str (lp->ptid),
+				paddress (target_gdbarch (), lp->stop_pc),
+				paddress (target_gdbarch (), pc));
+	  discard = 1;
+	}
+      else if (!breakpoint_inserted_here_p (get_regcache_aspace (regcache), pc))
+	{
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"SC: previous breakpoint of %s, at %s gone\n",
+				target_pid_to_str (lp->ptid),
+				paddress (target_gdbarch (), lp->stop_pc));
+
+	  discard = 1;
+	}
+
+      if (discard)
+	{
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"SC: pending event of %s cancelled.\n",
+				target_pid_to_str (lp->ptid));
+
+	  lp->status = 0;
+	  linux_resume_one_lwp (lp, lp->step, GDB_SIGNAL_0);
+	  return 0;
+	}
+      return 1;
+    }
+
  return lwp_status_pending_p (lp);
 }

@ -2570,8 +2632,8 @@ count_events_callback (struct lwp_info *lp, void *data)

  gdb_assert (count != NULL);

-  /* Count only resumed LWPs that have a SIGTRAP event pending.  */
-  if (lp->resumed && linux_nat_lp_status_is_event (lp))
+  /* Select only resumed LWPs that have an event pending.  */
+  if (lp->resumed && lwp_status_pending_p (lp))
    (*count)++;

  return 0;
@ -2609,16 +2671,19 @@ select_event_lwp_callback (struct lwp_info *lp, void *data)

  gdb_assert (selector != NULL);

-  /* Select only resumed LWPs that have a SIGTRAP event pending.  */
-  if (lp->resumed && linux_nat_lp_status_is_event (lp))
+  /* Select only resumed LWPs that have an event pending.  */
+  if (lp->resumed && lwp_status_pending_p (lp))
    if ((*selector)-- == 0)
      return 1;

  return 0;
 }

+/* Called when the LWP got a signal/trap that could be explained by a
+   software or hardware breakpoint.  */
+
 static int
-cancel_breakpoint (struct lwp_info *lp)
+check_stopped_by_breakpoint (struct lwp_info *lp)
 {
  /* Arrange for a breakpoint to be hit again later.  We don't keep
     the SIGTRAP status and don't forward the SIGTRAP signal to the
@ -2632,48 +2697,42 @@ cancel_breakpoint (struct lwp_info *lp)
  struct regcache *regcache = get_thread_regcache (lp->ptid);
  struct gdbarch *gdbarch = get_regcache_arch (regcache);
  CORE_ADDR pc;
+  CORE_ADDR sw_bp_pc;

-  pc = regcache_read_pc (regcache) - target_decr_pc_after_break (gdbarch);
-  if (breakpoint_inserted_here_p (get_regcache_aspace (regcache), pc))
+  pc = regcache_read_pc (regcache);
+  sw_bp_pc = pc - target_decr_pc_after_break (gdbarch);
+
+  if ((!lp->step || lp->stop_pc == sw_bp_pc)
+      && software_breakpoint_inserted_here_p (get_regcache_aspace (regcache),
+					      sw_bp_pc))
    {
+      /* The LWP was either continued, or stepped a software
+	 breakpoint instruction.  */
      if (debug_linux_nat)
 	fprintf_unfiltered (gdb_stdlog,
-			    "CB: Push back breakpoint for %s\n",
+			    "CB: Push back software breakpoint for %s\n",
 			    target_pid_to_str (lp->ptid));

      /* Back up the PC if necessary.  */
-      if (target_decr_pc_after_break (gdbarch))
-	regcache_write_pc (regcache, pc);
+      if (pc != sw_bp_pc)
+	regcache_write_pc (regcache, sw_bp_pc);

+      lp->stop_pc = sw_bp_pc;
+      lp->stop_reason = LWP_STOPPED_BY_SW_BREAKPOINT;
      return 1;
    }
-  return 0;
-}

-static int
-cancel_breakpoints_callback (struct lwp_info *lp, void *data)
-{
-  struct lwp_info *event_lp = data;
+  if (hardware_breakpoint_inserted_here_p (get_regcache_aspace (regcache), pc))
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "CB: Push back hardware breakpoint for %s\n",
+			    target_pid_to_str (lp->ptid));

-  /* Leave the LWP that has been elected to receive a SIGTRAP alone.  */
-  if (lp == event_lp)
-    return 0;
-
-  /* If a LWP other than the LWP that we're reporting an event for has
-     hit a GDB breakpoint (as opposed to some random trap signal),
-     then just arrange for it to hit it again later.  We don't keep
-     the SIGTRAP status and don't forward the SIGTRAP signal to the
-     LWP.  We will handle the current event, eventually we will resume
-     all LWPs, and this one will get its breakpoint trap again.
-
-     If we do not do this, then we run the risk that the user will
-     delete or disable the breakpoint, but the LWP will have already
-     tripped on it.  */
-
-  if (linux_nat_lp_status_is_event (lp)
-      && cancel_breakpoint (lp))
-    /* Throw away the SIGTRAP.  */
-    lp->status = 0;
+      lp->stop_pc = pc;
+      lp->stop_reason = LWP_STOPPED_BY_HW_BREAKPOINT;
+      return 1;
+    }

  return 0;
 }
@ -2685,36 +2744,48 @@ select_event_lwp (ptid_t filter, struct lwp_info **orig_lp, int *status)
 {
  int num_events = 0;
  int random_selector;
-  struct lwp_info *event_lp;
+  struct lwp_info *event_lp = NULL;

  /* Record the wait status for the original LWP.  */
  (*orig_lp)->status = *status;

-  /* Give preference to any LWP that is being single-stepped.  */
-  event_lp = iterate_over_lwps (filter,
-				select_singlestep_lwp_callback, NULL);
-  if (event_lp != NULL)
+  /* In all-stop, give preference to the LWP that is being
+     single-stepped.  There will be at most one, and it will be the
+     LWP that the core is most interested in.  If we didn't do this,
+     then we'd have to handle pending step SIGTRAPs somehow in case
+     the core later continues the previously-stepped thread, as
+     otherwise we'd report the pending SIGTRAP then, and the core, not
+     having stepped the thread, wouldn't understand what the trap was
+     for, and therefore would report it to the user as a random
+     signal.  */
+  if (!non_stop)
    {
-      if (debug_linux_nat)
-	fprintf_unfiltered (gdb_stdlog,
-			    "SEL: Select single-step %s\n",
-			    target_pid_to_str (event_lp->ptid));
+      event_lp = iterate_over_lwps (filter,
+				    select_singlestep_lwp_callback, NULL);
+      if (event_lp != NULL)
+	{
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"SEL: Select single-step %s\n",
+				target_pid_to_str (event_lp->ptid));
+	}
    }
-  else
-    {
-      /* No single-stepping LWP.  Select one at random, out of those
-         which have had SIGTRAP events.  */

-      /* First see how many SIGTRAP events we have.  */
+  if (event_lp == NULL)
+    {
+      /* Pick one at random, out of those which have had events.  */
+
+      /* First see how many events we have.  */
      iterate_over_lwps (filter, count_events_callback, &num_events);

-      /* Now randomly pick a LWP out of those that have had a SIGTRAP.  */
+      /* Now randomly pick a LWP out of those that have had
+	 events.  */
      random_selector = (int)
 	((num_events * (double) rand ()) / (RAND_MAX + 1.0));

      if (debug_linux_nat && num_events > 1)
 	fprintf_unfiltered (gdb_stdlog,
-			    "SEL: Found %d SIGTRAP events, selecting #%d\n",
+			    "SEL: Found %d events, selecting #%d\n",
 			    num_events, random_selector);

      event_lp = iterate_over_lwps (filter,
@ -2748,8 +2819,6 @@ resumed_callback (struct lwp_info *lp, void *data)
 static int
 stop_and_resume_callback (struct lwp_info *lp, void *data)
 {
-  int *new_pending_p = data;
-
  if (!lp->stopped)
    {
      ptid_t ptid = lp->ptid;
@ -2790,8 +2859,6 @@ stop_and_resume_callback (struct lwp_info *lp, void *data)
 				    "SARC: not re-resuming LWP %ld "
 				    "(has pending)\n",
 				    ptid_get_lwp (lp->ptid));
-	      if (new_pending_p)
-		*new_pending_p = 1;
 	    }
 	}
    }
@ -2799,18 +2866,14 @@ stop_and_resume_callback (struct lwp_info *lp, void *data)
 }

 /* Check if we should go on and pass this event to common code.
-   Return the affected lwp if we are, or NULL otherwise.  If we stop
-   all lwps temporarily, we may end up with new pending events in some
-   other lwp.  In that case set *NEW_PENDING_P to true.  */
+   Return the affected lwp if we are, or NULL otherwise.  */

 static struct lwp_info *
-linux_nat_filter_event (int lwpid, int status, int *new_pending_p)
+linux_nat_filter_event (int lwpid, int status)
 {
  struct lwp_info *lp;
  int event = linux_ptrace_get_extended_event (status);

-  *new_pending_p = 0;
-
  lp = find_lwp_pid (pid_to_ptid (lwpid));

  /* Check for stop events reported by a process we didn't already
@ -2890,42 +2953,62 @@ linux_nat_filter_event (int lwpid, int status, int *new_pending_p)
 	return NULL;
    }

-  if (linux_nat_status_is_event (status))
-    save_sigtrap (lp);
-
  /* Check if the thread has exited.  */
-  if ((WIFEXITED (status) || WIFSIGNALED (status))
-      && num_lwps (ptid_get_pid (lp->ptid)) > 1)
+  if (WIFEXITED (status) || WIFSIGNALED (status))
    {
-      /* If this is the main thread, we must stop all threads and verify
-	 if they are still alive.  This is because in the nptl thread model
-	 on Linux 2.4, there is no signal issued for exiting LWPs
-	 other than the main thread.  We only get the main thread exit
-	 signal once all child threads have already exited.  If we
-	 stop all the threads and use the stop_wait_callback to check
-	 if they have exited we can determine whether this signal
-	 should be ignored or whether it means the end of the debugged
-	 application, regardless of which threading model is being
-	 used.  */
-      if (ptid_get_pid (lp->ptid) == ptid_get_lwp (lp->ptid))
+      if (num_lwps (ptid_get_pid (lp->ptid)) > 1)
 	{
-	  iterate_over_lwps (pid_to_ptid (ptid_get_pid (lp->ptid)),
-			     stop_and_resume_callback, new_pending_p);
+	  /* If this is the main thread, we must stop all threads and
+	     verify if they are still alive.  This is because in the
+	     nptl thread model on Linux 2.4, there is no signal issued
+	     for exiting LWPs other than the main thread.  We only get
+	     the main thread exit signal once all child threads have
+	     already exited.  If we stop all the threads and use the
+	     stop_wait_callback to check if they have exited we can
+	     determine whether this signal should be ignored or
+	     whether it means the end of the debugged application,
+	     regardless of which threading model is being used.  */
+	  if (ptid_get_pid (lp->ptid) == ptid_get_lwp (lp->ptid))
+	    {
+	      iterate_over_lwps (pid_to_ptid (ptid_get_pid (lp->ptid)),
+				 stop_and_resume_callback, NULL);
+	    }
+
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"LLW: %s exited.\n",
+				target_pid_to_str (lp->ptid));
+
+	  if (num_lwps (ptid_get_pid (lp->ptid)) > 1)
+	    {
+	      /* If there is at least one more LWP, then the exit signal
+		 was not the end of the debugged application and should be
+		 ignored.  */
+	      exit_lwp (lp);
+	      return NULL;
+	    }
 	}

+      gdb_assert (lp->resumed);
+
      if (debug_linux_nat)
 	fprintf_unfiltered (gdb_stdlog,
-			    "LLW: %s exited.\n",
-			    target_pid_to_str (lp->ptid));
+			    "Process %ld exited\n",
+			    ptid_get_lwp (lp->ptid));

-      if (num_lwps (ptid_get_pid (lp->ptid)) > 1)
-       {
-	 /* If there is at least one more LWP, then the exit signal
-	    was not the end of the debugged application and should be
-	    ignored.  */
-	 exit_lwp (lp);
-	 return NULL;
-       }
+      /* This was the last lwp in the process.  Since events are
+	 serialized to GDB core, we may not be able report this one
+	 right now, but GDB core and the other target layers will want
+	 to be notified about the exit code/signal, leave the status
+	 pending for the next time we're able to report it.  */
+
+      /* Dead LWP's aren't expected to reported a pending sigstop.  */
+      lp->signalled = 0;
+
+      /* Store the pending event in the waitstatus, because
+	 W_EXITCODE(0,0) == 0.  */
+      store_waitstatus (&lp->waitstatus, status);
+      return lp;
    }

  /* Check if the current LWP has previously exited.  In the nptl
@ -3007,9 +3090,59 @@ linux_nat_filter_event (int lwpid, int status, int *new_pending_p)
      return NULL;
    }

+  /* Don't report signals that GDB isn't interested in, such as
+     signals that are neither printed nor stopped upon.  Stopping all
+     threads can be a bit time-consuming so if we want decent
+     performance with heavily multi-threaded programs, especially when
+     they're using a high frequency timer, we'd better avoid it if we
+     can.  */
+  if (WIFSTOPPED (status))
+    {
+      enum gdb_signal signo = gdb_signal_from_host (WSTOPSIG (status));
+
+      if (!non_stop)
+	{
+	  /* Only do the below in all-stop, as we currently use SIGSTOP
+	     to implement target_stop (see linux_nat_stop) in
+	     non-stop.  */
+	  if (signo == GDB_SIGNAL_INT && signal_pass_state (signo) == 0)
+	    {
+	      /* If ^C/BREAK is typed at the tty/console, SIGINT gets
+		 forwarded to the entire process group, that is, all LWPs
+		 will receive it - unless they're using CLONE_THREAD to
+		 share signals.  Since we only want to report it once, we
+		 mark it as ignored for all LWPs except this one.  */
+	      iterate_over_lwps (pid_to_ptid (ptid_get_pid (lp->ptid)),
+					      set_ignore_sigint, NULL);
+	      lp->ignore_sigint = 0;
+	    }
+	  else
+	    maybe_clear_ignore_sigint (lp);
+	}
+
+      /* When using hardware single-step, we need to report every signal.
+	 Otherwise, signals in pass_mask may be short-circuited.  */
+      if (!lp->step
+	  && WSTOPSIG (status) && sigismember (&pass_mask, WSTOPSIG (status)))
+	{
+	  linux_resume_one_lwp (lp, lp->step, signo);
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"LLW: %s %s, %s (preempt 'handle')\n",
+				lp->step ?
+				"PTRACE_SINGLESTEP" : "PTRACE_CONT",
+				target_pid_to_str (lp->ptid),
+				(signo != GDB_SIGNAL_0
+				 ? strsignal (gdb_signal_to_host (signo))
+				 : "0"));
+	  return NULL;
+	}
+    }
+
  /* An interesting event.  */
  gdb_assert (lp);
  lp->status = status;
+  save_sigtrap (lp);
  return lp;
 }

@ -3108,9 +3241,6 @@ linux_nat_wait_1 (struct target_ops *ops,
  /* Make sure SIGCHLD is blocked until the sigsuspend below.  */
  block_child_signals (&prev_mask);

-retry:
-  status = 0;
-
  /* First check if there is a LWP with a wait status pending.  */
  lp = iterate_over_lwps (ptid, status_callback, NULL);
  if (lp != NULL)
@ -3128,7 +3258,9 @@ retry:
      set_sigint_trap ();
    }

-  /* But if we don't find a pending event, we'll have to wait.  */
+  /* But if we don't find a pending event, we'll have to wait.  Always
+     pull all events out of the kernel.  We'll randomly select an
+     event LWP out of all that have events, to prevent starvation.  */

  while (lp == NULL)
    {
@ -3159,10 +3291,6 @@ retry:

      if (lwpid > 0)
 	{
-	  /* If this is true, then we paused LWPs momentarily, and may
-	     now have pending events to handle.  */
-	  int new_pending;
-
 	  if (debug_linux_nat)
 	    {
 	      fprintf_unfiltered (gdb_stdlog,
@ -3170,101 +3298,18 @@ retry:
 				  (long) lwpid, status_to_str (status));
 	    }

-	  lp = linux_nat_filter_event (lwpid, status, &new_pending);
-
-	  /* STATUS is now no longer valid, use LP->STATUS instead.  */
-	  status = 0;
-
-	  if (lp && !ptid_match (lp->ptid, ptid))
-	    {
-	      gdb_assert (lp->resumed);
-
-	      if (debug_linux_nat)
-		fprintf_unfiltered (gdb_stdlog,
-				    "LWP %ld got an event %06x, "
-				    "leaving pending.\n",
-				    ptid_get_lwp (lp->ptid), lp->status);
-
-	      if (WIFSTOPPED (lp->status))
-		{
-		  if (WSTOPSIG (lp->status) != SIGSTOP)
-		    {
-		      /* Cancel breakpoint hits.  The breakpoint may
-			 be removed before we fetch events from this
-			 process to report to the core.  It is best
-			 not to assume the moribund breakpoints
-			 heuristic always handles these cases --- it
-			 could be too many events go through to the
-			 core before this one is handled.  All-stop
-			 always cancels breakpoint hits in all
-			 threads.  */
-		      if (non_stop
-			  && linux_nat_lp_status_is_event (lp)
-			  && cancel_breakpoint (lp))
-			{
-			  /* Throw away the SIGTRAP.  */
-			  lp->status = 0;
-
-			  if (debug_linux_nat)
-			    fprintf_unfiltered (gdb_stdlog,
-						"LLW: LWP %ld hit a "
-						"breakpoint while "
-						"waiting for another "
-						"process; "
-						"cancelled it\n",
-						ptid_get_lwp (lp->ptid));
-			}
-		    }
-		  else
-		    lp->signalled = 0;
-		}
-	      else if (WIFEXITED (lp->status) || WIFSIGNALED (lp->status))
-		{
-		  if (debug_linux_nat)
-		    fprintf_unfiltered (gdb_stdlog,
-					"Process %ld exited while stopping "
-					"LWPs\n",
-					ptid_get_lwp (lp->ptid));
-
-		  /* This was the last lwp in the process.  Since
-		     events are serialized to GDB core, and we can't
-		     report this one right now, but GDB core and the
-		     other target layers will want to be notified
-		     about the exit code/signal, leave the status
-		     pending for the next time we're able to report
-		     it.  */
-
-		  /* Dead LWP's aren't expected to reported a pending
-		     sigstop.  */
-		  lp->signalled = 0;
-
-		  /* Store the pending event in the waitstatus as
-		     well, because W_EXITCODE(0,0) == 0.  */
-		  store_waitstatus (&lp->waitstatus, lp->status);
-		}
-
-	      /* Keep looking.  */
-	      lp = NULL;
-	    }
-
-	  if (new_pending)
-	    {
-	      /* Some LWP now has a pending event.  Go all the way
-		 back to check it.  */
-	      goto retry;
-	    }
-
-	  if (lp)
-	    {
-	      /* We got an event to report to the core.  */
-	      break;
-	    }
-
+	  linux_nat_filter_event (lwpid, status);
 	  /* Retry until nothing comes out of waitpid.  A single
 	     SIGCHLD can indicate more than one child stopped.  */
 	  continue;
 	}

+      /* Now that we've pulled all events out of the kernel, check if
+	 there's any LWP with a status to report to the core.  */
+      lp = iterate_over_lwps (ptid, status_callback, NULL);
+      if (lp != NULL)
+	break;
+
      /* Check for zombie thread group leaders.  Those can't be reaped
 	 until all other threads in the thread group are.  */
      check_zombie_leaders ();
@ -3314,68 +3359,6 @@ retry:
  status = lp->status;
  lp->status = 0;

-  /* Don't report signals that GDB isn't interested in, such as
-     signals that are neither printed nor stopped upon.  Stopping all
-     threads can be a bit time-consuming so if we want decent
-     performance with heavily multi-threaded programs, especially when
-     they're using a high frequency timer, we'd better avoid it if we
-     can.  */
-
-  if (WIFSTOPPED (status))
-    {
-      enum gdb_signal signo = gdb_signal_from_host (WSTOPSIG (status));
-
-      /* When using hardware single-step, we need to report every signal.
-	 Otherwise, signals in pass_mask may be short-circuited.  */
-      if (!lp->step
-	  && WSTOPSIG (status) && sigismember (&pass_mask, WSTOPSIG (status)))
-	{
-	  /* FIMXE: kettenis/2001-06-06: Should we resume all threads
-	     here?  It is not clear we should.  GDB may not expect
-	     other threads to run.  On the other hand, not resuming
-	     newly attached threads may cause an unwanted delay in
-	     getting them running.  */
-	  linux_resume_one_lwp (lp, lp->step, signo);
-	  if (debug_linux_nat)
-	    fprintf_unfiltered (gdb_stdlog,
-				"LLW: %s %s, %s (preempt 'handle')\n",
-				lp->step ?
-				"PTRACE_SINGLESTEP" : "PTRACE_CONT",
-				target_pid_to_str (lp->ptid),
-				(signo != GDB_SIGNAL_0
-				 ? strsignal (gdb_signal_to_host (signo))
-				 : "0"));
-	  goto retry;
-	}
-
-      if (!non_stop)
-	{
-	  /* Only do the below in all-stop, as we currently use SIGINT
-	     to implement target_stop (see linux_nat_stop) in
-	     non-stop.  */
-	  if (signo == GDB_SIGNAL_INT && signal_pass_state (signo) == 0)
-	    {
-	      /* If ^C/BREAK is typed at the tty/console, SIGINT gets
-		 forwarded to the entire process group, that is, all LWPs
-		 will receive it - unless they're using CLONE_THREAD to
-		 share signals.  Since we only want to report it once, we
-		 mark it as ignored for all LWPs except this one.  */
-	      iterate_over_lwps (pid_to_ptid (ptid_get_pid (ptid)),
-					      set_ignore_sigint, NULL);
-	      lp->ignore_sigint = 0;
-	    }
-	  else
-	    maybe_clear_ignore_sigint (lp);
-	}
-    }
-
-  /* This LWP is stopped now.  */
-  lp->stopped = 1;
-
-  if (debug_linux_nat)
-    fprintf_unfiltered (gdb_stdlog, "LLW: Candidate event %s in %s.\n",
-			status_to_str (status), target_pid_to_str (lp->ptid));
-
  if (!non_stop)
    {
      /* Now stop all other LWP's ...  */
@ -3384,33 +3367,46 @@ retry:
      /* ... and wait until all of them have reported back that
 	 they're no longer running.  */
      iterate_over_lwps (minus_one_ptid, stop_wait_callback, NULL);
+    }

-      /* If we're not waiting for a specific LWP, choose an event LWP
-	 from among those that have had events.  Giving equal priority
-	 to all LWPs that have had events helps prevent
-	 starvation.  */
-      if (ptid_equal (ptid, minus_one_ptid) || ptid_is_pid (ptid))
-	select_event_lwp (ptid, &lp, &status);
+  /* If we're not waiting for a specific LWP, choose an event LWP from
+     among those that have had events.  Giving equal priority to all
+     LWPs that have had events helps prevent starvation.  */
+  if (ptid_equal (ptid, minus_one_ptid) || ptid_is_pid (ptid))
+    select_event_lwp (ptid, &lp, &status);

-      /* Now that we've selected our final event LWP, cancel any
-	 breakpoints in other LWPs that have hit a GDB breakpoint.
-	 See the comment in cancel_breakpoints_callback to find out
-	 why.  */
-      iterate_over_lwps (minus_one_ptid, cancel_breakpoints_callback, lp);
+  gdb_assert (lp != NULL);

-      /* We'll need this to determine whether to report a SIGSTOP as
-	 TARGET_WAITKIND_0.  Need to take a copy because
-	 resume_clear_callback clears it.  */
-      last_resume_kind = lp->last_resume_kind;
+  /* Now that we've selected our final event LWP, un-adjust its PC if
+     it was a software breakpoint.  */
+  if (lp->stop_reason == LWP_STOPPED_BY_SW_BREAKPOINT)
+    {
+      struct regcache *regcache = get_thread_regcache (lp->ptid);
+      struct gdbarch *gdbarch = get_regcache_arch (regcache);
+      int decr_pc = target_decr_pc_after_break (gdbarch);

+      if (decr_pc != 0)
+	{
+	  CORE_ADDR pc;
+
+	  pc = regcache_read_pc (regcache);
+	  regcache_write_pc (regcache, pc + decr_pc);
+	}
+    }
+
+  /* We'll need this to determine whether to report a SIGSTOP as
+     GDB_SIGNAL_0.  Need to take a copy because resume_clear_callback
+     clears it.  */
+  last_resume_kind = lp->last_resume_kind;
+
+  if (!non_stop)
+    {
      /* In all-stop, from the core's perspective, all LWPs are now
 	 stopped until a new resume action is sent over.  */
      iterate_over_lwps (minus_one_ptid, resume_clear_callback, NULL);
    }
  else
    {
-      /* See above.  */
-      last_resume_kind = lp->last_resume_kind;
      resume_clear_callback (lp, NULL);
    }

--- a/gdb/linux-nat.h
+++ b/gdb/linux-nat.h
@ -23,6 +23,24 @@

 struct arch_lwp_info;

+/* Reasons an LWP last stopped.  */
+
+enum lwp_stop_reason
+{
+  /* Either not stopped, or stopped for a reason that doesn't require
+     special tracking.  */
+  LWP_STOPPED_BY_NO_REASON,
+
+  /* Stopped by a software breakpoint.  */
+  LWP_STOPPED_BY_SW_BREAKPOINT,
+
+  /* Stopped by a hardware breakpoint.  */
+  LWP_STOPPED_BY_HW_BREAKPOINT,
+
+  /* Stopped by a watchpoint.  */
+  LWP_STOPPED_BY_WATCHPOINT
+};
+
 /* Structure describing an LWP.  This is public only for the purposes
   of ALL_LWPS; target-specific code should generally not access it
   directly.  */
@ -63,12 +81,19 @@ struct lwp_info
  /* If non-zero, a pending wait status.  */
  int status;

+  /* When 'stopped' is set, this is where the lwp last stopped, with
+     decr_pc_after_break already accounted for.  If the LWP is
+     running, and stepping, this is the address at which the lwp was
+     resumed (that is, it's the previous stop PC).  If the LWP is
+     running and not stepping, this is 0.  */
+  CORE_ADDR stop_pc;
+
  /* Non-zero if we were stepping this LWP.  */
  int step;

-  /* STOPPED_BY_WATCHPOINT is non-zero if this LWP stopped with a data
-     watchpoint trap.  */
-  int stopped_by_watchpoint;
+  /* The reason the LWP last stopped, if we need to track it
+     (breakpoint, watchpoint, etc.)  */
+  enum lwp_stop_reason stop_reason;

  /* On architectures where it is possible to know the data address of
     a triggered watchpoint, STOPPED_DATA_ADDRESS_P is non-zero, and
--- a/gdb/x86-linux-nat.c
+++ b/gdb/x86-linux-nat.c
@ -214,7 +214,7 @@ x86_linux_prepare_to_resume (struct lwp_info *lwp)
      lwp->arch_private->debug_registers_changed = 0;
    }

-  if (clear_status || lwp->stopped_by_watchpoint)
+  if (clear_status || lwp->stop_reason == LWP_STOPPED_BY_WATCHPOINT)
    x86_linux_dr_set (lwp->ptid, DR_STATUS, 0);
 }