Commit Graph

1217 Commits

Author SHA1 Message Date
Paul E. McKenney 3fc3d1709f rcu: Move RCU CPU stall-warning code out of tree_plugin.h
The RCU CPU stall-warning code for normal grace periods is currently
scattered across two files, due to earlier Tiny RCU support for RCU
CPU stall warnings and for old Kconfig options that have long since
been retired.  Given that it is hard for the lead RCU maintainer to
find relevant stall-warning code, it would be good to consolidate it.
This commit continues this process by moving stall-warning code from
kernel/rcu/tree_plugin.c to a new kernel/rcu/tree_stall.h file.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:40:13 -07:00
Paul E. McKenney 10462d6f58 rcu: Move RCU CPU stall-warning code out of update.c
The RCU CPU stall-warning code for normal grace periods is currently
scattered across three files, due to earlier Tiny RCU support for RCU
CPU stall warnings and for old Kconfig options that have long since
been retired.  Given that it is hard for the lead RCU maintainer to
find relevant stall-warning code, it would be good to consolidate it.
This commit starts this process by moving stall-warning code from
kernel/rcu/update.c to a new kernel/rcu/tree_stall.h file.

Note that the definitions of rcu_cpu_stall_suppress and
rcu_cpu_stall_timeout must remain in kernel/rcu/update.h to provide
compatibility for kernel boot parameter lists.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:40:13 -07:00
Paul E. McKenney f5ad399149 srcu: Remove cleanup_srcu_struct_quiesced()
The cleanup_srcu_struct_quiesced() function was added because NVME
used WQ_MEM_RECLAIM workqueues and SRCU did not, which meant that
NVME workqueues waiting on SRCU workqueues could result in deadlocks
during low-memory conditions.  However, SRCU now also has WQ_MEM_RECLAIM
workqueues, so there is no longer a potential for deadlock.  Furthermore,
it turns out to be extremely hard to use cleanup_srcu_struct_quiesced()
correctly due to the fact that SRCU callback invocation accesses the
srcu_struct structure's per-CPU data area just after callbacks are
invoked.  Therefore, the usual practice of using srcu_barrier() to wait
for callbacks to be invoked before invoking cleanup_srcu_struct_quiesced()
fails because SRCU's callback-invocation workqueue handler might be
delayed, which can result in cleanup_srcu_struct_quiesced() being invoked
(and thus freeing the per-CPU data) before the SRCU's callback-invocation
workqueue handler is finished using that per-CPU data.  Nor is this a
theoretical problem: KASAN emitted use-after-free warnings because of
this problem on actual runs.

In short, NVME can now safely invoke cleanup_srcu_struct(), which
avoids the use-after-free scenario.  And cleanup_srcu_struct_quiesced()
is quite difficult to use safely.  This commit therefore removes
cleanup_srcu_struct_quiesced(), switching its sole user back to
cleanup_srcu_struct().  This effectively reverts the following pair
of commits:

f7194ac32c ("srcu: Add cleanup_srcu_struct_quiesced()")
4317228ad9 ("nvme: Avoid flush dependency in delete controller flow")

Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Bart Van Assche <bvanassche@acm.org>
2019-03-26 14:39:24 -07:00
Paul E. McKenney 5cdfd174ea srcu: Check for in-flight callbacks in _cleanup_srcu_struct()
If someone fails to drain the corresponding SRCU callbacks (for
example, by failing to invoke srcu_barrier()) before invoking either
cleanup_srcu_struct() or cleanup_srcu_struct_quiesced(), the resulting
diagnostic is an ambiguous use-after-free diagnostic, and even then
only if you are running something like KASAN.  This commit therefore
improves SRCU diagnostics by adding checks for in-flight callbacks at
_cleanup_srcu_struct() time.

Note that these diagnostics can still be defeated, for example, by
invoking call_srcu() concurrently with cleanup_srcu_struct().  Which is
a really bad idea, but sometimes all too easy to do.  But even then,
these diagnostics have at least some probability of catching the problem.

Reported-by: Sagi Grimberg <sagi@grimberg.me>
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Tested-by: Bart Van Assche <bvanassche@acm.org>
2019-03-26 14:39:24 -07:00
Paul E. McKenney add0d37b4f rcu: Correct READ_ONCE()/WRITE_ONCE() for ->rcu_read_unlock_special
The task_struct structure's ->rcu_read_unlock_special field is only ever
read or written by the owning task, but it is accessed both at process
and interrupt levels.  It may therefore be accessed using plain reads
and writes while interrupts are disabled, but must be accessed using
READ_ONCE() and WRITE_ONCE() or better otherwise.  This commit makes a
few adjustments to align with this discipline.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:38:38 -07:00
Paul E. McKenney f1a98045ab rcu: Fix typo in tree_exp.h comment
This commit changes a rcu_exp_handler() comment from rcu_preempt_defer_qs()
to rcu_preempt_deferred_qs() in order to better match reality.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:38:38 -07:00
Paul E. McKenney a2badefa85 rcu: Eliminate redundant NULL-pointer check
Because rcu_wake_cond() checks for a null task_struct pointer, there is
no need for its callers to do so.  This commit eliminates the redundant
check.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:38:38 -07:00
Zhouyi Zhou 5d8a752e31 rcu: Fix force_qs_rnp() header comment
Previously, threads blocked on offlining CPUS were migrated to the
root rcu_node structure, thus requiring RCU priority boosting on this
structure.  However, since commit d19fb8d1f3 ("rcu: Don't migrate
blocked tasks even if all corresponding CPUs offline"), RCU does not
migrate blocked tasks.  Consequently, RCU no longer does RCU priority
boosting on the root rcu_node structure as of commit 1be0085b51 ("rcu:
Don't initiate RCU priority boosting on root rcu_node").

This commit therefore brings comments for the force_qs_rnp() function's
header comment in line with this new no-root-boosting reality.

Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
[ paulmck: Also remove obsolete comment on suppressing new grace periods. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:38:38 -07:00
Paul E. McKenney 85f2b60c43 rcu: Update jiffies_to_sched_qs and adjust_jiffies_till_sched_qs() comments
This commit better documents the jiffies_to_sched_qs default-value
strategy used by adjust_jiffies_till_sched_qs()

Reported-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:38:38 -07:00
Neeraj Upadhyay 6973032a60 rcu: Default jiffies_to_sched_qs to jiffies_till_sched_qs
The current code only calls adjust_jiffies_till_sched_qs() if
jiffies_till_sched_qs is left at its default value, so when the
jiffies_till_sched_qs kernel-boot parameter actually is specified,
jiffies_to_sched_qs will be left with the value zero, which
will result in useless slowdowns of cond_resched().  This commit
therefore changes rcu_init_geometry() to unconditionally invoke
adjust_jiffies_till_sched_qs(), which ensures that jiffies_to_sched_qs
will be initialized in all cases, thus maintaining good cond_resched()
performance.

Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:38:38 -07:00
Neeraj Upadhyay 0f58d2ac2c rcu: Fix self-wakeups for grace-period kthread
The current rcu_gp_kthread_wake() function uses in_interrupt()
and thus does a self-wakeup from all interrupt contexts, including
the pointless case where the GP kthread happens to be running with
bottom halves disabled, along with the impossible case where the GP
kthread is running within an NMI handler (you are not supposed to invoke
rcu_gp_kthread_wake() from within an NMI handler.  This commit therefore
replaces the in_interrupt() with in_irq(), so that the self-wakeups
happen only from handlers for hardware interrupts and softirqs.
This also makes the code match the comment.

Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-03-26 14:37:49 -07:00
Paul E. McKenney 497e42600b rcu: Report error for bad rcu_nocbs= parameter values
This commit prints a console message when cpulist_parse() reports a
bad list of CPUs, and sets all CPUs' bits in that case.  The reason for
setting all CPUs' bits is that this is the safe(r) choice for real-time
workloads, which would normally be the ones using the rcu_nocbs= kernel
boot parameter.  Either way, later RCU console log messages list the
actual set of CPUs whose RCU callbacks will be offloaded.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:37:49 -07:00
Paul E. McKenney da8739f23f rcu: Allow rcu_nocbs= to specify all CPUs
Currently, the rcu_nocbs= kernel boot parameter requires that a specific
list of CPUs be specified, and has no way to say "all of them".
As noted by user RavFX in a comment to Phoronix topic 1002538, this
is an inconvenient side effect of the removal of the RCU_NOCB_CPU_ALL
Kconfig option.  This commit therefore enables the rcu_nocbs= kernel boot
parameter to be given the string "all", as in "rcu_nocbs=all" to specify
that all CPUs on the system are to have their RCU callbacks offloaded.

Another approach would be to make cpulist_parse() check for "all", but
there are uses of cpulist_parse() that do other checking, which could
conflict with an "all".  This commit therefore focuses on the specific
use of cpulist_parse() in rcu_nocb_setup().

Just a note to other people who would like changes to Linux-kernel RCU:
If you send your requests to me directly, they might get fixed somewhat
faster.  RavFX's comment was posted on January 22, 2018 and I first saw
it on March 5, 2019.  And the only reason that I found it -at- -all- was
that I was looking for projects using RCU, and my search engine showed
me that Phoronix comment quite by accident.  Your choice, though!  ;-)

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:37:49 -07:00
Akira Yokosawa b2eb85b49a rcu: Move common code out of if-else block
As the result of recent addition of "rdp->core_needs_qs = false;" in
the "if" block, now both branches of the if-else have the same
assignment.

Factor it out and reduce line count.

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2019-03-26 14:37:49 -07:00
Liu Song 3ffe3d1adc rcu: Set rcutree.kthread_prio sysfs access to read-only
The rcutree.kthread_prio kernel-boot parameter is used to set the
priority for boost (rcub), per-CPU (rcuc), and grace-period (rcu_preempt
or rcu_sched) kthreads.  It is also used by rcutorture to check whether
it is possible to meaningfully test RCU priority boosting.  However,
all of these cases will either ignore or be confused by any post-boot
changes to rcutree.kthread_prio.

Note that the user really can change the priorities of all of these
kthreads using chrt, given sufficient privileges.  Therefore, the
read-write nature of sysfs access to rcutree.kthread_prio is thus at
best an attractive nuisance.

This commit therefore changes sysfs access to rcutree.kthread_prio to
be read-only.

Signed-off-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:37:49 -07:00
Paul E. McKenney 884157cef0 rcu: Make exit_rcu() handle non-preempted RCU readers
The purpose of exit_rcu() is to handle cases where buggy code causes a
task to exit within an RCU read-side critical section.  It currently
does that in the case where said RCU read-side critical section was
preempted at least once, but fails to handle cases where preemption did
not occur.  This case needs to be handled because otherwise the final
context switch away from the exiting task will incorrectly behave as if
task exit were instead a preemption of an RCU read-side critical section,
and will therefore queue the exiting task.  The exiting task will have
exited, and thus won't ever execute rcu_read_unlock(), which means that
it will remain queued forever, blocking all subsequent grace periods,
and eventually resulting in OOM.

Although this is arguably better than letting grace periods proceed
and having a later rcu_read_unlock() access the now-freed task
structure that once belonged to the exiting tasks, it would obviously
be better to correctly handle this case.  This commit therefore sets
->rcu_read_lock_nesting to 1 in that case, so that the subsequence call
to __rcu_read_unlock() causes the exiting task to exit its dangling RCU
read-side critical section.

Note that deferred quiescent states need not be considered.  The reason
is that removing the task from the ->blkd_tasks[] list in the call to
rcu_preempt_deferred_qs() handles the per-task component of any deferred
quiescent state, and all other components of any deferred quiescent state
are associated with the CPU, which isn't going anywhere until some later
CPU-hotplug operation, which will report any remaining deferred quiescent
states from within the rcu_report_dead() function.

Note also that negative values of ->rcu_read_lock_nesting need not be
considered.  First, these won't show up in exit_rcu() unless there is
a serious bug in RCU, and second, setting ->rcu_read_lock_nesting sets
the state so that the RCU read-side critical section will be exited
normally.

Again, this code has no effect unless there has been some prior bug
that prevents a task from leaving an RCU read-side critical section
before exiting.  Furthermore, there have been no reports of the bug
fixed by this commit appearing in production.  This commit is therefore
absolutely -not- recommended for backporting to -stable.

Reported-by: ABHISHEK DUBEY <dabhishek@iisc.ac.in>
Reported-by: BHARATH Y MOURYA <bharathm@iisc.ac.in>
Reported-by: Aravinda Prasad <aravinda@iisc.ac.in>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Tested-by: ABHISHEK DUBEY <dabhishek@iisc.ac.in>
2019-03-26 14:37:49 -07:00
Cyrill Gorcunov 18d7e40679 rcu: rcu_qs -- Use raise_softirq_irqoff to not save irqs twice
The rcu_qs is disabling IRQs by self so no need to do the same in raise_softirq
but instead we can save some cycles using raise_softirq_irqoff directly.

CC: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:37:49 -07:00
Joel Fernandes (Google) 671a63517c rcu: Avoid unnecessary softirq when system is idle
When there are no callbacks pending on an idle system, I noticed that
RCU softirq is continuously firing. During this the cpu_no_qs is set to
false, and core_needs_qs is set to true indefinitely. This causes
rcu_process_callbacks to be repeatedly called, even though the node
corresponding to the CPU has that CPU's mask bit cleared and the system
is idle. I believe the race is when such mask clearing is done during
idle CPU scan of the quiescent state forcing stage in the kthread
instead of the softirq. Since the rnp mask is cleared, but the flags on
the CPU's rdp are not cleared, the CPU thinks it still needs to report
to core RCU.

Cure this by clearing the core_needs_qs flag when the CPU detects that
its node is already updated which will avoid the unwanted softirq raises
to the benefit of real-time systems.

Test: Ran rcutorture for various tree RCU configs.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:37:49 -07:00
Paul E. McKenney e85e6a21b2 rcu: Unconditionally expedite during suspend/hibernate
The rcu_pm_notify() function refuses to switch to/from expedited grace
periods on systems with more than 256 CPUs due to the serialized
initialization of expedited grace periods.  However, expedited grace
periods are now initialized in parallel, removing this concern.
This commit therefore removes the checks from rcu_pm_notify(), so that
expedited grace periods are used unconditionally during suspend/resume
and hibernate/wake operations.

As always, real-time workloads wishing to completely avoid expedited
grace periods can use the rcupdate.rcu_normal= kernel parameter.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-03-26 14:37:49 -07:00
Linus Torvalds 203b6609e0 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Lots of tooling updates - too many to list, here's a few highlights:

   - Various subcommand updates to 'perf trace', 'perf report', 'perf
     record', 'perf annotate', 'perf script', 'perf test', etc.

   - CPU and NUMA topology and affinity handling improvements,

   - HW tracing and HW support updates:
      - Intel PT updates
      - ARM CoreSight updates
      - vendor HW event updates

   - BPF updates

   - Tons of infrastructure updates, both on the build system and the
     library support side

   - Documentation updates.

   - ... and lots of other changes, see the changelog for details.

  Kernel side updates:

   - Tighten up kprobes blacklist handling, reduce the number of places
     where developers can install a kprobe and hang/crash the system.

   - Fix/enhance vma address filter handling.

   - Various PMU driver updates, small fixes and additions.

   - refcount_t conversions

   - BPF updates

   - error code propagation enhancements

   - misc other changes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
  perf script python: Add Python3 support to syscall-counts-by-pid.py
  perf script python: Add Python3 support to syscall-counts.py
  perf script python: Add Python3 support to stat-cpi.py
  perf script python: Add Python3 support to stackcollapse.py
  perf script python: Add Python3 support to sctop.py
  perf script python: Add Python3 support to powerpc-hcalls.py
  perf script python: Add Python3 support to net_dropmonitor.py
  perf script python: Add Python3 support to mem-phys-addr.py
  perf script python: Add Python3 support to failed-syscalls-by-pid.py
  perf script python: Add Python3 support to netdev-times.py
  perf tools: Add perf_exe() helper to find perf binary
  perf script: Handle missing fields with -F +..
  perf data: Add perf_data__open_dir_data function
  perf data: Add perf_data__(create_dir|close_dir) functions
  perf data: Fail check_backup in case of error
  perf data: Make check_backup work over directories
  perf tools: Add rm_rf_perf_data function
  perf tools: Add pattern name checking to rm_rf
  perf tools: Add depth checking to rm_rf
  perf data: Add global path holder
  ...
2019-03-06 07:59:36 -08:00
Linus Torvalds 3717f613f4 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main RCU related changes in this cycle were:

   - Additional cleanups after RCU flavor consolidation

   - Grace-period forward-progress cleanups and improvements

   - Documentation updates

   - Miscellaneous fixes

   - spin_is_locked() conversions to lockdep

   - SPDX changes to RCU source and header files

   - SRCU updates

   - Torture-test updates, including nolibc updates and moving nolibc to
     tools/include"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
  locking/locktorture: Convert to SPDX license identifier
  linux/torture: Convert to SPDX license identifier
  torture: Convert to SPDX license identifier
  linux/srcu: Convert to SPDX license identifier
  linux/rcutree: Convert to SPDX license identifier
  linux/rcutiny: Convert to SPDX license identifier
  linux/rcu_sync: Convert to SPDX license identifier
  linux/rcu_segcblist: Convert to SPDX license identifier
  linux/rcupdate: Convert to SPDX license identifier
  linux/rcu_node_tree: Convert to SPDX license identifier
  rcu/update: Convert to SPDX license identifier
  rcu/tree: Convert to SPDX license identifier
  rcu/tiny: Convert to SPDX license identifier
  rcu/sync: Convert to SPDX license identifier
  rcu/srcu: Convert to SPDX license identifier
  rcu/rcutorture: Convert to SPDX license identifier
  rcu/rcu_segcblist: Convert to SPDX license identifier
  rcu/rcuperf: Convert to SPDX license identifier
  rcu/rcu.h: Convert to SPDX license identifier
  RCU/torture.txt: Remove section MODULE PARAMETERS
  ...
2019-03-05 14:49:11 -08:00
Masami Hiramatsu a39f15b964 kprobes: Prohibit probing on RCU debug routine
Since kprobe itself depends on RCU, probing on RCU debug
routine can cause recursive breakpoint bugs.

Prohibit probing on RCU debug routines.

int3
 ->do_int3()
   ->ist_enter()
     ->RCU_LOCKDEP_WARN()
       ->debug_lockdep_rcu_enabled() -> int3

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrea Righi <righi.andrea@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/154998807741.31052.11229157537816341591.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-13 08:16:40 +01:00
Masami Hiramatsu c13324a505 x86/kprobes: Prohibit probing on functions before kprobe_int3_handler()
Prohibit probing on the functions called before kprobe_int3_handler()
in do_int3(). More specifically, ftrace_int3_handler(),
poke_int3_handler(), and ist_enter(). And since rcu_nmi_enter() is
called by ist_enter(), it also should be marked as NOKPROBE_SYMBOL.

Since those are handled before kprobe_int3_handler(), probing those
functions can cause a breakpoint recursion and crash the kernel.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrea Righi <righi.andrea@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/154998793571.31052.11301258949601150994.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-13 08:16:39 +01:00
Paul E. McKenney e7ffb4eb9a Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', 'spdx.2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD
doc.2019.01.26a:  Documentation updates.
fixes.2019.01.26a:  Miscellaneous fixes.
sil.2019.01.26a:  Removal of a few more spin_is_locked() instances.
spdx.2019.02.09a:  Add SPDX identifiers to RCU files
srcu.2019.01.26a:  SRCU updates.
torture.2019.01.26a: Torture-test updates.
2019-02-09 08:47:52 -08:00
Paul E. McKenney 38b4df649e rcu/update: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:44:27 -08:00
Paul E. McKenney 22e4092531 rcu/tree: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
[ paulmck: Update .h file SPDX comment format per Joe Perches. ]
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:44:10 -08:00
Paul E. McKenney 00de9d7415 rcu/tiny: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:44:05 -08:00
Paul E. McKenney 96b903f5da rcu/sync: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:43:59 -08:00
Paul E. McKenney e7ee1501cd rcu/srcu: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:43:54 -08:00
Paul E. McKenney 2e24ce8852 rcu/rcutorture: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:43:46 -08:00
Paul E. McKenney eb7935e479 rcu/rcu_segcblist: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:43:40 -08:00
Paul E. McKenney 8bf05ed3ad rcu/rcuperf: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:43:35 -08:00
Paul E. McKenney b5b11890de rcu/rcu.h: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:43:05 -08:00
Paul E. McKenney e838a7d66e rcuperf: Stop abusing IS_ENABLED()
The ever-evolving IS_ENABLED() macro is intended for CONFIG_* Kconfig
options, but rcuperf currently uses it for the decidedly non-CONFIG_*
MODULE macro.  In the spirit of not inviting trouble, this commit
substitutes tried-and-true #ifdef.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
2019-01-25 15:37:11 -08:00
Paul E. McKenney 3a6cb58f15 rcutorture: Add grace period after CPU offline
Beyond a certain point in the CPU-hotplug offline process, timers get
stranded on the outgoing CPU, and won't fire until that CPU comes back
online, which might well be never.  This commit therefore adds a hook
in torture_onoff_init() that is invoked from torture_offline(), which
rcutorture uses to occasionally wait for a grace period.  This should
result in failures for RCU implementations that rely on stranded timers
eventually firing in the absence of the CPU coming back online.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:37:10 -08:00
Paul E. McKenney cd618d102b rcutorture: Record grace periods in forward-progress histogram
This commit records grace periods in rcutorture's n_launders_hist[]
histogram, thus allowing rcu_torture_fwd_cb_hist() to print out the
elapsed number of grace periods between buckets.  This information
helps to determine whether a lack of forward progress is due to stalled
grace periods on the one hand or due to sluggish callback invocation on
the other.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:37:09 -08:00
Sebastian Andrzej Siewior e81baf4cb1 srcu: Remove srcu_queue_delayed_work_on()
srcu_queue_delayed_work_on() disables preemption (and therefore CPU
hotplug in RCU's case) and then checks based on its own accounting if a
CPU is online. If the CPU is online it uses queue_delayed_work_on()
otherwise it fallbacks to queue_delayed_work().
The problem here is that queue_work() on -RT does not work with disabled
preemption.

queue_work_on() works also on an offlined CPU. queue_delayed_work_on()
has the problem that it is possible to program a timer on an offlined
CPU. This timer will fire once the CPU is online again. But until then,
the timer remains programmed and nothing will happen.

Add a local timer which will fire (as requested per delay) on the local
CPU and then enqueue the work on the specific CPU.

RCUtorture testing with SRCU-P for 24h showed no problems.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:36:42 -08:00
Paul E. McKenney c2d8089de7 rcu: Fix obsolete DYNTICK_IRQ_NONIDLE comment
This commit updates the DYNTICK_IRQ_NONIDLE header comment to remove
the obsolete commentary about unmatched rcu_irq_{enter,exit}().

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:35:24 -08:00
Paul E. McKenney 39abefe743 rcu: Repair rcu_nmi_exit() docbook header
This commit removes the "@irq" argument from the rcu_nmi_exit() docbook
header, given that this function now has no arguments.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:35:23 -08:00
Paul E. McKenney 5a0874c1d1 rcu: Remove preemption disabling from expedited CPU selection
It turns out that it is queue_delayed_work_on() rather than
queue_work_on() that has difficulties when used concurrently with
CPU-hotplug removal operations.  It is therefore unnecessary to protect
CPU identification and queue_work_on() with preempt_disable().

This commit therefore removes the preempt_disable() and preempt_enable()
from sync_rcu_exp_select_cpus(), which has the further benefit of reducing
the number of changes that must be maintained in the -rt patchset.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Sebastian Siewior <bigeasy@linutronix.de>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:35:23 -08:00
Paul E. McKenney fb60e533be rcu: Rename rcu_process_callbacks() to rcu_core() for Tree RCU
Although the name rcu_process_callbacks() still makes sense for Tiny
RCU, where most of what it does is invoke callbacks, it no longer makes
much sense for Tree RCU, especially given that the actually callback
invocation is relegated to rcu_do_batch(), or, for no-CBs CPUs, to the
rcuo kthreads.  Especially in the latter case, rcu_process_callbacks()
has very little to do with actual callbacks.  A better description of
this function is that it performs RCU's core processing.

This commit therefore changes the name of Tree RCU's rcu_process_callbacks()
function to rcu_core(), which also has the virtue of being consistent with
the existing invoke_rcu_core() function.

While in the area, the header comment is reworked.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:35:22 -08:00
Paul E. McKenney c98cac603f rcu: Rename rcu_check_callbacks() to rcu_sched_clock_irq()
The name rcu_check_callbacks() arguably made sense back in the early
2000s when RCU was quite a bit simpler than it is today, but it has
become quite misleading, especially with the advent of dyntick-idle
and NO_HZ_FULL.  The rcu_check_callbacks() function is RCU's hook into
the scheduling-clock interrupt, and is now but one of many ways that
callbacks get promoted to invocable state.

This commit therefore changes the name to rcu_sched_clock_irq(),
which is the same number of characters and clearly indicates this
function's relation to the rest of the Linux kernel.  In addition, for
the sake of consistency, rcu_flavor_check_callbacks() is also renamed
to rcu_flavor_sched_clock_irq().

While in the area, the header comments for both functions are reworked.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:35:21 -08:00
Paul E. McKenney 7a968bb26a Merge branches 'consolidate.2019.01.26a' and 'fwd.2019.01.26a' into HEAD
consolidate.2019.01.26a: RCU flavor consolidation cleanups.
fwd.2019.01.26a: RCU grace-period forward-progress fixes.
2019-01-25 15:32:01 -08:00
Zhang, Jun 13dc7d0c7a rcu: Prevent needless ->gp_seq_needed update in __note_gp_changes()
Currently, __note_gp_changes() checks to see if the rcu_node structure's
->gp_seq_needed is greater than or equal to that of the rcu_data
structure, and if so, updates the rcu_data structure's ->gp_seq_needed
field.  This results in a useless store in the case where the two fields
are equal.

This commit therefore carries out this store only in the case where the
rcu_node structure's ->gp_seq_needed is strictly greater than that of
the rcu_data structure.

Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Link: https://lkml.kernel.org/r/88DC34334CA3444C85D647DBFA962C2735AD5F77@SHSMSX104.ccr.corp.intel.com
2019-01-25 15:30:00 -08:00
Zhang, Jun 1d1f898df6 rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
The rcu_gp_kthread_wake() function is invoked when it might be necessary
to wake the RCU grace-period kthread.  Because self-wakeups are normally
a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from
this kthread, it naturally refuses to do the wakeup.

Unfortunately, natural though it might be, this heuristic fails when
rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler
that interrupted the grace-period kthread just after the final check of
the wait-event condition but just before the schedule() call.  In this
case, a wakeup is required, even though the call to rcu_gp_kthread_wake()
is within the RCU grace-period kthread's context.  Failing to provide
this wakeup can result in grace periods failing to start, which in turn
results in out-of-memory conditions.

This race window is quite narrow, but it actually did happen during real
testing.  It would of course need to be fixed even if it was strictly
theoretical in nature.

This patch does not Cc stable because it does not apply cleanly to
earlier kernel versions.

Fixes: 48a7639ce8 ("rcu: Make callers awaken grace-period kthread")
Reported-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com>
Co-developed-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "xiao, jin" <jin.xiao@intel.com>
Co-developed-by: Bai, Jie A <jie.a.bai@intel.com>
Signed-off: "Zhang, Jun" <jun.zhang@intel.com>
Signed-off: "He, Bo" <bo.he@intel.com>
Signed-off: "xiao, jin" <jin.xiao@intel.com>
Signed-off: Bai, Jie A <jie.a.bai@intel.com>
Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com>
[ paulmck: Switch from !in_softirq() to "!in_interrupt() &&
  !in_serving_softirq() to avoid redundant wakeups and to also handle the
  interrupt-handler scenario as well as the softirq-handler scenario that
  actually occurred in testing. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com
2019-01-25 15:29:59 -08:00
Paul E. McKenney 2ccaff10f7 rcu: Add sysrq rcu_node-dump capability
Life is hard if RCU manages to get stuck without triggering RCU CPU
stall warnings or triggering the rcu_check_gp_start_stall() checks
for failing to start a grace period.  This commit therefore adds a
boot-time-selectable sysrq key (commandeering "y") that allows manually
dumping Tree RCU state.  The new rcutree.sysrq_rcu kernel boot parameter
must be set for this sysrq to be available.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:59 -08:00
Paul E. McKenney 3b6505fd8e rcu: Protect rcu_check_gp_kthread_starvation() access to ->gp_flags
The rcu_check_gp_kthread_starvation() function can be invoked without
holding locks, so the access to the rcu_state structure's ->gp_flags
field must be protected with READ_ONCE().  This commit therefore adds
this protection.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:58 -08:00
Paul E. McKenney fd897573fa rcu: Improve diagnostics for failed RCU grace-period start
If a grace period fails to start (for example, because you commented
out the last two lines of rcu_accelerate_cbs_unlocked()), rcu_core()
will invoke rcu_check_gp_start_stall(), which will notice and complain.
However, this complaint is lacking crucial debugging information such
as when the last wakeup executed and what the value of ->gp_seq was at
that time.  This commit therefore removes the current pr_alert() from
rcu_check_gp_start_stall(), instead invoking show_rcu_gp_kthreads(),
which has been updated to print the needed information, which is collected
by rcu_gp_kthread_wake().

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:57 -08:00
Paul E. McKenney a9fefdb257 rcu: Update NOCB comments
This commit updates a few obsolete comments in the RCU callback-offload
code.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:57 -08:00
Paul E. McKenney b2c1955b88 rcu: Remove unused rcu_cpu_kthread_cpu per-CPU variable
The rcu_cpu_kthread_cpu used to provide debugfs information, but is no
longer used.  This commit therefore removes it.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:56 -08:00
Paul E. McKenney f7e972ee12 rcu: Move rcu_cpu_has_work to rcu_data structure
Given that RCU has a perfectly good per-CPU rcu_data structure, most
per-CPU quantities should be stored there.

This commit therefore moves the rcu_cpu_has_work per-CPU variable to
the rcu_data structure.  This also makes this variable unconditionally
present, which should be acceptable given the memory reduction due to the
RCU flavor consolidation and also due to simplifications this will enable.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:56 -08:00
Paul E. McKenney 8b4d0f4858 rcu: Remove unused rcu_cpu_kthread_loops per-CPU variable
The rcu_cpu_kthread_loops variable used to provide debugfs information,
but is no longer used.  This commit therefore removes it.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:55 -08:00
Paul E. McKenney 6ffdde28b7 rcu: Move rcu_cpu_kthread_status to rcu_data structure
Given that RCU has a perfectly good per-CPU rcu_data structure, most
per-CPU quantities should be stored there.

This commit therefore moves the rcu_cpu_kthread_status per-CPU variable
to the rcu_data structure.  This also makes this variable unconditionally
present, which should be acceptable given the memory reduction due to the
RCU flavor consolidation and also due to simplifications this will enable.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:54 -08:00
Paul E. McKenney 37f62d7cf0 rcu: Move rcu_cpu_kthread_task to rcu_data structure
Given that RCU has a perfectly good per-CPU rcu_data structure, most
per-CPU quantities should be stored there.

This commit therefore moves the rcu_cpu_kthread_task per-CPU variable to
the rcu_data structure.  This also makes this variable unconditionally
present, which should be acceptable given the memory reduction due to the
RCU flavor consolidation and also due to simplifications this will enable.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:53 -08:00
Paul E. McKenney 9cf422a8e7 rcu: Accommodate zero jiffies_till_first_fqs and kthread kicking
It is perfectly fine to set the rcutree.jiffies_till_first_fqs boot
parameter to zero, in fact, this can be useful on specialty systems that
usually have at least one idle CPU and that need fast grace periods.
This is because this setting causes the RCU grace-period kthread to
scan for idle threads immediately after grace-period initialization,
as opposed to waiting several jiffies to do so.

It is also perfectly fine to set the rcutree.rcu_kick_kthreads kernel
parameter, which gives the RCU grace-period kthread an extra wakeup
if it doesn't make progress for a period of three times the setting of
the rcutree.jiffies_till_first_fqs boot parameter.  This is of course
problematic when the value of this parameter is zero, as it can result
in unnecessary wakeup IPIs along with unnecessary WARN_ONCE() invocations.

This commit therefore defers kthread kicking for at least two jiffies,
regardless of the setting of rcutree.jiffies_till_first_fqs.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:29:53 -08:00
Paul E. McKenney 260e1e4fd8 rcu: Discard separate per-CPU callback counts
Back when there were multiple flavors of RCU, it was necessary to
separately count lazy and non-lazy callbacks for each CPU.  These counts
were used in CONFIG_RCU_FAST_NO_HZ kernels to determine how long a newly
idle CPU should be allowed to sleep before handling its RCU callbacks.
But now that there is only one flavor, the callback counts for a given
CPU's sole rcu_data structure are the counts for that CPU.

This commit therefore removes the rcu_data structure's ->nonlazy_posted
and ->nonlazy_posted_snap fields, the rcu_idle_count_callbacks_posted()
and rcu_cpu_has_callbacks() functions, repurposes the rcu_data structure's
->all_lazy field to record the laziness state at the beginning of the
latest idle sojourn, and modifies CONFIG_RCU_FAST_NO_HZ RCU CPU stall
warnings accordingly.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:30 -08:00
Paul E. McKenney 8923072664 rcu: Inline _synchronize_rcu_expedited() into synchronize_rcu_expedited()
Now that _synchronize_rcu_expedited() has only one caller, and given that
this is a tail call, this commit inlines _synchronize_rcu_expedited()
into synchronize_rcu_expedited().

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:29 -08:00
Paul E. McKenney e5bc3af773 rcu: Consolidate PREEMPT and !PREEMPT synchronize_rcu()
Now that rcu_blocking_is_gp() makes the correct immediate-return
decision for both PREEMPT and !PREEMPT, a single implementation of
synchronize_rcu() will work correctly under both configurations.
This commit therefore eliminates a few lines of code by consolidating
the two implementations of synchronize_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:28 -08:00
Paul E. McKenney 3cd4ca47aa rcu: Consolidate PREEMPT and !PREEMPT synchronize_rcu_expedited()
The CONFIG_PREEMPT=n and CONFIG_PREEMPT=y implementations of
synchronize_rcu_expedited() are quite similar, and with small
modifications to rcu_blocking_is_gp() can be made identical.  This commit
therefore makes this change in order to save a few lines of code and to
reduce the amount of duplicate code.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:27 -08:00
Paul E. McKenney 142d106d5e rcu: Determine expedited-GP IPI handler at build time
Back when there could be multiple RCU flavors running in the same kernel
at the same time, it was necessary to specify the expedited grace-period
IPI handler at runtime.  Now that there is only one RCU flavor, the
IPI handler can be determined at build time.  There is therefore no
longer any reason for the RCU-preempt and RCU-sched IPI handlers to
have different names, nor is there any reason to pass these handlers in
function arguments and in the data structures enclosing workqueues.

This commit therefore makes all these changes, pushing the specification
of the expedited grace-period IPI handler down to the point of use.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:27 -08:00
Paul E. McKenney c46f497a61 rcu: Inline rcu_kthread_do_work() into its sole remaining caller
The rcu_kthread_do_work() function has a single-line body and only one
remaining caller.  This commit therefore saves a few lines of code by
inlining rcu_kthread_do_work() into its sole remaining caller.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:25 -08:00
Paul E. McKenney c97058d033 rcu: Eliminate RCU_BH_FLAVOR and RCU_SCHED_FLAVOR
Now that the RCU flavors have been consolidated, RCU_BH_FLAVOR and
RCU_SCHED_FLAVOR are no longer used.  This commit therefore saves a
few lines by removing them.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:25 -08:00
Paul E. McKenney cd920e5a34 rcu: Inline force_quiescent_state() into rcu_force_quiescent_state()
Given that rcu_force_quiescent_state() is a simple wrapper around
force_quiescent_state(), this commit saves a few lines of code by
inlining force_quiescent_state() into rcu_force_quiescent_state(),
and changing all references to force_quiescent_state() to instead
invoke rcu_force_quiescent_state().

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:24 -08:00
Paul E. McKenney 1de462ed85 rcu: Make expedited IPI handler return after handling critical section
During expedited RCU grace-period initialization, IPIs are sent to
all non-idle online CPUs.  The IPI handler checks to see if the CPU is
in quiescent state, reporting one if so.  This handler looks at three
different cases: (1) The CPU is not in an rcu_read_lock()-based critical
section, (2) The CPU is in the process of exiting an rcu_read_lock()-based
critical section, and (3) The CPU is in an rcu_read_lock()-based critical
section.  In case (2), execution falls through into case (3).

This is harmless from a functionality viewpoint, but can result in
needless overhead during an improbable corner case.  This commit therefore
adds the "return" statement needed to prevent fall-through.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:24 -08:00
Paul E. McKenney ad368d15b0 rcu: Rename and comment changes due to only one rcuo kthread per CPU
Given RCU flavor consolidation, the name rcu_spawn_all_nocb_kthreads()
is quite misleading.  It no longer ever creates more than one kthread,
and it does so only for the specified CPU.  This commit therefore changes
this name to the more descriptive rcu_spawn_cpu_nocb_kthread(), and also
fixes up a similar issue in its header comment while in the area.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25 15:28:23 -08:00
Paul E. McKenney a4cffdad73 time: Move CONTEXT_TRACKING to kernel/time/Kconfig
Both CONTEXT_TRACKING and CONTEXT_TRACKING_FORCE are currently defined
in kernel/rcu/kconfig, which might have made sense at some point, but
no longer does given that RCU refers to neither of these Kconfig options.

Therefore move them to kernel/time/Kconfig, where the rest of the
NO_HZ_FULL Kconfig options live.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: https://lkml.kernel.org/r/20181220170525.GA12579@linux.ibm.com
2019-01-15 11:16:41 +01:00
Paul E. McKenney 5ac7cdc298 rcutorture: Don't do busted forward-progress testing
The "busted" rcutorture type is an intentionally broken implementation
of RCU.  Doing forward-progress testing on this implementation is not
particularly meaningful on the one hand and can result in fatal abuse
of the memory allocator on the other.  This commit therefore disables
forward-progress testing of the "busted" rcutorture type.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:42 -08:00
Paul E. McKenney 2e57bf97a6 rcutorture: Use 100ms buckets for forward-progress callback histograms
This commit narrows the scope of each bucket of the forward-progress
callback-invocation histograms from one second to 100 milliseconds, which
aids debugging of forward-progress problems by making shorter-duration
callback-invocation stalls visible.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:41 -08:00
Paul E. McKenney 2667ccce93 rcutorture: Recover from OOM during forward-progress tests
This commit causes the OOM handler to do rcu_barrier() calls and to
free up forward-progress callbacks in order to recover from OOM events.
The current test is terminated, but subsequent forward-progress tests can
proceed.  This allows a long test to result in multiple forward-progress
failures, greatly reducing the required testing time.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:41 -08:00
Paul E. McKenney 73d665b141 rcutorture: Print forward-progress test age upon failure
This commit prints the age of the forward-progress test in jiffies,
in order to allow better interpretation of the callback-invocation
histograms.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:40 -08:00
Paul E. McKenney c51d7b5e6c rcutorture: Print time since GP end upon forward-progress failure
If rcutorture's forward-progress tests fail while a grace period is not
in progress, it is useful to print the time since the last grace period
ended as a way to detect failure to launch a new grace period.  This
commit therefore makes this change.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:40 -08:00
Paul E. McKenney 1a682754c7 rcutorture: Print histogram of CB invocation at OOM time
One reason why a forward-progress test might fail would be if something
prevented or delayed callback invocation.  This commit therefore adds a
callback-invocation histogram printout when OOM is reported to rcutorture.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:39 -08:00
Paul E. McKenney 8dd3b54689 rcutorture: Print GP age upon forward-progress failure
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:38 -08:00
Paul E. McKenney bfcfcffc5f rcu: Print per-CPU callback counts for forward-progress failures
This commit prints out the non-zero per-CPU callback counts when a
forware-progress error (OOM event) occurs.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
[ paulmck: Fix a pair of uninitialized locals spotted by kbuild test robot. ]
2018-12-01 12:45:38 -08:00
Paul E. McKenney 903ee83d91 rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings
The RCU CPU stall warnings print an estimate of the total number of
RCU callbacks queued in the system, but this estimate leaves out
the callbacks queued for nocbs CPUs.  This commit therefore introduces
rcu_get_n_cbs_cpu(), which gives an accurate callback estimate for
both nocbs and normal CPUs, and uses this new function as needed.

This commit also introduces a rcu_get_n_cbs_nocb_cpu() helper function
that returns the number of callbacks for nocbs CPUs or zero otherwise,
and also uses this function in place of direct access to ->nocb_q_count
while in the area (fewer characters, you see).

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:37 -08:00
Paul E. McKenney e0aff97355 rcutorture: Dump grace-period diagnostics upon forward-progress OOM
This commit adds an OOM notifier during rcutorture forward-progress
testing.  If this notifier is invoked, it dumps out some grace-period
state to help debug the forward-progress problem.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:36 -08:00
Paul E. McKenney 61670adcb4 rcutorture: Prepare for asynchronous access to rcu_fwd_startat
Because rcutorture's forward-progress checking will trigger from an
OOM notifier, this notifier will introduce asynchronous concurrent
access to the rcu_fwd_startat variable.  This commit therefore prepares
for this by converting updates to WRITE_ONCE().

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01 12:45:35 -08:00
Paul E. McKenney 5ab7ab8362 rcutorture: Affinity forward-progress test to avoid housekeeping CPUs
This commit affinities the forward-progress tests to avoid hogging a
housekeeping CPU on the theory that the offloaded callbacks will be
running on those housekeeping CPUs.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
[ paulmck: Fix NULL-pointer issue located by kbuild test robot. ]
Tested-by: Rong Chen <rong.a.chen@intel.com>
2018-12-01 12:45:34 -08:00
Paul E. McKenney 6b3de7a172 rcutorture: Break up too-long rcu_torture_fwd_prog() function
This commit splits rcu_torture_fwd_prog_nr() and rcu_torture_fwd_prog_cr()
functions out of rcu_torture_fwd_prog() in order to reduce indentation
pain and because rcu_torture_fwd_prog() was getting a bit too long.
In addition, this will enable easier conditional execution of the
rcu_torture_fwd_prog_cr() function, which can give false-positive
failures in some NO_HZ_FULL configurations due to overloading the
housekeeping CPUs.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-12-01 12:45:34 -08:00
Paul E. McKenney fc6f9c5778 rcutorture: Remove cbflood facility
Now that the forward-progress code does a full-bore continuous callback
flood lasting multiple seconds, there is little point in also posting a
mere 60,000 callbacks every second or so.  This commit therefore removes
the old cbflood testing.  Over time, it may be desirable to concurrently
do full-bore continuous callback floods on all CPUs simultaneously, but
one dragon at a time.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-12-01 12:45:33 -08:00
Paul E. McKenney 4871848531 rcutorture: Add call_rcu() flooding forward-progress tests
This commit adds a call_rcu() flooding loop to the forward-progress test.
This emulates tight userspace loops that force call_rcu() invocations,
for example, the infamous loop containing close(open()) that instigated
the addition of blimit.  If RCU does not make sufficient forward progress
in invoking the resulting flood of callbacks, rcutorture emits a warning.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-12-01 12:45:32 -08:00
Paul E. McKenney eaaf055f27 Merge branches 'bug.2018.11.12a', 'consolidate.2018.12.01a', 'doc.2018.11.12a', 'fixes.2018.11.12a', 'initrd.2018.11.08b', 'sil.2018.11.12a' and 'srcu.2018.11.27a' into HEAD
bug.2018.11.12a:  Get rid of BUG_ON() and friends
consolidate.2018.12.01a:  Continued RCU flavor-consolidation cleanup
doc.2018.11.12a:  Documentation updates
fixes.2018.11.12a:  Miscellaneous fixes
initrd.2018.11.08b:  Automate creation of rcutorture initrd
sil.2018.11.12a:  Remove more spin_unlock_wait() calls
2018-12-01 12:43:16 -08:00
Paul E. McKenney aacb5d91ab srcu: Use "ssp" instead of "sp" for srcu_struct pointer
In RCU, the distinction between "rsp", "rnp", and "rdp" has served well
for a great many years, but in SRCU, "sp" vs. "sdp" has proven confusing.
This commit therefore renames SRCU's "sp" pointers to "ssp", so that there
is "ssp" for srcu_struct pointer, "snp" for srcu_node pointer, and "sdp"
for srcu_data pointer.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-27 09:24:17 -08:00
Dennis Krein eb4c238227 srcu: Lock srcu_data structure in srcu_gp_start()
The srcu_gp_start() function is called with the srcu_struct structure's
->lock held, but not with the srcu_data structure's ->lock.  This is
problematic because this function accesses and updates the srcu_data
structure's ->srcu_cblist, which is protected by that lock.  Failing to
hold this lock can result in corruption of the SRCU callback lists,
which in turn can result in arbitrarily bad results.

This commit therefore makes srcu_gp_start() acquire the srcu_data
structure's ->lock across the calls to rcu_segcblist_advance() and
rcu_segcblist_accelerate(), thus preventing this corruption.

Reported-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Christoph Hellwig <hch@infradead.org>
Reported-by: Sebastian Kuzminsky <seb.kuzminsky@gmail.com>
Signed-off-by: Dennis Krein <Dennis.Krein@netapp.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Tested-by: Dennis Krein <Dennis.Krein@netapp.com>
Cc: <stable@vger.kernel.org> # 4.16.x
2018-11-27 09:23:57 -08:00
Paul E. McKenney 5f1a6ef374 rcu: Avoid signed integer overflow in rcu_preempt_deferred_qs()
Subtracting INT_MIN can be interpreted as unconditional signed integer
overflow, which according to the C standard is undefined behavior.
Therefore, kernel build arguments notwithstanding, it would be good to
future-proof the code.  This commit therefore substitutes INT_MAX for
INT_MIN in order to avoid undefined behavior.

While in the neighborhood, this commit also creates some meaningful names
for INT_MAX and friends in order to improve readability, as suggested
by Joel Fernandes.

Reported-by: Ran Rozenstein <ranro@mellanox.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney 117f683c6e rcu: Replace this_cpu_ptr() with __this_cpu_read()
Because __this_cpu_read() can be lighter weight than equivalent uses of
this_cpu_ptr(), this commit replaces the latter with the former.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney 05f415715c rcu: Speed up expedited GPs when interrupting RCU reader
In PREEMPT kernels, an expedited grace period might send an IPI to a
CPU that is executing an RCU read-side critical section.  In that case,
it would be nice if the rcu_read_unlock() directly interacted with the
RCU core code to immediately report the quiescent state.  And this does
happen in the case where the reader has been preempted.  But it would
also be a nice performance optimization if immediate reporting also
happened in the preemption-free case.

This commit therefore adds an ->exp_hint field to the task_struct structure's
->rcu_read_unlock_special field.  The IPI handler sets this hint when
it has interrupted an RCU read-side critical section, and this causes
the outermost rcu_read_unlock() call to invoke rcu_read_unlock_special(),
which, if preemption is enabled, reports the quiescent state immediately.
If preemption is disabled, then the report is required to be deferred
until preemption (or bottom halves or interrupts or whatever) is re-enabled.

Because this is a hint, it does nothing for more complicated cases.  For
example, if the IPI interrupts an RCU reader, but interrupts are disabled
across the rcu_read_unlock(), but another rcu_read_lock() is executed
before interrupts are re-enabled, the hint will already have been cleared.
If you do crazy things like this, reporting will be deferred until some
later RCU_SOFTIRQ handler, context switch, cond_resched(), or similar.

Reported-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2018-11-12 09:03:59 -08:00
Paul E. McKenney 0a89e5a402 rcu: Trace end of grace period before end of grace period
Currently, rcu_gp_cleanup() traces the end of the old grace period after
the old grace period has officially ended.  This might make intuitive
sense, but it also makes for confusing event-trace output because the
"end" trace displays not the old but instead the new grace-period number.
This commit therefore traces the end of an old grace period just before
that grace period officially ends.

Reported-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Zhouyi Zhou 2320bda26d rcu: Adjust the comment of function rcu_is_watching
Because RCU avoids interrupting idle CPUs, rcu_is_watching() is used to
test whether or not it is currently legal to run RCU read-side critical
sections on this CPU.  However, the first sentence and last sentences
of current comment for rcu_is_watching have opposite meaning of what
is expected.  This commit therefore fixes this header comment.

Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney c669c014d1 rcu: Add jiffies-since-GP-activity to show_rcu_gp_kthreads()
This commit adds a printout of the number of jiffies since the last time
that the RCU grace-period kthread did any processing.  This can be useful
when tracking down forward-progress issues.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney 691960197e rcu: Add state name to show_rcu_gp_kthreads() output
This commit adds the name of the RCU grace-period state to
the show_rcu_gp_kthreads() output in order to ease debugging.
This commit also moves gp_state_getname() up in the code so that
show_rcu_gp_kthreads() can use it.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney 791416c471 rcu: Parameterize rcu_check_gp_start_stall()
In order to debug forward-progress stalls, it is necessary to check
for excessively delayed grace-period starts.  This is currently done
for RCU CPU stall warnings by rcu_check_gp_start_stall(), which checks
to see if the start of a requested grace period has been delayed by an
RCU CPU stall warning period.  Because rcutorture will need to check
for the time consumed by an RCU forward-progress delay, this commit
promotes gpssdelay from a local variable to a formal parameter.  It is
not necessary to export rcu_check_gp_start_stall() because rcutorture
will access it via a wrapper function.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney b3c1d9ec7c rcu: Avoid double multiply by HZ
The rcu_check_gp_start_stall() function multiplies the return value
from rcu_jiffies_till_stall_check() by HZ, but the units are already
in jiffies.  This commit therefore avoids the need for introduction of
a jiffies-squared unit by removing the extraneous multiplication.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney f0ad56e876 rcu: Eliminate BUG_ON() for kernel/rcu/update.c
The update.c file has a number of calls to BUG_ON(), which panics the
kernel, which is not a good strategy for devices (like embedded) that
don't have a way to capture console output.  This commit therefore
converts these BUG_ON() calls to WARN_ON_ONCE() and WARN_ONCE().

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:15:59 -08:00
Paul E. McKenney 9213784b48 rcu: Eliminate BUG_ON() for kernel/rcu/tree_plugin.h
The tree_plugin.h file has a number of calls to BUG_ON(), which panics
the kernel, which is not a good strategy for devices (like embedded)
that don't have a way to capture console output.  This commit therefore
converts these BUG_ON() calls to WARN_ON_ONCE() and WARN_ONCE().

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
[ paulmck: Fix typo: s/rcuo/rcub/. ]
2018-11-12 08:15:16 -08:00
Paul E. McKenney 9cac83a57e rcu: Stop expedited grace periods from relying on stop-machine
The CPU-selection code in sync_rcu_exp_select_cpus() disables preemption
to prevent the cpu_online_mask from changing.  However, this relies on
the stop-machine mechanism in the CPU-hotplug offline code, which is not
desirable (it would be good to someday remove the stop-machine mechanism).

This commit therefore instead uses the relevant leaf rcu_node structure's
->ffmask, which has a bit set for all CPUs that are fully functional.
A given CPU's bit is cleared very early during offline processing by
rcutree_offline_cpu() and set very late during online processing by
rcutree_online_cpu().  Therefore, if a CPU's bit is set in this mask, and
preemption is disabled, we have to be before the synchronize_sched() in
the CPU-hotplug offline code, which means that the CPU is guaranteed to be
workqueue-ready throughout the duration of the enclosing preempt_disable()
region of code.

This also has the side-effect of using WORK_CPU_UNBOUND if all the CPUs for
this leaf rcu_node structure are offline, which is an acceptable difference
in behavior.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-11 11:23:01 -08:00
Paul E. McKenney 0607ba8403 srcu: Prevent __call_srcu() counter wrap with read-side critical section
Ever since cdf7abc461 ("srcu: Allow use of Tiny/Tree SRCU from
both process and interrupt context"), it has been permissible
to use SRCU read-side critical sections in interrupt context.
This allows __call_srcu() to use SRCU read-side critical sections to
prevent a new SRCU grace period from ending before the call to either
srcu_funnel_gp_start() or srcu_funnel_exp_start completes, thus preventing
SRCU grace-period counter overflow during that time.

Note that this does not permit removal of the counter-wrap checks in
srcu_gp_end().  These check are necessary to handle the case where
a given CPU does not interact at all with SRCU for an extended time
period.

This commit therefore adds an SRCU read-side critical section to
__call_srcu() in order to prevent grace period counter wrap during
the funnel-locking process.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-08 21:54:14 -08:00
Paul E. McKenney d3ff3891b2 rcu: Consolidate the RCU update functions invoked by sync.c
This commit retains all the various gp_ops[] entries, but makes their
update functions all be synchronize_rcu(), call_rcu() and rcu_barrier().
The read-side checks remain consistent with the various RCU flavors,
which still exist on the read side.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2018-11-08 21:43:20 -08:00
Paul E. McKenney 309ba859b9 rcu: Eliminate synchronize_rcu_mult()
Now that synchronize_rcu() waits for both RCU read-side critical
sections and preempt-disabled regions of code, the sole caller of
synchronize_rcu_mult() can be replaced by synchronize_rcu().
This patch makes this change and removes synchronize_rcu_mult().
Note that _wait_rcu_gp() still supports synchronize_rcu_mult(),
and thus might be simplified in the future to take only take
a single call_rcu() function rather than the current list of them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-08 21:43:20 -08:00
Joel Fernandes (Google) adbccddb4a rcu: Fix rcu_{node,data} comments about gp_seq_needed
Recent changes have removed the old ->gp_seq_needed field from the
rcu_state structure, which in turn obsoleted a couple of comments in
the rcu_node and rcu_data structures.  This commit therefore updates
these comments accordingly.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:43:20 -08:00