Commit Graph

607 Commits

Author SHA1 Message Date
Linus Torvalds bcd550745f Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner.

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ia64: vsyscall: Add missing paranthesis
  alarmtimer: Don't call rtc_timer_init() when CONFIG_RTC_CLASS=n
  x86: vdso: Put declaration before code
  x86-64: Inline vdso clock_gettime helpers
  x86-64: Simplify and optimize vdso clock_gettime monotonic variants
  kernel-time: fix s/then/than/ spelling errors
  time: remove no_sync_cmos_clock
  time: Avoid scary backtraces when warning of > 11% adj
  alarmtimer: Make sure we initialize the rtctimer
  ntp: Fix leap-second hrtimer livelock
  x86, tsc: Skip refined tsc calibration on systems with reliable TSC
  rtc: Provide flag for rtc devices that don't support UIE
  ia64: vsyscall: Use seqcount instead of seqlock
  x86: vdso: Use seqcount instead of seqlock
  x86: vdso: Remove bogus locking in update_vsyscall_tz()
  time: Remove bogus comments
  time: Fix change_clocksource locking
  time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-29 14:16:48 -07:00
Thomas Gleixner c5e14e7630 alarmtimer: Don't call rtc_timer_init() when CONFIG_RTC_CLASS=n
rtc_timer_init() is not available when CONFIG_RTC_CLASS=n. Provide a
proper wrapper in the RTC section of alarmtimer.c

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
2012-03-24 12:46:23 +01:00
Jim Cromie 88b28adf6f kernel-time: fix s/then/than/ spelling errors
Use than for comparisons, like more than.

CC: John Stultz <john.stultz@linaro.org>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-03-23 16:49:21 -07:00
Cesar Eduardo Barros 335dd85895 time: remove no_sync_cmos_clock
Commit 9863c90f68 (x86, vmware: Remove
deprecated VMI kernel support) removed the only place which set
no_sync_cmos_clock. Since that commit, this variable is never set.

Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-03-23 16:25:20 -07:00
John Stultz e919cfd42d time: Avoid scary backtraces when warning of > 11% adj
Folks have been getting a number of warnings about time
adjustments > 11%. The WARN_ON leaves a big useless backtrace
so this patch removes it for a printk_once().

I'm still working to narrow down the cause of the > 11% adjustment.

Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-03-23 16:25:16 -07:00
John Stultz ad30dfa94c alarmtimer: Make sure we initialize the rtctimer
jonghwan Choi reported seeing warnings with the alarmtimer
code at suspend/resume time, and pointed out that the
rtctimer isn't being properly initialized.

This patch corrects this issue.

Reported-by: jonghwan Choi <jhbird.choi@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-03-23 16:23:12 -07:00
Linus Torvalds 7bfe0e66d5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov:
 "- we finally merged driver for USB version of Synaptics touchpads
    (I guess most commonly found in IBM/Lenovo keyboard/touchpad combo);

   - a bunch of new drivers for embedded platforms (Cypress
     touchscreens, DA9052 OnKey, MAX8997-haptic, Ilitek ILI210x
     touchscreens, TI touchscreen);

   - input core allows clients to specify desired clock source for
     timestamps on input events (EVIOCSCLOCKID ioctl);

   - input core allows querying state of all MT slots for given event
     code via EVIOCGMTSLOTS ioctl;

   - various driver fixes and improvements."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (45 commits)
  Input: ili210x - add support for Ilitek ILI210x based touchscreens
  Input: altera_ps2 - use of_match_ptr()
  Input: synaptics_usb - switch to module_usb_driver()
  Input: convert I2C drivers to use module_i2c_driver()
  Input: convert SPI drivers to use module_spi_driver()
  Input: omap4-keypad - move platform_data to <linux/platform_data>
  Input: kxtj9 - who_am_i check value and initial data rate fixes
  Input: add driver support for MAX8997-haptic
  Input: tegra-kbc - revise device tree support
  Input: of_keymap - add device tree bindings for simple key matrices
  Input: wacom - fix physical size calculation for 3rd-gen Bamboo
  Input: twl4030-vibra - really switch from #if to #ifdef
  Input: hp680_ts_input - ensure arguments to request_irq and free_irq are compatible
  Input: max8925_onkey - avoid accessing input device too early
  Input: max8925_onkey - allow to be used as a wakeup source
  Input: atmel-wm97xx - convert to dev_pm_ops
  Input: atmel-wm97xx - set driver owner
  Input: add cyttsp touchscreen maintainer entry
  Input: cyttsp - remove useless checks in cyttsp_probe()
  Input: usbtouchscreen - add support for Data Modul EasyTouch TP 72037
  ...
2012-03-22 20:20:18 -07:00
John Stultz 6b43ae8a61 ntp: Fix leap-second hrtimer livelock
Since commit 7dffa3c673 the ntp
subsystem has used an hrtimer for triggering the leapsecond
adjustment. However, this can cause a potential livelock.

Thomas diagnosed this as the following pattern:
CPU 0                                                    CPU 1
do_adjtimex()
  spin_lock_irq(&ntp_lock);
    process_adjtimex_modes();				 timer_interrupt()
      process_adj_status();                                do_timer()
        ntp_start_leap_timer();                             write_lock(&xtime_lock);
          hrtimer_start();                                  update_wall_time();
             hrtimer_reprogram();                            ntp_tick_length()
               tick_program_event()                            spin_lock(&ntp_lock);
                 clockevents_program_event()
		   ktime_get()
                     seq = req_seqbegin(xtime_lock);

This patch tries to avoid the problem by reverting back to not using
an hrtimer to inject leapseconds, and instead we handle the leapsecond
processing in the second_overflow() function.

The downside to this change is that on systems that support highres
timers, the leap second processing will occur on a HZ tick boundary,
(ie: ~1-10ms, depending on HZ)  after the leap second instead of
possibly sooner (~34us in my tests w/ x86_64 lapic).

This patch applies on top of tip/timers/core.

CC: Sasha Levin <levinsasha928@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Diagnoised-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-03-22 19:43:43 -07:00
John Stultz f695cf9483 time: Fix change_clocksource locking
change_clocksource() fails to grab locks or call timekeeping_update(),
which leaves a race window for time inconsistencies.

This adds proper locking and a call to timekeeping_update() to fix this.

CC: Andy Lutomirski <luto@amacapital.net>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-03-15 18:17:54 -07:00
Sasha Levin a078c6d0e6 ntp: Fix integer overflow when setting time
'long secs' is passed as divisor to div_s64, which accepts a 32bit
divisor. On 64bit machines that value is trimmed back from 8 bytes
back to 4, causing a divide by zero when the number is bigger than
(1 << 32) - 1 and all 32 lower bits are 0.

Use div64_long() instead.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Cc: johnstul@us.ibm.com
Link: http://lkml.kernel.org/r/1331829374-31543-2-git-send-email-levinsasha928@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-15 21:41:34 +01:00
Dmitry Torokhov b675b3667f Merge commit 'v3.3-rc6' into next 2012-03-09 10:55:17 -08:00
Thomas Gleixner 97ac984d2f Merge branch 'fortglx/3.4/time' of git://git.linaro.org/people/jstultz/linux into timers/core 2012-02-15 20:29:18 +01:00
Frederic Weisbecker 15f827be93 nohz: Remove ts->Einidle checks before restarting the tick
ts->inidle is set by tick_nohz_idle_enter() and unset by
tick_nohz_idle_exit(). However these two calls are assumed
to be always paired. This means that by the time we call
tick_nohz_idle_exit(), ts->inidle is supposed to be always
set to 1.

Remove the checks for ts->inidle in tick_nohz_idle_exit().
This simplifies a bit the code and improves its debuggability
(ie: ensure the call is paired with a tick_nohz_idle_enter()
call).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/1327427984-23282-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15 15:23:09 +01:00
Michal Hocko 430ee88195 nohz: Remove update_ts_time_stat from tick_nohz_start_idle
There is no reason to call update_ts_time_stat from tick_nohz_start_idle
anymore (after e0e37c20 sched: Eliminate the ts->idle_lastupdate field)
when we updated idle_lastupdate unconditionally.

We haven't set idle_active yet and do not provide last_update_time so
the whole call end up being just 2 wasted branches.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: http://lkml.kernel.org/r/1322755222-6951-1-git-send-email-mhocko@suse.cz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15 15:23:09 +01:00
Suresh Siddha 77b0d60c5a clockevents: Leave the broadcast device in shutdown mode when not needed
Platforms with Always Running APIC Timer doesn't use the broadcast timer
but the kernel is leaving the broadcast timer (HPET in this case)
in oneshot mode.

On these platforms, before the switch to oneshot mode, broadcast device is
actually in shutdown mode. Code checks for empty tick_broadcast_mask and
avoids going into the periodic mode.

During switch to oneshot mode, add the same tick_broadcast_mask checks in the
tick_broadcast_switch_to_oneshot() and avoid the broadcast device going into
the oneshot mode.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: john stultz <johnstul@us.ibm.com>
Cc: venki@google.com
Link: http://lkml.kernel.org/r/1320452301.15071.16.camel@sbsiddha-desk.sc.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15 15:23:09 +01:00
John Stultz a80b83b7b8 Input: add infrastructure for selecting clockid for event time stamps
As noted by Arve and others, since wall time can jump backwards, it is
difficult to use for input because one cannot determine if one event
occurred before another or for how long a key was pressed.

However, the timestamp field is part of the kernel ABI, and cannot be
changed without possibly breaking existing users.

This patch adds a new IOCTL that allows a clockid to be set in the
evdev_client struct that will specify which time base to use for event
timestamps (ie: CLOCK_MONOTONIC instead of CLOCK_REALTIME).

For now we only support CLOCK_MONOTONIC and CLOCK_REALTIME, but
in the future we could support other clockids if appropriate.

The default remains CLOCK_REALTIME, so we don't change the ABI.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Daniel Kurtz <djkurtz@google.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2012-02-03 00:24:58 -08:00
Thomas Gleixner cc06268c6a time: Move common updates to a function
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:31 -08:00
Thomas Gleixner 058892e632 time: Reorder so the hot data is together
Keep all the interesting data in a single cache line.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:29 -08:00
John Stultz 92c1d3ed4d time: Remove most of xtime_lock usage in timekeeping.c
Now that ntp.c's locking is reworked, we can remove most
of the xtime_lock usage in timekeeping.c

The remaining xtime_lock presence is really for jiffies access
and the global load calculation.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:27 -08:00
John Stultz bd3312681f ntp: Add ntp_lock to replace xtime_locking
Use a ntp_lock spin lock to replace xtime_lock locking in ntp.c

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:25 -08:00
John Stultz ea7cf49a76 ntp: Access tick_length variable via ntp_tick_length()
Currently the NTP managed tick_length value is accessed globally,
in preparations for locking cleanups, make sure it is accessed via
a function and mark it as static.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:23 -08:00
John Stultz 8357929e6a ntp: Cleanup timex.h
Move ntp_sycned to ntp.c and mark time_status as static.
Also yank function declaration for non-existant function.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:20 -08:00
John Stultz 70471f2f06 time: Add timekeeper lock
Now that all the timekeeping variables are stored in
the timekeeper structure, add a new lock to protect the
structure.

For now, this lock nests under the xtime_lock for writes.

For readers, we don't need to take xtime_lock anymore.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:18 -08:00
John Stultz 8fcce546be time: Cleanup global variables and move them to the top
Move global xtime_lock and timekeeping_suspended values up
to the top of timekeeping.c

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:16 -08:00
John Stultz 01f71b47e0 time: Move raw_time into timekeeper structure
In preparation for locking cleanups, move raw_time into
timekeeper structure.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:14 -08:00
John Stultz 8ff2cb92dd time: Move xtime into timekeeeper structure
In preparation for locking cleanups, move xtime into
timekeeper structure.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:12 -08:00
John Stultz d9f7217aac time: Move wall_to_monotonic into the timekeeper structure
In preparation for locking cleanups, move wall_to_monotonic
into the timekeeper structure.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:09 -08:00
John Stultz 00c5fb774e time: Move total_sleep_time into the timekeeper structure
Move total_sleep_time into the timekeeper structure in preparation
for locking cleanups

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-01-26 19:44:07 -08:00
Linus Torvalds 98793265b4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
  Kconfig: acpi: Fix typo in comment.
  misc latin1 to utf8 conversions
  devres: Fix a typo in devm_kfree comment
  btrfs: free-space-cache.c: remove extra semicolon.
  fat: Spelling s/obsolate/obsolete/g
  SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
  tools/power turbostat: update fields in manpage
  mac80211: drop spelling fix
  types.h: fix comment spelling for 'architectures'
  typo fixes: aera -> area, exntension -> extension
  devices.txt: Fix typo of 'VMware'.
  sis900: Fix enum typo 'sis900_rx_bufer_status'
  decompress_bunzip2: remove invalid vi modeline
  treewide: Fix comment and string typo 'bufer'
  hyper-v: Update MAINTAINERS
  treewide: Fix typos in various parts of the kernel, and fix some comments.
  clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
  gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
  leds: Kconfig: Fix typo 'D2NET_V2'
  sound: Kconfig: drop unknown symbol ARCH_CLPS7500
  ...

Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
kconfig additions, close to removed commented-out old ones)
2012-01-08 13:21:22 -08:00
Linus Torvalds 7affca3537 Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
* 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (73 commits)
  arm: fix up some samsung merge sysdev conversion problems
  firmware: Fix an oops on reading fw_priv->fw in sysfs loading file
  Drivers:hv: Fix a bug in vmbus_driver_unregister()
  driver core: remove __must_check from device_create_file
  debugfs: add missing #ifdef HAS_IOMEM
  arm: time.h: remove device.h #include
  driver-core: remove sysdev.h usage.
  clockevents: remove sysdev.h
  arm: convert sysdev_class to a regular subsystem
  arm: leds: convert sysdev_class to a regular subsystem
  kobject: remove kset_find_obj_hinted()
  m86k: gpio - convert sysdev_class to a regular subsystem
  mips: txx9_sram - convert sysdev_class to a regular subsystem
  mips: 7segled - convert sysdev_class to a regular subsystem
  sh: dma - convert sysdev_class to a regular subsystem
  sh: intc - convert sysdev_class to a regular subsystem
  power: suspend - convert sysdev_class to a regular subsystem
  power: qe_ic - convert sysdev_class to a regular subsystem
  power: cmm - convert sysdev_class to a regular subsystem
  s390: time - convert sysdev_class to a regular subsystem
  ...

Fix up conflicts with 'struct sysdev' removal from various platform
drivers that got changed:
 - arch/arm/mach-exynos/cpu.c
 - arch/arm/mach-exynos/irq-eint.c
 - arch/arm/mach-s3c64xx/common.c
 - arch/arm/mach-s3c64xx/cpu.c
 - arch/arm/mach-s5p64x0/cpu.c
 - arch/arm/mach-s5pv210/common.c
 - arch/arm/plat-samsung/include/plat/cpu.h
 - arch/powerpc/kernel/sysfs.c
and fix up cpu_is_hotpluggable() as per Greg in include/linux/cpu.h
2012-01-07 12:03:30 -08:00
Linus Torvalds 376613e81d Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, tsc: Skip TSC synchronization checks for tsc=reliable
  clocksource: Convert tcb_clksrc to use clocksource_register_hz/khz
  clocksource: cris: Convert to clocksource_register_khz
  clocksource: xtensa: Convert to clocksource_register_hz/khz
  clocksource: um: Convert to clocksource_register_hz/khz
  clocksource: parisc: Convert to clocksource_register_hz/khz
  clocksource: m86k: Convert to clocksource_register_hz/khz
  time: x86: Replace LATCH with PIT_LATCH in i8253 clocksource driver
  time: x86: Remove CLOCK_TICK_RATE from acpi_pm clocksource driver
  time: x86: Remove CLOCK_TICK_RATE from mach_timer.h
  time: x86: Remove CLOCK_TICK_RATE from tsc code
  time: Fix spelling mistakes in new comments
  time: fix bogus comment in timekeeping_get_ns_raw
2012-01-06 13:57:44 -08:00
Greg Kroah-Hartman ff4b8a57f0 Merge branch 'driver-core-next' into Linux 3.2
This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file,
and it fixes the build error in the arch/x86/kernel/microcode_core.c
file, that the merge did not catch.

The microcode_core.c patch was provided by Stephen Rothwell
<sfr@canb.auug.org.au> who was invaluable in the merge issues involved
with the large sysdev removal process in the driver-core tree.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-06 11:42:52 -08:00
Linus Torvalds 0db49b72bc Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  sched/tracing: Add a new tracepoint for sleeptime
  sched: Disable scheduler warnings during oopses
  sched: Fix cgroup movement of waking process
  sched: Fix cgroup movement of newly created process
  sched: Fix cgroup movement of forking process
  sched: Remove cfs bandwidth period check in tg_set_cfs_period()
  sched: Fix load-balance lock-breaking
  sched: Replace all_pinned with a generic flags field
  sched: Only queue remote wakeups when crossing cache boundaries
  sched: Add missing rcu_dereference() around ->real_parent usage
  [S390] fix cputime overflow in uptime_proc_show
  [S390] cputime: add sparse checking and cleanup
  sched: Mark parent and real_parent as __rcu
  sched, nohz: Fix missing RCU read lock
  sched, nohz: Set the NOHZ_BALANCE_KICK flag for idle load balancer
  sched, nohz: Fix the idle cpu check in nohz_idle_balance
  sched: Use jump_labels for sched_feat
  sched/accounting: Fix parameter passing in task_group_account_field
  sched/accounting: Fix user/system tick double accounting
  sched/accounting: Re-use scheduler statistics for the root cgroup
  ...

Fix up conflicts in
 - arch/ia64/include/asm/cputime.h, include/asm-generic/cputime.h
	usecs_to_cputime64() vs the sparse cleanups
 - kernel/sched/fair.c, kernel/time/tick-sched.c
	scheduler changes in multiple branches
2012-01-06 08:44:54 -08:00
Linus Torvalds 423d091dfe Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
  cpu: Export cpu_up()
  rcu: Apply ACCESS_ONCE() to rcu_boost() return value
  Revert "rcu: Permit rt_mutex_unlock() with irqs disabled"
  docs: Additional LWN links to RCU API
  rcu: Augment rcu_batch_end tracing for idle and callback state
  rcu: Add rcutorture tests for srcu_read_lock_raw()
  rcu: Make rcutorture test for hotpluggability before offlining CPUs
  driver-core/cpu: Expose hotpluggability to the rest of the kernel
  rcu: Remove redundant rcu_cpu_stall_suppress declaration
  rcu: Adaptive dyntick-idle preparation
  rcu: Keep invoking callbacks if CPU otherwise idle
  rcu: Irq nesting is always 0 on rcu_enter_idle_common
  rcu: Don't check irq nesting from rcu idle entry/exit
  rcu: Permit dyntick-idle with callbacks pending
  rcu: Document same-context read-side constraints
  rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass
  rcu: Remove dynticks false positives and RCU failures
  rcu: Reduce latency of rcu_prepare_for_idle()
  rcu: Eliminate RCU_FAST_NO_HZ grace-period hang
  rcu: Avoid needlessly IPIing CPUs at GP end
  ...
2012-01-06 08:02:40 -08:00
Linus Torvalds 3b87487ac5 Revert "clockevents: Set noop handler in clockevents_exchange_device()"
This reverts commit de28f25e82.

It results in resume problems for various people. See for example

  http://thread.gmane.org/gmane.linux.kernel/1233033
  http://thread.gmane.org/gmane.linux.kernel/1233389
  http://thread.gmane.org/gmane.linux.kernel/1233159
  http://thread.gmane.org/gmane.linux.kernel/1227868/focus=1230877

and the fedora and ubuntu bug reports

  https://bugzilla.redhat.com/show_bug.cgi?id=767248
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/904569

which got bisected down to the stable version of this commit.

Reported-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: Phil Miller <mille121@illinois.edu>
Reported-by: Philip Langdale <philipl@overt.org>
Reported-by: Tim Gardner <tim.gardner@canonical.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <gregkh@suse.de>
Cc: stable@kernel.org    # for stable kernels that applied the original
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-30 13:24:40 -08:00
Kay Sievers 7239f65cf3 clockevents: remove sysdev.h
This isn't needed in the clockevents.c file, and the header file is
going away soon, so just remove the #include

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-21 16:12:37 -08:00
Kusanagi Kouichi b1b73d0950 time/clocksource: Fix kernel-doc warnings
Fix various KernelDoc build warnings.

Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Cc: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20111219091320.0D5AF6FC03D@msa105.auone-net.jp
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-19 11:41:40 +01:00
Ingo Molnar 6a54aebf69 Merge commit 'v3.2-rc5' into sched/core
Merge reason: Pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-15 08:21:30 +01:00
Kay Sievers d369a5d8fc clocksource: convert sysdev_class to a regular subsystem
After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Cc: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-14 15:28:51 -08:00
Frederic Weisbecker 1268fbc746 nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
Those two APIs were provided to optimize the calls of
tick_nohz_idle_enter() and rcu_idle_enter() into a single
irq disabled section. This way no interrupt happening in-between would
needlessly process any RCU job.

Now we are talking about an optimization for which benefits
have yet to be measured. Let's start simple and completely decouple
idle rcu and dyntick idle logics to simplify.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:57 -08:00
Frederic Weisbecker 2bbb6817c0 nohz: Allow rcu extended quiescent state handling seperately from tick stop
It is assumed that rcu won't be used once we switch to tickless
mode and until we restart the tick. However this is not always
true, as in x86-64 where we dereference the idle notifiers after
the tick is stopped.

To prepare for fixing this, add two new APIs:
tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().

If no use of RCU is made in the idle loop between
tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch
must instead call the new *_norcu() version such that the arch doesn't
need to call rcu_idle_enter() and rcu_idle_exit().

Otherwise the arch must call tick_nohz_enter_idle() and
tick_nohz_exit_idle() and also call explicitly:

- rcu_idle_enter() after its last use of RCU before the CPU is put
to sleep.
- rcu_idle_exit() before the first use of RCU after the CPU is woken
up.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:36 -08:00
Frederic Weisbecker 280f06774a nohz: Separate out irq exit and idle loop dyntick logic
The tick_nohz_stop_sched_tick() function, which tries to delay
the next timer tick as long as possible, can be called from two
places:

- From the idle loop to start the dytick idle mode
- From interrupt exit if we have interrupted the dyntick
idle mode, so that we reprogram the next tick event in
case the irq changed some internal state that requires this
action.

There are only few minor differences between both that
are handled by that function, driven by the ts->inidle
cpu variable and the inidle parameter. The whole guarantees
that we only update the dyntick mode on irq exit if we actually
interrupted the dyntick idle mode, and that we enter in RCU extended
quiescent state from idle loop entry only.

Split this function into:

- tick_nohz_idle_enter(), which sets ts->inidle to 1, enters
dynticks idle mode unconditionally if it can, and enters into RCU
extended quiescent state.

- tick_nohz_irq_exit() which only updates the dynticks idle mode
when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called).

To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed
into tick_nohz_idle_exit().

This simplifies the code and micro-optimize the irq exit path (no need
for local_irq_save there). This also prepares for the split between
dynticks and rcu extended quiescent state logics. We'll need this split to
further fix illegal uses of RCU in extended quiescent states in the idle
loop.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11 10:31:35 -08:00
Paul E. McKenney 9b2e4f1880 rcu: Track idleness independent of idle tasks
Earlier versions of RCU used the scheduling-clock tick to detect idleness
by checking for the idle task, but handled idleness differently for
CONFIG_NO_HZ=y.  But there are now a number of uses of RCU read-side
critical sections in the idle task, for example, for tracing.  A more
fine-grained detection of idleness is therefore required.

This commit presses the old dyntick-idle code into full-time service,
so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
always invoked at the beginning of an idle loop iteration.  Similarly,
rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
at the end of an idle-loop iteration.  This allows the idle task to
use RCU everywhere except between consecutive rcu_idle_enter() and
rcu_idle_exit() calls, in turn allowing architecture maintainers to
specify exactly where in the idle loop that RCU may be used.

Because some of the userspace upcall uses can result in what looks
to RCU like half of an interrupt, it is not possible to expect that
the irq_enter() and irq_exit() hooks will give exact counts.  This
patch therefore expands the ->dynticks_nesting counter to 64 bits
and uses two separate bitfields to count process/idle transitions
and interrupt entry/exit transitions.  It is presumed that userspace
upcalls do not happen in the idle loop or from usermode execution
(though usermode might do a system call that results in an upcall).
The counter is hard-reset on each process/idle transition, which
avoids the interrupt entry/exit error from accumulating.  Overflow
is avoided by the 64-bitness of the ->dyntick_nesting counter.

This commit also adds warnings if a non-idle task asks RCU to enter
idle state (and these checks will need some adjustment before applying
Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
In addition, validation of ->dynticks and ->dynticks_nesting is added.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11 10:31:24 -08:00
Thomas Gleixner c9c024b3f3 alarmtimers: Fix time comparison
The expiry function compares the timer against current time and does
not expire the timer when the expiry time is >= now. That's wrong. If
the timer is set for now, then it must expire.

Make the condition expiry > now for breaking out the loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: stable@kernel.org
2011-12-06 11:38:32 +01:00
Suresh Siddha 69e1e811dc sched, nohz: Track nr_busy_cpus in the sched_group_power
Introduce nr_busy_cpus in the struct sched_group_power [Not in sched_group
because sched groups are duplicated for the SD_OVERLAP scheduler domain]
and for each cpu that enters and exits idle, this parameter will
be updated in each scheduler group of the scheduler domain that this cpu
belongs to.

To avoid the frequent update of this state as the cpu enters
and exits idle, the update of the stat during idle exit is
delayed to the first timer tick that happens after the cpu becomes busy.
This is done using NOHZ_IDLE flag in the struct rq's nohz_flags.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20111202010832.555984323@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06 09:06:32 +01:00
Linus Torvalds 40c043b077 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevents: Set noop handler in clockevents_exchange_device()
  tick-broadcast: Stop active broadcast device when replacing it
  clocksource: Fix bug with max_deferment margin calculation
  rtc: Fix some bugs that allowed accumulating time drift in suspend/resume
  rtc: Disable the alarm in the hardware
2011-12-05 16:53:43 -08:00
Thomas Gleixner 0518469d0a Merge branch 'fortglx/3.3/tip/timers/core' of git://git.linaro.org/people/jstultz/linux into timers/core 2011-12-05 22:13:49 +01:00
Thomas Gleixner de28f25e82 clockevents: Set noop handler in clockevents_exchange_device()
If a device is shutdown, then there might be a pending interrupt,
which will be processed after we reenable interrupts, which causes the
original handler to be run. If the old handler is the (broadcast)
periodic handler the shutdown state might hang the kernel completely.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
2011-12-02 16:07:23 +01:00
Thomas Gleixner c1be84309c tick-broadcast: Stop active broadcast device when replacing it
When a better rated broadcast device is installed, then the current
active device is not disabled, which results in two running broadcast
devices.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
2011-12-02 16:06:54 +01:00
Yang Honggang (Joseph) b1f919664d clocksource: Fix bug with max_deferment margin calculation
In order to leave a margin of 12.5% we should >> 3 not >> 5.

CC: stable@kernel.org
Signed-off-by: Yang Honggang (Joseph) <eagle.rtlinux@gmail.com>
[jstultz: Modified commit subject]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-12-01 15:50:00 -08:00
Paul Bolle a13b032776 clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
There's no Kconfig symbol GENERIC_CLOCKEVENTS_MIGR, so the check for it
will always fail.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-11-29 11:56:49 +01:00
Linus Torvalds c28800a9c3 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hrtimer: Fix extra wakeups from __remove_hrtimer()
  timekeeping: add arch_offset hook to ktime_get functions
  clocksource: Avoid selecting mult values that might overflow when adjusted
  time: Improve documentation of timekeeeping_adjust()
2011-11-28 08:43:52 -08:00
John Stultz 3f86f28ffc time: Fix spelling mistakes in new comments
Fixup spelling issues caught by Richard

CC: Richard Cochran <richardcochran@gmail.com>
CC: Chen Jie <chenj@lemote.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-11-21 19:00:55 -08:00
Dan McGee c9fad429d4 time: fix bogus comment in timekeeping_get_ns_raw
The whole point of this function is to return a value not touched by
NTP; unfortunately the comment got copied wholesale without adjustment
from the timekeeping_get_ns function above.

Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-11-21 19:00:46 -08:00
Hector Palacios d004e02405 timekeeping: add arch_offset hook to ktime_get functions
ktime_get and ktime_get_ts were calling timekeeping_get_ns()
but later they were not calling arch_gettimeoffset() so architectures
using this mechanism returned 0 ns when calling these functions.

This happened for example when running Busybox's ping which calls
syscall(__NR_clock_gettime, CLOCK_MONOTONIC, ts) which eventually
calls ktime_get. As a result the returned ping travel time was zero.

CC: stable@kernel.org
Signed-off-by: Hector Palacios <hector.palacios@digi.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-11-17 14:57:19 -08:00
Ingo Molnar 367177e501 Merge branch 'formingo/3.2/tip/timers/core' of git://git.linaro.org/people/jstultz/linux into timers/core
Conflicts:
	kernel/time/timekeeping.c
2011-11-11 08:10:42 +01:00
John Stultz d65670a78c clocksource: Avoid selecting mult values that might overflow when adjusted
For some frequencies, the clocks_calc_mult_shift() function will
unfortunately select mult values very close to 0xffffffff.  This
has the potential to overflow when NTP adjusts the clock, adding
to the mult value.

This patch adds a clocksource.maxadj value, which provides
an approximation of an 11% adjustment(NTP limits adjustments to
500ppm and the tick adjustment is limited to 10%), which could
be made to the clocksource.mult value. This is then used to both
check that the current mult value won't overflow/underflow, as
well as warning us if the timekeeping_adjust() code pushes over
that 11% boundary.

v2: Fix max_adjustment calculation, and improve WARN_ONCE
messages.

v3: Don't warn before maxadj has actually been set

CC: Yong Zhang <yong.zhang0@gmail.com>
CC: David Daney <ddaney.cavm@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Chen Jie <chenj@lemote.com>
CC: zhangfx <zhangfx@lemote.com>
CC: stable@kernel.org
Reported-by: Chen Jie <chenj@lemote.com>
Reported-by: zhangfx <zhangfx@lemote.com>
Tested-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-11-10 11:27:08 -08:00
Paul Gortmaker 6e5fdeedca kernel: Fix files explicitly needing EXPORT_SYMBOL infrastructure
These files were getting <linux/module.h> via an implicit non-obvious
path, but we want to crush those out of existence since they cost
time during compiles of processing thousands of lines of headers
for no reason.  Give them the lightweight header that just contains
the EXPORT_SYMBOL infrastructure.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:05 -04:00
John Stultz c2bc11113c time: Improve documentation of timekeeeping_adjust()
After getting a number of questions in private emails about the
math around admittedly very complex timekeeping_adjust() and
timekeeping_big_adjust(), I figure the code needs some better
comments.

Hopefully the explanations are clear enough and don't muddy the
water any worse.

Still needs documentation for ntp_error, but I couldn't recall
exactly the full explanation behind the code that's there
(although I do recall once working it out when Roman first
proposed it). Given a bit more time I can probably work it out,
but I don't want to hold back this documentation until then.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Chen Jie <chenj@lemote.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1319764362-32367-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-28 08:57:38 +02:00
Linus Torvalds 39adff5f69 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  time, s390: Get rid of compile warning
  dw_apb_timer: constify clocksource name
  time: Cleanup old CONFIG_GENERIC_TIME references that snuck in
  time: Change jiffies_to_clock_t() argument type to unsigned long
  alarmtimers: Fix error handling
  clocksource: Make watchdog reset lockless
  posix-cpu-timers: Cure SMP accounting oddities
  s390: Use direct ktime path for s390 clockevent device
  clockevents: Add direct ktime programming function
  clockevents: Make minimum delay adjustments configurable
  nohz: Remove "Switched to NOHz mode" debugging messages
  proc: Consider NO_HZ when printing idle and iowait times
  nohz: Make idle/iowait counter update conditional
  nohz: Fix update_ts_time_stat idle accounting
  cputime: Clean up cputime_to_usecs and usecs_to_cputime macros
  alarmtimers: Rework RTC device selection using class interface
  alarmtimers: Add try_to_cancel functionality
  alarmtimers: Add more refined alarm state tracking
  alarmtimers: Remove period from alarm structure
  alarmtimers: Remove interval cap limit hack
  ...
2011-10-26 17:15:03 +02:00
Linus Torvalds 19b4a8d520 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
  rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp()
  rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states
  rcu: Wire up RCU_BOOST_PRIO for rcutree
  rcu: Make rcu_torture_boost() exit loops at end of test
  rcu: Make rcu_torture_fqs() exit loops at end of test
  rcu: Permit rt_mutex_unlock() with irqs disabled
  rcu: Avoid having just-onlined CPU resched itself when RCU is idle
  rcu: Suppress NMI backtraces when stall ends before dump
  rcu: Prohibit grace periods during early boot
  rcu: Simplify unboosting checks
  rcu: Prevent early boot set_need_resched() from __rcu_pending()
  rcu: Dump local stack if cannot dump all CPUs' stacks
  rcu: Move __rcu_read_unlock()'s barrier() within if-statement
  rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation
  rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier
  rcu: Make rcu_implicit_dynticks_qs() locals be correct size
  rcu: Eliminate in_irq() checks in rcu_enter_nohz()
  nohz: Remove nohz_cpu_mask
  rcu: Document interpretation of RCU-lockdep splats
  rcu: Allow rcutorture's stat_interval parameter to be changed at runtime
  ...
2011-10-26 16:26:53 +02:00
Shi, Alex fc0763f53e nohz: Remove nohz_cpu_mask
RCU no longer uses this global variable, nor does anyone else.  This
commit therefore removes this variable.  This reduces memory footprint
and also removes some atomic instructions and memory barriers from
the dyntick-idle path.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:29 -07:00
Thomas Gleixner 4523f6ada8 alarmtimers: Fix error handling
commit 8bc0daf (alarmtimers: Rework RTC device selection using class
interface) did not implement required error checks. Add them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-14 10:54:29 +02:00
Thomas Gleixner 2737c49f29 locking, timer_stats: Annotate table_lock as raw
The table_lock lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Reported-by: Andreas Sundebo <kernel@sundebo.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Andreas Sundebo <kernel@sundebo.dk>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:12:00 +02:00
Thomas Gleixner 9fb6033625 clocksource: Make watchdog reset lockless
KGDB needs to trylock watchdog_lock when trying to reset the
clocksource watchdog after the system has been stopped to avoid a
potential deadlock. When the trylock fails TSC usually becomes
unstable.

We can be more clever by using an atomic counter and checking it in
the clocksource_watchdog callback. We restart the watchdog whenever
the counter is > 0 and only decrement the counter when we ran through
a full update cycle.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <johnstul@us.ibm.com>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1109121326280.2723@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-13 09:58:29 +02:00
Martin Schwidefsky 65516f8a7c clockevents: Add direct ktime programming function
There is at least one architecture (s390) with a sane clockevent device
that can be programmed with the equivalent of a ktime. No need to create
a delta against the current time, the ktime can be used directly.

A new clock device function 'set_next_ktime' is introduced that is called
with the unmodified ktime for the timer if the clock event device has the 
CLOCK_EVT_FEAT_KTIME bit set.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133142.815350967@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:56 +02:00
Martin Schwidefsky d1748302f7 clockevents: Make minimum delay adjustments configurable
The automatic increase of the min_delta_ns of a clockevents device
should be done in the clockevents code as the minimum delay is an
attribute of the clockevents device.

In addition not all architectures want the automatic adjustment, on a
massively virtualized system it can happen that the programming of a
clock event fails several times in a row because the virtual cpu has
been rescheduled quickly enough. In that case the minimum delay will
erroneously be increased with no way back. The new config symbol
GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic
adjustment. The config option is selected only for x86.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:56 +02:00
Heiko Carstens 29c158e81c nohz: Remove "Switched to NOHz mode" debugging messages
When performing cpu hotplug tests the kernel printk log buffer gets flooded
with pointless "Switched to NOHz mode..." messages. Especially when afterwards
analyzing a dump this might have removed more interesting stuff out of the
buffer.
Assuming that switching to NOHz mode simply works just remove the printk.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Link: http://lkml.kernel.org/r/20110823112046.GB2540@osiris.boeblingen.de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:55 +02:00
Michal Hocko 09a1d34f85 nohz: Make idle/iowait counter update conditional
get_cpu_{idle,iowait}_time_us update idle/iowait counters
unconditionally if the given CPU is in the idle loop.

This doesn't work well outside of CPU governors which are singletons
so nobody (except for IRQ) can race with them.

We will need to use both functions from /proc/stat handler to properly
handle nohz idle/iowait times.

Make the update depend on a non NULL last_update_time argument.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/11f23179472635ce52e78921d47a20216b872f23.1314172057.git.mhocko@suse.cz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:55 +02:00
Michal Hocko 6beea0cda8 nohz: Fix update_ts_time_stat idle accounting
update_ts_time_stat currently updates idle time even if we are in
iowait loop at the moment. The only real users of the idle counter
(via get_cpu_idle_time_us) are CPU governors and they expect to get
cumulative time for both idle and iowait times.
The value (idle_sleeptime) is also printed to userspace by print_cpu
but it prints both idle and iowait times so the idle part is misleading.

Let's clean this up and fix update_ts_time_stat to account both counters
properly and update consumers of idle to consider iowait time as well.
If we do this we might use get_cpu_{idle,iowait}_time_us from other
contexts as well and we will get expected values.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:55 +02:00
John Stultz 8bc0dafb5c alarmtimers: Rework RTC device selection using class interface
This allows cleaner detection of the RTC device being registered, rather
then probing any time someone calls alarmtimer_get_rtcdev.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:30 -07:00
John Stultz 9082c465a5 alarmtimers: Add try_to_cancel functionality
There's a number of edge cases when cancelling a alarm, so
to be sure we accurately do so, introduce try_to_cancel, which
returns proper failure errors if it cannot. Also modify cancel
to spin until the alarm is properly disabled.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:29 -07:00
John Stultz a28cde81ab alarmtimers: Add more refined alarm state tracking
In order to allow for functionality like try_to_cancel, add
more refined  state tracking (similar to hrtimers).

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:27 -07:00
John Stultz 9e26476243 alarmtimers: Remove period from alarm structure
Now that periodic alarmtimers are managed by the handler function,
remove the period value from the alarm structure and let the handlers
manage the interval on their own.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:26 -07:00
John Stultz d77e23acce alarmtimers: Remove interval cap limit hack
Now that the alarmtimers code has been refactored, the interval
cap limit can be removed.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:24 -07:00
John Stultz dce75a8c71 alarmtimers: Add alarm_forward functionality
In order to avoid wasting time expiring and re-adding very high freq
periodic alarmtimers, introduce alarm_forward() which is similar to
hrtimer_forward and moves the timer to the next future expiration time
and returns the number of overruns.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:23 -07:00
John Stultz 54da23b720 alarmtimers: Push rearming peroidic timers down into alamrtimer handler
This patch pushes the periodic alarmtimer re-arming down into the alarmtimer
handler, mimicking how hrtimers handle this.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:22 -07:00
John Stultz 4b41308d2d alarmtimers: Change alarmtimer functions to return alarmtimer_restart values
In order to properly fix the denial of service issue with high freq
periodic alarm timers, we need to push the re-arming logic into the
alarm timer handler, much as the hrtimer code does.

This patch introduces alarmtimer_restart enum and changes the
alarmtimer handler declarations to use it as a return value. Further,
to ease following changes, it extends the alarmtimer handler functions
to also take the time at expiration. No logic is yet modified.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:20 -07:00
John Stultz 6af7e471e5 alarmtimers: Avoid possible denial of service with high freq periodic timers
Its possible to jam up the alarm timers by setting very small interval
timers, which will cause the alarmtimer subsystem to spend all of its time
firing and restarting timers. This can effectivly lock up a box.

A deeper fix is needed, closely mimicking the hrtimer code, but for now
just cap the interval to 100us to avoid userland hanging the system.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: stable@kernel.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 10:26:09 -07:00
John Stultz ea7802f630 alarmtimers: Memset itimerspec passed into alarm_timer_get
Following common_timer_get, zero out the itimerspec passed in.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: stable@kernel.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 07:10:09 -07:00
John Stultz 971c90bfa2 alarmtimers: Avoid possible null pointer traversal
We don't check if old_setting is non null before assigning it, so
correct this.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: stable@kernel.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 07:09:53 -07:00
Linus Torvalds 112ec46966 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  time: Fix stupid KERN_WARN compile issue
  rtc: Avoid accumulating time drift in suspend/resume
  time: Avoid accumulating time drift in suspend/resume
  time: Catch invalid timespec sleep values in __timekeeping_inject_sleeptime
2011-07-22 16:52:18 -07:00
John Stultz cbaa51524b time: Fix stupid KERN_WARN compile issue
Terribly embarassing. Don't know how I committed this, but its
KERN_WARNING not KERN_WARN.

This fixes the following compile error:
kernel/time/timekeeping.c: In function ‘__timekeeping_inject_sleeptime’:
kernel/time/timekeeping.c:608: error: ‘KERN_WARN’ undeclared (first use in this function)
kernel/time/timekeeping.c:608: error: (Each undeclared identifier is reported only once
kernel/time/timekeeping.c:608: error: for each function it appears in.)
kernel/time/timekeeping.c:608: error: expected ‘)’ before string constant
make[2]: *** [kernel/time/timekeeping.o] Error 1

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-07-20 15:42:55 -07:00
John Stultz cb33217b1b time: Avoid accumulating time drift in suspend/resume
Because the read_persistent_clock interface is usually backed by
only a second granular interface, each time we read from the persistent
clock for suspend/resume, we introduce a half second (on average) of error.

In order to avoid this error accumulating as the system is suspended
over and over, this patch measures the time delta between the persistent
clock and the system CLOCK_REALTIME.

If the delta is less then 2 seconds from the last suspend, we compensate
by using the previous time delta (keeping it close). If it is larger
then 2 seconds, we assume the clock was set or has been changed, so we
do no correction and update the delta.

Note: If NTP is running, ths could seem to "fight" with the NTP corrected
time, where as if the system time was off by 1 second, and NTP slewed the
value in, a suspend/resume cycle could undo this correction, by trying to
restore the previous offset from the persistent clock.  However, without
this patch, since each read could cause almost a full second worth of
error, its possible to get almost 2 seconds of error just from the
suspend/resume cycle alone, so this about equal to any offset added by
the compensation.

Further on systems that suspend/resume frequently, this should keep time
closer then NTP could compensate for if the errors were allowed to
accumulate.

Credits to Arve Hjønnevåg for suggesting this solution.

CC: Arve Hjønnevåg <arve@android.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-06-21 16:55:37 -07:00
John Stultz cb5de2f8d0 time: Catch invalid timespec sleep values in __timekeeping_inject_sleeptime
Arve suggested making sure we catch possible negative sleep time
intervals that could be passed into timekeeping_inject_sleeptime.

CC: Arve Hjønnevåg <arve@android.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-06-21 16:55:36 -07:00
John Stultz 1c6b39ad3f alarmtimers: Return -ENOTSUPP if no RTC device is present
Toralf Förster and Richard Weinberger noted that if there is
no RTC device, the alarm timers core prints out an annoying
"ALARM timers will not wake from suspend" message.

This warning has been removed in a previous patch, however
the issue still remains:  The original idea was to support
alarm timers even if there was no rtc device, as long as the
system didn't go into suspend.

However, after further consideration, communicating to the application
that alarmtimers are not fully functional seems like the better
solution.

So this patch makes it so we return -ENOTSUPP to any posix _ALARM
clockid calls if there is no backing RTC device on the system.

Further this changes the behavior where when there is no rtc device
we will check for one on clock_getres, clock_gettime, timer_create,
and timer_nsleep instead of on suspend.

CC: Toralf Förster <toralf.foerster@gmx.de>
CC: Richard Weinberger <richard@nod.at
CC: Peter Zijlstra <peterz@infradead.org>
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Reported by: Richard Weinberger <richard@nod.at>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-06-21 16:32:28 -07:00
John Stultz c008ba58af alarmtimers: Handle late rtc module loading
The alarmtimers code currently picks a rtc device to use at
late init time. However, if your rtc driver is loaded as a module,
it may be registered after the alarmtimers late init code, leaving
the alarmtimers nonfunctional.

This patch moves the the rtcdevice selection to when we actually try
to use it, allowing us to make use of rtc modules that may have been
loaded at any point since bootup.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Meelis Roos <mroos@ut.ee>
Reported-by: Meelis Roos <mroos@ut.ee>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-06-21 15:38:33 -07:00
Thomas Gleixner b5199515c2 clocksource: Make watchdog robust vs. interruption
The clocksource watchdog code is interruptible and it has been
observed that this can trigger false positives which disable the TSC.

The reason is that an interrupt storm or a long running interrupt
handler between the read of the watchdog source and the read of the
TSC brings the two far enough apart that the delta is larger than the
unstable treshold. Move both reads into a short interrupt disabled
region to avoid that.

Reported-and-tested-by: Vernon Mauery <vernux@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
2011-06-16 19:30:53 +02:00
Thomas Gleixner 1b054b67d3 clockevents: Handle empty cpumask gracefully
For UP it's stupid to request an initialized cpumask for the clock
event devices. Though we need the mask set even on UP to avoid a
horrible ifdeffery especially in the broadcast code.

For SMP we can at least try to survive with a warning and set the
cpumask of the cpu we're running on. That gives a decent chance to
bring the machine up and retrieve the debug info.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Walleij <linus.walleij@linaro.org
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Stephen Boyd <sboyd@codeaurora.org>
2011-06-03 11:13:33 +02:00
Thomas Gleixner ab8177bc53 hrtimers: Avoid touching inactive timer bases
Instead of iterating over all possible timer bases avoid it by marking
the active bases in the cpu base.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Peter Zijlstra <peterz@infradead.org>
2011-05-23 13:59:54 +02:00
Thomas Gleixner 250f972d85 Merge branch 'timers/urgent' into timers/core
Reason: Get upstream fixes and kfree_rcu which is necessary for a
follow up patch.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-20 20:08:05 +02:00
Thomas Gleixner c0e299b1a9 clockevents/source: Use u64 to make 32bit happy
unsigned long is not 64bit on 32bit machine.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-20 10:50:52 +02:00
Linus Torvalds 78c4def67e Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  hrtimer: Make lookup table const
  RTC: Disable CONFIG_RTC_CLASS from being built as a module
  timers: Fix alarmtimer build issues when CONFIG_RTC_CLASS=n
  timers: Remove delayed irqwork from alarmtimers implementation
  timers: Improve alarmtimer comments and minor fixes
  timers: Posix interface for alarm-timers
  timers: Introduce in-kernel alarm-timer interface
  timers: Add rb_init_node() to allow for stack allocated rb nodes
  time: Add timekeeping_inject_sleeptime
2011-05-19 17:45:08 -07:00
Linus Torvalds 7e6628e4bc Merge branch 'timers-clockevents-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-clockevents-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: hpet: Cleanup the clockevents init and register code
  x86: Convert PIT to clockevents_config_and_register()
  clockevents: Provide interface to reconfigure an active clock event device
  clockevents: Provide combined configure and register function
  clockevents: Restructure clock_event_device members
  clocksource: Get rid of the hardcoded 5 seconds sleep time limit
  clocksource: Restructure clocksource struct members
2011-05-19 17:44:40 -07:00
Thomas Gleixner 80b816b736 clockevents: Provide interface to reconfigure an active clock event device
Some ARM SoCs have clock event devices which have their frequency
modified due to frequency scaling. Provide an interface which allows
to reconfigure an active device. After reconfiguration reprogram the
current pending event.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: LAK <linux-arm-kernel@lists.infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/%3C20110518210136.437459958%40linutronix.de%3E
2011-05-19 14:24:16 +02:00
Thomas Gleixner 57f0fcbe1d clockevents: Provide combined configure and register function
All clockevent devices have the same open coded initialization
functions. Provide an interface which does all necessary
initialization in the core code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/%3C20110518210136.331975870%40linutronix.de%3E
2011-05-19 14:24:15 +02:00
Thomas Gleixner 724ed53e8a clocksource: Get rid of the hardcoded 5 seconds sleep time limit
Slow clocksources can have a way longer sleep time than 5 seconds and
even fast ones can easily cope with 600 seconds and still maintain
proper accuracy.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/%3C20110518210136.109811585%40linutronix.de%3E
2011-05-19 14:24:15 +02:00
Thomas Gleixner 07f4beb0b5 tick: Clear broadcast active bit when switching to oneshot
The first cpu which switches from periodic to oneshot mode switches
also the broadcast device into oneshot mode. The broadcast device
serves as a backup for per cpu timers which stop in deeper
C-states. To avoid starvation of the cpus which might be in idle and
depend on broadcast mode it marks the other cpus as broadcast active
and sets the brodcast expiry value of those cpus to the next tick.

The oneshot mode broadcast bit for the other cpus is sticky and gets
only cleared when those cpus exit idle. If a cpu was not idle while
the bit got set in consequence the bit prevents that the broadcast
device is armed on behalf of that cpu when it enters idle for the
first time after it switched to oneshot mode.

In most cases that goes unnoticed as one of the other cpus has usually
a timer pending which keeps the broadcast device armed with a short
timeout. Now if the only cpu which has a short timer active has the
bit set then the broadcast device will not be armed on behalf of that
cpu and will fire way after the expected timer expiry. In the case of
Christians bug report it took ~145 seconds which is about half of the
wrap around time of HPET (the limit for that device) due to the fact
that all other cpus had no timers armed which expired before the 145
seconds timeframe.

The solution is simply to clear the broadcast active bit
unconditionally when a cpu switches to oneshot mode after the first
cpu switched the broadcast device over. It's not idle at that point
otherwise it would not be executing that code.

[ I fundamentally hate that broadcast crap. Why the heck thought some
  folks that when going into deep idle it's a brilliant concept to
  switch off the last device which brings the cpu back from that
  state? ]

Thanks to Christian for providing all the valuable debug information!

Reported-and-tested-by: Christian Hoffmann <email@christianhoffmann.info>
Cc: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-16 23:35:41 +02:00
Andi Kleen 7372b0b122 clockevents: Move C3 stop test outside lock
Avoid taking broadcast_lock in the idle path for systems where the
timer doesn't stop in C3.

[ tglx: Removed the stale label and added comment ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Dave Kleikamp <dkleikamp@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: lenb@kernel.org
Cc: paulmck@us.ibm.com
Link: http://lkml.kernel.org/r/%3C20110504234806.GF2925%40one.firstfloor.org%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-05 17:32:13 +02:00
john stultz e05b2efb82 clocksource: Install completely before selecting
Christian Hoffmann reported that the command line clocksource override
with acpi_pm timer fails:

 Kernel command line: <SNIP> clocksource=acpi_pm
 hpet clockevent registered
 Switching to clocksource hpet
 Override clocksource acpi_pm is not HRT compatible.
 Cannot switch while in HRT/NOHZ mode.

The watchdog code is what enables CLOCK_SOURCE_VALID_FOR_HRES, but we
actually end up selecting the clocksource before we enqueue it into
the watchdog list, so that's why we see the warning and fail to switch
to acpi_pm timer as requested. That's particularly bad when we want to
debug timekeeping related problems in early boot.

Put the selection call last.

Reported-by: Christian Hoffmann <email@christianhoffmann.info>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: stable@kernel.org # 32...
Link: http://lkml.kernel.org/r/%3C1304558210.2943.24.camel%40work-vm%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-05 15:23:26 +02:00