Commit Graph

2073 Commits

Author SHA1 Message Date
Ingo Molnar 0c98d344fe sched/core: Remove the tsk_cpus_allowed() wrapper
So the original intention of tsk_cpus_allowed() was to 'future-proof'
the field - but it's pretty ineffectual at that, because half of
the code uses ->cpus_allowed directly ...

Also, the wrapper makes the code longer than the original expression!

So just get rid of it. This also shrinks <linux/sched.h> a bit.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:24 +01:00
Linus Torvalds af8999f672 ARM: SoC non-urgent fixes for merge window
We sometimes collect non-critical fixes that come in during the later part
 of the merge window in a branch for the next release instead, and this is
 that contents for v4.11.
 
 Most of these are OMAP fixes, dealing with OMAP36/37 detection, quirks
 and setup. There's also some fixes for Davinci and a Kconfig fix for SCPI
 to only enable on ARM{,64}.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYrMlHAAoJEIwa5zzehBx3oZ4P/3nRgb4dtwEwXwFmJf8Xd4nu
 yetQbcwRreHvh8utsU2Pe+8tffV8jLgsW8TxZ43d6deYFii046HhZAXtvTTVgFpE
 OA0fJpNJ00KYqP1Nx5q/kwZoH3uBz442uMUQ9lyziB3RpimhRsiKyHwnTyuWljyx
 hPmO1XKcF6pQBXk1uwOzO1lSDUeOn4eAmeLonlG1gQ5qtrkU0WbrTPxpmn/CB546
 LH5Nj0qVRzEa7xr8O+2nzeKPVwcXGwsKVKCDbSJmsey2KOEDnEjjxpToAh3WnA4W
 Tm1av5QdyqsLVqAMkNYezrS8EzBjRKa1ma4xUqsNoIhO1XI7xa/LkonU8a0+ZdSX
 p48DCvv7IHX5IqdIHHB0s1eICvTsW8Cp/4YUJzuZDFbS9B2t5b3412+n43tVa8l3
 HYPeTzL5S3VOrMtpQKkGAFrw5OGm+URy4CYQxpX5DxSRSqvXTj12ajBHRbfdbzCO
 r2i2rhKL07PF3DAf8L1coHcBQDS7Vc/k+fhKCQy+W1RDxmjYwYKSI9agOyZi1HQ7
 X+0HuUyKTthCE2kUrj4rye/87MffWwdjNgnOZiHR1X7YtWgnjp1g9K+mLZHh/y5m
 Tq/M55cK9h6dOghx121jYFkkvDclEQDemJuDbKY0sEMDrDXtppcI/T+znZ1LTq7i
 1eaK4lTyAX7dbQJUQCwe
 =NhZq
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC non-urgent fixes from Arnd Bergmann:
 "We sometimes collect non-critical fixes that come in during the later
  part of the merge window in a branch for the next release instead, and
  this is that contents for v4.11.

  Most of these are OMAP fixes, dealing with OMAP36/37 detection, quirks
  and setup. There's also some fixes for Davinci and a Kconfig fix for
  SCPI to only enable on ARM{,64}"

* tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  firmware: arm_scpi: Add hardware dependencies
  ARM: OMAP3: Fix SoC detection of OMAP36/37 Family
  ARM: OMAP5: Add HWMOD_SWSUP_SIDLE_ACT flag for UART
  ARM: dts: Fix compatible for ti81xx uarts for 8250
  ARM: dts: Fix am335x and dm814x scm syscon to probe children
  ARM: OMAP2+: Fix init for multiple quirks for the same SoC
  ARM: dts: Fix omap3 off mode pull defines
  bus: da850-mstpri: fix my e-mail address
  ARM: davinci: da850: fix da850_set_pll0rate()
  ARM: davinci: da850: coding style fix
2017-02-23 15:28:04 -08:00
Linus Torvalds 02c3de1105 Power management updates for v4.11-rc1
- Operating Performance Points (OPP) framework fixes, cleanups and
    switch over from RCU-based synchronization to reference counting
    using krefs (Viresh Kumar, Wei Yongjun, Dave Gerlach).
 
  - cpufreq core cleanups and documentation updates (Viresh Kumar,
    Rafael Wysocki).
 
  - New cpufreq driver for Broadcom BMIPS SoCs (Markus Mayer).
 
  - New cpufreq-dt sub-driver for TI SoCs requiring special handling,
    like in the AM335x, AM437x, DRA7x, and AM57x families, along with
    new DT bindings for it (Dave Gerlach, Paul Gortmaker).
 
  - ARM64 SoCs support for the qoriq cpufreq driver (Tang Yuantian).
 
  - intel_pstate driver updates including a new sysfs knob to control
    the driver's operation mode and fixes related to the no_turbo
    sysfs knob and the hardware-managed P-states feature support
    (Rafael Wysocki, Srinivas Pandruvada).
 
  - New interface to export ultra-turbo frequencies for the powernv
    cpufreq driver (Shilpasri Bhat).
 
  - Assorted fixes for cpufreq drivers (Arnd Bergmann, Dan Carpenter,
    Wei Yongjun).
 
  - devfreq core fixes, mostly related to the sysfs interface exported
    by it (Chanwoo Choi, Chris Diamand).
 
  - Updates of the exynos-bus and exynos-ppmu devfreq drivers (Chanwoo
    Choi).
 
  - Device PM QoS extension to support CPUs and support for per-CPU
    wakeup (device resume) latency constraints in the cpuidle menu
    governor (Alex Shi).
 
  - Wakeup IRQs framework fixes (Grygorii Strashko).
 
  - Generic power domains framework update including a fix to make
    it handle asynchronous invocations of *noirq suspend/resume
    callbacks correctly (Ulf Hansson, Geert Uytterhoeven).
 
  - Assorted fixes and cleanups in the core suspend/hibernate code,
    PM QoS framework and x86 ACPI idle support code (Corentin Labbe,
    Geert Uytterhoeven, Geliang Tang, John Keeping, Nick Desaulniers).
 
  - Update of the analyze_suspend.py script is updated to version 4.5
    offering multiple improvements (Todd Brandt).
 
  - New tool for intel_pstate diagnostics using the pstate_sample
    tracepoint (Doug Smythies).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYq3IjAAoJEILEb/54YlRx/lYP+gNXhfETSzjd4kWSHy3FVEDb
 gc5rMiE2j0OYgVSXwBI7p4EqMPy56lSWBASvbF2o6v9CIxb880KLFEsMDCVHwn46
 6xfEnIRxf1oeRqn7EG9ZPIcTgNsUyvK+gah7zgLXu/0KU7ceXxygvNk47qpeOZ8f
 dKYgIk/TOSGPC8H2nsg8VBKlK/ZOj5hID4F3MmFw6yDuWVCYuh2EokYXS4Nx0JwY
 UQGpWtz+FWWs71vhgVl33GbPXWvPqA7OMe0btZ3RCnhnz4tA/mH+jDWiaspCdS3J
 vKGeZyZptjIMJcufm3X7s7ghYjELheqQusMODDXk4AaWQ5nz8V5/h7NThYfa9J1b
 M93Tb0rMb2MqUhBpv/M6D3qQroZmhq55QKfQrul3QWSOiQUzTWJcbbpyeBQ7nkrI
 F1qNqQfuCnBL/r9y7HpW8P2iFg9kCHkwTtXMdp/lzGXdKzSGtAUSkYg5ohnUzQTp
 2WCPTEk+5DxLVPjW5rDoZOotr5p1kdcdWBk6r3MEWRokZK6PJo7rJBcnTtXSo2mO
 lLRba006q+fTlI5wZtjAI0rOiS3JgtT6cRx7uPjZlze9TGjklJhdsCPJbM5gcOT+
 YiOxvqD+9if5QRSxiEZNj3bQ43wYhXmpctfIanyxziq09BPIPxvgfRR/BkUzc34R
 ps4CIvImim5v5xc8Zsbk
 =57xJ
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "The majority of changes go into the Operating Performance Points (OPP)
  framework and cpufreq this time, followed by devfreq and some
  scattered updates all over.

  The OPP changes are mostly related to switching over from RCU-based
  synchronization, that turned out to be overly complicated and
  problematic, to reference counting using krefs.

  In the cpufreq land there are core cleanups, documentation updates, a
  new driver for Broadcom BMIPS SoCs, a new cpufreq-dt sub-driver for TI
  SoCs that require special handling, ARM64 SoCs support for the qoriq
  driver, intel_pstate updates, powernv driver update and assorted
  fixes.

  The devfreq changes are mostly fixes related to the sysfs interface
  and some Exynos drivers updates.

  Apart from that, the cpuidle menu governor will support per-CPU PM QoS
  constraints for the wakeup latency now, some bugs in the wakeup IRQs
  framework are fixed, the generic power domains framework should handle
  asynchronous invocations of *noirq suspend/resume callbacks from now
  on, the analyze_suspend.py script is updated and there is a new tool
  for intel_pstate diagnostics.

  Specifics:

   - Operating Performance Points (OPP) framework fixes, cleanups and
     switch over from RCU-based synchronization to reference counting
     using krefs (Viresh Kumar, Wei Yongjun, Dave Gerlach)

   - cpufreq core cleanups and documentation updates (Viresh Kumar,
     Rafael Wysocki)

   - New cpufreq driver for Broadcom BMIPS SoCs (Markus Mayer)

   - New cpufreq-dt sub-driver for TI SoCs requiring special handling,
     like in the AM335x, AM437x, DRA7x, and AM57x families, along with
     new DT bindings for it (Dave Gerlach, Paul Gortmaker)

   - ARM64 SoCs support for the qoriq cpufreq driver (Tang Yuantian)

   - intel_pstate driver updates including a new sysfs knob to control
     the driver's operation mode and fixes related to the no_turbo sysfs
     knob and the hardware-managed P-states feature support (Rafael
     Wysocki, Srinivas Pandruvada)

   - New interface to export ultra-turbo frequencies for the powernv
     cpufreq driver (Shilpasri Bhat)

   - Assorted fixes for cpufreq drivers (Arnd Bergmann, Dan Carpenter,
     Wei Yongjun)

   - devfreq core fixes, mostly related to the sysfs interface exported
     by it (Chanwoo Choi, Chris Diamand)

   - Updates of the exynos-bus and exynos-ppmu devfreq drivers (Chanwoo
     Choi)

   - Device PM QoS extension to support CPUs and support for per-CPU
     wakeup (device resume) latency constraints in the cpuidle menu
     governor (Alex Shi)

   - Wakeup IRQs framework fixes (Grygorii Strashko)

   - Generic power domains framework update including a fix to make it
     handle asynchronous invocations of *noirq suspend/resume callbacks
     correctly (Ulf Hansson, Geert Uytterhoeven)

   - Assorted fixes and cleanups in the core suspend/hibernate code, PM
     QoS framework and x86 ACPI idle support code (Corentin Labbe, Geert
     Uytterhoeven, Geliang Tang, John Keeping, Nick Desaulniers)

   - Update of the analyze_suspend.py script is updated to version 4.5
     offering multiple improvements (Todd Brandt)

   - New tool for intel_pstate diagnostics using the pstate_sample
     tracepoint (Doug Smythies)"

* tag 'pm-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (85 commits)
  MAINTAINERS: cpufreq: add bmips-cpufreq.c
  PM / QoS: Fix memory leak on resume_latency.notifiers
  PM / Documentation: Spelling s/wrtie/write/
  PM / sleep: Fix test_suspend after sleep state rework
  cpufreq: CPPC: add ACPI_PROCESSOR dependency
  cpufreq: make ti-cpufreq explicitly non-modular
  cpufreq: Do not clear real_cpus mask on policy init
  tools/power/x86: Debug utility for intel_pstate driver
  AnalyzeSuspend: fix drag and zoom bug in javascript
  PM / wakeirq: report a wakeup_event on dedicated wekup irq
  PM / wakeirq: Fix spurious wake-up events for dedicated wakeirqs
  PM / wakeirq: Enable dedicated wakeirq for suspend
  cpufreq: dt: Don't use generic platdev driver for ti-cpufreq platforms
  cpufreq: ti: Add cpufreq driver to determine available OPPs at runtime
  Documentation: dt: add bindings for ti-cpufreq
  PM / OPP: Expose _of_get_opp_desc_node as dev_pm_opp API
  cpufreq: qoriq: Don't look at clock implementation details
  cpufreq: qoriq: add ARM64 SoCs support
  PM / Domains: Provide dummy governors if CONFIG_PM_GENERIC_DOMAINS=n
  cpufreq: brcmstb-avs-cpufreq: remove unnecessary platform_set_drvdata()
  ...
2017-02-20 17:41:31 -08:00
Linus Torvalds 828cad8ea0 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this (fairly busy) cycle were:

   - There was a class of scheduler bugs related to forgetting to update
     the rq-clock timestamp which can cause weird and hard to debug
     problems, so there's a new debug facility for this: which uncovered
     a whole lot of bugs which convinced us that we want to keep the
     debug facility.

     (Peter Zijlstra, Matt Fleming)

   - Various cputime related updates: eliminate cputime and use u64
     nanoseconds directly, simplify and improve the arch interfaces,
     implement delayed accounting more widely, etc. - (Frederic
     Weisbecker)

   - Move code around for better structure plus cleanups (Ingo Molnar)

   - Move IO schedule accounting deeper into the scheduler plus related
     changes to improve the situation (Tejun Heo)

   - ... plus a round of sched/rt and sched/deadline fixes, plus other
     fixes, updats and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (85 commits)
  sched/core: Remove unlikely() annotation from sched_move_task()
  sched/autogroup: Rename auto_group.[ch] to autogroup.[ch]
  sched/topology: Split out scheduler topology code from core.c into topology.c
  sched/core: Remove unnecessary #include headers
  sched/rq_clock: Consolidate the ordering of the rq_clock methods
  delayacct: Include <uapi/linux/taskstats.h>
  sched/core: Clean up comments
  sched/rt: Show the 'sched_rr_timeslice' SCHED_RR timeslice tuning knob in milliseconds
  sched/clock: Add dummy clear_sched_clock_stable() stub function
  sched/cputime: Remove generic asm headers
  sched/cputime: Remove unused nsec_to_cputime()
  s390, sched/cputime: Remove unused cputime definitions
  powerpc, sched/cputime: Remove unused cputime definitions
  s390, sched/cputime: Make arch_cpu_idle_time() to return nsecs
  ia64, sched/cputime: Remove unused cputime definitions
  ia64: Convert vtime to use nsec units directly
  ia64, sched/cputime: Move the nsecs based cputime headers to the last arch using it
  sched/cputime: Remove jiffies based cputime
  sched/cputime, vtime: Return nsecs instead of cputime_t to account
  sched/cputime: Complete nsec conversion of tick based accounting
  ...
2017-02-20 12:52:55 -08:00
Arnd Bergmann a578884fa0 cpufreq: CPPC: add ACPI_PROCESSOR dependency
Without the Kconfig dependency, we can get this warning:

warning: ACPI_CPPC_CPUFREQ selects ACPI_CPPC_LIB which has unmet direct dependencies (ACPI && ACPI_PROCESSOR)

Fixes: 5477fb3bd1 (ACPI / CPPC: Add a CPUFreq driver for use with CPPC)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-16 01:00:03 +01:00
Paul Gortmaker 149ab86496 cpufreq: make ti-cpufreq explicitly non-modular
The Kconfig currently controlling compilation of this code is:

drivers/cpufreq/Kconfig.arm:config ARM_TI_CPUFREQ
drivers/cpufreq/Kconfig.arm:    bool "Texas Instruments CPUFreq support"

...meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modular infrastructure use, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-16 00:58:52 +01:00
Rafael J. Wysocki f451014692 cpufreq: Do not clear real_cpus mask on policy init
If new_policy is set in cpufreq_online(), the policy object has just
been created and its real_cpus mask has been zeroed on allocation,
and the driver's ->init() callback should not touch it.

It doesn't need to be cleared again, so don't do that.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2017-02-16 00:57:42 +01:00
Dave Gerlach 051bd84bb4 cpufreq: dt: Don't use generic platdev driver for ti-cpufreq platforms
Some TI platforms, specifically those in the am33xx, am43xx, dra7xx, and
am57xx families of SoCs can make use of the ti-cpufreq driver to
selectively enable OPPs based on the exact configuration in use. The
ti-cpufreq is given the responsibility of creating the cpufreq-dt
platform device when the driver is in use so drop am33xx and dra7xx
from the cpufreq-dt-platdev driver so it is not created twice.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-09 22:59:00 +01:00
Dave Gerlach e13cf046cd cpufreq: ti: Add cpufreq driver to determine available OPPs at runtime
Some TI SoCs, like those in the AM335x, AM437x, DRA7x, and AM57x families,
have different OPPs available for the MPU depending on which specific
variant of the SoC is in use. This can be determined through use of the
revision and an eFuse register present in the silicon. Introduce a
ti-cpufreq driver that can read the aformentioned values and provide
them as version matching data to the opp framework. Through this the
opp-supported-hw dt binding that is part of the operating-points-v2
table can be used to indicate availability of OPPs for each device.

This driver also creates the "cpufreq-dt" platform_device after passing
the version matching data to the OPP framework so that the cpufreq-dt
handles the actual cpufreq implementation. Even without the necessary
data to pass the version matching data the driver will still create this
device to maintain backwards compatibility with operating-points v1
tables.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-09 22:57:48 +01:00
Rafael J. Wysocki 40e993aa04 Merge OPP material for v4.11 to satisfy dependencies. 2017-02-09 22:52:35 +01:00
Tang Yuantian b1e9a64972 cpufreq: qoriq: Don't look at clock implementation details
Get the CPU clock's potential parent clocks from the clock interface
itself, rather than manually parsing the clocks property to find a
phandle, looking at the clock-names property of that, and assuming that
those are valid parent clocks for the cpu clock.

This is necessary now that the clocks are generated based on the clock
driver's knowledge of the chip rather than a fragile device-tree
description of the mux options.

We can now rely on the clock driver to ensure that the mux only exposes
options that are valid.  The cpufreq driver was currently being overly
conservative in some cases -- for example, the "min_cpufreq =
get_bus_freq()" restriction only applies to chips with erratum
A-004510, and whether the freq_mask used on p5020 is needed depends on
the actual frequencies of the PLLs (FWIW, p5040 has a similar
limitation but its .freq_mask was zero) -- and the frequency mask
mechanism made assumptions about particular parent clock indices that
are no longer valid.

Signed-off-by: Scott Wood <scottwood@nxp.com>
Signed-off-by: Tang Yuantian <yuantian.tang@nxp.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-09 14:33:02 +01:00
Tang Yuantian 5026ac2314 cpufreq: qoriq: add ARM64 SoCs support
Add ARM64 config to Kconfig to enable CPU frequency feature on
NXP ARM64 SoCs.

Signed-off-by: Tang Yuantian <yuantian.tang@nxp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-09 14:33:01 +01:00
Wei Yongjun 113f9017e5 cpufreq: brcmstb-avs-cpufreq: remove unnecessary platform_set_drvdata()
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-09 01:22:46 +01:00
Dan Carpenter a69261e447 cpufreq: s3c2416: double free on driver init error path
The "goto err_armclk;" error path already does a clk_put(s3c_freq->hclk);
so this is a double free.

Fixes: 34ee550752 ([CPUFREQ] Add S3C2416/S3C2450 cpufreq driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-09 01:22:45 +01:00
Markus Mayer cdb56cbfd7 cpufreq: bmips-cpufreq: CPUfreq driver for Broadcom's BMIPS SoCs
Add the MIPS CPUfreq driver. This driver currently supports CPUfreq on
BMIPS5xxx-based SoCs.

Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-09 01:22:44 +01:00
Rafael J. Wysocki 56c7303e62 Merge back earlier cpufreq changes for v4.11. 2017-02-09 01:18:14 +01:00
Rafael J. Wysocki cbf304e420 Merge branches 'pm-core-fixes' and 'pm-cpufreq-fixes'
* pm-core-fixes:
  PM / runtime: Avoid false-positive warnings from might_sleep_if()

* pm-cpufreq-fixes:
  cpufreq: intel_pstate: Disable energy efficiency optimization
  cpufreq: brcmstb-avs-cpufreq: properly retrieve P-state upon suspend
  cpufreq: brcmstb-avs-cpufreq: extend sysfs entry brcm_avs_pmap
2017-02-06 14:52:10 +01:00
Srinivas Pandruvada 6e978b22ef cpufreq: intel_pstate: Disable energy efficiency optimization
Some Kabylake desktop processors may not reach max turbo when running in
HWP mode, even if running under sustained 100% utilization.

This occurs when the HWP.EPP (Energy Performance Preference) is set to
"balance_power" (0x80) -- the default on most systems.

It occurs because the platform BIOS may erroneously enable an
energy-efficiency setting -- MSR_IA32_POWER_CTL BIT-EE, which is not
recommended to be enabled on this SKU.

On the failing systems, this BIOS issue was not discovered when the
desktop motherboard was tested with Windows, because the BIOS also
neglects to provide the ACPI/CPPC table, that Windows requires to enable
HWP, and so Windows runs in legacy P-state mode, where this setting has
no effect.

Linux' intel_pstate driver does not require ACPI/CPPC to enable HWP, and
so it runs in HWP mode, exposing this incorrect BIOS configuration.

There are several ways to address this problem.

First, Linux can also run in legacy P-state mode on this system.
As intel_pstate is how Linux enables HWP, booting with
"intel_pstate=disable"
will run in acpi-cpufreq/ondemand legacy p-state mode.

Or second, the "performance" governor can be used with intel_pstate,
which will modify HWP.EPP to 0.

Or third, starting in 4.10, the
/sys/devices/system/cpu/cpufreq/policy*/energy_performance_preference
attribute in can be updated from "balance_power" to "performance".

Or fourth, apply this patch, which fixes the erroneous setting of
MSR_IA32_POWER_CTL BIT_EE on this model, allowing the default
configuration to function as designed.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04 00:11:08 +01:00
Srinivas Pandruvada 8fc7554ae5 cpufreq: intel_pstate: Calculate guaranteed performance for HWP
When HWP is active, turbo activation ratio is not used to calculate max
non turbo ratio. But on these systems the max non turbo ratio is decided
by config TDP settings.

This change removes usage of MSR_TURBO_ACTIVATION_RATIO for HWP systems,
instead directly use TDP ratios, when more than one TDPs are available.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04 00:05:33 +01:00
Srinivas Pandruvada 4e5d3f713b cpufreq: intel_pstate: Make HWP limits compatible with legacy
Under HWP the performance limits are calculated using max_perf_pct
and min_perf_pct using possible performance, not available performance.
The available performance can be reduced by no_turbo setting. To make
compatible with legacy mode, use max/min performance percentage with
respect to available performance.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04 00:05:32 +01:00
Srinivas Pandruvada 7d9a8a9f4e cpufreq: intel_pstate: Lower frequency than expected under no_turbo
When turbo is not disabled by BIOS, but user disabled from intel P-State
sysfs and changes max/min using cpufreq sysfs, the resultant frequency
is lower than what user requested.

The reason for this, when the perf limits are calculated in set_policy()
callback, they are with reference to max cpu frequency (turbo frequency
), but when enforced in the intel_pstate_get_min_max() they are with
reference to max available performance as documented in the intel_pstate
documentation (in this case max non turbo P-State).

This needs similar change as done in intel_cpufreq_verify_policy() for
passive mode. Set policy->cpuinfo.max_freq based on the turbo status.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04 00:05:32 +01:00
Rafael J. Wysocki fb1fe1041c cpufreq: intel_pstate: Operation mode control from sysfs
Make it possible to change the operation mode of intel_pstate with
the help of a new sysfs attribute called "status".

There are three possible configurations that can be selected using
this attribute:

 "off"     - The driver is not in use at this time.
 "active"  - The driver works as a P-state governor (default).
 "passive" - The driver works as a regular cpufreq one and collaborates
             with the generic cpufreq governors (it sets P-states as
             requested by those governors).  [This is the same mode
             the driver can be started in by passing intel_pstate=passive
             in the kernel command line.]

The current setting is returned by reads from this attribute.  Writing
one of the above strings to it changes the operation mode as indicated
by that string, if possible.

If HW-managed P-states (HWP) feature is enabled, it is not possible
to change the driver's operation mode and attempts to write to this
attribute will fail.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04 00:05:31 +01:00
Rafael J. Wysocki 0c30b65b3c cpufreq: intel_pstate: Expose global sysfs attributes upfront
Expose the intel_pstate's global sysfs attributes before registering
the driver to prepare for the addition of an attribute that also will
have to work if the driver is not registered.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04 00:05:30 +01:00
Viresh Kumar 052f573f5c cpufreq: Remove CPUFREQ_START notifier event
Its not used anymore, remove it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04 00:05:30 +01:00
Shilpasri G Bhat b12f7a2b01 cpufreq: powernv: Add boost files to export ultra-turbo frequencies
In P8+, Workload Optimized Frequency(WOF) provides the capability to
boost the cpu frequency based on the utilization of the other cpus
running in the chip. The On-Chip-Controller(OCC) firmware will control
the achievability of these frequencies depending on the power headroom
available in the chip. Currently the ultra-turbo frequencies provided
by this feature are exported along with the turbo and sub-turbo
frequencies as scaling_available_frequencies. This patch will export
the ultra-turbo frequencies separately as scaling_boost_frequencies in
WOF enabled systems. This patch will add the boost sysfs file which
can be used to disable/enable ultra-turbo frequencies.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03 23:59:41 +01:00
Viresh Kumar 801e0f378f cpufreq: Remove CONFIG_CPU_FREQ_STAT_DETAILS config option
This doesn't have any benefit apart from saving a small amount of memory
when it is disabled. The ifdef hackery in the code makes it dirty
unnecessarily.

Clean it up by removing the Kconfig option completely. Few defconfigs
are also updated and CONFIG_CPU_FREQ_STAT_DETAILS is replaced with
CONFIG_CPU_FREQ_STAT now in them, as users wanted stats to be enabled.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03 23:59:39 +01:00
Viresh Kumar f9f41e3ef9 cpufreq: Remove policy create/remove notifiers
Those were added by:

commit fcd7af917a ("cpufreq: stats: handle cpufreq_unregister_driver()
and suspend/resume properly")

but aren't used anymore since:

commit 1aefc75b24 ("cpufreq: stats: Make the stats code non-modular").

Remove them. Also remove the redundant parameter to the respective
routines.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03 23:59:38 +01:00
Frederic Weisbecker 7fb1327ee9 sched/cputime: Convert kcpustat to nsecs
Kernel CPU stats are stored in cputime_t which is an architecture
defined type, and hence a bit opaque and requiring accessors and mutators
for any operation.

Converting them to nsecs simplifies the code and is one step toward
the removal of cputime_t in the core code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01 09:13:47 +01:00
Viresh Kumar 8a31d9d942 PM / OPP: Update OPP users to put reference
This patch updates dev_pm_opp_find_freq_*() routines to get a reference
to the OPPs returned by them.

Also updates the users of dev_pm_opp_find_freq_*() routines to call
dev_pm_opp_put() after they are done using the OPPs.

As it is guaranteed the that OPPs wouldn't get freed while being used,
the RCU read side locking present with the users isn't required anymore.
Drop it as well.

This patch also updates all users of devfreq_recommended_opp() which was
returning an OPP received from the OPP core.

Note that some of the OPP core routines have gained
rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
within them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> [Devfreq]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-30 09:22:21 +01:00
Viresh Kumar fa30184d19 PM / OPP: Return opp_table from dev_pm_opp_set_*() routines
Now that we have proper kernel reference infrastructure in place for OPP
tables, use it to guarantee that the OPP table isn't freed while being
used by the callers of dev_pm_opp_set_*() APIs.

Make them all return the pointer to the OPP table after taking its
reference and put the reference back with dev_pm_opp_put_*() APIs.

Now that the OPP table wouldn't get freed while these routines are
executing after dev_pm_opp_get_opp_table() is called, there is no need
to take opp_table_lock. Drop them as well.

Remove the rcu specific comments from these routines as they aren't
relevant anymore.

Note that prototypes of dev_pm_opp_{set|put}_regulators() were already
updated by another patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-30 09:22:21 +01:00
Viresh Kumar 3aa26a3b2e PM / OPP: Rename dev_pm_opp_get_suspend_opp() and return OPP rate
There is only one user of dev_pm_opp_get_suspend_opp() and that uses it
to get the OPP rate for the suspend_opp.

Rename dev_pm_opp_get_suspend_opp() as dev_pm_opp_get_suspend_opp_freq()
and return the rate directly from it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-27 11:49:09 +01:00
Markus Mayer 3c223c19ae cpufreq: brcmstb-avs-cpufreq: properly retrieve P-state upon suspend
The AVS GET_PMAP command does return a P-state along with the P-map
information. However, that P-state is the initial P-state when the
P-map was first downloaded to AVS. It is *not* the current P-state.

Therefore, we explicitly retrieve the P-state using the GET_PSTATE
command.

Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-27 11:43:49 +01:00
Markus Mayer 9b02c54bc9 cpufreq: brcmstb-avs-cpufreq: extend sysfs entry brcm_avs_pmap
We extend the brcm_avs_pmap sysfs entry (which issues the GET_PMAP
command to AVS) to include all fields from struct pmap. This means
adding mode (AVS, DVS, DVFS) and state (the P-state) to the output.

Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-27 11:43:48 +01:00
Rafael J. Wysocki ff7e593c9c Merge branches 'pm-sleep' and 'pm-cpufreq'
* pm-sleep:
  Revert "PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag"

* pm-cpufreq:
  cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy
2017-01-27 00:08:59 +01:00
Srinivas Pandruvada 1443ebbacf cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy
A side effect of keeping intel_pstate sysfs limits in sync with cpufreq
is that the now sysfs limits can't enforced under performance policy.

For example, if the max_perf_pct is changed from 100 to 80, this will call
intel_pstate_set_policy(), which will change the max_perf to 100 again for
performance policy. Same issue happens, when no_turbo is set.

This change calculates max and min frequency using sysfs performance
limits in intel_pstate_verify_policy() and adjusts policy limits by
calling cpufreq_verify_within_limits().

Also, it causes the setting of performance limits to be skipped if
no_turbo is set.

Fixes: 111b8b3fe4 (cpufreq: intel_pstate: Always keep all limits settings in sync)
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-20 03:35:27 +01:00
Rafael J. Wysocki 3baad65546 Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: dt: Add support for APM X-Gene 2
  cpufreq: intel_pstate: Always keep all limits settings in sync
  cpufreq: intel_pstate: Use locking in intel_cpufreq_verify_policy()
  cpufreq: intel_pstate: Use locking in intel_pstate_resume()
  cpufreq: intel_pstate: Do not expose PID parameters in passive mode
2017-01-06 14:34:52 +01:00
Hoan Tran e11b6293a8 cpufreq: dt: Add support for APM X-Gene 2
Add the compatible string for supporting the generic device tree cpufreq-dt
driver on APM's X-Gene 2 SoC.

Signed-off-by: Hoan Tran <hotran@apm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-05 00:27:51 +01:00
Bartosz Golaszewski b40881738f ARM: davinci: da850: fix da850_set_pll0rate()
This function is confusing - its second argument is an index to the
freq table, not the requested clock rate in Hz, but it's used as the
set_rate callback for the pll0 clock. It leads to an oops when the
caller doesn't know the internals and passes the rate in Hz as
argument instead of the cpufreq index since this argument isn't bounds
checked either.

Fix it by iterating over the array of supported frequencies and
selecting a one that matches or returning -EINVAL for unsupported
rates.

Also: update the davinci cpufreq driver. It's the only user of this
clock and currently it passes the cpufreq table index to
clk_set_rate(), which is confusing. Make it pass the requested clock
rate in Hz.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[nsekhar@ti.com: commit headline update]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2017-01-02 15:02:51 +05:30
Rafael J. Wysocki 111b8b3fe4 cpufreq: intel_pstate: Always keep all limits settings in sync
Make intel_pstate update per-logical-CPU limits when the global
settings are changed to ensure that they are always in sync and
users will not see confusing values in per-logical-CPU sysfs
attributes.

This also fixes the problem that setting the "no_turbo" global
attribute to 1 in the "passive" mode (ie. when intel_pstate acts
as a regular cpufreq driver) when scaling_governor is set to
"performance" has no effect.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2016-12-31 21:48:44 +01:00
Rafael J. Wysocki cad3046796 cpufreq: intel_pstate: Use locking in intel_cpufreq_verify_policy()
Race conditions are possible if intel_cpufreq_verify_policy()
is executed in parallel with global limits updates from sysfs,
so the invocation of intel_pstate_update_perf_limits() in it
should be carried out under intel_pstate_limits_lock.

Make that happen.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2016-12-31 21:48:43 +01:00
Rafael J. Wysocki aa439248ab cpufreq: intel_pstate: Use locking in intel_pstate_resume()
Theoretically, intel_pstate_resume() may be executed in parallel
with intel_pstate_set_policy(), if the latter is invoked via
cpufreq_update_policy() as a result of a notification, so use
intel_pstate_limits_lock in there too to avoid race conditions.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2016-12-31 21:48:42 +01:00
Rafael J. Wysocki 366430b5c2 cpufreq: intel_pstate: Do not expose PID parameters in passive mode
If intel_pstate works in the passive mode in which it acts as
a regular cpufreq driver and collaborates with generic cpufreq
governors, the PID parameters are not used, so do not expose
them via debugfs in that case.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-27 03:30:11 +01:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Rafael J. Wysocki 7b99f1aeed Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: s3c64xx: remove incorrect __init annotation
  cpufreq: Remove CPU hotplug callbacks only if they were initialized
  CPU/hotplug: Clarify description of __cpuhp_setup_state() return value
2016-12-22 14:34:55 +01:00
Arnd Bergmann adec57c61c cpufreq: s3c64xx: remove incorrect __init annotation
s3c64xx_cpufreq_config_regulator is incorrectly annotated
as __init, since the caller is also not init:

WARNING: vmlinux.o(.text+0x92fe1c): Section mismatch in reference from the function s3c64xx_cpufreq_driver_init() to the function .init.text:s3c64xx_cpufreq_config_regulator()

With modern gcc versions, the function gets inline, so we don't
see the warning, this only happens with gcc-4.6 and older.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-21 02:54:18 +01:00
Boris Ostrovsky 2a8fa123d9 cpufreq: Remove CPU hotplug callbacks only if they were initialized
Since CPU hotplug callbacks are requested for CPUHP_AP_ONLINE_DYN state,
successful callback initialization will result in cpuhp_setup_state()
returning a positive value. Therefore acpi_cpufreq_online being zero
indicates that callbacks have not been installed.

This means that acpi_cpufreq_boost_exit() should only remove them if
acpi_cpufreq_online is positive. Trying to call
cpuhp_remove_state_nocalls(0) will cause a BUG().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-21 02:52:52 +01:00
Linus Torvalds 7b9dc3f75f Power management material for v4.10-rc1
- New cpufreq driver for Broadcom STB SoCs and a Device Tree binding
    for it (Markus Mayer).
 
  - Support for ARM Integrator/AP and Integrator/CP in the generic
    DT cpufreq driver and elimination of the old Integrator cpufreq
    driver (Linus Walleij).
 
  - Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier,
    and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie,
    Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik).
 
  - cpufreq core fix to eliminate races that may lead to using
    inactive policy objects and related cleanups (Rafael Wysocki).
 
  - cpufreq schedutil governor update to make it use SCHED_FIFO
    kernel threads (instead of regular workqueues) for doing delayed
    work (to reduce the response latency in some cases) and related
    cleanups (Viresh Kumar).
 
  - New cpufreq sysfs attribute for resetting statistics (Markus
    Mayer).
 
  - cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis,
    Viresh Kumar).
 
  - Support for using generic cpufreq governors in the intel_pstate
    driver (Rafael Wysocki).
 
  - Support for per-logical-CPU P-state limits and the EPP/EPB
    (Energy Performance Preference/Energy Performance Bias) knobs
    in the intel_pstate driver (Srinivas Pandruvada).
 
  - New CPU ID for Knights Mill in intel_pstate (Piotr Luc).
 
  - intel_pstate driver modification to use the P-state selection
    algorithm based on CPU load on platforms with the system profile
    in the ACPI tables set to "mobile" (Srinivas Pandruvada).
 
  - intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki,
    Srinivas Pandruvada).
 
  - cpufreq powernv driver updates including fast switching support
    (for the schedutil governor), fixes and cleanus (Akshay Adiga,
    Andrew Donnellan, Denis Kirjanov).
 
  - acpi-cpufreq driver rework to switch it over to the new CPU
    offline/online state machine (Sebastian Andrzej Siewior).
 
  - Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth
    Prakash).
 
  - Idle injection rework (to make it use the regular idle path
    instead of a home-grown custom one) and related powerclamp
    thermal driver updates (Peter Zijlstra, Jacob Pan, Petr Mladek,
    Sebastian Andrzej Siewior).
 
  - New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy
    Shevchenko, Piotr Luc).
 
  - intel_idle driver cleanups and switch over to using the new CPU
    offline/online state machine (Anna-Maria Gleixner, Sebastian
    Andrzej Siewior).
 
  - cpuidle DT driver update to support suspend-to-idle properly
    (Sudeep Holla).
 
  - cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian,
    Rafael Wysocki).
 
  - Preliminary support for power domains including CPUs in the
    generic power domains (genpd) framework and related DT bindings
    (Lina Iyer).
 
  - Assorted fixes and cleanups in the generic power domains (genpd)
    framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven).
 
  - Preliminary support for devices with multiple voltage regulators
    and related fixes and cleanups in the Operating Performance Points
    (OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd).
 
  - System sleep state selection interface rework to make it easier
    to support suspend-to-idle as the default system suspend method
    (Rafael Wysocki).
 
  - PM core fixes and cleanups, mostly related to the interactions
    between the system suspend and runtime PM frameworks (Ulf Hansson,
    Sahitya Tummala, Tony Lindgren).
 
  - Latency tolerance PM QoS framework imorovements (Andrew
    Lutomirski).
 
  - New Knights Mill CPU ID for the Intel RAPL power capping driver
    (Piotr Luc).
 
  - Intel RAPL power capping driver fixes, cleanups and switch over
    to using the new CPU offline/online state machine (Jacob Pan,
    Thomas Gleixner, Sebastian Andrzej Siewior).
 
  - Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc,
    rockchip-dfi devfreq drivers and the devfreq core (Axel Lin,
    Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh
    Kumar).
 
  - Fix for false-positive KASAN warnings during resume from ACPI S3
    (suspend-to-RAM) on x86 (Josh Poimboeuf).
 
  - Memory map verification during resume from hibernation on x86 to
    ensure a consistent address space layout (Chen Yu).
 
  - Wakeup sources debugging enhancement (Xing Wei).
 
  - rockchip-io AVS driver cleanup (Shawn Lin).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYTx4+AAoJEILEb/54YlRx9f8P/2SlNHUENW5qh6FtCw00oC2u
 UqJerQJ2L38UgbgxbE/0VYblma9rFABDWC1eO2xN2XdcdW5UPBKPVvNcOgNe1Clh
 gjy3RxZXVpmjfzt2kGfsTLEuGnHqwvx51hTUkeA2LwvkOal45xb8ZESmy8opCtiv
 iG4LwmPHoxdX5Za5nA9ItFKzxyO1EoyNSnBYAVwALDHxmNOfxEcRevfurASt/0M9
 brCCZJA0/sZxeL0lBdy8fNQPIBTUfCoTJG/MtmzGrObJ9wMFvEDfXrVEyZiWs/zA
 AAZ4kQL77enrIKgrLN8e0G6LzTLHoVcvn38Xjf24dKUqhd7ACBhYcnW+jK3+7EAd
 gjZ8efObQsiuyK/EDLUNw35tt96CHOqfrQCj2tIwRVvk9EekLqAGXdIndTCr2kYW
 RpefmP5kMljnm/nQFOVLwMEUQMuVkvUE7EgxADy7DoDmepBFC4ICRDWPye70R2kC
 0O1Tn2PAQq4Fd1tyI9TYYz0YQQkRoaRb5rfYUSzbRbeCdsphUopp4Vhsiyn6IcnF
 XnLbg6pRAat82MoS9n4pfO/VCo8vkErKA8tut9G7TDakkrJoEE7l31PdKW0hP3f6
 sBo6xXy6WTeivU/o/i8TbM6K4mA37pBaj78ooIkWLgg5fzRaS2+0xSPVy2H9x1m5
 LymHcobCK9rSZ1l208Fe
 =vhxI
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "Again, cpufreq gets more changes than the other parts this time (one
  new driver, one old driver less, a bunch of enhancements of the
  existing code, new CPU IDs, fixes, cleanups)

  There also are some changes in cpuidle (idle injection rework, a
  couple of new CPU IDs, online/offline rework in intel_idle, fixes and
  cleanups), in the generic power domains framework (mostly related to
  supporting power domains containing CPUs), and in the Operating
  Performance Points (OPP) library (mostly related to supporting devices
  with multiple voltage regulators)

  In addition to that, the system sleep state selection interface is
  modified to make it easier for distributions with unchanged user space
  to support suspend-to-idle as the default system suspend method, some
  issues are fixed in the PM core, the latency tolerance PM QoS
  framework is improved a bit, the Intel RAPL power capping driver is
  cleaned up and there are some fixes and cleanups in the devfreq
  subsystem

  Specifics:

   - New cpufreq driver for Broadcom STB SoCs and a Device Tree binding
     for it (Markus Mayer)

   - Support for ARM Integrator/AP and Integrator/CP in the generic DT
     cpufreq driver and elimination of the old Integrator cpufreq driver
     (Linus Walleij)

   - Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier,
     and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie,
     Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik)

   - cpufreq core fix to eliminate races that may lead to using inactive
     policy objects and related cleanups (Rafael Wysocki)

   - cpufreq schedutil governor update to make it use SCHED_FIFO kernel
     threads (instead of regular workqueues) for doing delayed work (to
     reduce the response latency in some cases) and related cleanups
     (Viresh Kumar)

   - New cpufreq sysfs attribute for resetting statistics (Markus Mayer)

   - cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis,
     Viresh Kumar)

   - Support for using generic cpufreq governors in the intel_pstate
     driver (Rafael Wysocki)

   - Support for per-logical-CPU P-state limits and the EPP/EPB (Energy
     Performance Preference/Energy Performance Bias) knobs in the
     intel_pstate driver (Srinivas Pandruvada)

   - New CPU ID for Knights Mill in intel_pstate (Piotr Luc)

   - intel_pstate driver modification to use the P-state selection
     algorithm based on CPU load on platforms with the system profile in
     the ACPI tables set to "mobile" (Srinivas Pandruvada)

   - intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki,
     Srinivas Pandruvada)

   - cpufreq powernv driver updates including fast switching support
     (for the schedutil governor), fixes and cleanus (Akshay Adiga,
     Andrew Donnellan, Denis Kirjanov)

   - acpi-cpufreq driver rework to switch it over to the new CPU
     offline/online state machine (Sebastian Andrzej Siewior)

   - Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth
     Prakash)

   - Idle injection rework (to make it use the regular idle path instead
     of a home-grown custom one) and related powerclamp thermal driver
     updates (Peter Zijlstra, Jacob Pan, Petr Mladek, Sebastian Andrzej
     Siewior)

   - New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy
     Shevchenko, Piotr Luc)

   - intel_idle driver cleanups and switch over to using the new CPU
     offline/online state machine (Anna-Maria Gleixner, Sebastian
     Andrzej Siewior)

   - cpuidle DT driver update to support suspend-to-idle properly
     (Sudeep Holla)

   - cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian,
     Rafael Wysocki)

   - Preliminary support for power domains including CPUs in the generic
     power domains (genpd) framework and related DT bindings (Lina Iyer)

   - Assorted fixes and cleanups in the generic power domains (genpd)
     framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven)

   - Preliminary support for devices with multiple voltage regulators
     and related fixes and cleanups in the Operating Performance Points
     (OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd)

   - System sleep state selection interface rework to make it easier to
     support suspend-to-idle as the default system suspend method
     (Rafael Wysocki)

   - PM core fixes and cleanups, mostly related to the interactions
     between the system suspend and runtime PM frameworks (Ulf Hansson,
     Sahitya Tummala, Tony Lindgren)

   - Latency tolerance PM QoS framework imorovements (Andrew Lutomirski)

   - New Knights Mill CPU ID for the Intel RAPL power capping driver
     (Piotr Luc)

   - Intel RAPL power capping driver fixes, cleanups and switch over to
     using the new CPU offline/online state machine (Jacob Pan, Thomas
     Gleixner, Sebastian Andrzej Siewior)

   - Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc,
     rockchip-dfi devfreq drivers and the devfreq core (Axel Lin,
     Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh Kumar)

   - Fix for false-positive KASAN warnings during resume from ACPI S3
     (suspend-to-RAM) on x86 (Josh Poimboeuf)

   - Memory map verification during resume from hibernation on x86 to
     ensure a consistent address space layout (Chen Yu)

   - Wakeup sources debugging enhancement (Xing Wei)

   - rockchip-io AVS driver cleanup (Shawn Lin)"

* tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (127 commits)
  devfreq: rk3399_dmc: Don't use OPP structures outside of RCU locks
  devfreq: rk3399_dmc: Remove dangling rcu_read_unlock()
  devfreq: exynos: Don't use OPP structures outside of RCU locks
  Documentation: intel_pstate: Document HWP energy/performance hints
  cpufreq: intel_pstate: Support for energy performance hints with HWP
  cpufreq: intel_pstate: Add locking around HWP requests
  PM / sleep: Print active wakeup sources when blocking on wakeup_count reads
  PM / core: Fix bug in the error handling of async suspend
  PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend
  PM / Domains: Fix compatible for domain idle state
  PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators()
  PM / OPP: Allow platform specific custom set_opp() callbacks
  PM / OPP: Separate out _generic_set_opp()
  PM / OPP: Add infrastructure to manage multiple regulators
  PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage()
  PM / OPP: Manage supply's voltage/current in a separate structure
  PM / OPP: Don't use OPP structure outside of rcu protected section
  PM / OPP: Reword binding supporting multiple regulators per device
  PM / OPP: Fix incorrect cpu-supply property in binding
  cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state()
  ..
2016-12-13 10:41:53 -08:00
Rafael J. Wysocki fecc8c0ebd Merge branch 'pm-cpufreq'
* pm-cpufreq: (51 commits)
  Documentation: intel_pstate: Document HWP energy/performance hints
  cpufreq: intel_pstate: Support for energy performance hints with HWP
  cpufreq: intel_pstate: Add locking around HWP requests
  cpufreq: ondemand: Set MIN_FREQUENCY_UP_THRESHOLD to 1
  cpufreq: intel_pstate: Add Knights Mill CPUID
  MAINTAINERS: Add bug tracking system location entry for cpufreq
  cpufreq: dt: Add support for zx296718
  cpufreq: acpi-cpufreq: drop rdmsr_on_cpus() usage
  cpufreq: acpi-cpufreq: Convert to hotplug state machine
  cpufreq: intel_pstate: fix intel_pstate_exit_perf_limits() prototype
  cpufreq: intel_pstate: Set EPP/EPB to 0 in performance mode
  cpufreq: schedutil: Rectify comment in sugov_irq_work() function
  cpufreq: intel_pstate: increase precision of performance limits
  cpufreq: intel_pstate: round up min_perf limits
  cpufreq: Make cpufreq_update_policy() void
  ACPI / processor: Make acpi_processor_ppc_has_changed() void
  cpufreq: Avoid using inactive policies
  cpufreq: intel_pstate: Generic governors support
  cpufreq: intel_pstate: Request P-states control from SMM if needed
  cpufreq: dt: Add support for r8a7743 and r8a7745
  ...
2016-12-12 20:45:01 +01:00
Rafael J. Wysocki 57def856f3 Merge branch 'pm-opp'
* pm-opp:
  PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators()
  PM / OPP: Allow platform specific custom set_opp() callbacks
  PM / OPP: Separate out _generic_set_opp()
  PM / OPP: Add infrastructure to manage multiple regulators
  PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage()
  PM / OPP: Manage supply's voltage/current in a separate structure
  PM / OPP: Don't use OPP structure outside of rcu protected section
  PM / OPP: Reword binding supporting multiple regulators per device
  PM / OPP: Fix incorrect cpu-supply property in binding
  PM / OPP: Pass opp_table to dev_pm_opp_put_regulator()
  PM / OPP: fix debug/error messages in dev_pm_opp_of_get_sharing_cpus()
  PM / OPP: make _of_get_opp_desc_node() a static function
2016-12-12 20:44:01 +01:00
Srinivas Pandruvada 984edbdccc cpufreq: intel_pstate: Support for energy performance hints with HWP
It is possible to provide hints to the HWP algorithms in the processor
to be more performance centric to more energy centric. These hints are
provided by using HWP energy performance preference (EPP) or energy
performance bias (EPB) settings.

The scope of these settings is per logical processor, which means that
each of the logical processors in the package can be programmed with a
different value.

This change provides cpufreq sysfs interface to provide hint. For each
policy, two additional attributes will be available to check and provide
hint. These attributes will only be present when the intel_pstate driver
is using HWP mode.

These attributes are:
 - energy_performance_available_preferences
 - energy_performance_preference

To get list of supported hints:
$ cat energy_performance_available_preferences
default performance balance_performance balance_power power

The current preference can be read or changed via cpufreq sysfs
attribute "energy_performance_preference". Reading from this attribute
will display current effective setting changed via any method. User can
write any of the valid preference string to this attribute. User can
always restore to power-on default by writing "default".

Implementation
Since these hints can be provided by direct MSR write or using some tools
like x86_energy_perf_policy, the driver internally doesn't maintain any
state. The user operation will result in direct read/write of MSR: 0x774
(HWP_REQUEST_MSR). Also driver use read modify write to update other
fields in this MSR.

Summary of changes:
 - struct cpudata field epp_saved is renamed to epp_powersave, as this
   stores the value to restore once policy is switched from performance
   to powersave to restore original powersave EPP value.
 - A new struct cpudata field epp_saved is used to store the raw MSR
   EPP/EPB value when a CPU goes offline or on suspend and restore on
   online/resume. This ensures that EPP value is restored to correct
   value irrespective of the means used to set.
 - EPP/EPB value ranges are fixed for each preference, which can be
   set for the cpufreq sysfs, so user request is mapped to/from this
   range.
 - New attributes are only added when HWP is present.
 - Since EPP value of 0 is valid the fields are initialized to
   -EINVAL when not valid. The field epp_default is read only once
   after powerup to avoid reading on subsequent CPU online operation
 - New suspend callback to store epp on suspend operation
 - Don't invalidate old epp_saved field on resume and online as now
   we can restore last epp value on suspend and this field can still
   have old EPP value sampled during switch to performance from
   powersave.
 - While here optimized setting of cpu_data->epp_powersave = epp in
   intel_pstate_hwp_set() as this was done in both true and false
   paths.
 - epp/epb set function returns error to caller on failure to pass
   on to user space for display.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-08 01:43:05 +01:00