summaryrefslogtreecommitdiff
path: root/drivers/cpufreq
AgeCommit message (Collapse)Author
2021-07-20cpufreq: CPPC: Fix potential memleak in cppc_cpufreq_cpu_initViresh Kumar
[ Upstream commit fe2535a44904a77615a3af8e8fd7dafb98fb0e1b ] It's a classic example of memleak, we allocate something, we fail and never free the resources. Make sure we free all resources on policy ->init() failures. Fixes: a28b2bfc099c ("cppc_cpufreq: replace per-cpu data array with a list") Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20cpufreq: scmi: Fix an error messageChristophe JAILLET
[ Upstream commit b791c7f94680ba9b60b0c0786b1d0eb4393053d6 ] 'ret' is known to be 0 here. The last error code is stored in 'nr_opp', so use it in the error message. Fixes: 71a37cd6a59d ("scmi-cpufreq: Remove deferred probe") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14cpufreq: Make cpufreq_online() call driver->offline() on errorsRafael J. Wysocki
[ Upstream commit 3b7180573c250eb6e2a7eec54ae91f27472332ea ] In the CPU removal path the ->offline() callback provided by the driver is always invoked before ->exit(), but in the cpufreq_online() error path it is not, so ->exit() is expected to somehow know the context in which it has been called and act accordingly. That is less than straightforward, so make cpufreq_online() invoke the driver's ->offline() callback, if present, on errors before ->exit() too. This only potentially affects intel_pstate. Fixes: 91a12e91dc39 ("cpufreq: Allow light-weight tear down and bring up of CPUs") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-14Revert "cpufreq: CPPC: Add support for frequency invariance"Viresh Kumar
This reverts commit 4c38f2df71c8e33c0b64865992d693f5022eeaad. There are few races in the frequency invariance support for CPPC driver, namely the driver doesn't stop the kthread_work and irq_work on policy exit during suspend/resume or CPU hotplug. A proper fix won't be possible for the 5.13-rc, as it requires a lot of changes. Lets revert the patch instead for now. Fixes: 4c38f2df71c8 ("cpufreq: CPPC: Add support for frequency invariance") Reported-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-05-15Merge tag 'sched-urgent-2021-05-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Fix an idle CPU selection bug, and an AMD Ryzen maximum frequency enumeration bug" * tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, sched: Fix the AMD CPPC maximum performance value on certain AMD Ryzen generations sched/fair: Fix clearing of has_idle_cores flag in select_idle_cpu()
2021-05-13x86, sched: Fix the AMD CPPC maximum performance value on certain AMD Ryzen ↵Huang Rui
generations Some AMD Ryzen generations has different calculation method on maximum performance. 255 is not for all ASICs, some specific generations should use 166 as the maximum performance. Otherwise, it will report incorrect frequency value like below: ~ → lscpu | grep MHz CPU MHz: 3400.000 CPU max MHz: 7228.3198 CPU min MHz: 2200.0000 [ mingo: Tidied up whitespace use. ] [ Alexander Monakov <amonakov@ispras.ru>: fix 225 -> 255 typo. ] Fixes: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD systems") Fixes: 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost frequencies") Reported-by: Jason Bagavatsingham <jason.bagavatsingham@gmail.com> Fixed-by: Alexander Monakov <amonakov@ispras.ru> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Jason Bagavatsingham <jason.bagavatsingham@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210425073451.2557394-1-ray.huang@amd.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211791 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-05-10cpufreq: intel_pstate: Use HWP if enabled by platform firmwareRafael J. Wysocki
It turns out that there are systems where HWP is enabled during initialization by the platform firmware (BIOS), but HWP EPP support is not advertised. After commit 7aa1031223bc ("cpufreq: intel_pstate: Avoid enabling HWP if EPP is not supported") intel_pstate refuses to use HWP on those systems, but the fallback PERF_CTL interface does not work on them either because of enabled HWP, and once enabled, HWP cannot be disabled. Consequently, the users of those systems cannot control CPU performance scaling. Address this issue by making intel_pstate use HWP unconditionally if it is enabled already when the driver starts. Fixes: 7aa1031223bc ("cpufreq: intel_pstate: Avoid enabling HWP if EPP is not supported") Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
2021-04-26Merge tag 'pm-5.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add some new hardware support (for example, IceLake-D idle states in intel_idle), fix some issues (for example, the handling of negative "sleep length" values in cpuidle governors), add new functionality to the existing drivers (for example, scale-invariance support in the ACPI CPPC cpufreq driver) and clean up code all over. Specifics: - Add idle states table for IceLake-D to the intel_idle driver and update IceLake-X C6 data in it (Artem Bityutskiy). - Fix the C7 idle state on Tegra114 in the tegra cpuidle driver and drop the unused do_idle() firmware call from it (Dmitry Osipenko). - Fix cpuidle-qcom-spm Kconfig entry (He Ying). - Fix handling of possible negative tick_nohz_get_next_hrtimer() return values of in cpuidle governors (Rafael Wysocki). - Add support for frequency-invariance to the ACPI CPPC cpufreq driver and update the frequency-invariance engine (FIE) to use it as needed (Viresh Kumar). - Simplify the default delay_us setting in the ACPI CPPC cpufreq driver (Tom Saeger). - Clean up frequency-related computations in the intel_pstate cpufreq driver (Rafael Wysocki). - Fix TBG parent setting for load levels in the armada-37xx cpufreq driver and drop the CPU PM clock .set_parent method for armada-37xx (Marek Behún). - Fix multiple issues in the armada-37xx cpufreq driver (Pali Rohár). - Fix handling of dev_pm_opp_of_cpumask_add_table() return values in cpufreq-dt to take the -EPROBE_DEFER one into acconut as appropriate (Quanyang Wang). - Fix format string in ia64-acpi-cpufreq (Sergei Trofimovich). - Drop the unused for_each_policy() macro from cpufreq (Shaokun Zhang). - Simplify computations in the schedutil cpufreq governor to avoid unnecessary overhead (Yue Hu). - Fix typos in the s5pv210 cpufreq driver (Bhaskar Chowdhury). - Fix cpufreq documentation links in Kconfig (Alexander Monakov). - Fix PCI device power state handling in pci_enable_device_flags() to avoid issuse in some cases when the device depends on an ACPI power resource (Rafael Wysocki). - Add missing documentation of pm_runtime_resume_and_get() (Alan Stern). - Add missing static inline stub for pm_runtime_has_no_callbacks() to pm_runtime.h and drop the unused try_to_freeze_nowarn() definition (YueHaibing). - Drop duplicate struct device declaration from pm.h and fix a structure type declaration in intel_rapl.h (Wan Jiabing). - Use dev_set_name() instead of an open-coded equivalent of it in the wakeup sources code and drop a redundant local variable initialization from it (Andy Shevchenko, Colin Ian King). - Use crc32 instead of md5 for e820 memory map integrity check during resume from hibernation on x86 (Chris von Recklinghausen). - Fix typos in comments in the system-wide and hibernation support code (Lu Jialin). - Modify the generic power domains (genpd) code to avoid resuming devices in the "prepare" phase of system-wide suspend and hibernation (Ulf Hansson). - Add Hygon Fam18h RAPL support to the intel_rapl power capping driver (Pu Wen). - Add MAINTAINERS entry for the dynamic thermal power management (DTPM) code (Daniel Lezcano). - Add devm variants of operating performance points (OPP) API functions and switch over some users of the OPP framework to the new resource-managed API (Yangtao Li and Dmitry Osipenko). - Update devfreq core: * Register devfreq devices as cooling devices on demand (Daniel Lezcano). * Add missing unlock opeation in devfreq_add_device() (Lukasz Luba). * Use the next frequency as resume_freq instead of the previous frequency when using the opp-suspend property (Dong Aisheng). * Check get_dev_status in devfreq_update_stats() (Dong Aisheng). * Fix set_freq path for the userspace governor in Kconfig (Dong Aisheng). * Remove invalid description of get_target_freq() (Dong Aisheng). - Update devfreq drivers: * imx8m-ddrc: Remove imx8m_ddrc_get_dev_status() and unneeded of_match_ptr() (Dong Aisheng, Fabio Estevam). * rk3399_dmc: dt-bindings: Add rockchip,pmu phandle and drop references to undefined symbols (Enric Balletbo i Serra, Gaël PORTAY). * rk3399_dmc: Use dev_err_probe() to simplify the code (Krzysztof Kozlowski). * imx-bus: Remove unneeded of_match_ptr() (Fabio Estevam). - Fix kernel-doc warnings in three places (Pierre-Louis Bossart). - Fix typo in the pm-graph utility code (Ricardo Ribalda)" * tag 'pm-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits) PM: wakeup: remove redundant assignment to variable retval PM: hibernate: x86: Use crc32 instead of md5 for hibernation e820 integrity check cpufreq: Kconfig: fix documentation links PM: wakeup: use dev_set_name() directly PM: runtime: Add documentation for pm_runtime_resume_and_get() cpufreq: intel_pstate: Simplify intel_pstate_update_perf_limits() cpufreq: armada-37xx: Fix module unloading cpufreq: armada-37xx: Remove cur_frequency variable cpufreq: armada-37xx: Fix determining base CPU frequency cpufreq: armada-37xx: Fix driver cleanup when registration failed clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0 clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz cpufreq: armada-37xx: Fix the AVS value for load L1 clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock cpufreq: armada-37xx: Fix setting TBG parent for load levels cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration cpuidle: tegra: Remove do_idle firmware call cpuidle: tegra: Fix C7 idling state on Tegra114 PM: sleep: fix typos in comments cpufreq: Remove unused for_each_policy macro ...
2021-04-26Merge tag 'arm-drivers-5.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "Updates for SoC specific drivers include a few subsystems that have their own maintainers but send them through the soc tree: TEE/OP-TEE: - Add tracepoints around calls to secure world Memory controller drivers: - Minor fixes for Renesas, Exynos, Mediatek and Tegra platforms - Add debug statistics to Tegra20 memory controller - Update Tegra bindings and convert to dtschema ARM SCMI Firmware: - Support for modular SCMI protocols and vendor specific extensions - New SCMI IIO driver - Per-cpu DVFS The other driver changes are all from the platform maintainers directly and reflect the drivers that don't fit into any other subsystem as well as treewide changes for a particular platform. SoCFPGA: - Various cleanups contributed by Krzysztof Kozlowski Mediatek: - add MT8183 support to mutex driver - MMSYS: use per SoC array to describe the possible routing - add MMSYS support for MT8183 and MT8167 - add support for PMIC wrapper with integrated arbiter - add support for MT8192/MT6873 Tegra: - Bug fixes to PMC and clock drivers NXP/i.MX: - Update SCU power domain driver to keep console domain power on. - Add missing ADC1 power domain to SCU power domain driver. - Update comments for single global power domain in SCU power domain driver. - Add i.MX51/i.MX53 unique id support to i.MX SoC driver. NXP/FSL SoC driver updates for v5.13 - Add ACPI support for RCPM driver - Use generic io{read,write} for QE drivers after performance optimized for PowerPC - Fix QBMAN probe to cleanup HW states correctly for kexec - Various cleanup and style fix for QBMAN/QE/GUTS drivers OMAP: - Preparation to use devicetree for genpd - ti-sysc needs iorange check improved when the interconnect target module has no control registers listed - ti-sysc needs to probe l4_wkup and l4_cfg interconnects first to avoid issues with missing resources and unnecessary deferred probe - ti-sysc debug option can now detect more devices - ti-sysc now warns if an old incomplete devicetree data is found as we now rely on it being complete for am3 and 4 - soc init code needs to check for prcm and prm nodes for omap4/5 and dra7 - omap-prm driver needs to enable autoidle retention support for omap4 - omap5 clocks are missing gpmc and ocmc clock registers - pci-dra7xx now needs to use builtin_platform_driver instead of using builtin_platform_driver_probe for deferred probe to work Raspberry Pi: - Fix-up all RPi firmware drivers so as for unbind to happen in an orderly fashion - Support for RPi's PoE hat PWM bus Qualcomm - Improved detection for SCM calling conventions - Support for OEM specific wifi firmware path - Added drivers for SC7280/SM8350: RPMH, LLCC< AOSS QMP" * tag 'arm-drivers-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits) soc: aspeed: fix a ternary sign expansion bug memory: mtk-smi: Add device-link between smi-larb and smi-common memory: samsung: exynos5422-dmc: handle clk_set_parent() failure memory: renesas-rpc-if: fix possible NULL pointer dereference of resource clk: socfpga: fix iomem pointer cast on 64-bit soc: aspeed: Adapt to new LPC device tree layout pinctrl: aspeed-g5: Adapt to new LPC device tree layout ipmi: kcs: aspeed: Adapt to new LPC DTS layout ARM: dts: Remove LPC BMC and Host partitions dt-bindings: aspeed-lpc: Remove LPC partitioning soc: fsl: enable acpi support in RCPM driver soc: qcom: mdt_loader: Detect truncated read of segments soc: qcom: mdt_loader: Validate that p_filesz < p_memsz soc: qcom: pdr: Fix error return code in pdr_register_listener firmware: qcom_scm: Fix kernel-doc function names to match firmware: qcom_scm: Suppress sysfs bind attributes firmware: qcom_scm: Workaround lack of "is available" call on SC7180 firmware: qcom_scm: Reduce locking section for __get_convention() firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool Revert "soc: fsl: qe: introduce qe_io{read,write}* wrappers" ...
2021-04-26Merge branch 'pm-cpufreq'Rafael J. Wysocki
* pm-cpufreq: (22 commits) cpufreq: Kconfig: fix documentation links cpufreq: intel_pstate: Simplify intel_pstate_update_perf_limits() cpufreq: armada-37xx: Fix module unloading cpufreq: armada-37xx: Remove cur_frequency variable cpufreq: armada-37xx: Fix determining base CPU frequency cpufreq: armada-37xx: Fix driver cleanup when registration failed clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0 clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz cpufreq: armada-37xx: Fix the AVS value for load L1 clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock cpufreq: armada-37xx: Fix setting TBG parent for load levels cpufreq: Remove unused for_each_policy macro cpufreq: dt: dev_pm_opp_of_cpumask_add_table() may return -EPROBE_DEFER cpufreq: intel_pstate: Clean up frequency computations cpufreq: cppc: simplify default delay_us setting cpufreq: Rudimentary typos fix in the file s5pv210-cpufreq.c cpufreq: CPPC: Add support for frequency invariance ia64: fix format string for ia64-acpi-cpu-freq cpufreq: schedutil: Call sugov_update_next_freq() before check to fast_switch_enabled arch_topology: Export arch_freq_scale and helpers ...
2021-04-21cpufreq: Kconfig: fix documentation linksAlexander Monakov
User documentation for cpufreq governors and drivers has been moved to admin-guide; adjust references from Kconfig entries accordingly. Remove references from undocumented cpufreq drivers, as well as the 'userspace' cpufreq governor, for which no additional details are provided in the admin-guide text. Fixes: 2a0e49279850 ("cpufreq: User/admin documentation update and consolidation") Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-12Merge branch 'cpufreq/arm/linux-next' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq updates for v5.13 from Viresh Kumar: "- Fix typos in s5pv210 cpufreq driver (Bhaskar Chowdhury). - Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz (Pali Rohár and Marek Behún). - cpufreq-dt: Return -EPROBE_DEFER on failure to add table (Quanyang Wang). - Minor cleanup in cppc driver (Tom Saeger). - Add frequency invariance support for CPPC driver and generalize freq invariance support arch-topology driver (Viresh Kumar)." * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: armada-37xx: Fix module unloading cpufreq: armada-37xx: Remove cur_frequency variable cpufreq: armada-37xx: Fix determining base CPU frequency cpufreq: armada-37xx: Fix driver cleanup when registration failed clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0 clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz cpufreq: armada-37xx: Fix the AVS value for load L1 clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock cpufreq: armada-37xx: Fix setting TBG parent for load levels cpufreq: dt: dev_pm_opp_of_cpumask_add_table() may return -EPROBE_DEFER cpufreq: cppc: simplify default delay_us setting cpufreq: Rudimentary typos fix in the file s5pv210-cpufreq.c cpufreq: CPPC: Add support for frequency invariance arch_topology: Export arch_freq_scale and helpers arch_topology: Allow multiple entities to provide sched_freq_tick() callback arch_topology: Rename freq_scale as arch_freq_scale
2021-04-09cpufreq: intel_pstate: Simplify intel_pstate_update_perf_limits()Rafael J. Wysocki
Because pstate.max_freq is always equal to the product of pstate.max_pstate and pstate.scaling and, analogously, pstate.turbo_freq is always equal to the product of pstate.turbo_pstate and pstate.scaling, the result of the max_policy_perf computation in intel_pstate_update_perf_limits() is always equal to the quotient of policy_max and pstate.scaling, regardless of whether or not turbo is disabled. Analogously, the result of min_policy_perf in intel_pstate_update_perf_limits() is always equal to the quotient of policy_min and pstate.scaling. Accordingly, intel_pstate_update_perf_limits() need not check whether or not turbo is enabled at all and in order to compute max_policy_perf and min_policy_perf it can always divide policy_max and policy_min, respectively, by pstate.scaling. Make it do so. While at it, move the definition and initialization of the turbo_max local variable to the code branch using it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com>
2021-04-09cpufreq: armada-37xx: Fix module unloadingPali Rohár
This driver is missing module_exit hook. Add proper driver exit function which unregisters the platform device and cleans up the data. Signed-off-by: Pali Rohár <pali@kernel.org> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Remove cur_frequency variablePali Rohár
Variable cur_frequency in armada37xx_cpufreq_driver_init() is unused. Signed-off-by: Pali Rohár <pali@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Fix determining base CPU frequencyPali Rohár
When current CPU load is not L0 then loading armada-37xx-cpufreq.ko driver fails with following error: # modprobe armada-37xx-cpufreq [ 502.702097] Unsupported CPU frequency 250 MHz This issue was partially fixed by commit 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp"), but only for calculating CPU frequency for opp. Fix this also for determination of base CPU frequency. Signed-off-by: Pali Rohár <pali@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Fixes: 92ce45fb875d ("cpufreq: Add DVFS support for Armada 37xx") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Fix driver cleanup when registration failedPali Rohár
Commit 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp") changed calculation of frequency passed to the dev_pm_opp_add() function call. But the code for dev_pm_opp_remove() function call was not updated, so the driver cleanup phase does not work when registration fails. This fixes the issue by using the same frequency in both calls. Signed-off-by: Pali Rohár <pali@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Fixes: 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Fix the AVS value for load L1Pali Rohár
The original CPU voltage value for load L1 is too low for Armada 37xx SoC when base CPU frequency is 1000 or 1200 MHz. It leads to instabilities where CPU gets stuck soon after dynamic voltage scaling from load L1 to L0. Update the CPU voltage value for load L1 accordingly when base frequency is 1000 or 1200 MHz. The minimal L1 value for base CPU frequency 1000 MHz is updated from the original 1.05V to 1.108V and for 1200 MHz is updated to 1.155V. This minimal L1 value is used only in the case when it is lower than value for L0. This change fixes CPU instability issues on 1 GHz and 1.2 GHz variants of Espressobin and 1 GHz Turris Mox. Marvell previously for 1 GHz variant of Espressobin provided a patch [1] suitable only for their Marvell Linux kernel 4.4 fork which workarounded this issue. Patch forced CPU voltage value to 1.108V in all loads. But such change does not fix CPU instability issues on 1.2 GHz variants of Armada 3720 SoC. During testing we come to the conclusion that using 1.108V as minimal value for L1 load makes 1 GHz variants of Espressobin and Turris Mox boards stable. And similarly 1.155V for 1.2 GHz variant of Espressobin. These two values 1.108V and 1.155V are documented in Armada 3700 Hardware Specifications as typical initial CPU voltage values. Discussion about this issue is also at the Armbian forum [2]. [1] - https://github.com/MarvellEmbeddedProcessors/linux-marvell/commit/dc33b62c90696afb6adc7dbcc4ebbd48bedec269 [2] - https://forum.armbian.com/topic/10429-how-to-make-espressobin-v7-stable/ Signed-off-by: Pali Rohár <pali@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Fixes: 1c3528232f4b ("cpufreq: armada-37xx: Add AVS support") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Fix setting TBG parent for load levelsMarek Behún
With CPU frequency determining software [1] we have discovered that after this driver does one CPU frequency change, the base frequency of the CPU is set to the frequency of TBG-A-P clock, instead of the TBG that is parent to the CPU. This can be reproduced on EspressoBIN and Turris MOX: cd /sys/devices/system/cpu/cpufreq/policy0 echo powersave >scaling_governor echo performance >scaling_governor Running the mhz tool before this driver is loaded reports 1000 MHz, and after loading the driver and executing commands above the tool reports 800 MHz. The change of TBG clock selector is supposed to happen in function armada37xx_cpufreq_dvfs_setup. Before the function returns, it does this: parent = clk_get_parent(clk); clk_set_parent(clk, parent); The armada-37xx-periph clock driver has the .set_parent method implemented correctly for this, so if the method was actually called, this would work. But since the introduction of the common clock framework in commit b2476490ef11 ("clk: introduce the common clock..."), the clk_set_parent function checks whether the parent is actually changing, and if the requested new parent is same as the old parent (which is obviously the case for the code above), the .set_parent method is not called at all. This patch fixes this issue by filling the correct TBG clock selector directly in the armada37xx_cpufreq_dvfs_setup during the filling of other registers at the same address. But the determination of CPU TBG index cannot be done via the common clock framework, therefore we need to access the North Bridge Peripheral Clock registers directly in this driver. [1] https://github.com/wtarreau/mhz Signed-off-by: Marek Behún <kabel@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Pali Rohár <pali@kernel.org> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Fixes: 92ce45fb875d ("cpufreq: Add DVFS support for Armada 37xx") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-08cpufreq: Remove unused for_each_policy macroShaokun Zhang
Macro 'for_each_policy' has become unused since commit f963735a3ca3 ("cpufreq: Create for_each_{in}active_policy()"), so remove it. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-29cpufreq: scmi: Port driver to the new scmi_perf_proto_ops interfaceCristian Marussi
Port driver to the new SCMI perf interface based on protocol handles and common devm_get_ops(). Link: https://lore.kernel.org/r/20210316124903.35011-13-cristian.marussi@arm.com Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-03-26cpufreq: Fix scaling_{available,boost}_frequencies_show() commentsGeert Uytterhoeven
The function names in the comment blocks for the functions scaling_available_frequencies_show() and scaling_boost_frequencies_show() do not match the actual names. Fixes: 6f19efc0a1ca08bc ("cpufreq: Add boost frequency support in core") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-25cpufreq: dt: dev_pm_opp_of_cpumask_add_table() may return -EPROBE_DEFERQuanyang Wang
The function dev_pm_opp_of_cpumask_add_table() may return -EPROBE_DEFER, which needs to be propagated to the caller to try probing the driver later on. Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> [ Viresh: Massage changelog/subject, improve code. ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-23cpufreq: intel_pstate: Clean up frequency computationsRafael J. Wysocki
Notice that some computations related to frequency in intel_pstate can be simplified if (a) intel_pstate_get_hwp_max() updates the relevant members of struct cpudata by itself and (b) the "turbo disabled" check is moved from it to its callers, so modify the code accordingly and while at it rename intel_pstate_get_hwp_max() to intel_pstate_get_hwp_cap() which better reflects its purpose and provide a simplified variat of it, __intel_pstate_get_hwp_cap(), suitable for the initialization path. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com>
2021-03-22cpufreq: cppc: simplify default delay_us settingTom Saeger
Simplify case when setting default in cppc_cpufreq_get_transition_delay_us. Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-22cpufreq: Rudimentary typos fix in the file s5pv210-cpufreq.cBhaskar Chowdhury
Trivial spelling fixes throughout the file. Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Reviewed-by: Tom Saeger <tom.saeger@oracle.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> [ Viresh: Capitalize two words. ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-22cpufreq: CPPC: Add support for frequency invarianceViresh Kumar
The Frequency Invariance Engine (FIE) is providing a frequency scaling correction factor that helps achieve more accurate load-tracking. Normally, this scaling factor can be obtained directly with the help of the cpufreq drivers as they know the exact frequency the hardware is running at. But that isn't the case for CPPC cpufreq driver. Another way of obtaining that is using the arch specific counter support, which is already present in kernel, but that hardware is optional for platforms. This patch updates the CPPC driver to register itself with the topology core to provide its own implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which gets called by the scheduler on every tick. Note that the arch specific counters have higher priority than CPPC counters, if available, though the CPPC driver doesn't need to have any special handling for that. On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we reach here from hard-irq context), which then schedules a normal work item and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable based on the counter updates since the last tick. To allow platforms to disable this CPPC counter-based frequency invariance support, this is all done under CONFIG_ACPI_CPPC_CPUFREQ_FIE, which is enabled by default. This also exports sched_setattr_nocheck() as the CPPC driver can be built as a module. Cc: linux-acpi@vger.kernel.org Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-19ia64: fix format string for ia64-acpi-cpu-freqSergei Trofimovich
Fix warning with %lx / s64 mismatch: CC [M] drivers/cpufreq/ia64-acpi-cpufreq.o drivers/cpufreq/ia64-acpi-cpufreq.c: In function 'processor_get_pstate': warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 's64' {aka 'long long int'} [-Wformat=] Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-15scmi-cpufreq: Get opp_shared_cpus from opp-v2 for EMNicola Mazzucato
By design, SCMI performance domains define the granularity of performance controls, they do not describe any underlying hardware dependencies (although they may match in many cases). It is therefore possible to have some platforms where hardware may have the ability to control CPU performance at different granularity and choose to describe fine-grained performance control through SCMI. In such situations, the energy model would be provided with inaccurate information based on controls, while it still needs to know the performance boundaries. To restore correct functionality, retrieve information of CPUs under the same performance domain from operating-points-v2 in DT, and pass it on to EM. Link: https://lore.kernel.org/r/20210218222326.15788-3-nicola.mazzucato@arm.com Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Nicola Mazzucato <nicola.mazzucato@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-03-15scmi-cpufreq: Remove deferred probeNicola Mazzucato
The current implementation of the scmi_cpufreq_init() function returns -EPROBE_DEFER when the OPP table is not populated. In practice the cpufreq core cannot handle this error code. Therefore, fix the return value and clarify the error message. Link: https://lore.kernel.org/r/20210218222326.15788-2-nicola.mazzucato@arm.com Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Nicola Mazzucato <nicola.mazzucato@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-03-08cpufreq: blacklist Arm Vexpress platforms in cpufreq-dt-platdevSudeep Holla
Add "arm,vexpress" to cpufreq-dt-platdev blacklist since the actual scaling is handled by the firmware cpufreq drivers(scpi, scmi and vexpress-spc). Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-08cpufreq: qcom-hw: Fix return value check in qcom_cpufreq_hw_cpu_init()Wei Yongjun
In case of error, the function ioremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-08cpufreq: qcom-hw: fix dereferencing freed memory 'data'Shawn Guo
Commit 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") introduces an issue of dereferencing freed memory 'data'. Fix it. Fixes: 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-02-24Merge tag 'sfi-removal-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull Simple Firmware Interface (SFI) support removal from Rafael Wysocki: "Drop support for depercated platforms using SFI, drop the entire support for SFI that has been long deprecated too and make some janitorial changes on top of that (Andy Shevchenko)" * tag 'sfi-removal-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: x86/platform/intel-mid: Update Copyright year and drop file names x86/platform/intel-mid: Remove unused header inclusion in intel-mid.h x86/platform/intel-mid: Drop unused __intel_mid_cpu_chip and Co. x86/platform/intel-mid: Get rid of intel_scu_ipc_legacy.h x86/PCI: Describe @reg for type1_access_ok() x86/PCI: Get rid of custom x86 model comparison sfi: Remove framework for deprecated firmware cpufreq: sfi-cpufreq: Remove driver for deprecated firmware media: atomisp: Remove unused header mfd: intel_msic: Remove driver for deprecated platform x86/apb_timer: Remove driver for deprecated platform x86/platform/intel-mid: Remove unused leftovers (vRTC) x86/platform/intel-mid: Remove unused leftovers (msic) x86/platform/intel-mid: Remove unused leftovers (msic_thermal) x86/platform/intel-mid: Remove unused leftovers (msic_power_btn) x86/platform/intel-mid: Remove unused leftovers (msic_gpio) x86/platform/intel-mid: Remove unused leftovers (msic_battery) x86/platform/intel-mid: Remove unused leftovers (msic_ocd) x86/platform/intel-mid: Remove unused leftovers (msic_audio) platform/x86: intel_scu_wdt: Drop mistakenly added const
2021-02-23Merge branches 'pm-cpufreq' and 'pm-opp'Rafael J. Wysocki
* pm-cpufreq: cpufreq: Fix typo in kerneldoc comment cpufreq: schedutil: Remove update_lock comment from struct sugov_policy definition cpufreq: schedutil: Remove needless sg_policy parameter from ignore_dl_rate_limit() cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is known cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks * pm-opp: opp: Don't skip freq update for different frequency
2021-02-19cpufreq: Fix typo in kerneldoc commentYue Hu
Change 'Terget' to 'Target'. Should be Target. Signed-off-by: Yue Hu <huyue2@yulong.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-18Merge branch 'cpufreq/arm/linux-next' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq fix for 5.12 from Viresh Kumar: "Single patch to fix issue with cpu hotplug and policy recreation for qcom-cpufreq-hw driver." * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks
2021-02-18cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is knownRafael J. Wysocki
Commit 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost frequencies") attempted to address a performance issue involving acpi-cpufreq, the schedutil governor and scale-invariance on x86 by extending the frequency tables created by acpi-cpufreq to cover the entire range of "turbo" (or "boost") frequencies, but that caused frequencies reported via /proc/cpuinfo and the scaling_cur_freq attribute in sysfs to change which may confuse users and monitoring tools. For this reason, revert the part of commit 3c55e94c0ade adding the extra entry to the frequency table and use the observation that in principle cpuinfo.max_freq need not be equal to the maximum frequency listed in the frequency table for the given policy. Namely, modify cpufreq_frequency_table_cpuinfo() to allow cpufreq drivers to set their own cpuinfo.max_freq above that frequency and change acpi-cpufreq to set cpuinfo.max_freq to the maximum boost frequency found via CPPC. This should be sufficient to let all of the cpufreq subsystem know the real maximum frequency of the CPU without changing frequency reporting. Link: https://bugzilla.kernel.org/show_bug.cgi?id=211305 Fixes: 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost frequencies") Reported-by: Matt McDonald <gardotd426@gmail.com> Tested-by: Matt McDonald <gardotd426@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Tested-by: Michael Larabel <Michael@phoronix.com> Cc: 5.11+ <stable@vger.kernel.org> # 5.11+
2021-02-18cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooksShawn Guo
Commit f17b3e44320b ("cpufreq: qcom-hw: Use devm_platform_ioremap_resource() to simplify code") introduces a regression on platforms using the driver, by failing to initialise a policy, when one is created post hotplug. When all the CPUs of a policy are hoptplugged out, the call to .exit() and later to devm_iounmap() does not release the memory region that was requested during devm_platform_ioremap_resource(). Therefore, a subsequent call to .init() will result in the following error, which will prevent a new policy to be initialised: [ 3395.915416] CPU4: shutdown [ 3395.938185] psci: CPU4 killed (polled 0 ms) [ 3399.071424] CPU5: shutdown [ 3399.094316] psci: CPU5 killed (polled 0 ms) [ 3402.139358] CPU6: shutdown [ 3402.161705] psci: CPU6 killed (polled 0 ms) [ 3404.742939] CPU7: shutdown [ 3404.765592] psci: CPU7 killed (polled 0 ms) [ 3411.492274] Detected VIPT I-cache on CPU4 [ 3411.492337] GICv3: CPU4: found redistributor 400 region 0:0x0000000017ae0000 [ 3411.492448] CPU4: Booted secondary processor 0x0000000400 [0x516f802d] [ 3411.503654] qcom-cpufreq-hw 17d43000.cpufreq: can't request region for resource [mem 0x17d45800-0x17d46bff] With that being said, the original code was tricky and skipping memory region request intentionally to hide this issue. The true cause is that those devm_xxx() device managed functions shouldn't be used for cpufreq init/exit hooks, because &pdev->dev is alive across the hooks and will not trigger auto resource free-up. Let's drop the use of device managed functions and manually allocate/free resources, so that the issue can be fixed properly. Cc: v5.10+ <stable@vger.kernel.org> # v5.10+ Fixes: f17b3e44320b ("cpufreq: qcom-hw: Use devm_platform_ioremap_resource() to simplify code") Suggested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-02-15cpufreq: sfi-cpufreq: Remove driver for deprecated firmwareAndy Shevchenko
SFI-based platforms are gone. So does this driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-15Merge branch 'pm-opp' into pmRafael J. Wysocki
* pm-opp: (37 commits) PM / devfreq: Add required OPPs support to passive governor PM / devfreq: Cache OPP table reference in devfreq OPP: Add function to look up required OPP's for a given OPP opp: Replace ENOTSUPP with EOPNOTSUPP opp: Fix "foo * bar" should be "foo *bar" opp: Don't ignore clk_get() errors other than -ENOENT opp: Update bandwidth requirements based on scaling up/down opp: Allow lazy-linking of required-opps opp: Remove dev_pm_opp_set_bw() devfreq: tegra30: Migrate to dev_pm_opp_set_opp() drm: msm: Migrate to dev_pm_opp_set_opp() cpufreq: qcom: Migrate to dev_pm_opp_set_opp() opp: Implement dev_pm_opp_set_opp() opp: Update parameters of _set_opp_custom() opp: Allow _generic_set_opp_clk_only() to work for non-freq devices opp: Allow _generic_set_opp_regulator() to work for non-freq devices opp: Allow _set_opp() to work for non-freq devices opp: Split _set_opp() out of dev_pm_opp_set_rate() opp: Keep track of currently programmed OPP opp: No need to check clk for errors ...
2021-02-10Merge back cpufreq updates for v5.12.Rafael J. Wysocki
2021-02-08Merge branch 'cpufreq/arm/linux-next' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq changes for v5.12 from Viresh Kumar: "- Removal of Tango driver as the platform got removed (Arnd Bergmann). - Use resource managed APIs for tegra20 (Dmitry Osipenko). - Generic cleanups for brcmstb (Christophe JAILLET). - Enable boost support for qcom-hw (Shawn Guo)." * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: remove tango driver cpufreq: brcmstb-avs-cpufreq: Fix resource leaks in ->remove() cpufreq: brcmstb-avs-cpufreq: Free resources in error path cpufreq: qcom-hw: enable boost support cpufreq: tegra20: Use resource-managed API
2021-02-08cpufreq: ACPI: Update arch scale-invariance max perf ratio if CPPC is not thereRafael J. Wysocki
If the maximum performance level taken for computing the arch_max_freq_ratio value used in the x86 scale-invariance code is higher than the one corresponding to the cpuinfo.max_freq value coming from the acpi_cpufreq driver, the scale-invariant utilization falls below 100% even if the CPU runs at cpuinfo.max_freq or slightly faster, which causes the schedutil governor to select a frequency below cpuinfo.max_freq. That frequency corresponds to a frequency table entry below the maximum performance level necessary to get to the "boost" range of CPU frequencies which prevents "boost" frequencies from being used in some workloads. While this issue is related to scale-invariance, it may be amplified by commit db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") from the 5.10 development cycle which made it extremely easy to default to schedutil even if the preferred driver is acpi_cpufreq as long as intel_pstate is built too, because the mere presence of the latter effectively removes the ondemand governor from the defaults. Distro kernels are likely to include both intel_pstate and acpi_cpufreq on x86, so their users who cannot use intel_pstate or choose to use acpi_cpufreq may easily be affectecd by this issue. If CPPC is available, it can be used to address this issue by extending the frequency tables created by acpi_cpufreq to cover the entire available frequency range (including "boost" frequencies) for each CPU, but if CPPC is not there, acpi_cpufreq has no idea what the maximum "boost" frequency is and the frequency tables created by it cannot be extended in a meaningful way, so in that case make it ask the arch scale-invariance code to to use the "nominal" performance level for CPU utilization scaling in order to avoid the issue at hand. Fixes: db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Giovanni Gherdovich <ggherdovich@suse.cz> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2021-02-08cpufreq: ACPI: Extend frequency tables to cover boost frequenciesRafael J. Wysocki
A severe performance regression on AMD EPYC processors when using the schedutil scaling governor was discovered by Phoronix.com and attributed to the following commits: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD systems") 976df7e5730e ("x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC") The source of the problem is that the maximum performance level taken for computing the arch_max_freq_ratio value used in the x86 scale- invariance code is higher than the one corresponding to the cpuinfo.max_freq value coming from the acpi_cpufreq driver. This effectively causes the scale-invariant utilization to fall below 100% even if the CPU runs at cpuinfo.max_freq or slightly faster, so the schedutil governor selects a frequency below cpuinfo.max_freq then. That frequency corresponds to a frequency table entry below the maximum performance level necessary to get to the "boost" range of CPU frequencies. However, if the cpuinfo.max_freq value coming from acpi_cpufreq was higher, the schedutil governor would select higher frequencies which in turn would allow acpi_cpufreq to set more adequate performance levels and to get to the "boost" range of CPU frequencies more often. This issue affects any systems where acpi_cpufreq is used and the "boost" (or "turbo") frequencies are enabled, not just AMD EPYC. Moreover, commit db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") from the 5.10 development cycle made it extremely easy to default to schedutil even if the preferred driver is acpi_cpufreq as long as intel_pstate is built too, because the mere presence of the latter effectively removes the ondemand governor from the defaults. Distro kernels are likely to include both intel_pstate and acpi_cpufreq on x86, so their users who cannot use intel_pstate or choose to use acpi_cpufreq may easily be affectecd by this issue. To address this issue, extend the frequency table constructed by acpi_cpufreq for each CPU to cover the entire range of available frequencies (including the "boost" ones) if CPPC is available and indicates that "boost" (or "turbo") frequencies are enabled. That causes cpuinfo.max_freq to become the maximum "boost" frequency of the given CPU (instead of the maximum frequency returned by the ACPI _PSS object that corresponds to the "nominal" performance level). Fixes: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD systems") Fixes: 976df7e5730e ("x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC") Fixes: db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") Link: https://www.phoronix.com/scan.php?page=article&item=linux511-amd-schedutil&num=1 Link: https://lore.kernel.org/linux-pm/20210203135321.12253-2-ggherdovich@suse.cz/ Reported-by: Michael Larabel <Michael@phoronix.com> Diagnosed-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Reviewed-by: Giovanni Gherdovich <ggherdovich@suse.cz> Tested-by: Michael Larabel <Michael@phoronix.com>
2021-02-04cpufreq: Remove unused flag CPUFREQ_PM_NO_WARNViresh Kumar
This flag is set by one of the drivers but it isn't used in the code otherwise. Remove the unused flag and update the driver. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-04cpufreq: Remove CPUFREQ_STICKY flagViresh Kumar
During cpufreq driver's registration, if the ->init() callback for all the CPUs fail then there is not much point in keeping the driver around as it will only account for more of unnecessary noise, for example cpufreq core will try to suspend/resume the driver which never got registered properly. The removal of such a driver is avoided if the driver carries the CPUFREQ_STICKY flag. This was added way back [1] in 2004 and perhaps no one should ever need it now. A lot of drivers do set this flag, probably because they just copied it from other drivers. This was added earlier for some platforms [2] because their cpufreq drivers were getting registered before the CPUs were registered with subsys framework. And hence they used to fail. The same isn't true anymore though. The current code flow in the kernel is: start_kernel() -> kernel_init() -> kernel_init_freeable() -> do_basic_setup() -> driver_init() -> cpu_dev_init() -> subsys_system_register() //For CPUs -> do_initcalls() -> cpufreq_register_driver() Clearly, the CPUs will always get registered with subsys framework before any cpufreq driver can get probed. Remove the flag and update the relevant drivers. Link: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/include/linux/cpufreq.h?id=7cc9f0d9a1ab04cedc60d64fd8dcf7df224a3b4d # [1] Link: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/arch/arm/mach-sa1100/cpu-sa1100.c?id=f59d3bbe35f6268d729f51be82af8325d62f20f5 # [2] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-02cpufreq: qcom: Migrate to dev_pm_opp_set_opp()Viresh Kumar
dev_pm_opp_set_bw() is getting removed and dev_pm_opp_set_opp() should be used instead. Migrate to the new API. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Dmitry Osipenko <digetx@gmail.com>
2021-01-22cpufreq: intel_pstate: Remove repeated wordNigel Christian
In the comment for trace in passive mode there is an unnecessary "the". Eradicate it. Signed-off-by: Nigel Christian <nigel.l.christian@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-21cpufreq: remove tango driverArnd Bergmann
The tango platform is getting removed, so the driver is no longer needed. Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> Cc: Mans Rullgard <mans@mansr.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> [ Viresh: Update cpufreq-dt-platdev.c as well ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>