summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-07-27thermal: qcom: tsens-v0_1: Add support for MSM8939Shawn Guo
The TSENS integrated on MSM8939 is a v0_1 device with 10 sensors. Different from its predecessor MSM8916, where 'calib_sel' bits sit in separate qfprom word, MSM8939 has 'cailb' and 'calib_sel' bits mixed and spread on discrete offsets. That's why all qfprom bits are read as one go and later mapped to calibration data for MSM8939. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Amit Kucheria <amit.kucheria@linaro.org> Tested-by: Konrad Dybcio <konradybcio@gmail.com> /* on Asus Z00T smartphone */ Acked-by: Konrad Dybcio <konradybcio@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629144926.665-3-shawn.guo@linaro.org
2020-07-27dt-bindings: tsens: qcom: Document MSM8939 compatibleKonrad Dybcio
It adds compatible for MSM8939 TSENS device. Signed-off-by: Konrad Dybcio <konradybcio@gmail.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Konrad Dybcio <konradybcio@gmail.com> /* on Asus Z00T smartphone */ Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629144926.665-2-shawn.guo@linaro.org
2020-07-24thermal: core: Fix thermal zone lookup by IDThierry Reding
When a thermal zone is looked up by an ID and no zone is found matching that ID, the thermal_zone_get_by_id() function will return a pointer to the thermal zone list head which isn't actually a valid thermal zone. This can lead to a subsequent crash because a valid pointer is returned to the called, but dereferencing that pointer as struct thermal_zone is not safe. Fixes: 329b064fbd13 ("thermal: core: Get thermal zone by id") Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200724170105.2705467-1-thierry.reding@gmail.com
2020-07-24thermal: int340x: processor_thermal: fix: update Jasper Lake PCI idSumeet Pawnikar
Update PCI device id for Jasper Lake processor thermal device. With this proc_thermal driver is getting loaded and processor thermal functionality works on Jasper Lake system. Fixes: f64a6583d3f5 ("thermal: int340x: processor_thermal: Add Jasper Lake support") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1595577146-1221-1-git-send-email-sumeet.r.pawnikar@intel.com
2020-07-21thermal: imx8mm: Support module autoloadingAnson Huang
Add a missing MODULE_DEVICE_TABLE entry to support module autoloading. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1592380074-19222-1-git-send-email-Anson.Huang@nxp.com
2020-07-21thermal: ti-soc-thermal: Fix reversed condition in ti_thermal_expose_sensor()Dan Carpenter
This condition is reversed and will cause breakage. Fixes: 7440f518dad9 ("thermal/drivers/ti-soc-thermal: Avoid dereferencing ERR_PTR") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200616091949.GA11940@mwanda
2020-07-21MAINTAINERS: Add maintenance information for IPALukasz Luba
Add entry for ARM Intelligent Power Allocation - thermal governor. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200603141420.15274-1-lukasz.luba@arm.com
2020-07-21thermal: rcar_gen3_thermal: Do not shadow thcode variableNiklas Söderlund
The function rcar_gen3_thermal_calc_coefs() takes an argument called 'thcode' which shadows the static global 'thcode' variable. This is not harmful but bad for readability and is harmful for planned changes to the driver. The THCODE values should be read from hardware fuses if they are available and only fallback to the global 'thcode' variable if they are not fused. Rename the global 'thcode' variable to 'thcodes' to avoid shadowing the symbol in functions that take it as an argument. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200610003300.884258-1-niklas.soderlund+renesas@ragnatech.se
2020-07-21dt-bindings: thermal: Get rid of thermal.txt and replace referencesAmit Kucheria
Now that we have yaml bindings for the thermal subsystem, get rid of the old bindings (thermal.txt). Replace all references to thermal.txt in the Documentation with a link to the appropriate YAML bindings using the following search and replace pattern: - If the reference is specific to the thermal-sensor-cells property, replace with a pointer to thermal-sensor.yaml - If the reference is to the cooling-cells property, replace with a pointer to thermal-cooling-devices.yaml - If the reference is generic thermal bindings, replace with a reference to thermal*.yaml. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/e9aacd33071a00568b67e110fa3bcc4d86d3e1e4.1595245166.git.amit.kucheria@linaro.org
2020-07-21thermal: core: Move initialization after core initcallDaniel Lezcano
The generic netlink is initialized at subsys_initcall, so far after the thermal init routine and the thermal generic netlink family initialization. On ŝome platforms, that leads to a memory corruption. The fix was sent to netdev@ to move the genetlink framework initialization at core_initcall. Move the thermal core initialization to postcore level which is very close to core level. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200717164217.18819-2-daniel.lezcano@linaro.org
2020-07-21thermal: netlink: Improve the initcall orderingDaniel Lezcano
The initcalls like to play joke. In our case, the thermal-netlink initcall is called after the thermal-core initcall but this one sends a notification before the former is initialized. No issue was spotted, but it could lead to a memory corruption, so instead of relying on the core_initcall for the thermal-netlink, let's initialize directly from the thermal-core init routine, so we have full control of the init ordering. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200717164217.18819-1-daniel.lezcano@linaro.org
2020-07-21net: genetlink: Move initialization to core_initcallDaniel Lezcano
The generic netlink is initialized far after the netlink protocol itself at subsys_initcall. The devlink is initialized at the same level, but after, as shown by a disassembly of the vmlinux: [ ... ] 374 ffff8000115f22c0 <__initcall_devlink_init4>: 375 ffff8000115f22c4 <__initcall_genl_init4>: [ ... ] The function devlink_init() calls genl_register_family() before the generic netlink subsystem is initialized. As the generic netlink initcall level is set since 2005, it seems that was not a problem, but now we have the thermal framework initialized at the core_initcall level which creates the generic netlink family and sends a notification which leads to a subtle memory corruption only detectable when the CONFIG_INIT_ON_ALLOC_DEFAULT_ON option is set with the earlycon at init time. The thermal framework needs to be initialized early in order to begin the mitigation as soon as possible. Moving it to postcore_initcall is acceptable. This patch changes the initialization level for the generic netlink family to the core_initcall and comes after the netlink protocol initialization. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200715074120.8768-1-daniel.lezcano@linaro.org
2020-07-21thermal: rcar_gen3_thermal: Add r8a774e1 supportMarian-Cristian Rotariu
Add r8a774e1 specific compatible string. Signed-off-by: Marian-Cristian Rotariu <marian-cristian.rotariu.rb@bp.renesas.com> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1594811350-14066-4-git-send-email-prabhakar.mahadev-lad.rj@bp.renesas.com
2020-07-21thermal/drivers/clock_cooling: Remove clock_cooling codeAmit Kucheria
clock_cooling has no in-kernel users. It has never found any use in drivers as far as I can tell. Remove the code. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/aa5d5ac2589cf7b14ece882130731b4a916849a6.1593619943.git.amit.kucheria@linaro.org
2020-07-21thermal: core: remove redundant initialization of variable retColin Ian King
The variable ret is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200706140747.489075-1-colin.king@canonical.com
2020-07-21thermal: netlink: Fix compilation error when CONFIG_NET=nDaniel Lezcano
When the network is not configured, the netlink is disabled on all the system. The thermal framework assumed the netlink is always opt-in. Fix this by adding a Kconfig option for the netlink notification, defaulting to yes and depending on CONFIG_NET. As the change implies multiple stubs and in order to not pollute the internal thermal header, the thermal_nelink.h has been added and included in the thermal_core.h, so this one regain some kind of clarity. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200707090159.1018-1-daniel.lezcano@linaro.org
2020-07-07thermal: core: Add notifications call in the frameworkDaniel Lezcano
The generic netlink protocol is implemented but the different notification functions are not yet connected to the core code. These changes add the notification calls in the different corresponding places. Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20200706105538.2159-4-daniel.lezcano@linaro.org
2020-07-07thermal: core: genetlink support for events/cmd/samplingDaniel Lezcano
Initially the thermal framework had a very simple notification mechanism to send generic netlink messages to the userspace. The notification function was never called from anywhere and the corresponding dead code was removed. It was probably a first attempt to introduce the netlink notification. At LPC2018, the presentation "Linux thermal: User kernel interface", proposed to create the notifications to the userspace via a kfifo. The advantage of the kfifo is the performance. It is usually used from a 1:1 communication channel where a driver captures data and sends it as fast as possible to a userspace process. The drawback is that only one process uses the notification channel exclusively, thus no other process is allowed to use the channel to get temperature or notifications. This patch defines a generic netlink API to discover the current thermal setup and adds event notifications as well as temperature sampling. As any genetlink protocol, it can evolve and the versioning allows to keep the backward compatibility. In order to prevent the user from getting flooded with data on a single channel, there are two multicast channels, one for the temperature sampling when the thermal zone is updated and another one for the events, so the user can get the events only without the thermal zone temperature sampling. Also, a list of commands to discover the thermal setup is added and can be extended when needed. Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20200706105538.2159-3-daniel.lezcano@linaro.org
2020-07-07thermal: core: Get thermal zone by idDaniel Lezcano
The next patch will introduce the generic netlink protocol to handle events, sampling and command from the thermal framework. In order to deal with the thermal zone, it uses its unique identifier to characterize it in the message. Passing an integer is more efficient than passing an entire string. This change provides a function returning back a thermal zone pointer corresponding to the identifier passed as parameter. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20200706105538.2159-2-daniel.lezcano@linaro.org
2020-07-07thermal: core: Add helpers to browse the cdev, tz and governor listDaniel Lezcano
The cdev, tz and governor list, as well as their respective locks are statically defined in the thermal_core.c file. In order to give a sane access to these list, like browsing all the thermal zones or all the cooling devices, let's define a set of helpers where we pass a callback as a parameter to be called for each thermal entity. We keep the self-encapsulation and ensure the locks are correctly taken when looking at the list. Acked-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200706105538.2159-1-daniel.lezcano@linaro.org
2020-07-07thermal: Make thermal_zone_device_is_enabled() available to core onlyAndrzej Pietrasiewicz
This function is not needed by drivers. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200703104354.19657-4-andrzej.p@collabora.com
2020-07-07thermal: imx: Use driver's local data to decide whether to run a measurementAndrzej Pietrasiewicz
Use driver's local data to evaluate the need to run or not to run a measurement. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200703104354.19657-3-andrzej.p@collabora.com
2020-07-07acpi: thermal: Don't call thermal_zone_device_is_enabled()Andrzej Pietrasiewicz
thermal_zone_device_update() can now handle disabled thermal zones, so the check here is not needed. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200703104354.19657-2-andrzej.p@collabora.com
2020-06-29thermal: Rename set_mode() to change_mode()Andrzej Pietrasiewicz
set_mode() is only called when tzd's mode is about to change. Actual setting is performed in thermal_core, in thermal_zone_device_set_mode(). The meaning of set_mode() callback is actually to notify the driver about the mode being changed and giving the driver a chance to oppose such change. To better reflect the purpose of the method rename it to change_mode() Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-12-andrzej.p@collabora.com
2020-06-29thermal: Simplify or eliminate unnecessary set_mode() methodsAndrzej Pietrasiewicz
Setting polling_delay is now done at thermal_core level (by not polling DISABLED devices), so no need to repeat this code. int340x: Checking for an impossible enum value is unnecessary. acpi/thermal: It only prints debug messages. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-11-andrzej.p@collabora.com
2020-06-29thermal: core: Stop polling DISABLED thermal devicesAndrzej Pietrasiewicz
Polling DISABLED devices is not desired, as all such "disabled" devices are meant to be handled by userspace. This patch introduces and uses should_stop_polling() to decide whether the device should be polled or not. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-10-andrzej.p@collabora.com
2020-06-29thermal: Explicitly enable non-changing thermal zone devicesAndrzej Pietrasiewicz
Some thermal zone devices never change their state, so they should be always enabled. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-9-andrzej.p@collabora.com
2020-06-29thermal: Use mode helpers in driversAndrzej Pietrasiewicz
Use thermal_zone_device_{en|dis}able() and thermal_zone_device_is_enabled(). Consequently, all set_mode() implementations in drivers: - can stop modifying tzd's "mode" member, - shall stop taking tzd's lock, as it is taken in the helpers - shall stop calling thermal_zone_device_update() as it is called in the helpers - can assume they are called when the mode truly changes, so checks to verify that can be dropped Not providing set_mode() by a driver no longer prevents the core from being able to set tzd's mode, so the relevant check in mode_store() is removed. Other comments: - acpi/thermal.c: tz->thermal_zone->mode will be updated only after we return from set_mode(), so use function parameter in thermal_set_mode() instead, no need to call acpi_thermal_check() in set_mode() - thermal/imx_thermal.c: regmap writes and mode assignment are done in thermal_zone_device_{en|dis}able() and set_mode() callback - thermal/intel/intel_quark_dts_thermal.c: soc_dts_{en|dis}able() are a part of set_mode() callback, so they don't need to modify tzd->mode, and don't need to fall back to the opposite mode if unsuccessful, as the return value will be propagated to thermal_zone_device_{en|dis}able() and ultimately tzd's member will not be changed in thermal_zone_device_set_mode(). - thermal/of-thermal.c: no need to set zone->mode to DISABLED in of_parse_thermal_zones() as a tzd is kzalloc'ed so mode is DISABLED anyway Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-8-andrzej.p@collabora.com
2020-06-29thermal: Add mode helpersAndrzej Pietrasiewicz
Prepare for making the drivers not access tzd's private members. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> [staticize thermal_zone_device_set_mode()] Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-7-andrzej.p@collabora.com
2020-06-29thermal: remove get_mode() operation of driversAndrzej Pietrasiewicz
get_mode() is now redundant, as the state is stored in struct thermal_zone_device. Consequently the "mode" attribute in sysfs can always be visible, because it is always possible to get the mode from struct tzd. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-6-andrzej.p@collabora.com
2020-06-29thermal: Store device mode in struct thermal_zone_deviceAndrzej Pietrasiewicz
Prepare for eliminating get_mode(). Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-5-andrzej.p@collabora.com
2020-06-29thermal: Add current mode to thermal zone deviceAndrzej Pietrasiewicz
Prepare for changing the place where the mode is stored: now it is in drivers, which might or might not implement get_mode()/set_mode() methods. A lot of cleanup can be done thanks to storing it in struct tzd. The get_mode() methods will become redundant. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-4-andrzej.p@collabora.com
2020-06-29thermal: Store thermal mode in a dedicated enumAndrzej Pietrasiewicz
Prepare for storing mode in struct thermal_zone_device. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-3-andrzej.p@collabora.com
2020-06-29acpi: thermal: Fix error handling in the register functionAndrzej Pietrasiewicz
The acpi_thermal_register_thermal_zone() is missing any error handling. This needs to be fixed. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-2-andrzej.p@collabora.com
2020-06-28Linux 5.8-rc3v5.8-rc3Linus Torvalds
2020-06-28Merge tag 'arm-omap-fixes-5.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM OMAP fixes from Arnd Bergmann: "The OMAP developers are particularly active at hunting down regressions, so this is a separate branch with OMAP specific fixes for v5.8: As Tony explains "The recent display subsystem (DSS) related platform data changes caused display related regressions for suspend and resume. Looks like I only tested suspend and resume before dropping the legacy platform data, and forgot to test it after dropping it. Turns out the main issue was that we no longer have platform code calling pm_runtime_suspend for DSS like we did for the legacy platform data case, and that fix is still being discussed on the dri-devel list and will get merged separately. The DSS related testing exposed a pile other other display related issues that also need fixing though": - Fix ti-sysc optional clock handling and reset status checks for devices that reset automatically in idle like DSS - Ignore ti-sysc clockactivity bit unless separately requested to avoid unexpected performance issues - Init ti-sysc framedonetv_irq to true and disable for am4 - Avoid duplicate DSS reset for legacy mode with dts data - Remove LCD timings for am4 as they cause warnings now that we're using generic panels Other OMAP changes from Tony include: - Fix omap_prm reset deassert as we still have drivers setting the pm_runtime_irq_safe() flag - Flush posted write for ti-sysc enable and disable - Fix droid4 spi related errors with spi flags - Fix am335x USB range and a typo for softreset - Fix dra7 timer nodes for clocks for IPU and DSP - Drop duplicate mailboxes after mismerge for dra7 - Prevent pocketgeagle header line signal from accidentally setting micro-SD write protection signal by removing the default mux - Fix NFSroot flakeyness after resume for duover by switching the smsc911x gpio interrupt to back to level sensitive - Fix regression for omap4 clockevent source after recent system timer changes - Yet another ethernet regression fix for the "rgmii" vs "rgmii-rxid" phy-mode - One patch to convert am3/am4 DT files to use the regular sdhci-omap driver instead of the old hsmmc driver, this was meant for the merge window but got lost in the process" * tag 'arm-omap-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits) ARM: dts: am5729: beaglebone-ai: fix rgmii phy-mode ARM: dts: Fix omap4 system timer source clocks ARM: dts: Fix duovero smsc interrupt for suspend ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect Revert "bus: ti-sysc: Increase max softreset wait" ARM: dts: am437x-epos-evm: remove lcd timings ARM: dts: am437x-gp-evm: remove lcd timings ARM: dts: am437x-sk-evm: remove lcd timings ARM: dts: dra7-evm-common: Fix duplicate mailbox nodes ARM: dts: dra7: Fix timer nodes properly for timer_sys_ck clocks ARM: dts: Fix am33xx.dtsi ti,sysc-mask wrong softreset flag ARM: dts: Fix am33xx.dtsi USB ranges length bus: ti-sysc: Increase max softreset wait ARM: OMAP2+: Fix legacy mode dss_reset bus: ti-sysc: Fix uninitialized framedonetv_irq bus: ti-sysc: Ignore clockactivity unless specified as a quirk bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit ARM: dts: omap4-droid4: Fix spi configuration and increase rate bus: ti-sysc: Flush posted write on enable and disable soc: ti: omap-prm: use atomic iopoll instead of sleeping one ...
2020-06-28Merge tag 'arm-fixes-5.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "Here are a couple of bug fixes, mostly for devicetree files NXP i.MX: - Use correct voltage on some i.MX8M board device trees to avoid hardware damage - Code fixes for a compiler warning and incorrect reference counting, both harmless. - Fix the i.MX8M SoC driver to correctly identify imx8mp - Fix watchdog configuration in imx6ul-kontron device tree. Broadcom: - A small regression fix for the Raspberry-Pi firmware driver - A Kconfig change to use the correct timer driver on Northstar - A DT fix for the Luxul XWC-2000 machine - Two more DT fixes for NSP SoCs STmicroelectronics STI - Revert one broken patch for L2 cache configuration ARM Versatile Express: - Fix a regression by reverting a broken DT cleanup TEE drivers: - MAINTAINERS: change tee mailing list" * tag 'arm-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: Revert "ARM: sti: Implement dummy L2 cache's write_sec" soc: imx8m: fix build warning ARM: imx6: add missing put_device() call in imx6q_suspend_init() ARM: imx5: add missing put_device() call in imx_suspend_alloc_ocram() soc: imx8m: Correct i.MX8MP UID fuse offset ARM: dts: imx6ul-kontron: Change WDOG_ANY signal from push-pull to open-drain ARM: dts: imx6ul-kontron: Move watchdog from Kontron i.MX6UL/ULL board to SoM arm64: dts: imx8mm-beacon: Fix voltages on LDO1 and LDO2 arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range arm64: dts: imx8mm-evk: correct ldo1/ldo2 voltage range ARM: dts: NSP: Correct FA2 mailbox node ARM: bcm2835: Fix integer overflow in rpi_firmware_print_firmware_revision() MAINTAINERS: change tee mailing list ARM: dts: NSP: Disable PL330 by default, add dma-coherent property ARM: bcm: Select ARM_TIMER_SP804 for ARCH_BCM_NSP ARM: dts: BCM5301X: Add missing memory "device_type" for Luxul XWC-2000 arm: dts: vexpress: Move mcc node back into motherboard node
2020-06-28Merge tag 'timers-urgent-2020-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "A single DocBook fix" * tag 'timers-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Fix kerneldoc system_device_crosststamp & al
2020-06-28Merge tag 'perf-urgent-2020-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "A single Kbuild dependency fix" * tag 'perf-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/rapl: Fix RAPL config variable bug
2020-06-28Merge tag 'efi-urgent-2020-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: - Fix build regression on v4.8 and older - Robustness fix for TPM log parsing code - kobject refcount fix for the ESRT parsing code - Two efivarfs fixes to make it behave more like an ordinary file system - Style fixup for zero length arrays - Fix a regression in path separator handling in the initrd loader - Fix a missing prototype warning - Add some kerneldoc headers for newly introduced stub routines - Allow support for SSDT overrides via EFI variables to be disabled - Report CPU mode and MMU state upon entry for 32-bit ARM - Use the correct stack pointer alignment when entering from mixed mode * tag 'efi-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/libstub: arm: Print CPU boot mode and MMU state at boot efi/libstub: arm: Omit arch specific config table matching array on arm64 efi/x86: Setup stack correctly for efi_pe_entry efi: Make it possible to disable efivar_ssdt entirely efi/libstub: Descriptions for stub helper functions efi/libstub: Fix path separator regression efi/libstub: Fix missing-prototype warning for skip_spaces() efi: Replace zero-length array and use struct_size() helper efivarfs: Don't return -EINTR when rate-limiting reads efivarfs: Update inode modification time for successful writes efi/esrt: Fix reference count leak in esre_create_sysfs_entry. efi/tpm: Verify event log header before parsing efi/x86: Fix build with gcc 4
2020-06-28Merge tag 'sched_urgent_for_5.8_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: "The most anticipated fix in this pull request is probably the horrible build fix for the RANDSTRUCT fail that didn't make -rc2. Also included is the cleanup that removes those BUILD_BUG_ON()s and replaces it with ugly unions. Also included is the try_to_wake_up() race fix that was first triggered by Paul's RCU-torture runs, but was independently hit by Dave Chinner's fstest runs as well" * tag 'sched_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/cfs: change initial value of runnable_avg smp, irq_work: Continue smp_call_function*() and irq_work*() integration sched/core: s/WF_ON_RQ/WQ_ON_CPU/ sched/core: Fix ttwu() race sched/core: Fix PI boosting between RT and DEADLINE tasks sched/deadline: Initialize ->dl_boosted sched/core: Check cpus_mask, not cpus_ptr in __set_cpus_allowed_ptr(), to fix mask corruption sched/core: Fix CONFIG_GCC_PLUGIN_RANDSTRUCT build fail
2020-06-28Merge tag 'x86_urgent_for_5.8_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - AMD Memory bandwidth counter width fix, by Babu Moger. - Use the proper length type in the 32-bit truncate() syscall variant, by Jiri Slaby. - Reinit IA32_FEAT_CTL during wakeup to fix the case where after resume, VMXON would #GP due to VMX not being properly enabled, by Sean Christopherson. - Fix a static checker warning in the resctrl code, by Dan Carpenter. - Add a CR4 pinning mask for bits which cannot change after boot, by Kees Cook. - Align the start of the loop of __clear_user() to 16 bytes, to improve performance on AMD zen1 and zen2 microarchitectures, by Matt Fleming. * tag 'x86_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm/64: Align start of __clear_user() loop to 16-bytes x86/cpu: Use pinning mask for CR4 bits needing to be 0 x86/resctrl: Fix a NULL vs IS_ERR() static checker warning in rdt_cdp_peer_get() x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup syscalls: Fix offset type of ksys_ftruncate() x86/resctrl: Fix memory bandwidth counter width for AMD
2020-06-28Merge tag 'rcu_urgent_for_5.8_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU-vs-KCSAN fixes from Borislav Petkov: "A single commit that uses "arch_" atomic operations to avoid the instrumentation that comes with the non-"arch_" versions. In preparation for that commit, it also has another commit that makes these "arch_" atomic operations available to generic code. Without these commits, KCSAN uses can see pointless errors" * tag 'rcu_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Fixup noinstr warnings locking/atomics: Provide the arch_atomic_ interface to generic code
2020-06-28Merge tag 'objtool_urgent_for_5.8_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Borislav Petkov: "Three fixes from Peter Zijlstra suppressing KCOV instrumentation in noinstr sections. Peter Zijlstra says: "Address KCOV vs noinstr. There is no function attribute to selectively suppress KCOV instrumentation, instead teach objtool to NOP out the calls in noinstr functions" This cures a bunch of KCOV crashes (as used by syzcaller)" * tag 'objtool_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix noinstr vs KCOV objtool: Provide elf_write_{insn,reloc}() objtool: Clean up elf_write() condition
2020-06-28Merge tag 'x86_entry_for_5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry fixes from Borislav Petkov: "This is the x86/entry urgent pile which has accumulated since the merge window. It is not the smallest but considering the almost complete entry core rewrite, the amount of fixes to follow is somewhat higher than usual, which is to be expected. Peter Zijlstra says: 'These patches address a number of instrumentation issues that were found after the x86/entry overhaul. When combined with rcu/urgent and objtool/urgent, these patches make UBSAN/KASAN/KCSAN happy again. Part of making this all work is bumping the minimum GCC version for KASAN builds to gcc-8.3, the reason for this is that the __no_sanitize_address function attribute is broken in GCC releases before that. No known GCC version has a working __no_sanitize_undefined, however because the only noinstr violation that results from this happens when an UB is found, we treat it like WARN. That is, we allow it to violate the noinstr rules in order to get the warning out'" * tag 'x86_entry_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Fix #UD vs WARN more x86/entry: Increase entry_stack size to a full page x86/entry: Fixup bad_iret vs noinstr objtool: Don't consider vmlinux a C-file kasan: Fix required compiler version compiler_attributes.h: Support no_sanitize_undefined check with GCC 4 x86/entry, bug: Comment the instrumentation_begin() usage for WARN() x86/entry, ubsan, objtool: Whitelist __ubsan_handle_*() x86/entry, cpumask: Provide non-instrumented variant of cpu_is_offline() compiler_types.h: Add __no_sanitize_{address,undefined} to noinstr kasan: Bump required compiler version x86, kcsan: Add __no_kcsan to noinstr kcsan: Remove __no_kcsan_or_inline x86, kcsan: Remove __no_kcsan_or_inline usage
2020-06-28sched/cfs: change initial value of runnable_avgsched_urgent_for_5.8_rc3Vincent Guittot
Some performance regression on reaim benchmark have been raised with commit 070f5e860ee2 ("sched/fair: Take into account runnable_avg to classify group") The problem comes from the init value of runnable_avg which is initialized with max value. This can be a problem if the newly forked task is finally a short task because the group of CPUs is wrongly set to overloaded and tasks are pulled less agressively. Set initial value of runnable_avg equals to util_avg to reflect that there is no waiting time so far. Fixes: 070f5e860ee2 ("sched/fair: Take into account runnable_avg to classify group") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200624154422.29166-1-vincent.guittot@linaro.org
2020-06-28smp, irq_work: Continue smp_call_function*() and irq_work*() integrationPeter Zijlstra
Instead of relying on BUG_ON() to ensure the various data structures line up, use a bunch of horrible unions to make it all automatic. Much of the union magic is to ensure irq_work and smp_call_function do not (yet) see the members of their respective data structures change name. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20200622100825.844455025@infradead.org
2020-06-28sched/core: s/WF_ON_RQ/WQ_ON_CPU/Peter Zijlstra
Use a better name for this poorly named flag, to avoid confusion... Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lkml.kernel.org/r/20200622100825.785115830@infradead.org
2020-06-28sched/core: Fix ttwu() racePeter Zijlstra
Paul reported rcutorture occasionally hitting a NULL deref: sched_ttwu_pending() ttwu_do_wakeup() check_preempt_curr() := check_preempt_wakeup() find_matching_se() is_same_group() if (se->cfs_rq == pse->cfs_rq) <-- *BOOM* Debugging showed that this only appears to happen when we take the new code-path from commit: 2ebb17717550 ("sched/core: Offload wakee task activation if it the wakee is descheduling") and only when @cpu == smp_processor_id(). Something which should not be possible, because p->on_cpu can only be true for remote tasks. Similarly, without the new code-path from commit: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu") this would've unconditionally hit: smp_cond_load_acquire(&p->on_cpu, !VAL); and if: 'cpu == smp_processor_id() && p->on_cpu' is possible, this would result in an instant live-lock (with IRQs disabled), something that hasn't been reported. The NULL deref can be explained however if the task_cpu(p) load at the beginning of try_to_wake_up() returns an old value, and this old value happens to be smp_processor_id(). Further assume that the p->on_cpu load accurately returns 1, it really is still running, just not here. Then, when we enqueue the task locally, we can crash in exactly the observed manner because p->se.cfs_rq != rq->cfs_rq, because p's cfs_rq is from the wrong CPU, therefore we'll iterate into the non-existant parents and NULL deref. The closest semi-plausible scenario I've managed to contrive is somewhat elaborate (then again, actual reproduction takes many CPU hours of rcutorture, so it can't be anything obvious): X->cpu = 1 rq(1)->curr = X CPU0 CPU1 CPU2 // switch away from X LOCK rq(1)->lock smp_mb__after_spinlock dequeue_task(X) X->on_rq = 9 switch_to(Z) X->on_cpu = 0 UNLOCK rq(1)->lock // migrate X to cpu 0 LOCK rq(1)->lock dequeue_task(X) set_task_cpu(X, 0) X->cpu = 0 UNLOCK rq(1)->lock LOCK rq(0)->lock enqueue_task(X) X->on_rq = 1 UNLOCK rq(0)->lock // switch to X LOCK rq(0)->lock smp_mb__after_spinlock switch_to(X) X->on_cpu = 1 UNLOCK rq(0)->lock // X goes sleep X->state = TASK_UNINTERRUPTIBLE smp_mb(); // wake X ttwu() LOCK X->pi_lock smp_mb__after_spinlock if (p->state) cpu = X->cpu; // =? 1 smp_rmb() // X calls schedule() LOCK rq(0)->lock smp_mb__after_spinlock dequeue_task(X) X->on_rq = 0 if (p->on_rq) smp_rmb(); if (p->on_cpu && ttwu_queue_wakelist(..)) [*] smp_cond_load_acquire(&p->on_cpu, !VAL) cpu = select_task_rq(X, X->wake_cpu, ...) if (X->cpu != cpu) switch_to(Y) X->on_cpu = 0 UNLOCK rq(0)->lock However I'm having trouble convincing myself that's actually possible on x86_64 -- after all, every LOCK implies an smp_mb() there, so if ttwu observes ->state != RUNNING, it must also observe ->cpu != 1. (Most of the previous ttwu() races were found on very large PowerPC) Nevertheless, this fully explains the observed failure case. Fix it by ordering the task_cpu(p) load after the p->on_cpu load, which is easy since nothing actually uses @cpu before this. Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu") Reported-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200622125649.GC576871@hirez.programming.kicks-ass.net
2020-06-28sched/core: Fix PI boosting between RT and DEADLINE tasksJuri Lelli
syzbot reported the following warning: WARNING: CPU: 1 PID: 6351 at kernel/sched/deadline.c:628 enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504 At deadline.c:628 we have: 623 static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se) 624 { 625 struct dl_rq *dl_rq = dl_rq_of_se(dl_se); 626 struct rq *rq = rq_of_dl_rq(dl_rq); 627 628 WARN_ON(dl_se->dl_boosted); 629 WARN_ON(dl_time_before(rq_clock(rq), dl_se->deadline)); [...] } Which means that setup_new_dl_entity() has been called on a task currently boosted. This shouldn't happen though, as setup_new_dl_entity() is only called when the 'dynamic' deadline of the new entity is in the past w.r.t. rq_clock and boosted tasks shouldn't verify this condition. Digging through the PI code I noticed that what above might in fact happen if an RT tasks blocks on an rt_mutex hold by a DEADLINE task. In the first branch of boosting conditions we check only if a pi_task 'dynamic' deadline is earlier than mutex holder's and in this case we set mutex holder to be dl_boosted. However, since RT 'dynamic' deadlines are only initialized if such tasks get boosted at some point (or if they become DEADLINE of course), in general RT 'dynamic' deadlines are usually equal to 0 and this verifies the aforementioned condition. Fix it by checking that the potential donor task is actually (even if temporary because in turn boosted) running at DEADLINE priority before using its 'dynamic' deadline value. Fixes: 2d3d891d3344 ("sched/deadline: Add SCHED_DEADLINE inheritance logic") Reported-by: syzbot+119ba87189432ead09b4@syzkaller.appspotmail.com Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Tested-by: Daniel Wagner <dwagner@suse.de> Link: https://lkml.kernel.org/r/20181119153201.GB2119@localhost.localdomain