summaryrefslogtreecommitdiff
path: root/drivers/base
AgeCommit message (Collapse)Author
2021-06-30software node: Handle software node injection to an existing device properlyHeikki Krogerus
[ Upstream commit 5dca69e26fe97f17d4a6cbd6872103c868577b14 ] The function software_node_notify() - the function that creates and removes the symlinks between the node and the device - was called unconditionally in device_add_software_node() and device_remove_software_node(), but it needs to be called in those functions only in the special case where the node is added to a device that has already been registered. This fixes NULL pointer dereference that happens if device_remove_software_node() is used with device that was never registered. Fixes: b622b24519f5 ("software node: Allow node addition to already existing device") Reported-and-tested-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03drivers: base: Fix device link removalRafael J. Wysocki
commit 80dd33cf72d1ab4f0af303f1fa242c6d6c8d328f upstream. When device_link_free() drops references to the supplier and consumer devices of the device link going away and the reference being dropped turns out to be the last one for any of those device objects, its ->release callback will be invoked and it may sleep which goes against the SRCU callback execution requirements. To address this issue, make the device link removal code carry out the device_link_free() actions preceded by SRCU synchronization from a separate work item (the "long" workqueue is used for that, because it does not matter when the device link memory is released and it may take time to get to that point) instead of using SRCU callbacks. While at it, make the code work analogously when SRCU is not enabled to reduce the differences between the SRCU and non-SRCU cases. Fixes: 843e600b8a2b ("driver core: Fix sleeping in invalid context during device link deletion") Cc: stable <stable@vger.kernel.org> Reported-by: chenxiang (M) <chenxiang66@hisilicon.com> Tested-by: chenxiang (M) <chenxiang66@hisilicon.com> Reviewed-by: Saravana Kannan <saravanak@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/5722787.lOV4Wx5bFT@kreacher Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19PM: runtime: Fix unpaired parent child_count for force_resumeTony Lindgren
commit c745253e2a691a40c66790defe85c104a887e14a upstream. As pm_runtime_need_not_resume() relies also on usage_count, it can return a different value in pm_runtime_force_suspend() compared to when called in pm_runtime_force_resume(). Different return values can happen if anything calls PM runtime functions in between, and causes the parent child_count to increase on every resume. So far I've seen the issue only for omapdrm that does complicated things with PM runtime calls during system suspend for legacy reasons: omap_atomic_commit_tail() for omapdrm.0 dispc_runtime_get() wakes up 58000000.dss as it's the dispc parent dispc_runtime_resume() rpm_resume() increases parent child_count dispc_runtime_put() won't idle, PM runtime suspend blocked pm_runtime_force_suspend() for 58000000.dss, !pm_runtime_need_not_resume() __update_runtime_status() system suspended pm_runtime_force_resume() for 58000000.dss, pm_runtime_need_not_resume() pm_runtime_enable() only called because of pm_runtime_need_not_resume() omap_atomic_commit_tail() for omapdrm.0 dispc_runtime_get() wakes up 58000000.dss as it's the dispc parent dispc_runtime_resume() rpm_resume() increases parent child_count dispc_runtime_put() won't idle, PM runtime suspend blocked ... rpm_suspend for 58000000.dss but parent child_count is now unbalanced Let's fix the issue by adding a flag for needs_force_resume and use it in pm_runtime_force_resume() instead of pm_runtime_need_not_resume(). Additionally omapdrm system suspend could be simplified later on to avoid lots of unnecessary PM runtime calls and the complexity it adds. The driver can just use internal functions that are shared between the PM runtime and system suspend related functions. Fixes: 4918e1f87c5f ("PM / runtime: Rework pm_runtime_force_suspend/resume()") Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-14node: fix device cleanups in error handling codeDan Carpenter
[ Upstream commit 4ce535ec0084f0d712317cb99d383cad3288e713 ] We can't use kfree() to free device managed resources so the kfree(dev) is against the rules. It's easier to write this code if we open code the device_register() as a device_initialize() and device_add(). That way if dev_set_name() set name fails we can call put_device() and it will clean up correctly. Fixes: acc02a109b04 ("node: Add memory-side caching attributes") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YHA0JUra+F64+NpB@mwanda Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14devtmpfs: fix placement of complete() callRasmus Villemoes
[ Upstream commit 38f087de8947700d3b06d3d1594490e0f611c5d1 ] Calling complete() from within the __init function is wrong - theoretically, the init process could proceed all the way to freeing the init mem before the devtmpfsd thread gets to execute the return instruction in devtmpfs_setup(). In practice, it seems to be harmless as gcc inlines devtmpfs_setup() into devtmpfsd(). So the calls of the __init functions init_chdir() etc. actually happen from devtmpfs_setup(), but the __ref on that one silences modpost (it's all right, because those calls happen before the complete()). But it does make the __init annotation of the setup function moot, which we'll fix in a subsequent patch. Fixes: bcbacc4909f1 ("devtmpfs: refactor devtmpfsd()") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20210312103027.2701413-1-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14regmap: set debugfs_name to NULL after it is freedMeng Li
[ Upstream commit e41a962f82e7afb5b1ee644f48ad0b3aee656268 ] There is a upstream commit cffa4b2122f5("regmap:debugfs: Fix a memory leak when calling regmap_attach_dev") that adds a if condition when create name for debugfs_name. With below function invoking logical, debugfs_name is freed in regmap_debugfs_exit(), but it is not created again because of the if condition introduced by above commit. regmap_reinit_cache() regmap_debugfs_exit() ... regmap_debugfs_init() So, set debugfs_name to NULL after it is freed. Fixes: cffa4b2122f5 ("regmap: debugfs: Fix a memory leak when calling regmap_attach_dev") Signed-off-by: Meng Li <Meng.Li@windriver.com> Link: https://lore.kernel.org/r/20210226021737.7690-1-Meng.Li@windriver.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14software node: Allow node addition to already existing deviceHeikki Krogerus
commit b622b24519f5b008f6d4e20e5675eaffa8fbd87b upstream. If the node is added to an already exiting device, the node needs to be also linked to the device separately. This will make sure the reference count is kept in balance also when the node is injected to a device afterwards. Fixes: e68d0119e328 ("software node: Introduce device_add_software_node()") Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210414075438.64547-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-05driver core: Fix locking bug in deferred_probe_timeout_work_func()Saravana Kannan
list_for_each_entry_safe() is only useful if we are deleting nodes in a linked list within the loop. It doesn't protect against other threads adding/deleting nodes to the list in parallel. We need to grab deferred_probe_mutex when traversing the deferred_probe_pending_list. Cc: stable@vger.kernel.org Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init") Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210402040342.2944858-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-03Merge tag 'driver-core-5.12-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fix from Greg KH: "Here is a single driver core fix for a reported problem with differed probing. It has been in linux-next for a while with no reported problems" * tag 'driver-core-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: driver core: clear deferred probe reason on probe retry
2021-03-29PM: runtime: Fix race getting/putting suppliers at probeAdrian Hunter
pm_runtime_put_suppliers() must not decrement rpm_active unless the consumer is suspended. That is because, otherwise, it could suspend suppliers for an active consumer. That can happen as follows: static int driver_probe_device(struct device_driver *drv, struct device *dev) { int ret = 0; if (!device_is_registered(dev)) return -ENODEV; dev->can_match = true; pr_debug("bus: '%s': %s: matched device %s with driver %s\n", drv->bus->name, __func__, dev_name(dev), drv->name); pm_runtime_get_suppliers(dev); if (dev->parent) pm_runtime_get_sync(dev->parent); At this point, dev can runtime suspend so rpm_put_suppliers() can run, rpm_active becomes 1 (the lowest value). pm_runtime_barrier(dev); if (initcall_debug) ret = really_probe_debug(dev, drv); else ret = really_probe(dev, drv); Probe callback can have runtime resumed dev, and then runtime put so dev is awaiting autosuspend, but rpm_active is 2. pm_request_idle(dev); if (dev->parent) pm_runtime_put(dev->parent); pm_runtime_put_suppliers(dev); Now pm_runtime_put_suppliers() will put the supplier i.e. rpm_active 2 -> 1, but consumer can still be active. return ret; } Fix by checking the runtime status. For any status other than RPM_SUSPENDED, rpm_active can be considered to be "owned" by rpm_[get/put]_suppliers() and pm_runtime_put_suppliers() need do nothing. Reported-by: Asutosh Das <asutoshd@codeaurora.org> Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-29PM: runtime: Fix ordering in pm_runtime_get_suppliers()Adrian Hunter
rpm_active indicates how many times the supplier usage_count has been incremented. Consequently it must be updated after pm_runtime_get_sync() of the supplier, not before. Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-23driver core: clear deferred probe reason on probe retryAhmad Fatoum
When retrying a deferred probe, any old defer reason string should be discarded. Otherwise, if the probe is deferred again at a different spot, but without setting a message, the now incorrect probe reason will remain. This was observed with the i.MX I2C driver, which ultimately failed to probe due to lack of the GPIO driver. The probe defer for GPIO doesn't record a message, but a previous probe defer to clock_get did. This had the effect that /sys/kernel/debug/devices_deferred listed a misleading probe deferral reason. Cc: stable <stable@vger.kernel.org> Fixes: d090b70ede02 ("driver core: add deferring probe reason to devices_deferred property") Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Link: https://lore.kernel.org/r/20210319110459.19966-1-a.fatoum@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-22PM: runtime: Defer suspending suppliersRafael J. Wysocki
Because the PM-runtime status of the device is not updated in __rpm_callback(), attempts to suspend the suppliers of the given device triggered by the rpm_put_suppliers() call in there may cause a supplier to be suspended completely before the status of the consumer is updated to RPM_SUSPENDED, which is confusing. To avoid that (1) modify __rpm_callback() to only decrease the PM-runtime usage counter of each supplier and (2) make rpm_suspend() try to suspend the suppliers after changing the consumer's status to RPM_SUSPENDED, in analogy with the device's parent. Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/ Fixes: 21d5c57b3726 ("PM / runtime: Use device links") Reported-by: elaine.zhang <zhangqing@rock-chips.com> Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-19Revert "PM: runtime: Update device status before letting suppliers suspend"Rafael J. Wysocki
Revert commit 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend") that introduced a race condition into __rpm_callback() which allowed a concurrent rpm_resume() to run and resume the device prematurely after its status had been changed to RPM_SUSPENDED by __rpm_callback(). Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend") Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/ Reported-by: Adrian Hunter <adrian.hunter@intel.com> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-03-10software node: Fix device_add_software_node()Heikki Krogerus
The function device_add_software_node() was meant to register the node supplied to it, but only if that node wasn't already registered. Right now the function attempts to always register the node. That will cause a failure with nodes that are already registered. Fixing that by incrementing the reference count of the nodes that have already been registered, and only registering the new nodes. Also, clarifying the behaviour in the function documentation. Fixes: e68d0119e328 ("software node: Introduce device_add_software_node()") Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-10software node: Fix node registrationHeikki Krogerus
Software node can not be registered before its parent. Fixes: 80488a6b1d3c ("software node: Add support for static node descriptors") Cc: 5.10+ <stable@vger.kernel.org> # 5.10+ Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-01PM: runtime: Update device status before letting suppliers suspendRafael J. Wysocki
Because the PM-runtime status of the device is not updated in __rpm_callback(), attempts to suspend the suppliers of the given device triggered by rpm_put_suppliers() called by it may fail. Fix this by making __rpm_callback() update the device's status to RPM_SUSPENDED before calling rpm_put_suppliers() if the current status of the device is RPM_SUSPENDING and the callback just invoked by it has returned 0 (success). While at it, modify the code in __rpm_callback() to always check the device's PM-runtime status under its PM lock. Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/ Fixes: 21d5c57b3726 ("PM / runtime: Use device links") Reported-by: Elaine Zhang <zhangqing@rock-chips.com> Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Elaine Zhang <zhangiqng@rock-chips.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-26Merge tag 'riscv-for-linus-5.12-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "A handful of new RISC-V related patches for this merge window: - A check to ensure drivers are properly using uaccess. This isn't manifesting with any of the drivers I'm currently using, but may catch errors in new drivers. - Some preliminary support for the FU740, along with the HiFive Unleashed it will appear on. - NUMA support for RISC-V, which involves making the arm64 code generic. - Support for kasan on the vmalloc region. - A handful of new drivers for the Kendryte K210, along with the DT plumbing required to boot on a handful of K210-based boards. - Support for allocating ASIDs. - Preliminary support for kernels larger than 128MiB. - Various other improvements to our KASAN support, including the utilization of huge pages when allocating the KASAN regions. We may have already found a bug with the KASAN_VMALLOC code, but it's passing my tests. There's a fix in the works, but that will probably miss the merge window. * tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (75 commits) riscv: Improve kasan population by using hugepages when possible riscv: Improve kasan population function riscv: Use KASAN_SHADOW_INIT define for kasan memory initialization riscv: Improve kasan definitions riscv: Get rid of MAX_EARLY_MAPPING_SIZE soc: canaan: Sort the Makefile alphabetically riscv: Disable KSAN_SANITIZE for vDSO riscv: Remove unnecessary declaration riscv: Add Canaan Kendryte K210 SD card defconfig riscv: Update Canaan Kendryte K210 defconfig riscv: Add Kendryte KD233 board device tree riscv: Add SiPeed MAIXDUINO board device tree riscv: Add SiPeed MAIX GO board device tree riscv: Add SiPeed MAIX DOCK board device tree riscv: Add SiPeed MAIX BiT board device tree riscv: Update Canaan Kendryte K210 device tree dt-bindings: add resets property to dw-apb-timer dt-bindings: fix sifive gpio properties dt-bindings: update sifive uart compatible string dt-bindings: update sifive clint compatible string ...
2021-02-26drivers/base/memory: don't store phys_device in memory blocksDavid Hildenbrand
No need to store the value for each and every memory block, as we can easily query the value at runtime. Reshuffle the members to optimize the memory layout. Also, let's clarify what the interface once was used for and why it's legacy nowadays. "phys_device" was used on s390x in older versions of lsmem[2]/chmem[3], back when they were still part of s390x-tools. They were later replaced by the variants in linux-utils. For example, RHEL6 and RHEL7 contain lsmem/chmem from s390-utils. RHEL8 switched to versions from util-linux on s390x [4]. "phys_device" was added with sysfs support for memory hotplug in commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove functions") in 2005. It always returned 0. s390x started returning something != 0 on some setups (if sclp.rzm is set by HW) in 2010 via commit 57b552ba0b2f ("memory hotplug/s390: set phys_device"). For s390x, it allowed for identifying which memory block devices belong to the same storage increment (RZM). Only if all memory block devices comprising a single storage increment were offline, the memory could actually be removed in the hypervisor. Since commit e5d709bb5fb7 ("s390/memory hotplug: provide memory_block_size_bytes() function") in 2013 a memory block device spans at least one storage increment - which is why the interface isn't really helpful/used anymore (except by old lsmem/chmem tools). There were once RFC patches to make use of "phys_device" in ACPI context; however, the underlying problem could be solved using different interfaces [1]. [1] https://patchwork.kernel.org/patch/2163871/ [2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem [3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem [4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134 Link: https://lkml.kernel.org/r/20210201181347.13262-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Vaibhav Jain <vaibhav@linux.ibm.com> Cc: Tom Rix <trix@redhat.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26mm/memory_hotplug: rename all existing 'memhp' into 'mhp'Anshuman Khandual
This renames all 'memhp' instances to 'mhp' except for memhp_default_state for being a kernel command line option. This is just a clean up and should not cause a functional change. Let's make it consistent rater than mixing the two prefixes. In preparation for more users of the 'mhp' terminology. Link: https://lkml.kernel.org/r/1611554093-27316-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Suggested-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: "A few small subsystems and some of MM. 172 patches. Subsystems affected by this patch series: hexagon, scripts, ntfs, ocfs2, vfs, and mm (slab-generic, slab, slub, debug, pagecache, swap, memcg, pagemap, mprotect, mremap, page-reporting, vmalloc, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, mempolicy, oom-kill, hugetlbfs, and migration)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (172 commits) mm/migrate: remove unneeded semicolons hugetlbfs: remove unneeded return value of hugetlb_vmtruncate() hugetlbfs: fix some comment typos hugetlbfs: correct some obsolete comments about inode i_mutex hugetlbfs: make hugepage size conversion more readable hugetlbfs: remove meaningless variable avoid_reserve hugetlbfs: correct obsolete function name in hugetlbfs_read_iter() hugetlbfs: use helper macro default_hstate in init_hugetlbfs_fs hugetlbfs: remove useless BUG_ON(!inode) in hugetlbfs_setattr() hugetlbfs: remove special hugetlbfs_set_page_dirty() mm/hugetlb: change hugetlb_reserve_pages() to type bool mm, oom: fix a comment in dump_task() mm/mempolicy: use helper range_in_vma() in queue_pages_test_walk() numa balancing: migrate on fault among multiple bound nodes mm, compaction: make fast_isolate_freepages() stay within zone mm/compaction: fix misbehaviors of fast_find_migrateblock() mm/compaction: correct deferral logic for proactive compaction mm/compaction: remove duplicated VM_BUG_ON_PAGE !PageLocked mm/compaction: remove rcu_read_lock during page compaction z3fold: simplify the zhdr initialization code in init_z3fold_page() ...
2021-02-24mm: memcg: add swapcache stat for memcg v2Shakeel Butt
This patch adds swapcache stat for the cgroup v2. The swapcache represents the memory that is accounted against both the memory and the swap limit of the cgroup. The main motivation behind exposing the swapcache stat is for enabling users to gracefully migrate from cgroup v1's memsw counter to cgroup v2's memory and swap counters. Cgroup v1's memsw limit allows users to limit the memory+swap usage of a workload but without control on the exact proportion of memory and swap. Cgroup v2 provides separate limits for memory and swap which enables more control on the exact usage of memory and swap individually for the workload. With some little subtleties, the v1's memsw limit can be switched with the sum of the v2's memory and swap limits. However the alternative for memsw usage is not yet available in cgroup v2. Exposing per-cgroup swapcache stat enables that alternative. Adding the memory usage and swap usage and subtracting the swapcache will approximate the memsw usage. This will help in the transparent migration of the workloads depending on memsw usage and limit to v2' memory and swap counters. The reasons these applications are still interested in this approximate memsw usage are: (1) these applications are not really interested in two separate memory and swap usage metrics. A single usage metric is more simple to use and reason about for them. (2) The memsw usage metric hides the underlying system's swap setup from the applications. Applications with multiple instances running in a datacenter with heterogeneous systems (some have swap and some don't) will keep seeing a consistent view of their usage. [akpm@linux-foundation.org: fix CONFIG_SWAP=n build] Link: https://lkml.kernel.org/r/20210108155813.2914586-3-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24mm: memcontrol: convert NR_FILE_PMDMAPPED account to pagesMuchun Song
Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics especially THP vmstat counters. In the systems with hundreds of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_FILE_PMDMAPPED account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Link: https://lkml.kernel.org/r/20201228164110.2838-7-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: NeilBrown <neilb@suse.de> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Cc: Rafael. J. Wysocki <rafael@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Roman Gushchin <guro@fb.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24mm: memcontrol: convert NR_SHMEM_PMDMAPPED account to pagesMuchun Song
Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics especially THP vmstat counters. In the systems with hundreds of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_SHMEM_PMDMAPPED account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Link: https://lkml.kernel.org/r/20201228164110.2838-6-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: NeilBrown <neilb@suse.de> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Cc: Rafael. J. Wysocki <rafael@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Roman Gushchin <guro@fb.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24mm: memcontrol: convert NR_SHMEM_THPS account to pagesMuchun Song
Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics especially THP vmstat counters. In the systems with hundreds of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_SHMEM_THPS account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Link: https://lkml.kernel.org/r/20201228164110.2838-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: NeilBrown <neilb@suse.de> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Cc: Rafael. J. Wysocki <rafael@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Roman Gushchin <guro@fb.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24mm: memcontrol: convert NR_FILE_THPS account to pagesMuchun Song
Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics especially THP vmstat counters. In the systems with if hundreds of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_FILE_THPS account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Link: https://lkml.kernel.org/r/20201228164110.2838-4-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: NeilBrown <neilb@suse.de> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Cc: Rafael. J. Wysocki <rafael@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Roman Gushchin <guro@fb.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24mm: memcontrol: convert NR_ANON_THPS account to pagesMuchun Song
Currently we use struct per_cpu_nodestat to cache the vmstat counters, which leads to inaccurate statistics especially THP vmstat counters. In the systems with hundreds of processors it can be GBs of memory. For example, for a 96 CPUs system, the threshold is the maximum number of 125. And the per cpu counters can cache 23.4375 GB in total. The THP page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like sensible. Although every THP stats update overflows the per-cpu counter, resorting to atomic global updates. But it can make the statistics more accuracy for the THP vmstat counters. So we convert the NR_ANON_THPS account to pages. This patch is consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page arrival"). Doing this also can make the unit of vmstat counters more unified. Finally, the unit of the vmstat counters are pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes or kB. The rest which is without suffix are pages. Link: https://lkml.kernel.org/r/20201228164110.2838-3-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Rafael. J. Wysocki <rafael@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Roman Gushchin <guro@fb.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Feng Tang <feng.tang@intel.com> Cc: NeilBrown <neilb@suse.de> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24Merge tag 'char-misc-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the large set of char/misc/whatever driver subsystem updates for 5.12-rc1. Over time it seems like this tree is collecting more and more tiny driver subsystems in one place, making it easier for those maintainers, which is why this is getting larger. Included in here are: - coresight driver updates - habannalabs driver updates - virtual acrn driver addition (proper acks from the x86 maintainers) - broadcom misc driver addition - speakup driver updates - soundwire driver updates - fpga driver updates - amba driver updates - mei driver updates - vfio driver updates - greybus driver updates - nvmeem driver updates - phy driver updates - mhi driver updates - interconnect driver udpates - fsl-mc bus driver updates - random driver fix - some small misc driver updates (rtsx, pvpanic, etc.) All of these have been in linux-next for a while, with the only reported issue being a merge conflict due to the dfl_device_id addition from the fpga subsystem in here" * tag 'char-misc-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (311 commits) spmi: spmi-pmic-arb: Fix hw_irq overflow Documentation: coresight: Add PID tracing description coresight: etm-perf: Support PID tracing for kernel at EL2 coresight: etm-perf: Clarify comment on perf options ACRN: update MAINTAINERS: mailing list is subscribers-only regmap: sdw-mbq: use MODULE_LICENSE("GPL") regmap: sdw: use no_pm routines for SoundWire 1.2 MBQ regmap: sdw: use _no_pm functions in regmap_read/write soundwire: intel: fix possible crash when no device is detected MAINTAINERS: replace my with email with replacements mhi: Fix double dma free uapi: map_to_7segment: Update example in documentation uio: uio_pci_generic: don't fail probe if pdev->irq equals to IRQ_NOTCONNECTED drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue firewire: replace tricky statement by two simple ones vme: make remove callback return void firmware: google: make coreboot driver's remove callback return void firmware: xilinx: Use explicit values for all enum values sample/acrn: Introduce a sample of HSM ioctl interface usage virt: acrn: Introduce an interface for Service VM to control vCPU ...
2021-02-24Merge tag 'driver-core-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core / debugfs update from Greg KH: "Here is the "big" driver core and debugfs update for 5.12-rc1 This set of driver core patches caused a bunch of problems in linux-next for the past few weeks, when Saravana tried to set fw_devlink=on as the default functionality. This caused a number of systems to stop booting, and lots of bugs were fixed in this area for almost all of the reported systems, but this option is not ready to be turned on just yet for the default operation based on this testing, so I've reverted that change at the very end so we don't have to worry about regressions in 5.12 We will try to turn this on for 5.13 if testing goes better over the next few months. Other than the fixes caused by the fw_devlink testing in here, there's not much more: - debugfs fixes for invalid input into debugfs_lookup() - kerneldoc cleanups - warn message if platform drivers return an error on their remove callback (a futile effort, but good to catch). All of these have been in linux-next for a while now, and the regressions have gone away with the revert of the fw_devlink change" * tag 'driver-core-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits) Revert "driver core: Set fw_devlink=on by default" of: property: fw_devlink: Ignore interrupts property for some configs debugfs: do not attempt to create a new file before the filesystem is initalized debugfs: be more robust at handling improper input in debugfs_lookup() driver core: auxiliary bus: Fix calling stage for auxiliary bus init of: irq: Fix the return value for of_irq_parse_one() stub of: irq: make a stub for of_irq_parse_one() clk: Mark fwnodes when their clock provider is added/removed PM: domains: Mark fwnodes when their powerdomain is added/removed irqdomain: Mark fwnodes when their irqdomain is added/removed driver core: fw_devlink: Handle suppliers that don't use driver core of: property: Add fw_devlink support for optional properties driver core: Add fw_devlink.strict kernel param of: property: Don't add links to absent suppliers driver core: fw_devlink: Detect supplier devices that will never be added driver core: platform: Emit a warning if a remove callback returned non-zero of: property: Fix fw_devlink handling of interrupts/interrupts-extended gpiolib: Don't probe gpio_device if it's not the primary device device.h: Remove bogus "the" in kerneldoc gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default ...
2021-02-23Merge tag 'idmapped-mounts-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull idmapped mounts from Christian Brauner: "This introduces idmapped mounts which has been in the making for some time. Simply put, different mounts can expose the same file or directory with different ownership. This initial implementation comes with ports for fat, ext4 and with Christoph's port for xfs with more filesystems being actively worked on by independent people and maintainers. Idmapping mounts handle a wide range of long standing use-cases. Here are just a few: - Idmapped mounts make it possible to easily share files between multiple users or multiple machines especially in complex scenarios. For example, idmapped mounts will be used in the implementation of portable home directories in systemd-homed.service(8) where they allow users to move their home directory to an external storage device and use it on multiple computers where they are assigned different uids and gids. This effectively makes it possible to assign random uids and gids at login time. - It is possible to share files from the host with unprivileged containers without having to change ownership permanently through chown(2). - It is possible to idmap a container's rootfs and without having to mangle every file. For example, Chromebooks use it to share the user's Download folder with their unprivileged containers in their Linux subsystem. - It is possible to share files between containers with non-overlapping idmappings. - Filesystem that lack a proper concept of ownership such as fat can use idmapped mounts to implement discretionary access (DAC) permission checking. - They allow users to efficiently changing ownership on a per-mount basis without having to (recursively) chown(2) all files. In contrast to chown (2) changing ownership of large sets of files is instantenous with idmapped mounts. This is especially useful when ownership of a whole root filesystem of a virtual machine or container is changed. With idmapped mounts a single syscall mount_setattr syscall will be sufficient to change the ownership of all files. - Idmapped mounts always take the current ownership into account as idmappings specify what a given uid or gid is supposed to be mapped to. This contrasts with the chown(2) syscall which cannot by itself take the current ownership of the files it changes into account. It simply changes the ownership to the specified uid and gid. This is especially problematic when recursively chown(2)ing a large set of files which is commong with the aforementioned portable home directory and container and vm scenario. - Idmapped mounts allow to change ownership locally, restricting it to specific mounts, and temporarily as the ownership changes only apply as long as the mount exists. Several userspace projects have either already put up patches and pull-requests for this feature or will do so should you decide to pull this: - systemd: In a wide variety of scenarios but especially right away in their implementation of portable home directories. https://systemd.io/HOME_DIRECTORY/ - container runtimes: containerd, runC, LXD:To share data between host and unprivileged containers, unprivileged and privileged containers, etc. The pull request for idmapped mounts support in containerd, the default Kubernetes runtime is already up for quite a while now: https://github.com/containerd/containerd/pull/4734 - The virtio-fs developers and several users have expressed interest in using this feature with virtual machines once virtio-fs is ported. - ChromeOS: Sharing host-directories with unprivileged containers. I've tightly synced with all those projects and all of those listed here have also expressed their need/desire for this feature on the mailing list. For more info on how people use this there's a bunch of talks about this too. Here's just two recent ones: https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf https://fosdem.org/2021/schedule/event/containers_idmap/ This comes with an extensive xfstests suite covering both ext4 and xfs: https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts It covers truncation, creation, opening, xattrs, vfscaps, setid execution, setgid inheritance and more both with idmapped and non-idmapped mounts. It already helped to discover an unrelated xfs setgid inheritance bug which has since been fixed in mainline. It will be sent for inclusion with the xfstests project should you decide to merge this. In order to support per-mount idmappings vfsmounts are marked with user namespaces. The idmapping of the user namespace will be used to map the ids of vfs objects when they are accessed through that mount. By default all vfsmounts are marked with the initial user namespace. The initial user namespace is used to indicate that a mount is not idmapped. All operations behave as before and this is verified in the testsuite. Based on prior discussions we want to attach the whole user namespace and not just a dedicated idmapping struct. This allows us to reuse all the helpers that already exist for dealing with idmappings instead of introducing a whole new range of helpers. In addition, if we decide in the future that we are confident enough to enable unprivileged users to setup idmapped mounts the permission checking can take into account whether the caller is privileged in the user namespace the mount is currently marked with. The user namespace the mount will be marked with can be specified by passing a file descriptor refering to the user namespace as an argument to the new mount_setattr() syscall together with the new MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern of extensibility. The following conditions must be met in order to create an idmapped mount: - The caller must currently have the CAP_SYS_ADMIN capability in the user namespace the underlying filesystem has been mounted in. - The underlying filesystem must support idmapped mounts. - The mount must not already be idmapped. This also implies that the idmapping of a mount cannot be altered once it has been idmapped. - The mount must be a detached/anonymous mount, i.e. it must have been created by calling open_tree() with the OPEN_TREE_CLONE flag and it must not already have been visible in the filesystem. The last two points guarantee easier semantics for userspace and the kernel and make the implementation significantly simpler. By default vfsmounts are marked with the initial user namespace and no behavioral or performance changes are observed. The manpage with a detailed description can be found here: https://git.kernel.org/brauner/man-pages/c/1d7b902e2875a1ff342e036a9f866a995640aea8 In order to support idmapped mounts, filesystems need to be changed and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The patches to convert individual filesystem are not very large or complicated overall as can be seen from the included fat, ext4, and xfs ports. Patches for other filesystems are actively worked on and will be sent out separately. The xfstestsuite can be used to verify that port has been done correctly. The mount_setattr() syscall is motivated independent of the idmapped mounts patches and it's been around since July 2019. One of the most valuable features of the new mount api is the ability to perform mounts based on file descriptors only. Together with the lookup restrictions available in the openat2() RESOLVE_* flag namespace which we added in v5.6 this is the first time we are close to hardened and race-free (e.g. symlinks) mounting and path resolution. While userspace has started porting to the new mount api to mount proper filesystems and create new bind-mounts it is currently not possible to change mount options of an already existing bind mount in the new mount api since the mount_setattr() syscall is missing. With the addition of the mount_setattr() syscall we remove this last restriction and userspace can now fully port to the new mount api, covering every use-case the old mount api could. We also add the crucial ability to recursively change mount options for a whole mount tree, both removing and adding mount options at the same time. This syscall has been requested multiple times by various people and projects. There is a simple tool available at https://github.com/brauner/mount-idmapped that allows to create idmapped mounts so people can play with this patch series. I'll add support for the regular mount binary should you decide to pull this in the following weeks: Here's an example to a simple idmapped mount of another user's home directory: u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt u1001@f2-vm:/$ ls -al /home/ubuntu/ total 28 drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 . drwxr-xr-x 4 root root 4096 Oct 28 04:00 .. -rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 ubuntu ubuntu 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 ubuntu ubuntu 807 Feb 25 2020 .profile -rw-r--r-- 1 ubuntu ubuntu 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ ls -al /mnt/ total 28 drwxr-xr-x 2 u1001 u1001 4096 Oct 28 22:07 . drwxr-xr-x 29 root root 4096 Oct 28 22:01 .. -rw------- 1 u1001 u1001 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 u1001 u1001 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 u1001 u1001 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 u1001 u1001 807 Feb 25 2020 .profile -rw-r--r-- 1 u1001 u1001 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 u1001 u1001 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ touch /mnt/my-file u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file u1001@f2-vm:/$ ls -al /mnt/my-file -rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file u1001@f2-vm:/$ ls -al /home/ubuntu/my-file -rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file u1001@f2-vm:/$ getfacl /mnt/my-file getfacl: Removing leading '/' from absolute path names # file: mnt/my-file # owner: u1001 # group: u1001 user::rw- user:u1001:rwx group::rw- mask::rwx other::r-- u1001@f2-vm:/$ getfacl /home/ubuntu/my-file getfacl: Removing leading '/' from absolute path names # file: home/ubuntu/my-file # owner: ubuntu # group: ubuntu user::rw- user:ubuntu:rwx group::rw- mask::rwx other::r--" * tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits) xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl xfs: support idmapped mounts ext4: support idmapped mounts fat: handle idmapped mounts tests: add mount_setattr() selftests fs: introduce MOUNT_ATTR_IDMAP fs: add mount_setattr() fs: add attr_flags_to_mnt_flags helper fs: split out functions to hold writers namespace: only take read lock in do_reconfigure_mnt() mount: make {lock,unlock}_mount_hash() static namespace: take lock_mount_hash() directly when changing flags nfs: do not export idmapped mounts overlayfs: do not mount on top of idmapped mounts ecryptfs: do not mount on top of idmapped mounts ima: handle idmapped mounts apparmor: handle idmapped mounts fs: make helpers idmap mount aware exec: handle idmapped mounts would_dump: handle idmapped mounts ...
2021-02-22Merge tag 'devicetree-for-5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: - Sync dtc to upstream version v1.6.0-51-g183df9e9c2b9 and build host fdtoverlay - Add kbuild support to build DT overlays (%.dtbo) - Drop NULLifying match table in of_match_device(). In preparation for this, there are several driver cleanups to use (of_)?device_get_match_data(). - Drop pointless wrappers from DT struct device API - Convert USB binding schemas to use graph schema and remove old plain text graph binding doc - Convert spi-nor and v3d GPU bindings to DT schema - Tree wide schema fixes for if/then schemas, array size constraints, and undocumented compatible strings in examples - Handle 'no-map' correctly for already reserved memblock regions * tag 'devicetree-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (35 commits) driver core: platform: Drop of_device_node_put() wrapper of: Remove of_dev_{get,put}() dt-bindings: usb: Change descibe to describe in usbmisc-imx.txt dt-bindings: can: rcar_canfd: Group tuples in pin control properties dt-bindings: power: renesas,apmu: Group tuples in cpus properties dt-bindings: mtd: spi-nor: Convert to DT schema format dt-bindings: Use portable sort for version cmp dt-bindings: ethernet-controller: fix fixed-link specification dt-bindings: irqchip: Add node name to PRUSS INTC dt-bindings: interconnect: Fix the expected number of cells dt-bindings: Fix errors in 'if' schemas dt-bindings: iommu: renesas,ipmmu-vmsa: Make 'power-domains' conditionally required dt-bindings: Fix undocumented compatible strings in examples kbuild: Add support to build overlays (%.dtbo) scripts: dtc: Remove the unused fdtdump.c file scripts: dtc: Build fdtoverlay tool scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9 scripts: dtc: Fetch fdtoverlay.c from external DTC project dt-bindings: thermal: sun8i: Fix misplaced schema keyword in compatible strings dt-bindings: iio: dac: Fix AD5686 references ...
2021-02-22Merge tag 'regmap-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap update from Mark Brown: "Just one simple code style improvement this time, no features. There is an addition to add a new SoundWire regmap type but that should be coming via the SoundWire tree as the support for the underlying bus features was only added this cycle" * tag 'regmap-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: Assign boolean values to a bool variable
2021-02-21Merge tag 'sound-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "A relatively calm release at this time, and no massive code changes are found in the stats, while a wide range of code refactoring and cleanup have been done. Note that this update includes the tree-wide trivial changes for dropping the return value from ISA remove callbacks, too. Below lists up some highlight: ALSA Core: - Support for the software jack injection via debugfs - Fixes for sync_stop PCM operations HD-audio and USB-audio: - A few usual HD-audio device quirks - Updates for Tegra HD-audio - More quirks for Pioneer and other USB-audio devices - Stricter state checks at USB-audio disconnection ASoC: - Continued code refactoring, cleanup and fixes in ASoC core API - A KUnit testsuite for the topology code - Lots of ASoC Intel driver Realtek codec updates, quirk additions and fixes - Support for Ingenic JZ4760(B), Intel AlderLake-P, DT configured nVidia cards, Qualcomm lpass-rx-macro and lpass-tx-macro - Removal of obsolete SIRF prima/atlas, Txx9 and ZTE zx drivers Others: - Drop return value from ISA driver remove callback - Cleanup with DIV_ROUND_UP() macro - FireWire updates, HDSP output loopback support" * tag 'sound-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (322 commits) ALSA: hda: intel-dsp-config: add Alder Lake support ASoC: soc-pcm: fix hw param limits calculation for multi-DAI ASoC: Intel: bytcr_rt5640: Add quirk for the Acer One S1002 tablet ASoC: Intel: bytcr_rt5651: Add quirk for the Jumper EZpad 7 tablet ASoC: Intel: bytcr_rt5640: Add quirk for the Voyo Winpad A15 tablet ASoC: Intel: bytcr_rt5640: Add quirk for the Estar Beauty HD MID 7316R tablet ASoC: soc-pcm: fix hwparams min/max init for dpcm ALSA: hda/realtek: Quirk for HP Spectre x360 14 amp setup ALSA: usb-audio: Add implicit fb quirk for BOSS GP-10 ALSA: hda: Add another CometLake-H PCI ID ASoC: soc-pcm: add soc_pcm_hw_update_format() ASoC: soc-pcm: add soc_pcm_hw_update_chan() ASoC: soc-pcm: add soc_pcm_hw_update_rate() ASoC: wm_adsp: Remove unused control callback structure ASoC: SOF: relax ABI checks and avoid unnecessary warnings ASoC: codecs: lpass-tx-macro: add dapm widgets and route ASoC: codecs: lpass-tx-macro: add support for lpass tx macro ASoC: qcom: dt-bindings: add bindings for lpass tx macro codec ASoC: codecs: lpass-rx-macro: add iir widgets ASoC: codecs: lpass-rx-macro: add dapm widgets and route ...
2021-02-21Merge tag 'media/v5.12-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - some core fixes in VB2 mem2mem support - some improvements and cleanups in V4L2 async kAPI - newer controls in V4L2 API for H-264 and HEVC codecs - allegro-dvt driver was promoted from staging - new i2c sendor drivers: imx334, ov5648, ov8865 - new automobile camera module: rdacm21 - ipu3 cio2 driver started gained support for some ACPI BIOSes - new ATSC frontend: MaxLinear mxl692 VSB tuner/demod - the SMIA/CCS driver gained more support for CSS standard - several driver fixes, updates and improvements * tag 'media/v5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (362 commits) media: v4l: async: Fix kerneldoc documentation for async functions media: i2c: max9271: Add MODULE_* macros media: i2c: Kconfig: Make MAX9271 a module media: imx334: 'ret' is uninitialized, should have been PTR_ERR() media: i2c: Add imx334 camera sensor driver media: dt-bindings: media: Add bindings for imx334 media: ov8856: Configure sensor for GRBG Bayer for all modes media: i2c: imx219: Implement V4L2_CID_LINK_FREQ control media: ov5675: fix vflip/hflip control media: ipu3-cio2: Build bridge only if ACPI is enabled media: Remove the legacy v4l2-clk API media: ov6650: Use the generic clock framework media: mt9m111: Use the generic clock framework media: ov9640: Use the generic clock framework media: pxa_camera: Drop the v4l2-clk clock register media: mach-pxa: Register the camera sensor fixed-rate clock media: i2c: imx258: get clock from device properties and enable it via runtime PM media: i2c: imx258: simplify getting state container media: i2c: imx258: add support for binding via device tree media: dt-bindings: media: imx258: add bindings for IMX258 sensor ...
2021-02-21Merge tag 'mips_5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - added support for Nintendo N64 - added support for Realtek RTL83XX SoCs - kaslr support for Loongson64 - first steps to get rid of set_fs() - DMA runtime coherent/non-coherent selection cleanup - cleanups and fixes * tag 'mips_5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (98 commits) Revert "MIPS: Add basic support for ptrace single step" vmlinux.lds.h: catch more UBSAN symbols into .data MIPS: kernel: Drop kgdb_call_nmi_hook MAINTAINERS: Add git tree for KVM/mips MIPS: Use common way to parse elfcorehdr MIPS: Simplify EVA cache handling Revert "MIPS: kernel: {ftrace,kgdb}: Set correct address limit for cache flushes" MIPS: remove CONFIG_DMA_PERDEV_COHERENT MIPS: remove CONFIG_DMA_MAYBE_COHERENT driver core: lift dma_default_coherent into common code MIPS: refactor the runtime coherent vs noncoherent DMA indicators MIPS/alchemy: factor out the DMA coherent setup MIPS/malta: simplify plat_setup_iocoherency MIPS: Add basic support for ptrace single step MAINTAINERS: replace non-matching patterns for loongson{2,3} MIPS: Make check condition for SDBBP consistent with EJTAG spec mips: Replace lkml.org links with lore Revert "MIPS: microMIPS: Fix the judgment of mm_jr16_op and mm_jalr_op" MIPS: crash_dump.c: Simplify copy_oldmem_page() Revert "mips: Manually call fdt_init_reserved_mem() method" ...
2021-02-20Merge tag 'pm-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add a new power capping facility allowing aggregate power constraints to be applied to sets of devices in a distributed manner, add a new CPU ID to the RAPL power capping driver and improve it, drop a cpufreq driver belonging to a platform that is not supported any more, drop two redundant cpufreq driver flags, update cpufreq drivers (intel_pstate, brcmstb-avs, qcom-hw), update the operating performance points (OPP) framework (code cleanups, new helpers, devfreq-related modifications), clean up devfreq, extend the PM clock layer, update the cpupower utility and make assorted janitorial changes. Specifics: - Add new power capping facility called DTPM (Dynamic Thermal Power Management), based on the existing power capping framework, to allow aggregate power constraints to be applied to sets of devices in a distributed manner, along with a CPU backend driver based on the Energy Model (Daniel Lezcano, Dan Carpenter, Colin Ian King). - Add AlderLake Mobile support to the Intel RAPL power capping driver and make it use the topology interface when laying out the system topology (Zhang Rui, Yunfeng Ye). - Drop the cpufreq tango driver belonging to a platform that is not supported any more (Arnd Bergmann). - Drop the redundant CPUFREQ_STICKY and CPUFREQ_PM_NO_WARN cpufreq driver flags (Viresh Kumar). - Update cpufreq drivers: * Fix max CPU frequency discovery in the intel_pstate driver and make janitorial changes in it (Chen Yu, Rafael Wysocki, Nigel Christian). * Fix resource leaks in the brcmstb-avs-cpufreq driver (Christophe JAILLET). * Make the tegra20 driver use the resource-managed API (Dmitry Osipenko). * Enable boost support in the qcom-hw driver (Shawn Guo). - Update the operating performance points (OPP) framework: * Clean up the OPP core (Dmitry Osipenko, Viresh Kumar). * Extend the OPP API by adding new helpers to it (Dmitry Osipenko, Viresh Kumar). * Allow required OPPs to be used for devfreq devices and update the devfreq governor code accordingly (Saravana Kannan). * Prepare the framework for introducing new dev_pm_opp_set_opp() helper (Viresh Kumar). * Drop dev_pm_opp_set_bw() and update related drivers (Viresh Kumar). * Allow lazy linking of required-OPPs (Viresh Kumar). - Simplify and clean up devfreq somewhat (Lukasz Luba, Yang Li, Pierre Kuo). - Update the generic power domains (genpd) framework: * Use device's next wakeup to determine domain idle state (Lina Iyer). * Improve initialization and debug (Dmitry Osipenko). * Simplify computations (Abaci Team). - Make janitorial changes in the core code handling system sleep and PM-runtime (Bhaskar Chowdhury, Bjorn Helgaas, Rikard Falkeborn, Zqiang). - Update the MAINTAINERS entry for the exynos cpuidle driver and drop DEBUG definition from intel_idle (Krzysztof Kozlowski, Tom Rix). - Extend the PM clock layer to cover clocks that must sleep (Nicolas Pitre). - Update the cpupower utility: * Update cpupower command, add support for AMD family 0x19 and clean up the code to remove many of the family checks to make future family updates easier (Nathan Fontenot, Robert Richter). * Add Makefile dependencies for install targets to allow building cpupower in parallel rather than serially (Ivan Babrou). - Make janitorial changes in power management Kconfig (Lukasz Luba)" * tag 'pm-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits) MAINTAINERS: cpuidle: exynos: include header in file pattern powercap: intel_rapl: Use topology interface in rapl_init_domains() powercap: intel_rapl: Use topology interface in rapl_add_package() PM: sleep: Constify static struct attribute_group PM: Kconfig: remove unneeded "default n" options PM: EM: update Kconfig description and drop "default n" option cpufreq: Remove unused flag CPUFREQ_PM_NO_WARN cpufreq: Remove CPUFREQ_STICKY flag PM / devfreq: Add required OPPs support to passive governor PM / devfreq: Cache OPP table reference in devfreq OPP: Add function to look up required OPP's for a given OPP PM / devfreq: rk3399_dmc: Remove unneeded semicolon opp: Replace ENOTSUPP with EOPNOTSUPP opp: Fix "foo * bar" should be "foo *bar" opp: Don't ignore clk_get() errors other than -ENOENT opp: Update bandwidth requirements based on scaling up/down opp: Allow lazy-linking of required-opps opp: Remove dev_pm_opp_set_bw() devfreq: tegra30: Migrate to dev_pm_opp_set_opp() drm: msm: Migrate to dev_pm_opp_set_opp() ...
2021-02-19Revert "driver core: Set fw_devlink=on by default"Greg Kroah-Hartman
This reverts commit e590474768f1cc04852190b61dec692411b22e2a. While things are _almost_ there and working for almost all systems, there are still reported regressions happening, so let's revert this default for 5.12. We can bring it back in linux-next after 5.12-rc1 is out to get more testing and hopefully solve the remaining different subsystem and driver issues that people are running into. Cc: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210219074549.1506936-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-18arch_numa: fix common code printing of phys_addr_tRandy Dunlap
Fix build warnings in the arch_numa common code: ../include/linux/kern_levels.h:5:18: warning: format '%Lx' expects argument of type 'long long unsigned int', but argument 3 has type 'phys_addr_t' {aka 'unsigned int'} [-Wformat=] ../drivers/base/arch_numa.c:360:56: note: format string is defined here 360 | pr_warn("Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n", ../drivers/base/arch_numa.c:435:39: note: format string is defined here 435 | pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n", start, end - 1); Fixes: ae3c107cd8be ("numa: Move numa implementation to common code") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-17Merge tag 'asoc-v5.12' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v5.12 Another quiet release in terms of features, though several of the drivers got quite a bit of work and there were a lot of general changes resulting from Morimoto-san's ongoing cleanup work. - As ever, lots of hard work by Morimoto-san cleaning up the code and making it more consistent. - Many improvements in the Intel drivers including a wide range of quirks and bug fixes. - A KUnit testsuite for the topology code. - Support for Ingenic JZ4760(B), Intel AlderLake-P, DT configured nVidia cards, Qualcomm lpass-rx-macro and lpass-tx-macro - Removal of obsolete SIRF prima/atlas, Txx9 and ZTE zx drivers.
2021-02-15Merge branches 'pm-sleep', 'pm-core', 'pm-domains' and 'pm-clk'Rafael J. Wysocki
* pm-sleep: PM: sleep: Constify static struct attribute_group PM: sleep: Use dev_printk() when possible PM: sleep: No need to check PF_WQ_WORKER in thaw_kernel_threads() * pm-core: PM: runtime: Fix typos and grammar PM: runtime: Fix resposible -> responsible in runtime.c * pm-domains: PM: domains: Simplify the calculation of variables PM: domains: Add "performance" column to debug summary PM: domains: Make of_genpd_add_subdomain() return -EPROBE_DEFER PM: domains: Make set_performance_state() callback optional PM: domains: use device's next wakeup to determine domain idle state PM: domains: inform PM domain of a device's next wakeup * pm-clk: PM: clk: make PM clock layer compatible with clocks that must sleep
2021-02-13driver core: lift dma_default_coherent into common codeChristoph Hellwig
Lift the dma_default_coherent variable from the mips architecture code to the driver core. This allows an architecture to sdefault all device to be DMA coherent at run time, even if the kernel is build with support for DMA noncoherent device. By allowing device_initialize to set the ->dma_coherent field to this default the amount of arch hooks required for this behavior can be greatly reduced. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-02-12driver core: platform: Drop of_device_node_put() wrapperRob Herring
of_device_node_put() is just a wrapper for of_node_put(). The platform driver core is already polluted with of_node pointers and the only 'get' already uses of_node_get() (though typically the get would happen in of_device_alloc()). Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20210211232745.1498137-3-robh@kernel.org
2021-02-12Merge tag 'soundwire-2_5.12-rc1' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-next Vinod writes: soundwire second update for 5.12-rc1 Some late changes for sdw: - fix for crash on intel driver - support for _no_pm IO calls in sdw regmap * tag 'soundwire-2_5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: regmap: sdw-mbq: use MODULE_LICENSE("GPL") regmap: sdw: use no_pm routines for SoundWire 1.2 MBQ regmap: sdw: use _no_pm functions in regmap_read/write soundwire: intel: fix possible crash when no device is detected
2021-02-11driver core: auxiliary bus: Fix calling stage for auxiliary bus initDave Jiang
When the auxiliary device code is built into the kernel, it can be executed before the auxiliary bus is registered. This causes bus->p to be not allocated and triggers a NULL pointer dereference when the auxiliary bus device gets added with bus_add_device(). Call the auxiliary_bus_init() under driver_init() so the bus is initialized before devices. Below is the kernel splat for the bug: [ 1.948215] BUG: kernel NULL pointer dereference, address: 0000000000000060 [ 1.950670] #PF: supervisor read access in kernel mode [ 1.950670] #PF: error_code(0x0000) - not-present page [ 1.950670] PGD 0 [ 1.950670] Oops: 0000 1 SMP NOPTI [ 1.950670] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-intel-nextsvmtest+ #2205 [ 1.950670] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 1.950670] RIP: 0010:bus_add_device+0x64/0x140 [ 1.950670] Code: 00 49 8b 75 20 48 89 df e8 59 a1 ff ff 41 89 c4 85 c0 75 7b 48 8b 53 50 48 85 d2 75 03 48 8b 13 49 8b 85 a0 00 00 00 48 89 de <48> 8 78 60 48 83 c7 18 e8 ef d9 a9 ff 41 89 c4 85 c0 75 45 48 8b [ 1.950670] RSP: 0000:ff46032ac001baf8 EFLAGS: 00010246 [ 1.950670] RAX: 0000000000000000 RBX: ff4597f7414aa680 RCX: 0000000000000000 [ 1.950670] RDX: ff4597f74142bbc0 RSI: ff4597f7414aa680 RDI: ff4597f7414aa680 [ 1.950670] RBP: ff46032ac001bb10 R08: 0000000000000044 R09: 0000000000000228 [ 1.950670] R10: ff4597f741141b30 R11: ff4597f740182a90 R12: 0000000000000000 [ 1.950670] R13: ffffffffa5e936c0 R14: 0000000000000000 R15: 0000000000000000 [ 1.950670] FS: 0000000000000000(0000) GS:ff4597f7bba00000(0000) knlGS:0000000000000000 [ 1.950670] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.950670] CR2: 0000000000000060 CR3: 000000002140c001 CR4: 0000000000f71ef0 [ 1.950670] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.950670] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 1.950670] PKRU: 55555554 [ 1.950670] Call Trace: [ 1.950670] device_add+0x3ee/0x850 [ 1.950670] __auxiliary_device_add+0x47/0x60 [ 1.950670] idxd_pci_probe+0xf77/0x1180 [ 1.950670] local_pci_probe+0x4a/0x90 [ 1.950670] pci_device_probe+0xff/0x1b0 [ 1.950670] really_probe+0x1cf/0x440 [ 1.950670] ? rdinit_setup+0x31/0x31 [ 1.950670] driver_probe_device+0xe8/0x150 [ 1.950670] device_driver_attach+0x58/0x60 [ 1.950670] __driver_attach+0x8f/0x150 [ 1.950670] ? device_driver_attach+0x60/0x60 [ 1.950670] ? device_driver_attach+0x60/0x60 [ 1.950670] bus_for_each_dev+0x79/0xc0 [ 1.950670] ? kmem_cache_alloc_trace+0x323/0x430 [ 1.950670] driver_attach+0x1e/0x20 [ 1.950670] bus_add_driver+0x154/0x1f0 [ 1.950670] driver_register+0x70/0xc0 [ 1.950670] __pci_register_driver+0x54/0x60 [ 1.950670] idxd_init_module+0xe2/0xfc [ 1.950670] ? idma64_platform_driver_init+0x19/0x19 [ 1.950670] do_one_initcall+0x4a/0x1e0 [ 1.950670] kernel_init_freeable+0x1fc/0x25c [ 1.950670] ? rest_init+0xba/0xba [ 1.950670] kernel_init+0xe/0x116 [ 1.950670] ret_from_fork+0x1f/0x30 [ 1.950670] Modules linked in: [ 1.950670] CR2: 0000000000000060 [ 1.950670] --[ end trace cd7d1b226d3ca901 ]-- Fixes: 7de3697e9cbd ("Add auxiliary bus support") Reported-by: Jacob Pan <jacob.jun.pan@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20210210201611.1611074-1-dave.jiang@intel.com Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-11regmap: sdw-mbq: use MODULE_LICENSE("GPL")Bard Liao
"GPL v2" is the same as "GPL". It exists for historic reasons. See Documentation/process/license-rules.rst Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210122070634.12825-8-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-02-11regmap: sdw: use no_pm routines for SoundWire 1.2 MBQBard Liao
Use no_pm versions for write and read. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@linux.intel.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210122070634.12825-7-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-02-11regmap: sdw: use _no_pm functions in regmap_read/writeBard Liao
sdw_update_slave_status will be invoked when a codec is attached, and the codec driver will initialize the codec with regmap functions while the codec device is pm_runtime suspended. regmap routines currently rely on regular SoundWire IO functions, which will call pm_runtime_get_sync()/put_autosuspend. This causes a deadlock where the resume routine waits for an initialization complete signal that while the initialization complete can only be reached when the resume completes. The only solution if we allow regmap functions to be used in resume operations as well as during codec initialization is to use _no_pm routines. The duty of making sure the bus is operational needs to be handled above the regmap level. Fixes: 7c22ce6e21840 ('regmap: Add SoundWire bus support') Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210122070634.12825-6-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-02-09PM: domains: Mark fwnodes when their powerdomain is added/removedSaravana Kannan
This allows fw_devlink to recognize power domain drivers that don't use the device-driver model to initialize the device. fw_devlink will use this information to make sure consumers of such power domain aren't indefinitely blocked from probing, waiting for the power domain device to appear and bind to a driver. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210205222644.2357303-8-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09driver core: fw_devlink: Handle suppliers that don't use driver coreSaravana Kannan
Device links only work between devices that use the driver core to match and bind a driver to a device. So, add an API for frameworks to let the driver core know that a fwnode has been initialized by a driver without using the driver core. Then use this information to make sure that fw_devlink doesn't make the consumers wait indefinitely on suppliers that'll never bind to a driver. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210205222644.2357303-6-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09driver core: Add fw_devlink.strict kernel paramSaravana Kannan
This param allows forcing all dependencies to be treated as mandatory. This will be useful for boards in which all optional dependencies like IOMMUs and DMAs need to be treated as mandatory dependencies. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210205222644.2357303-4-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>