bcachefs.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-11-18	mtd: rawnand: gpio: Keep the driver compatible with on-die ECC engines	Miquel Raynal
	commit b5b5b4dc6fcd8194b9dd38c8acdc5ab71adf44f8 upstream. Following the introduction of the generic ECC engine infrastructure, it was necessary to reorganize the code and move the ECC configuration in the ->attach_chip() hook. Failing to do that properly lead to a first series of fixes supposed to stabilize the situation. Unfortunately, this only fixed the use of software ECC engines, preventing any other kind of engine to be used, including on-die ones. It is now time to (finally) fix the situation by ensuring that we still provide a default (eg. software ECC) but will still support different ECC engines such as on-die ECC engines if properly described in the device tree. There are no changes needed on the core side in order to do this, but we just need to leverage the logic there which allows: 1- a subsystem default (set to Host engines in the raw NAND world) 2- a driver specific default (here set to software ECC engines) 3- any type of engine requested by the user (ie. described in the DT) As the raw NAND subsystem has not yet been fully converted to the ECC engine infrastructure, in order to provide a default ECC engine for this driver we need to set chip->ecc.engine_type before calling nand_scan(). During the initialization step, the core will consider this entry as the default engine for this driver. This value may of course be overloaded by the user if the usual DT properties are provided. Fixes: f6341f6448e0 ("mtd: rawnand: gpio: Move the ECC initialization to ->attach_chip()") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-4-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mtd: rawnand: mpc5121: Keep the driver compatible with on-die ECC engines	Miquel Raynal
	commit f9d8570b7fd6f4f08528ce2f5e39787a8a260cd6 upstream. Following the introduction of the generic ECC engine infrastructure, it was necessary to reorganize the code and move the ECC configuration in the ->attach_chip() hook. Failing to do that properly lead to a first series of fixes supposed to stabilize the situation. Unfortunately, this only fixed the use of software ECC engines, preventing any other kind of engine to be used, including on-die ones. It is now time to (finally) fix the situation by ensuring that we still provide a default (eg. software ECC) but will still support different ECC engines such as on-die ECC engines if properly described in the device tree. There are no changes needed on the core side in order to do this, but we just need to leverage the logic there which allows: 1- a subsystem default (set to Host engines in the raw NAND world) 2- a driver specific default (here set to software ECC engines) 3- any type of engine requested by the user (ie. described in the DT) As the raw NAND subsystem has not yet been fully converted to the ECC engine infrastructure, in order to provide a default ECC engine for this driver we need to set chip->ecc.engine_type before calling nand_scan(). During the initialization step, the core will consider this entry as the default engine for this driver. This value may of course be overloaded by the user if the usual DT properties are provided. Fixes: 6dd09f775b72 ("mtd: rawnand: mpc5121: Move the ECC initialization to ->attach_chip()") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-5-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mtd: rawnand: xway: Keep the driver compatible with on-die ECC engines	Miquel Raynal
	commit 6bcd2960af1b7bacb2f1e710ab0c0b802d900501 upstream. Following the introduction of the generic ECC engine infrastructure, it was necessary to reorganize the code and move the ECC configuration in the ->attach_chip() hook. Failing to do that properly lead to a first series of fixes supposed to stabilize the situation. Unfortunately, this only fixed the use of software ECC engines, preventing any other kind of engine to be used, including on-die ones. It is now time to (finally) fix the situation by ensuring that we still provide a default (eg. software ECC) but will still support different ECC engines such as on-die ECC engines if properly described in the device tree. There are no changes needed on the core side in order to do this, but we just need to leverage the logic there which allows: 1- a subsystem default (set to Host engines in the raw NAND world) 2- a driver specific default (here set to software ECC engines) 3- any type of engine requested by the user (ie. described in the DT) As the raw NAND subsystem has not yet been fully converted to the ECC engine infrastructure, in order to provide a default ECC engine for this driver we need to set chip->ecc.engine_type before calling nand_scan(). During the initialization step, the core will consider this entry as the default engine for this driver. This value may of course be overloaded by the user if the usual DT properties are provided. Fixes: d525914b5bd8 ("mtd: rawnand: xway: Move the ECC initialization to ->attach_chip()") Cc: stable@vger.kernel.org Cc: Jan Hoffmann <jan@3e8.eu> Cc: Kestrel seventyfour <kestrelseventyfour@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Tested-by: Jan Hoffmann <jan@3e8.eu> Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-10-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mtd: rawnand: ams-delta: Keep the driver compatible with on-die ECC engines	Miquel Raynal
	commit d707bb74daae07879e0fc1b4b960f8f2d0a5fe5d upstream. Following the introduction of the generic ECC engine infrastructure, it was necessary to reorganize the code and move the ECC configuration in the ->attach_chip() hook. Failing to do that properly lead to a first series of fixes supposed to stabilize the situation. Unfortunately, this only fixed the use of software ECC engines, preventing any other kind of engine to be used, including on-die ones. It is now time to (finally) fix the situation by ensuring that we still provide a default (eg. software ECC) but will still support different ECC engines such as on-die ECC engines if properly described in the device tree. There are no changes needed on the core side in order to do this, but we just need to leverage the logic there which allows: 1- a subsystem default (set to Host engines in the raw NAND world) 2- a driver specific default (here set to software ECC engines) 3- any type of engine requested by the user (ie. described in the DT) As the raw NAND subsystem has not yet been fully converted to the ECC engine infrastructure, in order to provide a default ECC engine for this driver we need to set chip->ecc.engine_type before calling nand_scan(). During the initialization step, the core will consider this entry as the default engine for this driver. This value may of course be overloaded by the user if the usual DT properties are provided. Fixes: 59d93473323a ("mtd: rawnand: ams-delta: Move the ECC initialization to ->attach_chip()") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-2-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mtd: rawnand: fsmc: Fix use of SM ORDER	Miquel Raynal
	commit 9be1446ece291a1f08164bd056bed3d698681f8b upstream. The introduction of the generic ECC engine API lead to a number of changes in various drivers which broke some of them. Here is a typical example: I expected the SM_ORDER option to be handled by the Hamming ECC engine internals. Problem: the fsmc driver does not instantiate (yet) a real ECC engine object so we had to use a 'bare' ECC helper instead of the shiny rawnand functions. However, when not intializing this engine properly and using the bare helpers, we do not get the SM ORDER feature handled automatically. It looks like this was lost in the process so let's ensure we use the right SM ORDER now. Fixes: ad9ffdce4539 ("mtd: rawnand: fsmc: Fix external use of SW Hamming ECC helper") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210928221507.199198-2-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	remoteproc: imx_rproc: Fix rsc-table name	Dong Aisheng
	commit e90547d59d4e29e269e22aa6ce590ed0b41207d2 upstream. Usually the dash '-' is preferred in node name. So far, not dts in upstream kernel, so we just update node name in driver. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Fixes: 5e4c1243071d ("remoteproc: imx_rproc: support remote cores booted before Linux Kernel") Reviewed-and-tested-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210910090621.3073540-6-peng.fan@oss.nxp.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	remoteproc: imx_rproc: Fix ignoring mapping vdev regions	Dong Aisheng
	commit afe670e23af91d8a74a8d7049f6e0984bbf6ea11 upstream. vdev regions are typically named vdev0buffer, vdev0ring0, vdev0ring1 and etc. Change to strncmp to cover them all. Fixes: 8f2d8961640f ("remoteproc: imx_rproc: ignore mapping vdev regions") Reviewed-and-tested-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210910090621.3073540-5-peng.fan@oss.nxp.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	remoteproc: Fix the wrong default value of is_iomem	Dong Aisheng
	commit 970675f61bf5761d7e5326f6e4df995ecdba5e11 upstream. Currently the is_iomem is a random value in the stack which may be default to true even on those platforms that not use iomem to store firmware. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Fixes: 40df0a91b2a5 ("remoteproc: add is_iomem to da_to_va") Reviewed-and-tested-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210910090621.3073540-3-peng.fan@oss.nxp.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	remoteproc: elf_loader: Fix loading segment when is_iomem true	Peng Fan
	commit 24acbd9dc934f5d9418a736c532d3970a272063e upstream. It seems luckliy work on i.MX platform, but it is wrong. Need use memcpy_toio, not memcpy_fromio. Fixes: 40df0a91b2a5 ("remoteproc: add is_iomem to da_to_va") Tested-by: Dong Aisheng <aisheng.dong@nxp.com> (i.MX8MQ) Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210910090621.3073540-2-peng.fan@oss.nxp.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	s390/cio: make ccw_device_dma_* more robust	Halil Pasic
	commit ad9a14517263a16af040598c7920c09ca9670a31 upstream. Since commit 48720ba56891 ("virtio/s390: use DMA memory for ccw I/O and classic notifiers") we were supposed to make sure that virtio_ccw_release_dev() completes before the ccw device and the attached dma pool are torn down, but unfortunately we did not. Before that commit it used to be OK to delay cleaning up the memory allocated by virtio-ccw indefinitely (which isn't really intuitive for guys used to destruction happens in reverse construction order), but now we trigger a BUG_ON if the genpool is destroyed before all memory allocated from it is deallocated. Which brings down the guest. We can observe this problem, when unregister_virtio_device() does not give up the last reference to the virtio_device (e.g. because a virtio-scsi attached scsi disk got removed without previously unmounting its previously mounted partition). To make sure that the genpool is only destroyed after all the necessary freeing is done let us take a reference on the ccw device on each ccw_device_dma_zalloc() and give it up on each ccw_device_dma_free(). Actually there are multiple approaches to fixing the problem at hand that can work. The upside of this one is that it is the safest one while remaining simple. We don't crash the guest even if the driver does not pair allocations and frees. The downside is the reference counting overhead, that the reference counting for ccw devices becomes more complex, in a sense that we need to pair the calls to the aforementioned functions for it to be correct, and that if we happen to leak, we leak more than necessary (the whole ccw device instead of just the genpool). Some alternatives to this approach are taking a reference in virtio_ccw_online() and giving it up in virtio_ccw_release_dev() or making sure virtio_ccw_release_dev() completes its work before virtio_ccw_remove() returns. The downside of these approaches is that these are less safe against programming errors. Cc: <stable@vger.kernel.org> # v5.3 Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Fixes: 48720ba56891 ("virtio/s390: use DMA memory for ccw I/O and classic notifiers") Reported-by: bfu@redhat.com Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	s390/ap: Fix hanging ioctl caused by orphaned replies	Harald Freudenberger
	commit 3826350e6dd435e244eb6e47abad5a47c169ebc2 upstream. When a queue is switched to soft offline during heavy load and later switched to soft online again and now used, it may be that the caller is blocked forever in the ioctl call. The failure occurs because there is a pending reply after the queue(s) have been switched to offline. This orphaned reply is received when the queue is switched to online and is accidentally counted for the outstanding replies. So when there was a valid outstanding reply and this orphaned reply is received it counts as the outstanding one thus dropping the outstanding counter to 0. Voila, with this counter the receive function is not called any more and the real outstanding reply is never received (until another request comes in...) and the ioctl blocks. The fix is simple. However, instead of readjusting the counter when an orphaned reply is detected, I check the queue status for not empty and compare this to the outstanding counter. So if the queue is not empty then the counter must not drop to 0 but at least have a value of 1. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	s390/tape: fix timer initialization in tape_std_assign()	Sven Schnelle
	commit 213fca9e23b59581c573d558aa477556f00b8198 upstream. commit 9c6c273aa424 ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()") changed the timer setup from init_timer_on_stack(() to timer_setup(), but missed to change the mod_timer() call. And while at it, use msecs_to_jiffies() instead of the open coded timeout calculation. Cc: stable@vger.kernel.org Fixes: 9c6c273aa424 ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	s390/cio: check the subchannel validity for dev_busid	Vineeth Vijayan
	commit a4751f157c194431fae9e9c493f456df8272b871 upstream. Check the validity of subchanel before reading other fields in the schib. Fixes: d3683c055212 ("s390/cio: add dev_busid sysfs entry for each subchannel") CC: <stable@vger.kernel.org> Reported-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20211105154451.847288-1-vneethv@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	s390/cpumf: cpum_cf PMU displays invalid value after hotplug remove	Thomas Richter
	commit 9d48c7afedf91a02d03295837ec76b2fb5e7d3fe upstream. When a CPU is hotplugged while the perf stat -e cycles command is running, a wrong (very large) value is displayed immediately after the CPU removal: Check the values, shouldn't be too high as in time counts unit events 1.001101919 29261846 cycles 2.002454499 17523405 cycles 3.003659292 24361161 cycles 4.004816983 18446744073638406144 cycles 5.005671647 <not counted> cycles ... The CPU hotplug off took place after 3 seconds. The issue is the read of the event count value after 4 seconds when the CPU is not available and the read of the counter returns an error. This is treated as a counter value of zero. This results in a very large value (0 - previous_value). Fix this by detecting the hotplugged off CPU and report 0 instead of a very large number. Cc: stable@vger.kernel.org Fixes: a029a4eab39e ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility") Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	PM: sleep: Avoid calling put_device() under dpm_list_mtx	Rafael J. Wysocki
	commit 2aa36604e8243698ff22bd5fef0dd0c6bb07ba92 upstream. It is generally unsafe to call put_device() with dpm_list_mtx held, because the given device's release routine may carry out an action depending on that lock which then may deadlock, so modify the system-wide suspend and resume of devices to always drop dpm_list_mtx before calling put_device() (and adjust white space somewhat while at it). For instance, this prevents the following splat from showing up in the kernel log after a system resume in certain configurations: [ 3290.969514] ====================================================== [ 3290.969517] WARNING: possible circular locking dependency detected [ 3290.969519] 5.15.0+ #2420 Tainted: G S [ 3290.969523] ------------------------------------------------------ [ 3290.969525] systemd-sleep/4553 is trying to acquire lock: [ 3290.969529] ffff888117ab1138 ((wq_completion)hci0#2){+.+.}-{0:0}, at: flush_workqueue+0x87/0x4a0 [ 3290.969554] but task is already holding lock: [ 3290.969556] ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0 [ 3290.969571] which lock already depends on the new lock. [ 3290.969573] the existing dependency chain (in reverse order) is: [ 3290.969575] -> #3 (dpm_list_mtx){+.+.}-{3:3}: [ 3290.969583] __mutex_lock+0x9d/0xa30 [ 3290.969591] device_pm_add+0x2e/0xe0 [ 3290.969597] device_add+0x4d5/0x8f0 [ 3290.969605] hci_conn_add_sysfs+0x43/0xb0 [bluetooth] [ 3290.969689] hci_conn_complete_evt.isra.71+0x124/0x750 [bluetooth] [ 3290.969747] hci_event_packet+0xd6c/0x28a0 [bluetooth] [ 3290.969798] hci_rx_work+0x213/0x640 [bluetooth] [ 3290.969842] process_one_work+0x2aa/0x650 [ 3290.969851] worker_thread+0x39/0x400 [ 3290.969859] kthread+0x142/0x170 [ 3290.969865] ret_from_fork+0x22/0x30 [ 3290.969872] -> #2 (&hdev->lock){+.+.}-{3:3}: [ 3290.969881] __mutex_lock+0x9d/0xa30 [ 3290.969887] hci_event_packet+0xba/0x28a0 [bluetooth] [ 3290.969935] hci_rx_work+0x213/0x640 [bluetooth] [ 3290.969978] process_one_work+0x2aa/0x650 [ 3290.969985] worker_thread+0x39/0x400 [ 3290.969993] kthread+0x142/0x170 [ 3290.969999] ret_from_fork+0x22/0x30 [ 3290.970004] -> #1 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}: [ 3290.970013] process_one_work+0x27d/0x650 [ 3290.970020] worker_thread+0x39/0x400 [ 3290.970028] kthread+0x142/0x170 [ 3290.970033] ret_from_fork+0x22/0x30 [ 3290.970038] -> #0 ((wq_completion)hci0#2){+.+.}-{0:0}: [ 3290.970047] __lock_acquire+0x15cb/0x1b50 [ 3290.970054] lock_acquire+0x26c/0x300 [ 3290.970059] flush_workqueue+0xae/0x4a0 [ 3290.970066] drain_workqueue+0xa1/0x130 [ 3290.970073] destroy_workqueue+0x34/0x1f0 [ 3290.970081] hci_release_dev+0x49/0x180 [bluetooth] [ 3290.970130] bt_host_release+0x1d/0x30 [bluetooth] [ 3290.970195] device_release+0x33/0x90 [ 3290.970201] kobject_release+0x63/0x160 [ 3290.970211] dpm_resume+0x164/0x3e0 [ 3290.970215] dpm_resume_end+0xd/0x20 [ 3290.970220] suspend_devices_and_enter+0x1a4/0xba0 [ 3290.970229] pm_suspend+0x26b/0x310 [ 3290.970236] state_store+0x42/0x90 [ 3290.970243] kernfs_fop_write_iter+0x135/0x1b0 [ 3290.970251] new_sync_write+0x125/0x1c0 [ 3290.970257] vfs_write+0x360/0x3c0 [ 3290.970263] ksys_write+0xa7/0xe0 [ 3290.970269] do_syscall_64+0x3a/0x80 [ 3290.970276] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3290.970284] other info that might help us debug this: [ 3290.970285] Chain exists of: (wq_completion)hci0#2 --> &hdev->lock --> dpm_list_mtx [ 3290.970297] Possible unsafe locking scenario: [ 3290.970299] CPU0 CPU1 [ 3290.970300] ---- ---- [ 3290.970302] lock(dpm_list_mtx); [ 3290.970306] lock(&hdev->lock); [ 3290.970310] lock(dpm_list_mtx); [ 3290.970314] lock((wq_completion)hci0#2); [ 3290.970319] * DEADLOCK * [ 3290.970321] 7 locks held by systemd-sleep/4553: [ 3290.970325] #0: ffff888103bcd448 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xa7/0xe0 [ 3290.970341] #1: ffff888115a14488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x103/0x1b0 [ 3290.970355] #2: ffff888100f719e0 (kn->active#233){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x10c/0x1b0 [ 3290.970369] #3: ffffffff82661048 (autosleep_lock){+.+.}-{3:3}, at: state_store+0x12/0x90 [ 3290.970384] #4: ffffffff82658ac8 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x9f/0x310 [ 3290.970399] #5: ffffffff827f2a48 (acpi_scan_lock){+.+.}-{3:3}, at: acpi_suspend_begin+0x4c/0x80 [ 3290.970416] #6: ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0 [ 3290.970428] stack backtrace: [ 3290.970431] CPU: 3 PID: 4553 Comm: systemd-sleep Tainted: G S 5.15.0+ #2420 [ 3290.970438] Hardware name: Dell Inc. XPS 13 9380/0RYJWW, BIOS 1.5.0 06/03/2019 [ 3290.970441] Call Trace: [ 3290.970446] dump_stack_lvl+0x44/0x57 [ 3290.970454] check_noncircular+0x105/0x120 [ 3290.970468] ? __lock_acquire+0x15cb/0x1b50 [ 3290.970474] __lock_acquire+0x15cb/0x1b50 [ 3290.970487] lock_acquire+0x26c/0x300 [ 3290.970493] ? flush_workqueue+0x87/0x4a0 [ 3290.970503] ? __raw_spin_lock_init+0x3b/0x60 [ 3290.970510] ? lockdep_init_map_type+0x58/0x240 [ 3290.970519] flush_workqueue+0xae/0x4a0 [ 3290.970526] ? flush_workqueue+0x87/0x4a0 [ 3290.970544] ? drain_workqueue+0xa1/0x130 [ 3290.970552] drain_workqueue+0xa1/0x130 [ 3290.970561] destroy_workqueue+0x34/0x1f0 [ 3290.970572] hci_release_dev+0x49/0x180 [bluetooth] [ 3290.970624] bt_host_release+0x1d/0x30 [bluetooth] [ 3290.970687] device_release+0x33/0x90 [ 3290.970695] kobject_release+0x63/0x160 [ 3290.970705] dpm_resume+0x164/0x3e0 [ 3290.970710] ? dpm_resume_early+0x251/0x3b0 [ 3290.970718] dpm_resume_end+0xd/0x20 [ 3290.970723] suspend_devices_and_enter+0x1a4/0xba0 [ 3290.970737] pm_suspend+0x26b/0x310 [ 3290.970746] state_store+0x42/0x90 [ 3290.970755] kernfs_fop_write_iter+0x135/0x1b0 [ 3290.970764] new_sync_write+0x125/0x1c0 [ 3290.970777] vfs_write+0x360/0x3c0 [ 3290.970785] ksys_write+0xa7/0xe0 [ 3290.970794] do_syscall_64+0x3a/0x80 [ 3290.970803] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3290.970811] RIP: 0033:0x7f41b1328164 [ 3290.970819] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 4a d2 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83 [ 3290.970824] RSP: 002b:00007ffe6ae21b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 3290.970831] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f41b1328164 [ 3290.970836] RDX: 0000000000000004 RSI: 000055965e651070 RDI: 0000000000000004 [ 3290.970839] RBP: 000055965e651070 R08: 000055965e64f390 R09: 00007f41b1e3d1c0 [ 3290.970843] R10: 000000000000000a R11: 0000000000000246 R12: 0000000000000004 [ 3290.970846] R13: 0000000000000001 R14: 000055965e64f2b0 R15: 0000000000000004 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	bcache: Revert "bcache: use bvec_virt"	Coly Li
	commit 2878feaed543c35f9dbbe6d8ce36fb67ac803eef upstream. This reverts commit 2fd3e5efe791946be0957c8e1eed9560b541fe46. The above commit replaces page_address(bv->bv_page) by bvec_virt(bv) to avoid directly access to bv->bv_page, but in situation bv->bv_offset is not zero and page_address(bv->bv_page) is not equal to bvec_virt(bv). In such case a memory corruption may happen because memory in next page is tainted by following line in do_btree_node_write(), memcpy(bvec_virt(bv), addr, PAGE_SIZE); This patch reverts the mentioned commit to avoid the memory corruption. Fixes: 2fd3e5efe791 ("bcache: use bvec_virt") Signed-off-by: Coly Li <colyli@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org # 5.15 Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211103151041.70516-1-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	bcache: fix use-after-free problem in bcache_device_free()	Coly Li
	commit 8468f45091d2866affed6f6a7aecc20779139173 upstream. In bcache_device_free(), pointer disk is referenced still in ida_simple_remove() after blk_cleanup_disk() gets called on this pointer. This may cause a potential panic by use-after-free on the disk pointer. This patch fixes the problem by calling blk_cleanup_disk() after ida_simple_remove(). Fixes: bc70852fd104 ("bcache: convert to blk_alloc_disk/blk_cleanup_disk") Signed-off-by: Coly Li <colyli@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: stable@vger.kernel.org # v5.14+ Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211103064917.67383-1-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	video: backlight: Drop maximum brightness override for brightness zero	Marek Vasut
	commit 33a5471f8da976bf271a1ebbd6b9d163cb0cb6aa upstream. The note in c2adda27d202f ("video: backlight: Add of_find_backlight helper in backlight.c") says that gpio-backlight uses brightness as power state. This has been fixed since in ec665b756e6f7 ("backlight: gpio-backlight: Correct initial power state handling") and other backlight drivers do not require this workaround. Drop the workaround. This fixes the case where e.g. pwm-backlight can perfectly well be set to brightness 0 on boot in DT, which without this patch leads to the display brightness to be max instead of off. Fixes: c2adda27d202f ("video: backlight: Add of_find_backlight helper in backlight.c") Cc: <stable@vger.kernel.org> # 5.4+ Cc: <stable@vger.kernel.org> # 4.19.x: ec665b756e6f7: backlight: gpio-backlight: Correct initial power state handling Signed-off-by: Marek Vasut <marex@denx.de> Acked-by: Noralf Trønnes <noralf@tronnes.org> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mfd: dln2: Add cell for initializing DLN2 ADC	Jack Andersen
	commit 313c84b5ae4104e48c661d5d706f9f4c425fd50f upstream. This patch extends the DLN2 driver; adding cell for adc_dln2 module. The original patch[1] fell through the cracks when the driver was added so ADC has never actually been usable. That patch did not have ACPI support which was added in v5.9, so the oldest supported version this current patch can be backported to is 5.10. [1] https://www.spinics.net/lists/linux-iio/msg33975.html Cc: <stable@vger.kernel.org> # 5.10+ Signed-off-by: Jack Andersen <jackoalan@gmail.com> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20211018112541.25466-1-noralf@tronnes.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mm, thp: fix incorrect unmap behavior for private pages	Rongwei Wang
	commit 8468e937df1f31411d1e127fa38db064af051fe5 upstream. When truncating pagecache on file THP, the private pages of a process should not be unmapped mapping. This incorrect behavior on a dynamic shared libraries which will cause related processes to happen core dump. A simple test for a DSO (Prerequisite is the DSO mapped in file THP): int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_WRONLY); if (fd < 0) { perror("open"); } close(fd); return 0; } The test only to open a target DSO, and do nothing. But this operation will lead one or more process to happen core dump. This patch mainly to fix this bug. Link: https://lkml.kernel.org/r/20211025092134.18562-3-rongwei.wang@linux.alibaba.com Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs") Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com> Tested-by: Xu Yu <xuyu@linux.alibaba.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Song Liu <song@kernel.org> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Collin Fijalkovich <cfijalkovich@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mm, thp: lock filemap when truncating page cache	Rongwei Wang
	commit 55fc0d91746759c71bc165bba62a2db64ac98e35 upstream. Patch series "fix two bugs for file THP". This patch (of 2): Transparent huge page has supported read-only non-shmem files. The file- backed THP is collapsed by khugepaged and truncated when written (for shared libraries). However, there is a race when multiple writers truncate the same page cache concurrently. In that case, subpage(s) of file THP can be revealed by find_get_entry in truncate_inode_pages_range, which will trigger PageTail BUG_ON in truncate_inode_page, as follows: page:000000009e420ff2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff pfn:0x50c3ff head:0000000075ff816d order:9 compound_mapcount:0 compound_pincount:0 flags: 0x37fffe0000010815(locked\|uptodate\|lru\|arch_1\|head) raw: 37fffe0000000000 fffffe0013108001 dead000000000122 dead000000000400 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 head: 37fffe0000010815 fffffe001066bd48 ffff000404183c20 0000000000000000 head: 0000000000000600 0000000000000000 00000001ffffffff ffff000c0345a000 page dumped because: VM_BUG_ON_PAGE(PageTail(page)) ------------[ cut here ]------------ kernel BUG at mm/truncate.c:213! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: xfs(E) libcrc32c(E) rfkill(E) ... CPU: 14 PID: 11394 Comm: check_madvise_d Kdump: ... Hardware name: ECS, BIOS 0.0.0 02/06/2015 pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) Call trace: truncate_inode_page+0x64/0x70 truncate_inode_pages_range+0x550/0x7e4 truncate_pagecache+0x58/0x80 do_dentry_open+0x1e4/0x3c0 vfs_open+0x38/0x44 do_open+0x1f0/0x310 path_openat+0x114/0x1dc do_filp_open+0x84/0x134 do_sys_openat2+0xbc/0x164 __arm64_sys_openat+0x74/0xc0 el0_svc_common.constprop.0+0x88/0x220 do_el0_svc+0x30/0xa0 el0_svc+0x20/0x30 el0_sync_handler+0x1a4/0x1b0 el0_sync+0x180/0x1c0 Code: aa0103e0 900061e1 910ec021 9400d300 (d4210000) This patch mainly to lock filemap when one enter truncate_pagecache(), avoiding truncating the same page cache concurrently. Link: https://lkml.kernel.org/r/20211025092134.18562-1-rongwei.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/20211025092134.18562-2-rongwei.wang@linux.alibaba.com Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs") Signed-off-by: Xu Yu <xuyu@linux.alibaba.com> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Song Liu <song@kernel.org> Cc: Collin Fijalkovich <cfijalkovich@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mm, oom: do not trigger out_of_memory from the #PF	Michal Hocko
	commit 60e2793d440a3ec95abb5d6d4fc034a4b480472d upstream. Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory. This can happen for 2 different reasons. a) Memcg is out of memory and we rely on mem_cgroup_oom_synchronize to perform the memcg OOM handling or b) normal allocation fails. The latter is quite problematic because allocation paths already trigger out_of_memory and the page allocator tries really hard to not fail allocations. Anyway, if the OOM killer has been already invoked there is no reason to invoke it again from the #PF path. Especially when the OOM condition might be gone by that time and we have no way to find out other than allocate. Moreover if the allocation failed and the OOM killer hasn't been invoked then we are unlikely to do the right thing from the #PF context because we have already lost the allocation context and restictions and therefore might oom kill a task from a different NUMA domain. This all suggests that there is no legitimate reason to trigger out_of_memory from pagefault_out_of_memory so drop it. Just to be sure that no #PF path returns with VM_FAULT_OOM without allocation print a warning that this is happening before we restart the #PF. [VvS: #PF allocation can hit into limit of cgroup v1 kmem controller. This is a local problem related to memcg, however, it causes unnecessary global OOM kills that are repeated over and over again and escalate into a real disaster. This has been broken since kmem accounting has been introduced for cgroup v1 (3.8). There was no kmem specific reclaim for the separate limit so the only way to handle kmem hard limit was to return with ENOMEM. In upstream the problem will be fixed by removing the outdated kmem limit, however stable and LTS kernels cannot do it and are still affected. This patch fixes the problem and should be backported into stable/LTS.] Link: https://lkml.kernel.org/r/f5fd8dd8-0ad4-c524-5f65-920b01972a42@virtuozzo.com Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks	Vasily Averin
	commit 0b28179a6138a5edd9d82ad2687c05b3773c387b upstream. Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3. Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It can be misused and allowed to trigger global OOM from inside a memcg-limited container. On the other hand if memcg fails allocation, called from inside #PF handler it triggers global OOM from inside pagefault_out_of_memory(). To prevent these problems this patchset: (a) removes execution of out_of_memory() from pagefault_out_of_memory(), becasue nobody can explain why it is necessary. (b) allow memcg to fail allocation of dying/killed tasks. This patch (of 3): Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory which in turn executes out_out_memory() and can kill a random task. An allocation might fail when the current task is the oom victim and there are no memory reserves left. The OOM killer is already handled at the page allocator level for the global OOM and at the charging level for the memcg one. Both have much more information about the scope of allocation/charge request. This means that either the OOM killer has been invoked properly and didn't lead to the allocation success or it has been skipped because it couldn't have been invoked. In both cases triggering it from here is pointless and even harmful. It makes much more sense to let the killed task die rather than to wake up an eternally hungry oom-killer and send him to choose a fatter victim for breakfast. Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	memcg: prohibit unconditional exceeding the limit of dying tasks	Vasily Averin
	commit a4ebf1b6ca1e011289677239a2a361fde4a88076 upstream. Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It is assumed that the amount of the memory charged by those tasks is bound and most of the memory will get released while the task is exiting. This is resembling a heuristic for the global OOM situation when tasks get access to memory reserves. There is no global memory shortage at the memcg level so the memcg heuristic is more relieved. The above assumption is overly optimistic though. E.g. vmalloc can scale to really large requests and the heuristic would allow that. We used to have an early break in the vmalloc allocator for killed tasks but this has been reverted by commit b8c8a338f75e ("Revert "vmalloc: back off when the current task is killed""). There are likely other similar code paths which do not check for fatal signals in an allocation&charge loop. Also there are some kernel objects charged to a memcg which are not bound to a process life time. It has been observed that it is not really hard to trigger these bypasses and cause global OOM situation. One potential way to address these runaways would be to limit the amount of excess (similar to the global OOM with limited oom reserves). This is certainly possible but it is not really clear how much of an excess is desirable and still protects from global OOMs as that would have to consider the overall memcg configuration. This patch is addressing the problem by removing the heuristic altogether. Bypass is only allowed for requests which either cannot fail or where the failure is not desirable while excess should be still limited (e.g. atomic requests). Implementation wise a killed or dying task fails to charge if it has passed the OOM killer stage. That should give all forms of reclaim chance to restore the limit before the failure (ENOMEM) and tell the caller to back off. In addition, this patch renames should_force_charge() helper to task_is_dying() because now its use is not associated witch forced charging. This patch depends on pagefault_out_of_memory() to not trigger out_of_memory(), because then a memcg failure can unwind to VM_FAULT_OOM and cause a global OOM killer. Link: https://lkml.kernel.org/r/8f5cebbb-06da-4902-91f0-6566fc4b4203@virtuozzo.com Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Shakeel Butt <shakeelb@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	mm/filemap.c: remove bogus VM_BUG_ON	Matthew Wilcox (Oracle)
	commit d417b49fff3e2f21043c834841e8623a6098741d upstream. It is not safe to check page->index without holding the page lock. It can be changed if the page is moved between the swap cache and the page cache for a shmem file, for example. There is a VM_BUG_ON below which checks page->index is correct after taking the page lock. Link: https://lkml.kernel.org/r/20210818144932.940640-1-willy@infradead.org Fixes: 5c211ba29deb ("mm: add and use find_lock_entries") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: <syzbot+c87be4f669d920c76330@syzkaller.appspotmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	9p/net: fix missing error check in p9_check_errors	Dominique Martinet
	commit 27eb4c3144f7a5ebef3c9a261d80cb3e1fa784dc upstream. Link: https://lkml.kernel.org/r/99338965-d36c-886e-cd0e-1d8fff2b4746@gmail.com Reported-by: syzbot+06472778c97ed94af66d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	net, neigh: Enable state migration between NUD_PERMANENT and NTF_USE	Daniel Borkmann
	[ Upstream commit 3dc20f4762c62d3b3f0940644881ed818aa7b2f5 ] Currently, it is not possible to migrate a neighbor entry between NUD_PERMANENT state and NTF_USE flag with a dynamic NUD state from a user space control plane. Similarly, it is not possible to add/remove NTF_EXT_LEARNED flag from an existing neighbor entry in combination with NTF_USE flag. This is due to the latter directly calling into neigh_event_send() without any meta data updates as happening in __neigh_update(). Thus, to enable this use case, extend the latter with a NEIGH_UPDATE_F_USE flag where we break the NUD_PERMANENT state in particular so that a latter neigh_event_send() is able to re-resolve a neighbor entry. Before fix, NUD_PERMANENT -> NUD_* & NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] As can be seen, despite the admin-triggered replace, the entry remains in the NUD_PERMANENT state. After fix, NUD_PERMANENT -> NUD_* & NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE [...] # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn STALE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] After the fix, the admin-triggered replace switches to a dynamic state from the NTF_USE flag which triggered a new neighbor resolution. Likewise, we can transition back from there, if needed, into NUD_PERMANENT. Similar before/after behavior can be observed for below transitions: Before fix, NTF_USE -> NTF_USE \| NTF_EXT_LEARNED -> NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] After fix, NTF_USE -> NTF_USE \| NTF_EXT_LEARNED -> NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [..] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18	dmaengine: bestcomm: fix system boot lockups	Anatolij Gustschin
	commit adec566b05288f2787a1f88dbaf77ed8b0c644fa upstream. memset() and memcpy() on an MMIO region like here results in a lockup at startup on mpc5200 platform (since this first happens during probing of the ATA and Ethernet drivers). Use memset_io() and memcpy_toio() instead. Fixes: 2f9ea1bde0d1 ("bestcomm: core bestcomm support for Freescale MPC5200") Cc: stable@vger.kernel.org # v5.14+ Signed-off-by: Anatolij Gustschin <agust@denx.de> Link: https://lore.kernel.org/r/20211014094012.21286-1-agust@denx.de Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	dmaengine: ti: k3-udma: Set r/tchan or rflow to NULL if request fail	Kishon Vijay Abraham I
	commit eb91224e47ec33a0a32c9be0ec0fcb3433e555fd upstream. udma_get_() checks if rchan/tchan/rflow is already allocated by checking if it has a NON NULL value. For the error cases, rchan/tchan/rflow will have error value and udma_get_() considers this as already allocated (PASS) since the error values are NON NULL. This results in NULL pointer dereference error while de-referencing rchan/tchan/rflow. Reset the value of rchan/tchan/rflow to NULL if a channel request fails. CC: stable@vger.kernel.org Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Link: https://lore.kernel.org/r/20211031032411.27235-3-kishon@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	dmaengine: ti: k3-udma: Set bchan to NULL if a channel request fail	Kishon Vijay Abraham I
	commit 5c6c6d60e4b489308ae4da8424c869f7cc53cd12 upstream. bcdma_get_() checks if bchan is already allocated by checking if it has a NON NULL value. For the error cases, bchan will have error value and bcdma_get_() considers this as already allocated (PASS) since the error values are NON NULL. This results in NULL pointer dereference error while de-referencing bchan. Reset the value of bchan to NULL if a channel request fails. CC: stable@vger.kernel.org Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Link: https://lore.kernel.org/r/20211031032411.27235-2-kishon@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	ksmbd: don't need 8byte alignment for request length in ksmbd_check_message	Namjae Jeon
	commit b53ad8107ee873795ecb5039d46b5d5502d404f2 upstream. When validating request length in ksmbd_check_message, 8byte alignment is not needed for compound request. It can cause wrong validation of request length. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org # v5.15 Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	ksmbd: Fix buffer length check in fsctl_validate_negotiate_info()	Marios Makassikis
	commit 78f1688a64cca77758ceb9b183088cf0054bfc82 upstream. The validate_negotiate_info_req struct definition includes an extra field to access the data coming after the header. This causes the check in fsctl_validate_negotiate_info() to count the first element of the array twice. This in turn makes some valid requests fail, depending on whether they include padding or not. Fixes: f7db8fd03a4b ("ksmbd: add validation in smb2_ioctl") Cc: stable@vger.kernel.org # v5.15 Acked-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	block: Hold invalidate_lock in BLKRESETZONE ioctl	Shin'ichiro Kawasaki
	commit 86399ea071099ec8ee0a83ac9ad67f7df96a50ad upstream. When BLKRESETZONE ioctl and data read race, the data read leaves stale page cache. The commit e5113505904e ("block: Discard page cache of zone reset target range") added page cache truncation to avoid stale page cache after the ioctl. However, the stale page cache still can be read during the reset zone operation for the ioctl. To avoid the stale page cache completely, hold invalidate_lock of the block device file mapping. Fixes: e5113505904e ("block: Discard page cache of zone reset target range") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.15 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211111085238.942492-1-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	block: Hold invalidate_lock in BLKZEROOUT ioctl	Shin'ichiro Kawasaki
	commit 35e4c6c1a2fc2eb11b9306e95cda1fa06a511948 upstream. When BLKZEROOUT ioctl and data read race, the data read leaves stale page cache. To avoid the stale page cache, hold invalidate_lock of the block device file mapping. The stale page cache is observed when blktests test case block/009 is modified to call "blkdiscard -z" command and repeated hundreds of times. This patch can be applied back to the stable kernel version v5.15.y. Rework is required for older stable kernels. Fixes: 22dd6d356628 ("block: invalidate the page cache when issuing BLKZEROOUT") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.15 Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211109104723.835533-3-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	block: Hold invalidate_lock in BLKDISCARD ioctl	Shin'ichiro Kawasaki
	commit 7607c44c157d343223510c8ffdf7206fdd2a6213 upstream. When BLKDISCARD ioctl and data read race, the data read leaves stale page cache. To avoid the stale page cache, hold invalidate_lock of the block device file mapping. The stale page cache is observed when blktests test case block/009 is repeated hundreds of times. This patch can be applied back to the stable kernel version v5.15.y with slight patch edit. Rework is required for older stable kernels. Fixes: 351499a172c0 ("block: Invalidate cache on discard v2") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.15 Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211109104723.835533-2-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	drm/i915/guc: Fix blocked context accounting	Matthew Brost
	commit fc30a6764a54dea42291aeb7009bef7aa2fc1cd4 upstream. Prior to this patch the blocked context counter was cleared on init_sched_state (used during registering a context & resets) which is incorrect. This state needs to be persistent or the counter can read the incorrect value resulting in scheduling never getting enabled again. Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-2-matthew.brost@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	erofs: fix unsafe pagevec reuse of hooked pclusters	Gao Xiang
	commit 86432a6dca9bed79111990851df5756d3eb5f57c upstream. There are pclusters in runtime marked with Z_EROFS_PCLUSTER_TAIL before actual I/O submission. Thus, the decompression chain can be extended if the following pcluster chain hooks such tail pcluster. As the related comment mentioned, if some page is made of a hooked pcluster and another followed pcluster, it can be reused for in-place I/O (since I/O should be submitted anyway): _______________________________________________________________ \| tail (partial) page \| head (partial) page \| \|_____PRIMARY_HOOKED___\|____________PRIMARY_FOLLOWED____________\| However, it's by no means safe to reuse as pagevec since if such PRIMARY_HOOKED pclusters finally move into bypass chain without I/O submission. It's somewhat hard to reproduce with LZ4 and I just found it (general protection fault) by ro_fsstressing a LZMA image for long time. I'm going to actively clean up related code together with multi-page folio adaption in the next few months. Let's address it directly for easier backporting for now. Call trace for reference: z_erofs_decompress_pcluster+0x10a/0x8a0 [erofs] z_erofs_decompress_queue.isra.36+0x3c/0x60 [erofs] z_erofs_runqueue+0x5f3/0x840 [erofs] z_erofs_readahead+0x1e8/0x320 [erofs] read_pages+0x91/0x270 page_cache_ra_unbounded+0x18b/0x240 filemap_get_pages+0x10a/0x5f0 filemap_read+0xa9/0x330 new_sync_read+0x11b/0x1a0 vfs_read+0xf1/0x190 Link: https://lore.kernel.org/r/20211103182006.4040-1-xiang@kernel.org Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	ceph: fix mdsmap decode when there are MDS's beyond max_mds	Xiubo Li
	commit 0e24421ac431e7af62d4acef6c638b85aae51728 upstream. If the max_mds is decreased in a cephfs cluster, there is a window of time before the MDSs are removed. If a map goes out during this period, the mdsmap may show the decreased max_mds but still shows those MDSes as in or in the export target list. Ensure that we don't fail the map decode in that case. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/52436 Fixes: d517b3983dd3 ("ceph: reconnect to the export targets on new mdsmaps") Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	f2fs: fix UAF in f2fs_available_free_memory	Dongliang Mu
	commit 5429c9dbc9025f9a166f64e22e3a69c94fd5b29b upstream. if2fs_fill_super -> f2fs_build_segment_manager -> create_discard_cmd_control -> f2fs_start_discard_thread It invokes kthread_run to create a thread and run issue_discard_thread. However, if f2fs_build_node_manager fails, the control flow goes to free_nm and calls f2fs_destroy_node_manager. This function will free sbi->nm_info. However, if issue_discard_thread accesses sbi->nm_info after the deallocation, but before the f2fs_stop_discard_thread, it will cause UAF(Use-after-free). -> f2fs_destroy_segment_manager -> destroy_discard_cmd_control -> f2fs_stop_discard_thread Fix this by stopping discard thread before f2fs_destroy_node_manager. Note that, the commit d6d2b491a82e1 introduces the call of f2fs_available_free_memory into issue_discard_thread. Cc: stable@vger.kernel.org Fixes: d6d2b491a82e ("f2fs: allow to change discard policy based on cached discard cmds") Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	f2fs: include non-compressed blocks in compr_written_block	Daeho Jeong
	commit 09631cf3234d32156e7cae32275f5a4144c683c5 upstream. Need to include non-compressed blocks in compr_written_block to estimate average compression ratio more accurately. Fixes: 5ac443e26a09 ("f2fs: add sysfs nodes to get runtime compression stat") Cc: stable@vger.kernel.org Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	f2fs: should use GFP_NOFS for directory inodes	Jaegeuk Kim
	commit 92d602bc7177325e7453189a22e0c8764ed3453e upstream. We use inline_dentry which requires to allocate dentry page when adding a link. If we allow to reclaim memory from filesystem, we do down_read(&sbi->cp_rwsem) twice by f2fs_lock_op(). I think this should be okay, but how about stopping the lockdep complaint [1]? f2fs_create() - f2fs_lock_op() - f2fs_do_add_link() - __f2fs_find_entry - f2fs_get_read_data_page() -> kswapd - shrink_node - f2fs_evict_inode - f2fs_lock_op() [1] fs_reclaim ){+.+.}-{0:0} : kswapd0: lock_acquire+0x114/0x394 kswapd0: __fs_reclaim_acquire+0x40/0x50 kswapd0: prepare_alloc_pages+0x94/0x1ec kswapd0: __alloc_pages_nodemask+0x78/0x1b0 kswapd0: pagecache_get_page+0x2e0/0x57c kswapd0: f2fs_get_read_data_page+0xc0/0x394 kswapd0: f2fs_find_data_page+0xa4/0x23c kswapd0: find_in_level+0x1a8/0x36c kswapd0: __f2fs_find_entry+0x70/0x100 kswapd0: f2fs_do_add_link+0x84/0x1ec kswapd0: f2fs_mkdir+0xe4/0x1e4 kswapd0: vfs_mkdir+0x110/0x1c0 kswapd0: do_mkdirat+0xa4/0x160 kswapd0: __arm64_sys_mkdirat+0x24/0x34 kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8 kswapd0: do_el0_svc+0x28/0xa0 kswapd0: el0_svc+0x24/0x38 kswapd0: el0_sync_handler+0x88/0xec kswapd0: el0_sync+0x1c0/0x200 kswapd0: -> #1 ( &sbi->cp_rwsem ){++++}-{3:3} : kswapd0: lock_acquire+0x114/0x394 kswapd0: down_read+0x7c/0x98 kswapd0: f2fs_do_truncate_blocks+0x78/0x3dc kswapd0: f2fs_truncate+0xc8/0x128 kswapd0: f2fs_evict_inode+0x2b8/0x8b8 kswapd0: evict+0xd4/0x2f8 kswapd0: iput+0x1c0/0x258 kswapd0: do_unlinkat+0x170/0x2a0 kswapd0: __arm64_sys_unlinkat+0x4c/0x68 kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8 kswapd0: do_el0_svc+0x28/0xa0 kswapd0: el0_svc+0x24/0x38 kswapd0: el0_sync_handler+0x88/0xec kswapd0: el0_sync+0x1c0/0x200 Cc: stable@vger.kernel.org Fixes: bdbc90fa55af ("f2fs: don't put dentry page in pagecache into highmem") Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Light Hsieh <light.hsieh@mediatek.com> Tested-by: Light Hsieh <light.hsieh@mediatek.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	irqchip/sifive-plic: Fixup EOI failed when masked	Guo Ren
	commit 69ea463021be0d159ab30f96195fb0dd18ee2272 upstream. When using "devm_request_threaded_irq(,,,,IRQF_ONESHOT,,)" in a driver, only the first interrupt is handled, and following interrupts are never delivered (initially reported in [1]). That's because the RISC-V PLIC cannot EOI masked interrupts, as explained in the description of Interrupt Completion in the PLIC spec [2]: <quote> The PLIC signals it has completed executing an interrupt handler by writing the interrupt ID it received from the claim to the claim/complete register. The PLIC does not check whether the completion ID is the same as the last claim ID for that target. If the completion ID does not match an interrupt source that is currently enabled for the target, the completion is silently ignored. </quote> Re-enable the interrupt before completion if it has been masked during the handling, and remask it afterwards. [1] http://lists.infradead.org/pipermail/linux-riscv/2021-July/007441.html [2] https://github.com/riscv/riscv-plic-spec/blob/8bc15a35d07c9edf7b5d23fec9728302595ffc4d/riscv-plic.adoc Fixes: bb0fed1c60cc ("irqchip/sifive-plic: Switch to fasteoi flow") Reported-by: Vincent Pelletier <plr.vincent@gmail.com> Tested-by: Nikita Shubin <nikita.shubin@maquefel.me> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> [maz: amended commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211105094748.3894453-1-guoren@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()	Michael Pratt
	commit ca7752caeaa70bd31d1714af566c9809688544af upstream. copy_process currently copies task_struct.posix_cputimers_work as-is. If a timer interrupt arrives while handling clone and before dup_task_struct completes then the child task will have: 1. posix_cputimers_work.scheduled = true 2. posix_cputimers_work.work queued. copy_process clears task_struct.task_works, so (2) will have no effect and posix_cpu_timers_work will never run (not to mention it doesn't make sense for two tasks to share a common linked list). Since posix_cpu_timers_work never runs, posix_cputimers_work.scheduled is never cleared. Since scheduled is set, future timer interrupts will skip scheduling work, with the ultimate result that the task will never receive timer expirations. Together, the complete flow is: 1. Task 1 calls clone(), enters kernel. 2. Timer interrupt fires, schedules task work on Task 1. 2a. task_struct.posix_cputimers_work.scheduled = true 2b. task_struct.posix_cputimers_work.work added to task_struct.task_works. 3. dup_task_struct() copies Task 1 to Task 2. 4. copy_process() clears task_struct.task_works for Task 2. 5. Future timer interrupts on Task 2 see task_struct.posix_cputimers_work.scheduled = true and skip scheduling work. Fix this by explicitly clearing contents of task_struct.posix_cputimers_work in copy_process(). This was never meant to be shared or inherited across tasks in the first place. Fixes: 1fb497dd0030 ("posix-cpu-timers: Provide mechanisms to defer timer handling to task_work") Reported-by: Rhys Hiltner <rhys@justin.tv> Signed-off-by: Michael Pratt <mpratt@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211101210615.716522-1-mpratt@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	KVM: x86: move guest_pv_has out of user_access section	Paolo Bonzini
	commit 3e067fd8503d6205aa0c1c8f48f6b209c592d19c upstream. When UBSAN is enabled, the code emitted for the call to guest_pv_has includes a call to __ubsan_handle_load_invalid_value. objtool complains that this call happens with UACCESS enabled; to avoid the warning, pull the calls to user_access_begin into both arms of the "if" statement, after the check for guest_pv_has. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	PCI/MSI: Destroy sysfs before freeing entries	Thomas Gleixner
	commit 3735459037114d31e5acd9894fad9aed104231a0 upstream. free_msi_irqs() frees the MSI entries before destroying the sysfs entries which are exposing them. Nothing prevents a concurrent free while a sysfs file is read and accesses the possibly freed entry. Move the sysfs release ahead of freeing the entries. Fixes: 1c51b50c2995 ("PCI/MSI: Export MSI mode using attributes, not kobjects") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87sfw5305m.ffs@tglx Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	PCI/MSI: Move non-mask check back into low level accessors	Thomas Gleixner
	commit 9c8e9c9681a0f3f1ae90a90230d059c7a1dece5a upstream. The recent rework of PCI/MSI[X] masking moved the non-mask checks from the low level accessors into the higher level mask/unmask functions. This missed the fact that these accessors can be invoked from other places as well. The missing checks break XEN-PV which sets pci_msi_ignore_mask and also violates the virtual MSIX and the msi_attrib.maskbit protections. Instead of sprinkling checks all over the place, lift them back into the low level accessor functions. To avoid checking three different conditions combine them into one property of msi_desc::msi_attrib. [ josef: Fixed the missed conversion in the core code ] Fixes: fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask functions") Reported-by: Josef Johansson <josef@oderland.se> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Josef Johansson <josef@oderland.se> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	x86/mce: Add errata workaround for Skylake SKX37	Dave Jones
	commit e629fc1407a63dbb748f828f9814463ffc2a0af0 upstream. Errata SKX37 is word-for-word identical to the other errata listed in this workaround. I happened to notice this after investigating a CMCI storm on a Skylake host. While I can't confirm this was the root cause, spurious corrected errors does sound like a likely suspect. Fixes: 2976908e4198 ("x86/mce: Do not log spurious corrected mce errors") Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20211029205759.GA7385@codemonkey.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	MIPS: Fix assembly error from MIPSr2 code used within MIPS_ISA_ARCH_LEVEL	Maciej W. Rozycki
	commit a923a2676e60683aee46aa4b93c30aff240ac20d upstream. Fix assembly errors like: {standard input}: Assembler messages: {standard input}:287: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32' {standard input}:680: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32' {standard input}:1274: Error: opcode not supported on this processor: mips3 (mips3) `dins $12,$9,32,32' {standard input}:2175: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32' make[1]: *** [scripts/Makefile.build:277: mm/highmem.o] Error 1 with code produced from `__cmpxchg64' for MIPS64r2 CPU configurations using CONFIG_32BIT and CONFIG_PHYS_ADDR_T_64BIT. This is due to MIPS_ISA_ARCH_LEVEL downgrading the assembly architecture to `r4000' i.e. MIPS III for MIPS64r2 configurations, while there is a block of code containing a DINS MIPS64r2 instruction conditionalized on MIPS_ISA_REV >= 2 within the scope of the downgrade. The assembly architecture override code pattern has been put there for LL/SC instructions, so that code compiles for configurations that select a processor to build for that does not support these instructions while still providing run-time support for processors that do, dynamically switched by non-constant `cpu_has_llsc'. It went in with linux-mips.org commit aac8aa7717a2 ("Enable a suitable ISA for the assembler around ll/sc so that code builds even for processors that don't support the instructions. Plus minor formatting fixes.") back in 2005. Fix the problem by wrapping these instructions along with the adjacent SYNC instructions only, following the practice established with commit cfd54de3b0e4 ("MIPS: Avoid move psuedo-instruction whilst using MIPS_ISA_LEVEL") and commit 378ed6f0e3c5 ("MIPS: Avoid using .set mips0 to restore ISA"). Strictly speaking the SYNC instructions do not have to be wrapped as they are only used as a Loongson3 erratum workaround, so they will be enabled in the assembler by default, but do this so as to keep code consistent with other places. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: c7e2d71dda7a ("MIPS: Fix set_pte() for Netlogic XLR using cmpxchg64()") Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	MIPS: fix *-pkg builds for loongson2ef platform	Masahiro Yamada
	commit 0706f74f719e6e72c3a862ab2990796578fa73cc upstream. Since commit 805b2e1d427a ("kbuild: include Makefile.compiler only when compiler is needed"), package builds for the loongson2f platform fail. $ make ARCH=mips CROSS_COMPILE=mips64-linux- lemote2f_defconfig bindeb-pkg [ snip ] sh ./scripts/package/builddeb arch/mips/loongson2ef//Platform:36: * only binutils >= 2.20.2 have needed option -mfix-loongson2f-nop. Stop. cp: cannot stat '': No such file or directory make[5]: * [scripts/Makefile.package:87: intdeb-pkg] Error 1 make[4]: * [Makefile:1558: intdeb-pkg] Error 2 make[3]: * [debian/rules:13: binary-arch] Error 2 dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2 make[2]: * [scripts/Makefile.package:83: bindeb-pkg] Error 2 make[1]: * [Makefile:1558: bindeb-pkg] Error 2 make: * [Makefile:350: __build_one_by_one] Error 2 The reason is because "make image_name" fails. $ make ARCH=mips CROSS_COMPILE=mips64-linux- image_name arch/mips/loongson2ef//Platform:36: * only binutils >= 2.20.2 have needed option -mfix-loongson2f-nop. Stop. In general, adding $(error ...) in the parse stage is troublesome, and it is pointless to check toolchains even if we are not building anything. Do not include Kbuild.platform in such cases. Fixes: 805b2e1d427a ("kbuild: include Makefile.compiler only when compiler is needed") Reported-by: Jason Self <jason@bluehome.net> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	MIPS: fix duplicated slashes for Platform file path	Masahiro Yamada
	commit cca2aac8acf470b01066f559acd7146fc4c32ae8 upstream. platform-y accumulates platform names with a slash appended. The current $(patsubst ...) ends up with doubling slashes. GNU Make still include Platform files, but in case of an error, a clumsy file path is displayed: arch/mips/loongson2ef//Platform:36: *** only binutils >= 2.20.2 have needed option -mfix-loongson2f-nop. Stop. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Jason Self <jason@bluehome.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>