bcachefs.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2022-06-09	Linux 5.10.121v5.10.121	Greg Kroah-Hartman
	Link: https://lore.kernel.org/r/20220607164908.521895282@linuxfoundation.org Tested-by: Pavel Machek (CIP) <pavel@denx.de> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Fox Chen <foxhlchen@gmail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	md: bcache: check the return value of kzalloc() in detached_dev_do_request()	Jia-Ju Bai
	commit 40f567bbb3b0639d2ec7d1c6ad4b1b018f80cf19 upstream. The function kzalloc() in detached_dev_do_request() can fail, so its return value should be checked. Fixes: bc082a55d25c ("bcache: fix inaccurate io state for detached bcache devices") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20220527152818.27545-4-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	ext4: only allow test_dummy_encryption when supported	Eric Biggers
	commit 5f41fdaea63ddf96d921ab36b2af4a90ccdb5744 upstream. Make the test_dummy_encryption mount option require that the encrypt feature flag be already enabled on the filesystem, rather than automatically enabling it. Practically, this means that "-O encrypt" will need to be included in MKFS_OPTIONS when running xfstests with the test_dummy_encryption mount option. (ext4/053 also needs an update.) Moreover, as long as the preconditions for test_dummy_encryption are being tightened anyway, take the opportunity to start rejecting it when !CONFIG_FS_ENCRYPTION rather than ignoring it. The motivation for requiring the encrypt feature flag is that: - Having the filesystem auto-enable feature flags is problematic, as it bypasses the usual sanity checks. The specific issue which came up recently is that in kernel versions where ext4 supports casefold but not encrypt+casefold (v5.1 through v5.10), the kernel will happily add the encrypt flag to a filesystem that has the casefold flag, making it unmountable -- but only for subsequent mounts, not the initial one. This confused the casefold support detection in xfstests, causing generic/556 to fail rather than be skipped. - The xfstests-bld test runners (kvm-xfstests et al.) already use the required mkfs flag, so they will not be affected by this change. Only users of test_dummy_encryption alone will be affected. But, this option has always been for testing only, so it should be fine to require that the few users of this option update their test scripts. - f2fs already requires it (for its equivalent feature flag). Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Link: https://lore.kernel.org/r/20220519204437.61645-1-ebiggers@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	MIPS: IP30: Remove incorrect `cpu_has_fpu' override	Maciej W. Rozycki
	commit f44b3e74c33fe04defeff24ebcae98c3bcc5b285 upstream. Remove unsupported forcing of `cpu_has_fpu' to 1, which makes the `nofpu' kernel parameter non-functional, and also causes a link error: ld: arch/mips/kernel/traps.o: in function `trap_init': ./arch/mips/include/asm/msa.h:(.init.text+0x348): undefined reference to `handle_fpe' ld: ./arch/mips/include/asm/msa.h:(.init.text+0x354): undefined reference to `handle_fpe' ld: ./arch/mips/include/asm/msa.h:(.init.text+0x360): undefined reference to `handle_fpe' where the CONFIG_MIPS_FP_SUPPORT configuration option has been disabled. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Reported-by: Stephen Zhang <starzhangzsd@gmail.com> Fixes: 7505576d1c1a ("MIPS: add support for SGI Octane (IP30)") Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	MIPS: IP27: Remove incorrect `cpu_has_fpu' override	Maciej W. Rozycki
	commit 424c3781dd1cb401857585331eaaa425a13f2429 upstream. Remove unsupported forcing of `cpu_has_fpu' to 1, which makes the `nofpu' kernel parameter non-functional, and also causes a link error: ld: arch/mips/kernel/traps.o: in function `trap_init': ./arch/mips/include/asm/msa.h:(.init.text+0x348): undefined reference to `handle_fpe' ld: ./arch/mips/include/asm/msa.h:(.init.text+0x354): undefined reference to `handle_fpe' ld: ./arch/mips/include/asm/msa.h:(.init.text+0x360): undefined reference to `handle_fpe' where the CONFIG_MIPS_FP_SUPPORT configuration option has been disabled. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Reported-by: Stephen Zhang <starzhangzsd@gmail.com> Fixes: 0ebb2f4159af ("MIPS: IP27: Update/restructure CPU overrides") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	RDMA/rxe: Generate a completion for unsupported/invalid opcode	Xiao Yang
	commit 2f917af777011c88e977b9b9a5d00b280d3a59ce upstream. Current rxe_requester() doesn't generate a completion when processing an unsupported/invalid opcode. If rxe driver doesn't support a new opcode (e.g. RDMA Atomic Write) and RDMA library supports it, an application using the new opcode can reproduce this issue. Fix the issue by calling "goto err;". Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20220410113513.27537-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	Revert "random: use static branch for crng_ready()"	Jason A. Donenfeld
	This reverts upstream commit f5bda35fba615ace70a656d4700423fa6c9bebee from stable. It's not essential and will take some time during 5.19 to work out properly. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-09	block: fix bio_clone_blkg_association() to associate with proper blkcg_gq	Jan Kara
	commit 22b106e5355d6e7a9c3b5cb5ed4ef22ae585ea94 upstream. Commit d92c370a16cb ("block: really clone the block cgroup in bio_clone_blkg_association") changed bio_clone_blkg_association() to just clone bio->bi_blkg reference from source to destination bio. This is however wrong if the source and destination bios are against different block devices because struct blkcg_gq is different for each bdev-blkcg pair. This will result in IOs being accounted (and throttled as a result) multiple times against the same device (src bdev) while throttling of the other device (dst bdev) is ignored. In case of BFQ the inconsistency can even result in crashes in bfq_bic_update_cgroup(). Fix the problem by looking up correct blkcg_gq for the cloned bio. Reported-by: Logan Gunthorpe <logang@deltatee.com> Reported-and-tested-by: Donald Buczek <buczek@molgen.mpg.de> Fixes: d92c370a16cb ("block: really clone the block cgroup in bio_clone_blkg_association") CC: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220602081242.7731-1-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bfq: Make sure bfqg for which we are queueing requests is online	Jan Kara
	commit 075a53b78b815301f8d3dd1ee2cd99554e34f0dd upstream. Bios queued into BFQ IO scheduler can be associated with a cgroup that was already offlined. This may then cause insertion of this bfq_group into a service tree. But this bfq_group will get freed as soon as last bio associated with it is completed leading to use after free issues for service tree users. Fix the problem by making sure we always operate on online bfq_group. If the bfq_group associated with the bio is not online, we pick the first online parent. CC: stable@vger.kernel.org Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support") Tested-by: "yukuai (C)" <yukuai3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220401102752.8599-9-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bfq: Get rid of __bio_blkcg() usage	Jan Kara
	commit 4e54a2493e582361adc3bfbf06c7d50d19d18837 upstream. BFQ usage of __bio_blkcg() is a relict from the past. Furthermore if bio would not be associated with any blkcg, the usage of __bio_blkcg() in BFQ is prone to races with the task being migrated between cgroups as __bio_blkcg() calls at different places could return different blkcgs. Convert BFQ to the new situation where bio->bi_blkg is initialized in bio_set_dev() and thus practically always valid. This allows us to save blkcg_gq lookup and noticeably simplify the code. CC: stable@vger.kernel.org Fixes: 0fe061b9f03c ("blkcg: fix ref count issue with bio_blkcg() using task_css") Tested-by: "yukuai (C)" <yukuai3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220401102752.8599-8-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bfq: Remove pointless bfq_init_rq() calls	Jan Kara
	commit 5f550ede5edf846ecc0067be1ba80514e6fe7f8e upstream. We call bfq_init_rq() from request merging functions where requests we get should have already gone through bfq_init_rq() during insert and anyway we want to do anything only if the request is already tracked by BFQ. So replace calls to bfq_init_rq() with RQ_BFQQ() instead to simply skip requests untracked by BFQ. We move bfq_init_rq() call in bfq_insert_request() a bit earlier to cover request merging and thus can transfer FIFO position in case of a merge. CC: stable@vger.kernel.org Tested-by: "yukuai (C)" <yukuai3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220401102752.8599-6-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bfq: Drop pointless unlock-lock pair	Jan Kara
	commit fc84e1f941b91221092da5b3102ec82da24c5673 upstream. In bfq_insert_request() we unlock bfqd->lock only to call trace_block_rq_insert() and then lock bfqd->lock again. This is really pointless since tracing is disabled if we really care about performance and even if the tracepoint is enabled, it is a quick call. CC: stable@vger.kernel.org Tested-by: "yukuai (C)" <yukuai3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220401102752.8599-5-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bfq: Avoid merging queues with different parents	Jan Kara
	commit c1cee4ab36acef271be9101590756ed0c0c374d9 upstream. It can happen that the parent of a bfqq changes between the moment we decide two queues are worth to merge (and set bic->stable_merge_bfqq) and the moment bfq_setup_merge() is called. This can happen e.g. because the process submitted IO for a different cgroup and thus bfqq got reparented. It can even happen that the bfqq we are merging with has parent cgroup that is already offline and going to be destroyed in which case the merge can lead to use-after-free issues such as: BUG: KASAN: use-after-free in __bfq_deactivate_entity+0x9cb/0xa50 Read of size 8 at addr ffff88800693c0c0 by task runc:[2:INIT]/10544 CPU: 0 PID: 10544 Comm: runc:[2:INIT] Tainted: G E 5.15.2-0.g5fb85fd-default #1 openSUSE Tumbleweed (unreleased) f1f3b891c72369aebecd2e43e4641a6358867c70 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x46/0x5a print_address_description.constprop.0+0x1f/0x140 ? __bfq_deactivate_entity+0x9cb/0xa50 kasan_report.cold+0x7f/0x11b ? __bfq_deactivate_entity+0x9cb/0xa50 __bfq_deactivate_entity+0x9cb/0xa50 ? update_curr+0x32f/0x5d0 bfq_deactivate_entity+0xa0/0x1d0 bfq_del_bfqq_busy+0x28a/0x420 ? resched_curr+0x116/0x1d0 ? bfq_requeue_bfqq+0x70/0x70 ? check_preempt_wakeup+0x52b/0xbc0 __bfq_bfqq_expire+0x1a2/0x270 bfq_bfqq_expire+0xd16/0x2160 ? try_to_wake_up+0x4ee/0x1260 ? bfq_end_wr_async_queues+0xe0/0xe0 ? _raw_write_unlock_bh+0x60/0x60 ? _raw_spin_lock_irq+0x81/0xe0 bfq_idle_slice_timer+0x109/0x280 ? bfq_dispatch_request+0x4870/0x4870 __hrtimer_run_queues+0x37d/0x700 ? enqueue_hrtimer+0x1b0/0x1b0 ? kvm_clock_get_cycles+0xd/0x10 ? ktime_get_update_offsets_now+0x6f/0x280 hrtimer_interrupt+0x2c8/0x740 Fix the problem by checking that the parent of the two bfqqs we are merging in bfq_setup_merge() is the same. Link: https://lore.kernel.org/linux-block/20211125172809.GC19572@quack2.suse.cz/ CC: stable@vger.kernel.org Fixes: 430a67f9d616 ("block, bfq: merge bursts of newly-created queues") Tested-by: "yukuai (C)" <yukuai3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220401102752.8599-2-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	thermal/core: Fix memory leak in the error path	Daniel Lezcano
	commit d44616c6cc3e35eea03ecfe9040edfa2b486a059 upstream. Fix the following error: smatch warnings: drivers/thermal/thermal_core.c:1020 __thermal_cooling_device_register() warn: possible memory leak of 'cdev' by freeing the cdev when exiting the function in the error path. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210319202257.890848-1-daniel.lezcano@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	thermal/core: fix a UAF bug in __thermal_cooling_device_register()	Ziyang Xuan
	commit 0a5c26712f963f0500161a23e0ffff8d29f742ab upstream. When device_register() return failed, program will goto out_kfree_type to release 'cdev->device' by put_device(). That will call thermal_release() to free 'cdev'. But the follow-up processes access 'cdev' continually. That trggers the UAF bug. ==================================================================== BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? __thermal_cooling_device_register+0x75b/0xa90 kasan_report.cold+0x7f/0x11b ? __thermal_cooling_device_register+0x75b/0xa90 __thermal_cooling_device_register+0x75b/0xa90 ? memset+0x20/0x40 ? __sanitizer_cov_trace_pc+0x1d/0x50 ? __devres_alloc_node+0x130/0x180 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa ...... Freed by task 258: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 thermal_release+0xa0/0x110 device_release+0xa7/0x240 kobject_put+0x1ce/0x540 put_device+0x20/0x30 __thermal_cooling_device_register+0x731/0xa90 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa [max6650] Do not use 'cdev' again after put_device() to fix the problem like doing in thermal_zone_device_register(). [dlezcano]: as requested by Rafael, change the affectation into two statements. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	kseltest/cgroup: Make test_stress.sh work if run interactively	Waiman Long
	commit 213adc63dfbcdff9a0c19ec1f2681fda9c05cf6d upstream. Commit 54de76c01239 ("kselftest/cgroup: fix test_stress.sh to use OUTPUT dir") changes the test_core command path from . to $OUTPUT. However, variable OUTPUT may not be defined if the command is run interactively. Fix that by using ${OUTPUT:-.} to cover both cases. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xfs: assert in xfs_btree_del_cursor should take into account error	Dave Chinner
	commit 56486f307100e8fc66efa2ebd8a71941fa10bf6f upstream. xfs/538 on a 1kB block filesystem failed with this assert: XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP \|\| cur->bc_ino.allocated == 0 \|\| xfs_is_shutdown(cur->bc_mp), file: fs/xfs/libxfs/xfs_btree.c, line: 448 The problem was that an allocation failed unexpectedly in xfs_bmbt_alloc_block() after roughly 150,000 minlen allocation error injections, resulting in an EFSCORRUPTED error being returned to xfs_bmapi_write(). The error occurred on extent-to-btree format conversion allocating the new root block: RIP: 0010:xfs_bmbt_alloc_block+0x177/0x210 Call Trace: <TASK> xfs_btree_new_iroot+0xdf/0x520 xfs_btree_make_block_unfull+0x10d/0x1c0 xfs_btree_insrec+0x364/0x790 xfs_btree_insert+0xaa/0x210 xfs_bmap_add_extent_hole_real+0x1fe/0x9a0 xfs_bmapi_allocate+0x34c/0x420 xfs_bmapi_write+0x53c/0x9c0 xfs_alloc_file_space+0xee/0x320 xfs_file_fallocate+0x36b/0x450 vfs_fallocate+0x148/0x340 __x64_sys_fallocate+0x3c/0x70 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa Why the allocation failed at this point is unknown, but is likely that we ran the transaction out of reserved space and filesystem out of space with bmbt blocks because of all the minlen allocations being done causing worst case fragmentation of a large allocation. Regardless of the cause, we've then called xfs_bmapi_finish() which calls xfs_btree_del_cursor(cur, error) to tear down the cursor. So we have a failed operation, error != 0, cur->bc_ino.allocated > 0 and the filesystem is still up. The assert fails to take into account that allocation can fail with an error and the transaction teardown will shut the filesystem down if necessary. i.e. the assert needs to check "\|\| error != 0" as well, because at this point shutdown is pending because the current transaction is dirty.... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xfs: consider shutdown in bmapbt cursor delete assert	Brian Foster
	commit 1cd738b13ae9b29e03d6149f0246c61f76e81fcf upstream. The assert in xfs_btree_del_cursor() checks that the bmapbt block allocation field has been handled correctly before the cursor is freed. This field is used for accurate calculation of indirect block reservation requirements (for delayed allocations), for example. generic/019 reproduces a scenario where this assert fails because the filesystem has shutdown while in the middle of a bmbt record insertion. This occurs after a bmbt block has been allocated via the cursor but before the higher level bmap function (i.e. xfs_bmap_add_extent_hole_real()) completes and resets the field. Update the assert to accommodate the transient state if the filesystem has shutdown. While here, clean up the indentation and comments in the function. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xfs: force log and push AIL to clear pinned inodes when aborting mount	Darrick J. Wong
	commit d336f7ebc65007f5831e2297e6f3383ae8dbf8ed upstream. If we allocate quota inodes in the process of mounting a filesystem but then decide to abort the mount, it's possible that the quota inodes are sitting around pinned by the log. Now that inode reclaim relies on the AIL to flush inodes, we have to force the log and push the AIL in between releasing the quota inodes and kicking off reclaim to tear down all the incore inodes. Do this by extracting the bits we need from the unmount path and reusing them. As an added bonus, failed writes during a failed mount will not retry forever now. This was originally found during a fuzz test of metadata directories (xfs/1546), but the actual symptom was that reclaim hung up on the quota inodes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xfs: restore shutdown check in mapped write fault path	Brian Foster
	commit e4826691cc7e5458bcb659935d0092bcf3f08c20 upstream. XFS triggers an iomap warning in the write fault path due to a !PageUptodate() page if a write fault happens to occur on a page that recently failed writeback. The iomap writeback error handling code can clear the Uptodate flag if no portion of the page is submitted for I/O. This is reproduced by fstest generic/019, which combines various forms of I/O with simulated disk failures that inevitably lead to filesystem shutdown (which then unconditionally fails page writeback). This is a regression introduced by commit f150b4234397 ("xfs: split the iomap ops for buffered vs direct writes") due to the removal of a shutdown check and explicit error return in the ->iomap_begin() path used by the write fault path. The explicit error return historically translated to a SIGBUS, but now carries on with iomap processing where it complains about the unexpected state. Restore the shutdown check to xfs_buffered_write_iomap_begin() to restore historical behavior. Fixes: f150b4234397 ("xfs: split the iomap ops for buffered vs direct writes") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xfs: fix incorrect root dquot corruption error when switching group/project ↵	Darrick J. Wong
	quota types commit 45068063efb7dd0a8d115c106aa05d9ab0946257 upstream. While writing up a regression test for broken behavior when a chprojid request fails, I noticed that we were logging corruption notices about the root dquot of the group/project quota file at mount time when testing V4 filesystems. In commit afeda6000b0c, I was trying to improve ondisk dquot validation by making sure that when we load an ondisk dquot into memory on behalf of an incore dquot, the dquot id and type matches. Unfortunately, I forgot that V4 filesystems only have two quota files, and can switch that file between group and project quota types at mount time. When we perform that switch, we'll try to load the default quota limits from the root dquot prior to running quotacheck and log a corruption error when the types don't match. This is inconsequential because quotacheck will reset the second quota file as part of doing the switch, but we shouldn't leave scary messages in the kernel log. Fixes: afeda6000b0c ("xfs: validate ondisk/incore dquot flags") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xfs: fix chown leaking delalloc quota blocks when fssetxattr fails	Darrick J. Wong
	commit 1aecf3734a95f3c167d1495550ca57556d33f7ec upstream. While refactoring the quota code to create a function to allocate inode change transactions, I noticed that xfs_qm_vop_chown_reserve does more than just make reservations: it also modifies the incore counts directly to handle the owner id change for the delalloc blocks. I then observed that the fssetxattr code continues validating input arguments after making the quota reservation but before dirtying the transaction. If the routine decides to error out, it fails to undo the accounting switch! This leads to incorrect quota reservation and failure down the line. We can fix this by making the reservation function do only that -- for the new dquot, it reserves ondisk and delalloc blocks to the transaction, and the old dquot hangs on to its incore reservation for now. Once we actually switch the dquots, we can then update the incore reservations because we've dirtied the transaction and it's too late to turn back now. No fixes tag because this has been broken since the start of git. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xfs: sync lazy sb accounting on quiesce of read-only mounts	Brian Foster
	commit 50d25484bebe94320c49dd1347d3330c7063bbdb upstream. xfs_log_sbcount() syncs the superblock specifically to accumulate the in-core percpu superblock counters and commit them to disk. This is required to maintain filesystem consistency across quiesce (freeze, read-only mount/remount) or unmount when lazy superblock accounting is enabled because individual transactions do not update the superblock directly. This mechanism works as expected for writable mounts, but xfs_log_sbcount() skips the update for read-only mounts. Read-only mounts otherwise still allow log recovery and write out an unmount record during log quiesce. If a read-only mount performs log recovery, it can modify the in-core superblock counters and write an unmount record when the filesystem unmounts without ever syncing the in-core counters. This leaves the filesystem with a clean log but in an inconsistent state with regard to lazy sb counters. Update xfs_log_sbcount() to use the same logic xfs_log_unmount_write() uses to determine when to write an unmount record. This ensures that lazy accounting is always synced before the log is cleaned. Refactor this logic into a new helper to distinguish between a writable filesystem and a writable log. Specifically, the log is writable unless the filesystem is mounted with the norecovery mount option, the underlying log device is read-only, or the filesystem is shutdown. Drop the freeze state check because the update is already allowed during the freezing process and no context calls this function on an already frozen fs. Also, retain the shutdown check in xfs_log_unmount_write() to catch the case where the preceding log force might have triggered a shutdown. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xfs: set inode size after creating symlink	Jeffrey Mitchell
	commit 8aa921a95335d0a8c8e2be35a44467e7c91ec3e4 upstream. When XFS creates a new symlink, it writes its size to disk but not to the VFS inode. This causes i_size_read() to return 0 for that symlink until it is re-read from disk, for example when the system is rebooted. I found this inconsistency while protecting directories with eCryptFS. The command "stat path/to/symlink/in/ecryptfs" will report "Size: 0" if the symlink was created after the last reboot on an XFS root. Call i_size_write() in xfs_symlink() Signed-off-by: Jeffrey Mitchell <jeffrey.mitchell@starlab.io> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	net: ipa: fix page free in ipa_endpoint_replenish_one()	Alex Elder
	commit 70132763d5d2e94cd185e3aa92ac6a3ba89068fa upstream. Currently the (possibly compound) pages used for receive buffers are freed using __free_pages(). But according to this comment above the definition of that function, that's wrong: If you want to use the page's reference count to decide when to free the allocation, you should allocate a compound page, and use put_page() instead of __free_pages(). Convert the call to __free_pages() in ipa_endpoint_replenish_one() to use put_page() instead. Fixes: 6a606b90153b8 ("net: ipa: allocate transaction in replenish loop") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	net: ipa: fix page free in ipa_endpoint_trans_release()	Alex Elder
	commit 155c0c90bca918de6e4327275dfc1d97fd604115 upstream. Currently the (possibly compound) page used for receive buffers are freed using __free_pages(). But according to this comment above the definition of that function, that's wrong: If you want to use the page's reference count to decide when to free the allocation, you should allocate a compound page, and use put_page() instead of __free_pages(). Convert the call to __free_pages() in ipa_endpoint_trans_release() to use put_page() instead. Fixes: ed23f02680caa ("net: ipa: define per-endpoint receive buffer size") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	phy: qcom-qmp: fix reset-controller leak on probe errors	Johan Hovold
	commit 4d2900f20edfe541f75756a00deeb2ffe7c66bc1 upstream. Make sure to release the lane reset controller in case of a late probe error (e.g. probe deferral). Note that due to the reset controller being defined in devicetree in "lane" child nodes, devm_reset_control_get_exclusive() cannot be used directly. Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") Cc: stable@vger.kernel.org # 4.12 Cc: Vivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220427063243.32576-3-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	coresight: core: Fix coresight device probe failure issue	Mao Jinlong
	commit 8c1d3f79d9ca48e406b78e90e94cf09a8c076bf2 upstream. It is possibe that probe failure issue happens when the device and its child_device's probe happens at the same time. In coresight_make_links, has_conns_grp is true for parent, but has_conns_grp is false for child device as has_conns_grp is set to true in coresight_create_conns_sysfs_group. The probe of parent device will fail at this condition. Add has_conns_grp check for child device before make the links and make the process from device_register to connection_create be atomic to avoid this probe failure issue. Cc: stable@vger.kernel.org Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Suggested-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com> Link: https://lore.kernel.org/r/20220309142206.15632-1-quic_jinlmao@quicinc.com [ Added Cc stable ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	blk-iolatency: Fix inflight count imbalances and IO hangs on offline	Tejun Heo
	commit 8a177a36da6c54c98b8685d4f914cb3637d53c0d upstream. iolatency needs to track the number of inflight IOs per cgroup. As this tracking can be expensive, it is disabled when no cgroup has iolatency configured for the device. To ensure that the inflight counters stay balanced, iolatency_set_limit() freezes the request_queue while manipulating the enabled counter, which ensures that no IO is in flight and thus all counters are zero. Unfortunately, iolatency_set_limit() isn't the only place where the enabled counter is manipulated. iolatency_pd_offline() can also dec the counter and trigger disabling. As this disabling happens without freezing the q, this can easily happen while some IOs are in flight and thus leak the counts. This can be easily demonstrated by turning on iolatency on an one empty cgroup while IOs are in flight in other cgroups and then removing the cgroup. Note that iolatency shouldn't have been enabled elsewhere in the system to ensure that removing the cgroup disables iolatency for the whole device. The following keeps flipping on and off iolatency on sda: echo +io > /sys/fs/cgroup/cgroup.subtree_control while true; do mkdir -p /sys/fs/cgroup/test echo '8:0 target=100000' > /sys/fs/cgroup/test/io.latency sleep 1 rmdir /sys/fs/cgroup/test sleep 1 done and there's concurrent fio generating direct rand reads: fio --name test --filename=/dev/sda --direct=1 --rw=randread \ --runtime=600 --time_based --iodepth=256 --numjobs=4 --bs=4k while monitoring with the following drgn script: while True: for css in css_for_each_descendant_pre(prog['blkcg_root'].css.address_of_()): for pos in hlist_for_each(container_of(css, 'struct blkcg', 'css').blkg_list): blkg = container_of(pos, 'struct blkcg_gq', 'blkcg_node') pd = blkg.pd[prog['blkcg_policy_iolatency'].plid] if pd.value_() == 0: continue iolat = container_of(pd, 'struct iolatency_grp', 'pd') inflight = iolat.rq_wait.inflight.counter.value_() if inflight: print(f'inflight={inflight} {disk_name(blkg.q.disk).decode("utf-8")} ' f'{cgroup_path(css.cgroup).decode("utf-8")}') time.sleep(1) The monitoring output looks like the following: inflight=1 sda /user.slice inflight=1 sda /user.slice ... inflight=14 sda /user.slice inflight=13 sda /user.slice inflight=17 sda /user.slice inflight=15 sda /user.slice inflight=18 sda /user.slice inflight=17 sda /user.slice inflight=20 sda /user.slice inflight=19 sda /user.slice <- fio stopped, inflight stuck at 19 inflight=19 sda /user.slice inflight=19 sda /user.slice If a cgroup with stuck inflight ends up getting throttled, the throttled IOs will never get issued as there's no completion event to wake it up leading to an indefinite hang. This patch fixes the bug by unifying enable handling into a work item which is automatically kicked off from iolatency_set_min_lat_nsec() which is called from both iolatency_set_limit() and iolatency_pd_offline() paths. Punting to a work item is necessary as iolatency_pd_offline() is called under spinlocks while freezing a request_queue requires a sleepable context. This also simplifies the code reducing LOC sans the comments and avoids the unnecessary freezes which were happening whenever a cgroup's latency target is newly set or cleared. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Liu Bo <bo.liu@linux.alibaba.com> Fixes: 8c772a9bfc7c ("blk-iolatency: fix IO hang due to negative inflight counter") Cc: stable@vger.kernel.org # v5.0+ Link: https://lore.kernel.org/r/Yn9ScX6Nx2qIiQQi@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	vdpasim: allow to enable a vq repeatedly	Eugenio Pérez
	commit 242436973831aa97e8ce19533c6c912ea8def31b upstream. Code must be resilient to enable a queue many times. At the moment the queue is resetting so it's definitely not the expected behavior. v2: set vq->ready = 0 at disable. Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator") Cc: stable@vger.kernel.org Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220519145919.772896-1-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	dt-bindings: gpio: altera: correct interrupt-cells	Dinh Nguyen
	commit 3a21c3ac93aff7b4522b152399df8f6a041df56d upstream. update documentation to correctly state the interrupt-cells to be 2. Cc: stable@vger.kernel.org Fixes: 4fd9bbc6e071 ("drivers/gpio: Altera soft IP GPIO driver devicetree binding") Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0	Akira Yokosawa
	commit 627f01eab93d8671d4e4afee9b148f9998d20e7c upstream. One of the changes in Sphinx 5.0.0 [1] says [sic]: 5.0.0 final - #10474: language does not accept None as it value. The default value of language becomes to 'en' now. [1]: https://www.sphinx-doc.org/en/master/changes.html#release-5-0-0-released-may-30-2022 It results in a new warning from Sphinx 5.0.0 [sic]: WARNING: Invalid configuration value found: 'language = None'. Update your configuration to a valid langauge code. Falling back to 'en' (English). Silence the warning by using 'en'. It works with all the Sphinx versions required for building kernel documentation (1.7.9 or later). Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Link: https://lore.kernel.org/r/bd0c2ddc-2401-03cb-4526-79ca664e1cbe@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	SMB3: EBADF/EIO errors in rename/open caused by race condition in ↵	Steve French
	smb2_compound_op commit 0a55cf74ffb5d004b93647e4389096880ce37d6b upstream. There is a race condition in smb2_compound_op: after_close: num_rqst++; if (cfile) { cifsFileInfo_put(cfile); // sends SMB2_CLOSE to the server cfile = NULL; This is triggered by smb2_query_path_info operation that happens during revalidate_dentry. In smb2_query_path_info, get_readable_path is called to load the cfile, increasing the reference counter. If in the meantime, this reference becomes the very last, this call to cifsFileInfo_put(cfile) will trigger a SMB2_CLOSE request sent to the server just before sending this compound request – and so then the compound request fails either with EBADF/EIO depending on the timing at the server, because the handle is already closed. In the first scenario, the race seems to be happening between smb2_query_path_info triggered by the rename operation, and between “cleanup” of asynchronous writes – while fsync(fd) likely waits for the asynchronous writes to complete, releasing the writeback structures can happen after the close(fd) call. So the EBADF/EIO errors will pop up if the timing is such that: 1) There are still outstanding references after close(fd) in the writeback structures 2) smb2_query_path_info successfully fetches the cfile, increasing the refcounter by 1 3) All writeback structures release the same cfile, reducing refcounter to 1 4) smb2_compound_op is called with that cfile In the second scenario, the race seems to be similar – here open triggers the smb2_query_path_info operation, and if all other threads in the meantime decrease the refcounter to 1 similarly to the first scenario, again SMB2_CLOSE will be sent to the server just before issuing the compound request. This case is harder to reproduce. See https://bugzilla.samba.org/show_bug.cgi?id=15051 Cc: stable@vger.kernel.org Fixes: 8de9e86c67ba ("cifs: create a helper to find a writeable handle by path name") Signed-off-by: Ondrej Hubsch <ohubsch@purestorage.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	ARM: pxa: maybe fix gpio lookup tables	Arnd Bergmann
	commit 2672a4bff6c03a20d5ae460a091f67ee782c3eff upstream. From inspection I found a couple of GPIO lookups that are listed with device "gpio-pxa", but actually have a number from a different gpio controller. Try to rectify that here, with a guess of what the actual device name is. Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	ARM: dts: s5pv210: Remove spi-cs-high on panel in Aries	Jonathan Bakker
	commit 096f58507374e1293a9e9cff8a1ccd5f37780a20 upstream. Since commit 766c6b63aa04 ("spi: fix client driver breakages when using GPIO descriptors"), the panel has been blank due to an inverted CS GPIO. In order to correct this, drop the spi-cs-high from the panel SPI device. Fixes: 766c6b63aa04 ("spi: fix client driver breakages when using GPIO descriptors") Cc: <stable@vger.kernel.org> Signed-off-by: Jonathan Bakker <xc-racer2@live.ca> Link: https://lore.kernel.org/r/CY4PR04MB05670C771062570E911AF3B4CB1C9@CY4PR04MB0567.namprd04.prod.outlook.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	phy: qcom-qmp: fix struct clk leak on probe errors	Johan Hovold
	commit f0a4bc38a12f5a0cc5ad68670d9480e91e6a94df upstream. Make sure to release the pipe clock reference in case of a late probe error (e.g. probe deferral). Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") Cc: stable@vger.kernel.org # 4.12 Cc: Vivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20220427063243.32576-2-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	arm64: dts: qcom: ipq8074: fix the sleep clock frequency	Kathiravan T
	commit f607dd767f5d6800ffbdce5b99ba81763b023781 upstream. Sleep clock frequency should be 32768Hz. Lets fix it. Cc: stable@vger.kernel.org Fixes: 41dac73e243d ("arm64: dts: Add ipq8074 SoC and HK01 board support") Link: https://lore.kernel.org/all/e2a447f8-6024-0369-f698-2027b6edcf9e@codeaurora.org/ Signed-off-by: Kathiravan T <quic_kathirav@quicinc.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/1644581655-11568-1-git-send-email-quic_kathirav@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	gma500: fix an incorrect NULL check on list iterator	Xiaomeng Tong
	commit bdef417d84536715145f6dc9cc3275c46f26295a upstream. The bug is here: return crtc; The list iterator value 'crtc' will always be set and non-NULL by list_for_each_entry(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element is found. To fix the bug, return 'crtc' when found, otherwise return NULL. Cc: stable@vger.kernel.org fixes: 89c78134cc54d ("gma500: Add Poulsbo support") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220327052028.2013-1-xiam0nd.tong@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	tilcdc: tilcdc_external: fix an incorrect NULL check on list iterator	Xiaomeng Tong
	commit 8b917cbe38e9b0d002492477a9fc2bfee2412ce4 upstream. The bug is here: if (!encoder) { The list iterator value 'encoder' will always be set and non-NULL by list_for_each_entry(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element is found. To fix the bug, use a new variable 'iter' as the list iterator, while use the original variable 'encoder' as a dedicated pointer to point to the found element. Cc: stable@vger.kernel.org Fixes: ec9eab097a500 ("drm/tilcdc: Add drm bridge support for attaching drm bridge drivers") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Reviewed-by: Jyri Sarha <jyri.sarha@iki.fi> Tested-by: Jyri Sarha <jyri.sarha@iki.fi> Signed-off-by: Jyri Sarha <jyri.sarha@iki.fi> Link: https://patchwork.freedesktop.org/patch/msgid/20220327061516.5076-1-xiam0nd.tong@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	serial: pch: don't overwrite xmit->buf[0] by x_char	Jiri Slaby
	commit d9f3af4fbb1d955bbaf872d9e76502f6e3e803cb upstream. When x_char is to be sent, the TX path overwrites whatever is in the circular buffer at offset 0 with x_char and sends it using pch_uart_hal_write(). I don't understand how this was supposed to work if xmit->buf[0] already contained some character. It must have been lost. Remove this whole pop_tx_x() concept and do the work directly in the callers. (Without printing anything using dev_dbg().) Cc: <stable@vger.kernel.org> Fixes: 3c6a483275f4 (Serial: EG20T: add PCH_UART driver) Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220503080808.28332-1-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bcache: avoid journal no-space deadlock by reserving 1 journal bucket	Coly Li
	commit 32feee36c30ea06e38ccb8ae6e5c44c6eec790a6 upstream. The journal no-space deadlock was reported time to time. Such deadlock can happen in the following situation. When all journal buckets are fully filled by active jset with heavy write I/O load, the cache set registration (after a reboot) will load all active jsets and inserting them into the btree again (which is called journal replay). If a journaled bkey is inserted into a btree node and results btree node split, new journal request might be triggered. For example, the btree grows one more level after the node split, then the root node record in cache device super block will be upgrade by bch_journal_meta() from bch_btree_set_root(). But there is no space in journal buckets, the journal replay has to wait for new journal bucket to be reclaimed after at least one journal bucket replayed. This is one example that how the journal no-space deadlock happens. The solution to avoid the deadlock is to reserve 1 journal bucket in run time, and only permit the reserved journal bucket to be used during cache set registration procedure for things like journal replay. Then the journal space will never be fully filled, there is no chance for journal no-space deadlock to happen anymore. This patch adds a new member "bool do_reserve" in struct journal, it is inititalized to 0 (false) when struct journal is allocated, and set to 1 (true) by bch_journal_space_reserve() when all initialization done in run_cache_set(). In the run time when journal_reclaim() tries to allocate a new journal bucket, free_journal_buckets() is called to check whether there are enough free journal buckets to use. If there is only 1 free journal bucket and journal->do_reserve is 1 (true), the last bucket is reserved and free_journal_buckets() will return 0 to indicate no free journal bucket. Then journal_reclaim() will give up, and try next time to see whetheer there is free journal bucket to allocate. By this method, there is always 1 jouranl bucket reserved in run time. During the cache set registration, journal->do_reserve is 0 (false), so the reserved journal bucket can be used to avoid the no-space deadlock. Reported-by: Nikhil Kshirsagar <nkshirsagar@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220524102336.10684-5-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bcache: remove incremental dirty sector counting for bch_sectors_dirty_init()	Coly Li
	commit 80db4e4707e78cb22287da7d058d7274bd4cb370 upstream. After making bch_sectors_dirty_init() being multithreaded, the existing incremental dirty sector counting in bch_root_node_dirty_init() doesn't release btree occupation after iterating 500000 (INIT_KEYS_EACH_TIME) bkeys. Because a read lock is added on btree root node to prevent the btree to be split during the dirty sectors counting, other I/O requester has no chance to gain the write lock even restart bcache_btree(). That is to say, the incremental dirty sectors counting is incompatible to the multhreaded bch_sectors_dirty_init(). We have to choose one and drop another one. In my testing, with 512 bytes random writes, I generate 1.2T dirty data and a btree with 400K nodes. With single thread and incremental dirty sectors counting, it takes 30+ minites to register the backing device. And with multithreaded dirty sectors counting, the backing device registration can be accomplished within 2 minutes. The 30+ minutes V.S. 2- minutes difference makes me decide to keep multithreaded bch_sectors_dirty_init() and drop the incremental dirty sectors counting. This is what this patch does. But INIT_KEYS_EACH_TIME is kept, in sectors_dirty_init_fn() the CPU will be released by cond_resched() after every INIT_KEYS_EACH_TIME keys iterated. This is to avoid the watchdog reports a bogus soft lockup warning. Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded") Signed-off-by: Coly Li <colyli@suse.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220524102336.10684-4-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bcache: improve multithreaded bch_sectors_dirty_init()	Coly Li
	commit 4dc34ae1b45fe26e772a44379f936c72623dd407 upstream. Commit b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded") makes bch_sectors_dirty_init() to be much faster when counting dirty sectors by iterating all dirty keys in the btree. But it isn't in ideal shape yet, still can be improved. This patch does the following changes to improve current parallel dirty keys iteration on the btree, - Add read lock to root node when multiple threads iterating the btree, to prevent the root node gets split by I/Os from other registered bcache devices. - Remove local variable "char name[32]" and generate kernel thread name string directly when calling kthread_run(). - Allocate "struct bch_dirty_init_state state" directly on stack and avoid the unnecessary dynamic memory allocation for it. - Decrease BCH_DIRTY_INIT_THRD_MAX from 64 to 12 which is enough indeed. - Increase &state->started to count created kernel thread after it succeeds to create. - When wait for all dirty key counting threads to finish, use wait_event() to replace wait_event_interruptible(). With the above changes, the code is more clear, and some potential error conditions are avoided. Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded") Signed-off-by: Coly Li <colyli@suse.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220524102336.10684-3-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	bcache: improve multithreaded bch_btree_check()	Coly Li
	commit 622536443b6731ec82c563aae7807165adbe9178 upstream. Commit 8e7102273f59 ("bcache: make bch_btree_check() to be multithreaded") makes bch_btree_check() to be much faster when checking all btree nodes during cache device registration. But it isn't in ideal shap yet, still can be improved. This patch does the following thing to improve current parallel btree nodes check by multiple threads in bch_btree_check(), - Add read lock to root node while checking all the btree nodes with multiple threads. Although currently it is not mandatory but it is good to have a read lock in code logic. - Remove local variable 'char name[32]', and generate kernel thread name string directly when calling kthread_run(). - Allocate local variable "struct btree_check_state check_state" on the stack and avoid unnecessary dynamic memory allocation for it. - Reduce BCH_BTR_CHKTHREAD_MAX from 64 to 12 which is enough indeed. - Increase check_state->started to count created kernel thread after it succeeds to create. - When wait for all checking kernel threads to finish, use wait_event() to replace wait_event_interruptible(). With this change, the code is more clear, and some potential error conditions are avoided. Fixes: 8e7102273f59 ("bcache: make bch_btree_check() to be multithreaded") Signed-off-by: Coly Li <colyli@suse.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220524102336.10684-2-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	stm: ltdc: fix two incorrect NULL checks on list iterator	Xiaomeng Tong
	commit 2e6c86be0e57079d1fb6c7c7e5423db096d0548a upstream. The two bugs are here: if (encoder) { if (bridge && bridge->timings) The list iterator value 'encoder/bridge' will always be set and non-NULL by drm_for_each_encoder()/list_for_each_entry(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element is found. To fix the bug, use a new variable '*_iter' as the list iterator, while use the old variable 'encoder/bridge' as a dedicated pointer to point to the found element. Cc: stable@vger.kernel.org Fixes: 99e360442f223 ("drm/stm: Fix bus_flags handling") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Acked-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com> Signed-off-by: Philippe Cornu <philippe.cornu@foss.st.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220327055355.3808-1-xiam0nd.tong@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	carl9170: tx: fix an incorrect use of list iterator	Xiaomeng Tong
	commit 54a6f29522da3c914da30e50721dedf51046449a upstream. If the previous list_for_each_entry_continue_rcu() don't exit early (no goto hit inside the loop), the iterator 'cvif' after the loop will be a bogus pointer to an invalid structure object containing the HEAD (&ar->vif_list). As a result, the use of 'cvif' after that will lead to a invalid memory access (i.e., 'cvif->id': the invalid pointer dereference when return back to/after the callsite in the carl9170_update_beacon()). The original intention should have been to return the valid 'cvif' when found in list, NULL otherwise. So just return NULL when no entry found, to fix this bug. Cc: stable@vger.kernel.org Fixes: 1f1d9654e183c ("carl9170: refactor carl9170_update_beacon") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Acked-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20220328122820.1004-1-xiam0nd.tong@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	ASoC: rt5514: Fix event generation for "DSP Voice Wake Up" control	Mark Brown
	commit 4213ff556740bb45e2d9ff0f50d056c4e7dd0921 upstream. The driver has a custom put function for "DSP Voice Wake Up" which does not generate event notifications on change, instead returning 0. Since we already exit early in the case that there is no change this can be fixed by unconditionally returning 1 at the end of the function. Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220428162444.3883147-1-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	rtl818x: Prevent using not initialized queues	Alexander Wetzel
	commit 746285cf81dc19502ab238249d75f5990bd2d231 upstream. Using not existing queues can panic the kernel with rtl8180/rtl8185 cards. Ignore the skb priority for those cards, they only have one tx queue. Pierre Asselin (pa@panix.com) reported the kernel crash in the Gentoo forum: https://forums.gentoo.org/viewtopic-t-1147832-postdays-0-postorder-asc-start-25.html He also confirmed that this patch fixes the issue. In summary this happened: After updating wpa_supplicant from 2.9 to 2.10 the kernel crashed with a "divide error: 0000" when connecting to an AP. Control port tx now tries to use IEEE80211_AC_VO for the priority, which wpa_supplicants starts to use in 2.10. Since only the rtl8187se part of the driver supports QoS, the priority of the skb is set to IEEE80211_AC_BE (2) by mac80211 for rtl8180/rtl8185 cards. rtl8180 is then unconditionally reading out the priority and finally crashes on drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c line 544 without this patch: idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries "ring->entries" is zero for rtl8180/rtl8185 cards, tx_ring[2] never got initialized. Cc: stable@vger.kernel.org Reported-by: pa@panix.com Tested-by: pa@panix.com Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20220422145228.7567-1-alexander@wetzel-home.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	xtensa/simdisk: fix proc_read_simdisk()	Yi Yang
	commit b011946d039d66bbc7102137e98cc67e1356aa87 upstream. The commit a69755b18774 ("xtensa simdisk: switch to proc_create_data()") split read operation into two parts, first retrieving the path when it's non-null and second retrieving the trailing '\n'. However when the path is non-null the first simple_read_from_buffer updates ppos, and the second simple_read_from_buffer returns 0 if ppos is greater than 1 (i.e. almost always). As a result reading from that proc file is almost always empty. Fix it by making a temporary copy of the path with the trailing '\n' and using simple_read_from_buffer on that copy. Cc: stable@vger.kernel.org Fixes: a69755b18774 ("xtensa simdisk: switch to proc_create_data()") Signed-off-by: Yi Yang <yiyang13@huawei.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09	hugetlb: fix huge_pmd_unshare address update	Mike Kravetz
	commit 48381273f8734d28ef56a5bdf1966dd8530111bc upstream. The routine huge_pmd_unshare() is passed a pointer to an address associated with an area which may be unshared. If unshare is successful this address is updated to 'optimize' callers iterating over huge page addresses. For the optimization to work correctly, address should be updated to the last huge page in the unmapped/unshared area. However, in the common case where the passed address is PUD_SIZE aligned, the address is incorrectly updated to the address of the preceding huge page. That wastes CPU cycles as the unmapped/unshared range is scanned twice. Link: https://lkml.kernel.org/r/20220524205003.126184-1-mike.kravetz@oracle.com Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>