summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2023-12-23Merge tag 'char-misc-6.7-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc driver fixes from Greg KH: "Here are a small number of various driver fixes for 6.7-rc7 that normally come through the char-misc tree, and one debugfs fix as well. Included in here are: - iio and hid sensor driver fixes for a number of small things - interconnect driver fixes - brcm_nvmem driver fixes - debugfs fix for previous fix - guard() definition in device.h so that many subsystems can start using it for 6.8-rc1 (requested by Dan Williams to make future merges easier) All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) debugfs: initialize cancellations earlier Revert "iio: hid-sensor-als: Add light color temperature support" Revert "iio: hid-sensor-als: Add light chromaticity support" nvmem: brcm_nvram: store a copy of NVRAM content dt-bindings: nvmem: mxs-ocotp: Document fsl,ocotp driver core: Add a guard() definition for the device_lock() interconnect: qcom: icc-rpm: Fix peak rate calculation iio: adc: MCP3564: fix hardware identification logic iio: adc: MCP3564: fix calib_bias and calib_scale range checks iio: adc: meson: add separate config for axg SoC family iio: adc: imx93: add four channels for imx93 adc iio: adc: ti_am335x_adc: Fix return value check of tiadc_request_dma() interconnect: qcom: sm8250: Enable sync_state iio: triggered-buffer: prevent possible freeing of wrong buffer iio: imu: inv_mpu6050: fix an error code problem in inv_mpu6050_read_raw iio: imu: adis16475: use bit numbers in assign_bit() iio: imu: adis16475: add spi_device_id table iio: tmag5273: fix temperature offset interconnect: Treat xlate() returning NULL node as an error iio: common: ms_sensors: ms_sensors_i2c: fix humidity conversion time table ...
2023-12-22debugfs: initialize cancellations earlierJohannes Berg
Tetsuo Handa pointed out that in the (now reverted) lockdep commit I initialized the data too late. The same is true for the cancellation data, it must be initialized before the cmpxchg(), otherwise it may be done twice and possibly even overwriting data in there already when there's a race. Fix that, which also requires destroying the mutex in case we lost the race. Fixes: 8c88a474357e ("debugfs: add API to allow debugfs operations cancellation") Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20231221150444.1e47a0377f80.If7e8ba721ba2956f12c6e8405e7d61e154aa7ae7@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-21afs: Fix use-after-free due to get/remove race in volume treeDavid Howells
When an afs_volume struct is put, its refcount is reduced to 0 before the cell->volume_lock is taken and the volume removed from the cell->volumes tree. Unfortunately, this means that the lookup code can race and see a volume with a zero ref in the tree, resulting in a use-after-free: refcount_t: addition on 0; use-after-free. WARNING: CPU: 3 PID: 130782 at lib/refcount.c:25 refcount_warn_saturate+0x7a/0xda ... RIP: 0010:refcount_warn_saturate+0x7a/0xda ... Call Trace: afs_get_volume+0x3d/0x55 afs_create_volume+0x126/0x1de afs_validate_fc+0xfe/0x130 afs_get_tree+0x20/0x2e5 vfs_get_tree+0x1d/0xc9 do_new_mount+0x13b/0x22e do_mount+0x5d/0x8a __do_sys_mount+0x100/0x12a do_syscall_64+0x3a/0x94 entry_SYSCALL_64_after_hwframe+0x62/0x6a Fix this by: (1) When putting, use a flag to indicate if the volume has been removed from the tree and skip the rb_erase if it has. (2) When looking up, use a conditional ref increment and if it fails because the refcount is 0, replace the node in the tree and set the removal flag. Fixes: 20325960f875 ("afs: Reorganise volume and server trees to be rooted on the cell") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-21afs: Fix overwriting of result of DNS queryDavid Howells
In afs_update_cell(), ret is the result of the DNS lookup and the errors are to be handled by a switch - however, the value gets clobbered in between by setting it to -ENOMEM in case afs_alloc_vlserver_list() fails. Fix this by moving the setting of -ENOMEM into the error handling for OOM failure. Further, only do it if we don't have an alternative error to return. Found by Linux Verification Center (linuxtesting.org) with SVACE. Based on a patch from Anastasia Belova [1]. Fixes: d5c32c89b208 ("afs: Fix cell DNS lookup") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Anastasia Belova <abelova@astralinux.ru> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: lvc-project@linuxtesting.org Link: https://lore.kernel.org/r/20231221085849.1463-1-abelova@astralinux.ru/ [1] Link: https://lore.kernel.org/r/1700862.1703168632@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-21Merge tag 'afs-fixes-20231221' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "Improve the interaction of arbitrary lookups in the AFS dynamic root that hit DNS lookup failures [1] where kafs behaves differently from openafs and causes some applications to fail that aren't expecting that. Further, negative DNS results aren't getting removed and are causing failures to persist. - Always delete unused (particularly negative) dentries as soon as possible so that they don't prevent future lookups from retrying. - Fix the handling of new-style negative DNS lookups in ->lookup() to make them return ENOENT so that userspace doesn't get confused when stat succeeds but the following open on the looked up file then fails. - Fix key handling so that DNS lookup results are reclaimed almost as soon as they expire rather than sitting round either forever or for an additional 5 mins beyond a set expiry time returning EKEYEXPIRED. They persist for 1s as /bin/ls will do a second stat call if the first fails" Link: https://bugzilla.kernel.org/show_bug.cgi?id=216637 [1] Reviewed-by: Jeffrey Altman <jaltman@auristor.com> * tag 'afs-fixes-20231221' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry afs: Fix dynamic root lookup DNS check afs: Fix the dynamic root's d_delete to always delete unused dentries
2023-12-21Merge tag 'trace-v6.7-rc6-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix another kerneldoc warning - Fix eventfs files to inherit the ownership of its parent directory. The dynamic creation of dentries in eventfs did not take into account if the tracefs file system was mounted with a gid/uid, and would still default to the gid/uid of root. This is a regression. - Fix warning when synthetic event testing is enabled along with startup event tracing testing is enabled * tag 'trace-v6.7-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing / synthetic: Disable events after testing in synth_event_gen_test_init() eventfs: Have event files and directories default to parent uid and gid tracing/synthetic: fix kernel-doc warnings
2023-12-21eventfs: Have event files and directories default to parent uid and gidSteven Rostedt (Google)
Dongliang reported: I found that in the latest version, the nodes of tracefs have been changed to dynamically created. This has caused me to encounter a problem where the gid I specified in the mounting parameters cannot apply to all files, as in the following situation: /data/tmp/events # mount | grep tracefs tracefs on /data/tmp type tracefs (rw,seclabel,relatime,gid=3012) gid 3012 = readtracefs /data/tmp # ls -lh total 0 -r--r----- 1 root readtracefs 0 1970-01-01 08:00 README -r--r----- 1 root readtracefs 0 1970-01-01 08:00 available_events ums9621_1h10:/data/tmp/events # ls -lh total 0 drwxr-xr-x 2 root root 0 2023-12-19 00:56 alarmtimer drwxr-xr-x 2 root root 0 2023-12-19 00:56 asoc It will prevent certain applications from accessing tracefs properly, I try to avoid this issue by making the following modifications. To fix this, have the files created default to taking the ownership of the parent dentry unless the ownership was previously set by the user. Link: https://lore.kernel.org/linux-trace-kernel/1703063706-30539-1-git-send-email-dongliang.cui@unisoc.com/ Link: https://lore.kernel.org/linux-trace-kernel/20231220105017.1489d790@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Hongyu Jin <hongyu.jin@unisoc.com> Fixes: 28e12c09f5aa0 ("eventfs: Save ownership and mode") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reported-by: Dongliang Cui <cuidongliang390@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-20Merge tag '6.7-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - two multichannel reconnect fixes, one fixing an important refcounting problem that can lead to umount problems - atime fix - five fixes for various potential OOB accesses, including a CVE fix, and two additional fixes for problems pointed out by Robert Morris's fuzzing investigation * tag '6.7-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: do not let cifs_chan_update_iface deallocate channels cifs: fix a pending undercount of srv_count fs: cifs: Fix atime update check smb: client: fix potential OOB in smb2_dump_detail() smb: client: fix potential OOB in cifs_dump_detail() smb: client: fix OOB in smbCalcSize() smb: client: fix OOB in SMB2_query_info_init() smb: client: fix OOB in cifsd when receiving compounded resps
2023-12-20Merge tag 'ovl-fixes-6.7-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs fix from Amir Goldstein: "Fix a regression from this merge window" * tag 'ovl-fixes-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: fix dentry reference leak after changes to underlying layers
2023-12-20Merge tag 'bcachefs-2023-12-19' of https://evilpiepirate.org/git/bcachefsLinus Torvalds
Pull more bcachefs fixes from Kent Overstreet: - Fix a deadlock in the data move path with nocow locks (vs. update in place writes); when trylock failed we were incorrectly waiting for in flight ios to flush. - Fix reporting of NFS file handle length - Fix early error path in bch2_fs_alloc() - list head wasn't being initialized early enough - Make sure correct (hardware accelerated) crc modules get loaded - Fix a rare overflow in the btree split path, when the packed bkey format grows and all the keys have no value (LRU btree). - Fix error handling in the sector allocator This was causing writes to spuriously fail in multidevice setups, and another bug meant that the errors weren't being logged, only reported via fsync. * tag 'bcachefs-2023-12-19' of https://evilpiepirate.org/git/bcachefs: bcachefs: Fix bch2_alloc_sectors_start_trans() error handling bcachefs; guard against overflow in btree node split bcachefs: btree_node_u64s_with_format() takes nr keys bcachefs: print explicit recovery pass message only once bcachefs: improve modprobe support by providing softdeps bcachefs: fix invalid memory access in bch2_fs_alloc() error path bcachefs: Fix determining required file handle length bcachefs: Fix nocow locks deadlock
2023-12-20Merge tag 'nfsd-6.7-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Address a few recently-introduced issues * tag 'nfsd-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Revert 5f7fc5d69f6e92ec0b38774c387f5cf7812c5806 NFSD: Revert 738401a9bd1ac34ccd5723d69640a4adbb1a4bc0 NFSD: Revert 6c41d9a9bd0298002805758216a9c44e38a8500d nfsd: hold nfsd_mutex across entire netlink operation nfsd: call nfsd_last_thread() before final nfsd_put()
2023-12-20afs: Fix dynamic root lookup DNS checkDavid Howells
In the afs dynamic root directory, the ->lookup() function does a DNS check on the cell being asked for and if the DNS upcall reports an error it will report an error back to userspace (typically ENOENT). However, if a failed DNS upcall returns a new-style result, it will return a valid result, with the status field set appropriately to indicate the type of failure - and in that case, dns_query() doesn't return an error and we let stat() complete with no error - which can cause confusion in userspace as subsequent calls that trigger d_automount then fail with ENOENT. Fix this by checking the status result from a valid dns_query() and returning an error if it indicates a failure. Fixes: bbb4c4323a4d ("dns: Allow the dns resolver to retrieve a server set") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216637 Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-12-20afs: Fix the dynamic root's d_delete to always delete unused dentriesDavid Howells
Fix the afs dynamic root's d_delete function to always delete unused dentries rather than only deleting them if they're positive. With things as they stand upstream, negative dentries stemming from failed DNS lookups stick around preventing retries. Fixes: 66c7e1d319a5 ("afs: Split the dynroot stuff out and give it its own ops tables") Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-12-19bcachefs: Fix bch2_alloc_sectors_start_trans() error handlingbcachefs-2023-12-19Kent Overstreet
When we fail to allocate because of insufficient open buckets, we don't want to retry from the full set of devices - we just want to retry in blocking mode. But if the retry in blocking mode fails with a different error code, we end up squashing the -BCH_ERR_open_buckets_empty error with an error that makes us thing we won't be able to allocate (insufficient_devices) - which is incorrect when we didn't try to allocate from the full set of devices, and causes the write to fail. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-19bcachefs; guard against overflow in btree node splitKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-19bcachefs: btree_node_u64s_with_format() takes nr keysKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-19cifs: do not let cifs_chan_update_iface deallocate channelsShyam Prasad N
cifs_chan_update_iface is meant to check and update the server interface used for a channel when the existing server interface is no longer available. So far, this handler had the code to remove an interface entry even if a new candidate interface is not available. Allowing this leads to several corner cases to handle. This change makes the logic much simpler by not deallocating the current channel interface entry if a new interface is not found to replace it with. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19cifs: fix a pending undercount of srv_countShyam Prasad N
The following commit reverted the changes to ref count the server struct while scheduling a reconnect work: 823342524868 Revert "cifs: reconnect work should have reference on server struct" However, a following change also introduced scheduling of reconnect work, and assumed ref counting. This change fixes that as well. Fixes umount problems like: [73496.157838] CPU: 5 PID: 1321389 Comm: umount Tainted: G W OE 6.7.0-060700rc6-generic #202312172332 [73496.157841] Hardware name: LENOVO 20MAS08500/20MAS08500, BIOS N2CET67W (1.50 ) 12/15/2022 [73496.157843] RIP: 0010:cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.157906] Code: 5d 31 c0 31 d2 31 f6 31 ff c3 cc cc cc cc e8 4a 6e 14 e6 e9 f6 fe ff ff be 03 00 00 00 48 89 d7 e8 78 26 b3 e5 e9 e4 fe ff ff <0f> 0b e9 b1 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 [73496.157908] RSP: 0018:ffffc90003bcbcb8 EFLAGS: 00010286 [73496.157911] RAX: 00000000ffffffff RBX: ffff8885830fa800 RCX: 0000000000000000 [73496.157913] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [73496.157915] RBP: ffffc90003bcbcc8 R08: 0000000000000000 R09: 0000000000000000 [73496.157917] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [73496.157918] R13: ffff8887d56ba800 R14: 00000000ffffffff R15: ffff8885830fa800 [73496.157920] FS: 00007f1ff0e33800(0000) GS:ffff88887ba80000(0000) knlGS:0000000000000000 [73496.157922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [73496.157924] CR2: 0000115f002e2010 CR3: 00000003d1e24005 CR4: 00000000003706f0 [73496.157926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [73496.157928] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [73496.157929] Call Trace: [73496.157931] <TASK> [73496.157933] ? show_regs+0x6d/0x80 [73496.157936] ? __warn+0x89/0x160 [73496.157939] ? cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.157976] ? report_bug+0x17e/0x1b0 [73496.157980] ? handle_bug+0x51/0xa0 [73496.157983] ? exc_invalid_op+0x18/0x80 [73496.157985] ? asm_exc_invalid_op+0x1b/0x20 [73496.157989] ? cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.158023] ? cifs_put_tcp_session+0x1e/0x190 [cifs] [73496.158057] __cifs_put_smb_ses+0x2b5/0x540 [cifs] [73496.158090] ? tconInfoFree+0xc2/0x120 [cifs] [73496.158130] cifs_put_tcon.part.0+0x108/0x2b0 [cifs] [73496.158173] cifs_put_tlink+0x49/0x90 [cifs] [73496.158220] cifs_umount+0x56/0xb0 [cifs] [73496.158258] cifs_kill_sb+0x52/0x60 [cifs] [73496.158306] deactivate_locked_super+0x32/0xc0 [73496.158309] deactivate_super+0x46/0x60 [73496.158311] cleanup_mnt+0xc3/0x170 [73496.158314] __cleanup_mnt+0x12/0x20 [73496.158330] task_work_run+0x5e/0xa0 [73496.158333] exit_to_user_mode_loop+0x105/0x130 [73496.158336] exit_to_user_mode_prepare+0xa5/0xb0 [73496.158338] syscall_exit_to_user_mode+0x29/0x60 [73496.158341] do_syscall_64+0x6c/0xf0 [73496.158344] ? syscall_exit_to_user_mode+0x37/0x60 [73496.158346] ? do_syscall_64+0x6c/0xf0 [73496.158349] ? exit_to_user_mode_prepare+0x30/0xb0 [73496.158353] ? syscall_exit_to_user_mode+0x37/0x60 [73496.158355] ? do_syscall_64+0x6c/0xf0 Reported-by: Robert Morris <rtm@csail.mit.edu> Fixes: 705fc522fe9d ("cifs: handle when server starts supporting multichannel") Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19fs: cifs: Fix atime update checkZizhi Wo
Commit 9b9c5bea0b96 ("cifs: do not return atime less than mtime") indicates that in cifs, if atime is less than mtime, some apps will break. Therefore, it introduce a function to compare this two variables in two places where atime is updated. If atime is less than mtime, update it to mtime. However, the patch was handled incorrectly, resulting in atime and mtime being exactly equal. A previous commit 69738cfdfa70 ("fs: cifs: Fix atime update check vs mtime") fixed one place and forgot to fix another. Fix it. Fixes: 9b9c5bea0b96 ("cifs: do not return atime less than mtime") Cc: stable@vger.kernel.org Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19smb: client: fix potential OOB in smb2_dump_detail()Paulo Alcantara
Validate SMB message with ->check_message() before calling ->calc_smb_size(). This fixes CVE-2023-6610. Reported-by: j51569436@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218219 Cc; stable@vger.kernel.org Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-18NFSD: Revert 738401a9bd1ac34ccd5723d69640a4adbb1a4bc0Chuck Lever
There's nothing wrong with this commit, but this is dead code now that nothing triggers a CB_GETATTR callback. It can be re-introduced once the issues with handling conflicting GETATTRs are resolved. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-18NFSD: Revert 6c41d9a9bd0298002805758216a9c44e38a8500dChuck Lever
For some reason, the wait_on_bit() in nfsd4_deleg_getattr_conflict() is waiting forever, preventing a clean server shutdown. The requesting client might also hang waiting for a reply to the conflicting GETATTR. Invoking wait_on_bit() in an nfsd thread context is a hazard. The correct fix is to replace this wait_on_bit() call site with a mechanism that defers the conflicting GETATTR until the CB_GETATTR completes or is known to have failed. That will require some surgery and extended testing and it's late in the v6.7-rc cycle, so I'm reverting now in favor of trying again in a subsequent kernel release. This is my fault: I should have recognized the ramifications of calling wait_on_bit() in here before accepting this patch. Thanks to Dai Ngo <dai.ngo@oracle.com> for diagnosing the issue. Reported-by: Wolfgang Walter <linux-nfs@stwm.de> Closes: https://lore.kernel.org/linux-nfs/e3d43ecdad554fbdcaa7181833834f78@stwm.de/ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-17bcachefs: print explicit recovery pass message only onceKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-17smb: client: fix potential OOB in cifs_dump_detail()Paulo Alcantara
Validate SMB message with ->check_message() before calling ->calc_smb_size(). Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-17smb: client: fix OOB in smbCalcSize()Paulo Alcantara
Validate @smb->WordCount to avoid reading off the end of @smb and thus causing the following KASAN splat: BUG: KASAN: slab-out-of-bounds in smbCalcSize+0x32/0x40 [cifs] Read of size 2 at addr ffff88801c024ec5 by task cifsd/1328 CPU: 1 PID: 1328 Comm: cifsd Not tainted 6.7.0-rc5 #9 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __phys_addr+0x46/0x90 kasan_report+0xd8/0x110 ? smbCalcSize+0x32/0x40 [cifs] ? smbCalcSize+0x32/0x40 [cifs] kasan_check_range+0x105/0x1b0 smbCalcSize+0x32/0x40 [cifs] checkSMB+0x162/0x370 [cifs] ? __pfx_checkSMB+0x10/0x10 [cifs] cifs_handle_standard+0xbc/0x2f0 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 cifs_demultiplex_thread+0xed1/0x1360 [cifs] ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? lockdep_hardirqs_on_prepare+0x136/0x210 ? __pfx_lock_release+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? mark_held_locks+0x1a/0x90 ? lockdep_hardirqs_on_prepare+0x136/0x210 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __kthread_parkme+0xce/0xf0 ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] kthread+0x18d/0x1d0 ? kthread+0xdb/0x1d0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> This fixes CVE-2023-6606. Reported-by: j51569436@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218218 Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-17smb: client: fix OOB in SMB2_query_info_init()Paulo Alcantara
A small CIFS buffer (448 bytes) isn't big enough to hold SMB2_QUERY_INFO request along with user's input data from CIFS_QUERY_INFO ioctl. That is, if the user passed an input buffer > 344 bytes, the client will memcpy() off the end of @req->Buffer in SMB2_query_info_init() thus causing the following KASAN splat: BUG: KASAN: slab-out-of-bounds in SMB2_query_info_init+0x242/0x250 [cifs] Write of size 1023 at addr ffff88801308c5a8 by task a.out/1240 CPU: 1 PID: 1240 Comm: a.out Not tainted 6.7.0-rc4 #5 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __phys_addr+0x46/0x90 kasan_report+0xd8/0x110 ? SMB2_query_info_init+0x242/0x250 [cifs] ? SMB2_query_info_init+0x242/0x250 [cifs] kasan_check_range+0x105/0x1b0 __asan_memcpy+0x3c/0x60 SMB2_query_info_init+0x242/0x250 [cifs] ? __pfx_SMB2_query_info_init+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? smb_rqst_len+0xa6/0xc0 [cifs] smb2_ioctl_query_info+0x4f4/0x9a0 [cifs] ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs] ? __pfx_cifsConvertToUTF16+0x10/0x10 [cifs] ? kasan_set_track+0x25/0x30 ? srso_alias_return_thunk+0x5/0xfbef5 ? __kasan_kmalloc+0x8f/0xa0 ? srso_alias_return_thunk+0x5/0xfbef5 ? cifs_strndup_to_utf16+0x12d/0x1a0 [cifs] ? __build_path_from_dentry_optional_prefix+0x19d/0x2d0 [cifs] ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs] cifs_ioctl+0x11c7/0x1de0 [cifs] ? __pfx_cifs_ioctl+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? rcu_is_watching+0x23/0x50 ? srso_alias_return_thunk+0x5/0xfbef5 ? __rseq_handle_notify_resume+0x6cd/0x850 ? __pfx___schedule+0x10/0x10 ? blkcg_iostat_update+0x250/0x290 ? srso_alias_return_thunk+0x5/0xfbef5 ? ksys_write+0xe9/0x170 __x64_sys_ioctl+0xc9/0x100 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f893dde49cf Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 RSP: 002b:00007ffc03ff4160 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffc03ff4378 RCX: 00007f893dde49cf RDX: 00007ffc03ff41d0 RSI: 00000000c018cf07 RDI: 0000000000000003 RBP: 00007ffc03ff4260 R08: 0000000000000410 R09: 0000000000000001 R10: 00007f893dce7300 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc03ff4388 R14: 00007f893df15000 R15: 0000000000406de0 </TASK> Fix this by increasing size of SMB2_QUERY_INFO request buffers and validating input length to prevent other callers from overflowing @req in SMB2_query_info_init() as well. Fixes: f5b05d622a3e ("cifs: add IOCTL for QUERY_INFO passthrough to userspace") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-17smb: client: fix OOB in cifsd when receiving compounded respsPaulo Alcantara
Validate next header's offset in ->next_header() so that it isn't smaller than MID_HEADER_SIZE(server) and then standard_receive3() or ->receive() ends up writing off the end of the buffer because 'pdu_length - MID_HEADER_SIZE(server)' wraps up to a huge length: BUG: KASAN: slab-out-of-bounds in _copy_to_iter+0x4fc/0x840 Write of size 701 at addr ffff88800caf407f by task cifsd/1090 CPU: 0 PID: 1090 Comm: cifsd Not tainted 6.7.0-rc4 #5 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __phys_addr+0x46/0x90 kasan_report+0xd8/0x110 ? _copy_to_iter+0x4fc/0x840 ? _copy_to_iter+0x4fc/0x840 kasan_check_range+0x105/0x1b0 __asan_memcpy+0x3c/0x60 _copy_to_iter+0x4fc/0x840 ? srso_alias_return_thunk+0x5/0xfbef5 ? hlock_class+0x32/0xc0 ? srso_alias_return_thunk+0x5/0xfbef5 ? __pfx__copy_to_iter+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_is_held_type+0x90/0x100 ? srso_alias_return_thunk+0x5/0xfbef5 ? __might_resched+0x278/0x360 ? __pfx___might_resched+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 __skb_datagram_iter+0x2c2/0x460 ? __pfx_simple_copy_to_iter+0x10/0x10 skb_copy_datagram_iter+0x6c/0x110 tcp_recvmsg_locked+0x9be/0xf40 ? __pfx_tcp_recvmsg_locked+0x10/0x10 ? mark_held_locks+0x5d/0x90 ? srso_alias_return_thunk+0x5/0xfbef5 tcp_recvmsg+0xe2/0x310 ? __pfx_tcp_recvmsg+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_acquire+0x14a/0x3a0 ? srso_alias_return_thunk+0x5/0xfbef5 inet_recvmsg+0xd0/0x370 ? __pfx_inet_recvmsg+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_trylock+0xd1/0x120 sock_recvmsg+0x10d/0x150 cifs_readv_from_socket+0x25a/0x490 [cifs] ? __pfx_cifs_readv_from_socket+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 cifs_read_from_socket+0xb5/0x100 [cifs] ? __pfx_cifs_read_from_socket+0x10/0x10 [cifs] ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_trylock+0xd1/0x120 ? _raw_spin_unlock+0x23/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 ? __smb2_find_mid+0x126/0x230 [cifs] cifs_demultiplex_thread+0xd39/0x1270 [cifs] ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] ? __pfx_lock_release+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? mark_held_locks+0x1a/0x90 ? lockdep_hardirqs_on_prepare+0x136/0x210 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __kthread_parkme+0xce/0xf0 ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] kthread+0x18d/0x1d0 ? kthread+0xdb/0x1d0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> Fixes: 8ce79ec359ad ("cifs: update multiplex loop to handle compounded responses") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-17Merge tag 'for-6.7-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more fix that verifies that the snapshot source is a root, same check is also done in user space but should be done by the ioctl as well" * tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: do not allow non subvolume root targets for snapshot
2023-12-17ovl: fix dentry reference leak after changes to underlying layersAmir Goldstein
syzbot excercised the forbidden practice of moving the workdir under lowerdir while overlayfs is mounted and tripped a dentry reference leak. Fixes: c63e56a4a652 ("ovl: do not open/llseek lower file with upper sb_writers held") Reported-and-tested-by: syzbot+8608bb4553edb8c78f41@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-12-16Merge tag 'trace-v6.7-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix eventfs to check creating new files for events with names greater than NAME_MAX. The eventfs lookup needs to check the return result of simple_lookup(). - Fix the ring buffer to check the proper max data size. Events must be able to fit on the ring buffer sub-buffer, if it cannot, then it fails to be written and the logic to add the event is avoided. The code to check if an event can fit failed to add the possible absolute timestamp which may make the event not be able to fit. This causes the ring buffer to go into an infinite loop trying to find a sub-buffer that would fit the event. Luckily, there's a check that will bail out if it looped over a 1000 times and it also warns. The real fix is not to add the absolute timestamp to an event that is starting at the beginning of a sub-buffer because it uses the sub-buffer timestamp. By avoiding the timestamp at the start of the sub-buffer allows events that pass the first check to always find a sub-buffer that it can fit on. - Have large events that do not fit on a trace_seq to print "LINE TOO BIG" like it does for the trace_pipe instead of what it does now which is to silently drop the output. - Fix a memory leak of forgetting to free the spare page that is saved by a trace instance. - Update the size of the snapshot buffer when the main buffer is updated if the snapshot buffer is allocated. - Fix ring buffer timestamp logic by removing all the places that tried to put the before_stamp back to the write stamp so that the next event doesn't add an absolute timestamp. But each of these updates added a race where by making the two timestamp equal, it was validating the write_stamp so that it can be incorrectly used for calculating the delta of an event. - There's a temp buffer used for printing the event that was using the event data size for allocation when it needed to use the size of the entire event (meta-data and payload data) - For hardening, use "%.*s" for printing the trace_marker output, to limit the amount that is printed by the size of the event. This was discovered by development that added a bug that truncated the '\0' and caused a crash. - Fix a use-after-free bug in the use of the histogram files when an instance is being removed. - Remove a useless update in the rb_try_to_discard of the write_stamp. The before_stamp was already changed to force the next event to add an absolute timestamp that the write_stamp is not used. But the write_stamp is modified again using an unneeded 64-bit cmpxchg. - Fix several races in the 32-bit implementation of the rb_time_cmpxchg() that does a 64-bit cmpxchg. - While looking at fixing the 64-bit cmpxchg, I noticed that because the ring buffer uses normal cmpxchg, and this can be done in NMI context, there's some architectures that do not have a working cmpxchg in NMI context. For these architectures, fail recording events that happen in NMI context. * tag 'trace-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI ring-buffer: Have rb_time_cmpxchg() set the msb counter too ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg() ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archs ring-buffer: Remove useless update to write_stamp in rb_try_to_discard() ring-buffer: Do not try to put back write_stamp tracing: Fix uaf issue when open the hist or hist_debug file tracing: Add size check when printing trace_marker output ring-buffer: Have saved event hold the entire event ring-buffer: Do not update before stamp when switching sub-buffers tracing: Update snapshot buffer on resize if it is allocated ring-buffer: Fix memory leak of free page eventfs: Fix events beyond NAME_MAX blocking tasks tracing: Have large events show up as '[LINE TOO BIG]' instead of nothing ring-buffer: Fix writing to the buffer with max_data_size
2023-12-15btrfs: do not allow non subvolume root targets for snapshotJosef Bacik
Our btrfs subvolume snapshot <source> <destination> utility enforces that <source> is the root of the subvolume, however this isn't enforced in the kernel. Update the kernel to also enforce this limitation to avoid problems with other users of this ioctl that don't have the appropriate checks in place. Reported-by: Martin Michaelis <code@mgjm.de> CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-12-15cred: get rid of CONFIG_DEBUG_CREDENTIALSJens Axboe
This code is rarely (never?) enabled by distros, and it hasn't caught anything in decades. Let's kill off this legacy debug code. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-15nfsd: hold nfsd_mutex across entire netlink operationNeilBrown
Rather than using svc_get() and svc_put() to hold a stable reference to the nfsd_svc for netlink lookups, simply hold the mutex for the entire time. The "entire" time isn't very long, and the mutex is not often contented. This makes way for us to remove the refcounts of svc, which is more confusing than useful. Reported-by: Jeff Layton <jlayton@kernel.org> Closes: https://lore.kernel.org/linux-nfs/5d9bbb599569ce29f16e4e0eef6b291eda0f375b.camel@kernel.org/T/#u Fixes: bd9d6a3efa97 ("NFSD: add rpc_status netlink support") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-15nfsd: call nfsd_last_thread() before final nfsd_put()NeilBrown
If write_ports_addfd or write_ports_addxprt fail, they call nfsd_put() without calling nfsd_last_thread(). This leaves nn->nfsd_serv pointing to a structure that has been freed. So remove 'static' from nfsd_last_thread() and call it when the nfsd_serv is about to be destroyed. Fixes: ec52361df99b ("SUNRPC: stop using ->sv_nrthreads as a refcount") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-14Merge tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: "Address OOBs and NULL dereference found by Dr. Morris's recent analysis and fuzzing. All marked for stable as well" * tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix OOB in smb2_query_reparse_point() smb: client: fix NULL deref in asn1_ber_decoder() smb: client: fix potential OOBs in smb2_parse_contexts() smb: client: fix OOB in receive_encrypted_standard()
2023-12-14bcachefs: improve modprobe support by providing softdepsDaniel Hill
We need to help modprobe load architecture specific modules so we don't fall back to generic software implementations, this should help performance when building as a module. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-14bcachefs: fix invalid memory access in bch2_fs_alloc() error pathThomas Bertschinger
When bch2_fs_alloc() gets an error before calling bch2_fs_btree_iter_init(), bch2_fs_btree_iter_exit() makes an invalid memory access because btree_trans_list is uninitialized. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Fixes: 6bd68ec266ad ("bcachefs: Heap allocate btree_trans") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-14Merge tag 'for-6.7-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Some fixes to quota accounting code, mostly around error handling and correctness: - free reserves on various error paths, after IO errors or transaction abort - don't clear reserved range at the folio release time, it'll be properly cleared after final write - fix integer overflow due to int used when passing around size of freed reservations - fix a regression in squota accounting that missed some cases with delayed refs" * tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: ensure releasing squota reserve on head refs btrfs: don't clear qgroup reserved bit in release_folio btrfs: free qgroup pertrans reserve on transaction abort btrfs: fix qgroup_free_reserved_data int overflow btrfs: free qgroup reserve when ORDERED_IOERR is set
2023-12-13Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull ufs fix from Al Viro: "ufs got broken this merge window on folio conversion - calling conventions for filemap_lock_folio() are not the same as for find_lock_page()" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix ufs_get_locked_folio() breakage
2023-12-13bcachefs: Fix determining required file handle lengthJan Kara
The ->encode_fh method is responsible for setting amount of space required for storing the file handle if not enough space was provided. bch2_encode_fh() was not setting required length in that case which breaks e.g. fanotify. Fix it. Reported-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-13fix ufs_get_locked_folio() breakageAl Viro
filemap_lock_folio() returns ERR_PTR(-ENOENT) if the thing is not in cache - not NULL like find_lock_page() used to. Fixes: 5fb7bd50b351 "ufs: add ufs_get_locked_folio and ufs_put_locked_folio" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-12-12eventfs: Fix events beyond NAME_MAX blocking tasksBeau Belgrave
Eventfs uses simple_lookup(), however, it will fail if the name of the entry is beyond NAME_MAX length. When this error is encountered, eventfs still tries to create dentries instead of skipping the dentry creation. When the dentry is attempted to be created in this state d_wait_lookup() will loop forever, waiting for the lookup to be removed. Fix eventfs to return the error in simple_lookup() back to the caller instead of continuing to try to create the dentry. Link: https://lore.kernel.org/linux-trace-kernel/20231210213534.497-1-beaub@linux.microsoft.com Fixes: 63940449555e ("eventfs: Implement eventfs lookup, read, open functions") Link: https://lore.kernel.org/linux-trace-kernel/20231208183601.GA46-beaub@linux.microsoft.com/ Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-12Merge tag 'ext4_for_linus-6.7-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix various bugs / regressions for ext4, including a soft lockup, a WARN_ON, and a BUG" * tag 'ext4_for_linus-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: fix soft lockup in journal_finish_inode_data_buffers() ext4: fix warning in ext4_dio_write_end_io() jbd2: increase the journal IO's priority jbd2: correct the printing of write_flags in jbd2_write_superblock() ext4: prevent the normalized size from exceeding EXT_MAX_BLOCKS
2023-12-12Merge tag 'fuse-fixes-6.7-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: - Fix a couple of potential crashes, one introduced in 6.6 and one in 5.10 - Fix misbehavior of virtiofs submounts on memory pressure - Clarify naming in the uAPI for a recent feature * tag 'fuse-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: disable FOPEN_PARALLEL_DIRECT_WRITES with FUSE_DIRECT_IO_ALLOW_MMAP fuse: dax: set fc->dax to NULL in fuse_dax_conn_free() fuse: share lookup state between submount and its parent docs/fuse-io: Document the usage of DIRECT_IO_ALLOW_MMAP fuse: Rename DIRECT_IO_RELAX to DIRECT_IO_ALLOW_MMAP
2023-12-12Merge tag '6.7-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: - Memory leak fix (in lock error path) - Two fixes for create with allocation size - FIx for potential UAF in lease break error path - Five directory lease (caching) fixes found during additional recent testing * tag '6.7-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix wrong name of SMB2_CREATE_ALLOCATION_SIZE ksmbd: fix wrong allocation size update in smb2_open() ksmbd: avoid duplicate opinfo_put() call on error of smb21_lease_break_ack() ksmbd: lazy v2 lease break on smb2_write() ksmbd: send v2 lease break notification for directory ksmbd: downgrade RWH lease caching state to RH for directory ksmbd: set v2 lease capability ksmbd: set epoch in create context v2 lease ksmbd: fix memory leak in smb2_lock()
2023-12-12jbd2: fix soft lockup in journal_finish_inode_data_buffers()Ye Bin
There's issue when do io test: WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170] CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G OE Call trace: dump_backtrace+0x0/0x1a0 show_stack+0x24/0x30 dump_stack+0xb0/0x100 watchdog_timer_fn+0x254/0x3f8 __hrtimer_run_queues+0x11c/0x380 hrtimer_interrupt+0xfc/0x2f8 arch_timer_handler_phys+0x38/0x58 handle_percpu_devid_irq+0x90/0x248 generic_handle_irq+0x3c/0x58 __handle_domain_irq+0x68/0xc0 gic_handle_irq+0x90/0x320 el1_irq+0xcc/0x180 queued_spin_lock_slowpath+0x1d8/0x320 jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2] kjournald2+0xec/0x2f0 [jbd2] kthread+0x134/0x138 ret_from_fork+0x10/0x18 Analyzed informations from vmcore as follows: (1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list'; (2) Now is processing the 855th jbd2_inode; (3) JBD2 task has TIF_NEED_RESCHED flag; (4) There's no pags in address_space around the 855th jbd2_inode; (5) There are some process is doing drop caches; (6) Mounted with 'nodioread_nolock' option; (7) 128 CPUs; According to informations from vmcore we know 'journal->j_list_lock' spin lock competition is fierce. So journal_finish_inode_data_buffers() maybe process slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors(). However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK, will not call cond_resched(). So may lead to soft lockup. journal_finish_inode_data_buffers filemap_fdatawait_range_keep_errors __filemap_fdatawait_range while (index <= end) nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK); if (!nr_pages) break; --> If 'nr_pages' is equal zero will break, then will not call cond_resched() for (i = 0; i < nr_pages; i++) wait_on_page_writeback(page); cond_resched(); To solve above issue, add scheduling point in the journal_finish_inode_data_buffers(); Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-12-11bcachefs: Fix nocow locks deadlockKent Overstreet
On trylock failure we were waiting for outstanding reads to complete - but nocow locks need to be held until the whole move is finished. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-11Merge tag 'bcachefs-2023-12-10' of https://evilpiepirate.org/git/bcachefsLinus Torvalds
Pull more bcachefs bugfixes from Kent Overstreet: - Fix a rare emergency shutdown path bug: dropping journal pins after the filesystem has mostly been torn down is not what we want. - Fix some concurrency issues with the btree write buffer and journal replay by not using the btree write buffer until journal replay is finished - A fixup from the prior patch to kill journal pre-reservations: at the start of the btree update path, where previously we took a pre-reservation, we do at least want to check the journal watermark. - Fix a race between dropping device metadata and btree node writes, which would re-add a pointer to a device that had just been dropped - Fix one of the SCRU lock warnings, in bch2_compression_stats_to_text(). - Partial fix for a rare transaction paths overflow, when indirect extents had been split by background tasks, by not running certain triggers when they're not needed. - Fix for creating a snapshot with implicit source in a subdirectory of the containing subvolume - Don't unfreeze when we're emergency read-only - Fix for rebalance spinning trying to compress unwritten extentns - Another deleted_inodes fix, for directories - Fix a rare deadlock (usually just an unecessary wait) when flushing the journal with an open journal entry. * tag 'bcachefs-2023-12-10' of https://evilpiepirate.org/git/bcachefs: bcachefs: Close journal entry if necessary when flushing all pins bcachefs: Fix uninitialized var in bch2_journal_replay() bcachefs: Fix deleted inode check for dirs bcachefs: rebalance shouldn't attempt to compress unwritten extents bcachefs: don't attempt rw on unfreeze when shutdown bcachefs: Fix creating snapshot with implict source bcachefs: Don't run indirect extent trigger unless inserting/deleting bcachefs: Convert compression_stats to for_each_btree_key2 bcachefs: Fix bch2_extent_drop_ptrs() call bcachefs: Fix a journal deadlock in replay bcachefs; Don't use btree write buffer until journal replay is finished bcachefs: Don't drop journal pins in exit path
2023-12-11afs: Fix refcount underflow from error handling raceDavid Howells
If an AFS cell that has an unreachable (eg. ENETUNREACH) server listed (VL server or fileserver), an asynchronous probe to one of its addresses may fail immediately because sendmsg() returns an error. When this happens, a refcount underflow can happen if certain events hit a very small window. The way this occurs is: (1) There are two levels of "call" object, the afs_call and the rxrpc_call. Each of them can be transitioned to a "completed" state in the event of success or failure. (2) Asynchronous afs_calls are self-referential whilst they are active to prevent them from evaporating when they're not being processed. This reference is disposed of when the afs_call is completed. Note that an afs_call may only be completed once; once completed completing it again will do nothing. (3) When a call transmission is made, the app-side rxrpc code queues a Tx buffer for the rxrpc I/O thread to transmit. The I/O thread invokes sendmsg() to transmit it - and in the case of failure, it transitions the rxrpc_call to the completed state. (4) When an rxrpc_call is completed, the app layer is notified. In this case, the app is kafs and it schedules a work item to process events pertaining to an afs_call. (5) When the afs_call event processor is run, it goes down through the RPC-specific handler to afs_extract_data() to retrieve data from rxrpc - and, in this case, it picks up the error from the rxrpc_call and returns it. The error is then propagated to the afs_call and that is completed too. At this point the self-reference is released. (6) If the rxrpc I/O thread manages to complete the rxrpc_call within the window between rxrpc_send_data() queuing the request packet and checking for call completion on the way out, then rxrpc_kernel_send_data() will return the error from sendmsg() to the app. (7) Then afs_make_call() will see an error and will jump to the error handling path which will attempt to clean up the afs_call. (8) The problem comes when the error handling path in afs_make_call() tries to unconditionally drop an async afs_call's self-reference. This self-reference, however, may already have been dropped by afs_extract_data() completing the afs_call (9) The refcount underflows when we return to afs_do_probe_vlserver() and that tries to drop its reference on the afs_call. Fix this by making afs_make_call() attempt to complete the afs_call rather than unconditionally putting it. That way, if afs_extract_data() manages to complete the call first, afs_make_call() won't do anything. The bug can be forced by making do_udp_sendmsg() return -ENETUNREACH and sticking an msleep() in rxrpc_send_data() after the 'success:' label to widen the race window. The error message looks something like: refcount_t: underflow; use-after-free. WARNING: CPU: 3 PID: 720 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110 ... RIP: 0010:refcount_warn_saturate+0xba/0x110 ... afs_put_call+0x1dc/0x1f0 [kafs] afs_fs_get_capabilities+0x8b/0xe0 [kafs] afs_fs_probe_fileserver+0x188/0x1e0 [kafs] afs_lookup_server+0x3bf/0x3f0 [kafs] afs_alloc_server_list+0x130/0x2e0 [kafs] afs_create_volume+0x162/0x400 [kafs] afs_get_tree+0x266/0x410 [kafs] vfs_get_tree+0x25/0xc0 fc_mount+0xe/0x40 afs_d_automount+0x1b3/0x390 [kafs] __traverse_mounts+0x8f/0x210 step_into+0x340/0x760 path_openat+0x13a/0x1260 do_filp_open+0xaf/0x160 do_sys_openat2+0xaf/0x170 or something like: refcount_t: underflow; use-after-free. ... RIP: 0010:refcount_warn_saturate+0x99/0xda ... afs_put_call+0x4a/0x175 afs_send_vl_probes+0x108/0x172 afs_select_vlserver+0xd6/0x311 afs_do_cell_detect_alias+0x5e/0x1e9 afs_cell_detect_alias+0x44/0x92 afs_validate_fc+0x9d/0x134 afs_get_tree+0x20/0x2e6 vfs_get_tree+0x1d/0xc9 fc_mount+0xe/0x33 afs_d_automount+0x48/0x9d __traverse_mounts+0xe0/0x166 step_into+0x140/0x274 open_last_lookups+0x1c1/0x1df path_openat+0x138/0x1c3 do_filp_open+0x55/0xb4 do_sys_openat2+0x6c/0xb6 Fixes: 34fa47612bfe ("afs: Fix race in async call refcounting") Reported-by: Bill MacAllister <bill@ca-zephyr.org> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052304 Suggested-by: Jeffrey E Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/2633992.1702073229@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-11smb: client: fix OOB in smb2_query_reparse_point()Paulo Alcantara
Validate @ioctl_rsp->OutputOffset and @ioctl_rsp->OutputCount so that their sum does not wrap to a number that is smaller than @reparse_buf and we end up with a wild pointer as follows: BUG: unable to handle page fault for address: ffff88809c5cd45f #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 4a01067 P4D 4a01067 PUD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 1260 Comm: mount.cifs Not tainted 6.7.0-rc4 #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:smb2_query_reparse_point+0x3e0/0x4c0 [cifs] Code: ff ff e8 f3 51 fe ff 41 89 c6 58 5a 45 85 f6 0f 85 14 fe ff ff 49 8b 57 48 8b 42 60 44 8b 42 64 42 8d 0c 00 49 39 4f 50 72 40 <8b> 04 02 48 8b 9d f0 fe ff ff 49 8b 57 50 89 03 48 8b 9d e8 fe ff RSP: 0018:ffffc90000347a90 EFLAGS: 00010212 RAX: 000000008000001f RBX: ffff88800ae11000 RCX: 00000000000000ec RDX: ffff88801c5cd440 RSI: 0000000000000000 RDI: ffffffff82004aa4 RBP: ffffc90000347bb0 R08: 00000000800000cd R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000024 R12: ffff8880114d4100 R13: ffff8880114d4198 R14: 0000000000000000 R15: ffff8880114d4000 FS: 00007f02c07babc0(0000) GS:ffff88806ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88809c5cd45f CR3: 0000000011750000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x181/0x480 ? search_module_extables+0x19/0x60 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x1b6/0x1c0 ? asm_exc_page_fault+0x26/0x30 ? _raw_spin_unlock_irqrestore+0x44/0x60 ? smb2_query_reparse_point+0x3e0/0x4c0 [cifs] cifs_get_fattr+0x16e/0xa50 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_acquire+0xbf/0x2b0 cifs_root_iget+0x163/0x5f0 [cifs] cifs_smb3_do_mount+0x5bd/0x780 [cifs] smb3_get_tree+0xd9/0x290 [cifs] vfs_get_tree+0x2c/0x100 ? capable+0x37/0x70 path_mount+0x2d7/0xb80 ? srso_alias_return_thunk+0x5/0xfbef5 ? _raw_spin_unlock_irqrestore+0x44/0x60 __x64_sys_mount+0x11a/0x150 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f02c08d5b1e Fixes: 2e4564b31b64 ("smb3: add support for stat of WSL reparse points for special file types") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>