summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2020-10-13f2fs: don't issue flush in f2fs_flush_device_cache() for nobarrier caseChao Yu
This patch changes f2fs_flush_device_cache() to skip issuing flush for nobarrier case. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-10-13f2fs: handle errors of f2fs_get_meta_page_nofailJaegeuk Kim
First problem is we hit BUG_ON() in f2fs_get_sum_page given EIO on f2fs_get_meta_page_nofail(). Quick fix was not to give any error with infinite loop, but syzbot caught a case where it goes to that loop from fuzzed image. In turned out we abused f2fs_get_meta_page_nofail() like in the below call stack. - f2fs_fill_super - f2fs_build_segment_manager - build_sit_entries - get_current_sit_page INFO: task syz-executor178:6870 can't die for more than 143 seconds. task:syz-executor178 state:R stack:26960 pid: 6870 ppid: 6869 flags:0x00004006 Call Trace: Showing all locks held in the system: 1 lock held by khungtaskd/1179: #0: ffffffff8a554da0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6242 1 lock held by systemd-journal/3920: 1 lock held by in:imklog/6769: #0: ffff88809eebc130 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:930 1 lock held by syz-executor178/6870: #0: ffff8880925120e0 (&type->s_umount_key#47/1){+.+.}-{3:3}, at: alloc_super+0x201/0xaf0 fs/super.c:229 Actually, we didn't have to use _nofail in this case, since we could return error to mount(2) already with the error handler. As a result, this patch tries to 1) remove _nofail callers as much as possible, 2) deal with error case in last remaining caller, f2fs_get_sum_page(). Reported-by: syzbot+ee250ac8137be41d7b13@syzkaller.appspotmail.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-10-13mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessarySuren Baghdasaryan
Currently __set_oom_adj loops through all processes in the system to keep oom_score_adj and oom_score_adj_min in sync between processes sharing their mm. This is done for any task with more that one mm_users, which includes processes with multiple threads (sharing mm and signals). However for such processes the loop is unnecessary because their signal structure is shared as well. Android updates oom_score_adj whenever a tasks changes its role (background/foreground/...) or binds to/unbinds from a service, making it more/less important. Such operation can happen frequently. We noticed that updates to oom_score_adj became more expensive and after further investigation found out that the patch mentioned in "Fixes" introduced a regression. Using Pixel 4 with a typical Android workload, write time to oom_score_adj increased from ~3.57us to ~362us. Moreover this regression linearly depends on the number of multi-threaded processes running on the system. Mark the mm with a new MMF_MULTIPROCESS flag bit when task is created with (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK). Change __set_oom_adj to use MMF_MULTIPROCESS instead of mm_users to decide whether oom_score_adj update should be synchronized between multiple processes. To prevent races between clone() and __set_oom_adj(), when oom_score_adj of the process being cloned might be modified from userspace, we use oom_adj_mutex. Its scope is changed to global. The combination of (CLONE_VM && !CLONE_THREAD) is rarely used except for the case of vfork(). To prevent performance regressions of vfork(), we skip taking oom_adj_mutex and setting MMF_MULTIPROCESS when CLONE_VFORK is specified. Clearing the MMF_MULTIPROCESS flag (when the last process sharing the mm exits) is left out of this patch to keep it simple and because it is believed that this threading model is rare. Should there ever be a need for optimizing that case as well, it can be done by hooking into the exit path, likely following the mm_update_next_owner pattern. With the combination of (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK) being quite rare, the regression is gone after the change is applied. [surenb@google.com: v3] Link: https://lkml.kernel.org/r/20200902012558.2335613-1-surenb@google.com Fixes: 44a70adec910 ("mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj") Reported-by: Tim Murray <timmurray@google.com> Suggested-by: Michal Hocko <mhocko@kernel.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Eugene Syromiatnikov <esyr@redhat.com> Cc: Christian Kellner <christian@kellner.me> Cc: Adrian Reber <areber@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Cc: Michel Lespinasse <walken@google.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Bernd Edlinger <bernd.edlinger@hotmail.de> Cc: John Johansen <john.johansen@canonical.com> Cc: Yafang Shao <laoar.shao@gmail.com> Link: https://lkml.kernel.org/r/20200824153036.3201505-1-surenb@google.com Debugged-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13mm: proc: smaps_rollup: do not stall write attempts on mmap_lockChinwen Chang
smaps_rollup will try to grab mmap_lock and go through the whole vma list until it finishes the iterating. When encountering large processes, the mmap_lock will be held for a longer time, which may block other write requests like mmap and munmap from progressing smoothly. There are upcoming mmap_lock optimizations like range-based locks, but the lock applied to smaps_rollup would be the coarse type, which doesn't avoid the occurrence of unpleasant contention. To solve aforementioned issue, we add a check which detects whether anyone wants to grab mmap_lock for write attempts. Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Steven Price <steven.price@arm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Song Liu <songliubraving@fb.com> Cc: Jimmy Assarsson <jimmyassarsson@gmail.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Daniel Kiss <daniel.kiss@arm.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Link: http://lkml.kernel.org/r/1597715898-3854-4-git-send-email-chinwen.chang@mediatek.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13mm: smaps*: extend smap_gather_stats to support specified beginningChinwen Chang
Extend smap_gather_stats to support indicated beginning address at which it should start gathering. To achieve the goal, we add a new parameter @start assigned by the caller and try to refactor it for simplicity. If @start is 0, it will use the range of @vma for gathering. Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Steven Price <steven.price@arm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Daniel Kiss <daniel.kiss@arm.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Huang Ying <ying.huang@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jimmy Assarsson <jimmyassarsson@gmail.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/1597715898-3854-3-git-send-email-chinwen.chang@mediatek.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13proc: optimise smaps for shmem entriesMatthew Wilcox (Oracle)
Avoid bumping the refcount on pages when we're only interested in the swap entries. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Huang Ying <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: William Kucharski <william.kucharski@oracle.com> Link: https://lkml.kernel.org/r/20200910183318.20139-5-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13fs_parse: mark fs_param_bad_value() as staticLuo Jiaxing
We found the following warning when build kernel with W=1: fs/fs_parser.c:192:5: warning: no previous prototype for `fs_param_bad_value' [-Wmissing-prototypes] int fs_param_bad_value(struct p_log *log, struct fs_parameter *param) ^ CC drivers/usb/gadget/udc/snps_udc_core.o And no header file define a prototype for this function, so we should mark it as static. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/1601293463-25763-1-git-send-email-luojiaxing@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13fs/xattr.c: fix kernel-doc warnings for setxattr & removexattrRandy Dunlap
Fix kernel-doc warnings in fs/xattr.c: ../fs/xattr.c:251: warning: Function parameter or member 'dentry' not described in '__vfs_setxattr_locked' ../fs/xattr.c:251: warning: Function parameter or member 'name' not described in '__vfs_setxattr_locked' ../fs/xattr.c:251: warning: Function parameter or member 'value' not described in '__vfs_setxattr_locked' ../fs/xattr.c:251: warning: Function parameter or member 'size' not described in '__vfs_setxattr_locked' ../fs/xattr.c:251: warning: Function parameter or member 'flags' not described in '__vfs_setxattr_locked' ../fs/xattr.c:251: warning: Function parameter or member 'delegated_inode' not described in '__vfs_setxattr_locked' ../fs/xattr.c:458: warning: Function parameter or member 'dentry' not described in '__vfs_removexattr_locked' ../fs/xattr.c:458: warning: Function parameter or member 'name' not described in '__vfs_removexattr_locked' ../fs/xattr.c:458: warning: Function parameter or member 'delegated_inode' not described in '__vfs_removexattr_locked' Fixes: 08b5d5014a27 ("xattr: break delegations in {set,remove}xattr") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Frank van der Linden <fllinden@amazon.com> Cc: Chuck Lever <chuck.lever@oracle.com> Link: http://lkml.kernel.org/r/7a3dd5a2-5787-adf3-d525-c203f9910ec4@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13ocfs2: fix potential soft lockup during fstrimGang He
When we discard unused blocks on a mounted ocfs2 filesystem, fstrim handles each block goup with locking/unlocking global bitmap meta-file repeatedly. we should let fstrim thread take a break(if need) between unlock and lock, this will avoid the potential soft lockup problem, and also gives the upper applications more IO opportunities, these applications are not blocked for too long at writing files. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Link: https://lkml.kernel.org/r/20200927015815.14904-1-ghe@suse.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13ocfs2: delete repeated words in commentsRandy Dunlap
Drop duplicated words {the, and} in comments. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Link: https://lkml.kernel.org/r/20200811021845.25134-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13ntfs: add check for mft record size in superblockRustam Kovhaev
Number of bytes allocated for mft record should be equal to the mft record size stored in ntfs superblock as reported by syzbot, userspace might trigger out-of-bounds read by dereferencing ctx->attr in ntfs_attr_find() Reported-by: syzbot+aed06913f36eff9b544e@syzkaller.appspotmail.com Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: syzbot+aed06913f36eff9b544e@syzkaller.appspotmail.com Acked-by: Anton Altaparmakov <anton@tuxera.com> Link: https://syzkaller.appspot.com/bug?extid=aed06913f36eff9b544e Link: https://lkml.kernel.org/r/20200824022804.226242-1-rkovhaev@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13Merge tag 'io_uring-5.10-2020-10-12' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring updates from Jens Axboe: - Add blkcg accounting for io-wq offload (Dennis) - A use-after-free fix for io-wq (Hillf) - Cancelation fixes and improvements - Use proper files_struct references for offload - Cleanup of io_uring_get_socket() since that can now go into our own header - SQPOLL fixes and cleanups, and support for sharing the thread - Improvement to how page accounting is done for registered buffers and huge pages, accounting the real pinned state - Series cleaning up the xarray code (Willy) - Various cleanups, refactoring, and improvements (Pavel) - Use raw spinlock for io-wq (Sebastian) - Add support for ring restrictions (Stefano) * tag 'io_uring-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (62 commits) io_uring: keep a pointer ref_node in file_data io_uring: refactor *files_register()'s error paths io_uring: clean file_data access in files_register io_uring: don't delay io_init_req() error check io_uring: clean leftovers after splitting issue io_uring: remove timeout.list after hrtimer cancel io_uring: use a separate struct for timeout_remove io_uring: improve submit_state.ios_left accounting io_uring: simplify io_file_get() io_uring: kill extra check in fixed io_file_get() io_uring: clean up ->files grabbing io_uring: don't io_prep_async_work() linked reqs io_uring: Convert advanced XArray uses to the normal API io_uring: Fix XArray usage in io_uring_add_task_file io_uring: Fix use of XArray in __io_uring_files_cancel io_uring: fix break condition for __io_uring_register() waiting io_uring: no need to call xa_destroy() on empty xarray io_uring: batch account ->req_issue and task struct references io_uring: kill callback_head argument for io_req_task_work_add() io_uring: move req preps out of io_issue_sqe() ...
2020-10-13Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: - Series of merge handling cleanups (Baolin, Christoph) - Series of blk-throttle fixes and cleanups (Baolin) - Series cleaning up BDI, seperating the block device from the backing_dev_info (Christoph) - Removal of bdget() as a generic API (Christoph) - Removal of blkdev_get() as a generic API (Christoph) - Cleanup of is-partition checks (Christoph) - Series reworking disk revalidation (Christoph) - Series cleaning up bio flags (Christoph) - bio crypt fixes (Eric) - IO stats inflight tweak (Gabriel) - blk-mq tags fixes (Hannes) - Buffer invalidation fixes (Jan) - Allow soft limits for zone append (Johannes) - Shared tag set improvements (John, Kashyap) - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel) - DM no-wait support (Mike, Konstantin) - Request allocation improvements (Ming) - Allow md/dm/bcache to use IO stat helpers (Song) - Series improving blk-iocost (Tejun) - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang, Xianting, Yang, Yufen, yangerkun) * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits) block: fix uapi blkzoned.h comments blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue blk-mq: get rid of the dead flush handle code path block: get rid of unnecessary local variable block: fix comment and add lockdep assert blk-mq: use helper function to test hw stopped block: use helper function to test queue register block: remove redundant mq check block: invoke blk_mq_exit_sched no matter whether have .exit_sched percpu_ref: don't refer to ref->data if it isn't allocated block: ratelimit handle_bad_sector() message blk-throttle: Re-use the throtl_set_slice_end() blk-throttle: Open code __throtl_de/enqueue_tg() blk-throttle: Move service tree validation out of the throtl_rb_first() blk-throttle: Move the list operation after list validation blk-throttle: Fix IO hang for a corner case blk-throttle: Avoid tracking latency if low limit is invalid blk-throttle: Avoid getting the current time if tg->last_finish_time is 0 blk-throttle: Remove a meaningless parameter for throtl_downgrade_state() block: Remove redundant 'return' statement ...
2020-10-13Merge tag 'erofs-for-5.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "This cycle addresses a reported permission issue with overlay due to a duplicated permission check for "trusted." xattrs. Also, a REQ_RAHEAD flag is added now to all readahead requests in order to trace readahead I/Os. The others are random cleanups. All commits have been tested and have been in linux-next as well. Summary: - fix an issue which can cause overlay permission problem due to duplicated permission check for "trusted." xattrs; - add REQ_RAHEAD flag to readahead requests for blktrace; - several random cleanup" * tag 'erofs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: remove unnecessary enum entries erofs: add REQ_RAHEAD flag to readahead requests erofs: fold in should_decompress_synchronously() erofs: avoid unnecessary variable `err' erofs: remove unneeded parameter erofs: avoid duplicated permission check for "trusted." xattrs
2020-10-13Merge tag 'for-5.10-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Mostly core updates with a few user visible bits and fixes. Hilights: - fsync performance improvements - less contention of log mutex (throughput +4%, latency -14%, dbench with 32 clients) - skip unnecessary commits for link and rename (throughput +6%, latency -30%, rename latency -75%, dbench with 16 clients) - make fast fsync wait only for writeback (throughput +10..40%, runtime -1..-20%, dbench with 1 to 64 clients on various file/block sizes) - direct io is now implemented using the iomap infrastructure, that's the main part, we still have a workaround that requires an iomap API update, coming in 5.10 - new sysfs exports: - information about the exclusive filesystem operation status (balance, device add/remove/replace, ...) - supported send stream version Core: - use ticket space reservations for data, fair policy using the same infrastructure as metadata - preparatory work to switch locking from our custom tree locks to standard rwsem, now the locking context is propagated to all callers, actual switch is expected to happen in the next dev cycle - seed device structures are now using list API - extent tracepoints print proper tree id - unified range checks for extent buffer helpers - send: avoid using temporary buffer for copying data - remove unnecessary RCU protection from space infos - remove unused readpage callback for metadata, enabling several cleanups - replace indirect function calls for end io hooks and remove extent_io_ops completely Fixes: - more lockdep warning fixes - fix qgroup reservation for delayed inode and an occasional reservation leak for preallocated files - fix device replace of a seed device - fix metadata reservation for fallocate that leads to transaction aborts - reschedule if necessary when logging directory items or when cloning lots of extents - tree-checker: fix false alert caused by legacy btrfs root item - send: fix rename/link conflicts for orphanized inodes - properly initialize device stats for seed devices - skip devices without magic signature when mounting Other: - error handling improvements, BUG_ONs replaced by proper handling, fuzz fixes - various function parameter cleanups - various W=1 cleanups - error/info messages improved Mishaps: - commit 62cf5391209a ("btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locks") is a rebase leftover after the patch got merged to 5.9-rc8 as a466c85edc6f ("btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locks"), the remaining part is trivial and the patch is in the middle of the series so I'm keeping it there instead of rebasing" * tag 'for-5.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (161 commits) btrfs: rename BTRFS_INODE_ORDERED_DATA_CLOSE flag btrfs: annotate device name rcu_string with __rcu btrfs: skip devices without magic signature when mounting btrfs: cleanup cow block on error btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK fs: remove no longer used dio_end_io() btrfs: return error if we're unable to read device stats btrfs: init device stats for seed devices btrfs: remove struct extent_io_ops btrfs: call submit_bio_hook directly for metadata pages btrfs: stop calling submit_bio_hook for data inodes btrfs: don't opencode is_data_inode in end_bio_extent_readpage btrfs: call submit_bio_hook directly in submit_one_bio btrfs: remove extent_io_ops::readpage_end_io_hook btrfs: replace readpage_end_io_hook with direct calls btrfs: send, recompute reference path after orphanization of a directory btrfs: send, orphanize first all conflicting inodes when processing references btrfs: tree-checker: fix false alert caused by legacy btrfs root item btrfs: use unaligned helpers for stack and header set/get helpers btrfs: free-space-cache: use unaligned helpers to access data ...
2020-10-13Merge tag 'dlm-5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set continues the ongoing rework of the low level communication layer in the dlm. The focus here is on improvements to connection handling, and reworking the receiving of messages" * tag 'dlm-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: fs: dlm: fix race in nodeid2con fs: dlm: rework receive handling fs: dlm: disallow buffer size below default fs: dlm: handle range check as callback fs: dlm: fix mark per nodeid setting fs: dlm: remove lock dependency warning fs: dlm: use free_con to free connection fs: dlm: handle possible othercon writequeues fs: dlm: move free writequeue into con free fs: dlm: fix configfs memory leak fs: dlm: fix dlm_local_addr memory leak fs: dlm: make connection hash lockless fs: dlm: synchronize dlm before shutdown
2020-10-13Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds
Pull fscrypt updates from Eric Biggers: "This release, we rework the implementation of creating new encrypted files in order to fix some deadlocks and prepare for adding fscrypt support to CephFS, which Jeff Layton is working on. We also export a symbol in preparation for the above-mentioned CephFS support and also for ext4/f2fs encrypt+casefold support. Finally, there are a few other small cleanups. As usual, all these patches have been in linux-next with no reported issues, and I've tested them with xfstests" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: export fscrypt_d_revalidate() fscrypt: rename DCACHE_ENCRYPTED_NAME to DCACHE_NOKEY_NAME fscrypt: don't call no-key names "ciphertext names" fscrypt: use sha256() instead of open coding fscrypt: make fscrypt_set_test_dummy_encryption() take a 'const char *' fscrypt: handle test_dummy_encryption in more logical way fscrypt: move fscrypt_prepare_symlink() out-of-line fscrypt: make "#define fscrypt_policy" user-only fscrypt: stop pretending that key setup is nofs-safe fscrypt: require that fscrypt_encrypt_symlink() already has key fscrypt: remove fscrypt_inherit_context() fscrypt: adjust logging for in-creation inodes ubifs: use fscrypt_prepare_new_inode() and fscrypt_set_context() f2fs: use fscrypt_prepare_new_inode() and fscrypt_set_context() ext4: use fscrypt_prepare_new_inode() and fscrypt_set_context() ext4: factor out ext4_xattr_credits_for_new_inode() fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context() fscrypt: restrict IV_INO_LBLK_32 to ino_bits <= 32 fscrypt: drop unused inode argument from fscrypt_fname_alloc_buffer
2020-10-13xfs: annotate grabbing the realtime bitmap/summary locks in growfsDarrick J. Wong
Use XFS_ILOCK_RT{BITMAP,SUM} to annotate grabbing the rt bitmap and summary locks when we grow the realtime volume, just like we do most everywhere else. This shuts up lockdep warnings about grabbing the ILOCK class of locks recursively: ============================================ WARNING: possible recursive locking detected 5.9.0-rc4-djw #rc4 Tainted: G O -------------------------------------------- xfs_growfs/4841 is trying to acquire lock: ffff888035acc230 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_ilock+0xac/0x1a0 [xfs] but task is already holding lock: ffff888035acedb0 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_ilock+0xac/0x1a0 [xfs] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&xfs_nondir_ilock_class); lock(&xfs_nondir_ilock_class); *** DEADLOCK *** May be due to missing lock nesting notation Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-10-13xfs: make xfs_growfs_rt update secondary superblocksDarrick J. Wong
When we call growfs on the data device, we update the secondary superblocks to reflect the updated filesystem geometry. We need to do this for growfs on the realtime volume too, because a future xfs_repair run could try to fix the filesystem using a backup superblock. This was observed by the online superblock scrubbers while running xfs/233. One can also trigger this by growing an rt volume, cycling the mount, and creating new rt files. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-10-13xfs: fix realtime bitmap/summary file truncation when growing rt volumeDarrick J. Wong
The realtime bitmap and summary files are regular files that are hidden away from the directory tree. Since they're regular files, inode inactivation will try to purge what it thinks are speculative preallocations beyond the incore size of the file. Unfortunately, xfs_growfs_rt forgets to update the incore size when it resizes the inodes, with the result that inactivating the rt inodes at unmount time will cause their contents to be truncated. Fix this by updating the incore size when we change the ondisk size as part of updating the superblock. Note that we don't do this when we're allocating blocks to the rt inodes because we actually want those blocks to get purged if the growfs fails. This fixes corruption complaints from the online rtsummary checker when running xfs/233. Since that test requires rmap, one can also trigger this by growing an rt volume, cycling the mount, and creating rt files. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-10-12Merge branch 'compat.mount' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull compat mount cleanups from Al Viro: "The last remnants of mount(2) compat buried by Christoph. Buried into NFS, that is. Generally I'm less enthusiastic about "let's use in_compat_syscall() deep in call chain" kind of approach than Christoph seems to be, but in this case it's warranted - that had been an NFS-specific wart, hopefully not to be repeated in any other filesystems (read: any new filesystem introducing non-text mount options will get NAKed even if it doesn't mess the layout up). IOW, not worth trying to grow an infrastructure that would avoid that use of in_compat_syscall()..." * 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: remove compat_sys_mount fs,nfs: lift compat nfs4 mount data handling into the nfs code nfs: simplify nfs4_parse_monolithic
2020-10-12Merge branch 'work.quota-compat' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull compat quotactl cleanups from Al Viro: "More Christoph's compat cleanups: quotactl(2)" * 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: quota: simplify the quotactl compat handling compat: add a compat_need_64bit_alignment_fixup() helper compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
2020-10-12Merge branch 'work.iov_iter' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull compat iovec cleanups from Al Viro: "Christoph's series around import_iovec() and compat variant thereof" * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: security/keys: remove compat_keyctl_instantiate_key_iov mm: remove compat_process_vm_{readv,writev} fs: remove compat_sys_vmsplice fs: remove the compat readv/writev syscalls fs: remove various compat readv/writev helpers iov_iter: transparently handle compat iovecs in import_iovec iov_iter: refactor rw_copy_check_uvector and import_iovec iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c compat.h: fix a spelling error in <linux/compat.h>
2020-10-12Merge tag 'efi-core-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI changes from Ingo Molnar: - Preliminary RISC-V enablement - the bulk of it will arrive via the RISCV tree. - Relax decompressed image placement rules for 32-bit ARM - Add support for passing MOK certificate table contents via a config table rather than a EFI variable. - Add support for 18 bit DIMM row IDs in the CPER records. - Work around broken Dell firmware that passes the entire Boot#### variable contents as the command line - Add definition of the EFI_MEMORY_CPU_CRYPTO memory attribute so we can identify it in the memory map listings. - Don't abort the boot on arm64 if the EFI RNG protocol is available but returns with an error - Replace slashes with exclamation marks in efivarfs file names - Split efi-pstore from the deprecated efivars sysfs code, so we can disable the latter on !x86. - Misc fixes, cleanups and updates. * tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) efi: mokvar: add missing include of asm/early_ioremap.h efi: efivars: limit availability to X86 builds efi: remove some false dependencies on CONFIG_EFI_VARS efi: gsmi: fix false dependency on CONFIG_EFI_VARS efi: efivars: un-export efivars_sysfs_init() efi: pstore: move workqueue handling out of efivars efi: pstore: disentangle from deprecated efivars module efi: mokvar-table: fix some issues in new code efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure efivarfs: Replace invalid slashes with exclamation marks in dentries. efi: Delete deprecated parameter comments efi/libstub: Fix missing-prototypes in string.c efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report it cper,edac,efi: Memory Error Record: bank group/address and chip id edac,ghes,cper: Add Row Extension to Memory Error Record efi/x86: Add a quirk to support command line arguments on Dell EFI firmware efi/libstub: Add efi_warn and *_once logging helpers integrity: Load certs from the EFI MOK config table integrity: Move import of MokListRT certs to a separate routine efi: Support for MOK variable config table ...
2020-10-12Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's quite a lot of code here, but much of it is due to the addition of a new PMU driver as well as some arm64-specific selftests which is an area where we've traditionally been lagging a bit. In terms of exciting features, this includes support for the Memory Tagging Extension which narrowly missed 5.9, hopefully allowing userspace to run with use-after-free detection in production on CPUs that support it. Work is ongoing to integrate the feature with KASAN for 5.11. Another change that I'm excited about (assuming they get the hardware right) is preparing the ASID allocator for sharing the CPU page-table with the SMMU. Those changes will also come in via Joerg with the IOMMU pull. We do stray outside of our usual directories in a few places, mostly due to core changes required by MTE. Although much of this has been Acked, there were a couple of places where we unfortunately didn't get any review feedback. Other than that, we ran into a handful of minor conflicts in -next, but nothing that should post any issues. Summary: - Userspace support for the Memory Tagging Extension introduced by Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11. - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context switching. - Fix and subsequent rewrite of our Spectre mitigations, including the addition of support for PR_SPEC_DISABLE_NOEXEC. - Support for the Armv8.3 Pointer Authentication enhancements. - Support for ASID pinning, which is required when sharing page-tables with the SMMU. - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op. - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and also support to handle CPU PMU IRQs as NMIs. - Allow prefetchable PCI BARs to be exposed to userspace using normal non-cacheable mappings. - Implementation of ARCH_STACKWALK for unwinding. - Improve reporting of unexpected kernel traps due to BPF JIT failure. - Improve robustness of user-visible HWCAP strings and their corresponding numerical constants. - Removal of TEXT_OFFSET. - Removal of some unused functions, parameters and prototypes. - Removal of MPIDR-based topology detection in favour of firmware description. - Cleanups to handling of SVE and FPSIMD register state in preparation for potential future optimisation of handling across syscalls. - Cleanups to the SDEI driver in preparation for support in KVM. - Miscellaneous cleanups and refactoring work" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits) Revert "arm64: initialize per-cpu offsets earlier" arm64: random: Remove no longer needed prototypes arm64: initialize per-cpu offsets earlier kselftest/arm64: Check mte tagged user address in kernel kselftest/arm64: Verify KSM page merge for MTE pages kselftest/arm64: Verify all different mmap MTE options kselftest/arm64: Check forked child mte memory accessibility kselftest/arm64: Verify mte tag inclusion via prctl kselftest/arm64: Add utilities and a test to validate mte memory perf: arm-cmn: Fix conversion specifiers for node type perf: arm-cmn: Fix unsigned comparison to less than zero arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option arm64: Pull in task_stack_page() to Spectre-v4 mitigation code KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled arm64: Get rid of arm64_ssbd_state KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state() KVM: arm64: Get rid of kvm_arm_have_ssbd() KVM: arm64: Simplify handling of ARCH_WORKAROUND_2 ...
2020-10-12fuse: connection remove fixMiklos Szeredi
Re-add lost removal of fc from fuse_conn_list and the control filesystem. Reported-by: kernel test robot <rong.a.chen@intel.com> Fixes: fcee216beb9c ("fuse: split fuse_mount off of fuse_conn") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-10-11ubifs: mount_ubifs: Release authentication resource in error handling pathZhihao Cheng
Release the authentication related resource in some error handling branches in mount_ubifs(). Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Cc: <stable@vger.kernel.org> # 4.20+ Fixes: d8a22773a12c6d7 ("ubifs: Enable authentication support") Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-10-11ubifs: Don't parse authentication mount options in remount processZhihao Cheng
There is no need to dump authentication options while remounting, because authentication initialization can only be doing once in the first mount process. Dumping authentication mount options in remount process may cause memory leak if UBIFS has already been mounted with old authentication mount options. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Cc: <stable@vger.kernel.org> # 4.20+ Fixes: d8a22773a12c6d7 ("ubifs: Enable authentication support") Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-10-11ubifs: Fix a memleak after dumping authentication mount optionsZhihao Cheng
Fix a memory leak after dumping authentication mount options in error handling branch. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Cc: <stable@vger.kernel.org> # 4.20+ Fixes: d8a22773a12c6d7 ("ubifs: Enable authentication support") Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-10-11Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs fix from Al Viro: "Fixes an obvious bug (memory leak introduced in 5.8)" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: pipe: Fix memory leaks in create_pipe_files()
2020-10-10cifs: Fix incomplete memory allocation on setxattr pathVladimir Zapolskiy
On setxattr() syscall path due to an apprent typo the size of a dynamically allocated memory chunk for storing struct smb2_file_full_ea_info object is computed incorrectly, to be more precise the first addend is the size of a pointer instead of the wanted object size. Coincidentally it makes no difference on 64-bit platforms, however on 32-bit targets the following memcpy() writes 4 bytes of data outside of the dynamically allocated memory. ============================================================================= BUG kmalloc-16 (Not tainted): Redzone overwritten ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: 0x79e69a6f-0x9e5cdecf @offset=368. First byte 0x73 instead of 0xcc INFO: Slab 0xd36d2454 objects=85 used=51 fp=0xf7d0fc7a flags=0x35000201 INFO: Object 0x6f171df3 @offset=352 fp=0x00000000 Redzone 5d4ff02d: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ Object 6f171df3: 00 00 00 00 00 05 06 00 73 6e 72 75 62 00 66 69 ........snrub.fi Redzone 79e69a6f: 73 68 32 0a sh2. Padding 56254d82: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ CPU: 0 PID: 8196 Comm: attr Tainted: G B 5.9.0-rc8+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 Call Trace: dump_stack+0x54/0x6e print_trailer+0x12c/0x134 check_bytes_and_report.cold+0x3e/0x69 check_object+0x18c/0x250 free_debug_processing+0xfe/0x230 __slab_free+0x1c0/0x300 kfree+0x1d3/0x220 smb2_set_ea+0x27d/0x540 cifs_xattr_set+0x57f/0x620 __vfs_setxattr+0x4e/0x60 __vfs_setxattr_noperm+0x4e/0x100 __vfs_setxattr_locked+0xae/0xd0 vfs_setxattr+0x4e/0xe0 setxattr+0x12c/0x1a0 path_setxattr+0xa4/0xc0 __ia32_sys_lsetxattr+0x1d/0x20 __do_fast_syscall_32+0x40/0x70 do_fast_syscall_32+0x29/0x60 do_SYSENTER_32+0x15/0x20 entry_SYSENTER_32+0x9f/0xf2 Fixes: 5517554e4313 ("cifs: Add support for writing attributes on SMB2+") Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-10io_uring: keep a pointer ref_node in file_dataio_uring-5.10-2020-10-12Pavel Begunkov
->cur_refs of struct fixed_file_data always points to percpu_ref embedded into struct fixed_file_ref_node. Don't overuse container_of() and offsetting, and point directly to fixed_file_ref_node. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: refactor *files_register()'s error pathsPavel Begunkov
Don't keep repeating cleaning sequences in error paths, write it once in the and use labels. It's less error prone and looks cleaner. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: clean file_data access in files_registerPavel Begunkov
Keep file_data in a local var and replace with it complex references such as ctx->file_data. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: don't delay io_init_req() error checkPavel Begunkov
Don't postpone io_init_req() error checks and do that right after calling it. There is no control-flow statements or dependencies with sqe/submitted accounting, so do those earlier, that makes the code flow a bit more natural. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: clean leftovers after splitting issuePavel Begunkov
Kill extra if in io_issue_sqe() and place send/recv[msg] calls appropriately under switch's cases. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: remove timeout.list after hrtimer cancelPavel Begunkov
Remove timeouts from ctx->timeout_list after hrtimer_try_to_cancel() successfully cancels it. With this we don't need to care whether there was a race and it was removed in io_timeout_fn(), and that will be handy for following patches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: use a separate struct for timeout_removePavel Begunkov
Don't use struct io_timeout for both IORING_OP_TIMEOUT and IORING_OP_TIMEOUT_REMOVE, they're quite different. Split them in two, that allows to remove an unused field in struct io_timeout, and btw kill ->flags not used by either. This also easier to follow, especially for timeout remove. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: improve submit_state.ios_left accountingPavel Begunkov
state->ios_left isn't decremented for requests that don't need a file, so it might be larger than number of SQEs left. That in some circumstances makes us to grab more files that is needed so imposing extra put. Deaccount one ios_left for each request. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: simplify io_file_get()Pavel Begunkov
Keep ->needs_file_no_error check out of io_file_get(), and let callers handle it. It makes it more straightforward. Also, as the only error it can hand back -EBADF, make it return a file or NULL. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: kill extra check in fixed io_file_get()Pavel Begunkov
ctx->nr_user_files == 0 IFF ctx->file_data == NULL and there fixed files are not used. Hence, verifying fds only against ctx->nr_user_files is enough. Remove the other check from hot path. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: clean up ->files grabbingPavel Begunkov
Move work.files grabbing into io_prep_async_work() to all other work resources initialisation. We don't need to keep it separately now, as ->ring_fd/file are gone. It also allows to not grab it when a request is not going to io-wq. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-10io_uring: don't io_prep_async_work() linked reqsPavel Begunkov
There is no real reason left for preparing io-wq work context for linked requests in advance, remove it as this might become a bottleneck in some cases. Reported-by: Roman Gershman <romger@amazon.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-09f2fs: fix to set SBI_NEED_FSCK flag for inconsistent inodeChao Yu
If compressed inode has inconsistent fields on i_compress_algorithm, i_compr_blocks and i_log_cluster_size, we missed to set SBI_NEED_FSCK to notice fsck to repair the inode, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-10-09io_uring: Convert advanced XArray uses to the normal APIMatthew Wilcox (Oracle)
There are no bugs here that I've spotted, it's just easier to use the normal API and there are no performance advantages to using the more verbose advanced API. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-09io_uring: Fix XArray usage in io_uring_add_task_fileMatthew Wilcox (Oracle)
The xas_store() wasn't paired with an xas_nomem() loop, so if it couldn't allocate memory using GFP_NOWAIT, it would leak the reference to the file descriptor. Also the node pointed to by the xas could be freed between the call to xas_load() under the rcu_read_lock() and the acquisition of the xa_lock. It's easier to just use the normal xa_load/xa_store interface here. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> [axboe: fix missing assign after alloc, cur_uring -> tctx rename] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-09io_uring: Fix use of XArray in __io_uring_files_cancelMatthew Wilcox (Oracle)
We have to drop the lock during each iteration, so there's no advantage to using the advanced API. Convert this to a standard xa_for_each() loop. Reported-by: syzbot+27c12725d8ff0bfe1a13@syzkaller.appspotmail.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-09fuse: implement crossmountsMax Reitz
FUSE servers can indicate crossmount points by setting FUSE_ATTR_SUBMOUNT in fuse_attr.flags. The inode will then be marked as S_AUTOMOUNT, and the .d_automount implementation creates a new submount at that location, so that the submount gets a distinct st_dev value. Note that all submounts get a distinct superblock and a distinct st_dev value, so for virtio-fs, even if the same filesystem is mounted more than once on the host, none of its mount points will have the same st_dev. We need distinct superblocks because the superblock points to the root node, but the different host mounts may show different trees (e.g. due to submounts in some of them, but not in others). Right now, this behavior is only enabled when fuse_conn.auto_submounts is set, which is the case only for virtio-fs. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-10-08f2fs: reject CASEFOLD inode flag without casefold featureEric Biggers
syzbot reported: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 6860 Comm: syz-executor835 Not tainted 5.9.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:utf8_casefold+0x43/0x1b0 fs/unicode/utf8-core.c:107 [...] Call Trace: f2fs_init_casefolded_name fs/f2fs/dir.c:85 [inline] __f2fs_setup_filename fs/f2fs/dir.c:118 [inline] f2fs_prepare_lookup+0x3bf/0x640 fs/f2fs/dir.c:163 f2fs_lookup+0x10d/0x920 fs/f2fs/namei.c:494 __lookup_hash+0x115/0x240 fs/namei.c:1445 filename_create+0x14b/0x630 fs/namei.c:3467 user_path_create fs/namei.c:3524 [inline] do_mkdirat+0x56/0x310 fs/namei.c:3664 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 [...] The problem is that an inode has F2FS_CASEFOLD_FL set, but the filesystem doesn't have the casefold feature flag set, and therefore super_block::s_encoding is NULL. Fix this by making sanity_check_inode() reject inodes that have F2FS_CASEFOLD_FL when the filesystem doesn't have the casefold feature. Reported-by: syzbot+05139c4039d0679e19ff@syzkaller.appspotmail.com Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-10-08f2fs: fix memory alignment to support 32bitJaegeuk Kim
In 32bit system, 64-bits key breaks memory alignment. This fixes the commit "f2fs: support 64-bits key in f2fs rb-tree node entry". Reported-by: Nicolas Chauvet <kwizart@gmail.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>