summaryrefslogtreecommitdiff
path: root/fs/ext4
AgeCommit message (Collapse)Author
2018-11-21ext4: fix buffer leak in __ext4_read_dirblock() on error pathVasily Averin
commit de59fae0043f07de5d25e02ca360f7d57bfa5866 upstream. Fixes: dc6982ff4db1 ("ext4: refactor code to read directory blocks ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.9 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: fix buffer leak in ext4_expand_extra_isize_ea() on error pathVasily Averin
commit 53692ec074d00589c2cf1d6d17ca76ad0adce6ec upstream. Fixes: de05ca852679 ("ext4: move call to ext4_error() into ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.17 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: fix buffer leak in ext4_xattr_move_to_block() on error pathVasily Averin
commit 6bdc9977fcdedf47118d2caf7270a19f4b6d8a8f upstream. Fixes: 3f2571c1f91f ("ext4: factor out xattr moving") Fixes: 6dd4ee7cab7e ("ext4: Expand extra_inodes space per ...") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.23 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: release bs.bh before re-using in ext4_xattr_block_find()Vasily Averin
commit 45ae932d246f721e6584430017176cbcadfde610 upstream. bs.bh was taken in previous ext4_xattr_block_find() call, it should be released before re-using Fixes: 7e01c8e5420b ("ext3/4: fix uninitialized bs in ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.26 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: fix buffer leak in ext4_xattr_get_block() on error pathVasily Averin
commit ecaaf408478b6fb4d9986f9b6652f3824e374f4c upstream. Fixes: dec214d00e0d ("ext4: xattr inode deduplication") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.13 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: fix possible leak of s_journal_flag_rwsem in error pathVasily Averin
commit af18e35bfd01e6d65a5e3ef84ffe8b252d1628c5 upstream. Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.7 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: fix possible leak of sbi->s_group_desc_leak in error pathTheodore Ts'o
commit 9e463084cdb22e0b56b2dfbc50461020409a5fd3 upstream. Fixes: bfe0a5f47ada ("ext4: add more mount time checks of the superblock") Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.18 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: avoid possible double brelse() in add_new_gdb() on error pathTheodore Ts'o
commit 4f32c38b4662312dd3c5f113d8bdd459887fb773 upstream. Fixes: b40971426a83 ("ext4: add error checking to calls to ...") Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.38 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizingVasily Averin
commit f348e2241fb73515d65b5d77dd9c174128a7fbf2 upstream. Fixes: 117fff10d7f1 ("ext4: grow the s_flex_groups array as needed ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: avoid buffer leak in ext4_orphan_add() after prior errorsVasily Averin
commit feaf264ce7f8d54582e2f66eb82dd9dd124c94f3 upstream. Fixes: d745a8c20c1f ("ext4: reduce contention on s_orphan_lock") Fixes: 6e3617e579e0 ("ext4: Handle non empty on-disk orphan link") Cc: Dmitry Monakhov <dmonakhov@gmail.com> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.34 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: avoid buffer leak on shutdown in ext4_mark_iloc_dirty()Vasily Averin
commit a6758309a005060b8297a538a457c88699cb2520 upstream. ext4_mark_iloc_dirty() callers expect that it releases iloc->bh even if it returns an error. Fixes: 0db1ff222d40 ("ext4: add shutdown bit and check for it") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.11 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: fix possible inode leak in the retry loop of ext4_resize_fs()Vasily Averin
commit db6aee62406d9fbb53315fcddd81f1dc271d49fa upstream. Fixes: 1c6bd7173d66 ("ext4: convert file system to meta_bg if needed ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: missing !bh check in ext4_xattr_inode_write()Vasily Averin
commit eb6984fa4ce2837dcb1f66720a600f31b0bb3739 upstream. According to Ted Ts'o ext4_getblk() called in ext4_xattr_inode_write() should not return bh = NULL The only time that bh could be NULL, then, would be in the case of something really going wrong; a programming error elsewhere (perhaps a wild pointer dereference) or I/O error causing on-disk file system corruption (although that would be highly unlikely given that we had *just* allocated the blocks and so the metadata blocks in question probably would still be in the cache). Fixes: e50e5129f384 ("ext4: xattr-in-inode support") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.13 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: avoid potential extra brelse in setup_new_flex_group_blocks()Vasily Averin
commit 9e4028935cca3f9ef9b6a90df9da6f1f94853536 upstream. Currently bh is set to NULL only during first iteration of for cycle, then this pointer is not cleared after end of using. Therefore rollback after errors can lead to extra brelse(bh) call, decrements bh counter and later trigger an unexpected warning in __brelse() Patch moves brelse() calls in body of cycle to exclude requirement of brelse() call in rollback. Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.3+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: add missing brelse() add_new_gdb_meta_bg()'s error pathVasily Averin
commit 61a9c11e5e7a0dab5381afa5d9d4dd5ebf18f7a0 upstream. Fixes: 01f795f9e0d6 ("ext4: add online resizing support for meta_bg ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: add missing brelse() in set_flexbg_block_bitmap()'s error pathVasily Averin
commit cea5794122125bf67559906a0762186cf417099c upstream. Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Cc: stable@kernel.org # 3.3 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21ext4: add missing brelse() update_backups()'s error pathVasily Averin
commit ea0abbb648452cdb6e1734b702b6330a7448fcf8 upstream. Fixes: ac27a0ec112a ("ext4: initial copy of files from ext3") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.19 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13ext4: fix use-after-free race in ext4_remount()'s error pathTheodore Ts'o
commit 33458eaba4dfe778a426df6a19b7aad2ff9f7eec upstream. It's possible for ext4_show_quota_options() to try reading s_qf_names[i] while it is being modified by ext4_remount() --- most notably, in ext4_remount's error path when the original values of the quota file name gets restored. Reported-by: syzbot+a2872d6feea6918008a9@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13ext4: propagate error from dquot_initialize() in EXT4_IOC_FSSETXATTRWang Shilong
commit 182a79e0c17147d2c2d3990a9a7b6b58a1561c7a upstream. We return most failure of dquota_initialize() except inode evict, this could make a bit sense, for example we allow file removal even quota files are broken? But it dosen't make sense to allow setting project if quota files etc are broken. Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13ext4: fix setattr project check in fssetxattr ioctlWang Shilong
commit dc7ac6c4cae3b58724c2f1e21a7c05ce19ecd5a8 upstream. Currently, project quota could be changed by fssetxattr ioctl, and existed permission check inode_owner_or_capable() is obviously not enough, just think that common users could change project id of file, that could make users to break project quota easily. This patch try to follow same regular of xfs project quota: "Project Quota ID state is only allowed to change from within the init namespace. Enforce that restriction only if we are trying to change the quota ID state. Everything else is allowed in user namespaces." Besides that, check and set project id'state should be an atomic operation, protect whole operation with inode lock, ext4_ioctl_setproject() is only used for ioctl EXT4_IOC_FSSETXATTR, we have held mnt_want_write_file() before ext4_ioctl_setflags(), and ext4_ioctl_setproject() is called after ext4_ioctl_setflags(), we could share codes, so remove it inside ext4_ioctl_setproject(). Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13ext4: initialize retries variable in ext4_da_write_inline_data_begin()Lukas Czerner
commit 625ef8a3acd111d5f496d190baf99d1a815bd03e upstream. Variable retries is not initialized in ext4_da_write_inline_data_begin() which can lead to nondeterministic number of retries in case we hit ENOSPC. Initialize retries to zero as we do everywhere else. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Fixes: bc0ca9df3b2a ("ext4: retry allocation when inline->extent conversion failed") Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13ext4: fix EXT4_IOC_SWAP_BOOTTheodore Ts'o
commit 18aded17492088962ef43f00825179598b3e8c58 upstream. The code EXT4_IOC_SWAP_BOOT ioctl hasn't been updated in a while, and it's a bit broken with respect to more modern ext4 kernels, especially metadata checksums. Other problems fixed with this commit: * Don't allow installing a DAX, swap file, or an encrypted file as a boot loader. * Respect the immutable and append-only flags. * Wait until any DIO operations are finished *before* calling truncate_inode_pages(). * Don't swap inode->i_flags, since these flags have nothing to do with the inode blocks --- and it will give the IMA/audit code heartburn when the inode is evicted. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Reported-by: syzbot+e81ccd4744c6c4f71354@syzkaller.appspotmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13ext4: fix argument checking in EXT4_IOC_MOVE_EXTTheodore Ts'o
[ Upstream commit f18b2b83a727a3db208308057d2c7945f368e625 ] If the starting block number of either the source or destination file exceeds the EOF, EXT4_IOC_MOVE_EXT should return EINVAL. Also fixed the helper function mext_check_coverage() so that if the logical block is beyond EOF, make it return immediately, instead of looping until the block number wraps all the away around. This takes long enough that if there are multiple threads trying to do pound on an the same inode doing non-sensical things, it can end up triggering the kernel's soft lockup detector. Reported-by: syzbot+c61979f6f2cba5cb3c06@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-17Merge tag 'ext4_for_linus_stable' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Ted writes: Various ext4 bug fixes; primarily making ext4 more robust against maliciously crafted file systems, and some DAX fixes. * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4, dax: set ext4_dax_aops for dax files ext4, dax: add ext4_bmap to ext4_dax_aops ext4: don't mark mmp buffer head dirty ext4: show test_dummy_encryption mount option in /proc/mounts ext4: close race between direct IO and ext4_break_layouts() ext4: fix online resizing for bigalloc file systems with a 1k block size ext4: fix online resize's handling of a too-small final block group ext4: recalucate superblock checksum after updating free blocks/inodes ext4: avoid arithemetic overflow that can trigger a BUG ext4: avoid divide by zero fault when deleting corrupted inline directories ext4: check to make sure the rename(2)'s destination is not freed ext4: add nonstring annotations to ext4.h
2018-09-15ext4, dax: set ext4_dax_aops for dax filesToshi Kani
Sync syscall to DAX file needs to flush processor cache, but it currently does not flush to existing DAX files. This is because 'ext4_da_aops' is set to address_space_operations of existing DAX files, instead of 'ext4_dax_aops', since S_DAX flag is set after ext4_set_aops() in the open path. New file -------- lookup_open ext4_create __ext4_new_inode ext4_set_inode_flags // Set S_DAX flag ext4_set_aops // Set aops to ext4_dax_aops Existing file ------------- lookup_open ext4_lookup ext4_iget ext4_set_aops // Set aops to ext4_da_aops ext4_set_inode_flags // Set S_DAX flag Change ext4_iget() to initialize i_flags before ext4_set_aops(). Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Suggested-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
2018-09-15ext4, dax: add ext4_bmap to ext4_dax_aopsToshi Kani
Ext4 mount path calls .bmap to the journal inode. This currently works for the DAX mount case because ext4_iget() always set 'ext4_da_aops' to any regular files. In preparation to fix ext4_iget() to set 'ext4_dax_aops' for ext4 DAX files, add ext4_bmap() to 'ext4_dax_aops', since bmap works for DAX inodes. Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Suggested-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
2018-09-15ext4: don't mark mmp buffer head dirtyLi Dongyang
Marking mmp bh dirty before writing it will make writeback pick up mmp block later and submit a write, we don't want the duplicate write as kmmpd thread should have full control of reading and writing the mmp block. Another reason is we will also have random I/O error on the writeback request when blk integrity is enabled, because kmmpd could modify the content of the mmp block(e.g. setting new seq and time) while the mmp block is under I/O requested by writeback. Signed-off-by: Li Dongyang <dongyangli@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@vger.kernel.org
2018-09-15ext4: show test_dummy_encryption mount option in /proc/mountsEric Biggers
When in effect, add "test_dummy_encryption" to _ext4_show_options() so that it is shown in /proc/mounts and other relevant procfs files. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-09-11ext4: close race between direct IO and ext4_break_layouts()Ross Zwisler
If the refcount of a page is lowered between the time that it is returned by dax_busy_page() and when the refcount is again checked in ext4_break_layouts() => ___wait_var_event(), the waiting function ext4_wait_dax_page() will never be called. This means that ext4_break_layouts() will still have 'retry' set to false, so we'll stop looping and never check the refcount of other pages in this inode. Instead, always continue looping as long as dax_layout_busy_page() gives us a page which it found with an elevated refcount. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-09-03ext4: fix online resizing for bigalloc file systems with a 1k block sizeTheodore Ts'o
An online resize of a file system with the bigalloc feature enabled and a 1k block size would be refused since ext4_resize_begin() did not understand s_first_data_block is 0 for all bigalloc file systems, even when the block size is 1k. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-09-03ext4: fix online resize's handling of a too-small final block groupTheodore Ts'o
Avoid growing the file system to an extent so that the last block group is too small to hold all of the metadata that must be stored in the block group. This problem can be triggered with the following reproducer: umount /mnt mke2fs -F -m0 -b 4096 -t ext4 -O resize_inode,^has_journal \ -E resize=1073741824 /tmp/foo.img 128M mount /tmp/foo.img /mnt truncate --size 1708M /tmp/foo.img resize2fs /dev/loop0 295400 umount /mnt e2fsck -fy /tmp/foo.img Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-09-01ext4: recalucate superblock checksum after updating free blocks/inodesTheodore Ts'o
When mounting the superblock, ext4_fill_super() calculates the free blocks and free inodes and stores them in the superblock. It's not strictly necessary, since we don't use them any more, but it's nice to keep them roughly aligned to reality. Since it's not critical for file system correctness, the code doesn't call ext4_commit_super(). The problem is that it's in ext4_commit_super() that we recalculate the superblock checksum. So if we're not going to call ext4_commit_super(), we need to call ext4_superblock_csum_set() to make sure the superblock checksum is consistent. Most of the time, this doesn't matter, since we end up calling ext4_commit_super() very soon thereafter, and definitely by the time the file system is unmounted. However, it doesn't work in this sequence: mke2fs -Fq -t ext4 /dev/vdc 128M mount /dev/vdc /vdc cp xfstests/git-versions /vdc godown /vdc umount /vdc mount /dev/vdc tune2fs -l /dev/vdc With this commit, the "tune2fs -l" no longer fails. Reported-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-09-01ext4: avoid arithemetic overflow that can trigger a BUGTheodore Ts'o
A maliciously crafted file system can cause an overflow when the results of a 64-bit calculation is stored into a 32-bit length parameter. https://bugzilla.kernel.org/show_bug.cgi?id=200623 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Wen Xu <wen.xu@gatech.edu> Cc: stable@vger.kernel.org
2018-08-27ext4: avoid divide by zero fault when deleting corrupted inline directoriesTheodore Ts'o
A specially crafted file system can trick empty_inline_dir() into reading past the last valid entry in a inline directory, and then run into the end of xattr marker. This will trigger a divide by zero fault. Fix this by using the size of the inline directory instead of dir->i_size. Also clean up error reporting in __ext4_check_dir_entry so that the message is clearer and more understandable --- and avoids the division by zero trap if the size passed in is zero. (I'm not sure why we coded it that way in the first place; printing offset % size is actually more confusing and less useful.) https://bugzilla.kernel.org/show_bug.cgi?id=200933 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Wen Xu <wen.xu@gatech.edu> Cc: stable@vger.kernel.org
2018-08-27ext4: check to make sure the rename(2)'s destination is not freedTheodore Ts'o
If the destination of the rename(2) system call exists, the inode's link count (i_nlinks) must be non-zero. If it is, the inode can end up on the orphan list prematurely, leading to all sorts of hilarity, including a use-after-free. https://bugzilla.kernel.org/show_bug.cgi?id=200931 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Wen Xu <wen.xu@gatech.edu> Cc: stable@vger.kernel.org
2018-08-27ext4: add nonstring annotations to ext4.hTheodore Ts'o
This suppresses some false positives in gcc 8's -Wstringop-truncation Suggested by Miguel Ojeda (hopefully the __nonstring definition will eventually get accepted in the compiler-gcc.h header file). Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2018-08-17ext4: readpages() should submit IO as read-aheadJens Axboe
a_ops->readpages() is only ever used for read-ahead. Ensure that we pass this information down to the block layer. Link: http://lkml.kernel.org/r/20180621010725.17813-5-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17dax: remove VM_MIXEDMAP for fsdax and device daxDave Jiang
This patch is reworked from an earlier patch that Dan has posted: https://patchwork.kernel.org/patch/10131727/ VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that the memory page it is dealing with is not typical memory from the linear map. The get_user_pages_fast() path, since it does not resolve the vma, is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we use that as a VM_MIXEDMAP replacement in some locations. In the cases where there is no pte to consult we fallback to using vma_is_dax() to detect the VM_MIXEDMAP special case. Now that we have explicit driver pfn_t-flag opt-in/opt-out for get_user_pages() support for DAX we can stop setting VM_MIXEDMAP. This also means we no longer need to worry about safely manipulating vm_flags in a future where we support dynamically changing the dax mode of a file. DAX should also now be supported with madvise_behavior(), vma_merge(), and copy_page_range(). This patch has been tested against ndctl unit test. It has also been tested against xfstests commit: 625515d using fake pmem created by memmap and no additional issues have been observed. Link: http://lkml.kernel.org/r/152847720311.55924.16999195879201817653.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-14Merge tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: "First pull request for this merge window, there will also be a followup request with some stragglers. This pull request contains: - Fix for a thundering heard issue in the wbt block code (Anchal Agarwal) - A few NVMe pull requests: * Improved tracepoints (Keith) * Larger inline data support for RDMA (Steve Wise) * RDMA setup/teardown fixes (Sagi) * Effects log suppor for NVMe target (Chaitanya Kulkarni) * Buffered IO suppor for NVMe target (Chaitanya Kulkarni) * TP4004 (ANA) support (Christoph) * Various NVMe fixes - Block io-latency controller support. Much needed support for properly containing block devices. (Josef) - Series improving how we handle sense information on the stack (Kees) - Lightnvm fixes and updates/improvements (Mathias/Javier et al) - Zoned device support for null_blk (Matias) - AIX partition fixes (Mauricio Faria de Oliveira) - DIF checksum code made generic (Max Gurtovoy) - Add support for discard in iostats (Michael Callahan / Tejun) - Set of updates for BFQ (Paolo) - Removal of async write support for bsg (Christoph) - Bio page dirtying and clone fixups (Christoph) - Set of bcache fix/changes (via Coly) - Series improving blk-mq queue setup/teardown speed (Ming) - Series improving merging performance on blk-mq (Ming) - Lots of other fixes and cleanups from a slew of folks" * tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block: (190 commits) blkcg: Make blkg_root_lookup() work for queues in bypass mode bcache: fix error setting writeback_rate through sysfs interface null_blk: add lock drop/acquire annotation Blk-throttle: reduce tail io latency when iops limit is enforced block: paride: pd: mark expected switch fall-throughs block: Ensure that a request queue is dissociated from the cgroup controller block: Introduce blk_exit_queue() blkcg: Introduce blkg_root_lookup() block: Remove two superfluous #include directives blk-mq: count the hctx as active before allocating tag block: bvec_nr_vecs() returns value for wrong slab bcache: trivial - remove tailing backslash in macro BTREE_FLAG bcache: make the pr_err statement used for ENOENT only in sysfs_attatch section bcache: set max writeback rate when I/O request is idle bcache: add code comments for bset.c bcache: fix mistaken comments in request.c bcache: fix mistaken code comments in bcache.h bcache: add a comment in super.c bcache: avoid unncessary cache prefetch bch_btree_node_get() bcache: display rate debug parameters to 0 when writeback is not running ...
2018-08-04ext4: remove unneeded variable "err" in ext4_mb_release_inode_pa()zhong jiang
The err is not used after initalization. So just remove the variable. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-08-02ext4: improve code readability in ext4_iget()Liu Song
Merge the duplicated complex conditions to improve code readability. Signed-off-by: Liu Song <liu.song11@zte.com.cn> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
2018-08-02ext4: fix spectre gadget in ext4_mb_regular_allocator()Jeremy Cline
'ac->ac_g_ex.fe_len' is a user-controlled value which is used in the derivation of 'ac->ac_2order'. 'ac->ac_2order', in turn, is used to index arrays which makes it a potential spectre gadget. Fix this by sanitizing the value assigned to 'ac->ac2_order'. This covers the following accesses found with the help of smatch: * fs/ext4/mballoc.c:1896 ext4_mb_simple_scan_group() warn: potential spectre issue 'grp->bb_counters' [w] (local cap) * fs/ext4/mballoc.c:445 mb_find_buddy() warn: potential spectre issue 'EXT4_SB(e4b->bd_sb)->s_mb_offsets' [r] (local cap) * fs/ext4/mballoc.c:446 mb_find_buddy() warn: potential spectre issue 'EXT4_SB(e4b->bd_sb)->s_mb_maxs' [r] (local cap) Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-08-01ext4: check for NUL characters in extended attribute's nameTheodore Ts'o
Extended attribute names are defined to be NUL-terminated, so the name must not contain a NUL character. This is important because there are places when remove extended attribute, the code uses strlen to determine the length of the entry. That should probably be fixed at some point, but code is currently really messy, so the simplest fix for now is to simply validate that the extended attributes are sane. https://bugzilla.kernel.org/show_bug.cgi?id=200401 Reported-by: Wen Xu <wen.xu@gatech.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-08-01ext4: use ext4_warning() for sb_getblk failureWang Shilong
Out of memory should not be considered as critical errors; so replace ext4_error() with ext4_warnig(). Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-07-29ext4: fix race when setting the bitmap corrupted flagWang Shilong
Whenever we hit block or inode bitmap corruptions we set bit and then reduce this block group free inode/clusters counter to expose right available space. However some of ext4_mark_group_bitmap_corrupted() is called inside group spinlock, some are not, this could make it happen that we double reduce one block group free counters from system. Always hold group spinlock for it could fix it, but it looks a little heavy, we could use test_and_set_bit() to fix race problems here. Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-07-29ext4: reset error code in ext4_find_entry in fallbackEric Sandeen
When ext4_find_entry() falls back to "searching the old fashioned way" due to a corrupt dx dir, it needs to reset the error code to NULL so that the nonstandard ERR_BAD_DX_DIR code isn't returned to userspace. https://bugzilla.kernel.org/show_bug.cgi?id=199947 Reported-by: Anatoly Trosinenko <anatoly.trosinenko@yandex.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-07-29ext4: handle layout changes to pinned DAX mappingsRoss Zwisler
Follow the lead of xfs_break_dax_layouts() and add synchronization between operations in ext4 which remove blocks from an inode (hole punch, truncate down, etc.) and pages which are pinned due to DAX DMA operations. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
2018-07-29ext4: use swap macro in mext_page_double_lockGustavo A. R. Silva
Make use of the swap macro and remove unnecessary variable *tmp*. This makes the code easier to read and maintain. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-07-29ext4: check allocation failure when duplicating "data" in ext4_remount()Chengguang Xu
There is no check for allocation failure when duplicating "data" in ext4_remount(). Check for failure and return error -ENOMEM in this case. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca>
2018-07-29ext4: fix warning message in ext4_enable_quotas()Junichi Uekawa
Output the warning message before we clobber type and be -1 all the time. The error message would now be [ 1.519791] EXT4-fs warning (device vdb): ext4_enable_quotas:5402: Failed to enable quota tracking (type=0, err=-3). Please run e2fsck to fix. Signed-off-by: Junichi Uekawa <uekawa@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca>