summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2021-01-31xfs: remove return value from xchk_ag_btcur_initscrub-fixes-5.12_2021-01-31Darrick J. Wong
Functions called by this function cannot fail, so get rid of the return and error checking. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: set the scrub AG number in xchk_ag_read_headersDarrick J. Wong
Since xchk_ag_read_headers initializes fields in struct xchk_ag, we might as well set the AG number and save the callers the trouble. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: validate ag btree levels using the precomputed valuesDarrick J. Wong
Use the AG btree height limits that we precomputed into the xfs_mount to validate the AG headers instead of using XFS_BTREE_MAXLEVELS. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: mark a data structure sick if there are cross-referencing errorsDarrick J. Wong
If scrub observes cross-referencing errors while scanning a data structure, mark the data structure sick. There's /something/ inconsistent, even if we can't really tell what it is. Fixes: 4860a05d2475 ("xfs: scrub/repair should update filesystem metadata health") Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: bail out of scrub immediately if scan incompleteDarrick J. Wong
If a scrubber cannot complete its check and signals an incomplete check, we must bail out immediately without updating health status, trying a repair, etc. because our scan is incomplete and we therefore do not know much more. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: fix uninitialized variables in xrep_calc_ag_resblksDarrick J. Wong
If we can't read the AGF header, we never actually set a value for freelen and usedlen. These two variables are used to make the worst case estimate of btree size, so it's safe to set them to the AG size as a fallback. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: fix rmap key comparison functionsDarrick J. Wong
Keys for extent interval records in the reverse mapping btree are supposed to be computed as follows: (physical block, owner, fork, is_btree, offset) This provides users the ability to look up a reverse mapping from a file block mapping record -- start with the physical block; then if there are multiple records for the same block, move on to the owner; then the inode fork type; and so on to the file offset. However, the key comparison functions incorrectly remove the fork/bmbt information that's encoded in the on-disk offset. This means that lookup comparisons are only done with: (physical block, owner, offset) This means that queries can return incorrect results. On consistent filesystems this isn't an issue because bmbt blocks and blocks mapped to an attr fork cannot be shared, but this prevents us from detecting incorrect fork and bmbt flag bits in the rmap btree. A previous version of this patch forgot to keep the (un)written state flag masked during the comparison and caused a major regression in 5.9.x since unwritten extent conversion can update an rmap record without requiring key updates. Note that blocks cannot go directly from data fork to attr fork without being deallocated and reallocated, nor can they be added to or removed from a bmbt without a free/alloc cycle, so this should not cause any regressions. Found by fuzzing keys[1].attrfork = ones on xfs/371. Fixes: 4b8ed67794fe ("xfs: add rmap btree operations") Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: don't bounce the iolock between free_{eof,cow}blockseofblocks-consolidation-5.12_2021-01-31Darrick J. Wong
Since xfs_inode_free_eofblocks and xfs_inode_free_cowblocks are now internal static functions, we can save ourselves a cycling of the iolock by passing the lock state out to xfs_blockgc_scan_inode and letting it do all the unlocking. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: expose the blockgc workqueue knobs publiclyDarrick J. Wong
Expose the workqueue sysfs knobs for the speculative preallocation gc workers on all kernels, and update the sysadmin information. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: parallelize block preallocation garbage collectionDarrick J. Wong
Split the block preallocation garbage collection work into per-AG work items so that we can take advantage of parallelization. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: rename block gc start and stop functionsDarrick J. Wong
Shorten the names of the two functions that start and stop block preallocation garbage collection and move them up to the other blockgc functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: only walk the incore inode tree once per blockgc scanDarrick J. Wong
Perform background block preallocation gc scans more efficiently by walking the incore inode tree once. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: consolidate the eofblocks and cowblocks workersDarrick J. Wong
Remove the separate cowblocks work items and knob so that we can control and run everything from a single blockgc work queue. Note that the speculative_prealloc_lifetime sysfs knob retains its historical name even though the functions move to prefix xfs_blockgc_*. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: consolidate incore inode radix tree posteof/cowblocks tagsDarrick J. Wong
The clearing of posteof blocks and cowblocks serve the same purpose: removing speculative block preallocations from inactive files. We don't need to burn two radix tree tags on this, so combine them into one. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: remove trivial eof/cowblocks functionsDarrick J. Wong
Get rid of these trivial helpers. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: hide xfs_icache_free_cowblocksDarrick J. Wong
Change the one remaining caller of xfs_icache_free_cowblocks to use our new combined blockgc scan function instead, since we will soon be combining the two scans. This introduces a slight behavior change, since a readonly remount now clears out post-EOF preallocations and not just CoW staging extents. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: hide xfs_icache_free_eofblocksDarrick J. Wong
Change the one remaining caller of xfs_icache_free_eofblocks to use our new combined blockgc scan function instead, since we will soon be combining the two scans. This introduces a slight behavior change, since the XFS_IOC_FREE_EOFBLOCKS now clears out speculative CoW reservations in addition to post-eof blocks. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: relocate the eofb/cowb workqueue functionsDarrick J. Wong
Move the xfs_{eof,cow}blocks_worker and xfs_queue_{eof,cow}blocks functions further down in the file so that the cleanups in the next patches won't have to pre-declare static functions. No functional changes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: set WQ_SYSFS on all workqueues in debug modeworkqueue-speedups-5.12_2021-01-31Darrick J. Wong
When CONFIG_XFS_DEBUG=y, set WQ_SYSFS on all workqueues that we create so that we (developers) have a means to monitor cpu affinity and whatnot for background workers. In the next patchset we'll expose knobs for more of the workqueues publicly and document it, but not now. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: increase the default parallelism levels of pwork clientsDarrick J. Wong
Increase the parallelism level for pwork clients to the workqueue defaults so that we can take advantage of computers with a lot of CPUs and a lot of hardware. On fast systems this will speed up quotacheck by a large factor, and the following posteof/cowblocks cleanup series will use the functionality presented in this patch to run garbage collection as quickly as possible. We do this by switching the pwork workqueue to unbounded, since the current user (quotacheck) runs lengthy scans for each work item and we don't care about dispatching the work on a warm cpu cache or anything like that. Also set WQ_SYSFS so that we can monitor where the wq is running. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: flush speculative space allocations when we run out of spacereclaim-space-harder-5.12_2021-01-31Darrick J. Wong
If a fs modification (creation, file write, reflink, etc.) is unable to reserve enough space to handle the modification, try clearing whatever space the filesystem might have been hanging onto in the hopes of speeding up the filesystem. The flushing behavior will become particularly important when we add deferred inode inactivation because that will increase the amount of space that isn't actively tied to user data. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: refactor xfs_icache_free_{eof,cow}blocks call sitesDarrick J. Wong
In anticipation of more restructuring of the eof/cowblocks gc code, refactor calling of those two functions into a single internal helper function, then present a new standard interface to purge speculative block preallocations and start shifting higher level code to use that. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: add a tracepoint for blockgc scansDarrick J. Wong
Add some tracepoints so that we can observe when the speculative preallocation garbage collector runs. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: flush eof/cowblocks if we can't reserve quota for chownDarrick J. Wong
If a file user, group, or project change is unable to reserve enough quota to handle the modification, try clearing whatever space the filesystem might have been hanging onto in the hopes of speeding up the filesystem. The flushing behavior will become particularly important when we add deferred inode inactivation because that will increase the amount of space that isn't actively tied to user data. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: flush eof/cowblocks if we can't reserve quota for inode creationDarrick J. Wong
If an inode creation is unable to reserve enough quota to handle the modification, try clearing whatever space the filesystem might have been hanging onto in the hopes of speeding up the filesystem. The flushing behavior will become particularly important when we add deferred inode inactivation because that will increase the amount of space that isn't actively tied to user data. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: flush eof/cowblocks if we can't reserve quota for file blocksDarrick J. Wong
If a fs modification (data write, reflink, xattr set, fallocate, etc.) is unable to reserve enough quota to handle the modification, try clearing whatever space the filesystem might have been hanging onto in the hopes of speeding up the filesystem. The flushing behavior will become particularly important when we add deferred inode inactivation because that will increase the amount of space that isn't actively tied to user data. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: try worst case space reservation upfront in xfs_reflink_remap_extentDarrick J. Wong
Now that we've converted xfs_reflink_remap_extent to use the new xfs_trans_alloc_inode API, we can focus on its slightly unusual behavior with regard to quota reservations. Since it's valid to remap written blocks into a hole, we must be able to increase the quota count by the number of blocks in the mapping. However, the incore space reservation process requires us to supply an asymptotic guess before we can gain exclusive access to resources. We'd like to reserve all the quota we need up front, but we also don't want to fail a written -> allocated remap operation unnecessarily. The solution is to make the remap_extents function call the transaction allocation function twice. The first time we ask to reserve enough space and quota to handle the absolute worst case situation, but if that fails, we can fall back to the old strategy: ask for the bare minimum space reservation upfront and increase the quota reservation later if we need to. Later in this patchset we change the transaction and quota code to try to reclaim space if we cannot reserve free space or quota. Restructuring the remap_extent function in this manner means that if the fallback increase fails, we can pass that back to the caller knowing that the transaction allocation already tried freeing space. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: pass flags and return gc errors from xfs_blockgc_free_quotaDarrick J. Wong
Change the signature of xfs_blockgc_free_quota in preparation for the next few patches. Callers can now pass EOF_FLAGS into the function to control scan parameters; and the function will now pass back any corruption errors seen while scanning, though for our retry loops we'll just try again unconditionally. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: move and rename xfs_inode_free_quota_blocks to avoid conflictsDarrick J. Wong
Move this function further down in the file so that later cleanups won't have to declare static functions. Change the name because we're about to rework all the code that performs garbage collection of speculatively allocated file blocks. No functional changes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: xfs_inode_free_quota_blocks should scan project quotaDarrick J. Wong
Buffered writers who have run out of quota reservation call xfs_inode_free_quota_blocks to try to free any space reservations that might reduce the quota usage. Unfortunately, the buffered write path treats "out of project quota" the same as "out of overall space" so this function has never supported scanning for space that might ease an "out of project quota" condition. We're about to start using this function for cases where we actually /can/ tell if we're out of project quota, so add in this functionality. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: don't stall cowblocks scan if we can't take locksDarrick J. Wong
Don't stall the cowblocks scan on a locked inode if we possibly can. We'd much rather the background scanner keep moving. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: trigger all block gc scans when low on quota spaceDarrick J. Wong
The functions to run an eof/cowblocks scan to try to reduce quota usage are kind of a mess -- the logic repeatedly initializes an eofb structure and there are logic bugs in the code that result in the cowblocks scan never actually happening. Replace all three functions with a single function that fills out an eofb and runs both eof and cowblocks scans. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: shut down the filesystem if we screw up quota errorsquota-function-cleanups-5.12_2021-01-31Darrick J. Wong
If we ever screw up the quota reservations enough to trip the assertions, something's wrong with the quota code. Shut down the filesystem when this happens, because this is corruption. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: refactor inode creation transaction/inode/quota allocation idiomDarrick J. Wong
For file ownership changes, create a new helper xfs_trans_alloc_ichange that allocates a transaction and reserves the appropriate amount of quota against that transction in preparation for a change of user, group, or project id. Replace all the open-coded idioms with a single call to this helper so that we can contain the retry loops in the next patchset. This changes the locking behavior for ichange transactions slightly. Since tr_ichange does not have a permanent reservation and cannot roll, we pass XFS_ILOCK_EXCL to ijoin so that the inode will be unlocked automatically at commit time. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: rename code to error in xfs_ioctl_setattrDarrick J. Wong
Rename the 'code' variable to 'error' to follow the naming convention of most other functions in xfs. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: clean up xfs_trans_reserve_quota_chown a bitDarrick J. Wong
Clean up the calling conventions of xfs_trans_reserve_quota_chown a little bit -- we can pass in a boolean force parameter since that's the only qmopt that caller care about, and make the obvious deficiencies more obvious for anyone who someday wants to clean up rt quota. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: move xfs_qm_vop_chown_reserve to xfs_trans_dquot.cDarrick J. Wong
Move xfs_qm_vop_chown_reserve to xfs_trans_dquot.c and rename it xfs_trans_reserve_quota_chown. This will enable us to share code with the other quota reservation helpers, which will be very useful in the next patchset when we implement retry loops. No functional changes here, we're just moving code. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-31xfs: refactor inode creation transaction/inode/quota allocation idiomDarrick J. Wong
For file creation, create a new helper xfs_trans_alloc_icreate that allocates a transaction and reserves the appropriate amount of quota against that transction. Replace all the open-coded idioms with a single call to this helper so that we can contain the retry loops in the next patchset. This changes the locking behavior for non-tempfile creation slightly, in that we now make the quota reservation without holding the directory ILOCK. While the dquots chosen for inode creation are based on the directory state at a given point in time, the directory ILOCK was released as soon as the dquot references are picked up. Hence it was never necessary to hold the directory ILOCK for the quota reservation. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: refactor reflink functions to use xfs_trans_alloc_inodeDarrick J. Wong
The two remaining callers of xfs_trans_reserve_quota_nblks are in the reflink code. These conversions aren't as uniform as the previous conversions, so call that out in a separate patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-31xfs: allow reservation of rtblocks with xfs_trans_alloc_inodeDarrick J. Wong
Make it so that we can reserve rt blocks with the xfs_trans_alloc_inode wrapper function, then convert a few more callsites. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: refactor common transaction/inode/quota allocation idiomDarrick J. Wong
Create a new helper xfs_trans_alloc_inode that allocates a transaction, locks and joins an inode to it, and then reserves the appropriate amount of quota against that transction. Then replace all the open-coded idioms with a single call to this helper. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: reserve data and rt quota at the same timeDarrick J. Wong
Modify xfs_trans_reserve_quota_nblks so that we can reserve data and realtime blocks from the dquot at the same time. This change has the theoretical side effect that for allocations to realtime files we will reserve from the dquot both the number of rtblocks being allocated and the number of bmbt blocks that might be needed to add the mapping. However, since the mount code disables quota if it finds a realtime device, this should not result in any behavior changes. Now that we've moved the inode creation callers away from using the _nblks function, we can repurpose the (now unused) ninos argument for realtime blocks, so make that change. This also replaces the flags argument with a boolean parameter to force the reservation since we don't need to distinguish between data and rt quota reservations any more, and the only flag being passed in was FORCE_RES. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: reduce quota reservation when doing a dax unwritten extent conversionDarrick J. Wong
In commit 3b0fe47805802, we reduced the free space requirement to perform a pre-write unwritten extent conversion on an S_DAX file. Since we're not actually allocating any space, the logic goes, we only need enough reservation to handle shape changes in the bmbt. The same logic should have been applied to quota -- we're not allocating any space, so we only need to reserve enough quota to handle the bmbt shape changes. Fixes: 3b0fe4780580 ("xfs: Don't use reserved blocks for data blocks with DAX") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: fix up build warnings when quotas are disabledDarrick J. Wong
Fix some build warnings on gcc 10.2 when quotas are disabled. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: clean up icreate quota reservation callsDarrick J. Wong
Create a proper helper so that inode creation calls can reserve quota with a dedicated function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: remove xfs_trans_unreserve_quota_nblks completelyDarrick J. Wong
xfs_trans_cancel will release all the quota resources that were reserved on behalf of the transaction, so get rid of the explicit unreserve step. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: create convenience wrappers for incore quota block reservationsDarrick J. Wong
Create a couple of convenience wrappers for creating and deleting quota block reservations against future changes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: clean up quota reservation callsitesDarrick J. Wong
Convert a few xfs_trans_*reserve* callsites that are open-coding other convenience functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-31xfs: fix xquota chown transactionality wrt delalloc blocksDarrick J. Wong
While refactoring the quota code to create a function to allocate inode change transactions, I noticed that xfs_qm_vop_chown_reserve does more than just make reservations: it also *modifies* the incore counts directly to handle the owner id change for the delalloc blocks. I then observed that the fssetxattr code continues validating input arguments after making the quota reservation but before dirtying the transaction. If the routine decides to error out, it fails to undo the accounting switch! This leads to incorrect quota reservation and failure down the line. We can fix this by making the reservation function do only that -- for the new dquot, it reserves ondisk and delalloc blocks to the transaction, and the old dquot hangs on to its incore reservation for now. Once we actually switch the dquots, we can then update the incore reservations because we've dirtied the transaction and it's too late to turn back now. No fixes tag because this has been broken since the start of git. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-01-27xfs: Fix 'set but not used' warning in xfs_bmap_compute_alignments()xfs-5.12-merge_2021-01-31Chandan Babu R
With both CONFIG_XFS_DEBUG and CONFIG_XFS_WARN disabled, the only reference to local variable "error" in xfs_bmap_compute_alignments() gets eliminated during pre-processing stage of the compilation process. This causes the compiler to generate a "set but not used" warning. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>