summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-09-17xfs: widen btree maxlevels computation to handle 64-bit record countsbtree-cleanups_2021-09-17Darrick J. Wong
Rework xfs_btree_compute_maxlevels to handle larger record counts, since we're about to add support for very large indices for the realtime rmap btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: consolidate btree block allocation tracepointsDarrick J. Wong
Don't waste tracepoint segment memory on per-btree block allocation tracepoints when we can do it from the generic btree code. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: consolidate btree block freeing tracepointsDarrick J. Wong
Don't waste tracepoint segment memory on per-btree block freeing tracepoints when we can do it from the generic btree code. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: fix confusing xfs_extent_item variable namesrealtime-extfree-intents_2021-09-17Darrick J. Wong
Change the name of all pointers to xfs_extent_item structures to "xefi" to make the name consistent and because the current selections ("new" and "free") mean other things in C. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: support error injection when freeing rt extentsDarrick J. Wong
A handful of fstests expect to be able to test what happens when extent free intents fail to actually free the extent. Now that we're supporting EFIs for realtime extents, add to xfs_rtfree_extent the same injection point that exists in the regular extent freeing code. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: support recovering extent-free intent items targetting realtime extentsDarrick J. Wong
Now that we have reflink on the realtime device, extent-free intent items have to support remapping extents on the realtime volume. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: support logging EFIs for realtime extentsDarrick J. Wong
Teach the EFI mechanism how to free realtime extents. We do this very sneakily, by using the upper bit of the length field in the log format (and a boolean flag incore) to convey the realtime status. We're going to need this to enforce proper ordering of operations when we enable realtime rmap. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: convert "skip_discard" to a proper flags bitsetextfree-intent-cleanups_2021-09-17Darrick J. Wong
Convert the boolean to skip discard on free into a proper flags field so that we can add more flags in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: clean up extent free log intent item tracepoint callsitesDarrick J. Wong
Pass the incore EFI structure to the tracepoints instead of open-coding the argument passing, and augment the tracepoints to tell us which operation we're selecting to match the other intent item tracepoints. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: pass xfs_extent_free_item directly through the log intent codeDarrick J. Wong
Pass the incore xfs_extent_free_item through the EFI logging code instead of repeatedly boxing and unboxing parameters. We'll clean up the tracepoints shortly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: rename xfs_bmap_add_free to xfs_free_extent_laterDarrick J. Wong
xfs_bmap_add_free isn't a block mapping function; it schedules deferred freeing operations for a later point in a compound transaction chain. While it's primarily used by bunmapi, its use has expanded beyond that. Move it to xfs_alloc.c and rename the function since it's now general freeing functionality. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: allow inode-based btrees to reserve space in the data devicereserve-rt-metadata-space_2021-09-17Darrick J. Wong
Create a new space reservation scheme so that btree metadata for the realtime volume can reserve space in the data device to avoid space underruns. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: simplify xfs_ag_resv_free signatureDarrick J. Wong
It's not possible to fail at increasing fdblocks, so get rid of all the error returns here. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: remove XFS_ILOCK_RT*Darrick J. Wong
Now that we've centralized the realtime metadata locking routines, get rid of the ILOCK subclasses since we now use explicit lockdep classes. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: refactor realtime inode lockingrefactor-rt-locking_2021-09-17Darrick J. Wong
Refactor realtime metadata inode locking so that we can get some sense here. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: use separate lock classes for realtime metadata inode ILOCKsDarrick J. Wong
Realtime metadata files are not quite regular files because userspace can't access the realtime bitmap directly, and because we take the ILOCK of the rt bitmap file while holding the ILOCK of a realtime file. The double nature of inodes confuses lockdep, so up until now we've created lockdep subclasses to help lockdep keep things straight. We've gotten away with using lockdep subclasses because there's only two rt metadata files, but with the coming addition of realtime rmap and refcounting, we'd need two more subclasses, which is a lot of class bits to burn on a side feature. Therefore, switch to manually setting the lockdep class of the rt metadata ILOCKs. In the next patch we'll remove the rt-related ILOCK subclasses. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: refactor realtime scrubbing context managementDarrick J. Wong
Create a pair of helpers to deal with setting up the necessary incore context to check metadata records against the realtime metadata. Right now this is limited to locking the realtime bitmap and summary inodes, but as we add rmap and reflink to the realtime device this will grow to include btree cursors. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: kill XFS_BTREE_MAXLEVELSbtree-dynamic-depth_2021-09-17Darrick J. Wong
Nobody uses this symbol anymore, so kill it. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: compute the maximum height of the rmap btree when reflink enabledDarrick J. Wong
Instead of assuming that the hardcoded XFS_BTREE_MAXLEVELS value is big enough to handle the maximally tall rmap btree when all blocks are in use and maximally shared, let's compute the maximum height assuming the rmapbt consumes as many blocks as possible. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: compute actual maximum btree height for critical reservation calculationDarrick J. Wong
Compute the actual maximum btree height when deciding if per-AG block reservation is critically low. This only affects the sanity check condition, since we /generally/ will trigger on the 10% threshold. This is a long-winded way of saying that we're removing one more usage of XFS_BTREE_MAXLEVELS. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: dynamically allocate cursors based on maxlevelsDarrick J. Wong
Replace the statically-sized btree cursor zone with dynamically sized allocations so that we can reduce the memory overhead for per-AG bt cursors while handling very tall btrees for rt metadata. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: encode the max btree height in the cursorDarrick J. Wong
Encode the maximum btree height in the cursor, since we're soon going to allow smaller cursors for AG btrees and larger cursors for file btrees. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: fix maxlevels comparisons in the btree staging codeDarrick J. Wong
The btree geometry computation function has an off-by-one error in that it does not allow maximally tall btrees (nlevels == XFS_BTREE_MAXLEVELS). This can result in repairs failing unnecessarily on very fragmented filesystems. Subsequent patches to remove MAXLEVELS usage in favor of the per-btree type computations will make this a much more likely occurrence. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: refactor btree cursor allocation functionDarrick J. Wong
Refactor btree allocation to a common helper. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: support dynamic btree cursor heightsDarrick J. Wong
Split out the btree level information into a separate struct and put it at the end of the cursor structure as a VLA. The realtime rmap btree (which is rooted in an inode) will require the ability to support many more levels than a per-AG btree cursor, which means that we're going to create two btree cursor caches to conserve memory for the more common case. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: check that bc_nlevels never overflowsDarrick J. Wong
Warn if we ever bump nlevels higher than the allowed maximum cursor height. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: stricter btree height checking when scanning for btree rootsDarrick J. Wong
When we're scanning for btree roots to rebuild the AG headers, make sure that the proposed tree does not exceed the maximum height for that btree type (and not just XFS_BTREE_MAXLEVELS). Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: stricter btree height checking when looking for errorsDarrick J. Wong
Since each btree type has its own precomputed maxlevels variable now, use them instead of the generic XFS_BTREE_MAXLEVELS to check the level of each per-AG btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: dynamically allocate btree scrub context structureDarrick J. Wong
Reorganize struct xchk_btree so that we can dynamically size the context structure to fit the type of btree cursor that we have. This will enable us to use memory more efficiently once we start adding very tall btree types. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: don't allocate scrub contexts on the stackDarrick J. Wong
Convert the on-stack scrub context, btree scrub context, and da btree scrub context into a heap allocation so that we reduce stack usage and gain the ability to handle tall btrees without issue. Specifically, this saves us ~208 bytes for the dabtree scrub, ~464 bytes for the btree scrub, and ~200 bytes for the main scrub context. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: remove xfs_btree_cur_t typedefDarrick J. Wong
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: update btree keys correctly when _insrec splits an inode root blockbtree-ifork-records_2021-09-17Darrick J. Wong
In commit 2c813ad66a72, I partially fixed a bug wherein xfs_btree_insrec would erroneously try to update the parent's key for a block that had been split if we decided to insert the new record into the new block. The solution was to detect this situation and update the in-core key value that we pass up to the caller so that the caller will (eventually) add the new block to the parent level of the tree with the correct key. However, I missed a subtlety about the way inode-rooted btrees work. If the full block was a maximally sized inode root block, we'll solve that fullness by moving the root block's records to a new block, resizing the root block, and updating the root to point to the new block. We don't pass a pointer to the new block to the caller because that work has already been done. The new record will /always/ land in the new block, so in this case we need to use xfs_btree_update_keys to update the keys. This bug can theoretically manifest itself in the very rare case that we split a bmbt root block and the new record lands in the very first slot of the new block, though I've never managed to trigger it in practice. However, it is very easy to reproduce by running generic/522 with the realtime rmapbt patchset if rtinherit=1. Fixes: 2c813ad66a72 ("xfs: support btrees with overlapping intervals for keys") Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: support storing records in the inode core rootDarrick J. Wong
Add the necessary flags and code so that we can support storing leaf records in the inode root block of a btree. This hasn't been necessary before, but the realtime rmapbt will need to be able to do this. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: hoist the node iroot update code out of xfs_btree_kill_irootDarrick J. Wong
In preparation for allowing records in an inode btree root, hoist the code that copies keyptrs from an existing node child into the root block to a separate function. Remove some unnecessary conditionals and clean up a few function calls in the new function. Note that this change reorders the ->free_block call with respect to the change in bc_nlevels to make it easier to support inode root leaf blocks in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: hoist the node iroot update code out of xfs_btree_new_irootDarrick J. Wong
In preparation for allowing records in an inode btree root, hoist the code that copies keyptrs from an existing node root into a child block to a separate function. Note that the new function explicitly computes the keys of the new child block and stores that in the root block; while the bmap btree could rely on leaving the key alone, realtime rmap needs to set the new high key. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: support leaves in the incore btree root block in xfs_iroot_reallocDarrick J. Wong
Add some logic to xfs_iroot_realloc so that we can handle leaf records in the btree root block correctly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: generalize the btree root reallocation functionDarrick J. Wong
In preparation for storing realtime rmap btree roots in an inode fork, make xfs_iroot_realloc take an ops structure that takes care of all the btree-specific geometry pieces. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: standardize the btree maxrecs function parametersDarrick J. Wong
Standardize the parameters in xfs_{alloc,bm,ino,rmap,refcount}bt_maxrecs so that we have consistent calling conventions. This doesn't affect the kernel that much, but enables us to clean up userspace a bit. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: rearrange xfs_iroot_realloc a bitDarrick J. Wong
Rearrange the innards of xfs_iroot_realloc so that we can reduce duplicated code prior to genericizing the function. No functional changes. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: move the zero records logic into xfs_bmap_broot_space_calcDarrick J. Wong
The bmap btree cannot ever have zero records in an incore btree block. If the number of records drops to zero, that means we're converting the fork to extents format and are trying to remove the tree. This logic won't hold for the future realtime rmap btree, so move the logic into the bmbt code. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: hoist the code that moves the incore inode fork broot memoryDarrick J. Wong
Whenever we change the size of the memory buffer holding an inode fork btree root block, we have to copy the contents over. Refactor all this into a single function that handles both, in preparation for making xfs_iroot_realloc more generic. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: fix a sloppy memory handling bug in xfs_iroot_reallocDarrick J. Wong
While refactoring code, I noticed that when xfs_iroot_realloc tries to shrink a bmbt root block, it allocates a smaller new block and then copies "records" and pointers to the new block. However, bmbt root blocks cannot ever be leaves, which means that it's not technically correct to copy records. We /should/ be copying keys. Note that this has never resulted in actual memory corruption because sizeof(bmbt_rec) == (sizeof(bmbt_key) + sizeof(bmbt_ptr)). However, this will no longer be true when we start adding realtime rmap stuff, so fix this now. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: refactor creation of bmap btree rootsDarrick J. Wong
Now that we've created inode fork helpers to allocate and free btree roots, create a new bmap btree helper to create a new bmbt root, and refactor the extents <-> btree conversion functions to use our new helpers. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: refactor the allocation and freeing of incore inode fork btree rootsDarrick J. Wong
Refactor the code that allocates and freese the incore inode fork btree roots. This will help us disentangle some of the weird logic when we're creating and tearing down inode-based btrees. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: replace shouty XFS_BM{BT,DR} macrosDarrick J. Wong
Replace all the shouty bmap btree and bmap disk root macros with actual functions, and fix a type handling error in the xattr code that the macros previously didn't care about. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: enable metadata directory featuremetadir_2021-09-17Darrick J. Wong
Enable the metadata directory feature. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: scrub metadata directoriesDarrick J. Wong
Teach online scrub about the metadata directory tree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: allow bulkstat to return metadata directoriesDarrick J. Wong
Allow the V5 bulkstat ioctl to return information about metadata directory files so that xfs_scrub can find and scrub them, since they are otherwise ordinary directories. (Metadata files of course require per-file scrub code and hence do not need exposure.) Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: advertise metadata directory featureDarrick J. Wong
Advertise the existence of the metadata directory feature; this will be used by scrub to decide if it needs to scan the metadir too. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-09-17xfs: hide metadata inodes from everyone because they are specialDarrick J. Wong
Metadata inodes are private files and therefore cannot be exposed to userspace. This means no bulkstat, no open-by-handle, no linking them into the directory tree, and no feeding them to LSMs. As such, we mark them S_PRIVATE, which stops all that. While we're at it, put them in a separate lockdep class so that it won't get confused by "recursive" i_rwsem locking such as what happens when we write to a rt file and need to allocate from the rt bitmap file. Signed-off-by: Darrick J. Wong <djwong@kernel.org>