summaryrefslogtreecommitdiff
path: root/fs/xfs/libxfs
AgeCommit message (Collapse)Author
2021-10-22xfs: allow queued AG intents to drain before scrubbingscrub-drain-intents_2021-10-22Darrick J. Wong
Currently, online scrub isn't sufficiently careful about quiescing allocation groups before checking them. While scrub does take the AG header locks, it doesn't serialize against chains of AG update intents that are being processed concurrently. If there's a collision, cross-referencing between data structures (e.g. rmapbt and refcountbt) can yield false corruption events; if repair is running, this results in incorrect repairs. Fix this by adding to the perag structure the count of active intents and make scrub wait until there aren't any to continue. This is a little stupid since transactions can queue intents without taking buffer locks, but we'll also wait for those transactions. XXX: should have instead a per-ag rwsem that gets taken as soon as the AG[IF] are locked and stays held until the transaction commits or moves on to the next AG? would we rather have a six lock so that intents can take an ix lock, and not have to upgrade to x until we actually want to make changes to that ag? is that how those even work?? Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: teach scrub to check file nlinksDarrick J. Wong
Copy-pasta the online quotacheck code to check inode link counts too. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: report health of inode link countsDarrick J. Wong
Report on the health of the inode link counts. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: experiment with dontcache when scanning inodesvectorized-scrub_2021-10-22Darrick J. Wong
Add some experimental flags to drop inodes from the cache after a scan. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: introduce vectored scrub modeDarrick J. Wong
Introduce a variant on XFS_SCRUB_METADATA that allows for vectored mode. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: track deferred ops statisticsDarrick J. Wong
Track some basic statistics on how hard we're pushing the defer ops. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: enable extent size hints for CoW when rtextsize > 1Darrick J. Wong
CoW extent size hints are not allowed on filesystems that have large realtime extents because we only want to perform the minimum required amount of write-around (aka write amplification) for shared extents. On filesystems where rtextsize > 1, allocations can only be done in units of full rt extents, which means that we can only map an entire rt extent's worth of blocks into the data fork. Hole punch requests become conversions to unwritten if the request isn't aligned properly. Because a copy-write fundamentally requires remapping, this means that we also can only do copy-writes of a full rt extent. This is too expensive for large hint sizes, since it's all or nothing. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: scrub the realtime refcount btreeDarrick J. Wong
Add code to scrub realtime refcount btrees. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: report realtime refcount btree corruption errors to the health systemDarrick J. Wong
Whenever we encounter corrupt realtime refcount btree blocks, we should report that to the health monitoring system for later reporting. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: enable extent size hints for CoW operationsDarrick J. Wong
Wire up the copy-on-write extent size hint for realtime files, and connect it to the rt allocator so that we avoid fragmentation on rt filesystems. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: apply rt extent alignment constraints to CoW extsize hintDarrick J. Wong
The copy-on-write extent size hint is subject to the same alignment constraints as the regular extent size hint. Since we're in the process of adding reflink (and therefore CoW) to the realtime device, we must apply the same scattered rextsize alignment validation strategies to both hints to deal with the possibility of rextsize changing. Therefore, fix the inode validator to perform rextsize alignment checks on regular realtime files, and to remove misaligned directory hints. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: fix xfs_get_extsz_hint behavior with realtime alwayscow filesDarrick J. Wong
Currently, we (ab)use xfs_get_extsz_hint so that it always returns a nonzero value for realtime files. This apparently was done to disable delayed allocation for realtime files. However, once we enable realtime reflink, we can also turn on the alwayscow flag to force CoW writes to realtime files. In this case, the logic will incorrectly send the write through the delalloc write path. Fix this by adjusting the logic slightly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: refcover CoW leftovers in the realtime volumeDarrick J. Wong
Scan the realtime refcount tree at mount time to get rid of leftover CoW staging extents. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: allow inodes to have the realtime and reflink flagsDarrick J. Wong
Now that we can share blocks between realtime files, allow this combination. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: enable CoW for realtime dataDarrick J. Wong
Update our write paths to support copy on write on the rt volume. This works in more or less the same way as it does on the data device, with the major exception that we never do delalloc on the rt volume. Because we consider unwritten CoW fork staging extents to be incore quota reservation, we update xfs_quota_reserve_blkres to support this case. Though xfs doesn't allow rt and quota together, the change is trivial and we shouldn't leave a logic bomb here. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: compute rtrmap btree max levels when reflink enabledDarrick J. Wong
Compute the maximum possible height of the realtime rmap btree when reflink is enabled. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: update rmap to allow cow staging extents in the rt rmapDarrick J. Wong
Don't error out on CoW staging extent records when realtime reflink is enabled. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: create routine to allocate and initialize a realtime refcount btree inodeDarrick J. Wong
Create a library routine to allocate and initialize an empty realtime refcountbt inode. We'll use this for growfs, mkfs, and repair. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: wire up realtime refcount btree cursorsDarrick J. Wong
Wire up realtime refcount btree cursors wherever they're needed throughout the code base. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: wire up a new inode fork type for the realtime refcountDarrick J. Wong
Plumb in the pieces we need to embed the root of the realtime refcount btree in an inode's data fork, complete with new fork type and on-disk interpretation functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: add metadata reservations for realtime refcount btreeDarrick J. Wong
Reserve some free blocks so that we will always have enough free blocks in the data volume to handle expansion of the realtime refcount btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: add realtime reverse map inode to metadata directoryDarrick J. Wong
Add a metadir path to select the realtime refcount btree inode and load it at mount time. The rtrefcountbt inode will have a unique extent format code, which means that we also have to update the inode validation and flush routines to look for it. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: add a realtime flag to the refcount update log redo itemsDarrick J. Wong
Extend the refcount update (CUI) log items with a new realtime flag that indicates that the updates apply against the realtime refcountbt. We'll wire up the actual refcount code later. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: prepare refcount functions to deal with rtrefcountbtDarrick J. Wong
Prepare the high-level refcount functions to deal with the new realtime refcountbt and its slightly different conventions. Provide the ability to talk to either refcountbt or rtrefcountbt formats from the same high level code. Note that we leave the _recover_cow_leftovers functions for a separate patch so that we can convert it all at once. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: add realtime refcount btree operationsDarrick J. Wong
Implement the generic btree operations needed to manipulate rtrefcount btree blocks. This is different from the regular refcountbt in that we allocate space from the filesystem at large, and are neither constrained to the free space nor any particular AG. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: realtime refcount btree transaction reservationsDarrick J. Wong
Make sure that there's enough log reservation to handle mapping and unmapping realtime extents. We have to reserve enough space to handle a split in the rtrefcountbt to add the record and a second split in the regular refcountbt to record the rtrefcountbt split. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: define the on-disk realtime refcount btree formatDarrick J. Wong
Start filling out the rtrefcount btree implementation. Start with the on-disk btree format; add everything needed to read, write and manipulate refcount btree blocks. This prepares the way for connecting the btree operations implementation. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: introduce realtime refcount btree definitionsDarrick J. Wong
Add new realtime refcount btree definitions. The realtime refcount btree will be rooted from a hidden inode, but has its own shape and therefore needs to have most of its own separate types. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: widen xfs_refcount_irec fields to handle realtime refcountbtDarrick J. Wong
Change the startblock and blockcount fields of xfs_refcount_irec to be 64 bits wide. This enables us to use the same high level refcount code for either tree. We'll also collect all the resulting breakage fixes here. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: clean up refcount log intent item tracepoint callsitesDarrick J. Wong
Pass the incore refcount intent structure to the tracepoints instead of open-coding the argument passing. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: pass refcount intent directly through the log intent codeDarrick J. Wong
Pass the incore refcount intent through the CUI logging code instead of repeatedly boxing and unboxing parameters. We'll clean up the tracepoints shortly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: prepare refcount btree tracepoints for wideningDarrick J. Wong
Prepare the rest of refcount btree tracepoints for use with realtime reflink by making them take the btree cursor object as a parameter. This will save us a lot of trouble later on. Remove the xfs_refcount_recover_extent tracepoint since it's already covered by other refcount tracepoints. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: create specialized classes for refcount tracepointsDarrick J. Wong
The only user of the "ag" tracepoint event classes is the refcount btree, so rename them to make that obvious and make them take the btree cursor to simplify the arguments. This will save us a lot of trouble later on. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: give refcount btree cursor error tracepoints their own classDarrick J. Wong
Convert all the refcount tracepoints to use the btree error tracepoint class. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: remove useless oinfo arg from xfs_refcount_adjustDarrick J. Wong
All callers pass NULL here, so eliminate the unnecessary argument. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: apply noalloc mode to inode allocations toonoalloc-ags_2021-10-22Darrick J. Wong
Don't allow inode allocations from this group if it's marked noalloc. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: enable userspace to hide an AG from allocationDarrick J. Wong
Add an administrative interface so that userspace can hide an allocation group from block allocation. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: create a noalloc mode for allocation groupsDarrick J. Wong
Create a new noalloc state for the per-AG structure that will disable block allocation in this AG. We accomplish this by subtracting from fdblocks all the free blocks in this AG, hiding those blocks from the allocator, and preventing freed blocks from updating fdblocks until we're ready to lift noalloc mode. Note that we reduce the free block count of the filesystem so that we can prevent transactions from entering the allocator looking for "free" space that we've turned off incore. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: compact flag bits in the perag structureDarrick J. Wong
Compact the flags in the per-ag structure so that we use less space. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: scrub the realtime rmapbtDarrick J. Wong
Check the realtime reverse mapping btree against the rtbitmap, and modify the rtbitmap scrub to check against the rtrmapbt. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: report realtime rmap btree corruption errors to the health systemDarrick J. Wong
Whenever we encounter corrupt realtime rmap btree blocks, we should report that to the health monitoring system for later reporting. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: create routine to allocate and initialize a realtime rmap btree inodeDarrick J. Wong
Create a library routine to allocate and initialize an empty realtime rmapbt inode. We'll use this for mkfs and repair. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: wire up rmap map and unmap to the realtime rmapbtDarrick J. Wong
Connect the map and unmap reverse-mapping operations to the realtime rmapbt via the deferred operation callbacks. This enables us to perform rmap operations against the correct btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: use realtime EFI to free extents when realtime rmap is enabledDarrick J. Wong
When rmap is enabled, XFS expects a certain order of operations, which is: 1) remove the file mapping, 2) remove the reverse mapping, and then 3) free the blocks. xfs_bmap_del_extent_real tries to do 1 and 3 in the same transaction, which means that when rtrmap is enabled, we have to use realtime EFIs to maintain the expected order. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: wire up a new inode fork type for the realtime rmapDarrick J. Wong
Plumb in the pieces we need to embed the root of the realtime rmap btree in an inode's data fork, complete with new fork type and on-disk interpretation functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: add metadata reservations for realtime rmap btreesDarrick J. Wong
Reserve some free blocks so that we will always have enough free blocks in the data volume to handle expansion of the realtime rmap btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: add realtime reverse map inode to metadata directoryDarrick J. Wong
Add a metadir path to select the realtime rmap btree inode and load it at mount time. The rtrmapbt inode will have a unique extent format code, which means that we also have to update the inode validation and flush routines to look for it. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: add a realtime flag to the rmap update log redo itemsDarrick J. Wong
Extend the rmap update (RUI) log items with a new realtime flag that indicates that the updates apply against the realtime rmapbt. We'll wire up the actual rmap code later. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: prepare rmap functions to deal with rtrmapbtDarrick J. Wong
Prepare the high-level rmap functions to deal with the new realtime rmapbt and its slightly different conventions. Provide the ability to talk to either rmapbt or rtrmapbt formats from the same high level code. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-22xfs: add realtime rmap btree operationsDarrick J. Wong
Implement the generic btree operations needed to manipulate rtrmap btree blocks. This is different from the regular rmapbt in that we allocate space from the filesystem at large, and are neither constrained to the free space nor any particular AG. Signed-off-by: Darrick J. Wong <djwong@kernel.org>