summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2021-12-15xfs: enable extent size hints for CoW when rtextsize > 1Darrick J. Wong
CoW extent size hints are not allowed on filesystems that have large realtime extents because we only want to perform the minimum required amount of write-around (aka write amplification) for shared extents. On filesystems where rtextsize > 1, allocations can only be done in units of full rt extents, which means that we can only map an entire rt extent's worth of blocks into the data fork. Hole punch requests become conversions to unwritten if the request isn't aligned properly. Because a copy-write fundamentally requires remapping, this means that we also can only do copy-writes of a full rt extent. This is too expensive for large hint sizes, since it's all or nothing. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: extend writeback requests to handle rt cow correctlyDarrick J. Wong
If we have shared realtime files and the rt extent size is larger than a single fs block, we need to extend writeback requests to be aligned to rt extent size granularity because we cannot share partial rt extents. The front end should have set us up for this by dirtying the relevant ranges. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: forcibly convert unwritten blocks within an rt extent before sharingDarrick J. Wong
As noted in the previous patch, XFS can only unmap and map full rt extents. This means that we cannot stop mid-extent for any reason, including stepping around unwritten/written extents. Second, the reflink and CoW mechanisms were not designed to handle shared unwritten extents, so we have to do something to get rid of them. If the user asks us to remap two files, we must scan both ranges beforehand to convert any unwritten extents that are not aligned to rt extent boundaries into zeroed written extents before sharing. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: enable CoW when rt extent size is larger than 1 blockDarrick J. Wong
Copy on write encounters a major plot twist when the file being CoW'd lives on the realtime volume and the realtime extent size is larger than a single filesystem block. XFS can only unmap and remap full rt extents, which means that allocations are always done in units of full rt extents, and a request to unmap less than one extent is treated as a request to convert an extent to unwritten status. This behavioral quirk is not compatible with the existing CoW mechanism, so we have to intercept every path through which files can be modified to ensure that we dirty an entire rt extent at once so that we can remap a full rt extent. Use the existing VFS unshare functions to dirty the page cache to set that up. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15iomap: set up for COWing around pagesDarrick J. Wong
In anticipation of enabling reflink on the realtime volume where the allocation unit is larger than a page, create an iomap function to dirty arbitrary parts of a file's page cache so that when we dirty part of a file that could undergo a COW extent, we can dirty an entire allocation unit's worth of pages. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15vfs: explicitly pass the block size to the remap prep functionDarrick J. Wong
Make it so that filesystems can pass an explicit blocksize to the remap prep function. This enables filesystems whose fundamental allocation units are /not/ the same as the blocksize to ensure that the remapping checks are aligned properly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: enable realtime reflinkrealtime-reflink_2021-12-15Darrick J. Wong
Enable reflink for realtime devices, sort of. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: repair inodes that have a refcount btree in the data forkDarrick J. Wong
Plumb knowledge of refcount btrees into the inode core repair code. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: online repair of the realtime refcount btreeDarrick J. Wong
Port the data device's refcount btree repair code to the realtime refcount btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: capture realtime CoW staging extents when rebuilding rt rmapbtDarrick J. Wong
Walk the realtime refcount btree to find the CoW staging extents when we're rebuilding the realtime rmap btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: walk the rt reference count tree when rebuilding rmapDarrick J. Wong
When we're rebuilding the data device rmap, if we encounter a "refcount" format fork, we have to walk the (realtime) refcount btree inode to build the appropriate mappings. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: check new rtbitmap records against rt refcount btreeDarrick J. Wong
When we're rebuilding the realtime bitmap, check the proposed free extents against the rt refcount btree to make sure we don't commit any grievous errors. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: detect and repair misaligned rtinherit directory cowextsize hintsDarrick J. Wong
If we encounter a directory that has been configured to pass on a CoW extent size hint to a new realtime file and the hint isn't an integer multiple of the rt extent size, we should flag the hint for administrative review and/or turn it off because that is a misconfiguration. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: allow dquot rt block count to exceed rt blocks on reflink fsDarrick J. Wong
Update the quota scrubber to allow dquots where the realtime block count exceeds the block count of the rt volume if reflink is enabled. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: check reference counts of gaps between rt refcount recordsDarrick J. Wong
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: cross-reference chceks with the rt refcount btreeDarrick J. Wong
Use the realtime refcount btree to cross-reference other data structures. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: scrub the realtime refcount btreeDarrick J. Wong
Add code to scrub realtime refcount btrees. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: report realtime refcount btree corruption errors to the health systemDarrick J. Wong
Whenever we encounter corrupt realtime refcount btree blocks, we should report that to the health monitoring system for later reporting. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: add realtime refcount btree when adding rt volumeDarrick J. Wong
If we're adding a realtime section to the filesystem, create the rt refcount btree inode before we start adding rt space. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: check that the rtrefcount maxlevels doesn't increase when growing fsDarrick J. Wong
The size of filesystem transaction reservations depends on the maximum height (maxlevels) of the realtime btrees. Since we don't want a grow operation to increase the reservation size enough that we'll fail the minimum log size checks on the next mount, constrain growfs operations if they would cause an increase in the rt refcount btree maxlevels. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: enable extent size hints for CoW operationsDarrick J. Wong
Wire up the copy-on-write extent size hint for realtime files, and connect it to the rt allocator so that we avoid fragmentation on rt filesystems. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: apply rt extent alignment constraints to CoW extsize hintDarrick J. Wong
The copy-on-write extent size hint is subject to the same alignment constraints as the regular extent size hint. Since we're in the process of adding reflink (and therefore CoW) to the realtime device, we must apply the same scattered rextsize alignment validation strategies to both hints to deal with the possibility of rextsize changing. Therefore, fix the inode validator to perform rextsize alignment checks on regular realtime files, and to remove misaligned directory hints. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: fix xfs_get_extsz_hint behavior with realtime alwayscow filesDarrick J. Wong
Currently, we (ab)use xfs_get_extsz_hint so that it always returns a nonzero value for realtime files. This apparently was done to disable delayed allocation for realtime files. However, once we enable realtime reflink, we can also turn on the alwayscow flag to force CoW writes to realtime files. In this case, the logic will incorrectly send the write through the delalloc write path. Fix this by adjusting the logic slightly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: refcover CoW leftovers in the realtime volumeDarrick J. Wong
Scan the realtime refcount tree at mount time to get rid of leftover CoW staging extents. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: allow inodes to have the realtime and reflink flagsDarrick J. Wong
Now that we can share blocks between realtime files, allow this combination. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: enable sharing of realtime file blocksDarrick J. Wong
Update the remapping routines to be able to handle realtime files. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: enable CoW for realtime dataDarrick J. Wong
Update our write paths to support copy on write on the rt volume. This works in more or less the same way as it does on the data device, with the major exception that we never do delalloc on the rt volume. Because we consider unwritten CoW fork staging extents to be incore quota reservation, we update xfs_quota_reserve_blkres to support this case. Though xfs doesn't allow rt and quota together, the change is trivial and we shouldn't leave a logic bomb here. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: refactor reflink quota updatesDarrick J. Wong
Hoist all quota updates for reflink into a helper function, since things are about to become more complicated. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: compute rtrmap btree max levels when reflink enabledDarrick J. Wong
Compute the maximum possible height of the realtime rmap btree when reflink is enabled. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: update rmap to allow cow staging extents in the rt rmapDarrick J. Wong
Don't error out on CoW staging extent records when realtime reflink is enabled. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: fix confusing variable names in xfs_refcount_item.cDarrick J. Wong
Variable names in this code module are inconsistent and confusing. xfs_phys_extent describe physical mappings, so rename them "pmap". xfs_refcount_intents describe refcount intents, so rename them "ri". Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: create routine to allocate and initialize a realtime refcount btree inodeDarrick J. Wong
Create a library routine to allocate and initialize an empty realtime refcountbt inode. We'll use this for growfs, mkfs, and repair. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: wire up realtime refcount btree cursorsDarrick J. Wong
Wire up realtime refcount btree cursors wherever they're needed throughout the code base. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: wire up a new inode fork type for the realtime refcountDarrick J. Wong
Plumb in the pieces we need to embed the root of the realtime refcount btree in an inode's data fork, complete with new fork type and on-disk interpretation functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: add metadata reservations for realtime refcount btreeDarrick J. Wong
Reserve some free blocks so that we will always have enough free blocks in the data volume to handle expansion of the realtime refcount btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: add realtime reverse map inode to metadata directoryDarrick J. Wong
Add a metadir path to select the realtime refcount btree inode and load it at mount time. The rtrefcountbt inode will have a unique extent format code, which means that we also have to update the inode validation and flush routines to look for it. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: add realtime refcount btree block detection to log recoveryDarrick J. Wong
Identify rt refcount btree blocks in the log correctly so that we can validate them during log recovery. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: support recovering refcount intent items targetting realtime extentsDarrick J. Wong
Now that we have reflink on the realtime device, refcount intent items have to support remapping extents on the realtime volume. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: add a realtime flag to the refcount update log redo itemsDarrick J. Wong
Extend the refcount update (CUI) log items with a new realtime flag that indicates that the updates apply against the realtime refcountbt. We'll wire up the actual refcount code later. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: prepare refcount functions to deal with rtrefcountbtDarrick J. Wong
Prepare the high-level refcount functions to deal with the new realtime refcountbt and its slightly different conventions. Provide the ability to talk to either refcountbt or rtrefcountbt formats from the same high level code. Note that we leave the _recover_cow_leftovers functions for a separate patch so that we can convert it all at once. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: add realtime refcount btree operationsDarrick J. Wong
Implement the generic btree operations needed to manipulate rtrefcount btree blocks. This is different from the regular refcountbt in that we allocate space from the filesystem at large, and are neither constrained to the free space nor any particular AG. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: realtime refcount btree transaction reservationsDarrick J. Wong
Make sure that there's enough log reservation to handle mapping and unmapping realtime extents. We have to reserve enough space to handle a split in the rtrefcountbt to add the record and a second split in the regular refcountbt to record the rtrefcountbt split. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: define the on-disk realtime refcount btree formatDarrick J. Wong
Start filling out the rtrefcount btree implementation. Start with the on-disk btree format; add everything needed to read, write and manipulate refcount btree blocks. This prepares the way for connecting the btree operations implementation. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: introduce realtime refcount btree definitionsDarrick J. Wong
Add new realtime refcount btree definitions. The realtime refcount btree will be rooted from a hidden inode, but has its own shape and therefore needs to have most of its own separate types. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: widen xfs_refcount_irec fields to handle realtime refcountbtDarrick J. Wong
Change the startblock and blockcount fields of xfs_refcount_irec to be 64 bits wide. This enables us to use the same high level refcount code for either tree. We'll also collect all the resulting breakage fixes here. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: restructure parameters to xfs_reflink_find_sharedDarrick J. Wong
In preparation for widening the refcount code to accept 64-bit extents, clean up the method signature for xfs_reflink_find_shared by passing in the bmbt irec that both callers are checking, and pass back out the same types that are found in the irec. Make the function static since there are fewer callers than there used to be. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: remove xfs_trans_set_refcount_flagsrefcount-intent-cleanups_2021-12-15Darrick J. Wong
Remove this single-use helper. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: clean up refcount log intent item tracepoint callsitesDarrick J. Wong
Pass the incore refcount intent structure to the tracepoints instead of open-coding the argument passing. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: pass refcount intent directly through the log intent codeDarrick J. Wong
Pass the incore refcount intent through the CUI logging code instead of repeatedly boxing and unboxing parameters. We'll clean up the tracepoints shortly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-12-15xfs: prepare refcount btree tracepoints for wideningDarrick J. Wong
Prepare the rest of refcount btree tracepoints for use with realtime reflink by making them take the btree cursor object as a parameter. This will save us a lot of trouble later on. Remove the xfs_refcount_recover_extent tracepoint since it's already covered by other refcount tracepoints. Signed-off-by: Darrick J. Wong <djwong@kernel.org>