summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-01xfs: fix an incore inode UAF in xfs_bui_recoverfix-log-recovery_2020-06-01Darrick J. Wong
In xfs_bui_item_recover, there exists a use-after-free bug with regards to the inode that is involved in the bmap replay operation. If the mapping operation does not complete, we call xfs_bmap_unmap_extent to create a deferred op to finish the unmapping work, and we retain a pointer to the incore inode. Unfortunately, the very next thing we do is commit the transaction and drop the inode. If reclaim tears down the inode before we try to finish the defer ops, we dereference garbage and blow up. Therefore, create a way to join inodes to the defer ops freezer so that we can maintain the xfs_inode reference until we're done with the inode. Note: This imposes the requirement that there be enough memory to keep every incore inode in memory throughout recovery. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: clean up xfs_bui_item_recover iget/trans_alloc/ilock orderingDarrick J. Wong
In most places in XFS, we have a specific order in which we gather resources: grab the inode, allocate a transaction, then lock the inode. xfs_bui_item_recover doesn't do it in that order, so fix it to be more consistent. This also makes the error bailout code a bit less weird. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: clean up bmap intent item recovery checkingDarrick J. Wong
The bmap intent item checking code in xfs_bui_item_recover is spread all over the function. We should check the recovered log item at the top before we allocate any resources or do anything else, so do that. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: xfs_defer_capture should absorb remaining block reservationDarrick J. Wong
When xfs_defer_capture extracts the deferred ops and transaction state from a transaction, it should absorb the remaining block reservation so that when we continue the dfops chain, we still have those blocks to use. This adds the requirement that every log intent item recovery function must be careful to reserve enough blocks to handle both itself and all defer ops that it can queue. On the other hand, this enables us to do away with the handwaving block estimation nonsense that was going on in xlog_finish_defer_ops. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: proper replay of deferred ops queued during log recoveryDarrick J. Wong
When we replay unfinished intent items that have been recovered from the log, it's possible that the replay will cause the creation of more deferred work items. As outlined in commit 509955823cc9c ("xfs: log recovery should replay deferred ops in order"), later work items have an implicit ordering dependency on earlier work items. Therefore, recovery must replay the items (both recovered and created) in the same order that they would have been during normal operation. For log recovery, we enforce this ordering by using an empty transaction to collect deferred ops that get created in the process of recovering a log intent item to prevent them from being committed before the rest of the recovered intent items. After we finish committing all the recovered log items, we allocate a transaction with an enormous block reservation, splice our huge list of created deferred ops into that transaction, and commit it, thereby finishing all those ops. This is /really/ hokey -- it's the one place in XFS where we allow nested transactions; the splicing of the defer ops list is is inelegant and has to be done twice per recovery function; and the broken way we handle inode pointers and block reservations cause subtle use-after-free and allocator problems that will be fixed by this patch and the two patches after it. Therefore, replace the hokey empty transaction with a structure designed to capture each chain of deferred ops that are created as part of recovering a single unfinished log intent. Finally, refactor the loop that replays those chains to do so using one transaction per chain. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: report realtime rmap btree corruption errors to the health systemrealtime-rmap_2020-06-01Darrick J. Wong
Whenever we encounter corrupt realtime rmap btree blocks, we should report that to the health monitoring system for later reporting. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: online repair of the realtime rmap btreeDarrick J. Wong
Repair the realtime rmap btree while mounted. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: online repair of realtime bitmapsDarrick J. Wong
Rebuild the realtime bitmap from the realtime rmap btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: online repair of realtime file bmapsDarrick J. Wong
Repair the block mappings of realtime files. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: cross-reference the realtime rmapbtDarrick J. Wong
Teach the data fork and realtime bitmap scrubbers to cross-reference information with the realtime rmap btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: cross-reference realtime bitmap to realtime rmapbt scrubberDarrick J. Wong
When we're checking the realtime rmap btree entries, cross-reference those entries with the realtime bitmap too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: scrub the realtime rmapbtDarrick J. Wong
Check the realtime reverse mapping btree against the rtbitmap, and modify the rtbitmap scrub to check against the rtrmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: refactor realtime scrubbing context managementDarrick J. Wong
Create a pair of helpers to deal with setting up the necessary incore context to check metadata records against the realtime metadata. This was already (sort of) open-coded in the data fork checker. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: wire up getfsmap to the realtime reverse mapping btreeDarrick J. Wong
Connect the getfsmap ioctl to the realtime rmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: enable realtime rmap btreeDarrick J. Wong
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: dynamically create the realtime rmapbt inode when attaching rtdevDarrick J. Wong
If the administrator asks us to add a realtime volume to an existing rmap filesystem, we must allocate and attach the rtrmapbt inode to the system prior to enabling the rt volume. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: create routine to allocate and initialize a realtime rmap btree inodeDarrick J. Wong
Create a library routine to allocate and initialize an empty realtime rmapbt inode. We'll use this for growfs, mkfs, and repair. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: wire up rmap map and unmap to the realtime rmapbtDarrick J. Wong
Connect the map and unmap reverse-mapping operations to the realtime rmapbt via the deferred operation callbacks. This enables us to perform rmap operations against the correct btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: wire up a new inode fork type for the realtime rmapDarrick J. Wong
Plumb in the pieces we need to embed the root of the realtime rmap btree in an inode's data fork, complete with new fork type and on-disk interpretation functions. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: add realtime reverse map inode to superblockDarrick J. Wong
Add a metadir path to select the realtime rmap btree inode and load it at mount time. The rtrmapbt inode will have a unique extent format code, which means that we also have to update the inode validation and flush routines to look for it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: add realtime rmap btree block detection to log recoveryDarrick J. Wong
Identify rtrmapbt blocks in the log correctly so that we can validate them during log recovery. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: add a realtime flag to the rmap update log redo itemsDarrick J. Wong
Extend the rmap update (RUI) log items with a new realtime flag that indicates that the updates apply against the realtime rmapbt. We'll wire up the actual rmap code later. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: prepare rmap functions to deal with rtrmapbtDarrick J. Wong
Prepare the high-level rmap functions to deal with the new realtime rmapbt and its slightly different conventions. Provide the ability to talk to either rmapbt or rtrmapbt formats from the same high level code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: add realtime rmap btree operationsDarrick J. Wong
Implement the generic btree operations needed to manipulate rtrmap btree blocks. This is different from the regular rmapbt in that we allocate space from the filesystem at large, and are neither constrained to the free space nor any particular AG. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: realtime rmap btree transaction reservationsDarrick J. Wong
Make sure that there's enough log reservation to handle mapping and unmapping realtime extents. We have to reserve enough space to handle a split in the rtrmapbt to add the record and a second split in the regular rmapbt to record the rtrmapbt split. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: define the on-disk realtime rmap btree formatDarrick J. Wong
Start filling out the rtrmap btree implementation. Start with the on-disk btree format; add everything needed to read, write and manipulate rmap btree blocks. This prepares the way for connecting the btree operations implementation. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: introduce realtime rmap btree definitionsDarrick J. Wong
Add new realtime rmap btree definitions. The realtime rmap btree will be rooted from a hidden inode, but has its own shape and therefore needs to have most of its own separate types. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: widen xfs_rmap_irec fields to handle realtime rmapbtDarrick J. Wong
Change the startblock and blockcount fields of xfs_rmap_irec to be 64 bits wide. This enables us to use the same high level rmap code for either tree. We'll also collect all the resulting breakage fixes here. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: support storing records in the inode core rootDarrick J. Wong
Make it so that we can actually store btree records in the inode core (i.e. enable bb_level == 0) so that the rtrmapbt can do this. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: make iroot_realloc a btree functionDarrick J. Wong
For btrees that are rooted in the inode core, we have to have a function to resize the root. This is fairly specific to each btree type, so make xfs_iroot_realloc a per-btree function. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: enable metadata inode directory featuremetadir_2020-06-01Darrick J. Wong
Enable the metadata inode directory feature. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: disable the agi rotor for metadata inodesDarrick J. Wong
Ideally, we'd put all the metadata inodes in one place if we could, so that the metadata all stay reasonably close together instead of spreading out over the disk. Furthermore, if the log is internal we'd probably prefer to keep the metadata near the log. Therefore, disable AGI rotoring for metadata inode allocations. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: ensure metadata directory paths exist before creating filesDarrick J. Wong
Since xfs_imeta_create can create new metadata files arbitrarily deep in the metadata directory tree, we must supply a function that can ensure that all directories in a path exist, and call it before the quota functions create the quota inodes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: read and write metadata inode directoryDarrick J. Wong
Plumb in the bits we need to look up metadata inode numbers from the metadata inode directory and save them back. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: enforce metadata inode flagDarrick J. Wong
Add checks for the metadata inode flag so that we don't ever leak metadata inodes out to userspace, and we don't ever try to read a regular inode as metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: convert metadata inode lookup keys to use pathsDarrick J. Wong
Convert the magic metadata inode lookup keys to use actual strings for paths. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: load metadata inode directory at mount timeDarrick J. Wong
Load the metadata directory inode into memory at mount time and release it at unmount time. We also make sure that the obsolete inode pointers in the superblock are not logged or read from the superblock. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: define the on-disk format for the metadir featureDarrick J. Wong
Define the on-disk layout and feature flags for the metadata inode directory feature. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: iget for metadata inodesDarrick J. Wong
Create a xfs_iget_meta function for metadata inodes to ensure that we always check that the inobt thinks a metadata inode is in use. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: convert all users to xfs_imeta_logDarrick J. Wong
Convert all open-coded sb metadata inode pointer logging to use xfs_imeta_log. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: refactor the v4 group/project inode pointer switchDarrick J. Wong
Refactor the group and project quota inode pointer switcheroo that happens only on v4 filesystems into a separate function prior to enhancing the xfs_qm_qino_alloc function. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: create transaction reservations for metadata inode operationsDarrick J. Wong
Create transaction reservation types and block reservation helpers to help us calculate transaction requirements. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: create imeta abstractions to get and set metadata inodesDarrick J. Wong
Create some helper routines to get and set metadata inode numbers instead of open-coding them throughout xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: get rid of cross_renameinode-refactor_2020-06-01Darrick J. Wong
Get rid of the largely pointless xfs_cross_rename now that we've refactored its parent. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: create libxfs helper to rename two directory entriesDarrick J. Wong
Create a new libxfs function to rename two directory entries. The upcoming metadata directory feature will need this to replace a metadata inode directory entry. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: create libxfs helper to exchange two directory entriesDarrick J. Wong
Create a new libxfs function to exchange two directory entries. The upcoming metadata directory feature will need this to replace a metadata inode directory entry. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: create libxfs helper to remove an existing inode/name from a directoryDarrick J. Wong
Create a new libxfs function to remove a (name, inode) entry from a directory. The upcoming metadata directory feature will need this to create a metadata directory tree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: hoist inode free function to libxfsDarrick J. Wong
Create a libxfs helper function that marks an inode free on disk. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: create libxfs helper to link an existing inode into a directoryDarrick J. Wong
Create a new libxfs function to link an existing inode into a directory. The upcoming metadata directory feature will need this to create a metadata directory tree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-01xfs: create libxfs helper to link a new inode into a directoryDarrick J. Wong
Create a new libxfs function to link a newly created inode into a directory. The upcoming metadata directory feature will need this to create a metadata directory tree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>