Age | Commit message (Collapse) | Author |
|
Refactor realtime metadata inode locking so that we can get some sense
here.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Realtime metadata files are not quite regular files because userspace
can't access the realtime bitmap directly, and because we take the ILOCK
of the rt bitmap file while holding the ILOCK of a realtime file. The
double nature of inodes confuses lockdep, so up until now we've created
lockdep subclasses to help lockdep keep things straight.
We've gotten away with using lockdep subclasses because there's only two
rt metadata files, but with the coming addition of realtime rmap and
refcounting, we'd need two more subclasses, which is a lot of class bits
to burn on a side feature.
Therefore, switch to manually setting the lockdep class of the rt
metadata ILOCKs. In the next patch we'll remove the rt-related ILOCK
subclasses.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a pair of helpers to deal with setting up the necessary incore
context to check metadata records against the realtime metadata. Right
now this is limited to locking the realtime bitmap and summary inodes,
but as we add rmap and reflink to the realtime device this will grow to
include btree cursors.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
In commit 2c813ad66a72, I partially fixed a bug wherein xfs_btree_insrec
would erroneously try to update the parent's key for a block that had
been split if we decided to insert the new record into the new block.
The solution was to detect this situation and update the in-core key
value that we pass up to the caller so that the caller will (eventually)
add the new block to the parent level of the tree with the correct key.
However, I missed a subtlety about the way inode-rooted btrees work. If
the full block was a maximally sized inode root block, we'll solve that
fullness by moving the root block's records to a new block, resizing the
root block, and updating the root to point to the new block. We don't
pass a pointer to the new block to the caller because that work has
already been done. The new record will /always/ land in the new block,
so in this case we need to use xfs_btree_update_keys to update the keys.
This bug can theoretically manifest itself in the very rare case that we
split a bmbt root block and the new record lands in the very first slot
of the new block, though I've never managed to trigger it in practice.
However, it is very easy to reproduce by running generic/522 with the
realtime rmapbt patchset if rtinherit=1.
Fixes: 2c813ad66a72 ("xfs: support btrees with overlapping intervals for keys")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Add the necessary flags and code so that we can support storing leaf
records in the inode root block of a btree. This hasn't been necessary
before, but the realtime rmapbt will need to be able to do this.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
In preparation for allowing records in an inode btree root, hoist the
code that copies keyptrs from an existing node child into the root block
to a separate function. Remove some unnecessary conditionals and clean
up a few function calls in the new function. Note that this change
reorders the ->free_block call with respect to the change in bc_nlevels
to make it easier to support inode root leaf blocks in the next patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
In preparation for allowing records in an inode btree root, hoist the
code that copies keyptrs from an existing node root into a child block
to a separate function. Note that the new function explicitly computes
the keys of the new child block and stores that in the root block; while
the bmap btree could rely on leaving the key alone, realtime rmap needs
to set the new high key.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Add some logic to xfs_iroot_realloc so that we can handle leaf records
in the btree root block correctly.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
In preparation for storing realtime rmap btree roots in an inode fork,
make xfs_iroot_realloc take an ops structure that takes care of all the
btree-specific geometry pieces.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Standardize the parameters in xfs_{alloc,bm,ino,rmap,refcount}bt_maxrecs
so that we have consistent calling conventions. This doesn't affect the
kernel that much, but enables us to clean up userspace a bit.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Rearrange the innards of xfs_iroot_realloc so that we can reduce
duplicated code prior to genericizing the function. No functional
changes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
The bmap btree cannot ever have zero records in an incore btree block.
If the number of records drops to zero, that means we're converting the
fork to extents format and are trying to remove the tree. This logic
won't hold for the future realtime rmap btree, so move the logic into
the bmbt code.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Whenever we change the size of the memory buffer holding an inode fork
btree root block, we have to copy the contents over. Refactor all this
into a single function that handles both, in preparation for making
xfs_iroot_realloc more generic.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
While refactoring code, I noticed that when xfs_iroot_realloc tries to
shrink a bmbt root block, it allocates a smaller new block and then
copies "records" and pointers to the new block. However, bmbt root
blocks cannot ever be leaves, which means that it's not technically
correct to copy records. We /should/ be copying keys.
Note that this has never resulted in actual memory corruption because
sizeof(bmbt_rec) == (sizeof(bmbt_key) + sizeof(bmbt_ptr)). However,
this will no longer be true when we start adding realtime rmap stuff,
so fix this now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Now that we've created inode fork helpers to allocate and free btree
roots, create a new bmap btree helper to create a new bmbt root, and
refactor the extents <-> btree conversion functions to use our new
helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Refactor the code that allocates and freese the incore inode fork btree
roots. This will help us disentangle some of the weird logic when we're
creating and tearing down inode-based btrees.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Replace all the shouty bmap btree and bmap disk root macros with actual
functions, and fix a type handling error in the xattr code that the
macros previously didn't care about.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Enable the metadata directory feature.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Teach online scrub about the metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Allow the V5 bulkstat ioctl to return information about metadata
directory files so that xfs_scrub can find and scrub them, since they
are otherwise ordinary directories.
(Metadata files of course require per-file scrub code and hence do not
need exposure.)
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Advertise the existence of the metadata directory feature; this will be
used by scrub to decide if it needs to scan the metadir too.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Metadata inodes are private files and therefore cannot be exposed to
userspace. This means no bulkstat, no open-by-handle, no linking them
into the directory tree, and no feeding them to LSMs. As such, we mark
them S_PRIVATE, which stops all that.
While we're at it, put them in a separate lockdep class so that it won't
get confused by "recursive" i_rwsem locking such as what happens when we
write to a rt file and need to allocate from the rt bitmap file.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Ideally, we'd put all the metadata inodes in one place if we could, so
that the metadata all stay reasonably close together instead of
spreading out over the disk. Furthermore, if the log is internal we'd
probably prefer to keep the metadata near the log. Therefore, disable
AGI rotoring for metadata inode allocations.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Since xfs_imeta_create can create new metadata files arbitrarily deep in
the metadata directory tree, we must supply a function that can ensure
that all directories in a path exist, and call it before the quota
functions create the quota inodes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Plumb in the bits we need to look up metadata inode numbers from the
metadata inode directory and save them back.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Add checks for the metadata inode flag so that we don't ever leak
metadata inodes out to userspace, and we don't ever try to read a
regular inode as metadata.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Convert the magic metadata inode lookup keys to use actual strings
for paths.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Load the metadata directory root inode into memory at mount time and
release it at unmount time. We also make sure that the obsolete inode
pointers in the superblock are not logged or read from the superblock.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Update the new metadata inode transaction reservations to handle
metadata directories if that feature is enabled.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Define the on-disk layout and feature flags for the metadata inode
directory feature.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a xfs_iget_meta function for metadata inodes to ensure that we
always check that the inobt thinks a metadata inode is in use.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Convert all open-coded sb metadata inode pointer logging to use
xfs_imeta_log.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Refactor the group and project quota inode pointer switcheroo that
happens only on v4 filesystems into a separate function prior to
enhancing the xfs_qm_qino_alloc function.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create transaction reservation types and block reservation helpers to
help us calculate transaction requirements. Right now the reservations
are the same as always; we're just separating the symbols for a future
patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create some helper routines to get and set metadata inode numbers
instead of open-coding them throughout xfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Get rid of the largely pointless xfs_cross_rename now that we've
refactored its parent.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a new libxfs function to rename two directory entries. The
upcoming metadata directory feature will need this to replace a metadata
inode directory entry.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a new libxfs function to exchange two directory entries.
The upcoming metadata directory feature will need this to replace a
metadata inode directory entry.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a new libxfs function to remove a (name, inode) entry from a
directory. The upcoming metadata directory feature will need this to
create a metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a libxfs helper function that marks an inode free on disk.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a new libxfs function to link an existing inode into a directory.
The upcoming metadata directory feature will need this to create a
metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a new libxfs function to link a newly created inode into a
directory. The upcoming metadata directory feature will need this to
create a metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Move xfs_bumplink and xfs_droplink to libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Move xfs_iunlink and xfs_iunlink_remove to libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Create a helper that calls dqalloc to allocate and grab a reference to
dquots for the user, group, and project ids listed in an icreate
structure. This simplifies the creat-related dqalloc callsites
scattered around the code base.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Move the initialization of the xfs_icreate_args structure out of
xfs_create and xfs_create_tempfile into their callers so that we can set
the new inode's attributes in one place and pass that through instead of
open coding the collection of attributes all over the code.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Move all the code that initializes a new inode's attributes from the
icreate_args structure and the parent directory into libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
There are two parts to initializing a newly allocated inode: setting up
the incore structures, and initializing the new inode core based on the
parent inode and the current user's environment. The initialization
code is not specific to the kernel, so we would like to share that with
userspace by hoisting it to libxfs. Therefore, split xfs_icreate into
separate functions to prepare for the next few patches.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Use xfs_trans_ichgtime to set the inode times when allocating an inode,
instead of open-coding them here.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
|
Enable xfs_trans_ichgtime to change the inode access time so that we can
use this function to set inode times when allocating inodes instead of
open-coding it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|