Age | Commit message (Collapse) | Author |
|
Create a polled version of xfs_inactive_force so that we can force
inactivation while holding a lock (usually the umount lock) without
tripping over the softlockup timer. This is for callers that hold vfs
locks while calling inactivation, which is currently unmount, iunlink
processing during mount, and rw->ro remount.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Split the inode inactivation work into per-AG work items so that we can
take advantage of parallelization.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
If we think that inactivation will free enough blocks to make it easier
to satisfy an fallocate request, force inactivation.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Any time we try to modify a file's contents and it fails due to ENOSPC
or EDQUOT, force inactivation work to free up some resources and try one
more time.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Instead of calling xfs_inactive directly from xfs_fs_destroy_inode,
defer the inactivation phase to a separate workqueue. With this we
avoid blocking memory reclaim on filesystem metadata updates that are
necessary to free an in-core inode, such as post-eof block freeing, COW
staging extent freeing, and truncating and freeing unlinked inodes. Now
that work is deferred to a workqueue where we can do the freeing in
batches.
We introduce two new inode flags -- NEEDS_INACTIVE and INACTIVATING.
The first flag helps our worker find inodes needing inactivation, and
the second flag marks inodes that are in the process of being
inactivated. A concurrent xfs_iget on the inode can still resurrect the
inode by clearing NEEDS_INACTIVE (or bailing if INACTIVATING is set).
Unfortunately, deferring the inactivation has one huge downside --
eventual consistency. Since all the freeing is deferred to a worker
thread, one can rm a file but the space doesn't come back immediately.
This can cause some odd side effects with quota accounting and statfs,
so we also force inactivation scans in order to maintain the existing
behaviors, at least outwardly.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Create an alternative version of xfs_ici_walk() that allow a caller to
pass in custom inode grab and inode release helper functions. Deferred
inode inactivation deals with xfs inodes that are still in memory but no
longer visible to the vfs, which means that it has to screen and process
those inodes differently.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Pass the per-AG structure to the xfs_ici_walk execute function. This
isn't needed now, but deferred inactivation will need it to modify some
per-ag data.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Set up quota counters to track the number of inodes and blocks that will
be freed from inactivating unlinked inodes. We'll use this in the
deferred inactivation patch to hide the effects of deferred processing.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Set up counters to track the number of inodes and blocks that will be
freed from inactivating unlinked inodes. We'll use this in the deferred
inactivation patch to hide the effects of deferred processing.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Add a predicate function to decide if an inode needs (deferred)
inactivation. Any file that has been unlinked or has speculative
preallocations either for post-EOF writes or for CoW qualifies.
This function will also be used by the upcoming deferred inactivation
patch.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Split the block preallocation garbage collection work into per-AG work
items so that we can take advantage of parallelization.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Shorten the names of the two functions that start and stop block
preallocation garbage collection and move them up to the other blockgc
functions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Perform background block preallocation gc scans more efficiently by
walking the incore inode tree once.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Remove the separate cowblocks work items and knob so that we can control
and run everything from a single blockgc work queue.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
The clearing of posteof blocks and cowblocks serve the same purpose:
removing speculative block preallocations from inactive files. We don't
need to burn two radix tree tags on this, so combine them into one.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Refactor the part of _free_eofblocks that decides if it's really going
to truncate post-EOF blocks into a separate helper function. The
upcoming deferred inode inactivation patch requires us to be able to
decide this prior to actual inactivation. No functionality changes.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
If a fs modification (creation, file write, reflink, etc.) is unable to
reserve enough space to handle the modification, try clearing whatever
space the filesystem might have been hanging onto in the hopes of
speeding up the filesystem. The flushing behavior will become
particularly important when we add deferred inode inactivation because
that will increase the amount of space that isn't actively tied to user
data.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
If a fs modification (creation, file write, reflink, etc.) is unable to
reserve enough quota to handle the modification, try clearing whatever
space the filesystem might have been hanging onto in the hopes of
speeding up the filesystem. The flushing behavior will become
particularly important when we add deferred inode inactivation because
that will increase the amount of space that isn't actively tied to user
data.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Buffered writers who have run out of quota reservation call
xfs_inode_free_quota_blocks to try to free any space reservations that
might reduce the quota usage. Unfortunately, the buffered write path
treats "out of project quota" the same as "out of overall space" so this
function has never supported scanning for space that might ease an "out
of project quota" condition.
We're about to start using this function for cases where we actually
/can/ tell if we're out of project quota, so add in this functionality.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Move the inode dirty data flushing to a workqueue so that multiple
threads can take advantage of a single thread's flush work.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Don't stall the cowblocks scan on a locked inode if we possibly can.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
The functions to run an eof/cowblocks scan to try to reduce quota usage
are kind of a mess -- the logic repeatedly initializes an eofb structure
and there are logic bugs in the code that result in the cowblocks scan
never actually happening.
Replace all three functions with a single function that fills out an
eofb if we're low on quota and runs both eof and cowblocks scans.
Fixes: 83104d449e8c4 ("xfs: garbage collect old cowextsz reservations")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Hide the incore inode walk interface because callers outside of the
icache code don't need to know about iter_flags and radix tags and other
implementation details of the incore inode cache.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Move the xfs_inode_ag_iterator function to be nearer xfs_inode_ag_walk
so that we don't have to scroll back and forth to figure out how the
incore inode walking function works. No functional changes.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
This is a boolean variable, so use the bool type.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
There are a number of predicate functions that help the incore inode
walking code decide if we really want to apply the iteration function to
the inode. These are boolean decisions, so change the return types to
boolean to match.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Refactor the two eofb-matching logics into a single helper so that we
don't repeat ourselves.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
This is now a pointless wrapper, so kill it.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
The incore inode walk code passes a flags argument and a pointer from
the xfs_inode_ag_iterator caller all the way to the iteration function.
We can reduce the function complexity by passing flags through the
private pointer.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Combine xfs_inode_ag_iterator_flags and xfs_inode_ag_iterator_tag into a
single wrapper function since there's only one caller of the _flags
variant.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Not used by anyone, so get rid of it.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Use XFS_ICI_NO_TAG instead of -1 when appropriate.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Move xfs_fs_eofblocks_from_user into the only file that actually uses
it, so that we don't have this function cluttering up the header file.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter XFS_CORRUPT_ON failures, we should report that to
the health monitoring system for later reporting.
I started with this and massaged everything until it built:
@@
expression mp, test;
@@
- if (XFS_CORRUPT_ON(mp, test)) return -EFSCORRUPTED;
+ if (XFS_CORRUPT_ON(mp, test)) { xfs_btree_mark_sick(cur); return -EFSCORRUPTED; }
@@
expression mp, test;
identifier label, error;
@@
- if (XFS_CORRUPT_ON(mp, test)) { error = -EFSCORRUPTED; goto label; }
+ if (XFS_CORRUPT_ON(mp, test)) { xfs_btree_mark_sick(cur); error = -EFSCORRUPTED; goto label; }
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter corrupt realtime metadat blocks, we should report
that to the health monitoring system for later reporting.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter corrupt quota blocks, we should report that to the
health monitoring system for later reporting.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter corrupt inode records, we should report that to
the health monitoring system for later reporting.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter corrupt symbolic link blocks, we should report
that to the health monitoring system for later reporting.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter corrupt directory or extended attribute blocks, we
should report that to the health monitoring system for later reporting.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter corrupt btree blocks, we should report that to the
health monitoring system for later reporting.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter a corrupt block mapping, we should report that to
the health monitoring system for later reporting.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Whenever we encounter a corrupt AG header, we should report that to the
health monitoring system for later reporting.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Split the setting of the sick and checked masks into separate functions
as part of preparing to add the ability for regular runtime fs code
(i.e. not scrub) to mark metadata structures sick when corruptions are
found. Improve the documentation of libxfs' requirements for helper
behavior.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Fix anything that causes the quota verifiers to fail.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
If a directory looks like it's in bad shape, try to sift through the
rubble to find whatever directory entries we can, zap the old tree, and
re-add the entries.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
If an attr block indicates that it could use compaction, set the preen
flag to have the attr fork rebuilt, since the attr fork rebuilder can
take care of that for us.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
|
If the extended attributes look bad, try to sift through the rubble to
find whatever keys/values we can, zap the attr tree, and re-add the
values.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Create a new helper to unmap blocks from an inode's fork.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Remove the transaction roll at the end of the loop in
xfs_itruncate_extents_flags. xfs_defer_finish takes care of rolling the
transaction as needed and reattaching the inode, which means we already
start each loop with a clean transaction.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Create a simple 'blob array' data structure for storage of arbitrarily
sized metadata objects that will be used to reconstruct metadata. For
the intended usage (temporarily storing extended attribute names and
values) we only have to support storing objects and retrieving them.
Use the xfile abstraction to store the attribute information in memory
that can be swapped out.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|