Age | Commit message (Collapse) | Author |
|
REQ_FUA means "skip the drive cache", and it can be used with reads to.
If there was a checksum error, we want to retry the whole read path, not
read it from cache again.
Suggested-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a sysfs attribute for checking whether read fua appears to behave
properly on a device.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This adds shrinker.to_text() methods for our shrinkers and hooks them up
to our existing to_text() functions.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for scrub rework: repair will use the original data_update.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for making scrub error correction use the move.c pipeline.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for rebalance handling replicas changes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Disk reservations don't guarantee that a specific device won't be full.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Make the io_read_nopromote tracepoint a bit better by giving it an error
code for this case.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Make the error injection tests a bit better; by limiting error injection
to a single device userspace should never see an IO error.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It's not supported in userspace - we don't pull in idr yet.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Specifying them at mount time is problematic for multiple reasons:
options specified at mount time aren't persistent, and IO path option
changes must create rebalance scan cookies - otherwise we'll flag IO
path option inconsistency in check_rebalance_work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
With closure_put() now using cmpxchg, this is no longer needed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Avoid transaction restarts from upgrading locks, we usually need to
update here.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It should only be kept around if we have inode options - we can't look
up the inode once it's in the reflink btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Drop bch_extent_rebalance when an extent no longer needs to be erasure
coded.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Required for propagating new filesystem options to indirect extents
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Avoid races with the new consistency checking for rebalance opts: a scan
cookie must exist when changing the options to avoid false positives on
inconsistency checks, and we need another scan cookie after setting the
options to avoid racing with a scan starting before seeing the new
options.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Collapse bch2_bkey_sectors_need_rebalance() and
bch2_bkey_ptrs_need_rebalance() down to a single function, which outputs
both the bitmap of pointers that need to be rebalanced and the number of
sectors that need to be processed.
This will enable adding other reasons an extent might need to be
processed by rebalance: changing the checksum type,
increasing/decreasing replication level, enabling/disabling erasure
coding, etc.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Factor out a small helper.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
io path options set from the inode should override existing indirect
extent options, if REFLINK_P_MAY_UPDATE_OPTS is set.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Also, consider the metadata_replicas option when better
accounting is not available.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Introduce btree node number accounting for better progress reporting.
This change includes a mandatory upgrade/downgrade.
Add 2 new counters for BCH_DISK_ACCOUNTING_btree: total number of btree
nodes (ignoring replication) and the number of non-leaf btree nodes
(likewise). Those are to be used by recovery progress reporting instead
of estimating them.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This prevents mounting of unresized image files from failing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It turns out we don't want to use the new fast device removal path -
which walks backpointers on a device - on old filesystems that didn't
have backpointers for cached pointers; they might still have stale
pointers.
Add a compat feature bit that indicates we know a filesystem has no
stale cached pointers, and have check_extents/check_indirect_extents
delete any stale cached pointers so we can set it after a successful
fsck.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We no longer have stale cached pointers on new filesystems - cached
pointers now have backpointers, for multiple reasons - so we can kill
most of this code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Dead code
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Accidentally committed in a78a11900ecbb.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We can easily print paths now.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This was noticed as spurious 'VFS incorrectly tried to delete inode'
errors, where we were leaking a transaction restart error (and failing
to call delete_ancestor_snapshot_inodes()).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix an occasional cause of srcu stall warnings - on very heavily loaded
systems, it helps to move work outside of the transaction (and is good
to do for performance regardless).
Also add some counter, so we can observe success/fail.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Throttle background writeback based on what the allocator is doing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Don't run ourselves out of free space while shutting down.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Our inodes are bigger than they used to be - the inode backpointer
fields shouldn't have been varint fields, but they came late.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|