summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-30bcachefs: Always print when doing journal replay in fsckKent Overstreet
This logging improvement helps see when the previous fsck pass has completed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Rename group to label for remaining strings.Daniel Hill
Signed-off-by: Daniel Hill <daniel@gluo.nz>
2022-05-30bcachefs: Fix encryption path on armKent Overstreet
flush_dcache_page() is not a noop on arm, but we were using virt_to_page() instead of vmalloc_to_page() for an address on the kernel stack - vmalloc memory, leading to an oops in flush_dcache_page(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Switch to key_type_user, not logonKent Overstreet
The only difference key_type_logon and key_type_user is that key_type_logon keys can't be read by userspace. However, userspace has actually been adding keys to both the logon and user keychains, because userspace fsck requires the keychain interface - so we might as well just use user and drop the logon keychain. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: LRU repair tweaksKent Overstreet
- Drop old unneeded parameter for whether we're in initial GC - which was from when btree updates had to be done differently before we went RW. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Delete bch_writepageKent Overstreet
Per Dave Chinner and the xfs folks, .writepage is no longer needed, and it's better not to define it if .writepages is the intended path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Make bch_option compatible with Rust ffiBrett Holman
Rust FFI lacks support for unnamed structs and unions. The space saved in bch_option is not enough to be significant. Signed-off-by: Brett Holman <bholman.devel@gmail.com>
2022-05-30bcachefs: Put btree_trans_verify_sorted() behind debug_check_iteratorsKent Overstreet
This is pretty expensive, and we've tested sufficiently with it now that it doesn't need to be on by default. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix extent mergingKent Overstreet
When merging extents, we have to check that we won't overflow size fields in any CRC entries - but the check for this was wrong, because in the loop it was in we weren't keeping a pointer to the (packed, encoded) CRC field. Fix this by moving it to its own loop. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Improve invalid bkey error messageKent Overstreet
Bkeys have gotten a lot bigger since this code was written and now are often formatted across multiple lines - while the reason a bkey is invalid will still be short and fit on a single line. This patch prints the error bfore the bkey, making it a bit more readable. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix journal_iters_fix()Kent Overstreet
journal_iters_fix() was incorrectly rewinding iterators past keys they had already returned, leading to those keys being double counted in the bch2_gc() path - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Go RW before bch2_check_lrus()Kent Overstreet
btree updates before going RW are expensive if they're in random order, since they use the list of keys for journal replay to insert, which is just a gap buffer. This patch improves the bucket invalidate path so that if bch2_check_lrus() hasn't finished it only prints warnings instead of doing an emergency shutdown, which means we can now set BCH_FS_MAY_GO_RW before bch2_check_lrus(). Also, the filesystem state bits are reorganized a bit. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Add persistent countersDaniel Hill
This adds a new superblock field for persisting counters and adds a sysfs interface in counters/ exposing these counters. The superblock field is ignored by older versions letting us avoid an on disk version bump. Each sysfs file outputs a counter that tracks since filesystem creation and a counter for the current mount session. Signed-off-by: Daniel Hill <daniel@gluo.nz>
2022-05-30bcachefs: Tracepoint improvementsKent Overstreet
Delete some obsolete tracepoints, organize alloc tracepoints better, make a few tracepoints more consistent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Don't kick journal reclaim unless low on spaceKent Overstreet
We shouldn't kick journal reclaim unnecessarily, it's got its own timer for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Lock ordering fixKent Overstreet
Can't take btree node locks while holding btree_reserve_cache_lock - it would be nice if we could check this with lockdep. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Shutdown path improvementsKent Overstreet
We're seeing occasional firings of the assertion in the key cache shutdown code that nr_dirty == 0, which means we must sometimes be doing transaction commits after we've gone read only. Cleanups & changes: - BCH_FS_ALLOC_CLEAN renamed to BCH_FS_CLEAN_SHUTDOWN - new helper bch2_btree_interior_updates_flush(), which returns true if it had to wait - bch2_btree_flush_writes() now also returns true if there were btree writes in flight - __bch2_fs_read_only now checks if btree writes were in flight in the shutdown loop: btree write completion does a transaction update, to update the pointer in the parent node - assert that !BCH_FS_CLEAN_SHUTDOWN in __bch2_trans_commit Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix hash_check_key()Kent Overstreet
hash_check_key() was incorrectly handling transaction restarts - switch it to for_each_btree_key_norestart(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Allocate some extra room in btree_key_cache_fill()Kent Overstreet
If we allocate a buffer that's a bit bigger than necessary the transaction commit path will be much less likely to have to reallocate - which requires a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: bch2_btree_iter_peek_all_levels()Kent Overstreet
This adds bch2_btree_iter_peek_all_levels(), which returns keys from every level of the btree - interior nodes included - in monotonically increasing order, soon to be used by the backpointers check & repair code. - BTREE_ITER_ALL_LEVELS can now be passed to for_each_btree_key() to iterate thusly, much like BTREE_ITER_SLOTS - The existing algorithm in bch2_btree_iter_advance() doesn't work with peek_all_levels(): we have to defer the actual advancing until the next time we call peek, where we have the btree path traversed and uptodate. So, we add an advanced bit to btree_iter; when BTREE_ITER_ALL_LEVELS is set bch2_btree_iter_advanced() just marks the iterator as advanced. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: btree_path_set_level_(up|down)Kent Overstreet
This adds two new helpers to btree_iter.c for changing the level of a path up or down - to be used by the new bch2_btree_iter_peek_all_levels(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: bch2_btree_iter_peek_slot() now works on interior nodesKent Overstreet
The new backpointers code will be using bch2_btree_iter_peek_slot() on interior nodes - this patch updates peek_slot() to make that work. - Pass the correct level to bch2_journal_keys_peek_slot() - We should only set BTREE_ITER_CACHED or BTREE_ITER_WITH_KEY_CACHE when using bch2_trans_iter_init(), not bch2_trans_node_iter_init() - Update assertions Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: btree_update_interior.c prep for backpointersKent Overstreet
Previously, btree_update_interior.c passed keys to bch2_trans_mark_* that hadn't been fully initialized - they didn't have the key field filled out, just the value. With backpointers, we need to make sure keys are fully initialized before marking them - because the backpointer points back to the original key. This patch tweaks the interior update paths to fix this. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Plumb btree_id & level to trans_markKent Overstreet
For backpointers, we'll need the full key location - that means btree_id and btree level. This patch plumbs it through. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Improve some fsck error messagesKent Overstreet
We have string names for d_type; use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Go emergency RO when i_blocks underflowsKent Overstreet
This improves some of our warnings and assertions - they imply possible filesystem inconsistencies, so they should be calling bch2_fs_inconsistent(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Ensure sysfs show fns print a newlineKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Improve try_alloc_bucket()Kent Overstreet
This patch primarily improves the error reporting in try_alloc_bucket(): we now print out the actual freespace extent we were allocating from, which makes it easier to grep through the logs when something goes wrong. Also: - switch to for_each_btree_key_norestart(); bch2_bucket_alloc() already handles transaction restarts, and in general transaction restarts should only be handled in one place - switch to < instead of != when looping over buckets represented by a freespace extent Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Kill old rebuild_replicas optionKent Overstreet
This option was useful when the replicas mechism was new and still being debugged, but hasn't been used in ages - let's delete it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: In fsck, pass BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE when deleting ↵Kent Overstreet
dirents A user reported an error where we hit an assertion due to deleting a key in an internal snapshot node, when deleting a dirent that points to a nonexisting inode. We try to avoid doing updates to keys for internal snapshot nodes, but upon inspection of the places where we remove dirents in fsck it appears BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE is correct for all of them: either the target dirent doesn't exist, or it's a directory with multiple dirents pointing to it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix for getting stuck in journal replayKent Overstreet
In journal replay, we weren't immediately dropping journal pins when we start doing updates that ewern't from journal replay - leading to journal reclaim getting stuck. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Delete a redundant tracepointKent Overstreet
Now that the bucket_alloc_fail tracepoint includes the error code, the open_bucket_alloc_fail tracepoint is redundant. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Improve error logging in fsck.cKent Overstreet
This adds error logging to a bunch of functions in fsck.c - in fsck, reduntant error messages is probably better than not enough. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix inode_backpointer_exists()Kent Overstreet
If the dirent an inode points to doesn't exist, we shouldn't be returning an error - just 0/false. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Improve bch2_lru_delete() error messagesKent Overstreet
When we detect a filesystem inconsistency, we should include the relevent keys in the error message. This patch adds a parameter to pass the key with the lru entry to bch2_lru_delete(), so that it can be printed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Introduce bch2_journal_keys_peek_(upto|slot)()Kent Overstreet
When many journal replay keys have been overwritten, bch2_journal_keys_peek() was taking excessively long to scan before it found a key to return. Fix this by introducing bch2_journal_keys_peek_upto() which takes a parameter for the end of the range we want, so that we can terminate the search much sooner, and replace all uses of bch2_journal_keys_peek() with peek_upto() or peek_slot(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Improve error message when alloc key doesn't match lru entryKent Overstreet
Error messages should always print out the full key when available - this gives us a starting point when looking through the journal to debug what went wrong. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Ensure buckets have io_time[READ] setKent Overstreet
It's an error if a bucket is in state BCH_DATA_cached but not on the LRU btree - i.e io_time[READ] == 0 - so, make sure it's set before adding it. Also, make some of the LRU code a bit clearer and more direct. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Use bch2_trans_inconsistent_on() in more placesKent Overstreet
This gets us better error messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Improve bch2_open_buckets_to_text()Kent Overstreet
This patch updates bch2_open_buckets_to_text() to include the device and bucket the open_bucket owns. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix CPU usage in journal read pathKent Overstreet
In journal_entry_add(), we were repeatedly scanning the journal entries radix tree to scan for old entries that can be freed, with O(n^2) behaviour. This patch tweaks things to remember the previous last_seq, so we don't have to scan for entries to free from the start. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix a null ptr derefKent Overstreet
We start doing allocations before the GC thread is created, which means we need to check for that to avoid a null ptr deref. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Don't trigger extra assertions in journal replayKent Overstreet
We now pass a rw argument to .key_invalid methods so they can trigger assertions for updates but not on existing keys. We shouldn't trigger these extra assertions in journal replay - this patch changes the transaction commit path accordingly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Minor device removal fixesKent Overstreet
- We weren't clearing the LRU btree - bch2_alloc_read() runs before bch2_check_alloc_key() deletes alloc keys for devices/buckets that don't exists, so it needs to check for that - bch2_check_lrus() needs to check that buckets exists - improve some error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix a few warnings on 32 bitKent Overstreet
These showed up when building for mips. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: bch2_btree_delete_extent_at()Kent Overstreet
New helper, for deleting extents. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Don't skip triggers in fcollapse()Kent Overstreet
With backpointers this doesn't work anymore - backpointers always need to be updated to point to the new extent position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Initialize ec work structs earlyKent Overstreet
We need to ensure that work structs in bch_fs always get initialized - otherwise an error in filesystem initialization can pop a warning in the workqueue code when we try to cancel a work struct that wasn't initialized. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Use a genradix for reading journal entriesKent Overstreet
Previously, the journal read path used a linked list for storing the journal entries we read from disk. But there's been a bug that's been causing journal_flush_delay to incorrectly be set to 0, leading to far more journal entries than is normal being written out, which then means filesystems are no longer able to start due to the O(n^2) behaviour of inserting into/searching that linked list. Fix this by switching to a radix tree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Refactor journal_keys_sort() to return an error codeKent Overstreet
When there weren't any keys in the journal there's no need to allocate the buffer - but doing that causes a spurious -ENOMEM. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>