summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2024-05-08bcachefs: bucket_valid()Kent Overstreet
cut out a branch from doing it the obvious way Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_trans_relock_fail() - factor out slowpathKent Overstreet
Factor out slowpath into a separate helper Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_dir_emit() - drop_locks_do() conversionKent Overstreet
Add a new helper that calls dir_emit() and updates ctx->pos on success; this lets us convert bch2_readdir() to drop_locks_do(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_btree_insert_trans() no longer specifies BTREE_ITER_cachedKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: __BTREE_ITER_ALL_SNAPSHOTS -> BTREE_ITER_SNAPSHOT_FIELDKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: mark_superblock cleanupKent Overstreet
Consolidate mark_superblock() and trans_mark_superblock(), like we did with the other trigger paths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: gc_btree_init_recurse() uses gc_mark_node()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: move root node topo checks to node_check_topology()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: move topology repair kick to gc_btrees()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: kill metadata only gcKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Finish converting reconstruct_alloc to errors_silentKent Overstreet
with errors_silent, reconstruct_alloc no longer requires fsck and fix_errors to work Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_gc() is now private to btree_gc.cKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: for_each_btree_key_continue()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: kill for_each_btree_key_old()Kent Overstreet
Dead code Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Optimize eytzinger0_sort() with bottom-up heapsortKuan-Wei Chiu
This optimization reduces the average number of comparisons required from 2*n*log2(n) - 3*n + o(n) to n*log2(n) + 0.37*n + o(n). When n is sufficiently large, it results in approximately 50% fewer comparisons. Currently, eytzinger0_sort employs the textbook version of heapsort, where during the heapify process, each level requires two comparisons to determine the maximum among three elements. In contrast, the bottom-up heapsort, during heapify, only compares two children at each level until reaching a leaf node. Then, it backtracks from the leaf node to find the correct position. Since heapify typically continues until very close to the leaf node, the standard heapify requires about 2*log2(n) comparisons, while the bottom-up variant only needs log2(n) comparisons. The experimental data presented below is based on an array generated by get_random_u32(). | N | comparisons(old) | comparisons(new) | time(old) | time(new) | |-------|------------------|------------------|-----------|-----------| | 10000 | 235381 | 136615 | 25545 us | 20366 us | | 20000 | 510694 | 293425 | 31336 us | 18312 us | | 30000 | 800384 | 457412 | 35042 us | 27386 us | | 40000 | 1101617 | 626831 | 48779 us | 38253 us | | 50000 | 1409762 | 799637 | 62238 us | 46950 us | | 60000 | 1721191 | 974521 | 75588 us | 58367 us | | 70000 | 2038536 | 1152171 | 90823 us | 68778 us | | 80000 | 2362958 | 1333472 | 104165 us | 78625 us | | 90000 | 2690900 | 1516065 | 116111 us | 89573 us | | 100000| 3019413 | 1699879 | 133638 us | 100998 us | Refs: BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, QUICKSORT (if n is not very small) Ingo Wegener Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 https://doi.org/10.1016/0304-3975(93)90364-Y Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: When traversing to interior nodes, propagate result to paths to ↵Kent Overstreet
same leaf node Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Don't read journal just for fsckKent Overstreet
reading the journal can take a decent amount of time compared to the rest of fsck, let's only read it when required. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: allow for custom action in fsck error messagesKent Overstreet
Be more explicit to the user about what we're doing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: New assertion for writing to the journal after shutdownKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_btree_path_to_text()Kent Overstreet
Long form version of bch2_btree_path_to_text() - useful in error messages and tracepoints. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: add btree_node_merging_disabled debug paramKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_hash_lookup() now returns bkey_s_cKent Overstreet
small cleanup Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_journal_keys_dump()Kent Overstreet
debug helper Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_btree_node_header_to_text()Kent Overstreet
better btree node read path error messages Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: prt_printf() now respects \r\n\tKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: printbufs: prt_printf() now handles \t\r\nKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: printbuf improvementsKent Overstreet
- fix assorted (harmless) off-by-one errors - we were inconsistent on whether out->pos stays <= out->size on overflow; now it does, and printbuf.overflow exists to indicate if a printbuf has overflowed - factor out printbuf_advance_pos() - printbuf_nul_terminate_reserved(); use this to reduce the number of printbuf_make_room() calls Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Run upgrade/downgrade even in -o nochanges modeKent Overstreet
We need to be able to test these paths in dry run mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Better write_super() error messagesKent Overstreet
When a superblock write is silently dropped or it's been modified by another process we need to know which device it was. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Fix xattr_to_text() unsafetyKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_bkey_format_field_overflows()Kent Overstreet
Fix another shift-by-64 by factoring out a common helper for bch2_bkey_format_invalid() and bformat_needs_redo() (where it was already fixed). Reported-by: syzbot+9833a1d29d4a44361e2c@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Fix needs_whiteout BUG_ON() in bkey_sort()Kent Overstreet
Btree nodes are log structured; thus, we need to emit whiteouts when we're deleting a key that's been written out to disk. k->needs_whiteout tracks whether a key will need a whiteout when it's deleted, and this requires some careful handling; e.g. the key we're deleting may not have been written out to disk, but it may have overwritten a key that was - thus we need to carry this flag around on overwrites. Invariants: There may be multiple key for the same position in a given node (because of overwrites), but only one of them will be a live (non deleted) key, and only one key for a given position will have the needs_whiteout flag set. Additionally, we don't want to carry around whiteouts that need to be written in the main searchable part of a btree node - btree_iter_peek() will have to skip past them, and this can lead to an O(n^2) issues when doing sequential deletions (e.g. inode rm/truncate). So there's a separate region in the btree node buffer for unwritten whiteouts; these are merge sorted with the rest of the keys we're writing in the btree node write path. The unwritten whiteouts was a later optimization that bch2_sort_keys() didn't take into account; the unwritten whiteouts area means that we never have deleted keys with needs_whiteout set in the main searchable part of a btree node. That means we can simplify and optimize some sort paths, and eliminate an assertion that syzbot found: - Unless we're in the btree node write path, it's always ok to drop whiteouts when sorting - When sorting for a btree node write, we drop the whiteout if it's not from the unwritten whiteouts area, or if it's overwritten by a real key at the same position. This completely eliminates some tricky logic for propagating the needs_whiteout flag: syzbot was able to hit the assertion that checked that there shouldn't be more than one key at the same pos with needs_whiteout set, likely due to a combination of flipping on needs_whiteout on all written keys (they need whiteouts if overwritten), combined with not always dropping unneeded whiteouts, and the tricky logic in the sort path for preserving needs_whiteout that wasn't really needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Fix sb_clean_validate endianness conversionKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-07bcachefs: Add missing sched_annotate_sleep() in bch2_journal_flush_seq_async()bcachefs-2024-05-07.2Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-07bcachefs: Fix race in bch2_write_super()Kent Overstreet
bch2_write_super() was looping over online devices multiple times - dropping and retaking io_ref each time. This meant it could race with device removal; it could increment the sequence number on a device but fail to write it - and then if the device was re-added, it would get confused the next time around thinking a superblock write was silently dropped. Fix this by taking io_ref once, and stashing pointers to online devices in a darray. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: BCH_SB_LAYOUT_SIZE_BITS_MAXKent Overstreet
Define a constant for the max superblock size, to avoid a too-large shift. Reported-by: syzbot+a8b0fb419355c91dda7f@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Add missing skcipher_request_set_callback() callKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Fix snapshot_t() usage in bch2_fs_quota_read_inode()Kent Overstreet
bch2_fs_quota_read_inode() wasn't entirely updated to the bch2_snapshot_tree() helper, which takes rcu lock. Reported-by: syzbot+a3a9a61224ed3b7f0010@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Fix shift-by-64 in bformat_needs_redo()Kent Overstreet
Ancient versions of bcachefs produced packed formats that could represent keys that our in memory format cannot represent; bformat_needs_redo() has some tricky shifts to check for this sort of overflow. Reported-by: syzbot+594427aebfefeebe91c6@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Guard against unknown k.k->type in __bkey_invalid()Kent Overstreet
For forwards compatibility we have to allow unknown key types, and only run the checks that make sense against them. Fix a missing guard on k.k->type being known. Reported-by: syzbot+ae4dc916da3ce51f284f@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Add missing validation for superblock section cleanKent Overstreet
We were forgetting to check for jset entries that overrun the end of the section - both in validate and to_text(); to_text() needs to be safe for types that fail to validate. Reported-by: syzbot+c48865e11e7e893ec4ab@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Fix assert in bch2_alloc_v4_invalid()Kent Overstreet
Reported-by: syzbot+10827fa6b176e1acf1d0@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: fix overflow in fiemapReed Riley
filefrag (and potentially other utilities that call fiemap) sometimes pass ULONG_MAX as the length. fiemap_prep clamps excessively large lengths - but the calculation of end can overflow if it occurs before calling fiemap_prep. When this happens, filefrag assumes it has read to the end and exits. Signed-off-by: Reed Riley <reed@riley.engineer> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Add a better limit for maximum number of bucketsKent Overstreet
The bucket_gens array is a single array allocation (one byte per bucket), and kernel allocations are still limited to INT_MAX. Check this limit to avoid failing the bucket_gens array allocation. Reported-by: syzbot+b29f436493184ea42e2b@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Fix lifetime issue in device iterator helpersKent Overstreet
bch2_get_next_dev() and bch2_get_next_online_dev() iterate over devices, dropping and taking refs as they go; we can't access the previous device (for ca->dev_idx) after we've dropped our ref to it, unless we take rcu_read_lock() first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Fix bch2_dev_lookup() refcountingKent Overstreet
bch2_dev_lookup() is supposed to take a ref on the device it returns, but for_each_member_device() takes refs as it iterates, for_each_member_device_rcu() does not. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Initialize bch_write_op->failed in inline data pathKent Overstreet
Normally this is initialized in __bch2_write(), which is executed in a loop, but the inline data path skips this. Reported-by: syzbot+fd3ccb331eb21f05d13b@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Fix refcount put in sb_field_resize error pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Inodes need extra padding for varint_decode_fast()Kent Overstreet
Reported-by: syzbot+66b9b74f6520068596a9@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>