summaryrefslogtreecommitdiff
path: root/fs/bcachefs/buckets.h
AgeCommit message (Collapse)Author
2023-10-22bcachefs: btree_pathKent Overstreet
This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Extensive triggers cleanupsKent Overstreet
- We no longer mark subsets of extents, they're marked like regular keys now - which means we can drop the offset & sectors arguments to trigger functions - Drop other arguments that are no longer needed anymore in various places - fs_usage - Drop the logic for handling extents in bch2_mark_update() that isn't needed anymore, to match bch2_trans_mark_update() - Better logic for hanlding the BTREE_ITER_CACHED_NOFILL case, where we don't have an old key to mark Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Inline fastpath of bch2_disk_reservation_add()Kent Overstreet
The fastpath now doesn't even disable preemption - instead we use a (non locked) cmpxchg. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: statfs resports incorrect avail blocksDan Robertson
The current implementation of bch_statfs does not scale the number of available blocks provided in f_bavail by the reserve factor. This causes an allocation of a file of this size to fail. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: gc shouldn't care about owned_by_allocatorKent Overstreet
The owned_by_allocator field is a purely in memory thing, even if/when we bring back GC at runtime there's no need for it to be recalculating this field. This is prep work for pulling it out of struct bucket, and eventually getting rid of the bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix bch2_trans_mark_dev_sb()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill bch2_fs_usage_scratch_get()Kent Overstreet
This is an important cleanup, eliminating an unnecessary copy in the transaction commit path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix livelock calling bch2_mark_bkey_replicas()Kent Overstreet
The bug was that we were trying to find a replicas entry that wasn't sorted - but, we can also simplify the code by not using bch2_mark_bkey_replicas and instead ensuring the list of replicas entries exists directly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22buckets.c fixups XXX squashKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix copygc thresholdKent Overstreet
Awhile back the meaning of is_available_bucket() and thus also bch_dev_usage->buckets_unavailable changed to include buckets that are owned by the allocator - this was so that the stat could be persisted like other allocation information, and wouldn't have to be regenerated by walking each bucket at mount time. This broke copygc, which needs to consider buckets that are reclaimable and haven't yet been grabbed by the allocator thread and moved onta freelist. This patch fixes that by adding dev_buckets_reclaimable() for copygc and the allocator thread, and cleans up some of the callers a bit. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Journal updates to dev usageKent Overstreet
This eliminates the need to scan every bucket to regenerate dev_usage at mount time. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Persist 64 bit io clocksKent Overstreet
Originally, bcachefs - going back to bcache - stored, for each bucket, a 16 bit counter corresponding to how long it had been since the bucket was read from. But, this required periodically rescaling counters on every bucket to avoid wraparound. That wasn't an issue in bcache, where we'd perodically rewrite the per bucket metadata all at once, but in bcachefs we're trying to avoid having to walk every single bucket. This patch switches to persisting 64 bit io clocks, corresponding to the 64 bit bucket timestaps introduced in the previous patch with KEY_TYPE_alloc_v2. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Mark superblocks transactionallyKent Overstreet
More work towards getting rid of the in memory struct bucket: this path adds code for marking superblock and journal buckets via the btree, and uses it in the device add and journal resize paths. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill bch2_invalidate_bucket()Kent Overstreet
This patch is working towards eventually getting rid of the in memory struct bucket, and relying only on the btree representation. Since bch2_invalidate_bucket() was only used for incrementing gens, not invalidating cached data, no other counters were being changed as a side effect - meaning it's safe for the allocator code to increment the bucket gen directly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Refactor dev usageKent Overstreet
This is to make it more amenable for serialization. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix integer overflow in bch2_disk_reservation_get()Kent Overstreet
The sectors argument shouldn't have been a u32 - it can be up to U32_MAX (i.e. fallocate creating persistent reservations), and if replication is enabled we'll overflow when we calculate the real number of sectors to reserve. Oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't write bucket IO time lazilyKent Overstreet
With the btree key cache code, we don't need to update the alloc btree lazily - and this will mean we can remove the bch2_alloc_write() call in the shutdown path. Future work: we really need to expend the bucket IO clocks from 16 to 64 bits, so that we don't have to rescale them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Update transactional triggers interface to pass old & new keysKent Overstreet
This is needed to fix a bug where we're overflowing iterators within a btree transaction, because we're updating the stripes btree (to update block counts) and the stripes btree trigger is unnecessarily updating the alloc btree - it doesn't need to update the alloc btree when the pointers within a stripe aren't changing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Delete unused argumentsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Use x-macros for data typesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Rework triggers interfaceKent Overstreet
The trigger for stripe keys is shortly going to need both the old and the new key passed to the trigger - this patch does that rework. For now, this just changes the in memory triggers, and this doesn't change how extent triggers work. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill BTREE_TRIGGER_NOOVERWRITESKent Overstreet
This is prep work for reworking the triggers machinery - we have triggers that need to know both the old and the new key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Interior btree updates are now fully transactionalKent Overstreet
We now update the alloc info (bucket sector counts) atomically with journalling the update to the interior btree nodes, and we also set new btree roots atomically with the journalled part of the btree update. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Move extent overwrite handling out of core btree codeKent Overstreet
Ever since the btree code was first written, handling of overwriting existing extents - including partially overwriting and splittin existing extents - was handled as part of the core btree insert path. The modern transaction and iterator infrastructure didn't exist then, so that was the only way for it to be done. This patch moves that outside of the core btree code to a pass that runs at transaction commit time. This is a significant simplification to the btree code and overall reduction in code size, but more importantly it gets us much closer to the core btree code being completely independent of extents and is important prep work for snapshots. This introduces a new feature bit; the old and new extent update models are incompatible when the filesystem needs journal replay. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: btree_ptr_v2Kent Overstreet
Add a new btree ptr type which contains the sequence number (random 64 bit cookie, actually) for that btree node - this lets us verify that when we read in a btree node it really is the btree node we wanted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Split out btree_trigger_flagsKent Overstreet
The trigger flags really belong with individual btree_insert_entries, not the transaction commit flags - this splits out those flags and unifies them with the BCH_BUCKET_MARK flags. Todo - split out btree_trigger.c from buckets.c Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Make btree_insert_entry more private to update pathKent Overstreet
This should be private to btree_update_leaf.c, and we might end up removing it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Inline more of bch2_trans_commit hot pathKent Overstreet
The main optimization here is that if we let bch2_replicas_delta_list_apply() fail, we can completely skip calling bch2_bkey_replicas_marked_locked(). And assorted other small optimizations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix erasure coding disk space accountingKent Overstreet
Disk space accounting for erasure coding + compression was completely broken - we need to calculate the parity sectors delta the same way we calculate disk_sectors, by calculating the old and new usage and subtracting to get the difference. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improve pointer marking checks and error messagesKent Overstreet
Importantly, we don't want to use bch2_fs_inconsistent_on() for errors that fsck can repair, becuase that will just put us in RO mode and prevent fsck from actually fixing stuff. Probably want to get rid of it in the future. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Rework calling convention for marking overwritesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Update path now handles triggers that generate more triggersKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improve key marking interfaceKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't use a fixed size buffer for fs_usage_deltasKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix spurious inconsistency in recoveryKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_trans_mark_update()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Write out fs usage consistentlyKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Return errors from for_each_btree_key()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Mark overwrites from journal replay in initial gcKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: add ability to run gc on metadata onlyKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Caller now responsible for calling mark_key for gcKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Allocate fs_usage in do_btree_insert_at()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: kill struct btree_insertKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix gc handling of bucket gensKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: fs_usage_u64s()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add a mechanism for blocking the journalKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Convert bucket invalidation to key marking pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: refactor key marking code a bitKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix double counting when gc is runningKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: gc lock no longer needed for disk reservationsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>