summaryrefslogtreecommitdiff
path: root/Allocator.mdwn
diff options
context:
space:
mode:
authorIkiWiki <ikiwiki.info>2021-05-22 23:47:15 -0700
committerIkiWiki <ikiwiki.info>2021-05-22 23:47:15 -0700
commit34d001d0e23551c8550b0e2b62734cbe68d1caa2 (patch)
tree05dc385148c870a7445ac5ec0da108fb463253e8 /Allocator.mdwn
parent42351b958ad21a487369e6adc27eb95f74a40e44 (diff)
parentd1386efe8b64c2fb1e6a269824ead10f01f785f3 (diff)
Merge branch 'master' of /home/bcachefs/bcachefs
Diffstat (limited to 'Allocator.mdwn')
-rw-r--r--Allocator.mdwn80
1 files changed, 73 insertions, 7 deletions
diff --git a/Allocator.mdwn b/Allocator.mdwn
index 9bad84a..2c15a68 100644
--- a/Allocator.mdwn
+++ b/Allocator.mdwn
@@ -45,11 +45,77 @@ available buckets. It's too easy to end up with bugs where this scanning is
happening repeatedly (we have one now...), and it's a scalability issue and we
shouldn't be doing it at all.
-Also, we would really like to get rid of the in memory array of buckets, this is
-another scalability issue. The in memory struct bucket now mostly just mirrors
-what's in the btree, in `KEY_TYPE_alloc` keys. The one exception is
-`journal_seq`, which is the journal sequence number of the last btree update
-that touched that bucket. It's needed for knowing whether we need to flush the
-journal before allocating and writing to that bucket - perhaps it should just be
-added to `struct bch_alloc` and stored in the keys in the btree.
+## In memory bucket array - kill
+We need to get rid of the in memory array of buckets - it's another scalability
+issue, and at this point it's an anachronism since now it mostly just mirrors
+what's in the btree, in `KEY_TYPE_alloc` keys. Exceptions:
+
+* `journal_seq`, which is the journal sequence number of the last btree update
+ that touched that bucket. It's needed for knowing whether we need to flush the
+ journal before allocating and writing to that bucket - we need to create
+ another data structure for bucket that can't be used until the journal is
+ flushed.
+
+* `owned_by_allocator`, which is true if an `open_bucket` referencing that
+ bucket exists - `open_buckets` represest buckets currently being allocated
+ from, they're reference counted to prevent buckets from being double allocated
+ (with writes taking a reference when they allocate space and releasing that
+ ref after they create the extent that points to that space).
+
+Currently, the main obstacle to getting rid of the bucket array is that we need
+to start the allocator threads to run journal replay, but we can't use btree
+iterators until journal replay has at least finished replaying updates to
+interior btree nodes.
+
+We have code in recovery.c for iterators that iterate over the btree with keys
+from the journal overlaid; it may be time to move this support into regular
+btree iterators in `btree_iter.c`.
+
+## WHAT WE NEED IN THE REWRITE:
+
+### Data structures:
+
+* buckets containing cached data that can be used if they are invalidated
+
+* buckets that are waiting on journal commit until they can be used
+
+* buckets ready to be discarded
+
+* buckets that are ready to be used now
+
+Buckets that are ready to be used now should be stored in sorted order.
+
+Buckets that are waiting on journal commit should be indexed by what journal
+seq needs to be flushed.
+
+Buckets that contain cached data should be stored on a heap.
+
+TASK LIST:
+
+* Move incrementing of bucket gens to happen when a bucket becomes empty (and
+ doesn't have an `open_bucket` pointing to it) - also need to add code to
+ `open_bucket_put()` to increment bucket gen if necessary
+
+* At the same time, add bucket to list of buckets awaiting journal commit
+
+* Add a flags field to `bch_alloc_v2` and a flag indicating bucket needs to be
+ discarded
+
+* Add a flag indicating whether a bucket had cached data in it - if not, we
+ don't need to track `oldest_gen`
+
+
+STATE TRANSITIONS:
+
+Cached data bucket -> empty uncommitted bucket
+Dirty data bucket -> empty uncommitted bucket
+
+In either case, the empty bucket first goes on the list of buckets awaiting
+journal commit.
+
+After journal commit:
+
+empty uncommited bucket -> empty undiscarded bucket
+
+After discard