From d1386efe8b64c2fb1e6a269824ead10f01f785f3 Mon Sep 17 00:00:00 2001 From: Kent Overstreet Date: Sun, 23 May 2021 02:47:13 -0400 Subject: fix typo --- Allocator.mdwn | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 73 insertions(+), 7 deletions(-) (limited to 'Allocator.mdwn') diff --git a/Allocator.mdwn b/Allocator.mdwn index 9bad84a..2c15a68 100644 --- a/Allocator.mdwn +++ b/Allocator.mdwn @@ -45,11 +45,77 @@ available buckets. It's too easy to end up with bugs where this scanning is happening repeatedly (we have one now...), and it's a scalability issue and we shouldn't be doing it at all. -Also, we would really like to get rid of the in memory array of buckets, this is -another scalability issue. The in memory struct bucket now mostly just mirrors -what's in the btree, in `KEY_TYPE_alloc` keys. The one exception is -`journal_seq`, which is the journal sequence number of the last btree update -that touched that bucket. It's needed for knowing whether we need to flush the -journal before allocating and writing to that bucket - perhaps it should just be -added to `struct bch_alloc` and stored in the keys in the btree. +## In memory bucket array - kill +We need to get rid of the in memory array of buckets - it's another scalability +issue, and at this point it's an anachronism since now it mostly just mirrors +what's in the btree, in `KEY_TYPE_alloc` keys. Exceptions: + +* `journal_seq`, which is the journal sequence number of the last btree update + that touched that bucket. It's needed for knowing whether we need to flush the + journal before allocating and writing to that bucket - we need to create + another data structure for bucket that can't be used until the journal is + flushed. + +* `owned_by_allocator`, which is true if an `open_bucket` referencing that + bucket exists - `open_buckets` represest buckets currently being allocated + from, they're reference counted to prevent buckets from being double allocated + (with writes taking a reference when they allocate space and releasing that + ref after they create the extent that points to that space). + +Currently, the main obstacle to getting rid of the bucket array is that we need +to start the allocator threads to run journal replay, but we can't use btree +iterators until journal replay has at least finished replaying updates to +interior btree nodes. + +We have code in recovery.c for iterators that iterate over the btree with keys +from the journal overlaid; it may be time to move this support into regular +btree iterators in `btree_iter.c`. + +## WHAT WE NEED IN THE REWRITE: + +### Data structures: + +* buckets containing cached data that can be used if they are invalidated + +* buckets that are waiting on journal commit until they can be used + +* buckets ready to be discarded + +* buckets that are ready to be used now + +Buckets that are ready to be used now should be stored in sorted order. + +Buckets that are waiting on journal commit should be indexed by what journal +seq needs to be flushed. + +Buckets that contain cached data should be stored on a heap. + +TASK LIST: + +* Move incrementing of bucket gens to happen when a bucket becomes empty (and + doesn't have an `open_bucket` pointing to it) - also need to add code to + `open_bucket_put()` to increment bucket gen if necessary + +* At the same time, add bucket to list of buckets awaiting journal commit + +* Add a flags field to `bch_alloc_v2` and a flag indicating bucket needs to be + discarded + +* Add a flag indicating whether a bucket had cached data in it - if not, we + don't need to track `oldest_gen` + + +STATE TRANSITIONS: + +Cached data bucket -> empty uncommitted bucket +Dirty data bucket -> empty uncommitted bucket + +In either case, the empty bucket first goes on the list of buckets awaiting +journal commit. + +After journal commit: + +empty uncommited bucket -> empty undiscarded bucket + +After discard -- cgit v1.2.3