summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorIkiWiki <ikiwiki.info>2021-03-09 14:05:47 -0800
committerIkiWiki <ikiwiki.info>2021-03-09 14:05:47 -0800
commitba059a05dc686c6360aeaa12de16a504c8b79ece (patch)
tree8484e9dd6a99311c1f79d60bd5aedde1c593fe81
parent28f6297beda4bed220440e15266ca78a2b2c0f47 (diff)
parentc15652573e158c1b24dabda05a1dd1081d7f231d (diff)
Merge branch 'master' of /home/bcachefs/bcachefs
-rw-r--r--Snapshots.mdwn34
1 files changed, 34 insertions, 0 deletions
diff --git a/Snapshots.mdwn b/Snapshots.mdwn
index bac5e54..3fbb2f1 100644
--- a/Snapshots.mdwn
+++ b/Snapshots.mdwn
@@ -117,6 +117,40 @@ In the current design, deleting a snapshot will require walking every btree that
has snapshots (extents, inodes, dirents and xattrs) to find and delete keys with
the given snapshot ID. It would be nice to improve this.
+Other performance considerations:
+=================================
+
+Snapshots seem to exist in one of those design spaces where there's inherent
+tradeoffs and it's almost impossible to design something that doesn't have
+pathalogical performance issues in any use case scenario. E.g. btrfs snapshots
+are known for being inefficient with sparse snapshots.
+
+bcachefs snapshots should perform beautifully when taking frequent periodic (and
+thus mostly fairly sparse) snapshots. The one thing we may have to watch out for
+is part of the keyspace becoming too dense with keys from unrelated snapshots -
+e.g. if we start with a 1 GB file, snapshot it 100 or 1000 times, and then have
+fio fully overwrite the file with 4k random writes in every snapshot - that
+would not be good, reading that file sequentially will require more or less
+sequentially scanning through all the extents from every snapshot.
+
+I expect this to be a fairly uncommon issue though, because when we allocate new
+inode numbers we'll be picking an inode number that's unused in any snapshot -
+most files in a filesystem are created, written to once, and then some time
+later a new version is created and then renamed over the old file. The only way
+to trigger this issue is by doing steady random writes to a large existing file
+that's never recreated - which is mostly just databases and virtual machine
+images. For virtual machine images people would be better off using reflink,
+which we already support and won't have this issue at all.
+
+But, if this does turn out to be a real issue for people (and if someone's
+willing to fund this area), it should be perfectly solvable: we first need to
+track number of keys for a given inode (extents/dirents/xattrs) in a given
+snapshot, and in all snapshots. When that ratio crosses some threshhold, we'll
+allocate a new inode and move all the keys for that inode number and snapshot ID
+to the new inode, and mark the original inode to redirect to the new inode so
+that the user visible inode number doesn't change. A bit tedious to implement,
+but straightforward enough.
+
Permissions:
============