summaryrefslogtreecommitdiff
path: root/Todo.mdwn
diff options
context:
space:
mode:
authorKent Overstreet <kent.overstreet@linux.dev>2023-09-22 18:30:23 -0400
committerKent Overstreet <kent.overstreet@linux.dev>2023-09-22 18:30:23 -0400
commit1fec6f13837263d5abc35a87a78999fd26eb580f (patch)
tree9c5a53f684dac0f6d7fa760bc5919d54d98063d5 /Todo.mdwn
parenta04ce4a3391c24e44dadd0d8e54fd4a52173e135 (diff)
More website improvements
- expand, reorg frontpage - update roadmap - new debugging page - new btree perf numbers Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Diffstat (limited to 'Todo.mdwn')
-rw-r--r--Todo.mdwn157
1 files changed, 0 insertions, 157 deletions
diff --git a/Todo.mdwn b/Todo.mdwn
deleted file mode 100644
index 06fbdb5..0000000
--- a/Todo.mdwn
+++ /dev/null
@@ -1,157 +0,0 @@
-# TODO
-
-## Kernel developers
-
-### P0 features
-
- * Disk space accounting rework: now that we've got the btree write buffer, we
- can store disk space accounting in btree keys (instead of the current hacky
- mechanism where it's only stored in the journal).
-
- This will make it possible to store many more counters - meaning we'll be
- able to store per-snapshot-id accounting.
-
- We also need to add separate counters for compressed/uncompressed size, and
- broken out by compression type.
-
- It'd also be really nice to add more counters to the inode for
- compressed/uncompressed size. Right now the inode just has `bi_sectors`,
- which the ondisk size for fallocate purposes, i.e. discounting replication.
- We could add something like
-
- bi_sectors_ondisk_compressed
- bi_sectors_ondisk_uncompressed
-
-### P1 features
-
-### P2 features
-
- * When we're using compression, we end up wasting a fair amount of space on
- internal fragmentation because compressed extents get rounded up to the
- filesystem block size when they're written - usually 4k. It'd be really nice
- if we could pack them in more efficiently - probably 512 byte sector
- granularity.
-
- On the read side this is no big deal to support - we have to bounce
- compressed extents anyways. The write side is the annoying part. The options
- are:
- * Buffer up writes when we don't have full blocks to write? Highly
- problematic, not going to do this.
- * Read modify write? Not an option for raw flash, would prefer it to not be
- our only option
- * Do data journalling when we don't have a full block to write? Possible
- solution, we want data journalling anyways
-
- * Full data journalling - we're definitely going to want this for when the
- journal is on an NVRAM device (also need to implement external journalling
- (easy), and direct journal on NVRAM support (what's involved here?)).
-
- Would be good to get a simple implementation done and tested so we know what
- the on disk format is going to be.
-
-## Developers
-
- * End user documentation needs a lot of work - complete man pages, etc.
-
- * bcachefs-tools needs some fleshing out in the --help department
-
-## Users
-
- * Benchmark bcachefs performance on different configurations:
- (Note that these tests will have to be repeated in the future.)
- * SSD
- * HDD
- * SSD + HDD
-
- * Update the website / Documentation
- * Ask questions (so and we can update the documentation / website).
-
-## Wishlist
-
- * "Seed devices", hard-readonly devices that are CoWed from on write (btrfs
- has this; useful for base devices for virtualization, among other things).
- * Nonce-misuse-resistant authenticated encryption, such as [AES-SIV][AESSIV]
- or HS1-SIV (Closes potential hole regarding nonce reuse and "external"
- snapshots, as might happen to VMs or systems with externally-managed storage
- like iSCSI).
- * Some form of "secure delete" functionality. (However, see [this LWN article][secdel]
- regarding implementation strategies and pitfalls).
- * A simplified userspace API with no hierarchy, only blobs identified by
- unique integer keys (eternaleye thinks this might be useful for
- object-capability systems, such as [Robigalia][robigalia]).
- * An API like the above, but supporting multiple streams per blob, possibly
- with string identifiers (needs further examination, intent is to match the
- [needs of CephFS for OSD backends][cephback]).
- * More advanced caching algorithms; one potentially-relevant paper is
- [Pannier: A Container-based Flash Cache for Compound Objects][pannier].
- * "Asymmetrical" compression algorithms, that support only decompression (XZ
- is a nice candidate here, and would be a very good match for some seed
- device use cases).
- * RAID-6 with parity 3 or greater - could potentially use Andrea Mazzoleni's
- [technique][triple-parity] for generating Cauchy matrices compatible with
- Linux' current RAID-5 and RAID-6 formats, providing a clean upgrade path.
- * "Inline" forward error correction, possibly using a fountain code like
- [RaptorQ][RFC6330].
- * Support Trusted/Encrypted kernel keyring keys, in order to take advantage
- of TPMs.
- * Support for multiple key slots.
- * LUKS2 has a new [keyslot system][LUKS2-keyslots] that better supports
- two-factor auth and other external keying mechanisms.
- * Ponder the ramifications of (and safe defaults for) compression in the
- presence of encryption.
- * Swap file support.
- * Support (a subset of?) the Ext[234] attributes denoting special behaviors:
- * `+c` for compressed files
- * `+C` for disabling copy-on-write
- * `+e` for extent-based storage (always set? btrfs doesn't set it...)
- * `+E` for _displaying_ that a file is encrypted
- * `+N` for _displaying_ that a file's contents are inlined into the inode
- * `+s` for files that should be securely erased on delete
- * `+u` for files that should permit being "undeleted"
- * Others seem less relevant, but may be worth investigating.
- * "Remote Subvolumes", subvolumes that reside on a subset of the filesystem's
- physical devices and can be deactivated so that the physical devices can be
- detached (if all subvolumes on them are remote and deactivated) without
- unmounting the entire filesystem.
- * Per-file allocation policies - highly useful for VM disks, potential zvol
- killer.
- * Conversion of multi-device filesystems (potentially relevant for btrfs)
- * May be possible to use the new `getfsmap` ioctl; it seems to provide
- device information, and it looks like it iterates the whole FS (perfect
- for doing a whole-FS conversion, so it may even simplify single-device
- cases).
- * A fully-explicit mount syntax that does no scanning at all, possibly with
- a purely-userspace mount.bcachefs helper for scanning that then invokes
- the explicit form. Allows eliminating scanning/registration from the kernel.
- * Strawman: mount -t bcachefs bcachefs -o dev=...,dev=... /mount/point
- * Support the [`i_version` inode attribute][iversion], useful for NFSv4,
- backup tools, indexers, and other things that want to have a counter
- that updates whenever something changes about a file.
- * Support DAX, to allow using NVM devices directly and reduce page-cache
- pressure
- * Caveat: May need careful thinking if the kernel assumes it can write
- directly if DAX is supported; getting checksums invalidated would suck.
- * Support converting filesystems not just to bcachefs volumes, but to image
- files _on_ bcachefs volumes (and once reflinks are in, to both at once)
- * Notably, if conversion made an image as the first step, this could allow
- the subsequent extraction of files to bcachefs files a _completely
- userspace process_, which merely reflinks ranges of content out. This
- would also avoid the risks facing the current system, where the converter
- notes where the data lives in the "inner" bcachefs image, and then the
- content is moved out from under it (such as by GC, or btrfs rebalancing)
- * Support "adopting" devices, by importing their content _into_ an existing
- filesystem using the existing extents on the old device. In theory, could
- supplant "adding a cache" in a minimally-invasive way.
- * Add a bcachefs command to export a path as a block device (satisfy some
- bcache use cases that bcachefs is clumsy for today, and in combination
- with the above two items may make migration from other filesystems easier)
-
-[AESSIV]: https://tools.ietf.org/html/rfc5297
-[secdel]: https://lwn.net/Articles/462437/
-[robigalia]: https://robigalia.org
-[cephback]: http://bryanapperson.com/blog/ceph-osd-performance/
-[pannier]: https://pdfs.semanticscholar.org/fa5f/3aa6de62e126e6fe2986c70a34e4d678860b.pdf
-[triple-parity]: https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg28964.html
-[RFC6330]: https://tools.ietf.org/html/rfc6330
-[LUKS2-keyslots]: https://fosdem.org/2018/schedule/event/cryptsetup/attachments/slides/2506/export/events/attachments/cryptsetup/slides/2506/fosdem18_cryptsetup_aead.pdf#page=24
-[iversion]: https://jtlayton.wordpress.com/2016/12/16/the-inode-i_version-counter-in-linux/