summaryrefslogtreecommitdiff
path: root/index.mdwn
blob: 13c18e570892bf679cc49894f3700878ee318d93 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
# "The COW filesystem for Linux that won't eat your data".

Bcachefs is an advanced new filesystem for Linux, with an emphasis on
reliability and robustness and the complete set of features one would expect
from a modern filesystem.

* Copy on write (COW) - like zfs or btrfs
* Full data and metadata checksumming
* Multiple devices
* Replication
* [[Erasure coding|ErasureCoding]] (not stable)
* [[Caching, data placement|Caching]]
* [[Compression]]
* [[Encryption]]
* [[Snapshots]]
* Scalable - has been tested to 100+ TB, expected to scale far higher (testers wanted!)
* High performance, low tail latency
* Already working and stable, with a small community of users
* [[Use Cases|UseCases]]

## Documentation

* [[Getting Started|GettingStarted]]
* [[User manual|bcachefs-principles-of-operation.pdf]]
* [[FAQ]]

## Debugging tools

bcachefs has extensive debugging tools and facilities for inspecting the state
of the system while running: [[Debugging]].

## Development tools

bcachefs development is done with
[[ktest|https://evilpiepirate.org/git/ktest.git]], which is used for both
interactive and automated testing, with a large test suite.

[[Test dashboard|https://evilpiepirate.org/~testdashboard/ci]]

## Philosophy, vision

We prioritize robustness and reliability over features: we make every effort to
ensure you won't lose data. It's building on top of a codebase with a pedigree
- bcache already has a reasonably good track record for reliability Starting
from there, bcachefs development has prioritized incremental development, and
keeping things stable, and aggressively fixing design issues as they are found;
the bcachefs codebase is considerably more robust and mature than upstream
bcache.

The long term goal of bcachefs is to produce a truly general purpose filesystem:
 - scalable and reliable for the high end
 - simple and easy to use
 - an extensible and modular platform for new feature development, based on a
   core that is a general purpose database, including potentially distributed storage

## Some technical high points

Filesystems have conventionally been implemented with a great deal of special
purpose, ad-hoc data structures; a filesystem-as-a-database is a rarer beast:

### btree: high performance, low latency

The core of bcachefs is a high performance, low latency b+ tree.
[[Wikipedia|https://en.wikipedia.org/wiki/Bcachefs]] covers some of the
highlights - large, log structured btree nodes, which enables some novel
performance optimizations. As a result, the bcachefs b+ tree is one of the
fastest production ordered key value stores around: [[benchmarks|BtreePerformance]].

Tail latency has also historically been a difficult area for filesystems, due
largely to locking and metadata write ordering dependencies. The database
approach allows bcachefs to shine here as well, it gives us a unified way to
handle locking for all on disk state, and introduce patterns and techniques for
avoiding aforementioned dependencies - we can easily avoid holding btree locks
while doing blocking operations, and as a result benchmarks show write
performance to be more consistant than even XFS.

#### Sophisticated transaction model

The main interface between the database layer and the filesystem layer provides
 - Transactions: updates are queued up, and are visible to code running within
   the transaction, but not the rest of the system until a successful
   transaction commit
 - Deadlock avoidance: High level filesystem code need not concern itself with lock ordering
 - Sophisticated iterators
 - Memoized btree lookups, for efficient transaction restart handling, as well
   as greatly simplifying high level filesystem code that need not pass
   iterators around to avoid lookups unnecessarily.

#### Triggers

Like other database systems, bcachefs-the-database provides triggers: hooks run
when keys enter or leave the btree - this is used for e.g. disk space
accounting.

Coupled with the btree write buffer code, this gets us highly efficient
backpointers (for copygc), and in the future and efficient way to maintain an
index-by-hash for data deduplication.

### Unified codebase

The entire bcachefs codebase can be built and used either inside the kernel, or
in userspace - notably, fsck is not a from-scratch implementation, it's just a
small module in the larger bcachefs codebase.

### Rust

We've got some initial work done on transitioning to Rust, with plans for much
more: here's an example of walking the btree, from Rust:
[[cmd_list|https://evilpiepirate.org/git/bcachefs-tools.git/tree/rust-src/src/cmd_list.rs]]

## Contact and support

Developing a filesystem is also not cheap, quick, or easy; we need funding!
Please chip in on [[Patreon|https://www.patreon.com/bcachefs]]

We're also now offering contracts for support and feature development -
[[email|kent.overstreet@gmail.com]] for more info. Check the
[[roadmap|Roadmap]] for ideas on things you might like to support.

Join us in the bcache [[IRC|Irc]] channel, we have a small group of bcachefs
users and testers there: #bcache on OFTC (irc.oftc.net).

Mailing list: [[https://lore.kernel.org/linux-bcachefs/]], or
linux-bcachefs@vger.kernel.org.

Bug trackers: [[bcachefs|https://github.com/koverstreet/bcachefs/issues]],
[[bcachefs-tools|https://github.com/koverstreet/bcachefs-tools/issues]]