bcachefs.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
9 days	bcachefs: split out lru_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
9 days	bcachefs: Disk space accounting rewrite	Kent Overstreet
	Main part of the disk accounting rewrite. This is a wholesale rewrite of the existing disk space accounting, which relies on percepu counters that are sharded by journal buffer, and rolled up and added to each journal write. With the new scheme, every set of counters is a distinct key in the accounting btree; this fixes scaling limitations of the old scheme, where counters took up space in each journal entry and required multiple percpu counters. Now, in memory accounting requires a single set of percpu counters - not multiple for each in flight journal buffer - and in the future we'll probably also have counters that don't use in memory percpu counters, they're not strictly required. An accounting update is now a normal btree update, using the btree write buffer path. At transaction commit time, we apply accounting updates to the in memory counters, which are percpu counters indexed in an eytzinger tree by the accounting key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
9 days	bcachefs: KEY_TYPE_accounting	Kent Overstreet
	New key type for the disk space accounting rewrite. - Holds a variable sized array of u64s (may be more than one for accounting e.g. compressed and uncompressed size, or buckets and sectors for a given data type) - Updates are deltas, not new versions of the key: this means updates to accounting can happen via the btree write buffer, which we'll be teaching to accumulate deltas. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
9 days	bcachefs: metadata version bucket_stripe_sectors	Kent Overstreet
	New on disk format version for bch_alloc->stripe_sectors and BCH_DATA_unstriped - accounting for unstriped data in stripe buckets. Upgrade/downgrade requires regenerating alloc info - but only if erasure coding is in use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
9 days	bcachefs: BCH_DATA_unstriped	Kent Overstreet
	Add a new pseudo data type, to track buckets that are members of a stripe, but have unstriped data in them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 days	bcachefs: Fix safe errors by default	Kent Overstreet
	i.e. the start of automatic self healing: If errors=continue or fix_safe, we now automatically fix simple errors without user intervention. New error action option: fix_safe This replaces the existing errors=ro option, which gets a new slot, i.e. existing errors=ro users now get errors=fix_safe. This is currently only enabled for a limited set of errors - initially just disk accounting; errors we would never not want to fix, and we don't want to require user intervention (i.e. to make sure a bug report gets filed). Errors will still be counted in the superblock, so we (developers) will still know they've been occuring if a bug report gets filed (as bug reports typically include the errors superblock section). Eventually we'll be enabling this for a much wider set of errors, after we've done thorough error injection testing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 days	bcachefs: Guard against overflowing LRU_TIME_BITS	Kent Overstreet
	LRUs only have 48 bits for the time field (i.e. LRU order); thus we need overflow checks and guards. Reported-by: syzbot+df3bf3f088dcaa728857@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 days	bcachefs: Fix btree ID bitmasks	Kent Overstreet
	these should be 64 bit bitmasks, not 32 bit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28	bcachefs: Split out sb-errors_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28	bcachefs: Split out journal_seq_blacklist_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28	bcachefs: Split out replicas_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28	bcachefs: Split out disk_groups_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28	bcachefs: split out sb-downgrade_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28	bcachefs: split out sb-members_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-20	bcachefs: Fix shift overflow in btree_lost_data()	Kent Overstreet
	Reported-by: syzbot+29f65db1a5fe427b5c56@syzkaller.appspotmail.com Fixes: 55936afe1107 ("bcachefs: Flag btrees with missing data") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: Move BCACHEFS_STATFS_MAGIC value to UAPI magic.h	Petr Vorel
	Move BCACHEFS_STATFS_MAGIC value to UAPI <linux/magic.h> under BCACHEFS_SUPER_MAGIC definition (use common approach for name) and reuse the definition in bcachefs_format.h BCACHEFS_STATFS_MAGIC. There are other bcachefs magic definitions: BCACHE_MAGIC, BCHFS_MAGIC, which use UUID_INIT() and are used only in libbcachefs. Therefore move only BCACHEFS_STATFS_MAGIC value, which can be used outside of libbcachefs for f_type field in struct statfs in statfs() or fstatfs(). Suggested-by: Su Yue <glass.su@suse.com> Signed-off-by: Petr Vorel <pvorel@suse.cz> Acked-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: bch_member.last_journal_bucket	Kent Overstreet
	On recovery from clean shutdown we don't typically read the journal, but we still want to avoid overwriting existing entries in the journal for list_journal debugging. Thus, add some fields to the member info section so we can remember where we left off. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06	bcachefs: BCH_SB_LAYOUT_SIZE_BITS_MAX	Kent Overstreet
	Define a constant for the max superblock size, to avoid a too-large shift. Reported-by: syzbot+a8b0fb419355c91dda7f@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06	bcachefs: Add a better limit for maximum number of buckets	Kent Overstreet
	The bucket_gens array is a single array allocation (one byte per bucket), and kernel allocations are still limited to INT_MAX. Check this limit to avoid failing the bucket_gens array allocation. Reported-by: syzbot+b29f436493184ea42e2b@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-17	bcachefs: KEY_TYPE_error is allowed for reflink	Kent Overstreet
	KEY_TYPE_error is left behind when we have to delete all pointers in an extent in fsck; it allows errors to be correctly returned by reads later. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-14	bcachefs: bch_member.btree_allocated_bitmap	Kent Overstreet
	This adds a small (64 bit) per-device bitmap that tracks ranges that have btree nodes, for accelerating btree node scan if it is ever needed. - New helpers, bch2_dev_btree_bitmap_marked() and bch2_dev_bitmap_mark(), for checking and updating the bitmap - Interior btree update path updates the bitmaps when required - The check_allocations pass has a new fsck_err check, btree_bitmap_not_marked - New on disk format version, mi_btree_mitmap, which indicates the new bitmap is present - Upgrade table lists the required recovery pass and expected fsck error - Btree node scan uses the bitmap to skip ranges if we're on the new version Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-13	bcachefs: Standardize helpers for printing enum strs with bounds checks	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-09	bcachefs: Don't scan for btree nodes when we can reconstruct	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-03	bcachefs: Flag btrees with missing data	Kent Overstreet
	We need this to know when we should attempt to reconstruct the snapshots btree Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13	bcachefs: omit alignment attribute on big endian struct bkey	Thomas Bertschinger
	This is needed for building Rust bindings on big endian architectures like s390x. Currently this is only done in userspace, but it might happen in-kernel in the future. When creating a Rust binding for struct bkey, the "packed" attribute is needed to get a type with the correct member offsets in the big endian case. However, rustc does not allow types to have both a "packed" and "align" attribute. Thus, in order to get a Rust type compatible with the C type, we must omit the "aligned" attribute in C. This does not affect the struct's size or member offsets, only its toplevel alignment, which should be an acceptable impact. The little endian version can have the "align" attribute because the "packed" attr is redundant, and rust-bindgen will omit the "packed" attr when an "align" attr is present and it can do so without changing a type's layout Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13	bcachefs: BTREE_ID_subvolume_children	Kent Overstreet
	Add a btree to record a parent -> child subvolume relationships, according to the filesystem heirarchy. The subvolume_children btree is a bitset btree: if a bit is set at pos p, that means p.offset is a child of subvolume p.inode. This will be used for efficiently listing subvolumes, as well as recursive deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13	bcachefs: bch_subvolume::fs_path_parent	Kent Overstreet
	Record the filesystem path heirarchy for subvolumes in bch_subvolume Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10	bcachefs: jset_entry_datetime	Kent Overstreet
	This gives us a way to record the date and time every journal entry was written - useful for debugging. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: logged_ops_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: reflink_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs; extents_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: ec_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: subvolume_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: snapshot_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: alloc_background_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: xattr_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: dirent_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: inode_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs; quota_format.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: sb-counters_format.h	Kent Overstreet
	bcachefs_format.h has gotten too big; let's do some organizing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: comment bch_subvolume	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: bch_snapshot::btime	Kent Overstreet
	Add a field to bch_snapshot for creation time; this will be important when we start exposing the snapshot tree to userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05	bcachefs: Upgrades now specify errors to fix, like downgrades	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05	bcachefs: bch_member->seq	Kent Overstreet
	Add new fields for split brain detection: - bch_member->seq, which tracks the sequence number of the last superblock write that happened to each member device - bch_sb->write_time, which tracks the time of the last superblock write, to allow detection of when two members have diverged but had the same number of superblock writes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05	bcachefs: Check journal entries for invalid keys in trans commit path	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01	bcachefs: fix userspace build errors	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01	bcachefs: btree write buffer now slurps keys from journal	Kent Overstreet
	Previosuly, the transaction commit path would have to add keys to the btree write buffer as a separate operation, requiring additional global synchronization. This patch introduces a new journal entry type, which indicates that the keys need to be copied into the btree write buffer prior to being written out. We switch the journal entry type back to JSET_ENTRY_btree_keys prior to write, so this is not an on disk format change. Flushing the btree write buffer may require pulling keys out of journal entries yet to be written, and quiescing outstanding journal reservations; we previously added journal->buf_lock for synchronization with the journal write path. We also can't put strict bounds on the number of keys in the journal destined for the write buffer, which means we might overflow the size of the preallocated buffer and have to reallocate - this introduces a potentially fatal memory allocation failure. This is something we'll have to watch for, if it becomes an issue in practice we can do additional mitigation. The transaction commit path no longer has to explicitly check if the write buffer is full and wait on flushing; this is another performance optimization. Instead, when the btree write buffer is close to full we change the journal watermark, so that only reservations for journal reclaim are allowed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01	bcachefs: Improve btree write buffer tracepoints	Kent Overstreet
	- add a tracepoint for write_buffer_flush_sync; this is expensive - fix the write_buffer_flush_slowpath tracepoint Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01	bcachefs: Kill dev_usage->buckets_ec	Kent Overstreet
	This counter is redundant; it's simply the sum of BCH_DATA_stripe and BCH_DATA_parity buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01	bcachefs: Rename bch_replicas_entry -> bch_replicas_entry_v1	Kent Overstreet
	Prep work for introducing bch_replicas_entry_v2 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>