AgeCommit message (Collapse)Author
2015-02-12bcache: Validate bkey formatbcache-dev-februaryKent Overstreet
Change-Id: Ie30284518f55388925da92e7e0c4e13e77f54a90
2015-02-12bcache: Hook up new btree node fieldsKent Overstreet
Change-Id: Id688153f065d685af18a8e2515edbb5991c86559
2015-02-12bcache: New btree node formatKent Overstreet
Change-Id: I48c8ee1ac9648c13411de92583fc18d43285c7d7
2015-02-12bcache: Bkey format field offsetsKent Overstreet
Change-Id: I1e873cbd57043de3c07d3c798860b9c454d3134c
2015-02-12bcache: Add accounting for nr packed/unpacked keysKent Overstreet
Change-Id: I7baddf048e05ca6c396ac74987c00f90741a6024 Signed-off-by: Kent Overstreet <>
2015-02-12bcache: New debugfs codeKent Overstreet
Change-Id: I63e2470f0145ca7ca2e1a0d44758106f3c9a539d
2015-02-12bcache: Move some assertions to debug buildsKent Overstreet
Change-Id: I091cf50d9c0bac4ce111daef169bb992523bf47a
2015-02-12bcache: Pointer compression for btree_node_iterKent Overstreet
Change-Id: Iad777ebaec333070b5d5bcd776da6a282dd96792
2015-02-12bcache: Drop btree_node_iter->b, btree_node_iter->sizeKent Overstreet
Change-Id: I2f15b3cd79c7462f59fb86fd93b936b0fcbd27a9
2015-02-12bcache: Packed bkeysKent Overstreet
Change-Id: I23b3d03524fa57d70e016030c55f9acd64affd89
2015-02-12bch_query_uuid now returns user_uuid of cache-set instead ofRaghu Krishnamurthy
set_uuid (which is internal uuid). Issue DAT- Change-Id: Ide5271808526c6be2aa88b64b4ca824c7acc0b9f
2015-02-12bcache: Kill bch_btree_count_u64s()Kent Overstreet
Since we only merge extents when doing a full sort, we don't need this anymore. Change-Id: I9760c7b1b9eb0f99e127ac1935c1dd4f171eb5c1
2015-02-12bcache: Validate, show btree node sizeKent Overstreet
Change-Id: Iac244252cbdb9f069fd548e9384688b2ff624d32
2015-02-12bcache: Make __ptr_invalid() more explicitKent Overstreet
also, more useful bkey_to_text() Change-Id: Idd602ea44a36feb06d0d0823783863dc44a6a1e3
2015-02-12bcache: Better inliningKent Overstreet
Inlining bch_bset_search() turned out to be a performance regression (inlining?), also we don't actually want to inline bch_btree_node_iter_push(), so do it like this instead. Change-Id: Ibee051f7e7fe7f4a72e879c155068eeb7ef2b564
2015-02-12bcache: Fix compiler warningsKent Overstreet
gcc complains about unused results with the old defition of EBUG_ON. -Werror was accidently turned off, turn it back on. Change-Id: Ibe35eef48362ea80cc9b64ed0592d31e1cb2b76a
2015-02-12bcache: fix rare race on first startup of fresh cache setSlava Pestov
On the very first startup of a cache set, we would set CACHE_SYNC to false and then initialize the first journal entry by calling bch_journal_next(). If the allocator thread calls bch_journal_meta() in between these two steps, bad things could happen: a) we could dereference a NULL pointer because c->journal.cur wasn't set yet b) if we were in between setting c->journal.cur and pushing the first entry onto c-> journal_reclaim_fast() would hang when trying to pop elements off c-> Change the order of these two steps so that bch_journal_meta() doesn't try to get a journal reservation before bch_journal_next() has been called, and change the FIFO reclaim loop to BUG_ON if the FIFO is empty instead of just looping. Change-Id: I1a13a81ca3779bced6e453158f4793e408ef9d57
2015-02-12bcache: mca_alloc() never fails with -ENOMEMSlava Pestov
This would previously happen if all nodes in the cache were intent-locked. This is very unlikely to happen, so instead of failing IOs, just try to reap a node again. Issue DAT-1050 Change-Id: Iadc53d65985558241e9088e64683436266d1dc6e
2015-02-12bcache: skip non-extents in bch_sectors_dirty_init()Slava Pestov
On bootup, we would count discards as dirty sectors on the backing device. This was wrong. This fixes a regression from "New bkey format" or "Don't insert deleted keys with nonzero size", depending on your political beliefs. Issue DAT-1844 Change-Id: I77469274b35c4c99bbcfe320eca3cc35ae6da8ad
2015-02-12bcache: add more dynamic faults for init and device add pathsSlava Pestov
Also fix bugs in device add path exposed by these. Issue DAT-1050 Change-Id: Ic69128eaa920cedaf3301be6800acb7dbe5b0003
2015-02-12bcache: enforce minimum journal entry sizeSlava Pestov
This lowers write latency by reducing the likelyhood that we fill up a journal entry really quickly while the previous journal write is still in progress. Previously, this would often happen when we are near the end of a journal bucket. Now, we just skip to the next journal bucket if we can't get a 32KiB journal entry. Change-Id: I08a0b40c75bb6b3486fd77f91a3caca7d9fffcf6
2015-02-12bcache: get c->verify mode working againSlava Pestov
Probably a waste of time, but I noticed it wasn't being tested in coverage reports. Change-Id: I68a46634ffbbb026cbcf9f16a0616463d2024d91
2015-02-12bcache: Minor refactoringKent Overstreet
Change-Id: I63a9609938432da27c51468736322f62047fb315
2015-02-12bcache: Better bch_btree_node_iter_verify()Kent Overstreet
Change-Id: Id0f750939bd99626f73dbdf2ed73757dcea7a6bf
2015-02-12bcache: Better inliningKent Overstreet
bch_bset_search() is now only called in the one place, so flatten that function instead Change-Id: I6a41b1bfa23f1abe4d8cd73bd1a1120d704a184c
2015-02-12bcache: NO_IOKent Overstreet
Change-Id: I9101859f9cd96b5a5e29ee27c1c055004dd4062f
2015-02-12bcache: don't evaluate EBUG_ON() expressions when not in debug modeKent Overstreet
Change-Id: I14c6c1328a2a7509d024aebedf86a9d8572fced9
2015-02-12bcache: Fix a null ptr deref in btree_iter_traverse()Kent Overstreet
btree_iter_node_set() does the lookup within the node, so we don't want to do it while the node is still empty... Change-Id: I1aca2a9f47710df36f64bf8501ac7ef6aaff60b6
2015-02-12bcache: remove over-eager BUG_ONSlava Pestov
This was added in "fix journal reclaim deadlock during journal replay" but we don't believe its actually helpful. Change-Id: I5610cad75fa1edc31716fab841b239fe3a00440b
2015-02-12bcache: make discard work like it did before when version is zeroSlava Pestov
- list_extents ioctl skips discard keys with zero version - inserting discard key with zero version unconditionally deletes overlapping keys Change-Id: Iff4053a5b446d9114e7ddcb764f15f6f49d6d3ab
2015-02-12bcache: fix init error pathSlava Pestov
If cache_set_alloc() fails before we add ourselves to sysfs, we would end up calling kobject_del() on a kobject that hasn't been added yet. This was exposed by the new init fault added recently. Change-Id: I8e913ceb6eab59a609f2087c13b6d96fe0a1914c
2015-02-12bcache: kick off background journal reclaim eagerlySlava Pestov
This patch changes bch_journal_reclaim_fast() to return that reclaim is needed before we're completely out of journal space. This allows background reclaim to overlap with using more journal buckets from bch_journal_next_bucket(), eliminating stalls from the write path waiting on a journal reservation. Change-Id: If13b9357ec3f76c13299e27204d60443d3719023
2015-02-12bcache: do btree node flushing in a work itemSlava Pestov
This is the first patch preparing us for background journal reclaim. Change-Id: Ic8622b75eba32ef99cf8694a68afc459a57cc238
2015-02-12bcache: improve journal tracepointsSlava Pestov
Change-Id: I0893e6555d399757a24d51c0495180ed87069479
2015-02-12bcache: rename c-> to c->journal.write_workSlava Pestov
Change-Id: Ia4bcd4490b7ff1f00d7c146e26b9f65ae1325ec6
2015-02-12bcache: rename journal_reclaim() to journal_next_bucket()Slava Pestov
This more accurately describes what it is doing. Change-Id: I8c8228ebccef7e2d4dbbfd521a355974b05a396c
2015-02-12bcache: don't run journal_reclaim() logic if current journal bucket has spaceSlava Pestov
We can just start a new entry in this case, only doing all the other stuff when the journal bucket is completely full. Change-Id: I0b24cd9b100dc33144507f48a49ab7d64f9a0eeb
2015-02-12bcache: move btree node flush to journal_reclaim()Slava Pestov
This is cleaner than having the caller do it. Change-Id: Ib6b5cc9cc64b94b6de662bef92b3acc62f35e762
2015-02-12bcache: don't need to call journal_reclaim() when kicking off journal writeSlava Pestov
Change-Id: I187e208bde9665f61bb95722df54d2267da88fd6
2015-02-12bcache: fix faulty logic in bch_journal_res_get()Slava Pestov
If current journal entry was completely full, we should try to write it before doing a journal reclaim. Otherwise we might do a reclaim for nothing and just sit there waiting 10ms for the timer to write the entry. This fixes a very old regression from "journal reservations". Change-Id: Ia083999130dcd6180b074efd0ba7a7fdc98d8c3a
2015-02-12bcache: only wake up journal.wait if pin refcount is 0Slava Pestov
Change-Id: Ife974d6aae1f1bebc32f9cb762a64ce6c29cc4b6
2015-02-12bcache: fix journal reclaim deadlock during journal replaySlava Pestov
Journal reclaim has to work during journal replay, because the allocator might need to invalidate buckets and write out prios and gens, or because we might need to set a new btree root. The recent patch "Fix journal replay" made this work by tracking reference counts on journal entries during replay in the same way that we do during normal operation, except that a reference is dropped once an entry has been replayed rather than dropping a reference when an entry has been written out. The problem with that patch is that we might start replay with a completely full journal, and be unable to add any new journal entries until the first bucket of entries has been replayed. If replaying the first bucket of entries required allocating buckets, we would deadlock in the allocator thread while waiting on a journal entry to write out prios and gens, because we would be unable to reclaim any journal buckets -- no entries have been replayed yet. Dig ourselves out of this hole by priming the allocator freelists with completely free buckets, by extending the existing logic to prime the PRIO freelist to prime all freelists. Also, wake up any threads waiting on reclaim when we drop a journal entry's reference count. Finally, add a BUG_ON() to ensure that flushing btree nodes makes forward progress during replay. Change-Id: I608b03bdf196834d22cda16d427555da50c344e5
2015-02-12bcache: add BUG_ONs for suspected memory scribble around ↵Slava Pestov
btree_node_iter_next_all() Issue DAT-1868 Change-Id: Ie34228c1425e88ba20feb7a959e89562b020b140
2015-02-12bcache: fix NULL deref in init error pathSlava Pestov
This fixes a regression from "notify user space of state changes using kobject_uevent". Change-Id: I0c2b2e2aab42d7a8c571c952e0ef662e2aec4cb9
2015-02-12bcache: add remove_failed notificationSlava Pestov
Change-Id: I1c2b68248eefd77b48fc5deb15e2908db6ef3f28
2015-02-12bcache: fix handling of PTR_LOST_DEV keysSlava Pestov
If we lose all copies of a key, we want to fail the read, not treat it as a hole in the keyspace, so don't mark the key as deleted. Instead, set the key type to error. This fixes a regression from "New bkey format". Also, instead of having bch_extent_normalize() set a key's deleted flag, just return true if the key should be dropped. If the key is an extent and has no pointers, it becomes a discard. This could come up in bch_flag_key_bad(). If a cached key points to a device that has gone bad, we end up dropping all pointers from the key. This would cause us to insert a deleted key, which triggers a BUG_ON ever since "Don't insert deleted keys with nonzero size". What we want instead is to discard this range of the keyspace -- we just lost some cached data, which is not an error. Change-Id: Ic57e8337bb111d0fb9aefe223368d131eb2229ca
2015-02-12bcache: re-work bbio IO error reportingSlava Pestov
We can't call bch_notify_*() from atomic context, so move it to a new ca->io_error_work. Change-Id: I958bdea2201dfe220338ad55585e885b2d54220c
2015-02-12bcache: Add sysfs internal uuid attributeJacob Malevich
Signed-off-by: Jacob Malevich <> Issue DAT-1913 Change-Id: I3b65730e43004b6bf4670de220bf740d28a70282
2015-02-12bcache: fix erroneous BUG_ON in journal.cSlava Pestov
It is fine for there to be a dirty journal entry with no keys, as long as JOURNAL_NEED_WRITE is also set, meaning the journal write is about to go down. This happens when bch_journal_meta() is called for example. Change-Id: Ia0e7cadc3e56f026dc275511cf84245d58767b4d
2015-02-12bcache: Drop bch_check_keys()Kent Overstreet
bch_btree_node_iter_next_check() is now able to check everything bch_check_keys() did. Change-Id: Ia90cda23505e9ed2a76586d352b5c275d8ab4e07