summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-02-26more dio rewritingblock_stuff_1Kent Overstreet
2014-02-26kill bio_get()Kent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26bio_add_page() conversionsKent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26fooKent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26pluggingKent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26kill ll_merge_requests_fnKent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26Multipage bvecsKent Overstreet
Convert merging to bio_add_page()/blk_max_segment() Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26convert integrity to new mergingKent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26convert nvme to blk_max_segment()Kent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26Introduce blk_max_segment()Kent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26block: Convert various code to bio_for_each_page()Kent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26block: Introduce bio_for_each_page()Kent Overstreet
Prep work for multipage bvecs: various code will still need to iterate over individual pages, so we add primitives to do so Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26block: Kill merge_bvec_fnKent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26raid5: kill bio_fits_rdev()Kent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26md: Use bio_copy_data()Kent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26block: kill bio_get_nr_vecs()Kent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26block: Make blk_queue_bounce() handle bios larger than BIO_MAX_PAGESKent Overstreet
We'd like to eventually be able to handle bios with more than BIO_MAX_PAGES segments; this shouldn't be too hard and it'll simplify other code in the kernel. The issue is code that clones the bio and must clone the biovec (i.e. it can't use bio_clone_fast()) won't be able to allocate a bio with more than BIO_MAX_PAGES - bio_alloc_bioset() always fails in that case. Fortunately, it's easy to make blk_queue_bounce() just process part of the bio if necessary, using bi_remaining to count the splits and punting the rest back to generic_make_request(). Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26mtip32xx: handle arbitrary size biosKent Overstreet
We get a measurable performance increase by handling this in the driver when we're already looping over the biovec, instead of handling it separately in generic_make_request() (or bio_add_page() originally) Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com>
2014-02-26dio: use submit_io() for block zeroingKent Overstreet
2014-02-26direct-io: Rewrite based on immutable biovecsKent Overstreet
Near total rewrite of fs/direct-io.c. This makes use of our new bio splitting functionality, and the fact that generic_make_request() will take arbitrary size bios - we allocate a bio, pin pages to it directly, then call the getblocks() function to map it wherever the filesystem tells us - splitting as needed. Appears to pass xfstests with CONFIG_XFS_DEBUG=y (would appreciate testing from someone who uses xfstests more than I). Doesn't quite work with btrfs yet - when running xfstests it eventually get stuck in an infinite loop somewhere. First thing in the kernel log is a warning at fs/btrfs/ordered-data.c:288 in btrfs_add_ordered_sum() - if any of the btrfs people want to take a look that would be a great help. It _definitely_ needs review/auditing from someone intimately familiar with the expected behaviour w.r.t. everything related to filesystem semantics and the handling of the getblocks() call - dio_send_bio() is what primarily implements that behaviour. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26block: Add bio_get_user_pages()Kent Overstreet
This replaces some of the code that was in __bio_map_user_iov(), and soon we're going to use this helper in the dio code. Note that this relies on the recent change to make generic_make_request() take arbitrary sized bios - we're not using bio_add_page() here. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
2014-02-26block: iov_count_pages()Kent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26block: convert to iov_iterKent Overstreet
2014-02-26iov_iter: Kill written arg to iov_iter_init()Kent Overstreet
This gets rid of a usually needless call to iov_iter_advance(). Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Chris Mason <clm@fb.com> Cc: linux-btrfs@vger.kernel.org Cc: Steve French <sfrench@samba.org> Cc: linux-cifs@vger.kernel.org Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: fuse-devel@lists.sourceforge.net Cc: Sage Weil <sage@inktank.com> Cc: ceph-devel@vger.kernel.org
2014-02-26iov_iter: Kill iov_iter_single_seg_count()Kent Overstreet
The new iov_iter_iovec() is a more general replacement. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk>
2014-02-26iov_iter: Move iov_iter to uio.hKent Overstreet
Going to be consolidating all the iov iter in one place, and fs.h is way too big. This also adds a new helper, iovec iov_iter_iovec(). Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk>
2014-02-26btrfs: Convert to bio_for_each_segment()Kent Overstreet
This is going to be important for future (hopeful) block layer refactoring, and using the standard primitives makes the code easier to audit. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Chris Mason <clm@fb.com> Cc: linux-btrfs@vger.kernel.org
2014-02-26btrfs: generic_make_request() handles arbitrary size bios nowKent Overstreet
So there's no need for btrfs to break up bios for device limits anymore Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Chris Mason <clm@fb.com> Cc: linux-btrfs@vger.kernel.org
2014-02-26bcache: generic_make_request() handles large bios nowKent Overstreet
So we get to delete our hacky workaround. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-26blk-lib.c: generic_make_request() handles large bios nowKent Overstreet
generic_make_request() will now do for us what the code in blk-lib.c was doing manually, with the bio_batch stuff - we still need some looping in case we're trying to discard/zeroout more than around a gigabyte, but when we can submit that much at a time doing the submissions in parallel really shouldn't matter. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
2014-02-26block: Gut bio_add_page()Kent Overstreet
Since generic_make_request() can now handle arbitrary size bios, all we have to do is make sure the bvec array doesn't overflow. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
2014-02-26block: Make generic_make_request handle arbitrary sized biosKent Overstreet
The way the block layer is currently written, it goes to great lengths to avoid having to split bios; upper layer code (such as bio_add_page()) checks what the underlying device can handle and tries to always create bios that don't need to be split. But this approach becomes unwieldy and eventually breaks down with stacked devices and devices with dynamic limits, and it adds a lot of complexity. If the block layer could split bios as needed, we could eliminate a lot of complexity elsewhere - particularly in stacked drivers. Code that creates bios can then create whatever size bios are convenient, and more importantly stacked drivers don't have to deal with both their own bio size limitations and the limitations of the (potentially multiple) devices underneath them. In the future this will let us delete merge_bvec_fn and a bunch of other code. We do this by adding calls to blk_queue_split() to the various make_request functions that need it - a few can already handle arbitrary size bios. Note that we add the call _after_ any call to blk_queue_bounce(); this means that blk_queue_split() and blk_recalc_rq_segments() don't need to be concerned with bouncing affecting segment merging. Some make_request_fns were simple enough to audit and verify they don't need blk_queue_split() calls. The skipped ones are: * nfhd_make_request (arch/m68k/emu/nfblock.c) * axon_ram_make_request (arch/powerpc/sysdev/axonram.c) * simdisk_make_request (arch/xtensa/platforms/iss/simdisk.c) * brd_make_request (ramdisk - drivers/block/brd.c) * loop_make_request * null_queue_bio * bcache's make_request fns Some others are almost certainly safe to remove now, but will be left for future patches. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Neil Brown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: drbd-user@lists.linbit.com Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: linux-nvme@lists.infradead.org Cc: Jiri Kosina <jkosina@suse.cz> Cc: Geoff Levand <geoff@infradead.org> Cc: Jim Paris <jim@jtan.com> Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Peng Tao <bergwolf@gmail.com>
2014-02-24smp: Rename __smp_call_function_single() to smp_call_function_single_async()Frederic Weisbecker
The name __smp_call_function_single() doesn't tell much about the properties of this function, especially when compared to smp_call_function_single(). The comments above the implementation are also misleading. The main point of this function is actually not to be able to embed the csd in an object. This is actually a requirement that result from the purpose of this function which is to raise an IPI asynchronously. As such it can be called with interrupts disabled. And this feature comes at the cost of the caller who then needs to serialize the IPIs on this csd. Lets rename the function and enhance the comments so that they reflect these properties. Suggested-by: Christoph Hellwig <hch@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24smp: Remove wait argument from __smp_call_function_single()Frederic Weisbecker
The main point of calling __smp_call_function_single() is to send an IPI in a pure asynchronous way. By embedding a csd in an object, a caller can send the IPI without waiting for a previous one to complete as is required by smp_call_function_single() for example. As such, sending this kind of IPI can be safe even when irqs are disabled. This flexibility comes at the expense of the caller who then needs to synchronize the csd lifecycle by himself and make sure that IPIs on a single csd are serialized. This is how __smp_call_function_single() works when wait = 0 and this usecase is relevant. Now there don't seem to be any usecase with wait = 1 that can't be covered by smp_call_function_single() instead, which is safer. Lets look at the two possible scenario: 1) The user calls __smp_call_function_single(wait = 1) on a csd embedded in an object. It looks like a nice and convenient pattern at the first sight because we can then retrieve the object from the IPI handler easily. But actually it is a waste of memory space in the object since the csd can be allocated from the stack by smp_call_function_single(wait = 1) and the object can be passed an the IPI argument. Besides that, embedding the csd in an object is more error prone because the caller must take care of the serialization of the IPIs for this csd. 2) The user calls __smp_call_function_single(wait = 1) on a csd that is allocated on the stack. It's ok but smp_call_function_single() can do it as well and it already takes care of the allocation on the stack. Again it's more simple and less error prone. Therefore, using the underscore prepend API version with wait = 1 is a bad pattern and a sign that the caller can do safer and more simple. There was a single user of that which has just been converted. So lets remove this option to discourage further users. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24watchdog: Simplify a little the IPI callFrederic Weisbecker
In order to remotely restart the watchdog hrtimer, update_timers() allocates a csd on the stack and pass it to __smp_call_function_single(). There is no partcular need, however, for a specific csd here. Lets simplify that a little by calling smp_call_function_single() which can already take care of the csd allocation by itself. Acked-by: Don Zickus <dzickus@redhat.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@fb.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24smp: Move __smp_call_function_single() below its safe versionFrederic Weisbecker
Move this function closer to __smp_call_function_single(). These functions have very similar behavior and should be displayed in the same block for clarity. Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24smp: Consolidate the various smp_call_function_single() declensionsFrederic Weisbecker
__smp_call_function_single() and smp_call_function_single() share some code that can be factorized: execute inline when the target is local, check if the target is online, lock the csd, call generic_exec_single(). Lets move the common parts to generic_exec_single(). Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24smp: Teach __smp_call_function_single() to check for offline cpusJan Kara
Align __smp_call_function_single() with smp_call_function_single() so that it also checks whether requested cpu is still online. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24smp: Remove unused list_head from csdJan Kara
Now that we got rid of all the remaining code which fiddled with csd.list, lets remove it. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24smp: Iterate functions through llist_for_each_entry_safe()Jan Kara
The IPI function llist iteration is open coded. Lets simplify this with using an llist iterator. Also we want to keep the iteration safe against possible csd.llist->next value reuse from the IPI handler. At least the block subsystem used to do such things so lets stay careful and use llist_for_each_entry_safe(). Signed-off-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24block: Stop abusing rq->csd.list in blk-softirqJan Kara
Abusing rq->csd.list for a list of requests to complete is rather ugly. We use rq->queuelist instead which is much cleaner. It is safe because queuelist is used by the block layer only for requests waiting to be submitted to a device. Thus it is unused when irq reports the request IO is finished. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24block: Remove useless IPI struct initializationFrederic Weisbecker
rq_fifo_clear() reset the csd.list through INIT_LIST_HEAD for no clear purpose. The csd.list doesn't need to be initialized as a list head because it's only ever used as a list node. Lets remove this useless initialization. Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24block: Stop abusing csd.list for fifo_timeJan Kara
Block layer currently abuses rq->csd.list.next for storing fifo_time. That is a terrible hack and completely unnecessary as well. Union achieves the same space saving in a cleaner way. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-21fs/bio-integrity: remove duplicate codeGu Zheng
Most code of function bio_integrity_verify and bio_integrity_generate is the same, so introduce a help function bio_integrity_generate_verify() to remove the duplicate code. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-18block: Substitute rcu_access_pointer() for rcu_dereference_raw()Paul E. McKenney
(Trivial patch.) If the code is looking at the RCU-protected pointer itself, but not dereferencing it, the rcu_dereference() functions can be downgraded to rcu_access_pointer(). This commit makes this downgrade in blkg_destroy() and ioc_destroy_icq(), both of which simply compare the RCU-protected pointer against another pointer with no dereferencing. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-18block: Use macros from compiler.h instead of __attribute__((...))Gideon Israel Dsouza
To increase compiler portability there are several macros defined in <linux/compiler.h> for various gcc __attribute((..)) constructs. I've made sure gcc these specific were replaced with the right macro and an #include <linux/compiler.h> was placed where needed. Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-18bio: don't write "bio: create slab" messages to syslogMikulas Patocka
When using device mapper, there are many "bio: create slab" messages in the log. Device mapper targets have different front_pad, so each time when we load a target that wasn't loaded before, we allocate a slab with the appropriate front_pad and there is associated "bio: create slab" message. This patch removes these messages, there is no need for them. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-16Linux 3.14-rc3v3.14-rc3Linus Torvalds
2014-02-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "We have a small collection of fixes in my for-linus branch. The big thing that stands out is a revert of a new ioctl. Users haven't shipped yet in btrfs-progs, and Dave Sterba found a better way to export the information" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: use right clone root offset for compressed extents btrfs: fix null pointer deference at btrfs_sysfs_add_one+0x105 Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol Btrfs: fix max_inline mount option Btrfs: fix a lockdep warning when cleaning up aborted transaction Revert "btrfs: add ioctl to export size of global metadata reservation"
2014-02-16Merge tag 'dt-fixes-for-3.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: "Fix booting on PPC boards. Changes to of_match_node matching caused the serial port on some PPC boards to stop working. Reverted the change and reimplement to split matching between new style compatible only matching and fallback to old matching algorithm" * tag 'dt-fixes-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: search the best compatible match first in __of_match_node() Revert "OF: base: match each node compatible against all given matches first"