summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2013-06-17lib-add-lz4-compressor-module-fixAndrew Morton
make lz4_compresshcctx() static Cc: Chanho Min <chanho.min@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17lib: add lz4 compressor moduleChanho Min
This patchset is for supporting LZ4 compression and the crypto API using it. As shown below, the size of data is a little bit bigger but compressing speed is faster under the enabled unaligned memory access. We can use lz4 de/compression through crypto API as well. Also, It will be useful for another potential user of lz4 compression. lz4 Compression Benchmark: Compiler: ARM gcc 4.6.4 ARMv7, 1 GHz based board Kernel: linux 3.4 Uncompressed data Size: 101 MB Compressed Size compression Speed LZO 72.1MB 32.1MB/s, 33.0MB/s(UA) LZ4 75.1MB 30.4MB/s, 35.9MB/s(UA) LZ4HC 59.8MB 2.4MB/s, 2.5MB/s(UA) - UA: Unaligned memory Access support - Latest patch set for LZO applied This patch: Add support for LZ4 compression in the Linux Kernel. LZ4 Compression APIs for kernel are based on LZ4 implementation by Yann Collet and were changed for kernel coding style. LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html LZ4 source repository : http://code.google.com/p/lz4/ svn revision : r90 Two APIs are added: lz4_compress() support basic lz4 compression whereas lz4hc_compress() support high compression or CPU performance get lower but compression ratio get higher. Also, we require the pre-allocated working memory with the defined size and destination buffer must be allocated with the size of lz4_compressbound. Signed-off-by: Chanho Min <chanho.min@lge.com> Cc: "Darrick J. Wong" <djwong@us.ibm.com> Cc: Bob Pearson <rpearson@systemfabricworks.com> Cc: Richard Weinberger <richard@nod.at> Cc: Herbert Xu <herbert@gondor.hengli.com.au> Cc: Yann Collet <yann.collet.73@gmail.com> Cc: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17lib: add support for LZ4-compressed kernelKyungsik Lee
Add support for extracting LZ4-compressed kernel images, as well as LZ4-compressed ramdisk images in the kernel boot process. Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Florian Fainelli <florian@openwrt.org> Cc: Yann Collet <yann.collet.73@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17decompressor: add LZ4 decompressor moduleKyungsik Lee
Add support for LZ4 decompression in the Linux Kernel. LZ4 Decompression APIs for kernel are based on LZ4 implementation by Yann Collet. Benchmark Results(PATCH v3) Compiler: Linaro ARM gcc 4.6.2 1. ARMv7, 1.5GHz based board Kernel: linux 3.4 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.7MB 20.1MB/s, 25.2MB/s(UA) LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA) 2. ARMv7, 1.7GHz based board Kernel: linux 3.7 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.0MB 34.1MB/s, 52.2MB/s(UA) LZ4 6.5MB 86.7MB/s - UA: Unaligned memory Access support - Latest patch set for LZO applied This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a very fast lossless compression algorithm and it also features an extremely fast decoder [1]. But we have five of decompressors already and one question which does arise, however, is that of where do we stop adding new ones? This issue had been discussed and came to the conclusion [2]. Russell King said that we should have: - one decompressor which is the fastest - one decompressor for the highest compression ratio - one popular decompressor (eg conventional gzip) If we have a replacement one for one of these, then it should do exactly that: replace it. The benchmark shows that an 8% increase in image size vs a 66% increase in decompression speed compared to LZO(which has been known as the fastest decompressor in the Kernel). Therefore the "fast but may not be small" compression title has clearly been taken by LZ4 [3]. [1] http://code.google.com/p/lz4/ [2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157 [3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347 LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html LZ4 source repository: http://code.google.com/p/lz4/ Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Yann Collet <yann.collet.73@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Florian Fainelli <florian@openwrt.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17lib: add weak clz/ctz functionsChanho Min
Some architectures need __c[lt]z[sd]i2() for __builtin_c[lt]z[ll] and It causes build failure. They can be implemented using the fls()/__ffs() and overridden by linking arch-specific versions may not be implemented yet. This is required by "lib: add lz4 compressor module". Reference: https://lkml.org/lkml/2013/4/18/603 Signed-off-by: Chanho Min <chanho.min@lge.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "Darrick J. Wong" <djwong@us.ibm.com> Cc: Bob Pearson <rpearson@systemfabricworks.com> Cc: Richard Weinberger <richard@nod.at> Cc: Herbert Xu <herbert@gondor.hengli.com.au> Cc: Yann Collet <yann.collet.73@gmail.com> Cc: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17idr-print-a-stack-dump-after-ida_remove-warning-fixAndrew Morton
convert the open-coded printk+dump_stack into WARN() Cc: Jean Delvare <jdelvare@suse.de> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17idr: print a stack dump after ida_remove warningJean Delvare
We print a dump stack after idr_remove warning. This is useful to find the faulty piece of code. Let's do the same for ida_remove, as it would be equally useful there. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Tejun Heo <tj@kernel.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17rbtree-remove-unneeded-include-fixAndrew Morton
fix lib/interval_tree.c build Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Nathan Zimmer <nzimmer@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17dump_stack-serialize-the-output-from-dump_stack-fixAndrew Morton
- fix comment indenting - avoid inclusion of <asm/> files - use <linux/> where possible - fix uniprocessor build (__dump_stack undefined) - remove unneeded ifdef around smp.h inclusion Cc: Alex Thorlton <athorlton@sgi.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Robin Holt <holt@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: athorlton@sgi.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17dump_stack: serialize the output from dump_stack()Alex Thorlton
tAdd adds functionality to serialize the output from dump_stack() to avoid mangling of the output when dump_stack is called simultaneously from multiple cpus. Signed-off-by: Alex Thorlton <athorlton@sgi.com> Reported-by: Russ Anderson <rja@sgi.com> Reviewed-by: Robin Holt <holt@sgi.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17Merge remote-tracking branch 'lzo-update/lzo-update'Stephen Rothwell
2013-06-17Merge remote-tracking branch 'char-misc/char-misc-next'Stephen Rothwell
2013-06-17Merge remote-tracking branch 'driver-core/driver-core-next'Stephen Rothwell
Conflicts: drivers/base/cpu.c drivers/base/memory.c include/linux/platform_device.h
2013-06-17Merge remote-tracking branch 'percpu/for-next'Stephen Rothwell
2013-06-17Merge remote-tracking branch 'trivial/for-next'Stephen Rothwell
Conflicts: Documentation/networking/netlink_mmap.txt
2013-06-17Merge remote-tracking branch 'cgroup/for-next'Stephen Rothwell
2013-06-17Merge remote-tracking branch 'crypto/master'Stephen Rothwell
2013-06-17Merge remote-tracking branch 'mips/mips-for-linux-next'Stephen Rothwell
2013-06-17Merge remote-tracking branch 'm68k/for-next'Stephen Rothwell
Conflicts: arch/cris/arch-v32/drivers/Kconfig
2013-06-16percpu-refcount: use RCU-sched insted of normal RCUTejun Heo
percpu-refcount was incorrectly using preempt_disable/enable() for RCU critical sections against call_rcu(). 6a24474da8 ("percpu-refcount: consistently use plain (non-sched) RCU") fixed it by converting the preepmtion operations with rcu_read_[un]lock() citing that there isn't any advantage in using sched-RCU over using the usual one; however, rcu_read_[un]lock() for the preemptible RCU implementation - CONFIG_TREE_PREEMPT_RCU, chosen when CONFIG_PREEMPT - are slightly more expensive than preempt_disable/enable(). In a contrived microbench which repeats the followings, - percpu_ref_get() - copy 32 bytes of data into percpu buffer - percpu_put_get() - copy 32 bytes of data into percpu buffer rcu_read_[un]lock() used in percpu_ref_get/put() makes it go slower by about 15% when compared to using sched-RCU. As the RCU critical sections are extremely short, using sched-RCU shouldn't have any latency implications. Convert to RCU-sched. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Kent Overstreet <koverstreet@google.com> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Rusty Russell <rusty@rustcorp.com.au>
2013-06-16lib: Move fonts from drivers/video/console/ to lib/fonts/Geert Uytterhoeven
Several drivers need font support independent of CONFIG_VT, cfr. commit 70ebec7b59a3271e9ed459fe4c3bf0660e9356ea, "console/font: Refactor font support code selection logic"). Hence move the fonts and their support logic from drivers/video/console/ to its own library directory lib/fonts/. This also allows to limit processing of drivers/video/console/Makefile to CONFIG_VT=y again. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2013-06-13percpu-refcount: implement percpu_tryget() along with ↵Tejun Heo
percpu_ref_kill_and_confirm() Implement percpu_tryget() which stops giving out references once the percpu_ref is visible as killed. Because the refcnt is per-cpu, different CPUs will start to see a refcnt as killed at different points in time and tryget() may continue to succeed on subset of cpus for a while after percpu_ref_kill() returns. For use cases where it's necessary to know when all CPUs start to see the refcnt as dead, percpu_ref_kill_and_confirm() is added. The new function takes an extra argument @confirm_kill which is invoked when the refcnt is guaranteed to be viewed as killed on all CPUs. While this isn't the prettiest interface, it doesn't force synchronous wait and is much safer than requiring the caller to do its own call_rcu(). v2: Patch description rephrased to emphasize that tryget() may continue to succeed on some CPUs after kill() returns as suggested by Kent. v3: Function comment in percpu_ref_kill_and_confirm() updated warning people to not depend on the implied RCU grace period from the confirm callback as it's an implementation detail. Signed-off-by: Tejun Heo <tj@kernel.org> Slightly-Grumpily-Acked-by: Kent Overstreet <koverstreet@google.com>
2013-06-13percpu-refcount: implement percpu_ref_cancel_init()Tejun Heo
Normally, percpu_ref_init() initializes and percpu_ref_kill() initiates destruction which completes asynchronously. The asynchronous destruction can be problematic in init failure path where the caller wants to destroy half-constructed object - distinguishing half-constructed objects from the usual release method can be painful for complex objects. This patch implements percpu_ref_cancel_init() which synchronously destroys the percpu_ref without invoking release. To avoid unintentional misuses, the function requires the ref to have finished percpu_ref_init() but never used and triggers WARN otherwise. v2: Explain the weird name and usage restriction in the function comment. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Kent Overstreet <koverstreet@google.com>
2013-06-13percpu-refcount: add __must_check to percpu_ref_init() and don't use ↵Tejun Heo
ACCESS_ONCE() in percpu_ref_kill_rcu() Two small changes. * Unlike most init functions, percpu_ref_init() allocates memory and may fail. Let's mark it with __must_check in case the caller forgets. * percpu_ref_kill_rcu() is unnecessarily using ACCESS_ONCE() to dereference @ref->pcpu_count, which can be misleading. The pointer is guaranteed to be valid and visible and can't change underneath the function. Drop ACCESS_ONCE(). Signed-off-by: Tejun Heo <tj@kernel.org>
2013-06-13lib/Kconfig.debug: Restrict FRAME_POINTER for MIPSMarkos Chandras
FAULT_INJECTION_STACKTRACE_FILTER selects FRAME_POINTER but that symbol is not available for MIPS. Fixes the following problem on a randconfig: warning: (LOCKDEP && FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP && KMEMCHECK) selects FRAME_POINTER which has unmet direct dependencies (DEBUG_KERNEL && (CRIS || M68K || FRV || UML || AVR32 || SUPERH || BLACKFIN || MN10300 || METAG) || ARCH_WANT_FRAME_POINTERS) Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Acked-by: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/5441/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-06-12percpu-refcount: cosmetic updatesTejun Heo
* s/percpu_ref_release/percpu_ref_func_t/ as it's customary to have _t postfix for types and the type is gonna be used for a different type of callback too. * Add @ARG to function comments. * Drop unnecessary and unaligned indentation from percpu_ref_init() function comment. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Kent Overstreet <koverstreet@google.com>
2013-06-12lib/mpi/mpicoder.c: looping issue, need stop when equal to zero, found by ↵Chen Gang
'EXTRA_FLAGS=-W'. For 'while' looping, need stop when 'nbytes == 0', or will cause issue. ('nbytes' is size_t which is always bigger or equal than zero). The related warning: (with EXTRA_CFLAGS=-W) lib/mpi/mpicoder.c:40:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-07kobject: sanitize argument for format stringKees Cook
Unlike kobject_set_name(), the kset_create_and_add() interface does not provide a way to use format strings, so make sure that the interface cannot be abused accidentally. It looks like all current callers use static strings, so there's no existing flaw. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-05net: core: move mac_pton() to lib/net_utils.cAndy Shevchenko
Since we have at least one user of this function outside of CONFIG_NET scope, we have to provide this function independently. The proposed solution is to move it under lib/net_utils.c with corresponding configuration variable and select wherever it is needed. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-05crypto: crct10dif - Use PTR_RETHerbert Xu
lib/crc-t10dif.c:42:1-3: WARNING: PTR_RET can be used Use PTR_RET rather than if(IS_ERR(...)) + PTR_ERR Generated by: coccinelle/api/ptr_ret.cocci Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-06-03percpu-refcount: Don't use silly cmpxchg()Kent Overstreet
The cmpxchg() was just to ensure the debug check didn't race, which was a bit excessive. The caller is supposed to do the appropriate synchronization, which means percpu_ref_kill() can just do a simple store. Signed-off-by: Kent Overstreet <koverstreet@google.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2013-06-03percpu: implement generic percpu refcountingKent Overstreet
This implements a refcount with similar semantics to atomic_get()/atomic_dec_and_test() - but percpu. It also implements two stage shutdown, as we need it to tear down the percpu counts. Before dropping the initial refcount, you must call percpu_ref_kill(); this puts the refcount in "shutting down mode" and switches back to a single atomic refcount with the appropriate barriers (synchronize_rcu()). It's also legal to call percpu_ref_kill() multiple times - it only returns true once, so callers don't have to reimplement shutdown synchronization. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: coding-style tweak] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Tejun Heo <tj@kernel.org>
2013-06-03debugfs: add get/set for atomic typesSeth Jennings
debugfs currently lack the ability to create attributes that set/get atomic_t values. This patch adds support for this through a new debugfs_create_atomic_t() function. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-29sprintf: hex_string(): fix commentSteven Rostedt
hex_string() had a typo in a comment. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-05-24MPILIB: disable usage of floating point registers on pariscHelge Deller
The umul_ppmm() macro for parisc uses the xmpyu assembler statement which does calculation via a floating point register. But usage of floating point registers inside the Linux kernel are not allowed and gcc will stop compilation due to the -mdisable-fpregs compiler option. Fix this by disabling the umul_ppmm() and udiv_qrnnd() macros. The mpilib will then use the generic built-in implementations instead. Signed-off-by: Helge Deller <deller@gmx.de>
2013-05-23Merge tag 'driver-core-3.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg Kroah-Hartman: "Here are 3 tiny driver core fixes for 3.10-rc2. A needed symbol export, a change to make it easier to track down offending sysfs files with incorrect attributes, and a klist bugfix. All have been in linux-next for a while" * tag 'driver-core-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: klist: del waiter from klist_remove_waiters before wakeup waitting process driver core: print sysfs attribute name when warning about bogus permissions driver core: export subsys_virtual_register
2013-05-23lib: make iovec obj instead of libRandy Dunlap
Fix build error io vmw_vmci.ko when CONFIG_VMWARE_VMCI=m by chaning iovec.o from lib-y to obj-y. ERROR: "memcpy_toiovec" [drivers/misc/vmw_vmci/vmw_vmci.ko] undefined! ERROR: "memcpy_fromiovec" [drivers/misc/vmw_vmci/vmw_vmci.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-21klist: del waiter from klist_remove_waiters before wakeup waitting processwang, biao
There is a race between klist_remove and klist_release. klist_remove uses a local var waiter saved on stack. When klist_release calls wake_up_process(waiter->process) to wake up the waiter, waiter might run immediately and reuse the stack. Then, klist_release calls list_del(&waiter->list) to change previous wait data and cause prior waiter thread corrupt. The patch fixes it against kernel 3.9. Signed-off-by: wang, biao <biao.wang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-20crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform ↵Tim Chen
framework When CRC T10 DIF is calculated using the crypto transform framework, we wrap the crc_t10dif function call to utilize it. This allows us to take advantage of any accelerated CRC T10 DIF transform that is plugged into the crypto framework. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-05-20Hoist memcpy_fromiovec/memcpy_toiovec into lib/Rusty Russell
ERROR: "memcpy_fromiovec" [drivers/vhost/vhost_scsi.ko] undefined! That function is only present with CONFIG_NET. Turns out that crypto/algif_skcipher.c also uses that outside net, but it actually needs sockets anyway. In addition, commit 6d4f0139d642c45411a47879325891ce2a7c164a added CONFIG_NET dependency to CONFIG_VMCI for memcpy_toiovec, so hoist that function and revert that commit too. socket.h already includes uio.h, so no callers need updating; trying only broke things fo x86_64 randconfig (thanks Fengguang!). Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-05-08Merge branch 'for-3.10/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: "It might look big in volume, but when categorized, not a lot of drivers are touched. The pull request contains: - mtip32xx fixes from Micron. - A slew of drbd updates, this time in a nicer series. - bcache, a flash/ssd caching framework from Kent. - Fixes for cciss" * 'for-3.10/drivers' of git://git.kernel.dk/linux-block: (66 commits) bcache: Use bd_link_disk_holder() bcache: Allocator cleanup/fixes cciss: bug fix to prevent cciss from loading in kdump crash kernel cciss: add cciss_allow_hpsa module parameter drivers/block/mg_disk.c: add CONFIG_PM_SLEEP to suspend/resume functions mtip32xx: Workaround for unaligned writes bcache: Make sure blocksize isn't smaller than device blocksize bcache: Fix merge_bvec_fn usage for when it modifies the bvm bcache: Correctly check against BIO_MAX_PAGES bcache: Hack around stuff that clones up to bi_max_vecs bcache: Set ra_pages based on backing device's ra_pages bcache: Take data offset from the bdev superblock. mtip32xx: mtip32xx: Disable TRIM support mtip32xx: fix a smatch warning bcache: Disable broken btree fuzz tester bcache: Fix a format string overflow bcache: Fix a minor memory leak on device teardown bcache: Documentation updates bcache: Use WARN_ONCE() instead of __WARN() bcache: Add missing #include <linux/prefetch.h> ...
2013-05-07rwsem: check counter to avoid cmpxchg callsDavidlohr Bueso
This patch tries to reduce the amount of cmpxchg calls in the writer failed path by checking the counter value first before issuing the instruction. If ->count is not set to RWSEM_WAITING_BIAS then there is no point wasting a cmpxchg call. Furthermore, Michel states "I suppose it helps due to the case where someone else steals the lock while we're trying to acquire sem->wait_lock." Two very different workloads and machines were used to see how this patch improves throughput: pgbench on a quad-core laptop and aim7 on a large 8 socket box with 80 cores. Some results comparing Michel's fast-path write lock stealing (tps-rwsem) on a quad-core laptop running pgbench: | db_size | clients | tps-rwsem | tps-patch | +---------+----------+----------------+--------------+ | 160 MB | 1 | 6906 | 9153 | + 32.5 | 160 MB | 2 | 15931 | 22487 | + 41.1% | 160 MB | 4 | 33021 | 32503 | | 160 MB | 8 | 34626 | 34695 | | 160 MB | 16 | 33098 | 34003 | | 160 MB | 20 | 31343 | 31440 | | 160 MB | 30 | 28961 | 28987 | | 160 MB | 40 | 26902 | 26970 | | 160 MB | 50 | 25760 | 25810 | ------------------------------------------------------ | 1.6 GB | 1 | 7729 | 7537 | | 1.6 GB | 2 | 19009 | 23508 | + 23.7% | 1.6 GB | 4 | 33185 | 32666 | | 1.6 GB | 8 | 34550 | 34318 | | 1.6 GB | 16 | 33079 | 32689 | | 1.6 GB | 20 | 31494 | 31702 | | 1.6 GB | 30 | 28535 | 28755 | | 1.6 GB | 40 | 27054 | 27017 | | 1.6 GB | 50 | 25591 | 25560 | ------------------------------------------------------ | 7.6 GB | 1 | 6224 | 7469 | + 20.0% | 7.6 GB | 2 | 13611 | 12778 | | 7.6 GB | 4 | 33108 | 32927 | | 7.6 GB | 8 | 34712 | 34878 | | 7.6 GB | 16 | 32895 | 33003 | | 7.6 GB | 20 | 31689 | 31974 | | 7.6 GB | 30 | 29003 | 28806 | | 7.6 GB | 40 | 26683 | 26976 | | 7.6 GB | 50 | 25925 | 25652 | ------------------------------------------------------ For the aim7 worloads, they overall improved on top of Michel's patchset. For full graphs on how the rwsem series plus this patch behaves on a large 8 socket machine against a vanilla kernel: http://stgolabs.net/rwsem-aim7-results.tar.gz Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07kref: minor cleanupAnatol Pomozov
- make warning smp-safe - result of atomic _unless_zero functions should be checked by caller to avoid use-after-free error - trivial whitespace fix. Link: https://lkml.org/lkml/2013/4/12/391 Tested: compile x86, boot machine and run xfstests Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> [ Removed line-break, changed to use WARN_ON_ONCE() - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07Merge branch 'rwsem-optimizations'Linus Torvalds
Merge rwsem optimizations from Michel Lespinasse: "These patches extend Alex Shi's work (which added write lock stealing on the rwsem slow path) in order to provide rwsem write lock stealing on the fast path (that is, without taking the rwsem's wait_lock). I have unfortunately been unable to push this through -next before due to Ingo Molnar / David Howells / Peter Zijlstra being busy with other things. However, this has gotten some attention from Rik van Riel and Davidlohr Bueso who both commented that they felt this was ready for v3.10, and Ingo Molnar has said that he was OK with me pushing directly to you. So, here goes :) Davidlohr got the following test results from pgbench running on a quad-core laptop: | db_size | clients | tps-vanilla | tps-rwsem | +---------+----------+----------------+--------------+ | 160 MB | 1 | 5803 | 6906 | + 19.0% | 160 MB | 2 | 13092 | 15931 | | 160 MB | 4 | 29412 | 33021 | | 160 MB | 8 | 32448 | 34626 | | 160 MB | 16 | 32758 | 33098 | | 160 MB | 20 | 26940 | 31343 | + 16.3% | 160 MB | 30 | 25147 | 28961 | | 160 MB | 40 | 25484 | 26902 | | 160 MB | 50 | 24528 | 25760 | ------------------------------------------------------ | 1.6 GB | 1 | 5733 | 7729 | + 34.8% | 1.6 GB | 2 | 9411 | 19009 | + 101.9% | 1.6 GB | 4 | 31818 | 33185 | | 1.6 GB | 8 | 33700 | 34550 | | 1.6 GB | 16 | 32751 | 33079 | | 1.6 GB | 20 | 30919 | 31494 | | 1.6 GB | 30 | 28540 | 28535 | | 1.6 GB | 40 | 26380 | 27054 | | 1.6 GB | 50 | 25241 | 25591 | ------------------------------------------------------ | 7.6 GB | 1 | 5779 | 6224 | | 7.6 GB | 2 | 10897 | 13611 | + 24.9% | 7.6 GB | 4 | 32683 | 33108 | | 7.6 GB | 8 | 33968 | 34712 | | 7.6 GB | 16 | 32287 | 32895 | | 7.6 GB | 20 | 27770 | 31689 | + 14.1% | 7.6 GB | 30 | 26739 | 29003 | | 7.6 GB | 40 | 24901 | 26683 | | 7.6 GB | 50 | 17115 | 25925 | + 51.5% ------------------------------------------------------ (Davidlohr also has one additional patch which further improves throughput, though I will ask him to send it directly to you as I have suggested some minor changes)." * emailed patches from Michel Lespinasse <walken@google.com>: rwsem: no need for explicit signed longs x86 rwsem: avoid taking slow path when stealing write lock rwsem: do not block readers at head of queue if other readers are active rwsem: implement support for write lock stealing on the fastpath rwsem: simplify __rwsem_do_wake rwsem: skip initial trylock in rwsem_down_write_failed rwsem: avoid taking wait_lock in rwsem_down_write_failed rwsem: use cmpxchg for trying to steal write lock rwsem: more agressive lock stealing in rwsem_down_write_failed rwsem: simplify rwsem_down_write_failed rwsem: simplify rwsem_down_read_failed rwsem: move rwsem_down_failed_common code into rwsem_down_{read,write}_failed rwsem: shorter spinlocked section in rwsem_down_failed_common() rwsem: make the waiter type an enumeration rather than a bitmask
2013-05-07rwsem: no need for explicit signed longsDavidlohr Bueso
Change explicit "signed long" declarations into plain "long" as suggested by Peter Hurley. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Reviewed-by: Michel Lespinasse <walken@google.com> Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07rwsem: do not block readers at head of queue if other readers are activeMichel Lespinasse
This change fixes a race condition where a reader might determine it needs to block, but by the time it acquires the wait_lock the rwsem has active readers and no queued waiters. In this situation the reader can run in parallel with the existing active readers; it does not need to block until the active readers complete. Thanks to Peter Hurley for noticing this possible race. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07rwsem: implement support for write lock stealing on the fastpathMichel Lespinasse
When we decide to wake up readers, we must first grant them as many read locks as necessary, and then actually wake up all these readers. But in order to know how many read shares to grant, we must first count the readers at the head of the queue. This might take a while if there are many readers, and we want to be protected against a writer stealing the lock while we're counting. To that end, we grant the first reader lock before counting how many more readers are queued. We also require some adjustments to the wake_type semantics. RWSEM_WAKE_NO_ACTIVE used to mean that we had found the count to be RWSEM_WAITING_BIAS, in which case the rwsem was known to be free as nobody could steal it while we hold the wait_lock. This doesn't make sense once we implement fastpath write lock stealing, so we now use RWSEM_WAKE_ANY in that case. Similarly, when rwsem_down_write_failed found that a read lock was active, it would use RWSEM_WAKE_READ_OWNED which signalled that new readers could be woken without checking first that the rwsem was available. We can't do that anymore since the existing readers might release their read locks, and a writer could steal the lock before we wake up additional readers. So, we have to use a new RWSEM_WAKE_READERS value to indicate we only want to wake readers, but we don't currently hold any read lock. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07rwsem: simplify __rwsem_do_wakeMichel Lespinasse
This is mostly for cleanup value: - We don't need several gotos to handle the case where the first waiter is a writer. Two simple tests will do (and generate very similar code). - In the remainder of the function, we know the first waiter is a reader, so we don't have to double check that. We can use do..while loops to iterate over the readers to wake (generates slightly better code). Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07rwsem: skip initial trylock in rwsem_down_write_failedMichel Lespinasse
We can skip the initial trylock in rwsem_down_write_failed() if there are known active lockers already, thus saving one likely-to-fail cmpxchg. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07rwsem: avoid taking wait_lock in rwsem_down_write_failedMichel Lespinasse
In rwsem_down_write_failed(), if there are active locks after we wake up (i.e. the lock got stolen from us), skip taking the wait_lock and go back to sleep immediately. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>