Age | Commit message (Collapse) | Author |
|
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
|
|
|
|
The "strictlimit" feature was introduced to enforce per-bdi dirty limits
for FUSE which sets bdi max_ratio to 1% by default:
http://article.gmane.org/gmane.linux.kernel.mm/105809
However the feature can be useful for other relatively slow or untrusted
BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
the feature:
echo 1 > /sys/class/bdi/X:Y/strictlimit
Being enabled, the feature enforces bdi max_ratio limit even if global
(10%) dirty limit is not reached. Of course, the effect is not visible
until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable value.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Artem S. Tashkinov" <t.artem@lycos.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
788257d6101d9 ("ufs: remove the BKL") replaced BKL with mutex protection
using functions lock_ufs, unlock_ufs and struct mutex 'mutex' in sb_info.
b6963327e052 ("ufs: drop lock/unlock super") removed lock/unlock super and
added struct mutex 's_lock' in sb_info.
Those 2 mutexes are generally locked/unlocked at the same time except in
allocation (balloc, ialloc).
This patch merges the 2 mutexes and propagates first commit solution. It
also adds mutex destruction before kfree during ufs_fill_super failure and
ufs_put_super.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Pointer 'usb3' to struct ufs_super_block_third acquired via
ubh_get_usb_third() is never used in function
ufs_read_cylinder_structures(= ). Thus remove it.
Detected by Coverity: CID 139939.
Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Pointer 'usb2' to struct ufs_super_block_second acquired via
ubh_get_usb_second() is never used in function ufs_statfs(). Thus remove
it.
Detected by Coverity: CID 139940.
Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove occurences of unused pointers to struct ufs_super_block_first that
w= ere acquired via ubh_get_usb_first().
Detected by Coverity: CID 139929 - CID 139936, CID 139940.
Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
init_inodecache is only called by __init init_ufs_fs.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add description of early_ioremap_debug kernel parameter.
Signed-off-by: Mark Salter <msalter@redhat.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add support for early IO or memory mappings which are needed before the
normal ioremap() is usable. This also adds fixmap support for permanent
fixed mappings such as that used by the earlyprintk device register
region.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Presently, paging_init() calls init_mem_pgprot() to initialize pgprot
values used by macros such as PAGE_KERNEL, PAGE_KERNEL_EXEC, etc. The new
fixmap and early_ioremap support also needs to use these macros before
paging_init() is called. This patch moves the init_mem_pgprot() call out
of paging_init() and into setup_arch() so that pgprot_default gets
initialized in time for fixmap and early_ioremap.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Move x86 over to the generic early ioremap implementation.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This patch creates a generic implementation of early_ioremap() support
based on the existing x86 implementation. early_ioremp() is useful for
early boot code which needs to temporarily map I/O or memory regions
before normal mapping functions such as ioremap() are available.
Some architectures have optional MMU. In the no-MMU case, the remap
functions simply return the passed in physical address and the unmap
functions do nothing.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This patch series takes the common bits from the x86 early ioremap
implementation and creates a generic implementation which may be used by
other architectures. The early ioremap interfaces are intended for
situations where boot code needs to make temporary virtual mappings before
the normal ioremap interfaces are available. Typically, this means before
paging_init() has run.
This patch (of 6):
There's a lot of sparse warnings for code like below: void *a =
early_memremap(phys_addr, size);
early_memremap intend to map kernel memory with ioremap facility, the
return pointer should be a kernel ram pointer instead of iomem one.
For making the function clearer and supressing sparse warnings this patch
do below two things:
1. cast to (__force void *) for the return value of early_memremap
2. add early_memunmap function and pass (__force void __iomem *) to iounmap
From Boris:
: Ingo told me yesterday, it makes sense too. I'd guess we can try it.
: FWIW, all callers of early_memremap use the memory they get remapped as
: normal memory so we should be safe.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When the system has only one CPU, lglock is effectively a spinlock; map it
directly to spinlock to eliminate the indirection and duplicate code.
In addition to removing overhead, this drops 1.6k of code with a defconfig
modified to have !CONFIG_SMP, and 1.1k with a minimal config.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
WARNING: storage class should be at the beginning of the declaration
#25: FILE: lib/smp_processor_id.c:10:
+notrace static unsigned int check_preemption_disabled(const char *what1,
WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then dev_err(dev, ... then pr_err(... to printk(KERN_ERR ...
#36: FILE: lib/smp_processor_id.c:42:
+ printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n",
ERROR: space required after that ',' (ctx:VxV)
#46: FILE: lib/smp_processor_id.c:56:
+ return check_preemption_disabled("smp_processor_id","");
^
total: 1 errors, 2 warnings, 36 lines checked
./patches/percpu-add-preemption-checks-to-__this_cpu-ops-fix.patch has style problems, please review.
If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Please run checkpatch prior to sending patches
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
snprintf can cause hangs. Move the string processing into the function so
that the string operations only occur when necessary after the conditions
have been checked.
Signed-off-by: Christoph Lameter <cl@linux.com>
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We define a check function in order to avoid trouble with the include
files. Then the higher level __this_cpu macros are modified to invoke the
preemption check.
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The RT_CACHE_STAT_INC macro triggers the new preemption checks
for __this_cpu ops.
I do not see any other synchronization that would allow the use
of a __this_cpu operation here however in commit
dbd2915ce87e811165da0717f8e159276ebb803e Andrew justifies
the use of raw_smp_processor_id() here because "we do not care"
about races. In the past we agreed that the price of disabling
interrupts here to get consistent counters would be too high.
These counters may be inaccurate due to race conditions.
The use of __this_cpu op improves the situation already from what commit
dbd2915ce87e811165da0717f8e159276ebb803e did since the single instruction
emitted on x86 does not allow the race to occur anymore. However,
non x86 platforms could still experience a race here.
Trace:
[ 1277.189084] __this_cpu_add operation in preemptible [00000000] code: avahi-daemon/1193
[ 1277.189085] caller is __this_cpu_preempt_check+0x38/0x60
[ 1277.189086] CPU: 1 PID: 1193 Comm: avahi-daemon Tainted: GF 3.12.0-rc4+ #187
[ 1277.189087] Hardware name: FUJITSU CELSIUS W530 Power/D3227-A1, BIOS V4.6.5.4 R1.10.0 for D3227-A1x 09/16/2013
[ 1277.189088] 0000000000000001 ffff8807ef78fa00 ffffffff816d5a57 ffff8807ef78ffd8
[ 1277.189089] ffff8807ef78fa30 ffffffff8137359c ffff8807ef78fba0 ffff88079f822b40
[ 1277.189091] 0000000020000000 ffff8807ee32c800 ffff8807ef78fa70 ffffffff813735f8
[ 1277.189093] Call Trace:
[ 1277.189094] [<ffffffff816d5a57>] dump_stack+0x4e/0x82
[ 1277.189096] [<ffffffff8137359c>] check_preemption_disabled+0xec/0x110
[ 1277.189097] [<ffffffff813735f8>] __this_cpu_preempt_check+0x38/0x60
[ 1277.189098] [<ffffffff81610d65>] __ip_route_output_key+0x575/0x8c0
[ 1277.189100] [<ffffffff816110d7>] ip_route_output_flow+0x27/0x70
[ 1277.189101] [<ffffffff81616c80>] ? ip_copy_metadata+0x1a0/0x1a0
[ 1277.189102] [<ffffffff81640b15>] udp_sendmsg+0x825/0xa20
[ 1277.189104] [<ffffffff811b4aa9>] ? do_sys_poll+0x449/0x5d0
[ 1277.189105] [<ffffffff8164c695>] inet_sendmsg+0x85/0xc0
[ 1277.189106] [<ffffffff815c6e3c>] sock_sendmsg+0x9c/0xd0
[ 1277.189108] [<ffffffff813735f8>] ? __this_cpu_preempt_check+0x38/0x60
[ 1277.189109] [<ffffffff815c7550>] ? move_addr_to_kernel+0x40/0xa0
[ 1277.189111] [<ffffffff815c71ec>] ___sys_sendmsg+0x37c/0x390
[ 1277.189112] [<ffffffff8136613a>] ? string.isra.3+0x3a/0xd0
[ 1277.189113] [<ffffffff8136613a>] ? string.isra.3+0x3a/0xd0
[ 1277.189115] [<ffffffff81367b54>] ? vsnprintf+0x364/0x650
[ 1277.189116] [<ffffffff81367ee9>] ? snprintf+0x39/0x40
[ 1277.189118] [<ffffffff813735f8>] ? __this_cpu_preempt_check+0x38/0x60
[ 1277.189119] [<ffffffff815c7ff9>] __sys_sendmsg+0x49/0x90
[ 1277.189121] [<ffffffff815c8052>] SyS_sendmsg+0x12/0x20
[ 1277.189122] [<ffffffff816e4fd3>] tracesys+0xe1/0xe6
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The initialization of a structure is not subject to synchronization. The
use of __this_cpu would trigger a false positive with the additional
preemption checks for __this_cpu ops.
So simply disable the check through the use of raw_cpu ops.
Trace:
[ 0.668066] __this_cpu_write operation in preemptible [00000000] code: modprobe/286
[ 0.668108] caller is __this_cpu_preempt_check+0x38/0x60
[ 0.668111] CPU: 3 PID: 286 Comm: modprobe Tainted: GF 3.12.0-rc4+ #187
[ 0.668112] Hardware name: FUJITSU CELSIUS W530 Power/D3227-A1, BIOS V4.6.5.4 R1.10.0 for D3227-A1x 09/16/2013
[ 0.668113] 0000000000000003 ffff8807edda1d18 ffffffff816d5a57 ffff8807edda1fd8
[ 0.668117] ffff8807edda1d48 ffffffff8137359c ffff8807edda1ef8 ffffffffa0002178
[ 0.668121] ffffc90000067730 ffff8807edda1e48 ffff8807edda1d88 ffffffff813735f8
[ 0.668124] Call Trace:
[ 0.668129] [<ffffffff816d5a57>] dump_stack+0x4e/0x82
[ 0.668132] [<ffffffff8137359c>] check_preemption_disabled+0xec/0x110
[ 0.668135] [<ffffffff813735f8>] __this_cpu_preempt_check+0x38/0x60
[ 0.668139] [<ffffffff810c24fd>] load_module+0xcfd/0x2650
[ 0.668143] [<ffffffff816dd922>] ? page_fault+0x22/0x30
[ 0.668146] [<ffffffff810c3ef6>] SyS_init_module+0xa6/0xd0
[ 0.668150] [<ffffffff816e4fd3>] tracesys+0xe1/0xe6
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
With the preempt checking logic for __this_cpu_ops we will get false
positives from locations in the code that use numa_node_id.
Before the __this_cpu ops where introduced there were
no checks for preemption present either. smp_raw_processor_id()
was used. See http://www.spinics.net/lists/linux-numa/msg00641.html
Therefore we need to use raw_cpu_read here to avoid false postives.
Note that this issue has been discussed in prior years.
If the process changes nodes after retrieving the current numa node then
that is acceptable since most uses of numa_node etc are for optimization
and not for correctness.
There were suggestions to implement a raw_numa_node_id in order to
do preempt checks for numa_node_id as well. But I think we better
defer that to another patch since that would mean investigating
how numa_node_id() is used throughout the kernel which would increase
the scope of this patchset significantly. After all preemption was never
checked before when numa_node_id() was used.
Some sample traces:
__this_cpu_read operation in preemptible [00000000] code: login/1456
caller is __this_cpu_preempt_check+0x2b/0x2d
CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3-dirty #185
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
000000000000013c ffff88001f31ba58 ffffffff8147cf5e ffff88001f31bfd8
ffff88001f31ba88 ffffffff8127eea9 0000000000000000 ffff88001f3975c0
00000000f7707000 ffff88001f3975c0 ffff88001f31bac0 ffffffff8127eeef
Call Trace:
[<ffffffff8147cf5e>] dump_stack+0x4e/0x82
[<ffffffff8127eea9>] check_preemption_disabled+0xc5/0xe0
[<ffffffff8127eeef>] __this_cpu_preempt_check+0x2b/0x2d
[<ffffffff81030ff5>] ? show_stack+0x3b/0x3d
[<ffffffff810ebee3>] get_task_policy+0x1d/0x49
[<ffffffff810ed705>] get_vma_policy+0x14/0x76
[<ffffffff810ed8ff>] alloc_pages_vma+0x35/0xff
[<ffffffff810dad97>] handle_mm_fault+0x290/0x73b
[<ffffffff810503da>] __do_page_fault+0x3fe/0x44d
[<ffffffff8109b360>] ? trace_hardirqs_on_caller+0x142/0x19e
[<ffffffff8109b3c9>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff81278bed>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
[<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
[<ffffffff81050451>] do_page_fault+0x9/0xc
[<ffffffff81483602>] page_fault+0x22/0x30
[<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
[<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
[<ffffffff810be4c3>] ? file_read_actor+0x3a/0x15a
[<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
[<ffffffff810bffab>] generic_file_aio_read+0x38e/0x624
[<ffffffff810f6d69>] do_sync_read+0x54/0x73
[<ffffffff810f7890>] vfs_read+0x9d/0x12a
[<ffffffff810f7a59>] SyS_read+0x47/0x7e
[<ffffffff81484f21>] cstar_dispatch+0x7/0x23
caller is __this_cpu_preempt_check+0x2b/0x2d
CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3-dirty #185
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
00000000000000e8 ffff88001f31bbf8 ffffffff8147cf5e ffff88001f31bfd8
ffff88001f31bc28 ffffffff8127eea9 ffffffff823c5c40 00000000000213da
0000000000000000 0000000000000000 ffff88001f31bc60 ffffffff8127eeef
Call Trace:
[<ffffffff8147cf5e>] dump_stack+0x4e/0x82
[<ffffffff8127eea9>] check_preemption_disabled+0xc5/0xe0
[<ffffffff8127eeef>] __this_cpu_preempt_check+0x2b/0x2d
[<ffffffff810e006e>] ? install_special_mapping+0x11/0xe4
[<ffffffff810ec8a8>] alloc_pages_current+0x8f/0xbc
[<ffffffff810bec6b>] __page_cache_alloc+0xb/0xd
[<ffffffff810c7e90>] __do_page_cache_readahead+0xf4/0x219
[<ffffffff810c7e0e>] ? __do_page_cache_readahead+0x72/0x219
[<ffffffff810c827c>] ra_submit+0x1c/0x20
[<ffffffff810c850c>] ondemand_readahead+0x28c/0x2b4
[<ffffffff810c85e9>] page_cache_sync_readahead+0x38/0x3a
[<ffffffff810bfe7e>] generic_file_aio_read+0x261/0x624
[<ffffffff810f6d69>] do_sync_read+0x54/0x73
[<ffffffff810f7890>] vfs_read+0x9d/0x12a
[<ffffffff810f7a59>] SyS_read+0x47/0x7e
[<ffffffff81484f21>] cstar_dispatch+0x7/0x23
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Alex Shi <alex.shi@intel.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The kernel has never been audited to ensure that this_cpu operations are
consistently used throughout the kernel. The code generated in many
places can be improved through the use of this_cpu operations (which uses
a segment register for relocation of per cpu offsets instead of performing
address calculations).
The patch set also addresses various consistency issues in general with
the per cpu macros.
A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
because checks are skipped. This is typically shown through a raw_
prefix. So this patch set changes the places where __this_cpu_ptr()
is used to raw_cpu_ptr().
B. There has been the long term wish by some that __this_cpu operations
would check for preemption. However, there are cases where preemption
checks need to be skipped. This patch set adds raw_cpu operations that
do not check for preemption and then adds preemption checks to the
__this_cpu operations.
C. The use of __get_cpu_var is always a reference to a percpu variable
that can also be handled via a this_cpu operation. This patch set
replaces all uses of __get_cpu_var with this_cpu operations.
D. We can then use this_cpu RMW operations in various places replacing
sequences of instructions by a single one.
E. The use of this_cpu operations throughout will allow other arches than
x86 to implement optimized references and RMV operations to work with
per cpu local data.
F. The use of this_cpu operations opens up the possibility to
further optimize code that relies on synchronization through
per cpu data.
The patch set works in a couple of stages:
I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
Also converts the existing __this_cpu_xx_# primitive in the x86
code to raw_cpu_xx_#.
II. Patch 2-4 use the raw_cpu operations in places that would give
us false positives once they are enabled.
III. Patch 5 adds preemption checks to __this_cpu operations to allow
checking if preemption is properly disabled when these functions
are used.
IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
with this_cpu_ptr. They do not depend on any changes to the percpu
code. No preemption tests are skipped if they are applied.
V. Patches 21-46 are conversion patches that use this_cpu operations
in various kernel subsystems/drivers or arch code.
VI. Patches 47/48 (not included in this series) remove no longer used
functions (__this_cpu_ptr and __get_cpu_var). These should only be
applied after all the conversion patches have made it and after we
have done additional passes through the kernel to ensure that none of
the uses of these functions remain.
This patch (of 46):
The patches following this one will add preemption checks to __this_cpu
ops so we need to have an alternative way to use this_cpu operations
without preemption checks.
raw_cpu_ops will be the basis for all other ops since these will be the
operations that do not implement any checks.
Primitive operations are renamed by this patch from __this_cpu_xxx to
raw_cpu_xxxx.
Also change the uses of the x86 percpu primitives in preempt.h.
These depend directly on asm/percpu.h (header #include nesting issue).
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Alex Shi <alex.shi@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bryan Wu <cooloney@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: David Daney <david.daney@cavium.com>
Cc: David Miller <davem@davemloft.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Mike Travis <travis@sgi.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Robert Richter <rric@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Since 4dcfa600 ("ARM: DMA-API: better handing of DMA masks for coherent
allocations") arm_dma_limit_pfn has almost substituted the arm_dma_limit.
The remaining user is dma_contiguous_reserve(). It is also referenced in
setup_dma_zone() to calculate arm_dma_limit_pfn.
Kill the global arm_dma_limit and equip setup_zone_dma with the local one.
Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
Reported-by: Vassili Karpov <av1474@comtv.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently, memsetting and kfreeing the device is bad behaviour. The
device will have a reference count of 1 and hence can cause trouble
because it has kfree'd. Proper way to handle a failed device_register is
to call put_device right after it fails.
Signed-off-by: Levente Kurusa <levex@linux.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The failure paths of sysfs_slab_add don't release the allocation of 'name'
made by create_unique_id() a few lines above the context of the diff below.
Create a common exit path to make it more obvious what needs freeing.
[vdavydov@parallels.com: free the name only if !unmergeable]
Signed-off-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently, we try to arrange sysfs entries for memcg caches in the same
manner as for global caches. Apart from turning /sys/kernel/slab into a
mess when there are a lot of kmem-active memcgs created, it actually does
not work properly - we won't create more than one link to a memcg cache in
case its parent is merged with another cache. For instance, if A is a
root cache merged with another root cache B, we will have the following
sysfs setup:
X
A -> X
B -> X
where X is some unique id (see create_unique_id()). Now if memcgs M and N
start to allocate from cache A (or B, which is the same), we will get:
X
X:M
X:N
A -> X
B -> X
A:M -> X:M
A:N -> X:N
Since B is an alias for A, we won't get entries B:M and B:N, which is
confusing.
It is more logical to have entries for memcg caches under the
corresponding root cache's sysfs directory. This would allow us to keep
sysfs layout clean, and avoid such inconsistencies like one described
above.
This patch does the trick. It creates a "cgroup" kset in each root cache
kobject to keep its children caches there.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Glauber Costa <glommer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Otherwise, kzalloc() called from a memcg won't clear the whole object.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Glauber Costa <glommer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently we destroy children caches at the very beginning of
kmem_cache_destroy(). This is wrong, because the root cache will not
necessarily be destroyed in the end - if it has aliases (refcount > 0),
kmem_cache_destroy() will simply decrement its refcount and return. In
this case, at best we will get a bunch of warnings in dmesg, like this
one:
kmem_cache_destroy kmalloc-32:0: Slab cache still has objects
CPU: 1 PID: 7139 Comm: modprobe Tainted: G B W 3.13.0+ #117
Hardware name:
ffff88007d7a6368 ffff880039b07e48 ffffffff8168cc2e ffff88007d7a6d68
ffff88007d7a6300 ffff880039b07e68 ffffffff81175e9f 0000000000000000
ffff88007d7a6300 ffff880039b07e98 ffffffff811b67c7 ffff88003e803c00
Call Trace:
[<ffffffff8168cc2e>] dump_stack+0x49/0x5b
[<ffffffff81175e9f>] kmem_cache_destroy+0xdf/0xf0
[<ffffffff811b67c7>] kmem_cache_destroy_memcg_children+0x97/0xc0
[<ffffffff81175dcf>] kmem_cache_destroy+0xf/0xf0
[<ffffffffa0130b21>] xfs_mru_cache_uninit+0x21/0x30 [xfs]
[<ffffffffa01893ea>] exit_xfs_fs+0x2e/0xc44 [xfs]
[<ffffffff810eeb58>] SyS_delete_module+0x198/0x1f0
[<ffffffff816994f9>] system_call_fastpath+0x16/0x1b
At worst - if kmem_cache_destroy() will race with an allocation from a
memcg cache - the kernel will panic.
This patch fixes this by moving children caches destruction after the
check if the cache has aliases. Plus, it forbids destroying a root cache
if it still has children caches, because each children cache keeps a
reference to its parent.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Glauber Costa <glommer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently, memcg_unregister_cache(), which deletes the cache being
destroyed from the memcg_slab_caches list, is called after
__kmem_cache_shutdown() (see kmem_cache_destroy()), which starts to
destroy the cache. As a result, one can access a partially destroyed
cache while traversing a memcg_slab_caches list, which can have deadly
consequences (for instance, cache_show() called for each cache on a
memcg_slab_caches list from mem_cgroup_slabinfo_read() will dereference
pointers to already freed data).
To fix this, let's move memcg_unregister_cache() before the cache
destruction process beginning, issuing memcg_register_cache() on failure.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Glauber Costa <glommer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Memcg-awareness turned kmem_cache_create() into a dirty interweaving of
memcg-only and except-for-memcg calls. To clean this up, let's move the
code responsible for memcg cache creation to a separate function.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Glauber Costa <glommer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This patch cleans up the memcg cache creation path as follows:
- Move memcg cache name creation to a separate function to be called
from kmem_cache_create_memcg(). This allows us to get rid of the mutex
protecting the temporary buffer used for the name formatting, because
the whole cache creation path is protected by the slab_mutex.
- Get rid of memcg_create_kmem_cache(). This function serves as a proxy
to kmem_cache_create_memcg(). After separating the cache name creation
path, it would be reduced to a function call, so let's inline it.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Glauber Costa <glommer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When a kmem cache is created (kmem_cache_create_memcg()), we first try to
find a compatible cache that already exists and can handle requests from
the new cache, i.e. has the same object size, alignment, ctor, etc. If
there is such a cache, we do not create any new caches, instead we simply
increment the refcount of the cache found and return it.
Currently we do this procedure not only when creating root caches, but
also for memcg caches. However, there is no point in that, because, as
every memcg cache has exactly the same parameters as its parent and cache
merging cannot be turned off in runtime (only on boot by passing
"slub_nomerge"), the root caches of any two potentially mergeable memcg
caches should be merged already, i.e. it must be the same root cache, and
therefore we couldn't even get to the memcg cache creation, because it
already exists.
The only exception is boot caches - they are explicitly forbidden to be
merged by setting their refcount to -1. There are currently only two of
them - kmem_cache and kmem_cache_node, which are used in slab internals (I
do not count kmalloc caches as their refcount is set to 1 immediately
after creation). Since they are prevented from merging preliminary I
guess we should avoid to merge their children too.
So let's remove the useless code responsible for merging memcg caches.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Glauber Costa <glommer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
arch_align_stack() moved to asm/exec.h, so change the comment referring to
asm/system.h which no longer exists.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Clean asm/system.h from docs as nothing should refer to that header anymore.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
To increase compiler portability there is <linux/compiler.h> which
provides convenience macros for various gcc constructs. Eg: __weak for
__attribute__((weak)). I've replaced all instances of gcc attributes with
the right macro in the kernel subsystem.
Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If the renamed symbol is defined lib/iomap.c implements ioport_map and
ioport_unmap and currently (nearly) all platforms define the port accessor
functions outb/inb and friend unconditionally. So HAS_IOPORT_MAP is the
better name for this. Consequently NO_IOPORT is renamed to NO_IOPORT_MAP.
The motivation for this change is to reintroduce a symbol HAS_IOPORT that
signals if outb/int et al are available. I will address that at least one
merge window later though to keep surprises to a minimum and catch new
introductions of (HAS|NO)_IOPORT.
The changes in this commit were done using:
$ git grep -l -E '(NO|HAS)_IOPORT' | xargs perl -p -i -e 's/\b((?:CONFIG_)?(?:NO|HAS)_IOPORT)\b/$1_MAP/'
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Unbreak i386 allmodconfig.
This is a hack - please fix properly ;)
Cc: Fabian Vogt <fabian@ritter-vogt.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Conflicts:
fs/fs-writeback.c
|
|
Conflicts:
fs/compat.c
ipc/compat_mq.c
|
|
Conflicts:
arch/powerpc/include/asm/opal.h
arch/powerpc/platforms/powernv/opal-wrappers.S
|
|
|
|
|
|
|
|
Conflicts:
arch/x86/include/asm/archrandom.h
|
|
|
|
|
|
Conflicts:
arch/arm/Kconfig
|
|
|
|
Conflicts:
drivers/gpio/gpio-ich.c
|
|
|