diff options
author | Luis R. Rodriguez <mcgrof@suse.com> | 2014-07-23 09:11:52 +1000 |
---|---|---|
committer | Stephen Rothwell <sfr@canb.auug.org.au> | 2014-07-23 09:11:52 +1000 |
commit | 3ae563276eb9aed4b74e5c64cd3f1512cdc2900a (patch) | |
tree | 5e70640455997e03b09bbb964063a2d25bc8f95d /init/Kconfig | |
parent | bd4a70eec15ac8a6306554cc0dca675df80e4098 (diff) |
printk: allow increasing the ring buffer depending on the number of CPUs
The default size of the ring buffer is too small for machines with a large
amount of CPUs under heavy load. What ends up happening when debugging is
the ring buffer overlaps and chews up old messages making debugging
impossible unless the size is passed as a kernel parameter. An idle
system upon boot up will on average spew out only about one or two extra
lines but where this really matters is on heavy load and that will vary
widely depending on the system and environment.
There are mechanisms to help increase the kernel ring buffer for tracing
through debugfs, and those interfaces even allow growing the kernel ring
buffer per CPU. We also have a static value which can be passed upon
boot. Relying on debugfs however is not ideal for production, and relying
on the value passed upon bootup is can only used *after* an issue has
creeped up. Instead of being reactive this adds a proactive measure which
lets you scale the amount of contributions you'd expect to the kernel ring
buffer under load by each CPU in the worst case scenario.
We use num_possible_cpus() to avoid complexities which could be introduced
by dynamically changing the ring buffer size at run time,
num_possible_cpus() lets us use the upper limit on possible number of CPUs
therefore avoiding having to deal with hotplugging CPUs on and off. This
introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT which is
used to specify the maximum amount of contributions to the kernel ring
buffer in the worst case before the kernel ring buffer flips over, the
size is specified as a power of 2. The total amount of contributions made
by each CPU must be greater than half of the default kernel ring buffer
size (1 << LOG_BUF_SHIFT bytes) in order to trigger an increase upon
bootup. The kernel ring buffer is increased to the next power of two that
would fit the required minimum kernel ring buffer size plus the additional
CPU contribution. For example if LOG_BUF_SHIFT is 18 (256 KB) you'd
require at least 128 KB contributions by other CPUs in order to trigger an
increase of the kernel ring buffer. With a LOG_CPU_BUF_SHIFT of 12 (4 KB)
you'd require at least anything over > 64 possible CPUs to trigger an
increase. If you had 128 possible CPUs the amount of minimum required
kernel ring buffer bumps to:
((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB
Since we require the ring buffer to be a power of two the new required
size would be 1024 KB.
This CPU contributions are ignored when the "log_buf_len" kernel parameter
is used as it forces the exact size of the ring buffer to an expected
power of two value.
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Tested-by: Petr Mladek <pmladek@suse.cz>
Reviewed-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Arun KS <arunks.linux@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'init/Kconfig')
-rw-r--r-- | init/Kconfig | 53 |
1 files changed, 52 insertions, 1 deletions
diff --git a/init/Kconfig b/init/Kconfig index 9d76b99af1b9..573d3f6d78ab 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -807,7 +807,11 @@ config LOG_BUF_SHIFT range 12 21 default 17 help - Select kernel log buffer size as a power of 2. + Select the minimal kernel log buffer size as a power of 2. + The final size is affected by LOG_CPU_MAX_BUF_SHIFT config + parameter, see below. Any higher size also might be forced + by "log_buf_len" boot parameter. + Examples: 17 => 128 KB 16 => 64 KB @@ -816,6 +820,53 @@ config LOG_BUF_SHIFT 13 => 8 KB 12 => 4 KB +config LOG_CPU_MAX_BUF_SHIFT + int "CPU kernel log buffer size contribution (13 => 8 KB, 17 => 128KB)" + range 0 21 + default 12 + depends on SMP + depends on !BASE_SMALL + help + The kernel ring buffer will get additional data logged onto it + when multiple CPUs are supported. Typically the contributions are + only a few lines when idle however under under load this can vary + and in the worst case it can mean losing logging information. You + can use this to set the maximum expected amount of logging + contribution under load by each CPU in the worst case scenario, as + a power of 2. The total sum amount of contributions made by all CPUs + must be greater than half of the default kernel ring buffer size + ((1 << LOG_BUF_SHIFT / 2 bytes)) in order to trigger an increase upon + bootup. If an increase is required the ring buffer is increased to + the next power of 2 that can fit both the minimum kernel ring buffer + (LOG_BUF_SHIFT) plus the additional worst case CPU contributions. + For example if LOG_BUF_SHIFT is 18 (256 KB) you'd require at laest + 128 KB contributions by other CPUs in order to trigger an increase. + With a LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything + over > 64 possible CPUs to trigger an increase. If you had 128 + possible CPUs the new minimum required kernel ring buffer size + would be: + + ((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB + + Since we only allow powers of two for the kernel ring buffer size the + new kernel ring buffer size would be 1024 KB. + + CPU contributions are ignored when "log_buf_len" kernel parameter is + used as it forces an exact (power of two) size of the ring buffer to + an expected value. + + The number of possible CPUs is used for this computation ignoring + hotplugging making the compuation optimal for the the worst case + scenerio while allowing a simple algorithm to be used from bootup. + + Examples shift values and their meaning: + 17 => 128 KB for each CPU + 16 => 64 KB for each CPU + 15 => 32 KB for each CPU + 14 => 16 KB for each CPU + 13 => 8 KB for each CPU + 12 => 4 KB for each CPU + # # Architectures with an unreliable sched_clock() should select this: # |