From 9220066ea807e3efce21476bffcac16743e9db22 Mon Sep 17 00:00:00 2001 From: Alexey Gladkov Date: Mon, 15 Jan 2024 15:46:42 +0000 Subject: docs: add information about ipc sysctls limitations After 25b21cb2f6d6 ("[PATCH] IPC namespace core") and 4e9823111bdc ("[PATCH] IPC namespace - shm") the shared memory page count stopped being global and started counting per ipc namespace. The documentation and shmget(2) still says that shmall is a global option. shmget(2): SHMALL System-wide limit on the total amount of shared memory, measured in units of the system page size. On Linux, this limit can be read and modified via /proc/sys/kernel/shmall. I think the changes made in 2006 should be documented. Link: https://lkml.kernel.org/r/09e99911071766958af488beb4e8a728a4f12135.1705333426.git.legion@kernel.org Signed-off-by: Alexey Gladkov Signed-off-by: Eric W. Biederman Acked-by: "Eric W. Biederman" Link: https://lkml.kernel.org/r/ede20ddf7be48b93e8084c3be2e920841ee1a641.1663756794.git.legion@kernel.org Cc: Christian Brauner Cc: Davidlohr Bueso Cc: Joel Granados Cc: Kees Cook Cc: Luis Chamberlain Cc: Manfred Spraul Signed-off-by: Andrew Morton --- Documentation/admin-guide/sysctl/kernel.rst | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index 6584a1f9bfe3..bc578663619d 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -594,6 +594,9 @@ default (``MSGMNB``). ``msgmni`` is the maximum number of IPC queues. 32000 by default (``MSGMNI``). +All of these parameters are set per ipc namespace. The maximum number of bytes +in POSIX message queues is limited by ``RLIMIT_MSGQUEUE``. This limit is +respected hierarchically in the each user namespace. msg_next_id, sem_next_id, and shm_next_id (System V IPC) ======================================================== @@ -1274,15 +1277,20 @@ are doing anyway :) shmall ====== -This parameter sets the total amount of shared memory pages that -can be used system wide. Hence, ``shmall`` should always be at least -``ceil(shmmax/PAGE_SIZE)``. +This parameter sets the total amount of shared memory pages that can be used +inside ipc namespace. The shared memory pages counting occurs for each ipc +namespace separately and is not inherited. Hence, ``shmall`` should always be at +least ``ceil(shmmax/PAGE_SIZE)``. If you are not sure what the default ``PAGE_SIZE`` is on your Linux system, you can run the following command:: # getconf PAGE_SIZE +To reduce or disable the ability to allocate shared memory, you must create a +new ipc namespace, set this parameter to the required value and prohibit the +creation of a new ipc namespace in the current user namespace or cgroups can +be used. shmmax ====== -- cgit v1.2.3 From 9c1b86f8ce04ec4aa22c1119480b3950bf4724c8 Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Thu, 25 Jan 2024 15:55:07 -0700 Subject: kbuild: raise the minimum supported version of LLVM to 13.0.1 Patch series "Bump the minimum supported version of LLVM to 13.0.1". This series bumps the minimum supported version of LLVM for building the kernel to 13.0.1. The first patch does the bump and all subsequent patches clean up all the various workarounds and checks for earlier versions. Quoting the first patch's commit message for those that were only on CC for the clean ups: When __builtin_mul_overflow() has arguments that differ in terms of signedness and width, LLVM may generate a libcall to __muloti4 because it performs the checks in terms of 65-bit multiplication. This issue becomes harder to hit (but still possible) after LLVM 12.0.0, which includes a special case for matching widths but different signs. To gain access to this special case, which the kernel can take advantage of when calls to __muloti4 appear, bump the minimum supported version of LLVM for building the kernel to 13.0.1. 13.0.1 was chosen because there is minimal impact to distribution support while allowing a few more workarounds to be dropped in the kernel source than if 12.0.0 were chosen. Looking at container images of up to date distribution versions: archlinux:latest clang version 16.0.6 debian:oldoldstable-slim clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final) debian:oldstable-slim Debian clang version 11.0.1-2 debian:stable-slim Debian clang version 14.0.6 debian:testing-slim Debian clang version 16.0.6 (19) debian:unstable-slim Debian clang version 16.0.6 (19) fedora:38 clang version 16.0.6 (Fedora 16.0.6-3.fc38) fedora:latest clang version 17.0.6 (Fedora 17.0.6-1.fc39) fedora:rawhide clang version 17.0.6 (Fedora 17.0.6-1.fc40) opensuse/leap:latest clang version 15.0.7 opensuse/tumbleweed:latest clang version 17.0.6 ubuntu:focal clang version 10.0.0-4ubuntu1 ubuntu:latest Ubuntu clang version 14.0.0-1ubuntu1.1 ubuntu:rolling Ubuntu clang version 16.0.6 (15) ubuntu:devel Ubuntu clang version 17.0.6 (3) The only distribution that gets left behind is Debian Bullseye, as the default version is 11.0.1; other distributions either have a newer version than 13.0.1 or one older than the current minimum of 11.0.0. Debian has easy access to more recent LLVM versions through apt.llvm.org, so this is not as much of a concern. There are also the kernel.org LLVM toolchains, which should work with distributions with glibc 2.28 and newer. Another benefit of slimming up the number of supported versions of LLVM for building the kernel is reducing the build capacity needed to support a matrix that builds with each supported version, which allows a matrix to reallocate the freed up build capacity towards something else, such as more configuration combinations. This passes my build matrix with all supported versions. This is based on Andrew's mm-nonmm-unstable to avoid trivial conflicts with my series to update the LLVM links across the repository [1] but I can easily rebase it to linux-kbuild if Masahiro would rather these patches go through there (and defer the conflict resolution to the merge window). [1]: https://lore.kernel.org/20240109-update-llvm-links-v1-0-eb09b59db071@kernel.org/ This patch (of 11): When __builtin_mul_overflow() has arguments that differ in terms of signedness and width, LLVM may generate a libcall to __muloti4 because it performs the checks in terms of 65-bit multiplication. This issue becomes harder to hit (but still possible) after LLVM 12.0.0, which includes a special case for matching widths but different signs. To gain access to this special case, which the kernel can take advantage of when calls to __muloti4 appear, bump the minimum supported version of LLVM for building the kernel to 13.0.1. 13.0.1 was chosen because there is minimal impact to distribution support while allowing a few more workarounds to be dropped in the kernel source than if 12.0.0 were chosen. Looking at container images of up to date distribution versions: archlinux:latest clang version 16.0.6 debian:oldoldstable-slim clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final) debian:oldstable-slim Debian clang version 11.0.1-2 debian:stable-slim Debian clang version 14.0.6 debian:testing-slim Debian clang version 16.0.6 (19) debian:unstable-slim Debian clang version 16.0.6 (19) fedora:38 clang version 16.0.6 (Fedora 16.0.6-3.fc38) fedora:latest clang version 17.0.6 (Fedora 17.0.6-1.fc39) fedora:rawhide clang version 17.0.6 (Fedora 17.0.6-1.fc40) opensuse/leap:latest clang version 15.0.7 opensuse/tumbleweed:latest clang version 17.0.6 ubuntu:focal clang version 10.0.0-4ubuntu1 ubuntu:latest Ubuntu clang version 14.0.0-1ubuntu1.1 ubuntu:rolling Ubuntu clang version 16.0.6 (15) ubuntu:devel Ubuntu clang version 17.0.6 (3) The only distribution that gets left behind is Debian Bullseye, as the default version is 11.0.1; other distributions either have a newer version than 13.0.1 or one older than the current minimum of 11.0.0. Debian has easy access to more recent LLVM versions through apt.llvm.org, so this is not as much of a concern. There are also the kernel.org LLVM toolchains, which should work with distributions with glibc 2.28 and newer. Another benefit of slimming up the number of supported versions of LLVM for building the kernel is reducing the build capacity needed to support a matrix that builds with each supported version, which allows a matrix to reallocate the freed up build capacity towards something else, such as more configuration combinations. Link: https://lkml.kernel.org/r/20240125-bump-min-llvm-ver-to-13-0-1-v1-0-f5ff9bda41c5@kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/1975 Link: https://github.com/llvm/llvm-project/issues/38013 Link: https://github.com/llvm/llvm-project/commit/3203143f1356a4e4e3ada231156fc6da6e1a9f9d Link: https://mirrors.edge.kernel.org/pub/tools/llvm/ Link: https://lkml.kernel.org/r/20240125-bump-min-llvm-ver-to-13-0-1-v1-1-f5ff9bda41c5@kernel.org Signed-off-by: Nathan Chancellor Reviewed-by: Kees Cook Cc: Albert Ou Cc: "Aneesh Kumar K.V (IBM)" Cc: Ard Biesheuvel Cc: Borislav Petkov (AMD) Cc: Catalin Marinas Cc: Conor Dooley Cc: Dave Hansen Cc: Ingo Molnar Cc: Mark Rutland Cc: Masahiro Yamada Cc: Michael Ellerman Cc: Nathan Chancellor Cc: "Naveen N. Rao" Cc: Nicholas Piggin Cc: Nicolas Schier Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- Documentation/process/changes.rst | 2 +- scripts/min-tool-version.sh | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/process/changes.rst b/Documentation/process/changes.rst index 50b3d1cb1115..d7306b8cad13 100644 --- a/Documentation/process/changes.rst +++ b/Documentation/process/changes.rst @@ -30,7 +30,7 @@ you probably needn't concern yourself with pcmciautils. Program Minimal version Command to check the version ====================== =============== ======================================== GNU C 5.1 gcc --version -Clang/LLVM (optional) 11.0.0 clang --version +Clang/LLVM (optional) 13.0.1 clang --version Rust (optional) 1.74.1 rustc --version bindgen (optional) 0.65.1 bindgen --version GNU make 3.82 make --version diff --git a/scripts/min-tool-version.sh b/scripts/min-tool-version.sh index 9faa4d3d91e3..5d17022ee1f6 100755 --- a/scripts/min-tool-version.sh +++ b/scripts/min-tool-version.sh @@ -29,7 +29,7 @@ llvm) elif [ "$SRCARCH" = loongarch ]; then echo 18.0.0 else - echo 11.0.0 + echo 13.0.1 fi ;; rustc) -- cgit v1.2.3 From 2e3fc6ca521499a985a1829a21759ac25359c0f3 Mon Sep 17 00:00:00 2001 From: Feng Tang Date: Fri, 2 Feb 2024 21:20:42 +0800 Subject: panic: add option to dump blocked tasks in panic_print For debugging kernel panics and other bugs, there is already an option of panic_print to dump all tasks' call stacks. On today's large servers running many containers, there could be thousands of tasks or more, and this will print out huge amount of call stacks, taking a lot of time (for serial console which is main target user case of panic_print). And in many cases, only those several tasks being blocked are key for the panic, so add an option to only dump blocked tasks' call stacks. [akpm@linux-foundation.org: clarify documentation a little] Link: https://lkml.kernel.org/r/20240202132042.3609657-1-feng.tang@intel.com Signed-off-by: Feng Tang Tested-by: Guilherme G. Piccoli Cc: Jonathan Corbet Cc: Josh Poimboeuf Cc: Peter Zijlstra (Intel) Cc: Randy Dunlap Signed-off-by: Andrew Morton --- Documentation/admin-guide/kernel-parameters.txt | 1 + Documentation/admin-guide/sysctl/kernel.rst | 1 + kernel/panic.c | 4 ++++ 3 files changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 31b3a25680d0..800b4b5dcbb4 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4182,6 +4182,7 @@ bit 4: print ftrace buffer bit 5: print all printk messages in buffer bit 6: print all CPUs backtrace (if available in the arch) + bit 7: print only tasks in uninterruptible (blocked) state *Be aware* that this option may print a _lot_ of lines, so there are risks of losing older messages in the log. Use this option carefully, maybe worth to setup a diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index bc578663619d..a9b71190399d 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -853,6 +853,7 @@ bit 3 print locks info if ``CONFIG_LOCKDEP`` is on bit 4 print ftrace buffer bit 5 print all printk messages in buffer bit 6 print all CPUs backtrace (if available in the arch) +bit 7 print only tasks in uninterruptible (blocked) state ===== ============================================ So for example to print tasks and memory info on panic, user can:: diff --git a/kernel/panic.c b/kernel/panic.c index d49b68184c56..d4cb86d97f8f 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -73,6 +73,7 @@ EXPORT_SYMBOL_GPL(panic_timeout); #define PANIC_PRINT_FTRACE_INFO 0x00000010 #define PANIC_PRINT_ALL_PRINTK_MSG 0x00000020 #define PANIC_PRINT_ALL_CPU_BT 0x00000040 +#define PANIC_PRINT_BLOCKED_TASKS 0x00000080 unsigned long panic_print; ATOMIC_NOTIFIER_HEAD(panic_notifier_list); @@ -227,6 +228,9 @@ static void panic_print_sys_info(bool console_flush) if (panic_print & PANIC_PRINT_FTRACE_INFO) ftrace_dump(DUMP_ALL); + + if (panic_print & PANIC_PRINT_BLOCKED_TASKS) + show_state_filter(TASK_UNINTERRUPTIBLE); } void check_panic_on_warn(const char *origin) -- cgit v1.2.3