summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-05-26 11:17:01 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2025-05-26 11:17:01 -0700
commitc5bfc48d5472fc60abafb510668d7bc3b5ecb401 (patch)
tree84cecfa8eea1e1c786977d5463378ee2b47d574d
parent7d7a103d299eb5b95d67873c5ea7db419eaaebc0 (diff)
parent4e83ae6ec87dddac070ba349d3b839589b1bb957 (diff)
Merge tag 'vfs-6.16-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull coredump updates from Christian Brauner: "This adds support for sending coredumps over an AF_UNIX socket. It also makes (implicit) use of the new SO_PEERPIDFD ability to hand out pidfds for reaped peer tasks The new coredump socket will allow userspace to not have to rely on usermode helpers for processing coredumps and provides a saf way to handle them instead of relying on super privileged coredumping helpers This will also be significantly more lightweight since the kernel doens't have to do a fork()+exec() for each crashing process to spawn a usermodehelper. Instead the kernel just connects to the AF_UNIX socket and userspace can process it concurrently however it sees fit. Support for userspace is incoming starting with systemd-coredump There's more work coming in that direction next cycle. The rest below goes into some details and background Coredumping currently supports two modes: (1) Dumping directly into a file somewhere on the filesystem. (2) Dumping into a pipe connected to a usermode helper process spawned as a child of the system_unbound_wq or kthreadd For simplicity I'm mostly ignoring (1). There's probably still some users of (1) out there but processing coredumps in this way can be considered adventurous especially in the face of set*id binaries The most common option should be (2) by now. It works by allowing userspace to put a string into /proc/sys/kernel/core_pattern like: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h The "|" at the beginning indicates to the kernel that a pipe must be used. The path following the pipe indicator is a path to a binary that will be spawned as a usermode helper process. Any additional parameters pass information about the task that is generating the coredump to the binary that processes the coredump In the example the core_pattern shown causes the kernel to spawn systemd-coredump as a usermode helper. There's various conceptual consequences of this (non-exhaustive list): - systemd-coredump is spawned with file descriptor number 0 (stdin) connected to the read-end of the pipe. All other file descriptors are closed. That specifically includes 1 (stdout) and 2 (stderr). This has already caused bugs because userspace assumed that this cannot happen (Whether or not this is a sane assumption is irrelevant) - systemd-coredump will be spawned as a child of system_unbound_wq. So it is not a child of any userspace process and specifically not a child of PID 1. It cannot be waited upon and is in a weird hybrid upcall which are difficult for userspace to control correctly - systemd-coredump is spawned with full kernel privileges. This necessitates all kinds of weird privilege dropping excercises in userspace to make this safe - A new usermode helper has to be spawned for each crashing process This adds a new mode: (3) Dumping into an AF_UNIX socket Userspace can set /proc/sys/kernel/core_pattern to: @/path/to/coredump.socket The "@" at the beginning indicates to the kernel that an AF_UNIX coredump socket will be used to process coredumps The coredump socket must be located in the initial mount namespace. When a task coredumps it opens a client socket in the initial network namespace and connects to the coredump socket: - The coredump server uses SO_PEERPIDFD to get a stable handle on the connected crashing task. The retrieved pidfd will provide a stable reference even if the crashing task gets SIGKILLed while generating the coredump. That is a huge attack vector right now - By setting core_pipe_limit non-zero userspace can guarantee that the crashing task cannot be reaped behind it's back and thus process all necessary information in /proc/<pid>. The SO_PEERPIDFD can be used to detect whether /proc/<pid> still refers to the same process The core_pipe_limit isn't used to rate-limit connections to the socket. This can simply be done via AF_UNIX socket directly - The pidfd for the crashing task will contain information how the task coredumps. The PIDFD_GET_INFO ioctl gained a new flag PIDFD_INFO_COREDUMP which can be used to retreive the coredump information If the coredump gets a new coredump client connection the kernel guarantees that PIDFD_INFO_COREDUMP information is available. Currently the following information is provided in the new @coredump_mask extension to struct pidfd_info: * PIDFD_COREDUMPED is raised if the task did actually coredump * PIDFD_COREDUMP_SKIP is raised if the task skipped coredumping (e.g., undumpable) * PIDFD_COREDUMP_USER is raised if this is a regular coredump and doesn't need special care by the coredump server * PIDFD_COREDUMP_ROOT is raised if the generated coredump should be treated as sensitive and the coredump server should restrict access to the generated coredump to sufficiently privileged users" * tag 'vfs-6.16-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: mips, net: ensure that SOCK_COREDUMP is defined selftests/coredump: add tests for AF_UNIX coredumps selftests/pidfd: add PIDFD_INFO_COREDUMP infrastructure coredump: validate socket name as it is written coredump: show supported coredump modes pidfs, coredump: add PIDFD_INFO_COREDUMP coredump: add coredump socket coredump: reflow dump helpers a little coredump: massage do_coredump() coredump: massage format_corename()
-rw-r--r--arch/mips/include/asm/socket.h9
-rw-r--r--fs/coredump.c402
-rw-r--r--fs/pidfs.c55
-rw-r--r--include/linux/net.h4
-rw-r--r--include/linux/pidfs.h5
-rw-r--r--include/uapi/linux/pidfd.h16
-rw-r--r--net/unix/af_unix.c54
-rw-r--r--tools/testing/selftests/coredump/stackdump_test.c467
-rw-r--r--tools/testing/selftests/pidfd/pidfd.h22
9 files changed, 917 insertions, 117 deletions
diff --git a/arch/mips/include/asm/socket.h b/arch/mips/include/asm/socket.h
index 4724a563c5bf..43a09f0dd3ff 100644
--- a/arch/mips/include/asm/socket.h
+++ b/arch/mips/include/asm/socket.h
@@ -36,15 +36,6 @@ enum sock_type {
SOCK_PACKET = 10,
};
-#define SOCK_MAX (SOCK_PACKET + 1)
-/* Mask which covers at least up to SOCK_MASK-1. The
- * * remaining bits are used as flags. */
-#define SOCK_TYPE_MASK 0xf
-
-/* Flags for socket, socketpair, paccept */
-#define SOCK_CLOEXEC O_CLOEXEC
-#define SOCK_NONBLOCK O_NONBLOCK
-
#define ARCH_HAS_SOCKET_TYPES 1
#endif /* _ASM_SOCKET_H */
diff --git a/fs/coredump.c b/fs/coredump.c
index d740a0411266..f217ebf2b3b6 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -44,7 +44,13 @@
#include <linux/sysctl.h>
#include <linux/elf.h>
#include <linux/pidfs.h>
+#include <linux/net.h>
+#include <linux/socket.h>
+#include <net/af_unix.h>
+#include <net/net_namespace.h>
+#include <net/sock.h>
#include <uapi/linux/pidfd.h>
+#include <uapi/linux/un.h>
#include <linux/uaccess.h>
#include <asm/mmu_context.h>
@@ -76,9 +82,16 @@ static char core_pattern[CORENAME_MAX_SIZE] = "core";
static int core_name_size = CORENAME_MAX_SIZE;
unsigned int core_file_note_size_limit = CORE_FILE_NOTE_SIZE_DEFAULT;
+enum coredump_type_t {
+ COREDUMP_FILE = 1,
+ COREDUMP_PIPE = 2,
+ COREDUMP_SOCK = 3,
+};
+
struct core_name {
char *corename;
int used, size;
+ enum coredump_type_t core_type;
};
static int expand_corename(struct core_name *cn, int size)
@@ -218,18 +231,24 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
{
const struct cred *cred = current_cred();
const char *pat_ptr = core_pattern;
- int ispipe = (*pat_ptr == '|');
bool was_space = false;
int pid_in_pattern = 0;
int err = 0;
cn->used = 0;
cn->corename = NULL;
+ if (*pat_ptr == '|')
+ cn->core_type = COREDUMP_PIPE;
+ else if (*pat_ptr == '@')
+ cn->core_type = COREDUMP_SOCK;
+ else
+ cn->core_type = COREDUMP_FILE;
if (expand_corename(cn, core_name_size))
return -ENOMEM;
cn->corename[0] = '\0';
- if (ispipe) {
+ switch (cn->core_type) {
+ case COREDUMP_PIPE: {
int argvs = sizeof(core_pattern) / 2;
(*argv) = kmalloc_array(argvs, sizeof(**argv), GFP_KERNEL);
if (!(*argv))
@@ -238,6 +257,45 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
++pat_ptr;
if (!(*pat_ptr))
return -ENOMEM;
+ break;
+ }
+ case COREDUMP_SOCK: {
+ /* skip the @ */
+ pat_ptr++;
+ if (!(*pat_ptr))
+ return -ENOMEM;
+
+ err = cn_printf(cn, "%s", pat_ptr);
+ if (err)
+ return err;
+
+ /* Require absolute paths. */
+ if (cn->corename[0] != '/')
+ return -EINVAL;
+
+ /*
+ * Ensure we can uses spaces to indicate additional
+ * parameters in the future.
+ */
+ if (strchr(cn->corename, ' ')) {
+ coredump_report_failure("Coredump socket may not %s contain spaces", cn->corename);
+ return -EINVAL;
+ }
+
+ /*
+ * Currently no need to parse any other options.
+ * Relevant information can be retrieved from the peer
+ * pidfd retrievable via SO_PEERPIDFD by the receiver or
+ * via /proc/<pid>, using the SO_PEERPIDFD to guard
+ * against pid recycling when opening /proc/<pid>.
+ */
+ return 0;
+ }
+ case COREDUMP_FILE:
+ break;
+ default:
+ WARN_ON_ONCE(true);
+ return -EINVAL;
}
/* Repeat as long as we have more pattern to process and more output
@@ -247,7 +305,7 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
* Split on spaces before doing template expansion so that
* %e and %E don't get split if they have spaces in them
*/
- if (ispipe) {
+ if (cn->core_type == COREDUMP_PIPE) {
if (isspace(*pat_ptr)) {
if (cn->used != 0)
was_space = true;
@@ -353,7 +411,7 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
* Installing a pidfd only makes sense if
* we actually spawn a usermode helper.
*/
- if (!ispipe)
+ if (cn->core_type != COREDUMP_PIPE)
break;
/*
@@ -384,12 +442,10 @@ out:
* If core_pattern does not include a %p (as is the default)
* and core_uses_pid is set, then .%pid will be appended to
* the filename. Do not do this for piped commands. */
- if (!ispipe && !pid_in_pattern && core_uses_pid) {
- err = cn_printf(cn, ".%d", task_tgid_vnr(current));
- if (err)
- return err;
- }
- return ispipe;
+ if (cn->core_type == COREDUMP_FILE && !pid_in_pattern && core_uses_pid)
+ return cn_printf(cn, ".%d", task_tgid_vnr(current));
+
+ return 0;
}
static int zap_process(struct signal_struct *signal, int exit_code)
@@ -545,6 +601,8 @@ static int umh_coredump_setup(struct subprocess_info *info, struct cred *new)
if (IS_ERR(pidfs_file))
return PTR_ERR(pidfs_file);
+ pidfs_coredump(cp);
+
/*
* Usermode helpers are childen of either
* system_unbound_wq or of kthreadd. So we know that
@@ -583,7 +641,6 @@ void do_coredump(const kernel_siginfo_t *siginfo)
const struct cred *old_cred;
struct cred *cred;
int retval = 0;
- int ispipe;
size_t *argv = NULL;
int argc = 0;
/* require nonrelative corefile path and be extra careful */
@@ -632,70 +689,14 @@ void do_coredump(const kernel_siginfo_t *siginfo)
old_cred = override_creds(cred);
- ispipe = format_corename(&cn, &cprm, &argv, &argc);
-
- if (ispipe) {
- int argi;
- int dump_count;
- char **helper_argv;
- struct subprocess_info *sub_info;
-
- if (ispipe < 0) {
- coredump_report_failure("format_corename failed, aborting core");
- goto fail_unlock;
- }
-
- if (cprm.limit == 1) {
- /* See umh_coredump_setup() which sets RLIMIT_CORE = 1.
- *
- * Normally core limits are irrelevant to pipes, since
- * we're not writing to the file system, but we use
- * cprm.limit of 1 here as a special value, this is a
- * consistent way to catch recursive crashes.
- * We can still crash if the core_pattern binary sets
- * RLIM_CORE = !1, but it runs as root, and can do
- * lots of stupid things.
- *
- * Note that we use task_tgid_vnr here to grab the pid
- * of the process group leader. That way we get the
- * right pid if a thread in a multi-threaded
- * core_pattern process dies.
- */
- coredump_report_failure("RLIMIT_CORE is set to 1, aborting core");
- goto fail_unlock;
- }
- cprm.limit = RLIM_INFINITY;
-
- dump_count = atomic_inc_return(&core_dump_count);
- if (core_pipe_limit && (core_pipe_limit < dump_count)) {
- coredump_report_failure("over core_pipe_limit, skipping core dump");
- goto fail_dropcount;
- }
-
- helper_argv = kmalloc_array(argc + 1, sizeof(*helper_argv),
- GFP_KERNEL);
- if (!helper_argv) {
- coredump_report_failure("%s failed to allocate memory", __func__);
- goto fail_dropcount;
- }
- for (argi = 0; argi < argc; argi++)
- helper_argv[argi] = cn.corename + argv[argi];
- helper_argv[argi] = NULL;
-
- retval = -ENOMEM;
- sub_info = call_usermodehelper_setup(helper_argv[0],
- helper_argv, NULL, GFP_KERNEL,
- umh_coredump_setup, NULL, &cprm);
- if (sub_info)
- retval = call_usermodehelper_exec(sub_info,
- UMH_WAIT_EXEC);
+ retval = format_corename(&cn, &cprm, &argv, &argc);
+ if (retval < 0) {
+ coredump_report_failure("format_corename failed, aborting core");
+ goto fail_unlock;
+ }
- kfree(helper_argv);
- if (retval) {
- coredump_report_failure("|%s pipe failed", cn.corename);
- goto close_fail;
- }
- } else {
+ switch (cn.core_type) {
+ case COREDUMP_FILE: {
struct mnt_idmap *idmap;
struct inode *inode;
int open_flags = O_CREAT | O_WRONLY | O_NOFOLLOW |
@@ -789,6 +790,143 @@ void do_coredump(const kernel_siginfo_t *siginfo)
if (do_truncate(idmap, cprm.file->f_path.dentry,
0, 0, cprm.file))
goto close_fail;
+ break;
+ }
+ case COREDUMP_PIPE: {
+ int argi;
+ int dump_count;
+ char **helper_argv;
+ struct subprocess_info *sub_info;
+
+ if (cprm.limit == 1) {
+ /* See umh_coredump_setup() which sets RLIMIT_CORE = 1.
+ *
+ * Normally core limits are irrelevant to pipes, since
+ * we're not writing to the file system, but we use
+ * cprm.limit of 1 here as a special value, this is a
+ * consistent way to catch recursive crashes.
+ * We can still crash if the core_pattern binary sets
+ * RLIM_CORE = !1, but it runs as root, and can do
+ * lots of stupid things.
+ *
+ * Note that we use task_tgid_vnr here to grab the pid
+ * of the process group leader. That way we get the
+ * right pid if a thread in a multi-threaded
+ * core_pattern process dies.
+ */
+ coredump_report_failure("RLIMIT_CORE is set to 1, aborting core");
+ goto fail_unlock;
+ }
+ cprm.limit = RLIM_INFINITY;
+
+ dump_count = atomic_inc_return(&core_dump_count);
+ if (core_pipe_limit && (core_pipe_limit < dump_count)) {
+ coredump_report_failure("over core_pipe_limit, skipping core dump");
+ goto fail_dropcount;
+ }
+
+ helper_argv = kmalloc_array(argc + 1, sizeof(*helper_argv),
+ GFP_KERNEL);
+ if (!helper_argv) {
+ coredump_report_failure("%s failed to allocate memory", __func__);
+ goto fail_dropcount;
+ }
+ for (argi = 0; argi < argc; argi++)
+ helper_argv[argi] = cn.corename + argv[argi];
+ helper_argv[argi] = NULL;
+
+ retval = -ENOMEM;
+ sub_info = call_usermodehelper_setup(helper_argv[0],
+ helper_argv, NULL, GFP_KERNEL,
+ umh_coredump_setup, NULL, &cprm);
+ if (sub_info)
+ retval = call_usermodehelper_exec(sub_info,
+ UMH_WAIT_EXEC);
+
+ kfree(helper_argv);
+ if (retval) {
+ coredump_report_failure("|%s pipe failed", cn.corename);
+ goto close_fail;
+ }
+ break;
+ }
+ case COREDUMP_SOCK: {
+#ifdef CONFIG_UNIX
+ struct file *file __free(fput) = NULL;
+ struct sockaddr_un addr = {
+ .sun_family = AF_UNIX,
+ };
+ ssize_t addr_len;
+ struct socket *socket;
+
+ addr_len = strscpy(addr.sun_path, cn.corename);
+ if (addr_len < 0)
+ goto close_fail;
+ addr_len += offsetof(struct sockaddr_un, sun_path) + 1;
+
+ /*
+ * It is possible that the userspace process which is
+ * supposed to handle the coredump and is listening on
+ * the AF_UNIX socket coredumps. Userspace should just
+ * mark itself non dumpable.
+ */
+
+ retval = sock_create_kern(&init_net, AF_UNIX, SOCK_STREAM, 0, &socket);
+ if (retval < 0)
+ goto close_fail;
+
+ file = sock_alloc_file(socket, 0, NULL);
+ if (IS_ERR(file))
+ goto close_fail;
+
+ /*
+ * Set the thread-group leader pid which is used for the
+ * peer credentials during connect() below. Then
+ * immediately register it in pidfs...
+ */
+ cprm.pid = task_tgid(current);
+ retval = pidfs_register_pid(cprm.pid);
+ if (retval)
+ goto close_fail;
+
+ /*
+ * ... and set the coredump information so userspace
+ * has it available after connect()...
+ */
+ pidfs_coredump(&cprm);
+
+ retval = kernel_connect(socket, (struct sockaddr *)(&addr),
+ addr_len, O_NONBLOCK | SOCK_COREDUMP);
+
+ /*
+ * ... Make sure to only put our reference after connect() took
+ * its own reference keeping the pidfs entry alive ...
+ */
+ pidfs_put_pid(cprm.pid);
+
+ if (retval) {
+ if (retval == -EAGAIN)
+ coredump_report_failure("Coredump socket %s receive queue full", addr.sun_path);
+ else
+ coredump_report_failure("Coredump socket connection %s failed %d", addr.sun_path, retval);
+ goto close_fail;
+ }
+
+ /* ... and validate that @sk_peer_pid matches @cprm.pid. */
+ if (WARN_ON_ONCE(unix_peer(socket->sk)->sk_peer_pid != cprm.pid))
+ goto close_fail;
+
+ cprm.limit = RLIM_INFINITY;
+ cprm.file = no_free_ptr(file);
+#else
+ coredump_report_failure("Core dump socket support %s disabled", cn.corename);
+ goto close_fail;
+#endif
+ break;
+ }
+ default:
+ WARN_ON_ONCE(true);
+ goto close_fail;
}
/* get us an unshared descriptor table; almost always a no-op */
@@ -823,13 +961,49 @@ void do_coredump(const kernel_siginfo_t *siginfo)
file_end_write(cprm.file);
free_vma_snapshot(&cprm);
}
- if (ispipe && core_pipe_limit)
- wait_for_dump_helpers(cprm.file);
+
+#ifdef CONFIG_UNIX
+ /* Let userspace know we're done processing the coredump. */
+ if (sock_from_file(cprm.file))
+ kernel_sock_shutdown(sock_from_file(cprm.file), SHUT_WR);
+#endif
+
+ /*
+ * When core_pipe_limit is set we wait for the coredump server
+ * or usermodehelper to finish before exiting so it can e.g.,
+ * inspect /proc/<pid>.
+ */
+ if (core_pipe_limit) {
+ switch (cn.core_type) {
+ case COREDUMP_PIPE:
+ wait_for_dump_helpers(cprm.file);
+ break;
+#ifdef CONFIG_UNIX
+ case COREDUMP_SOCK: {
+ ssize_t n;
+
+ /*
+ * We use a simple read to wait for the coredump
+ * processing to finish. Either the socket is
+ * closed or we get sent unexpected data. In
+ * both cases, we're done.
+ */
+ n = __kernel_read(cprm.file, &(char){ 0 }, 1, NULL);
+ if (n != 0)
+ coredump_report_failure("Unexpected data on coredump socket");
+ break;
+ }
+#endif
+ default:
+ break;
+ }
+ }
+
close_fail:
if (cprm.file)
filp_close(cprm.file, NULL);
fail_dropcount:
- if (ispipe)
+ if (cn.core_type == COREDUMP_PIPE)
atomic_dec(&core_dump_count);
fail_unlock:
kfree(argv);
@@ -852,10 +1026,9 @@ static int __dump_emit(struct coredump_params *cprm, const void *addr, int nr)
struct file *file = cprm->file;
loff_t pos = file->f_pos;
ssize_t n;
+
if (cprm->written + nr > cprm->limit)
return 0;
-
-
if (dump_interrupted())
return 0;
n = __kernel_write(file, addr, nr, &pos);
@@ -872,20 +1045,21 @@ static int __dump_skip(struct coredump_params *cprm, size_t nr)
{
static char zeroes[PAGE_SIZE];
struct file *file = cprm->file;
+
if (file->f_mode & FMODE_LSEEK) {
- if (dump_interrupted() ||
- vfs_llseek(file, nr, SEEK_CUR) < 0)
+ if (dump_interrupted() || vfs_llseek(file, nr, SEEK_CUR) < 0)
return 0;
cprm->pos += nr;
return 1;
- } else {
- while (nr > PAGE_SIZE) {
- if (!__dump_emit(cprm, zeroes, PAGE_SIZE))
- return 0;
- nr -= PAGE_SIZE;
- }
- return __dump_emit(cprm, zeroes, nr);
}
+
+ while (nr > PAGE_SIZE) {
+ if (!__dump_emit(cprm, zeroes, PAGE_SIZE))
+ return 0;
+ nr -= PAGE_SIZE;
+ }
+
+ return __dump_emit(cprm, zeroes, nr);
}
int dump_emit(struct coredump_params *cprm, const void *addr, int nr)
@@ -1054,7 +1228,7 @@ EXPORT_SYMBOL(dump_align);
void validate_coredump_safety(void)
{
if (suid_dumpable == SUID_DUMP_ROOT &&
- core_pattern[0] != '/' && core_pattern[0] != '|') {
+ core_pattern[0] != '/' && core_pattern[0] != '|' && core_pattern[0] != '@') {
coredump_report_failure("Unsafe core_pattern used with fs.suid_dumpable=2: "
"pipe handler or fully qualified core dump path required. "
@@ -1062,18 +1236,55 @@ void validate_coredump_safety(void)
}
}
+static inline bool check_coredump_socket(void)
+{
+ if (core_pattern[0] != '@')
+ return true;
+
+ /*
+ * Coredump socket must be located in the initial mount
+ * namespace. Don't give the impression that anything else is
+ * supported right now.
+ */
+ if (current->nsproxy->mnt_ns != init_task.nsproxy->mnt_ns)
+ return false;
+
+ /* Must be an absolute path. */
+ if (*(core_pattern + 1) != '/')
+ return false;
+
+ return true;
+}
+
static int proc_dostring_coredump(const struct ctl_table *table, int write,
void *buffer, size_t *lenp, loff_t *ppos)
{
- int error = proc_dostring(table, write, buffer, lenp, ppos);
+ int error;
+ ssize_t retval;
+ char old_core_pattern[CORENAME_MAX_SIZE];
+
+ retval = strscpy(old_core_pattern, core_pattern, CORENAME_MAX_SIZE);
+
+ error = proc_dostring(table, write, buffer, lenp, ppos);
+ if (error)
+ return error;
+ if (!check_coredump_socket()) {
+ strscpy(core_pattern, old_core_pattern, retval + 1);
+ return -EINVAL;
+ }
- if (!error)
- validate_coredump_safety();
+ validate_coredump_safety();
return error;
}
static const unsigned int core_file_note_size_min = CORE_FILE_NOTE_SIZE_DEFAULT;
static const unsigned int core_file_note_size_max = CORE_FILE_NOTE_SIZE_MAX;
+static char core_modes[] = {
+ "file\npipe"
+#ifdef CONFIG_UNIX
+ "\nsocket"
+#endif
+};
static const struct ctl_table coredump_sysctls[] = {
{
@@ -1117,6 +1328,13 @@ static const struct ctl_table coredump_sysctls[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
},
+ {
+ .procname = "core_modes",
+ .data = core_modes,
+ .maxlen = sizeof(core_modes) - 1,
+ .mode = 0444,
+ .proc_handler = proc_dostring,
+ },
};
static int __init init_fs_coredump_sysctls(void)
diff --git a/fs/pidfs.c b/fs/pidfs.c
index 6d701ef75695..c1f0a067be40 100644
--- a/fs/pidfs.c
+++ b/fs/pidfs.c
@@ -20,6 +20,7 @@
#include <linux/time_namespace.h>
#include <linux/utsname.h>
#include <net/net_namespace.h>
+#include <linux/coredump.h>
#include "internal.h"
#include "mount.h"
@@ -33,6 +34,7 @@ static struct kmem_cache *pidfs_cachep __ro_after_init;
struct pidfs_exit_info {
__u64 cgroupid;
__s32 exit_code;
+ __u32 coredump_mask;
};
struct pidfs_inode {
@@ -240,6 +242,22 @@ static inline bool pid_in_current_pidns(const struct pid *pid)
return false;
}
+static __u32 pidfs_coredump_mask(unsigned long mm_flags)
+{
+ switch (__get_dumpable(mm_flags)) {
+ case SUID_DUMP_USER:
+ return PIDFD_COREDUMP_USER;
+ case SUID_DUMP_ROOT:
+ return PIDFD_COREDUMP_ROOT;
+ case SUID_DUMP_DISABLE:
+ return PIDFD_COREDUMP_SKIP;
+ default:
+ WARN_ON_ONCE(true);
+ }
+
+ return 0;
+}
+
static long pidfd_info(struct file *file, unsigned int cmd, unsigned long arg)
{
struct pidfd_info __user *uinfo = (struct pidfd_info __user *)arg;
@@ -280,6 +298,11 @@ static long pidfd_info(struct file *file, unsigned int cmd, unsigned long arg)
}
}
+ if (mask & PIDFD_INFO_COREDUMP) {
+ kinfo.mask |= PIDFD_INFO_COREDUMP;
+ kinfo.coredump_mask = READ_ONCE(pidfs_i(inode)->__pei.coredump_mask);
+ }
+
task = get_pid_task(pid, PIDTYPE_PID);
if (!task) {
/*
@@ -296,6 +319,13 @@ static long pidfd_info(struct file *file, unsigned int cmd, unsigned long arg)
if (!c)
return -ESRCH;
+ if (!(kinfo.mask & PIDFD_INFO_COREDUMP)) {
+ task_lock(task);
+ if (task->mm)
+ kinfo.coredump_mask = pidfs_coredump_mask(task->mm->flags);
+ task_unlock(task);
+ }
+
/* Unconditionally return identifiers and credentials, the rest only on request */
user_ns = current_user_ns();
@@ -559,6 +589,31 @@ void pidfs_exit(struct task_struct *tsk)
}
}
+#ifdef CONFIG_COREDUMP
+void pidfs_coredump(const struct coredump_params *cprm)
+{
+ struct pid *pid = cprm->pid;
+ struct pidfs_exit_info *exit_info;
+ struct dentry *dentry;
+ struct inode *inode;
+ __u32 coredump_mask = 0;
+
+ dentry = pid->stashed;
+ if (WARN_ON_ONCE(!dentry))
+ return;
+
+ inode = d_inode(dentry);
+ exit_info = &pidfs_i(inode)->__pei;
+ /* Note how we were coredumped. */
+ coredump_mask = pidfs_coredump_mask(cprm->mm_flags);
+ /* Note that we actually did coredump. */
+ coredump_mask |= PIDFD_COREDUMPED;
+ /* If coredumping is set to skip we should never end up here. */
+ VFS_WARN_ON_ONCE(coredump_mask & PIDFD_COREDUMP_SKIP);
+ smp_store_release(&exit_info->coredump_mask, coredump_mask);
+}
+#endif
+
static struct vfsmount *pidfs_mnt __ro_after_init;
/*
diff --git a/include/linux/net.h b/include/linux/net.h
index 0ff950eecc6b..f60fff91e1df 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -70,6 +70,7 @@ enum sock_type {
SOCK_DCCP = 6,
SOCK_PACKET = 10,
};
+#endif /* ARCH_HAS_SOCKET_TYPES */
#define SOCK_MAX (SOCK_PACKET + 1)
/* Mask which covers at least up to SOCK_MASK-1. The
@@ -81,8 +82,7 @@ enum sock_type {
#ifndef SOCK_NONBLOCK
#define SOCK_NONBLOCK O_NONBLOCK
#endif
-
-#endif /* ARCH_HAS_SOCKET_TYPES */
+#define SOCK_COREDUMP O_NOCTTY
/**
* enum sock_shutdown_cmd - Shutdown types
diff --git a/include/linux/pidfs.h b/include/linux/pidfs.h
index 2676890c4d0d..77e7db194914 100644
--- a/include/linux/pidfs.h
+++ b/include/linux/pidfs.h
@@ -2,11 +2,16 @@
#ifndef _LINUX_PID_FS_H
#define _LINUX_PID_FS_H
+struct coredump_params;
+
struct file *pidfs_alloc_file(struct pid *pid, unsigned int flags);
void __init pidfs_init(void);
void pidfs_add_pid(struct pid *pid);
void pidfs_remove_pid(struct pid *pid);
void pidfs_exit(struct task_struct *tsk);
+#ifdef CONFIG_COREDUMP
+void pidfs_coredump(const struct coredump_params *cprm);
+#endif
extern const struct dentry_operations pidfs_dentry_operations;
int pidfs_register_pid(struct pid *pid);
void pidfs_get_pid(struct pid *pid);
diff --git a/include/uapi/linux/pidfd.h b/include/uapi/linux/pidfd.h
index 8c1511edd0e9..c27a4e238e4b 100644
--- a/include/uapi/linux/pidfd.h
+++ b/include/uapi/linux/pidfd.h
@@ -25,10 +25,24 @@
#define PIDFD_INFO_CREDS (1UL << 1) /* Always returned, even if not requested */
#define PIDFD_INFO_CGROUPID (1UL << 2) /* Always returned if available, even if not requested */
#define PIDFD_INFO_EXIT (1UL << 3) /* Only returned if requested. */
+#define PIDFD_INFO_COREDUMP (1UL << 4) /* Only returned if requested. */
#define PIDFD_INFO_SIZE_VER0 64 /* sizeof first published struct */
/*
+ * Values for @coredump_mask in pidfd_info.
+ * Only valid if PIDFD_INFO_COREDUMP is set in @mask.
+ *
+ * Note, the @PIDFD_COREDUMP_ROOT flag indicates that the generated
+ * coredump should be treated as sensitive and access should only be
+ * granted to privileged users.
+ */
+#define PIDFD_COREDUMPED (1U << 0) /* Did crash and... */
+#define PIDFD_COREDUMP_SKIP (1U << 1) /* coredumping generation was skipped. */
+#define PIDFD_COREDUMP_USER (1U << 2) /* coredump was done as the user. */
+#define PIDFD_COREDUMP_ROOT (1U << 3) /* coredump was done as root. */
+
+/*
* The concept of process and threads in userland and the kernel is a confusing
* one - within the kernel every thread is a 'task' with its own individual PID,
* however from userland's point of view threads are grouped by a single PID,
@@ -92,6 +106,8 @@ struct pidfd_info {
__u32 fsuid;
__u32 fsgid;
__s32 exit_code;
+ __u32 coredump_mask;
+ __u32 __spare1;
};
#define PIDFS_IOCTL_MAGIC 0xFF
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 472f8aa9ea15..59a64b2ced6e 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -85,10 +85,13 @@
#include <linux/file.h>
#include <linux/filter.h>
#include <linux/fs.h>
+#include <linux/fs_struct.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/mount.h>
#include <linux/namei.h>
+#include <linux/net.h>
+#include <linux/pidfs.h>
#include <linux/poll.h>
#include <linux/proc_fs.h>
#include <linux/sched/signal.h>
@@ -100,7 +103,6 @@
#include <linux/splice.h>
#include <linux/string.h>
#include <linux/uaccess.h>
-#include <linux/pidfs.h>
#include <net/af_unix.h>
#include <net/net_namespace.h>
#include <net/scm.h>
@@ -1146,7 +1148,7 @@ static int unix_release(struct socket *sock)
}
static struct sock *unix_find_bsd(struct sockaddr_un *sunaddr, int addr_len,
- int type)
+ int type, int flags)
{
struct inode *inode;
struct path path;
@@ -1154,13 +1156,39 @@ static struct sock *unix_find_bsd(struct sockaddr_un *sunaddr, int addr_len,
int err;
unix_mkname_bsd(sunaddr, addr_len);
- err = kern_path(sunaddr->sun_path, LOOKUP_FOLLOW, &path);
- if (err)
- goto fail;
- err = path_permission(&path, MAY_WRITE);
- if (err)
- goto path_put;
+ if (flags & SOCK_COREDUMP) {
+ const struct cred *cred;
+ struct cred *kcred;
+ struct path root;
+
+ kcred = prepare_kernel_cred(&init_task);
+ if (!kcred) {
+ err = -ENOMEM;
+ goto fail;
+ }
+
+ task_lock(&init_task);
+ get_fs_root(init_task.fs, &root);
+ task_unlock(&init_task);
+
+ cred = override_creds(kcred);
+ err = vfs_path_lookup(root.dentry, root.mnt, sunaddr->sun_path,
+ LOOKUP_BENEATH | LOOKUP_NO_SYMLINKS |
+ LOOKUP_NO_MAGICLINKS, &path);
+ put_cred(revert_creds(cred));
+ path_put(&root);
+ if (err)
+ goto fail;
+ } else {
+ err = kern_path(sunaddr->sun_path, LOOKUP_FOLLOW, &path);
+ if (err)
+ goto fail;
+
+ err = path_permission(&path, MAY_WRITE);
+ if (err)
+ goto path_put;
+ }
err = -ECONNREFUSED;
inode = d_backing_inode(path.dentry);
@@ -1210,12 +1238,12 @@ static struct sock *unix_find_abstract(struct net *net,
static struct sock *unix_find_other(struct net *net,
struct sockaddr_un *sunaddr,
- int addr_len, int type)
+ int addr_len, int type, int flags)
{
struct sock *sk;
if (sunaddr->sun_path[0])
- sk = unix_find_bsd(sunaddr, addr_len, type);
+ sk = unix_find_bsd(sunaddr, addr_len, type, flags);
else
sk = unix_find_abstract(net, sunaddr, addr_len, type);
@@ -1473,7 +1501,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
}
restart:
- other = unix_find_other(sock_net(sk), sunaddr, alen, sock->type);
+ other = unix_find_other(sock_net(sk), sunaddr, alen, sock->type, 0);
if (IS_ERR(other)) {
err = PTR_ERR(other);
goto out;
@@ -1620,7 +1648,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
restart:
/* Find listening sock. */
- other = unix_find_other(net, sunaddr, addr_len, sk->sk_type);
+ other = unix_find_other(net, sunaddr, addr_len, sk->sk_type, flags);
if (IS_ERR(other)) {
err = PTR_ERR(other);
goto out_free_skb;
@@ -2089,7 +2117,7 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
if (msg->msg_namelen) {
lookup:
other = unix_find_other(sock_net(sk), msg->msg_name,
- msg->msg_namelen, sk->sk_type);
+ msg->msg_namelen, sk->sk_type, 0);
if (IS_ERR(other)) {
err = PTR_ERR(other);
goto out_free;
diff --git a/tools/testing/selftests/coredump/stackdump_test.c b/tools/testing/selftests/coredump/stackdump_test.c
index fe3c728cd6be..9984413be9f0 100644
--- a/tools/testing/selftests/coredump/stackdump_test.c
+++ b/tools/testing/selftests/coredump/stackdump_test.c
@@ -1,14 +1,20 @@
// SPDX-License-Identifier: GPL-2.0
#include <fcntl.h>
+#include <inttypes.h>
#include <libgen.h>
#include <linux/limits.h>
#include <pthread.h>
#include <string.h>
+#include <sys/mount.h>
#include <sys/resource.h>
+#include <sys/stat.h>
+#include <sys/socket.h>
+#include <sys/un.h>
#include <unistd.h>
#include "../kselftest_harness.h"
+#include "../pidfd/pidfd.h"
#define STACKDUMP_FILE "stack_values"
#define STACKDUMP_SCRIPT "stackdump"
@@ -35,6 +41,7 @@ static void crashing_child(void)
FIXTURE(coredump)
{
char original_core_pattern[256];
+ pid_t pid_coredump_server;
};
FIXTURE_SETUP(coredump)
@@ -44,6 +51,7 @@ FIXTURE_SETUP(coredump)
char *dir;
int ret;
+ self->pid_coredump_server = -ESRCH;
file = fopen("/proc/sys/kernel/core_pattern", "r");
ASSERT_NE(NULL, file);
@@ -61,10 +69,17 @@ FIXTURE_TEARDOWN(coredump)
{
const char *reason;
FILE *file;
- int ret;
+ int ret, status;
unlink(STACKDUMP_FILE);
+ if (self->pid_coredump_server > 0) {
+ kill(self->pid_coredump_server, SIGTERM);
+ waitpid(self->pid_coredump_server, &status, 0);
+ }
+ unlink("/tmp/coredump.file");
+ unlink("/tmp/coredump.socket");
+
file = fopen("/proc/sys/kernel/core_pattern", "w");
if (!file) {
reason = "Unable to open core_pattern";
@@ -154,4 +169,454 @@ TEST_F_TIMEOUT(coredump, stackdump, 120)
fclose(file);
}
+TEST_F(coredump, socket)
+{
+ int fd, pidfd, ret, status;
+ FILE *file;
+ pid_t pid, pid_coredump_server;
+ struct stat st;
+ char core_file[PATH_MAX];
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+ const struct sockaddr_un coredump_sk = {
+ .sun_family = AF_UNIX,
+ .sun_path = "/tmp/coredump.socket",
+ };
+ size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
+ sizeof("/tmp/coredump.socket");
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ file = fopen("/proc/sys/kernel/core_pattern", "w");
+ ASSERT_NE(file, NULL);
+
+ ret = fprintf(file, "@/tmp/coredump.socket");
+ ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
+ ASSERT_EQ(fclose(file), 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ int fd_server, fd_coredump, fd_peer_pidfd, fd_core_file;
+ socklen_t fd_peer_pidfd_len;
+
+ close(ipc_sockets[0]);
+
+ fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
+ if (fd_server < 0)
+ _exit(EXIT_FAILURE);
+
+ ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
+ if (ret < 0) {
+ fprintf(stderr, "Failed to bind coredump socket\n");
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_FAILURE);
+ }
+
+ ret = listen(fd_server, 1);
+ if (ret < 0) {
+ fprintf(stderr, "Failed to listen on coredump socket\n");
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_FAILURE);
+ }
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_FAILURE);
+ }
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0) {
+ fprintf(stderr, "Failed to accept coredump socket connection\n");
+ close(fd_server);
+ _exit(EXIT_FAILURE);
+ }
+
+ fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
+ ret = getsockopt(fd_coredump, SOL_SOCKET, SO_PEERPIDFD,
+ &fd_peer_pidfd, &fd_peer_pidfd_len);
+ if (ret < 0) {
+ fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
+ close(fd_coredump);
+ close(fd_server);
+ _exit(EXIT_FAILURE);
+ }
+
+ memset(&info, 0, sizeof(info));
+ info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
+ ret = ioctl(fd_peer_pidfd, PIDFD_GET_INFO, &info);
+ if (ret < 0) {
+ fprintf(stderr, "Failed to retrieve pidfd info from peer pidfd for coredump socket connection\n");
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ _exit(EXIT_FAILURE);
+ }
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP)) {
+ fprintf(stderr, "Missing coredump information from coredumping task\n");
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ _exit(EXIT_FAILURE);
+ }
+
+ if (!(info.coredump_mask & PIDFD_COREDUMPED)) {
+ fprintf(stderr, "Received connection from non-coredumping task\n");
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ _exit(EXIT_FAILURE);
+ }
+
+ fd_core_file = creat("/tmp/coredump.file", 0644);
+ if (fd_core_file < 0) {
+ fprintf(stderr, "Failed to create coredump file\n");
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ _exit(EXIT_FAILURE);
+ }
+
+ for (;;) {
+ char buffer[4096];
+ ssize_t bytes_read, bytes_write;
+
+ bytes_read = read(fd_coredump, buffer, sizeof(buffer));
+ if (bytes_read < 0) {
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ close(fd_core_file);
+ _exit(EXIT_FAILURE);
+ }
+
+ if (bytes_read == 0)
+ break;
+
+ bytes_write = write(fd_core_file, buffer, bytes_read);
+ if (bytes_read != bytes_write) {
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ close(fd_core_file);
+ _exit(EXIT_FAILURE);
+ }
+ }
+
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ close(fd_core_file);
+ _exit(EXIT_SUCCESS);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_TRUE(WCOREDUMP(status));
+
+ info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
+ ASSERT_EQ(ioctl(pidfd, PIDFD_GET_INFO, &info), 0);
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ waitpid(pid_coredump_server, &status, 0);
+ self->pid_coredump_server = -ESRCH;
+ ASSERT_TRUE(WIFEXITED(status));
+ ASSERT_EQ(WEXITSTATUS(status), 0);
+
+ ASSERT_EQ(stat("/tmp/coredump.file", &st), 0);
+ ASSERT_GT(st.st_size, 0);
+ /*
+ * We should somehow validate the produced core file.
+ * For now just allow for visual inspection
+ */
+ system("file /tmp/coredump.file");
+}
+
+TEST_F(coredump, socket_detect_userspace_client)
+{
+ int fd, pidfd, ret, status;
+ FILE *file;
+ pid_t pid, pid_coredump_server;
+ struct stat st;
+ char core_file[PATH_MAX];
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+ const struct sockaddr_un coredump_sk = {
+ .sun_family = AF_UNIX,
+ .sun_path = "/tmp/coredump.socket",
+ };
+ size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
+ sizeof("/tmp/coredump.socket");
+
+ file = fopen("/proc/sys/kernel/core_pattern", "w");
+ ASSERT_NE(file, NULL);
+
+ ret = fprintf(file, "@/tmp/coredump.socket");
+ ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
+ ASSERT_EQ(fclose(file), 0);
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ int fd_server, fd_coredump, fd_peer_pidfd, fd_core_file;
+ socklen_t fd_peer_pidfd_len;
+
+ close(ipc_sockets[0]);
+
+ fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
+ if (fd_server < 0)
+ _exit(EXIT_FAILURE);
+
+ ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
+ if (ret < 0) {
+ fprintf(stderr, "Failed to bind coredump socket\n");
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_FAILURE);
+ }
+
+ ret = listen(fd_server, 1);
+ if (ret < 0) {
+ fprintf(stderr, "Failed to listen on coredump socket\n");
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_FAILURE);
+ }
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_FAILURE);
+ }
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0) {
+ fprintf(stderr, "Failed to accept coredump socket connection\n");
+ close(fd_server);
+ _exit(EXIT_FAILURE);
+ }
+
+ fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
+ ret = getsockopt(fd_coredump, SOL_SOCKET, SO_PEERPIDFD,
+ &fd_peer_pidfd, &fd_peer_pidfd_len);
+ if (ret < 0) {
+ fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
+ close(fd_coredump);
+ close(fd_server);
+ _exit(EXIT_FAILURE);
+ }
+
+ memset(&info, 0, sizeof(info));
+ info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
+ ret = ioctl(fd_peer_pidfd, PIDFD_GET_INFO, &info);
+ if (ret < 0) {
+ fprintf(stderr, "Failed to retrieve pidfd info from peer pidfd for coredump socket connection\n");
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ _exit(EXIT_FAILURE);
+ }
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP)) {
+ fprintf(stderr, "Missing coredump information from coredumping task\n");
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ _exit(EXIT_FAILURE);
+ }
+
+ if (info.coredump_mask & PIDFD_COREDUMPED) {
+ fprintf(stderr, "Received unexpected connection from coredumping task\n");
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ _exit(EXIT_FAILURE);
+ }
+
+ close(fd_coredump);
+ close(fd_server);
+ close(fd_peer_pidfd);
+ close(fd_core_file);
+ _exit(EXIT_SUCCESS);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0) {
+ int fd_socket;
+ ssize_t ret;
+
+ fd_socket = socket(AF_UNIX, SOCK_STREAM, 0);
+ if (fd_socket < 0)
+ _exit(EXIT_FAILURE);
+
+
+ ret = connect(fd_socket, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
+ if (ret < 0)
+ _exit(EXIT_FAILURE);
+
+ (void *)write(fd_socket, &(char){ 0 }, 1);
+ close(fd_socket);
+ _exit(EXIT_SUCCESS);
+ }
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFEXITED(status));
+ ASSERT_EQ(WEXITSTATUS(status), 0);
+
+ info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
+ ASSERT_EQ(ioctl(pidfd, PIDFD_GET_INFO, &info), 0);
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_EQ((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ waitpid(pid_coredump_server, &status, 0);
+ self->pid_coredump_server = -ESRCH;
+ ASSERT_TRUE(WIFEXITED(status));
+ ASSERT_EQ(WEXITSTATUS(status), 0);
+
+ ASSERT_NE(stat("/tmp/coredump.file", &st), 0);
+ ASSERT_EQ(errno, ENOENT);
+}
+
+TEST_F(coredump, socket_enoent)
+{
+ int pidfd, ret, status;
+ FILE *file;
+ pid_t pid;
+ char core_file[PATH_MAX];
+
+ file = fopen("/proc/sys/kernel/core_pattern", "w");
+ ASSERT_NE(file, NULL);
+
+ ret = fprintf(file, "@/tmp/coredump.socket");
+ ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
+ ASSERT_EQ(fclose(file), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_FALSE(WCOREDUMP(status));
+}
+
+TEST_F(coredump, socket_no_listener)
+{
+ int pidfd, ret, status;
+ FILE *file;
+ pid_t pid, pid_coredump_server;
+ int ipc_sockets[2];
+ char c;
+ const struct sockaddr_un coredump_sk = {
+ .sun_family = AF_UNIX,
+ .sun_path = "/tmp/coredump.socket",
+ };
+ size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
+ sizeof("/tmp/coredump.socket");
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ file = fopen("/proc/sys/kernel/core_pattern", "w");
+ ASSERT_NE(file, NULL);
+
+ ret = fprintf(file, "@/tmp/coredump.socket");
+ ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
+ ASSERT_EQ(fclose(file), 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ int fd_server;
+ socklen_t fd_peer_pidfd_len;
+
+ close(ipc_sockets[0]);
+
+ fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
+ if (fd_server < 0)
+ _exit(EXIT_FAILURE);
+
+ ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
+ if (ret < 0) {
+ fprintf(stderr, "Failed to bind coredump socket\n");
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_FAILURE);
+ }
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_FAILURE);
+ }
+
+ close(fd_server);
+ close(ipc_sockets[1]);
+ _exit(EXIT_SUCCESS);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_FALSE(WCOREDUMP(status));
+
+ waitpid(pid_coredump_server, &status, 0);
+ self->pid_coredump_server = -ESRCH;
+ ASSERT_TRUE(WIFEXITED(status));
+ ASSERT_EQ(WEXITSTATUS(status), 0);
+}
+
TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
index 55bcf81a2b9a..efd74063126e 100644
--- a/tools/testing/selftests/pidfd/pidfd.h
+++ b/tools/testing/selftests/pidfd/pidfd.h
@@ -131,6 +131,26 @@
#define PIDFD_INFO_EXIT (1UL << 3) /* Always returned if available, even if not requested */
#endif
+#ifndef PIDFD_INFO_COREDUMP
+#define PIDFD_INFO_COREDUMP (1UL << 4)
+#endif
+
+#ifndef PIDFD_COREDUMPED
+#define PIDFD_COREDUMPED (1U << 0) /* Did crash and... */
+#endif
+
+#ifndef PIDFD_COREDUMP_SKIP
+#define PIDFD_COREDUMP_SKIP (1U << 1) /* coredumping generation was skipped. */
+#endif
+
+#ifndef PIDFD_COREDUMP_USER
+#define PIDFD_COREDUMP_USER (1U << 2) /* coredump was done as the user. */
+#endif
+
+#ifndef PIDFD_COREDUMP_ROOT
+#define PIDFD_COREDUMP_ROOT (1U << 3) /* coredump was done as root. */
+#endif
+
#ifndef PIDFD_THREAD
#define PIDFD_THREAD O_EXCL
#endif
@@ -150,6 +170,8 @@ struct pidfd_info {
__u32 fsuid;
__u32 fsgid;
__s32 exit_code;
+ __u32 coredump_mask;
+ __u32 __spare1;
};
/*