From 39148e94e3e1f0477ce8ed3fda00123722681f3a Mon Sep 17 00:00:00 2001 From: James Hogan Date: Mon, 19 Jan 2015 10:30:54 +0000 Subject: MIPS: fork: Fix MSA/FPU/DSP context duplication race There is a race in the MIPS fork code which allows the child to get a stale copy of parent MSA/FPU/DSP state that is active in hardware registers when the fork() is called. This is because copy_thread() saves the live register state into the child context only if the hardware is currently in use, apparently on the assumption that the hardware state cannot have been saved and disabled since the initial duplication of the task_struct. However preemption is certainly possible during this window. An example sequence of events is as follows: 1) The parent userland process puts important data into saved floating point registers ($f20-$f31), which are then dirty compared to the process' stored context. 2) The parent process calls fork() which does a clone system call. 3) In the kernel, do_fork() -> copy_process() -> dup_task_struct() -> arch_dup_task_struct() (which uses the weakly defined default implementation). This duplicates the parent process' task context, which includes a stale version of its FP context from when it was last saved, probably some time before (1). 4) At some point before copy_process() calls copy_thread(), such as when duplicating the memory map, the process is desceduled. Perhaps it is preempted asynchronously, or perhaps it sleeps while blocked on a mutex. The dirty FP state in the FP registers is saved to the parent process' context and the FPU is disabled. 5) When the process is rescheduled again it continues copying state until it gets to copy_thread(), which checks whether the FPU is in use, so that it can copy that dirty state to the child process' task context. Because of the deschedule however the FPU is not in use, so the child process' context is left with stale FP context from the last time the parent saved it (some time before (1)). 6) When the new child process is scheduled it reads the important data from the saved floating point register, and ends up doing a NULL pointer dereference as a result of the stale data. This use of saved floating point registers across function calls can be triggered fairly easily by explicitly using inline asm with a current (MIPS R2) compiler, but is far more likely to happen unintentionally with a MIPS R6 compiler where the FP registers are more likely to get used as scratch registers for storing non-fp data. It is easily fixed, in the same way that other architectures do it, by overriding the implementation of arch_dup_task_struct() to sync the dirty hardware state to the parent process' task context *prior* to duplicating it, rather than copying straight to the child process' task context in copy_thread(). Note, the FPU hardware is not disabled so the parent process may continue executing with the live register context, but now the child process is guaranteed to have an identical copy of it at that point. Signed-off-by: James Hogan Reported-by: Matthew Fortune Tested-by: Markos Chandras Cc: Ralf Baechle Cc: Paul Burton Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9075/ Signed-off-by: Ralf Baechle --- arch/mips/kernel/process.c | 36 ++++++++++++++++++++++++------------ 1 file changed, 24 insertions(+), 12 deletions(-) (limited to 'arch/mips/kernel/process.c') diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index eb76434828e8..85bff5d513e5 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -82,6 +82,30 @@ void flush_thread(void) { } +int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) +{ + /* + * Save any process state which is live in hardware registers to the + * parent context prior to duplication. This prevents the new child + * state becoming stale if the parent is preempted before copy_thread() + * gets a chance to save the parent's live hardware registers to the + * child context. + */ + preempt_disable(); + + if (is_msa_enabled()) + save_msa(current); + else if (is_fpu_owner()) + _save_fp(current); + + save_dsp(current); + + preempt_enable(); + + *dst = *src; + return 0; +} + int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg, struct task_struct *p) { @@ -92,18 +116,6 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, childksp = (unsigned long)task_stack_page(p) + THREAD_SIZE - 32; - preempt_disable(); - - if (is_msa_enabled()) - save_msa(p); - else if (is_fpu_owner()) - save_fp(p); - - if (cpu_has_dsp) - save_dsp(p); - - preempt_enable(); - /* set up new TSS. */ childregs = (struct pt_regs *) childksp - 1; /* Put the stack after the struct pt_regs. */ -- cgit v1.2.3 From 9791554b45a2acc28247f66a5fd5bbc212a6b8c8 Mon Sep 17 00:00:00 2001 From: Paul Burton Date: Thu, 8 Jan 2015 12:17:37 +0000 Subject: MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS Userland code may be built using an ABI which permits linking to objects that have more restrictive floating point requirements. For example, userland code may be built to target the O32 FPXX ABI. Such code may be linked with other FPXX code, or code built for either one of the more restrictive FP32 or FP64. When linking with more restrictive code, the overall requirement of the process becomes that of the more restrictive code. The kernel has no way to know in advance which mode the process will need to be executed in, and indeed it may need to change during execution. The dynamic loader is the only code which will know the overall required mode, and so it needs to have a means to instruct the kernel to switch the FP mode of the process. This patch introduces 2 new options to the prctl syscall which provide such a capability. The FP mode of the process is represented as a simple bitmask combining a number of mode bits mirroring those present in the hardware. Userland can either retrieve the current FP mode of the process: mode = prctl(PR_GET_FP_MODE); or modify the current FP mode of the process: err = prctl(PR_SET_FP_MODE, new_mode); Signed-off-by: Paul Burton Cc: Matthew Fortune Cc: Markos Chandras Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/8899/ Signed-off-by: Ralf Baechle --- arch/mips/include/asm/mmu.h | 3 ++ arch/mips/include/asm/mmu_context.h | 2 + arch/mips/include/asm/processor.h | 11 +++++ arch/mips/kernel/process.c | 92 +++++++++++++++++++++++++++++++++++++ arch/mips/kernel/traps.c | 19 ++++++++ include/uapi/linux/prctl.h | 5 ++ kernel/sys.c | 12 +++++ 7 files changed, 144 insertions(+) (limited to 'arch/mips/kernel/process.c') diff --git a/arch/mips/include/asm/mmu.h b/arch/mips/include/asm/mmu.h index c436138945a8..1afa1f986df8 100644 --- a/arch/mips/include/asm/mmu.h +++ b/arch/mips/include/asm/mmu.h @@ -1,9 +1,12 @@ #ifndef __ASM_MMU_H #define __ASM_MMU_H +#include + typedef struct { unsigned long asid[NR_CPUS]; void *vdso; + atomic_t fp_mode_switching; } mm_context_t; #endif /* __ASM_MMU_H */ diff --git a/arch/mips/include/asm/mmu_context.h b/arch/mips/include/asm/mmu_context.h index 2f82568a3ee4..87f11072f557 100644 --- a/arch/mips/include/asm/mmu_context.h +++ b/arch/mips/include/asm/mmu_context.h @@ -132,6 +132,8 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm) for_each_possible_cpu(i) cpu_context(i, mm) = 0; + atomic_set(&mm->context.fp_mode_switching, 0); + return 0; } diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h index f1df4cb4a286..9daa38608cd8 100644 --- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -399,4 +399,15 @@ unsigned long get_wchan(struct task_struct *p); #endif +/* + * Functions & macros implementing the PR_GET_FP_MODE & PR_SET_FP_MODE options + * to the prctl syscall. + */ +extern int mips_get_process_fp_mode(struct task_struct *task); +extern int mips_set_process_fp_mode(struct task_struct *task, + unsigned int value); + +#define GET_FP_MODE(task) mips_get_process_fp_mode(task) +#define SET_FP_MODE(task,value) mips_set_process_fp_mode(task, value) + #endif /* _ASM_PROCESSOR_H */ diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index eb76434828e8..4677b4c67da6 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -550,3 +551,94 @@ void arch_trigger_all_cpu_backtrace(bool include_self) { smp_call_function(arch_dump_stack, NULL, 1); } + +int mips_get_process_fp_mode(struct task_struct *task) +{ + int value = 0; + + if (!test_tsk_thread_flag(task, TIF_32BIT_FPREGS)) + value |= PR_FP_MODE_FR; + if (test_tsk_thread_flag(task, TIF_HYBRID_FPREGS)) + value |= PR_FP_MODE_FRE; + + return value; +} + +int mips_set_process_fp_mode(struct task_struct *task, unsigned int value) +{ + const unsigned int known_bits = PR_FP_MODE_FR | PR_FP_MODE_FRE; + unsigned long switch_count; + struct task_struct *t; + + /* Check the value is valid */ + if (value & ~known_bits) + return -EOPNOTSUPP; + + /* Avoid inadvertently triggering emulation */ + if ((value & PR_FP_MODE_FR) && cpu_has_fpu && + !(current_cpu_data.fpu_id & MIPS_FPIR_F64)) + return -EOPNOTSUPP; + if ((value & PR_FP_MODE_FRE) && cpu_has_fpu && !cpu_has_fre) + return -EOPNOTSUPP; + + /* Save FP & vector context, then disable FPU & MSA */ + if (task->signal == current->signal) + lose_fpu(1); + + /* Prevent any threads from obtaining live FP context */ + atomic_set(&task->mm->context.fp_mode_switching, 1); + smp_mb__after_atomic(); + + /* + * If there are multiple online CPUs then wait until all threads whose + * FP mode is about to change have been context switched. This approach + * allows us to only worry about whether an FP mode switch is in + * progress when FP is first used in a tasks time slice. Pretty much all + * of the mode switch overhead can thus be confined to cases where mode + * switches are actually occuring. That is, to here. However for the + * thread performing the mode switch it may take a while... + */ + if (num_online_cpus() > 1) { + spin_lock_irq(&task->sighand->siglock); + + for_each_thread(task, t) { + if (t == current) + continue; + + switch_count = t->nvcsw + t->nivcsw; + + do { + spin_unlock_irq(&task->sighand->siglock); + cond_resched(); + spin_lock_irq(&task->sighand->siglock); + } while ((t->nvcsw + t->nivcsw) == switch_count); + } + + spin_unlock_irq(&task->sighand->siglock); + } + + /* + * There are now no threads of the process with live FP context, so it + * is safe to proceed with the FP mode switch. + */ + for_each_thread(task, t) { + /* Update desired FP register width */ + if (value & PR_FP_MODE_FR) { + clear_tsk_thread_flag(t, TIF_32BIT_FPREGS); + } else { + set_tsk_thread_flag(t, TIF_32BIT_FPREGS); + clear_tsk_thread_flag(t, TIF_MSA_CTX_LIVE); + } + + /* Update desired FP single layout */ + if (value & PR_FP_MODE_FRE) + set_tsk_thread_flag(t, TIF_HYBRID_FPREGS); + else + clear_tsk_thread_flag(t, TIF_HYBRID_FPREGS); + } + + /* Allow threads to use FP again */ + atomic_set(&task->mm->context.fp_mode_switching, 0); + + return 0; +} diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c index ad3d2031c327..d5fbfb51b9da 100644 --- a/arch/mips/kernel/traps.c +++ b/arch/mips/kernel/traps.c @@ -1134,10 +1134,29 @@ static int default_cu2_call(struct notifier_block *nfb, unsigned long action, return NOTIFY_OK; } +static int wait_on_fp_mode_switch(atomic_t *p) +{ + /* + * The FP mode for this task is currently being switched. That may + * involve modifications to the format of this tasks FP context which + * make it unsafe to proceed with execution for the moment. Instead, + * schedule some other task. + */ + schedule(); + return 0; +} + static int enable_restore_fp_context(int msa) { int err, was_fpu_owner, prior_msa; + /* + * If an FP mode switch is currently underway, wait for it to + * complete before proceeding. + */ + wait_on_atomic_t(¤t->mm->context.fp_mode_switching, + wait_on_fp_mode_switch, TASK_KILLABLE); + if (!used_math()) { /* First time FP context user. */ preempt_disable(); diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 89f63503f903..31891d9535e2 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -185,4 +185,9 @@ struct prctl_mm_map { #define PR_MPX_ENABLE_MANAGEMENT 43 #define PR_MPX_DISABLE_MANAGEMENT 44 +#define PR_SET_FP_MODE 45 +#define PR_GET_FP_MODE 46 +# define PR_FP_MODE_FR (1 << 0) /* 64b FP registers */ +# define PR_FP_MODE_FRE (1 << 1) /* 32b compatibility */ + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index a8c9f5a7dda6..08b16bbdcdf0 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -97,6 +97,12 @@ #ifndef MPX_DISABLE_MANAGEMENT # define MPX_DISABLE_MANAGEMENT(a) (-EINVAL) #endif +#ifndef GET_FP_MODE +# define GET_FP_MODE(a) (-EINVAL) +#endif +#ifndef SET_FP_MODE +# define SET_FP_MODE(a,b) (-EINVAL) +#endif /* * this is where the system-wide overflow UID and GID are defined, for @@ -2215,6 +2221,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_MPX_DISABLE_MANAGEMENT: error = MPX_DISABLE_MANAGEMENT(me); break; + case PR_SET_FP_MODE: + error = SET_FP_MODE(me, arg2); + break; + case PR_GET_FP_MODE: + error = GET_FP_MODE(me); + break; default: error = -EINVAL; break; -- cgit v1.2.3 From 13e45f095753b8203a8446648dea527f9ce4413c Mon Sep 17 00:00:00 2001 From: Markos Chandras Date: Tue, 13 Jan 2015 13:01:49 +0000 Subject: MIPS: kernel: process: Do not allow FR=0 on MIPS R6 A prctl() call to set FR=0 for MIPS R6 should not be allowed since FR=1 is the only option for R6 cores. Cc: Paul Burton Cc: Matthew Fortune Signed-off-by: Markos Chandras --- arch/mips/kernel/process.c | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'arch/mips/kernel/process.c') diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index 4677b4c67da6..696d59e40fa4 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -581,6 +581,10 @@ int mips_set_process_fp_mode(struct task_struct *task, unsigned int value) if ((value & PR_FP_MODE_FRE) && cpu_has_fpu && !cpu_has_fre) return -EOPNOTSUPP; + /* FR = 0 not supported in MIPS R6 */ + if (!(value & PR_FP_MODE_FR) && cpu_has_fpu && cpu_has_mips_r6) + return -EOPNOTSUPP; + /* Save FP & vector context, then disable FPU & MSA */ if (task->signal == current->signal) lose_fpu(1); -- cgit v1.2.3