diff options
author | Stephen Rothwell <sfr@canb.auug.org.au> | 2011-08-31 14:32:33 +1000 |
---|---|---|
committer | Stephen Rothwell <sfr@canb.auug.org.au> | 2011-08-31 14:32:33 +1000 |
commit | 2cd776673a43c2fbfdb0f90f79458880abad3f46 (patch) | |
tree | e811ffb9eb26eec60afe1d763a917c68c2828e09 | |
parent | dc978c7a7a2ceaa06db8d62ea5657d269376c98b (diff) | |
parent | 64a8d94fbc0d7fe4e87ed41ecb1b4b9e05cc0efe (diff) |
Merge branch 'akpm'
205 files changed, 7360 insertions, 2169 deletions
diff --git a/Documentation/DocBook/debugobjects.tmpl b/Documentation/DocBook/debugobjects.tmpl index 08ff908aa7a2..24979f691e3e 100644 --- a/Documentation/DocBook/debugobjects.tmpl +++ b/Documentation/DocBook/debugobjects.tmpl @@ -96,6 +96,7 @@ <listitem><para>debug_object_deactivate</para></listitem> <listitem><para>debug_object_destroy</para></listitem> <listitem><para>debug_object_free</para></listitem> + <listitem><para>debug_object_assert_init</para></listitem> </itemizedlist> Each of these functions takes the address of the real object and a pointer to the object type specific debug description @@ -273,6 +274,26 @@ debug checks. </para> </sect1> + + <sect1 id="debug_object_assert_init"> + <title>debug_object_assert_init</title> + <para> + This function is called to assert that an object has been + initialized. + </para> + <para> + When the real object is not tracked by debugobjects, it calls + fixup_assert_init of the object type description structure + provided by the caller, with the hardcoded object state + ODEBUG_NOT_AVAILABLE. The fixup function can correct the problem + by calling debug_object_init and other specific initializing + functions. + </para> + <para> + When the real object is already tracked by debugobjects it is + ignored. + </para> + </sect1> </chapter> <chapter id="fixupfunctions"> <title>Fixup functions</title> @@ -381,6 +402,35 @@ statistics. </para> </sect1> + <sect1 id="fixup_assert_init"> + <title>fixup_assert_init</title> + <para> + This function is called from the debug code whenever a problem + in debug_object_assert_init is detected. + </para> + <para> + Called from debug_object_assert_init() with a hardcoded state + ODEBUG_STATE_NOTAVAILABLE when the object is not found in the + debug bucket. + </para> + <para> + The function returns 1 when the fixup was successful, + otherwise 0. The return value is used to update the + statistics. + </para> + <para> + Note, this function should make sure debug_object_init() is + called before returning. + </para> + <para> + The handling of statically initialized objects is a special + case. The fixup function should check if this is a legitimate + case of a statically initialized object or not. In this case only + debug_object_init() should be called to make the object known to + the tracker. Then the function should return 0 because this is not + a real fixup. + </para> + </sect1> </chapter> <chapter id="bugs"> <title>Known Bugs And Assumptions</title> diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 4dc465477665..188cd811ec44 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -133,41 +133,6 @@ Who: Pavel Machek <pavel@ucw.cz> --------------------------- -What: sys_sysctl -When: September 2010 -Option: CONFIG_SYSCTL_SYSCALL -Why: The same information is available in a more convenient from - /proc/sys, and none of the sysctl variables appear to be - important performance wise. - - Binary sysctls are a long standing source of subtle kernel - bugs and security issues. - - When I looked several months ago all I could find after - searching several distributions were 5 user space programs and - glibc (which falls back to /proc/sys) using this syscall. - - The man page for sysctl(2) documents it as unusable for user - space programs. - - sysctl(2) is not generally ABI compatible to a 32bit user - space application on a 64bit and a 32bit kernel. - - For the last several months the policy has been no new binary - sysctls and no one has put forward an argument to use them. - - Binary sysctls issues seem to keep happening appearing so - properly deprecating them (with a warning to user space) and a - 2 year grace warning period will mean eventually we can kill - them and end the pain. - - In the mean time individual binary sysctls can be dealt with - in a piecewise fashion. - -Who: Eric Biederman <ebiederm@xmission.com> - ---------------------------- - What: /proc/<pid>/oom_adj When: August 2012 Why: /proc/<pid>/oom_adj allows userspace to influence the oom killer's diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index b7cad05751dd..ccfc61ad32e9 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -973,6 +973,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted. ignore_loglevel [KNL] Ignore loglevel setting - this will print /all/ kernel messages to the console. Useful for debugging. + We also add it as printk module parameter, so users + could change it dynamically, usually by + /sys/module/printk/parameters/ignore_loglevel. ihash_entries= [KNL] Set number of hash buckets for inode cache. @@ -1665,6 +1668,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. debugging driver suspend/resume hooks). This may not work reliably with all consoles, but is known to work with serial and VGA consoles. + To facilitate more flexible debugging, we also add + console_suspend, a printk module parameter to control + it. Users could use console_suspend (usually + /sys/module/printk/parameters/console_suspend) to + turn on/off it dynamically. noaliencache [MM, NUMA, SLAB] Disables the allocation of alien caches in the slab allocator. Saves per-node memory, diff --git a/Documentation/rapidio/tsi721.txt b/Documentation/rapidio/tsi721.txt new file mode 100644 index 000000000000..335f3c6087dc --- /dev/null +++ b/Documentation/rapidio/tsi721.txt @@ -0,0 +1,49 @@ +RapidIO subsystem mport driver for IDT Tsi721 PCI Express-to-SRIO bridge. +========================================================================= + +I. Overview + +This driver implements all currently defined RapidIO mport callback functions. +It supports maintenance read and write operations, inbound and outbound RapidIO +doorbells, inbound maintenance port-writes and RapidIO messaging. + +To generate SRIO maintenance transactions this driver uses one of Tsi721 DMA +channels. This mechanism provides access to larger range of hop counts and +destination IDs without need for changes in outbound window translation. + +RapidIO messaging support uses dedicated messaging channels for each mailbox. +For inbound messages this driver uses destination ID matching to forward messages +into the corresponding message queue. Messaging callbacks are implemented to be +fully compatible with RIONET driver (Ethernet over RapidIO messaging services). + +II. Known problems + + None. + +III. To do + + Add DMA data transfers (non-messaging). + Add inbound region (SRIO-to-PCIe) mapping. + +IV. Version History + + 1.0.0 - Initial driver release. + +V. License +----------------------------------------------- + + Copyright(c) 2011 Integrated Device Technology, Inc. All rights reserved. + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + You should have received a copy of the GNU General Public License along with + this program; if not, write to the Free Software Foundation, Inc., + 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl index 12cecc83cd91..4a37c4759cd2 100644 --- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl +++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl @@ -379,10 +379,10 @@ EVENT_PROCESS: # To closer match vmstat scanning statistics, only count isolate_both # and isolate_inactive as scanning. isolate_active is rotation - # isolate_inactive == 0 - # isolate_active == 1 - # isolate_both == 2 - if ($isolate_mode != 1) { + # isolate_inactive == 1 + # isolate_active == 2 + # isolate_both == 3 + if ($isolate_mode != 2) { $perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned; } $perprocesspid{$process_pid}->{HIGH_NR_CONTIG_DIRTY} += $nr_contig_dirty; diff --git a/MAINTAINERS b/MAINTAINERS index f98ee644b1dc..644894af58b2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3989,6 +3989,7 @@ M: Eric Piel <eric.piel@tremplin-utc.net> S: Maintained F: Documentation/misc-devices/lis3lv02d F: drivers/misc/lis3lv02d/ +F: drivers/platform/x86/hp_accel.c LLC (802.2) M: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h index b2d9df5667af..3962caf485c0 100644 --- a/arch/arm/include/asm/processor.h +++ b/arch/arm/include/asm/processor.h @@ -55,7 +55,6 @@ struct thread_struct { #define start_thread(regs,pc,sp) \ ({ \ unsigned long *stack = (unsigned long *)sp; \ - set_fs(USER_DS); \ memset(regs->uregs, 0, sizeof(regs->uregs)); \ if (current->personality & ADDR_LIMIT_32BIT) \ regs->ARM_cpsr = USR_MODE; \ diff --git a/arch/arm/mach-ux500/mbox-db5500.c b/arch/arm/mach-ux500/mbox-db5500.c index 2b2d51caf9d8..0127490218cd 100644 --- a/arch/arm/mach-ux500/mbox-db5500.c +++ b/arch/arm/mach-ux500/mbox-db5500.c @@ -168,7 +168,7 @@ static ssize_t mbox_read_fifo(struct device *dev, return sprintf(buf, "0x%X\n", mbox_value); } -static DEVICE_ATTR(fifo, S_IWUGO | S_IRUGO, mbox_read_fifo, mbox_write_fifo); +static DEVICE_ATTR(fifo, S_IWUSR | S_IRUGO, mbox_read_fifo, mbox_write_fifo); static int mbox_show(struct seq_file *s, void *data) { diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h index d9f397fae03e..691be0b95c1e 100644 --- a/arch/ia64/include/asm/processor.h +++ b/arch/ia64/include/asm/processor.h @@ -309,7 +309,6 @@ struct thread_struct { } #define start_thread(regs,new_ip,new_sp) do { \ - set_fs(USER_DS); \ regs->cr_ipsr = ((regs->cr_ipsr | (IA64_PSR_BITS_TO_SET | IA64_PSR_CPL)) \ & ~(IA64_PSR_BITS_TO_CLEAR | IA64_PSR_RI | IA64_PSR_IS)); \ regs->cr_iip = new_ip; \ diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig index e077b0bf56ca..5dcb59ff848d 100644 --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@ -7,6 +7,7 @@ config PARISC select HAVE_FUNCTION_TRACE_MCOUNT_TEST if 64BIT select RTC_CLASS select RTC_DRV_GENERIC + select ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS select INIT_ALL_POSSIBLE select BUG select HAVE_IRQ_WORK diff --git a/arch/parisc/Kconfig.debug b/arch/parisc/Kconfig.debug index 7305ac8f7f5b..bc989e522a04 100644 --- a/arch/parisc/Kconfig.debug +++ b/arch/parisc/Kconfig.debug @@ -12,18 +12,4 @@ config DEBUG_RODATA portion of the kernel code won't be covered by a TLB anymore. If in doubt, say "N". -config DEBUG_STRICT_USER_COPY_CHECKS - bool "Strict copy size checks" - depends on DEBUG_KERNEL && !TRACE_BRANCH_PROFILING - ---help--- - Enabling this option turns a certain set of sanity checks for user - copy operations into compile time failures. - - The copy_from_user() etc checks are there to help test if there - are sufficient security checks on the length argument of - the copy operation, by having gcc prove that the argument is - within bounds. - - If unsure, or if you run an older (pre 4.4) gcc, say N. - endmenu diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h index 9ce66e9d1c2b..7213ec9e594c 100644 --- a/arch/parisc/include/asm/processor.h +++ b/arch/parisc/include/asm/processor.h @@ -196,7 +196,6 @@ typedef unsigned int elf_caddr_t; /* offset pc for priv. level */ \ pc |= 3; \ \ - set_fs(USER_DS); \ regs->iasq[0] = spaceid; \ regs->iasq[1] = spaceid; \ regs->iaoq[0] = pc; \ @@ -299,7 +298,6 @@ on downward growing arches, it looks like this: elf_addr_t pc = (elf_addr_t)new_pc | 3; \ elf_caddr_t *argv = (elf_caddr_t *)bprm->exec + 1; \ \ - set_fs(USER_DS); \ regs->iasq[0] = spaceid; \ regs->iasq[1] = spaceid; \ regs->iaoq[0] = pc; \ diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c index 4b4b9181a1a0..62c60b87d039 100644 --- a/arch/parisc/kernel/process.c +++ b/arch/parisc/kernel/process.c @@ -192,7 +192,6 @@ void flush_thread(void) /* Only needs to handle fpu stuff or perf monitors. ** REVISIT: several arches implement a "lazy fpu state". */ - set_fs(USER_DS); } void release_thread(struct task_struct *dead_task) diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h index fa0d27a400de..559ae1ee6706 100644 --- a/arch/powerpc/include/asm/systbl.h +++ b/arch/powerpc/include/asm/systbl.h @@ -354,3 +354,5 @@ COMPAT_SYS_SPU(clock_adjtime) SYSCALL_SPU(syncfs) COMPAT_SYS_SPU(sendmmsg) SYSCALL_SPU(setns) +COMPAT_SYS(process_vm_readv) +COMPAT_SYS(process_vm_writev) diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index b8b3f599362b..d3d1b5efd7eb 100644 --- a/arch/powerpc/include/asm/unistd.h +++ b/arch/powerpc/include/asm/unistd.h @@ -373,10 +373,12 @@ #define __NR_syncfs 348 #define __NR_sendmmsg 349 #define __NR_setns 350 +#define __NR_process_vm_readv 351 +#define __NR_process_vm_writev 352 #ifdef __KERNEL__ -#define __NR_syscalls 351 +#define __NR_syscalls 353 #define __NR__exit __NR_exit #define NR_syscalls __NR_syscalls diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index a9fbd43395f7..78f13f237360 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -120,6 +120,7 @@ config S390 select ARCH_INLINE_WRITE_UNLOCK_BH select ARCH_INLINE_WRITE_UNLOCK_IRQ select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE + select ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS config SCHED_OMIT_FRAME_POINTER def_bool y diff --git a/arch/s390/Kconfig.debug b/arch/s390/Kconfig.debug index d76cef3fef37..aa1796cc237c 100644 --- a/arch/s390/Kconfig.debug +++ b/arch/s390/Kconfig.debug @@ -17,20 +17,6 @@ config STRICT_DEVMEM If you are unsure, say Y. -config DEBUG_STRICT_USER_COPY_CHECKS - def_bool n - prompt "Strict user copy size checks" - ---help--- - Enabling this option turns a certain set of sanity checks for user - copy operations into compile time warnings. - - The copy_from_user() etc checks are there to help test if there - are sufficient security checks on the length argument of - the copy operation, by having gcc prove that the argument is - within bounds. - - If unsure, or if you run an older (pre 4.4) gcc, say N. - config DEBUG_SET_MODULE_RONX def_bool y depends on MODULES diff --git a/arch/s390/lib/Makefile b/arch/s390/lib/Makefile index 761ab8b56afc..97975ec7a274 100644 --- a/arch/s390/lib/Makefile +++ b/arch/s390/lib/Makefile @@ -3,7 +3,6 @@ # lib-y += delay.o string.o uaccess_std.o uaccess_pt.o -obj-y += usercopy.o obj-$(CONFIG_32BIT) += div64.o qrnnd.o ucmpdi2.o lib-$(CONFIG_64BIT) += uaccess_mvcos.o lib-$(CONFIG_SMP) += spinlock.o diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c index c8cc461ff75f..f793742eec2b 100644 --- a/arch/sparc/kernel/process_32.c +++ b/arch/sparc/kernel/process_32.c @@ -380,8 +380,7 @@ void flush_thread(void) #endif } - /* Now, this task is no longer a kernel thread. */ - current->thread.current_ds = USER_DS; + /* This task is no longer a kernel thread. */ if (current->thread.flags & SPARC_FLAG_KTHREAD) { current->thread.flags &= ~SPARC_FLAG_KTHREAD; diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c index 0049e434c2f7..3739a06a76cb 100644 --- a/arch/sparc/kernel/process_64.c +++ b/arch/sparc/kernel/process_64.c @@ -368,9 +368,6 @@ void flush_thread(void) /* Clear FPU register state. */ t->fpsaved[0] = 0; - - if (get_thread_current_ds() != ASI_AIUS) - set_fs(USER_DS); } /* It's a bit more tricky when 64-bit tasks are involved... */ diff --git a/arch/sparc/lib/Makefile b/arch/sparc/lib/Makefile index a3fc4375a150..ddc551cb815a 100644 --- a/arch/sparc/lib/Makefile +++ b/arch/sparc/lib/Makefile @@ -43,4 +43,3 @@ obj-y += iomap.o obj-$(CONFIG_SPARC32) += atomic32.o obj-y += ksyms.o obj-$(CONFIG_SPARC64) += PeeCeeI.o -obj-y += usercopy.o diff --git a/arch/sparc/lib/usercopy.c b/arch/sparc/lib/usercopy.c deleted file mode 100644 index 14b363fec8a2..000000000000 --- a/arch/sparc/lib/usercopy.c +++ /dev/null @@ -1,8 +0,0 @@ -#include <linux/module.h> -#include <linux/bug.h> - -void copy_from_user_overflow(void) -{ - WARN(1, "Buffer overflow detected!\n"); -} -EXPORT_SYMBOL(copy_from_user_overflow); diff --git a/arch/tile/Kconfig b/arch/tile/Kconfig index b30f71ac0d06..99ee890d0aa1 100644 --- a/arch/tile/Kconfig +++ b/arch/tile/Kconfig @@ -7,6 +7,7 @@ config TILE select GENERIC_FIND_FIRST_BIT select USE_GENERIC_SMP_HELPERS select CC_OPTIMIZE_FOR_SIZE + select ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS select HAVE_GENERIC_HARDIRQS select GENERIC_IRQ_PROBE select GENERIC_PENDING_IRQ if SMP @@ -97,13 +98,6 @@ config STRICT_DEVMEM config SMP def_bool y -# Allow checking for compile-time determined overflow errors in -# copy_from_user(). There are still unprovable places in the -# generic code as of 2.6.34, so this option is not really compatible -# with -Werror, which is more useful in general. -config DEBUG_COPY_FROM_USER - def_bool n - config HVC_TILE select HVC_DRIVER def_bool y diff --git a/arch/tile/include/asm/uaccess.h b/arch/tile/include/asm/uaccess.h index ef34d2caa5b1..9a540be3149e 100644 --- a/arch/tile/include/asm/uaccess.h +++ b/arch/tile/include/asm/uaccess.h @@ -353,7 +353,12 @@ _copy_from_user(void *to, const void __user *from, unsigned long n) return n; } -#ifdef CONFIG_DEBUG_COPY_FROM_USER +#ifdef CONFIG_DEBUG_STRICT_USER_COPY_CHECKS +/* + * There are still unprovable places in the generic code as of 2.6.34, so this + * option is not really compatible with -Werror, which is more useful in + * general. + */ extern void copy_from_user_overflow(void) __compiletime_warning("copy_from_user() size is not provably correct"); diff --git a/arch/tile/lib/uaccess.c b/arch/tile/lib/uaccess.c index f8d398c9ee7f..030abe3ee4f1 100644 --- a/arch/tile/lib/uaccess.c +++ b/arch/tile/lib/uaccess.c @@ -22,11 +22,3 @@ int __range_ok(unsigned long addr, unsigned long size) is_arch_mappable_range(addr, size)); } EXPORT_SYMBOL(__range_ok); - -#ifdef CONFIG_DEBUG_COPY_FROM_USER -void copy_from_user_overflow(void) -{ - WARN(1, "Buffer overflow detected!\n"); -} -EXPORT_SYMBOL(copy_from_user_overflow); -#endif diff --git a/arch/unicore32/include/asm/processor.h b/arch/unicore32/include/asm/processor.h index e11cb0786578..f0d780a51f9b 100644 --- a/arch/unicore32/include/asm/processor.h +++ b/arch/unicore32/include/asm/processor.h @@ -53,7 +53,6 @@ struct thread_struct { #define start_thread(regs, pc, sp) \ ({ \ unsigned long *stack = (unsigned long *)sp; \ - set_fs(USER_DS); \ memset(regs->uregs, 0, sizeof(regs->uregs)); \ regs->UCreg_asr = USER_MODE; \ regs->UCreg_pc = pc & ~1; /* pc */ \ diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f1833e34b16b..ff6d4e7d0b3e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -64,6 +64,7 @@ config X86 select HAVE_TEXT_POKE_SMP select HAVE_GENERIC_HARDIRQS select HAVE_SPARSE_IRQ + select ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS select GENERIC_FIND_FIRST_BIT select GENERIC_IRQ_PROBE select GENERIC_PENDING_IRQ if SMP @@ -2073,6 +2074,20 @@ config OLPC_XO15_SCI - AC adapter status updates - Battery status updates +config ALIX + bool "PCEngines ALIX System Support (LED setup)" + select GPIOLIB + ---help--- + This option enables system support for the PCEngines ALIX. + At present this just sets up LEDs for GPIO control on + ALIX2/3/6 boards. However, other system specific setup should + get added here. + + Note: You must still enable the drivers for GPIO and LED support + (GPIO_CS5535 & LEDS_GPIO) to actually use the LEDs + + Note: You have to set alix.force=1 for boards with Award BIOS. + endif # X86_32 config AMD_NB diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug index c0f8a5c88910..2b00959c1061 100644 --- a/arch/x86/Kconfig.debug +++ b/arch/x86/Kconfig.debug @@ -270,18 +270,4 @@ config OPTIMIZE_INLINING If unsure, say N. -config DEBUG_STRICT_USER_COPY_CHECKS - bool "Strict copy size checks" - depends on DEBUG_KERNEL && !TRACE_BRANCH_PROFILING - ---help--- - Enabling this option turns a certain set of sanity checks for user - copy operations into compile time failures. - - The copy_from_user() etc checks are there to help test if there - are sufficient security checks on the length argument of - the copy operation, by having gcc prove that the argument is - within bounds. - - If unsure, or if you run an older (pre 4.4) gcc, say N. - endmenu diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S index 54edb207ff3a..a6253ec1b284 100644 --- a/arch/x86/ia32/ia32entry.S +++ b/arch/x86/ia32/ia32entry.S @@ -850,4 +850,6 @@ ia32_sys_call_table: .quad sys_syncfs .quad compat_sys_sendmmsg /* 345 */ .quad sys_setns + .quad compat_sys_process_vm_readv + .quad compat_sys_process_vm_writev ia32_syscall_end: diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 7e50f06393aa..3857b04ff38c 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -177,4 +177,54 @@ static inline int invalid_vm86_irq(int irq) # define NR_IRQS NR_IRQS_LEGACY #endif +#define irq_vector_name(sirq) { sirq, #sirq } +#define invalidate_tlb_vector_name(i) { INVALIDATE_TLB_VECTOR_END-31+i, \ + "INVALIDATE_TLB_VECTOR" } + +#define irq_vector_name_table \ + irq_vector_name(NMI_VECTOR), \ + irq_vector_name(LOCAL_TIMER_VECTOR), \ + irq_vector_name(ERROR_APIC_VECTOR), \ + irq_vector_name(RESCHEDULE_VECTOR), \ + irq_vector_name(CALL_FUNCTION_VECTOR), \ + irq_vector_name(CALL_FUNCTION_SINGLE_VECTOR), \ + irq_vector_name(THERMAL_APIC_VECTOR), \ + irq_vector_name(THRESHOLD_APIC_VECTOR), \ + irq_vector_name(REBOOT_VECTOR), \ + irq_vector_name(SPURIOUS_APIC_VECTOR), \ + irq_vector_name(IRQ_WORK_VECTOR), \ + irq_vector_name(X86_PLATFORM_IPI_VECTOR), \ + invalidate_tlb_vector_name(0), \ + invalidate_tlb_vector_name(1), \ + invalidate_tlb_vector_name(2), \ + invalidate_tlb_vector_name(3), \ + invalidate_tlb_vector_name(4), \ + invalidate_tlb_vector_name(5), \ + invalidate_tlb_vector_name(6), \ + invalidate_tlb_vector_name(7), \ + invalidate_tlb_vector_name(8), \ + invalidate_tlb_vector_name(9), \ + invalidate_tlb_vector_name(10), \ + invalidate_tlb_vector_name(11), \ + invalidate_tlb_vector_name(12), \ + invalidate_tlb_vector_name(13), \ + invalidate_tlb_vector_name(14), \ + invalidate_tlb_vector_name(15), \ + invalidate_tlb_vector_name(16), \ + invalidate_tlb_vector_name(17), \ + invalidate_tlb_vector_name(18), \ + invalidate_tlb_vector_name(19), \ + invalidate_tlb_vector_name(20), \ + invalidate_tlb_vector_name(21), \ + invalidate_tlb_vector_name(22), \ + invalidate_tlb_vector_name(23), \ + invalidate_tlb_vector_name(24), \ + invalidate_tlb_vector_name(25), \ + invalidate_tlb_vector_name(26), \ + invalidate_tlb_vector_name(27), \ + invalidate_tlb_vector_name(28), \ + invalidate_tlb_vector_name(29), \ + invalidate_tlb_vector_name(30), \ + invalidate_tlb_vector_name(31) + #endif /* _ASM_X86_IRQ_VECTORS_H */ diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index 1c66d30971ad..6ca53e507777 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -43,6 +43,14 @@ _copy_from_user(void *to, const void __user *from, unsigned len); __must_check unsigned long copy_in_user(void __user *to, const void __user *from, unsigned len); +extern void copy_from_user_overflow(void) +#ifdef CONFIG_DEBUG_STRICT_USER_COPY_CHECKS + __compiletime_error("copy_from_user() buffer size is not provably correct") +#else + __compiletime_warning("copy_from_user() buffer size is not provably correct") +#endif +; + static inline unsigned long __must_check copy_from_user(void *to, const void __user *from, unsigned long n) @@ -52,10 +60,8 @@ static inline unsigned long __must_check copy_from_user(void *to, might_fault(); if (likely(sz == -1 || sz >= n)) n = _copy_from_user(to, from, n); -#ifdef CONFIG_DEBUG_VM else - WARN(1, "Buffer overflow detected!\n"); -#endif + copy_from_user_overflow(); return n; } diff --git a/arch/x86/include/asm/unistd_32.h b/arch/x86/include/asm/unistd_32.h index 593485b38ab3..599c77d38f33 100644 --- a/arch/x86/include/asm/unistd_32.h +++ b/arch/x86/include/asm/unistd_32.h @@ -352,10 +352,12 @@ #define __NR_syncfs 344 #define __NR_sendmmsg 345 #define __NR_setns 346 +#define __NR_process_vm_readv 347 +#define __NR_process_vm_writev 348 #ifdef __KERNEL__ -#define NR_syscalls 347 +#define NR_syscalls 349 #define __ARCH_WANT_IPC_PARSE_VERSION #define __ARCH_WANT_OLD_READDIR diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h index 201040573444..8226f10731dd 100644 --- a/arch/x86/include/asm/unistd_64.h +++ b/arch/x86/include/asm/unistd_64.h @@ -683,6 +683,10 @@ __SYSCALL(__NR_sendmmsg, sys_sendmmsg) __SYSCALL(__NR_setns, sys_setns) #define __NR_getcpu 309 __SYSCALL(__NR_getcpu, sys_getcpu) +#define __NR_process_vm_readv 310 +__SYSCALL(__NR_process_vm_readv, sys_process_vm_readv) +#define __NR_process_vm_writev 311 +__SYSCALL(__NR_process_vm_writev, sys_process_vm_writev) #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 52fa56399a50..0fe559f3639b 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -34,6 +34,7 @@ #include <linux/dmi.h> #include <linux/smp.h> #include <linux/mm.h> +#include <trace/events/irq_vectors.h> #include <asm/perf_event.h> #include <asm/x86_init.h> @@ -859,7 +860,9 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs) */ exit_idle(); irq_enter(); + trace_irq_vector_entry(LOCAL_TIMER_VECTOR); local_apic_timer_interrupt(); + trace_irq_vector_exit(LOCAL_TIMER_VECTOR); irq_exit(); set_irq_regs(old_regs); @@ -1793,6 +1796,7 @@ void smp_spurious_interrupt(struct pt_regs *regs) exit_idle(); irq_enter(); + trace_irq_vector_entry(SPURIOUS_APIC_VECTOR); /* * Check if this really is a spurious interrupt and ACK it * if it is a vectored one. Just in case... @@ -1807,6 +1811,7 @@ void smp_spurious_interrupt(struct pt_regs *regs) /* see sw-dev-man vol 3, chapter 7.4.13.5 */ pr_info("spurious APIC interrupt on CPU#%d, " "should never happen.\n", smp_processor_id()); + trace_irq_vector_exit(SPURIOUS_APIC_VECTOR); irq_exit(); } @@ -1830,6 +1835,7 @@ void smp_error_interrupt(struct pt_regs *regs) exit_idle(); irq_enter(); + trace_irq_vector_entry(ERROR_APIC_VECTOR); /* First tickle the hardware, only then report what went on. -- REW */ v0 = apic_read(APIC_ESR); apic_write(APIC_ESR, 0); @@ -1850,6 +1856,7 @@ void smp_error_interrupt(struct pt_regs *regs) apic_printk(APIC_DEBUG, KERN_CONT "\n"); + trace_irq_vector_exit(ERROR_APIC_VECTOR); irq_exit(); } diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c index 787e06c84ea6..6b7edb53f00a 100644 --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c @@ -24,6 +24,7 @@ #include <linux/init.h> #include <linux/smp.h> #include <linux/cpu.h> +#include <trace/events/irq_vectors.h> #include <asm/processor.h> #include <asm/system.h> @@ -399,8 +400,10 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs) { exit_idle(); irq_enter(); + trace_irq_vector_entry(THERMAL_APIC_VECTOR); inc_irq_stat(irq_thermal_count); smp_thermal_vector(); + trace_irq_vector_exit(THERMAL_APIC_VECTOR); irq_exit(); /* Ack only at the end to avoid potential reentry */ ack_APIC_irq(); diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c index d746df2909c9..ffde17bdd85a 100644 --- a/arch/x86/kernel/cpu/mcheck/threshold.c +++ b/arch/x86/kernel/cpu/mcheck/threshold.c @@ -3,6 +3,7 @@ */ #include <linux/interrupt.h> #include <linux/kernel.h> +#include <trace/events/irq_vectors.h> #include <asm/irq_vectors.h> #include <asm/apic.h> @@ -21,8 +22,10 @@ asmlinkage void smp_threshold_interrupt(void) { exit_idle(); irq_enter(); + trace_irq_vector_entry(THRESHOLD_APIC_VECTOR); inc_irq_stat(irq_threshold_count); mce_threshold_vector(); + trace_irq_vector_exit(THRESHOLD_APIC_VECTOR); irq_exit(); /* Ack only at the end to avoid potential reentry */ ack_APIC_irq(); diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 303a0e48f076..f655f802260d 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -19,6 +19,7 @@ #include <linux/acpi.h> #include <linux/firmware-map.h> #include <linux/memblock.h> +#include <linux/sort.h> #include <asm/e820.h> #include <asm/proto.h> @@ -227,22 +228,38 @@ void __init e820_print_map(char *who) * ____________________33__ * ______________________4_ */ +struct change_member { + struct e820entry *pbios; /* pointer to original bios entry */ + unsigned long long addr; /* address for this change point */ +}; + +static int __init cpcompare(const void *a, const void *b) +{ + struct change_member * const *app = a, * const *bpp = b; + const struct change_member *ap = *app, *bp = *bpp; + + /* + * Inputs are pointers to two elements of change_point[]. If their + * addresses are unequal, their difference dominates. If the addresses + * are equal, then consider one that represents the end of its region + * to be greater than one that does not. + */ + if (ap->addr != bp->addr) + return ap->addr > bp->addr ? 1 : -1; + + return (ap->addr != ap->pbios->addr) - (bp->addr != bp->pbios->addr); +} int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map, u32 *pnr_map) { - struct change_member { - struct e820entry *pbios; /* pointer to original bios entry */ - unsigned long long addr; /* address for this change point */ - }; static struct change_member change_point_list[2*E820_X_MAX] __initdata; static struct change_member *change_point[2*E820_X_MAX] __initdata; static struct e820entry *overlap_list[E820_X_MAX] __initdata; static struct e820entry new_bios[E820_X_MAX] __initdata; - struct change_member *change_tmp; unsigned long current_type, last_type; unsigned long long last_addr; - int chgidx, still_changing; + int chgidx; int overlap_entries; int new_bios_entry; int old_nr, new_nr, chg_nr; @@ -279,35 +296,7 @@ int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map, chg_nr = chgidx; /* sort change-point list by memory addresses (low -> high) */ - still_changing = 1; - while (still_changing) { - still_changing = 0; - for (i = 1; i < chg_nr; i++) { - unsigned long long curaddr, lastaddr; - unsigned long long curpbaddr, lastpbaddr; - - curaddr = change_point[i]->addr; - lastaddr = change_point[i - 1]->addr; - curpbaddr = change_point[i]->pbios->addr; - lastpbaddr = change_point[i - 1]->pbios->addr; - - /* - * swap entries, when: - * - * curaddr > lastaddr or - * curaddr == lastaddr and curaddr == curpbaddr and - * lastaddr != lastpbaddr - */ - if (curaddr < lastaddr || - (curaddr == lastaddr && curaddr == curpbaddr && - lastaddr != lastpbaddr)) { - change_tmp = change_point[i]; - change_point[i] = change_point[i-1]; - change_point[i-1] = change_tmp; - still_changing = 1; - } - } - } + sort(change_point, chg_nr, sizeof *change_point, cpcompare, 0); /* create a new bios memory map, removing overlaps */ overlap_entries = 0; /* number of entries in the overlap table */ diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 429e0c92924e..64aad37c0ff0 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -18,6 +18,9 @@ #include <asm/mce.h> #include <asm/hw_irq.h> +#define CREATE_TRACE_POINTS +#include <trace/events/irq_vectors.h> + atomic_t irq_err_count; /* Function pointer for generic interrupt vector handling */ @@ -212,12 +215,13 @@ void smp_x86_platform_ipi(struct pt_regs *regs) exit_idle(); irq_enter(); - + trace_irq_vector_entry(X86_PLATFORM_IPI_VECTOR); inc_irq_stat(x86_platform_ipis); if (x86_platform_ipi_callback) x86_platform_ipi_callback(); + trace_irq_vector_exit(X86_PLATFORM_IPI_VECTOR); irq_exit(); set_irq_regs(old_regs); diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c index ca8f703a1e70..107754c8d8ec 100644 --- a/arch/x86/kernel/irq_work.c +++ b/arch/x86/kernel/irq_work.c @@ -8,13 +8,16 @@ #include <linux/irq_work.h> #include <linux/hardirq.h> #include <asm/apic.h> +#include <trace/events/irq_vectors.h> void smp_irq_work_interrupt(struct pt_regs *regs) { irq_enter(); ack_APIC_irq(); + trace_irq_vector_entry(IRQ_WORK_VECTOR); inc_irq_stat(apic_irq_work_irqs); irq_work_run(); + trace_irq_vector_exit(IRQ_WORK_VECTOR); irq_exit(); } diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 16204dc15484..17c5d01868e0 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -23,6 +23,7 @@ #include <linux/interrupt.h> #include <linux/cpu.h> #include <linux/gfp.h> +#include <trace/events/irq_vectors.h> #include <asm/mtrr.h> #include <asm/tlbflush.h> @@ -200,8 +201,10 @@ static void native_stop_other_cpus(int wait) void smp_reschedule_interrupt(struct pt_regs *regs) { ack_APIC_irq(); + trace_irq_vector_entry(RESCHEDULE_VECTOR); inc_irq_stat(irq_resched_count); scheduler_ipi(); + trace_irq_vector_exit(RESCHEDULE_VECTOR); /* * KVM uses this interrupt to force a cpu out of guest mode */ @@ -211,8 +214,10 @@ void smp_call_function_interrupt(struct pt_regs *regs) { ack_APIC_irq(); irq_enter(); + trace_irq_vector_entry(CALL_FUNCTION_VECTOR); generic_smp_call_function_interrupt(); inc_irq_stat(irq_call_count); + trace_irq_vector_exit(CALL_FUNCTION_VECTOR); irq_exit(); } @@ -220,8 +225,10 @@ void smp_call_function_single_interrupt(struct pt_regs *regs) { ack_APIC_irq(); irq_enter(); + trace_irq_vector_entry(CALL_FUNCTION_SINGLE_VECTOR); generic_smp_call_function_single_interrupt(); inc_irq_stat(irq_call_count); + trace_irq_vector_exit(CALL_FUNCTION_SINGLE_VECTOR); irq_exit(); } diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S index bc19be332bc9..9a0e31293920 100644 --- a/arch/x86/kernel/syscall_table_32.S +++ b/arch/x86/kernel/syscall_table_32.S @@ -346,3 +346,5 @@ ENTRY(sys_call_table) .long sys_syncfs .long sys_sendmmsg /* 345 */ .long sys_setns + .long sys_process_vm_readv + .long sys_process_vm_writev diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c index dd5fbf4101fc..25046df137f2 100644 --- a/arch/x86/kernel/time.c +++ b/arch/x86/kernel/time.c @@ -52,6 +52,13 @@ unsigned long profile_pc(struct pt_regs *regs) } EXPORT_SYMBOL(profile_pc); +static irqreturn_t timer_interrupt(int irq, void *dev_id); +static struct irqaction irq0 = { + .handler = timer_interrupt, + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER, + .name = "timer" +}; + /* * Default timer interrupt handler for PIT/HPET */ @@ -60,7 +67,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id) /* Keep nmi watchdog up to date */ inc_irq_stat(irq0_irqs); + trace_irq_handler_entry(irq, &irq0); global_clock_event->event_handler(global_clock_event); + trace_irq_handler_exit(irq, &irq0, 1); /* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */ if (MCA_bus) @@ -69,12 +78,6 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id) return IRQ_HANDLED; } -static struct irqaction irq0 = { - .handler = timer_interrupt, - .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER, - .name = "timer" -}; - void __init setup_default_timer_irq(void) { setup_irq(0, &irq0); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 6913369c234c..1286877967e9 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -31,6 +31,7 @@ #include <linux/mm.h> #include <linux/smp.h> #include <linux/io.h> +#include <trace/events/irq_vectors.h> #ifdef CONFIG_EISA #include <linux/ioport.h> @@ -434,12 +435,14 @@ dotraplinkage notrace __kprobes void do_nmi(struct pt_regs *regs, long error_code) { nmi_enter(); + trace_irq_vector_entry(NMI_VECTOR); inc_irq_stat(__nmi_count); if (!ignore_nmis) default_do_nmi(regs); + trace_irq_vector_exit(NMI_VECTOR); nmi_exit(); } diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c index e218d5df85ff..8498684e45b0 100644 --- a/arch/x86/lib/usercopy_32.c +++ b/arch/x86/lib/usercopy_32.c @@ -883,9 +883,3 @@ _copy_from_user(void *to, const void __user *from, unsigned long n) return n; } EXPORT_SYMBOL(_copy_from_user); - -void copy_from_user_overflow(void) -{ - WARN(1, "Buffer overflow detected!\n"); -} -EXPORT_SYMBOL(copy_from_user_overflow); diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index 4b5ba85eb5c9..845df6835f9f 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -75,9 +75,9 @@ static unsigned long mmap_rnd(void) */ if (current->flags & PF_RANDOMIZE) { if (mmap_is_ia32()) - rnd = (long)get_random_int() % (1<<8); + rnd = get_random_int() % (1<<8); else - rnd = (long)(get_random_int() % (1<<28)); + rnd = get_random_int() % (1<<28); } return rnd << PAGE_SHIFT; } diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index d6c0418c3e47..bf9475d97307 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -5,6 +5,7 @@ #include <linux/smp.h> #include <linux/interrupt.h> #include <linux/module.h> +#include <trace/events/irq_vectors.h> #include <linux/cpu.h> #include <asm/tlbflush.h> @@ -141,6 +142,7 @@ void smp_invalidate_interrupt(struct pt_regs *regs) sender = ~regs->orig_ax - INVALIDATE_TLB_VECTOR_START; f = &flush_state[sender]; + trace_irq_vector_entry(INVALIDATE_TLB_VECTOR_START + sender); if (!cpumask_test_cpu(cpu, to_cpumask(f->flush_cpumask))) goto out; /* @@ -167,6 +169,7 @@ out: cpumask_clear_cpu(cpu, to_cpumask(f->flush_cpumask)); smp_mb__after_clear_bit(); inc_irq_stat(irq_tlb_count); + trace_irq_vector_exit(INVALIDATE_TLB_VECTOR_START + sender); } static void flush_tlb_others_ipi(const struct cpumask *cpumask, diff --git a/arch/x86/platform/Makefile b/arch/x86/platform/Makefile index 021eee91c056..8d874396cb29 100644 --- a/arch/x86/platform/Makefile +++ b/arch/x86/platform/Makefile @@ -1,6 +1,7 @@ # Platform specific code goes here obj-y += ce4100/ obj-y += efi/ +obj-y += geode/ obj-y += iris/ obj-y += mrst/ obj-y += olpc/ diff --git a/arch/x86/platform/geode/Makefile b/arch/x86/platform/geode/Makefile new file mode 100644 index 000000000000..07c9cd05021a --- /dev/null +++ b/arch/x86/platform/geode/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_ALIX) += alix.o diff --git a/arch/x86/platform/geode/alix.c b/arch/x86/platform/geode/alix.c new file mode 100644 index 000000000000..ca1973699d3d --- /dev/null +++ b/arch/x86/platform/geode/alix.c @@ -0,0 +1,142 @@ +/* + * System Specific setup for PCEngines ALIX. + * At the moment this means setup of GPIO control of LEDs + * on Alix.2/3/6 boards. + * + * + * Copyright (C) 2008 Constantin Baranov <const@mimas.ru> + * Copyright (C) 2011 Ed Wildgoose <kernel@wildgooses.com> + * + * TODO: There are large similarities with leds-net5501.c + * by Alessandro Zummo <a.zummo@towertech.it> + * In the future leds-net5501.c should be migrated over to platform + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation. + */ + +#include <linux/kernel.h> +#include <linux/init.h> +#include <linux/io.h> +#include <linux/string.h> +#include <linux/module.h> +#include <linux/leds.h> +#include <linux/platform_device.h> +#include <linux/gpio.h> + +#include <asm/geode.h> + +static int force = 0; +module_param(force, bool, 0444); +/* FIXME: Award bios is not automatically detected as Alix platform */ +MODULE_PARM_DESC(force, "Force detection as ALIX.2/ALIX.3 platform"); + +static struct gpio_led alix_leds[] = { + { + .name = "alix:1", + .gpio = 6, + .default_trigger = "default-on", + .active_low = 1, + }, + { + .name = "alix:2", + .gpio = 25, + .default_trigger = "default-off", + .active_low = 1, + }, + { + .name = "alix:3", + .gpio = 27, + .default_trigger = "default-off", + .active_low = 1, + }, +}; + +static struct gpio_led_platform_data alix_leds_data = { + .num_leds = ARRAY_SIZE(alix_leds), + .leds = alix_leds, +}; + +static struct platform_device alix_leds_dev = { + .name = "leds-gpio", + .id = -1, + .dev.platform_data = &alix_leds_data, +}; + +static void __init register_alix(void) +{ + /* Setup LED control through leds-gpio driver */ + platform_device_register(&alix_leds_dev); +} + +static int __init alix_present(unsigned long bios_phys, + const char *alix_sig, + size_t alix_sig_len) +{ + const size_t bios_len = 0x00010000; + const char *bios_virt; + const char *scan_end; + const char *p; + char name[64]; + + if (force) { + printk(KERN_NOTICE "%s: forced to skip BIOS test, " + "assume system is ALIX.2/ALIX.3\n", + KBUILD_MODNAME); + return 1; + } + + bios_virt = phys_to_virt(bios_phys); + scan_end = bios_virt + bios_len - (alix_sig_len + 2); + for (p = bios_virt; p < scan_end; p++) { + const char *tail; + char *a; + + if (memcmp(p, alix_sig, alix_sig_len) != 0) + continue; + + memcpy(name, p, sizeof(name)); + + /* remove the first \0 character from string */ + a = strchr(name, '\0'); + if (a) + *a = ' '; + + /* cut the string at a newline */ + a = strchr(name, '\r'); + if (a) + *a = '\0'; + + tail = p + alix_sig_len; + if ((tail[0] == '2' || tail[0] == '3')) { + printk(KERN_INFO + "%s: system is recognized as \"%s\"\n", + KBUILD_MODNAME, name); + return 1; + } + } + + return 0; +} + +static int __init alix_init(void) +{ + const char tinybios_sig[] = "PC Engines ALIX."; + const char coreboot_sig[] = "PC Engines\0ALIX."; + + if (!is_geode()) + return 0; + + if (alix_present(0xf0000, tinybios_sig, sizeof(tinybios_sig) - 1) || + alix_present(0x500, coreboot_sig, sizeof(coreboot_sig) - 1)) + register_alix(); + + return 0; +} + +module_init(alix_init); + +MODULE_AUTHOR("Ed Wildgoose <kernel@wildgooses.com>"); +MODULE_DESCRIPTION("PCEngines ALIX System Setup"); +MODULE_LICENSE("GPL"); diff --git a/arch/x86/platform/iris/iris.c b/arch/x86/platform/iris/iris.c index 1ba7f5ed8c9b..f1900223aca7 100644 --- a/arch/x86/platform/iris/iris.c +++ b/arch/x86/platform/iris/iris.c @@ -23,6 +23,7 @@ #include <linux/moduleparam.h> #include <linux/module.h> +#include <linux/platform_device.h> #include <linux/kernel.h> #include <linux/errno.h> #include <linux/delay.h> @@ -62,29 +63,75 @@ static void iris_power_off(void) * by reading its input port and seeing whether the read value is * meaningful. */ -static int iris_init(void) +static int iris_probe(struct platform_device *pdev) { - unsigned char status; - if (force != 1) { - printk(KERN_ERR "The force parameter has not been set to 1 so the Iris poweroff handler will not be installed.\n"); - return -ENODEV; - } - status = inb(IRIS_GIO_INPUT); + unsigned char status = inb(IRIS_GIO_INPUT); if (status == IRIS_GIO_NODEV) { - printk(KERN_ERR "This machine does not seem to be an Iris. Power_off handler not installed.\n"); + printk(KERN_ERR "This machine does not seem to be an Iris. " + "Power off handler not installed.\n"); return -ENODEV; } old_pm_power_off = pm_power_off; pm_power_off = &iris_power_off; printk(KERN_INFO "Iris power_off handler installed.\n"); - return 0; } -static void iris_exit(void) +static int iris_remove(struct platform_device *pdev) { pm_power_off = old_pm_power_off; printk(KERN_INFO "Iris power_off handler uninstalled.\n"); + return 0; +} + +static struct platform_driver iris_driver = { + .driver = { + .name = "iris", + .owner = THIS_MODULE, + }, + .probe = iris_probe, + .remove = iris_remove, +}; + +static struct resource iris_resources[] = { + { + .start = IRIS_GIO_BASE, + .end = IRIS_GIO_OUTPUT, + .flags = IORESOURCE_IO, + .name = "address" + } +}; + +static struct platform_device *iris_device; + +static int iris_init(void) +{ + int ret; + if (force != 1) { + printk(KERN_ERR "The force parameter has not been set to 1." + " The Iris poweroff handler will not be installed.\n"); + return -ENODEV; + } + ret = platform_driver_register(&iris_driver); + if (ret < 0) { + printk(KERN_ERR "Failed to register iris platform driver: %d\n", + ret); + return ret; + } + iris_device = platform_device_register_simple("iris", (-1), + iris_resources, ARRAY_SIZE(iris_resources)); + if (IS_ERR(iris_device)) { + printk(KERN_ERR "Failed to register iris platform device\n"); + platform_driver_unregister(&iris_driver); + return PTR_ERR(iris_device); + } + return 0; +} + +static void iris_exit(void) +{ + platform_device_unregister(iris_device); + platform_driver_unregister(&iris_driver); } module_init(iris_init); diff --git a/drivers/base/node.c b/drivers/base/node.c index 793f796c4da3..9e58e71911b3 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -176,6 +176,7 @@ static ssize_t node_read_numastat(struct sys_device * dev, } static SYSDEV_ATTR(numastat, S_IRUGO, node_read_numastat, NULL); +#if defined(CONFIG_PROC_FS) || defined(CONFIG_SYSFS) static ssize_t node_read_vmstat(struct sys_device *dev, struct sysdev_attribute *attr, char *buf) { @@ -190,6 +191,7 @@ static ssize_t node_read_vmstat(struct sys_device *dev, return n; } static SYSDEV_ATTR(vmstat, S_IRUGO, node_read_vmstat, NULL); +#endif static ssize_t node_read_distance(struct sys_device * dev, struct sysdev_attribute *attr, char * buf) @@ -274,7 +276,7 @@ int register_node(struct node *node, int num, struct node *parent) sysdev_create_file(&node->sysdev, &attr_meminfo); sysdev_create_file(&node->sysdev, &attr_numastat); sysdev_create_file(&node->sysdev, &attr_distance); - sysdev_create_file(&node->sysdev, &attr_vmstat); + sysdev_create_file_optional(&node->sysdev, &attr_vmstat); scan_unevictable_register_node(node); @@ -299,7 +301,7 @@ void unregister_node(struct node *node) sysdev_remove_file(&node->sysdev, &attr_meminfo); sysdev_remove_file(&node->sysdev, &attr_numastat); sysdev_remove_file(&node->sysdev, &attr_distance); - sysdev_remove_file(&node->sysdev, &attr_vmstat); + sysdev_remove_file_optional(&node->sysdev, &attr_vmstat); scan_unevictable_unregister_node(node); hugetlb_unregister_node(node); /* no-op, if memoryless node */ diff --git a/drivers/block/brd.c b/drivers/block/brd.c index dba1c32e1ddf..a9587ee5136d 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -360,7 +360,7 @@ static int brd_make_request(struct request_queue *q, struct bio *bio) out: bio_endio(bio, err); - return 0; + return err; } #ifdef CONFIG_BLK_DEV_XIP diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index 98de8f418676..9955a53733b2 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -4250,7 +4250,7 @@ static int __init floppy_init(void) use_virtual_dma = can_use_virtual_dma & 1; fdc_state[0].address = FDC1; if (fdc_state[0].address == -1) { - del_timer(&fd_timeout); + del_timer_sync(&fd_timeout); err = -ENODEV; goto out_unreg_region; } @@ -4261,7 +4261,7 @@ static int __init floppy_init(void) fdc = 0; /* reset fdc in case of unexpected interrupt */ err = floppy_grab_irq_and_dma(); if (err) { - del_timer(&fd_timeout); + del_timer_sync(&fd_timeout); err = -EBUSY; goto out_unreg_region; } @@ -4318,7 +4318,7 @@ static int __init floppy_init(void) user_reset_fdc(-1, FD_RESET_ALWAYS, false); } fdc = 0; - del_timer(&fd_timeout); + del_timer_sync(&fd_timeout); current_drive = 0; initialized = true; if (have_no_fdc) { @@ -4368,7 +4368,7 @@ out_unreg_blkdev: unregister_blkdev(FLOPPY_MAJOR, "fd"); out_put_disk: while (dr--) { - del_timer(&motor_off_timer[dr]); + del_timer_sync(&motor_off_timer[dr]); if (disks[dr]->queue) blk_cleanup_queue(disks[dr]->queue); put_disk(disks[dr]); diff --git a/drivers/char/hpet.c b/drivers/char/hpet.c index 0833896cf6f2..c9aeb7fce878 100644 --- a/drivers/char/hpet.c +++ b/drivers/char/hpet.c @@ -263,16 +263,40 @@ static void hpet_timer_set_irq(struct hpet_dev *devp) static int hpet_open(struct inode *inode, struct file *file) { - struct hpet_dev *devp; struct hpets *hpetp; - int i; if (file->f_mode & FMODE_WRITE) return -EINVAL; + hpetp = hpets; + /* starting with timer-neutral instance */ + file->private_data = &hpetp->hp_dev[hpetp->hp_ntimer]; + + return 0; +} + +static int hpet_alloc_timer(struct file *file) +{ + struct hpet_dev *devp; + struct hpets *hpetp; + int i; + + /* once acquired, will remain */ + devp = file->private_data; + if (devp->hd_timer) + return 0; + mutex_lock(&hpet_mutex); spin_lock_irq(&hpet_lock); + /* check for race acquiring */ + devp = file->private_data; + if (devp->hd_timer) { + spin_unlock_irq(&hpet_lock); + mutex_unlock(&hpet_mutex); + return 0; + } + for (devp = NULL, hpetp = hpets; hpetp && !devp; hpetp = hpetp->hp_next) for (i = 0; i < hpetp->hp_ntimer; i++) if (hpetp->hp_dev[i].hd_flags & HPET_OPEN) @@ -402,6 +426,11 @@ static int hpet_mmap(struct file *file, struct vm_area_struct *vma) static int hpet_fasync(int fd, struct file *file, int on) { struct hpet_dev *devp; + int r; + + r = hpet_alloc_timer(file); + if (r < 0) + return r; devp = file->private_data; @@ -420,6 +449,9 @@ static int hpet_release(struct inode *inode, struct file *file) devp = file->private_data; timer = devp->hd_timer; + if (!timer) + goto out; + spin_lock_irq(&hpet_lock); writeq((readq(&timer->hpet_config) & ~Tn_INT_ENB_CNF_MASK), @@ -444,7 +476,7 @@ static int hpet_release(struct inode *inode, struct file *file) if (irq) free_irq(irq, devp); - +out: file->private_data = NULL; return 0; } @@ -593,6 +625,9 @@ hpet_ioctl_common(struct hpet_dev *devp, int cmd, unsigned long arg, break; case HPET_IE_ON: return hpet_ioctl_ieon(devp); + case HPET_ALLOC_TIMER: + /* nothing to do */ + return 0; default: return -EINVAL; } @@ -859,7 +894,11 @@ int hpet_alloc(struct hpet_data *hdp) return 0; } - siz = sizeof(struct hpets) + ((hdp->hd_nirqs - 1) * + /* + * last hpet_dev will have null timer pointer, gives timer-neutral + * representation of block + */ + siz = sizeof(struct hpets) + ((hdp->hd_nirqs) * sizeof(struct hpet_dev)); hpetp = kzalloc(siz, GFP_KERNEL); @@ -925,13 +964,16 @@ int hpet_alloc(struct hpet_data *hdp) writeq(mcfg, &hpet->hpet_config); } - for (i = 0, devp = hpetp->hp_dev; i < hpetp->hp_ntimer; i++, devp++) { + for (i = 0, devp = hpetp->hp_dev; i < hpetp->hp_ntimer + 1; + i++, devp++) { struct hpet_timer __iomem *timer; - timer = &hpet->hpet_timers[devp - hpetp->hp_dev]; - devp->hd_hpets = hpetp; devp->hd_hpet = hpet; + if (i == hpetp->hp_ntimer) + continue; + + timer = &hpet->hpet_timers[devp - hpetp->hp_dev]; devp->hd_timer = timer; /* diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c index 33b56e5c5c14..c97b468ee9f7 100644 --- a/drivers/cpufreq/cpufreq_conservative.c +++ b/drivers/cpufreq/cpufreq_conservative.c @@ -120,10 +120,12 @@ static inline cputime64_t get_cpu_idle_time_jiffy(unsigned int cpu, static inline cputime64_t get_cpu_idle_time(unsigned int cpu, cputime64_t *wall) { - u64 idle_time = get_cpu_idle_time_us(cpu, wall); + u64 idle_time = get_cpu_idle_time_us(cpu, NULL); if (idle_time == -1ULL) return get_cpu_idle_time_jiffy(cpu, wall); + else + idle_time += get_cpu_iowait_time_us(cpu, wall); return idle_time; } diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c index 629b3ec698e2..fa8af4ebb1d6 100644 --- a/drivers/cpufreq/cpufreq_ondemand.c +++ b/drivers/cpufreq/cpufreq_ondemand.c @@ -144,10 +144,12 @@ static inline cputime64_t get_cpu_idle_time_jiffy(unsigned int cpu, static inline cputime64_t get_cpu_idle_time(unsigned int cpu, cputime64_t *wall) { - u64 idle_time = get_cpu_idle_time_us(cpu, wall); + u64 idle_time = get_cpu_idle_time_us(cpu, NULL); if (idle_time == -1ULL) return get_cpu_idle_time_jiffy(cpu, wall); + else + idle_time += get_cpu_iowait_time_us(cpu, wall); return idle_time; } diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c index bcb1126e3d00..153980be4ee6 100644 --- a/drivers/firmware/dmi_scan.c +++ b/drivers/firmware/dmi_scan.c @@ -585,14 +585,12 @@ int dmi_name_in_serial(const char *str) } /** - * dmi_name_in_vendors - Check if string is anywhere in the DMI vendor information. + * dmi_name_in_vendors - Check if string is in the DMI system or board vendor name * @str: Case sensitive Name */ int dmi_name_in_vendors(const char *str) { - static int fields[] = { DMI_BIOS_VENDOR, DMI_BIOS_VERSION, DMI_SYS_VENDOR, - DMI_PRODUCT_NAME, DMI_PRODUCT_VERSION, DMI_BOARD_VENDOR, - DMI_BOARD_NAME, DMI_BOARD_VERSION, DMI_NONE }; + static int fields[] = { DMI_SYS_VENDOR, DMI_BOARD_VENDOR, DMI_NONE }; int i; for (i = 0; fields[i] != DMI_NONE; i++) { int f = fields[i]; diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index b493663c7ba7..b71218d03391 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -96,6 +96,7 @@ config DRM_I915 select FB_CFB_IMAGEBLIT # i915 depends on ACPI_VIDEO when ACPI is enabled # but for select to work, need to select ACPI_VIDEO's dependencies, ick + select BACKLIGHT_LCD_SUPPORT if ACPI select BACKLIGHT_CLASS_DEVICE if ACPI select VIDEO_OUTPUT_CONTROL if ACPI select INPUT if ACPI diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c index ac6e0d1bd629..d51577bc1766 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c @@ -37,7 +37,6 @@ #include <linux/kernel.h> struct vmwgfx_gmrid_man { - spinlock_t lock; struct ida gmr_ida; uint32_t max_gmr_ids; }; @@ -49,34 +48,20 @@ static int vmw_gmrid_man_get_node(struct ttm_mem_type_manager *man, { struct vmwgfx_gmrid_man *gman = (struct vmwgfx_gmrid_man *)man->priv; - int ret; int id; mem->mm_node = NULL; - do { - if (unlikely(ida_pre_get(&gman->gmr_ida, GFP_KERNEL) == 0)) - return -ENOMEM; - - spin_lock(&gman->lock); - ret = ida_get_new(&gman->gmr_ida, &id); - - if (unlikely(ret == 0 && id >= gman->max_gmr_ids)) { - ida_remove(&gman->gmr_ida, id); - spin_unlock(&gman->lock); + id = ida_simple_get(&gman->gmr_ida, 0, gman->max_gmr_ids, GFP_KERNEL); + if (id < 0) { + if (id == -ENOSPC) return 0; - } - - spin_unlock(&gman->lock); - - } while (ret == -EAGAIN); - - if (likely(ret == 0)) { - mem->mm_node = gman; - mem->start = id; + return id; } + mem->mm_node = gman; + mem->start = id; - return ret; + return 0; } static void vmw_gmrid_man_put_node(struct ttm_mem_type_manager *man, @@ -86,9 +71,7 @@ static void vmw_gmrid_man_put_node(struct ttm_mem_type_manager *man, (struct vmwgfx_gmrid_man *)man->priv; if (mem->mm_node) { - spin_lock(&gman->lock); - ida_remove(&gman->gmr_ida, mem->start); - spin_unlock(&gman->lock); + ida_simple_remove(&gman->gmr_ida, mem->start); mem->mm_node = NULL; } } @@ -102,7 +85,6 @@ static int vmw_gmrid_man_init(struct ttm_mem_type_manager *man, if (unlikely(gman == NULL)) return -ENOMEM; - spin_lock_init(&gman->lock); ida_init(&gman->gmr_ida); gman->max_gmr_ids = p_size; man->priv = (void *) gman; diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c index c72f1c0b5e63..18376f113b0f 100644 --- a/drivers/gpu/vga/vgaarb.c +++ b/drivers/gpu/vga/vgaarb.c @@ -993,14 +993,20 @@ static ssize_t vga_arb_write(struct file *file, const char __user * buf, uc = &priv->cards[i]; } - if (!uc) - return -EINVAL; + if (!uc) { + ret_val = -EINVAL; + goto done; + } - if (io_state & VGA_RSRC_LEGACY_IO && uc->io_cnt == 0) - return -EINVAL; + if (io_state & VGA_RSRC_LEGACY_IO && uc->io_cnt == 0) { + ret_val = -EINVAL; + goto done; + } - if (io_state & VGA_RSRC_LEGACY_MEM && uc->mem_cnt == 0) - return -EINVAL; + if (io_state & VGA_RSRC_LEGACY_MEM && uc->mem_cnt == 0) { + ret_val = -EINVAL; + goto done; + } vga_put(pdev, io_state); diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c index a61e7815a2a9..6460487e41b5 100644 --- a/drivers/hwmon/hwmon.c +++ b/drivers/hwmon/hwmon.c @@ -27,8 +27,7 @@ static struct class *hwmon_class; -static DEFINE_IDR(hwmon_idr); -static DEFINE_SPINLOCK(idr_lock); +static DEFINE_IDA(hwmon_ida); /** * hwmon_device_register - register w/ hwmon @@ -42,30 +41,17 @@ static DEFINE_SPINLOCK(idr_lock); struct device *hwmon_device_register(struct device *dev) { struct device *hwdev; - int id, err; - -again: - if (unlikely(idr_pre_get(&hwmon_idr, GFP_KERNEL) == 0)) - return ERR_PTR(-ENOMEM); - - spin_lock(&idr_lock); - err = idr_get_new(&hwmon_idr, NULL, &id); - spin_unlock(&idr_lock); + int id; - if (unlikely(err == -EAGAIN)) - goto again; - else if (unlikely(err)) - return ERR_PTR(err); + id = ida_simple_get(&hwmon_ida, 0, 0, GFP_KERNEL); + if (id < 0) + return ERR_PTR(id); - id = id & MAX_ID_MASK; hwdev = device_create(hwmon_class, dev, MKDEV(0, 0), NULL, HWMON_ID_FORMAT, id); - if (IS_ERR(hwdev)) { - spin_lock(&idr_lock); - idr_remove(&hwmon_idr, id); - spin_unlock(&idr_lock); - } + if (IS_ERR(hwdev)) + ida_simple_remove(&hwmon_ida, id); return hwdev; } @@ -81,9 +67,7 @@ void hwmon_device_unregister(struct device *dev) if (likely(sscanf(dev_name(dev), HWMON_ID_FORMAT, &id) == 1)) { device_unregister(dev); - spin_lock(&idr_lock); - idr_remove(&hwmon_idr, id); - spin_unlock(&idr_lock); + ida_simple_remove(&hwmon_ida, id); } else dev_dbg(dev->parent, "hwmon_device_unregister() failed: bad class ID!\n"); diff --git a/drivers/hwmon/ibmaem.c b/drivers/hwmon/ibmaem.c index c6d59b1788d7..6a967d7dbdee 100644 --- a/drivers/hwmon/ibmaem.c +++ b/drivers/hwmon/ibmaem.c @@ -88,8 +88,7 @@ #define AEM_MIN_POWER_INTERVAL 200 #define UJ_PER_MJ 1000L -static DEFINE_IDR(aem_idr); -static DEFINE_SPINLOCK(aem_idr_lock); +static DEFINE_IDA(aem_ida); static struct platform_driver aem_driver = { .driver = { @@ -355,38 +354,6 @@ static void aem_msg_handler(struct ipmi_recv_msg *msg, void *user_msg_data) complete(&data->read_complete); } -/* ID functions */ - -/* Obtain an id */ -static int aem_idr_get(int *id) -{ - int i, err; - -again: - if (unlikely(!idr_pre_get(&aem_idr, GFP_KERNEL))) - return -ENOMEM; - - spin_lock(&aem_idr_lock); - err = idr_get_new(&aem_idr, NULL, &i); - spin_unlock(&aem_idr_lock); - - if (unlikely(err == -EAGAIN)) - goto again; - else if (unlikely(err)) - return err; - - *id = i & MAX_ID_MASK; - return 0; -} - -/* Release an object ID */ -static void aem_idr_put(int id) -{ - spin_lock(&aem_idr_lock); - idr_remove(&aem_idr, id); - spin_unlock(&aem_idr_lock); -} - /* Sensor support functions */ /* Read a sensor value; must be called with data->lock held */ @@ -526,7 +493,7 @@ static void aem_delete(struct aem_data *data) ipmi_destroy_user(data->ipmi.user); platform_set_drvdata(data->pdev, NULL); platform_device_unregister(data->pdev); - aem_idr_put(data->id); + ida_simple_remove(&aem_ida, data->id); kfree(data); } @@ -583,7 +550,8 @@ static int aem_init_aem1_inst(struct aem_ipmi_data *probe, u8 module_handle) data->power_period[i] = AEM_DEFAULT_POWER_INTERVAL; /* Create sub-device for this fw instance */ - if (aem_idr_get(&data->id)) + data->id = ida_simple_get(&aem_ida, 0, 0, GFP_KERNEL); + if (data->id < 0) goto id_err; data->pdev = platform_device_alloc(DRVNAME, data->id); @@ -643,7 +611,7 @@ ipmi_err: platform_set_drvdata(data->pdev, NULL); platform_device_unregister(data->pdev); dev_err: - aem_idr_put(data->id); + ida_simple_remove(&aem_ida, data->id); id_err: kfree(data); @@ -722,7 +690,8 @@ static int aem_init_aem2_inst(struct aem_ipmi_data *probe, data->power_period[i] = AEM_DEFAULT_POWER_INTERVAL; /* Create sub-device for this fw instance */ - if (aem_idr_get(&data->id)) + data->id = ida_simple_get(&aem_ida, 0, 0, GFP_KERNEL); + if (data->id < 0) goto id_err; data->pdev = platform_device_alloc(DRVNAME, data->id); @@ -782,7 +751,7 @@ ipmi_err: platform_set_drvdata(data->pdev, NULL); platform_device_unregister(data->pdev); dev_err: - aem_idr_put(data->id); + ida_simple_remove(&aem_ida, data->id); id_err: kfree(data); diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 18767f8ab090..86e1850e1380 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -271,33 +271,42 @@ static void __setup_broadcast_timer(void *arg) clockevents_notify(reason, &cpu); } -static int setup_broadcast_cpuhp_notify(struct notifier_block *n, +static void auto_demotion_disable(void *dummy) +{ + unsigned long long msr_bits; + + rdmsrl(MSR_NHM_SNB_PKG_CST_CFG_CTL, msr_bits); + msr_bits &= ~auto_demotion_disable_flags; + wrmsrl(MSR_NHM_SNB_PKG_CST_CFG_CTL, msr_bits); +} + +static void __intel_idle_notify_handler(void *arg) +{ + if (auto_demotion_disable_flags) + auto_demotion_disable(NULL); + + if (lapic_timer_reliable_states != LAPIC_TIMER_ALWAYS_RELIABLE) + __setup_broadcast_timer((void *)true); +} + +static int setup_intelidle_cpuhp_notify(struct notifier_block *n, unsigned long action, void *hcpu) { int hotcpu = (unsigned long)hcpu; switch (action & 0xf) { case CPU_ONLINE: - smp_call_function_single(hotcpu, __setup_broadcast_timer, - (void *)true, 1); + smp_call_function_single(hotcpu, __intel_idle_notify_handler, + NULL, 1); break; } return NOTIFY_OK; } -static struct notifier_block setup_broadcast_notifier = { - .notifier_call = setup_broadcast_cpuhp_notify, +static struct notifier_block setup_intelidle_notifier = { + .notifier_call = setup_intelidle_cpuhp_notify, }; -static void auto_demotion_disable(void *dummy) -{ - unsigned long long msr_bits; - - rdmsrl(MSR_NHM_SNB_PKG_CST_CFG_CTL, msr_bits); - msr_bits &= ~auto_demotion_disable_flags; - wrmsrl(MSR_NHM_SNB_PKG_CST_CFG_CTL, msr_bits); -} - /* * intel_idle_probe() */ @@ -367,10 +376,8 @@ static int intel_idle_probe(void) if (boot_cpu_has(X86_FEATURE_ARAT)) /* Always Reliable APIC Timer */ lapic_timer_reliable_states = LAPIC_TIMER_ALWAYS_RELIABLE; - else { - smp_call_function(__setup_broadcast_timer, (void *)true, 1); - register_cpu_notifier(&setup_broadcast_notifier); - } + else + on_each_cpu(__setup_broadcast_timer, (void *)true, 1); pr_debug(PREFIX "v" INTEL_IDLE_VERSION " model 0x%X\n", boot_cpu_data.x86_model); @@ -460,7 +467,7 @@ static int intel_idle_cpuidle_devices_init(void) } } if (auto_demotion_disable_flags) - smp_call_function(auto_demotion_disable, NULL, 1); + on_each_cpu(auto_demotion_disable, NULL, 1); return 0; } @@ -491,6 +498,10 @@ static int __init intel_idle_init(void) return retval; } + if (auto_demotion_disable_flags || lapic_timer_reliable_states != + LAPIC_TIMER_ALWAYS_RELIABLE) + register_cpu_notifier(&setup_intelidle_notifier); + return 0; } @@ -499,10 +510,12 @@ static void __exit intel_idle_exit(void) intel_idle_cpuidle_devices_uninit(); cpuidle_unregister_driver(&intel_idle_driver); - if (lapic_timer_reliable_states != LAPIC_TIMER_ALWAYS_RELIABLE) { - smp_call_function(__setup_broadcast_timer, (void *)false, 1); - unregister_cpu_notifier(&setup_broadcast_notifier); - } + if (lapic_timer_reliable_states != LAPIC_TIMER_ALWAYS_RELIABLE) + on_each_cpu(__setup_broadcast_timer, (void *)false, 1); + + if (auto_demotion_disable_flags || lapic_timer_reliable_states != + LAPIC_TIMER_ALWAYS_RELIABLE) + unregister_cpu_notifier(&setup_intelidle_notifier); return; } diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index cc92137b3e02..71f0c0f7df94 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -137,7 +137,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, down_write(¤t->mm->mmap_sem); - locked = npages + current->mm->locked_vm; + locked = npages + current->mm->pinned_vm; lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; if ((locked > lock_limit) && !capable(CAP_IPC_LOCK)) { @@ -207,7 +207,7 @@ out: __ib_umem_release(context->device, umem, 0); kfree(umem); } else - current->mm->locked_vm = locked; + current->mm->pinned_vm = locked; up_write(¤t->mm->mmap_sem); if (vma_list) @@ -223,7 +223,7 @@ static void ib_umem_account(struct work_struct *work) struct ib_umem *umem = container_of(work, struct ib_umem, work); down_write(&umem->mm->mmap_sem); - umem->mm->locked_vm -= umem->diff; + umem->mm->pinned_vm -= umem->diff; up_write(&umem->mm->mmap_sem); mmput(umem->mm); kfree(umem); diff --git a/drivers/infiniband/hw/ipath/ipath_user_pages.c b/drivers/infiniband/hw/ipath/ipath_user_pages.c index cfed5399f074..dc66c4506916 100644 --- a/drivers/infiniband/hw/ipath/ipath_user_pages.c +++ b/drivers/infiniband/hw/ipath/ipath_user_pages.c @@ -79,7 +79,7 @@ static int __ipath_get_user_pages(unsigned long start_page, size_t num_pages, goto bail_release; } - current->mm->locked_vm += num_pages; + current->mm->pinned_vm += num_pages; ret = 0; goto bail; @@ -178,7 +178,7 @@ void ipath_release_user_pages(struct page **p, size_t num_pages) __ipath_release_user_pages(p, num_pages, 1); - current->mm->locked_vm -= num_pages; + current->mm->pinned_vm -= num_pages; up_write(¤t->mm->mmap_sem); } @@ -195,7 +195,7 @@ static void user_pages_account(struct work_struct *_work) container_of(_work, struct ipath_user_pages_work, work); down_write(&work->mm->mmap_sem); - work->mm->locked_vm -= work->num_pages; + work->mm->pinned_vm -= work->num_pages; up_write(&work->mm->mmap_sem); mmput(work->mm); kfree(work); diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c index 7689e49c13c9..2bc1d2b96298 100644 --- a/drivers/infiniband/hw/qib/qib_user_pages.c +++ b/drivers/infiniband/hw/qib/qib_user_pages.c @@ -74,7 +74,7 @@ static int __qib_get_user_pages(unsigned long start_page, size_t num_pages, goto bail_release; } - current->mm->locked_vm += num_pages; + current->mm->pinned_vm += num_pages; ret = 0; goto bail; @@ -151,7 +151,7 @@ void qib_release_user_pages(struct page **p, size_t num_pages) __qib_release_user_pages(p, num_pages, 1); if (current->mm) { - current->mm->locked_vm -= num_pages; + current->mm->pinned_vm -= num_pages; up_write(¤t->mm->mmap_sem); } } diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index 3dc9befa5aec..f0db877d71e6 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -263,7 +263,7 @@ rmrr_parse_dev(struct dmar_rmrr_unit *rmrru) return ret; } -static LIST_HEAD(dmar_atsr_units); +LIST_HEAD(dmar_atsr_units); static int __init dmar_parse_one_atsr(struct acpi_dmar_header *hdr) { @@ -277,6 +277,7 @@ static int __init dmar_parse_one_atsr(struct acpi_dmar_header *hdr) atsru->hdr = hdr; atsru->include_all = atsr->flags & 0x1; + atsru->segment = atsr->segment; list_add(&atsru->list, &dmar_atsr_units); @@ -308,14 +309,12 @@ int dmar_find_matched_atsr_unit(struct pci_dev *dev) { int i; struct pci_bus *bus; - struct acpi_dmar_atsr *atsr; struct dmar_atsr_unit *atsru; dev = pci_physfn(dev); list_for_each_entry(atsru, &dmar_atsr_units, list) { - atsr = container_of(atsru->hdr, struct acpi_dmar_atsr, header); - if (atsr->segment == pci_domain_nr(dev->bus)) + if (atsru->segment == pci_domain_nr(dev->bus)) goto found; } diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index c621c98c99da..1d85ed6615ab 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -633,6 +633,157 @@ static struct intel_iommu *device_to_iommu(int segment, u8 bus, u8 devfn) return NULL; } +#ifdef CONFIG_HOTPLUG +struct dev_dmaru { + struct list_head list; + void *dmaru; + int index; + int segment; + unsigned char bus; + unsigned int devfn; +}; + +static int +save_dev_dmaru(int segment, unsigned char bus, unsigned int devfn, + void *dmaru, int index, struct list_head *lh) +{ + struct dev_dmaru *m; + + m = kzalloc(sizeof(*m), GFP_KERNEL); + if (!m) + return -ENOMEM; + + m->segment = segment; + m->bus = bus; + m->devfn = devfn; + m->dmaru = dmaru; + m->index = index; + + list_add(&m->list, lh); + + return 0; +} + +static void +*get_dev_dmaru(int segment, unsigned char bus, unsigned int devfn, + int *index, struct list_head *lh) +{ + struct dev_dmaru *m; + void *dmaru = NULL; + + list_for_each_entry(m, lh, list) { + if (m->segment == segment && + m->bus == bus && m->devfn == devfn) { + *index = m->index; + dmaru = m->dmaru; + list_del(&m->list); + kfree(m); + break; + } + } + + return dmaru; +} + +static LIST_HEAD(saved_dev_drhd_list); + +static void remove_dev_from_drhd(struct pci_dev *dev) +{ + struct dmar_drhd_unit *drhd = NULL; + int segment = pci_domain_nr(dev->bus); + int i; + + for_each_drhd_unit(drhd) { + if (drhd->ignored) + continue; + if (segment != drhd->segment) + continue; + + for (i = 0; i < drhd->devices_cnt; i++) { + if (drhd->devices[i] == dev) { + /* save it at first if it is in drhd */ + save_dev_dmaru(segment, dev->bus->number, + dev->devfn, drhd, i, + &saved_dev_drhd_list); + /* always remove it */ + drhd->devices[i] = NULL; + return; + } + } + } +} + +static void restore_dev_to_drhd(struct pci_dev *dev) +{ + struct dmar_drhd_unit *drhd = NULL; + int i; + + /* find the stored drhd */ + drhd = get_dev_dmaru(pci_domain_nr(dev->bus), dev->bus->number, + dev->devfn, &i, &saved_dev_drhd_list); + /* restore that into drhd */ + if (drhd) + drhd->devices[i] = dev; +} +#else +static void remove_dev_from_drhd(struct pci_dev *dev) +{ +} + +static void restore_dev_to_drhd(struct pci_dev *dev) +{ +} +#endif + +#if defined(CONFIG_DMAR) && defined(CONFIG_HOTPLUG) +static LIST_HEAD(saved_dev_atsr_list); + +static void remove_dev_from_atsr(struct pci_dev *dev) +{ + struct dmar_atsr_unit *atsr = NULL; + int segment = pci_domain_nr(dev->bus); + int i; + + for_each_atsr_unit(atsr) { + if (segment != atsr->segment) + continue; + + for (i = 0; i < atsr->devices_cnt; i++) { + if (atsr->devices[i] == dev) { + /* save it at first if it is in drhd */ + save_dev_dmaru(segment, dev->bus->number, + dev->devfn, atsr, i, + &saved_dev_atsr_list); + /* always remove it */ + atsr->devices[i] = NULL; + return; + } + } + } +} + +static void restore_dev_to_atsr(struct pci_dev *dev) +{ + struct dmar_atsr_unit *atsr = NULL; + int i; + + /* find the stored atsr */ + atsr = get_dev_dmaru(pci_domain_nr(dev->bus), dev->bus->number, + dev->devfn, &i, &saved_dev_atsr_list); + /* restore that into atsr */ + if (atsr) + atsr->devices[i] = dev; +} +#else +static void remove_dev_from_atsr(struct pci_dev *dev) +{ +} + +static void restore_dev_to_atsr(struct pci_dev *dev) +{ +} +#endif + static void domain_flush_cache(struct dmar_domain *domain, void *addr, int size) { @@ -3403,20 +3554,36 @@ static int device_notifier(struct notifier_block *nb, struct pci_dev *pdev = to_pci_dev(dev); struct dmar_domain *domain; - if (iommu_no_mapping(dev)) + if (unlikely(dev->bus != &pci_bus_type)) return 0; - domain = find_domain(pdev); - if (!domain) - return 0; + switch (action) { + case BUS_NOTIFY_UNBOUND_DRIVER: + if (iommu_no_mapping(dev)) + goto out; + + if (iommu_pass_through) + goto out; + + domain = find_domain(pdev); + if (!domain) + goto out; - if (action == BUS_NOTIFY_UNBOUND_DRIVER && !iommu_pass_through) { domain_remove_one_dev_info(domain, pdev); if (!(domain->flags & DOMAIN_FLAG_VIRTUAL_MACHINE) && !(domain->flags & DOMAIN_FLAG_STATIC_IDENTITY) && list_empty(&domain->devices)) domain_exit(domain); +out: + remove_dev_from_drhd(pdev); + remove_dev_from_atsr(pdev); + + break; + case BUS_NOTIFY_ADD_DEVICE: + restore_dev_to_drhd(pdev); + restore_dev_to_atsr(pdev); + break; } return 0; diff --git a/drivers/leds/Kconfig b/drivers/leds/Kconfig index b591e726a6fa..b3e767414f77 100644 --- a/drivers/leds/Kconfig +++ b/drivers/leds/Kconfig @@ -113,14 +113,6 @@ config LEDS_WRAP help This option enables support for the PCEngines WRAP programmable LEDs. -config LEDS_ALIX2 - tristate "LED Support for ALIX.2 and ALIX.3 series" - depends on LEDS_CLASS - depends on X86 && !GPIO_CS5535 && !CS5535_GPIO - help - This option enables support for the PCEngines ALIX.2 and ALIX.3 LEDs. - You have to set leds-alix2.force=1 for boards with Award BIOS. - config LEDS_COBALT_QUBE tristate "LED Support for the Cobalt Qube series front LED" depends on LEDS_CLASS @@ -383,6 +375,18 @@ config LEDS_ASIC3 cannot be used. This driver supports hardware blinking with an on+off period from 62ms to 125s. Say Y to enable LEDs on the HP iPAQ hx4700. +config LEDS_RENESAS_TPU + bool "LED support for Renesas TPU" + depends on LEDS_CLASS && HAVE_CLK && GENERIC_GPIO + help + This option enables build of the LED TPU platform driver, + suitable to drive any TPU channel on newer Renesas SoCs. + The driver controls the GPIO pin connected to the LED via + the GPIO framework and expects the LED to be connected to + a pin that can be driven in both GPIO mode and using TPU + pin function. The latter to support brightness control. + Brightness control is supported but hardware blinking is not. + config LEDS_TRIGGERS bool "LED Trigger support" depends on LEDS_CLASS diff --git a/drivers/leds/Makefile b/drivers/leds/Makefile index bbfd2e367dc0..e4f6bf568880 100644 --- a/drivers/leds/Makefile +++ b/drivers/leds/Makefile @@ -16,7 +16,6 @@ obj-$(CONFIG_LEDS_AMS_DELTA) += leds-ams-delta.o obj-$(CONFIG_LEDS_NET48XX) += leds-net48xx.o obj-$(CONFIG_LEDS_NET5501) += leds-net5501.o obj-$(CONFIG_LEDS_WRAP) += leds-wrap.o -obj-$(CONFIG_LEDS_ALIX2) += leds-alix2.o obj-$(CONFIG_LEDS_COBALT_QUBE) += leds-cobalt-qube.o obj-$(CONFIG_LEDS_COBALT_RAQ) += leds-cobalt-raq.o obj-$(CONFIG_LEDS_SUNFIRE) += leds-sunfire.o @@ -43,6 +42,7 @@ obj-$(CONFIG_LEDS_MC13783) += leds-mc13783.o obj-$(CONFIG_LEDS_NS2) += leds-ns2.o obj-$(CONFIG_LEDS_NETXBIG) += leds-netxbig.o obj-$(CONFIG_LEDS_ASIC3) += leds-asic3.o +obj-$(CONFIG_LEDS_RENESAS_TPU) += leds-renesas-tpu.o # LED SPI Drivers obj-$(CONFIG_LEDS_DAC124S085) += leds-dac124s085.o diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c index 4bebae733349..6f1ff93d7cec 100644 --- a/drivers/leds/led-triggers.c +++ b/drivers/leds/led-triggers.c @@ -261,9 +261,12 @@ void led_trigger_register_simple(const char *name, struct led_trigger **tp) if (trigger) { trigger->name = name; err = led_trigger_register(trigger); - if (err < 0) + if (err < 0) { + kfree(trigger); + trigger = NULL; printk(KERN_WARNING "LED trigger %s failed to register" " (%d)\n", name, err); + } } else printk(KERN_WARNING "LED trigger %s failed to register" " (no memory)\n", name); diff --git a/drivers/leds/leds-alix2.c b/drivers/leds/leds-alix2.c deleted file mode 100644 index f59ffadf5125..000000000000 --- a/drivers/leds/leds-alix2.c +++ /dev/null @@ -1,239 +0,0 @@ -/* - * LEDs driver for PCEngines ALIX.2 and ALIX.3 - * - * Copyright (C) 2008 Constantin Baranov <const@mimas.ru> - */ - -#include <linux/err.h> -#include <linux/io.h> -#include <linux/kernel.h> -#include <linux/leds.h> -#include <linux/module.h> -#include <linux/platform_device.h> -#include <linux/string.h> -#include <linux/pci.h> - -static int force = 0; -module_param(force, bool, 0444); -MODULE_PARM_DESC(force, "Assume system has ALIX.2/ALIX.3 style LEDs"); - -#define MSR_LBAR_GPIO 0x5140000C -#define CS5535_GPIO_SIZE 256 - -static u32 gpio_base; - -static struct pci_device_id divil_pci[] = { - { PCI_DEVICE(PCI_VENDOR_ID_NS, PCI_DEVICE_ID_NS_CS5535_ISA) }, - { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CS5536_ISA) }, - { } /* NULL entry */ -}; -MODULE_DEVICE_TABLE(pci, divil_pci); - -struct alix_led { - struct led_classdev cdev; - unsigned short port; - unsigned int on_value; - unsigned int off_value; -}; - -static void alix_led_set(struct led_classdev *led_cdev, - enum led_brightness brightness) -{ - struct alix_led *led_dev = - container_of(led_cdev, struct alix_led, cdev); - - if (brightness) - outl(led_dev->on_value, gpio_base + led_dev->port); - else - outl(led_dev->off_value, gpio_base + led_dev->port); -} - -static struct alix_led alix_leds[] = { - { - .cdev = { - .name = "alix:1", - .brightness_set = alix_led_set, - }, - .port = 0x00, - .on_value = 1 << 22, - .off_value = 1 << 6, - }, - { - .cdev = { - .name = "alix:2", - .brightness_set = alix_led_set, - }, - .port = 0x80, - .on_value = 1 << 25, - .off_value = 1 << 9, - }, - { - .cdev = { - .name = "alix:3", - .brightness_set = alix_led_set, - }, - .port = 0x80, - .on_value = 1 << 27, - .off_value = 1 << 11, - }, -}; - -static int __init alix_led_probe(struct platform_device *pdev) -{ - int i; - int ret; - - for (i = 0; i < ARRAY_SIZE(alix_leds); i++) { - alix_leds[i].cdev.flags |= LED_CORE_SUSPENDRESUME; - ret = led_classdev_register(&pdev->dev, &alix_leds[i].cdev); - if (ret < 0) - goto fail; - } - return 0; - -fail: - while (--i >= 0) - led_classdev_unregister(&alix_leds[i].cdev); - return ret; -} - -static int alix_led_remove(struct platform_device *pdev) -{ - int i; - - for (i = 0; i < ARRAY_SIZE(alix_leds); i++) - led_classdev_unregister(&alix_leds[i].cdev); - return 0; -} - -static struct platform_driver alix_led_driver = { - .remove = alix_led_remove, - .driver = { - .name = KBUILD_MODNAME, - .owner = THIS_MODULE, - }, -}; - -static int __init alix_present(unsigned long bios_phys, - const char *alix_sig, - size_t alix_sig_len) -{ - const size_t bios_len = 0x00010000; - const char *bios_virt; - const char *scan_end; - const char *p; - char name[64]; - - if (force) { - printk(KERN_NOTICE "%s: forced to skip BIOS test, " - "assume system has ALIX.2 style LEDs\n", - KBUILD_MODNAME); - return 1; - } - - bios_virt = phys_to_virt(bios_phys); - scan_end = bios_virt + bios_len - (alix_sig_len + 2); - for (p = bios_virt; p < scan_end; p++) { - const char *tail; - char *a; - - if (memcmp(p, alix_sig, alix_sig_len) != 0) - continue; - - memcpy(name, p, sizeof(name)); - - /* remove the first \0 character from string */ - a = strchr(name, '\0'); - if (a) - *a = ' '; - - /* cut the string at a newline */ - a = strchr(name, '\r'); - if (a) - *a = '\0'; - - tail = p + alix_sig_len; - if ((tail[0] == '2' || tail[0] == '3')) { - printk(KERN_INFO - "%s: system is recognized as \"%s\"\n", - KBUILD_MODNAME, name); - return 1; - } - } - - return 0; -} - -static struct platform_device *pdev; - -static int __init alix_pci_led_init(void) -{ - u32 low, hi; - - if (pci_dev_present(divil_pci) == 0) { - printk(KERN_WARNING KBUILD_MODNAME": DIVIL not found\n"); - return -ENODEV; - } - - /* Grab the GPIO I/O range */ - rdmsr(MSR_LBAR_GPIO, low, hi); - - /* Check the mask and whether GPIO is enabled (sanity check) */ - if (hi != 0x0000f001) { - printk(KERN_WARNING KBUILD_MODNAME": GPIO not enabled\n"); - return -ENODEV; - } - - /* Mask off the IO base address */ - gpio_base = low & 0x0000ff00; - - if (!request_region(gpio_base, CS5535_GPIO_SIZE, KBUILD_MODNAME)) { - printk(KERN_ERR KBUILD_MODNAME": can't allocate I/O for GPIO\n"); - return -ENODEV; - } - - /* Set GPIO function to output */ - outl(1 << 6, gpio_base + 0x04); - outl(1 << 9, gpio_base + 0x84); - outl(1 << 11, gpio_base + 0x84); - - return 0; -} - -static int __init alix_led_init(void) -{ - int ret = -ENODEV; - const char tinybios_sig[] = "PC Engines ALIX."; - const char coreboot_sig[] = "PC Engines\0ALIX."; - - if (alix_present(0xf0000, tinybios_sig, sizeof(tinybios_sig) - 1) || - alix_present(0x500, coreboot_sig, sizeof(coreboot_sig) - 1)) - ret = alix_pci_led_init(); - - if (ret < 0) - return ret; - - pdev = platform_device_register_simple(KBUILD_MODNAME, -1, NULL, 0); - if (!IS_ERR(pdev)) { - ret = platform_driver_probe(&alix_led_driver, alix_led_probe); - if (ret) - platform_device_unregister(pdev); - } else - ret = PTR_ERR(pdev); - - return ret; -} - -static void __exit alix_led_exit(void) -{ - platform_device_unregister(pdev); - platform_driver_unregister(&alix_led_driver); - release_region(gpio_base, CS5535_GPIO_SIZE); -} - -module_init(alix_led_init); -module_exit(alix_led_exit); - -MODULE_AUTHOR("Constantin Baranov <const@mimas.ru>"); -MODULE_DESCRIPTION("PCEngines ALIX.2 and ALIX.3 LED driver"); -MODULE_LICENSE("GPL"); diff --git a/drivers/leds/leds-renesas-tpu.c b/drivers/leds/leds-renesas-tpu.c new file mode 100644 index 000000000000..d9a7d72a7569 --- /dev/null +++ b/drivers/leds/leds-renesas-tpu.c @@ -0,0 +1,344 @@ +/* + * LED control using Renesas TPU + * + * Copyright (C) 2011 Magnus Damm + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <linux/module.h> +#include <linux/init.h> +#include <linux/platform_device.h> +#include <linux/spinlock.h> +#include <linux/printk.h> +#include <linux/ioport.h> +#include <linux/io.h> +#include <linux/clk.h> +#include <linux/leds.h> +#include <linux/leds-renesas-tpu.h> +#include <linux/gpio.h> +#include <linux/err.h> +#include <linux/slab.h> +#include <linux/pm_runtime.h> + +enum r_tpu_pin { R_TPU_PIN_UNUSED, R_TPU_PIN_GPIO, R_TPU_PIN_GPIO_FN }; +enum r_tpu_timer { R_TPU_TIMER_UNUSED, R_TPU_TIMER_ON }; + +struct r_tpu_priv { + struct led_classdev ldev; + void __iomem *mapbase; + struct clk *clk; + struct platform_device *pdev; + enum r_tpu_pin pin_state; + enum r_tpu_timer timer_state; + unsigned long min_rate; + unsigned int refresh_rate; +}; + +static DEFINE_SPINLOCK(r_tpu_lock); + +#define TSTR -1 /* Timer start register (shared register) */ +#define TCR 0 /* Timer control register (+0x00) */ +#define TMDR 1 /* Timer mode register (+0x04) */ +#define TIOR 2 /* Timer I/O control register (+0x08) */ +#define TIER 3 /* Timer interrupt enable register (+0x0c) */ +#define TSR 4 /* Timer status register (+0x10) */ +#define TCNT 5 /* Timer counter (+0x14) */ +#define TGRA 6 /* Timer general register A (+0x18) */ +#define TGRB 7 /* Timer general register B (+0x1c) */ +#define TGRC 8 /* Timer general register C (+0x20) */ +#define TGRD 9 /* Timer general register D (+0x24) */ + +static inline unsigned short r_tpu_read(struct r_tpu_priv *p, int reg_nr) +{ + struct led_renesas_tpu_config *cfg = p->pdev->dev.platform_data; + void __iomem *base = p->mapbase; + unsigned long offs = reg_nr << 2; + + if (reg_nr == TSTR) + return ioread16(base - cfg->channel_offset); + + return ioread16(base + offs); +} + +static inline void r_tpu_write(struct r_tpu_priv *p, int reg_nr, + unsigned short value) +{ + struct led_renesas_tpu_config *cfg = p->pdev->dev.platform_data; + void __iomem *base = p->mapbase; + unsigned long offs = reg_nr << 2; + + if (reg_nr == TSTR) { + iowrite16(value, base - cfg->channel_offset); + return; + } + + iowrite16(value, base + offs); +} + +static void r_tpu_start_stop_ch(struct r_tpu_priv *p, int start) +{ + struct led_renesas_tpu_config *cfg = p->pdev->dev.platform_data; + unsigned long flags, value; + + /* start stop register shared by multiple timer channels */ + spin_lock_irqsave(&r_tpu_lock, flags); + value = r_tpu_read(p, TSTR); + + if (start) + value |= 1 << cfg->timer_bit; + else + value &= ~(1 << cfg->timer_bit); + + r_tpu_write(p, TSTR, value); + spin_unlock_irqrestore(&r_tpu_lock, flags); +} + +static int r_tpu_enable(struct r_tpu_priv *p, enum led_brightness brightness) +{ + struct led_renesas_tpu_config *cfg = p->pdev->dev.platform_data; + int prescaler[] = { 1, 4, 16, 64 }; + int k, ret; + unsigned long rate, tmp; + + if (p->timer_state == R_TPU_TIMER_ON) + return 0; + + /* wake up device and enable clock */ + pm_runtime_get_sync(&p->pdev->dev); + ret = clk_enable(p->clk); + if (ret) { + dev_err(&p->pdev->dev, "cannot enable clock\n"); + return ret; + } + + /* make sure channel is disabled */ + r_tpu_start_stop_ch(p, 0); + + /* get clock rate after enabling it */ + rate = clk_get_rate(p->clk); + + /* pick the lowest acceptable rate */ + for (k = 0; k < ARRAY_SIZE(prescaler); k++) + if ((rate / prescaler[k]) < p->min_rate) + break; + + if (!k) { + dev_err(&p->pdev->dev, "clock rate mismatch\n"); + goto err0; + } + dev_dbg(&p->pdev->dev, "rate = %lu, prescaler %u\n", + rate, prescaler[k - 1]); + + /* clear TCNT on TGRB match, count on rising edge, set prescaler */ + r_tpu_write(p, TCR, 0x0040 | (k - 1)); + + /* output 0 until TGRA, output 1 until TGRB */ + r_tpu_write(p, TIOR, 0x0002); + + rate /= prescaler[k - 1] * p->refresh_rate; + r_tpu_write(p, TGRB, rate); + dev_dbg(&p->pdev->dev, "TRGB = 0x%04lx\n", rate); + + tmp = (cfg->max_brightness - brightness) * rate; + r_tpu_write(p, TGRA, tmp / cfg->max_brightness); + dev_dbg(&p->pdev->dev, "TRGA = 0x%04lx\n", tmp / cfg->max_brightness); + + /* PWM mode */ + r_tpu_write(p, TMDR, 0x0002); + + /* enable channel */ + r_tpu_start_stop_ch(p, 1); + + p->timer_state = R_TPU_TIMER_ON; + return 0; + err0: + clk_disable(p->clk); + pm_runtime_put_sync(&p->pdev->dev); + return -ENOTSUPP; +} + +static void r_tpu_disable(struct r_tpu_priv *p) +{ + if (p->timer_state == R_TPU_TIMER_UNUSED) + return; + + /* disable channel */ + r_tpu_start_stop_ch(p, 0); + + /* stop clock and mark device as idle */ + clk_disable(p->clk); + pm_runtime_put_sync(&p->pdev->dev); + + p->timer_state = R_TPU_TIMER_UNUSED; +} + +static void r_tpu_set_pin(struct r_tpu_priv *p, enum r_tpu_pin new_state, + enum led_brightness brightness) +{ + struct led_renesas_tpu_config *cfg = p->pdev->dev.platform_data; + + if (p->pin_state == new_state) { + if (p->pin_state == R_TPU_PIN_GPIO) + gpio_set_value(cfg->pin_gpio, brightness); + return; + } + + if (p->pin_state == R_TPU_PIN_GPIO) + gpio_free(cfg->pin_gpio); + + if (p->pin_state == R_TPU_PIN_GPIO_FN) + gpio_free(cfg->pin_gpio_fn); + + if (new_state == R_TPU_PIN_GPIO) { + gpio_request(cfg->pin_gpio, cfg->name); + gpio_direction_output(cfg->pin_gpio, !!brightness); + } + if (new_state == R_TPU_PIN_GPIO_FN) + gpio_request(cfg->pin_gpio_fn, cfg->name); + + p->pin_state = new_state; +} + +static void r_tpu_set_brightness(struct led_classdev *ldev, + enum led_brightness brightness) +{ + struct r_tpu_priv *p = container_of(ldev, struct r_tpu_priv, ldev); + + r_tpu_disable(p); + + /* off and maximum are handled as GPIO pins, in between PWM */ + if ((brightness == 0) || (brightness == ldev->max_brightness)) + r_tpu_set_pin(p, R_TPU_PIN_GPIO, brightness); + else { + r_tpu_set_pin(p, R_TPU_PIN_GPIO_FN, 0); + r_tpu_enable(p, brightness); + } +} + +static int __devinit r_tpu_probe(struct platform_device *pdev) +{ + struct led_renesas_tpu_config *cfg = pdev->dev.platform_data; + struct r_tpu_priv *p; + struct resource *res; + int ret = -ENXIO; + + if (!cfg) { + dev_err(&pdev->dev, "missing platform data\n"); + goto err0; + } + + p = kzalloc(sizeof(*p), GFP_KERNEL); + if (p == NULL) { + dev_err(&pdev->dev, "failed to allocate driver data\n"); + ret = -ENOMEM; + goto err0; + } + + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (!res) { + dev_err(&pdev->dev, "failed to get I/O memory\n"); + goto err1; + } + + /* map memory, let mapbase point to our channel */ + p->mapbase = ioremap_nocache(res->start, resource_size(res)); + if (p->mapbase == NULL) { + dev_err(&pdev->dev, "failed to remap I/O memory\n"); + goto err1; + } + + /* get hold of clock */ + p->clk = clk_get(&pdev->dev, NULL); + if (IS_ERR(p->clk)) { + dev_err(&pdev->dev, "cannot get clock\n"); + ret = PTR_ERR(p->clk); + goto err2; + } + + p->pdev = pdev; + p->pin_state = R_TPU_PIN_UNUSED; + p->timer_state = R_TPU_TIMER_UNUSED; + p->refresh_rate = cfg->refresh_rate ? cfg->refresh_rate : 100; + r_tpu_set_pin(p, R_TPU_PIN_GPIO, LED_OFF); + platform_set_drvdata(pdev, p); + + + p->ldev.name = cfg->name; + p->ldev.brightness = LED_OFF; + p->ldev.max_brightness = cfg->max_brightness; + p->ldev.brightness_set = r_tpu_set_brightness; + p->ldev.flags |= LED_CORE_SUSPENDRESUME; + ret = led_classdev_register(&pdev->dev, &p->ldev); + if (ret < 0) + goto err3; + + /* max_brightness may be updated by the LED core code */ + p->min_rate = p->ldev.max_brightness * p->refresh_rate; + + pm_runtime_enable(&pdev->dev); + return 0; + + err3: + r_tpu_set_pin(p, R_TPU_PIN_UNUSED, LED_OFF); + clk_put(p->clk); + err2: + iounmap(p->mapbase); + err1: + kfree(p); + err0: + return ret; +} + +static int __devexit r_tpu_remove(struct platform_device *pdev) +{ + struct r_tpu_priv *p = platform_get_drvdata(pdev); + + r_tpu_set_brightness(&p->ldev, LED_OFF); + led_classdev_unregister(&p->ldev); + r_tpu_disable(p); + r_tpu_set_pin(p, R_TPU_PIN_UNUSED, LED_OFF); + + pm_runtime_disable(&pdev->dev); + clk_put(p->clk); + + iounmap(p->mapbase); + kfree(p); + return 0; +} + +static struct platform_driver r_tpu_device_driver = { + .probe = r_tpu_probe, + .remove = __devexit_p(r_tpu_remove), + .driver = { + .name = "leds-renesas-tpu", + } +}; + +static int __init r_tpu_init(void) +{ + return platform_driver_register(&r_tpu_device_driver); +} + +static void __exit r_tpu_exit(void) +{ + platform_driver_unregister(&r_tpu_device_driver); +} + +module_init(r_tpu_init); +module_exit(r_tpu_exit); + +MODULE_AUTHOR("Magnus Damm"); +MODULE_DESCRIPTION("Renesas TPU LED Driver"); +MODULE_LICENSE("GPL v2"); diff --git a/drivers/misc/lis3lv02d/lis3lv02d.c b/drivers/misc/lis3lv02d/lis3lv02d.c index b928bc14e97b..e67dceab552b 100644 --- a/drivers/misc/lis3lv02d/lis3lv02d.c +++ b/drivers/misc/lis3lv02d/lis3lv02d.c @@ -163,7 +163,7 @@ static void lis3lv02d_get_xyz(struct lis3lv02d *lis3, int *x, int *y, int *z) int i; if (lis3->blkread) { - if (lis3_dev.whoami == WAI_12B) { + if (lis3->whoami == WAI_12B) { u16 data[3]; lis3->blkread(lis3, OUTX_L, 6, (u8 *)data); for (i = 0; i < 3; i++) @@ -195,18 +195,30 @@ static int lis3_8_rates[2] = {100, 400}; static int lis3_3dc_rates[16] = {0, 1, 10, 25, 50, 100, 200, 400, 1600, 5000}; /* ODR is Output Data Rate */ -static int lis3lv02d_get_odr(void) +static int lis3lv02d_get_odr(struct lis3lv02d *lis3) { u8 ctrl; int shift; - lis3_dev.read(&lis3_dev, CTRL_REG1, &ctrl); - ctrl &= lis3_dev.odr_mask; - shift = ffs(lis3_dev.odr_mask) - 1; - return lis3_dev.odrs[(ctrl >> shift)]; + lis3->read(&lis3, CTRL_REG1, &ctrl); + ctrl &= lis3->odr_mask; + shift = ffs(lis3->odr_mask) - 1; + return lis3->odrs[(ctrl >> shift)]; } -static int lis3lv02d_set_odr(int rate) +static int lis3lv02d_get_pwron_wait(struct lis3lv02d *lis3) +{ + int div = lis3lv02d_get_odr(lis3); + + if (WARN_ONCE(div == 0, "device returned spurious data")) + return -ENXIO; + + /* LIS3 power on delay is quite long */ + msleep(lis3->pwron_delay / div); + return 0; +} + +static int lis3lv02d_set_odr(struct lis3lv02d *lis3, int rate) { u8 ctrl; int i, len, shift; @@ -214,14 +226,14 @@ static int lis3lv02d_set_odr(int rate) if (!rate) return -EINVAL; - lis3_dev.read(&lis3_dev, CTRL_REG1, &ctrl); - ctrl &= ~lis3_dev.odr_mask; - len = 1 << hweight_long(lis3_dev.odr_mask); /* # of possible values */ - shift = ffs(lis3_dev.odr_mask) - 1; + lis3->read(lis3, CTRL_REG1, &ctrl); + ctrl &= ~lis3->odr_mask; + len = 1 << hweight_long(lis3->odr_mask); /* # of possible values */ + shift = ffs(lis3->odr_mask) - 1; for (i = 0; i < len; i++) - if (lis3_dev.odrs[i] == rate) { - lis3_dev.write(&lis3_dev, CTRL_REG1, + if (lis3->odrs[i] == rate) { + lis3->write(lis3, CTRL_REG1, ctrl | (i << shift)); return 0; } @@ -240,12 +252,12 @@ static int lis3lv02d_selftest(struct lis3lv02d *lis3, s16 results[3]) mutex_lock(&lis3->mutex); irq_cfg = lis3->irq_cfg; - if (lis3_dev.whoami == WAI_8B) { + if (lis3->whoami == WAI_8B) { lis3->data_ready_count[IRQ_LINE0] = 0; lis3->data_ready_count[IRQ_LINE1] = 0; /* Change interrupt cfg to data ready for selftest */ - atomic_inc(&lis3_dev.wake_thread); + atomic_inc(&lis3->wake_thread); lis3->irq_cfg = LIS3_IRQ1_DATA_READY | LIS3_IRQ2_DATA_READY; lis3->read(lis3, CTRL_REG3, &ctrl_reg_data); lis3->write(lis3, CTRL_REG3, (ctrl_reg_data & @@ -253,12 +265,12 @@ static int lis3lv02d_selftest(struct lis3lv02d *lis3, s16 results[3]) (LIS3_IRQ1_DATA_READY | LIS3_IRQ2_DATA_READY)); } - if (lis3_dev.whoami == WAI_3DC) { + if (lis3->whoami == WAI_3DC) { ctlreg = CTRL_REG4; selftest = CTRL4_ST0; } else { ctlreg = CTRL_REG1; - if (lis3_dev.whoami == WAI_12B) + if (lis3->whoami == WAI_12B) selftest = CTRL1_ST; else selftest = CTRL1_STP; @@ -266,7 +278,9 @@ static int lis3lv02d_selftest(struct lis3lv02d *lis3, s16 results[3]) lis3->read(lis3, ctlreg, ®); lis3->write(lis3, ctlreg, (reg | selftest)); - msleep(lis3->pwron_delay / lis3lv02d_get_odr()); + ret = lis3lv02d_get_pwron_wait(lis3); + if (ret) + goto fail; /* Read directly to avoid axis remap */ x = lis3->read_data(lis3, OUTX); @@ -275,7 +289,9 @@ static int lis3lv02d_selftest(struct lis3lv02d *lis3, s16 results[3]) /* back to normal settings */ lis3->write(lis3, ctlreg, reg); - msleep(lis3->pwron_delay / lis3lv02d_get_odr()); + ret = lis3lv02d_get_pwron_wait(lis3); + if (ret) + goto fail; results[0] = x - lis3->read_data(lis3, OUTX); results[1] = y - lis3->read_data(lis3, OUTY); @@ -283,9 +299,9 @@ static int lis3lv02d_selftest(struct lis3lv02d *lis3, s16 results[3]) ret = 0; - if (lis3_dev.whoami == WAI_8B) { + if (lis3->whoami == WAI_8B) { /* Restore original interrupt configuration */ - atomic_dec(&lis3_dev.wake_thread); + atomic_dec(&lis3->wake_thread); lis3->write(lis3, CTRL_REG3, ctrl_reg_data); lis3->irq_cfg = irq_cfg; @@ -363,8 +379,9 @@ void lis3lv02d_poweroff(struct lis3lv02d *lis3) } EXPORT_SYMBOL_GPL(lis3lv02d_poweroff); -void lis3lv02d_poweron(struct lis3lv02d *lis3) +int lis3lv02d_poweron(struct lis3lv02d *lis3) { + int err; u8 reg; lis3->init(lis3); @@ -382,35 +399,41 @@ void lis3lv02d_poweron(struct lis3lv02d *lis3) reg |= CTRL2_BOOT_8B; lis3->write(lis3, CTRL_REG2, reg); - /* LIS3 power on delay is quite long */ - msleep(lis3->pwron_delay / lis3lv02d_get_odr()); + err = lis3lv02d_get_pwron_wait(lis3); + if (err) + return err; if (lis3->reg_ctrl) lis3_context_restore(lis3); + + return 0; } EXPORT_SYMBOL_GPL(lis3lv02d_poweron); static void lis3lv02d_joystick_poll(struct input_polled_dev *pidev) { + struct lis3lv02d *lis3 = pidev->private; int x, y, z; - mutex_lock(&lis3_dev.mutex); - lis3lv02d_get_xyz(&lis3_dev, &x, &y, &z); + mutex_lock(&lis3->mutex); + lis3lv02d_get_xyz(lis3, &x, &y, &z); input_report_abs(pidev->input, ABS_X, x); input_report_abs(pidev->input, ABS_Y, y); input_report_abs(pidev->input, ABS_Z, z); input_sync(pidev->input); - mutex_unlock(&lis3_dev.mutex); + mutex_unlock(&lis3->mutex); } static void lis3lv02d_joystick_open(struct input_polled_dev *pidev) { - if (lis3_dev.pm_dev) - pm_runtime_get_sync(lis3_dev.pm_dev); + struct lis3lv02d *lis3 = pidev->private; - if (lis3_dev.pdata && lis3_dev.whoami == WAI_8B && lis3_dev.idev) - atomic_set(&lis3_dev.wake_thread, 1); + if (lis3->pm_dev) + pm_runtime_get_sync(lis3->pm_dev); + + if (lis3->pdata && lis3->whoami == WAI_8B && lis3->idev) + atomic_set(&lis3->wake_thread, 1); /* * Update coordinates for the case where poll interval is 0 and * the chip in running purely under interrupt control @@ -420,14 +443,18 @@ static void lis3lv02d_joystick_open(struct input_polled_dev *pidev) static void lis3lv02d_joystick_close(struct input_polled_dev *pidev) { - atomic_set(&lis3_dev.wake_thread, 0); - if (lis3_dev.pm_dev) - pm_runtime_put(lis3_dev.pm_dev); + struct lis3lv02d *lis3 = pidev->private; + + atomic_set(&lis3->wake_thread, 0); + if (lis3->pm_dev) + pm_runtime_put(lis3->pm_dev); } -static irqreturn_t lis302dl_interrupt(int irq, void *dummy) +static irqreturn_t lis302dl_interrupt(int irq, void *data) { - if (!test_bit(0, &lis3_dev.misc_opened)) + struct lis3lv02d *lis3 = data; + + if (!test_bit(0, &lis3->misc_opened)) goto out; /* @@ -435,12 +462,12 @@ static irqreturn_t lis302dl_interrupt(int irq, void *dummy) * the lid is closed. This leads to interrupts as soon as a little move * is done. */ - atomic_inc(&lis3_dev.count); + atomic_inc(&lis3->count); - wake_up_interruptible(&lis3_dev.misc_wait); - kill_fasync(&lis3_dev.async_queue, SIGIO, POLL_IN); + wake_up_interruptible(&lis3->misc_wait); + kill_fasync(&lis3->async_queue, SIGIO, POLL_IN); out: - if (atomic_read(&lis3_dev.wake_thread)) + if (atomic_read(&lis3->wake_thread)) return IRQ_WAKE_THREAD; return IRQ_HANDLED; } @@ -512,28 +539,37 @@ static irqreturn_t lis302dl_interrupt_thread2_8b(int irq, void *data) static int lis3lv02d_misc_open(struct inode *inode, struct file *file) { - if (test_and_set_bit(0, &lis3_dev.misc_opened)) + struct lis3lv02d *lis3 = container_of(file->private_data, + struct lis3lv02d, miscdev); + + if (test_and_set_bit(0, &lis3->misc_opened)) return -EBUSY; /* already open */ - if (lis3_dev.pm_dev) - pm_runtime_get_sync(lis3_dev.pm_dev); + if (lis3->pm_dev) + pm_runtime_get_sync(lis3->pm_dev); - atomic_set(&lis3_dev.count, 0); + atomic_set(&lis3->count, 0); return 0; } static int lis3lv02d_misc_release(struct inode *inode, struct file *file) { - fasync_helper(-1, file, 0, &lis3_dev.async_queue); - clear_bit(0, &lis3_dev.misc_opened); /* release the device */ - if (lis3_dev.pm_dev) - pm_runtime_put(lis3_dev.pm_dev); + struct lis3lv02d *lis3 = container_of(file->private_data, + struct lis3lv02d, miscdev); + + fasync_helper(-1, file, 0, &lis3->async_queue); + clear_bit(0, &lis3->misc_opened); /* release the device */ + if (lis3->pm_dev) + pm_runtime_put(lis3->pm_dev); return 0; } static ssize_t lis3lv02d_misc_read(struct file *file, char __user *buf, size_t count, loff_t *pos) { + struct lis3lv02d *lis3 = container_of(file->private_data, + struct lis3lv02d, miscdev); + DECLARE_WAITQUEUE(wait, current); u32 data; unsigned char byte_data; @@ -542,10 +578,10 @@ static ssize_t lis3lv02d_misc_read(struct file *file, char __user *buf, if (count < 1) return -EINVAL; - add_wait_queue(&lis3_dev.misc_wait, &wait); + add_wait_queue(&lis3->misc_wait, &wait); while (true) { set_current_state(TASK_INTERRUPTIBLE); - data = atomic_xchg(&lis3_dev.count, 0); + data = atomic_xchg(&lis3->count, 0); if (data) break; @@ -575,22 +611,28 @@ static ssize_t lis3lv02d_misc_read(struct file *file, char __user *buf, out: __set_current_state(TASK_RUNNING); - remove_wait_queue(&lis3_dev.misc_wait, &wait); + remove_wait_queue(&lis3->misc_wait, &wait); return retval; } static unsigned int lis3lv02d_misc_poll(struct file *file, poll_table *wait) { - poll_wait(file, &lis3_dev.misc_wait, wait); - if (atomic_read(&lis3_dev.count)) + struct lis3lv02d *lis3 = container_of(file->private_data, + struct lis3lv02d, miscdev); + + poll_wait(file, &lis3->misc_wait, wait); + if (atomic_read(&lis3->count)) return POLLIN | POLLRDNORM; return 0; } static int lis3lv02d_misc_fasync(int fd, struct file *file, int on) { - return fasync_helper(fd, file, on, &lis3_dev.async_queue); + struct lis3lv02d *lis3 = container_of(file->private_data, + struct lis3lv02d, miscdev); + + return fasync_helper(fd, file, on, &lis3->async_queue); } static const struct file_operations lis3lv02d_misc_fops = { @@ -603,85 +645,80 @@ static const struct file_operations lis3lv02d_misc_fops = { .fasync = lis3lv02d_misc_fasync, }; -static struct miscdevice lis3lv02d_misc_device = { - .minor = MISC_DYNAMIC_MINOR, - .name = "freefall", - .fops = &lis3lv02d_misc_fops, -}; - -int lis3lv02d_joystick_enable(void) +int lis3lv02d_joystick_enable(struct lis3lv02d *lis3) { struct input_dev *input_dev; int err; int max_val, fuzz, flat; int btns[] = {BTN_X, BTN_Y, BTN_Z}; - if (lis3_dev.idev) + if (lis3->idev) return -EINVAL; - lis3_dev.idev = input_allocate_polled_device(); - if (!lis3_dev.idev) + lis3->idev = input_allocate_polled_device(); + if (!lis3->idev) return -ENOMEM; - lis3_dev.idev->poll = lis3lv02d_joystick_poll; - lis3_dev.idev->open = lis3lv02d_joystick_open; - lis3_dev.idev->close = lis3lv02d_joystick_close; - lis3_dev.idev->poll_interval = MDPS_POLL_INTERVAL; - lis3_dev.idev->poll_interval_min = MDPS_POLL_MIN; - lis3_dev.idev->poll_interval_max = MDPS_POLL_MAX; - input_dev = lis3_dev.idev->input; + lis3->idev->poll = lis3lv02d_joystick_poll; + lis3->idev->open = lis3lv02d_joystick_open; + lis3->idev->close = lis3lv02d_joystick_close; + lis3->idev->poll_interval = MDPS_POLL_INTERVAL; + lis3->idev->poll_interval_min = MDPS_POLL_MIN; + lis3->idev->poll_interval_max = MDPS_POLL_MAX; + lis3->idev->private = lis3; + input_dev = lis3->idev->input; input_dev->name = "ST LIS3LV02DL Accelerometer"; input_dev->phys = DRIVER_NAME "/input0"; input_dev->id.bustype = BUS_HOST; input_dev->id.vendor = 0; - input_dev->dev.parent = &lis3_dev.pdev->dev; + input_dev->dev.parent = &lis3->pdev->dev; set_bit(EV_ABS, input_dev->evbit); - max_val = (lis3_dev.mdps_max_val * lis3_dev.scale) / LIS3_ACCURACY; - if (lis3_dev.whoami == WAI_12B) { + max_val = (lis3->mdps_max_val * lis3->scale) / LIS3_ACCURACY; + if (lis3->whoami == WAI_12B) { fuzz = LIS3_DEFAULT_FUZZ_12B; flat = LIS3_DEFAULT_FLAT_12B; } else { fuzz = LIS3_DEFAULT_FUZZ_8B; flat = LIS3_DEFAULT_FLAT_8B; } - fuzz = (fuzz * lis3_dev.scale) / LIS3_ACCURACY; - flat = (flat * lis3_dev.scale) / LIS3_ACCURACY; + fuzz = (fuzz * lis3->scale) / LIS3_ACCURACY; + flat = (flat * lis3->scale) / LIS3_ACCURACY; input_set_abs_params(input_dev, ABS_X, -max_val, max_val, fuzz, flat); input_set_abs_params(input_dev, ABS_Y, -max_val, max_val, fuzz, flat); input_set_abs_params(input_dev, ABS_Z, -max_val, max_val, fuzz, flat); - lis3_dev.mapped_btns[0] = lis3lv02d_get_axis(abs(lis3_dev.ac.x), btns); - lis3_dev.mapped_btns[1] = lis3lv02d_get_axis(abs(lis3_dev.ac.y), btns); - lis3_dev.mapped_btns[2] = lis3lv02d_get_axis(abs(lis3_dev.ac.z), btns); + lis3->mapped_btns[0] = lis3lv02d_get_axis(abs(lis3->ac.x), btns); + lis3->mapped_btns[1] = lis3lv02d_get_axis(abs(lis3->ac.y), btns); + lis3->mapped_btns[2] = lis3lv02d_get_axis(abs(lis3->ac.z), btns); - err = input_register_polled_device(lis3_dev.idev); + err = input_register_polled_device(lis3->idev); if (err) { - input_free_polled_device(lis3_dev.idev); - lis3_dev.idev = NULL; + input_free_polled_device(lis3->idev); + lis3->idev = NULL; } return err; } EXPORT_SYMBOL_GPL(lis3lv02d_joystick_enable); -void lis3lv02d_joystick_disable(void) +void lis3lv02d_joystick_disable(struct lis3lv02d *lis3) { - if (lis3_dev.irq) - free_irq(lis3_dev.irq, &lis3_dev); - if (lis3_dev.pdata && lis3_dev.pdata->irq2) - free_irq(lis3_dev.pdata->irq2, &lis3_dev); + if (lis3->irq) + free_irq(lis3->irq, lis3); + if (lis3->pdata && lis3->pdata->irq2) + free_irq(lis3->pdata->irq2, lis3); - if (!lis3_dev.idev) + if (!lis3->idev) return; - if (lis3_dev.irq) - misc_deregister(&lis3lv02d_misc_device); - input_unregister_polled_device(lis3_dev.idev); - input_free_polled_device(lis3_dev.idev); - lis3_dev.idev = NULL; + if (lis3->irq) + misc_deregister(&lis3->miscdev); + input_unregister_polled_device(lis3->idev); + input_free_polled_device(lis3->idev); + lis3->idev = NULL; } EXPORT_SYMBOL_GPL(lis3lv02d_joystick_disable); @@ -706,6 +743,7 @@ static void lis3lv02d_sysfs_poweron(struct lis3lv02d *lis3) static ssize_t lis3lv02d_selftest_show(struct device *dev, struct device_attribute *attr, char *buf) { + struct lis3lv02d *lis3 = dev_get_drvdata(dev); s16 values[3]; static const char ok[] = "OK"; @@ -713,8 +751,8 @@ static ssize_t lis3lv02d_selftest_show(struct device *dev, static const char irq[] = "FAIL_IRQ"; const char *res; - lis3lv02d_sysfs_poweron(&lis3_dev); - switch (lis3lv02d_selftest(&lis3_dev, values)) { + lis3lv02d_sysfs_poweron(lis3); + switch (lis3lv02d_selftest(lis3, values)) { case SELFTEST_FAIL: res = fail; break; @@ -733,33 +771,37 @@ static ssize_t lis3lv02d_selftest_show(struct device *dev, static ssize_t lis3lv02d_position_show(struct device *dev, struct device_attribute *attr, char *buf) { + struct lis3lv02d *lis3 = dev_get_drvdata(dev); int x, y, z; - lis3lv02d_sysfs_poweron(&lis3_dev); - mutex_lock(&lis3_dev.mutex); - lis3lv02d_get_xyz(&lis3_dev, &x, &y, &z); - mutex_unlock(&lis3_dev.mutex); + lis3lv02d_sysfs_poweron(lis3); + mutex_lock(&lis3->mutex); + lis3lv02d_get_xyz(lis3, &x, &y, &z); + mutex_unlock(&lis3->mutex); return sprintf(buf, "(%d,%d,%d)\n", x, y, z); } static ssize_t lis3lv02d_rate_show(struct device *dev, struct device_attribute *attr, char *buf) { - lis3lv02d_sysfs_poweron(&lis3_dev); - return sprintf(buf, "%d\n", lis3lv02d_get_odr()); + struct lis3lv02d *lis3 = dev_get_drvdata(dev); + + lis3lv02d_sysfs_poweron(lis3); + return sprintf(buf, "%d\n", lis3lv02d_get_odr(lis3)); } static ssize_t lis3lv02d_rate_set(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { + struct lis3lv02d *lis3 = dev_get_drvdata(dev); unsigned long rate; if (strict_strtoul(buf, 0, &rate)) return -EINVAL; - lis3lv02d_sysfs_poweron(&lis3_dev); - if (lis3lv02d_set_odr(rate)) + lis3lv02d_sysfs_poweron(lis3); + if (lis3lv02d_set_odr(lis3, rate)) return -EINVAL; return count; @@ -788,6 +830,7 @@ static int lis3lv02d_add_fs(struct lis3lv02d *lis3) if (IS_ERR(lis3->pdev)) return PTR_ERR(lis3->pdev); + platform_set_drvdata(lis3->pdev, lis3); return sysfs_create_group(&lis3->pdev->dev.kobj, &lis3lv02d_attribute_group); } @@ -801,7 +844,7 @@ int lis3lv02d_remove_fs(struct lis3lv02d *lis3) /* SYSFS may have left chip running. Turn off if necessary */ if (!pm_runtime_suspended(lis3->pm_dev)) - lis3lv02d_poweroff(&lis3_dev); + lis3lv02d_poweroff(lis3); pm_runtime_disable(lis3->pm_dev); pm_runtime_set_suspended(lis3->pm_dev); @@ -811,24 +854,24 @@ int lis3lv02d_remove_fs(struct lis3lv02d *lis3) } EXPORT_SYMBOL_GPL(lis3lv02d_remove_fs); -static void lis3lv02d_8b_configure(struct lis3lv02d *dev, +static void lis3lv02d_8b_configure(struct lis3lv02d *lis3, struct lis3lv02d_platform_data *p) { int err; int ctrl2 = p->hipass_ctrl; if (p->click_flags) { - dev->write(dev, CLICK_CFG, p->click_flags); - dev->write(dev, CLICK_TIMELIMIT, p->click_time_limit); - dev->write(dev, CLICK_LATENCY, p->click_latency); - dev->write(dev, CLICK_WINDOW, p->click_window); - dev->write(dev, CLICK_THSZ, p->click_thresh_z & 0xf); - dev->write(dev, CLICK_THSY_X, + lis3->write(lis3, CLICK_CFG, p->click_flags); + lis3->write(lis3, CLICK_TIMELIMIT, p->click_time_limit); + lis3->write(lis3, CLICK_LATENCY, p->click_latency); + lis3->write(lis3, CLICK_WINDOW, p->click_window); + lis3->write(lis3, CLICK_THSZ, p->click_thresh_z & 0xf); + lis3->write(lis3, CLICK_THSY_X, (p->click_thresh_x & 0xf) | (p->click_thresh_y << 4)); - if (dev->idev) { - struct input_dev *input_dev = lis3_dev.idev->input; + if (lis3->idev) { + struct input_dev *input_dev = lis3->idev->input; input_set_capability(input_dev, EV_KEY, BTN_X); input_set_capability(input_dev, EV_KEY, BTN_Y); input_set_capability(input_dev, EV_KEY, BTN_Z); @@ -836,22 +879,22 @@ static void lis3lv02d_8b_configure(struct lis3lv02d *dev, } if (p->wakeup_flags) { - dev->write(dev, FF_WU_CFG_1, p->wakeup_flags); - dev->write(dev, FF_WU_THS_1, p->wakeup_thresh & 0x7f); + lis3->write(lis3, FF_WU_CFG_1, p->wakeup_flags); + lis3->write(lis3, FF_WU_THS_1, p->wakeup_thresh & 0x7f); /* pdata value + 1 to keep this backward compatible*/ - dev->write(dev, FF_WU_DURATION_1, p->duration1 + 1); + lis3->write(lis3, FF_WU_DURATION_1, p->duration1 + 1); ctrl2 ^= HP_FF_WU1; /* Xor to keep compatible with old pdata*/ } if (p->wakeup_flags2) { - dev->write(dev, FF_WU_CFG_2, p->wakeup_flags2); - dev->write(dev, FF_WU_THS_2, p->wakeup_thresh2 & 0x7f); + lis3->write(lis3, FF_WU_CFG_2, p->wakeup_flags2); + lis3->write(lis3, FF_WU_THS_2, p->wakeup_thresh2 & 0x7f); /* pdata value + 1 to keep this backward compatible*/ - dev->write(dev, FF_WU_DURATION_2, p->duration2 + 1); + lis3->write(lis3, FF_WU_DURATION_2, p->duration2 + 1); ctrl2 ^= HP_FF_WU2; /* Xor to keep compatible with old pdata*/ } /* Configure hipass filters */ - dev->write(dev, CTRL_REG2, ctrl2); + lis3->write(lis3, CTRL_REG2, ctrl2); if (p->irq2) { err = request_threaded_irq(p->irq2, @@ -859,7 +902,7 @@ static void lis3lv02d_8b_configure(struct lis3lv02d *dev, lis302dl_interrupt_thread2_8b, IRQF_TRIGGER_RISING | IRQF_ONESHOT | (p->irq_flags2 & IRQF_TRIGGER_MASK), - DRIVER_NAME, &lis3_dev); + DRIVER_NAME, lis3); if (err < 0) pr_err("No second IRQ. Limited functionality\n"); } @@ -869,93 +912,97 @@ static void lis3lv02d_8b_configure(struct lis3lv02d *dev, * Initialise the accelerometer and the various subsystems. * Should be rather independent of the bus system. */ -int lis3lv02d_init_device(struct lis3lv02d *dev) +int lis3lv02d_init_device(struct lis3lv02d *lis3) { int err; irq_handler_t thread_fn; int irq_flags = 0; - dev->whoami = lis3lv02d_read_8(dev, WHO_AM_I); + lis3->whoami = lis3lv02d_read_8(lis3, WHO_AM_I); - switch (dev->whoami) { + switch (lis3->whoami) { case WAI_12B: pr_info("12 bits sensor found\n"); - dev->read_data = lis3lv02d_read_12; - dev->mdps_max_val = 2048; - dev->pwron_delay = LIS3_PWRON_DELAY_WAI_12B; - dev->odrs = lis3_12_rates; - dev->odr_mask = CTRL1_DF0 | CTRL1_DF1; - dev->scale = LIS3_SENSITIVITY_12B; - dev->regs = lis3_wai12_regs; - dev->regs_size = ARRAY_SIZE(lis3_wai12_regs); + lis3->read_data = lis3lv02d_read_12; + lis3->mdps_max_val = 2048; + lis3->pwron_delay = LIS3_PWRON_DELAY_WAI_12B; + lis3->odrs = lis3_12_rates; + lis3->odr_mask = CTRL1_DF0 | CTRL1_DF1; + lis3->scale = LIS3_SENSITIVITY_12B; + lis3->regs = lis3_wai12_regs; + lis3->regs_size = ARRAY_SIZE(lis3_wai12_regs); break; case WAI_8B: pr_info("8 bits sensor found\n"); - dev->read_data = lis3lv02d_read_8; - dev->mdps_max_val = 128; - dev->pwron_delay = LIS3_PWRON_DELAY_WAI_8B; - dev->odrs = lis3_8_rates; - dev->odr_mask = CTRL1_DR; - dev->scale = LIS3_SENSITIVITY_8B; - dev->regs = lis3_wai8_regs; - dev->regs_size = ARRAY_SIZE(lis3_wai8_regs); + lis3->read_data = lis3lv02d_read_8; + lis3->mdps_max_val = 128; + lis3->pwron_delay = LIS3_PWRON_DELAY_WAI_8B; + lis3->odrs = lis3_8_rates; + lis3->odr_mask = CTRL1_DR; + lis3->scale = LIS3_SENSITIVITY_8B; + lis3->regs = lis3_wai8_regs; + lis3->regs_size = ARRAY_SIZE(lis3_wai8_regs); break; case WAI_3DC: pr_info("8 bits 3DC sensor found\n"); - dev->read_data = lis3lv02d_read_8; - dev->mdps_max_val = 128; - dev->pwron_delay = LIS3_PWRON_DELAY_WAI_8B; - dev->odrs = lis3_3dc_rates; - dev->odr_mask = CTRL1_ODR0|CTRL1_ODR1|CTRL1_ODR2|CTRL1_ODR3; - dev->scale = LIS3_SENSITIVITY_8B; + lis3->read_data = lis3lv02d_read_8; + lis3->mdps_max_val = 128; + lis3->pwron_delay = LIS3_PWRON_DELAY_WAI_8B; + lis3->odrs = lis3_3dc_rates; + lis3->odr_mask = CTRL1_ODR0|CTRL1_ODR1|CTRL1_ODR2|CTRL1_ODR3; + lis3->scale = LIS3_SENSITIVITY_8B; break; default: - pr_err("unknown sensor type 0x%X\n", dev->whoami); + pr_err("unknown sensor type 0x%X\n", lis3->whoami); return -EINVAL; } - dev->reg_cache = kzalloc(max(sizeof(lis3_wai8_regs), + lis3->reg_cache = kzalloc(max(sizeof(lis3_wai8_regs), sizeof(lis3_wai12_regs)), GFP_KERNEL); - if (dev->reg_cache == NULL) { + if (lis3->reg_cache == NULL) { printk(KERN_ERR DRIVER_NAME "out of memory\n"); return -ENOMEM; } - mutex_init(&dev->mutex); - atomic_set(&dev->wake_thread, 0); + mutex_init(&lis3->mutex); + atomic_set(&lis3->wake_thread, 0); - lis3lv02d_add_fs(dev); - lis3lv02d_poweron(dev); + lis3lv02d_add_fs(lis3); + err = lis3lv02d_poweron(lis3); + if (err) { + lis3lv02d_remove_fs(lis3); + return err; + } - if (dev->pm_dev) { - pm_runtime_set_active(dev->pm_dev); - pm_runtime_enable(dev->pm_dev); + if (lis3->pm_dev) { + pm_runtime_set_active(lis3->pm_dev); + pm_runtime_enable(lis3->pm_dev); } - if (lis3lv02d_joystick_enable()) + if (lis3lv02d_joystick_enable(lis3)) pr_err("joystick initialization failed\n"); /* passing in platform specific data is purely optional and only * used by the SPI transport layer at the moment */ - if (dev->pdata) { - struct lis3lv02d_platform_data *p = dev->pdata; + if (lis3->pdata) { + struct lis3lv02d_platform_data *p = lis3->pdata; - if (dev->whoami == WAI_8B) - lis3lv02d_8b_configure(dev, p); + if (lis3->whoami == WAI_8B) + lis3lv02d_8b_configure(lis3, p); irq_flags = p->irq_flags1 & IRQF_TRIGGER_MASK; - dev->irq_cfg = p->irq_cfg; + lis3->irq_cfg = p->irq_cfg; if (p->irq_cfg) - dev->write(dev, CTRL_REG3, p->irq_cfg); + lis3->write(lis3, CTRL_REG3, p->irq_cfg); if (p->default_rate) - lis3lv02d_set_odr(p->default_rate); + lis3lv02d_set_odr(lis3, p->default_rate); } /* bail if we did not get an IRQ from the bus layer */ - if (!dev->irq) { + if (!lis3->irq) { pr_debug("No IRQ. Disabling /dev/freefall\n"); goto out; } @@ -971,23 +1018,27 @@ int lis3lv02d_init_device(struct lis3lv02d *dev) * io-apic is not configurable (and generates a warning) but I keep it * in case of support for other hardware. */ - if (dev->pdata && dev->whoami == WAI_8B) + if (lis3->pdata && lis3->whoami == WAI_8B) thread_fn = lis302dl_interrupt_thread1_8b; else thread_fn = NULL; - err = request_threaded_irq(dev->irq, lis302dl_interrupt, + err = request_threaded_irq(lis3->irq, lis302dl_interrupt, thread_fn, IRQF_TRIGGER_RISING | IRQF_ONESHOT | irq_flags, - DRIVER_NAME, &lis3_dev); + DRIVER_NAME, lis3); if (err < 0) { pr_err("Cannot get IRQ\n"); goto out; } - if (misc_register(&lis3lv02d_misc_device)) + lis3->miscdev.minor = MISC_DYNAMIC_MINOR; + lis3->miscdev.name = "freefall"; + lis3->miscdev.fops = &lis3lv02d_misc_fops; + + if (misc_register(&lis3->miscdev)) pr_err("misc_register failed\n"); out: return 0; diff --git a/drivers/misc/lis3lv02d/lis3lv02d.h b/drivers/misc/lis3lv02d/lis3lv02d.h index a1939589eb2c..2b1482ad3f16 100644 --- a/drivers/misc/lis3lv02d/lis3lv02d.h +++ b/drivers/misc/lis3lv02d/lis3lv02d.h @@ -21,6 +21,7 @@ #include <linux/platform_device.h> #include <linux/input-polldev.h> #include <linux/regulator/consumer.h> +#include <linux/miscdevice.h> /* * This driver tries to support the "digital" accelerometer chips from @@ -273,6 +274,8 @@ struct lis3lv02d { struct fasync_struct *async_queue; /* queue for the misc device */ wait_queue_head_t misc_wait; /* Wait queue for the misc device */ unsigned long misc_opened; /* bit0: whether the device is open */ + struct miscdevice miscdev; + int data_ready_count[2]; atomic_t wake_thread; unsigned char irq_cfg; @@ -282,10 +285,10 @@ struct lis3lv02d { }; int lis3lv02d_init_device(struct lis3lv02d *lis3); -int lis3lv02d_joystick_enable(void); -void lis3lv02d_joystick_disable(void); +int lis3lv02d_joystick_enable(struct lis3lv02d *lis3); +void lis3lv02d_joystick_disable(struct lis3lv02d *lis3); void lis3lv02d_poweroff(struct lis3lv02d *lis3); -void lis3lv02d_poweron(struct lis3lv02d *lis3); +int lis3lv02d_poweron(struct lis3lv02d *lis3); int lis3lv02d_remove_fs(struct lis3lv02d *lis3); extern struct lis3lv02d lis3_dev; diff --git a/drivers/misc/lis3lv02d/lis3lv02d_i2c.c b/drivers/misc/lis3lv02d/lis3lv02d_i2c.c index b20dfb4522d2..6cdc38f6a9a8 100644 --- a/drivers/misc/lis3lv02d/lis3lv02d_i2c.c +++ b/drivers/misc/lis3lv02d/lis3lv02d_i2c.c @@ -161,8 +161,13 @@ static int __devinit lis3lv02d_i2c_probe(struct i2c_client *client, if (lis3_dev.reg_ctrl) lis3_reg_ctrl(&lis3_dev, LIS3_REG_OFF); - if (ret == 0) - return 0; + if (ret) + goto fail2; + return 0; + +fail2: + regulator_bulk_free(ARRAY_SIZE(lis3_dev.regulators), + lis3_dev.regulators); fail: if (pdata && pdata->release_resources) pdata->release_resources(); @@ -177,7 +182,7 @@ static int __devexit lis3lv02d_i2c_remove(struct i2c_client *client) if (pdata && pdata->release_resources) pdata->release_resources(); - lis3lv02d_joystick_disable(); + lis3lv02d_joystick_disable(lis3); lis3lv02d_remove_fs(&lis3_dev); if (lis3_dev.reg_ctrl) diff --git a/drivers/misc/lis3lv02d/lis3lv02d_spi.c b/drivers/misc/lis3lv02d/lis3lv02d_spi.c index c1f8a8fbf694..b2c1be12d16f 100644 --- a/drivers/misc/lis3lv02d/lis3lv02d_spi.c +++ b/drivers/misc/lis3lv02d/lis3lv02d_spi.c @@ -83,7 +83,7 @@ static int __devinit lis302dl_spi_probe(struct spi_device *spi) static int __devexit lis302dl_spi_remove(struct spi_device *spi) { struct lis3lv02d *lis3 = spi_get_drvdata(spi); - lis3lv02d_joystick_disable(); + lis3lv02d_joystick_disable(lis3); lis3lv02d_poweroff(lis3); return lis3lv02d_remove_fs(&lis3_dev); diff --git a/drivers/oprofile/oprofilefs.c b/drivers/oprofile/oprofilefs.c index e9ff6f7770be..ee14e6e5b586 100644 --- a/drivers/oprofile/oprofilefs.c +++ b/drivers/oprofile/oprofilefs.c @@ -65,9 +65,6 @@ int oprofilefs_ulong_from_user(unsigned long *val, char const __user *buf, size_ char tmpbuf[TMPBUFSIZE]; unsigned long flags; - if (!count) - return 0; - if (count > TMPBUFSIZE - 1) return -EINVAL; @@ -97,6 +94,8 @@ static ssize_t ulong_write_file(struct file *file, char const __user *buf, size_ if (*offset) return -EINVAL; + if (count == 0) + return 0; retval = oprofilefs_ulong_from_user(&value, buf, count); if (retval) diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig index 1e88d4785321..91d0d4b55ae4 100644 --- a/drivers/platform/x86/Kconfig +++ b/drivers/platform/x86/Kconfig @@ -164,7 +164,7 @@ config HP_ACCEL Support for a led indicating disk protection will be provided as hp::hddprotect. For more information on the feature, refer to - Documentation/hwmon/lis3lv02d. + Documentation/misc-devices/lis3lv02d. To compile this driver as a module, choose M here: the module will be called hp_accel. diff --git a/drivers/platform/x86/acerhdf.c b/drivers/platform/x86/acerhdf.c index 760c6d7624fe..698e45c98b07 100644 --- a/drivers/platform/x86/acerhdf.c +++ b/drivers/platform/x86/acerhdf.c @@ -161,6 +161,7 @@ static const struct bios_settings_t bios_tbl[] = { {"Acer", "Aspire 1410", "v1.3303", 0x55, 0x58, {0x9e, 0x00} }, {"Acer", "Aspire 1410", "v1.3308", 0x55, 0x58, {0x9e, 0x00} }, {"Acer", "Aspire 1410", "v1.3310", 0x55, 0x58, {0x9e, 0x00} }, + {"Acer", "Aspire 1410", "v1.3314", 0x55, 0x58, {0x9e, 0x00} }, /* Acer 1810xx */ {"Acer", "Aspire 1810TZ", "v0.3108", 0x55, 0x58, {0x9e, 0x00} }, {"Acer", "Aspire 1810T", "v0.3108", 0x55, 0x58, {0x9e, 0x00} }, diff --git a/drivers/platform/x86/hp_accel.c b/drivers/platform/x86/hp_accel.c index 1b52d00e2f90..f93a47b7136a 100644 --- a/drivers/platform/x86/hp_accel.c +++ b/drivers/platform/x86/hp_accel.c @@ -209,6 +209,8 @@ static struct dmi_system_id lis3lv02d_dmi_ids[] = { AXIS_DMI_MATCH("NC6715x", "HP Compaq 6715", y_inverted), AXIS_DMI_MATCH("NC693xx", "HP EliteBook 693", xy_rotated_right), AXIS_DMI_MATCH("NC693xx", "HP EliteBook 853", xy_swap), + AXIS_DMI_MATCH("NC854xx", "HP EliteBook 854", y_inverted), + AXIS_DMI_MATCH("NC273xx", "HP EliteBook 273", y_inverted), /* Intel-based HP Pavilion dv5 */ AXIS_DMI_MATCH2("HPDV5_I", PRODUCT_NAME, "HP Pavilion dv5", @@ -227,6 +229,7 @@ static struct dmi_system_id lis3lv02d_dmi_ids[] = { AXIS_DMI_MATCH("HPB452x", "HP ProBook 452", y_inverted), AXIS_DMI_MATCH("HPB522x", "HP ProBook 522", xy_swap), AXIS_DMI_MATCH("HPB532x", "HP ProBook 532", y_inverted), + AXIS_DMI_MATCH("HPB655x", "HP ProBook 655", xy_swap_inverted), AXIS_DMI_MATCH("Mini510x", "HP Mini 510", xy_rotated_left_usd), { NULL, } /* Laptop models without axis info (yet): @@ -320,7 +323,7 @@ static int lis3lv02d_add(struct acpi_device *device) INIT_WORK(&hpled_led.work, delayed_set_status_worker); ret = led_classdev_register(NULL, &hpled_led.led_classdev); if (ret) { - lis3lv02d_joystick_disable(); + lis3lv02d_joystick_disable(&lis3_dev); lis3lv02d_poweroff(&lis3_dev); flush_work(&hpled_led.work); return ret; @@ -334,7 +337,7 @@ static int lis3lv02d_remove(struct acpi_device *device, int type) if (!device) return -EINVAL; - lis3lv02d_joystick_disable(); + lis3lv02d_joystick_disable(&lis3_dev); lis3lv02d_poweroff(&lis3_dev); led_classdev_unregister(&hpled_led.led_classdev); @@ -354,8 +357,7 @@ static int lis3lv02d_suspend(struct acpi_device *device, pm_message_t state) static int lis3lv02d_resume(struct acpi_device *device) { - lis3lv02d_poweron(&lis3_dev); - return 0; + return lis3lv02d_poweron(&lis3_dev); } #else #define lis3lv02d_suspend NULL diff --git a/drivers/power/ds2780_battery.c b/drivers/power/ds2780_battery.c index ed25ef5887d4..887ec98d8c22 100644 --- a/drivers/power/ds2780_battery.c +++ b/drivers/power/ds2780_battery.c @@ -39,6 +39,7 @@ struct ds2780_device_info { struct device *dev; struct power_supply bat; struct device *w1_dev; + struct task_struct *mutex_holder; }; enum current_types { @@ -49,8 +50,8 @@ enum current_types { static const char model[] = "DS2780"; static const char manufacturer[] = "Maxim/Dallas"; -static inline struct ds2780_device_info *to_ds2780_device_info( - struct power_supply *psy) +static inline struct ds2780_device_info * +to_ds2780_device_info(struct power_supply *psy) { return container_of(psy, struct ds2780_device_info, bat); } @@ -60,17 +61,28 @@ static inline struct power_supply *to_power_supply(struct device *dev) return dev_get_drvdata(dev); } -static inline int ds2780_read8(struct device *dev, u8 *val, int addr) +static inline int ds2780_battery_io(struct ds2780_device_info *dev_info, + char *buf, int addr, size_t count, int io) { - return w1_ds2780_io(dev, val, addr, sizeof(u8), 0); + if (dev_info->mutex_holder == current) + return w1_ds2780_io_nolock(dev_info->w1_dev, buf, addr, count, io); + else + return w1_ds2780_io(dev_info->w1_dev, buf, addr, count, io); +} + +static inline int ds2780_read8(struct ds2780_device_info *dev_info, u8 *val, + int addr) +{ + return ds2780_battery_io(dev_info, val, addr, sizeof(u8), 0); } -static int ds2780_read16(struct device *dev, s16 *val, int addr) +static int ds2780_read16(struct ds2780_device_info *dev_info, s16 *val, + int addr) { int ret; u8 raw[2]; - ret = w1_ds2780_io(dev, raw, addr, sizeof(u8) * 2, 0); + ret = ds2780_battery_io(dev_info, raw, addr, sizeof(raw), 0); if (ret < 0) return ret; @@ -79,16 +91,16 @@ static int ds2780_read16(struct device *dev, s16 *val, int addr) return 0; } -static inline int ds2780_read_block(struct device *dev, u8 *val, int addr, - size_t count) +static inline int ds2780_read_block(struct ds2780_device_info *dev_info, + u8 *val, int addr, size_t count) { - return w1_ds2780_io(dev, val, addr, count, 0); + return ds2780_battery_io(dev_info, val, addr, count, 0); } -static inline int ds2780_write(struct device *dev, u8 *val, int addr, - size_t count) +static inline int ds2780_write(struct ds2780_device_info *dev_info, u8 *val, + int addr, size_t count) { - return w1_ds2780_io(dev, val, addr, count, 1); + return ds2780_battery_io(dev_info, val, addr, count, 1); } static inline int ds2780_store_eeprom(struct device *dev, int addr) @@ -122,7 +134,7 @@ static int ds2780_set_sense_register(struct ds2780_device_info *dev_info, { int ret; - ret = ds2780_write(dev_info->w1_dev, &conductance, + ret = ds2780_write(dev_info, &conductance, DS2780_RSNSP_REG, sizeof(u8)); if (ret < 0) return ret; @@ -134,7 +146,7 @@ static int ds2780_set_sense_register(struct ds2780_device_info *dev_info, static int ds2780_get_rsgain_register(struct ds2780_device_info *dev_info, u16 *rsgain) { - return ds2780_read16(dev_info->w1_dev, rsgain, DS2780_RSGAIN_MSB_REG); + return ds2780_read16(dev_info, rsgain, DS2780_RSGAIN_MSB_REG); } /* Set RSGAIN value from 0 to 1.999 in steps of 0.001 */ @@ -144,8 +156,8 @@ static int ds2780_set_rsgain_register(struct ds2780_device_info *dev_info, int ret; u8 raw[] = {rsgain >> 8, rsgain & 0xFF}; - ret = ds2780_write(dev_info->w1_dev, raw, - DS2780_RSGAIN_MSB_REG, sizeof(u8) * 2); + ret = ds2780_write(dev_info, raw, + DS2780_RSGAIN_MSB_REG, sizeof(raw)); if (ret < 0) return ret; @@ -167,7 +179,7 @@ static int ds2780_get_voltage(struct ds2780_device_info *dev_info, * Bits 2 - 0 of the voltage value are in bits 7 - 5 of the * voltage LSB register */ - ret = ds2780_read16(dev_info->w1_dev, &voltage_raw, + ret = ds2780_read16(dev_info, &voltage_raw, DS2780_VOLT_MSB_REG); if (ret < 0) return ret; @@ -196,7 +208,7 @@ static int ds2780_get_temperature(struct ds2780_device_info *dev_info, * Bits 2 - 0 of the temperature value are in bits 7 - 5 of the * temperature LSB register */ - ret = ds2780_read16(dev_info->w1_dev, &temperature_raw, + ret = ds2780_read16(dev_info, &temperature_raw, DS2780_TEMP_MSB_REG); if (ret < 0) return ret; @@ -222,13 +234,13 @@ static int ds2780_get_current(struct ds2780_device_info *dev_info, * The units of measurement for current are dependent on the value of * the sense resistor. */ - ret = ds2780_read8(dev_info->w1_dev, &sense_res_raw, DS2780_RSNSP_REG); + ret = ds2780_read8(dev_info, &sense_res_raw, DS2780_RSNSP_REG); if (ret < 0) return ret; if (sense_res_raw == 0) { dev_err(dev_info->dev, "sense resistor value is 0\n"); - return -ENXIO; + return -EINVAL; } sense_res = 1000 / sense_res_raw; @@ -248,7 +260,7 @@ static int ds2780_get_current(struct ds2780_device_info *dev_info, * Bits 7 - 0 of the current value are in bits 7 - 0 of the current * LSB register */ - ret = ds2780_read16(dev_info->w1_dev, ¤t_raw, reg_msb); + ret = ds2780_read16(dev_info, ¤t_raw, reg_msb); if (ret < 0) return ret; @@ -267,7 +279,7 @@ static int ds2780_get_accumulated_current(struct ds2780_device_info *dev_info, * The units of measurement for accumulated current are dependent on * the value of the sense resistor. */ - ret = ds2780_read8(dev_info->w1_dev, &sense_res_raw, DS2780_RSNSP_REG); + ret = ds2780_read8(dev_info, &sense_res_raw, DS2780_RSNSP_REG); if (ret < 0) return ret; @@ -285,7 +297,7 @@ static int ds2780_get_accumulated_current(struct ds2780_device_info *dev_info, * Bits 7 - 0 of the ACR value are in bits 7 - 0 of the ACR * LSB register */ - ret = ds2780_read16(dev_info->w1_dev, ¤t_raw, DS2780_ACR_MSB_REG); + ret = ds2780_read16(dev_info, ¤t_raw, DS2780_ACR_MSB_REG); if (ret < 0) return ret; @@ -299,7 +311,7 @@ static int ds2780_get_capacity(struct ds2780_device_info *dev_info, int ret; u8 raw; - ret = ds2780_read8(dev_info->w1_dev, &raw, DS2780_RARC_REG); + ret = ds2780_read8(dev_info, &raw, DS2780_RARC_REG); if (ret < 0) return ret; @@ -345,7 +357,7 @@ static int ds2780_get_charge_now(struct ds2780_device_info *dev_info, * Bits 7 - 0 of the RAAC value are in bits 7 - 0 of the RAAC * LSB register */ - ret = ds2780_read16(dev_info->w1_dev, &charge_raw, DS2780_RAAC_MSB_REG); + ret = ds2780_read16(dev_info, &charge_raw, DS2780_RAAC_MSB_REG); if (ret < 0) return ret; @@ -356,7 +368,7 @@ static int ds2780_get_charge_now(struct ds2780_device_info *dev_info, static int ds2780_get_control_register(struct ds2780_device_info *dev_info, u8 *control_reg) { - return ds2780_read8(dev_info->w1_dev, control_reg, DS2780_CONTROL_REG); + return ds2780_read8(dev_info, control_reg, DS2780_CONTROL_REG); } static int ds2780_set_control_register(struct ds2780_device_info *dev_info, @@ -364,7 +376,7 @@ static int ds2780_set_control_register(struct ds2780_device_info *dev_info, { int ret; - ret = ds2780_write(dev_info->w1_dev, &control_reg, + ret = ds2780_write(dev_info, &control_reg, DS2780_CONTROL_REG, sizeof(u8)); if (ret < 0) return ret; @@ -503,7 +515,7 @@ static ssize_t ds2780_get_sense_resistor_value(struct device *dev, struct power_supply *psy = to_power_supply(dev); struct ds2780_device_info *dev_info = to_ds2780_device_info(psy); - ret = ds2780_read8(dev_info->w1_dev, &sense_resistor, DS2780_RSNSP_REG); + ret = ds2780_read8(dev_info, &sense_resistor, DS2780_RSNSP_REG); if (ret < 0) return ret; @@ -584,7 +596,7 @@ static ssize_t ds2780_get_pio_pin(struct device *dev, struct power_supply *psy = to_power_supply(dev); struct ds2780_device_info *dev_info = to_ds2780_device_info(psy); - ret = ds2780_read8(dev_info->w1_dev, &sfr, DS2780_SFR_REG); + ret = ds2780_read8(dev_info, &sfr, DS2780_SFR_REG); if (ret < 0) return ret; @@ -611,7 +623,7 @@ static ssize_t ds2780_set_pio_pin(struct device *dev, return -EINVAL; } - ret = ds2780_write(dev_info->w1_dev, &new_setting, + ret = ds2780_write(dev_info, &new_setting, DS2780_SFR_REG, sizeof(u8)); if (ret < 0) return ret; @@ -632,7 +644,7 @@ static ssize_t ds2780_read_param_eeprom_bin(struct file *filp, DS2780_EEPROM_BLOCK1_END - DS2780_EEPROM_BLOCK1_START + 1 - off); - return ds2780_read_block(dev_info->w1_dev, buf, + return ds2780_read_block(dev_info, buf, DS2780_EEPROM_BLOCK1_START + off, count); } @@ -650,7 +662,7 @@ static ssize_t ds2780_write_param_eeprom_bin(struct file *filp, DS2780_EEPROM_BLOCK1_END - DS2780_EEPROM_BLOCK1_START + 1 - off); - ret = ds2780_write(dev_info->w1_dev, buf, + ret = ds2780_write(dev_info, buf, DS2780_EEPROM_BLOCK1_START + off, count); if (ret < 0) return ret; @@ -685,9 +697,8 @@ static ssize_t ds2780_read_user_eeprom_bin(struct file *filp, DS2780_EEPROM_BLOCK0_END - DS2780_EEPROM_BLOCK0_START + 1 - off); - return ds2780_read_block(dev_info->w1_dev, buf, + return ds2780_read_block(dev_info, buf, DS2780_EEPROM_BLOCK0_START + off, count); - } static ssize_t ds2780_write_user_eeprom_bin(struct file *filp, @@ -704,7 +715,7 @@ static ssize_t ds2780_write_user_eeprom_bin(struct file *filp, DS2780_EEPROM_BLOCK0_END - DS2780_EEPROM_BLOCK0_START + 1 - off); - ret = ds2780_write(dev_info->w1_dev, buf, + ret = ds2780_write(dev_info, buf, DS2780_EEPROM_BLOCK0_START + off, count); if (ret < 0) return ret; @@ -768,6 +779,7 @@ static int __devinit ds2780_battery_probe(struct platform_device *pdev) dev_info->bat.properties = ds2780_battery_props; dev_info->bat.num_properties = ARRAY_SIZE(ds2780_battery_props); dev_info->bat.get_property = ds2780_battery_get_property; + dev_info->mutex_holder = current; ret = power_supply_register(&pdev->dev, &dev_info->bat); if (ret) { @@ -797,6 +809,8 @@ static int __devinit ds2780_battery_probe(struct platform_device *pdev) goto fail_remove_bin_file; } + dev_info->mutex_holder = NULL; + return 0; fail_remove_bin_file: @@ -816,6 +830,8 @@ static int __devexit ds2780_battery_remove(struct platform_device *pdev) { struct ds2780_device_info *dev_info = platform_get_drvdata(pdev); + dev_info->mutex_holder = current; + /* remove attributes */ sysfs_remove_group(&dev_info->bat.dev->kobj, &ds2780_attr_group); diff --git a/drivers/pps/clients/Kconfig b/drivers/pps/clients/Kconfig index 8520a7f4dd62..c2e0f1e97bcd 100644 --- a/drivers/pps/clients/Kconfig +++ b/drivers/pps/clients/Kconfig @@ -29,4 +29,13 @@ config PPS_CLIENT_PARPORT If you say yes here you get support for a PPS source connected with the interrupt pin of your parallel port. +config PPS_CLIENT_GPIO + tristate "PPS client using GPIO" + depends on PPS + help + If you say yes here you get support for a PPS source using + GPIO. To be useful you must also register a platform device + specifying the GPIO pin and other options, usually in your board + setup. + endif diff --git a/drivers/pps/clients/Makefile b/drivers/pps/clients/Makefile index 4feb7e9e71ee..a461d15f4a2e 100644 --- a/drivers/pps/clients/Makefile +++ b/drivers/pps/clients/Makefile @@ -5,5 +5,6 @@ obj-$(CONFIG_PPS_CLIENT_KTIMER) += pps-ktimer.o obj-$(CONFIG_PPS_CLIENT_LDISC) += pps-ldisc.o obj-$(CONFIG_PPS_CLIENT_PARPORT) += pps_parport.o +obj-$(CONFIG_PPS_CLIENT_GPIO) += pps-gpio.o ccflags-$(CONFIG_PPS_DEBUG) := -DDEBUG diff --git a/drivers/pps/clients/pps-gpio.c b/drivers/pps/clients/pps-gpio.c new file mode 100644 index 000000000000..655055545479 --- /dev/null +++ b/drivers/pps/clients/pps-gpio.c @@ -0,0 +1,227 @@ +/* + * pps-gpio.c -- PPS client driver using GPIO + * + * + * Copyright (C) 2010 Ricardo Martins <rasm@fe.up.pt> + * Copyright (C) 2011 James Nuss <jamesnuss@nanometrics.ca> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#define PPS_GPIO_NAME "pps-gpio" +#define pr_fmt(fmt) PPS_GPIO_NAME ": " fmt + +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/interrupt.h> +#include <linux/module.h> +#include <linux/platform_device.h> +#include <linux/slab.h> +#include <linux/pps_kernel.h> +#include <linux/pps-gpio.h> +#include <linux/gpio.h> +#include <linux/list.h> + +/* Info for each registered platform device */ +struct pps_gpio_device_data { + int irq; /* IRQ used as PPS source */ + struct pps_device *pps; /* PPS source device */ + struct pps_source_info info; /* PPS source information */ + const struct pps_gpio_platform_data *pdata; +}; + +/* + * Report the PPS event + */ + +static irqreturn_t pps_gpio_irq_handler(int irq, void *data) +{ + const struct pps_gpio_device_data *info; + struct pps_event_time ts; + int rising_edge; + + /* Get the time stamp first */ + pps_get_ts(&ts); + + info = data; + + rising_edge = gpio_get_value(info->pdata->gpio_pin); + if ((rising_edge && !info->pdata->assert_falling_edge) || + (!rising_edge && info->pdata->assert_falling_edge)) + pps_event(info->pps, &ts, PPS_CAPTUREASSERT, NULL); + else if (info->pdata->capture_clear && + ((rising_edge && info->pdata->assert_falling_edge) || + (!rising_edge && !info->pdata->assert_falling_edge))) + pps_event(info->pps, &ts, PPS_CAPTURECLEAR, NULL); + + return IRQ_HANDLED; +} + +static int pps_gpio_setup(struct platform_device *pdev) +{ + int ret; + const struct pps_gpio_platform_data *pdata = pdev->dev.platform_data; + + ret = gpio_request(pdata->gpio_pin, pdata->gpio_label); + if (ret) { + pr_warning("failed to request GPIO %u\n", pdata->gpio_pin); + return -EINVAL; + } + + ret = gpio_direction_input(pdata->gpio_pin); + if (ret) { + pr_warning("failed to set pin direction\n"); + gpio_free(pdata->gpio_pin); + return -EINVAL; + } + + return 0; +} + +static unsigned long +get_irqf_trigger_flags(const struct pps_gpio_platform_data *pdata) +{ + unsigned long flags = pdata->assert_falling_edge ? + IRQF_TRIGGER_FALLING : IRQF_TRIGGER_RISING; + + if (pdata->capture_clear) { + flags |= ((flags & IRQF_TRIGGER_RISING) ? + IRQF_TRIGGER_FALLING : IRQF_TRIGGER_RISING); + } + + return flags; +} + +static int pps_gpio_probe(struct platform_device *pdev) +{ + struct pps_gpio_device_data *data; + int irq; + int ret; + int err; + int pps_default_params; + const struct pps_gpio_platform_data *pdata = pdev->dev.platform_data; + + + /* GPIO setup */ + ret = pps_gpio_setup(pdev); + if (ret) + return -EINVAL; + + /* IRQ setup */ + irq = gpio_to_irq(pdata->gpio_pin); + if (irq < 0) { + pr_err("failed to map GPIO to IRQ: %d\n", irq); + err = -EINVAL; + goto return_error; + } + + /* allocate space for device info */ + data = kzalloc(sizeof(struct pps_gpio_device_data), GFP_KERNEL); + if (data == NULL) { + err = -ENOMEM; + goto return_error; + } + + /* initialize PPS specific parts of the bookkeeping data structure. */ + data->info.mode = PPS_CAPTUREASSERT | PPS_OFFSETASSERT | + PPS_ECHOASSERT | PPS_CANWAIT | PPS_TSFMT_TSPEC; + if (pdata->capture_clear) + data->info.mode |= PPS_CAPTURECLEAR | PPS_OFFSETCLEAR | + PPS_ECHOCLEAR; + data->info.owner = THIS_MODULE; + snprintf(data->info.name, PPS_MAX_NAME_LEN - 1, "%s.%d", + pdev->name, pdev->id); + + /* register PPS source */ + pps_default_params = PPS_CAPTUREASSERT | PPS_OFFSETASSERT; + if (pdata->capture_clear) + pps_default_params |= PPS_CAPTURECLEAR | PPS_OFFSETCLEAR; + data->pps = pps_register_source(&data->info, pps_default_params); + if (data->pps == NULL) { + kfree(data); + pr_err("failed to register IRQ %d as PPS source\n", irq); + err = -EINVAL; + goto return_error; + } + + data->irq = irq; + data->pdata = pdata; + + /* register IRQ interrupt handler */ + ret = request_irq(irq, pps_gpio_irq_handler, + get_irqf_trigger_flags(pdata), data->info.name, data); + if (ret) { + pps_unregister_source(data->pps); + kfree(data); + pr_err("failed to acquire IRQ %d\n", irq); + err = -EINVAL; + goto return_error; + } + + platform_set_drvdata(pdev, data); + dev_info(data->pps->dev, "Registered IRQ %d as PPS source\n", irq); + + return 0; + +return_error: + gpio_free(pdata->gpio_pin); + return err; +} + +static int pps_gpio_remove(struct platform_device *pdev) +{ + struct pps_gpio_device_data *data = platform_get_drvdata(pdev); + const struct pps_gpio_platform_data *pdata = data->pdata; + + platform_set_drvdata(pdev, NULL); + free_irq(data->irq, data); + gpio_free(pdata->gpio_pin); + pps_unregister_source(data->pps); + pr_info("removed IRQ %d as PPS source\n", data->irq); + kfree(data); + return 0; +} + +static struct platform_driver pps_gpio_driver = { + .probe = pps_gpio_probe, + .remove = __devexit_p(pps_gpio_remove), + .driver = { + .name = PPS_GPIO_NAME, + .owner = THIS_MODULE + }, +}; + +static int __init pps_gpio_init(void) +{ + int ret = platform_driver_register(&pps_gpio_driver); + if (ret < 0) + pr_err("failed to register platform driver\n"); + return ret; +} + +static void __exit pps_gpio_exit(void) +{ + platform_driver_unregister(&pps_gpio_driver); + pr_debug("unregistered platform driver\n"); +} + +module_init(pps_gpio_init); +module_exit(pps_gpio_exit); + +MODULE_AUTHOR("Ricardo Martins <rasm@fe.up.pt>"); +MODULE_AUTHOR("James Nuss <jamesnuss@nanometrics.ca>"); +MODULE_DESCRIPTION("Use GPIO pin as PPS source"); +MODULE_LICENSE("GPL"); +MODULE_VERSION("1.0.0"); diff --git a/drivers/pps/clients/pps-ktimer.c b/drivers/pps/clients/pps-ktimer.c index 82583b0ff82d..436b4e4e71a1 100644 --- a/drivers/pps/clients/pps-ktimer.c +++ b/drivers/pps/clients/pps-ktimer.c @@ -52,17 +52,6 @@ static void pps_ktimer_event(unsigned long ptr) } /* - * The echo function - */ - -static void pps_ktimer_echo(struct pps_device *pps, int event, void *data) -{ - dev_info(pps->dev, "echo %s %s\n", - event & PPS_CAPTUREASSERT ? "assert" : "", - event & PPS_CAPTURECLEAR ? "clear" : ""); -} - -/* * The PPS info struct */ @@ -72,7 +61,6 @@ static struct pps_source_info pps_ktimer_info = { .mode = PPS_CAPTUREASSERT | PPS_OFFSETASSERT | PPS_ECHOASSERT | PPS_CANWAIT | PPS_TSFMT_TSPEC, - .echo = pps_ktimer_echo, .owner = THIS_MODULE, }; diff --git a/drivers/pps/clients/pps_parport.c b/drivers/pps/clients/pps_parport.c index c571d6dd8f61..e1b4705ae3ec 100644 --- a/drivers/pps/clients/pps_parport.c +++ b/drivers/pps/clients/pps_parport.c @@ -133,14 +133,6 @@ out_both: return; } -/* the PPS echo function */ -static void pps_echo(struct pps_device *pps, int event, void *data) -{ - dev_info(pps->dev, "echo %s %s\n", - event & PPS_CAPTUREASSERT ? "assert" : "", - event & PPS_CAPTURECLEAR ? "clear" : ""); -} - static void parport_attach(struct parport *port) { struct pps_client_pp *device; @@ -151,7 +143,6 @@ static void parport_attach(struct parport *port) PPS_OFFSETASSERT | PPS_OFFSETCLEAR | \ PPS_ECHOASSERT | PPS_ECHOCLEAR | \ PPS_CANWAIT | PPS_TSFMT_TSPEC, - .echo = pps_echo, .owner = THIS_MODULE, .dev = NULL }; diff --git a/drivers/pps/kapi.c b/drivers/pps/kapi.c index a4e8eb9fece6..f197e8ea185c 100644 --- a/drivers/pps/kapi.c +++ b/drivers/pps/kapi.c @@ -52,6 +52,14 @@ static void pps_add_offset(struct pps_ktime *ts, struct pps_ktime *offset) ts->sec += offset->sec; } +static void pps_echo_client_default(struct pps_device *pps, int event, + void *data) +{ + dev_info(pps->dev, "echo %s %s\n", + event & PPS_CAPTUREASSERT ? "assert" : "", + event & PPS_CAPTURECLEAR ? "clear" : ""); +} + /* * Exported functions */ @@ -80,13 +88,6 @@ struct pps_device *pps_register_source(struct pps_source_info *info, err = -EINVAL; goto pps_register_source_exit; } - if ((info->mode & (PPS_ECHOASSERT | PPS_ECHOCLEAR)) != 0 && - info->echo == NULL) { - pr_err("%s: echo function is not defined\n", - info->name); - err = -EINVAL; - goto pps_register_source_exit; - } if ((info->mode & (PPS_TSFMT_TSPEC | PPS_TSFMT_NTPFP)) == 0) { pr_err("%s: unspecified time format\n", info->name); @@ -108,6 +109,11 @@ struct pps_device *pps_register_source(struct pps_source_info *info, pps->params.mode = default_params; pps->info = *info; + /* check for default echo function */ + if ((pps->info.mode & (PPS_ECHOASSERT | PPS_ECHOCLEAR)) && + pps->info.echo == NULL) + pps->info.echo = pps_echo_client_default; + init_waitqueue_head(&pps->queue); spin_lock_init(&pps->lock); diff --git a/drivers/rapidio/Kconfig b/drivers/rapidio/Kconfig index 070211a5955c..bc8719238793 100644 --- a/drivers/rapidio/Kconfig +++ b/drivers/rapidio/Kconfig @@ -1,6 +1,8 @@ # # RapidIO configuration # +source "drivers/rapidio/devices/Kconfig" + config RAPIDIO_DISC_TIMEOUT int "Discovery timeout duration (seconds)" depends on RAPIDIO @@ -20,8 +22,6 @@ config RAPIDIO_ENABLE_RX_TX_PORTS ports for Input/Output direction to allow other traffic than Maintenance transfers. -source "drivers/rapidio/switches/Kconfig" - config RAPIDIO_DEBUG bool "RapidIO subsystem debug messages" depends on RAPIDIO @@ -32,3 +32,5 @@ config RAPIDIO_DEBUG going on. If you are unsure about this, say N here. + +source "drivers/rapidio/switches/Kconfig" diff --git a/drivers/rapidio/Makefile b/drivers/rapidio/Makefile index 89b8eca825b5..ec3fb8121004 100644 --- a/drivers/rapidio/Makefile +++ b/drivers/rapidio/Makefile @@ -4,5 +4,6 @@ obj-y += rio.o rio-access.o rio-driver.o rio-scan.o rio-sysfs.o obj-$(CONFIG_RAPIDIO) += switches/ +obj-$(CONFIG_RAPIDIO) += devices/ subdir-ccflags-$(CONFIG_RAPIDIO_DEBUG) := -DDEBUG diff --git a/drivers/rapidio/devices/Kconfig b/drivers/rapidio/devices/Kconfig new file mode 100644 index 000000000000..df0b0c56cc24 --- /dev/null +++ b/drivers/rapidio/devices/Kconfig @@ -0,0 +1,10 @@ +# +# RapidIO master port configuration +# + +config RAPIDIO_TSI721 + bool "IDT Tsi721 PCI Express SRIO Controller support" + depends on RAPIDIO && PCI && PCIEPORTBUS + default "n" + ---help--- + Include support for IDT Tsi721 PCI Express Serial RapidIO controller. diff --git a/drivers/rapidio/devices/Makefile b/drivers/rapidio/devices/Makefile new file mode 100644 index 000000000000..3b7b4e2dff7c --- /dev/null +++ b/drivers/rapidio/devices/Makefile @@ -0,0 +1,5 @@ +# +# Makefile for RapidIO devices +# + +obj-$(CONFIG_RAPIDIO_TSI721) += tsi721.o diff --git a/drivers/rapidio/devices/tsi721.c b/drivers/rapidio/devices/tsi721.c new file mode 100644 index 000000000000..dafedc8cc558 --- /dev/null +++ b/drivers/rapidio/devices/tsi721.c @@ -0,0 +1,2337 @@ +/* + * RapidIO mport driver for Tsi721 PCIExpress-to-SRIO bridge + * + * Copyright 2011 Integrated Device Technology, Inc. + * Alexandre Bounine <alexandre.bounine@idt.com> + * Chul Kim <chul.kim@idt.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 + * Temple Place - Suite 330, Boston, MA 02111-1307, USA. + */ + +#include <linux/io.h> +#include <linux/errno.h> +#include <linux/init.h> +#include <linux/ioport.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/pci.h> +#include <linux/rio.h> +#include <linux/rio_drv.h> +#include <linux/dma-mapping.h> +#include <linux/interrupt.h> +#include <linux/kfifo.h> +#include <linux/delay.h> + +#include "tsi721.h" + +#define DEBUG_PW /* Inbound Port-Write debugging */ + +static void tsi721_omsg_handler(struct tsi721_device *priv, int ch); +static void tsi721_imsg_handler(struct tsi721_device *priv, int ch); + +/** + * tsi721_lcread - read from local SREP config space + * @mport: RapidIO master port info + * @index: ID of RapdiIO interface + * @offset: Offset into configuration space + * @len: Length (in bytes) of the maintenance transaction + * @data: Value to be read into + * + * Generates a local SREP space read. Returns %0 on + * success or %-EINVAL on failure. + */ +static int tsi721_lcread(struct rio_mport *mport, int index, u32 offset, + int len, u32 *data) +{ + struct tsi721_device *priv = mport->priv; + + if (len != sizeof(u32)) + return -EINVAL; /* only 32-bit access is supported */ + + *data = ioread32(priv->regs + offset); + + return 0; +} + +/** + * tsi721_lcwrite - write into local SREP config space + * @mport: RapidIO master port info + * @index: ID of RapdiIO interface + * @offset: Offset into configuration space + * @len: Length (in bytes) of the maintenance transaction + * @data: Value to be written + * + * Generates a local write into SREP configuration space. Returns %0 on + * success or %-EINVAL on failure. + */ +static int tsi721_lcwrite(struct rio_mport *mport, int index, u32 offset, + int len, u32 data) +{ + struct tsi721_device *priv = mport->priv; + + if (len != sizeof(u32)) + return -EINVAL; /* only 32-bit access is supported */ + + iowrite32(data, priv->regs + offset); + + return 0; +} + +/** + * tsi721_maint_dma - Helper function to generate RapidIO maintenance + * transactions using designated Tsi721 DMA channel. + * @priv: pointer to tsi721 private data + * @sys_size: RapdiIO transport system size + * @destid: Destination ID of transaction + * @hopcount: Number of hops to target device + * @offset: Offset into configuration space + * @len: Length (in bytes) of the maintenance transaction + * @data: Location to be read from or write into + * @do_wr: Operation flag (1 == MAINT_WR) + * + * Generates a RapidIO maintenance transaction (Read or Write). + * Returns %0 on success and %-EINVAL or %-EFAULT on failure. + */ +static int tsi721_maint_dma(struct tsi721_device *priv, u32 sys_size, + u16 destid, u8 hopcount, u32 offset, int len, + u32 *data, int do_wr) +{ + struct tsi721_dma_desc *bd_ptr; + u32 rd_count, swr_ptr, ch_stat; + int i, err = 0; + u32 op = do_wr ? MAINT_WR : MAINT_RD; + + if (offset > (RIO_MAINT_SPACE_SZ - len) || (len != sizeof(u32))) + return -EINVAL; + + bd_ptr = priv->bdma[TSI721_DMACH_MAINT].bd_base; + + rd_count = ioread32( + priv->regs + TSI721_DMAC_DRDCNT(TSI721_DMACH_MAINT)); + + /* Initialize DMA descriptor */ + bd_ptr[0].type_id = cpu_to_le32((DTYPE2 << 29) | (op << 19) | destid); + bd_ptr[0].bcount = cpu_to_le32((sys_size << 26) | 0x04); + bd_ptr[0].raddr_lo = cpu_to_le32((hopcount << 24) | offset); + bd_ptr[0].raddr_hi = 0; + if (do_wr) + bd_ptr[0].data[0] = cpu_to_be32p(data); + else + bd_ptr[0].data[0] = 0xffffffff; + + mb(); + + /* Start DMA operation */ + iowrite32(rd_count + 2, + priv->regs + TSI721_DMAC_DWRCNT(TSI721_DMACH_MAINT)); + (void)ioread32(priv->regs + TSI721_DMAC_DWRCNT(TSI721_DMACH_MAINT)); + i = 0; + + /* Wait until DMA transfer is finished */ + while ((ch_stat = ioread32(priv->regs + + TSI721_DMAC_STS(TSI721_DMACH_MAINT))) & TSI721_DMAC_STS_RUN) { + udelay(10); + i++; + if (i >= 5000000) { + dev_dbg(&priv->pdev->dev, + "%s : DMA[%d] read timeout ch_status=%x\n", + __func__, TSI721_DMACH_MAINT, ch_stat); + if (!do_wr) + *data = 0xffffffff; + err = -EFAULT; + goto err_out; + } + } + + if (ch_stat & TSI721_DMAC_STS_ABORT) { + /* If DMA operation aborted due to error, + * reinitialize DMA channel + */ + dev_dbg(&priv->pdev->dev, "%s : DMA ABORT ch_stat=%x\n", + __func__, ch_stat); + dev_dbg(&priv->pdev->dev, "OP=%d : destid=%x hc=%x off=%x\n", + do_wr ? MAINT_WR : MAINT_RD, destid, hopcount, offset); + iowrite32(TSI721_DMAC_INT_ALL, + priv->regs + TSI721_DMAC_INT(TSI721_DMACH_MAINT)); + iowrite32(TSI721_DMAC_CTL_INIT, + priv->regs + TSI721_DMAC_CTL(TSI721_DMACH_MAINT)); + udelay(10); + iowrite32(0, priv->regs + + TSI721_DMAC_DWRCNT(TSI721_DMACH_MAINT)); + udelay(1); + if (!do_wr) + *data = 0xffffffff; + err = -EFAULT; + goto err_out; + } + + if (!do_wr) + *data = be32_to_cpu(bd_ptr[0].data[0]); + + /* + * Update descriptor status FIFO RD pointer. + * NOTE: Skipping check and clear FIFO entries because we are waiting + * for transfer to be completed. + */ + swr_ptr = ioread32(priv->regs + TSI721_DMAC_DSWP(TSI721_DMACH_MAINT)); + iowrite32(swr_ptr, priv->regs + TSI721_DMAC_DSRP(TSI721_DMACH_MAINT)); +err_out: + + return err; +} + +/** + * tsi721_cread_dma - Generate a RapidIO maintenance read transaction + * using Tsi721 BDMA engine. + * @mport: RapidIO master port control structure + * @index: ID of RapdiIO interface + * @destid: Destination ID of transaction + * @hopcount: Number of hops to target device + * @offset: Offset into configuration space + * @len: Length (in bytes) of the maintenance transaction + * @val: Location to be read into + * + * Generates a RapidIO maintenance read transaction. + * Returns %0 on success and %-EINVAL or %-EFAULT on failure. + */ +static int tsi721_cread_dma(struct rio_mport *mport, int index, u16 destid, + u8 hopcount, u32 offset, int len, u32 *data) +{ + struct tsi721_device *priv = mport->priv; + + return tsi721_maint_dma(priv, mport->sys_size, destid, hopcount, + offset, len, data, 0); +} + +/** + * tsi721_cwrite_dma - Generate a RapidIO maintenance write transaction + * using Tsi721 BDMA engine + * @mport: RapidIO master port control structure + * @index: ID of RapdiIO interface + * @destid: Destination ID of transaction + * @hopcount: Number of hops to target device + * @offset: Offset into configuration space + * @len: Length (in bytes) of the maintenance transaction + * @val: Value to be written + * + * Generates a RapidIO maintenance write transaction. + * Returns %0 on success and %-EINVAL or %-EFAULT on failure. + */ +static int tsi721_cwrite_dma(struct rio_mport *mport, int index, u16 destid, + u8 hopcount, u32 offset, int len, u32 data) +{ + struct tsi721_device *priv = mport->priv; + u32 temp = data; + + return tsi721_maint_dma(priv, mport->sys_size, destid, hopcount, + offset, len, &temp, 1); +} + +/** + * tsi721_pw_handler - Tsi721 inbound port-write interrupt handler + * @mport: RapidIO master port structure + * + * Handles inbound port-write interrupts. Copies PW message from an internal + * buffer into PW message FIFO and schedules deferred routine to process + * queued messages. + */ +static int +tsi721_pw_handler(struct rio_mport *mport) +{ + struct tsi721_device *priv = mport->priv; + u32 pw_stat; + u32 pw_buf[TSI721_RIO_PW_MSG_SIZE/sizeof(u32)]; + + + pw_stat = ioread32(priv->regs + TSI721_RIO_PW_RX_STAT); + + if (pw_stat & TSI721_RIO_PW_RX_STAT_PW_VAL) { + pw_buf[0] = ioread32(priv->regs + TSI721_RIO_PW_RX_CAPT(0)); + pw_buf[1] = ioread32(priv->regs + TSI721_RIO_PW_RX_CAPT(1)); + pw_buf[2] = ioread32(priv->regs + TSI721_RIO_PW_RX_CAPT(2)); + pw_buf[3] = ioread32(priv->regs + TSI721_RIO_PW_RX_CAPT(3)); + + /* Queue PW message (if there is room in FIFO), + * otherwise discard it. + */ + spin_lock(&priv->pw_fifo_lock); + if (kfifo_avail(&priv->pw_fifo) >= TSI721_RIO_PW_MSG_SIZE) + kfifo_in(&priv->pw_fifo, pw_buf, + TSI721_RIO_PW_MSG_SIZE); + else + priv->pw_discard_count++; + spin_unlock(&priv->pw_fifo_lock); + } + + /* Clear pending PW interrupts */ + iowrite32(TSI721_RIO_PW_RX_STAT_PW_DISC | TSI721_RIO_PW_RX_STAT_PW_VAL, + priv->regs + TSI721_RIO_PW_RX_STAT); + + schedule_work(&priv->pw_work); + + return 0; +} + +static void tsi721_pw_dpc(struct work_struct *work) +{ + struct tsi721_device *priv = container_of(work, struct tsi721_device, + pw_work); + unsigned long flags; + u32 msg_buffer[RIO_PW_MSG_SIZE/sizeof(u32)]; /* Use full size PW message + buffer for RIO layer */ + + /* + * Process port-write messages + */ + spin_lock_irqsave(&priv->pw_fifo_lock, flags); + while (kfifo_out(&priv->pw_fifo, (unsigned char *)msg_buffer, + TSI721_RIO_PW_MSG_SIZE)) { + /* Process one message */ + spin_unlock_irqrestore(&priv->pw_fifo_lock, flags); +#ifdef DEBUG_PW + { + u32 i; + pr_debug("%s : Port-Write Message:", __func__); + for (i = 0; i < RIO_PW_MSG_SIZE/sizeof(u32); ) { + pr_debug("0x%02x: %08x %08x %08x %08x", i*4, + msg_buffer[i], msg_buffer[i + 1], + msg_buffer[i + 2], msg_buffer[i + 3]); + i += 4; + } + pr_debug("\n"); + } +#endif + /* Pass the port-write message to RIO core for processing */ + rio_inb_pwrite_handler((union rio_pw_msg *)msg_buffer); + spin_lock_irqsave(&priv->pw_fifo_lock, flags); + } + spin_unlock_irqrestore(&priv->pw_fifo_lock, flags); +} + +/** + * tsi721_pw_enable - enable/disable port-write interface init + * @mport: Master port implementing the port write unit + * @enable: 1=enable; 0=disable port-write message handling + */ +static int tsi721_pw_enable(struct rio_mport *mport, int enable) +{ + struct tsi721_device *priv = mport->priv; + u32 rval; + + rval = ioread32(priv->regs + TSI721_RIO_EM_INT_ENABLE); + + if (enable) + rval |= TSI721_RIO_EM_INT_ENABLE_PW_RX; + else + rval &= ~TSI721_RIO_EM_INT_ENABLE_PW_RX; + + /* Clear pending PW interrupts */ + iowrite32(TSI721_RIO_PW_RX_STAT_PW_DISC | TSI721_RIO_PW_RX_STAT_PW_VAL, + priv->regs + TSI721_RIO_PW_RX_STAT); + /* Update enable bits */ + iowrite32(rval, priv->regs + TSI721_RIO_EM_INT_ENABLE); + + return 0; +} + +/** + * tsi721_dsend - Send a RapidIO doorbell + * @mport: RapidIO master port info + * @index: ID of RapidIO interface + * @destid: Destination ID of target device + * @data: 16-bit info field of RapidIO doorbell + * + * Sends a RapidIO doorbell message. Always returns %0. + */ +static int tsi721_dsend(struct rio_mport *mport, int index, + u16 destid, u16 data) +{ + struct tsi721_device *priv = mport->priv; + u32 offset; + + offset = (((mport->sys_size) ? RIO_TT_CODE_16 : RIO_TT_CODE_8) << 18) | + (destid << 2); + + dev_dbg(&priv->pdev->dev, + "Send Doorbell 0x%04x to destID 0x%x\n", data, destid); + iowrite16be(data, priv->odb_base + offset); + + return 0; +} + +/** + * tsi721_dbell_handler - Tsi721 doorbell interrupt handler + * @mport: RapidIO master port structure + * + * Handles inbound doorbell interrupts. Copies doorbell entry from an internal + * buffer into DB message FIFO and schedules deferred routine to process + * queued DBs. + */ +static int +tsi721_dbell_handler(struct rio_mport *mport) +{ + struct tsi721_device *priv = mport->priv; + u32 regval; + + /* Disable IDB interrupts */ + regval = ioread32(priv->regs + TSI721_SR_CHINTE(IDB_QUEUE)); + regval &= ~TSI721_SR_CHINT_IDBQRCV; + iowrite32(regval, + priv->regs + TSI721_SR_CHINTE(IDB_QUEUE)); + + schedule_work(&priv->idb_work); + + return 0; +} + +static void tsi721_db_dpc(struct work_struct *work) +{ + struct tsi721_device *priv = container_of(work, struct tsi721_device, + idb_work); + struct rio_mport *mport; + struct rio_dbell *dbell; + int found = 0; + u32 wr_ptr, rd_ptr; + u64 *idb_entry; + u32 regval; + union { + u64 msg; + u8 bytes[8]; + } idb; + + /* + * Process queued inbound doorbells + */ + mport = priv->mport; + + wr_ptr = ioread32(priv->regs + TSI721_IDQ_WP(IDB_QUEUE)); + rd_ptr = ioread32(priv->regs + TSI721_IDQ_RP(IDB_QUEUE)); + + while (wr_ptr != rd_ptr) { + idb_entry = (u64 *)(priv->idb_base + + (TSI721_IDB_ENTRY_SIZE * rd_ptr)); + rd_ptr++; + idb.msg = *idb_entry; + *idb_entry = 0; + + /* Process one doorbell */ + list_for_each_entry(dbell, &mport->dbells, node) { + if ((dbell->res->start <= DBELL_INF(idb.bytes)) && + (dbell->res->end >= DBELL_INF(idb.bytes))) { + found = 1; + break; + } + } + + if (found) { + dbell->dinb(mport, dbell->dev_id, DBELL_SID(idb.bytes), + DBELL_TID(idb.bytes), DBELL_INF(idb.bytes)); + } else { + dev_dbg(&priv->pdev->dev, + "spurious inb doorbell, sid %2.2x tid %2.2x" + " info %4.4x\n", DBELL_SID(idb.bytes), + DBELL_TID(idb.bytes), DBELL_INF(idb.bytes)); + } + } + + iowrite32(rd_ptr & (IDB_QSIZE - 1), + priv->regs + TSI721_IDQ_RP(IDB_QUEUE)); + + /* Re-enable IDB interrupts */ + regval = ioread32(priv->regs + TSI721_SR_CHINTE(IDB_QUEUE)); + regval |= TSI721_SR_CHINT_IDBQRCV; + iowrite32(regval, + priv->regs + TSI721_SR_CHINTE(IDB_QUEUE)); +} + +/** + * tsi721_srio_msix - Tsi721 MSI-X SRIO MAC interrupt handler + * @irq: Linux interrupt number + * @ptr: Pointer to interrupt-specific data (mport structure) + * + * Handles Tsi721 interrupts from SRIO MAC. + */ +static irqreturn_t tsi721_srio_msix(int irq, void *ptr) +{ + struct tsi721_device *priv = ((struct rio_mport *)ptr)->priv; + u32 srio_int; + + /* Service SRIO MAC interrupts */ + srio_int = ioread32(priv->regs + TSI721_RIO_EM_INT_STAT); + if (srio_int & TSI721_RIO_EM_INT_STAT_PW_RX) + tsi721_pw_handler((struct rio_mport *)ptr); + + return IRQ_HANDLED; +} + +/** + * tsi721_sr2pc_ch_msix - Tsi721 MSI-X SR2PC Channel interrupt handler + * @irq: Linux interrupt number + * @ptr: Pointer to interrupt-specific data (mport structure) + * + * Handles Tsi721 interrupts from SR2PC Channel. + * NOTE: At this moment services only one SR2PC channel associated with inbound + * doorbells. + */ +static irqreturn_t tsi721_sr2pc_ch_msix(int irq, void *ptr) +{ + struct tsi721_device *priv = ((struct rio_mport *)ptr)->priv; + u32 sr_ch_int; + + /* Service Inbound DB interrupt from SR2PC channel */ + sr_ch_int = ioread32(priv->regs + TSI721_SR_CHINT(IDB_QUEUE)); + if (sr_ch_int & TSI721_SR_CHINT_IDBQRCV) + tsi721_dbell_handler((struct rio_mport *)ptr); + + /* Clear interrupts */ + iowrite32(sr_ch_int, priv->regs + TSI721_SR_CHINT(IDB_QUEUE)); + /* Read back to ensure that interrupt was cleared */ + sr_ch_int = ioread32(priv->regs + TSI721_SR_CHINT(IDB_QUEUE)); + + return IRQ_HANDLED; +} + +/** + * tsi721_omsg_msix - MSI-X interrupt handler for outbound messaging + * @irq: Linux interrupt number + * @ptr: Pointer to interrupt-specific data (mport structure) + * + * Handles outbound messaging interrupts signaled using MSI-X. + */ +static irqreturn_t tsi721_omsg_msix(int irq, void *ptr) +{ + struct tsi721_device *priv = ((struct rio_mport *)ptr)->priv; + int mbox; + + mbox = (irq - priv->msix[TSI721_VECT_OMB0_DONE].vector) % RIO_MAX_MBOX; + tsi721_omsg_handler(priv, mbox); + return IRQ_HANDLED; +} + +/** + * tsi721_imsg_msix - MSI-X interrupt handler for inbound messaging + * @irq: Linux interrupt number + * @ptr: Pointer to interrupt-specific data (mport structure) + * + * Handles inbound messaging interrupts signaled using MSI-X. + */ +static irqreturn_t tsi721_imsg_msix(int irq, void *ptr) +{ + struct tsi721_device *priv = ((struct rio_mport *)ptr)->priv; + int mbox; + + mbox = (irq - priv->msix[TSI721_VECT_IMB0_RCV].vector) % RIO_MAX_MBOX; + tsi721_imsg_handler(priv, mbox + 4); + return IRQ_HANDLED; +} + +/** + * tsi721_irqhandler - Tsi721 interrupt handler + * @irq: Linux interrupt number + * @ptr: Pointer to interrupt-specific data (mport structure) + * + * Handles Tsi721 interrupts signaled using MSI and INTA. Checks reported + * interrupt events and calls an event-specific handler(s). + */ +static irqreturn_t tsi721_irqhandler(int irq, void *ptr) +{ + struct rio_mport *mport = (struct rio_mport *)ptr; + struct tsi721_device *priv = mport->priv; + u32 dev_int; + u32 dev_ch_int; + u32 intval; + u32 ch_inte; + + dev_int = ioread32(priv->regs + TSI721_DEV_INT); + if (!dev_int) + return IRQ_NONE; + + dev_ch_int = ioread32(priv->regs + TSI721_DEV_CHAN_INT); + + if (dev_int & TSI721_DEV_INT_SR2PC_CH) { + /* Service SR2PC Channel interrupts */ + if (dev_ch_int & TSI721_INT_SR2PC_CHAN(IDB_QUEUE)) { + /* Service Inbound Doorbell interrupt */ + intval = ioread32(priv->regs + + TSI721_SR_CHINT(IDB_QUEUE)); + if (intval & TSI721_SR_CHINT_IDBQRCV) + tsi721_dbell_handler(mport); + else + dev_info(&priv->pdev->dev, + "Unsupported SR_CH_INT %x\n", intval); + + /* Clear interrupts */ + iowrite32(intval, + priv->regs + TSI721_SR_CHINT(IDB_QUEUE)); + (void)ioread32(priv->regs + TSI721_SR_CHINT(IDB_QUEUE)); + } + } + + if (dev_int & TSI721_DEV_INT_SMSG_CH) { + int ch; + + /* + * Service channel interrupts from Messaging Engine + */ + + if (dev_ch_int & TSI721_INT_IMSG_CHAN_M) { /* Inbound Msg */ + /* Disable signaled OB MSG Channel interrupts */ + ch_inte = ioread32(priv->regs + TSI721_DEV_CHAN_INTE); + ch_inte &= ~(dev_ch_int & TSI721_INT_IMSG_CHAN_M); + iowrite32(ch_inte, priv->regs + TSI721_DEV_CHAN_INTE); + + /* + * Process Inbound Message interrupt for each MBOX + */ + for (ch = 4; ch < RIO_MAX_MBOX + 4; ch++) { + if (!(dev_ch_int & TSI721_INT_IMSG_CHAN(ch))) + continue; + tsi721_imsg_handler(priv, ch); + } + } + + if (dev_ch_int & TSI721_INT_OMSG_CHAN_M) { /* Outbound Msg */ + /* Disable signaled OB MSG Channel interrupts */ + ch_inte = ioread32(priv->regs + TSI721_DEV_CHAN_INTE); + ch_inte &= ~(dev_ch_int & TSI721_INT_OMSG_CHAN_M); + iowrite32(ch_inte, priv->regs + TSI721_DEV_CHAN_INTE); + + /* + * Process Outbound Message interrupts for each MBOX + */ + + for (ch = 0; ch < RIO_MAX_MBOX; ch++) { + if (!(dev_ch_int & TSI721_INT_OMSG_CHAN(ch))) + continue; + tsi721_omsg_handler(priv, ch); + } + } + } + + if (dev_int & TSI721_DEV_INT_SRIO) { + /* Service SRIO MAC interrupts */ + intval = ioread32(priv->regs + TSI721_RIO_EM_INT_STAT); + if (intval & TSI721_RIO_EM_INT_STAT_PW_RX) + tsi721_pw_handler(mport); + } + + return IRQ_HANDLED; +} + +static void tsi721_interrupts_init(struct tsi721_device *priv) +{ + u32 intr; + + /* Enable IDB interrupts */ + iowrite32(TSI721_SR_CHINT_ALL, + priv->regs + TSI721_SR_CHINT(IDB_QUEUE)); + iowrite32(TSI721_SR_CHINT_IDBQRCV, + priv->regs + TSI721_SR_CHINTE(IDB_QUEUE)); + iowrite32(TSI721_INT_SR2PC_CHAN(IDB_QUEUE), + priv->regs + TSI721_DEV_CHAN_INTE); + + /* Enable SRIO MAC interrupts */ + iowrite32(TSI721_RIO_EM_DEV_INT_EN_INT, + priv->regs + TSI721_RIO_EM_DEV_INT_EN); + + if (priv->flags & TSI721_USING_MSIX) + intr = TSI721_DEV_INT_SRIO; + else + intr = TSI721_DEV_INT_SR2PC_CH | TSI721_DEV_INT_SRIO | + TSI721_DEV_INT_SMSG_CH; + + iowrite32(intr, priv->regs + TSI721_DEV_INTE); + (void)ioread32(priv->regs + TSI721_DEV_INTE); +} + +/** + * tsi721_request_msix - register interrupt service for MSI-X mode. + * @mport: RapidIO master port structure + * + * Registers MSI-X interrupt service routines for interrupts that are active + * immediately after mport initialization. Messaging interrupt service routines + * should be registered during corresponding open requests. + */ +static int tsi721_request_msix(struct rio_mport *mport) +{ + struct tsi721_device *priv = mport->priv; + int err = 0; + + err = request_irq(priv->msix[TSI721_VECT_IDB].vector, + tsi721_sr2pc_ch_msix, 0, + priv->msix[TSI721_VECT_IDB].irq_name, (void *)mport); + if (err) + goto out; + + err = request_irq(priv->msix[TSI721_VECT_PWRX].vector, + tsi721_srio_msix, 0, + priv->msix[TSI721_VECT_PWRX].irq_name, (void *)mport); +out: + return err; +} + +static int tsi721_request_irq(struct rio_mport *mport) +{ + struct tsi721_device *priv = mport->priv; + int err; + + if (priv->flags & TSI721_USING_MSIX) + err = tsi721_request_msix(mport); + else + err = request_irq(priv->pdev->irq, tsi721_irqhandler, + (priv->flags & TSI721_USING_MSI) ? 0 : IRQF_SHARED, + DRV_NAME, (void *)mport); + + if (err) + dev_err(&priv->pdev->dev, + "Unable to allocate interrupt, Error: %d\n", err); + + return err; +} + +/** + * tsi721_enable_msix - Attempts to enable MSI-X support for Tsi721. + * @priv: pointer to tsi721 private data + * + * Configures MSI-X support for Tsi721. Supports only an exact number + * of requested vectors. + */ +static int tsi721_enable_msix(struct tsi721_device *priv) +{ + struct msix_entry entries[TSI721_VECT_MAX]; + int err; + int i; + + entries[TSI721_VECT_IDB].entry = TSI721_MSIX_SR2PC_IDBQ_RCV(IDB_QUEUE); + entries[TSI721_VECT_PWRX].entry = TSI721_MSIX_SRIO_MAC_INT; + + /* + * Initialize MSI-X entries for Messaging Engine: + * this driver supports four RIO mailboxes (inbound and outbound) + * NOTE: Inbound message MBOX 0...4 use IB channels 4...7. Therefore + * offset +4 is added to IB MBOX number. + */ + for (i = 0; i < RIO_MAX_MBOX; i++) { + entries[TSI721_VECT_IMB0_RCV + i].entry = + TSI721_MSIX_IMSG_DQ_RCV(i + 4); + entries[TSI721_VECT_IMB0_INT + i].entry = + TSI721_MSIX_IMSG_INT(i + 4); + entries[TSI721_VECT_OMB0_DONE + i].entry = + TSI721_MSIX_OMSG_DONE(i); + entries[TSI721_VECT_OMB0_INT + i].entry = + TSI721_MSIX_OMSG_INT(i); + } + + err = pci_enable_msix(priv->pdev, entries, ARRAY_SIZE(entries)); + if (err) { + if (err > 0) + dev_info(&priv->pdev->dev, + "Only %d MSI-X vectors available, " + "not using MSI-X\n", err); + return err; + } + + /* + * Copy MSI-X vector information into tsi721 private structure + */ + priv->msix[TSI721_VECT_IDB].vector = entries[TSI721_VECT_IDB].vector; + snprintf(priv->msix[TSI721_VECT_IDB].irq_name, IRQ_DEVICE_NAME_MAX, + DRV_NAME "-idb@pci:%s", pci_name(priv->pdev)); + priv->msix[TSI721_VECT_PWRX].vector = entries[TSI721_VECT_PWRX].vector; + snprintf(priv->msix[TSI721_VECT_PWRX].irq_name, IRQ_DEVICE_NAME_MAX, + DRV_NAME "-pwrx@pci:%s", pci_name(priv->pdev)); + + for (i = 0; i < RIO_MAX_MBOX; i++) { + priv->msix[TSI721_VECT_IMB0_RCV + i].vector = + entries[TSI721_VECT_IMB0_RCV + i].vector; + snprintf(priv->msix[TSI721_VECT_IMB0_RCV + i].irq_name, + IRQ_DEVICE_NAME_MAX, DRV_NAME "-imbr%d@pci:%s", + i, pci_name(priv->pdev)); + + priv->msix[TSI721_VECT_IMB0_INT + i].vector = + entries[TSI721_VECT_IMB0_INT + i].vector; + snprintf(priv->msix[TSI721_VECT_IMB0_INT + i].irq_name, + IRQ_DEVICE_NAME_MAX, DRV_NAME "-imbi%d@pci:%s", + i, pci_name(priv->pdev)); + + priv->msix[TSI721_VECT_OMB0_DONE + i].vector = + entries[TSI721_VECT_OMB0_DONE + i].vector; + snprintf(priv->msix[TSI721_VECT_OMB0_DONE + i].irq_name, + IRQ_DEVICE_NAME_MAX, DRV_NAME "-ombd%d@pci:%s", + i, pci_name(priv->pdev)); + + priv->msix[TSI721_VECT_OMB0_INT + i].vector = + entries[TSI721_VECT_OMB0_INT + i].vector; + snprintf(priv->msix[TSI721_VECT_OMB0_INT + i].irq_name, + IRQ_DEVICE_NAME_MAX, DRV_NAME "-ombi%d@pci:%s", + i, pci_name(priv->pdev)); + } + + return 0; +} + +/** + * tsi721_init_pc2sr_mapping - initializes outbound (PCIe->SRIO) + * translation regions. + * @priv: pointer to tsi721 private data + * + * Disables SREP translation regions. + */ +static void tsi721_init_pc2sr_mapping(struct tsi721_device *priv) +{ + int i; + + /* Disable all PC2SR translation windows */ + for (i = 0; i < TSI721_OBWIN_NUM; i++) + iowrite32(0, priv->regs + TSI721_OBWINLB(i)); +} + +/** + * tsi721_init_sr2pc_mapping - initializes inbound (SRIO->PCIe) + * translation regions. + * @priv: pointer to tsi721 private data + * + * Disables inbound windows. + */ +static void tsi721_init_sr2pc_mapping(struct tsi721_device *priv) +{ + int i; + + /* Disable all SR2PC inbound windows */ + for (i = 0; i < TSI721_IBWIN_NUM; i++) + iowrite32(0, priv->regs + TSI721_IBWINLB(i)); +} + +/** + * tsi721_port_write_init - Inbound port write interface init + * @priv: pointer to tsi721 private data + * + * Initializes inbound port write handler. + * Returns %0 on success or %-ENOMEM on failure. + */ +static int tsi721_port_write_init(struct tsi721_device *priv) +{ + priv->pw_discard_count = 0; + INIT_WORK(&priv->pw_work, tsi721_pw_dpc); + spin_lock_init(&priv->pw_fifo_lock); + if (kfifo_alloc(&priv->pw_fifo, + TSI721_RIO_PW_MSG_SIZE * 32, GFP_KERNEL)) { + dev_err(&priv->pdev->dev, "PW FIFO allocation failed\n"); + return -ENOMEM; + } + + /* Use reliable port-write capture mode */ + iowrite32(TSI721_RIO_PW_CTL_PWC_REL, priv->regs + TSI721_RIO_PW_CTL); + return 0; +} + +static int tsi721_doorbell_init(struct tsi721_device *priv) +{ + /* Outbound Doorbells do not require any setup. + * Tsi721 uses dedicated PCI BAR1 to generate doorbells. + * That BAR1 was mapped during the probe routine. + */ + + /* Initialize Inbound Doorbell processing DPC and queue */ + priv->db_discard_count = 0; + INIT_WORK(&priv->idb_work, tsi721_db_dpc); + + /* Allocate buffer for inbound doorbells queue */ + priv->idb_base = dma_alloc_coherent(&priv->pdev->dev, + IDB_QSIZE * TSI721_IDB_ENTRY_SIZE, + &priv->idb_dma, GFP_KERNEL); + if (!priv->idb_base) + return -ENOMEM; + + memset(priv->idb_base, 0, IDB_QSIZE * TSI721_IDB_ENTRY_SIZE); + + dev_dbg(&priv->pdev->dev, "Allocated IDB buffer @ %p (phys = %llx)\n", + priv->idb_base, (unsigned long long)priv->idb_dma); + + iowrite32(TSI721_IDQ_SIZE_VAL(IDB_QSIZE), + priv->regs + TSI721_IDQ_SIZE(IDB_QUEUE)); + iowrite32(((u64)priv->idb_dma >> 32), + priv->regs + TSI721_IDQ_BASEU(IDB_QUEUE)); + iowrite32(((u64)priv->idb_dma & TSI721_IDQ_BASEL_ADDR), + priv->regs + TSI721_IDQ_BASEL(IDB_QUEUE)); + /* Enable accepting all inbound doorbells */ + iowrite32(0, priv->regs + TSI721_IDQ_MASK(IDB_QUEUE)); + + iowrite32(TSI721_IDQ_INIT, priv->regs + TSI721_IDQ_CTL(IDB_QUEUE)); + + iowrite32(0, priv->regs + TSI721_IDQ_RP(IDB_QUEUE)); + + return 0; +} + +static void tsi721_doorbell_free(struct tsi721_device *priv) +{ + if (priv->idb_base == NULL) + return; + + /* Free buffer allocated for inbound doorbell queue */ + dma_free_coherent(&priv->pdev->dev, IDB_QSIZE * TSI721_IDB_ENTRY_SIZE, + priv->idb_base, priv->idb_dma); + priv->idb_base = NULL; +} + +static int tsi721_bdma_ch_init(struct tsi721_device *priv, int chnum) +{ + struct tsi721_dma_desc *bd_ptr; + u64 *sts_ptr; + dma_addr_t bd_phys, sts_phys; + int sts_size; + int bd_num = priv->bdma[chnum].bd_num; + + dev_dbg(&priv->pdev->dev, "Init Block DMA Engine, CH%d\n", chnum); + + /* + * Initialize DMA channel for maintenance requests + */ + + /* Allocate space for DMA descriptors */ + bd_ptr = dma_alloc_coherent(&priv->pdev->dev, + bd_num * sizeof(struct tsi721_dma_desc), + &bd_phys, GFP_KERNEL); + if (!bd_ptr) + return -ENOMEM; + + priv->bdma[chnum].bd_phys = bd_phys; + priv->bdma[chnum].bd_base = bd_ptr; + + memset(bd_ptr, 0, bd_num * sizeof(struct tsi721_dma_desc)); + + dev_dbg(&priv->pdev->dev, "DMA descriptors @ %p (phys = %llx)\n", + bd_ptr, (unsigned long long)bd_phys); + + /* Allocate space for descriptor status FIFO */ + sts_size = (bd_num >= TSI721_DMA_MINSTSSZ) ? + bd_num : TSI721_DMA_MINSTSSZ; + sts_size = roundup_pow_of_two(sts_size); + sts_ptr = dma_alloc_coherent(&priv->pdev->dev, + sts_size * sizeof(struct tsi721_dma_sts), + &sts_phys, GFP_KERNEL); + if (!sts_ptr) { + /* Free space allocated for DMA descriptors */ + dma_free_coherent(&priv->pdev->dev, + bd_num * sizeof(struct tsi721_dma_desc), + bd_ptr, bd_phys); + priv->bdma[chnum].bd_base = NULL; + return -ENOMEM; + } + + priv->bdma[chnum].sts_phys = sts_phys; + priv->bdma[chnum].sts_base = sts_ptr; + priv->bdma[chnum].sts_size = sts_size; + + memset(sts_ptr, 0, sts_size); + + dev_dbg(&priv->pdev->dev, + "desc status FIFO @ %p (phys = %llx) size=0x%x\n", + sts_ptr, (unsigned long long)sts_phys, sts_size); + + /* Initialize DMA descriptors ring */ + bd_ptr[bd_num - 1].type_id = cpu_to_le32(DTYPE3 << 29); + bd_ptr[bd_num - 1].next_lo = cpu_to_le32((u64)bd_phys & + TSI721_DMAC_DPTRL_MASK); + bd_ptr[bd_num - 1].next_hi = cpu_to_le32((u64)bd_phys >> 32); + + /* Setup DMA descriptor pointers */ + iowrite32(((u64)bd_phys >> 32), + priv->regs + TSI721_DMAC_DPTRH(chnum)); + iowrite32(((u64)bd_phys & TSI721_DMAC_DPTRL_MASK), + priv->regs + TSI721_DMAC_DPTRL(chnum)); + + /* Setup descriptor status FIFO */ + iowrite32(((u64)sts_phys >> 32), + priv->regs + TSI721_DMAC_DSBH(chnum)); + iowrite32(((u64)sts_phys & TSI721_DMAC_DSBL_MASK), + priv->regs + TSI721_DMAC_DSBL(chnum)); + iowrite32(TSI721_DMAC_DSSZ_SIZE(sts_size), + priv->regs + TSI721_DMAC_DSSZ(chnum)); + + /* Clear interrupt bits */ + iowrite32(TSI721_DMAC_INT_ALL, + priv->regs + TSI721_DMAC_INT(chnum)); + + (void)ioread32(priv->regs + TSI721_DMAC_INT(chnum)); + + /* Toggle DMA channel initialization */ + iowrite32(TSI721_DMAC_CTL_INIT, priv->regs + TSI721_DMAC_CTL(chnum)); + (void)ioread32(priv->regs + TSI721_DMAC_CTL(chnum)); + udelay(10); + + return 0; +} + +static int tsi721_bdma_ch_free(struct tsi721_device *priv, int chnum) +{ + u32 ch_stat; + + if (priv->bdma[chnum].bd_base == NULL) + return 0; + + /* Check if DMA channel still running */ + ch_stat = ioread32(priv->regs + TSI721_DMAC_STS(chnum)); + if (ch_stat & TSI721_DMAC_STS_RUN) + return -EFAULT; + + /* Put DMA channel into init state */ + iowrite32(TSI721_DMAC_CTL_INIT, + priv->regs + TSI721_DMAC_CTL(chnum)); + + /* Free space allocated for DMA descriptors */ + dma_free_coherent(&priv->pdev->dev, + priv->bdma[chnum].bd_num * sizeof(struct tsi721_dma_desc), + priv->bdma[chnum].bd_base, priv->bdma[chnum].bd_phys); + priv->bdma[chnum].bd_base = NULL; + + /* Free space allocated for status FIFO */ + dma_free_coherent(&priv->pdev->dev, + priv->bdma[chnum].sts_size * sizeof(struct tsi721_dma_sts), + priv->bdma[chnum].sts_base, priv->bdma[chnum].sts_phys); + priv->bdma[chnum].sts_base = NULL; + return 0; +} + +static int tsi721_bdma_init(struct tsi721_device *priv) +{ + /* Initialize BDMA channel allocated for RapidIO maintenance read/write + * request generation + */ + priv->bdma[TSI721_DMACH_MAINT].bd_num = 2; + if (tsi721_bdma_ch_init(priv, TSI721_DMACH_MAINT)) { + dev_err(&priv->pdev->dev, "Unable to initialize maintenance DMA" + " channel %d, aborting\n", TSI721_DMACH_MAINT); + return -ENOMEM; + } + + return 0; +} + +static void tsi721_bdma_free(struct tsi721_device *priv) +{ + tsi721_bdma_ch_free(priv, TSI721_DMACH_MAINT); +} + +/* Enable Inbound Messaging Interrupts */ +static void +tsi721_imsg_interrupt_enable(struct tsi721_device *priv, int ch, + u32 inte_mask) +{ + u32 rval; + + if (!inte_mask) + return; + + /* Clear pending Inbound Messaging interrupts */ + iowrite32(inte_mask, priv->regs + TSI721_IBDMAC_INT(ch)); + + /* Enable Inbound Messaging interrupts */ + rval = ioread32(priv->regs + TSI721_IBDMAC_INTE(ch)); + iowrite32(rval | inte_mask, priv->regs + TSI721_IBDMAC_INTE(ch)); + + if (priv->flags & TSI721_USING_MSIX) + return; /* Finished if we are in MSI-X mode */ + + /* + * For MSI and INTA interrupt signalling we need to enable next levels + */ + + /* Enable Device Channel Interrupt */ + rval = ioread32(priv->regs + TSI721_DEV_CHAN_INTE); + iowrite32(rval | TSI721_INT_IMSG_CHAN(ch), + priv->regs + TSI721_DEV_CHAN_INTE); +} + +/* Disable Inbound Messaging Interrupts */ +static void +tsi721_imsg_interrupt_disable(struct tsi721_device *priv, int ch, + u32 inte_mask) +{ + u32 rval; + + if (!inte_mask) + return; + + /* Clear pending Inbound Messaging interrupts */ + iowrite32(inte_mask, priv->regs + TSI721_IBDMAC_INT(ch)); + + /* Disable Inbound Messaging interrupts */ + rval = ioread32(priv->regs + TSI721_IBDMAC_INTE(ch)); + rval &= ~inte_mask; + iowrite32(rval, priv->regs + TSI721_IBDMAC_INTE(ch)); + + if (priv->flags & TSI721_USING_MSIX) + return; /* Finished if we are in MSI-X mode */ + + /* + * For MSI and INTA interrupt signalling we need to disable next levels + */ + + /* Disable Device Channel Interrupt */ + rval = ioread32(priv->regs + TSI721_DEV_CHAN_INTE); + rval &= ~TSI721_INT_IMSG_CHAN(ch); + iowrite32(rval, priv->regs + TSI721_DEV_CHAN_INTE); +} + +/* Enable Outbound Messaging interrupts */ +static void +tsi721_omsg_interrupt_enable(struct tsi721_device *priv, int ch, + u32 inte_mask) +{ + u32 rval; + + if (!inte_mask) + return; + + /* Clear pending Outbound Messaging interrupts */ + iowrite32(inte_mask, priv->regs + TSI721_OBDMAC_INT(ch)); + + /* Enable Outbound Messaging channel interrupts */ + rval = ioread32(priv->regs + TSI721_OBDMAC_INTE(ch)); + iowrite32(rval | inte_mask, priv->regs + TSI721_OBDMAC_INTE(ch)); + + if (priv->flags & TSI721_USING_MSIX) + return; /* Finished if we are in MSI-X mode */ + + /* + * For MSI and INTA interrupt signalling we need to enable next levels + */ + + /* Enable Device Channel Interrupt */ + rval = ioread32(priv->regs + TSI721_DEV_CHAN_INTE); + iowrite32(rval | TSI721_INT_OMSG_CHAN(ch), + priv->regs + TSI721_DEV_CHAN_INTE); +} + +/* Disable Outbound Messaging interrupts */ +static void +tsi721_omsg_interrupt_disable(struct tsi721_device *priv, int ch, + u32 inte_mask) +{ + u32 rval; + + if (!inte_mask) + return; + + /* Clear pending Outbound Messaging interrupts */ + iowrite32(inte_mask, priv->regs + TSI721_OBDMAC_INT(ch)); + + /* Disable Outbound Messaging interrupts */ + rval = ioread32(priv->regs + TSI721_OBDMAC_INTE(ch)); + rval &= ~inte_mask; + iowrite32(rval, priv->regs + TSI721_OBDMAC_INTE(ch)); + + if (priv->flags & TSI721_USING_MSIX) + return; /* Finished if we are in MSI-X mode */ + + /* + * For MSI and INTA interrupt signalling we need to disable next levels + */ + + /* Disable Device Channel Interrupt */ + rval = ioread32(priv->regs + TSI721_DEV_CHAN_INTE); + rval &= ~TSI721_INT_OMSG_CHAN(ch); + iowrite32(rval, priv->regs + TSI721_DEV_CHAN_INTE); +} + +/** + * tsi721_add_outb_message - Add message to the Tsi721 outbound message queue + * @mport: Master port with outbound message queue + * @rdev: Target of outbound message + * @mbox: Outbound mailbox + * @buffer: Message to add to outbound queue + * @len: Length of message + */ +static int +tsi721_add_outb_message(struct rio_mport *mport, struct rio_dev *rdev, int mbox, + void *buffer, size_t len) +{ + struct tsi721_device *priv = mport->priv; + struct tsi721_omsg_desc *desc; + u32 tx_slot; + + if (!priv->omsg_init[mbox] || + len > TSI721_MSG_MAX_SIZE || len < 8) + return -EINVAL; + + tx_slot = priv->omsg_ring[mbox].tx_slot; + + /* Copy copy message into transfer buffer */ + memcpy(priv->omsg_ring[mbox].omq_base[tx_slot], buffer, len); + + if (len & 0x7) + len += 8; + + /* Build descriptor associated with buffer */ + desc = priv->omsg_ring[mbox].omd_base; + desc[tx_slot].type_id = cpu_to_le32((DTYPE4 << 29) | rdev->destid); + if (tx_slot % 4 == 0) + desc[tx_slot].type_id |= cpu_to_le32(TSI721_OMD_IOF); + + desc[tx_slot].msg_info = + cpu_to_le32((mport->sys_size << 26) | (mbox << 22) | + (0xe << 12) | (len & 0xff8)); + desc[tx_slot].bufptr_lo = + cpu_to_le32((u64)priv->omsg_ring[mbox].omq_phys[tx_slot] & + 0xffffffff); + desc[tx_slot].bufptr_hi = + cpu_to_le32((u64)priv->omsg_ring[mbox].omq_phys[tx_slot] >> 32); + + priv->omsg_ring[mbox].wr_count++; + + /* Go to next descriptor */ + if (++priv->omsg_ring[mbox].tx_slot == priv->omsg_ring[mbox].size) { + priv->omsg_ring[mbox].tx_slot = 0; + /* Move through the ring link descriptor at the end */ + priv->omsg_ring[mbox].wr_count++; + } + + mb(); + + /* Set new write count value */ + iowrite32(priv->omsg_ring[mbox].wr_count, + priv->regs + TSI721_OBDMAC_DWRCNT(mbox)); + (void)ioread32(priv->regs + TSI721_OBDMAC_DWRCNT(mbox)); + + return 0; +} + +/** + * tsi721_omsg_handler - Outbound Message Interrupt Handler + * @priv: pointer to tsi721 private data + * @ch: number of OB MSG channel to service + * + * Services channel interrupts from outbound messaging engine. + */ +static void tsi721_omsg_handler(struct tsi721_device *priv, int ch) +{ + u32 omsg_int; + + spin_lock(&priv->omsg_ring[ch].lock); + + omsg_int = ioread32(priv->regs + TSI721_OBDMAC_INT(ch)); + + if (omsg_int & TSI721_OBDMAC_INT_ST_FULL) + dev_info(&priv->pdev->dev, + "OB MBOX%d: Status FIFO is full\n", ch); + + if (omsg_int & (TSI721_OBDMAC_INT_DONE | TSI721_OBDMAC_INT_IOF_DONE)) { + u32 srd_ptr; + u64 *sts_ptr, last_ptr = 0, prev_ptr = 0; + int i, j; + u32 tx_slot; + + /* + * Find last successfully processed descriptor + */ + + /* Check and clear descriptor status FIFO entries */ + srd_ptr = priv->omsg_ring[ch].sts_rdptr; + sts_ptr = priv->omsg_ring[ch].sts_base; + j = srd_ptr * 8; + while (sts_ptr[j]) { + for (i = 0; i < 8 && sts_ptr[j]; i++, j++) { + prev_ptr = last_ptr; + last_ptr = sts_ptr[j]; + sts_ptr[j] = 0; + } + + ++srd_ptr; + srd_ptr %= priv->omsg_ring[ch].sts_size; + j = srd_ptr * 8; + } + + if (last_ptr == 0) + goto no_sts_update; + + priv->omsg_ring[ch].sts_rdptr = srd_ptr; + iowrite32(srd_ptr, priv->regs + TSI721_OBDMAC_DSRP(ch)); + + if (!priv->mport->outb_msg[ch].mcback) + goto no_sts_update; + + /* Inform upper layer about transfer completion */ + + tx_slot = (last_ptr - (u64)priv->omsg_ring[ch].omd_phys)/ + sizeof(struct tsi721_omsg_desc); + + /* + * Check if this is a Link Descriptor (LD). + * If yes, ignore LD and use descriptor processed + * before LD. + */ + if (tx_slot == priv->omsg_ring[ch].size) { + if (prev_ptr) + tx_slot = (prev_ptr - + (u64)priv->omsg_ring[ch].omd_phys)/ + sizeof(struct tsi721_omsg_desc); + else + goto no_sts_update; + } + + /* Move slot index to the next message to be sent */ + ++tx_slot; + if (tx_slot == priv->omsg_ring[ch].size) + tx_slot = 0; + BUG_ON(tx_slot >= priv->omsg_ring[ch].size); + priv->mport->outb_msg[ch].mcback(priv->mport, + priv->omsg_ring[ch].dev_id, ch, + tx_slot); + } + +no_sts_update: + + if (omsg_int & TSI721_OBDMAC_INT_ERROR) { + /* + * Outbound message operation aborted due to error, + * reinitialize OB MSG channel + */ + + dev_dbg(&priv->pdev->dev, "OB MSG ABORT ch_stat=%x\n", + ioread32(priv->regs + TSI721_OBDMAC_STS(ch))); + + iowrite32(TSI721_OBDMAC_INT_ERROR, + priv->regs + TSI721_OBDMAC_INT(ch)); + iowrite32(TSI721_OBDMAC_CTL_INIT, + priv->regs + TSI721_OBDMAC_CTL(ch)); + (void)ioread32(priv->regs + TSI721_OBDMAC_CTL(ch)); + + /* Inform upper level to clear all pending tx slots */ + if (priv->mport->outb_msg[ch].mcback) + priv->mport->outb_msg[ch].mcback(priv->mport, + priv->omsg_ring[ch].dev_id, ch, + priv->omsg_ring[ch].tx_slot); + /* Synch tx_slot tracking */ + iowrite32(priv->omsg_ring[ch].tx_slot, + priv->regs + TSI721_OBDMAC_DRDCNT(ch)); + (void)ioread32(priv->regs + TSI721_OBDMAC_DRDCNT(ch)); + priv->omsg_ring[ch].wr_count = priv->omsg_ring[ch].tx_slot; + priv->omsg_ring[ch].sts_rdptr = 0; + } + + /* Clear channel interrupts */ + iowrite32(omsg_int, priv->regs + TSI721_OBDMAC_INT(ch)); + + if (!(priv->flags & TSI721_USING_MSIX)) { + u32 ch_inte; + + /* Re-enable channel interrupts */ + ch_inte = ioread32(priv->regs + TSI721_DEV_CHAN_INTE); + ch_inte |= TSI721_INT_OMSG_CHAN(ch); + iowrite32(ch_inte, priv->regs + TSI721_DEV_CHAN_INTE); + } + + spin_unlock(&priv->omsg_ring[ch].lock); +} + +/** + * tsi721_open_outb_mbox - Initialize Tsi721 outbound mailbox + * @mport: Master port implementing Outbound Messaging Engine + * @dev_id: Device specific pointer to pass on event + * @mbox: Mailbox to open + * @entries: Number of entries in the outbound mailbox ring + */ +static int tsi721_open_outb_mbox(struct rio_mport *mport, void *dev_id, + int mbox, int entries) +{ + struct tsi721_device *priv = mport->priv; + struct tsi721_omsg_desc *bd_ptr; + int i, rc = 0; + + if ((entries < TSI721_OMSGD_MIN_RING_SIZE) || + (entries > (TSI721_OMSGD_RING_SIZE)) || + (!is_power_of_2(entries)) || mbox >= RIO_MAX_MBOX) { + rc = -EINVAL; + goto out; + } + + priv->omsg_ring[mbox].dev_id = dev_id; + priv->omsg_ring[mbox].size = entries; + priv->omsg_ring[mbox].sts_rdptr = 0; + spin_lock_init(&priv->omsg_ring[mbox].lock); + + /* Outbound Msg Buffer allocation based on + the number of maximum descriptor entries */ + for (i = 0; i < entries; i++) { + priv->omsg_ring[mbox].omq_base[i] = + dma_alloc_coherent( + &priv->pdev->dev, TSI721_MSG_BUFFER_SIZE, + &priv->omsg_ring[mbox].omq_phys[i], + GFP_KERNEL); + if (priv->omsg_ring[mbox].omq_base[i] == NULL) { + dev_dbg(&priv->pdev->dev, + "Unable to allocate OB MSG data buffer for" + " MBOX%d\n", mbox); + rc = -ENOMEM; + goto out_buf; + } + } + + /* Outbound message descriptor allocation */ + priv->omsg_ring[mbox].omd_base = dma_alloc_coherent( + &priv->pdev->dev, + (entries + 1) * sizeof(struct tsi721_omsg_desc), + &priv->omsg_ring[mbox].omd_phys, GFP_KERNEL); + if (priv->omsg_ring[mbox].omd_base == NULL) { + dev_dbg(&priv->pdev->dev, + "Unable to allocate OB MSG descriptor memory " + "for MBOX%d\n", mbox); + rc = -ENOMEM; + goto out_buf; + } + + memset(priv->omsg_ring[mbox].omd_base, 0, + (entries + 1) * sizeof(struct tsi721_omsg_desc)); + priv->omsg_ring[mbox].tx_slot = 0; + + /* Outbound message descriptor status FIFO allocation */ + priv->omsg_ring[mbox].sts_size = roundup_pow_of_two(entries + 1); + priv->omsg_ring[mbox].sts_base = dma_alloc_coherent(&priv->pdev->dev, + priv->omsg_ring[mbox].sts_size * + sizeof(struct tsi721_dma_sts), + &priv->omsg_ring[mbox].sts_phys, GFP_KERNEL); + if (priv->omsg_ring[mbox].sts_base == NULL) { + dev_dbg(&priv->pdev->dev, + "Unable to allocate OB MSG descriptor status FIFO " + "for MBOX%d\n", mbox); + rc = -ENOMEM; + goto out_desc; + } + + memset(priv->omsg_ring[mbox].sts_base, 0, + entries * sizeof(struct tsi721_dma_sts)); + + /* + * Configure Outbound Messaging Engine + */ + + /* Setup Outbound Message descriptor pointer */ + iowrite32(((u64)priv->omsg_ring[mbox].omd_phys >> 32), + priv->regs + TSI721_OBDMAC_DPTRH(mbox)); + iowrite32(((u64)priv->omsg_ring[mbox].omd_phys & + TSI721_OBDMAC_DPTRL_MASK), + priv->regs + TSI721_OBDMAC_DPTRL(mbox)); + + /* Setup Outbound Message descriptor status FIFO */ + iowrite32(((u64)priv->omsg_ring[mbox].sts_phys >> 32), + priv->regs + TSI721_OBDMAC_DSBH(mbox)); + iowrite32(((u64)priv->omsg_ring[mbox].sts_phys & + TSI721_OBDMAC_DSBL_MASK), + priv->regs + TSI721_OBDMAC_DSBL(mbox)); + iowrite32(TSI721_DMAC_DSSZ_SIZE(priv->omsg_ring[mbox].sts_size), + priv->regs + (u32)TSI721_OBDMAC_DSSZ(mbox)); + + /* Enable interrupts */ + + if (priv->flags & TSI721_USING_MSIX) { + /* Request interrupt service if we are in MSI-X mode */ + rc = request_irq( + priv->msix[TSI721_VECT_OMB0_DONE + mbox].vector, + tsi721_omsg_msix, 0, + priv->msix[TSI721_VECT_OMB0_DONE + mbox].irq_name, + (void *)mport); + + if (rc) { + dev_dbg(&priv->pdev->dev, + "Unable to allocate MSI-X interrupt for " + "OBOX%d-DONE\n", mbox); + goto out_stat; + } + + rc = request_irq(priv->msix[TSI721_VECT_OMB0_INT + mbox].vector, + tsi721_omsg_msix, 0, + priv->msix[TSI721_VECT_OMB0_INT + mbox].irq_name, + (void *)mport); + + if (rc) { + dev_dbg(&priv->pdev->dev, + "Unable to allocate MSI-X interrupt for " + "MBOX%d-INT\n", mbox); + free_irq( + priv->msix[TSI721_VECT_OMB0_DONE + mbox].vector, + (void *)mport); + goto out_stat; + } + } + + tsi721_omsg_interrupt_enable(priv, mbox, TSI721_OBDMAC_INT_ALL); + + /* Initialize Outbound Message descriptors ring */ + bd_ptr = priv->omsg_ring[mbox].omd_base; + bd_ptr[entries].type_id = cpu_to_le32(DTYPE5 << 29); + bd_ptr[entries].next_lo = + cpu_to_le32((u64)priv->omsg_ring[mbox].omd_phys & + TSI721_OBDMAC_DPTRL_MASK); + bd_ptr[entries].next_hi = + cpu_to_le32((u64)priv->omsg_ring[mbox].omd_phys >> 32); + priv->omsg_ring[mbox].wr_count = 0; + mb(); + + /* Initialize Outbound Message engine */ + iowrite32(TSI721_OBDMAC_CTL_INIT, priv->regs + TSI721_OBDMAC_CTL(mbox)); + (void)ioread32(priv->regs + TSI721_OBDMAC_DWRCNT(mbox)); + udelay(10); + + priv->omsg_init[mbox] = 1; + + return 0; + +out_stat: + dma_free_coherent(&priv->pdev->dev, + priv->omsg_ring[mbox].sts_size * sizeof(struct tsi721_dma_sts), + priv->omsg_ring[mbox].sts_base, + priv->omsg_ring[mbox].sts_phys); + + priv->omsg_ring[mbox].sts_base = NULL; + +out_desc: + dma_free_coherent(&priv->pdev->dev, + (entries + 1) * sizeof(struct tsi721_omsg_desc), + priv->omsg_ring[mbox].omd_base, + priv->omsg_ring[mbox].omd_phys); + + priv->omsg_ring[mbox].omd_base = NULL; + +out_buf: + for (i = 0; i < priv->omsg_ring[mbox].size; i++) { + if (priv->omsg_ring[mbox].omq_base[i]) { + dma_free_coherent(&priv->pdev->dev, + TSI721_MSG_BUFFER_SIZE, + priv->omsg_ring[mbox].omq_base[i], + priv->omsg_ring[mbox].omq_phys[i]); + + priv->omsg_ring[mbox].omq_base[i] = NULL; + } + } + +out: + return rc; +} + +/** + * tsi721_close_outb_mbox - Close Tsi721 outbound mailbox + * @mport: Master port implementing the outbound message unit + * @mbox: Mailbox to close + */ +static void tsi721_close_outb_mbox(struct rio_mport *mport, int mbox) +{ + struct tsi721_device *priv = mport->priv; + u32 i; + + if (!priv->omsg_init[mbox]) + return; + priv->omsg_init[mbox] = 0; + + /* Disable Interrupts */ + + tsi721_omsg_interrupt_disable(priv, mbox, TSI721_OBDMAC_INT_ALL); + + if (priv->flags & TSI721_USING_MSIX) { + free_irq(priv->msix[TSI721_VECT_OMB0_DONE + mbox].vector, + (void *)mport); + free_irq(priv->msix[TSI721_VECT_OMB0_INT + mbox].vector, + (void *)mport); + } + + /* Free OMSG Descriptor Status FIFO */ + dma_free_coherent(&priv->pdev->dev, + priv->omsg_ring[mbox].sts_size * sizeof(struct tsi721_dma_sts), + priv->omsg_ring[mbox].sts_base, + priv->omsg_ring[mbox].sts_phys); + + priv->omsg_ring[mbox].sts_base = NULL; + + /* Free OMSG descriptors */ + dma_free_coherent(&priv->pdev->dev, + (priv->omsg_ring[mbox].size + 1) * + sizeof(struct tsi721_omsg_desc), + priv->omsg_ring[mbox].omd_base, + priv->omsg_ring[mbox].omd_phys); + + priv->omsg_ring[mbox].omd_base = NULL; + + /* Free message buffers */ + for (i = 0; i < priv->omsg_ring[mbox].size; i++) { + if (priv->omsg_ring[mbox].omq_base[i]) { + dma_free_coherent(&priv->pdev->dev, + TSI721_MSG_BUFFER_SIZE, + priv->omsg_ring[mbox].omq_base[i], + priv->omsg_ring[mbox].omq_phys[i]); + + priv->omsg_ring[mbox].omq_base[i] = NULL; + } + } +} + +/** + * tsi721_imsg_handler - Inbound Message Interrupt Handler + * @priv: pointer to tsi721 private data + * @ch: inbound message channel number to service + * + * Services channel interrupts from inbound messaging engine. + */ +static void tsi721_imsg_handler(struct tsi721_device *priv, int ch) +{ + u32 mbox = ch - 4; + u32 imsg_int; + + imsg_int = ioread32(priv->regs + TSI721_IBDMAC_INT(ch)); + + if (imsg_int & TSI721_IBDMAC_INT_SRTO) + dev_info(&priv->pdev->dev, "IB MBOX%d SRIO timeout\n", + mbox); + + if (imsg_int & TSI721_IBDMAC_INT_PC_ERROR) + dev_info(&priv->pdev->dev, "IB MBOX%d PCIe error\n", + mbox); + + if (imsg_int & TSI721_IBDMAC_INT_FQ_LOW) + dev_info(&priv->pdev->dev, + "IB MBOX%d IB free queue low\n", mbox); + + /* Clear IB channel interrupts */ + iowrite32(imsg_int, priv->regs + TSI721_IBDMAC_INT(ch)); + + /* If an IB Msg is received notify the upper layer */ + if (imsg_int & TSI721_IBDMAC_INT_DQ_RCV && + priv->mport->inb_msg[mbox].mcback) + priv->mport->inb_msg[mbox].mcback(priv->mport, + priv->imsg_ring[mbox].dev_id, mbox, -1); + + if (!(priv->flags & TSI721_USING_MSIX)) { + u32 ch_inte; + + /* Re-enable channel interrupts */ + ch_inte = ioread32(priv->regs + TSI721_DEV_CHAN_INTE); + ch_inte |= TSI721_INT_IMSG_CHAN(ch); + iowrite32(ch_inte, priv->regs + TSI721_DEV_CHAN_INTE); + } +} + +/** + * tsi721_open_inb_mbox - Initialize Tsi721 inbound mailbox + * @mport: Master port implementing the Inbound Messaging Engine + * @dev_id: Device specific pointer to pass on event + * @mbox: Mailbox to open + * @entries: Number of entries in the inbound mailbox ring + */ +static int tsi721_open_inb_mbox(struct rio_mport *mport, void *dev_id, + int mbox, int entries) +{ + struct tsi721_device *priv = mport->priv; + int ch = mbox + 4; + int i; + u64 *free_ptr; + int rc = 0; + + if ((entries < TSI721_IMSGD_MIN_RING_SIZE) || + (entries > TSI721_IMSGD_RING_SIZE) || + (!is_power_of_2(entries)) || mbox >= RIO_MAX_MBOX) { + rc = -EINVAL; + goto out; + } + + /* Initialize IB Messaging Ring */ + priv->imsg_ring[mbox].dev_id = dev_id; + priv->imsg_ring[mbox].size = entries; + priv->imsg_ring[mbox].rx_slot = 0; + priv->imsg_ring[mbox].desc_rdptr = 0; + priv->imsg_ring[mbox].fq_wrptr = 0; + for (i = 0; i < priv->imsg_ring[mbox].size; i++) + priv->imsg_ring[mbox].imq_base[i] = NULL; + + /* Allocate buffers for incoming messages */ + priv->imsg_ring[mbox].buf_base = + dma_alloc_coherent(&priv->pdev->dev, + entries * TSI721_MSG_BUFFER_SIZE, + &priv->imsg_ring[mbox].buf_phys, + GFP_KERNEL); + + if (priv->imsg_ring[mbox].buf_base == NULL) { + dev_err(&priv->pdev->dev, + "Failed to allocate buffers for IB MBOX%d\n", mbox); + rc = -ENOMEM; + goto out; + } + + /* Allocate memory for circular free list */ + priv->imsg_ring[mbox].imfq_base = + dma_alloc_coherent(&priv->pdev->dev, + entries * 8, + &priv->imsg_ring[mbox].imfq_phys, + GFP_KERNEL); + + if (priv->imsg_ring[mbox].imfq_base == NULL) { + dev_err(&priv->pdev->dev, + "Failed to allocate free queue for IB MBOX%d\n", mbox); + rc = -ENOMEM; + goto out_buf; + } + + /* Allocate memory for Inbound message descriptors */ + priv->imsg_ring[mbox].imd_base = + dma_alloc_coherent(&priv->pdev->dev, + entries * sizeof(struct tsi721_imsg_desc), + &priv->imsg_ring[mbox].imd_phys, GFP_KERNEL); + + if (priv->imsg_ring[mbox].imd_base == NULL) { + dev_err(&priv->pdev->dev, + "Failed to allocate descriptor memory for IB MBOX%d\n", + mbox); + rc = -ENOMEM; + goto out_dma; + } + + /* Fill free buffer pointer list */ + free_ptr = priv->imsg_ring[mbox].imfq_base; + for (i = 0; i < entries; i++) + free_ptr[i] = cpu_to_le64( + (u64)(priv->imsg_ring[mbox].buf_phys) + + i * 0x1000); + + mb(); + + /* + * For mapping of inbound SRIO Messages into appropriate queues we need + * to set Inbound Device ID register in the messaging engine. We do it + * once when first inbound mailbox is requested. + */ + if (!(priv->flags & TSI721_IMSGID_SET)) { + iowrite32((u32)priv->mport->host_deviceid, + priv->regs + TSI721_IB_DEVID); + priv->flags |= TSI721_IMSGID_SET; + } + + /* + * Configure Inbound Messaging channel (ch = mbox + 4) + */ + + /* Setup Inbound Message free queue */ + iowrite32(((u64)priv->imsg_ring[mbox].imfq_phys >> 32), + priv->regs + TSI721_IBDMAC_FQBH(ch)); + iowrite32(((u64)priv->imsg_ring[mbox].imfq_phys & + TSI721_IBDMAC_FQBL_MASK), + priv->regs+TSI721_IBDMAC_FQBL(ch)); + iowrite32(TSI721_DMAC_DSSZ_SIZE(entries), + priv->regs + TSI721_IBDMAC_FQSZ(ch)); + + /* Setup Inbound Message descriptor queue */ + iowrite32(((u64)priv->imsg_ring[mbox].imd_phys >> 32), + priv->regs + TSI721_IBDMAC_DQBH(ch)); + iowrite32(((u32)priv->imsg_ring[mbox].imd_phys & + (u32)TSI721_IBDMAC_DQBL_MASK), + priv->regs+TSI721_IBDMAC_DQBL(ch)); + iowrite32(TSI721_DMAC_DSSZ_SIZE(entries), + priv->regs + TSI721_IBDMAC_DQSZ(ch)); + + /* Enable interrupts */ + + if (priv->flags & TSI721_USING_MSIX) { + /* Request interrupt service if we are in MSI-X mode */ + rc = request_irq(priv->msix[TSI721_VECT_IMB0_RCV + mbox].vector, + tsi721_imsg_msix, 0, + priv->msix[TSI721_VECT_IMB0_RCV + mbox].irq_name, + (void *)mport); + + if (rc) { + dev_dbg(&priv->pdev->dev, + "Unable to allocate MSI-X interrupt for " + "IBOX%d-DONE\n", mbox); + goto out_desc; + } + + rc = request_irq(priv->msix[TSI721_VECT_IMB0_INT + mbox].vector, + tsi721_imsg_msix, 0, + priv->msix[TSI721_VECT_IMB0_INT + mbox].irq_name, + (void *)mport); + + if (rc) { + dev_dbg(&priv->pdev->dev, + "Unable to allocate MSI-X interrupt for " + "IBOX%d-INT\n", mbox); + goto out_desc; + } + } + + tsi721_imsg_interrupt_enable(priv, ch, TSI721_IBDMAC_INT_ALL); + + /* Initialize Inbound Message Engine */ + iowrite32(TSI721_IBDMAC_CTL_INIT, priv->regs + TSI721_IBDMAC_CTL(ch)); + (void)ioread32(priv->regs + TSI721_IBDMAC_CTL(ch)); + udelay(10); + priv->imsg_ring[mbox].fq_wrptr = entries - 1; + iowrite32(entries - 1, priv->regs + TSI721_IBDMAC_FQWP(ch)); + + priv->imsg_init[mbox] = 1; + return 0; + +out_desc: + dma_free_coherent(&priv->pdev->dev, + priv->imsg_ring[mbox].size * sizeof(struct tsi721_imsg_desc), + priv->imsg_ring[mbox].imd_base, + priv->imsg_ring[mbox].imd_phys); + + priv->imsg_ring[mbox].imd_base = NULL; + +out_dma: + dma_free_coherent(&priv->pdev->dev, + priv->imsg_ring[mbox].size * 8, + priv->imsg_ring[mbox].imfq_base, + priv->imsg_ring[mbox].imfq_phys); + + priv->imsg_ring[mbox].imfq_base = NULL; + +out_buf: + dma_free_coherent(&priv->pdev->dev, + priv->imsg_ring[mbox].size * TSI721_MSG_BUFFER_SIZE, + priv->imsg_ring[mbox].buf_base, + priv->imsg_ring[mbox].buf_phys); + + priv->imsg_ring[mbox].buf_base = NULL; + +out: + return rc; +} + +/** + * tsi721_close_inb_mbox - Shut down Tsi721 inbound mailbox + * @mport: Master port implementing the Inbound Messaging Engine + * @mbox: Mailbox to close + */ +static void tsi721_close_inb_mbox(struct rio_mport *mport, int mbox) +{ + struct tsi721_device *priv = mport->priv; + u32 rx_slot; + int ch = mbox + 4; + + if (!priv->imsg_init[mbox]) /* mbox isn't initialized yet */ + return; + priv->imsg_init[mbox] = 0; + + /* Disable Inbound Messaging Engine */ + + /* Disable Interrupts */ + tsi721_imsg_interrupt_disable(priv, ch, TSI721_OBDMAC_INT_MASK); + + if (priv->flags & TSI721_USING_MSIX) { + free_irq(priv->msix[TSI721_VECT_IMB0_RCV + mbox].vector, + (void *)mport); + free_irq(priv->msix[TSI721_VECT_IMB0_INT + mbox].vector, + (void *)mport); + } + + /* Clear Inbound Buffer Queue */ + for (rx_slot = 0; rx_slot < priv->imsg_ring[mbox].size; rx_slot++) + priv->imsg_ring[mbox].imq_base[rx_slot] = NULL; + + /* Free memory allocated for message buffers */ + dma_free_coherent(&priv->pdev->dev, + priv->imsg_ring[mbox].size * TSI721_MSG_BUFFER_SIZE, + priv->imsg_ring[mbox].buf_base, + priv->imsg_ring[mbox].buf_phys); + + priv->imsg_ring[mbox].buf_base = NULL; + + /* Free memory allocated for free pointr list */ + dma_free_coherent(&priv->pdev->dev, + priv->imsg_ring[mbox].size * 8, + priv->imsg_ring[mbox].imfq_base, + priv->imsg_ring[mbox].imfq_phys); + + priv->imsg_ring[mbox].imfq_base = NULL; + + /* Free memory allocated for RX descriptors */ + dma_free_coherent(&priv->pdev->dev, + priv->imsg_ring[mbox].size * sizeof(struct tsi721_imsg_desc), + priv->imsg_ring[mbox].imd_base, + priv->imsg_ring[mbox].imd_phys); + + priv->imsg_ring[mbox].imd_base = NULL; +} + +/** + * tsi721_add_inb_buffer - Add buffer to the Tsi721 inbound message queue + * @mport: Master port implementing the Inbound Messaging Engine + * @mbox: Inbound mailbox number + * @buf: Buffer to add to inbound queue + */ +static int tsi721_add_inb_buffer(struct rio_mport *mport, int mbox, void *buf) +{ + struct tsi721_device *priv = mport->priv; + u32 rx_slot; + int rc = 0; + + rx_slot = priv->imsg_ring[mbox].rx_slot; + if (priv->imsg_ring[mbox].imq_base[rx_slot]) { + dev_err(&priv->pdev->dev, + "Error adding inbound buffer %d, buffer exists\n", + rx_slot); + rc = -EINVAL; + goto out; + } + + priv->imsg_ring[mbox].imq_base[rx_slot] = buf; + + if (++priv->imsg_ring[mbox].rx_slot == priv->imsg_ring[mbox].size) + priv->imsg_ring[mbox].rx_slot = 0; + +out: + return rc; +} + +/** + * tsi721_get_inb_message - Fetch inbound message from the Tsi721 MSG Queue + * @mport: Master port implementing the Inbound Messaging Engine + * @mbox: Inbound mailbox number + * + * Returns pointer to the message on success or NULL on failure. + */ +static void *tsi721_get_inb_message(struct rio_mport *mport, int mbox) +{ + struct tsi721_device *priv = mport->priv; + struct tsi721_imsg_desc *desc; + u32 rx_slot; + void *rx_virt = NULL; + u64 rx_phys; + void *buf = NULL; + u64 *free_ptr; + int ch = mbox + 4; + int msg_size; + + if (!priv->imsg_init[mbox]) + return NULL; + + desc = priv->imsg_ring[mbox].imd_base; + desc += priv->imsg_ring[mbox].desc_rdptr; + + if (!(le32_to_cpu(desc->msg_info) & TSI721_IMD_HO)) + goto out; + + rx_slot = priv->imsg_ring[mbox].rx_slot; + while (priv->imsg_ring[mbox].imq_base[rx_slot] == NULL) { + if (++rx_slot == priv->imsg_ring[mbox].size) + rx_slot = 0; + } + + rx_phys = ((u64)le32_to_cpu(desc->bufptr_hi) << 32) | + le32_to_cpu(desc->bufptr_lo); + + rx_virt = priv->imsg_ring[mbox].buf_base + + (rx_phys - (u64)priv->imsg_ring[mbox].buf_phys); + + buf = priv->imsg_ring[mbox].imq_base[rx_slot]; + msg_size = le32_to_cpu(desc->msg_info) & TSI721_IMD_BCOUNT; + if (msg_size == 0) + msg_size = RIO_MAX_MSG_SIZE; + + memcpy(buf, rx_virt, msg_size); + priv->imsg_ring[mbox].imq_base[rx_slot] = NULL; + + desc->msg_info &= cpu_to_le32(~TSI721_IMD_HO); + if (++priv->imsg_ring[mbox].desc_rdptr == priv->imsg_ring[mbox].size) + priv->imsg_ring[mbox].desc_rdptr = 0; + + iowrite32(priv->imsg_ring[mbox].desc_rdptr, + priv->regs + TSI721_IBDMAC_DQRP(ch)); + + /* Return free buffer into the pointer list */ + free_ptr = priv->imsg_ring[mbox].imfq_base; + free_ptr[priv->imsg_ring[mbox].fq_wrptr] = cpu_to_le64(rx_phys); + + if (++priv->imsg_ring[mbox].fq_wrptr == priv->imsg_ring[mbox].size) + priv->imsg_ring[mbox].fq_wrptr = 0; + + iowrite32(priv->imsg_ring[mbox].fq_wrptr, + priv->regs + TSI721_IBDMAC_FQWP(ch)); +out: + return buf; +} + +/** + * tsi721_messages_init - Initialization of Messaging Engine + * @priv: pointer to tsi721 private data + * + * Configures Tsi721 messaging engine. + */ +static int tsi721_messages_init(struct tsi721_device *priv) +{ + int ch; + + iowrite32(0, priv->regs + TSI721_SMSG_ECC_LOG); + iowrite32(0, priv->regs + TSI721_RETRY_GEN_CNT); + iowrite32(0, priv->regs + TSI721_RETRY_RX_CNT); + + /* Set SRIO Message Request/Response Timeout */ + iowrite32(TSI721_RQRPTO_VAL, priv->regs + TSI721_RQRPTO); + + /* Initialize Inbound Messaging Engine Registers */ + for (ch = 0; ch < TSI721_IMSG_CHNUM; ch++) { + /* Clear interrupt bits */ + iowrite32(TSI721_IBDMAC_INT_MASK, + priv->regs + TSI721_IBDMAC_INT(ch)); + /* Clear Status */ + iowrite32(0, priv->regs + TSI721_IBDMAC_STS(ch)); + + iowrite32(TSI721_SMSG_ECC_COR_LOG_MASK, + priv->regs + TSI721_SMSG_ECC_COR_LOG(ch)); + iowrite32(TSI721_SMSG_ECC_NCOR_MASK, + priv->regs + TSI721_SMSG_ECC_NCOR(ch)); + } + + return 0; +} + +/** + * tsi721_disable_ints - disables all device interrupts + * @priv: pointer to tsi721 private data + */ +static void tsi721_disable_ints(struct tsi721_device *priv) +{ + int ch; + + /* Disable all device level interrupts */ + iowrite32(0, priv->regs + TSI721_DEV_INTE); + + /* Disable all Device Channel interrupts */ + iowrite32(0, priv->regs + TSI721_DEV_CHAN_INTE); + + /* Disable all Inbound Msg Channel interrupts */ + for (ch = 0; ch < TSI721_IMSG_CHNUM; ch++) + iowrite32(0, priv->regs + TSI721_IBDMAC_INTE(ch)); + + /* Disable all Outbound Msg Channel interrupts */ + for (ch = 0; ch < TSI721_OMSG_CHNUM; ch++) + iowrite32(0, priv->regs + TSI721_OBDMAC_INTE(ch)); + + /* Disable all general messaging interrupts */ + iowrite32(0, priv->regs + TSI721_SMSG_INTE); + + /* Disable all BDMA Channel interrupts */ + for (ch = 0; ch < TSI721_DMA_MAXCH; ch++) + iowrite32(0, priv->regs + TSI721_DMAC_INTE(ch)); + + /* Disable all general BDMA interrupts */ + iowrite32(0, priv->regs + TSI721_BDMA_INTE); + + /* Disable all SRIO Channel interrupts */ + for (ch = 0; ch < TSI721_SRIO_MAXCH; ch++) + iowrite32(0, priv->regs + TSI721_SR_CHINTE(ch)); + + /* Disable all general SR2PC interrupts */ + iowrite32(0, priv->regs + TSI721_SR2PC_GEN_INTE); + + /* Disable all PC2SR interrupts */ + iowrite32(0, priv->regs + TSI721_PC2SR_INTE); + + /* Disable all I2C interrupts */ + iowrite32(0, priv->regs + TSI721_I2C_INT_ENABLE); + + /* Disable SRIO MAC interrupts */ + iowrite32(0, priv->regs + TSI721_RIO_EM_INT_ENABLE); + iowrite32(0, priv->regs + TSI721_RIO_EM_DEV_INT_EN); +} + +/** + * tsi721_setup_mport - Setup Tsi721 as RapidIO subsystem master port + * @priv: pointer to tsi721 private data + * + * Configures Tsi721 as RapidIO master port. + */ +static int __devinit tsi721_setup_mport(struct tsi721_device *priv) +{ + struct pci_dev *pdev = priv->pdev; + int err = 0; + struct rio_ops *ops; + + struct rio_mport *mport; + + ops = kzalloc(sizeof(struct rio_ops), GFP_KERNEL); + if (!ops) { + dev_dbg(&pdev->dev, "Unable to allocate memory for rio_ops\n"); + return -ENOMEM; + } + + ops->lcread = tsi721_lcread; + ops->lcwrite = tsi721_lcwrite; + ops->cread = tsi721_cread_dma; + ops->cwrite = tsi721_cwrite_dma; + ops->dsend = tsi721_dsend; + ops->open_inb_mbox = tsi721_open_inb_mbox; + ops->close_inb_mbox = tsi721_close_inb_mbox; + ops->open_outb_mbox = tsi721_open_outb_mbox; + ops->close_outb_mbox = tsi721_close_outb_mbox; + ops->add_outb_message = tsi721_add_outb_message; + ops->add_inb_buffer = tsi721_add_inb_buffer; + ops->get_inb_message = tsi721_get_inb_message; + + mport = kzalloc(sizeof(struct rio_mport), GFP_KERNEL); + if (!mport) { + kfree(ops); + dev_dbg(&pdev->dev, "Unable to allocate memory for mport\n"); + return -ENOMEM; + } + + mport->ops = ops; + mport->index = 0; + mport->sys_size = 0; /* small system */ + mport->phy_type = RIO_PHY_SERIAL; + mport->priv = (void *)priv; + mport->phys_efptr = 0x100; + + INIT_LIST_HEAD(&mport->dbells); + + rio_init_dbell_res(&mport->riores[RIO_DOORBELL_RESOURCE], 0, 0xffff); + rio_init_mbox_res(&mport->riores[RIO_INB_MBOX_RESOURCE], 0, 0); + rio_init_mbox_res(&mport->riores[RIO_OUTB_MBOX_RESOURCE], 0, 0); + strcpy(mport->name, "Tsi721 mport"); + + /* Hook up interrupt handler */ + + if (!tsi721_enable_msix(priv)) + priv->flags |= TSI721_USING_MSIX; + else if (!pci_enable_msi(pdev)) + priv->flags |= TSI721_USING_MSI; + else + dev_info(&pdev->dev, + "MSI/MSI-X is not available. Using legacy INTx.\n"); + + err = tsi721_request_irq(mport); + + if (!err) { + tsi721_interrupts_init(priv); + ops->pwenable = tsi721_pw_enable; + } else + dev_err(&pdev->dev, "Unable to get assigned PCI IRQ " + "vector %02X err=0x%x\n", pdev->irq, err); + + /* Enable SRIO link */ + iowrite32(ioread32(priv->regs + TSI721_DEVCTL) | + TSI721_DEVCTL_SRBOOT_CMPL, + priv->regs + TSI721_DEVCTL); + + rio_register_mport(mport); + priv->mport = mport; + + if (mport->host_deviceid >= 0) + iowrite32(RIO_PORT_GEN_HOST | RIO_PORT_GEN_MASTER | + RIO_PORT_GEN_DISCOVERED, + priv->regs + (0x100 + RIO_PORT_GEN_CTL_CSR)); + else + iowrite32(0, priv->regs + (0x100 + RIO_PORT_GEN_CTL_CSR)); + + return 0; +} + +static int __devinit tsi721_probe(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + struct tsi721_device *priv; + int i; + int err; + u32 regval; + + priv = kzalloc(sizeof(struct tsi721_device), GFP_KERNEL); + if (priv == NULL) { + dev_err(&pdev->dev, "Failed to allocate memory for device\n"); + err = -ENOMEM; + goto err_exit; + } + + err = pci_enable_device(pdev); + if (err) { + dev_err(&pdev->dev, "Failed to enable PCI device\n"); + goto err_clean; + } + + priv->pdev = pdev; + +#ifdef DEBUG + for (i = 0; i <= PCI_STD_RESOURCE_END; i++) { + dev_dbg(&pdev->dev, "res[%d] @ 0x%llx (0x%lx, 0x%lx)\n", + i, (unsigned long long)pci_resource_start(pdev, i), + (unsigned long)pci_resource_len(pdev, i), + pci_resource_flags(pdev, i)); + } +#endif + /* + * Verify BAR configuration + */ + + /* BAR_0 (registers) must be 512KB+ in 32-bit address space */ + if (!(pci_resource_flags(pdev, BAR_0) & IORESOURCE_MEM) || + pci_resource_flags(pdev, BAR_0) & IORESOURCE_MEM_64 || + pci_resource_len(pdev, BAR_0) < TSI721_REG_SPACE_SIZE) { + dev_err(&pdev->dev, + "Missing or misconfigured CSR BAR0, aborting.\n"); + err = -ENODEV; + goto err_disable_pdev; + } + + /* BAR_1 (outbound doorbells) must be 16MB+ in 32-bit address space */ + if (!(pci_resource_flags(pdev, BAR_1) & IORESOURCE_MEM) || + pci_resource_flags(pdev, BAR_1) & IORESOURCE_MEM_64 || + pci_resource_len(pdev, BAR_1) < TSI721_DB_WIN_SIZE) { + dev_err(&pdev->dev, + "Missing or misconfigured Doorbell BAR1, aborting.\n"); + err = -ENODEV; + goto err_disable_pdev; + } + + /* + * BAR_2 and BAR_4 (outbound translation) must be in 64-bit PCIe address + * space. + * NOTE: BAR_2 and BAR_4 are not used by this version of driver. + * It may be a good idea to keep them disabled using HW configuration + * to save PCI memory space. + */ + if ((pci_resource_flags(pdev, BAR_2) & IORESOURCE_MEM) && + (pci_resource_flags(pdev, BAR_2) & IORESOURCE_MEM_64)) { + dev_info(&pdev->dev, "Outbound BAR2 is not used but enabled.\n"); + } + + if ((pci_resource_flags(pdev, BAR_4) & IORESOURCE_MEM) && + (pci_resource_flags(pdev, BAR_4) & IORESOURCE_MEM_64)) { + dev_info(&pdev->dev, "Outbound BAR4 is not used but enabled.\n"); + } + + err = pci_request_regions(pdev, DRV_NAME); + if (err) { + dev_err(&pdev->dev, "Cannot obtain PCI resources, " + "aborting.\n"); + goto err_disable_pdev; + } + + pci_set_master(pdev); + + priv->regs = pci_ioremap_bar(pdev, BAR_0); + if (!priv->regs) { + dev_err(&pdev->dev, + "Unable to map device registers space, aborting\n"); + err = -ENOMEM; + goto err_free_res; + } + + priv->odb_base = pci_ioremap_bar(pdev, BAR_1); + if (!priv->odb_base) { + dev_err(&pdev->dev, + "Unable to map outbound doorbells space, aborting\n"); + err = -ENOMEM; + goto err_unmap_bars; + } + + /* Configure DMA attributes. */ + if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) { + if (pci_set_dma_mask(pdev, DMA_BIT_MASK(32))) { + dev_info(&pdev->dev, "Unable to set DMA mask\n"); + goto err_unmap_bars; + } + + if (pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32))) + dev_info(&pdev->dev, "Unable to set consistent DMA mask\n"); + } else { + err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64)); + if (err) + dev_info(&pdev->dev, "Unable to set consistent DMA mask\n"); + } + + /* Clear "no snoop" and "relaxed ordering" bits. */ + pci_read_config_dword(pdev, 0x40 + PCI_EXP_DEVCTL, ®val); + regval &= ~(PCI_EXP_DEVCTL_RELAX_EN | PCI_EXP_DEVCTL_NOSNOOP_EN); + pci_write_config_dword(pdev, 0x40 + PCI_EXP_DEVCTL, regval); + + /* + * FIXUP: correct offsets of MSI-X tables in the MSI-X Capability Block + */ + pci_write_config_dword(pdev, TSI721_PCIECFG_EPCTL, 0x01); + pci_write_config_dword(pdev, TSI721_PCIECFG_MSIXTBL, + TSI721_MSIXTBL_OFFSET); + pci_write_config_dword(pdev, TSI721_PCIECFG_MSIXPBA, + TSI721_MSIXPBA_OFFSET); + pci_write_config_dword(pdev, TSI721_PCIECFG_EPCTL, 0); + /* End of FIXUP */ + + tsi721_disable_ints(priv); + + tsi721_init_pc2sr_mapping(priv); + tsi721_init_sr2pc_mapping(priv); + + if (tsi721_bdma_init(priv)) { + dev_err(&pdev->dev, "BDMA initialization failed, aborting\n"); + err = -ENOMEM; + goto err_unmap_bars; + } + + err = tsi721_doorbell_init(priv); + if (err) + goto err_free_bdma; + + tsi721_port_write_init(priv); + + err = tsi721_messages_init(priv); + if (err) + goto err_free_consistent; + + err = tsi721_setup_mport(priv); + if (err) + goto err_free_consistent; + + return 0; + +err_free_consistent: + tsi721_doorbell_free(priv); +err_free_bdma: + tsi721_bdma_free(priv); +err_unmap_bars: + if (priv->regs) + iounmap(priv->regs); + if (priv->odb_base) + iounmap(priv->odb_base); +err_free_res: + pci_release_regions(pdev); + pci_clear_master(pdev); +err_disable_pdev: + pci_disable_device(pdev); +err_clean: + kfree(priv); +err_exit: + return err; +} + +static DEFINE_PCI_DEVICE_TABLE(tsi721_pci_tbl) = { + { PCI_DEVICE(PCI_VENDOR_ID_IDT, PCI_DEVICE_ID_TSI721) }, + { 0, } /* terminate list */ +}; + +MODULE_DEVICE_TABLE(pci, tsi721_pci_tbl); + +static struct pci_driver tsi721_driver = { + .name = "tsi721", + .id_table = tsi721_pci_tbl, + .probe = tsi721_probe, +}; + +static int __init tsi721_init(void) +{ + return pci_register_driver(&tsi721_driver); +} + +static void __exit tsi721_exit(void) +{ + pci_unregister_driver(&tsi721_driver); +} + +device_initcall(tsi721_init); diff --git a/drivers/rapidio/devices/tsi721.h b/drivers/rapidio/devices/tsi721.h new file mode 100644 index 000000000000..3bea6bc26e80 --- /dev/null +++ b/drivers/rapidio/devices/tsi721.h @@ -0,0 +1,762 @@ +/* + * Tsi721 PCIExpress-to-SRIO bridge definitions + * + * Copyright 2011, Integrated Device Technology, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 + * Temple Place - Suite 330, Boston, MA 02111-1307, USA. + */ + +#ifndef __TSI721_H +#define __TSI721_H + +#define DRV_NAME "tsi721" + +#define DEFAULT_HOPCOUNT 0xff +#define DEFAULT_DESTID 0xff + +/* PCI device ID */ +#define PCI_DEVICE_ID_TSI721 0x80ab + +#define BAR_0 0 +#define BAR_1 1 +#define BAR_2 2 +#define BAR_4 4 + +#define TSI721_PC2SR_BARS 2 +#define TSI721_PC2SR_WINS 8 +#define TSI721_PC2SR_ZONES 8 +#define TSI721_MAINT_WIN 0 /* Window for outbound maintenance requests */ +#define IDB_QUEUE 0 /* Inbound Doorbell Queue to use */ +#define IDB_QSIZE 512 /* Inbound Doorbell Queue size */ + +/* Memory space sizes */ +#define TSI721_REG_SPACE_SIZE (512 * 1024) /* 512K */ +#define TSI721_DB_WIN_SIZE (16 * 1024 * 1024) /* 16MB */ + +#define RIO_TT_CODE_8 0x00000000 +#define RIO_TT_CODE_16 0x00000001 + +#define TSI721_DMA_MAXCH 8 +#define TSI721_DMA_MINSTSSZ 32 +#define TSI721_DMA_STSBLKSZ 8 + +#define TSI721_SRIO_MAXCH 8 + +#define DBELL_SID(buf) (((u8)buf[2] << 8) | (u8)buf[3]) +#define DBELL_TID(buf) (((u8)buf[4] << 8) | (u8)buf[5]) +#define DBELL_INF(buf) (((u8)buf[0] << 8) | (u8)buf[1]) + +#define TSI721_RIO_PW_MSG_SIZE 16 /* Tsi721 saves only 16 bytes of PW msg */ + +/* Register definitions */ + +/* + * Registers in PCIe configuration space + */ + +#define TSI721_PCIECFG_MSIXTBL 0x0a4 +#define TSI721_MSIXTBL_OFFSET 0x2c000 +#define TSI721_PCIECFG_MSIXPBA 0x0a8 +#define TSI721_MSIXPBA_OFFSET 0x2a000 +#define TSI721_PCIECFG_EPCTL 0x400 + +/* + * Event Management Registers + */ + +#define TSI721_RIO_EM_INT_STAT 0x10910 +#define TSI721_RIO_EM_INT_STAT_PW_RX 0x00010000 + +#define TSI721_RIO_EM_INT_ENABLE 0x10914 +#define TSI721_RIO_EM_INT_ENABLE_PW_RX 0x00010000 + +#define TSI721_RIO_EM_DEV_INT_EN 0x10930 +#define TSI721_RIO_EM_DEV_INT_EN_INT 0x00000001 + +/* + * Port-Write Block Registers + */ + +#define TSI721_RIO_PW_CTL 0x10a04 +#define TSI721_RIO_PW_CTL_PW_TIMER 0xf0000000 +#define TSI721_RIO_PW_CTL_PWT_DIS (0 << 28) +#define TSI721_RIO_PW_CTL_PWT_103 (1 << 28) +#define TSI721_RIO_PW_CTL_PWT_205 (1 << 29) +#define TSI721_RIO_PW_CTL_PWT_410 (1 << 30) +#define TSI721_RIO_PW_CTL_PWT_820 (1 << 31) +#define TSI721_RIO_PW_CTL_PWC_MODE 0x01000000 +#define TSI721_RIO_PW_CTL_PWC_CONT 0x00000000 +#define TSI721_RIO_PW_CTL_PWC_REL 0x01000000 + +#define TSI721_RIO_PW_RX_STAT 0x10a10 +#define TSI721_RIO_PW_RX_STAT_WR_SIZE 0x0000f000 +#define TSI_RIO_PW_RX_STAT_WDPTR 0x00000100 +#define TSI721_RIO_PW_RX_STAT_PW_SHORT 0x00000008 +#define TSI721_RIO_PW_RX_STAT_PW_TRUNC 0x00000004 +#define TSI721_RIO_PW_RX_STAT_PW_DISC 0x00000002 +#define TSI721_RIO_PW_RX_STAT_PW_VAL 0x00000001 + +#define TSI721_RIO_PW_RX_CAPT(x) (0x10a20 + (x)*4) + +/* + * Inbound Doorbells + */ + +#define TSI721_IDB_ENTRY_SIZE 64 + +#define TSI721_IDQ_CTL(x) (0x20000 + (x) * 1000) +#define TSI721_IDQ_SUSPEND 0x00000002 +#define TSI721_IDQ_INIT 0x00000001 + +#define TSI721_IDQ_STS(x) (0x20004 + (x) * 1000) +#define TSI721_IDQ_RUN 0x00200000 + +#define TSI721_IDQ_MASK(x) (0x20008 + (x) * 1000) +#define TSI721_IDQ_MASK_MASK 0xffff0000 +#define TSI721_IDQ_MASK_PATT 0x0000ffff + +#define TSI721_IDQ_RP(x) (0x2000c + (x) * 1000) +#define TSI721_IDQ_RP_PTR 0x0007ffff + +#define TSI721_IDQ_WP(x) (0x20010 + (x) * 1000) +#define TSI721_IDQ_WP_PTR 0x0007ffff + +#define TSI721_IDQ_BASEL(x) (0x20014 + (x) * 1000) +#define TSI721_IDQ_BASEL_ADDR 0xffffffc0 +#define TSI721_IDQ_BASEU(x) (0x20018 + (x) * 1000) +#define TSI721_IDQ_SIZE(x) (0x2001c + (x) * 1000) +#define TSI721_IDQ_SIZE_VAL(size) (__fls(size) - 4) +#define TSI721_IDQ_SIZE_MIN 512 +#define TSI721_IDQ_SIZE_MAX (512 * 1024) + +#define TSI721_SR_CHINT(x) (0x20040 + (x) * 1000) +#define TSI721_SR_CHINTE(x) (0x20044 + (x) * 1000) +#define TSI721_SR_CHINTSET(x) (0x20048 + (x) * 1000) +#define TSI721_SR_CHINT_ODBOK 0x00000020 +#define TSI721_SR_CHINT_IDBQRCV 0x00000010 +#define TSI721_SR_CHINT_SUSP 0x00000008 +#define TSI721_SR_CHINT_ODBTO 0x00000004 +#define TSI721_SR_CHINT_ODBRTRY 0x00000002 +#define TSI721_SR_CHINT_ODBERR 0x00000001 +#define TSI721_SR_CHINT_ALL 0x0000003f + +#define TSI721_IBWIN_NUM 8 + +#define TSI721_IBWINLB(x) (0x29000 + (x) * 20) +#define TSI721_IBWINLB_BA 0xfffff000 +#define TSI721_IBWINLB_WEN 0x00000001 + +#define TSI721_SR2PC_GEN_INTE 0x29800 +#define TSI721_SR2PC_PWE 0x29804 +#define TSI721_SR2PC_GEN_INT 0x29808 + +#define TSI721_DEV_INTE 0x29840 +#define TSI721_DEV_INT 0x29844 +#define TSI721_DEV_INTSET 0x29848 +#define TSI721_DEV_INT_SMSG_CH 0x00000800 +#define TSI721_DEV_INT_SMSG_NCH 0x00000400 +#define TSI721_DEV_INT_SR2PC_CH 0x00000200 +#define TSI721_DEV_INT_SRIO 0x00000020 + +#define TSI721_DEV_CHAN_INTE 0x2984c +#define TSI721_DEV_CHAN_INT 0x29850 + +#define TSI721_INT_SR2PC_CHAN_M 0xff000000 +#define TSI721_INT_SR2PC_CHAN(x) (1 << (24 + (x))) +#define TSI721_INT_IMSG_CHAN_M 0x00ff0000 +#define TSI721_INT_IMSG_CHAN(x) (1 << (16 + (x))) +#define TSI721_INT_OMSG_CHAN_M 0x0000ff00 +#define TSI721_INT_OMSG_CHAN(x) (1 << (8 + (x))) + +/* + * PC2SR block registers + */ +#define TSI721_OBWIN_NUM TSI721_PC2SR_WINS + +#define TSI721_OBWINLB(x) (0x40000 + (x) * 20) +#define TSI721_OBWINLB_BA 0xffff8000 +#define TSI721_OBWINLB_WEN 0x00000001 + +#define TSI721_OBWINUB(x) (0x40004 + (x) * 20) + +#define TSI721_OBWINSZ(x) (0x40008 + (x) * 20) +#define TSI721_OBWINSZ_SIZE 0x00001f00 +#define TSI721_OBWIN_SIZE(size) (__fls(size) - 15) + +#define TSI721_ZONE_SEL 0x41300 +#define TSI721_ZONE_SEL_RD_WRB 0x00020000 +#define TSI721_ZONE_SEL_GO 0x00010000 +#define TSI721_ZONE_SEL_WIN 0x00000038 +#define TSI721_ZONE_SEL_ZONE 0x00000007 + +#define TSI721_LUT_DATA0 0x41304 +#define TSI721_LUT_DATA0_ADD 0xfffff000 +#define TSI721_LUT_DATA0_RDTYPE 0x00000f00 +#define TSI721_LUT_DATA0_NREAD 0x00000100 +#define TSI721_LUT_DATA0_MNTRD 0x00000200 +#define TSI721_LUT_DATA0_RDCRF 0x00000020 +#define TSI721_LUT_DATA0_WRCRF 0x00000010 +#define TSI721_LUT_DATA0_WRTYPE 0x0000000f +#define TSI721_LUT_DATA0_NWR 0x00000001 +#define TSI721_LUT_DATA0_MNTWR 0x00000002 +#define TSI721_LUT_DATA0_NWR_R 0x00000004 + +#define TSI721_LUT_DATA1 0x41308 + +#define TSI721_LUT_DATA2 0x4130c +#define TSI721_LUT_DATA2_HC 0xff000000 +#define TSI721_LUT_DATA2_ADD65 0x000c0000 +#define TSI721_LUT_DATA2_TT 0x00030000 +#define TSI721_LUT_DATA2_DSTID 0x0000ffff + +#define TSI721_PC2SR_INTE 0x41310 + +#define TSI721_DEVCTL 0x48004 +#define TSI721_DEVCTL_SRBOOT_CMPL 0x00000004 + +#define TSI721_I2C_INT_ENABLE 0x49120 + +/* + * Block DMA Engine Registers + * x = 0..7 + */ + +#define TSI721_DMAC_DWRCNT(x) (0x51000 + (x) * 0x1000) +#define TSI721_DMAC_DRDCNT(x) (0x51004 + (x) * 0x1000) + +#define TSI721_DMAC_CTL(x) (0x51008 + (x) * 0x1000) +#define TSI721_DMAC_CTL_SUSP 0x00000002 +#define TSI721_DMAC_CTL_INIT 0x00000001 + +#define TSI721_DMAC_INT(x) (0x5100c + (x) * 0x1000) +#define TSI721_DMAC_INT_STFULL 0x00000010 +#define TSI721_DMAC_INT_DONE 0x00000008 +#define TSI721_DMAC_INT_SUSP 0x00000004 +#define TSI721_DMAC_INT_ERR 0x00000002 +#define TSI721_DMAC_INT_IOFDONE 0x00000001 +#define TSI721_DMAC_INT_ALL 0x0000001f + +#define TSI721_DMAC_INTSET(x) (0x51010 + (x) * 0x1000) + +#define TSI721_DMAC_STS(x) (0x51014 + (x) * 0x1000) +#define TSI721_DMAC_STS_ABORT 0x00400000 +#define TSI721_DMAC_STS_RUN 0x00200000 +#define TSI721_DMAC_STS_CS 0x001f0000 + +#define TSI721_DMAC_INTE(x) (0x51018 + (x) * 0x1000) + +#define TSI721_DMAC_DPTRL(x) (0x51024 + (x) * 0x1000) +#define TSI721_DMAC_DPTRL_MASK 0xffffffe0 + +#define TSI721_DMAC_DPTRH(x) (0x51028 + (x) * 0x1000) + +#define TSI721_DMAC_DSBL(x) (0x5102c + (x) * 0x1000) +#define TSI721_DMAC_DSBL_MASK 0xffffffc0 + +#define TSI721_DMAC_DSBH(x) (0x51030 + (x) * 0x1000) + +#define TSI721_DMAC_DSSZ(x) (0x51034 + (x) * 0x1000) +#define TSI721_DMAC_DSSZ_SIZE_M 0x0000000f +#define TSI721_DMAC_DSSZ_SIZE(size) (__fls(size) - 4) + + +#define TSI721_DMAC_DSRP(x) (0x51038 + (x) * 0x1000) +#define TSI721_DMAC_DSRP_MASK 0x0007ffff + +#define TSI721_DMAC_DSWP(x) (0x5103c + (x) * 0x1000) +#define TSI721_DMAC_DSWP_MASK 0x0007ffff + +#define TSI721_BDMA_INTE 0x5f000 + +/* + * Messaging definitions + */ +#define TSI721_MSG_BUFFER_SIZE RIO_MAX_MSG_SIZE +#define TSI721_MSG_MAX_SIZE RIO_MAX_MSG_SIZE +#define TSI721_IMSG_MAXCH 8 +#define TSI721_IMSG_CHNUM TSI721_IMSG_MAXCH +#define TSI721_IMSGD_MIN_RING_SIZE 32 +#define TSI721_IMSGD_RING_SIZE 512 + +#define TSI721_OMSG_CHNUM 4 /* One channel per MBOX */ +#define TSI721_OMSGD_MIN_RING_SIZE 32 +#define TSI721_OMSGD_RING_SIZE 512 + +/* + * Outbound Messaging Engine Registers + * x = 0..7 + */ + +#define TSI721_OBDMAC_DWRCNT(x) (0x61000 + (x) * 0x1000) + +#define TSI721_OBDMAC_DRDCNT(x) (0x61004 + (x) * 0x1000) + +#define TSI721_OBDMAC_CTL(x) (0x61008 + (x) * 0x1000) +#define TSI721_OBDMAC_CTL_MASK 0x00000007 +#define TSI721_OBDMAC_CTL_RETRY_THR 0x00000004 +#define TSI721_OBDMAC_CTL_SUSPEND 0x00000002 +#define TSI721_OBDMAC_CTL_INIT 0x00000001 + +#define TSI721_OBDMAC_INT(x) (0x6100c + (x) * 0x1000) +#define TSI721_OBDMAC_INTSET(x) (0x61010 + (x) * 0x1000) +#define TSI721_OBDMAC_INTE(x) (0x61018 + (x) * 0x1000) +#define TSI721_OBDMAC_INT_MASK 0x0000001F +#define TSI721_OBDMAC_INT_ST_FULL 0x00000010 +#define TSI721_OBDMAC_INT_DONE 0x00000008 +#define TSI721_OBDMAC_INT_SUSPENDED 0x00000004 +#define TSI721_OBDMAC_INT_ERROR 0x00000002 +#define TSI721_OBDMAC_INT_IOF_DONE 0x00000001 +#define TSI721_OBDMAC_INT_ALL TSI721_OBDMAC_INT_MASK + +#define TSI721_OBDMAC_STS(x) (0x61014 + (x) * 0x1000) +#define TSI721_OBDMAC_STS_MASK 0x007f0000 +#define TSI721_OBDMAC_STS_ABORT 0x00400000 +#define TSI721_OBDMAC_STS_RUN 0x00200000 +#define TSI721_OBDMAC_STS_CS 0x001f0000 + +#define TSI721_OBDMAC_PWE(x) (0x6101c + (x) * 0x1000) +#define TSI721_OBDMAC_PWE_MASK 0x00000002 +#define TSI721_OBDMAC_PWE_ERROR_EN 0x00000002 + +#define TSI721_OBDMAC_DPTRL(x) (0x61020 + (x) * 0x1000) +#define TSI721_OBDMAC_DPTRL_MASK 0xfffffff0 + +#define TSI721_OBDMAC_DPTRH(x) (0x61024 + (x) * 0x1000) +#define TSI721_OBDMAC_DPTRH_MASK 0xffffffff + +#define TSI721_OBDMAC_DSBL(x) (0x61040 + (x) * 0x1000) +#define TSI721_OBDMAC_DSBL_MASK 0xffffffc0 + +#define TSI721_OBDMAC_DSBH(x) (0x61044 + (x) * 0x1000) +#define TSI721_OBDMAC_DSBH_MASK 0xffffffff + +#define TSI721_OBDMAC_DSSZ(x) (0x61048 + (x) * 0x1000) +#define TSI721_OBDMAC_DSSZ_MASK 0x0000000f + +#define TSI721_OBDMAC_DSRP(x) (0x6104c + (x) * 0x1000) +#define TSI721_OBDMAC_DSRP_MASK 0x0007ffff + +#define TSI721_OBDMAC_DSWP(x) (0x61050 + (x) * 0x1000) +#define TSI721_OBDMAC_DSWP_MASK 0x0007ffff + +#define TSI721_RQRPTO 0x60010 +#define TSI721_RQRPTO_MASK 0x00ffffff +#define TSI721_RQRPTO_VAL 400 /* Response TO value */ + +/* + * Inbound Messaging Engine Registers + * x = 0..7 + */ + +#define TSI721_IB_DEVID_GLOBAL 0xffff +#define TSI721_IBDMAC_FQBL(x) (0x61200 + (x) * 0x1000) +#define TSI721_IBDMAC_FQBL_MASK 0xffffffc0 + +#define TSI721_IBDMAC_FQBH(x) (0x61204 + (x) * 0x1000) +#define TSI721_IBDMAC_FQBH_MASK 0xffffffff + +#define TSI721_IBDMAC_FQSZ_ENTRY_INX TSI721_IMSGD_RING_SIZE +#define TSI721_IBDMAC_FQSZ(x) (0x61208 + (x) * 0x1000) +#define TSI721_IBDMAC_FQSZ_MASK 0x0000000f + +#define TSI721_IBDMAC_FQRP(x) (0x6120c + (x) * 0x1000) +#define TSI721_IBDMAC_FQRP_MASK 0x0007ffff + +#define TSI721_IBDMAC_FQWP(x) (0x61210 + (x) * 0x1000) +#define TSI721_IBDMAC_FQWP_MASK 0x0007ffff + +#define TSI721_IBDMAC_FQTH(x) (0x61214 + (x) * 0x1000) +#define TSI721_IBDMAC_FQTH_MASK 0x0007ffff + +#define TSI721_IB_DEVID 0x60020 +#define TSI721_IB_DEVID_MASK 0x0000ffff + +#define TSI721_IBDMAC_CTL(x) (0x61240 + (x) * 0x1000) +#define TSI721_IBDMAC_CTL_MASK 0x00000003 +#define TSI721_IBDMAC_CTL_SUSPEND 0x00000002 +#define TSI721_IBDMAC_CTL_INIT 0x00000001 + +#define TSI721_IBDMAC_STS(x) (0x61244 + (x) * 0x1000) +#define TSI721_IBDMAC_STS_MASK 0x007f0000 +#define TSI721_IBSMAC_STS_ABORT 0x00400000 +#define TSI721_IBSMAC_STS_RUN 0x00200000 +#define TSI721_IBSMAC_STS_CS 0x001f0000 + +#define TSI721_IBDMAC_INT(x) (0x61248 + (x) * 0x1000) +#define TSI721_IBDMAC_INTSET(x) (0x6124c + (x) * 0x1000) +#define TSI721_IBDMAC_INTE(x) (0x61250 + (x) * 0x1000) +#define TSI721_IBDMAC_INT_MASK 0x0000100f +#define TSI721_IBDMAC_INT_SRTO 0x00001000 +#define TSI721_IBDMAC_INT_SUSPENDED 0x00000008 +#define TSI721_IBDMAC_INT_PC_ERROR 0x00000004 +#define TSI721_IBDMAC_INT_FQ_LOW 0x00000002 +#define TSI721_IBDMAC_INT_DQ_RCV 0x00000001 +#define TSI721_IBDMAC_INT_ALL TSI721_IBDMAC_INT_MASK + +#define TSI721_IBDMAC_PWE(x) (0x61254 + (x) * 0x1000) +#define TSI721_IBDMAC_PWE_MASK 0x00001700 +#define TSI721_IBDMAC_PWE_SRTO 0x00001000 +#define TSI721_IBDMAC_PWE_ILL_FMT 0x00000400 +#define TSI721_IBDMAC_PWE_ILL_DEC 0x00000200 +#define TSI721_IBDMAC_PWE_IMP_SP 0x00000100 + +#define TSI721_IBDMAC_DQBL(x) (0x61300 + (x) * 0x1000) +#define TSI721_IBDMAC_DQBL_MASK 0xffffffc0 +#define TSI721_IBDMAC_DQBL_ADDR 0xffffffc0 + +#define TSI721_IBDMAC_DQBH(x) (0x61304 + (x) * 0x1000) +#define TSI721_IBDMAC_DQBH_MASK 0xffffffff + +#define TSI721_IBDMAC_DQRP(x) (0x61308 + (x) * 0x1000) +#define TSI721_IBDMAC_DQRP_MASK 0x0007ffff + +#define TSI721_IBDMAC_DQWR(x) (0x6130c + (x) * 0x1000) +#define TSI721_IBDMAC_DQWR_MASK 0x0007ffff + +#define TSI721_IBDMAC_DQSZ(x) (0x61314 + (x) * 0x1000) +#define TSI721_IBDMAC_DQSZ_MASK 0x0000000f + +/* + * Messaging Engine Interrupts + */ + +#define TSI721_SMSG_PWE 0x6a004 + +#define TSI721_SMSG_INTE 0x6a000 +#define TSI721_SMSG_INT 0x6a008 +#define TSI721_SMSG_INTSET 0x6a010 +#define TSI721_SMSG_INT_MASK 0x0086ffff +#define TSI721_SMSG_INT_UNS_RSP 0x00800000 +#define TSI721_SMSG_INT_ECC_NCOR 0x00040000 +#define TSI721_SMSG_INT_ECC_COR 0x00020000 +#define TSI721_SMSG_INT_ECC_NCOR_CH 0x0000ff00 +#define TSI721_SMSG_INT_ECC_COR_CH 0x000000ff + +#define TSI721_SMSG_ECC_LOG 0x6a014 +#define TSI721_SMSG_ECC_LOG_MASK 0x00070007 +#define TSI721_SMSG_ECC_LOG_ECC_NCOR_M 0x00070000 +#define TSI721_SMSG_ECC_LOG_ECC_COR_M 0x00000007 + +#define TSI721_RETRY_GEN_CNT 0x6a100 +#define TSI721_RETRY_GEN_CNT_MASK 0xffffffff + +#define TSI721_RETRY_RX_CNT 0x6a104 +#define TSI721_RETRY_RX_CNT_MASK 0xffffffff + +#define TSI721_SMSG_ECC_COR_LOG(x) (0x6a300 + (x) * 4) +#define TSI721_SMSG_ECC_COR_LOG_MASK 0x000000ff + +#define TSI721_SMSG_ECC_NCOR(x) (0x6a340 + (x) * 4) +#define TSI721_SMSG_ECC_NCOR_MASK 0x000000ff + +/* + * Block DMA Descriptors + */ + +struct tsi721_dma_desc { + __le32 type_id; + +#define TSI721_DMAD_DEVID 0x0000ffff +#define TSI721_DMAD_CRF 0x00010000 +#define TSI721_DMAD_PRIO 0x00060000 +#define TSI721_DMAD_RTYPE 0x00780000 +#define TSI721_DMAD_IOF 0x08000000 +#define TSI721_DMAD_DTYPE 0xe0000000 + + __le32 bcount; + +#define TSI721_DMAD_BCOUNT1 0x03ffffff /* if DTYPE == 1 */ +#define TSI721_DMAD_BCOUNT2 0x0000000f /* if DTYPE == 2 */ +#define TSI721_DMAD_TT 0x0c000000 +#define TSI721_DMAD_RADDR0 0xc0000000 + + union { + __le32 raddr_lo; /* if DTYPE == (1 || 2) */ + __le32 next_lo; /* if DTYPE == 3 */ + }; + +#define TSI721_DMAD_CFGOFF 0x00ffffff +#define TSI721_DMAD_HOPCNT 0xff000000 + + union { + __le32 raddr_hi; /* if DTYPE == (1 || 2) */ + __le32 next_hi; /* if DTYPE == 3 */ + }; + + union { + struct { /* if DTYPE == 1 */ + __le32 bufptr_lo; + __le32 bufptr_hi; + __le32 s_dist; + __le32 s_size; + } t1; + __le32 data[4]; /* if DTYPE == 2 */ + u32 reserved[4]; /* if DTYPE == 3 */ + }; +} __attribute__((aligned(32))); + +/* + * Inbound Messaging Descriptor + */ +struct tsi721_imsg_desc { + __le32 type_id; + +#define TSI721_IMD_DEVID 0x0000ffff +#define TSI721_IMD_CRF 0x00010000 +#define TSI721_IMD_PRIO 0x00060000 +#define TSI721_IMD_TT 0x00180000 +#define TSI721_IMD_DTYPE 0xe0000000 + + __le32 msg_info; + +#define TSI721_IMD_BCOUNT 0x00000ff8 +#define TSI721_IMD_SSIZE 0x0000f000 +#define TSI721_IMD_LETER 0x00030000 +#define TSI721_IMD_XMBOX 0x003c0000 +#define TSI721_IMD_MBOX 0x00c00000 +#define TSI721_IMD_CS 0x78000000 +#define TSI721_IMD_HO 0x80000000 + + __le32 bufptr_lo; + __le32 bufptr_hi; + u32 reserved[12]; + +} __attribute__((aligned(64))); + +/* + * Outbound Messaging Descriptor + */ +struct tsi721_omsg_desc { + __le32 type_id; + +#define TSI721_OMD_DEVID 0x0000ffff +#define TSI721_OMD_CRF 0x00010000 +#define TSI721_OMD_PRIO 0x00060000 +#define TSI721_OMD_IOF 0x08000000 +#define TSI721_OMD_DTYPE 0xe0000000 +#define TSI721_OMD_RSRVD 0x17f80000 + + __le32 msg_info; + +#define TSI721_OMD_BCOUNT 0x00000ff8 +#define TSI721_OMD_SSIZE 0x0000f000 +#define TSI721_OMD_LETER 0x00030000 +#define TSI721_OMD_XMBOX 0x003c0000 +#define TSI721_OMD_MBOX 0x00c00000 +#define TSI721_OMD_TT 0x0c000000 + + union { + __le32 bufptr_lo; /* if DTYPE == 4 */ + __le32 next_lo; /* if DTYPE == 5 */ + }; + + union { + __le32 bufptr_hi; /* if DTYPE == 4 */ + __le32 next_hi; /* if DTYPE == 5 */ + }; + +} __attribute__((aligned(16))); + +struct tsi721_dma_sts { + __le64 desc_sts[8]; +} __attribute__((aligned(64))); + +struct tsi721_desc_sts_fifo { + union { + __le64 da64; + struct { + __le32 lo; + __le32 hi; + } da32; + } stat[8]; +} __attribute__((aligned(64))); + +/* Descriptor types for BDMA and Messaging blocks */ +enum dma_dtype { + DTYPE1 = 1, /* Data Transfer DMA Descriptor */ + DTYPE2 = 2, /* Immediate Data Transfer DMA Descriptor */ + DTYPE3 = 3, /* Block Pointer DMA Descriptor */ + DTYPE4 = 4, /* Outbound Msg DMA Descriptor */ + DTYPE5 = 5, /* OB Messaging Block Pointer Descriptor */ + DTYPE6 = 6 /* Inbound Messaging Descriptor */ +}; + +enum dma_rtype { + NREAD = 0, + LAST_NWRITE_R = 1, + ALL_NWRITE = 2, + ALL_NWRITE_R = 3, + MAINT_RD = 4, + MAINT_WR = 5 +}; + +/* + * mport Driver Definitions + */ +#define TSI721_DMA_CHNUM TSI721_DMA_MAXCH + +#define TSI721_DMACH_MAINT 0 /* DMA channel for maint requests */ +#define TSI721_DMACH_MAINT_NBD 32 /* Number of BDs for maint requests */ + +#define MSG_DMA_ENTRY_INX_TO_SIZE(x) ((0x10 << (x)) & 0xFFFF0) + +enum tsi721_smsg_int_flag { + SMSG_INT_NONE = 0x00000000, + SMSG_INT_ECC_COR_CH = 0x000000ff, + SMSG_INT_ECC_NCOR_CH = 0x0000ff00, + SMSG_INT_ECC_COR = 0x00020000, + SMSG_INT_ECC_NCOR = 0x00040000, + SMSG_INT_UNS_RSP = 0x00800000, + SMSG_INT_ALL = 0x0006ffff +}; + +/* Structures */ + +struct dma_chan { + int bd_num; /* number of buffer descriptors */ + void *bd_base; /* start of DMA descriptors */ + dma_addr_t bd_phys; + void *sts_base; /* start of DMA BD status FIFO */ + dma_addr_t sts_phys; + int sts_size; +}; + +struct tsi721_imsg_ring { + u32 size; + /* VA/PA of data buffers for incoming messages */ + void *buf_base; + dma_addr_t buf_phys; + /* VA/PA of circular free buffer list */ + void *imfq_base; + dma_addr_t imfq_phys; + /* VA/PA of Inbound message descriptors */ + void *imd_base; + dma_addr_t imd_phys; + /* Inbound Queue buffer pointers */ + void *imq_base[TSI721_IMSGD_RING_SIZE]; + + u32 rx_slot; + void *dev_id; + u32 fq_wrptr; + u32 desc_rdptr; +}; + +struct tsi721_omsg_ring { + u32 size; + /* VA/PA of OB Msg descriptors */ + void *omd_base; + dma_addr_t omd_phys; + /* VA/PA of OB Msg data buffers */ + void *omq_base[TSI721_OMSGD_RING_SIZE]; + dma_addr_t omq_phys[TSI721_OMSGD_RING_SIZE]; + /* VA/PA of OB Msg descriptor status FIFO */ + void *sts_base; + dma_addr_t sts_phys; + u32 sts_size; /* # of allocated status entries */ + u32 sts_rdptr; + + u32 tx_slot; + void *dev_id; + u32 wr_count; + spinlock_t lock; +}; + +enum tsi721_flags { + TSI721_USING_MSI = (1 << 0), + TSI721_USING_MSIX = (1 << 1), + TSI721_IMSGID_SET = (1 << 2), +}; + +/* + * MSI-X Table Entries (0 ... 69) + */ +#define TSI721_MSIX_DMACH_DONE(x) (0 + (x)) +#define TSI721_MSIX_DMACH_INT(x) (8 + (x)) +#define TSI721_MSIX_BDMA_INT 16 +#define TSI721_MSIX_OMSG_DONE(x) (17 + (x)) +#define TSI721_MSIX_OMSG_INT(x) (25 + (x)) +#define TSI721_MSIX_IMSG_DQ_RCV(x) (33 + (x)) +#define TSI721_MSIX_IMSG_INT(x) (41 + (x)) +#define TSI721_MSIX_MSG_INT 49 +#define TSI721_MSIX_SR2PC_IDBQ_RCV(x) (50 + (x)) +#define TSI721_MSIX_SR2PC_CH_INT(x) (58 + (x)) +#define TSI721_MSIX_SR2PC_INT 66 +#define TSI721_MSIX_PC2SR_INT 67 +#define TSI721_MSIX_SRIO_MAC_INT 68 +#define TSI721_MSIX_I2C_INT 69 + +/* MSI-X vector and init table entry indexes */ +enum tsi721_msix_vect { + TSI721_VECT_IDB, + TSI721_VECT_PWRX, /* PW_RX is part of SRIO MAC Interrupt reporting */ + TSI721_VECT_OMB0_DONE, + TSI721_VECT_OMB1_DONE, + TSI721_VECT_OMB2_DONE, + TSI721_VECT_OMB3_DONE, + TSI721_VECT_OMB0_INT, + TSI721_VECT_OMB1_INT, + TSI721_VECT_OMB2_INT, + TSI721_VECT_OMB3_INT, + TSI721_VECT_IMB0_RCV, + TSI721_VECT_IMB1_RCV, + TSI721_VECT_IMB2_RCV, + TSI721_VECT_IMB3_RCV, + TSI721_VECT_IMB0_INT, + TSI721_VECT_IMB1_INT, + TSI721_VECT_IMB2_INT, + TSI721_VECT_IMB3_INT, + TSI721_VECT_MAX +}; + +#define IRQ_DEVICE_NAME_MAX 64 + +struct msix_irq { + u16 vector; + char irq_name[IRQ_DEVICE_NAME_MAX]; +}; + +struct tsi721_device { + struct pci_dev *pdev; + struct rio_mport *mport; + u32 flags; + void __iomem *regs; + struct msix_irq msix[TSI721_VECT_MAX]; + + /* Doorbells */ + void __iomem *odb_base; + void *idb_base; + dma_addr_t idb_dma; + struct work_struct idb_work; + u32 db_discard_count; + + /* Inbound Port-Write */ + struct work_struct pw_work; + struct kfifo pw_fifo; + spinlock_t pw_fifo_lock; + u32 pw_discard_count; + + /* BDMA Engine */ + struct dma_chan bdma[TSI721_DMA_CHNUM]; + + /* Inbound Messaging */ + int imsg_init[TSI721_IMSG_CHNUM]; + struct tsi721_imsg_ring imsg_ring[TSI721_IMSG_CHNUM]; + + /* Outbound Messaging */ + int omsg_init[TSI721_OMSG_CHNUM]; + struct tsi721_omsg_ring omsg_ring[TSI721_OMSG_CHNUM]; +}; + +#endif diff --git a/drivers/rapidio/rio-scan.c b/drivers/rapidio/rio-scan.c index ebe77dd87daf..0914f49b0a53 100644 --- a/drivers/rapidio/rio-scan.c +++ b/drivers/rapidio/rio-scan.c @@ -923,7 +923,7 @@ static int __devinit rio_enum_peer(struct rio_net *net, struct rio_mport *port, * rio_enum_complete- Tests if enumeration of a network is complete * @port: Master port to send transaction * - * Tests the Component Tag CSR for non-zero value (enumeration + * Tests the PGCCSR discovered bit for non-zero value (enumeration * complete flag). Return %1 if enumeration is complete or %0 if * enumeration is incomplete. */ @@ -933,7 +933,7 @@ static int rio_enum_complete(struct rio_mport *port) rio_local_read_config_32(port, port->phys_efptr + RIO_PORT_GEN_CTL_CSR, ®val); - return (regval & RIO_PORT_GEN_MASTER) ? 1 : 0; + return (regval & RIO_PORT_GEN_DISCOVERED) ? 1 : 0; } /** diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c index 01a7df5317c1..e8326f26fa2f 100644 --- a/drivers/rtc/class.c +++ b/drivers/rtc/class.c @@ -21,16 +21,13 @@ #include "rtc-core.h" -static DEFINE_IDR(rtc_idr); -static DEFINE_MUTEX(idr_lock); +static DEFINE_IDA(rtc_ida); struct class *rtc_class; static void rtc_device_release(struct device *dev) { struct rtc_device *rtc = to_rtc_device(dev); - mutex_lock(&idr_lock); - idr_remove(&rtc_idr, rtc->id); - mutex_unlock(&idr_lock); + ida_simple_remove(&rtc_ida, rtc->id); kfree(rtc); } @@ -146,25 +143,16 @@ struct rtc_device *rtc_device_register(const char *name, struct device *dev, struct rtc_wkalrm alrm; int id, err; - if (idr_pre_get(&rtc_idr, GFP_KERNEL) == 0) { - err = -ENOMEM; + id = ida_simple_get(&rtc_ida, 0, 0, GFP_KERNEL); + if (id < 0) { + err = id; goto exit; } - - mutex_lock(&idr_lock); - err = idr_get_new(&rtc_idr, NULL, &id); - mutex_unlock(&idr_lock); - - if (err < 0) - goto exit; - - id = id & MAX_ID_MASK; - rtc = kzalloc(sizeof(struct rtc_device), GFP_KERNEL); if (rtc == NULL) { err = -ENOMEM; - goto exit_idr; + goto exit_ida; } rtc->id = id; @@ -222,10 +210,8 @@ struct rtc_device *rtc_device_register(const char *name, struct device *dev, exit_kfree: kfree(rtc); -exit_idr: - mutex_lock(&idr_lock); - idr_remove(&rtc_idr, id); - mutex_unlock(&idr_lock); +exit_ida: + ida_simple_remove(&rtc_ida, id); exit: dev_err(dev, "rtc core: unable to register %s, err = %d\n", @@ -276,7 +262,7 @@ static void __exit rtc_exit(void) { rtc_dev_exit(); class_destroy(rtc_class); - idr_destroy(&rtc_idr); + ida_destroy(&rtc_ida); } subsys_initcall(rtc_init); diff --git a/drivers/scsi/aacraid/commctrl.c b/drivers/scsi/aacraid/commctrl.c index 8a0b33033177..0bd38da4ada0 100644 --- a/drivers/scsi/aacraid/commctrl.c +++ b/drivers/scsi/aacraid/commctrl.c @@ -650,6 +650,7 @@ static int aac_send_raw_srb(struct aac_dev* dev, void __user * arg) AAC_OPT_NEW_COMM) ? (dev->scsi_host_ptr->max_sectors << 9) : 65536)) { + kfree(usg); rcode = -EINVAL; goto cleanup; } diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c index 5c1776406c96..1408ebb49111 100644 --- a/drivers/scsi/megaraid.c +++ b/drivers/scsi/megaraid.c @@ -310,15 +310,15 @@ mega_query_adapter(adapter_t *adapter) if (adapter->product_info.subsysvid == HP_SUBSYS_VID) { sprintf (adapter->fw_version, "%c%d%d.%d%d", adapter->product_info.fw_version[2], - adapter->product_info.fw_version[1] >> 8, + adapter->product_info.fw_version[1] >> 4, adapter->product_info.fw_version[1] & 0x0f, - adapter->product_info.fw_version[0] >> 8, + adapter->product_info.fw_version[0] >> 4, adapter->product_info.fw_version[0] & 0x0f); sprintf (adapter->bios_version, "%c%d%d.%d%d", adapter->product_info.bios_version[2], - adapter->product_info.bios_version[1] >> 8, + adapter->product_info.bios_version[1] >> 4, adapter->product_info.bios_version[1] & 0x0f, - adapter->product_info.bios_version[0] >> 8, + adapter->product_info.bios_version[0] >> 4, adapter->product_info.bios_version[0] & 0x0f); } else { memcpy(adapter->fw_version, diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c index b31a8e3841d7..fa849bdd230f 100644 --- a/drivers/scsi/osd/osd_uld.c +++ b/drivers/scsi/osd/osd_uld.c @@ -387,7 +387,7 @@ static void __remove(struct device *dev) if (oud->disk) put_disk(oud->disk); - ida_remove(&osd_minor_ida, oud->minor); + ida_simple_remove(&osd_minor_ida, oud->minor); kfree(oud); } @@ -403,18 +403,12 @@ static int osd_probe(struct device *dev) if (scsi_device->type != TYPE_OSD) return -ENODEV; - do { - if (!ida_pre_get(&osd_minor_ida, GFP_KERNEL)) - return -ENODEV; - - error = ida_get_new(&osd_minor_ida, &minor); - } while (error == -EAGAIN); - - if (error) - return error; - if (minor >= SCSI_OSD_MAX_MINOR) { - error = -EBUSY; - goto err_retract_minor; + minor = ida_simple_get(&osd_minor_ida, 0, + SCSI_OSD_MAX_MINOR, GFP_KERNEL); + if (minor < 0) { + if (minor == -ENOSPC) + return -EBUSY; + return minor; } error = -ENOMEM; @@ -491,7 +485,7 @@ err_free_osd: dev_set_drvdata(dev, NULL); kfree(oud); err_retract_minor: - ida_remove(&osd_minor_ida, minor); + ida_simple_remove(&osd_minor_ida, minor); return error; } diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index a7942e5c8be8..4e9a42642d09 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -111,7 +111,6 @@ static void scsi_disk_release(struct device *cdev); static void sd_print_sense_hdr(struct scsi_disk *, struct scsi_sense_hdr *); static void sd_print_result(struct scsi_disk *, int); -static DEFINE_SPINLOCK(sd_index_lock); static DEFINE_IDA(sd_index_ida); /* This semaphore is used to mediate the 0->1 reference get in the @@ -2581,22 +2580,15 @@ static int sd_probe(struct device *dev) if (!gd) goto out_free; - do { - if (!ida_pre_get(&sd_index_ida, GFP_KERNEL)) - goto out_put; - - spin_lock(&sd_index_lock); - error = ida_get_new(&sd_index_ida, &index); - spin_unlock(&sd_index_lock); - } while (error == -EAGAIN); - - if (error) + index = ida_simple_get(&sd_index_ida, 0, SD_MAX_DISKS, GFP_KERNEL); + if (index < 0) { + error = index; + if (error == -ENOSPC) { + sdev_printk(KERN_WARNING, sdp, + "SCSI disk (sd) name space exhausted.\n"); + error = -ENODEV; + } goto out_put; - - if (index >= SD_MAX_DISKS) { - error = -ENODEV; - sdev_printk(KERN_WARNING, sdp, "SCSI disk (sd) name space exhausted.\n"); - goto out_free_index; } error = sd_format_disk_name("sd", index, gd->disk_name, DISK_NAME_LEN); @@ -2634,9 +2626,7 @@ static int sd_probe(struct device *dev) return 0; out_free_index: - spin_lock(&sd_index_lock); - ida_remove(&sd_index_ida, index); - spin_unlock(&sd_index_lock); + ida_simple_remove(&sd_index_ida, index); out_put: put_disk(gd); out_free: @@ -2692,9 +2682,7 @@ static void scsi_disk_release(struct device *dev) struct scsi_disk *sdkp = to_scsi_disk(dev); struct gendisk *disk = sdkp->disk; - spin_lock(&sd_index_lock); - ida_remove(&sd_index_ida, sdkp->index); - spin_unlock(&sd_index_lock); + ida_simple_remove(&sd_index_ida, sdkp->index); disk->private_data = NULL; put_disk(disk); diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 909ed9ed24c0..ce76d0c40aaf 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -2367,16 +2367,15 @@ static ssize_t sg_proc_write_adio(struct file *filp, const char __user *buffer, size_t count, loff_t *off) { - int num; - char buff[11]; + int err; + unsigned long num; if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) return -EACCES; - num = (count < 10) ? count : 10; - if (copy_from_user(buff, buffer, num)) - return -EFAULT; - buff[num] = '\0'; - sg_allow_dio = simple_strtoul(buff, NULL, 10) ? 1 : 0; + err = kstrtoul_from_user(buffer, count, 0, &num); + if (err) + return err; + sg_allow_dio = num ? 1 : 0; return count; } @@ -2389,17 +2388,15 @@ static ssize_t sg_proc_write_dressz(struct file *filp, const char __user *buffer, size_t count, loff_t *off) { - int num; + int err; unsigned long k = ULONG_MAX; - char buff[11]; if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) return -EACCES; - num = (count < 10) ? count : 10; - if (copy_from_user(buff, buffer, num)) - return -EFAULT; - buff[num] = '\0'; - k = simple_strtoul(buff, NULL, 10); + + err = kstrtoul_from_user(buffer, count, 0, &k); + if (err) + return err; if (k <= 1048576) { /* limit "big buff" to 1 MB */ sg_big_buff = k; return count; diff --git a/drivers/staging/rts5139/ms.c b/drivers/staging/rts5139/ms.c index b0e9071c8e52..badc35cc9140 100644 --- a/drivers/staging/rts5139/ms.c +++ b/drivers/staging/rts5139/ms.c @@ -25,6 +25,7 @@ #include <linux/blkdev.h> #include <linux/kthread.h> +#include <linux/vmalloc.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/vmalloc.h> diff --git a/drivers/staging/rts5139/rts51x_scsi.c b/drivers/staging/rts5139/rts51x_scsi.c index 3b32f9e6e4f0..4b8279cd7b9a 100644 --- a/drivers/staging/rts5139/rts51x_scsi.c +++ b/drivers/staging/rts5139/rts51x_scsi.c @@ -26,6 +26,7 @@ #include <linux/blkdev.h> #include <linux/kthread.h> #include <linux/sched.h> +#include <linux/vmalloc.h> #include <linux/slab.h> #include <linux/vmalloc.h> diff --git a/drivers/staging/rts5139/xd.c b/drivers/staging/rts5139/xd.c index 5820605d1806..6cd53c47e04d 100644 --- a/drivers/staging/rts5139/xd.c +++ b/drivers/staging/rts5139/xd.c @@ -25,6 +25,7 @@ #include <linux/blkdev.h> #include <linux/kthread.h> +#include <linux/vmalloc.h> #include <linux/sched.h> #include <linux/vmalloc.h> diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c index 5aac00eb1830..ee2c060b034d 100644 --- a/drivers/video/fbmem.c +++ b/drivers/video/fbmem.c @@ -1076,14 +1076,16 @@ static long do_fb_ioctl(struct fb_info *info, unsigned int cmd, case FBIOPUT_VSCREENINFO: if (copy_from_user(&var, argp, sizeof(var))) return -EFAULT; - if (!lock_fb_info(info)) - return -ENODEV; console_lock(); + if (!lock_fb_info(info)) { + console_unlock(); + return -ENODEV; + } info->flags |= FBINFO_MISC_USEREVENT; ret = fb_set_var(info, &var); info->flags &= ~FBINFO_MISC_USEREVENT; - console_unlock(); unlock_fb_info(info); + console_unlock(); if (!ret && copy_to_user(argp, &var, sizeof(var))) ret = -EFAULT; break; @@ -1112,12 +1114,14 @@ static long do_fb_ioctl(struct fb_info *info, unsigned int cmd, case FBIOPAN_DISPLAY: if (copy_from_user(&var, argp, sizeof(var))) return -EFAULT; - if (!lock_fb_info(info)) - return -ENODEV; console_lock(); + if (!lock_fb_info(info)) { + console_unlock(); + return -ENODEV; + } ret = fb_pan_display(info, &var); - console_unlock(); unlock_fb_info(info); + console_unlock(); if (ret == 0 && copy_to_user(argp, &var, sizeof(var))) return -EFAULT; break; @@ -1159,14 +1163,16 @@ static long do_fb_ioctl(struct fb_info *info, unsigned int cmd, unlock_fb_info(info); break; case FBIOBLANK: - if (!lock_fb_info(info)) - return -ENODEV; console_lock(); + if (!lock_fb_info(info)) { + console_unlock(); + return -ENODEV; + } info->flags |= FBINFO_MISC_USEREVENT; ret = fb_blank(info, arg); info->flags &= ~FBINFO_MISC_USEREVENT; - console_unlock(); unlock_fb_info(info); + console_unlock(); break; default: if (!lock_fb_info(info)) diff --git a/drivers/w1/slaves/w1_ds2760.c b/drivers/w1/slaves/w1_ds2760.c index 483d45180911..5754c9a4f58b 100644 --- a/drivers/w1/slaves/w1_ds2760.c +++ b/drivers/w1/slaves/w1_ds2760.c @@ -114,43 +114,7 @@ static struct bin_attribute w1_ds2760_bin_attr = { .read = w1_ds2760_read_bin, }; -static DEFINE_IDR(bat_idr); -static DEFINE_MUTEX(bat_idr_lock); - -static int new_bat_id(void) -{ - int ret; - - while (1) { - int id; - - ret = idr_pre_get(&bat_idr, GFP_KERNEL); - if (ret == 0) - return -ENOMEM; - - mutex_lock(&bat_idr_lock); - ret = idr_get_new(&bat_idr, NULL, &id); - mutex_unlock(&bat_idr_lock); - - if (ret == 0) { - ret = id & MAX_ID_MASK; - break; - } else if (ret == -EAGAIN) { - continue; - } else { - break; - } - } - - return ret; -} - -static void release_bat_id(int id) -{ - mutex_lock(&bat_idr_lock); - idr_remove(&bat_idr, id); - mutex_unlock(&bat_idr_lock); -} +static DEFINE_IDA(bat_ida); static int w1_ds2760_add_slave(struct w1_slave *sl) { @@ -158,7 +122,7 @@ static int w1_ds2760_add_slave(struct w1_slave *sl) int id; struct platform_device *pdev; - id = new_bat_id(); + id = ida_simple_get(&bat_ida, 0, 0, GFP_KERNEL); if (id < 0) { ret = id; goto noid; @@ -187,7 +151,7 @@ bin_attr_failed: pdev_add_failed: platform_device_unregister(pdev); pdev_alloc_failed: - release_bat_id(id); + ida_simple_remove(&bat_ida, id); noid: success: return ret; @@ -199,7 +163,7 @@ static void w1_ds2760_remove_slave(struct w1_slave *sl) int id = pdev->id; platform_device_unregister(pdev); - release_bat_id(id); + ida_simple_remove(&bat_ida, id); sysfs_remove_bin_file(&sl->dev.kobj, &w1_ds2760_bin_attr); } @@ -217,14 +181,14 @@ static int __init w1_ds2760_init(void) { printk(KERN_INFO "1-Wire driver for the DS2760 battery monitor " " chip - (c) 2004-2005, Szabolcs Gyurko\n"); - idr_init(&bat_idr); + ida_init(&bat_ida); return w1_register_family(&w1_ds2760_family); } static void __exit w1_ds2760_exit(void) { w1_unregister_family(&w1_ds2760_family); - idr_destroy(&bat_idr); + ida_destroy(&bat_ida); } EXPORT_SYMBOL(w1_ds2760_read); diff --git a/drivers/w1/slaves/w1_ds2780.c b/drivers/w1/slaves/w1_ds2780.c index 274c8f38303f..39f78c0b143c 100644 --- a/drivers/w1/slaves/w1_ds2780.c +++ b/drivers/w1/slaves/w1_ds2780.c @@ -26,20 +26,14 @@ #include "../w1_family.h" #include "w1_ds2780.h" -int w1_ds2780_io(struct device *dev, char *buf, int addr, size_t count, - int io) +static int w1_ds2780_do_io(struct device *dev, char *buf, int addr, + size_t count, int io) { struct w1_slave *sl = container_of(dev, struct w1_slave, dev); - if (!dev) - return -ENODEV; + if (addr > DS2780_DATA_SIZE || addr < 0) + return 0; - mutex_lock(&sl->master->mutex); - - if (addr > DS2780_DATA_SIZE || addr < 0) { - count = 0; - goto out; - } count = min_t(int, count, DS2780_DATA_SIZE - addr); if (w1_reset_select_slave(sl) == 0) { @@ -47,7 +41,6 @@ int w1_ds2780_io(struct device *dev, char *buf, int addr, size_t count, w1_write_8(sl->master, W1_DS2780_WRITE_DATA); w1_write_8(sl->master, addr); w1_write_block(sl->master, buf, count); - /* XXX w1_write_block returns void, not n_written */ } else { w1_write_8(sl->master, W1_DS2780_READ_DATA); w1_write_8(sl->master, addr); @@ -55,13 +48,42 @@ int w1_ds2780_io(struct device *dev, char *buf, int addr, size_t count, } } -out: + return count; +} + +int w1_ds2780_io(struct device *dev, char *buf, int addr, size_t count, + int io) +{ + struct w1_slave *sl = container_of(dev, struct w1_slave, dev); + int ret; + + if (!dev) + return -ENODEV; + + mutex_lock(&sl->master->mutex); + + ret = w1_ds2780_do_io(dev, buf, addr, count, io); + mutex_unlock(&sl->master->mutex); - return count; + return ret; } EXPORT_SYMBOL(w1_ds2780_io); +int w1_ds2780_io_nolock(struct device *dev, char *buf, int addr, size_t count, + int io) +{ + int ret; + + if (!dev) + return -ENODEV; + + ret = w1_ds2780_do_io(dev, buf, addr, count, io); + + return ret; +} +EXPORT_SYMBOL(w1_ds2780_io_nolock); + int w1_ds2780_eeprom_cmd(struct device *dev, int addr, int cmd) { struct w1_slave *sl = container_of(dev, struct w1_slave, dev); @@ -99,43 +121,7 @@ static struct bin_attribute w1_ds2780_bin_attr = { .read = w1_ds2780_read_bin, }; -static DEFINE_IDR(bat_idr); -static DEFINE_MUTEX(bat_idr_lock); - -static int new_bat_id(void) -{ - int ret; - - while (1) { - int id; - - ret = idr_pre_get(&bat_idr, GFP_KERNEL); - if (ret == 0) - return -ENOMEM; - - mutex_lock(&bat_idr_lock); - ret = idr_get_new(&bat_idr, NULL, &id); - mutex_unlock(&bat_idr_lock); - - if (ret == 0) { - ret = id & MAX_ID_MASK; - break; - } else if (ret == -EAGAIN) { - continue; - } else { - break; - } - } - - return ret; -} - -static void release_bat_id(int id) -{ - mutex_lock(&bat_idr_lock); - idr_remove(&bat_idr, id); - mutex_unlock(&bat_idr_lock); -} +static DEFINE_IDA(bat_ida); static int w1_ds2780_add_slave(struct w1_slave *sl) { @@ -143,7 +129,7 @@ static int w1_ds2780_add_slave(struct w1_slave *sl) int id; struct platform_device *pdev; - id = new_bat_id(); + id = ida_simple_get(&bat_ida, 0, 0, GFP_KERNEL); if (id < 0) { ret = id; goto noid; @@ -172,7 +158,7 @@ bin_attr_failed: pdev_add_failed: platform_device_unregister(pdev); pdev_alloc_failed: - release_bat_id(id); + ida_simple_remove(&bat_ida, id); noid: return ret; } @@ -183,7 +169,7 @@ static void w1_ds2780_remove_slave(struct w1_slave *sl) int id = pdev->id; platform_device_unregister(pdev); - release_bat_id(id); + ida_simple_remove(&bat_ida, id); sysfs_remove_bin_file(&sl->dev.kobj, &w1_ds2780_bin_attr); } @@ -199,14 +185,14 @@ static struct w1_family w1_ds2780_family = { static int __init w1_ds2780_init(void) { - idr_init(&bat_idr); + ida_init(&bat_ida); return w1_register_family(&w1_ds2780_family); } static void __exit w1_ds2780_exit(void) { w1_unregister_family(&w1_ds2780_family); - idr_destroy(&bat_idr); + ida_destroy(&bat_ida); } module_init(w1_ds2780_init); diff --git a/drivers/w1/slaves/w1_ds2780.h b/drivers/w1/slaves/w1_ds2780.h index a1fba79eb1b5..737379365021 100644 --- a/drivers/w1/slaves/w1_ds2780.h +++ b/drivers/w1/slaves/w1_ds2780.h @@ -124,6 +124,8 @@ extern int w1_ds2780_io(struct device *dev, char *buf, int addr, size_t count, int io); +extern int w1_ds2780_io_nolock(struct device *dev, char *buf, int addr, + size_t count, int io); extern int w1_ds2780_eeprom_cmd(struct device *dev, int addr, int cmd); #endif /* !_W1_DS2780_H */ diff --git a/fs/Kconfig b/fs/Kconfig index 9fe0b349f4cd..5f4c45d4aa10 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -109,7 +109,7 @@ source "fs/proc/Kconfig" source "fs/sysfs/Kconfig" config TMPFS - bool "Virtual memory file system support (former shm fs)" + bool "Tmpfs virtual memory file system support (former shm fs)" depends on SHMEM help Tmpfs is a file system which keeps all files in virtual memory. @@ -1387,13 +1387,13 @@ static ssize_t aio_setup_vectored_rw(int type, struct kiocb *kiocb, bool compat) ret = compat_rw_copy_check_uvector(type, (struct compat_iovec __user *)kiocb->ki_buf, kiocb->ki_nbytes, 1, &kiocb->ki_inline_vec, - &kiocb->ki_iovec); + &kiocb->ki_iovec, 1); else #endif ret = rw_copy_check_uvector(type, (struct iovec __user *)kiocb->ki_buf, kiocb->ki_nbytes, 1, &kiocb->ki_inline_vec, - &kiocb->ki_iovec); + &kiocb->ki_iovec, 1); if (ret < 0) goto out; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0ccc7438ad34..c46d7ed18d90 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1863,7 +1863,8 @@ static int btrfs_io_failed_hook(struct bio *failed_bio, read_lock(&em_tree->lock); em = lookup_extent_mapping(em_tree, start, failrec->len); - if (em->start > start || em->start + em->len < start) { + if (em && !IS_ERR(em) && (em->start > start || + em->start + em->len < start)) { free_extent_map(em); em = NULL; } diff --git a/fs/compat.c b/fs/compat.c index 58b1da459893..c76adb46b0ce 100644 --- a/fs/compat.c +++ b/fs/compat.c @@ -550,7 +550,7 @@ out: ssize_t compat_rw_copy_check_uvector(int type, const struct compat_iovec __user *uvector, unsigned long nr_segs, unsigned long fast_segs, struct iovec *fast_pointer, - struct iovec **ret_pointer) + struct iovec **ret_pointer, int check_access) { compat_ssize_t tot_len; struct iovec *iov = *ret_pointer = fast_pointer; @@ -597,7 +597,8 @@ ssize_t compat_rw_copy_check_uvector(int type, } if (len < 0) /* size_t not fitting in compat_ssize_t .. */ goto out; - if (!access_ok(vrfy_dir(type), compat_ptr(buf), len)) { + if (check_access && + !access_ok(vrfy_dir(type), compat_ptr(buf), len)) { ret = -EFAULT; goto out; } @@ -1111,7 +1112,7 @@ static ssize_t compat_do_readv_writev(int type, struct file *file, goto out; tot_len = compat_rw_copy_check_uvector(type, uvector, nr_segs, - UIO_FASTIOV, iovstack, &iov); + UIO_FASTIOV, iovstack, &iov, 1); if (tot_len == 0) { ret = 0; goto out; diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 9026fc91fe3b..828e750af23a 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -70,6 +70,15 @@ * simultaneous inserts (A into B and B into A) from racing and * constructing a cycle without either insert observing that it is * going to. + * It is necessary to acquire multiple "ep->mtx"es at once in the + * case when one epoll fd is added to another. In this case, we + * always acquire the locks in the order of nesting (i.e. after + * epoll_ctl(e1, EPOLL_CTL_ADD, e2), e1->mtx will always be acquired + * before e2->mtx). Since we disallow cycles of epoll file + * descriptors, this ensures that the mutexes are well-ordered. In + * order to communicate this nesting to lockdep, when walking a tree + * of epoll file descriptors, we use the current recursion depth as + * the lockdep subkey. * It is possible to drop the "ep->mtx" and to use the global * mutex "epmutex" (together with "ep->lock") to have it working, * but having "ep->mtx" will make the interface more scalable. @@ -464,13 +473,15 @@ static void ep_unregister_pollwait(struct eventpoll *ep, struct epitem *epi) * @ep: Pointer to the epoll private data structure. * @sproc: Pointer to the scan callback. * @priv: Private opaque data passed to the @sproc callback. + * @depth: The current depth of recursive f_op->poll calls. * * Returns: The same integer error code returned by the @sproc callback. */ static int ep_scan_ready_list(struct eventpoll *ep, int (*sproc)(struct eventpoll *, struct list_head *, void *), - void *priv) + void *priv, + int depth) { int error, pwake = 0; unsigned long flags; @@ -481,7 +492,7 @@ static int ep_scan_ready_list(struct eventpoll *ep, * We need to lock this because we could be hit by * eventpoll_release_file() and epoll_ctl(). */ - mutex_lock(&ep->mtx); + mutex_lock_nested(&ep->mtx, depth); /* * Steal the ready list, and re-init the original one to the @@ -670,7 +681,7 @@ static int ep_read_events_proc(struct eventpoll *ep, struct list_head *head, static int ep_poll_readyevents_proc(void *priv, void *cookie, int call_nests) { - return ep_scan_ready_list(priv, ep_read_events_proc, NULL); + return ep_scan_ready_list(priv, ep_read_events_proc, NULL, call_nests + 1); } static unsigned int ep_eventpoll_poll(struct file *file, poll_table *wait) @@ -737,7 +748,7 @@ void eventpoll_release_file(struct file *file) ep = epi->ep; list_del_init(&epi->fllink); - mutex_lock(&ep->mtx); + mutex_lock_nested(&ep->mtx, 0); ep_remove(ep, epi); mutex_unlock(&ep->mtx); } @@ -1134,7 +1145,7 @@ static int ep_send_events(struct eventpoll *ep, esed.maxevents = maxevents; esed.events = events; - return ep_scan_ready_list(ep, ep_send_events_proc, &esed); + return ep_scan_ready_list(ep, ep_send_events_proc, &esed, 0); } static inline struct timespec ep_set_mstimeout(long ms) @@ -1267,7 +1278,7 @@ static int ep_loop_check_proc(void *priv, void *cookie, int call_nests) struct rb_node *rbp; struct epitem *epi; - mutex_lock(&ep->mtx); + mutex_lock_nested(&ep->mtx, call_nests + 1); for (rbp = rb_first(&ep->rbr); rbp; rbp = rb_next(rbp)) { epi = rb_entry(rbp, struct epitem, rbn); if (unlikely(is_file_epoll(epi->ffd.file))) { @@ -1409,7 +1420,7 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, } - mutex_lock(&ep->mtx); + mutex_lock_nested(&ep->mtx, 0); /* * Try to lookup the file inside our RB tree, Since we grabbed "mtx" diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index db761d593df3..a4171a972ae6 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -933,12 +933,13 @@ struct ext4_inode_info { #define test_opt2(sb, opt) (EXT4_SB(sb)->s_mount_opt2 & \ EXT4_MOUNT2_##opt) -#define ext4_set_bit __test_and_set_bit_le +#define ext4_test_and_set_bit __test_and_set_bit_le +#define ext4_set_bit __set_bit_le #define ext4_set_bit_atomic ext2_set_bit_atomic -#define ext4_clear_bit __test_and_clear_bit_le +#define ext4_test_and_clear_bit __test_and_clear_bit_le +#define ext4_clear_bit __clear_bit_le #define ext4_clear_bit_atomic ext2_clear_bit_atomic #define ext4_test_bit test_bit_le -#define ext4_find_first_zero_bit find_first_zero_bit_le #define ext4_find_next_zero_bit find_next_zero_bit_le #define ext4_find_next_bit find_next_bit_le diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 9c63f273b550..fd96ea1facd3 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -252,7 +252,7 @@ void ext4_free_inode(handle_t *handle, struct inode *inode) fatal = ext4_journal_get_write_access(handle, bh2); } ext4_lock_group(sb, block_group); - cleared = ext4_clear_bit(bit, bitmap_bh->b_data); + cleared = ext4_test_and_clear_bit(bit, bitmap_bh->b_data); if (fatal || !cleared) { ext4_unlock_group(sb, block_group); goto out; @@ -729,7 +729,7 @@ static int ext4_claim_inode(struct super_block *sb, */ down_read(&grp->alloc_sem); ext4_lock_group(sb, group); - if (ext4_set_bit(ino, inode_bitmap_bh->b_data)) { + if (ext4_test_and_set_bit(ino, inode_bitmap_bh->b_data)) { /* not a free inode */ retval = 1; goto err_ret; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 6f49843b3bfc..0655865816a5 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1811,8 +1811,12 @@ static int ext4_writepage(struct page *page, * We don't want to do block allocation, so redirty * the page and return. We may reach here when we do * a journal commit via journal_submit_inode_data_buffers. - * We can also reach here via shrink_page_list + * We can also reach here via shrink_page_list but it + * should never be for direct reclaim so warn if that + * happens */ + WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == + PF_MEMALLOC); goto redirty_page; } if (commit_write) diff --git a/fs/logfs/logfs.h b/fs/logfs/logfs.h index 68e4fd275a02..1ba6e21e5956 100644 --- a/fs/logfs/logfs.h +++ b/fs/logfs/logfs.h @@ -619,7 +619,6 @@ static inline int logfs_buf_recover(struct logfs_area *area, u64 ofs, struct page *emergency_read_begin(struct address_space *mapping, pgoff_t index); void emergency_read_end(struct page *page); void logfs_crash_dump(struct super_block *sb); -void *memchr_inv(const void *s, int c, size_t n); int logfs_statfs(struct dentry *dentry, struct kstatfs *stats); int logfs_check_ds(struct logfs_disk_super *ds); int logfs_write_sb(struct super_block *sb); diff --git a/fs/logfs/super.c b/fs/logfs/super.c index 2e998b787261..50203fc5d4ec 100644 --- a/fs/logfs/super.c +++ b/fs/logfs/super.c @@ -92,28 +92,6 @@ void logfs_crash_dump(struct super_block *sb) } /* - * TODO: move to lib/string.c - */ -/** - * memchr_inv - Find a character in an area of memory. - * @s: The memory area - * @c: The byte to search for - * @n: The size of the area. - * - * returns the address of the first character other than @c, or %NULL - * if the whole buffer contains just @c. - */ -void *memchr_inv(const void *s, int c, size_t n) -{ - const unsigned char *p = s; - while (n-- != 0) - if ((unsigned char)c != *p++) - return (void *)(p - 1); - - return NULL; -} - -/* * FIXME: There should be a reserve for root, similar to ext2. */ int logfs_statfs(struct dentry *dentry, struct kstatfs *stats) diff --git a/fs/namei.c b/fs/namei.c index 2826db35dc25..e8b5ee05170c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -137,7 +137,8 @@ static int do_getname(const char __user *filename, char *page) return retval; } -static char *getname_flags(const char __user * filename, int flags) +static char *getname_flags_empty(const char __user * filename, + int flags, int *empty) { char *tmp, *result; @@ -148,6 +149,8 @@ static char *getname_flags(const char __user * filename, int flags) result = tmp; if (retval < 0) { + if (retval == -ENOENT && empty) + *empty = 1; if (retval != -ENOENT || !(flags & LOOKUP_EMPTY)) { __putname(tmp); result = ERR_PTR(retval); @@ -160,7 +163,7 @@ static char *getname_flags(const char __user * filename, int flags) char *getname(const char __user * filename) { - return getname_flags(filename, 0); + return getname_flags_empty(filename, 0, 0); } #ifdef CONFIG_AUDITSYSCALL @@ -1807,11 +1810,11 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len) return __lookup_hash(&this, base, NULL); } -int user_path_at(int dfd, const char __user *name, unsigned flags, - struct path *path) +int user_path_at_empty(int dfd, const char __user *name, unsigned flags, + struct path *path, int *empty) { struct nameidata nd; - char *tmp = getname_flags(name, flags); + char *tmp = getname_flags_empty(name, flags, empty); int err = PTR_ERR(tmp); if (!IS_ERR(tmp)) { @@ -1825,6 +1828,12 @@ int user_path_at(int dfd, const char __user *name, unsigned flags, return err; } +int user_path_at(int dfd, const char __user *name, unsigned flags, + struct path *path) +{ + return user_path_at_empty(dfd, name, flags, path, 0); +} + static int user_path_parent(int dfd, const char __user *path, struct nameidata *nd, char **name) { diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index bcde467acca3..d355e6e36b36 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -849,5 +849,52 @@ static inline void _ocfs2_clear_bit(unsigned int bit, unsigned long *bitmap) #define ocfs2_test_bit test_bit_le #define ocfs2_find_next_zero_bit find_next_zero_bit_le #define ocfs2_find_next_bit find_next_bit_le + +static inline void *correct_addr_and_bit_unaligned(int *bit, void *addr) +{ +#if BITS_PER_LONG == 64 + *bit += ((unsigned long) addr & 7UL) << 3; + addr = (void *) ((unsigned long) addr & ~7UL); +#elif BITS_PER_LONG == 32 + *bit += ((unsigned long) addr & 3UL) << 3; + addr = (void *) ((unsigned long) addr & ~3UL); +#else +#error "how many bits you are?!" +#endif + return addr; +} + +static inline void ocfs2_set_bit_unaligned(int bit, void *bitmap) +{ + bitmap = correct_addr_and_bit_unaligned(&bit, bitmap); + ocfs2_set_bit(bit, bitmap); +} + +static inline void ocfs2_clear_bit_unaligned(int bit, void *bitmap) +{ + bitmap = correct_addr_and_bit_unaligned(&bit, bitmap); + ocfs2_clear_bit(bit, bitmap); +} + +static inline int ocfs2_test_bit_unaligned(int bit, void *bitmap) +{ + bitmap = correct_addr_and_bit_unaligned(&bit, bitmap); + return ocfs2_test_bit(bit, bitmap); +} + +static inline int ocfs2_find_next_zero_bit_unaligned(void *bitmap, int max, + int start) +{ + int fix = 0, ret, tmpmax; + bitmap = correct_addr_and_bit_unaligned(&fix, bitmap); + tmpmax = max + fix; + start += fix; + + ret = ocfs2_find_next_zero_bit(bitmap, tmpmax, start) - fix; + if (ret > max) + return max; + return ret; +} + #endif /* OCFS2_H */ diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c index 942fd65bdad3..f100bf70a906 100644 --- a/fs/ocfs2/quota_local.c +++ b/fs/ocfs2/quota_local.c @@ -551,8 +551,8 @@ static int ocfs2_recover_local_quota_file(struct inode *lqinode, goto out_commit; } lock_buffer(qbh); - WARN_ON(!ocfs2_test_bit(bit, dchunk->dqc_bitmap)); - ocfs2_clear_bit(bit, dchunk->dqc_bitmap); + WARN_ON(!ocfs2_test_bit_unaligned(bit, dchunk->dqc_bitmap)); + ocfs2_clear_bit_unaligned(bit, dchunk->dqc_bitmap); le32_add_cpu(&dchunk->dqc_free, 1); unlock_buffer(qbh); ocfs2_journal_dirty(handle, qbh); @@ -949,7 +949,7 @@ static struct ocfs2_quota_chunk *ocfs2_find_free_entry(struct super_block *sb, * ol_quota_entries_per_block(sb); } - found = ocfs2_find_next_zero_bit(dchunk->dqc_bitmap, len, 0); + found = ocfs2_find_next_zero_bit_unaligned(dchunk->dqc_bitmap, len, 0); /* We failed? */ if (found == len) { mlog(ML_ERROR, "Did not find empty entry in chunk %d with %u" @@ -1213,7 +1213,7 @@ static void olq_alloc_dquot(struct buffer_head *bh, void *private) struct ocfs2_local_disk_chunk *dchunk; dchunk = (struct ocfs2_local_disk_chunk *)bh->b_data; - ocfs2_set_bit(*offset, dchunk->dqc_bitmap); + ocfs2_set_bit_unaligned(*offset, dchunk->dqc_bitmap); le32_add_cpu(&dchunk->dqc_free, -1); } @@ -1294,7 +1294,7 @@ int ocfs2_local_release_dquot(handle_t *handle, struct dquot *dquot) (od->dq_chunk->qc_headerbh->b_data); /* Mark structure as freed */ lock_buffer(od->dq_chunk->qc_headerbh); - ocfs2_clear_bit(offset, dchunk->dqc_bitmap); + ocfs2_clear_bit_unaligned(offset, dchunk->dqc_bitmap); le32_add_cpu(&dchunk->dqc_free, 1); unlock_buffer(od->dq_chunk->qc_headerbh); ocfs2_journal_dirty(handle, od->dq_chunk->qc_headerbh); diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 1435ef5e94ca..928afdf91ffd 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -360,6 +360,7 @@ static const struct file_operations proc_sys_file_operations = { }; static const struct file_operations proc_sys_dir_file_operations = { + .read = generic_read_dir, .readdir = proc_sys_readdir, .llseek = generic_file_llseek, }; diff --git a/fs/proc/stat.c b/fs/proc/stat.c index 9758b654a1bc..42b274da92c3 100644 --- a/fs/proc/stat.c +++ b/fs/proc/stat.c @@ -10,6 +10,7 @@ #include <linux/time.h> #include <linux/irqnr.h> #include <asm/cputime.h> +#include <linux/tick.h> #ifndef arch_irq_stat_cpu #define arch_irq_stat_cpu(cpu) 0 @@ -21,6 +22,35 @@ #define arch_idle_time(cpu) 0 #endif +static cputime64_t get_idle_time(int cpu) +{ + u64 idle_time = get_cpu_idle_time_us(cpu, NULL); + cputime64_t idle; + + if (idle_time == -1ULL) { + /* !NO_HZ so we can rely on cpustat.idle */ + idle = kstat_cpu(cpu).cpustat.idle; + idle = cputime64_add(idle, arch_idle_time(cpu)); + } else + idle = usecs_to_cputime(idle_time); + + return idle; +} + +static cputime64_t get_iowait_time(int cpu) +{ + u64 iowait_time = get_cpu_iowait_time_us(cpu, NULL); + cputime64_t iowait; + + if (iowait_time == -1ULL) + /* !NO_HZ so we can rely on cpustat.iowait */ + iowait = kstat_cpu(cpu).cpustat.iowait; + else + iowait = usecs_to_cputime(iowait_time); + + return iowait; +} + static int show_stat(struct seq_file *p, void *v) { int i, j; @@ -42,9 +72,8 @@ static int show_stat(struct seq_file *p, void *v) user = cputime64_add(user, kstat_cpu(i).cpustat.user); nice = cputime64_add(nice, kstat_cpu(i).cpustat.nice); system = cputime64_add(system, kstat_cpu(i).cpustat.system); - idle = cputime64_add(idle, kstat_cpu(i).cpustat.idle); - idle = cputime64_add(idle, arch_idle_time(i)); - iowait = cputime64_add(iowait, kstat_cpu(i).cpustat.iowait); + idle = cputime64_add(idle, get_idle_time(i)); + iowait = cputime64_add(iowait, get_iowait_time(i)); irq = cputime64_add(irq, kstat_cpu(i).cpustat.irq); softirq = cputime64_add(softirq, kstat_cpu(i).cpustat.softirq); steal = cputime64_add(steal, kstat_cpu(i).cpustat.steal); @@ -76,14 +105,12 @@ static int show_stat(struct seq_file *p, void *v) (unsigned long long)cputime64_to_clock_t(guest), (unsigned long long)cputime64_to_clock_t(guest_nice)); for_each_online_cpu(i) { - /* Copy values here to work around gcc-2.95.3, gcc-2.96 */ user = kstat_cpu(i).cpustat.user; nice = kstat_cpu(i).cpustat.nice; system = kstat_cpu(i).cpustat.system; - idle = kstat_cpu(i).cpustat.idle; - idle = cputime64_add(idle, arch_idle_time(i)); - iowait = kstat_cpu(i).cpustat.iowait; + idle = get_idle_time(i); + iowait = get_iowait_time(i); irq = kstat_cpu(i).cpustat.irq; softirq = kstat_cpu(i).cpustat.softirq; steal = kstat_cpu(i).cpustat.steal; diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 25b6a887adb9..d1931b9ab21b 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -44,6 +44,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm) "VmPeak:\t%8lu kB\n" "VmSize:\t%8lu kB\n" "VmLck:\t%8lu kB\n" + "VmPin:\t%8lu kB\n" "VmHWM:\t%8lu kB\n" "VmRSS:\t%8lu kB\n" "VmData:\t%8lu kB\n" @@ -55,6 +56,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm) hiwater_vm << (PAGE_SHIFT-10), (total_vm - mm->reserved_vm) << (PAGE_SHIFT-10), mm->locked_vm << (PAGE_SHIFT-10), + mm->pinned_vm << (PAGE_SHIFT-10), hiwater_rss << (PAGE_SHIFT-10), total_rss << (PAGE_SHIFT-10), data << (PAGE_SHIFT-10), diff --git a/fs/read_write.c b/fs/read_write.c index 179f1c33ea57..7a220bec217c 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -617,7 +617,8 @@ ssize_t do_loop_readv_writev(struct file *filp, struct iovec *iov, ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector, unsigned long nr_segs, unsigned long fast_segs, struct iovec *fast_pointer, - struct iovec **ret_pointer) + struct iovec **ret_pointer, + int check_access) { unsigned long seg; ssize_t ret; @@ -673,7 +674,8 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector, ret = -EINVAL; goto out; } - if (unlikely(!access_ok(vrfy_dir(type), buf, len))) { + if (check_access + && unlikely(!access_ok(vrfy_dir(type), buf, len))) { ret = -EFAULT; goto out; } @@ -705,7 +707,7 @@ static ssize_t do_readv_writev(int type, struct file *file, } ret = rw_copy_check_uvector(type, uvector, nr_segs, - ARRAY_SIZE(iovstack), iovstack, &iov); + ARRAY_SIZE(iovstack), iovstack, &iov, 1); if (ret <= 0) goto out; diff --git a/fs/stat.c b/fs/stat.c index ba5316ffac61..9f290b04fbdb 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -296,15 +296,16 @@ SYSCALL_DEFINE4(readlinkat, int, dfd, const char __user *, pathname, { struct path path; int error; + int empty = 0; if (bufsiz <= 0) return -EINVAL; - error = user_path_at(dfd, pathname, LOOKUP_EMPTY, &path); + error = user_path_at_empty(dfd, pathname, LOOKUP_EMPTY, &path, &empty); if (!error) { struct inode *inode = path.dentry->d_inode; - error = -EINVAL; + error = (empty) ? -ENOENT : -EINVAL; if (inode->i_op->readlink) { error = security_inode_readlink(path.dentry); if (!error) { diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 63e971e2b837..b90189798d37 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -925,11 +925,11 @@ xfs_vm_writepage( * random callers for direct reclaim or memcg reclaim. We explicitly * allow reclaim from kswapd as the stack usage there is relatively low. * - * This should really be done by the core VM, but until that happens - * filesystems like XFS, btrfs and ext4 have to take care of this - * by themselves. + * This should never happen except in the case of a VM regression so + * warn about it. */ - if ((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == PF_MEMALLOC) + if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == + PF_MEMALLOC)) goto redirty; /* diff --git a/include/asm-generic/cputime.h b/include/asm-generic/cputime.h index 61e03dd7939e..62ce6823c0f2 100644 --- a/include/asm-generic/cputime.h +++ b/include/asm-generic/cputime.h @@ -38,8 +38,8 @@ typedef u64 cputime64_t; /* * Convert cputime to microseconds and back. */ -#define cputime_to_usecs(__ct) jiffies_to_usecs(__ct); -#define usecs_to_cputime(__msecs) usecs_to_jiffies(__msecs); +#define cputime_to_usecs(__ct) jiffies_to_usecs(__ct) +#define usecs_to_cputime(__msecs) usecs_to_jiffies(__msecs) /* * Convert cputime to seconds and back. diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index da7e4bc34e8c..1b7f9d525013 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -516,7 +516,7 @@ struct cgroup_subsys { struct list_head sibling; /* used when use_id == true */ struct idr idr; - spinlock_t id_lock; + rwlock_t id_lock; /* should be defined only by modular subsystems */ struct module *module; diff --git a/include/linux/compat.h b/include/linux/compat.h index c6e7523bf765..154bf5683015 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -547,7 +547,8 @@ extern ssize_t compat_rw_copy_check_uvector(int type, const struct compat_iovec __user *uvector, unsigned long nr_segs, unsigned long fast_segs, struct iovec *fast_pointer, - struct iovec **ret_pointer); + struct iovec **ret_pointer, + int check_access); extern void __user *compat_alloc_user_space(unsigned long len); diff --git a/include/linux/debugobjects.h b/include/linux/debugobjects.h index 65970b811e22..0e5f5785d9f2 100644 --- a/include/linux/debugobjects.h +++ b/include/linux/debugobjects.h @@ -46,6 +46,8 @@ struct debug_obj { * fails * @fixup_free: fixup function, which is called when the free check * fails + * @fixup_assert_init: fixup function, which is called when the assert_init + * check fails */ struct debug_obj_descr { const char *name; @@ -54,6 +56,7 @@ struct debug_obj_descr { int (*fixup_activate) (void *addr, enum debug_obj_state state); int (*fixup_destroy) (void *addr, enum debug_obj_state state); int (*fixup_free) (void *addr, enum debug_obj_state state); + int (*fixup_assert_init)(void *addr, enum debug_obj_state state); }; #ifdef CONFIG_DEBUG_OBJECTS @@ -64,6 +67,7 @@ extern void debug_object_activate (void *addr, struct debug_obj_descr *descr); extern void debug_object_deactivate(void *addr, struct debug_obj_descr *descr); extern void debug_object_destroy (void *addr, struct debug_obj_descr *descr); extern void debug_object_free (void *addr, struct debug_obj_descr *descr); +extern void debug_object_assert_init(void *addr, struct debug_obj_descr *descr); /* * Active state: @@ -89,6 +93,8 @@ static inline void debug_object_destroy (void *addr, struct debug_obj_descr *descr) { } static inline void debug_object_free (void *addr, struct debug_obj_descr *descr) { } +static inline void +debug_object_assert_init(void *addr, struct debug_obj_descr *descr) { } static inline void debug_objects_early_init(void) { } static inline void debug_objects_mem_init(void) { } diff --git a/include/linux/dmar.h b/include/linux/dmar.h index 7b776d71d36d..6701d9432f0b 100644 --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -219,14 +219,19 @@ struct dmar_rmrr_unit { #define for_each_rmrr_units(rmrr) \ list_for_each_entry(rmrr, &dmar_rmrr_units, list) +extern struct list_head dmar_atsr_units; struct dmar_atsr_unit { struct list_head list; /* list of ATSR units */ struct acpi_dmar_header *hdr; /* ACPI header */ struct pci_dev **devices; /* target devices */ int devices_cnt; /* target device count */ + u16 segment; /* PCI domain */ u8 include_all:1; /* include all ports */ }; +#define for_each_atsr_unit(atsr) \ + list_for_each_entry(atsr, &dmar_atsr_units, list) + extern int intel_iommu_init(void); #else /* !CONFIG_DMAR: */ static inline int intel_iommu_init(void) { return -ENODEV; } diff --git a/include/linux/fs.h b/include/linux/fs.h index ba98668a1826..e005f34d4da8 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -392,8 +392,8 @@ struct inodes_stat_t { #include <linux/semaphore.h> #include <linux/fiemap.h> #include <linux/rculist_bl.h> -#include <linux/shrinker.h> #include <linux/atomic.h> +#include <linux/shrinker.h> #include <asm/byteorder.h> @@ -1627,9 +1627,10 @@ struct inode_operations { struct seq_file; ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector, - unsigned long nr_segs, unsigned long fast_segs, - struct iovec *fast_pointer, - struct iovec **ret_pointer); + unsigned long nr_segs, unsigned long fast_segs, + struct iovec *fast_pointer, + struct iovec **ret_pointer, + int check_access); extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *); extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *); diff --git a/include/linux/hpet.h b/include/linux/hpet.h index 219ca4f6bea6..d690c0fe4b8b 100644 --- a/include/linux/hpet.h +++ b/include/linux/hpet.h @@ -125,6 +125,7 @@ struct hpet_info { #define HPET_EPI _IO('h', 0x04) /* enable periodic */ #define HPET_DPI _IO('h', 0x05) /* disable periodic */ #define HPET_IRQFREQ _IOW('h', 0x6, unsigned long) /* IRQFREQ usec */ +#define HPET_ALLOC_TIMER _IO('h', 0x7) #define MAX_HPET_TBS 8 /* maximum hpet timer blocks */ diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 48c32ebf65a7..a9ace9c32507 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -22,6 +22,11 @@ extern int zap_huge_pmd(struct mmu_gather *tlb, extern int mincore_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, unsigned char *vec); +extern int move_huge_pmd(struct vm_area_struct *vma, + struct vm_area_struct *new_vma, + unsigned long old_addr, + unsigned long new_addr, unsigned long old_end, + pmd_t *old_pmd, pmd_t *new_pmd); extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot); diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index f97672a36fa8..265e2c3cbd1c 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -303,7 +303,7 @@ extern void jiffies_to_timespec(const unsigned long jiffies, extern unsigned long timeval_to_jiffies(const struct timeval *value); extern void jiffies_to_timeval(const unsigned long jiffies, struct timeval *value); -extern clock_t jiffies_to_clock_t(long x); +extern clock_t jiffies_to_clock_t(unsigned long x); extern unsigned long clock_t_to_jiffies(unsigned long x); extern u64 jiffies_64_to_clock_t(u64 x); extern u64 nsec_to_clock_t(u64 x); diff --git a/include/linux/leds-renesas-tpu.h b/include/linux/leds-renesas-tpu.h new file mode 100644 index 000000000000..055387086fc1 --- /dev/null +++ b/include/linux/leds-renesas-tpu.h @@ -0,0 +1,14 @@ +#ifndef __LEDS_RENESAS_TPU_H__ +#define __LEDS_RENESAS_TPU_H__ + +struct led_renesas_tpu_config { + char *name; + unsigned pin_gpio_fn; + unsigned pin_gpio; + unsigned int channel_offset; + unsigned int timer_bit; + unsigned int max_brightness; + unsigned int refresh_rate; +}; + +#endif /* __LEDS_RENESAS_TPU_H__ */ diff --git a/include/linux/magic.h b/include/linux/magic.h index 1e5df2af8d84..2d4beab0d5b7 100644 --- a/include/linux/magic.h +++ b/include/linux/magic.h @@ -30,11 +30,11 @@ #define ANON_INODE_FS_MAGIC 0x09041934 #define PSTOREFS_MAGIC 0x6165676C -#define MINIX_SUPER_MAGIC 0x137F /* original minix fs */ -#define MINIX_SUPER_MAGIC2 0x138F /* minix fs, 30 char names */ -#define MINIX2_SUPER_MAGIC 0x2468 /* minix V2 fs */ -#define MINIX2_SUPER_MAGIC2 0x2478 /* minix V2 fs, 30 char names */ -#define MINIX3_SUPER_MAGIC 0x4d5a /* minix V3 fs */ +#define MINIX_SUPER_MAGIC 0x137F /* minix v1 fs, 14 char names */ +#define MINIX_SUPER_MAGIC2 0x138F /* minix v1 fs, 30 char names */ +#define MINIX2_SUPER_MAGIC 0x2468 /* minix v2 fs, 14 char names */ +#define MINIX2_SUPER_MAGIC2 0x2478 /* minix v2 fs, 30 char names */ +#define MINIX3_SUPER_MAGIC 0x4d5a /* minix v3 fs, 60 char names */ #define MSDOS_SUPER_MAGIC 0x4d44 /* MD */ #define NCP_SUPER_MAGIC 0x564c /* Guess, what 0x564c is :-) */ diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 7525e38c434d..e6b843e16e81 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -80,6 +80,7 @@ extern phys_addr_t __memblock_alloc_base(phys_addr_t size, phys_addr_t align, phys_addr_t max_addr); extern phys_addr_t memblock_phys_mem_size(void); +extern phys_addr_t memblock_start_of_DRAM(void); extern phys_addr_t memblock_end_of_DRAM(void); extern void memblock_enforce_memory_limit(phys_addr_t memory_limit); extern int memblock_is_memory(phys_addr_t addr); diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 3b535db00a94..fb1ed1c2d80d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -35,7 +35,8 @@ enum mem_cgroup_page_stat_item { extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, int order, - int mode, struct zone *z, + isolate_mode_t mode, + struct zone *z, struct mem_cgroup *mem_cont, int active, int file); @@ -87,8 +88,8 @@ extern void mem_cgroup_uncharge_end(void); extern void mem_cgroup_uncharge_page(struct page *page); extern void mem_cgroup_uncharge_cache_page(struct page *page); -extern void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask); -int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem); +extern void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask); +int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *memcg); extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page); extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); @@ -97,19 +98,19 @@ extern struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm); static inline int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup) { - struct mem_cgroup *mem; + struct mem_cgroup *memcg; rcu_read_lock(); - mem = mem_cgroup_from_task(rcu_dereference((mm)->owner)); + memcg = mem_cgroup_from_task(rcu_dereference((mm)->owner)); rcu_read_unlock(); - return cgroup == mem; + return cgroup == memcg; } -extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *mem); +extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg); extern int mem_cgroup_prepare_migration(struct page *page, struct page *newpage, struct mem_cgroup **ptr, gfp_t gfp_mask); -extern void mem_cgroup_end_migration(struct mem_cgroup *mem, +extern void mem_cgroup_end_migration(struct mem_cgroup *memcg, struct page *oldpage, struct page *newpage, bool migration_ok); /* @@ -166,7 +167,7 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask, unsigned long *total_scanned); -u64 mem_cgroup_get_limit(struct mem_cgroup *mem); +u64 mem_cgroup_get_limit(struct mem_cgroup *memcg); void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx); #ifdef CONFIG_TRANSPARENT_HUGEPAGE @@ -262,18 +263,20 @@ static inline struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm return NULL; } -static inline int mm_match_cgroup(struct mm_struct *mm, struct mem_cgroup *mem) +static inline int mm_match_cgroup(struct mm_struct *mm, + struct mem_cgroup *memcg) { return 1; } static inline int task_in_mem_cgroup(struct task_struct *task, - const struct mem_cgroup *mem) + const struct mem_cgroup *memcg) { return 1; } -static inline struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *mem) +static inline struct cgroup_subsys_state + *mem_cgroup_css(struct mem_cgroup *memcg) { return NULL; } @@ -285,22 +288,22 @@ mem_cgroup_prepare_migration(struct page *page, struct page *newpage, return 0; } -static inline void mem_cgroup_end_migration(struct mem_cgroup *mem, +static inline void mem_cgroup_end_migration(struct mem_cgroup *memcg, struct page *oldpage, struct page *newpage, bool migration_ok) { } -static inline int mem_cgroup_get_reclaim_priority(struct mem_cgroup *mem) +static inline int mem_cgroup_get_reclaim_priority(struct mem_cgroup *memcg) { return 0; } -static inline void mem_cgroup_note_reclaim_priority(struct mem_cgroup *mem, +static inline void mem_cgroup_note_reclaim_priority(struct mem_cgroup *memcg, int priority) { } -static inline void mem_cgroup_record_reclaim_priority(struct mem_cgroup *mem, +static inline void mem_cgroup_record_reclaim_priority(struct mem_cgroup *memcg, int priority) { } @@ -366,7 +369,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, } static inline -u64 mem_cgroup_get_limit(struct mem_cgroup *mem) +u64 mem_cgroup_get_limit(struct mem_cgroup *memcg) { return 0; } diff --git a/include/linux/mm.h b/include/linux/mm.h index 7438071b44aa..23d95283cac4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -10,6 +10,7 @@ #include <linux/mmzone.h> #include <linux/rbtree.h> #include <linux/prio_tree.h> +#include <linux/atomic.h> #include <linux/debug_locks.h> #include <linux/mm_types.h> #include <linux/range.h> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7870e473033c..499611005c96 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -293,8 +293,15 @@ struct mm_struct { unsigned long hiwater_rss; /* High-watermark of RSS usage */ unsigned long hiwater_vm; /* High-water virtual memory usage */ - unsigned long total_vm, locked_vm, shared_vm, exec_vm; - unsigned long stack_vm, reserved_vm, def_flags, nr_ptes; + unsigned long total_vm; /* Total pages mapped */ + unsigned long locked_vm; /* Pages that have PG_mlocked set */ + unsigned long pinned_vm; /* Refcount permanently increased */ + unsigned long shared_vm; /* Shared pages (files) */ + unsigned long exec_vm; /* VM_EXEC & ~VM_WRITE */ + unsigned long stack_vm; /* VM_GROWSUP/DOWN */ + unsigned long reserved_vm; /* VM_RESERVED|VM_IO pages */ + unsigned long def_flags; + unsigned long nr_ptes; /* Page table pages */ unsigned long start_code, end_code, start_data, end_data; unsigned long start_brk, brk, start_stack; unsigned long arg_start, arg_end, env_start, env_end; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index be1ac8d7789b..1ed4116bcc3d 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -100,6 +100,7 @@ enum zone_stat_item { NR_UNSTABLE_NFS, /* NFS unstable pages */ NR_BOUNCE, NR_VMSCAN_WRITE, + NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ NR_WRITEBACK_TEMP, /* Writeback using temporary buffers */ NR_ISOLATED_ANON, /* Temporary isolated pages from anon lru */ NR_ISOLATED_FILE, /* Temporary isolated pages from file lru */ @@ -164,6 +165,18 @@ static inline int is_unevictable_lru(enum lru_list l) #define LRU_ALL_EVICTABLE (LRU_ALL_FILE | LRU_ALL_ANON) #define LRU_ALL ((1 << NR_LRU_LISTS) - 1) +/* Isolate inactive pages */ +#define ISOLATE_INACTIVE ((__force fmode_t)0x1) +/* Isolate active pages */ +#define ISOLATE_ACTIVE ((__force fmode_t)0x2) +/* Isolate clean file */ +#define ISOLATE_CLEAN ((__force fmode_t)0x4) +/* Isolate unmapped file */ +#define ISOLATE_UNMAPPED ((__force fmode_t)0x8) + +/* LRU Isolation modes. */ +typedef unsigned __bitwise__ isolate_mode_t; + enum zone_watermarks { WMARK_MIN, WMARK_LOW, diff --git a/include/linux/namei.h b/include/linux/namei.h index 76fe2c62ae71..c30012740c74 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -66,6 +66,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_EMPTY 0x4000 extern int user_path_at(int, const char __user *, unsigned, struct path *); +extern int user_path_at_empty(int, const char __user *, unsigned, struct path *, int *empty); #define user_path(name, path) user_path_at(AT_FDCWD, name, LOOKUP_FOLLOW, path) #define user_lpath(name, path) user_path_at(AT_FDCWD, name, 0, path) diff --git a/include/linux/pps-gpio.h b/include/linux/pps-gpio.h new file mode 100644 index 000000000000..0035abe41b9a --- /dev/null +++ b/include/linux/pps-gpio.h @@ -0,0 +1,32 @@ +/* + * pps-gpio.h -- PPS client for GPIOs + * + * + * Copyright (C) 2011 James Nuss <jamesnuss@nanometrics.ca> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef _PPS_GPIO_H +#define _PPS_GPIO_H + +struct pps_gpio_platform_data { + bool assert_falling_edge; + bool capture_clear; + unsigned int gpio_pin; + const char *gpio_label; +}; + +#endif diff --git a/include/linux/rio_ids.h b/include/linux/rio_ids.h index 0cee0152aca9..b66d13d1bdc0 100644 --- a/include/linux/rio_ids.h +++ b/include/linux/rio_ids.h @@ -39,5 +39,6 @@ #define RIO_DID_IDTCPS1616 0x0379 #define RIO_DID_IDTVPS1616 0x0377 #define RIO_DID_IDTSPS1616 0x0378 +#define RIO_DID_TSI721 0x80ab #endif /* LINUX_RIO_IDS_H */ diff --git a/include/linux/security.h b/include/linux/security.h index d9f7ec41ba51..0fbea8b5deff 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -2042,7 +2042,7 @@ static inline void security_inode_free(struct inode *inode) static inline int security_inode_init_security(struct inode *inode, struct inode *dir, const struct qstr *qstr, - initxattrs initxattrs, + const initxattrs initxattrs, void *fs_data) { return 0; diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 790651b4e5ba..ac6b8ee07825 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -34,7 +34,7 @@ struct shrinker { /* These are for internal use */ struct list_head list; - long nr; /* objs pending delete */ + atomic_long_t nr_in_batch; /* objs pending delete */ }; #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */ extern void register_shrinker(struct shrinker *); diff --git a/include/linux/string.h b/include/linux/string.h index a176db2f2c85..e033564f10ba 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -114,6 +114,7 @@ extern int memcmp(const void *,const void *,__kernel_size_t); #ifndef __HAVE_ARCH_MEMCHR extern void * memchr(const void *,int,__kernel_size_t); #endif +void *memchr_inv(const void *s, int c, size_t n); extern char *kstrdup(const char *s, gfp_t gfp); extern char *kstrndup(const char *s, size_t len, gfp_t gfp); diff --git a/include/linux/swap.h b/include/linux/swap.h index 5d4edbf889d4..7c997f271a8a 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -247,15 +247,10 @@ static inline void lru_cache_add_file(struct page *page) __lru_cache_add(page, LRU_INACTIVE_FILE); } -/* LRU Isolation modes. */ -#define ISOLATE_INACTIVE 0 /* Isolate inactive pages. */ -#define ISOLATE_ACTIVE 1 /* Isolate active pages. */ -#define ISOLATE_BOTH 2 /* Isolate both active and inactive pages. */ - /* linux/mm/vmscan.c */ extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order, gfp_t gfp_mask, nodemask_t *mask); -extern int __isolate_lru_page(struct page *page, int mode, int file); +extern int __isolate_lru_page(struct page *page, isolate_mode_t mode, int file); extern unsigned long shrink_all_memory(unsigned long nr_pages); extern int vm_swappiness; extern int remove_mapping(struct address_space *mapping, struct page *page); diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 1ff0ec2a5e8d..86a24b1166d1 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -844,4 +844,17 @@ asmlinkage long sys_open_by_handle_at(int mountdirfd, struct file_handle __user *handle, int flags); asmlinkage long sys_setns(int fd, int nstype); +asmlinkage long sys_process_vm_readv(pid_t pid, + const struct iovec __user *lvec, + unsigned long liovcnt, + const struct iovec __user *rvec, + unsigned long riovcnt, + unsigned long flags); +asmlinkage long sys_process_vm_writev(pid_t pid, + const struct iovec __user *lvec, + unsigned long liovcnt, + const struct iovec __user *rvec, + unsigned long riovcnt, + unsigned long flags); + #endif diff --git a/include/linux/sysdev.h b/include/linux/sysdev.h index 20f63d3e6144..6692a8e50eea 100644 --- a/include/linux/sysdev.h +++ b/include/linux/sysdev.h @@ -114,6 +114,20 @@ struct sysdev_attribute { extern int sysdev_create_file(struct sys_device *, struct sysdev_attribute *); extern void sysdev_remove_file(struct sys_device *, struct sysdev_attribute *); +#if defined(CONFIG_PROC_FS) || defined(CONFIG_SYSFS) +#define sysdev_create_file_optional(sysdev, sysdevattr) \ + return sysdev_create_file(sysdev, sysdevattr); + +#define sysdev_remove_file_optional(sysdev, sysdevattr) \ + sysdev_remove_file(sysdev, sysdevattr); +#else +#define sysdev_create_file_optional(sysdev, sysdevattr) \ + (0) + +#define sysdev_remove_file_optional(sysdev, sysdevattr) \ + do {} while (0) +#endif + /* Create/remove NULL terminated attribute list */ static inline int sysdev_create_files(struct sys_device *d, struct sysdev_attribute **a) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 9332e52ea8c2..687fb11e2010 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -13,6 +13,7 @@ struct vm_area_struct; /* vma defining user mapping in mm_types.h */ #define VM_MAP 0x00000004 /* vmap()ed pages */ #define VM_USERMAP 0x00000008 /* suitable for remap_vmalloc_range */ #define VM_VPAGES 0x00000010 /* buffer for pages was vmalloc'ed */ +#define VM_UNLIST 0x00000020 /* vm_struct is not listed in vmlist */ /* bits [20..32] reserved for arch specific ioremap internals */ /* diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 2b8963ff0f35..a5f495f36928 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -118,8 +118,6 @@ extern int vm_highmem_is_dirtyable; extern int block_dump; extern int laptop_mode; -extern unsigned long determine_dirtyable_memory(void); - extern int dirty_background_ratio_handler(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos); diff --git a/include/scsi/scsi_netlink.h b/include/scsi/scsi_netlink.h index 58ce8fe44783..5cb20ccb1956 100644 --- a/include/scsi/scsi_netlink.h +++ b/include/scsi/scsi_netlink.h @@ -23,7 +23,7 @@ #define SCSI_NETLINK_H #include <linux/netlink.h> - +#include <linux/types.h> /* * This file intended to be included by both kernel and user space diff --git a/include/trace/events/irq_vectors.h b/include/trace/events/irq_vectors.h new file mode 100644 index 000000000000..699ddaae3ddc --- /dev/null +++ b/include/trace/events/irq_vectors.h @@ -0,0 +1,56 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM irq_vectors + +#if !defined(_TRACE_IRQ_VECTORS_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_IRQ_VECTORS_H + +#include <linux/tracepoint.h> +#include <asm/irq.h> + +#ifndef irq_vector_name_table +#define irq_vector_name_table { -1, NULL } +#endif + +DECLARE_EVENT_CLASS(irq_vector, + + TP_PROTO(int irq), + + TP_ARGS(irq), + + TP_STRUCT__entry( + __field( int, irq ) + ), + + TP_fast_assign( + __entry->irq = irq; + ), + + TP_printk("irq=%d name=%s", __entry->irq, + __print_symbolic(__entry->irq, irq_vector_name_table)) +); + +/* + * irq_vector_entry - called before enterring a interrupt vector handler + */ +DEFINE_EVENT(irq_vector, irq_vector_entry, + + TP_PROTO(int irq), + + TP_ARGS(irq) +); + +/* + * irq_vector_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_EVENT(irq_vector, irq_vector_exit, + + TP_PROTO(int irq), + + TP_ARGS(irq) +); + +#endif /* _TRACE_IRQ_VECTORS_H */ + +/* This part must be outside protection */ +#include <trace/define_trace.h> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index 36851f7f13da..edc4b3d25a2d 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -266,7 +266,7 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template, unsigned long nr_lumpy_taken, unsigned long nr_lumpy_dirty, unsigned long nr_lumpy_failed, - int isolate_mode), + isolate_mode_t isolate_mode), TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode), @@ -278,7 +278,7 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template, __field(unsigned long, nr_lumpy_taken) __field(unsigned long, nr_lumpy_dirty) __field(unsigned long, nr_lumpy_failed) - __field(int, isolate_mode) + __field(isolate_mode_t, isolate_mode) ), TP_fast_assign( @@ -312,7 +312,7 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_lru_isolate, unsigned long nr_lumpy_taken, unsigned long nr_lumpy_dirty, unsigned long nr_lumpy_failed, - int isolate_mode), + isolate_mode_t isolate_mode), TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode) @@ -327,7 +327,7 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_memcg_isolate, unsigned long nr_lumpy_taken, unsigned long nr_lumpy_dirty, unsigned long nr_lumpy_failed, - int isolate_mode), + isolate_mode_t isolate_mode), TP_ARGS(order, nr_requested, nr_scanned, nr_taken, nr_lumpy_taken, nr_lumpy_dirty, nr_lumpy_failed, isolate_mode) diff --git a/init/Kconfig b/init/Kconfig index dc7e27bf89a8..3ecfa32b1a10 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -935,7 +935,7 @@ config UID16 config SYSCTL_SYSCALL bool "Sysctl syscall support" if EXPERT depends on PROC_SYSCTL - default y + default n select SYSCTL ---help--- sys_sysctl uses binary paths that have been found challenging @@ -947,7 +947,7 @@ config SYSCTL_SYSCALL trying to save some space it is probably safe to disable this, making your kernel marginally smaller. - If unsure say Y here. + If unsure say N here. config KALLSYMS bool "Load all symbols for debugging/ksymoops" if EXPERT diff --git a/init/do_mounts.c b/init/do_mounts.c index c0851a8e030c..0f6e1d985a3b 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -28,7 +28,7 @@ int __initdata rd_doload; /* 1 = load RAM disk, 0 = don't load */ int root_mountflags = MS_RDONLY | MS_SILENT; static char * __initdata root_device_name; static char __initdata saved_root_name[64]; -static int __initdata root_wait; +static int root_wait; dev_t ROOT_DEV; @@ -85,12 +85,15 @@ no_match: /** * devt_from_partuuid - looks up the dev_t of a partition by its UUID - * @uuid: 36 byte char array containing a hex ascii UUID + * @uuid: min 36 byte char array containing a hex ascii UUID * * The function will return the first partition which contains a matching * UUID value in its partition_meta_info struct. This does not search * by filesystem UUIDs. * + * If @uuid is followed by a "/PARTNROFF=%d", then the number will be + * extracted and used as an offset from the partition identified by the UUID. + * * Returns the matching dev_t on success or 0 on failure. */ static dev_t devt_from_partuuid(char *uuid_str) @@ -98,6 +101,28 @@ static dev_t devt_from_partuuid(char *uuid_str) dev_t res = 0; struct device *dev = NULL; u8 uuid[16]; + struct gendisk *disk; + struct hd_struct *part; + int offset = 0; + + if (strlen(uuid_str) < 36) + goto done; + + /* Check for optional partition number offset attributes. */ + if (uuid_str[36]) { + char c = 0; + /* Explicitly fail on poor PARTUUID syntax. */ + if (sscanf(&uuid_str[36], + "/PARTNROFF=%d%c", &offset, &c) != 1) { + printk(KERN_ERR "VFS: PARTUUID= is invalid.\n" + "Expected PARTUUID=<valid-uuid-id>[/PARTNROFF=%%d]\n"); + if (root_wait) + printk(KERN_ERR + "Disabling rootwait; root= is invalid.\n"); + root_wait = 0; + goto done; + } + } /* Pack the requested UUID in the expected format. */ part_pack_uuid(uuid_str, uuid); @@ -107,8 +132,21 @@ static dev_t devt_from_partuuid(char *uuid_str) goto done; res = dev->devt; - put_device(dev); + /* Attempt to find the partition by offset. */ + if (!offset) + goto no_offset; + + res = 0; + disk = part_to_disk(dev_to_part(dev)); + part = disk_get_part(disk, dev_to_part(dev)->partno + offset); + if (part) { + res = part_devt(part); + put_device(part_to_dev(part)); + } + +no_offset: + put_device(dev); done: return res; } @@ -126,6 +164,8 @@ done: * used when disk name of partitioned disk ends on a digit. * 6) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the * unique id of a partition if the partition table provides it. + * 7) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to + * a partition with a known unique id. * * If name doesn't have fall into the categories above, we return (0,0). * block_class is used to check if something is a disk name. If the disk @@ -143,8 +183,6 @@ dev_t name_to_dev_t(char *name) #ifdef CONFIG_BLOCK if (strncmp(name, "PARTUUID=", 9) == 0) { name += 9; - if (strlen(name) != 36) - goto fail; res = devt_from_partuuid(name); if (!res) goto fail; diff --git a/ipc/shm.c b/ipc/shm.c index 02ecf2c078fc..5da50165a5bd 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -74,7 +74,7 @@ void shm_init_ns(struct ipc_namespace *ns) ns->shm_ctlmax = SHMMAX; ns->shm_ctlall = SHMALL; ns->shm_ctlmni = SHMMNI; - ns->shm_rmid_forced = 0; + ns->shm_rmid_forced = 1; ns->shm_tot = 0; ipc_init_ids(&shm_ids(ns)); } diff --git a/kernel/audit.c b/kernel/audit.c index 09fae2677a45..2c1d6ab7106e 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1260,12 +1260,13 @@ static void audit_log_vformat(struct audit_buffer *ab, const char *fmt, avail = audit_expand(ab, max_t(unsigned, AUDIT_BUFSIZ, 1+len-avail)); if (!avail) - goto out; + goto out_va_end; len = vsnprintf(skb_tail_pointer(skb), avail, fmt, args2); } - va_end(args2); if (len > 0) skb_put(skb, len); +out_va_end: + va_end(args2); out: return; } diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 1d2b6ceea95d..bc3caf0ee454 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4880,9 +4880,9 @@ void free_css_id(struct cgroup_subsys *ss, struct cgroup_subsys_state *css) rcu_assign_pointer(id->css, NULL); rcu_assign_pointer(css->id, NULL); - spin_lock(&ss->id_lock); + write_lock(&ss->id_lock); idr_remove(&ss->idr, id->id); - spin_unlock(&ss->id_lock); + write_unlock(&ss->id_lock); kfree_rcu(id, rcu_head); } EXPORT_SYMBOL_GPL(free_css_id); @@ -4908,10 +4908,10 @@ static struct css_id *get_new_cssid(struct cgroup_subsys *ss, int depth) error = -ENOMEM; goto err_out; } - spin_lock(&ss->id_lock); + write_lock(&ss->id_lock); /* Don't use 0. allocates an ID of 1-65535 */ error = idr_get_new_above(&ss->idr, newid, 1, &myid); - spin_unlock(&ss->id_lock); + write_unlock(&ss->id_lock); /* Returns error when there are no free spaces for new ID.*/ if (error) { @@ -4926,9 +4926,9 @@ static struct css_id *get_new_cssid(struct cgroup_subsys *ss, int depth) return newid; remove_idr: error = -ENOSPC; - spin_lock(&ss->id_lock); + write_lock(&ss->id_lock); idr_remove(&ss->idr, myid); - spin_unlock(&ss->id_lock); + write_unlock(&ss->id_lock); err_out: kfree(newid); return ERR_PTR(error); @@ -4940,7 +4940,7 @@ static int __init_or_module cgroup_init_idr(struct cgroup_subsys *ss, { struct css_id *newid; - spin_lock_init(&ss->id_lock); + rwlock_init(&ss->id_lock); idr_init(&ss->idr); newid = get_new_cssid(ss, 0); @@ -5035,9 +5035,9 @@ css_get_next(struct cgroup_subsys *ss, int id, * scan next entry from bitmap(tree), tmpid is updated after * idr_get_next(). */ - spin_lock(&ss->id_lock); + read_lock(&ss->id_lock); tmp = idr_get_next(&ss->idr, &tmpid); - spin_unlock(&ss->id_lock); + read_unlock(&ss->id_lock); if (!tmp) break; diff --git a/kernel/events/core.c b/kernel/events/core.c index 5a0eeda4ad9e..115d701a2c36 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -3501,7 +3501,7 @@ static void perf_mmap_close(struct vm_area_struct *vma) struct ring_buffer *rb = event->rb; atomic_long_sub((size >> PAGE_SHIFT) + 1, &user->locked_vm); - vma->vm_mm->locked_vm -= event->mmap_locked; + vma->vm_mm->pinned_vm -= event->mmap_locked; rcu_assign_pointer(event->rb, NULL); mutex_unlock(&event->mmap_mutex); @@ -3582,7 +3582,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) lock_limit = rlimit(RLIMIT_MEMLOCK); lock_limit >>= PAGE_SHIFT; - locked = vma->vm_mm->locked_vm + extra; + locked = vma->vm_mm->pinned_vm + extra; if ((locked > lock_limit) && perf_paranoid_tracepoint_raw() && !capable(CAP_IPC_LOCK)) { @@ -3608,7 +3608,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) atomic_long_add(user_extra, &user->locked_vm); event->mmap_locked = extra; event->mmap_user = get_current_user(); - vma->vm_mm->locked_vm += event->mmap_locked; + vma->vm_mm->pinned_vm += event->mmap_locked; unlock: if (!ret) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 184d8987489d..d3466d09a7ca 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -2198,7 +2198,7 @@ static ssize_t write_enabled_file_bool(struct file *file, const char __user *user_buf, size_t count, loff_t *ppos) { char buf[32]; - int buf_size; + size_t buf_size; buf_size = min(count, (sizeof(buf)-1)); if (copy_from_user(buf, user_buf, buf_size)) diff --git a/kernel/printk.c b/kernel/printk.c index 28a40d8171b8..e86148856736 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -532,6 +532,9 @@ static int __init ignore_loglevel_setup(char *str) } early_param("ignore_loglevel", ignore_loglevel_setup); +module_param_named(ignore_loglevel, ignore_loglevel, bool, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(ignore_loglevel, "ignore loglevel setting, to" + "print all kernel messages to the console."); /* * Write out chars from start to end - 1 inclusive @@ -1108,6 +1111,10 @@ static int __init console_suspend_disable(char *str) return 1; } __setup("no_console_suspend", console_suspend_disable); +module_param_named(console_suspend, console_suspend_enabled, + bool, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(console_suspend, "suspend console during suspend" + " and hibernate operations"); /** * suspend_console - suspend the console subsystem diff --git a/kernel/sched.c b/kernel/sched.c index 7c0c81b39cc3..16ec21cbd1d2 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -7435,6 +7435,7 @@ static void __sdt_free(const struct cpumask *cpu_map) struct sched_domain *sd = *per_cpu_ptr(sdd->sd, j); if (sd && (sd->flags & SD_OVERLAP)) free_sched_groups(sd->groups, 0); + kfree(*per_cpu_ptr(sdd->sd, j)); kfree(*per_cpu_ptr(sdd->sg, j)); kfree(*per_cpu_ptr(sdd->sgp, j)); } diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index a9a5de07c4f1..47bfa16430d7 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -145,6 +145,10 @@ cond_syscall(sys_io_submit); cond_syscall(sys_io_cancel); cond_syscall(sys_io_getevents); cond_syscall(sys_syslog); +cond_syscall(sys_process_vm_readv); +cond_syscall(sys_process_vm_writev); +cond_syscall(compat_sys_process_vm_readv); +cond_syscall(compat_sys_process_vm_writev); /* arch-specific weak syscall entries */ cond_syscall(sys_pciconfig_read); diff --git a/kernel/time.c b/kernel/time.c index 58636b7ee5e0..73e416db0a1e 100644 --- a/kernel/time.c +++ b/kernel/time.c @@ -575,7 +575,7 @@ EXPORT_SYMBOL(jiffies_to_timeval); /* * Convert jiffies/jiffies_64 to clock_t and back. */ -clock_t jiffies_to_clock_t(long x) +clock_t jiffies_to_clock_t(unsigned long x) { #if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0 # if HZ < USER_HZ diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index eb98e55196b9..4eb7c1f7efff 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -158,9 +158,10 @@ update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, u64 *last_upda if (ts->idle_active) { delta = ktime_sub(now, ts->idle_entrytime); - ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta); if (nr_iowait_cpu(cpu) > 0) ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta); + else + ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta); ts->idle_entrytime = now; } @@ -196,11 +197,11 @@ static ktime_t tick_nohz_start_idle(int cpu, struct tick_sched *ts) /** * get_cpu_idle_time_us - get the total idle time of a cpu * @cpu: CPU number to query - * @last_update_time: variable to store update time in + * @last_update_time: variable to store update time in. Do not update + * counters if NULL. * * Return the cummulative idle time (since boot) for a given - * CPU, in microseconds. The idle time returned includes - * the iowait time (unlike what "top" and co report). + * CPU, in microseconds. * * This time is measured via accounting rather than sampling, * and is as accurate as ktime_get() is. @@ -210,20 +211,33 @@ static ktime_t tick_nohz_start_idle(int cpu, struct tick_sched *ts) u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time) { struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu); + ktime_t now, idle; if (!tick_nohz_enabled) return -1; - update_ts_time_stats(cpu, ts, ktime_get(), last_update_time); + now = ktime_get(); + if (last_update_time) { + update_ts_time_stats(cpu, ts, now, last_update_time); + idle = ts->idle_sleeptime; + } else { + if (ts->idle_active && !nr_iowait_cpu(cpu)) { + ktime_t delta = ktime_sub(now, ts->idle_entrytime); + idle = ktime_add(ts->idle_sleeptime, delta); + } else + idle = ts->idle_sleeptime; + } + + return ktime_to_us(idle); - return ktime_to_us(ts->idle_sleeptime); } EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); -/* +/** * get_cpu_iowait_time_us - get the total iowait time of a cpu * @cpu: CPU number to query - * @last_update_time: variable to store update time in + * @last_update_time: variable to store update time in. Do not update + * counters if NULL. * * Return the cummulative iowait time (since boot) for a given * CPU, in microseconds. @@ -236,13 +250,24 @@ EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time) { struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu); + ktime_t now, iowait; if (!tick_nohz_enabled) return -1; - update_ts_time_stats(cpu, ts, ktime_get(), last_update_time); + now = ktime_get(); + if (last_update_time) { + update_ts_time_stats(cpu, ts, now, last_update_time); + iowait = ts->iowait_sleeptime; + } else { + if (ts->idle_active && nr_iowait_cpu(cpu) > 0) { + ktime_t delta = ktime_sub(now, ts->idle_entrytime); + iowait = ktime_add(ts->iowait_sleeptime, delta); + } else + iowait = ts->iowait_sleeptime; + } - return ktime_to_us(ts->iowait_sleeptime); + return ktime_to_us(iowait); } EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us); diff --git a/kernel/timer.c b/kernel/timer.c index dbaa62422b13..42d02c1e877d 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -480,12 +480,41 @@ static int timer_fixup_free(void *addr, enum debug_obj_state state) } } +/* + * fixup_assert_init is called when: + * - an untracked/uninit-ed object is found + */ +static int timer_fixup_assert_init(void *addr, enum debug_obj_state state) +{ + struct timer_list *timer = addr; + + switch (state) { + case ODEBUG_STATE_NOTAVAILABLE: + if (timer->entry.prev == TIMER_ENTRY_STATIC) { + /* + * This is not really a fixup. The timer was + * statically initialized. We just make sure that it + * is tracked in the object tracker. + */ + debug_object_init(timer, &timer_debug_descr); + return 0; + } else { + WARN_ON(1); + init_timer(timer); + return 1; + } + default: + return 0; + } +} + static struct debug_obj_descr timer_debug_descr = { - .name = "timer_list", - .debug_hint = timer_debug_hint, - .fixup_init = timer_fixup_init, - .fixup_activate = timer_fixup_activate, - .fixup_free = timer_fixup_free, + .name = "timer_list", + .debug_hint = timer_debug_hint, + .fixup_init = timer_fixup_init, + .fixup_activate = timer_fixup_activate, + .fixup_free = timer_fixup_free, + .fixup_assert_init = timer_fixup_assert_init, }; static inline void debug_timer_init(struct timer_list *timer) @@ -508,6 +537,11 @@ static inline void debug_timer_free(struct timer_list *timer) debug_object_free(timer, &timer_debug_descr); } +static inline void debug_timer_assert_init(struct timer_list *timer) +{ + debug_object_assert_init(timer, &timer_debug_descr); +} + static void __init_timer(struct timer_list *timer, const char *name, struct lock_class_key *key); @@ -531,6 +565,7 @@ EXPORT_SYMBOL_GPL(destroy_timer_on_stack); static inline void debug_timer_init(struct timer_list *timer) { } static inline void debug_timer_activate(struct timer_list *timer) { } static inline void debug_timer_deactivate(struct timer_list *timer) { } +static inline void debug_timer_assert_init(struct timer_list *timer) { } #endif static inline void debug_init(struct timer_list *timer) @@ -552,6 +587,11 @@ static inline void debug_deactivate(struct timer_list *timer) trace_timer_cancel(timer); } +static inline void debug_assert_init(struct timer_list *timer) +{ + debug_timer_assert_init(timer); +} + static void __init_timer(struct timer_list *timer, const char *name, struct lock_class_key *key) @@ -902,6 +942,8 @@ int del_timer(struct timer_list *timer) unsigned long flags; int ret = 0; + debug_assert_init(timer); + timer_stats_timer_clear_start_info(timer); if (timer_pending(timer)) { base = lock_timer_base(timer, &flags); @@ -932,6 +974,8 @@ int try_to_del_timer_sync(struct timer_list *timer) unsigned long flags; int ret = -1; + debug_assert_init(timer); + base = lock_timer_base(timer, &flags); if (base->running_timer == timer) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 639c95aca53c..4d6100d83e59 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1120,6 +1120,24 @@ config SYSCTL_SYSCALL_CHECK to properly maintain and use. This enables checks that help you to keep things correct. +config ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS + bool + +config DEBUG_STRICT_USER_COPY_CHECKS + bool "Strict user copy size checks" + depends on ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS + depends on DEBUG_KERNEL && !TRACE_BRANCH_PROFILING + help + Enabling this option turns a certain set of sanity checks for user + copy operations into compile time failures. + + The copy_from_user() etc checks are there to help test if there + are sufficient security checks on the length argument of + the copy operation, by having gcc prove that the argument is + within bounds. + + If unsure, say N. + source mm/Kconfig.debug source kernel/trace/Kconfig diff --git a/lib/Makefile b/lib/Makefile index 3f5bc6d903e0..7d500a5f283f 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -14,6 +14,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ proportions.o prio_heap.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o +obj-$(CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS) += usercopy.o lib-$(CONFIG_MMU) += ioremap.o lib-$(CONFIG_SMP) += cpumask.o diff --git a/lib/crc32.c b/lib/crc32.c index a6e633a48cea..07005ba65c4b 100644 --- a/lib/crc32.c +++ b/lib/crc32.c @@ -1,4 +1,8 @@ /* + * July 20, 2011 Bob Pearson <rpearson at systemfabricworks.com> + * added slice by 8 algorithm to the existing conventional and + * slice by 4 algorithms. + * * Oct 15, 2000 Matt Domsch <Matt_Domsch@dell.com> * Nicer crc32 functions/docs submitted by linux@horizon.com. Thanks! * Code was from the public domain, copyright abandoned. Code was @@ -19,7 +23,6 @@ * This source code is licensed under the GNU General Public License, * Version 2. See the file COPYING for more details. */ - #include <linux/crc32.h> #include <linux/kernel.h> #include <linux/module.h> @@ -28,14 +31,15 @@ #include <linux/init.h> #include <linux/atomic.h> #include "crc32defs.h" -#if CRC_LE_BITS == 8 -# define tole(x) __constant_cpu_to_le32(x) + +#if CRC_LE_BITS > 8 +# define tole(x) (__force u32) __constant_cpu_to_le32(x) #else # define tole(x) (x) #endif -#if CRC_BE_BITS == 8 -# define tobe(x) __constant_cpu_to_be32(x) +#if CRC_BE_BITS > 8 +# define tobe(x) (__force u32) __constant_cpu_to_be32(x) #else # define tobe(x) (x) #endif @@ -45,54 +49,228 @@ MODULE_AUTHOR("Matt Domsch <Matt_Domsch@dell.com>"); MODULE_DESCRIPTION("Ethernet CRC32 calculations"); MODULE_LICENSE("GPL"); -#if CRC_LE_BITS == 8 || CRC_BE_BITS == 8 +#if CRC_LE_BITS > 8 +static inline u32 crc32_le_body(u32 crc, u8 const *buf, size_t len) +{ + const u8 *p8; + const u32 *p32; + int init_bytes, end_bytes; + size_t words; + int i; + u32 q; + u8 i0, i1, i2, i3; + + crc = (__force u32) __cpu_to_le32(crc); + +#if CRC_LE_BITS == 64 + p8 = buf; + p32 = (u32 *)(((uintptr_t)p8 + 7) & ~7); + + init_bytes = (uintptr_t)p32 - (uintptr_t)p8; + if (init_bytes > len) + init_bytes = len; + words = (len - init_bytes) >> 3; + end_bytes = (len - init_bytes) & 7; +#else + p8 = buf; + p32 = (u32 *)(((uintptr_t)p8 + 3) & ~3); + + init_bytes = (uintptr_t)p32 - (uintptr_t)p8; + if (init_bytes > len) + init_bytes = len; + words = (len - init_bytes) >> 2; + end_bytes = (len - init_bytes) & 3; +#endif + + for (i = 0; i < init_bytes; i++) { +#ifdef __LITTLE_ENDIAN + i0 = *p8++ ^ crc; + crc = t0_le[i0] ^ (crc >> 8); +#else + i0 = *p8++ ^ (crc >> 24); + crc = t0_le[i0] ^ (crc << 8); +#endif + } + + for (i = 0; i < words; i++) { +#ifdef __LITTLE_ENDIAN +# if CRC_LE_BITS == 64 + /* slice by 8 algorithm */ + q = *p32++ ^ crc; + i3 = q; + i2 = q >> 8; + i1 = q >> 16; + i0 = q >> 24; + crc = t7_le[i3] ^ t6_le[i2] ^ t5_le[i1] ^ t4_le[i0]; + + q = *p32++; + i3 = q; + i2 = q >> 8; + i1 = q >> 16; + i0 = q >> 24; + crc ^= t3_le[i3] ^ t2_le[i2] ^ t1_le[i1] ^ t0_le[i0]; +# else + /* slice by 4 algorithm */ + q = *p32++ ^ crc; + i3 = q; + i2 = q >> 8; + i1 = q >> 16; + i0 = q >> 24; + crc = t3_le[i3] ^ t2_le[i2] ^ t1_le[i1] ^ t0_le[i0]; +# endif +#else +# if CRC_LE_BITS == 64 + q = *p32++ ^ crc; + i3 = q >> 24; + i2 = q >> 16; + i1 = q >> 8; + i0 = q; + crc = t7_le[i3] ^ t6_le[i2] ^ t5_le[i1] ^ t4_le[i0]; + + q = *p32++; + i3 = q >> 24; + i2 = q >> 16; + i1 = q >> 8; + i0 = q; + crc ^= t3_le[i3] ^ t2_le[i2] ^ t1_le[i1] ^ t0_le[i0]; +# else + q = *p32++ ^ crc; + i3 = q >> 24; + i2 = q >> 16; + i1 = q >> 8; + i0 = q; + crc = t3_le[i3] ^ t2_le[i2] ^ t1_le[i1] ^ t0_le[i0]; +# endif +#endif + } + + p8 = (u8 *)p32; + + for (i = 0; i < end_bytes; i++) { +#ifdef __LITTLE_ENDIAN + i0 = *p8++ ^ crc; + crc = t0_le[i0] ^ (crc >> 8); +#else + i0 = *p8++ ^ (crc >> 24); + crc = t0_le[i0] ^ (crc << 8); +#endif + } -static inline u32 -crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256]) + return __le32_to_cpu((__force __le32)crc); +} +#endif + +#if CRC_BE_BITS > 8 +static inline u32 crc32_be_body(u32 crc, u8 const *buf, size_t len) { -# ifdef __LITTLE_ENDIAN -# define DO_CRC(x) crc = tab[0][(crc ^ (x)) & 255] ^ (crc >> 8) -# define DO_CRC4 crc = tab[3][(crc) & 255] ^ \ - tab[2][(crc >> 8) & 255] ^ \ - tab[1][(crc >> 16) & 255] ^ \ - tab[0][(crc >> 24) & 255] -# else -# define DO_CRC(x) crc = tab[0][((crc >> 24) ^ (x)) & 255] ^ (crc << 8) -# define DO_CRC4 crc = tab[0][(crc) & 255] ^ \ - tab[1][(crc >> 8) & 255] ^ \ - tab[2][(crc >> 16) & 255] ^ \ - tab[3][(crc >> 24) & 255] -# endif - const u32 *b; - size_t rem_len; - - /* Align it */ - if (unlikely((long)buf & 3 && len)) { - do { - DO_CRC(*buf++); - } while ((--len) && ((long)buf)&3); + const u8 *p8; + const u32 *p32; + int init_bytes, end_bytes; + size_t words; + int i; + u32 q; + u8 i0, i1, i2, i3; + + crc = (__force u32) __cpu_to_be32(crc); + +#if CRC_LE_BITS == 64 + p8 = buf; + p32 = (u32 *)(((uintptr_t)p8 + 7) & ~7); + + init_bytes = (uintptr_t)p32 - (uintptr_t)p8; + if (init_bytes > len) + init_bytes = len; + words = (len - init_bytes) >> 3; + end_bytes = (len - init_bytes) & 7; +#else + p8 = buf; + p32 = (u32 *)(((uintptr_t)p8 + 3) & ~3); + + init_bytes = (uintptr_t)p32 - (uintptr_t)p8; + if (init_bytes > len) + init_bytes = len; + words = (len - init_bytes) >> 2; + end_bytes = (len - init_bytes) & 3; +#endif + + for (i = 0; i < init_bytes; i++) { +#ifdef __LITTLE_ENDIAN + i0 = *p8++ ^ crc; + crc = t0_be[i0] ^ (crc >> 8); +#else + i0 = *p8++ ^ (crc >> 24); + crc = t0_be[i0] ^ (crc << 8); +#endif } - rem_len = len & 3; - /* load data 32 bits wide, xor data 32 bits wide. */ - len = len >> 2; - b = (const u32 *)buf; - for (--b; len; --len) { - crc ^= *++b; /* use pre increment for speed */ - DO_CRC4; + + for (i = 0; i < words; i++) { +#ifdef __LITTLE_ENDIAN +# if CRC_LE_BITS == 64 + /* slice by 8 algorithm */ + q = *p32++ ^ crc; + i3 = q; + i2 = q >> 8; + i1 = q >> 16; + i0 = q >> 24; + crc = t7_be[i3] ^ t6_be[i2] ^ t5_be[i1] ^ t4_be[i0]; + + q = *p32++; + i3 = q; + i2 = q >> 8; + i1 = q >> 16; + i0 = q >> 24; + crc ^= t3_be[i3] ^ t2_be[i2] ^ t1_be[i1] ^ t0_be[i0]; +# else + /* slice by 4 algorithm */ + q = *p32++ ^ crc; + i3 = q; + i2 = q >> 8; + i1 = q >> 16; + i0 = q >> 24; + crc = t3_be[i3] ^ t2_be[i2] ^ t1_be[i1] ^ t0_be[i0]; +# endif +#else +# if CRC_LE_BITS == 64 + q = *p32++ ^ crc; + i3 = q >> 24; + i2 = q >> 16; + i1 = q >> 8; + i0 = q; + crc = t7_be[i3] ^ t6_be[i2] ^ t5_be[i1] ^ t4_be[i0]; + + q = *p32++; + i3 = q >> 24; + i2 = q >> 16; + i1 = q >> 8; + i0 = q; + crc ^= t3_be[i3] ^ t2_be[i2] ^ t1_be[i1] ^ t0_be[i0]; +# else + q = *p32++ ^ crc; + i3 = q >> 24; + i2 = q >> 16; + i1 = q >> 8; + i0 = q; + crc = t3_be[i3] ^ t2_be[i2] ^ t1_be[i1] ^ t0_be[i0]; +# endif +#endif } - len = rem_len; - /* And the last few bytes */ - if (len) { - u8 *p = (u8 *)(b + 1) - 1; - do { - DO_CRC(*++p); /* use pre increment for speed */ - } while (--len); + + p8 = (u8 *)p32; + + for (i = 0; i < end_bytes; i++) { +#ifdef __LITTLE_ENDIAN + i0 = *p8++ ^ crc; + crc = t0_be[i0] ^ (crc >> 8); +#else + i0 = *p8++ ^ (crc >> 24); + crc = t0_be[i0] ^ (crc << 8); +#endif } - return crc; -#undef DO_CRC -#undef DO_CRC4 + + return __be32_to_cpu((__force __be32)crc); } #endif + /** * crc32_le() - Calculate bitwise little-endian Ethernet AUTODIN II CRC32 * @crc: seed value for computation. ~0 for Ethernet, sometimes 0 for @@ -100,53 +278,40 @@ crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256]) * @p: pointer to buffer over which CRC is run * @len: length of buffer @p */ -u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len); - -#if CRC_LE_BITS == 1 -/* - * In fact, the table-based code will work in this case, but it can be - * simplified by inlining the table in ?: form. - */ - u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len) { +#if CRC_LE_BITS == 1 int i; while (len--) { crc ^= *p++; for (i = 0; i < 8; i++) crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0); } - return crc; -} -#else /* Table-based approach */ - -u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len) -{ -# if CRC_LE_BITS == 8 - const u32 (*tab)[] = crc32table_le; - - crc = __cpu_to_le32(crc); - crc = crc32_body(crc, p, len, tab); - return __le32_to_cpu(crc); +# elif CRC_LE_BITS == 2 + while (len--) { + crc ^= *p++; + crc = (crc >> 2) ^ t0_le[crc & 0x03]; + crc = (crc >> 2) ^ t0_le[crc & 0x03]; + crc = (crc >> 2) ^ t0_le[crc & 0x03]; + crc = (crc >> 2) ^ t0_le[crc & 0x03]; + } # elif CRC_LE_BITS == 4 while (len--) { crc ^= *p++; - crc = (crc >> 4) ^ crc32table_le[crc & 15]; - crc = (crc >> 4) ^ crc32table_le[crc & 15]; + crc = (crc >> 4) ^ t0_le[crc & 0x0f]; + crc = (crc >> 4) ^ t0_le[crc & 0x0f]; } - return crc; -# elif CRC_LE_BITS == 2 +# elif CRC_LE_BITS == 8 while (len--) { crc ^= *p++; - crc = (crc >> 2) ^ crc32table_le[crc & 3]; - crc = (crc >> 2) ^ crc32table_le[crc & 3]; - crc = (crc >> 2) ^ crc32table_le[crc & 3]; - crc = (crc >> 2) ^ crc32table_le[crc & 3]; + crc = (crc >> 8) ^ t0_le[crc & 0xff]; } - return crc; +# else + crc = crc32_le_body(crc, p, len); # endif + return crc; } -#endif +EXPORT_SYMBOL(crc32_le); /** * crc32_be() - Calculate bitwise big-endian Ethernet AUTODIN II CRC32 @@ -155,57 +320,40 @@ u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len) * @p: pointer to buffer over which CRC is run * @len: length of buffer @p */ -u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len); - -#if CRC_BE_BITS == 1 -/* - * In fact, the table-based code will work in this case, but it can be - * simplified by inlining the table in ?: form. - */ - u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) { +#if CRC_BE_BITS == 1 int i; while (len--) { crc ^= *p++ << 24; for (i = 0; i < 8; i++) - crc = - (crc << 1) ^ ((crc & 0x80000000) ? CRCPOLY_BE : - 0); + crc = (crc << 1) ^ + ((crc & 0x80000000) ? CRCPOLY_BE : 0); + } +# elif CRC_BE_BITS == 2 + while (len--) { + crc ^= *p++ << 24; + crc = (crc << 2) ^ t0_be[crc >> 30]; + crc = (crc << 2) ^ t0_be[crc >> 30]; + crc = (crc << 2) ^ t0_be[crc >> 30]; + crc = (crc << 2) ^ t0_be[crc >> 30]; } - return crc; -} - -#else /* Table-based approach */ -u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) -{ -# if CRC_BE_BITS == 8 - const u32 (*tab)[] = crc32table_be; - - crc = __cpu_to_be32(crc); - crc = crc32_body(crc, p, len, tab); - return __be32_to_cpu(crc); # elif CRC_BE_BITS == 4 while (len--) { crc ^= *p++ << 24; - crc = (crc << 4) ^ crc32table_be[crc >> 28]; - crc = (crc << 4) ^ crc32table_be[crc >> 28]; + crc = (crc << 4) ^ t0_be[crc >> 28]; + crc = (crc << 4) ^ t0_be[crc >> 28]; } - return crc; -# elif CRC_BE_BITS == 2 +# elif CRC_BE_BITS == 8 while (len--) { crc ^= *p++ << 24; - crc = (crc << 2) ^ crc32table_be[crc >> 30]; - crc = (crc << 2) ^ crc32table_be[crc >> 30]; - crc = (crc << 2) ^ crc32table_be[crc >> 30]; - crc = (crc << 2) ^ crc32table_be[crc >> 30]; + crc = (crc << 8) ^ t0_be[crc >> 24]; } - return crc; +# else + crc = crc32_be_body(crc, p, len); # endif + return crc; } -#endif - -EXPORT_SYMBOL(crc32_le); EXPORT_SYMBOL(crc32_be); /* diff --git a/lib/crc32defs.h b/lib/crc32defs.h index 9b6773d73749..fe3fccdbd65d 100644 --- a/lib/crc32defs.h +++ b/lib/crc32defs.h @@ -6,27 +6,29 @@ #define CRCPOLY_LE 0xedb88320 #define CRCPOLY_BE 0x04c11db7 -/* How many bits at a time to use. Requires a table of 4<<CRC_xx_BITS bytes. */ +/* How many bits at a time to use. Valid values are 1, 2, 4, 8, 32 and 64. */ /* For less performance-sensitive, use 4 */ #ifndef CRC_LE_BITS -# define CRC_LE_BITS 8 +# define CRC_LE_BITS 64 #endif #ifndef CRC_BE_BITS -# define CRC_BE_BITS 8 +# define CRC_BE_BITS 64 #endif /* * Little-endian CRC computation. Used with serial bit streams sent * lsbit-first. Be sure to use cpu_to_le32() to append the computed CRC. */ -#if CRC_LE_BITS > 8 || CRC_LE_BITS < 1 || CRC_LE_BITS & CRC_LE_BITS-1 -# error CRC_LE_BITS must be a power of 2 between 1 and 8 +#if CRC_LE_BITS > 64 || CRC_LE_BITS < 1 || CRC_LE_BITS == 16 || \ + CRC_LE_BITS & CRC_LE_BITS-1 +# error "CRC_LE_BITS must be one of {1, 2, 4, 8, 32, 64}" #endif /* * Big-endian CRC computation. Used with serial bit streams sent * msbit-first. Be sure to use cpu_to_be32() to append the computed CRC. */ -#if CRC_BE_BITS > 8 || CRC_BE_BITS < 1 || CRC_BE_BITS & CRC_BE_BITS-1 -# error CRC_BE_BITS must be a power of 2 between 1 and 8 +#if CRC_BE_BITS > 64 || CRC_BE_BITS < 1 || CRC_BE_BITS == 16 || \ + CRC_BE_BITS & CRC_BE_BITS-1 +# error "CRC_BE_BITS must be one of {1, 2, 4, 8, 32, 64}" #endif diff --git a/lib/debugobjects.c b/lib/debugobjects.c index a78b7c6e042c..b07b5b82f396 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -563,6 +563,39 @@ out_unlock: } /** + * debug_object_assert_init - debug checks when object should be init-ed + * @addr: address of the object + * @descr: pointer to an object specific debug description structure + */ +void debug_object_assert_init(void *addr, struct debug_obj_descr *descr) +{ + struct debug_bucket *db; + struct debug_obj *obj; + unsigned long flags; + + if (!debug_objects_enabled) + return; + + db = get_bucket((unsigned long) addr); + + raw_spin_lock_irqsave(&db->lock, flags); + + obj = lookup_object(addr, db); + if (!obj) { + raw_spin_unlock_irqrestore(&db->lock, flags); + /* + * Maybe the object is static. Let the type specific + * code decide what to do. + */ + debug_object_fixup(descr->fixup_assert_init, addr, + ODEBUG_STATE_NOTAVAILABLE); + return; + } + + raw_spin_unlock_irqrestore(&db->lock, flags); +} + +/** * debug_object_active_state - debug checks object usage state machine * @addr: address of the object * @descr: pointer to an object specific debug description structure diff --git a/lib/gen_crc32table.c b/lib/gen_crc32table.c index 85d0e412a04f..e1303d7c6b51 100644 --- a/lib/gen_crc32table.c +++ b/lib/gen_crc32table.c @@ -4,11 +4,20 @@ #define ENTRIES_PER_LINE 4 +#if CRC_LE_BITS <= 8 #define LE_TABLE_SIZE (1 << CRC_LE_BITS) +#else +#define LE_TABLE_SIZE 256 +#endif + +#if CRC_BE_BITS <= 8 #define BE_TABLE_SIZE (1 << CRC_BE_BITS) +#else +#define BE_TABLE_SIZE 256 +#endif -static uint32_t crc32table_le[4][LE_TABLE_SIZE]; -static uint32_t crc32table_be[4][BE_TABLE_SIZE]; +static uint32_t crc32table_le[8][256]; +static uint32_t crc32table_be[8][256]; /** * crc32init_le() - allocate and initialize LE table data @@ -24,14 +33,14 @@ static void crc32init_le(void) crc32table_le[0][0] = 0; - for (i = 1 << (CRC_LE_BITS - 1); i; i >>= 1) { + for (i = LE_TABLE_SIZE >> 1; i; i >>= 1) { crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0); for (j = 0; j < LE_TABLE_SIZE; j += 2 * i) crc32table_le[0][i + j] = crc ^ crc32table_le[0][j]; } for (i = 0; i < LE_TABLE_SIZE; i++) { crc = crc32table_le[0][i]; - for (j = 1; j < 4; j++) { + for (j = 1; j < 8; j++) { crc = crc32table_le[0][crc & 0xff] ^ (crc >> 8); crc32table_le[j][i] = crc; } @@ -55,44 +64,58 @@ static void crc32init_be(void) } for (i = 0; i < BE_TABLE_SIZE; i++) { crc = crc32table_be[0][i]; - for (j = 1; j < 4; j++) { + for (j = 1; j < 8; j++) { crc = crc32table_be[0][(crc >> 24) & 0xff] ^ (crc << 8); crc32table_be[j][i] = crc; } } } -static void output_table(uint32_t table[4][256], int len, char *trans) +static void output_table(uint32_t table[8][256], int len, char trans) { int i, j; - for (j = 0 ; j < 4; j++) { - printf("{"); + for (j = 0 ; j < 8; j++) { + printf("static const u32 t%d_%ce[] = {", j, trans); for (i = 0; i < len - 1; i++) { - if (i % ENTRIES_PER_LINE == 0) + if ((i % ENTRIES_PER_LINE) == 0) printf("\n"); - printf("%s(0x%8.8xL), ", trans, table[j][i]); + printf("to%ce(0x%8.8xL),", trans, table[j][i]); + if ((i % ENTRIES_PER_LINE) != (ENTRIES_PER_LINE - 1)) + printf(" "); + } + printf("to%ce(0x%8.8xL)};\n\n", trans, table[j][len - 1]); + + if (trans == 'l') { + if ((j+1)*8 >= CRC_LE_BITS) + break; + } else { + if ((j+1)*8 >= CRC_BE_BITS) + break; } - printf("%s(0x%8.8xL)},\n", trans, table[j][len - 1]); } } int main(int argc, char** argv) { - printf("/* this file is generated - do not edit */\n\n"); + printf("/*\n"); + printf(" * crc32table.h - CRC32 tables\n"); + printf(" * this file is generated - do not edit\n"); + printf(" * # gen_crc32table > crc32table.h\n"); + printf(" * with\n"); + printf(" * CRC_LE_BITS = %d\n", CRC_LE_BITS); + printf(" * CRC_BE_BITS = %d\n", CRC_BE_BITS); + printf(" */\n"); + printf("\n"); if (CRC_LE_BITS > 1) { crc32init_le(); - printf("static const u32 crc32table_le[4][256] = {"); - output_table(crc32table_le, LE_TABLE_SIZE, "tole"); - printf("};\n"); + output_table(crc32table_le, LE_TABLE_SIZE, 'l'); } if (CRC_BE_BITS > 1) { crc32init_be(); - printf("static const u32 crc32table_be[4][256] = {"); - output_table(crc32table_be, BE_TABLE_SIZE, "tobe"); - printf("};\n"); + output_table(crc32table_be, BE_TABLE_SIZE, 'b'); } return 0; diff --git a/lib/kstrtox.c b/lib/kstrtox.c index 5e066759f551..7a94c8f14e29 100644 --- a/lib/kstrtox.c +++ b/lib/kstrtox.c @@ -18,26 +18,40 @@ #include <linux/module.h> #include <linux/types.h> #include <asm/uaccess.h> +#include "kstrtox.h" -static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res) +const char *_parse_integer_fixup_radix(const char *s, unsigned int *base) { - unsigned long long acc; - int ok; - - if (base == 0) { + if (*base == 0) { if (s[0] == '0') { if (_tolower(s[1]) == 'x' && isxdigit(s[2])) - base = 16; + *base = 16; else - base = 8; + *base = 8; } else - base = 10; + *base = 10; } - if (base == 16 && s[0] == '0' && _tolower(s[1]) == 'x') + if (*base == 16 && s[0] == '0' && _tolower(s[1]) == 'x') s += 2; + return s; +} - acc = 0; - ok = 0; +/* + * Convert non-negative integer string representation in explicitly given radix + * to an integer. + * Return number of characters consumed maybe or-ed with overflow bit. + * If overflow occurs, result integer (incorrect) is still returned. + * + * Don't you dare use this function. + */ +unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long *res) +{ + unsigned int rv; + int overflow; + + *res = 0; + rv = 0; + overflow = 0; while (*s) { unsigned int val; @@ -45,23 +59,40 @@ static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res) val = *s - '0'; else if ('a' <= _tolower(*s) && _tolower(*s) <= 'f') val = _tolower(*s) - 'a' + 10; - else if (*s == '\n' && *(s + 1) == '\0') - break; else - return -EINVAL; + break; if (val >= base) - return -EINVAL; - if (acc > div_u64(ULLONG_MAX - val, base)) - return -ERANGE; - acc = acc * base + val; - ok = 1; - + break; + if (*res > div_u64(ULLONG_MAX - val, base)) + overflow = 1; + *res = *res * base + val; + rv++; s++; } - if (!ok) + if (overflow) + rv |= KSTRTOX_OVERFLOW; + return rv; +} + +static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res) +{ + unsigned long long _res; + unsigned int rv; + + s = _parse_integer_fixup_radix(s, &base); + rv = _parse_integer(s, base, &_res); + if (rv & KSTRTOX_OVERFLOW) + return -ERANGE; + rv &= ~KSTRTOX_OVERFLOW; + if (rv == 0) + return -EINVAL; + s += rv; + if (*s == '\n') + s++; + if (*s) return -EINVAL; - *res = acc; + *res = _res; return 0; } diff --git a/lib/kstrtox.h b/lib/kstrtox.h new file mode 100644 index 000000000000..f13eeeaf441d --- /dev/null +++ b/lib/kstrtox.h @@ -0,0 +1,8 @@ +#ifndef _LIB_KSTRTOX_H +#define _LIB_KSTRTOX_H + +#define KSTRTOX_OVERFLOW (1U << 31) +const char *_parse_integer_fixup_radix(const char *s, unsigned int *base); +unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long *res); + +#endif diff --git a/lib/radix-tree.c b/lib/radix-tree.c index a2f9da59c197..d9df7454519c 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -576,7 +576,6 @@ int radix_tree_tag_get(struct radix_tree_root *root, { unsigned int height, shift; struct radix_tree_node *node; - int saw_unset_tag = 0; /* check the root's tag bit */ if (!root_tag_get(root, tag)) @@ -603,15 +602,10 @@ int radix_tree_tag_get(struct radix_tree_root *root, return 0; offset = (index >> shift) & RADIX_TREE_MAP_MASK; - - /* - * This is just a debug check. Later, we can bale as soon as - * we see an unset tag. - */ if (!tag_get(node, tag, offset)) - saw_unset_tag = 1; + return 0; if (height == 1) - return !!tag_get(node, tag, offset); + return 1; node = rcu_dereference_raw(node->slots[offset]); shift -= RADIX_TREE_MAP_SHIFT; height--; diff --git a/lib/string.c b/lib/string.c index 01fad9b203e1..18f511194bc6 100644 --- a/lib/string.c +++ b/lib/string.c @@ -756,3 +756,57 @@ void *memchr(const void *s, int c, size_t n) } EXPORT_SYMBOL(memchr); #endif + +static void *check_bytes8(const u8 *start, u8 value, unsigned int bytes) +{ + while (bytes) { + if (*start != value) + return (void *)start; + start++; + bytes--; + } + return NULL; +} + +/** + * memchr_inv - Find a character in an area of memory. + * @s: The memory area + * @c: The byte to search for + * @n: The size of the area. + * + * returns the address of the first character other than @c, or %NULL + * if the whole buffer contains just @c. + */ +void *memchr_inv(const void *start, int c, size_t bytes) +{ + u8 value = c; + u64 value64; + unsigned int words, prefix; + + if (bytes <= 16) + return check_bytes8(start, value, bytes); + + value64 = value | value << 8 | value << 16 | value << 24; + value64 = (value64 & 0xffffffff) | value64 << 32; + prefix = 8 - ((unsigned long)start) % 8; + + if (prefix) { + u8 *r = check_bytes8(start, value, prefix); + if (r) + return r; + start += prefix; + bytes -= prefix; + } + + words = bytes / 8; + + while (words) { + if (*(u64 *)start != value64) + return check_bytes8(start, value, 8); + start += 8; + words--; + } + + return check_bytes8(start, value, bytes % 8); +} +EXPORT_SYMBOL(memchr_inv); diff --git a/arch/s390/lib/usercopy.c b/lib/usercopy.c index 14b363fec8a2..14b363fec8a2 100644 --- a/arch/s390/lib/usercopy.c +++ b/lib/usercopy.c diff --git a/lib/vsprintf.c b/lib/vsprintf.c index d7222a9c8267..c1a3927326eb 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -31,17 +31,7 @@ #include <asm/div64.h> #include <asm/sections.h> /* for dereference_function_descriptor() */ -static unsigned int simple_guess_base(const char *cp) -{ - if (cp[0] == '0') { - if (_tolower(cp[1]) == 'x' && isxdigit(cp[2])) - return 16; - else - return 8; - } else { - return 10; - } -} +#include "kstrtox.h" /** * simple_strtoull - convert a string to an unsigned long long @@ -51,23 +41,14 @@ static unsigned int simple_guess_base(const char *cp) */ unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base) { - unsigned long long result = 0; + unsigned long long result; + unsigned int rv; - if (!base) - base = simple_guess_base(cp); + cp = _parse_integer_fixup_radix(cp, &base); + rv = _parse_integer(cp, base, &result); + /* FIXME */ + cp += (rv & ~KSTRTOX_OVERFLOW); - if (base == 16 && cp[0] == '0' && _tolower(cp[1]) == 'x') - cp += 2; - - while (isxdigit(*cp)) { - unsigned int value; - - value = isdigit(*cp) ? *cp - '0' : _tolower(*cp) - 'a' + 10; - if (value >= base) - break; - result = result * base + value; - cp++; - } if (endp) *endp = (char *)cp; diff --git a/mm/Kconfig b/mm/Kconfig index 804d26e2b969..10d7986b64b5 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -131,6 +131,9 @@ config SPARSEMEM_VMEMMAP config HAVE_MEMBLOCK boolean +config NO_BOOTMEM + boolean + # eventually, we can have this option just 'select SPARSEMEM' config MEMORY_HOTPLUG bool "Allow for memory hot-add" diff --git a/mm/Makefile b/mm/Makefile index 72c9e4fdde00..306742a28266 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -5,7 +5,8 @@ mmu-y := nommu.o mmu-$(CONFIG_MMU) := fremap.o highmem.o madvise.o memory.o mincore.o \ mlock.o mmap.o mprotect.o mremap.o msync.o rmap.o \ - vmalloc.o pagewalk.o pgtable-generic.o + vmalloc.o pagewalk.o pgtable-generic.o \ + process_vm_access.o obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ maccess.o page_alloc.o page-writeback.o \ diff --git a/mm/compaction.c b/mm/compaction.c index 6cc604bd5649..a0e420207ebf 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -35,10 +35,6 @@ struct compact_control { unsigned long migrate_pfn; /* isolate_migratepages search base */ bool sync; /* Synchronous migration */ - /* Account for isolated anon and file pages */ - unsigned long nr_anon; - unsigned long nr_file; - unsigned int order; /* order a direct compactor needs */ int migratetype; /* MOVABLE, RECLAIMABLE etc */ struct zone *zone; @@ -223,17 +219,13 @@ static void isolate_freepages(struct zone *zone, static void acct_isolated(struct zone *zone, struct compact_control *cc) { struct page *page; - unsigned int count[NR_LRU_LISTS] = { 0, }; + unsigned int count[2] = { 0, }; - list_for_each_entry(page, &cc->migratepages, lru) { - int lru = page_lru_base_type(page); - count[lru]++; - } + list_for_each_entry(page, &cc->migratepages, lru) + count[!!page_is_file_cache(page)]++; - cc->nr_anon = count[LRU_ACTIVE_ANON] + count[LRU_INACTIVE_ANON]; - cc->nr_file = count[LRU_ACTIVE_FILE] + count[LRU_INACTIVE_FILE]; - __mod_zone_page_state(zone, NR_ISOLATED_ANON, cc->nr_anon); - __mod_zone_page_state(zone, NR_ISOLATED_FILE, cc->nr_file); + __mod_zone_page_state(zone, NR_ISOLATED_ANON, count[0]); + __mod_zone_page_state(zone, NR_ISOLATED_FILE, count[1]); } /* Similar to reclaim, but different enough that they don't share logic */ @@ -269,6 +261,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, unsigned long last_pageblock_nr = 0, pageblock_nr; unsigned long nr_scanned = 0, nr_isolated = 0; struct list_head *migratelist = &cc->migratepages; + isolate_mode_t mode = ISOLATE_ACTIVE|ISOLATE_INACTIVE; /* Do not scan outside zone boundaries */ low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn); @@ -356,8 +349,11 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, continue; } + if (!cc->sync) + mode |= ISOLATE_CLEAN; + /* Try isolate the page */ - if (__isolate_lru_page(page, ISOLATE_BOTH, 0) != 0) + if (__isolate_lru_page(page, mode, 0) != 0) continue; VM_BUG_ON(PageTransCompound(page)); diff --git a/mm/debug-pagealloc.c b/mm/debug-pagealloc.c index a1e3324de2b5..2618933efdb3 100644 --- a/mm/debug-pagealloc.c +++ b/mm/debug-pagealloc.c @@ -1,7 +1,9 @@ #include <linux/kernel.h> +#include <linux/string.h> #include <linux/mm.h> #include <linux/page-debug-flags.h> #include <linux/poison.h> +#include <linux/ratelimit.h> static inline void set_page_poison(struct page *page) { @@ -59,14 +61,12 @@ static bool single_bit_flip(unsigned char a, unsigned char b) static void check_poison_mem(unsigned char *mem, size_t bytes) { + static DEFINE_RATELIMIT_STATE(ratelimit, 5 * HZ, 10); unsigned char *start; unsigned char *end; - for (start = mem; start < mem + bytes; start++) { - if (*start != PAGE_POISON) - break; - } - if (start == mem + bytes) + start = memchr_inv(mem, PAGE_POISON, bytes); + if (!start) return; for (end = mem + bytes - 1; end > start; end--) { @@ -74,7 +74,7 @@ static void check_poison_mem(unsigned char *mem, size_t bytes) break; } - if (!printk_ratelimit()) + if (!__ratelimit(&ratelimit)) return; else if (start == end && single_bit_flip(*start, PAGE_POISON)) printk(KERN_ERR "pagealloc: single bit error\n"); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e2d1587be269..6b072bdccf81 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1052,6 +1052,51 @@ int mincore_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, return ret; } +int move_huge_pmd(struct vm_area_struct *vma, struct vm_area_struct *new_vma, + unsigned long old_addr, + unsigned long new_addr, unsigned long old_end, + pmd_t *old_pmd, pmd_t *new_pmd) +{ + int ret = 0; + pmd_t pmd; + + struct mm_struct *mm = vma->vm_mm; + + if ((old_addr & ~HPAGE_PMD_MASK) || + (new_addr & ~HPAGE_PMD_MASK) || + old_end - old_addr < HPAGE_PMD_SIZE || + (new_vma->vm_flags & VM_NOHUGEPAGE)) + goto out; + + /* + * The destination pmd shouldn't be established, free_pgtables() + * should have release it. + */ + if (WARN_ON(!pmd_none(*new_pmd))) { + VM_BUG_ON(pmd_trans_huge(*new_pmd)); + goto out; + } + + spin_lock(&mm->page_table_lock); + if (likely(pmd_trans_huge(*old_pmd))) { + if (pmd_trans_splitting(*old_pmd)) { + spin_unlock(&mm->page_table_lock); + wait_split_huge_page(vma->anon_vma, old_pmd); + ret = -1; + } else { + pmd = pmdp_get_and_clear(mm, old_addr, old_pmd); + VM_BUG_ON(!pmd_none(*new_pmd)); + set_pmd_at(mm, new_addr, new_pmd, pmd); + spin_unlock(&mm->page_table_lock); + ret = 1; + } + } else { + spin_unlock(&mm->page_table_lock); + } +out: + return ret; +} + int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot) { diff --git a/mm/memblock.c b/mm/memblock.c index ccbf97339592..b7ed63633581 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -626,6 +626,12 @@ phys_addr_t __init memblock_phys_mem_size(void) return memblock.memory_size; } +/* lowest address */ +phys_addr_t __init_memblock memblock_start_of_DRAM(void) +{ + return memblock.memory.regions[0].base; +} + phys_addr_t __init_memblock memblock_end_of_DRAM(void) { int idx = memblock.memory.cnt - 1; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 88706a76d886..54b35b35ea04 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -202,8 +202,8 @@ struct mem_cgroup_eventfd_list { struct eventfd_ctx *eventfd; }; -static void mem_cgroup_threshold(struct mem_cgroup *mem); -static void mem_cgroup_oom_notify(struct mem_cgroup *mem); +static void mem_cgroup_threshold(struct mem_cgroup *memcg); +static void mem_cgroup_oom_notify(struct mem_cgroup *memcg); enum { SCAN_BY_LIMIT, @@ -408,29 +408,29 @@ enum charge_type { #define MEM_CGROUP_RECLAIM_SOFT_BIT 0x2 #define MEM_CGROUP_RECLAIM_SOFT (1 << MEM_CGROUP_RECLAIM_SOFT_BIT) -static void mem_cgroup_get(struct mem_cgroup *mem); -static void mem_cgroup_put(struct mem_cgroup *mem); -static struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *mem); -static void drain_all_stock_async(struct mem_cgroup *mem); +static void mem_cgroup_get(struct mem_cgroup *memcg); +static void mem_cgroup_put(struct mem_cgroup *memcg); +static struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg); +static void drain_all_stock_async(struct mem_cgroup *memcg); static struct mem_cgroup_per_zone * -mem_cgroup_zoneinfo(struct mem_cgroup *mem, int nid, int zid) +mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid) { - return &mem->info.nodeinfo[nid]->zoneinfo[zid]; + return &memcg->info.nodeinfo[nid]->zoneinfo[zid]; } -struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *mem) +struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg) { - return &mem->css; + return &memcg->css; } static struct mem_cgroup_per_zone * -page_cgroup_zoneinfo(struct mem_cgroup *mem, struct page *page) +page_cgroup_zoneinfo(struct mem_cgroup *memcg, struct page *page) { int nid = page_to_nid(page); int zid = page_zonenum(page); - return mem_cgroup_zoneinfo(mem, nid, zid); + return mem_cgroup_zoneinfo(memcg, nid, zid); } static struct mem_cgroup_tree_per_zone * @@ -449,7 +449,7 @@ soft_limit_tree_from_page(struct page *page) } static void -__mem_cgroup_insert_exceeded(struct mem_cgroup *mem, +__mem_cgroup_insert_exceeded(struct mem_cgroup *memcg, struct mem_cgroup_per_zone *mz, struct mem_cgroup_tree_per_zone *mctz, unsigned long long new_usage_in_excess) @@ -483,7 +483,7 @@ __mem_cgroup_insert_exceeded(struct mem_cgroup *mem, } static void -__mem_cgroup_remove_exceeded(struct mem_cgroup *mem, +__mem_cgroup_remove_exceeded(struct mem_cgroup *memcg, struct mem_cgroup_per_zone *mz, struct mem_cgroup_tree_per_zone *mctz) { @@ -494,17 +494,17 @@ __mem_cgroup_remove_exceeded(struct mem_cgroup *mem, } static void -mem_cgroup_remove_exceeded(struct mem_cgroup *mem, +mem_cgroup_remove_exceeded(struct mem_cgroup *memcg, struct mem_cgroup_per_zone *mz, struct mem_cgroup_tree_per_zone *mctz) { spin_lock(&mctz->lock); - __mem_cgroup_remove_exceeded(mem, mz, mctz); + __mem_cgroup_remove_exceeded(memcg, mz, mctz); spin_unlock(&mctz->lock); } -static void mem_cgroup_update_tree(struct mem_cgroup *mem, struct page *page) +static void mem_cgroup_update_tree(struct mem_cgroup *memcg, struct page *page) { unsigned long long excess; struct mem_cgroup_per_zone *mz; @@ -517,9 +517,9 @@ static void mem_cgroup_update_tree(struct mem_cgroup *mem, struct page *page) * Necessary to update all ancestors when hierarchy is used. * because their event counter is not touched. */ - for (; mem; mem = parent_mem_cgroup(mem)) { - mz = mem_cgroup_zoneinfo(mem, nid, zid); - excess = res_counter_soft_limit_excess(&mem->res); + for (; memcg; memcg = parent_mem_cgroup(memcg)) { + mz = mem_cgroup_zoneinfo(memcg, nid, zid); + excess = res_counter_soft_limit_excess(&memcg->res); /* * We have to update the tree if mz is on RB-tree or * mem is over its softlimit. @@ -528,18 +528,18 @@ static void mem_cgroup_update_tree(struct mem_cgroup *mem, struct page *page) spin_lock(&mctz->lock); /* if on-tree, remove it */ if (mz->on_tree) - __mem_cgroup_remove_exceeded(mem, mz, mctz); + __mem_cgroup_remove_exceeded(memcg, mz, mctz); /* * Insert again. mz->usage_in_excess will be updated. * If excess is 0, no tree ops. */ - __mem_cgroup_insert_exceeded(mem, mz, mctz, excess); + __mem_cgroup_insert_exceeded(memcg, mz, mctz, excess); spin_unlock(&mctz->lock); } } } -static void mem_cgroup_remove_from_trees(struct mem_cgroup *mem) +static void mem_cgroup_remove_from_trees(struct mem_cgroup *memcg) { int node, zone; struct mem_cgroup_per_zone *mz; @@ -547,9 +547,9 @@ static void mem_cgroup_remove_from_trees(struct mem_cgroup *mem) for_each_node_state(node, N_POSSIBLE) { for (zone = 0; zone < MAX_NR_ZONES; zone++) { - mz = mem_cgroup_zoneinfo(mem, node, zone); + mz = mem_cgroup_zoneinfo(memcg, node, zone); mctz = soft_limit_tree_node_zone(node, zone); - mem_cgroup_remove_exceeded(mem, mz, mctz); + mem_cgroup_remove_exceeded(memcg, mz, mctz); } } } @@ -610,7 +610,7 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_zone *mctz) * common workload, threashold and synchonization as vmstat[] should be * implemented. */ -static long mem_cgroup_read_stat(struct mem_cgroup *mem, +static long mem_cgroup_read_stat(struct mem_cgroup *memcg, enum mem_cgroup_stat_index idx) { long val = 0; @@ -618,81 +618,79 @@ static long mem_cgroup_read_stat(struct mem_cgroup *mem, get_online_cpus(); for_each_online_cpu(cpu) - val += per_cpu(mem->stat->count[idx], cpu); + val += per_cpu(memcg->stat->count[idx], cpu); #ifdef CONFIG_HOTPLUG_CPU - spin_lock(&mem->pcp_counter_lock); - val += mem->nocpu_base.count[idx]; - spin_unlock(&mem->pcp_counter_lock); + spin_lock(&memcg->pcp_counter_lock); + val += memcg->nocpu_base.count[idx]; + spin_unlock(&memcg->pcp_counter_lock); #endif put_online_cpus(); return val; } -static void mem_cgroup_swap_statistics(struct mem_cgroup *mem, +static void mem_cgroup_swap_statistics(struct mem_cgroup *memcg, bool charge) { int val = (charge) ? 1 : -1; - this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_SWAPOUT], val); + this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_SWAPOUT], val); } -void mem_cgroup_pgfault(struct mem_cgroup *mem, int val) +void mem_cgroup_pgfault(struct mem_cgroup *memcg, int val) { - this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_PGFAULT], val); + this_cpu_add(memcg->stat->events[MEM_CGROUP_EVENTS_PGFAULT], val); } -void mem_cgroup_pgmajfault(struct mem_cgroup *mem, int val) +void mem_cgroup_pgmajfault(struct mem_cgroup *memcg, int val) { - this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_PGMAJFAULT], val); + this_cpu_add(memcg->stat->events[MEM_CGROUP_EVENTS_PGMAJFAULT], val); } -static unsigned long mem_cgroup_read_events(struct mem_cgroup *mem, +static unsigned long mem_cgroup_read_events(struct mem_cgroup *memcg, enum mem_cgroup_events_index idx) { unsigned long val = 0; int cpu; for_each_online_cpu(cpu) - val += per_cpu(mem->stat->events[idx], cpu); + val += per_cpu(memcg->stat->events[idx], cpu); #ifdef CONFIG_HOTPLUG_CPU - spin_lock(&mem->pcp_counter_lock); - val += mem->nocpu_base.events[idx]; - spin_unlock(&mem->pcp_counter_lock); + spin_lock(&memcg->pcp_counter_lock); + val += memcg->nocpu_base.events[idx]; + spin_unlock(&memcg->pcp_counter_lock); #endif return val; } -static void mem_cgroup_charge_statistics(struct mem_cgroup *mem, +static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, bool file, int nr_pages) { - preempt_disable(); - if (file) - __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages); + this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_CACHE], + nr_pages); else - __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages); + this_cpu_add(memcg->stat->count[MEM_CGROUP_STAT_RSS], + nr_pages); /* pagein of a big page is an event. So, ignore page size */ - if (nr_pages > 0) - __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]); - else { - __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]); + if (nr_pages > 0) { + this_cpu_inc(memcg->stat->events[MEM_CGROUP_EVENTS_PGPGIN]); + } else { + this_cpu_inc(memcg->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]); nr_pages = -nr_pages; /* for event */ } - __this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages); - - preempt_enable(); + this_cpu_add(memcg->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages); } unsigned long -mem_cgroup_zone_nr_lru_pages(struct mem_cgroup *mem, int nid, int zid, +mem_cgroup_zone_nr_lru_pages(struct mem_cgroup *memcg, int nid, int zid, unsigned int lru_mask) { struct mem_cgroup_per_zone *mz; enum lru_list l; unsigned long ret = 0; - mz = mem_cgroup_zoneinfo(mem, nid, zid); + mz = mem_cgroup_zoneinfo(memcg, nid, zid); for_each_lru(l) { if (BIT(l) & lru_mask) @@ -702,44 +700,45 @@ mem_cgroup_zone_nr_lru_pages(struct mem_cgroup *mem, int nid, int zid, } static unsigned long -mem_cgroup_node_nr_lru_pages(struct mem_cgroup *mem, +mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, int nid, unsigned int lru_mask) { u64 total = 0; int zid; for (zid = 0; zid < MAX_NR_ZONES; zid++) - total += mem_cgroup_zone_nr_lru_pages(mem, nid, zid, lru_mask); + total += mem_cgroup_zone_nr_lru_pages(memcg, + nid, zid, lru_mask); return total; } -static unsigned long mem_cgroup_nr_lru_pages(struct mem_cgroup *mem, +static unsigned long mem_cgroup_nr_lru_pages(struct mem_cgroup *memcg, unsigned int lru_mask) { int nid; u64 total = 0; for_each_node_state(nid, N_HIGH_MEMORY) - total += mem_cgroup_node_nr_lru_pages(mem, nid, lru_mask); + total += mem_cgroup_node_nr_lru_pages(memcg, nid, lru_mask); return total; } -static bool __memcg_event_check(struct mem_cgroup *mem, int target) +static bool __memcg_event_check(struct mem_cgroup *memcg, int target) { unsigned long val, next; - val = this_cpu_read(mem->stat->events[MEM_CGROUP_EVENTS_COUNT]); - next = this_cpu_read(mem->stat->targets[target]); + val = this_cpu_read(memcg->stat->events[MEM_CGROUP_EVENTS_COUNT]); + next = this_cpu_read(memcg->stat->targets[target]); /* from time_after() in jiffies.h */ return ((long)next - (long)val < 0); } -static void __mem_cgroup_target_update(struct mem_cgroup *mem, int target) +static void __mem_cgroup_target_update(struct mem_cgroup *memcg, int target) { unsigned long val, next; - val = this_cpu_read(mem->stat->events[MEM_CGROUP_EVENTS_COUNT]); + val = this_cpu_read(memcg->stat->events[MEM_CGROUP_EVENTS_COUNT]); switch (target) { case MEM_CGROUP_TARGET_THRESH: @@ -755,30 +754,30 @@ static void __mem_cgroup_target_update(struct mem_cgroup *mem, int target) return; } - this_cpu_write(mem->stat->targets[target], next); + this_cpu_write(memcg->stat->targets[target], next); } /* * Check events in order. * */ -static void memcg_check_events(struct mem_cgroup *mem, struct page *page) +static void memcg_check_events(struct mem_cgroup *memcg, struct page *page) { /* threshold event is triggered in finer grain than soft limit */ - if (unlikely(__memcg_event_check(mem, MEM_CGROUP_TARGET_THRESH))) { - mem_cgroup_threshold(mem); - __mem_cgroup_target_update(mem, MEM_CGROUP_TARGET_THRESH); - if (unlikely(__memcg_event_check(mem, + if (unlikely(__memcg_event_check(memcg, MEM_CGROUP_TARGET_THRESH))) { + mem_cgroup_threshold(memcg); + __mem_cgroup_target_update(memcg, MEM_CGROUP_TARGET_THRESH); + if (unlikely(__memcg_event_check(memcg, MEM_CGROUP_TARGET_SOFTLIMIT))) { - mem_cgroup_update_tree(mem, page); - __mem_cgroup_target_update(mem, + mem_cgroup_update_tree(memcg, page); + __mem_cgroup_target_update(memcg, MEM_CGROUP_TARGET_SOFTLIMIT); } #if MAX_NUMNODES > 1 - if (unlikely(__memcg_event_check(mem, + if (unlikely(__memcg_event_check(memcg, MEM_CGROUP_TARGET_NUMAINFO))) { - atomic_inc(&mem->numainfo_events); - __mem_cgroup_target_update(mem, + atomic_inc(&memcg->numainfo_events); + __mem_cgroup_target_update(memcg, MEM_CGROUP_TARGET_NUMAINFO); } #endif @@ -808,7 +807,7 @@ struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p) struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm) { - struct mem_cgroup *mem = NULL; + struct mem_cgroup *memcg = NULL; if (!mm) return NULL; @@ -819,25 +818,25 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm) */ rcu_read_lock(); do { - mem = mem_cgroup_from_task(rcu_dereference(mm->owner)); - if (unlikely(!mem)) + memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); + if (unlikely(!memcg)) break; - } while (!css_tryget(&mem->css)); + } while (!css_tryget(&memcg->css)); rcu_read_unlock(); - return mem; + return memcg; } /* The caller has to guarantee "mem" exists before calling this */ -static struct mem_cgroup *mem_cgroup_start_loop(struct mem_cgroup *mem) +static struct mem_cgroup *mem_cgroup_start_loop(struct mem_cgroup *memcg) { struct cgroup_subsys_state *css; int found; - if (!mem) /* ROOT cgroup has the smallest ID */ + if (!memcg) /* ROOT cgroup has the smallest ID */ return root_mem_cgroup; /*css_put/get against root is ignored*/ - if (!mem->use_hierarchy) { - if (css_tryget(&mem->css)) - return mem; + if (!memcg->use_hierarchy) { + if (css_tryget(&memcg->css)) + return memcg; return NULL; } rcu_read_lock(); @@ -845,13 +844,13 @@ static struct mem_cgroup *mem_cgroup_start_loop(struct mem_cgroup *mem) * searching a memory cgroup which has the smallest ID under given * ROOT cgroup. (ID >= 1) */ - css = css_get_next(&mem_cgroup_subsys, 1, &mem->css, &found); + css = css_get_next(&mem_cgroup_subsys, 1, &memcg->css, &found); if (css && css_tryget(css)) - mem = container_of(css, struct mem_cgroup, css); + memcg = container_of(css, struct mem_cgroup, css); else - mem = NULL; + memcg = NULL; rcu_read_unlock(); - return mem; + return memcg; } static struct mem_cgroup *mem_cgroup_get_next(struct mem_cgroup *iter, @@ -905,29 +904,29 @@ static struct mem_cgroup *mem_cgroup_get_next(struct mem_cgroup *iter, for_each_mem_cgroup_tree_cond(iter, NULL, true) -static inline bool mem_cgroup_is_root(struct mem_cgroup *mem) +static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg) { - return (mem == root_mem_cgroup); + return (memcg == root_mem_cgroup); } void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx) { - struct mem_cgroup *mem; + struct mem_cgroup *memcg; if (!mm) return; rcu_read_lock(); - mem = mem_cgroup_from_task(rcu_dereference(mm->owner)); - if (unlikely(!mem)) + memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); + if (unlikely(!memcg)) goto out; switch (idx) { case PGMAJFAULT: - mem_cgroup_pgmajfault(mem, 1); + mem_cgroup_pgmajfault(memcg, 1); break; case PGFAULT: - mem_cgroup_pgfault(mem, 1); + mem_cgroup_pgfault(memcg, 1); break; default: BUG(); @@ -1112,18 +1111,18 @@ void mem_cgroup_move_lists(struct page *page, * Checks whether given mem is same or in the root_mem's * hierarchy subtree */ -static bool mem_cgroup_same_or_subtree(const struct mem_cgroup *root_mem, - struct mem_cgroup *mem) +static bool mem_cgroup_same_or_subtree(const struct mem_cgroup *root_memcg, + struct mem_cgroup *memcg) { - if (root_mem != mem) { - return (root_mem->use_hierarchy && - css_is_ancestor(&mem->css, &root_mem->css)); + if (root_memcg != memcg) { + return (root_memcg->use_hierarchy && + css_is_ancestor(&memcg->css, &root_memcg->css)); } return true; } -int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem) +int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *memcg) { int ret; struct mem_cgroup *curr = NULL; @@ -1137,12 +1136,12 @@ int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem) if (!curr) return 0; /* - * We should check use_hierarchy of "mem" not "curr". Because checking + * We should check use_hierarchy of "memcg" not "curr". Because checking * use_hierarchy of "curr" here make this function true if hierarchy is - * enabled in "curr" and "curr" is a child of "mem" in *cgroup* - * hierarchy(even if use_hierarchy is disabled in "mem"). + * enabled in "curr" and "curr" is a child of "memcg" in *cgroup* + * hierarchy(even if use_hierarchy is disabled in "memcg"). */ - ret = mem_cgroup_same_or_subtree(mem, curr); + ret = mem_cgroup_same_or_subtree(memcg, curr); css_put(&curr->css); return ret; } @@ -1231,7 +1230,8 @@ mem_cgroup_get_reclaim_stat_from_page(struct page *page) unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, int order, - int mode, struct zone *z, + isolate_mode_t mode, + struct zone *z, struct mem_cgroup *mem_cont, int active, int file) { @@ -1299,13 +1299,13 @@ unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, * Returns the maximum amount of memory @mem can be charged with, in * pages. */ -static unsigned long mem_cgroup_margin(struct mem_cgroup *mem) +static unsigned long mem_cgroup_margin(struct mem_cgroup *memcg) { unsigned long long margin; - margin = res_counter_margin(&mem->res); + margin = res_counter_margin(&memcg->res); if (do_swap_account) - margin = min(margin, res_counter_margin(&mem->memsw)); + margin = min(margin, res_counter_margin(&memcg->memsw)); return margin >> PAGE_SHIFT; } @@ -1320,33 +1320,33 @@ int mem_cgroup_swappiness(struct mem_cgroup *memcg) return memcg->swappiness; } -static void mem_cgroup_start_move(struct mem_cgroup *mem) +static void mem_cgroup_start_move(struct mem_cgroup *memcg) { int cpu; get_online_cpus(); - spin_lock(&mem->pcp_counter_lock); + spin_lock(&memcg->pcp_counter_lock); for_each_online_cpu(cpu) - per_cpu(mem->stat->count[MEM_CGROUP_ON_MOVE], cpu) += 1; - mem->nocpu_base.count[MEM_CGROUP_ON_MOVE] += 1; - spin_unlock(&mem->pcp_counter_lock); + per_cpu(memcg->stat->count[MEM_CGROUP_ON_MOVE], cpu) += 1; + memcg->nocpu_base.count[MEM_CGROUP_ON_MOVE] += 1; + spin_unlock(&memcg->pcp_counter_lock); put_online_cpus(); synchronize_rcu(); } -static void mem_cgroup_end_move(struct mem_cgroup *mem) +static void mem_cgroup_end_move(struct mem_cgroup *memcg) { int cpu; - if (!mem) + if (!memcg) return; get_online_cpus(); - spin_lock(&mem->pcp_counter_lock); + spin_lock(&memcg->pcp_counter_lock); for_each_online_cpu(cpu) - per_cpu(mem->stat->count[MEM_CGROUP_ON_MOVE], cpu) -= 1; - mem->nocpu_base.count[MEM_CGROUP_ON_MOVE] -= 1; - spin_unlock(&mem->pcp_counter_lock); + per_cpu(memcg->stat->count[MEM_CGROUP_ON_MOVE], cpu) -= 1; + memcg->nocpu_base.count[MEM_CGROUP_ON_MOVE] -= 1; + spin_unlock(&memcg->pcp_counter_lock); put_online_cpus(); } /* @@ -1361,13 +1361,13 @@ static void mem_cgroup_end_move(struct mem_cgroup *mem) * waiting at hith-memory prressure caused by "move". */ -static bool mem_cgroup_stealed(struct mem_cgroup *mem) +static bool mem_cgroup_stealed(struct mem_cgroup *memcg) { VM_BUG_ON(!rcu_read_lock_held()); - return this_cpu_read(mem->stat->count[MEM_CGROUP_ON_MOVE]) > 0; + return this_cpu_read(memcg->stat->count[MEM_CGROUP_ON_MOVE]) > 0; } -static bool mem_cgroup_under_move(struct mem_cgroup *mem) +static bool mem_cgroup_under_move(struct mem_cgroup *memcg) { struct mem_cgroup *from; struct mem_cgroup *to; @@ -1382,17 +1382,17 @@ static bool mem_cgroup_under_move(struct mem_cgroup *mem) if (!from) goto unlock; - ret = mem_cgroup_same_or_subtree(mem, from) - || mem_cgroup_same_or_subtree(mem, to); + ret = mem_cgroup_same_or_subtree(memcg, from) + || mem_cgroup_same_or_subtree(memcg, to); unlock: spin_unlock(&mc.lock); return ret; } -static bool mem_cgroup_wait_acct_move(struct mem_cgroup *mem) +static bool mem_cgroup_wait_acct_move(struct mem_cgroup *memcg) { if (mc.moving_task && current != mc.moving_task) { - if (mem_cgroup_under_move(mem)) { + if (mem_cgroup_under_move(memcg)) { DEFINE_WAIT(wait); prepare_to_wait(&mc.waitq, &wait, TASK_INTERRUPTIBLE); /* moving charge context might have finished. */ @@ -1476,12 +1476,12 @@ done: * This function returns the number of memcg under hierarchy tree. Returns * 1(self count) if no children. */ -static int mem_cgroup_count_children(struct mem_cgroup *mem) +static int mem_cgroup_count_children(struct mem_cgroup *memcg) { int num = 0; struct mem_cgroup *iter; - for_each_mem_cgroup_tree(iter, mem) + for_each_mem_cgroup_tree(iter, memcg) num++; return num; } @@ -1511,21 +1511,21 @@ u64 mem_cgroup_get_limit(struct mem_cgroup *memcg) * that to reclaim free pages from. */ static struct mem_cgroup * -mem_cgroup_select_victim(struct mem_cgroup *root_mem) +mem_cgroup_select_victim(struct mem_cgroup *root_memcg) { struct mem_cgroup *ret = NULL; struct cgroup_subsys_state *css; int nextid, found; - if (!root_mem->use_hierarchy) { - css_get(&root_mem->css); - ret = root_mem; + if (!root_memcg->use_hierarchy) { + css_get(&root_memcg->css); + ret = root_memcg; } while (!ret) { rcu_read_lock(); - nextid = root_mem->last_scanned_child + 1; - css = css_get_next(&mem_cgroup_subsys, nextid, &root_mem->css, + nextid = root_memcg->last_scanned_child + 1; + css = css_get_next(&mem_cgroup_subsys, nextid, &root_memcg->css, &found); if (css && css_tryget(css)) ret = container_of(css, struct mem_cgroup, css); @@ -1534,9 +1534,9 @@ mem_cgroup_select_victim(struct mem_cgroup *root_mem) /* Updates scanning parameter */ if (!css) { /* this means start scan from ID:1 */ - root_mem->last_scanned_child = 0; + root_memcg->last_scanned_child = 0; } else - root_mem->last_scanned_child = found; + root_memcg->last_scanned_child = found; } return ret; @@ -1552,14 +1552,14 @@ mem_cgroup_select_victim(struct mem_cgroup *root_mem) * reclaimable pages on a node. Returns true if there are any reclaimable * pages in the node. */ -static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *mem, +static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *memcg, int nid, bool noswap) { - if (mem_cgroup_node_nr_lru_pages(mem, nid, LRU_ALL_FILE)) + if (mem_cgroup_node_nr_lru_pages(memcg, nid, LRU_ALL_FILE)) return true; if (noswap || !total_swap_pages) return false; - if (mem_cgroup_node_nr_lru_pages(mem, nid, LRU_ALL_ANON)) + if (mem_cgroup_node_nr_lru_pages(memcg, nid, LRU_ALL_ANON)) return true; return false; @@ -1572,29 +1572,29 @@ static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *mem, * nodes based on the zonelist. So update the list loosely once per 10 secs. * */ -static void mem_cgroup_may_update_nodemask(struct mem_cgroup *mem) +static void mem_cgroup_may_update_nodemask(struct mem_cgroup *memcg) { int nid; /* * numainfo_events > 0 means there was at least NUMAINFO_EVENTS_TARGET * pagein/pageout changes since the last update. */ - if (!atomic_read(&mem->numainfo_events)) + if (!atomic_read(&memcg->numainfo_events)) return; - if (atomic_inc_return(&mem->numainfo_updating) > 1) + if (atomic_inc_return(&memcg->numainfo_updating) > 1) return; /* make a nodemask where this memcg uses memory from */ - mem->scan_nodes = node_states[N_HIGH_MEMORY]; + memcg->scan_nodes = node_states[N_HIGH_MEMORY]; for_each_node_mask(nid, node_states[N_HIGH_MEMORY]) { - if (!test_mem_cgroup_node_reclaimable(mem, nid, false)) - node_clear(nid, mem->scan_nodes); + if (!test_mem_cgroup_node_reclaimable(memcg, nid, false)) + node_clear(nid, memcg->scan_nodes); } - atomic_set(&mem->numainfo_events, 0); - atomic_set(&mem->numainfo_updating, 0); + atomic_set(&memcg->numainfo_events, 0); + atomic_set(&memcg->numainfo_updating, 0); } /* @@ -1609,16 +1609,16 @@ static void mem_cgroup_may_update_nodemask(struct mem_cgroup *mem) * * Now, we use round-robin. Better algorithm is welcomed. */ -int mem_cgroup_select_victim_node(struct mem_cgroup *mem) +int mem_cgroup_select_victim_node(struct mem_cgroup *memcg) { int node; - mem_cgroup_may_update_nodemask(mem); - node = mem->last_scanned_node; + mem_cgroup_may_update_nodemask(memcg); + node = memcg->last_scanned_node; - node = next_node(node, mem->scan_nodes); + node = next_node(node, memcg->scan_nodes); if (node == MAX_NUMNODES) - node = first_node(mem->scan_nodes); + node = first_node(memcg->scan_nodes); /* * We call this when we hit limit, not when pages are added to LRU. * No LRU may hold pages because all pages are UNEVICTABLE or @@ -1628,7 +1628,7 @@ int mem_cgroup_select_victim_node(struct mem_cgroup *mem) if (unlikely(node == MAX_NUMNODES)) node = numa_node_id(); - mem->last_scanned_node = node; + memcg->last_scanned_node = node; return node; } @@ -1638,7 +1638,7 @@ int mem_cgroup_select_victim_node(struct mem_cgroup *mem) * unused nodes. But scan_nodes is lazily updated and may not cotain * enough new information. We need to do double check. */ -bool mem_cgroup_reclaimable(struct mem_cgroup *mem, bool noswap) +bool mem_cgroup_reclaimable(struct mem_cgroup *memcg, bool noswap) { int nid; @@ -1646,12 +1646,12 @@ bool mem_cgroup_reclaimable(struct mem_cgroup *mem, bool noswap) * quick check...making use of scan_node. * We can skip unused nodes. */ - if (!nodes_empty(mem->scan_nodes)) { - for (nid = first_node(mem->scan_nodes); + if (!nodes_empty(memcg->scan_nodes)) { + for (nid = first_node(memcg->scan_nodes); nid < MAX_NUMNODES; - nid = next_node(nid, mem->scan_nodes)) { + nid = next_node(nid, memcg->scan_nodes)) { - if (test_mem_cgroup_node_reclaimable(mem, nid, noswap)) + if (test_mem_cgroup_node_reclaimable(memcg, nid, noswap)) return true; } } @@ -1659,23 +1659,23 @@ bool mem_cgroup_reclaimable(struct mem_cgroup *mem, bool noswap) * Check rest of nodes. */ for_each_node_state(nid, N_HIGH_MEMORY) { - if (node_isset(nid, mem->scan_nodes)) + if (node_isset(nid, memcg->scan_nodes)) continue; - if (test_mem_cgroup_node_reclaimable(mem, nid, noswap)) + if (test_mem_cgroup_node_reclaimable(memcg, nid, noswap)) return true; } return false; } #else -int mem_cgroup_select_victim_node(struct mem_cgroup *mem) +int mem_cgroup_select_victim_node(struct mem_cgroup *memcg) { return 0; } -bool mem_cgroup_reclaimable(struct mem_cgroup *mem, bool noswap) +bool mem_cgroup_reclaimable(struct mem_cgroup *memcg, bool noswap) { - return test_mem_cgroup_node_reclaimable(mem, 0, noswap); + return test_mem_cgroup_node_reclaimable(memcg, 0, noswap); } #endif @@ -1700,21 +1700,21 @@ static void __mem_cgroup_record_scanstat(unsigned long *stats, static void mem_cgroup_record_scanstat(struct memcg_scanrecord *rec) { - struct mem_cgroup *mem; + struct mem_cgroup *memcg; int context = rec->context; if (context >= NR_SCAN_CONTEXT) return; - mem = rec->mem; - spin_lock(&mem->scanstat.lock); - __mem_cgroup_record_scanstat(mem->scanstat.stats[context], rec); - spin_unlock(&mem->scanstat.lock); + memcg = rec->mem; + spin_lock(&memcg->scanstat.lock); + __mem_cgroup_record_scanstat(memcg->scanstat.stats[context], rec); + spin_unlock(&memcg->scanstat.lock); - mem = rec->root; - spin_lock(&mem->scanstat.lock); - __mem_cgroup_record_scanstat(mem->scanstat.rootstats[context], rec); - spin_unlock(&mem->scanstat.lock); + memcg = rec->root; + spin_lock(&memcg->scanstat.lock); + __mem_cgroup_record_scanstat(memcg->scanstat.rootstats[context], rec); + spin_unlock(&memcg->scanstat.lock); } /* @@ -1722,14 +1722,14 @@ static void mem_cgroup_record_scanstat(struct memcg_scanrecord *rec) * we reclaimed from, so that we don't end up penalizing one child extensively * based on its position in the children list. * - * root_mem is the original ancestor that we've been reclaim from. + * root_memcg is the original ancestor that we've been reclaim from. * - * We give up and return to the caller when we visit root_mem twice. + * We give up and return to the caller when we visit root_memcg twice. * (other groups can be removed while we're walking....) * * If shrink==true, for avoiding to free too much, this returns immedieately. */ -static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, +static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_memcg, struct zone *zone, gfp_t gfp_mask, unsigned long reclaim_options, @@ -1745,10 +1745,10 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, unsigned long excess; unsigned long scanned; - excess = res_counter_soft_limit_excess(&root_mem->res) >> PAGE_SHIFT; + excess = res_counter_soft_limit_excess(&root_memcg->res) >> PAGE_SHIFT; /* If memsw_is_minimum==1, swap-out is of-no-use. */ - if (!check_soft && !shrink && root_mem->memsw_is_minimum) + if (!check_soft && !shrink && root_memcg->memsw_is_minimum) noswap = true; if (shrink) @@ -1758,11 +1758,11 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, else rec.context = SCAN_BY_LIMIT; - rec.root = root_mem; + rec.root = root_memcg; while (1) { - victim = mem_cgroup_select_victim(root_mem); - if (victim == root_mem) { + victim = mem_cgroup_select_victim(root_memcg); + if (victim == root_memcg) { loop++; /* * We are not draining per cpu cached charges during @@ -1771,7 +1771,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, * charges will not give any. */ if (!check_soft && loop >= 1) - drain_all_stock_async(root_mem); + drain_all_stock_async(root_memcg); if (loop >= 2) { /* * If we have not been able to reclaim @@ -1827,9 +1827,9 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, return ret; total += ret; if (check_soft) { - if (!res_counter_soft_limit_excess(&root_mem->res)) + if (!res_counter_soft_limit_excess(&root_memcg->res)) return total; - } else if (mem_cgroup_margin(root_mem)) + } else if (mem_cgroup_margin(root_memcg)) return total; } return total; @@ -1840,12 +1840,12 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, * If someone is running, return false. * Has to be called with memcg_oom_lock */ -static bool mem_cgroup_oom_lock(struct mem_cgroup *mem) +static bool mem_cgroup_oom_lock(struct mem_cgroup *memcg) { struct mem_cgroup *iter, *failed = NULL; bool cond = true; - for_each_mem_cgroup_tree_cond(iter, mem, cond) { + for_each_mem_cgroup_tree_cond(iter, memcg, cond) { if (iter->oom_lock) { /* * this subtree of our hierarchy is already locked @@ -1865,7 +1865,7 @@ static bool mem_cgroup_oom_lock(struct mem_cgroup *mem) * what we set up to the failing subtree */ cond = true; - for_each_mem_cgroup_tree_cond(iter, mem, cond) { + for_each_mem_cgroup_tree_cond(iter, memcg, cond) { if (iter == failed) { cond = false; continue; @@ -1878,24 +1878,24 @@ static bool mem_cgroup_oom_lock(struct mem_cgroup *mem) /* * Has to be called with memcg_oom_lock */ -static int mem_cgroup_oom_unlock(struct mem_cgroup *mem) +static int mem_cgroup_oom_unlock(struct mem_cgroup *memcg) { struct mem_cgroup *iter; - for_each_mem_cgroup_tree(iter, mem) + for_each_mem_cgroup_tree(iter, memcg) iter->oom_lock = false; return 0; } -static void mem_cgroup_mark_under_oom(struct mem_cgroup *mem) +static void mem_cgroup_mark_under_oom(struct mem_cgroup *memcg) { struct mem_cgroup *iter; - for_each_mem_cgroup_tree(iter, mem) + for_each_mem_cgroup_tree(iter, memcg) atomic_inc(&iter->under_oom); } -static void mem_cgroup_unmark_under_oom(struct mem_cgroup *mem) +static void mem_cgroup_unmark_under_oom(struct mem_cgroup *memcg) { struct mem_cgroup *iter; @@ -1904,7 +1904,7 @@ static void mem_cgroup_unmark_under_oom(struct mem_cgroup *mem) * mem_cgroup_oom_lock() may not be called. We have to use * atomic_add_unless() here. */ - for_each_mem_cgroup_tree(iter, mem) + for_each_mem_cgroup_tree(iter, memcg) atomic_add_unless(&iter->under_oom, -1, 0); } @@ -1919,85 +1919,85 @@ struct oom_wait_info { static int memcg_oom_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *arg) { - struct mem_cgroup *wake_mem = (struct mem_cgroup *)arg, - *oom_wait_mem; + struct mem_cgroup *wake_memcg = (struct mem_cgroup *)arg, + *oom_wait_memcg; struct oom_wait_info *oom_wait_info; oom_wait_info = container_of(wait, struct oom_wait_info, wait); - oom_wait_mem = oom_wait_info->mem; + oom_wait_memcg = oom_wait_info->mem; /* * Both of oom_wait_info->mem and wake_mem are stable under us. * Then we can use css_is_ancestor without taking care of RCU. */ - if (!mem_cgroup_same_or_subtree(oom_wait_mem, wake_mem) - && !mem_cgroup_same_or_subtree(wake_mem, oom_wait_mem)) + if (!mem_cgroup_same_or_subtree(oom_wait_memcg, wake_memcg) + && !mem_cgroup_same_or_subtree(wake_memcg, oom_wait_memcg)) return 0; return autoremove_wake_function(wait, mode, sync, arg); } -static void memcg_wakeup_oom(struct mem_cgroup *mem) +static void memcg_wakeup_oom(struct mem_cgroup *memcg) { - /* for filtering, pass "mem" as argument. */ - __wake_up(&memcg_oom_waitq, TASK_NORMAL, 0, mem); + /* for filtering, pass "memcg" as argument. */ + __wake_up(&memcg_oom_waitq, TASK_NORMAL, 0, memcg); } -static void memcg_oom_recover(struct mem_cgroup *mem) +static void memcg_oom_recover(struct mem_cgroup *memcg) { - if (mem && atomic_read(&mem->under_oom)) - memcg_wakeup_oom(mem); + if (memcg && atomic_read(&memcg->under_oom)) + memcg_wakeup_oom(memcg); } /* * try to call OOM killer. returns false if we should exit memory-reclaim loop. */ -bool mem_cgroup_handle_oom(struct mem_cgroup *mem, gfp_t mask) +bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask) { struct oom_wait_info owait; bool locked, need_to_kill; - owait.mem = mem; + owait.mem = memcg; owait.wait.flags = 0; owait.wait.func = memcg_oom_wake_function; owait.wait.private = current; INIT_LIST_HEAD(&owait.wait.task_list); need_to_kill = true; - mem_cgroup_mark_under_oom(mem); + mem_cgroup_mark_under_oom(memcg); - /* At first, try to OOM lock hierarchy under mem.*/ + /* At first, try to OOM lock hierarchy under memcg.*/ spin_lock(&memcg_oom_lock); - locked = mem_cgroup_oom_lock(mem); + locked = mem_cgroup_oom_lock(memcg); /* * Even if signal_pending(), we can't quit charge() loop without * accounting. So, UNINTERRUPTIBLE is appropriate. But SIGKILL * under OOM is always welcomed, use TASK_KILLABLE here. */ prepare_to_wait(&memcg_oom_waitq, &owait.wait, TASK_KILLABLE); - if (!locked || mem->oom_kill_disable) + if (!locked || memcg->oom_kill_disable) need_to_kill = false; if (locked) - mem_cgroup_oom_notify(mem); + mem_cgroup_oom_notify(memcg); spin_unlock(&memcg_oom_lock); if (need_to_kill) { finish_wait(&memcg_oom_waitq, &owait.wait); - mem_cgroup_out_of_memory(mem, mask); + mem_cgroup_out_of_memory(memcg, mask); } else { schedule(); finish_wait(&memcg_oom_waitq, &owait.wait); } spin_lock(&memcg_oom_lock); if (locked) - mem_cgroup_oom_unlock(mem); - memcg_wakeup_oom(mem); + mem_cgroup_oom_unlock(memcg); + memcg_wakeup_oom(memcg); spin_unlock(&memcg_oom_lock); - mem_cgroup_unmark_under_oom(mem); + mem_cgroup_unmark_under_oom(memcg); if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current)) return false; /* Give chance to dying process */ - schedule_timeout(1); + schedule_timeout_uninterruptible(1); return true; } @@ -2028,7 +2028,7 @@ bool mem_cgroup_handle_oom(struct mem_cgroup *mem, gfp_t mask) void mem_cgroup_update_page_stat(struct page *page, enum mem_cgroup_page_stat_item idx, int val) { - struct mem_cgroup *mem; + struct mem_cgroup *memcg; struct page_cgroup *pc = lookup_page_cgroup(page); bool need_unlock = false; unsigned long uninitialized_var(flags); @@ -2037,16 +2037,16 @@ void mem_cgroup_update_page_stat(struct page *page, return; rcu_read_lock(); - mem = pc->mem_cgroup; - if (unlikely(!mem || !PageCgroupUsed(pc))) + memcg = pc->mem_cgroup; + if (unlikely(!memcg || !PageCgroupUsed(pc))) goto out; /* pc->mem_cgroup is unstable ? */ - if (unlikely(mem_cgroup_stealed(mem)) || PageTransHuge(page)) { + if (unlikely(mem_cgroup_stealed(memcg)) || PageTransHuge(page)) { /* take a lock against to access pc->mem_cgroup */ move_lock_page_cgroup(pc, &flags); need_unlock = true; - mem = pc->mem_cgroup; - if (!mem || !PageCgroupUsed(pc)) + memcg = pc->mem_cgroup; + if (!memcg || !PageCgroupUsed(pc)) goto out; } @@ -2062,7 +2062,7 @@ void mem_cgroup_update_page_stat(struct page *page, BUG(); } - this_cpu_add(mem->stat->count[idx], val); + this_cpu_add(memcg->stat->count[idx], val); out: if (unlikely(need_unlock)) @@ -2093,13 +2093,13 @@ static DEFINE_MUTEX(percpu_charge_mutex); * cgroup which is not current target, returns false. This stock will be * refilled. */ -static bool consume_stock(struct mem_cgroup *mem) +static bool consume_stock(struct mem_cgroup *memcg) { struct memcg_stock_pcp *stock; bool ret = true; stock = &get_cpu_var(memcg_stock); - if (mem == stock->cached && stock->nr_pages) + if (memcg == stock->cached && stock->nr_pages) stock->nr_pages--; else /* need to call res_counter_charge */ ret = false; @@ -2140,24 +2140,24 @@ static void drain_local_stock(struct work_struct *dummy) * Cache charges(val) which is from res_counter, to local per_cpu area. * This will be consumed by consume_stock() function, later. */ -static void refill_stock(struct mem_cgroup *mem, unsigned int nr_pages) +static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { struct memcg_stock_pcp *stock = &get_cpu_var(memcg_stock); - if (stock->cached != mem) { /* reset if necessary */ + if (stock->cached != memcg) { /* reset if necessary */ drain_stock(stock); - stock->cached = mem; + stock->cached = memcg; } stock->nr_pages += nr_pages; put_cpu_var(memcg_stock); } /* - * Drains all per-CPU charge caches for given root_mem resp. subtree + * Drains all per-CPU charge caches for given root_memcg resp. subtree * of the hierarchy under it. sync flag says whether we should block * until the work is done. */ -static void drain_all_stock(struct mem_cgroup *root_mem, bool sync) +static void drain_all_stock(struct mem_cgroup *root_memcg, bool sync) { int cpu, curcpu; @@ -2166,12 +2166,12 @@ static void drain_all_stock(struct mem_cgroup *root_mem, bool sync) curcpu = get_cpu(); for_each_online_cpu(cpu) { struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu); - struct mem_cgroup *mem; + struct mem_cgroup *memcg; - mem = stock->cached; - if (!mem || !stock->nr_pages) + memcg = stock->cached; + if (!memcg || !stock->nr_pages) continue; - if (!mem_cgroup_same_or_subtree(root_mem, mem)) + if (!mem_cgroup_same_or_subtree(root_memcg, memcg)) continue; if (!test_and_set_bit(FLUSHING_CACHED_CHARGE, &stock->flags)) { if (cpu == curcpu) @@ -2200,23 +2200,23 @@ out: * expects some charges will be back to res_counter later but cannot wait for * it. */ -static void drain_all_stock_async(struct mem_cgroup *root_mem) +static void drain_all_stock_async(struct mem_cgroup *root_memcg) { /* * If someone calls draining, avoid adding more kworker runs. */ if (!mutex_trylock(&percpu_charge_mutex)) return; - drain_all_stock(root_mem, false); + drain_all_stock(root_memcg, false); mutex_unlock(&percpu_charge_mutex); } /* This is a synchronous drain interface. */ -static void drain_all_stock_sync(struct mem_cgroup *root_mem) +static void drain_all_stock_sync(struct mem_cgroup *root_memcg) { /* called when force_empty is called */ mutex_lock(&percpu_charge_mutex); - drain_all_stock(root_mem, true); + drain_all_stock(root_memcg, true); mutex_unlock(&percpu_charge_mutex); } @@ -2224,35 +2224,35 @@ static void drain_all_stock_sync(struct mem_cgroup *root_mem) * This function drains percpu counter value from DEAD cpu and * move it to local cpu. Note that this function can be preempted. */ -static void mem_cgroup_drain_pcp_counter(struct mem_cgroup *mem, int cpu) +static void mem_cgroup_drain_pcp_counter(struct mem_cgroup *memcg, int cpu) { int i; - spin_lock(&mem->pcp_counter_lock); + spin_lock(&memcg->pcp_counter_lock); for (i = 0; i < MEM_CGROUP_STAT_DATA; i++) { - long x = per_cpu(mem->stat->count[i], cpu); + long x = per_cpu(memcg->stat->count[i], cpu); - per_cpu(mem->stat->count[i], cpu) = 0; - mem->nocpu_base.count[i] += x; + per_cpu(memcg->stat->count[i], cpu) = 0; + memcg->nocpu_base.count[i] += x; } for (i = 0; i < MEM_CGROUP_EVENTS_NSTATS; i++) { - unsigned long x = per_cpu(mem->stat->events[i], cpu); + unsigned long x = per_cpu(memcg->stat->events[i], cpu); - per_cpu(mem->stat->events[i], cpu) = 0; - mem->nocpu_base.events[i] += x; + per_cpu(memcg->stat->events[i], cpu) = 0; + memcg->nocpu_base.events[i] += x; } /* need to clear ON_MOVE value, works as a kind of lock. */ - per_cpu(mem->stat->count[MEM_CGROUP_ON_MOVE], cpu) = 0; - spin_unlock(&mem->pcp_counter_lock); + per_cpu(memcg->stat->count[MEM_CGROUP_ON_MOVE], cpu) = 0; + spin_unlock(&memcg->pcp_counter_lock); } -static void synchronize_mem_cgroup_on_move(struct mem_cgroup *mem, int cpu) +static void synchronize_mem_cgroup_on_move(struct mem_cgroup *memcg, int cpu) { int idx = MEM_CGROUP_ON_MOVE; - spin_lock(&mem->pcp_counter_lock); - per_cpu(mem->stat->count[idx], cpu) = mem->nocpu_base.count[idx]; - spin_unlock(&mem->pcp_counter_lock); + spin_lock(&memcg->pcp_counter_lock); + per_cpu(memcg->stat->count[idx], cpu) = memcg->nocpu_base.count[idx]; + spin_unlock(&memcg->pcp_counter_lock); } static int __cpuinit memcg_cpu_hotplug_callback(struct notifier_block *nb, @@ -2290,7 +2290,7 @@ enum { CHARGE_OOM_DIE, /* the current is killed because of OOM */ }; -static int mem_cgroup_do_charge(struct mem_cgroup *mem, gfp_t gfp_mask, +static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, unsigned int nr_pages, bool oom_check) { unsigned long csize = nr_pages * PAGE_SIZE; @@ -2299,16 +2299,16 @@ static int mem_cgroup_do_charge(struct mem_cgroup *mem, gfp_t gfp_mask, unsigned long flags = 0; int ret; - ret = res_counter_charge(&mem->res, csize, &fail_res); + ret = res_counter_charge(&memcg->res, csize, &fail_res); if (likely(!ret)) { if (!do_swap_account) return CHARGE_OK; - ret = res_counter_charge(&mem->memsw, csize, &fail_res); + ret = res_counter_charge(&memcg->memsw, csize, &fail_res); if (likely(!ret)) return CHARGE_OK; - res_counter_uncharge(&mem->res, csize); + res_counter_uncharge(&memcg->res, csize); mem_over_limit = mem_cgroup_from_res_counter(fail_res, memsw); flags |= MEM_CGROUP_RECLAIM_NOSWAP; } else @@ -2366,12 +2366,12 @@ static int mem_cgroup_do_charge(struct mem_cgroup *mem, gfp_t gfp_mask, static int __mem_cgroup_try_charge(struct mm_struct *mm, gfp_t gfp_mask, unsigned int nr_pages, - struct mem_cgroup **memcg, + struct mem_cgroup **ptr, bool oom) { unsigned int batch = max(CHARGE_BATCH, nr_pages); int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES; - struct mem_cgroup *mem = NULL; + struct mem_cgroup *memcg = NULL; int ret; /* @@ -2389,17 +2389,17 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm, * thread group leader migrates. It's possible that mm is not * set, if so charge the init_mm (happens for pagecache usage). */ - if (!*memcg && !mm) + if (!*ptr && !mm) goto bypass; again: - if (*memcg) { /* css should be a valid one */ - mem = *memcg; - VM_BUG_ON(css_is_removed(&mem->css)); - if (mem_cgroup_is_root(mem)) + if (*ptr) { /* css should be a valid one */ + memcg = *ptr; + VM_BUG_ON(css_is_removed(&memcg->css)); + if (mem_cgroup_is_root(memcg)) goto done; - if (nr_pages == 1 && consume_stock(mem)) + if (nr_pages == 1 && consume_stock(memcg)) goto done; - css_get(&mem->css); + css_get(&memcg->css); } else { struct task_struct *p; @@ -2407,7 +2407,7 @@ again: p = rcu_dereference(mm->owner); /* * Because we don't have task_lock(), "p" can exit. - * In that case, "mem" can point to root or p can be NULL with + * In that case, "memcg" can point to root or p can be NULL with * race with swapoff. Then, we have small risk of mis-accouning. * But such kind of mis-account by race always happens because * we don't have cgroup_mutex(). It's overkill and we allo that @@ -2415,12 +2415,12 @@ again: * (*) swapoff at el will charge against mm-struct not against * task-struct. So, mm->owner can be NULL. */ - mem = mem_cgroup_from_task(p); - if (!mem || mem_cgroup_is_root(mem)) { + memcg = mem_cgroup_from_task(p); + if (!memcg || mem_cgroup_is_root(memcg)) { rcu_read_unlock(); goto done; } - if (nr_pages == 1 && consume_stock(mem)) { + if (nr_pages == 1 && consume_stock(memcg)) { /* * It seems dagerous to access memcg without css_get(). * But considering how consume_stok works, it's not @@ -2433,7 +2433,7 @@ again: goto done; } /* after here, we may be blocked. we need to get refcnt */ - if (!css_tryget(&mem->css)) { + if (!css_tryget(&memcg->css)) { rcu_read_unlock(); goto again; } @@ -2445,7 +2445,7 @@ again: /* If killed, bypass charge */ if (fatal_signal_pending(current)) { - css_put(&mem->css); + css_put(&memcg->css); goto bypass; } @@ -2455,43 +2455,43 @@ again: nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES; } - ret = mem_cgroup_do_charge(mem, gfp_mask, batch, oom_check); + ret = mem_cgroup_do_charge(memcg, gfp_mask, batch, oom_check); switch (ret) { case CHARGE_OK: break; case CHARGE_RETRY: /* not in OOM situation but retry */ batch = nr_pages; - css_put(&mem->css); - mem = NULL; + css_put(&memcg->css); + memcg = NULL; goto again; case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */ - css_put(&mem->css); + css_put(&memcg->css); goto nomem; case CHARGE_NOMEM: /* OOM routine works */ if (!oom) { - css_put(&mem->css); + css_put(&memcg->css); goto nomem; } /* If oom, we never return -ENOMEM */ nr_oom_retries--; break; case CHARGE_OOM_DIE: /* Killed by OOM Killer */ - css_put(&mem->css); + css_put(&memcg->css); goto bypass; } } while (ret != CHARGE_OK); if (batch > nr_pages) - refill_stock(mem, batch - nr_pages); - css_put(&mem->css); + refill_stock(memcg, batch - nr_pages); + css_put(&memcg->css); done: - *memcg = mem; + *ptr = memcg; return 0; nomem: - *memcg = NULL; + *ptr = NULL; return -ENOMEM; bypass: - *memcg = NULL; + *ptr = NULL; return 0; } @@ -2500,15 +2500,15 @@ bypass: * This function is for that and do uncharge, put css's refcnt. * gotten by try_charge(). */ -static void __mem_cgroup_cancel_charge(struct mem_cgroup *mem, +static void __mem_cgroup_cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) { - if (!mem_cgroup_is_root(mem)) { + if (!mem_cgroup_is_root(memcg)) { unsigned long bytes = nr_pages * PAGE_SIZE; - res_counter_uncharge(&mem->res, bytes); + res_counter_uncharge(&memcg->res, bytes); if (do_swap_account) - res_counter_uncharge(&mem->memsw, bytes); + res_counter_uncharge(&memcg->memsw, bytes); } } @@ -2533,7 +2533,7 @@ static struct mem_cgroup *mem_cgroup_lookup(unsigned short id) struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page) { - struct mem_cgroup *mem = NULL; + struct mem_cgroup *memcg = NULL; struct page_cgroup *pc; unsigned short id; swp_entry_t ent; @@ -2543,23 +2543,23 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page) pc = lookup_page_cgroup(page); lock_page_cgroup(pc); if (PageCgroupUsed(pc)) { - mem = pc->mem_cgroup; - if (mem && !css_tryget(&mem->css)) - mem = NULL; + memcg = pc->mem_cgroup; + if (memcg && !css_tryget(&memcg->css)) + memcg = NULL; } else if (PageSwapCache(page)) { ent.val = page_private(page); id = lookup_swap_cgroup(ent); rcu_read_lock(); - mem = mem_cgroup_lookup(id); - if (mem && !css_tryget(&mem->css)) - mem = NULL; + memcg = mem_cgroup_lookup(id); + if (memcg && !css_tryget(&memcg->css)) + memcg = NULL; rcu_read_unlock(); } unlock_page_cgroup(pc); - return mem; + return memcg; } -static void __mem_cgroup_commit_charge(struct mem_cgroup *mem, +static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg, struct page *page, unsigned int nr_pages, struct page_cgroup *pc, @@ -2568,14 +2568,14 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem, lock_page_cgroup(pc); if (unlikely(PageCgroupUsed(pc))) { unlock_page_cgroup(pc); - __mem_cgroup_cancel_charge(mem, nr_pages); + __mem_cgroup_cancel_charge(memcg, nr_pages); return; } /* * we don't need page_cgroup_lock about tail pages, becase they are not * accessed by any other context at this point. */ - pc->mem_cgroup = mem; + pc->mem_cgroup = memcg; /* * We access a page_cgroup asynchronously without lock_page_cgroup(). * Especially when a page_cgroup is taken from a page, pc->mem_cgroup @@ -2598,14 +2598,14 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem, break; } - mem_cgroup_charge_statistics(mem, PageCgroupCache(pc), nr_pages); + mem_cgroup_charge_statistics(memcg, PageCgroupCache(pc), nr_pages); unlock_page_cgroup(pc); /* * "charge_statistics" updated event counter. Then, check it. * Insert ancestor (and ancestor's ancestors), to softlimit RB-tree. * if they exceeds softlimit. */ - memcg_check_events(mem, page); + memcg_check_events(memcg, page); } #ifdef CONFIG_TRANSPARENT_HUGEPAGE @@ -2700,10 +2700,8 @@ static int mem_cgroup_move_account(struct page *page, if (PageCgroupFileMapped(pc)) { /* Update mapped_file data for mem_cgroup */ - preempt_disable(); - __this_cpu_dec(from->stat->count[MEM_CGROUP_STAT_FILE_MAPPED]); - __this_cpu_inc(to->stat->count[MEM_CGROUP_STAT_FILE_MAPPED]); - preempt_enable(); + this_cpu_dec(from->stat->count[MEM_CGROUP_STAT_FILE_MAPPED]); + this_cpu_inc(to->stat->count[MEM_CGROUP_STAT_FILE_MAPPED]); } mem_cgroup_charge_statistics(from, PageCgroupCache(pc), -nr_pages); if (uncharge) @@ -2792,7 +2790,7 @@ out: static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, enum charge_type ctype) { - struct mem_cgroup *mem = NULL; + struct mem_cgroup *memcg = NULL; unsigned int nr_pages = 1; struct page_cgroup *pc; bool oom = true; @@ -2811,11 +2809,11 @@ static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm, pc = lookup_page_cgroup(page); BUG_ON(!pc); /* XXX: remove this and move pc lookup into commit */ - ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &mem, oom); - if (ret || !mem) + ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom); + if (ret || !memcg) return ret; - __mem_cgroup_commit_charge(mem, page, nr_pages, pc, ctype); + __mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype); return 0; } @@ -2844,7 +2842,7 @@ __mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr, enum charge_type ctype); static void -__mem_cgroup_commit_charge_lrucare(struct page *page, struct mem_cgroup *mem, +__mem_cgroup_commit_charge_lrucare(struct page *page, struct mem_cgroup *memcg, enum charge_type ctype) { struct page_cgroup *pc = lookup_page_cgroup(page); @@ -2854,7 +2852,7 @@ __mem_cgroup_commit_charge_lrucare(struct page *page, struct mem_cgroup *mem, * LRU. Take care of it. */ mem_cgroup_lru_del_before_commit(page); - __mem_cgroup_commit_charge(mem, page, 1, pc, ctype); + __mem_cgroup_commit_charge(memcg, page, 1, pc, ctype); mem_cgroup_lru_add_after_commit(page); return; } @@ -2862,7 +2860,7 @@ __mem_cgroup_commit_charge_lrucare(struct page *page, struct mem_cgroup *mem, int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) { - struct mem_cgroup *mem = NULL; + struct mem_cgroup *memcg = NULL; int ret; if (mem_cgroup_disabled()) @@ -2874,8 +2872,8 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm, mm = &init_mm; if (page_is_file_cache(page)) { - ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &mem, true); - if (ret || !mem) + ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true); + if (ret || !memcg) return ret; /* @@ -2883,15 +2881,15 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm, * put that would remove them from the LRU list, make * sure that they get relinked properly. */ - __mem_cgroup_commit_charge_lrucare(page, mem, + __mem_cgroup_commit_charge_lrucare(page, memcg, MEM_CGROUP_CHARGE_TYPE_CACHE); return ret; } /* shmem */ if (PageSwapCache(page)) { - ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &mem); + ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg); if (!ret) - __mem_cgroup_commit_charge_swapin(page, mem, + __mem_cgroup_commit_charge_swapin(page, memcg, MEM_CGROUP_CHARGE_TYPE_SHMEM); } else ret = mem_cgroup_charge_common(page, mm, gfp_mask, @@ -2910,7 +2908,7 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page, gfp_t mask, struct mem_cgroup **ptr) { - struct mem_cgroup *mem; + struct mem_cgroup *memcg; int ret; *ptr = NULL; @@ -2928,12 +2926,12 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, */ if (!PageSwapCache(page)) goto charge_cur_mm; - mem = try_get_mem_cgroup_from_page(page); - if (!mem) + memcg = try_get_mem_cgroup_from_page(page); + if (!memcg) goto charge_cur_mm; - *ptr = mem; + *ptr = memcg; ret = __mem_cgroup_try_charge(NULL, mask, 1, ptr, true); - css_put(&mem->css); + css_put(&memcg->css); return ret; charge_cur_mm: if (unlikely(!mm)) @@ -2993,16 +2991,16 @@ void mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr) MEM_CGROUP_CHARGE_TYPE_MAPPED); } -void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *mem) +void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg) { if (mem_cgroup_disabled()) return; - if (!mem) + if (!memcg) return; - __mem_cgroup_cancel_charge(mem, 1); + __mem_cgroup_cancel_charge(memcg, 1); } -static void mem_cgroup_do_uncharge(struct mem_cgroup *mem, +static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg, unsigned int nr_pages, const enum charge_type ctype) { @@ -3020,7 +3018,7 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *mem, * uncharges. Then, it's ok to ignore memcg's refcnt. */ if (!batch->memcg) - batch->memcg = mem; + batch->memcg = memcg; /* * do_batch > 0 when unmapping pages or inode invalidate/truncate. * In those cases, all pages freed continuously can be expected to be in @@ -3040,7 +3038,7 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *mem, * merge a series of uncharges to an uncharge of res_counter. * If not, we uncharge res_counter ony by one. */ - if (batch->memcg != mem) + if (batch->memcg != memcg) goto direct_uncharge; /* remember freed charge and uncharge it later */ batch->nr_pages++; @@ -3048,11 +3046,11 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *mem, batch->memsw_nr_pages++; return; direct_uncharge: - res_counter_uncharge(&mem->res, nr_pages * PAGE_SIZE); + res_counter_uncharge(&memcg->res, nr_pages * PAGE_SIZE); if (uncharge_memsw) - res_counter_uncharge(&mem->memsw, nr_pages * PAGE_SIZE); - if (unlikely(batch->memcg != mem)) - memcg_oom_recover(mem); + res_counter_uncharge(&memcg->memsw, nr_pages * PAGE_SIZE); + if (unlikely(batch->memcg != memcg)) + memcg_oom_recover(memcg); return; } @@ -3062,7 +3060,7 @@ direct_uncharge: static struct mem_cgroup * __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype) { - struct mem_cgroup *mem = NULL; + struct mem_cgroup *memcg = NULL; unsigned int nr_pages = 1; struct page_cgroup *pc; @@ -3085,7 +3083,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype) lock_page_cgroup(pc); - mem = pc->mem_cgroup; + memcg = pc->mem_cgroup; if (!PageCgroupUsed(pc)) goto unlock_out; @@ -3108,7 +3106,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype) break; } - mem_cgroup_charge_statistics(mem, PageCgroupCache(pc), -nr_pages); + mem_cgroup_charge_statistics(memcg, PageCgroupCache(pc), -nr_pages); ClearPageCgroupUsed(pc); /* @@ -3123,15 +3121,15 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype) * even after unlock, we have mem->res.usage here and this memcg * will never be freed. */ - memcg_check_events(mem, page); + memcg_check_events(memcg, page); if (do_swap_account && ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT) { - mem_cgroup_swap_statistics(mem, true); - mem_cgroup_get(mem); + mem_cgroup_swap_statistics(memcg, true); + mem_cgroup_get(memcg); } - if (!mem_cgroup_is_root(mem)) - mem_cgroup_do_uncharge(mem, nr_pages, ctype); + if (!mem_cgroup_is_root(memcg)) + mem_cgroup_do_uncharge(memcg, nr_pages, ctype); - return mem; + return memcg; unlock_out: unlock_page_cgroup(pc); @@ -3321,7 +3319,7 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry, int mem_cgroup_prepare_migration(struct page *page, struct page *newpage, struct mem_cgroup **ptr, gfp_t gfp_mask) { - struct mem_cgroup *mem = NULL; + struct mem_cgroup *memcg = NULL; struct page_cgroup *pc; enum charge_type ctype; int ret = 0; @@ -3335,8 +3333,8 @@ int mem_cgroup_prepare_migration(struct page *page, pc = lookup_page_cgroup(page); lock_page_cgroup(pc); if (PageCgroupUsed(pc)) { - mem = pc->mem_cgroup; - css_get(&mem->css); + memcg = pc->mem_cgroup; + css_get(&memcg->css); /* * At migrating an anonymous page, its mapcount goes down * to 0 and uncharge() will be called. But, even if it's fully @@ -3374,12 +3372,12 @@ int mem_cgroup_prepare_migration(struct page *page, * If the page is not charged at this point, * we return here. */ - if (!mem) + if (!memcg) return 0; - *ptr = mem; + *ptr = memcg; ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, ptr, false); - css_put(&mem->css);/* drop extra refcnt */ + css_put(&memcg->css);/* drop extra refcnt */ if (ret || *ptr == NULL) { if (PageAnon(page)) { lock_page_cgroup(pc); @@ -3405,21 +3403,21 @@ int mem_cgroup_prepare_migration(struct page *page, ctype = MEM_CGROUP_CHARGE_TYPE_CACHE; else ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM; - __mem_cgroup_commit_charge(mem, page, 1, pc, ctype); + __mem_cgroup_commit_charge(memcg, page, 1, pc, ctype); return ret; } /* remove redundant charge if migration failed*/ -void mem_cgroup_end_migration(struct mem_cgroup *mem, +void mem_cgroup_end_migration(struct mem_cgroup *memcg, struct page *oldpage, struct page *newpage, bool migration_ok) { struct page *used, *unused; struct page_cgroup *pc; - if (!mem) + if (!memcg) return; /* blocks rmdir() */ - cgroup_exclude_rmdir(&mem->css); + cgroup_exclude_rmdir(&memcg->css); if (!migration_ok) { used = oldpage; unused = newpage; @@ -3455,7 +3453,7 @@ void mem_cgroup_end_migration(struct mem_cgroup *mem, * So, rmdir()->pre_destroy() can be called while we do this charge. * In that case, we need to call pre_destroy() again. check it here. */ - cgroup_release_and_wakeup_rmdir(&mem->css); + cgroup_release_and_wakeup_rmdir(&memcg->css); } #ifdef CONFIG_DEBUG_VM @@ -3534,7 +3532,7 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, /* * Rather than hide all in some function, I do this in * open coded manner. You see what this really does. - * We have to guarantee mem->res.limit < mem->memsw.limit. + * We have to guarantee memcg->res.limit < memcg->memsw.limit. */ mutex_lock(&set_limit_mutex); memswlimit = res_counter_read_u64(&memcg->memsw, RES_LIMIT); @@ -3596,7 +3594,7 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, /* * Rather than hide all in some function, I do this in * open coded manner. You see what this really does. - * We have to guarantee mem->res.limit < mem->memsw.limit. + * We have to guarantee memcg->res.limit < memcg->memsw.limit. */ mutex_lock(&set_limit_mutex); memlimit = res_counter_read_u64(&memcg->res, RES_LIMIT); @@ -3734,7 +3732,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, * This routine traverse page_cgroup in given list and drop them all. * *And* this routine doesn't reclaim page itself, just removes page_cgroup. */ -static int mem_cgroup_force_empty_list(struct mem_cgroup *mem, +static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg, int node, int zid, enum lru_list lru) { struct zone *zone; @@ -3745,7 +3743,7 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *mem, int ret = 0; zone = &NODE_DATA(node)->node_zones[zid]; - mz = mem_cgroup_zoneinfo(mem, node, zid); + mz = mem_cgroup_zoneinfo(memcg, node, zid); list = &mz->lists[lru]; loop = MEM_CGROUP_ZSTAT(mz, lru); @@ -3772,7 +3770,7 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *mem, page = lookup_cgroup_page(pc); - ret = mem_cgroup_move_parent(page, pc, mem, GFP_KERNEL); + ret = mem_cgroup_move_parent(page, pc, memcg, GFP_KERNEL); if (ret == -ENOMEM) break; @@ -3793,14 +3791,14 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *mem, * make mem_cgroup's charge to be 0 if there is no task. * This enables deleting this mem_cgroup. */ -static int mem_cgroup_force_empty(struct mem_cgroup *mem, bool free_all) +static int mem_cgroup_force_empty(struct mem_cgroup *memcg, bool free_all) { int ret; int node, zid, shrink; int nr_retries = MEM_CGROUP_RECLAIM_RETRIES; - struct cgroup *cgrp = mem->css.cgroup; + struct cgroup *cgrp = memcg->css.cgroup; - css_get(&mem->css); + css_get(&memcg->css); shrink = 0; /* should free all ? */ @@ -3816,14 +3814,14 @@ move_account: goto out; /* This is for making all *used* pages to be on LRU. */ lru_add_drain_all(); - drain_all_stock_sync(mem); + drain_all_stock_sync(memcg); ret = 0; - mem_cgroup_start_move(mem); + mem_cgroup_start_move(memcg); for_each_node_state(node, N_HIGH_MEMORY) { for (zid = 0; !ret && zid < MAX_NR_ZONES; zid++) { enum lru_list l; for_each_lru(l) { - ret = mem_cgroup_force_empty_list(mem, + ret = mem_cgroup_force_empty_list(memcg, node, zid, l); if (ret) break; @@ -3832,16 +3830,16 @@ move_account: if (ret) break; } - mem_cgroup_end_move(mem); - memcg_oom_recover(mem); + mem_cgroup_end_move(memcg); + memcg_oom_recover(memcg); /* it seems parent cgroup doesn't have enough mem */ if (ret == -ENOMEM) goto try_to_free; cond_resched(); /* "ret" should also be checked to ensure all lists are empty. */ - } while (mem->res.usage > 0 || ret); + } while (memcg->res.usage > 0 || ret); out: - css_put(&mem->css); + css_put(&memcg->css); return ret; try_to_free: @@ -3854,7 +3852,7 @@ try_to_free: lru_add_drain_all(); /* try to free all pages in this cgroup */ shrink = 1; - while (nr_retries && mem->res.usage > 0) { + while (nr_retries && memcg->res.usage > 0) { struct memcg_scanrecord rec; int progress; @@ -3863,9 +3861,9 @@ try_to_free: goto out; } rec.context = SCAN_BY_SHRINK; - rec.mem = mem; - rec.root = mem; - progress = try_to_free_mem_cgroup_pages(mem, GFP_KERNEL, + rec.mem = memcg; + rec.root = memcg; + progress = try_to_free_mem_cgroup_pages(memcg, GFP_KERNEL, false, &rec); if (!progress) { nr_retries--; @@ -3894,12 +3892,12 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, u64 val) { int retval = 0; - struct mem_cgroup *mem = mem_cgroup_from_cont(cont); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cont); struct cgroup *parent = cont->parent; - struct mem_cgroup *parent_mem = NULL; + struct mem_cgroup *parent_memcg = NULL; if (parent) - parent_mem = mem_cgroup_from_cont(parent); + parent_memcg = mem_cgroup_from_cont(parent); cgroup_lock(); /* @@ -3910,10 +3908,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, * For the root cgroup, parent_mem is NULL, we allow value to be * set if there are no children. */ - if ((!parent_mem || !parent_mem->use_hierarchy) && + if ((!parent_memcg || !parent_memcg->use_hierarchy) && (val == 1 || val == 0)) { if (list_empty(&cont->children)) - mem->use_hierarchy = val; + memcg->use_hierarchy = val; else retval = -EBUSY; } else @@ -3924,14 +3922,14 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, } -static unsigned long mem_cgroup_recursive_stat(struct mem_cgroup *mem, +static unsigned long mem_cgroup_recursive_stat(struct mem_cgroup *memcg, enum mem_cgroup_stat_index idx) { struct mem_cgroup *iter; long val = 0; /* Per-cpu values can be negative, use a signed accumulator */ - for_each_mem_cgroup_tree(iter, mem) + for_each_mem_cgroup_tree(iter, memcg) val += mem_cgroup_read_stat(iter, idx); if (val < 0) /* race ? */ @@ -3939,29 +3937,29 @@ static unsigned long mem_cgroup_recursive_stat(struct mem_cgroup *mem, return val; } -static inline u64 mem_cgroup_usage(struct mem_cgroup *mem, bool swap) +static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) { u64 val; - if (!mem_cgroup_is_root(mem)) { + if (!mem_cgroup_is_root(memcg)) { if (!swap) - return res_counter_read_u64(&mem->res, RES_USAGE); + return res_counter_read_u64(&memcg->res, RES_USAGE); else - return res_counter_read_u64(&mem->memsw, RES_USAGE); + return res_counter_read_u64(&memcg->memsw, RES_USAGE); } - val = mem_cgroup_recursive_stat(mem, MEM_CGROUP_STAT_CACHE); - val += mem_cgroup_recursive_stat(mem, MEM_CGROUP_STAT_RSS); + val = mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_CACHE); + val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_RSS); if (swap) - val += mem_cgroup_recursive_stat(mem, MEM_CGROUP_STAT_SWAPOUT); + val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_SWAPOUT); return val << PAGE_SHIFT; } static u64 mem_cgroup_read(struct cgroup *cont, struct cftype *cft) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cont); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cont); u64 val; int type, name; @@ -3970,15 +3968,15 @@ static u64 mem_cgroup_read(struct cgroup *cont, struct cftype *cft) switch (type) { case _MEM: if (name == RES_USAGE) - val = mem_cgroup_usage(mem, false); + val = mem_cgroup_usage(memcg, false); else - val = res_counter_read_u64(&mem->res, name); + val = res_counter_read_u64(&memcg->res, name); break; case _MEMSWAP: if (name == RES_USAGE) - val = mem_cgroup_usage(mem, true); + val = mem_cgroup_usage(memcg, true); else - val = res_counter_read_u64(&mem->memsw, name); + val = res_counter_read_u64(&memcg->memsw, name); break; default: BUG(); @@ -4066,24 +4064,24 @@ out: static int mem_cgroup_reset(struct cgroup *cont, unsigned int event) { - struct mem_cgroup *mem; + struct mem_cgroup *memcg; int type, name; - mem = mem_cgroup_from_cont(cont); + memcg = mem_cgroup_from_cont(cont); type = MEMFILE_TYPE(event); name = MEMFILE_ATTR(event); switch (name) { case RES_MAX_USAGE: if (type == _MEM) - res_counter_reset_max(&mem->res); + res_counter_reset_max(&memcg->res); else - res_counter_reset_max(&mem->memsw); + res_counter_reset_max(&memcg->memsw); break; case RES_FAILCNT: if (type == _MEM) - res_counter_reset_failcnt(&mem->res); + res_counter_reset_failcnt(&memcg->res); else - res_counter_reset_failcnt(&mem->memsw); + res_counter_reset_failcnt(&memcg->memsw); break; } @@ -4100,7 +4098,7 @@ static u64 mem_cgroup_move_charge_read(struct cgroup *cgrp, static int mem_cgroup_move_charge_write(struct cgroup *cgrp, struct cftype *cft, u64 val) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cgrp); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp); if (val >= (1 << NR_MOVE_TYPE)) return -EINVAL; @@ -4110,7 +4108,7 @@ static int mem_cgroup_move_charge_write(struct cgroup *cgrp, * inconsistent. */ cgroup_lock(); - mem->move_charge_at_immigrate = val; + memcg->move_charge_at_immigrate = val; cgroup_unlock(); return 0; @@ -4167,49 +4165,49 @@ struct { static void -mem_cgroup_get_local_stat(struct mem_cgroup *mem, struct mcs_total_stat *s) +mem_cgroup_get_local_stat(struct mem_cgroup *memcg, struct mcs_total_stat *s) { s64 val; /* per cpu stat */ - val = mem_cgroup_read_stat(mem, MEM_CGROUP_STAT_CACHE); + val = mem_cgroup_read_stat(memcg, MEM_CGROUP_STAT_CACHE); s->stat[MCS_CACHE] += val * PAGE_SIZE; - val = mem_cgroup_read_stat(mem, MEM_CGROUP_STAT_RSS); + val = mem_cgroup_read_stat(memcg, MEM_CGROUP_STAT_RSS); s->stat[MCS_RSS] += val * PAGE_SIZE; - val = mem_cgroup_read_stat(mem, MEM_CGROUP_STAT_FILE_MAPPED); + val = mem_cgroup_read_stat(memcg, MEM_CGROUP_STAT_FILE_MAPPED); s->stat[MCS_FILE_MAPPED] += val * PAGE_SIZE; - val = mem_cgroup_read_events(mem, MEM_CGROUP_EVENTS_PGPGIN); + val = mem_cgroup_read_events(memcg, MEM_CGROUP_EVENTS_PGPGIN); s->stat[MCS_PGPGIN] += val; - val = mem_cgroup_read_events(mem, MEM_CGROUP_EVENTS_PGPGOUT); + val = mem_cgroup_read_events(memcg, MEM_CGROUP_EVENTS_PGPGOUT); s->stat[MCS_PGPGOUT] += val; if (do_swap_account) { - val = mem_cgroup_read_stat(mem, MEM_CGROUP_STAT_SWAPOUT); + val = mem_cgroup_read_stat(memcg, MEM_CGROUP_STAT_SWAPOUT); s->stat[MCS_SWAP] += val * PAGE_SIZE; } - val = mem_cgroup_read_events(mem, MEM_CGROUP_EVENTS_PGFAULT); + val = mem_cgroup_read_events(memcg, MEM_CGROUP_EVENTS_PGFAULT); s->stat[MCS_PGFAULT] += val; - val = mem_cgroup_read_events(mem, MEM_CGROUP_EVENTS_PGMAJFAULT); + val = mem_cgroup_read_events(memcg, MEM_CGROUP_EVENTS_PGMAJFAULT); s->stat[MCS_PGMAJFAULT] += val; /* per zone stat */ - val = mem_cgroup_nr_lru_pages(mem, BIT(LRU_INACTIVE_ANON)); + val = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_INACTIVE_ANON)); s->stat[MCS_INACTIVE_ANON] += val * PAGE_SIZE; - val = mem_cgroup_nr_lru_pages(mem, BIT(LRU_ACTIVE_ANON)); + val = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_ACTIVE_ANON)); s->stat[MCS_ACTIVE_ANON] += val * PAGE_SIZE; - val = mem_cgroup_nr_lru_pages(mem, BIT(LRU_INACTIVE_FILE)); + val = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_INACTIVE_FILE)); s->stat[MCS_INACTIVE_FILE] += val * PAGE_SIZE; - val = mem_cgroup_nr_lru_pages(mem, BIT(LRU_ACTIVE_FILE)); + val = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_ACTIVE_FILE)); s->stat[MCS_ACTIVE_FILE] += val * PAGE_SIZE; - val = mem_cgroup_nr_lru_pages(mem, BIT(LRU_UNEVICTABLE)); + val = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_UNEVICTABLE)); s->stat[MCS_UNEVICTABLE] += val * PAGE_SIZE; } static void -mem_cgroup_get_total_stat(struct mem_cgroup *mem, struct mcs_total_stat *s) +mem_cgroup_get_total_stat(struct mem_cgroup *memcg, struct mcs_total_stat *s) { struct mem_cgroup *iter; - for_each_mem_cgroup_tree(iter, mem) + for_each_mem_cgroup_tree(iter, memcg) mem_cgroup_get_local_stat(iter, s); } @@ -4433,20 +4431,20 @@ static int compare_thresholds(const void *a, const void *b) return _a->threshold - _b->threshold; } -static int mem_cgroup_oom_notify_cb(struct mem_cgroup *mem) +static int mem_cgroup_oom_notify_cb(struct mem_cgroup *memcg) { struct mem_cgroup_eventfd_list *ev; - list_for_each_entry(ev, &mem->oom_notify, list) + list_for_each_entry(ev, &memcg->oom_notify, list) eventfd_signal(ev->eventfd, 1); return 0; } -static void mem_cgroup_oom_notify(struct mem_cgroup *mem) +static void mem_cgroup_oom_notify(struct mem_cgroup *memcg) { struct mem_cgroup *iter; - for_each_mem_cgroup_tree(iter, mem) + for_each_mem_cgroup_tree(iter, memcg) mem_cgroup_oom_notify_cb(iter); } @@ -4636,7 +4634,7 @@ static int mem_cgroup_oom_register_event(struct cgroup *cgrp, static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp, struct cftype *cft, struct eventfd_ctx *eventfd) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cgrp); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp); struct mem_cgroup_eventfd_list *ev, *tmp; int type = MEMFILE_TYPE(cft->private); @@ -4644,7 +4642,7 @@ static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp, spin_lock(&memcg_oom_lock); - list_for_each_entry_safe(ev, tmp, &mem->oom_notify, list) { + list_for_each_entry_safe(ev, tmp, &memcg->oom_notify, list) { if (ev->eventfd == eventfd) { list_del(&ev->list); kfree(ev); @@ -4657,11 +4655,11 @@ static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp, static int mem_cgroup_oom_control_read(struct cgroup *cgrp, struct cftype *cft, struct cgroup_map_cb *cb) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cgrp); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp); - cb->fill(cb, "oom_kill_disable", mem->oom_kill_disable); + cb->fill(cb, "oom_kill_disable", memcg->oom_kill_disable); - if (atomic_read(&mem->under_oom)) + if (atomic_read(&memcg->under_oom)) cb->fill(cb, "under_oom", 1); else cb->fill(cb, "under_oom", 0); @@ -4671,7 +4669,7 @@ static int mem_cgroup_oom_control_read(struct cgroup *cgrp, static int mem_cgroup_oom_control_write(struct cgroup *cgrp, struct cftype *cft, u64 val) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cgrp); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp); struct mem_cgroup *parent; /* cannot set to root cgroup and only 0 and 1 are allowed */ @@ -4683,13 +4681,13 @@ static int mem_cgroup_oom_control_write(struct cgroup *cgrp, cgroup_lock(); /* oom-kill-disable is a flag for subhierarchy. */ if ((parent->use_hierarchy) || - (mem->use_hierarchy && !list_empty(&cgrp->children))) { + (memcg->use_hierarchy && !list_empty(&cgrp->children))) { cgroup_unlock(); return -EINVAL; } - mem->oom_kill_disable = val; + memcg->oom_kill_disable = val; if (!val) - memcg_oom_recover(mem); + memcg_oom_recover(memcg); cgroup_unlock(); return 0; } @@ -4714,33 +4712,35 @@ static int mem_cgroup_vmscan_stat_read(struct cgroup *cgrp, struct cftype *cft, struct cgroup_map_cb *cb) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cgrp); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp); char string[64]; int i; for (i = 0; i < NR_SCANSTATS; i++) { strcpy(string, scanstat_string[i]); strcat(string, SCANSTAT_WORD_LIMIT); - cb->fill(cb, string, mem->scanstat.stats[SCAN_BY_LIMIT][i]); + cb->fill(cb, string, memcg->scanstat.stats[SCAN_BY_LIMIT][i]); } for (i = 0; i < NR_SCANSTATS; i++) { strcpy(string, scanstat_string[i]); strcat(string, SCANSTAT_WORD_SYSTEM); - cb->fill(cb, string, mem->scanstat.stats[SCAN_BY_SYSTEM][i]); + cb->fill(cb, string, memcg->scanstat.stats[SCAN_BY_SYSTEM][i]); } for (i = 0; i < NR_SCANSTATS; i++) { strcpy(string, scanstat_string[i]); strcat(string, SCANSTAT_WORD_LIMIT); strcat(string, SCANSTAT_WORD_HIERARCHY); - cb->fill(cb, string, mem->scanstat.rootstats[SCAN_BY_LIMIT][i]); + cb->fill(cb, + string, memcg->scanstat.rootstats[SCAN_BY_LIMIT][i]); } for (i = 0; i < NR_SCANSTATS; i++) { strcpy(string, scanstat_string[i]); strcat(string, SCANSTAT_WORD_SYSTEM); strcat(string, SCANSTAT_WORD_HIERARCHY); - cb->fill(cb, string, mem->scanstat.rootstats[SCAN_BY_SYSTEM][i]); + cb->fill(cb, + string, memcg->scanstat.rootstats[SCAN_BY_SYSTEM][i]); } return 0; } @@ -4748,12 +4748,13 @@ static int mem_cgroup_vmscan_stat_read(struct cgroup *cgrp, static int mem_cgroup_reset_vmscan_stat(struct cgroup *cgrp, unsigned int event) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cgrp); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp); - spin_lock(&mem->scanstat.lock); - memset(&mem->scanstat.stats, 0, sizeof(mem->scanstat.stats)); - memset(&mem->scanstat.rootstats, 0, sizeof(mem->scanstat.rootstats)); - spin_unlock(&mem->scanstat.lock); + spin_lock(&memcg->scanstat.lock); + memset(&memcg->scanstat.stats, 0, sizeof(memcg->scanstat.stats)); + memset(&memcg->scanstat.rootstats, + 0, sizeof(memcg->scanstat.rootstats)); + spin_unlock(&memcg->scanstat.lock); return 0; } @@ -4878,7 +4879,7 @@ static int register_memsw_files(struct cgroup *cont, struct cgroup_subsys *ss) } #endif -static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *mem, int node) +static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node) { struct mem_cgroup_per_node *pn; struct mem_cgroup_per_zone *mz; @@ -4898,21 +4899,21 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *mem, int node) if (!pn) return 1; - mem->info.nodeinfo[node] = pn; for (zone = 0; zone < MAX_NR_ZONES; zone++) { mz = &pn->zoneinfo[zone]; for_each_lru(l) INIT_LIST_HEAD(&mz->lists[l]); mz->usage_in_excess = 0; mz->on_tree = false; - mz->mem = mem; + mz->mem = memcg; } + memcg->info.nodeinfo[node] = pn; return 0; } -static void free_mem_cgroup_per_zone_info(struct mem_cgroup *mem, int node) +static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node) { - kfree(mem->info.nodeinfo[node]); + kfree(memcg->info.nodeinfo[node]); } static struct mem_cgroup *mem_cgroup_alloc(void) @@ -4954,51 +4955,51 @@ out_free: * Removal of cgroup itself succeeds regardless of refs from swap. */ -static void __mem_cgroup_free(struct mem_cgroup *mem) +static void __mem_cgroup_free(struct mem_cgroup *memcg) { int node; - mem_cgroup_remove_from_trees(mem); - free_css_id(&mem_cgroup_subsys, &mem->css); + mem_cgroup_remove_from_trees(memcg); + free_css_id(&mem_cgroup_subsys, &memcg->css); for_each_node_state(node, N_POSSIBLE) - free_mem_cgroup_per_zone_info(mem, node); + free_mem_cgroup_per_zone_info(memcg, node); - free_percpu(mem->stat); + free_percpu(memcg->stat); if (sizeof(struct mem_cgroup) < PAGE_SIZE) - kfree(mem); + kfree(memcg); else - vfree(mem); + vfree(memcg); } -static void mem_cgroup_get(struct mem_cgroup *mem) +static void mem_cgroup_get(struct mem_cgroup *memcg) { - atomic_inc(&mem->refcnt); + atomic_inc(&memcg->refcnt); } -static void __mem_cgroup_put(struct mem_cgroup *mem, int count) +static void __mem_cgroup_put(struct mem_cgroup *memcg, int count) { - if (atomic_sub_and_test(count, &mem->refcnt)) { - struct mem_cgroup *parent = parent_mem_cgroup(mem); - __mem_cgroup_free(mem); + if (atomic_sub_and_test(count, &memcg->refcnt)) { + struct mem_cgroup *parent = parent_mem_cgroup(memcg); + __mem_cgroup_free(memcg); if (parent) mem_cgroup_put(parent); } } -static void mem_cgroup_put(struct mem_cgroup *mem) +static void mem_cgroup_put(struct mem_cgroup *memcg) { - __mem_cgroup_put(mem, 1); + __mem_cgroup_put(memcg, 1); } /* * Returns the parent mem_cgroup in memcgroup hierarchy with hierarchy enabled. */ -static struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *mem) +static struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg) { - if (!mem->res.parent) + if (!memcg->res.parent) return NULL; - return mem_cgroup_from_res_counter(mem->res.parent, res); + return mem_cgroup_from_res_counter(memcg->res.parent, res); } #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP @@ -5041,16 +5042,16 @@ static int mem_cgroup_soft_limit_tree_init(void) static struct cgroup_subsys_state * __ref mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont) { - struct mem_cgroup *mem, *parent; + struct mem_cgroup *memcg, *parent; long error = -ENOMEM; int node; - mem = mem_cgroup_alloc(); - if (!mem) + memcg = mem_cgroup_alloc(); + if (!memcg) return ERR_PTR(error); for_each_node_state(node, N_POSSIBLE) - if (alloc_mem_cgroup_per_zone_info(mem, node)) + if (alloc_mem_cgroup_per_zone_info(memcg, node)) goto free_out; /* root ? */ @@ -5058,7 +5059,7 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont) int cpu; enable_swap_cgroup(); parent = NULL; - root_mem_cgroup = mem; + root_mem_cgroup = memcg; if (mem_cgroup_soft_limit_tree_init()) goto free_out; for_each_possible_cpu(cpu) { @@ -5069,13 +5070,13 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont) hotcpu_notifier(memcg_cpu_hotplug_callback, 0); } else { parent = mem_cgroup_from_cont(cont->parent); - mem->use_hierarchy = parent->use_hierarchy; - mem->oom_kill_disable = parent->oom_kill_disable; + memcg->use_hierarchy = parent->use_hierarchy; + memcg->oom_kill_disable = parent->oom_kill_disable; } if (parent && parent->use_hierarchy) { - res_counter_init(&mem->res, &parent->res); - res_counter_init(&mem->memsw, &parent->memsw); + res_counter_init(&memcg->res, &parent->res); + res_counter_init(&memcg->memsw, &parent->memsw); /* * We increment refcnt of the parent to ensure that we can * safely access it on res_counter_charge/uncharge. @@ -5084,22 +5085,22 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont) */ mem_cgroup_get(parent); } else { - res_counter_init(&mem->res, NULL); - res_counter_init(&mem->memsw, NULL); + res_counter_init(&memcg->res, NULL); + res_counter_init(&memcg->memsw, NULL); } - mem->last_scanned_child = 0; - mem->last_scanned_node = MAX_NUMNODES; - INIT_LIST_HEAD(&mem->oom_notify); + memcg->last_scanned_child = 0; + memcg->last_scanned_node = MAX_NUMNODES; + INIT_LIST_HEAD(&memcg->oom_notify); if (parent) - mem->swappiness = mem_cgroup_swappiness(parent); - atomic_set(&mem->refcnt, 1); - mem->move_charge_at_immigrate = 0; - mutex_init(&mem->thresholds_lock); - spin_lock_init(&mem->scanstat.lock); - return &mem->css; + memcg->swappiness = mem_cgroup_swappiness(parent); + atomic_set(&memcg->refcnt, 1); + memcg->move_charge_at_immigrate = 0; + mutex_init(&memcg->thresholds_lock); + spin_lock_init(&memcg->scanstat.lock); + return &memcg->css; free_out: - __mem_cgroup_free(mem); + __mem_cgroup_free(memcg); root_mem_cgroup = NULL; return ERR_PTR(error); } @@ -5107,17 +5108,17 @@ free_out: static int mem_cgroup_pre_destroy(struct cgroup_subsys *ss, struct cgroup *cont) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cont); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cont); - return mem_cgroup_force_empty(mem, false); + return mem_cgroup_force_empty(memcg, false); } static void mem_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cont) { - struct mem_cgroup *mem = mem_cgroup_from_cont(cont); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cont); - mem_cgroup_put(mem); + mem_cgroup_put(memcg); } static int mem_cgroup_populate(struct cgroup_subsys *ss, @@ -5140,9 +5141,9 @@ static int mem_cgroup_do_precharge(unsigned long count) { int ret = 0; int batch_count = PRECHARGE_COUNT_AT_ONCE; - struct mem_cgroup *mem = mc.to; + struct mem_cgroup *memcg = mc.to; - if (mem_cgroup_is_root(mem)) { + if (mem_cgroup_is_root(memcg)) { mc.precharge += count; /* we don't need css_get for root */ return ret; @@ -5151,16 +5152,16 @@ static int mem_cgroup_do_precharge(unsigned long count) if (count > 1) { struct res_counter *dummy; /* - * "mem" cannot be under rmdir() because we've already checked + * "memcg" cannot be under rmdir() because we've already checked * by cgroup_lock_live_cgroup() that it is not removed and we * are still under the same cgroup_mutex. So we can postpone * css_get(). */ - if (res_counter_charge(&mem->res, PAGE_SIZE * count, &dummy)) + if (res_counter_charge(&memcg->res, PAGE_SIZE * count, &dummy)) goto one_by_one; - if (do_swap_account && res_counter_charge(&mem->memsw, + if (do_swap_account && res_counter_charge(&memcg->memsw, PAGE_SIZE * count, &dummy)) { - res_counter_uncharge(&mem->res, PAGE_SIZE * count); + res_counter_uncharge(&memcg->res, PAGE_SIZE * count); goto one_by_one; } mc.precharge += count; @@ -5177,8 +5178,9 @@ one_by_one: batch_count = PRECHARGE_COUNT_AT_ONCE; cond_resched(); } - ret = __mem_cgroup_try_charge(NULL, GFP_KERNEL, 1, &mem, false); - if (ret || !mem) + ret = __mem_cgroup_try_charge(NULL, + GFP_KERNEL, 1, &memcg, false); + if (ret || !memcg) /* mem_cgroup_clear_mc() will do uncharge later */ return -ENOMEM; mc.precharge++; @@ -5452,13 +5454,13 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss, struct task_struct *p) { int ret = 0; - struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup); + struct mem_cgroup *memcg = mem_cgroup_from_cont(cgroup); - if (mem->move_charge_at_immigrate) { + if (memcg->move_charge_at_immigrate) { struct mm_struct *mm; struct mem_cgroup *from = mem_cgroup_from_task(p); - VM_BUG_ON(from == mem); + VM_BUG_ON(from == memcg); mm = get_task_mm(p); if (!mm) @@ -5473,7 +5475,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss, mem_cgroup_start_move(from); spin_lock(&mc.lock); mc.from = from; - mc.to = mem; + mc.to = memcg; spin_unlock(&mc.lock); /* We set mc.moving_task later */ diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 8b57173c1dd5..905f2027b2a1 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1412,7 +1412,9 @@ asmlinkage long compat_sys_get_mempolicy(int __user *policy, err = sys_get_mempolicy(policy, nm, nr_bits+1, addr, flags); if (!err && nmask) { - err = copy_from_user(bm, nm, alloc_size); + unsigned long copy_size; + copy_size = min_t(unsigned long, sizeof(bm), alloc_size); + err = copy_from_user(bm, nm, copy_size); /* ensure entire bitmap is zeroed */ err |= clear_user(nmask, ALIGN(maxnode-1, 8) / 8); err |= compat_put_bitmap(nmask, bm, nr_bits); diff --git a/mm/migrate.c b/mm/migrate.c index 666e4e677414..71713fc49962 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -621,38 +621,18 @@ static int move_to_new_page(struct page *newpage, struct page *page, return rc; } -/* - * Obtain the lock on page, remove all ptes and migrate the page - * to the newly allocated page in newpage. - */ -static int unmap_and_move(new_page_t get_new_page, unsigned long private, - struct page *page, int force, bool offlining, bool sync) +static int __unmap_and_move(struct page *page, struct page *newpage, + int force, bool offlining, bool sync) { - int rc = 0; - int *result = NULL; - struct page *newpage = get_new_page(page, private, &result); + int rc = -EAGAIN; int remap_swapcache = 1; int charge = 0; struct mem_cgroup *mem; struct anon_vma *anon_vma = NULL; - if (!newpage) - return -ENOMEM; - - if (page_count(page) == 1) { - /* page was freed from under us. So we are done. */ - goto move_newpage; - } - if (unlikely(PageTransHuge(page))) - if (unlikely(split_huge_page(page))) - goto move_newpage; - - /* prepare cgroup just returns 0 or -ENOMEM */ - rc = -EAGAIN; - if (!trylock_page(page)) { if (!force || !sync) - goto move_newpage; + goto out; /* * It's not safe for direct compaction to call lock_page. @@ -668,7 +648,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, * altogether. */ if (current->flags & PF_MEMALLOC) - goto move_newpage; + goto out; lock_page(page); } @@ -785,27 +765,52 @@ uncharge: mem_cgroup_end_migration(mem, page, newpage, rc == 0); unlock: unlock_page(page); +out: + return rc; +} -move_newpage: +/* + * Obtain the lock on page, remove all ptes and migrate the page + * to the newly allocated page in newpage. + */ +static int unmap_and_move(new_page_t get_new_page, unsigned long private, + struct page *page, int force, bool offlining, bool sync) +{ + int rc = 0; + int *result = NULL; + struct page *newpage = get_new_page(page, private, &result); + + if (!newpage) + return -ENOMEM; + + if (page_count(page) == 1) { + /* page was freed from under us. So we are done. */ + goto out; + } + + if (unlikely(PageTransHuge(page))) + if (unlikely(split_huge_page(page))) + goto out; + + rc = __unmap_and_move(page, newpage, force, offlining, sync); +out: if (rc != -EAGAIN) { - /* - * A page that has been migrated has all references - * removed and will be freed. A page that has not been - * migrated will have kepts its references and be - * restored. - */ - list_del(&page->lru); + /* + * A page that has been migrated has all references + * removed and will be freed. A page that has not been + * migrated will have kepts its references and be + * restored. + */ + list_del(&page->lru); dec_zone_page_state(page, NR_ISOLATED_ANON + page_is_file_cache(page)); putback_lru_page(page); } - /* * Move the new page to the LRU. If migration was not successful * then this will free the page. */ putback_lru_page(newpage); - if (result) { if (rc) *result = rc; diff --git a/mm/mremap.c b/mm/mremap.c index 506fa44403df..d6959cb4df58 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -41,8 +41,7 @@ static pmd_t *get_old_pmd(struct mm_struct *mm, unsigned long addr) return NULL; pmd = pmd_offset(pud, addr); - split_huge_page_pmd(mm, pmd); - if (pmd_none_or_clear_bad(pmd)) + if (pmd_none(*pmd)) return NULL; return pmd; @@ -65,8 +64,6 @@ static pmd_t *alloc_new_pmd(struct mm_struct *mm, struct vm_area_struct *vma, return NULL; VM_BUG_ON(pmd_trans_huge(*pmd)); - if (pmd_none(*pmd) && __pte_alloc(mm, vma, pmd, addr)) - return NULL; return pmd; } @@ -80,11 +77,7 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, struct mm_struct *mm = vma->vm_mm; pte_t *old_pte, *new_pte, pte; spinlock_t *old_ptl, *new_ptl; - unsigned long old_start; - old_start = old_addr; - mmu_notifier_invalidate_range_start(vma->vm_mm, - old_start, old_end); if (vma->vm_file) { /* * Subtle point from Rajesh Venkatasubramanian: before @@ -111,7 +104,7 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, new_pte++, new_addr += PAGE_SIZE) { if (pte_none(*old_pte)) continue; - pte = ptep_clear_flush(vma, old_addr, old_pte); + pte = ptep_get_and_clear(mm, old_addr, old_pte); pte = move_pte(pte, new_vma->vm_page_prot, old_addr, new_addr); set_pte_at(mm, new_addr, new_pte, pte); } @@ -123,7 +116,6 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, pte_unmap_unlock(old_pte - 1, old_ptl); if (mapping) mutex_unlock(&mapping->i_mmap_mutex); - mmu_notifier_invalidate_range_end(vma->vm_mm, old_start, old_end); } #define LATENCY_LIMIT (64 * PAGE_SIZE) @@ -134,22 +126,43 @@ unsigned long move_page_tables(struct vm_area_struct *vma, { unsigned long extent, next, old_end; pmd_t *old_pmd, *new_pmd; + bool need_flush = false; old_end = old_addr + len; flush_cache_range(vma, old_addr, old_end); + mmu_notifier_invalidate_range_start(vma->vm_mm, old_addr, old_end); + for (; old_addr < old_end; old_addr += extent, new_addr += extent) { cond_resched(); next = (old_addr + PMD_SIZE) & PMD_MASK; - if (next - 1 > old_end) - next = old_end; + /* even if next overflowed, extent below will be ok */ extent = next - old_addr; + if (extent > old_end - old_addr) + extent = old_end - old_addr; old_pmd = get_old_pmd(vma->vm_mm, old_addr); if (!old_pmd) continue; new_pmd = alloc_new_pmd(vma->vm_mm, vma, new_addr); if (!new_pmd) break; + if (pmd_trans_huge(*old_pmd)) { + int err = 0; + if (extent == HPAGE_PMD_SIZE) + err = move_huge_pmd(vma, new_vma, old_addr, + new_addr, old_end, + old_pmd, new_pmd); + if (err > 0) { + need_flush = true; + continue; + } else if (!err) { + split_huge_page_pmd(vma->vm_mm, old_pmd); + } + VM_BUG_ON(pmd_trans_huge(*old_pmd)); + } + if (pmd_none(*new_pmd) && __pte_alloc(new_vma->vm_mm, new_vma, + new_pmd, new_addr)) + break; next = (new_addr + PMD_SIZE) & PMD_MASK; if (extent > next - new_addr) extent = next - new_addr; @@ -157,7 +170,12 @@ unsigned long move_page_tables(struct vm_area_struct *vma, extent = LATENCY_LIMIT; move_ptes(vma, old_pmd, old_addr, old_addr + extent, new_vma, new_pmd, new_addr); + need_flush = true; } + if (likely(need_flush)) + flush_tlb_range(vma, old_end-len, old_addr); + + mmu_notifier_invalidate_range_end(vma->vm_mm, old_end-len, old_end); return len + old_addr - old_end; /* how much done */ } diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 626303b52f3c..64bccffb848e 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -435,7 +435,7 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) task_unlock(p); /* - * Kill all processes sharing p->mm in other thread groups, if any. + * Kill all user processes sharing p->mm in other thread groups, if any. * They don't get access to memory reserves or a higher scheduler * priority, though, to avoid depletion of all memory or task * starvation. This prevents mm->mmap_sem livelock when an oom killed @@ -445,7 +445,8 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) * signal. */ for_each_process(q) - if (q->mm == mm && !same_thread_group(q, p)) { + if (q->mm == mm && !same_thread_group(q, p) && + !(q->flags & PF_KTHREAD)) { task_lock(q); /* Protect ->comm from prctl() */ pr_err("Kill process %d (%s) sharing same memory\n", task_pid_nr(q), q->comm); diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 0e309cd1b5b9..da6d263a8c8c 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -143,6 +143,66 @@ static struct prop_descriptor vm_completions; static struct prop_descriptor vm_dirties; /* + * Work out the current dirty-memory clamping and background writeout + * thresholds. + * + * The main aim here is to lower them aggressively if there is a lot of mapped + * memory around. To avoid stressing page reclaim with lots of unreclaimable + * pages. It is better to clamp down on writers than to start swapping, and + * performing lots of scanning. + * + * We only allow 1/2 of the currently-unmapped memory to be dirtied. + * + * We don't permit the clamping level to fall below 5% - that is getting rather + * excessive. + * + * We make sure that the background writeout level is below the adjusted + * clamping level. + */ +static unsigned long highmem_dirtyable_memory(unsigned long total) +{ +#ifdef CONFIG_HIGHMEM + int node; + unsigned long x = 0; + + for_each_node_state(node, N_HIGH_MEMORY) { + struct zone *z = + &NODE_DATA(node)->node_zones[ZONE_HIGHMEM]; + + x += zone_page_state(z, NR_FREE_PAGES) + + zone_reclaimable_pages(z); + } + /* + * Make sure that the number of highmem pages is never larger + * than the number of the total dirtyable memory. This can only + * occur in very strange VM situations but we want to make sure + * that this does not occur. + */ + return min(x, total); +#else + return 0; +#endif +} + +/** + * determine_dirtyable_memory - amount of memory that may be used + * + * Returns the numebr of pages that can currently be freed and used + * by the kernel for direct mappings. + */ +static unsigned long determine_dirtyable_memory(void) +{ + unsigned long x; + + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages(); + + if (!vm_highmem_is_dirtyable) + x -= highmem_dirtyable_memory(x); + + return x + 1; /* Ensure that we never return 0 */ +} + +/* * couple the period to the dirty_ratio: * * period/2 ~ roundup_pow_of_two(dirty limit) @@ -208,7 +268,6 @@ int dirty_ratio_handler(struct ctl_table *table, int write, return ret; } - int dirty_bytes_handler(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos) @@ -350,67 +409,6 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned max_ratio) } EXPORT_SYMBOL(bdi_set_max_ratio); -/* - * Work out the current dirty-memory clamping and background writeout - * thresholds. - * - * The main aim here is to lower them aggressively if there is a lot of mapped - * memory around. To avoid stressing page reclaim with lots of unreclaimable - * pages. It is better to clamp down on writers than to start swapping, and - * performing lots of scanning. - * - * We only allow 1/2 of the currently-unmapped memory to be dirtied. - * - * We don't permit the clamping level to fall below 5% - that is getting rather - * excessive. - * - * We make sure that the background writeout level is below the adjusted - * clamping level. - */ - -static unsigned long highmem_dirtyable_memory(unsigned long total) -{ -#ifdef CONFIG_HIGHMEM - int node; - unsigned long x = 0; - - for_each_node_state(node, N_HIGH_MEMORY) { - struct zone *z = - &NODE_DATA(node)->node_zones[ZONE_HIGHMEM]; - - x += zone_page_state(z, NR_FREE_PAGES) + - zone_reclaimable_pages(z); - } - /* - * Make sure that the number of highmem pages is never larger - * than the number of the total dirtyable memory. This can only - * occur in very strange VM situations but we want to make sure - * that this does not occur. - */ - return min(x, total); -#else - return 0; -#endif -} - -/** - * determine_dirtyable_memory - amount of memory that may be used - * - * Returns the numebr of pages that can currently be freed and used - * by the kernel for direct mappings. - */ -unsigned long determine_dirtyable_memory(void) -{ - unsigned long x; - - x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages(); - - if (!vm_highmem_is_dirtyable) - x -= highmem_dirtyable_memory(x); - - return x + 1; /* Ensure that we never return 0 */ -} - static unsigned long hard_dirty_limit(unsigned long thresh) { return max(thresh, global_dirty_limit); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6e8ecb6e021c..83a02052bce4 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -318,6 +318,7 @@ static void bad_page(struct page *page) current->comm, page_to_pfn(page)); dump_page(page); + print_modules(); dump_stack(); out: /* Leave bad fields for debug, except PageBuddy could make trouble */ diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c index 39d216d535ea..6bdc67dbbc28 100644 --- a/mm/page_cgroup.c +++ b/mm/page_cgroup.c @@ -513,11 +513,10 @@ int swap_cgroup_swapon(int type, unsigned long max_pages) length = DIV_ROUND_UP(max_pages, SC_PER_PAGE); array_size = length * sizeof(void *); - array = vmalloc(array_size); + array = vzalloc(array_size); if (!array) goto nomem; - memset(array, 0, array_size); ctrl = &swap_cgroup_ctrl[type]; mutex_lock(&swap_cgroup_mutex); ctrl->length = length; diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c new file mode 100644 index 000000000000..e920aa3ce104 --- /dev/null +++ b/mm/process_vm_access.c @@ -0,0 +1,496 @@ +/* + * linux/mm/process_vm_access.c + * + * Copyright (C) 2010-2011 Christopher Yeoh <cyeoh@au1.ibm.com>, IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <linux/mm.h> +#include <linux/uio.h> +#include <linux/sched.h> +#include <linux/highmem.h> +#include <linux/ptrace.h> +#include <linux/slab.h> +#include <linux/syscalls.h> + +#ifdef CONFIG_COMPAT +#include <linux/compat.h> +#endif + +/** + * process_vm_rw_pages - read/write pages from task specified + * @task: task to read/write from + * @mm: mm for task + * @process_pages: struct pages area that can store at least + * nr_pages_to_copy struct page pointers + * @pa: address of page in task to start copying from/to + * @start_offset: offset in page to start copying from/to + * @len: number of bytes to copy + * @lvec: iovec array specifying where to copy to/from + * @lvec_cnt: number of elements in iovec array + * @lvec_current: index in iovec array we are up to + * @lvec_offset: offset in bytes from current iovec iov_base we are up to + * @vm_write: 0 means copy from, 1 means copy to + * @nr_pages_to_copy: number of pages to copy + * @bytes_copied: returns number of bytes successfully copied + * Returns 0 on success, error code otherwise + */ +static int process_vm_rw_pages(struct task_struct *task, + struct mm_struct *mm, + struct page **process_pages, + unsigned long pa, + unsigned long start_offset, + unsigned long len, + const struct iovec *lvec, + unsigned long lvec_cnt, + unsigned long *lvec_current, + size_t *lvec_offset, + int vm_write, + unsigned int nr_pages_to_copy, + ssize_t *bytes_copied) +{ + int pages_pinned; + void *target_kaddr; + int pgs_copied = 0; + int j; + int ret; + ssize_t bytes_to_copy; + ssize_t rc = 0; + + *bytes_copied = 0; + + /* Get the pages we're interested in */ + down_read(&mm->mmap_sem); + pages_pinned = get_user_pages(task, mm, pa, + nr_pages_to_copy, + vm_write, 0, process_pages, NULL); + up_read(&mm->mmap_sem); + + if (pages_pinned != nr_pages_to_copy) { + rc = -EFAULT; + goto end; + } + + /* Do the copy for each page */ + for (pgs_copied = 0; + (pgs_copied < nr_pages_to_copy) && (*lvec_current < lvec_cnt); + pgs_copied++) { + /* Make sure we have a non zero length iovec */ + while (*lvec_current < lvec_cnt + && lvec[*lvec_current].iov_len == 0) + (*lvec_current)++; + if (*lvec_current == lvec_cnt) + break; + + /* + * Will copy smallest of: + * - bytes remaining in page + * - bytes remaining in destination iovec + */ + bytes_to_copy = min_t(ssize_t, PAGE_SIZE - start_offset, + len - *bytes_copied); + bytes_to_copy = min_t(ssize_t, bytes_to_copy, + lvec[*lvec_current].iov_len + - *lvec_offset); + + target_kaddr = kmap(process_pages[pgs_copied]) + start_offset; + + if (vm_write) + ret = copy_from_user(target_kaddr, + lvec[*lvec_current].iov_base + + *lvec_offset, + bytes_to_copy); + else + ret = copy_to_user(lvec[*lvec_current].iov_base + + *lvec_offset, + target_kaddr, bytes_to_copy); + kunmap(process_pages[pgs_copied]); + if (ret) { + *bytes_copied += bytes_to_copy - ret; + pgs_copied++; + rc = -EFAULT; + goto end; + } + *bytes_copied += bytes_to_copy; + *lvec_offset += bytes_to_copy; + if (*lvec_offset == lvec[*lvec_current].iov_len) { + /* + * Need to copy remaining part of page into the + * next iovec if there are any bytes left in page + */ + (*lvec_current)++; + *lvec_offset = 0; + start_offset = (start_offset + bytes_to_copy) + % PAGE_SIZE; + if (start_offset) + pgs_copied--; + } else { + start_offset = 0; + } + } + +end: + if (vm_write) { + for (j = 0; j < pages_pinned; j++) { + if (j < pgs_copied) + set_page_dirty_lock(process_pages[j]); + put_page(process_pages[j]); + } + } else { + for (j = 0; j < pages_pinned; j++) + put_page(process_pages[j]); + } + + return rc; +} + +/* Maximum number of pages kmalloc'd to hold struct page's during copy */ +#define PVM_MAX_KMALLOC_PAGES (PAGE_SIZE * 2) + +/** + * process_vm_rw_single_vec - read/write pages from task specified + * @addr: start memory address of target process + * @len: size of area to copy to/from + * @lvec: iovec array specifying where to copy to/from locally + * @lvec_cnt: number of elements in iovec array + * @lvec_current: index in iovec array we are up to + * @lvec_offset: offset in bytes from current iovec iov_base we are up to + * @process_pages: struct pages area that can store at least + * nr_pages_to_copy struct page pointers + * @mm: mm for task + * @task: task to read/write from + * @vm_write: 0 means copy from, 1 means copy to + * @bytes_copied: returns number of bytes successfully copied + * Returns 0 on success or on failure error code + */ +static int process_vm_rw_single_vec(unsigned long addr, + unsigned long len, + const struct iovec *lvec, + unsigned long lvec_cnt, + unsigned long *lvec_current, + size_t *lvec_offset, + struct page **process_pages, + struct mm_struct *mm, + struct task_struct *task, + int vm_write, + ssize_t *bytes_copied) +{ + unsigned long pa = addr & PAGE_MASK; + unsigned long start_offset = addr - pa; + unsigned long nr_pages; + ssize_t bytes_copied_loop; + ssize_t rc = 0; + unsigned long nr_pages_copied = 0; + unsigned long nr_pages_to_copy; + unsigned long max_pages_per_loop = PVM_MAX_KMALLOC_PAGES + / sizeof(struct pages *); + + *bytes_copied = 0; + + /* Work out address and page range required */ + if (len == 0) + return 0; + nr_pages = (addr + len - 1) / PAGE_SIZE - addr / PAGE_SIZE + 1; + + while ((nr_pages_copied < nr_pages) && (*lvec_current < lvec_cnt)) { + nr_pages_to_copy = min(nr_pages - nr_pages_copied, + max_pages_per_loop); + + rc = process_vm_rw_pages(task, mm, process_pages, pa, + start_offset, len, + lvec, lvec_cnt, + lvec_current, lvec_offset, + vm_write, nr_pages_to_copy, + &bytes_copied_loop); + start_offset = 0; + *bytes_copied += bytes_copied_loop; + + if (rc < 0) { + return rc; + } else { + len -= bytes_copied_loop; + nr_pages_copied += nr_pages_to_copy; + pa += nr_pages_to_copy * PAGE_SIZE; + } + } + + return rc; +} + +/* Maximum number of entries for process pages array + which lives on stack */ +#define PVM_MAX_PP_ARRAY_COUNT 16 + +/** + * process_vm_rw_core - core of reading/writing pages from task specified + * @pid: PID of process to read/write from/to + * @lvec: iovec array specifying where to copy to/from locally + * @liovcnt: size of lvec array + * @rvec: iovec array specifying where to copy to/from in the other process + * @riovcnt: size of rvec array + * @flags: currently unused + * @vm_write: 0 if reading from other process, 1 if writing to other process + * Returns the number of bytes read/written or error code. May + * return less bytes than expected if an error occurs during the copying + * process. + */ +static ssize_t process_vm_rw_core(pid_t pid, const struct iovec *lvec, + unsigned long liovcnt, + const struct iovec *rvec, + unsigned long riovcnt, + unsigned long flags, int vm_write) +{ + struct task_struct *task; + struct page *pp_stack[PVM_MAX_PP_ARRAY_COUNT]; + struct page **process_pages = pp_stack; + struct mm_struct *mm; + unsigned long i; + ssize_t rc = 0; + ssize_t bytes_copied_loop; + ssize_t bytes_copied = 0; + unsigned long nr_pages = 0; + unsigned long nr_pages_iov; + unsigned long iov_l_curr_idx = 0; + size_t iov_l_curr_offset = 0; + ssize_t iov_len; + + /* + * Work out how many pages of struct pages we're going to need + * when eventually calling get_user_pages + */ + for (i = 0; i < riovcnt; i++) { + iov_len = rvec[i].iov_len; + if (iov_len > 0) { + nr_pages_iov = ((unsigned long)rvec[i].iov_base + + iov_len) + / PAGE_SIZE - (unsigned long)rvec[i].iov_base + / PAGE_SIZE + 1; + nr_pages = max(nr_pages, nr_pages_iov); + } + } + + if (nr_pages == 0) + return 0; + + if (nr_pages > PVM_MAX_PP_ARRAY_COUNT) { + /* For reliability don't try to kmalloc more than + 2 pages worth */ + process_pages = kmalloc(min_t(size_t, PVM_MAX_KMALLOC_PAGES, + sizeof(struct pages *)*nr_pages), + GFP_KERNEL); + + if (!process_pages) + return -ENOMEM; + } + + /* Get process information */ + rcu_read_lock(); + task = find_task_by_vpid(pid); + if (task) + get_task_struct(task); + rcu_read_unlock(); + if (!task) { + rc = -ESRCH; + goto free_proc_pages; + } + + task_lock(task); + if (__ptrace_may_access(task, PTRACE_MODE_ATTACH)) { + task_unlock(task); + rc = -EPERM; + goto put_task_struct; + } + mm = task->mm; + + if (!mm || (task->flags & PF_KTHREAD)) { + task_unlock(task); + rc = -EINVAL; + goto put_task_struct; + } + + atomic_inc(&mm->mm_users); + task_unlock(task); + + for (i = 0; i < riovcnt && iov_l_curr_idx < liovcnt; i++) { + rc = process_vm_rw_single_vec( + (unsigned long)rvec[i].iov_base, rvec[i].iov_len, + lvec, liovcnt, &iov_l_curr_idx, &iov_l_curr_offset, + process_pages, mm, task, vm_write, &bytes_copied_loop); + bytes_copied += bytes_copied_loop; + if (rc != 0) { + /* If we have managed to copy any data at all then + we return the number of bytes copied. Otherwise + we return the error code */ + if (bytes_copied) + rc = bytes_copied; + goto put_mm; + } + } + + rc = bytes_copied; +put_mm: + mmput(mm); + +put_task_struct: + put_task_struct(task); + +free_proc_pages: + if (process_pages != pp_stack) + kfree(process_pages); + return rc; +} + +/** + * process_vm_rw - check iovecs before calling core routine + * @pid: PID of process to read/write from/to + * @lvec: iovec array specifying where to copy to/from locally + * @liovcnt: size of lvec array + * @rvec: iovec array specifying where to copy to/from in the other process + * @riovcnt: size of rvec array + * @flags: currently unused + * @vm_write: 0 if reading from other process, 1 if writing to other process + * Returns the number of bytes read/written or error code. May + * return less bytes than expected if an error occurs during the copying + * process. + */ +static ssize_t process_vm_rw(pid_t pid, + const struct iovec __user *lvec, + unsigned long liovcnt, + const struct iovec __user *rvec, + unsigned long riovcnt, + unsigned long flags, int vm_write) +{ + struct iovec iovstack_l[UIO_FASTIOV]; + struct iovec iovstack_r[UIO_FASTIOV]; + struct iovec *iov_l = iovstack_l; + struct iovec *iov_r = iovstack_r; + ssize_t rc; + + if (flags != 0) + return -EINVAL; + + /* Check iovecs */ + if (vm_write) + rc = rw_copy_check_uvector(WRITE, lvec, liovcnt, UIO_FASTIOV, + iovstack_l, &iov_l, 1); + else + rc = rw_copy_check_uvector(READ, lvec, liovcnt, UIO_FASTIOV, + iovstack_l, &iov_l, 1); + if (rc <= 0) + goto free_iovecs; + + rc = rw_copy_check_uvector(READ, rvec, riovcnt, UIO_FASTIOV, + iovstack_r, &iov_r, 0); + if (rc <= 0) + goto free_iovecs; + + rc = process_vm_rw_core(pid, iov_l, liovcnt, iov_r, riovcnt, flags, + vm_write); + +free_iovecs: + if (iov_r != iovstack_r) + kfree(iov_r); + if (iov_l != iovstack_l) + kfree(iov_l); + + return rc; +} + +SYSCALL_DEFINE6(process_vm_readv, pid_t, pid, const struct iovec __user *, lvec, + unsigned long, liovcnt, const struct iovec __user *, rvec, + unsigned long, riovcnt, unsigned long, flags) +{ + return process_vm_rw(pid, lvec, liovcnt, rvec, riovcnt, flags, 0); +} + +SYSCALL_DEFINE6(process_vm_writev, pid_t, pid, + const struct iovec __user *, lvec, + unsigned long, liovcnt, const struct iovec __user *, rvec, + unsigned long, riovcnt, unsigned long, flags) +{ + return process_vm_rw(pid, lvec, liovcnt, rvec, riovcnt, flags, 1); +} + +#ifdef CONFIG_COMPAT + +asmlinkage ssize_t +compat_process_vm_rw(compat_pid_t pid, + const struct compat_iovec __user *lvec, + unsigned long liovcnt, + const struct compat_iovec __user *rvec, + unsigned long riovcnt, + unsigned long flags, int vm_write) +{ + struct iovec iovstack_l[UIO_FASTIOV]; + struct iovec iovstack_r[UIO_FASTIOV]; + struct iovec *iov_l = iovstack_l; + struct iovec *iov_r = iovstack_r; + ssize_t rc = -EFAULT; + + if (flags != 0) + return -EINVAL; + + if (!access_ok(VERIFY_READ, lvec, liovcnt * sizeof(*lvec))) + goto out; + + if (!access_ok(VERIFY_READ, rvec, riovcnt * sizeof(*rvec))) + goto out; + + if (vm_write) + rc = compat_rw_copy_check_uvector(WRITE, lvec, liovcnt, + UIO_FASTIOV, iovstack_l, + &iov_l, 1); + else + rc = compat_rw_copy_check_uvector(READ, lvec, liovcnt, + UIO_FASTIOV, iovstack_l, + &iov_l, 1); + if (rc <= 0) + goto free_iovecs; + rc = compat_rw_copy_check_uvector(READ, rvec, riovcnt, + UIO_FASTIOV, iovstack_r, + &iov_r, 0); + if (rc <= 0) + goto free_iovecs; + + rc = process_vm_rw_core(pid, iov_l, liovcnt, iov_r, riovcnt, flags, + vm_write); + +free_iovecs: + if (iov_r != iovstack_r) + kfree(iov_r); + if (iov_l != iovstack_l) + kfree(iov_l); + +out: + return rc; +} + +asmlinkage ssize_t +compat_sys_process_vm_readv(compat_pid_t pid, + const struct compat_iovec __user *lvec, + unsigned long liovcnt, + const struct compat_iovec __user *rvec, + unsigned long riovcnt, + unsigned long flags) +{ + return compat_process_vm_rw(pid, lvec, liovcnt, rvec, + riovcnt, flags, 0); +} + +asmlinkage ssize_t +compat_sys_process_vm_writev(compat_pid_t pid, + const struct compat_iovec __user *lvec, + unsigned long liovcnt, + const struct compat_iovec __user *rvec, + unsigned long riovcnt, + unsigned long flags) +{ + return compat_process_vm_rw(pid, lvec, liovcnt, rvec, + riovcnt, flags, 1); +} + +#endif diff --git a/mm/slab.c b/mm/slab.c index 6d90a091fdca..10a3624d8a13 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1927,8 +1927,8 @@ static void check_poison_obj(struct kmem_cache *cachep, void *objp) /* Print header */ if (lines == 0) { printk(KERN_ERR - "Slab corruption: %s start=%p, len=%d\n", - cachep->name, realobj, size); + "Slab corruption (%s): %s start=%p, len=%d\n", + print_tainted(), cachep->name, realobj, size); print_objinfo(cachep, objp, 0); } /* Hexdump the affected line */ @@ -3037,8 +3037,8 @@ static void check_slabp(struct kmem_cache *cachep, struct slab *slabp) if (entries != cachep->num - slabp->inuse) { bad: printk(KERN_ERR "slab: Internal list corruption detected in " - "cache '%s'(%d), slabp %p(%d). Hexdump:\n", - cachep->name, cachep->num, slabp, slabp->inuse); + "cache '%s'(%d), slabp %p(%d). Tainted(%s). Hexdump:\n", + cachep->name, cachep->num, slabp, slabp->inuse, print_tainted()); for (i = 0; i < sizeof(*slabp) + cachep->num * sizeof(kmem_bufctl_t); i++) { diff --git a/mm/slub.c b/mm/slub.c index 537cec14cd2d..acdbe862ee00 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -596,7 +596,7 @@ static void slab_bug(struct kmem_cache *s, char *fmt, ...) va_end(args); printk(KERN_ERR "========================================" "=====================================\n"); - printk(KERN_ERR "BUG %s: %s\n", s->name, buf); + printk(KERN_ERR "BUG %s (%s): %s\n", s->name, print_tainted(), buf); printk(KERN_ERR "----------------------------------------" "-------------------------------------\n\n"); } @@ -681,49 +681,6 @@ static void init_object(struct kmem_cache *s, void *object, u8 val) memset(p + s->objsize, val, s->inuse - s->objsize); } -static u8 *check_bytes8(u8 *start, u8 value, unsigned int bytes) -{ - while (bytes) { - if (*start != value) - return start; - start++; - bytes--; - } - return NULL; -} - -static u8 *check_bytes(u8 *start, u8 value, unsigned int bytes) -{ - u64 value64; - unsigned int words, prefix; - - if (bytes <= 16) - return check_bytes8(start, value, bytes); - - value64 = value | value << 8 | value << 16 | value << 24; - value64 = (value64 & 0xffffffff) | value64 << 32; - prefix = 8 - ((unsigned long)start) % 8; - - if (prefix) { - u8 *r = check_bytes8(start, value, prefix); - if (r) - return r; - start += prefix; - bytes -= prefix; - } - - words = bytes / 8; - - while (words) { - if (*(u64 *)start != value64) - return check_bytes8(start, value, 8); - start += 8; - words--; - } - - return check_bytes8(start, value, bytes % 8); -} - static void restore_bytes(struct kmem_cache *s, char *message, u8 data, void *from, void *to) { @@ -738,7 +695,7 @@ static int check_bytes_and_report(struct kmem_cache *s, struct page *page, u8 *fault; u8 *end; - fault = check_bytes(start, value, bytes); + fault = memchr_inv(start, value, bytes); if (!fault) return 1; @@ -831,7 +788,7 @@ static int slab_pad_check(struct kmem_cache *s, struct page *page) if (!remainder) return 1; - fault = check_bytes(end - remainder, POISON_INUSE, remainder); + fault = memchr_inv(end - remainder, POISON_INUSE, remainder); if (!fault) return 1; while (end > fault && end[-1] == POISON_INUSE) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 7ef0903058ee..0aca3ce85d95 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1253,18 +1253,22 @@ EXPORT_SYMBOL_GPL(map_vm_area); DEFINE_RWLOCK(vmlist_lock); struct vm_struct *vmlist; -static void insert_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, +static void setup_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, unsigned long flags, void *caller) { - struct vm_struct *tmp, **p; - vm->flags = flags; vm->addr = (void *)va->va_start; vm->size = va->va_end - va->va_start; vm->caller = caller; va->private = vm; va->flags |= VM_VM_AREA; +} + +static void insert_vmalloc_vmlist(struct vm_struct *vm) +{ + struct vm_struct *tmp, **p; + vm->flags &= ~VM_UNLIST; write_lock(&vmlist_lock); for (p = &vmlist; (tmp = *p) != NULL; p = &tmp->next) { if (tmp->addr >= vm->addr) @@ -1275,6 +1279,13 @@ static void insert_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, write_unlock(&vmlist_lock); } +static void insert_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, + unsigned long flags, void *caller) +{ + setup_vmalloc_vm(vm, va, flags, caller); + insert_vmalloc_vmlist(vm); +} + static struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long align, unsigned long flags, unsigned long start, unsigned long end, int node, gfp_t gfp_mask, void *caller) @@ -1313,7 +1324,18 @@ static struct vm_struct *__get_vm_area_node(unsigned long size, return NULL; } - insert_vmalloc_vm(area, va, flags, caller); + /* + * When this function is called from __vmalloc_node_range, + * we do not add vm_struct to vmlist here to avoid + * accessing uninitialized members of vm_struct such as + * pages and nr_pages fields. They will be set later. + * To distinguish it from others, we use a VM_UNLIST flag. + */ + if (flags & VM_UNLIST) + setup_vmalloc_vm(area, va, flags, caller); + else + insert_vmalloc_vm(area, va, flags, caller); + return area; } @@ -1381,17 +1403,20 @@ struct vm_struct *remove_vm_area(const void *addr) va = find_vmap_area((unsigned long)addr); if (va && va->flags & VM_VM_AREA) { struct vm_struct *vm = va->private; - struct vm_struct *tmp, **p; - /* - * remove from list and disallow access to this vm_struct - * before unmap. (address range confliction is maintained by - * vmap.) - */ - write_lock(&vmlist_lock); - for (p = &vmlist; (tmp = *p) != vm; p = &tmp->next) - ; - *p = tmp->next; - write_unlock(&vmlist_lock); + + if (!(vm->flags & VM_UNLIST)) { + struct vm_struct *tmp, **p; + /* + * remove from list and disallow access to + * this vm_struct before unmap. (address range + * confliction is maintained by vmap.) + */ + write_lock(&vmlist_lock); + for (p = &vmlist; (tmp = *p) != vm; p = &tmp->next) + ; + *p = tmp->next; + write_unlock(&vmlist_lock); + } vmap_debug_free_range(va->va_start, va->va_end); free_unmap_vmap_area(va); @@ -1602,8 +1627,8 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, if (!size || (size >> PAGE_SHIFT) > totalram_pages) return NULL; - area = __get_vm_area_node(size, align, VM_ALLOC, start, end, node, - gfp_mask, caller); + area = __get_vm_area_node(size, align, VM_ALLOC | VM_UNLIST, + start, end, node, gfp_mask, caller); if (!area) return NULL; @@ -1611,6 +1636,12 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, addr = __vmalloc_area_node(area, gfp_mask, prot, node, caller); /* + * In this function, newly allocated vm_struct is not added + * to vmlist at __get_vm_area_node(). so, it is added here. + */ + insert_vmalloc_vmlist(area); + + /* * A ref_count = 3 is needed because the vm_struct and vmap_area * structures allocated in the __get_vm_area_node() function contain * references to the virtual address of the vmalloc'ed block. diff --git a/mm/vmscan.c b/mm/vmscan.c index 510488f19c13..668f7ec4b839 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -184,7 +184,7 @@ static unsigned long zone_nr_lru_pages(struct zone *zone, */ void register_shrinker(struct shrinker *shrinker) { - shrinker->nr = 0; + atomic_long_set(&shrinker->nr_in_batch, 0); down_write(&shrinker_rwsem); list_add_tail(&shrinker->list, &shrinker_list); up_write(&shrinker_rwsem); @@ -248,25 +248,26 @@ unsigned long shrink_slab(struct shrink_control *shrink, list_for_each_entry(shrinker, &shrinker_list, list) { unsigned long long delta; - unsigned long total_scan; - unsigned long max_pass; + long total_scan; + long max_pass; int shrink_ret = 0; long nr; long new_nr; long batch_size = shrinker->batch ? shrinker->batch : SHRINK_BATCH; + max_pass = do_shrinker_shrink(shrinker, shrink, 0); + if (max_pass <= 0) + continue; + /* * copy the current shrinker scan count into a local variable * and zero it so that other concurrent shrinker invocations * don't also do this scanning work. */ - do { - nr = shrinker->nr; - } while (cmpxchg(&shrinker->nr, nr, 0) != nr); + nr = atomic_long_xchg(&shrinker->nr_in_batch, 0); total_scan = nr; - max_pass = do_shrinker_shrink(shrinker, shrink, 0); delta = (4 * nr_pages_scanned) / shrinker->seeks; delta *= max_pass; do_div(delta, lru_pages + 1); @@ -326,12 +327,11 @@ unsigned long shrink_slab(struct shrink_control *shrink, * manner that handles concurrent updates. If we exhausted the * scan, there is no need to do an update. */ - do { - nr = shrinker->nr; - new_nr = total_scan + nr; - if (total_scan <= 0) - break; - } while (cmpxchg(&shrinker->nr, nr, new_nr) != nr); + if (total_scan > 0) + new_nr = atomic_long_add_return(total_scan, + &shrinker->nr_in_batch); + else + new_nr = atomic_long_read(&shrinker->nr_in_batch); trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr); } @@ -496,15 +496,6 @@ static pageout_t pageout(struct page *page, struct address_space *mapping, return PAGE_ACTIVATE; } - /* - * Wait on writeback if requested to. This happens when - * direct reclaiming a large contiguous area and the - * first attempt to free a range of pages fails. - */ - if (PageWriteback(page) && - (sc->reclaim_mode & RECLAIM_MODE_SYNC)) - wait_on_page_writeback(page); - if (!PageWriteback(page)) { /* synchronous write or broken a_ops? */ ClearPageReclaim(page); @@ -724,7 +715,13 @@ static enum page_references page_check_references(struct page *page, */ SetPageReferenced(page); - if (referenced_page) + if (referenced_page || referenced_ptes > 1) + return PAGEREF_ACTIVATE; + + /* + * Activate file-backed executable pages after first usage. + */ + if (vm_flags & VM_EXEC) return PAGEREF_ACTIVATE; return PAGEREF_KEEP; @@ -760,7 +757,10 @@ static noinline_for_stack void free_page_list(struct list_head *free_pages) */ static unsigned long shrink_page_list(struct list_head *page_list, struct zone *zone, - struct scan_control *sc) + struct scan_control *sc, + int priority, + unsigned long *ret_nr_dirty, + unsigned long *ret_nr_writeback) { LIST_HEAD(ret_pages); LIST_HEAD(free_pages); @@ -768,6 +768,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, unsigned long nr_dirty = 0; unsigned long nr_congested = 0; unsigned long nr_reclaimed = 0; + unsigned long nr_writeback = 0; cond_resched(); @@ -804,13 +805,12 @@ static unsigned long shrink_page_list(struct list_head *page_list, (PageSwapCache(page) && (sc->gfp_mask & __GFP_IO)); if (PageWriteback(page)) { + nr_writeback++; /* - * Synchronous reclaim is performed in two passes, - * first an asynchronous pass over the list to - * start parallel writeback, and a second synchronous - * pass to wait for the IO to complete. Wait here - * for any page for which writeback has already - * started. + * Synchronous reclaim cannot queue pages for + * writeback due to the possibility of stack overflow + * but if it encounters a page under writeback, wait + * for the IO to complete. */ if ((sc->reclaim_mode & RECLAIM_MODE_SYNC) && may_enter_fs) @@ -866,6 +866,25 @@ static unsigned long shrink_page_list(struct list_head *page_list, if (PageDirty(page)) { nr_dirty++; + /* + * Only kswapd can writeback filesystem pages to + * avoid risk of stack overflow but do not writeback + * unless under significant pressure. + */ + if (page_is_file_cache(page) && + (!current_is_kswapd() || priority >= DEF_PRIORITY - 2)) { + /* + * Immediately reclaim when written back. + * Similar in principal to deactivate_page() + * except we already have the page isolated + * and know it's dirty + */ + inc_zone_page_state(page, NR_VMSCAN_IMMEDIATE); + SetPageReclaim(page); + + goto keep_locked; + } + if (references == PAGEREF_RECLAIM_CLEAN) goto keep_locked; if (!may_enter_fs) @@ -1000,6 +1019,8 @@ keep_lumpy: list_splice(&ret_pages, page_list); count_vm_events(PGACTIVATE, pgactivate); + *ret_nr_dirty += nr_dirty; + *ret_nr_writeback += nr_writeback; return nr_reclaimed; } @@ -1013,23 +1034,27 @@ keep_lumpy: * * returns 0 on success, -ve errno on failure. */ -int __isolate_lru_page(struct page *page, int mode, int file) +int __isolate_lru_page(struct page *page, isolate_mode_t mode, int file) { + bool all_lru_mode; int ret = -EINVAL; /* Only take pages on the LRU. */ if (!PageLRU(page)) return ret; + all_lru_mode = (mode & (ISOLATE_ACTIVE|ISOLATE_INACTIVE)) == + (ISOLATE_ACTIVE|ISOLATE_INACTIVE); + /* * When checking the active state, we need to be sure we are * dealing with comparible boolean values. Take the logical not * of each. */ - if (mode != ISOLATE_BOTH && (!PageActive(page) != !mode)) + if (!all_lru_mode && !PageActive(page) != !(mode & ISOLATE_ACTIVE)) return ret; - if (mode != ISOLATE_BOTH && page_is_file_cache(page) != file) + if (!all_lru_mode && !!page_is_file_cache(page) != file) return ret; /* @@ -1042,6 +1067,12 @@ int __isolate_lru_page(struct page *page, int mode, int file) ret = -EBUSY; + if ((mode & ISOLATE_CLEAN) && (PageDirty(page) || PageWriteback(page))) + return ret; + + if ((mode & ISOLATE_UNMAPPED) && page_mapped(page)) + return ret; + if (likely(get_page_unless_zero(page))) { /* * Be careful not to clear PageLRU until after we're @@ -1077,7 +1108,8 @@ int __isolate_lru_page(struct page *page, int mode, int file) */ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, struct list_head *src, struct list_head *dst, - unsigned long *scanned, int order, int mode, int file) + unsigned long *scanned, int order, isolate_mode_t mode, + int file) { unsigned long nr_taken = 0; unsigned long nr_lumpy_taken = 0; @@ -1202,8 +1234,8 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, static unsigned long isolate_pages_global(unsigned long nr, struct list_head *dst, unsigned long *scanned, int order, - int mode, struct zone *z, - int active, int file) + isolate_mode_t mode, + struct zone *z, int active, int file) { int lru = LRU_BASE; if (active) @@ -1401,7 +1433,7 @@ static noinline_for_stack void update_isolated_counts(struct zone *zone, } /* - * Returns true if the caller should wait to clean dirty/writeback pages. + * Returns true if a direct reclaim should wait on pages under writeback. * * If we are direct reclaiming for contiguous pages and we do not reclaim * everything in the list, try again and wait for writeback IO to complete. @@ -1455,6 +1487,9 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone, unsigned long nr_taken; unsigned long nr_anon; unsigned long nr_file; + unsigned long nr_dirty = 0; + unsigned long nr_writeback = 0; + isolate_mode_t reclaim_mode = ISOLATE_INACTIVE; while (unlikely(too_many_isolated(zone, file, sc))) { congestion_wait(BLK_RW_ASYNC, HZ/10); @@ -1465,15 +1500,21 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone, } set_reclaim_mode(priority, sc, false); + if (sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM) + reclaim_mode |= ISOLATE_ACTIVE; + lru_add_drain(); + + if (!sc->may_unmap) + reclaim_mode |= ISOLATE_UNMAPPED; + if (!sc->may_writepage) + reclaim_mode |= ISOLATE_CLEAN; + spin_lock_irq(&zone->lru_lock); if (scanning_global_lru(sc)) { - nr_taken = isolate_pages_global(nr_to_scan, - &page_list, &nr_scanned, sc->order, - sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM ? - ISOLATE_BOTH : ISOLATE_INACTIVE, - zone, 0, file); + nr_taken = isolate_pages_global(nr_to_scan, &page_list, + &nr_scanned, sc->order, reclaim_mode, zone, 0, file); zone->pages_scanned += nr_scanned; if (current_is_kswapd()) __count_zone_vm_events(PGSCAN_KSWAPD, zone, @@ -1482,12 +1523,9 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone, __count_zone_vm_events(PGSCAN_DIRECT, zone, nr_scanned); } else { - nr_taken = mem_cgroup_isolate_pages(nr_to_scan, - &page_list, &nr_scanned, sc->order, - sc->reclaim_mode & RECLAIM_MODE_LUMPYRECLAIM ? - ISOLATE_BOTH : ISOLATE_INACTIVE, - zone, sc->mem_cgroup, - 0, file); + nr_taken = mem_cgroup_isolate_pages(nr_to_scan, &page_list, + &nr_scanned, sc->order, reclaim_mode, zone, + sc->mem_cgroup, 0, file); /* * mem_cgroup_isolate_pages() keeps track of * scanned pages on its own. @@ -1503,12 +1541,14 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone, spin_unlock_irq(&zone->lru_lock); - nr_reclaimed = shrink_page_list(&page_list, zone, sc); + nr_reclaimed = shrink_page_list(&page_list, zone, sc, priority, + &nr_dirty, &nr_writeback); /* Check if we should syncronously wait for writeback */ if (should_reclaim_stall(nr_taken, nr_reclaimed, priority, sc)) { set_reclaim_mode(priority, sc, true); - nr_reclaimed += shrink_page_list(&page_list, zone, sc); + nr_reclaimed += shrink_page_list(&page_list, zone, sc, + priority, &nr_dirty, &nr_writeback); } if (!scanning_global_lru(sc)) @@ -1521,6 +1561,16 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone, putback_lru_pages(zone, sc, nr_anon, nr_file, &page_list); + /* + * If we have encountered a high number of dirty pages under writeback + * then we are reaching the end of the LRU too quickly and global + * limits are not enough to throttle processes due to the page + * distribution throughout zones. Scale the number of dirty pages that + * must be under writeback before being throttled to priority. + */ + if (nr_writeback && nr_writeback >= (nr_taken >> (DEF_PRIORITY-priority))) + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10); + trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id, zone_idx(zone), nr_scanned, nr_reclaimed, @@ -1592,19 +1642,26 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone, struct page *page; struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc); unsigned long nr_rotated = 0; + isolate_mode_t reclaim_mode = ISOLATE_ACTIVE; lru_add_drain(); + + if (!sc->may_unmap) + reclaim_mode |= ISOLATE_UNMAPPED; + if (!sc->may_writepage) + reclaim_mode |= ISOLATE_CLEAN; + spin_lock_irq(&zone->lru_lock); if (scanning_global_lru(sc)) { nr_taken = isolate_pages_global(nr_pages, &l_hold, &pgscanned, sc->order, - ISOLATE_ACTIVE, zone, + reclaim_mode, zone, 1, file); zone->pages_scanned += pgscanned; } else { nr_taken = mem_cgroup_isolate_pages(nr_pages, &l_hold, &pgscanned, sc->order, - ISOLATE_ACTIVE, zone, + reclaim_mode, zone, sc->mem_cgroup, 1, file); /* * mem_cgroup_isolate_pages() keeps track of @@ -1808,23 +1865,22 @@ static void get_scan_count(struct zone *zone, struct scan_control *sc, u64 fraction[2], denominator; enum lru_list l; int noswap = 0; - int force_scan = 0; - unsigned long nr_force_scan[2]; - - - anon = zone_nr_lru_pages(zone, sc, LRU_ACTIVE_ANON) + - zone_nr_lru_pages(zone, sc, LRU_INACTIVE_ANON); - file = zone_nr_lru_pages(zone, sc, LRU_ACTIVE_FILE) + - zone_nr_lru_pages(zone, sc, LRU_INACTIVE_FILE); + bool force_scan = false; - if (((anon + file) >> priority) < SWAP_CLUSTER_MAX) { - /* kswapd does zone balancing and need to scan this zone */ - if (scanning_global_lru(sc) && current_is_kswapd()) - force_scan = 1; - /* memcg may have small limit and need to avoid priority drop */ - if (!scanning_global_lru(sc)) - force_scan = 1; - } + /* + * If the zone or memcg is small, nr[l] can be 0. This + * results in no scanning on this priority and a potential + * priority drop. Global direct reclaim can go to the next + * zone and tends to have no problems. Global kswapd is for + * zone balancing and it needs to scan a minimum amount. When + * reclaiming for a memcg, a priority drop can cause high + * latencies, so it's better to scan a minimum amount there as + * well. + */ + if (scanning_global_lru(sc) && current_is_kswapd()) + force_scan = true; + if (!scanning_global_lru(sc)) + force_scan = true; /* If we have no swap space, do not bother scanning anon pages. */ if (!sc->may_swap || (nr_swap_pages <= 0)) { @@ -1832,11 +1888,14 @@ static void get_scan_count(struct zone *zone, struct scan_control *sc, fraction[0] = 0; fraction[1] = 1; denominator = 1; - nr_force_scan[0] = 0; - nr_force_scan[1] = SWAP_CLUSTER_MAX; goto out; } + anon = zone_nr_lru_pages(zone, sc, LRU_ACTIVE_ANON) + + zone_nr_lru_pages(zone, sc, LRU_INACTIVE_ANON); + file = zone_nr_lru_pages(zone, sc, LRU_ACTIVE_FILE) + + zone_nr_lru_pages(zone, sc, LRU_INACTIVE_FILE); + if (scanning_global_lru(sc)) { free = zone_page_state(zone, NR_FREE_PAGES); /* If we have very few page cache pages, @@ -1845,8 +1904,6 @@ static void get_scan_count(struct zone *zone, struct scan_control *sc, fraction[0] = 1; fraction[1] = 0; denominator = 1; - nr_force_scan[0] = SWAP_CLUSTER_MAX; - nr_force_scan[1] = 0; goto out; } } @@ -1895,11 +1952,6 @@ static void get_scan_count(struct zone *zone, struct scan_control *sc, fraction[0] = ap; fraction[1] = fp; denominator = ap + fp + 1; - if (force_scan) { - unsigned long scan = SWAP_CLUSTER_MAX; - nr_force_scan[0] = div64_u64(scan * ap, denominator); - nr_force_scan[1] = div64_u64(scan * fp, denominator); - } out: for_each_evictable_lru(l) { int file = is_file_lru(l); @@ -1908,20 +1960,10 @@ out: scan = zone_nr_lru_pages(zone, sc, l); if (priority || noswap) { scan >>= priority; + if (!scan && force_scan) + scan = SWAP_CLUSTER_MAX; scan = div64_u64(scan * fraction[file], denominator); } - - /* - * If zone is small or memcg is small, nr[l] can be 0. - * This results no-scan on this priority and priority drop down. - * For global direct reclaim, it can visit next zone and tend - * not to have problems. For global kswapd, it's for zone - * balancing and it need to scan a small amounts. When using - * memcg, priority drop can cause big latency. So, it's better - * to scan small amount. See may_noscan above. - */ - if (!scan && force_scan) - scan = nr_force_scan[file]; nr[l] = scan; } } @@ -2000,12 +2042,14 @@ static void shrink_zone(int priority, struct zone *zone, enum lru_list l; unsigned long nr_reclaimed, nr_scanned; unsigned long nr_to_reclaim = sc->nr_to_reclaim; + struct blk_plug plug; restart: nr_reclaimed = 0; nr_scanned = sc->nr_scanned; get_scan_count(zone, sc, nr, priority); + blk_start_plug(&plug); while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] || nr[LRU_INACTIVE_FILE]) { for_each_evictable_lru(l) { @@ -2029,6 +2073,7 @@ restart: if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY) break; } + blk_finish_plug(&plug); sc->nr_reclaimed += nr_reclaimed; /* @@ -2722,6 +2767,8 @@ out: /* If balanced, clear the congested flag */ zone_clear_flag(zone, ZONE_CONGESTED); + if (i <= *classzone_idx) + balanced += zone->present_pages; } } diff --git a/mm/vmstat.c b/mm/vmstat.c index 20c18b7694b2..471b20bfa03a 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -702,6 +702,7 @@ const char * const vmstat_text[] = { "nr_unstable", "nr_bounce", "nr_vmscan_write", + "nr_vmscan_immediate_reclaim", "nr_writeback_temp", "nr_isolated_anon", "nr_isolated_file", diff --git a/security/keys/compat.c b/security/keys/compat.c index 338b510e9027..4c48e13448f8 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -38,7 +38,7 @@ long compat_keyctl_instantiate_key_iov( ret = compat_rw_copy_check_uvector(WRITE, _payload_iov, ioc, ARRAY_SIZE(iovstack), - iovstack, &iov); + iovstack, &iov, 1); if (ret < 0) return ret; if (ret == 0) diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index eca51918c951..0b3f5d72af1c 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -1065,7 +1065,7 @@ long keyctl_instantiate_key_iov(key_serial_t id, goto no_payload; ret = rw_copy_check_uvector(WRITE, _payload_iov, ioc, - ARRAY_SIZE(iovstack), iovstack, &iov); + ARRAY_SIZE(iovstack), iovstack, &iov, 1); if (ret < 0) return ret; if (ret == 0) diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c index 55d92cbb177a..902187657bc2 100644 --- a/security/selinux/selinuxfs.c +++ b/security/selinux/selinuxfs.c @@ -752,14 +752,6 @@ out: return length; } -static inline int hexcode_to_int(int code) { - if (code == '\0' || !isxdigit(code)) - return -1; - if (isdigit(code)) - return code - '0'; - return tolower(code) - 'a' + 10; -} - static ssize_t sel_write_create(struct file *file, char *buf, size_t size) { char *scon = NULL, *tcon = NULL; @@ -811,9 +803,11 @@ static ssize_t sel_write_create(struct file *file, char *buf, size_t size) if (c1 == '+') c1 = ' '; else if (c1 == '%') { - if ((c1 = hexcode_to_int(*r++)) < 0) + c1 = hex_to_bin(*r++); + if (c1 < 0) goto out; - if ((c2 = hexcode_to_int(*r++)) < 0) + c2 = hex_to_bin(*r++); + if (c2 < 0) goto out; c1 = (c1 << 4) | c2; } |