summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2015-09-29lib/decompressors: use real out buf size for gunzip with kernelYinghai Lu
commit 2d3862d26e67a59340ba1cf1748196c76c5787de upstream. When loading x86 64bit kernel above 4GiB with patched grub2, got kernel gunzip error. | early console in decompress_kernel | decompress_kernel: | input: [0x807f2143b4-0x807ff61aee] | output: [0x807cc00000-0x807f3ea29b] 0x027ea29c: output_len | boot via startup_64 | KASLR using RDTSC... | new output: [0x46fe000000-0x470138cfff] 0x0338d000: output_run_size | decompress: [0x46fe000000-0x47007ea29b] <=== [0x807f2143b4-0x807ff61aee] | | Decompressing Linux... gz... | | uncompression error | | -- System halted the new buffer is at 0x46fe000000ULL, decompressor_gzip is using 0xffffffb901ffffff as out_len. gunzip in lib/zlib_inflate/inflate.c cap that len to 0x01ffffff and decompress fails later. We could hit this problem with crashkernel booting that uses kexec loading kernel above 4GiB. We have decompress_* support: 1. inbuf[]/outbuf[] for kernel preboot. 2. inbuf[]/flush() for initramfs 3. fill()/flush() for initrd. This bug only affect kernel preboot path that use outbuf[]. Add __decompress and take real out_buf_len for gunzip instead of guessing wrong buf size. Fixes: 1431574a1c4 (lib/decompressors: fix "no limit" output buffer length) Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Alexandre Courbot <acourbot@nvidia.com> Cc: Jon Medhurst <tixy@linaro.org> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29parisc: Filter out spurious interrupts in PA-RISC irq handlerHelge Deller
commit b1b4e435e4ef7de77f07bf2a42c8380b960c2d44 upstream. When detecting a serial port on newer PA-RISC machines (with iosapic) we have a long way to go to find the right IRQ line, registering it, then registering the serial port and the irq handler for the serial port. During this phase spurious interrupts for the serial port may happen which then crashes the kernel because the action handler might not have been set up yet. So, basically it's a race condition between the serial port hardware and the CPU which sets up the necessary fields in the irq sructs. The main reason for this race is, that we unmask the serial port irqs too early without having set up everything properly before (which isn't easily possible because we need the IRQ number to register the serial ports). This patch is a work-around for this problem. It adds checks to the CPU irq handler to verify if the IRQ action field has been initialized already. If not, we just skip this interrupt (which isn't critical for a serial port at bootup). The real fix would probably involve rewriting all PA-RISC specific IRQ code (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of the irq chips and proper irq enabling along this line. This bug has been in the PA-RISC port since the beginning, but the crashes happened very rarely with currently used hardware. But on the latest machine which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900, 1GHz) and which has the largest possible L1 cache size (64MB each), the kernel crashed at every boot because of this race. So, without this patch the machine would currently be unuseable. For the record, here is the flow logic: 1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq(). 2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq. 3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq 4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!) 5. serial_init_chip() then registers the 8250 port. Problems: - In step 4 the CPU irq shouldn't have been registered yet, but after step 5 - If serial irq happens between 4 and 5 have finished, the kernel will crash Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29parisc: Use double word condition in 64bit CAS operationJohn David Anglin
commit 1b59ddfcf1678de38a1f8ca9fb8ea5eebeff1843 upstream. The attached change fixes the condition used in the "sub" instruction. A double word comparison is needed. This fixes the 64-bit LWS CAS operation on 64-bit kernels. I can now enable 64-bit atomic support in GCC. Signed-off-by: John David Anglin <dave.anglin> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29x86/mm: Initialize pmd_idx in page_table_range_init_count()Minfei Huang
commit 9962eea9e55f797f05f20ba6448929cab2a9f018 upstream. The variable pmd_idx is not initialized for the first iteration of the for loop. Assign the proper value which indexes the start address. Fixes: 719272c45b82 'x86, mm: only call early_ioremap_page_table_range_init() once' Signed-off-by: Minfei Huang <mnfhuang@gmail.com> Cc: tony.luck@intel.com Cc: wangnan0@huawei.com Cc: david.vrabel@citrix.com Reviewed-by: yinghai@kernel.org Link: http://lkml.kernel.org/r/1436703522-29552-1-git-send-email-mhuang@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/mm: Recompute hash value after a failed updateAneesh Kumar K.V
commit 36b35d5d807b7e57aff7d08e63de8b17731ee211 upstream. If we had secondary hash flag set, we ended up modifying hash value in the updatepp code path. Hence with a failed updatepp we will be using a wrong hash value for the following hash insert. Fix this by recomputing hash before insert. Without this patch we can end up with using wrong slot number in linux pte. That can result in us missing an hash pte update or invalidate which can cause memory corruption or even machine check. Fixes: 6d492ecc6489 ("powerpc/THP: Add code to handle HPTE faults for hugepages") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/boot: Specify ABI v2 when building an LE boot wrapperBenjamin Herrenschmidt
commit 655471f54c2e395ba29ae4156ba0f49928177cc1 upstream. The kernel does it, not the boot wrapper, which breaks with some cross compilers that still default to ABI v1. Fixes: 147c05168fc8 ("powerpc/boot: Add support for 64bit little endian wrapper") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/pseries: Release DRC when configure_connector failsBharata B Rao
commit daebaabb5cfbe4a6f09ca0e0f8b7673efc704960 upstream. Commit f32393c943e2 ("powerpc/pseries: Correct cpu affinity for dlpar added cpus") moved dlpar_acquire_drc() call to before dlpar_configure_connector() call in dlpar_cpu_probe(), but missed to release the DRC if dlpar_configure_connector() failed. During CPU hotplug, if configure-connector fails for any reason, then this will result in subsequent CPU hotplug attempts to fail. Release the acquired DRC if dlpar_configure_connector() call fails so that the DRC is left in right isolation and allocation state for the subsequent hotplug operation to succeed. Fixes: f32393c943e2 ("powerpc/pseries: Correct cpu affinity for dlpar added cpus") Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=Nishanth Aravamudan
commit fa14486979b3a47307bcdb10f8b5baa875a5cf68 upstream. The 32-bit TCE table initialization relies on the DMA window having a size equal to a power of 2 (and checks for it explicitly). But crashkernel= has no constraint that requires a power-of-2 be specified. This causes the kdump kernel to fail to boot as none of the PCI devices (including the disk controller) are successfully initialized. After this change, the PCI devices successfully set up the 32-bit TCE table and kdump succeeds. Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernelNishanth Aravamudan
commit bb0054552d080dd929907c5925d4bedc8bf6def7 upstream. When attempting to kdump with the 4.2 kernel, we see for each PCI device: pci 0003:01 : [PE# 000] Assign DMA32 space pci 0003:01 : [PE# 000] Setting up 32-bit TCE table at 0..80000000 pci 0003:01 : [PE# 000] Failed to create 32-bit TCE table, err -22 PCI: Domain 0004 has 8 available 32-bit DMA segments PCI: 4 PE# for a total weight of 70 pci 0004:01 : [PE# 002] Assign DMA32 space pci 0004:01 : [PE# 002] Setting up 32-bit TCE table at 0..80000000 pci 0004:01 : [PE# 002] Failed to create 32-bit TCE table, err -22 pci 0004:0d : [PE# 005] Assign DMA32 space pci 0004:0d : [PE# 005] Setting up 32-bit TCE table at 0..80000000 pci 0004:0d : [PE# 005] Failed to create 32-bit TCE table, err -22 pci 0004:0e : [PE# 006] Assign DMA32 space pci 0004:0e : [PE# 006] Setting up 32-bit TCE table at 0..80000000 pci 0004:0e : [PE# 006] Failed to create 32-bit TCE table, err -22 pci 0004:10 : [PE# 008] Assign DMA32 space pci 0004:10 : [PE# 008] Setting up 32-bit TCE table at 0..80000000 pci 0004:10 : [PE# 008] Failed to create 32-bit TCE table, err -22 and eventually the kdump kernel fails to boot as none of the PCI devices (including the disk controller) are successfully initialized. The EINVAL response is because the DMA window (the 2GB base window) is larger than the kdump kernel's reserved memory (crashkernel=, in this case specified to be 1024M). The check in question, if ((window_size > memory_hotplug_max()) || !is_power_of_2(window_size)) is a valid sanity check for pnv_pci_ioda2_table_alloc_pages(), so adjust the caller to pass in a smaller window size if our maximum memory value is smaller than the DMA window. After this change, the PCI devices successfully set up the 32-bit TCE table and kdump succeeds. The problem was seen on a Firestone machine originally. Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Coding style pedantry, use u64, change the indentation] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc: Uncomment and make enable_kernel_vsx() routine availableLeonidas Da Silva Barbosa
commit 72cd7b44bc99376b3f3c93cedcd052663fcdf705 upstream. enable_kernel_vsx() function was commented since anything was using it. However, vmx-crypto driver uses VSX instructions which are only available if VSX is enable. Otherwise it rises an exception oops. This patch uncomment enable_kernel_vsx() routine and makes it available. Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlersThomas Huth
commit 1c2cb594441d02815d304cccec9742ff5c707495 upstream. The EPOW interrupt handler uses rtas_get_sensor(), which in turn uses rtas_busy_delay() to wait for RTAS becoming ready in case it is necessary. But rtas_busy_delay() is annotated with might_sleep() and thus may not be used by interrupts handlers like the EPOW handler! This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6 Call Trace: [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable) [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180 [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0 [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0 [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450 [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300 [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0 [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260 [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80 [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200 [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24 [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110 [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180 Fix this issue by introducing a new rtas_get_sensor_fast() function that does not use rtas_busy_delay() - and thus can only be used for sensors that do not cause a BUSY condition - known as "fast" sensors. The EPOW sensor is defined to be "fast" in sPAPR - mpe. Fixes: 587f83e8dd50 ("powerpc/pseries: Use rtas_get_sensor in RAS code") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hashMichael Ellerman
commit 74b5037baa2011a2799e2c43adde7d171b072f9e upstream. The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K PAGE_SIZE. However when built with a 4K PAGE_SIZE there is an additional config option which can be enabled, PPC_HAS_HASH_64K, which means the kernel also knows how to hash a 64K page even though the base PAGE_SIZE is 4K. This is used in one obscure configuration, to support 64K pages for SPU local store on the Cell processor when the rest of the kernel is using 4K pages. In this configuration, pte_pagesize_index() is defined to just pass through its arguments to get_slice_psize(). However pte_pagesize_index() is called for both user and kernel addresses, whereas get_slice_psize() only knows how to handle user addresses. This has been broken forever, however until recently it happened to work. That was because in get_slice_psize() the large kernel address would cause the right shift of the slice mask to return zero. However in commit 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB"), the get_slice_psize() code was changed so that instead of a right shift we do an array lookup based on the address. When passed a kernel address this means we index way off the end of the slice array and return random junk. That is only fatal if we happen to hit something non-zero, but when we do return a non-zero value we confuse the MMU code and eventually cause a check stop. This fix is ugly, but simple. When we're called for a kernel address we return 4K, which is always correct in this configuration, otherwise we use the slice mask. Fixes: 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB") Reported-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()Gavin Shan
commit 259800135c654a098d9f0adfdd3d1f20eef1f231 upstream. The config space of some PCI devices can't be accessed when their PEs are in frozen state. Otherwise, fenced PHB might be seen. Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing EEH_PE_CFG_BLOCKED is set automatically when the PE is put to frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores PCI device BARs with eeh_pe_restore_bars(), which then calls eeh_ops->restore_config() to reinitialize the PCI device in (OPAL) firmware. eeh_ops->restore_config() produces PCI config access that causes fenced PHB. The problem was reported on below adapter: 0001:01:00.0 0200: 14e4:168e (rev 10) 0001:01:00.0 Ethernet controller: Broadcom Corporation \ NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10) This fixes the issue by skipping eeh_pe_restore_bars() in eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE. Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE") Reported-by: Manvanthara B. Puttashankar <mputtash@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/eeh: Probe after unbalanced kref checkDaniel Axtens
commit e642d11bdbfe8eb10116ab3959a2b5d75efda832 upstream. In the complete hotplug case, EEH PEs are supposed to be released and set to NULL. Normally, this is done by eeh_remove_device(), which is called from pcibios_release_device(). However, if something is holding a kref to the device, it will not be released, and the PE will remain. eeh_add_device_late() has a check for this which will explictly destroy the PE in this case. This check in eeh_add_device_late() occurs after a call to eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(), which will exit without probing if there is an existing PE. This means that on PowerNV, devices with outstanding krefs will not be rediscovered by EEH correctly after a complete hotplug. This is affecting CXL (CAPI) devices in the field. Put the probe after the kref check so that the PE is destroyed and affected devices are correctly rediscovered by EEH. Fixes: d91dafc02f42 ("powerpc/eeh: Delay probing EEH device during hotplug") Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29powerpc/pseries: Fix corrupted pdn listGavin Shan
commit 590c7567a2895f939525ead57b0334c6d47986f0 upstream. Commit cca87d30 ("powerpc/pci: Refactor pci_dn") introduced pdn list for SRIOV VFs. It means the pdn is be put into the child list of its parent pdn when the pdn is created. When doing PCI hot unplugging on pSeries, the PCI device node as well as its pdn are released through procfs entry "powerpc/ofdt". Some one else grabs the memory chunk of the pdn and update it accordingly. At the same time, the pdn is still tracked in the child list of parent pdn. It leads to corrupted child list in the parent pdn. This fixes above issue by removing the pdn from the child list of its parent pdn when the device node is detached from the system. Note the pdn is free'd when the device node is released if the device node is dynamic one. Otherwise, the device node as well as the pdn won't be released. Fixes: cca87d30 ("powerpc/pci: Refactor pci_dn") Reported-by: Santwana Samantray <santwana.samantray@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm64: KVM: Disable virtual timer even if the guest is not using itMarc Zyngier
commit c4cbba9fa078f55d9f6d081dbb4aec7cf969e7c7 upstream. When running a guest with the architected timer disabled (with QEMU and the kernel_irqchip=off option, for example), it is important to make sure the timer gets turned off. Otherwise, the guest may try to enable it anyway, leading to a screaming HW interrupt. The fix is to unconditionally turn off the virtual timer on guest exit. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29KVM: arm64: add workaround for Cortex-A57 erratum #852523Will Deacon
commit 43297dda0a51e4ffed0888ce727c218cfb7474b6 upstream. When restoring the system register state for an AArch32 guest at EL2, writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57, which can lead to the guest effectively running with junk in the DACR and running into unexpected domain faults. This patch works around the issue by re-ordering our restoration of the AArch32 register aliases so that they happen before the AArch64 system registers. Ensuring that the registers are restored in this order guarantees that they will be correctly synchronised by the core. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resourcesPavel Fedin
commit c2f58514cfb374d5368c9da945f1765cd48eb0da upstream. Until b26e5fdac43c ("arm/arm64: KVM: introduce per-VM ops"), kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(), and vgic_v2_map_resources() still has it. But now vm_ops are not initialized until we call kvm_vgic_create(). Therefore kvm_vgic_map_resources() can being called without a VGIC, and we die because vm_ops.map_resources is NULL. Fixing this restores QEMU's kernel-irqchip=off option to a working state, allowing to use GIC emulation in userspace. Fixes: b26e5fdac43c ("arm/arm64: KVM: introduce per-VM ops") Signed-off-by: Pavel Fedin <p.fedin@samsung.com> [maz: reworked commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm64: errata: add module build workaround for erratum #843419Will Deacon
commit df057cc7b4fa59e9b55f07ffdb6c62bf02e99a00 upstream. Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can lead to a memory access using an incorrect address in certain sequences headed by an ADRP instruction. There is a linker fix to generate veneers for ADRP instructions, but this doesn't work for kernel modules which are built as unlinked ELF objects. This patch adds a new config option for the erratum which, when enabled, builds kernel modules with the mcmodel=large flag. This uses absolute addressing for all kernel symbols, thereby removing the use of ADRP as a PC-relative form of addressing. The ADRP relocs are removed from the module loader so that we fail to load any potentially affected modules. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm64: head.S: initialise mdcr_el2 in el2_setupWill Deacon
commit d10bcd473301888f957ec4b6b12aa3621be78d59 upstream. When entering the kernel at EL2, we fail to initialise the MDCR_EL2 register which controls debug access and PMU capabilities at EL1. This patch ensures that the register is initialised so that all traps are disabled and all the PMU counters are available to the host. When a guest is scheduled, KVM takes care to configure trapping appropriately. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm64: compat: fix vfp save/restore across signal handlers in big-endianWill Deacon
commit bdec97a855ef1e239f130f7a11584721c9a1bf04 upstream. When saving/restoring the VFP registers from a compat (AArch32) signal frame, we rely on the compat registers forming a prefix of the native register file and therefore make use of copy_{to,from}_user to transfer between the native fpsimd_state and the compat_vfp_sigframe. Unfortunately, this doesn't work so well in a big-endian environment. Our fpsimd save/restore code operates directly on 128-bit quantities (Q registers) whereas the compat_vfp_sigframe represents the registers as an array of 64-bit (D) registers. The architecture packs the compat D registers into the Q registers, with the least significant bytes holding the lower register. Consequently, we need to swap the 64-bit halves when converting between these two representations on a big-endian machine. This patch replaces the __copy_{to,from}_user invocations in our compat VFP signal handling code with explicit __put_user loops that operate on 64-bit values and swap them accordingly. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm64: set MAX_MEMBLOCK_ADDR according to linear region sizeArd Biesheuvel
commit 34ba2c4247e5c4b1542b1106e156af324660c4f0 upstream. The linear region size of a 39-bit VA kernel is only 256 GB, which may be insufficient to cover all of system RAM, even on platforms that have much less than 256 GB of memory but which is laid out very sparsely. So make sure we clip the memory we will not be able to map before installing it into the memblock memory table, by setting MAX_MEMBLOCK_ADDR accordingly. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm64: flush FP/SIMD state correctly after execve()Ard Biesheuvel
commit 674c242c9323d3c293fc4f9a3a3a619fe3063290 upstream. When a task calls execve(), its FP/SIMD state is flushed so that none of the original program state is observeable by the incoming program. However, since this flushing consists of setting the in-memory copy of the FP/SIMD state to all zeroes, the CPU field is set to CPU 0 as well, which indicates to the lazy FP/SIMD preserve/restore code that the FP/SIMD state does not need to be reread from memory if the task is scheduled again on CPU 0 without any other tasks having entered userland (or used the FP/SIMD in kernel mode) on the same CPU in the mean time. If this happens, the FP/SIMD state of the old program will still be present in the registers when the new program starts. So set the CPU field to the invalid value of NR_CPUS when performing the flush, by calling fpsimd_flush_task_state(). Reported-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com> Reported-by: Janet Liu <janet.liu@spreadtrum.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm64: entry: always restore x0 from the stack on syscall returnWill Deacon
commit 412fcb6cebd758d080cacd5a41a0cbc656ea5fce upstream. We have a micro-optimisation on the fast syscall return path where we take care to keep x0 live with the return value from the syscall so that we can avoid restoring it from the stack. The benefit of doing this is fairly suspect, since we will be restoring x1 from the stack anyway (which lives adjacent in the pt_regs structure) and the only additional cost is saving x0 back to pt_regs after the syscall handler, which could be seen as a poor man's prefetch. More importantly, this causes issues with the context tracking code. The ct_user_enter macro ends up branching into C code, which is free to use x0 as a scratch register and consequently leads to us returning junk back to userspace as the syscall return value. Rather than special case the context-tracking code, this patch removes the questionable optimisation entirely. Cc: Larry Bassel <larry.bassel@linaro.org> Cc: Kevin Hilman <khilman@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29arm64: kconfig: Move LIST_POISON to a safe valueJeff Vander Stoep
commit bf0c4e04732479f650ff59d1ee82de761c0071f0 upstream. Move the poison pointer offset to 0xdead000000000000, a recognized value that is not mappable by user-space exploits. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Thierry Strudel <tstrudel@google.com> Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29MIPS: math-emu: Emulate missing BC1{EQ,NE}Z instructionsMarkos Chandras
commit c909ca718e8f50cf484ef06a8dd935e738e8e53d upstream. Commit c8a34581ec09 ("MIPS: Emulate the BC1{EQ,NE}Z FPU instructions") added support for emulating the new R6 BC1{EQ,NE}Z branches but it missed the case where the instruction that caused the exception was not on a DS. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Fixes: c8a34581ec09 ("MIPS: Emulate the BC1{EQ,NE}Z FPU instructions") Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10738/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29MIPS: math-emu: Allow m{f,t}hc emulation on MIPS R6Markos Chandras
commit e8f80cc1a6d80587136b015e989a12827e1fcfe5 upstream. The mfhc/mthc instructions are supported on MIPS R6 so emulate them if needed. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10737/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29MIPS: CPS: use 32b accesses to GCRsPaul Burton
commit 90996511187d6282db6d02d3f97006b4dbb5c457 upstream. Commit b677bc03d757 ("MIPS: cps-vec: Use macros for various arithmetics and memory operations") replaced various load & store instructions through cps-vec.S with the PTR_L & PTR_S macros. However it was somewhat overzealous in doing so for CM GCR accesses, since the bit width of the CM doesn't necessarily match that of the CPU. The registers accessed (GCR_CL_COHERENCE & GCR_CL_ID) should be safe to simply always access using 32b instructions, so do so in order to avoid issues when using a 32b CM with a 64b CPU. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: James Hogan <james.hogan@imgtec.com> Patchwork: https://patchwork.linux-mips.org/patch/10864/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ARM: rockchip: fix broken buildCaesar Wang
commit cb8cc37f4d38d96552f2c52deb15e511cdacf906 upstream. The following was seen in branch[0] build. arch/arm/mach-rockchip/platsmp.c:154:23: error: 'rockchip_secondary_startup' undeclared (first use in this function) branch[0]: git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip.git v4.3-armsoc/soc The broken build is caused by the commit fe4407c0dc58 ("ARM: rockchip: fix the CPU soft reset"). Signed-off-by: Caesar Wang <wxt@rock-chips.com> The breakage was a result of it being wrongly merged in my branch with the cache invalidation rework from Russell 02b4e2756e01c ("ARM: v7 setup function should invalidate L1 cache"). Signed-off-by: Heiko Stuebner <heiko@sntech.de> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ACPI, PCI: Penalize legacy IRQ used by ACPI SCIJiang Liu
commit 5d0ddfebb93069061880fc57ee4ba7246bd1e1ee upstream. Nick Meier reported a regression with HyperV that " After rebooting the VM, the following messages are logged in syslog when trying to load the tulip driver: tulip: Linux Tulip drivers version 1.1.15 (Feb 27, 2007) tulip: 0000:00:0a.0: PCI INT A: failed to register GSI tulip: Cannot enable tulip board #0, aborting tulip: probe of 0000:00:0a.0 failed with error -16 Errors occur in 3.19.0 kernel Works in 3.17 kernel. " According to the ACPI dump file posted by Nick at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072 The ACPI MADT table includes an interrupt source overridden entry for ACPI SCI: [236h 0566 1] Subtable Type : 02 <Interrupt Source Override> [237h 0567 1] Length : 0A [238h 0568 1] Bus : 00 [239h 0569 1] Source : 09 [23Ah 0570 4] Interrupt : 00000009 [23Eh 0574 2] Flags (decoded below) : 000D Polarity : 1 Trigger Mode : 3 And in DSDT table, we have _PRT method to define PCI interrupts, which eventually goes to: Name (PRSA, ResourceTemplate () { IRQ (Level, ActiveLow, Shared, ) {3,4,5,7,9,10,11,12,14,15} }) Name (PRSB, ResourceTemplate () { IRQ (Level, ActiveLow, Shared, ) {3,4,5,7,9,10,11,12,14,15} }) Name (PRSC, ResourceTemplate () { IRQ (Level, ActiveLow, Shared, ) {3,4,5,7,9,10,11,12,14,15} }) Name (PRSD, ResourceTemplate () { IRQ (Level, ActiveLow, Shared, ) {3,4,5,7,9,10,11,12,14,15} }) According to the MADT and DSDT tables, IRQ 9 may be used for: 1) ACPI SCI in level, high mode 2) PCI legacy IRQ in level, low mode So there's a conflict in polarity setting for IRQ 9. Prior to commit cd68f6bd53cf ("x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI"), ACPI SCI is handled specially and there's no check for conflicts between ACPI SCI and PCI legagy IRQ. And it seems that the HyperV hypervisor doesn't make use of the polarity configuration in IOAPIC entry, so it just works. Commit cd68f6bd53cf gets rid of the specially handling of ACPI SCI, and then the pin attribute checking code discloses the conflicts between ACPI SCI and PCI legacy IRQ on HyperV virtual machine, and rejects the request to assign IRQ9 to PCI devices. So penalize legacy IRQ used by ACPI SCI and mark it unusable if ACPI SCI attributes conflict with PCI IRQ attributes. Please refer to following links for more information: https://bugzilla.kernel.org/show_bug.cgi?id=101301 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072 Fixes: cd68f6bd53cf ("x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI") Reported-and-tested-by: Nick Meier <nmeier@microsoft.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ARM: dts: rockchip: fix rk3288 watchdog irqHeiko Stuebner
commit 1a1b698b115467242303daf5fe1d3c9886c2fa17 upstream. The watchdog irq is actually SPI 79, which translates to the original 111 in the manual where the SPI irqs start at 32. The current dw_wdt driver does not use the irq at all, so this issue never surfaced. Nevertheless fix this for a time we want to use the irq. Fixes: 2ab557b72d46 ("ARM: dts: rockchip: add core rk3288 dtsi") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ARM: rockchip: fix the CPU soft resetCaesar Wang
commit fe4407c0dc58215a7abfb7532740d79ddabe7a7a upstream. We need different orderings when turning a core on and turning a core off. In one case we need to assert reset before turning power off. In ther other case we need to turn power on and the deassert reset. In general, the correct flow is: CPU off: reset_control_assert regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd)) wait_for_power_domain_to_turn_off CPU on: regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0) wait_for_power_domain_to_turn_on reset_control_deassert This is needed for stressing CPU up/down, as per: cd /sys/devices/system/cpu/ for i in $(seq 10000); do echo "================= $i ============" for j in $(seq 100); do while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" != "000"" ]] echo 0 > cpu1/online echo 0 > cpu2/online echo 0 > cpu3/online done while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" != "111" ]]; do echo 1 > cpu1/online echo 1 > cpu2/online echo 1 > cpu3/online done done done The following is reproducable log: [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs [34466.186824] Disabling non-boot CPUs ... [34466.187509] CPU1: shutdown [34466.188672] CPU2: shutdown [34473.736627] Kernel panic - not syncing:Watchdog detected hard LOCKUP on cpu 0 ....... or others similar log: ....... [ 4072.454453] CPU1: shutdown [ 4072.504436] CPU2: shutdown [ 4072.554426] CPU3: shutdown [ 4072.577827] CPU1: Booted secondary processor [ 4072.582611] CPU2: Booted secondary processor <hang> Tested by cpu up/down scripts, the results told us need delay more time before write the sram. The wait time is affected by many aspects (e.g: cpu frequency, bootrom frequency, sram frequency, bus speed, ...). Although the cpus other than cpu0 will write the sram, the speedy is no the same as cpu0, if the cpu0 early wake up, perhaps the other cpus can't startup. As we know, the cpu0 can wake up when the cpu1/2/3 write the 'sram+4/8' and send the sev. Anyway..... At the moment, 1ms delay will be happy work for cpu up/down scripts test. Signed-off-by: Caesar Wang <wxt@rock-chips.com> Reviewed-by: Doug Anderson <dianders@chromium.org> Reviewed-by: Kever Yang <kever.yang@rock-chips.com> Fixes: 3ee851e212d0 ("ARM: rockchip: add basic smp support for rk3288") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ARM: OMAP2+: DRA7: clockdomain: change l4per2_7xx_clkdm to SW_WKUPVignesh R
commit b9e23f321940d2db2c9def8ff723b8464fb86343 upstream. Legacy IPs like PWMSS, present under l4per2_7xx_clkdm, cannot support smart-idle when its clock domain is in HW_AUTO on DRA7 SoCs. Hence, program clock domain to SW_WKUP. Signed-off-by: Vignesh R <vigneshr@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> Reviewed-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ARM: dts: fix clock-frequency of display timing0 for exynos3250-rinatoHyungwon Hwang
commit 65e3293381e1cf1abcfe1aa22b914650a40e3af4 upstream. After the commit abc0b1447d49 ("drm: Perform basic sanity checks on probed modes"), proper clock-frequency becomes mandatory for validating the mode of panel. The display does not work if there is no mode validated. Also, this clock-frequency must be set appropriately for getting required frame rate. Fixes: abc0b1447d49 ("drm: Perform basic sanity checks on probed modes") Signed-off-by: Hyungwon Hwang <human.hwang@samsung.com> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Sigend-off-by: Kukjin Kim <kgene@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ARM: orion5x: fix legacy orion5x IRQ numbersBenjamin Cama
commit 5be9fc23cdb42e1d383ecc8eae8a8ff70a752708 upstream. Since v3.18, attempts to deliver IRQ0 are rejected, breaking orion5x. Fix this by increasing all interrupts by one, as did 5d6bed2a9c8b for dove. Also, force MULTI_IRQ_HANDLER for all orion platforms (including dove) as the specific handler is needed to shift back IRQ numbers by one. [gregory.clement@free-electrons.com]: moved the select MULTI_IRQ_HANDLER from PLAT_ORION_LEGACY to ARCH_ORION5X as it broke the build for dove. Fixes: a71b092a9c68 ("ARM: Convert handle_IRQ to use __handle_domain_irq") Signed-off-by: Benjamin Cama <benoar@dolka.fr> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Tested-by: Detlef Vollmann <dv@vollmann.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ARM: BCM63xx: fix parameter to of_get_cpu_node in bcm63138_smp_boot_secondarySudeep Holla
commit a6b4b25bd15c0a490769422a7eee355e9e91a57b upstream. of_get_cpu_node provides the device node associated with the given logical CPU and cpu_logical_map contains the physical id for each CPU in the logical ordering. Passing cpu_logical_map(cpu) to of_get_cpu_node is incorrect. This patch fixes the issue by passing the logical CPU number to of_get_cpu_node Fixes: ed5cd8163da8 ("ARM: BCM63xx: Add SMP support for BCM63138") Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: bcm-kernel-feedback-list@broadcom.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21x86/mce: Reenable CMCI banks when swiching back to interrupt modeXie XiuQi
commit 1b48465500611a2dc5e75800c61ac352e22d41c3 upstream. Zhang Liguang reported the following issue: 1) System detects a CMCI storm on the current CPU. 2) Kernel disables the CMCI interrupt on banks owned by the current CPU and switches to poll mode 3) After the CMCI storm subsides, kernel switches back to interrupt mode 4) We expect the system to reenable the CMCI interrupt on banks owned by the current CPU mce_intel_adjust_timer |-> cmci_reenable |-> cmci_discover # owned banks are ignored here static void cmci_discover(int banks) ... for (i = 0; i < banks; i++) { ... if (test_bit(i, owned)) # ownd banks is ignore here continue; So convert cmci_storm_disable_banks() to cmci_toggle_interrupt_mode() which controls whether to enable or disable CMCI interrupts with its argument. NB: We cannot clear the owned bit because the banks won't be polled, otherwise. See: 27f6c573e0f7 ("x86, CMCI: Add proper detection of end of CMCI storms") for more info. Reported-by: Zhang Liguang <zhangliguang@huawei.com> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: huawei.libin@huawei.com Cc: linux-edac <linux-edac@vger.kernel.org> Cc: rui.xiang@huawei.com Link: http://lkml.kernel.org/r/1439396985-12812-10-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21xtensa: fix kernel register spillingMax Filippov
commit 77d6273e79e3a86552fcf10cdd31a69b46ed2ce6 upstream. call12 can't be safely used as the first call in the inline function, because the compiler does not extend the stack frame of the bounding function accordingly, which may result in corruption of local variables. If a call needs to be done, do call8 first followed by call12. For pure assembly code in _switch_to increase stack frame size of the bounding function. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21xtensa: fix threadptr reload on return to userspaceMax Filippov
commit 4229fb12a03e5da5882b420b0aa4a02e77447b86 upstream. Userspace return code may skip restoring THREADPTR register if there are no registers that need to be zeroed. This leads to spurious failures in libc NPTL tests. Always restore THREADPTR on return to userspace. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTEPaul Mackerras
commit 1e5bf454f58731e360e504253e85bae7aaa2d298 upstream. The reference (R) and change (C) bits in a HPT entry can be set by hardware at any time up until the HPTE is invalidated and the TLB invalidation sequence has completed. This means that when removing a HPTE, we need to read the HPTE after the invalidation sequence has completed in order to obtain reliable values of R and C. The code in kvmppc_do_h_remove() used to do this. However, commit 6f22bd3265fb ("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the read after invalidation as a side effect of other changes. This restores the read of the HPTE after invalidation. The user-visible effect of this bug would be that when migrating a guest, there is a small probability that a page modified by the guest and then unmapped by the guest might not get re-transmitted and thus the destination might end up with a stale copy of the page. Fixes: 6f22bd3265fb Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is setGautham R. Shenoy
commit 06554d9f6cc8f0b5ec903db19726a15dfc7b09d6 upstream. The code that handles the case when we receive a H_DOORBELL interrupt has a comment which says "Hypervisor doorbell - exit only if host IPI flag set". However, the current code does not actually check if the host IPI flag is set. This is due to a comparison instruction that got missed. As a result, the current code performs the exit to host only if some sibling thread or a sibling sub-core is exiting to the host. This implies that, an IPI sent to a sibling core in (subcores-per-core != 1) mode will be missed by the host unless the sibling core is on the exit path to the host. This patch adds the missing comparison operation which will ensure that when HOST_IPI flag is set, we unconditionally exit to the host. Fixes: 66feed61cdf6 Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21KVM: MMU: fix validation of mmio page faultXiao Guangrong
commit 6f691251c0350ac52a007c54bf3ef62e9d8cdc5e upstream. We got the bug that qemu complained with "KVM: unknown exit, hardware reason 31" and KVM shown these info: [84245.284948] EPT: Misconfiguration. [84245.285056] EPT: GPA: 0xfeda848 [84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4 [84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3 [84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2 [84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1 This is because we got a mmio #PF and the handler see the mmio spte becomes normal (points to the ram page) However, this is valid after introducing fast mmio spte invalidation which increases the generation-number instead of zapping mmio sptes, a example is as follows: 1. QEMU drops mmio region by adding a new memslot 2. invalidate all mmio sptes 3. VCPU 0 VCPU 1 access the invalid mmio spte access the region originally was MMIO before set the spte to the normal ram map mmio #PF check the spte and see it becomes normal ram mapping !!! This patch fixes the bug just by dropping the check in mmio handler, it's good for backport. Full check will be introduced in later patches Reported-by: Pavel Shirshov <ru.pchel@gmail.com> Tested-by: Pavel Shirshov <ru.pchel@gmail.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21crypto: ghash-clmulni: specify context size for ghash async algorithmAndrey Ryabinin
commit 71c6da846be478a61556717ef1ee1cea91f5d6a8 upstream. Currently context size (cra_ctxsize) doesn't specified for ghash_async_alg. Which means it's zero. Thus crypto_create_tfm() doesn't allocate needed space for ghash_async_ctx, so any read/write to ctx (e.g. in ghash_async_init_tfm()) is not valid. Signed-off-by: Andrey Ryabinin <aryabinin@odin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21s390/setup: fix novx parameterMartin Schwidefsky
commit 89b1145e93771d727645c96e323539c029b63f1c upstream. The novx parameter disables the vector facility but the HWCAP_S390_VXRS bit in the ELf hardware capabilies is always set if the machine has the vector facility. If the user space program uses the "vx" string in the features field of /proc/cpuinfo to utilize vector instruction it will crash if the novx kernel paramter is set. Convert setup_hwcaps to an arch_initcall and use MACHINE_HAS_VX to decide if the HWCAPS_S390_VXRS bit needs to be set. Reported-by: Ulrich Weigand <uweigand@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-27Merge tag 'powerpc-4.2-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix MSI/MSI-X on pseries from Guilherme" * tag 'powerpc-4.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code
2015-08-27Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull amr64 kvm fix from Will Deacon: "We've uncovered a nasty bug in the arm64 KVM code which allows a badly behaved 32-bit guest to bring down the host. The fix is simple (it's what I believe we call a "brown paper bag" bug) and I don't think it makes sense to sit on this, particularly as Russell ended up triggering this rather than just somebody noticing a potential problem by inspection. Usually arm64 KVM changes would go via Paolo's tree, but he's on holiday at the moment and the deal is that anything urgent gets shuffled via the arch trees, so here it is. Summary: Fix arm64 KVM issue when injecting an abort into a 32-bit guest, which would lead to an illegal exception return at EL2 and a subsequent host crash" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: KVM: Fix host crash when injecting a fault into a 32bit guest
2015-08-27arm64: KVM: Fix host crash when injecting a fault into a 32bit guestMarc Zyngier
When injecting a fault into a misbehaving 32bit guest, it seems rather idiotic to also inject a 64bit fault that is only going to corrupt the guest state. This leads to a situation where we perform an illegal exception return at EL2 causing the host to crash instead of killing the guest. Just fix the stupid bug that has been there from day 1. Cc: <stable@vger.kernel.org> Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-08-26powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF caseGuilherme G. Piccoli
Since commit 1851617cd2da ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI"), the setup of dev->msi_cap/msix_cap and the disable of MSI/MSI-X interrupts isn't being done at PCI probe time, as the logic responsible for this was moved in the aforementioned commit from pci_device_add() to pci_setup_device(). The latter function is not reachable on PowerPC pseries platform during Open Firmware PCI probing time. This exhibits as drivers not being able to enable MSI, eg: bnx2x 0000:01:00.0: no msix capability found This patch calls pci_msi_setup_pci_dev() explicitly to disable MSI/MSI-X during PCI probe time on pSeries platform. Fixes: 1851617cd2da ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI") [mpe: Flesh out change log and clarify comment] Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-25Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix for a APIC regression introduced in 4.0 which went undetected until now. I screwed up the x2apic cleanup in a subtle way. The screwup is only visible on systems which have x2apic preenabled in the BIOS and need to disable it during boot" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Fix fallout from x2apic cleanup
2015-08-23Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS bug fixes from Ralf Baechle: "Two more fixes for 4.2. One fixes a build issue with the LLVM assembler - LLVM assembler macro names are case sensitive, GNU as macro names are insensitive; the other corrects a license string (GPL v2, not GPLv2) such that the module loader will recognice the license correctly" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: FIRMWARE: bcm47xx_nvram: Fix module license. MIPS: Fix LLVM build issue.