summaryrefslogtreecommitdiff
path: root/drivers/xen
AgeCommit message (Collapse)Author
2019-02-27xen/pvcalls: remove set but not used variable 'intf'YueHaibing
[ Upstream commit 1f8ce09b36c41a026a37a24b20efa32000892a64 ] Fixes gcc '-Wunused-but-set-variable' warning: drivers/xen/pvcalls-back.c: In function 'pvcalls_sk_state_change': drivers/xen/pvcalls-back.c:286:28: warning: variable 'intf' set but not used [-Wunused-but-set-variable] It not used since e6587cdbd732 ("pvcalls-back: set -ENOTCONN in pvcalls_conn_back_read") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-27pvcalls-back: set -ENOTCONN in pvcalls_conn_back_readStefano Stabellini
[ Upstream commit e6587cdbd732eacb4c7ce592ed46f7bbcefb655f ] When a connection is closing we receive on pvcalls_sk_state_change notification. Instead of setting the connection as closed immediately (-ENOTCONN), let's read one more time from it: pvcalls_conn_back_read will set the connection as closed when necessary. That way, we avoid races between pvcalls_sk_state_change and pvcalls_back_ioworker. Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-31xen: Fix x86 sched_clock() interface for xenJuergen Gross
commit 867cefb4cb1012f42cada1c7d1f35ac8dd276071 upstream. Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration: [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216 Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value. [boris: replaced pr_info() with pr_info_once() in xen_callback_vector() to avoid printing with incorrect timestamp during resume (as we haven't re-adjusted the clock yet)] Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") Cc: <stable@vger.kernel.org> # 4.11 Reported-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-17Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"Igor Druzhinin
[ Upstream commit 123664101aa2156d05251704fc63f9bcbf77741a ] This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f. That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only. The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17xen: xlate_mmu: add missing header to fix 'W=1' warningSrikanth Boddepalli
[ Upstream commit 72791ac854fea36034fa7976b748fde585008e78 ] Add a missing header otherwise compiler warns about missed prototype: drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes] int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma, ^~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Srikanth Boddepalli <boddepalli.srikanth@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-13xen/balloon: Support xend-based toolstackBoris Ostrovsky
commit 3aa6c19d2f38be9c6e9a8ad5fa8e3c9d29ee3c35 upstream. Xend-based toolstacks don't have static-max entry in xenstore. The equivalent node for those toolstacks is memory_static_max. Fixes: 5266b8e4445c (xen: fix booting ballooned down hvm guest) Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <stable@vger.kernel.org> # 4.13 Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13xen-swiotlb: use actually allocated size on check physical continuousJoe Jin
commit 7250f422da0480d8512b756640f131b9b893ccda upstream. xen_swiotlb_{alloc,free}_coherent() allocate/free memory based on the order of the pages and not size argument (bytes). This is inconsistent with range_straddles_page_boundary and memset which use the 'size' value, which may lead to not exchanging memory with Xen (range_straddles_page_boundary() returned true). And then the call to xen_swiotlb_free_coherent() would actually try to exchange the memory with Xen, leading to the kernel hitting an BUG (as the hypercall returned an error). This patch fixes it by making the 'size' variable be of the same size as the amount of memory allocated. CC: stable@vger.kernel.org Signed-off-by: Joe Jin <joe.jin@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christoph Helwig <hch@lst.de> Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: John Sobecki <john.sobecki@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-10xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usageJosh Abraham
[ Upstream commit 4dca864b59dd150a221730775e2f21f49779c135 ] This patch removes duplicate macro useage in events_base.c. It also fixes gcc warning: variable ‘col’ set but not used [-Wunused-but-set-variable] Signed-off-by: Joshua Abraham <j.abraham1776@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-10xen: avoid crash in disable_hotplug_cpuOlaf Hering
[ Upstream commit 3366cdb6d350d95466ee430ac50f3c8415ca8f46 ] The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0: BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased) Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010 RIP: e030:device_offline+0x9/0xb0 Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6 RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000 R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30 R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0 FS: 00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660 Call Trace: handle_vcpu_hotplug_event+0xb5/0xc0 xenwatch_thread+0x80/0x140 ? wait_woken+0x80/0x80 kthread+0x112/0x130 ? kthread_create_worker_on_cpu+0x40/0x40 ret_from_fork+0x3a/0x50 This happens because handle_vcpu_hotplug_event is called twice. In the first iteration cpu_present is still true, in the second iteration cpu_present is false which causes get_cpu_device to return NULL. In case of cpu#0, cpu_online is apparently always true. Fix this crash by checking if the cpu can be hotplugged, which is false for a cpu that was just removed. Also check if the cpu was actually offlined by device_remove, otherwise leave the cpu_present state as it is. Rearrange to code to do all work with device_hotplug_lock held. Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-10xen/manage: don't complain about an empty value in control/sysrq nodeVitaly Kuznetsov
[ Upstream commit 87dffe86d406bee8782cac2db035acb9a28620a7 ] When guest receives a sysrq request from the host it acknowledges it by writing '\0' to control/sysrq xenstore node. This, however, make xenstore watch fire again but xenbus_scanf() fails to parse empty value with "%c" format string: sysrq: SysRq : Emergency Sync Emergency Sync complete xen:manage: Error -34 reading sysrq code in control/sysrq Ignore -ERANGE the same way we already ignore -ENOENT, empty value in control/sysrq is totally legal. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15xen/balloon: fix balloon initialization for PVH Dom0Roger Pau Monne
[ Upstream commit 3596924a233e45aa918c961a902170fc4916461b ] The current balloon code tries to calculate a delta factor for the balloon target when running in HVM mode in order to account for memory used by the firmware. This workaround for memory accounting doesn't work properly on a PVH Dom0, that has a static-max value different from the target value even at startup. Note that this is not a problem for DomUs because guests are started with a static-max value that matches the amount of RAM in the memory map. Fix this by forcefully setting target_diff for Dom0, regardless of it's mode. Reported-by: Gabriel Bercarug <bercarug@amazon.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-24xen/scsiback: add error handling for xenbus_printfZhouyang Jia
[ Upstream commit 7c63ca24c878e0051c91904b72174029320ef4bd ] When xenbus_printf fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling xenbus_printf. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-24xen: add error handling for xenbus_printfZhouyang Jia
[ Upstream commit 84c029a73327cef571eaa61c7d6e67e8031b52ec ] When xenbus_printf fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling xenbus_printf. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03xen: Remove unnecessary BUG_ON from __unbind_from_irq()Boris Ostrovsky
commit eef04c7b3786ff0c9cb1019278b6c6c2ea0ad4ff upstream. Commit 910f8befdf5b ("xen/pirq: fix error path cleanup when binding MSIs") fixed a couple of errors in error cleanup path of xen_bind_pirq_msi_to_irq(). This cleanup allowed a call to __unbind_from_irq() with an unbound irq, which would result in triggering the BUG_ON there. Since there is really no reason for the BUG_ON (xen_free_irq() can operate on unbound irqs) we can remove it. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-21xen: xenbus_dev_frontend: Really return response stringSimon Gaiser
[ Upstream commit ebf04f331fa15a966262341a7dc6b1a0efd633e4 ] xenbus_command_reply() did not actually copy the response string and leaked stack content instead. Fixes: 9a6161fe73bd ("xen: return xenstore command failures via response instead of rc") Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30xen/acpi: off by one in read_acpi_id()Dan Carpenter
[ Upstream commit c37a3c94775855567b90f91775b9691e10bd2806 ] If acpi_id is == nr_acpi_bits, then we access one element beyond the end of the acpi_psd[] array or we set one bit beyond the end of the bit map when we do __set_bit(acpi_id, acpi_id_present); Fixes: 59a568029181 ("xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30xen: xenbus: use put_device() instead of kfree()Arvind Yadav
[ Upstream commit 351b2bccede1cb673ec7957b35ea997ea24c8884 ] Never directly free @dev after calling device_register(), even if it returned an error! Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30xen/pirq: fix error path cleanup when binding MSIsRoger Pau Monne
[ Upstream commit 910f8befdf5bccf25287d9f1743e3e546bcb7ce0 ] Current cleanup in the error path of xen_bind_pirq_msi_to_irq is wrong. First of all there's an off-by-one in the cleanup loop, which can lead to unbinding wrong IRQs. Secondly IRQs not bound won't be freed, thus leaking IRQ numbers. Note that there's no need to differentiate between bound and unbound IRQs when freeing them, __unbind_from_irq will deal with both of them correctly. Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups") Reported-by: Hooman Mirhadi <mirhadih@amazon.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Amit Shah <aams@amazon.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30xen/pvcalls: fix null pointer dereference on map->sockColin Ian King
[ Upstream commit 68d2059be660944152ba667e43c3b4ec225974bc ] Currently if map is null then a potential null pointer deference occurs when calling sock_release on map->sock. I believe the actual intention was to call sock_release on sock instead. Fix this. Fixes: 5db4d286a8ef ("xen/pvcalls: implement connect command") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30xen-swiotlb: fix the check condition for xen_swiotlb_free_coherentJoe Jin
commit 4855c92dbb7b3b85c23e88ab7ca04f99b9677b41 upstream. When run raidconfig from Dom0 we found that the Xen DMA heap is reduced, but Dom Heap is increased by the same size. Tracing raidconfig we found that the related ioctl() in megaraid_sas will call dma_alloc_coherent() to apply memory. If the memory allocated by Dom0 is not in the DMA area, it will exchange memory with Xen to meet the requiment. Later drivers call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent() the check condition (dev_addr + size - 1 <= dma_mask) is always false, it prevents calling xen_destroy_contiguous_region() to return the memory to the Xen DMA heap. This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.". Signed-off-by: Joe Jin <joe.jin@oracle.com> Tested-by: John Sobecki <john.sobecki@oracle.com> Reviewed-by: Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26xen/grant-table: Use put_page instead of free_pageRoss Lagerwall
[ Upstream commit 3ac7292a25db1c607a50752055a18aba32ac2176 ] The page given to gnttab_end_foreign_access() to free could be a compound page so use put_page() instead of free_page() since it can handle both compound and single pages correctly. This bug was discovered when migrating a Xen VM with several VIFs and CONFIG_DEBUG_VM enabled. It hits a BUG usually after fewer than 10 iterations. All netfront devices disconnect from the backend during a suspend/resume and this will call gnttab_end_foreign_access() if a netfront queue has an outstanding skb. The mismatch between calling get_page() and free_page() on a compound page causes a reference counting error which is detected when DEBUG_VM is enabled. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-19xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handlingSimon Gaiser
commit 2a22ee6c3ab1d761bc9c04f1e4117edd55b82f09 upstream. Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") made a subtle change to the semantic of xenbus_dev_request_and_reply() and xenbus_transaction_end(). Before on an error response to XS_TRANSACTION_END xenbus_dev_request_and_reply() would not decrement the active transaction counter. But xenbus_transaction_end() has always counted the transaction as finished regardless of the response. The new behavior is that xenbus_dev_request_and_reply() and xenbus_transaction_end() will always count the transaction as finished regardless the response code (handled in xs_request_exit()). But xenbus_dev_frontend tries to end a transaction on closing of the device if the XS_TRANSACTION_END failed before. Trying to close the transaction twice corrupts the reference count. So fix this by also considering a transaction closed if we have sent XS_TRANSACTION_END once regardless of the return code. Cc: <stable@vger.kernel.org> # 4.11 Fixes: fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-03xen/gntdev: Fix partial gntdev_mmap() cleanupRoss Lagerwall
[ Upstream commit cf2acf66ad43abb39735568f55e1f85f9844e990 ] When cleaning up after a partially successful gntdev_mmap(), unmap the successfully mapped grant pages otherwise Xen will kill the domain if in debug mode (Attempt to implicitly unmap a granted PTE) or Linux will kill the process and emit "BUG: Bad page map in process" if Xen is in release mode. This is only needed when use_ptemod is true because gntdev_put_map() will unmap grant pages itself when use_ptemod is false. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-03xen/gntdev: Fix off-by-one error when unmapping with holesRoss Lagerwall
[ Upstream commit 951a010233625b77cde3430b4b8785a9a22968d1 ] If the requested range has a hole, the calculation of the number of pages to unmap is off by one. Fix it. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-03xen/balloon: Mark unallocated host memory as UNUSABLEBoris Ostrovsky
[ Upstream commit b3cf8528bb21febb650a7ecbf080d0647be40b9f ] Commit f5775e0b6116 ("x86/xen: discard RAM regions above the maximum reservation") left host memory not assigned to dom0 as available for memory hotplug. Unfortunately this also meant that those regions could be used by others. Specifically, commit fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") may try to map those addresses as MMIO. To prevent this mark unallocated host memory as E820_TYPE_UNUSABLE (thus effectively reverting f5775e0b6116) and keep track of that region as a hostmem resource that can be used for the hotplug. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28mm, swap, frontswap: fix THP swap if frontswap enabledHuang Ying
commit 7ba716698cc53f8d5367766c93c538c7da6c68ce upstream. It was reported by Sergey Senozhatsky that if THP (Transparent Huge Page) and frontswap (via zswap) are both enabled, when memory goes low so that swap is triggered, segfault and memory corruption will occur in random user space applications as follow, kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000] #0 0x00007fc08889ae0d _int_malloc (libc.so.6) #1 0x00007fc08889c2f3 malloc (libc.so.6) #2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt) #3 0x0000560e6005e75c n/a (urxvt) #4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt) #5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt) #6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt) #7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt) #8 0x0000560e6005cb55 ev_run (urxvt) #9 0x0000560e6003b9b9 main (urxvt) #10 0x00007fc08883af4a __libc_start_main (libc.so.6) #11 0x0000560e6003f9da _start (urxvt) After bisection, it was found the first bad commit is bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped out"). The root cause is as follows: When the pages are written to swap device during swapping out in swap_writepage(), zswap (fontswap) is tried to compress the pages to improve performance. But zswap (frontswap) will treat THP as a normal page, so only the head page is saved. After swapping in, tail pages will not be restored to their original contents, causing memory corruption in the applications. This is fixed by refusing to save page in the frontswap store functions if the page is a THP. So that the THP will be swapped out to swap device. Another choice is to split THP if frontswap is enabled. But it is found that the frontswap enabling isn't flexible. For example, if CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if zswap itself isn't enabled. Frontswap has multiple backends, to make it easy for one backend to enable THP support, the THP checking is put in backend frontswap store functions instead of the general interfaces. Link: http://lkml.kernel.org/r/20180209084947.22749-1-ying.huang@intel.com Fixes: bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped out") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Minchan Kim <minchan@kernel.org> [put THP checking in backend] Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Shaohua Li <shli@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Shakeel Butt <shakeelb@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> [4.14] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25xen: XEN_ACPI_PROCESSOR is Dom0-onlyJan Beulich
[ Upstream commit c4f9d9cb2c29ff04c6b4bb09b72802d8aedfc7cb ] Add a respective dependency. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22xenbus: track caller request idJoao Martins
commit 29fee6eed2811ff1089b30fc579a2d19d78016ab upstream. Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") optimized xenbus concurrent accesses but in doing so broke UABI of /dev/xen/xenbus. Through /dev/xen/xenbus applications are in charge of xenbus message exchange with the correct header and body. Now, after the mentioned commit the replies received by application will no longer have the header req_id echoed back as it was on request (see specification below for reference), because that particular field is being overwritten by kernel. struct xsd_sockmsg { uint32_t type; /* XS_??? */ uint32_t req_id;/* Request identifier, echoed in daemon's response. */ uint32_t tx_id; /* Transaction id (0 if not related to a transaction). */ uint32_t len; /* Length of data following this. */ /* Generally followed by nul-terminated string(s). */ }; Before there was only one request at a time so req_id could simply be forwarded back and forth. To allow simultaneous requests we need a different req_id for each message thus kernel keeps a monotonic increasing counter for this field and is written on every request irrespective of userspace value. Forwarding again the req_id on userspace requests is not a solution because we would open the possibility of userspace-generated req_id colliding with kernel ones. So this patch instead takes another route which is to artificially keep user req_id while keeping the xenbus logic as is. We do that by saving the original req_id before xs_send(), use the private kernel counter as req_id and then once reply comes and was validated, we restore back the original req_id. Cc: <stable@vger.kernel.org> # 4.11 Fixes: fd8aa9095a ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Reported-by: Bhavesh Davda <bhavesh.davda@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02Merge tag 'spdx_identifiers-4.14-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull initial SPDX identifiers from Greg KH: "License cleanup: add SPDX license identifiers to some files Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: License cleanup: add SPDX license identifier to uapi header files with a license License cleanup: add SPDX license identifier to uapi header files with no license License cleanup: add SPDX GPL-2.0 license identifier to files with no license
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-27Merge tag 'for-linus-4.14c-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a fix for the Xen gntdev device repairing an issue in case of partial failure of mapping multiple pages of another domain - a fix of a regression in the Xen balloon driver introduced in 4.13 - a build fix for Xen on ARM which will trigger e.g. for Linux RT - a maintainers update for pvops (not really Xen, but carrying through this tree just for convenience) * tag 'for-linus-4.14c-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: maintainers: drop Chris Wright from pvops arm/xen: don't inclide rwlock.h directly. xen: fix booting ballooned down hvm guest xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap()
2017-10-26xen: fix booting ballooned down hvm guestJuergen Gross
Commit 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't online new memory initially") introduced a regression when booting a HVM domain with memory less than mem-max: instead of ballooning down immediately the system would try to use the memory up to mem-max resulting in Xen crashing the domain. For HVM domains the current size will be reflected in Xenstore node memory/static-max instead of memory/target. Additionally we have to trigger the ballooning process at once. Cc: <stable@vger.kernel.org> # 4.13 Fixes: 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't online new memory initially") Reported-by: Simon Gaiser <hw42@ipsumj.de> Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-25xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap()Juergen Gross
In case gntdev_mmap() succeeds only partially in mapping grant pages it will leave some vital information uninitialized needed later for cleanup. This will lead to an out of bounds array access when unmapping the already mapped pages. So just initialize the data needed for unmapping the pages a little bit earlier. Cc: <stable@vger.kernel.org> Reported-by: Arthur Borsboom <arthurborsboom@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-09-29Merge tag 'for-linus-4.14c-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - avoid a warning when compiling with clang - consider read-only bits in xen-pciback when writing to a BAR - fix a boot crash of pv-domains * tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping xen-pciback: relax BAR sizing write value check x86/xen: clean up clang build warning
2017-09-28xen-pciback: relax BAR sizing write value checkJan Beulich
Just like done in d2bd05d88d ("xen-pciback: return proper values during BAR sizing") for the ROM BAR, ordinary ones also shouldn't compare the written value directly against ~0, but consider the r/o bits at the bottom (if any). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-09-22Merge tag 'for-linus-4.14b-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "A fix for a missing __init annotation and two cleanup patches" * tag 'for-linus-4.14b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen, arm64: drop dummy lookup_address() xen: don't compile pv-specific parts if XEN_PV isn't configured xen: x86: mark xen_find_pt_base as __init
2017-09-18xen: don't compile pv-specific parts if XEN_PV isn't configuredJuergen Gross
xenbus_client.c contains some functions specific for pv guests. Enclose them with #ifdef CONFIG_XEN_PV to avoid compiling them when they are not needed (e.g. on ARM). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-09-13mm: treewide: remove GFP_TEMPORARY allocation flagMichal Hocko
GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's primary motivation was to allow users to tell that an allocation is short lived and so the allocator can try to place such allocations close together and prevent long term fragmentation. As much as this sounds like a reasonable semantic it becomes much less clear when to use the highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the context holding that memory sleep? Can it take locks? It seems there is no good answer for those questions. The current implementation of GFP_TEMPORARY is basically GFP_KERNEL | __GFP_RECLAIMABLE which in itself is tricky because basically none of the existing caller provide a way to reclaim the allocated memory. So this is rather misleading and hard to evaluate for any benefits. I have checked some random users and none of them has added the flag with a specific justification. I suspect most of them just copied from other existing users and others just thought it might be a good idea to use without any measuring. This suggests that GFP_TEMPORARY just motivates for cargo cult usage without any reasoning. I believe that our gfp flags are quite complex already and especially those with highlevel semantic should be clearly defined to prevent from confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and replace all existing users to simply use GFP_KERNEL. Please note that SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and so they will be placed properly for memory fragmentation prevention. I can see reasons we might want some gfp flag to reflect shorterm allocations but I propose starting from a clear semantic definition and only then add users with proper justification. This was been brought up before LSF this year by Matthew [1] and it turned out that GFP_TEMPORARY really doesn't have a clear semantic. It seems to be a heuristic without any measured advantage for most (if not all) its current users. The follow up discussion has revealed that opinions on what might be temporary allocation differ a lot between developers. So rather than trying to tweak existing users into a semantic which they haven't expected I propose to simply remove the flag and start from scratch if we really need a semantic for short term allocations. [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org [akpm@linux-foundation.org: fix typo] [akpm@linux-foundation.org: coding-style fixes] [sfr@canb.auug.org.au: drm/i915: fix up] Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Brown <neilb@suse.de> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-07Merge tag 'for-linus-4.14b-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - the new pvcalls backend for routing socket calls from a guest to dom0 - some cleanups of Xen code - a fix for wrong usage of {get,put}_cpu() * tag 'for-linus-4.14b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (27 commits) xen/mmu: set MMU_NORMAL_PT_UPDATE in remap_area_mfn_pte_fn xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guests xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init() xen/pvcalls: use WARN_ON(1) instead of __WARN() xen: remove not used trace functions xen: remove unused function xen_set_domain_pte() xen: remove tests for pvh mode in pure pv paths xen-platform: constify pci_device_id. xen: cleanup xen.h xen: introduce a Kconfig option to enable the pvcalls backend xen/pvcalls: implement write xen/pvcalls: implement read xen/pvcalls: implement the ioworker functions xen/pvcalls: disconnect and module_exit xen/pvcalls: implement release command xen/pvcalls: implement poll command xen/pvcalls: implement accept command xen/pvcalls: implement listen command xen/pvcalls: implement bind command xen/pvcalls: implement connect command ...
2017-09-05Merge tag 'driver-core-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core update from Greg KH: "Here is the "big" driver core update for 4.14-rc1. It's really not all that big, the largest thing here being some firmware tests to help ensure that that crazy api is working properly. There's also a new uevent for when a driver is bound or unbound from a device, fixing a hole in the driver model that's been there since the very beginning. Many thanks to Dmitry for being persistent and pointing out how wrong I was about this all along :) Patches for the new uevents are already in the systemd tree, if people want to play around with them. Otherwise just a number of other small api changes and updates here, nothing major. All of these patches have been in linux-next for a while with no reported issues" * tag 'driver-core-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (28 commits) driver core: bus: Fix a potential double free Do not disable driver and bus shutdown hook when class shutdown hook is set. base: topology: constify attribute_group structures. base: Convert to using %pOF instead of full_name kernfs: Clarify lockdep name for kn->count fbdev: uvesafb: remove DRIVER_ATTR() usage xen: xen-pciback: remove DRIVER_ATTR() usage driver core: Document struct device:dma_ops mod_devicetable: Remove excess description from structured comment test_firmware: add batched firmware tests firmware: enable a debug print for batched requests firmware: define pr_fmt firmware: send -EINTR on signal abort on fallback mechanism test_firmware: add test case for SIGCHLD on sync fallback initcall_debug: add deferred probe times Input: axp20x-pek - switch to using devm_device_add_group() Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01 Input: gpio_keys - use devm_device_add_group() for attributes driver core: add devm_device_add_group() and friends driver core: add device_{add|remove}_group() helpers ...
2017-09-04Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Thomas Gleixner: "This update provides: - Cleanup of the IDT management including the removal of the extra tracing IDT. A first step to cleanup the vector management code. - The removal of the paravirt op adjust_exception_frame. This is a XEN specific issue, but merged through this branch to avoid nasty merge collisions - Prevent dmesg spam about the TSC DEADLINE bug, when the CPU has disabled the TSC DEADLINE timer in CPUID. - Adjust a debug message in the ioapic code to print out the information correctly" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) x86/idt: Fix the X86_TRAP_BP gate x86/xen: Get rid of paravirt op adjust_exception_frame x86/eisa: Add missing include x86/idt: Remove superfluous ALIGNment x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature x86/idt: Remove the tracing IDT leftovers x86/idt: Hide set_intr_gate() x86/idt: Simplify alloc_intr_gate() x86/idt: Deinline setup functions x86/idt: Remove unused functions/inlines x86/idt: Move interrupt gate initialization to IDT code x86/idt: Move APIC gate initialization to tables x86/idt: Move regular trap init to tables x86/idt: Move IST stack based traps to table init x86/idt: Move debug stack init to table based x86/idt: Switch early trap init to IDT tables x86/idt: Prepare for table based init x86/idt: Move early IDT setup out of 32-bit asm x86/idt: Move early IDT handler setup to IDT code x86/idt: Consolidate IDT invalidation ...
2017-08-31xen/gntdev: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: xen-devel@lists.xenproject.org (moderated for non-subscribers) Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guestsBoris Ostrovsky
Commit aba831a69632 ("xen: remove tests for pvh mode in pure pv paths") removed XENFEAT_auto_translated_physmap test in xen_alloc_p2m_entry() since it is assumed that the routine is never called by non-PV guests. However, alloc_xenballooned_pages() may make this call on a PVH guest. Prevent this from happening by adding XENFEAT_auto_translated_physmap check there. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Fixes: aba831a69632 ("xen: remove tests for pvh mode in pure pv paths")
2017-08-31xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()Julien Grall
When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following splat appears: [ 0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes) [ 0.019717] ASID allocator initialised with 65536 entries [ 0.020019] xen:grant_table: Grant tables using version 1 layout [ 0.020051] Grant table initialized [ 0.020069] BUG: sleeping function called from invalid context at /data/src/linux/mm/page_alloc.c:4046 [ 0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0 [ 0.020123] no locks held by swapper/0/1. [ 0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598 [ 0.020166] Hardware name: FVP Base (DT) [ 0.020182] Call trace: [ 0.020199] [<ffff00000808a5c0>] dump_backtrace+0x0/0x270 [ 0.020222] [<ffff00000808a95c>] show_stack+0x24/0x30 [ 0.020244] [<ffff000008c1ef20>] dump_stack+0xb8/0xf0 [ 0.020267] [<ffff0000081128c0>] ___might_sleep+0x1c8/0x1f8 [ 0.020291] [<ffff000008112948>] __might_sleep+0x58/0x90 [ 0.020313] [<ffff0000082171b8>] __alloc_pages_nodemask+0x1c0/0x12e8 [ 0.020338] [<ffff00000827a110>] alloc_page_interleave+0x38/0x88 [ 0.020363] [<ffff00000827a904>] alloc_pages_current+0xdc/0xf0 [ 0.020387] [<ffff000008211f38>] __get_free_pages+0x28/0x50 [ 0.020411] [<ffff0000086566a4>] evtchn_fifo_alloc_control_block+0x2c/0xa0 [ 0.020437] [<ffff0000091747b0>] xen_evtchn_fifo_init+0x38/0xb4 [ 0.020461] [<ffff0000091746c0>] xen_init_IRQ+0x44/0xc8 [ 0.020484] [<ffff000009128adc>] xen_guest_init+0x250/0x300 [ 0.020507] [<ffff000008083974>] do_one_initcall+0x44/0x130 [ 0.020531] [<ffff000009120df8>] kernel_init_freeable+0x120/0x288 [ 0.020556] [<ffff000008c31ca8>] kernel_init+0x18/0x110 [ 0.020578] [<ffff000008083710>] ret_from_fork+0x10/0x40 [ 0.020606] xen:events: Using FIFO-based ABI [ 0.020658] Xen: initializing cpu0 [ 0.027727] Hierarchical SRCU implementation. [ 0.036235] EFI services will not be available. [ 0.043810] smp: Bringing up secondary CPUs ... This is because get_cpu() in xen_evtchn_fifo_init() will disable preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set). xen_evtchn_fifo_init() will always be called before SMP is initialized, so {get,put}_cpu() could be replaced by a simple smp_processor_id(). This also avoid to modify evtchn_fifo_alloc_control_block that will be called in other context. Signed-off-by: Julien Grall <julien.grall@arm.com> Reported-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Fixes: 1fe565517b57 ("xen/events: use the FIFO-based ABI if available") Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31xen/pvcalls: use WARN_ON(1) instead of __WARN()Arnd Bergmann
__WARN() is an internal helper that is only available on some architectures, but causes a build error e.g. on ARM64 in some configurations: drivers/xen/pvcalls-back.c: In function 'set_backend_state': drivers/xen/pvcalls-back.c:1097:5: error: implicit declaration of function '__WARN' [-Werror=implicit-function-declaration] Unfortunately, there is no equivalent of BUG() that takes no arguments, but WARN_ON(1) is commonly used in other drivers and works on all configurations. Fixes: 7160378206b2 ("xen/pvcalls: xenbus state handling") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31xen-platform: constify pci_device_id.Arvind Yadav
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by <linux/pci.h> work with const pci_device_id. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31xen: introduce a Kconfig option to enable the pvcalls backendStefano Stabellini
Also add pvcalls-back to the Makefile. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Juergen Gross <jgross@suse.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31xen/pvcalls: implement writeStefano Stabellini
When the other end notifies us that there is data to be written (pvcalls_back_conn_event), increment the io and write counters, and schedule the ioworker. Implement the write function called by ioworker by reading the data from the data ring, writing it to the socket by calling inet_sendmsg. Set out_error on error. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Juergen Gross <jgross@suse.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31xen/pvcalls: implement readStefano Stabellini
When an active socket has data available, increment the io and read counters, and schedule the ioworker. Implement the read function by reading from the socket, writing the data to the data ring. Set in_error on error. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Juergen Gross <jgross@suse.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31xen/pvcalls: implement the ioworker functionsStefano Stabellini
We have one ioworker per socket. Each ioworker goes through the list of outstanding read/write requests. Once all requests have been dealt with, it returns. We use one atomic counter per socket for "read" operations and one for "write" operations to keep track of the reads/writes to do. We also use one atomic counter ("io") per ioworker to keep track of how many outstanding requests we have in total assigned to the ioworker. The ioworker finishes when there are none. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>