summaryrefslogtreecommitdiff
path: root/arch/sparc/kernel
AgeCommit message (Collapse)Author
2017-07-12mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful ↵Michal Hocko
semantic __GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to the page allocator. This has been true but only for allocations requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always ignored for smaller sizes. This is a bit unfortunate because there is no way to express the same semantic for those requests and they are considered too important to fail so they might end up looping in the page allocator for ever, similarly to GFP_NOFAIL requests. Now that the whole tree has been cleaned up and accidental or misled usage of __GFP_REPEAT flag has been removed for !costly requests we can give the original flag a better name and more importantly a more useful semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user that the allocator would try really hard but there is no promise of a success. This will work independent of the order and overrides the default allocator behavior. Page allocator users have several levels of guarantee vs. cost options (take GFP_KERNEL as an example) - GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_ attempt to free memory at all. The most light weight mode which even doesn't kick the background reclaim. Should be used carefully because it might deplete the memory and the next user might hit the more aggressive reclaim - GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic allocation without any attempt to free memory from the current context but can wake kswapd to reclaim memory if the zone is below the low watermark. Can be used from either atomic contexts or when the request is a performance optimization and there is another fallback for a slow path. - (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) - non sleeping allocation with an expensive fallback so it can access some portion of memory reserves. Usually used from interrupt/bh context with an expensive slow path fallback. - GFP_KERNEL - both background and direct reclaim are allowed and the _default_ page allocator behavior is used. That means that !costly allocation requests are basically nofail but there is no guarantee of that behavior so failures have to be checked properly by callers (e.g. OOM killer victim is allowed to fail currently). - GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior and all allocation requests fail early rather than cause disruptive reclaim (one round of reclaim in this implementation). The OOM killer is not invoked. - GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator behavior and all allocation requests try really hard. The request will fail if the reclaim cannot make any progress. The OOM killer won't be triggered. - GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior and all allocation requests will loop endlessly until they succeed. This might be really dangerous especially for larger orders. Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL because they already had their semantic. No new users are added. __alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if there is no progress and we have already passed the OOM point. This means that all the reclaim opportunities have been exhausted except the most disruptive one (the OOM killer) and a user defined fallback behavior is more sensible than keep retrying in the page allocator. [akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c] [mhocko@suse.com: semantic fix] Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz [mhocko@kernel.org: address other thing spotted by Vlastimil] Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alex Belits <alex.belits@cavium.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David Daney <david.daney@cavium.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: NeilBrown <neilb@suse.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12kernel/watchdog: introduce arch_touch_nmi_watchdog()Nicholas Piggin
For architectures that define HAVE_NMI_WATCHDOG, instead of having them provide the complete touch_nmi_watchdog() function, just have them provide arch_touch_nmi_watchdog(). This gives the generic code more flexibility in implementing this function, and arch implementations don't miss out on touching the softlockup watchdog or other generic details. Link: http://lkml.kernel.org/r/20170616065715.18390-3-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Don Zickus <dzickus@redhat.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Tested-by: Babu Moger <babu.moger@oracle.com> [sparc] Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-nextLinus Torvalds
Pull sparc updates from David Miller: 1) Queued spinlocks and rwlocks for sparc64, from Babu Moger. 2) Some const'ification from Arvind Yadav. 3) LDC/VIO driver infrastructure changes to facilitate future upcoming drivers, from Jag Raman. 4) Initialize sched_clock() et al. early so that the initial printk timestamps are all done while the implementation is available and functioning. From Pavel Tatashin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next: (38 commits) sparc: kernel: pmc: make of_device_ids const. sparc64: fix typo in property sparc64: add port_id to VIO device metadata sparc64: Enhance search for VIO device in MDESC sparc64: enhance VIO device probing sparc64: check if a client is allowed to register for MDESC notifications sparc64: remove restriction on VIO device name size sparc64: refactor code to obtain cfg_handle property from MDESC sparc64: add MDESC node name property to VIO device metadata sparc64: mdesc: use __GFP_REPEAT action modifier for VM allocation sparc64: expand MDESC interface sparc64: skip handshake for LDC channels in RAW mode sparc64: specify the device class in VIO version info. packet sparc64: ensure VIO operations are defined while being used sparc: kernel: apc: make of_device_ids const sparc/time: make of_device_ids const sparc64: broken %tick frequency on spitfire cpus sparc64: use prom interface to get %stick frequency sparc64: optimize functions that access tick sparc64: add hot-patched and inlined get_tick() ...
2017-07-06Merge tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping infrastructure from Christoph Hellwig: "This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me)" * tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits) ARM: dma-mapping: Remove traces of NOMMU code ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus ARM: NOMMU: Introduce dma operations for noMMU drivers: dma-mapping: allow dma_common_mmap() for NOMMU drivers: dma-coherent: Introduce default DMA pool drivers: dma-coherent: Account dma_pfn_offset when used with device tree dma: Take into account dma_pfn_offset dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs dma-mapping: remove dmam_free_noncoherent crypto: qat - avoid an uninitialized variable warning au1100fb: remove a bogus dma_free_nonconsistent call MAINTAINERS: add entry for dma mapping helpers powerpc: merge __dma_set_mask into dma_set_mask dma-mapping: remove the set_dma_mask method powerpc/cell: use the dma_supported method for ops switching powerpc/cell: clean up fixed mapping dma_ops initialization tile: remove dma_supported and mapping_error methods xen-swiotlb: remove xen_swiotlb_set_dma_mask arm: implement ->dma_supported instead of ->set_dma_mask mips/loongson64: implement ->dma_supported instead of ->set_dma_mask ...
2017-07-03Merge tag 'driver-core-4.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big driver core update for 4.13-rc1. The large majority of this is a lot of cleanup of old fields in the driver core structures and their remaining usages in random drivers. All of those fixes have been reviewed by the various subsystem maintainers. There's also some small firmware updates in here, a new kobject uevent api interface that makes userspace interaction easier, and a few other minor things. All of these have been in linux-next for a long while with no reported issues" * tag 'driver-core-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (56 commits) arm: mach-rpc: ecard: fix build error zram: convert remaining CLASS_ATTR() to CLASS_ATTR_RO() driver-core: remove struct bus_type.dev_attrs powerpc: vio_cmo: use dev_groups and not dev_attrs for bus_type powerpc: vio: use dev_groups and not dev_attrs for bus_type USB: usbip: convert to use DRIVER_ATTR_RW s390: drivers: convert to use DRIVER_ATTR_RO/WO platform: thinkpad_acpi: convert to use DRIVER_ATTR_RO/RW pcmcia: ds: convert to use DRIVER_ATTR_RO wireless: ipw2x00: convert to use DRIVER_ATTR_RW net: ehea: convert to use DRIVER_ATTR_RO net: caif: convert to use DRIVER_ATTR_RO TTY: hvc: convert to use DRIVER_ATTR_RW PCI: pci-driver: convert to use DRIVER_ATTR_WO IB: nes: convert to use DRIVER_ATTR_RW HID: hid-core: convert to use DRIVER_ATTR_RO and drv_groups arm: ecard: fix dev_groups patch typo tty: serdev: use dev_groups and not dev_attrs for bus_type sparc: vio: use dev_groups and not dev_attrs for bus_type hid: intel-ish-hid: use dev_groups and not dev_attrs for bus_type ...
2017-07-03Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP hotplug updates from Thomas Gleixner: "This update is primarily a cleanup of the CPU hotplug locking code. The hotplug locking mechanism is an open coded RWSEM, which allows recursive locking. The main problem with that is the recursive nature as it evades the full lockdep coverage and hides potential deadlocks. The rework replaces the open coded RWSEM with a percpu RWSEM and establishes full lockdep coverage that way. The bulk of the changes fix up recursive locking issues and address the now fully reported potential deadlocks all over the place. Some of these deadlocks have been observed in the RT tree, but on mainline the probability was low enough to hide them away." * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) cpu/hotplug: Constify attribute_group structures powerpc: Only obtain cpu_hotplug_lock if called by rtasd ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init cpu/hotplug: Remove unused check_for_tasks() function perf/core: Don't release cred_guard_mutex if not taken cpuhotplug: Link lock stacks for hotplug callbacks acpi/processor: Prevent cpu hotplug deadlock sched: Provide is_percpu_thread() helper cpu/hotplug: Convert hotplug locking to percpu rwsem s390: Prevent hotplug rwsem recursion arm: Prevent hotplug rwsem recursion arm64: Prevent cpu hotplug rwsem recursion kprobes: Cure hotplug lock ordering issues jump_label: Reorder hotplug lock and jump_label_lock perf/tracing/cpuhotplug: Fix locking order ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() PCI: Replace the racy recursion prevention PCI: Use cpu_hotplug_disable() instead of get_online_cpus() perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() x86/perf: Drop EXPORT of perf_check_microcode ...
2017-07-03sparc: kernel: pmc: make of_device_ids const.Arvind Yadav
of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-28arch: remove unused macro/function thread_saved_pc()Tobias Klauser
The only user of thread_saved_pc() in non-arch-specific code was removed in commit 8243d5597793 ("sched/core: Remove pointless printout in sched_show_task()"). Remove the implementations as well. Some architectures use thread_saved_pc() in their arch-specific code. Leave their thread_saved_pc() intact. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-28sparc: remove arch specific dma_supported implementationsChristoph Hellwig
Usually dma_supported decisions are done by the dma_map_ops instance. Switch sparc to that model by providing a ->dma_supported instance for sbus that always returns false, and implementations tailored to the sun4u and sun4v cases for sparc64, and leave it unimplemented for PCI on sparc32, which means always supported. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net>
2017-06-28sparc: remove leon_dma_opsChristoph Hellwig
We can just use pci32_dma_ops directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net>
2017-06-28sparc: implement ->mapping_errorChristoph Hellwig
DMA_ERROR_CODE is going to go away, so don't rely on it. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: fix typo in propertyPavel Tatashin
There is a typo in a comment that propagated into code: upa-portis instead of upa-portid This problem was detected by code inspection. Fixes: eea9833453bd ("sparc64: broken %tick frequency on spitfire cpus" Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reported-by: Steven Sistare <steven.sistare@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: add port_id to VIO device metadataJag Raman
Add port_id field to VIO device metadata to identify the port of VIO device. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: Enhance search for VIO device in MDESCJag Raman
Enhances search for VIO device in MDESC by leveraging already existing MDESC APIs. Enhances changes in earlier patch, "sparc: Machine description indices can vary", by using existing MD search functions. It also specifies a match function, thereby enabling device_find_child() to use it for the purpose of matching device nodes in MDESC. An API to find VDEV node in MDESC based on its md_node_info is also added. It is planned to be used by VIO device clients in the future. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: enhance VIO device probingJag Raman
- Allocate IRQs for VIO devices during probing. - Allow clients to specify if IRQs would be allocated for a given VIO device. - Cache the device handle of the root node of channel-devices sub-tree in Machine Description (MDESC). Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: check if a client is allowed to register for MDESC notificationsJag Raman
Check if a client is supported, by comparing against a whitelist, to register for notifications from Machine Description (MDESC) Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: remove restriction on VIO device name sizeJag Raman
Removes restriction on VIO device's size limit. Since KOBJ_NAME_LEN has been dropped from kobject, there doesn't seem to be a restriction on the device name anymore. This limit therefore doesn't make sense. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: refactor code to obtain cfg_handle property from MDESCJag Raman
Refactors code to get the cfg_handle property of a node from Machine Description (MDESC) Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: add MDESC node name property to VIO device metadataJag Raman
Add the MDESC node name of MDESC client to VIO device metadata. It is later used to uniquely identify a node in the MDESC. VIO & MDESC APIs are updated to handle this node name. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: mdesc: use __GFP_REPEAT action modifier for VM allocationJag Raman
During MDESC handle allocation, use the __GFP_REPEAT flag instead of __GFP_NOFAIL. If memory is not available, the caller expects a NULL pointer instead of waiting until memory is allocated. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: expand MDESC interfaceJag Raman
Add the following two APIs to Machine Description (MDESC) - mdesc_get_node: Searches for a node in the Machine Description tree based on given information about that node. - mdesc_get_node_info: Retrieves information about a given node. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: skip handshake for LDC channels in RAW modeJag Raman
LDC channels in RAW mode does not provide any session management. No handshake protocol is defined for LDC channels in RAW mode. It's therefore skipped. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: specify the device class in VIO version info. packetJag Raman
Specify the class of VIO device in the version info. packet. The device's class identifies the type of VIO device, whether it's DISK, CONSOLE, NETWORK, etc... This packet is used in the handshake between the client and server for this device. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc64: ensure VIO operations are defined while being usedJag Raman
It's possible that VIO operations are not defined for some VIO clients. In that case, VIO ops pointer should be checked for NULL before being used Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25sparc: kernel: apc: make of_device_ids constArvind Yadav
of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19mm: larger stack guard gap, between vmasHugh Dickins
Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-15sparc/time: make of_device_ids constArvind Yadav
of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15sparc64: broken %tick frequency on spitfire cpusPavel Tatashin
After early boot time stamps project the %tick frequency is detected incorrectly on spittfire cpus. We must use cpuid of boot cpu to find corresponding cpu node in OpenBoot, and extract clock-frequency property from there. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15sparc64: use prom interface to get %stick frequencyPavel Tatashin
We initialize time early, we must use prom interface instead of open firmware driver, which is not yet initialized. Also, use prom_getintdefault() instead of prom_getint() to be compatible with the code before early boot timestamps project. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12sparc64: optimize functions that access tickPavel Tatashin
Replace read tick function pointers with the new hot-patched get_tick(). This optimizes the performance of functions such as: sched_clock() Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12sparc64: add hot-patched and inlined get_tick()Pavel Tatashin
Add the new get_tick() function that is hot-patched during boot based on processor we are booting on. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12sparc64: initialize time earlyPavel Tatashin
In Linux it is possible to configure printk() to output timestamp next to every line. This is very useful to determine the slow parts of the boot process, and also to avoid regressions, as boot time is visiable to everyone. Also, there are scripts that change these time stamps to intervals. However, on larger machines these timestamps start appearing many seconds, and even minutes into the boot process. This patch gets stick-frequency property early from OpenBoot, and uses its value to initialize time stamps before the first printk() messages are printed. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12sparc64: improve modularity tick optionsPavel Tatashin
This patch prepares the code for early boot time stamps by making it more modular. - init_tick_ops() to initialize struct sparc64_tick_ops - new sparc64_tick_ops operation get_frequency() which returns a frequency Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12sparc64: optimize loads in clock_sched()Pavel Tatashin
In clock sched we now have three loads: - Function pointer - quotient for multiplication - offset However, it is possible to improve performance substantially, by guaranteeing that all three loads are from the same cacheline. By moving these three values first in sparc64_tick_ops, and by having tick_operations 64-byte aligned we guarantee this. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12sparc64: show time stamps from zeroPavel Tatashin
On most platforms, time is shown from the beginning of boot. This patch is adding offset to sched_clock() for SPARC, to also show time from 0. This means we will have one more load, but we saved one in an ealier patch. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12sparc64: access tick function from variablePavel Tatashin
In timer_64.c tick functions are access via pointer (tick_ops), every time clock is read, there is one extra load to get to the function. This patch optimizes it, by accessing functions pointer from value. Current ched_clock(): sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x250 ], %g1 ! <tick_ops> ldx [ %g1 ], %g1 call %g1 nop sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x300 ], %g1 ! <timer_ticks_per_nsec_quotient> mulx %o0, %g1, %g1 rett %i7 + 8 srlx %g1, 0xa, %o0 New sched_clock(): sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x340 ], %g1 call %g1 nop sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x378 ], %g1 mulx %o0, %g1, %g1 rett %i7 + 8 srlx %g1, 0xa, %o0 Before three loads, now two loads. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12sparc64: remove trailing white spacesPavel Tatashin
A few changes that were reported by checkpatch, removed all trailing white spaces in these two files. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10sparc64: print debug messages when reading from LDC channelJag Raman
Print debug messages when reading from given LDC channel. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10sparc64: ldc abort during vds iso bootJag Raman
Orabug: 20902628 When an ldc control-only packet is received during data exchange in read_nonraw(), a new rx head is calculated but the rx queue head is not actually advanced (rx_set_head() is not called) and a branch is taken to 'no_data' at which point two things can happen depending on the value of the newly calculated rx head and the current rx tail: - If the rx queue is determined to be not empty, then the wrong packet is picked up. - If the rx queue is determined to be empty, then a read error (EAGAIN) is eventually returned since it is falsely assumed that more data was expected. The fix is to update the rx head and return in case of a control only packet during data exchange. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10sparc64: ensure LDC channel is ready before communicationJag Raman
Ensure that LDC channel is up before writing to it, in RAW mode. Generate event to bring the LDC channel up, if it's not up already. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10sparc64: enhance ldc_abort to print messageJag Raman
Enhance ldc_abort to accept a message to be printed when it is called. Add a macro, LDC_ABORT, to print info. about the function that calls ldc_abort. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10sparc64: expand LDC interfaceJag Raman
Add the following LDC APIs which are planned to be used by LDC clients in the future: - ldc_set_state: Sets given LDC channel to given state - ldc_mode: Returns the mode of given LDC channel - ldc_print: Prints info about given LDC channel - ldc_rx_reset: Reset the RX queue of given LDC channel Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09sparc: vio: use dev_groups and not dev_attrs for bus_typeGreg Kroah-Hartman
The dev_attrs field has long been "depreciated" and is finally being removed, so move the driver to use the "correct" dev_groups field instead for struct bus_type. Acked-by: "David S. Miller" <davem@davemloft.net> Cc: <sparclinux@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-06sparc64: delete old wrap codePavel Tatashin
The old method that is using xcall and softint to get new context id is deleted, as it is replaced by a method of using per_cpu_secondary_mm without xcall to perform the context wrap. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc: Machine description indices can varyJames Clarke
VIO devices were being looked up by their index in the machine description node block, but this often varies over time as devices are added and removed. Instead, store the ID and look up using the type, config handle and ID. Signed-off-by: James Clarke <jrtc27@jrtc27.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541 Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06sparc64: mm: fix copy_tsb to correctly copy huge page TSBsMike Kravetz
When a TSB grows beyond its current capacity, a new TSB is allocated and copy_tsb is called to copy entries from the old TSB to the new. A hash shift based on page size is used to calculate the index of an entry in the TSB. copy_tsb has hard coded PAGE_SHIFT in these calculations. However, for huge page TSBs the value REAL_HPAGE_SHIFT should be used. As a result, when copy_tsb is called for a huge page TSB the entries are placed at the incorrect index in the newly allocated TSB. When doing hardware table walk, the MMU does not match these entries and we end up in the TSB miss handling code. This code will then create and write an entry to the correct index in the TSB. We take a performance hit for the table walk miss and recreation of these entries. Pass a new parameter to copy_tsb that is the page size shift to be used when copying the TSB. Suggested-by: Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06arch/sparc: support NR_CPUS = 4096Jane Chu
Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info() only allocates a single page for NR_CPUS mondo entries. Thus we cannot use all 4096 CPUs on some SPARC platforms. To fix, allocate (2^order) pages where order is set according to the size of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa are not used in asm code, there are no imm13 offsets from the base PA that will break because they can only reach one page. Orabug: 25505750 Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Atish Patra <atish.patra@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01sparc64: Fix build warnings with gcc 7.David S. Miller
arch/sparc/kernel/ds.c: In function ‘register_services’: arch/sparc/kernel/ds.c:912:3: error: ‘strcpy’: writing at least 1 byte into a region of size 0 overflows the destination Reported-by: Anatoly Pugachev <matorola@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-26jump_label: Reorder hotplug lock and jump_label_lockThomas Gleixner
The conversion of the hotplug locking to a percpu rwsem unearthed lock ordering issues all over the place. The jump_label code has two issues: 1) Nested get_online_cpus() invocations 2) Ordering problems vs. the cpus rwsem and the jump_label_mutex To cure these, the following lock order has been established; cpus_rwsem -> jump_label_lock -> text_mutex Even if not all architectures need protection against CPU hotplug, taking cpus_rwsem before jump_label_lock is now mandatory in code pathes which actually modify code and therefor need text_mutex protection. Move the get_online_cpus() invocations into the core jump label code and establish the proper lock order where required. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@akamai.com> Cc: Ralf Baechle <ralf@linux-mips.org> Link: http://lkml.kernel.org/r/20170524081549.025830817@linutronix.de
2017-05-17sparc/ftrace: Fix ftrace graph time measurementLiam R. Howlett
The ftrace function_graph time measurements of a given function is not accurate according to those recorded by ftrace using the function filters. This change pulls the x86_64 fix from 'commit 722b3c746953 ("ftrace/graph: Trace function entry before updating index")' into the sparc specific prepare_ftrace_return which stops ftrace from counting interrupted tasks in the time measurement. Example measurements for select_task_rq_fair running "hackbench 100 process 1000": | tracing/trace_stat/function0 | function_graph Before patch | 2.802 us | 4.255 us After patch | 2.749 us | 3.094 us Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>