summaryrefslogtreecommitdiff
path: root/Documentation
AgeCommit message (Collapse)Author
2021-07-28Documentation: Fix intiramfs script nameRobert Richter
commit 5e60f363b38fd40e4d8838b5d6f4d4ecee92c777 upstream. Documentation was not changed when renaming the script in commit 80e715a06c2d ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh"). Fixing this. Basically does: $ sed -i -e s/gen_initramfs_list.sh/gen_initramfs.sh/g $(git grep -l gen_initramfs_list.sh) Fixes: 80e715a06c2d ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh") Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-28userfaultfd: do not untag user pointersPeter Collingbourne
commit e71e2ace5721a8b921dca18b045069e7bb411277 upstream. Patch series "userfaultfd: do not untag user pointers", v5. If a user program uses userfaultfd on ranges of heap memory, it may end up passing a tagged pointer to the kernel in the range.start field of the UFFDIO_REGISTER ioctl. This can happen when using an MTE-capable allocator, or on Android if using the Tagged Pointers feature for MTE readiness [1]. When a fault subsequently occurs, the tag is stripped from the fault address returned to the application in the fault.address field of struct uffd_msg. However, from the application's perspective, the tagged address *is* the memory address, so if the application is unaware of memory tags, it may get confused by receiving an address that is, from its point of view, outside of the bounds of the allocation. We observed this behavior in the kselftest for userfaultfd [2] but other applications could have the same problem. Address this by not untagging pointers passed to the userfaultfd ioctls. Instead, let the system call fail. Also change the kselftest to use mmap so that it doesn't encounter this problem. [1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c This patch (of 2): Do not untag pointers passed to the userfaultfd ioctls. Instead, let the system call fail. This will provide an early indication of problems with tag-unaware userspace code instead of letting the code get confused later, and is consistent with how we decided to handle brk/mmap/mremap in commit dcde237319e6 ("mm: Avoid creating virtual address aliases in brk()/mmap()/mremap()"), as well as being consistent with the existing tagged address ABI documentation relating to how ioctl arguments are handled. The code change is a revert of commit 7d0325749a6c ("userfaultfd: untag user pointers") plus some fixups to some additional calls to validate_range that have appeared since then. [1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a25501b Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI") Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alistair Delva <adelva@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mitch Phillips <mitchp@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will@kernel.org> Cc: William McVicker <willmcvicker@google.com> Cc: <stable@vger.kernel.org> [5.4] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-28tracing/histogram: Rename "cpu" to "common_cpu"Steven Rostedt (VMware)
commit 1e3bac71c5053c99d438771fc9fa5082ae5d90aa upstream. Currently the histogram logic allows the user to write "cpu" in as an event field, and it will record the CPU that the event happened on. The problem with this is that there's a lot of events that have "cpu" as a real field, and using "cpu" as the CPU it ran on, makes it impossible to run histograms on the "cpu" field of events. For example, if I want to have a histogram on the count of the workqueue_queue_work event on its cpu field, running: ># echo 'hist:keys=cpu' > events/workqueue/workqueue_queue_work/trigger Gives a misleading and wrong result. Change the command to "common_cpu" as no event should have "common_*" fields as that's a reserved name for fields used by all events. And this makes sense here as common_cpu would be a field used by all events. Now we can even do: ># echo 'hist:keys=common_cpu,cpu if cpu < 100' > events/workqueue/workqueue_queue_work/trigger ># cat events/workqueue/workqueue_queue_work/hist # event histogram # # trigger info: hist:keys=common_cpu,cpu:vals=hitcount:sort=hitcount:size=2048 if cpu < 100 [active] # { common_cpu: 0, cpu: 2 } hitcount: 1 { common_cpu: 0, cpu: 4 } hitcount: 1 { common_cpu: 7, cpu: 7 } hitcount: 1 { common_cpu: 0, cpu: 7 } hitcount: 1 { common_cpu: 0, cpu: 1 } hitcount: 1 { common_cpu: 0, cpu: 6 } hitcount: 2 { common_cpu: 0, cpu: 5 } hitcount: 2 { common_cpu: 1, cpu: 1 } hitcount: 4 { common_cpu: 6, cpu: 6 } hitcount: 4 { common_cpu: 5, cpu: 5 } hitcount: 14 { common_cpu: 4, cpu: 4 } hitcount: 26 { common_cpu: 0, cpu: 0 } hitcount: 39 { common_cpu: 2, cpu: 2 } hitcount: 184 Now for backward compatibility, I added a trick. If "cpu" is used, and the field is not found, it will fall back to "common_cpu" and work as it did before. This way, it will still work for old programs that use "cpu" to get the actual CPU, but if the event has a "cpu" as a field, it will get that event's "cpu" field, which is probably what it wants anyway. I updated the tracefs/README to include documentation about both the common_timestamp and the common_cpu. This way, if that text is present in the README, then an application can know that common_cpu is supported over just plain "cpu". Link: https://lkml.kernel.org/r/20210721110053.26b4f641@oasis.local.home Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: 8b7622bf94a44 ("tracing: Add cpu field for hist triggers") Reviewed-by: Tom Zanussi <zanussi@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-28tcp: disable TFO blackhole logic by defaultWei Wang
[ Upstream commit 213ad73d06073b197a02476db3a4998e219ddb06 ] Multiple complaints have been raised from the TFO users on the internet stating that the TFO blackhole logic is too aggressive and gets falsely triggered too often. (e.g. https://blog.apnic.net/2021/07/05/tcp-fast-open-not-so-fast/) Considering that most middleboxes no longer drop TFO packets, we decide to disable the blackhole logic by setting /proc/sys/net/ipv4/tcp_fastopen_blackhole_timeout_set to 0 by default. Fixes: cf1ef3f0719b4 ("net/tcp_fastopen: Disable active side TFO in certain scenarios") Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20dt-bindings: i2c: at91: fix example for scl-gpiosNicolas Ferre
[ Upstream commit 92e669017ff1616ba7d8ba3c65f5193bc2a7acbe ] The SCL gpio pin used by I2C bus for recovery needs to be configured as open drain, so fix the binding example accordingly. In relation with fix c5a283802573 ("ARM: dts: at91: Configure I2C SCL gpio as open drain"). Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Fixes: 19e5cef058a0 ("dt-bindings: i2c: at91: document optional bus recovery properties") Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20f2fs: fix to avoid adding tab before doc sectionChao Yu
[ Upstream commit 3c16dc40aab84bab9cf54c2b61a458bb86b180c3 ] Otherwise whole section after tab will be invisible in compiled html format document. Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Fixes: 89272ca1102e ("docs: filesystems: convert f2fs.txt to ReST") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14powerpc/papr_scm: Make 'perf_stats' invisible if perf-stats unavailableVaibhav Jain
[ Upstream commit ed78f56e1271f108e8af61baeba383dcd77adbec ] In case performance stats for an nvdimm are not available, reading the 'perf_stats' sysfs file returns an -ENOENT error. A better approach is to make the 'perf_stats' file entirely invisible to indicate that performance stats for an nvdimm are unavailable. So this patch updates 'papr_nd_attribute_group' to add a 'is_visible' callback implemented as newly introduced 'papr_nd_attribute_visible()' that returns an appropriate mode in case performance stats aren't supported in a given nvdimm. Also the initialization of 'papr_scm_priv.stat_buffer_len' is moved from papr_scm_nvdimm_init() to papr_scm_probe() so that it value is available when 'papr_nd_attribute_visible()' is called during nvdimm initialization. Even though 'perf_stats' attribute is available since v5.9, there are no known user-space tools/scripts that are dependent on presence of its sysfs file. Hence I dont expect any user-space breakage with this patch. Fixes: 2d02bf835e57 ("powerpc/papr_scm: Fetch nvdimm performance stats from PHYP") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210513092349.285021-1-vaibhav@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14mm/debug_vm_pgtable/basic: add validation for dirtiness after write protectAnshuman Khandual
[ Upstream commit bb5c47ced46797409f4791d0380db3116d93134c ] Patch series "mm/debug_vm_pgtable: Some minor updates", v3. This series contains some cleanups and new test suggestions from Catalin from an earlier discussion. https://lore.kernel.org/linux-mm/20201123142237.GF17833@gaia/ This patch (of 2): This adds validation tests for dirtiness after write protect conversion for each page table level. There are two new separate test types involved here. The first test ensures that a given page table entry does not become dirty after pxx_wrprotect(). This is important for platforms like arm64 which transfers and drops the hardware dirty bit (!PTE_RDONLY) to the software dirty bit while making it an write protected one. This test ensures that no fresh page table entry could be created with hardware dirty bit set. The second test ensures that a given page table entry always preserve the dirty information across pxx_wrprotect(). This adds two previously missing PUD level basic tests and while here fixes pxx_wrprotect() related typos in the documentation file. Link: https://lkml.kernel.org/r/1611137241-26220-1-git-send-email-anshuman.khandual@arm.com Link: https://lkml.kernel.org/r/1611137241-26220-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14clocksource: Retry clock read if long delays detectedPaul E. McKenney
[ Upstream commit db3a34e17433de2390eb80d436970edcebd0ca3e ] When the clocksource watchdog marks a clock as unstable, this might be due to that clock being unstable or it might be due to delays that happen to occur between the reads of the two clocks. Yes, interrupts are disabled across those two reads, but there are no shortage of things that can delay interrupts-disabled regions of code ranging from SMI handlers to vCPU preemption. It would be good to have some indication as to why the clock was marked unstable. Therefore, re-read the watchdog clock on either side of the read from the clock under test. If the watchdog clock shows an excessive time delta between its pair of reads, the reads are retried. The maximum number of retries is specified by a new kernel boot parameter clocksource.max_cswd_read_retries, which defaults to three, that is, up to four reads, one initial and up to three retries. If more than one retry was required, a message is printed on the console (the occasional single retry is expected behavior, especially in guest OSes). If the maximum number of retries is exceeded, the clock under test will be marked unstable. However, the probability of this happening due to various sorts of delays is quite small. In addition, the reason (clock-read delays) for the unstable marking will be apparent. Reported-by: Chris Mason <clm@fb.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Feng Tang <feng.tang@intel.com> Link: https://lore.kernel.org/r/20210527190124.440372-1-paulmck@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14hwmon: (max31790) Fix pwmX_enable attributesGuenter Roeck
[ Upstream commit 148c847c9e5a54b99850617bf9c143af9a344f92 ] pwmX_enable supports three possible values: 0: Fan control disabled. Duty cycle is fixed to 0% 1: Fan control enabled, pwm mode. Duty cycle is determined by values written into Target Duty Cycle registers. 2: Fan control enabled, rpm mode Duty cycle is adjusted such that fan speed matches the values in Target Count registers The current code does not do this; instead, it mixes pwm control configuration with fan speed monitoring configuration. Worse, it reports that pwm control would be disabled (pwmX_enable==0) when it is in fact enabled in pwm mode. Part of the problem may be that the chip sets the "TACH input enable" bit on its own whenever the mode bit is set to RPM mode, but that doesn't mean that "TACH input enable" accurately reflects the pwm mode. Fix it up and only handle pwm control with the pwmX_enable attributes. In the documentation, clarify that disabling pwm control (pwmX_enable=0) sets the pwm duty cycle to 0%. In the code, explain why TACH_INPUT_EN is set together with RPM_MODE. While at it, only update the configuration register if the configuration has changed, and only update the cached configuration if updating the chip configuration was successful. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@cesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-4-linux@roeck-us.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14hwmon: (max31790) Report correct current pwm duty cyclesGuenter Roeck
[ Upstream commit 897f6339893b741a5d68ae8e2475df65946041c2 ] The MAX31790 has two sets of registers for pwm duty cycles, one to request a duty cycle and one to read the actual current duty cycle. Both do not have to be the same. When reporting the pwm duty cycle to the user, the actual pwm duty cycle from pwm duty cycle registers needs to be reported. When setting it, the pwm target duty cycle needs to be written. Since we don't know the actual pwm duty cycle after a target pwm duty cycle has been written, set the valid flag to false to indicate that actual pwm duty cycle should be read from the chip instead of using cached values. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@ceesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-3-linux@roeck-us.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14media: hevc: Fix dependent slice segment flagsJernej Skrabec
[ Upstream commit 67a7e53d5b21f3a84efc03a4e62db7caf97841ef ] Dependent slice segment flag for PPS control is misnamed. It should have "enabled" at the end. It only tells if this flag is present in slice header or not and not the actual value. Fix this by renaming the PPS flag and introduce another flag for slice control which tells actual value. Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14evm: Refuse EVM_ALLOW_METADATA_WRITES only if an HMAC key is loadedRoberto Sassu
commit 9acc89d31f0c94c8e573ed61f3e4340bbd526d0c upstream. EVM_ALLOW_METADATA_WRITES is an EVM initialization flag that can be set to temporarily disable metadata verification until all xattrs/attrs necessary to verify an EVM portable signature are copied to the file. This flag is cleared when EVM is initialized with an HMAC key, to avoid that the HMAC is calculated on unverified xattrs/attrs. Currently EVM unnecessarily denies setting this flag if EVM is initialized with a public key, which is not a concern as it cannot be used to trust xattrs/attrs updates. This patch removes this limitation. Fixes: ae1ba1676b88e ("EVM: Allow userland to permit modification of EVM-protected metadata") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: stable@vger.kernel.org # 4.16.x Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-23mm/slub: clarify verification reportingKees Cook
commit 8669dbab2ae56085c128894b181c2aa50f97e368 upstream. Patch series "Actually fix freelist pointer vs redzoning", v4. This fixes redzoning vs the freelist pointer (both for middle-position and very small caches). Both are "theoretical" fixes, in that I see no evidence of such small-sized caches actually be used in the kernel, but that's no reason to let the bugs continue to exist, especially since people doing local development keep tripping over it. :) This patch (of 3): Instead of repeating "Redzone" and "Poison", clarify which sides of those zones got tripped. Additionally fix column alignment in the trailer. Before: BUG test (Tainted: G B ): Redzone overwritten ... Redzone (____ptrval____): bb bb bb bb bb bb bb bb ........ Object (____ptrval____): f6 f4 a5 40 1d e8 ...@.. Redzone (____ptrval____): 1a aa .. Padding (____ptrval____): 00 00 00 00 00 00 00 00 ........ After: BUG test (Tainted: G B ): Right Redzone overwritten ... Redzone (____ptrval____): bb bb bb bb bb bb bb bb ........ Object (____ptrval____): f6 f4 a5 40 1d e8 ...@.. Redzone (____ptrval____): 1a aa .. Padding (____ptrval____): 00 00 00 00 00 00 00 00 ........ The earlier commits that slowly resulted in the "Before" reporting were: d86bd1bece6f ("mm/slub: support left redzone") ffc79d288000 ("slub: use print_hex_dump") 2492268472e7 ("SLUB: change error reporting format to follow lockdep loosely") Link: https://lkml.kernel.org/r/20210608183955.280836-1-keescook@chromium.org Link: https://lkml.kernel.org/r/20210608183955.280836-2-keescook@chromium.org Link: https://lore.kernel.org/lkml/cfdb11d7-fb8e-e578-c939-f7f5fb69a6bd@suse.cz/ Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Marco Elver <elver@google.com> Cc: "Lin, Zhenpeng" <zplin@psu.edu> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Roman Gushchin <guro@fb.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-16ASoC: meson: gx-card: fix sound-dai dt schemaJerome Brunet
commit d031d99b02eaf7363c33f5b27b38086cc8104082 upstream. There is a fair amount of warnings when running 'make dtbs_check' with amlogic,gx-sound-card.yaml. Ex: arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:1: missing phandle tag in 0 arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:2: missing phandle tag in 0 arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0: [66, 0, 0] is too long The reason is that the sound-dai phandle provided has cells, and in such case the schema should use 'phandle-array' instead of 'phandle'. Fixes: fd00366b8e41 ("ASoC: meson: gx: add sound card dt-binding documentation") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://lore.kernel.org/r/20210524093448.357140-1-jbrunet@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-16KVM: X86: MMU: Use the correct inherited permissions to get shadow pageLai Jiangshan
commit b1bd5cba3306691c771d558e94baa73e8b0b96b7 upstream. When computing the access permissions of a shadow page, use the effective permissions of the walk up to that point, i.e. the logic AND of its parents' permissions. Two guest PxE entries that point at the same table gfn need to be shadowed with different shadow pages if their parents' permissions are different. KVM currently uses the effective permissions of the last non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have full ("uwx") permissions, and the effective permissions are recorded only in role.access and merged into the leaves, this can lead to incorrect reuse of a shadow page and eventually to a missing guest protection page fault. For example, here is a shared pagetable: pgd[] pud[] pmd[] virtual address pointers /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--) /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-) pgd-| (shared pmd[] as above) \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--) \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--) pud1 and pud2 point to the same pmd table, so: - ptr1 and ptr3 points to the same page. - ptr2 and ptr4 points to the same page. (pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries) - First, the guest reads from ptr1 first and KVM prepares a shadow page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1. "u--" comes from the effective permissions of pgd, pud1 and pmd1, which are stored in pt->access. "u--" is used also to get the pagetable for pud1, instead of "uw-". - Then the guest writes to ptr2 and KVM reuses pud1 which is present. The hypervisor set up a shadow page for ptr2 with pt->access is "uw-" even though the pud1 pmd (because of the incorrect argument to kvm_mmu_get_page in the previous step) has role.access="u--". - Then the guest reads from ptr3. The hypervisor reuses pud1's shadow pmd for pud2, because both use "u--" for their permissions. Thus, the shadow pmd already includes entries for both pmd1 and pmd2. - At last, the guest writes to ptr4. This causes no vmexit or pagefault, because pud1's shadow page structures included an "uw-" page even though its role.access was "u--". Any kind of shared pagetable might have the similar problem when in virtual machine without TDP enabled if the permissions are different from different ancestors. In order to fix the problem, we change pt->access to be an array, and any access in it will not include permissions ANDed from child ptes. The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/ Remember to test it with TDP disabled. The problem had existed long before the commit 41074d07c78b ("KVM: MMU: Fix inherited permissions for emulated guest pte updates"), and it is hard to find which is the culprit. So there is no fixes tag here. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20210603052455.21023-1-jiangshanlai@gmail.com> Cc: stable@vger.kernel.org Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03Documentation: seccomp: Fix user notification documentationSargun Dhillon
commit aac902925ea646e461c95edc98a8a57eb0def917 upstream. The documentation had some previously incorrect information about how userspace notifications (and responses) were handled due to a change from a previously proposed patchset. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org> Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210517193908.3113-2-sargun@sargun.me Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference ↵Nicholas Piggin
between sc and scv syscalls commit 5665bc35c1ed917ac8fd06cb651317bb47a65b10 upstream. The sc and scv 0 system calls have different ABI conventions, and ptracers need to know which system call type is being used if they want to look at the syscall registers. Document that pt_regs.trap can be used for this, and fix one in-tree user to work with scv 0 syscalls. Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Cc: stable@vger.kernel.org # v5.9+ Reported-by: "Dmitry V. Levin" <ldv@altlinux.org> Suggested-by: "Dmitry V. Levin" <ldv@altlinux.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210520111931.2597127-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-22tweewide: Fix most Shebang linesFinn Behrens
commit c25ce589dca10d64dde139ae093abc258a32869c upstream. Change every shebang which does not need an argument to use /usr/bin/env. This is needed as not every distro has everything under /usr/bin, sometimes not even bash. Signed-off-by: Finn Behrens <me@kloenk.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19dt-bindings: serial: 8250: Remove duplicated compatible stringsZhen Lei
commit a7277a73984114b38dcb62c8548850800ffe864e upstream. The compatible strings "mediatek,*" appears two times, remove one of them. Fixes: e69f5dc623f9 ("dt-bindings: serial: Convert 8250 to json-schema") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210422090857.583-1-thunder.leizhen@huawei.com Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19dt-bindings: media: renesas,vin: Make resets optional on R-Car Gen1Geert Uytterhoeven
commit 7935bb56e21b2add81149f4def8e59b4133fe57c upstream. The "resets" property is not present on R-Car Gen1 SoCs. Supporting it would require migrating from renesas,cpg-clocks to renesas,cpg-mssr. Fixes: 905fc6b1bfb4a631 ("dt-bindings: rcar-vin: Convert bindings to json-schema") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Link: https://lore.kernel.org/r/217c8197efaee7d803b22d433abb0ea8e33b84c6.1619700314.git.geert+renesas@glider.be Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19ARM: 9012/1: move device tree mapping out of linear regionArd Biesheuvel
commit 7a1be318f5795cb66fa0dc86b3ace427fe68057f upstream On ARM, setting up the linear region is tricky, given the constraints around placement and alignment of the memblocks, and how the kernel itself as well as the DT are placed in physical memory. Let's simplify matters a bit, by moving the device tree mapping to the top of the address space, right between the end of the vmalloc region and the start of the the fixmap region, and create a read-only mapping for it that is independent of the size of the linear region, and how it is organized. Since this region was formerly used as a guard region, which will now be populated fully on LPAE builds by this read-only mapping (which will still be able to function as a guard region for stray writes), bump the start of the [underutilized] fixmap region by 512 KB as well, to ensure that there is always a proper guard region here. Doing so still leaves ample room for the fixmap space, even with NR_CPUS set to its maximum value of 32. Tested-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19kbuild: generate Module.symvers only when vmlinux existsMasahiro Yamada
[ Upstream commit 69bc8d386aebbd91a6bb44b6d33f77c8dfa9ed8c ] The external module build shows the following warning if Module.symvers is missing in the kernel tree. WARNING: Symbol version dump "Module.symvers" is missing. Modules may not have dependencies or modversions. I think this is an important heads-up because the resulting modules may not work as expected. This happens when you did not build the entire kernel tree, for example, you might have prepared the minimal setups for external modules by 'make defconfig && make modules_preapre'. A problem is that 'make modules' creates Module.symvers even without vmlinux. In this case, that warning is suppressed since Module.symvers already exists in spite of its incomplete content. The incomplete (i.e. invalid) Module.symvers should not be created. This commit changes the second pass of modpost to dump symbols into modules-only.symvers. The final Module.symvers is created by concatenating vmlinux.symvers and modules-only.symvers if both exist. Module.symvers is supposed to collect symbols from both vmlinux and modules. It might be a bit confusing, and I am not quite sure if it is an official interface, but presumably it is difficult to rename it because some tools (e.g. kmod) parse it. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14media: docs: Fix data organization of MEDIA_BUS_FMT_RGB101010_1X30Liu Ying
[ Upstream commit c451ee146d449bbe39835fc3d9007b7f06332415 ] The media bus bit width of MEDIA_BUS_FMT_RGB101010_1X30 is 30. So, 'Bit31' and 'Bit30' cells for the 'MEDIA_BUS_FMT_RGB101010_1X30' row should be spaces instead of '0's. Fixes: 54f38fcae536 ("media: docs: move uAPI book to userspace-api/media") Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14dt-bindings: serial: stm32: Use 'type: object' instead of false for ↵dillon min
'additionalProperties' [ Upstream commit 9f299d3264c67a892af87337dbaa0bdd20830c0c ] To use additional properties 'bluetooth' on serial, need replace false with 'type: object' for 'additionalProperties' to make it as a node, else will run into dtbs_check warnings. 'arch/arm/boot/dts/stm32h750i-art-pi.dt.yaml: serial@40004800: 'bluetooth' does not match any of the regexes: 'pinctrl-[0-9]+' Fixes: af1c2d81695b ("dt-bindings: serial: Convert STM32 UART to json-schema") Reported-by: kernel test robot <lkp@intel.com> Tested-by: Valentin Caron <valentin.caron@foss.st.com> Signed-off-by: dillon min <dillon.minfei@gmail.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/1616757302-7889-8-git-send-email-dillon.minfei@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14firmware: xilinx: Remove zynqmp_pm_get_eemi_ops() in ↵Nobuhiro Iwamatsu
IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE) [ Upstream commit 79bfe480a0a0b259ab9fddcd2fe52c03542b1196 ] zynqmp_pm_get_eemi_ops() was removed in commit 4db8180ffe7c: "Firmware: xilinx: Remove eemi ops for fpga related APIs", but not in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE). Any driver who want to communicate with PMC using EEMI APIs use the functions provided for each function This removed zynqmp_pm_get_eemi_ops() in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE), and also modify the documentation for this driver. Fixes: 4db8180ffe7c ("firmware: xilinx: Remove eemi ops for fpga related APIs") Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Link: https://lore.kernel.org/r/20210215155849.2425846-1-iwamatsu@nigauri.org Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-14dt-bindings: net: ethernet-controller: fix typo in NVMEMRafał Miłecki
commit af9d316f3dd6d1385fbd1631b5103e620fc4298a upstream. The correct property name is "nvmem-cell-names". This is what: 1. Was originally documented in the ethernet.txt 2. Is used in DTS files 3. Matches standard syntax for phandles 4. Linux net subsystem checks for Fixes: 9d3de3c58347 ("dt-bindings: net: Add YAML schemas for the generic Ethernet options") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ishSean Christopherson
[ Upstream commit b318e8decf6b9ef1bcf4ca06fae6d6a2cb5d5c5c ] Fix a plethora of issues with MSR filtering by installing the resulting filter as an atomic bundle instead of updating the live filter one range at a time. The KVM_X86_SET_MSR_FILTER ioctl() isn't truly atomic, as the hardware MSR bitmaps won't be updated until the next VM-Enter, but the relevant software struct is atomically updated, which is what KVM really needs. Similar to the approach used for modifying memslots, make arch.msr_filter a SRCU-protected pointer, do all the work configuring the new filter outside of kvm->lock, and then acquire kvm->lock only when the new filter has been vetted and created. That way vCPU readers either see the old filter or the new filter in their entirety, not some half-baked state. Yuan Yao pointed out a use-after-free in ksm_msr_allowed() due to a TOCTOU bug, but that's just the tip of the iceberg... - Nothing is __rcu annotated, making it nigh impossible to audit the code for correctness. - kvm_add_msr_filter() has an unpaired smp_wmb(). Violation of kernel coding style aside, the lack of a smb_rmb() anywhere casts all code into doubt. - kvm_clear_msr_filter() has a double free TOCTOU bug, as it grabs count before taking the lock. - kvm_clear_msr_filter() also has memory leak due to the same TOCTOU bug. The entire approach of updating the live filter is also flawed. While installing a new filter is inherently racy if vCPUs are running, fixing the above issues also makes it trivial to ensure certain behavior is deterministic, e.g. KVM can provide deterministic behavior for MSRs with identical settings in the old and new filters. An atomic update of the filter also prevents KVM from getting into a half-baked state, e.g. if installing a filter fails, the existing approach would leave the filter in a half-baked state, having already committed whatever bits of the filter were already processed. [*] https://lkml.kernel.org/r/20210312083157.25403-1-yaoyuan0329os@gmail.com Fixes: 1a155254ff93 ("KVM: x86: Introduce MSR filtering") Cc: stable@vger.kernel.org Cc: Alexander Graf <graf@amazon.com> Reported-by: Yuan Yao <yaoyuan0329os@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210316184436.2544875-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: libsas: Introduce a _gfp() variant of event notifiersAhmed S. Darwish
[ Upstream commit c2d0f1a65ab9fbabebb463bf36f50ea8f4633386 ] sas_alloc_event() uses in_interrupt() to decide which allocation should be used. The usage of in_interrupt() in drivers is phased out and Linus clearly requested that code which changes behaviour depending on context should either be separated or the context be conveyed in an argument passed by the caller, which usually knows the context. The in_interrupt() check is also only partially correct, because it fails to choose the correct code path when just preemption or interrupts are disabled. For example, as in the following call chain: mvsas/mv_sas.c: mvs_work_queue() [process context] spin_lock_irqsave(mvs_info::lock, ) -> libsas/sas_event.c: sas_notify_phy_event() -> sas_alloc_event() -> in_interrupt() = false -> invalid GFP_KERNEL allocation -> libsas/sas_event.c: sas_notify_port_event() -> sas_alloc_event() -> in_interrupt() = false -> invalid GFP_KERNEL allocation Introduce sas_alloc_event_gfp(), sas_notify_port_event_gfp(), and sas_notify_phy_event_gfp(), which all behave like the non _gfp() variants but use a caller-passed GFP mask for allocations. For bisectability, all callers will be modified first to pass GFP context, then the non _gfp() libsas API variants will be modified to take a gfp_t by default. Link: https://lore.kernel.org/r/20210118100955.1761652-4-a.darwish@linutronix.de Fixes: 1c393b970e0f ("scsi: libsas: Use dynamic alloced work to avoid sas event lost") Cc: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: libsas: Remove notifier indirectionJohn Garry
[ Upstream commit 121181f3f839c29d8dd9fdc3cc9babbdc74227f8 ] LLDDs report events to libsas with .notify_port_event and .notify_phy_event callbacks. These callbacks are fixed and so there is no reason why the functions cannot be called directly, so do that. This neatens the code slightly, makes it more obvious, and reduces function pointer usage, which is generally a good thing. Downside is that there are 2x more symbol exports. [a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables] Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-17KVM: arm64: Reject VM creation when the default IPA size is unsupportedMarc Zyngier
commit 7d717558dd5ef10d28866750d5c24ff892ea3778 upstream. KVM/arm64 has forever used a 40bit default IPA space, partially due to its 32bit heritage (where the only choice is 40bit). However, there are implementations in the wild that have a *cough* much smaller *cough* IPA space, which leads to a misprogramming of VTCR_EL2, and a guest that is stuck on its first memory access if userspace dares to ask for the default IPA setting (which most VMMs do). Instead, blundly reject the creation of such VM, as we can't satisfy the requirements from userspace (with a one-off warning). Also clarify the boot warning, and document that the VM creation will fail when an unsupported IPA size is provided. Although this is an ABI change, it doesn't really change much for userspace: - the guest couldn't run before this change, but no error was returned. At least userspace knows what is happening. - a memory slot that was accepted because it did fit the default IPA space now doesn't even get a chance to be registered. The other thing that is left doing is to convince userspace to actually use the IPA space setting instead of relying on the antiquated default. Fixes: 233a7cb23531 ("kvm: arm64: Allow tuning the physical address size for VM") Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20210311100016.3830038-2-maz@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-17drivers/base/memory: don't store phys_device in memory blocksDavid Hildenbrand
[ Upstream commit e9a2e48e8704c9d20a625c6f2357147d03ea7b97 ] No need to store the value for each and every memory block, as we can easily query the value at runtime. Reshuffle the members to optimize the memory layout. Also, let's clarify what the interface once was used for and why it's legacy nowadays. "phys_device" was used on s390x in older versions of lsmem[2]/chmem[3], back when they were still part of s390x-tools. They were later replaced by the variants in linux-utils. For example, RHEL6 and RHEL7 contain lsmem/chmem from s390-utils. RHEL8 switched to versions from util-linux on s390x [4]. "phys_device" was added with sysfs support for memory hotplug in commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove functions") in 2005. It always returned 0. s390x started returning something != 0 on some setups (if sclp.rzm is set by HW) in 2010 via commit 57b552ba0b2f ("memory hotplug/s390: set phys_device"). For s390x, it allowed for identifying which memory block devices belong to the same storage increment (RZM). Only if all memory block devices comprising a single storage increment were offline, the memory could actually be removed in the hypervisor. Since commit e5d709bb5fb7 ("s390/memory hotplug: provide memory_block_size_bytes() function") in 2013 a memory block device spans at least one storage increment - which is why the interface isn't really helpful/used anymore (except by old lsmem/chmem tools). There were once RFC patches to make use of "phys_device" in ACPI context; however, the underlying problem could be solved using different interfaces [1]. [1] https://patchwork.kernel.org/patch/2163871/ [2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem [3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem [4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134 Link: https://lkml.kernel.org/r/20210201181347.13262-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Vaibhav Jain <vaibhav@linux.ibm.com> Cc: Tom Rix <trix@redhat.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-17drm: Use USB controller's DMA mask when importing dmabufsThomas Zimmermann
commit 659ab7a49cbebe0deffcbe1f9560e82006b21817 upstream. USB devices cannot perform DMA and hence have no dma_mask set in their device structure. Therefore importing dmabuf into a USB-based driver fails, which breaks joining and mirroring of display in X11. For USB devices, pick the associated USB controller as attachment device. This allows the DRM import helpers to perform the DMA setup. If the DMA controller does not support DMA transfers, we're out of luck and cannot import. Our current USB-based DRM drivers don't use DMA, so the actual DMA device is not important. Tested by joining/mirroring displays of udl and radeon under Gnome/X11. v8: * release dmadev if device initialization fails (Noralf) * fix commit description (Noralf) v7: * fix use-before-init bug in gm12u320 (Dan) v6: * implement workaround in DRM drivers and hold reference to DMA device while USB device is in use * remove dev_is_usb() (Greg) * collapse USB helper into usb_intf_get_dma_device() (Alan) * integrate Daniel's TODO statement (Daniel) * fix typos (Greg) v5: * provide a helper for USB interfaces (Alan) * add FIXME item to documentation and TODO list (Daniel) v4: * implement workaround with USB helper functions (Greg) * use struct usb_device->bus->sysdev as DMA device (Takashi) v3: * drop gem_create_object * use DMA mask of USB controller, if any (Daniel, Christian, Noralf) v2: * move fix to importer side (Christian, Daniel) * update SHMEM and CMA helpers for new PRIME callbacks Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 6eb0233ec2d0 ("usb: don't inherity DMA properties for USB devices") Tested-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Noralf Trønnes <noralf@tronnes.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> # v5.10+ Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20210303133229.3288-1-tzimmermann@suse.de Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-17docs: networking: drop special stable handlingJakub Kicinski
commit dbbe7c962c3a8163bf724dbc3c9fdfc9b16d3117 upstream. Leave it to Greg. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07dt-bindings: net: btusb: DT fix s/interrupt-name/interrupt-names/Geert Uytterhoeven
commit f288988930e93857e0375bdf88bb670c312b82eb upstream. The standard DT property name is "interrupt-names". Fixes: fd913ef7ce619467 ("Bluetooth: btusb: Add out-of-band wakeup support") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Acked-by: Rajat Jain <rajatja@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07dt-bindings: ethernet-controller: fix fixed-link specificationRussell King
commit 322322d15b9b912bc8710c367a95a7de62220a72 upstream. The original fixed-link.txt allowed a pause property for fixed link. This has been missed in the conversion to yaml format. Fixes: 9d3de3c58347 ("dt-bindings: net: Add YAML schemas for the generic Ethernet options") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/E1l6W2G-0002Ga-0O@rmk-PC.armlinux.org.uk Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07tcp: fix tcp_rmem documentationEric Dumazet
commit 1d1be91254bbdd189796041561fd430f7553bb88 upstream. tcp_rmem[1] has been changed to 131072, we should update the documentation to reflect this. Fixes: a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Zhibin Liu <zhibinliu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04mm/vmscan: restore zone_reclaim_mode ABIDave Hansen
commit 519983645a9f2ec339cabfa0c6ef7b09be985dd0 upstream. I went to go add a new RECLAIM_* mode for the zone_reclaim_mode sysctl. Like a good kernel developer, I also went to go update the documentation. I noticed that the bits in the documentation didn't match the bits in the #defines. The VM never explicitly checks the RECLAIM_ZONE bit. The bit is, however implicitly checked when checking 'node_reclaim_mode==0'. The RECLAIM_ZONE #define was removed in a cleanup. That, by itself is fine. But, when the bit was removed (bit 0) the _other_ bit locations also got changed. That's not OK because the bit values are documented to mean one specific thing. Users surely do not expect the meaning to change from kernel to kernel. The end result is that if someone had a script that did: sysctl vm.zone_reclaim_mode=1 it would have gone from enabling node reclaim for clean unmapped pages to writing out pages during node reclaim after the commit in question. That's not great. Put the bits back the way they were and add a comment so something like this is a bit harder to do again. Update the documentation to make it clear that the first bit is ignored. Link: https://lkml.kernel.org/r/20210219172555.FF0CDF23@viggo.jf.intel.com Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Fixes: 648b5cf368e0 ("mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE") Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: "Tobin C. Harding" <tobin@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04seq_file: document how per-entry resources are managed.NeilBrown
commit b3656d8227f4c45812c6b40815d8f4e446ed372a upstream. Patch series "Fix some seq_file users that were recently broken". A recent change to seq_file broke some users which were using seq_file in a non-"standard" way ... though the "standard" isn't documented, so they can be excused. The result is a possible leak - of memory in one case, of references to a 'transport' in the other. These three patches: 1/ document and explain the problem 2/ fix the problem user in x86 3/ fix the problem user in net/sctp This patch (of 3): Users of seq_file will sometimes find it convenient to take a resource, such as a lock or memory allocation, in the ->start or ->next operations. These are per-entry resources, distinct from per-session resources which are taken in ->start and released in ->stop. The preferred management of these is release the resource on the subsequent call to ->next or ->stop. However prior to Commit 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface") it happened that ->show would always be called after ->start or ->next, and a few users chose to release the resource in ->show. This is no longer reliable. Since the mentioned commit, ->next will always come after a successful ->show (to ensure m->index is updated correctly), so the original ordering cannot be maintained. This patch updates the documentation to clearly state the required behaviour. Other patches will fix the few problematic users. [akpm@linux-foundation.org: fix typo, per Willy] Link: https://lkml.kernel.org/r/161248518659.21478.2484341937387294998.stgit@noble1 Link: https://lkml.kernel.org/r/161248539020.21478.3147971477400875336.stgit@noble1 Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface") Signed-off-by: NeilBrown <neilb@suse.de> Cc: Xin Long <lucien.xin@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04watch_queue: Drop references to /dev/watch_queueGabriel Krisman Bertazi
[ Upstream commit 8fe62e0c0e2efa5437f3ee81b65d69e70a45ecd2 ] The merged API doesn't use a watch_queue device, but instead relies on pipes, so let the documentation reflect that. Fixes: f7e47677e39a ("watch_queue: Add a key/keyring notification facility") Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Ben Boeckel <mathstuf@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf/arm-cmn: Fix PMU instance namingRobin Murphy
[ Upstream commit 79d7c3dca99fa96033695ddf5d495b775a3a137b ] Although it's neat to avoid the suffix for the typical case of a single PMU, it means systems with multiple CMN instances end up with inconsistent naming. I think it also breaks perf tool's "uncore alias" logic if the common instance prefix is also the full name of one. Avoid any surprises by not trying to be clever and simply numbering every instance, even when it might technically prove redundant. Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/649a2281233f193d59240b13ed91b57337c77b32.1611839564.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04scsi: libsas: docs: Remove notify_ha_event()Ahmed S. Darwish
commit 3f901c81dfad6930de5d4e6b582c4fde880cdada upstream. The ->notify_ha_event() hook has long been removed from the libsas event interface. Remove it from documentation. Link: https://lore.kernel.org/r/20210118100955.1761652-2-a.darwish@linutronix.de Fixes: 042ebd293b86 ("scsi: libsas: kill useless ha_event and do some cleanup") Cc: stable@vger.kernel.org Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-10ovl: implement volatile-specific fsync error behaviourSargun Dhillon
commit 335d3fc57941e5c6164c69d439aec1cb7a800876 upstream. Overlayfs's volatile option allows the user to bypass all forced sync calls to the upperdir filesystem. This comes at the cost of safety. We can never ensure that the user's data is intact, but we can make a best effort to expose whether or not the data is likely to be in a bad state. The best way to handle this in the time being is that if an overlayfs's upperdir experiences an error after a volatile mount occurs, that error will be returned on fsync, fdatasync, sync, and syncfs. This is contradictory to the traditional behaviour of VFS which fails the call once, and only raises an error if a subsequent fsync error has occurred, and been raised by the filesystem. One awkward aspect of the patch is that we have to manually set the superblock's errseq_t after the sync_fs callback as opposed to just returning an error from syncfs. This is because the call chain looks something like this: sys_syncfs -> sync_filesystem -> __sync_filesystem -> /* The return value is ignored here sb->s_op->sync_fs(sb) _sync_blockdev /* Where the VFS fetches the error to raise to userspace */ errseq_check_and_advance Because of this we call errseq_set every time the sync_fs callback occurs. Due to the nature of this seen / unseen dichotomy, if the upperdir is an inconsistent state at the initial mount time, overlayfs will refuse to mount, as overlayfs cannot get a snapshot of the upperdir's errseq that will increment on error until the user calls syncfs. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Fixes: c86243b090bc ("ovl: provide a mount option "volatile"") Cc: stable@vger.kernel.org Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03KVM: Documentation: Fix spec for KVM_CAP_ENABLE_CAP_VMQuentin Perret
commit a10f373ad3c760dd40b41e2f69a800ee7b8da15e upstream. The documentation classifies KVM_ENABLE_CAP with KVM_CAP_ENABLE_CAP_VM as a vcpu ioctl, which is incorrect. Fix it by specifying it as a VM ioctl. Fixes: e5d83c74a580 ("kvm: make KVM_CAP_ENABLE_CAP_VM architecture agnostic") Signed-off-by: Quentin Perret <qperret@google.com> Message-Id: <20210108165349.747359-1-qperret@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03KVM: Forbid the use of tagged userspace addresses for memslotsMarc Zyngier
commit 139bc8a6146d92822c866cf2fd410159c56b3648 upstream. The use of a tagged address could be pretty confusing for the whole memslot infrastructure as well as the MMU notifiers. Forbid it altogether, as it never quite worked the first place. Cc: stable@vger.kernel.org Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03x86/entry: Emit a symbol for register restoring thunkNick Desaulniers
commit 5e6dca82bcaa49348f9e5fcb48df4881f6d6c4ae upstream. Arnd found a randconfig that produces the warning: arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at offset 0x3e when building with LLVM_IAS=1 (Clang's integrated assembler). Josh notes: With the LLVM assembler not generating section symbols, objtool has no way to reference this code when it generates ORC unwinder entries, because this code is outside of any ELF function. The limitation now being imposed by objtool is that all code must be contained in an ELF symbol. And .L symbols don't create such symbols. So basically, you can use an .L symbol *inside* a function or a code segment, you just can't use the .L symbol to contain the code using a SYM_*_START/END annotation pair. Fangrui notes that this optimization is helpful for reducing image size when compiling with -ffunction-sections and -fdata-sections. I have observed on the order of tens of thousands of symbols for the kernel images built with those flags. A patch has been authored against GNU binutils to match this behavior of not generating unused section symbols ([1]), so this will also become a problem for users of GNU binutils once they upgrade to 2.36. Omit the .L prefix on a label so that the assembler will emit an entry into the symbol table for the label, with STB_LOCAL binding. This enables objtool to generate proper unwind info here with LLVM_IAS=1 or GNU binutils 2.36+. [ bp: Massage commit message. ] Reported-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Suggested-by: Borislav Petkov <bp@alien8.de> Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20210112194625.4181814-1-ndesaulniers@google.com Link: https://github.com/ClangBuiltLinux/linux/issues/1209 Link: https://reviews.llvm.org/D93783 Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1408485ce69f844dcd7ded093a8 [1] Cc: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27driver core: Fix device link device name collisionSaravana Kannan
commit e020ff611ba9be54e959e6b548038f8a020da1c9 upstream. The device link device's name was of the form: <supplier-dev-name>--<consumer-dev-name> This can cause name collision as reported here [1] as device names are not globally unique. Since device names have to be unique within the bus/class, add the bus/class name as a prefix to the device names used to construct the device link device name. So the devuce link device's name will be of the form: <supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name> [1] - https://lore.kernel.org/lkml/20201229033440.32142-1-michael@walle.cc/ Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs") Cc: stable@vger.kernel.org Reported-by: Michael Walle <michael@walle.cc> Tested-by: Michael Walle <michael@walle.cc> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210110175408.1465657-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27x86/xen: Add xen_no_vector_callback option to test PCI INTX deliveryDavid Woodhouse
[ Upstream commit b36b0fe96af13460278bf9b173beced1bd15f85d ] It's useful to be able to test non-vector event channel delivery, to make sure Linux will work properly on older Xen which doesn't have it. It's also useful for those working on Xen and Xen-compatible hypervisors, because there are guest kernels still in active use which use PCI INTX even when vector delivery is available. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-4-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-27dm integrity: conditionally disable "recalculate" featureMikulas Patocka
commit 5c02406428d5219c367c5f53457698c58bc5f917 upstream. Otherwise a malicious user could (ab)use the "recalculate" feature that makes dm-integrity calculate the checksums in the background while the device is already usable. When the system restarts before all checksums have been calculated, the calculation continues where it was interrupted even if the recalculate feature is not requested the next time the dm device is set up. Disable recalculating if we use internal_hash or journal_hash with a key (e.g. HMAC) and we don't have the "legacy_recalculate" flag. This may break activation of a volume, created by an older kernel, that is not yet fully recalculated -- if this happens, the user should add the "legacy_recalculate" flag to constructor parameters. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Daniel Glockner <dg@emlix.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-23dt-bindings: net: renesas,etheravb: RZ/G2H needs tx-internal-delay-psGeert Uytterhoeven
[ Upstream commit f97844f9c518172f813b7ece18a9956b1f70c1bb ] The merge resolution of the interaction of commits 307eea32b202864c ("dt-bindings: net: renesas,ravb: Add support for r8a774e1 SoC") and d7adf6331189cbe9 ("dt-bindings: net: renesas,etheravb: Convert to json-schema") missed that "tx-internal-delay-ps" should be a required property on RZ/G2H. Fixes: 8b0308fe319b8002 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20210105151516.1540653-1-geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>