summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-04-02KVM: X86: Change the type of access u32 to u64Lai Jiangshan
Change the type of access u32 to u64 for FNAME(walk_addr) and ->gva_to_gpa(). The kinds of accesses are usually combinations of UWX, and VMX/SVM's nested paging adds a new factor of access: is it an access for a guest page table or for a final guest physical address. And SMAP relies a factor for supervisor access: explicit or implicit. So @access in FNAME(walk_addr) and ->gva_to_gpa() is better to include all these information to do the walk. Although @access(u32) has enough bits to encode all the kinds, this patch extends it to u64: o Extra bits will be in the higher 32 bits, so that we can easily obtain the traditional access mode (UWX) by converting it to u32. o Reuse the value for the access kind defined by SVM's nested paging (PFERR_GUEST_FINAL_MASK and PFERR_GUEST_PAGE_MASK) as @error_code in kvm_handle_page_fault(). Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220311070346.45023-2-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: Remove dirty handling from gfn_to_pfn_cache completelyDavid Woodhouse
It isn't OK to cache the dirty status of a page in internal structures for an indefinite period of time. Any time a vCPU exits the run loop to userspace might be its last; the VMM might do its final check of the dirty log, flush the last remaining dirty pages to the destination and complete a live migration. If we have internal 'dirty' state which doesn't get flushed until the vCPU is finally destroyed on the source after migration is complete, then we have lost data because that will escape the final copy. This problem already exists with the use of kvm_vcpu_unmap() to mark pages dirty in e.g. VMX nesting. Note that the actual Linux MM already considers the page to be dirty since we have a writeable mapping of it. This is just about the KVM dirty logging. For the nesting-style use cases (KVM_GUEST_USES_PFN) we will need to track which gfn_to_pfn_caches have been used and explicitly mark the corresponding pages dirty before returning to userspace. But we would have needed external tracking of that anyway, rather than walking the full list of GPCs to find those belonging to this vCPU which are dirty. So let's rely *solely* on that external tracking, and keep it simple rather than laying a tempting trap for callers to fall into. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220303154127.202856-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: Use enum to track if cached PFN will be used in guest and/or hostSean Christopherson
Replace the guest_uses_pa and kernel_map booleans in the PFN cache code with a unified enum/bitmask. Using explicit names makes it easier to review and audit call sites. Opportunistically add a WARN to prevent passing garbage; instantating a cache without declaring its usage is either buggy or pointless. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220303154127.202856-2-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: SVM: Fix kvm_cache_regs.h inclusions for is_guest_mode()Peter Gonda
Include kvm_cache_regs.h to pick up the definition of is_guest_mode(), which is referenced by nested_svm_virtualize_tpr() in svm.h. Remove include from svm_onhpyerv.c which was done only because of lack of include in svm.h. Fixes: 883b0a91f41ab ("KVM: SVM: Move Nested SVM Implementation to nested.c") Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Peter Gonda <pgonda@google.com> Message-Id: <20220304161032.2270688-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: x86/pmu: Use different raw event masks for AMD and IntelJim Mattson
The third nybble of AMD's event select overlaps with Intel's IN_TX and IN_TXCP bits. Therefore, we can't use AMD64_RAW_EVENT_MASK on Intel platforms that support TSX. Declare a raw_event_mask in the kvm_pmu structure, initialize it in the vendor-specific pmu_refresh() functions, and use that mask for PERF_TYPE_RAW configurations in reprogram_gp_counter(). Fixes: 710c47651431 ("KVM: x86/pmu: Use AMD64_RAW_EVENT_MASK for PERF_TYPE_RAW") Signed-off-by: Jim Mattson <jmattson@google.com> Message-Id: <20220308012452.3468611-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: Don't actually set a request when evicting vCPUs for GFN cache invdSean Christopherson
Don't actually set a request bit in vcpu->requests when making a request purely to force a vCPU to exit the guest. Logging a request but not actually consuming it would cause the vCPU to get stuck in an infinite loop during KVM_RUN because KVM would see the pending request and bail from VM-Enter to service the request. Note, it's currently impossible for KVM to set KVM_REQ_GPC_INVALIDATE as nothing in KVM is wired up to set guest_uses_pa=true. But, it'd be all too easy for arch code to introduce use of kvm_gfn_to_pfn_cache_init() without implementing handling of the request, especially since getting test coverage of MMU notifier interaction with specific KVM features usually requires a directed test. Opportunistically rename gfn_to_pfn_cache_invalidate_start()'s wake_vcpus to evict_vcpus. The purpose of the request is to get vCPUs out of guest mode, it's supposed to _avoid_ waking vCPUs that are blocking. Opportunistically rename KVM_REQ_GPC_INVALIDATE to be more specific as to what it wants to accomplish, and to genericize the name so that it can used for similar but unrelated scenarios, should they arise in the future. Add a comment and documentation to explain why the "no action" request exists. Add compile-time assertions to help detect improper usage. Use the inner assertless helper in the one s390 path that makes requests without a hardcoded request. Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220223165302.3205276-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: avoid double put_page with gfn-to-pfn cacheDavid Woodhouse
If the cache's user host virtual address becomes invalid, there is still a path from kvm_gfn_to_pfn_cache_refresh() where __release_gpc() could release the pfn but the gpc->pfn field has not been overwritten with an error value. If this happens, kvm_gfn_to_pfn_cache_unmap will call put_page again on the same page. Cc: stable@vger.kernel.org Fixes: 982ed0de4753 ("KVM: Reinstate gfn_to_pfn_cache with invalidation support") Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: x86/mmu: Zap only TDP MMU leafs in zap range and mmu_notifier unmapSean Christopherson
Re-introduce zapping only leaf SPTEs in kvm_zap_gfn_range() and kvm_tdp_mmu_unmap_gfn_range(), this time without losing a pending TLB flush when processing multiple roots (including nested TDP shadow roots). Dropping the TLB flush resulted in random crashes when running Hyper-V Server 2019 in a guest with KSM enabled in the host (or any source of mmu_notifier invalidations, KSM is just the easiest to force). This effectively revert commits 873dd122172f8cce329113cfb0dfe3d2344d80c0 and fcb93eb6d09dd302cbef22bd95a5858af75e4156, and thus restores commit cf3e26427c08ad9015956293ab389004ac6a338e, plus this delta on top: bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, gfn_t end, struct kvm_mmu_page *root; for_each_tdp_mmu_root_yield_safe(kvm, root, as_id) - flush = tdp_mmu_zap_leafs(kvm, root, start, end, can_yield, false); + flush = tdp_mmu_zap_leafs(kvm, root, start, end, can_yield, flush); return flush; } Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220325230348.2587437-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: SVM: fix panic on out-of-bounds guest IRQYi Wang
As guest_irq is coming from KVM_IRQFD API call, it may trigger crash in svm_update_pi_irte() due to out-of-bounds: crash> bt PID: 22218 TASK: ffff951a6ad74980 CPU: 73 COMMAND: "vcpu8" #0 [ffffb1ba6707fa40] machine_kexec at ffffffff8565b397 #1 [ffffb1ba6707fa90] __crash_kexec at ffffffff85788a6d #2 [ffffb1ba6707fb58] crash_kexec at ffffffff8578995d #3 [ffffb1ba6707fb70] oops_end at ffffffff85623c0d #4 [ffffb1ba6707fb90] no_context at ffffffff856692c9 #5 [ffffb1ba6707fbf8] exc_page_fault at ffffffff85f95b51 #6 [ffffb1ba6707fc50] asm_exc_page_fault at ffffffff86000ace [exception RIP: svm_update_pi_irte+227] RIP: ffffffffc0761b53 RSP: ffffb1ba6707fd08 RFLAGS: 00010086 RAX: ffffb1ba6707fd78 RBX: ffffb1ba66d91000 RCX: 0000000000000001 RDX: 00003c803f63f1c0 RSI: 000000000000019a RDI: ffffb1ba66db2ab8 RBP: 000000000000019a R8: 0000000000000040 R9: ffff94ca41b82200 R10: ffffffffffffffcf R11: 0000000000000001 R12: 0000000000000001 R13: 0000000000000001 R14: ffffffffffffffcf R15: 000000000000005f ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffb1ba6707fdb8] kvm_irq_routing_update at ffffffffc09f19a1 [kvm] #8 [ffffb1ba6707fde0] kvm_set_irq_routing at ffffffffc09f2133 [kvm] #9 [ffffb1ba6707fe18] kvm_vm_ioctl at ffffffffc09ef544 [kvm] RIP: 00007f143c36488b RSP: 00007f143a4e04b8 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 00007f05780041d0 RCX: 00007f143c36488b RDX: 00007f05780041d0 RSI: 000000004008ae6a RDI: 0000000000000020 RBP: 00000000000004e8 R8: 0000000000000008 R9: 00007f05780041e0 R10: 00007f0578004560 R11: 0000000000000246 R12: 00000000000004e0 R13: 000000000000001a R14: 00007f1424001c60 R15: 00007f0578003bc0 ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b Vmx have been fix this in commit 3a8b0677fc61 (KVM: VMX: Do not BUG() on out-of-bounds guest IRQ), so we can just copy source from that to fix this. Co-developed-by: Yi Liu <liu.yi24@zte.com.cn> Signed-off-by: Yi Liu <liu.yi24@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Message-Id: <20220309113025.44469-1-wang.yi59@zte.com.cn> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: MMU: propagate alloc_workqueue failurePaolo Bonzini
If kvm->arch.tdp_mmu_zap_wq cannot be created, the failure has to be propagated up to kvm_mmu_init_vm and kvm_arch_init_vm. kvm_arch_init_vm also has to undo all the initialization, so group all the MMU initialization code at the beginning and handle cleaning up of kvm_page_track_init. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-01Merge branch 'work.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted bits and pieces" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: drop needless assignment in aio_read() clean overflow checks in count_mounts() a bit seq_file: fix NULL pointer arithmetic warning uml/x86: use x86 load_unaligned_zeropad() asm/user.h: killed unused macros constify struct path argument of finish_automount()/do_add_mount() fs: Remove FIXME comment in generic_write_checks()
2022-04-01Merge tag 'vfs-5.18-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull vfs fix from Darrick Wong: "The erofs developers felt that FIEMAP should handle ranged requests starting at s_maxbytes by returning EFBIG instead of passing the filesystem implementation a nonsense 0-byte request. Not sure why they keep tagging this 'iomap', but the VFS shouldn't be asking for information about ranges of a file that the filesystem already declared that it does not support. - Fix a potential infinite loop in FIEMAP by fixing an off by one error when comparing the requested range against s_maxbytes" * tag 'vfs-5.18-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: fs: fix an infinite loop in iomap_fiemap
2022-04-01Merge tag 'xfs-5.18-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Darrick Wong: "This fixes multiple problems in the reserve pool sizing functions: an incorrect free space calculation, a pointless infinite loop, and even more braindamage that could result in the pool being overfilled. The pile of patches from Dave fix myriad races and UAF bugs in the log recovery code that much to our mutual surprise nobody's tripped over. Dave also fixed a performance optimization that had turned into a regression. Dave Chinner is taking over as XFS maintainer starting Sunday and lasting until 5.19-rc1 is tagged so that I can focus on starting a massive design review for the (feature complete after five years) online repair feature. From then on, he and I will be moving XFS to a co-maintainership model by trading duties every other release. NOTE: I hope very strongly that the other pieces of the (X)FS ecosystem (fstests and xfsprogs) will make similar changes to spread their maintenance load. Summary: - Fix an incorrect free space calculation in xfs_reserve_blocks that could lead to a request for free blocks that will never succeed. - Fix a hang in xfs_reserve_blocks caused by an infinite loop and the incorrect free space calculation. - Fix yet a third problem in xfs_reserve_blocks where multiple racing threads can overfill the reserve pool. - Fix an accounting error that lead to us reporting reserved space as "available". - Fix a race condition during abnormal fs shutdown that could cause UAF problems when memory reclaim and log shutdown try to clean up inodes. - Fix a bug where log shutdown can race with unmount to tear down the log, thereby causing UAF errors. - Disentangle log and filesystem shutdown to reduce confusion. - Fix some confusion in xfs_trans_commit such that a race between transaction commit and filesystem shutdown can cause unlogged dirty inode metadata to be committed, thereby corrupting the filesystem. - Remove a performance optimization in the log as it was discovered that certain storage hardware handle async log flushes so poorly as to cause serious performance regressions. Recent restructuring of other parts of the logging code mean that no performance benefit is seen on hardware that handle it well" * tag 'xfs-5.18-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: drop async cache flushes from CIL commits. xfs: shutdown during log recovery needs to mark the log shutdown xfs: xfs_trans_commit() path must check for log shutdown xfs: xfs_do_force_shutdown needs to block racing shutdowns xfs: log shutdown triggers should only shut down the log xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks xfs: shutdown in intent recovery has non-intent items in the AIL xfs: aborting inodes on shutdown may need buffer lock xfs: don't report reserved bnobt space as available xfs: fix overfilling of reserve pool xfs: always succeed at setting the reserve pool size xfs: remove infinite loop when reserving free block pool xfs: don't include bnobt blocks when reserving free block pool xfs: document the XFS_ALLOC_AGFL_RESERVE constant
2022-04-01Merge tag 'riscv-for-linus-5.18-mw2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fix from Palmer Dabbelt: - Fix the RISC-V section of the generic CPU idle bindings to comply with the recently tightened DT schema. * tag 'riscv-for-linus-5.18-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: dt-bindings: Fix phandle-array issues in the idle-states bindings
2022-04-01Merge tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver fixes from Jens Axboe: "Followup block driver updates and fixes for the 5.18-rc1 merge window. In detail: - NVMe pull request - Fix multipath hang when disk goes live over reconnect (Anton Eidelman) - fix RCU hole that allowed for endless looping in multipath round robin (Chris Leech) - remove redundant assignment after left shift (Colin Ian King) - add quirks for Samsung X5 SSDs (Monish Kumar R) - fix the read-only state for zoned namespaces with unsupposed features (Pankaj Raghav) - use a private workqueue instead of the system workqueue in nvmet (Sagi Grimberg) - allow duplicate NSIDs for private namespaces (Sungup Moon) - expose use_threaded_interrupts read-only in sysfs (Xin Hao)" - nbd minor allocation fix (Zhang) - drbd fixes and maintainer addition (Lars, Jakob, Christoph) - n64cart build fix (Jackie) - loop compat ioctl fix (Carlos) - misc fixes (Colin, Dongli)" * tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block: drbd: remove check of list iterator against head past the loop body drbd: remove usage of list iterator variable after loop nbd: fix possible overflow on 'first_minor' in nbd_dev_add() MAINTAINERS: add drbd co-maintainer drbd: fix potential silent data corruption loop: fix ioctl calls using compat_loop_info nvme-multipath: fix hang when disk goes live over reconnect nvme: fix RCU hole that allowed for endless looping in multipath round robin nvme: allow duplicate NSIDs for private namespaces nvmet: remove redundant assignment after left shift nvmet: use a private workqueue instead of the system workqueue nvme-pci: add quirks for Samsung X5 SSDs nvme-pci: expose use_threaded_interrupts read-only in sysfs nvme: fix the read-only state for zoned namespaces with unsupposed features n64cart: convert bi_disk to bi_bdev->bd_disk fix build xen/blkfront: fix comment for need_copy xen-blkback: remove redundant assignment to variable i
2022-04-01Merge tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Either fixes or a few additions that got missed in the initial merge window pull. In detail: - List iterator fix to avoid leaking value post loop (Jakob) - One-off fix in minor count (Christophe) - Fix for a regression in how io priority setting works for an exiting task (Jiri) - Fix a regression in this merge window with blkg_free() being called in an inappropriate context (Ming) - Misc fixes (Ming, Tom)" * tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-block: blk-wbt: remove wbt_track stub block: use dedicated list iterator variable block: Fix the maximum minor value is blk_alloc_ext_minor() block: restore the old set_task_ioprio() behaviour wrt PF_EXITING block: avoid calling blkg_free() in atomic context lib/sbitmap: allocate sb->map via kvzalloc_node
2022-04-01Merge tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: "A little bit all over the map, some regression fixes for this merge window, and some general fixes that are stable bound. In detail: - Fix an SQPOLL memory ordering issue (Almog) - Accept fixes (Dylan) - Poll fixes (me) - Fixes for provided buffers and recycling (me) - Tweak to IORING_OP_MSG_RING command added in this merge window (me) - Memory leak fix (Pavel) - Misc fixes and tweaks (Pavel, me)" * tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-block: io_uring: defer msg-ring file validity check until command issue io_uring: fail links if msg-ring doesn't succeeed io_uring: fix memory leak of uid in files registration io_uring: fix put_kbuf without proper locking io_uring: fix invalid flags for io_put_kbuf() io_uring: improve req fields comments io_uring: enable EPOLLEXCLUSIVE for accept poll io_uring: improve task work cache utilization io_uring: fix async accept on O_NONBLOCK sockets io_uring: remove IORING_CQE_F_MSG io_uring: add flag for disabling provided buffer recycling io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly io_uring: don't recycle provided buffer if punted to async worker io_uring: fix assuming triggered poll waitqueue is the single poll io_uring: bump poll refs to full 31-bits io_uring: remove poll entry from list when canceling all io_uring: fix memory ordering when SQPOLL thread goes to sleep io_uring: ensure that fsnotify is always called io_uring: recycle provided before arming poll
2022-04-01Merge tag 'for-5.18/dm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM integrity shrink crash due to journal entry not being marked unused. - Fix DM bio polling to handle possibility that underlying device(s) return BLK_STS_AGAIN during submission. - Fix dm_io and dm_target_io flags race condition on Alpha. - Add some pr_err debugging to help debug cases when DM ioctl structure is corrupted. * tag 'for-5.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: fix bio polling to handle possibile BLK_STS_AGAIN dm: fix dm_io and dm_target_io flags race condition on Alpha dm integrity: set journal entry unused when shrinking device dm ioctl: log an error if the ioctl structure is corrupted
2022-04-01dt-bindings: Fix phandle-array issues in the idle-states bindingsPalmer Dabbelt
As per 39bd2b6a3783 ("dt-bindings: Improve phandle-array schemas"), the phandle-array bindings have been disambiguated. This fixes the new RISC-V idle-states bindings to comply with the schema. Fixes: 1bd524f7e8d8 ("dt-bindings: Add common bindings for ARM and RISC-V idle states") Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-01Merge tag '5.18-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull ksmbd updates from Steve French: - three cleanup fixes - shorten module load warning - two documentation fixes * tag '5.18-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: replace usage of found with dedicated list iterator variable ksmbd: Remove a redundant zeroing of memory MAINTAINERS: ksmbd: switch Sergey to reviewer ksmbd: shorten experimental warning on loading the module ksmbd: use netif_is_bridge_port Documentation: ksmbd: update Feature Status table
2022-04-01Merge tag '5.18-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull more cifs updates from Steve French: - three fixes for big endian issues in how Persistent and Volatile file ids were stored - Various misc. fixes: including some for oops, 2 for ioctls, 1 for writeback - cleanup of how tcon (tree connection) status is tracked - Four changesets to move various duplicated protocol definitions (defined both in cifs.ko and ksmbd) into smbfs_common/smb2pdu.h - important performance improvement to use cached handles in some key compounding code paths (reduces numbers of opens/closes sent in some workloads) - fix to allow alternate DFS target to be used to retry on a failed i/o * tag '5.18-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix NULL ptr dereference in smb2_ioctl_query_info() cifs: prevent bad output lengths in smb2_ioctl_query_info() smb3: fix ksmbd bigendian bug in oplock break, and move its struct to smbfs_common smb3: cleanup and clarify status of tree connections smb3: move defines for query info and query fsinfo to smbfs_common smb3: move defines for ioctl protocol header and SMB2 sizes to smbfs_common [smb3] move more common protocol header definitions to smbfs_common cifs: fix incorrect use of list iterator after the loop ksmbd: store fids as opaque u64 integers cifs: fix bad fids sent over wire cifs: change smb2_query_info_compound to use a cached fid, if available cifs: convert the path to utf16 in smb2_query_info_compound cifs: writeback fix cifs: do not skip link targets when an I/O fails
2022-04-01Merge tag 'exfat-for-5.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Add keep_last_dots mount option to allow access to paths with trailing dots - Avoid repetitive volume dirty bit set/clear to improve storage life time * tag 'exfat-for-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: do not clear VolumeDirty in writeback exfat: allow access to paths with trailing dots
2022-04-01Merge tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds
Pull more filesystem folio updates from Matthew Wilcox: "A mixture of odd changes that didn't quite make it into the original pull and fixes for things that did. Also the readpages changes had to wait for the NFS tree to be pulled first. - Remove ->readpages infrastructure - Remove AOP_FLAG_CONT_EXPAND - Move read_descriptor_t to networking code - Pass the iocb to generic_perform_write - Minor updates to iomap, btrfs, ext4, f2fs, ntfs" * tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecache: btrfs: Remove a use of PAGE_SIZE in btrfs_invalidate_folio() ntfs: Correct mark_ntfs_record_dirty() folio conversion f2fs: Get the superblock from the mapping instead of the page f2fs: Correct f2fs_dirty_data_folio() conversion ext4: Correct ext4_journalled_dirty_folio() conversion filemap: Remove AOP_FLAG_CONT_EXPAND fs: Pass an iocb to generic_perform_write() fs, net: Move read_descriptor_t to net.h fs: Remove read_actor_t iomap: Simplify is_partially_uptodate a little readahead: Update comments mm: remove the skip_page argument to read_pages mm: remove the pages argument to read_pages fs: Remove ->readpages address space operation readahead: Remove read_cache_pages()
2022-04-01Merge tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarrayLinus Torvalds
Pull XArray updates from Matthew Wilcox: - Documentation update - Fix test-suite build after move of bitmap.h - Fix xas_create_range() when a large entry is already present - Fix xas_split() of a shadow entry * tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarray: XArray: Update the LRU list in xas_split() XArray: Fix xas_create_range() when multi-order entry present XArray: Include bitmap.h from xarray.h XArray: Document the locking requirement for the xa_state
2022-04-01Merge tag 'riscv-for-linus-5.18-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: "This has a handful of new features: - Support for CURRENT_STACK_POINTER, which enables some extra stack debugging for HARDENED_USERCOPY. - Support for the new SBI CPU idle extension, via cpuidle and suspend drivers. - Profiling has been enabled in the defconfigs. but is mostly fixes and cleanups" * tag 'riscv-for-linus-5.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits) RISC-V: K210 defconfigs: Drop redundant MEMBARRIER=n RISC-V: defconfig: Drop redundant SBI HVC and earlycon Documentation: riscv: remove non-existent directory from table of contents riscv: cpu.c: don't use kernel-doc markers for comments RISC-V: Enable profiling by default RISC-V: module: fix apply_r_riscv_rcv_branch_rela typo RISC-V: Declare per cpu boot data as static RISC-V: Fix a comment typo in riscv_of_parent_hartid() riscv: Increase stack size under KASAN riscv: Fix fill_callchain return value riscv: dts: canaan: Fix SPI3 bus width riscv: Rename "sp_in_global" to "current_stack_pointer" riscv module: remove (NOLOAD) RISC-V: Enable RISC-V SBI CPU Idle driver for QEMU virt machine dt-bindings: Add common bindings for ARM and RISC-V idle states cpuidle: Add RISC-V SBI CPU idle driver cpuidle: Factor-out power domain related code from PSCI domain driver RISC-V: Add SBI HSM suspend related defines RISC-V: Add arch functions for non-retentive suspend entry/exit RISC-V: Rename relocate() and make it global ...
2022-04-01Merge tag 's390-5.18-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: - Add kretprobes framepointer verification and return address recovery in stacktrace. - Support control domain masks on custom zcrypt devices and filter admin requests. - Cleanup timer API usage. - Rework absolute lowcore access helpers. - Other various small improvements and fixes. * tag 's390-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (26 commits) s390/alternatives: avoid using jgnop mnemonic s390/pci: rename get_zdev_by_bus() to zdev_from_bus() s390/pci: improve zpci_dev reference counting s390/smp: use physical address for SIGP_SET_PREFIX command s390: cleanup timer API use s390/zcrypt: fix using the correct variable for sizeof() s390/vfio-ap: fix kernel doc and signature of group notifier functions s390/maccess: rework absolute lowcore accessors s390/smp: cleanup control register update routines s390/smp: cleanup target CPU callback starting s390/test_unwind: verify __kretprobe_trampoline is replaced s390/unwind: avoid duplicated unwinding entries for kretprobes s390/unwind: recover kretprobe modified return address in stacktrace s390/kprobes: enable kretprobes framepointer verification s390/test_unwind: extend kretprobe test s390/ap: adjust whitespace s390/ap: use insn format for new instructions s390/alternatives: use insn format for new instructions s390/alternatives: use instructions instead of byte patterns s390/traps: improve panic message for translation-specification exception ...
2022-04-01Merge tag 'soc-fixes-5.18-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd BergmannL "The introduction of vmap-stack on 32-bit arm caused a regression on a few omap3/omap4 machines that pass a stack variable into a firmware interface. The early pre-ACPI AMD Seattle machines have been broken for a while, Ard Biesheuvel has a series to bring them back for now. A few machines with multiple DMA channels used on a device have the channels in the wrong order according to the binding, which causes a harmless warning. Reversing the order is easier than fixing the tools to suppress the warning" * tag 'soc-fixes-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: ls1046a: Update i2c node dma properties arm64: dts: ls1043a: Update i2c dma properties ARM: dts: spear1340: Update serial node properties ARM: dts: spear13xx: Update SPI dma properties ARM: OMAP2+: Fix regression for smc calls for vmap stack dt: amd-seattle: add a description of the CPUs and caches dt: amd-seattle: disable IPMI controller and some GPIO blocks on B0 dt: amd-seattle: add description of the SATA/CCP SMMUs dt: amd-seattle: add a description of the PCIe SMMU dt: amd-seattle: fix PCIe legacy interrupt routing dt: amd-seattle: upgrade AMD Seattle XGBE to new SMMU binding dt: amd-seattle: remove Overdrive revision A0 support dt: amd-seattle: remove Husky platform
2022-04-01perf python: Convert tracepoint.py example to python3Tanu M
Convert the tracepoint.py file to python3 as many of the files in tools/perf are already written in python3. Committer testing: # export PYTHONPATH=/tmp/build/perf/python/ # python3 ~acme/git/perf/tools/perf/python/tracepoint.py | head time 67394457376909 prev_comm=swapper/12 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=gnome-terminal- next_pid=3313 next_prio=120 time 67394457807669 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 time 67394457811859 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120 time 67394457824929 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 time 67394457831899 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120 time 67394457842299 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 time 67394457844179 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120 time 67394457853879 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 time 67394457856339 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120 time 67394457865659 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 Traceback (most recent call last): File "/var/home/acme/git/perf/tools/perf/python/tracepoint.py", line 48, in <module> main() File "/var/home/acme/git/perf/tools/perf/python/tracepoint.py", line 37, in main print("time %u prev_comm=%s prev_pid=%d prev_prio=%d prev_state=0x%x ==> next_comm=%s next_pid=%d next_prio=%d" % ( BrokenPipeError: [Errno 32] Broken pipe # Signed-off-by: Tanu M <tanu235m@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/CAPS78prawYzRZnyhWjgOnGw4EwoswNwztvfZFdCOPOydFzVwzQ@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01perf evlist: Directly return instead of using local ret variableHaowen Bai
Addresses this coccinelle warning: ./tools/perf/util/evlist.c:1333:5-8: Unneeded variable: "err". Return "- ENOMEM" on line 1358 Signed-off-by: Haowen Bai <baihaowen@meizu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/1648432532-23151-1-git-send-email-baihaowen@meizu.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01perf cpumap: More cpu map reuse by merge.Ian Rogers
perf_cpu_map__merge() will reuse one of its arguments if they are equal or the other argument is NULL. The arguments could be reused if it is known one set of values is a subset of the other. For example, a map of 0-1 and a map of just 0 when merged yields the map of 0-1. Currently a new map is created rather than adding a reference count to the original 0-1 map. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20220328232648.2127340-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01perf cpumap: Add is_subset functionIan Rogers
Returns true if the second argument is a subset of the first. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20220328232648.2127340-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01perf evlist: Rename cpus to user_requested_cpusIan Rogers
evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps of all evsels. For non-task targets, cpus is set to be cpus requested from the command line, defaulting to all online cpus if no cpus are specified. For an uncore event, all_cpus may be just CPU 0 or every online CPU. This causes all_cpus to have fewer values than the cpus variable which is confusing given the 'all' in the name. To try to make the behavior clearer, rename cpus to user_requested_cpus and add comments on the two struct variables. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20220328232648.2127340-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01perf tools: Stop depending on .git files for building PERF-VERSION-FILEJohn Garry
This essentially reverts commit c72e3f04b45fb2e5 ("tools/perf/build: Speed up git-version test on re-make") and commit 4e666cdb06eede20 ("perf tools: Fix dependency for version file creation") In commit c72e3f04b45fb2e5 ("tools/perf/build: Speed up git-version test on re-make"), a makefile dependency on .git/HEAD was added. The background is that running PERF-VERSION-FILE is relatively slow, and commands like "git describe" are particularly slow. In commit 4e666cdb06eede20 ("perf tools: Fix dependency for version file creation"), an additional dependency on .git/ORIG_HEAD was added, as .git/HEAD may not change for "git reset --hard HEAD^" command. However, depending on whether we're on a branch or not, a "git cherry-pick" may not lead to the version being updated. As discussed with the git community in [0], using git internal files for dependencies is not reliable. Commit 4e666cdb06ee also breaks some build scenarios [1]. As mentioned, c72e3f04b45fb2e5 ("tools/perf/build: Speed up git-version test on re-make") was added to speed up the build. However in commit 7572733b84997d23 ("perf tools: Fix version kernel tag") we removed the call to "git describe", so just revert Makefile.perf back to same as pre c72e3f04b45fb2e5 ("tools/perf/build: Speed up git-version test on re-make") and the build should not be so slow, as below: Pre 7572733b8499: $> time util/PERF-VERSION-GEN PERF_VERSION = 5.17.rc8.g4e666cdb06ee real 0m0.110s user 0m0.091s sys 0m0.019s Post 7572733b8499: $> time util/PERF-VERSION-GEN PERF_VERSION = 5.17.rc8.g7572733b8499 real 0m0.039s user 0m0.036s sys 0m0.007s [0] https://lore.kernel.org/git/87wngkpddp.fsf@igel.home/T/#m4a4dd6de52fdbe21179306cd57b3761eb07f45f8 [1] https://lore.kernel.org/linux-perf-users/20220329093120.4173283-1-matthieu.baerts@tessares.net/T/#u Committer testing: After a fresh rebuild using 'make -C tools/perf O=/tmp/build/perf install-bin': $ perf -v perf version 5.17.g162f9db407b6 $ git log --oneline -1 162f9db407b6a6e5 (HEAD -> perf/core) perf tools: Stop depending on .git files for building PERF-VERSION-FILE $ Now using a detached tarball, i.e. outside the kernel source tree: $ ls -la perf*tar ls: cannot access 'perf*tar': No such file or directory $ make perf-tar-src-pkg TAR PERF_VERSION = 5.17.g31d10b3ef133 $ ls -la perf*tar -rw-r--r--. 1 acme acme 22241280 Mar 30 13:26 perf-5.17.0.tar $ mv perf-5.17.0.tar /tmp $ cd /tmp $ tar xf perf-5.17.0.tar $ cd perf-5.17.0/ $ make -C tools/perf |& tail CC util/pmu.o CC util/pmu-flex.o CC util/expr-flex.o CC util/expr.o LD util/scripting-engines/perf-in.o LD util/intel-pt-decoder/perf-in.o LD util/perf-in.o LD perf-in.o LINK perf make: Leaving directory '/tmp/perf-5.17.0/tools/perf' $ tools/perf/perf -v perf version 5.17.g31d10b3ef133 $ pwd /tmp/perf-5.17.0 $ cat PERF-VERSION-FILE #define PERF_VERSION "5.17.g31d10b3ef133" $ Fixes: 4e666cdb06eede20 ("perf tools: Fix dependency for version file creation") Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1648635774-14581-1-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01tools headers cpufeatures: Sync with the kernel sourcesArnaldo Carvalho de Melo
To pick the changes from: 991625f3dd2cbc4b ("x86/ibt: Add IBT feature, MSR and #CP handling") This only causes these perf files to be rebuilt: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/YkSCx2kr4ambH+Qe@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01tools headers UAPI: Sync drm/i915_drm.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: caa574ffc4aaf4f2 ("drm/i915/uapi: document behaviour for DG2 64K support") That don't add any new ioctl, so no changes in tooling. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h' diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: http://lore.kernel.org/lkml/YkSChHqaOApscFQ0@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01tools headers UAPI: Sync linux/kvm.h with the kernel sourcesArnaldo Carvalho de Melo
To pick the changes in: 6d8491910fcd3324 ("KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2") ef11c9463ae00630 ("KVM: s390: Add vm IOCTL for key checked guest absolute memory access") e9e9feebcbc14b17 ("KVM: s390: Add optional storage key checking to MEMOP IOCTL") That just rebuilds perf, as these patches don't add any new KVM ioctl to be harvested for the the 'perf trace' ioctl syscall argument beautifiers. This is also by now used by tools/testing/selftests/kvm/, a simple test build succeeded. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Cc: Oliver Upton <oupton@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/YkSCOWHQdir1lhdJ@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01tools kvm headers arm64: Update KVM headers from the kernel sourcesArnaldo Carvalho de Melo
To pick the changes from: 34739fd95fab3a5e ("KVM: arm64: Indicate SYSTEM_RESET2 in kvm_run::system_event flags field") 583cda1b0e7d5d49 ("KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU") That don't causes any changes in tooling (when built on x86), only addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h' diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/lkml/YkSB4Q7kWmnaqeZU@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: 991625f3dd2cbc4b ("x86/ibt: Add IBT feature, MSR and #CP handling") Addressing these tools/perf build warnings: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' That makes the beautification scripts to pick some new entries: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after --- before 2022-03-29 16:23:07.678740040 -0300 +++ after 2022-03-29 16:23:16.960978524 -0300 @@ -220,6 +220,13 @@ [0x00000669] = "MC6_DEMOTION_POLICY_CONFIG", [0x00000680] = "LBR_NHM_FROM", [0x00000690] = "CORE_PERF_LIMIT_REASONS", + [0x000006a0] = "IA32_U_CET", + [0x000006a2] = "IA32_S_CET", + [0x000006a4] = "IA32_PL0_SSP", + [0x000006a5] = "IA32_PL1_SSP", + [0x000006a6] = "IA32_PL2_SSP", + [0x000006a7] = "IA32_PL3_SSP", + [0x000006a8] = "IA32_INT_SSP_TAB", [0x000006B0] = "GFX_PERF_LIMIT_REASONS", [0x000006B1] = "RING_PERF_LIMIT_REASONS", [0x000006c0] = "LBR_NHM_TO", $ And this gets rebuilt: CC /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o LD /tmp/build/perf/trace/beauty/tracepoints/perf-in.o LD /tmp/build/perf/trace/beauty/perf-in.o CC /tmp/build/perf/util/amd-sample-raw.o LD /tmp/build/perf/util/perf-in.o LD /tmp/build/perf/perf-in.o LINK /tmp/build/perf/perf Now one can trace systemwide asking to see backtraces to where those MSRs are being read/written with: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" ^C# If we use -v (verbose mode) we can see what it does behind the scenes: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" Using CPUID AuthenticAMD-25-21-0 0x6a0 0x6a8 New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) 0x6a0 0x6a8 New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) mmap size 528384B ^C# Example with a frequent msr: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2 Using CPUID AuthenticAMD-25-21-0 0x48 New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) 0x48 New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) mmap size 528384B Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols 0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) futex_wait_queue_me ([kernel.kallsyms]) futex_wait ([kernel.kallsyms]) do_futex ([kernel.kallsyms]) __x64_sys_futex ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so) 0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule_idle ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/YkNd7Ky+vi7H2Zl2@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01tools headers UAPI: Sync asm-generic/mman-common.h with the kernelArnaldo Carvalho de Melo
To pick the changes from: 9457056ac426e5ed ("mm: madvise: MADV_DONTNEED_LOCKED") That result in these changes in the tools: $ diff -u tools/include/uapi/asm-generic/mman-common.h include/uapi/asm-generic/mman-common.h --- tools/include/uapi/asm-generic/mman-common.h 2022-03-29 16:17:50.461694991 -0300 +++ include/uapi/asm-generic/mman-common.h 2022-03-27 19:12:48.923250468 -0300 @@ -75,6 +75,8 @@ #define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */ #define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */ +#define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */ + /* compatibility flags */ #define MAP_FILE 0 $ tools/perf/trace/beauty/madvise_behavior.sh > before $ cp include/uapi/asm-generic/mman-common.h tools/include/uapi/asm-generic/mman-common.h $ tools/perf/trace/beauty/madvise_behavior.sh > after $ diff -u before after --- before 2022-03-29 16:18:04.091044244 -0300 +++ after 2022-03-29 16:18:11.692238906 -0300 @@ -20,6 +20,7 @@ [21] = "PAGEOUT", [22] = "POPULATE_READ", [23] = "POPULATE_WRITE", + [24] = "DONTNEED_LOCKED", [100] = "HWPOISON", [101] = "SOFT_OFFLINE", }; $ I.e. now when madvise gets those behaviours as args, 'perf trace' will be able to translate from the number to a human readable string and to use the strings in tracepoint filter expressions. This addresses the following perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman-common.h' differs from latest version at 'include/uapi/asm-generic/mman-common.h' diff -u tools/include/uapi/asm-generic/mman-common.h include/uapi/asm-generic/mman-common.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/YkNcUfeh795yqGMV@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01perf beauty: Update copy of linux/socket.h with the kernel sourcesArnaldo Carvalho de Melo
To pick the changes in: a6a6fe27bab48f0d ("net/smc: Dynamic control handshake limitation by socket options") This automagically adds support for the SOL_MNC socket level: $ diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h --- tools/perf/trace/beauty/include/linux/socket.h 2022-03-14 17:55:22.277148656 -0300 +++ include/linux/socket.h 2022-03-27 19:12:48.908250063 -0300 @@ -366,6 +366,7 @@ #define SOL_XDP 283 #define SOL_MPTCP 284 #define SOL_MCTP 285 +#define SOL_SMC 286 /* IPX options */ #define IPX_TYPE 1 $ tools/perf/trace/beauty/socket.sh > before $ cp include/linux/socket.h tools/perf/trace/beauty/include/linux/socket.h $ tools/perf/trace/beauty/socket.sh > after $ diff -u before after --- before 2022-03-29 11:47:56.390258780 -0300 +++ after 2022-03-29 11:48:03.158436189 -0300 @@ -67,6 +67,7 @@ [283] = "XDP", [284] = "MPTCP", [285] = "MCTP", + [286] = "SMC", }; DEFINE_STRARRAY(socket_level, "SOL_"); $ This will allow 'perf trace' to translate 286 into "SMC" as is done with the other socket levels: # perf trace -e setsockopt --max-events 4 344.916 ( 0.003 ms): Socket Thread/3816 setsockopt(fd: 168, level: TCP, optname: 5, optval: 0x7f5797b9c4f8, optlen: 4) = 0 344.920 ( 0.002 ms): Socket Thread/3816 setsockopt(fd: 168, level: TCP, optname: 6, optval: 0x7f5797b9c4f4, optlen: 4) = 0 1246.974 ( 0.010 ms): systemd-resolv/1128 setsockopt(fd: 22, level: IP, optname: 11, optval: 0x7ffc96cd7244, optlen: 4) = 0 1246.986 ( 0.002 ms): systemd-resolv/1128 setsockopt(fd: 22, level: IP, optname: 8, optval: 0x7ffc96cd7264, optlen: 4) = 0 This addresses this perf build warning: Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h' diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: D. Wythe <alibuda@linux.alibaba.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/YkMdpzzjPu5VZtW3@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01perf tools: Update copy of libbpf's hashmap.cArnaldo Carvalho de Melo
To pick the changes in: fba60b171a032283 ("libbpf: Use IS_ERR_OR_NULL() in hashmap__free()") That don't entail any changes in tools/perf. This addresses this perf build warning: Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h' diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h Not a kernel ABI, its just that this uses the mechanism in place for checking kernel ABI files drift. Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mauricio Vásquez <mauricio@kinvolk.io> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/YkMb2SAIai2VeuUD@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01perf stat: Avoid SEGV if core.cpus isn't setIan Rogers
Passing NULL to perf_cpu_map__max doesn't make sense as there is no valid max. Avoid this problem by null checking in perf_stat_init_aggr_mode. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20220328062414.1893550-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge still more updates from Andrew Morton: "16 patches. Subsystems affected by this patch series: ofs2, nilfs2, mailmap, and mm (madvise, mlock, mfence, memory-failure, kasan, debug, kmemleak, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/damon: prevent activated scheme from sleeping by deactivated schemes mm/kmemleak: reset tag when compare object pointer doc/vm/page_owner.rst: remove content related to -c option tools/vm/page_owner_sort.c: remove -c option mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEP mm,hwpoison: unmap poisoned page before invalidation mailmap: update Kirill's email mm: kfence: fix objcgs vector allocation mm/munlock: protect the per-CPU pagevec by a local_lock_t mm/munlock: update Documentation/vm/unevictable-lru.rst mm/munlock: add lru_add_drain() to fix memcg_stat_test nilfs2: get rid of nilfs_mapping_init() nilfs2: fix lockdep warnings during disk space reclamation nilfs2: fix lockdep warnings in page operations for btree nodes ocfs2: fix crash when mount with quota enabled Revert "mm: madvise: skip unmapped vma holes passed to process_madvise"
2022-04-01mm/damon: prevent activated scheme from sleeping by deactivated schemesJonghyeon Kim
In the DAMON, the minimum wait time of the schemes decides whether the kernel wakes up 'kdamon_fn()'. But since the minimum wait time is initialized to zero, there are corner cases against the original objective. For example, if we have several schemes for one target, and if the wait time of the first scheme is zero, the minimum wait time will set zero, which means 'kdamond_fn()' should wake up to apply this scheme. However, in the following scheme, wait time can be set to non-zero. Thus, the mininum wait time will be set to non-zero, which can cause sleeping this interval for 'kdamon_fn()' due to one deactivated last scheme. This commit prevents making DAMON monitoring inactive state due to other deactivated schemes. Link: https://lkml.kernel.org/r/20220330105302.32114-1-tome01@ajou.ac.kr Signed-off-by: Jonghyeon Kim <tome01@ajou.ac.kr> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-01mm/kmemleak: reset tag when compare object pointerKuan-Ying Lee
When we use HW-tag based kasan and enable vmalloc support, we hit the following bug. It is due to comparison between tagged object and non-tagged pointer. We need to reset the kasan tag when we need to compare tagged object and non-tagged pointer. kmemleak: [name:kmemleak&]Scan area larger than object 0xffffffe77076f440 CPU: 4 PID: 1 Comm: init Tainted: G S W 5.15.25-android13-0-g5cacf919c2bc #1 Hardware name: MT6983(ENG) (DT) Call trace: add_scan_area+0xc4/0x244 kmemleak_scan_area+0x40/0x9c layout_and_allocate+0x1e8/0x288 load_module+0x2c8/0xf00 __se_sys_finit_module+0x190/0x1d0 __arm64_sys_finit_module+0x20/0x30 invoke_syscall+0x60/0x170 el0_svc_common+0xc8/0x114 do_el0_svc+0x28/0xa0 el0_svc+0x60/0xf8 el0t_64_sync_handler+0x88/0xec el0t_64_sync+0x1b4/0x1b8 kmemleak: [name:kmemleak&]Object 0xf5ffffe77076b000 (size 32768): kmemleak: [name:kmemleak&] comm "init", pid 1, jiffies 4294894197 kmemleak: [name:kmemleak&] min_count = 0 kmemleak: [name:kmemleak&] count = 0 kmemleak: [name:kmemleak&] flags = 0x1 kmemleak: [name:kmemleak&] checksum = 0 kmemleak: [name:kmemleak&] backtrace: module_alloc+0x9c/0x120 move_module+0x34/0x19c layout_and_allocate+0x1c4/0x288 load_module+0x2c8/0xf00 __se_sys_finit_module+0x190/0x1d0 __arm64_sys_finit_module+0x20/0x30 invoke_syscall+0x60/0x170 el0_svc_common+0xc8/0x114 do_el0_svc+0x28/0xa0 el0_svc+0x60/0xf8 el0t_64_sync_handler+0x88/0xec el0t_64_sync+0x1b4/0x1b8 Link: https://lkml.kernel.org/r/20220318034051.30687-1-Kuan-Ying.Lee@mediatek.com Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: Nicholas Tang <nicholas.tang@mediatek.com> Cc: Yee Lee <yee.lee@mediatek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-01doc/vm/page_owner.rst: remove content related to -c optionYinan Zhang
-c option has been removed from page_owner_sort.c. Remove the usage of -c option from Documentation. This work is coauthored by Shenghong Han Yixuan Cao Chongxi Zhao Jiajian Ye Yuhong Feng Yongqiang Liu Link: https://lkml.kernel.org/r/20220326085920.1470081-2-zhangyinan2019@email.szu.edu.cn Signed-off-by: Yinan Zhang <zhangyinan2019@email.szu.edu.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Sean Anderson <seanga2@gmail.com> Cc: Tang Bin <tangbin@cmss.chinamobile.com> Cc: Zhenliang Wei <weizhenliang@huawei.com> Cc: Georgi Djakov <georgi.djakov@linaro.org> Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn> Cc: Jiajian Ye <yejiajian2018@email.szu.edu.cn> Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn> Cc: Yuhong Feng <yuhongf@szu.edu.cn> Cc: Yongqiang Liu <liuyongqiang13@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-01tools/vm/page_owner_sort.c: remove -c optionYinan Zhang
The -c option is used to cull by stacktrace. Now, --cull option has been Added in page_owner_sort.c. Culling by stacktrace is one of the function of "--cull". No need to set an extra parameter. So remove -c option. Remove parsing of -c when parse parameter and remove "-c" from usage. This work is coauthored by Shenghong Han Yixuan Cao Chongxi Zhao Jiajian Ye Yuhong Feng Yongqiang Liu Link: https://lkml.kernel.org/r/20220326085920.1470081-1-zhangyinan2019@email.szu.edu.cn Signed-off-by: Yinan Zhang <zhangyinan2019@email.szu.edu.cn> Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn> Cc: Georgi Djakov <georgi.djakov@linaro.org> Cc: Jiajian Ye <yejiajian2018@email.szu.edu.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Sean Anderson <seanga2@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tang Bin <tangbin@cmss.chinamobile.com> Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn> Cc: Yongqiang Liu <liuyongqiang13@huawei.com> Cc: Yuhong Feng <yuhongf@szu.edu.cn> Cc: Zhenliang Wei <weizhenliang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-01mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEPAndrey Konovalov
KASAN changes that added new GFP flags mistakenly updated __GFP_BITS_SHIFT as the total number of GFP bits instead of as a shift used to define __GFP_BITS_MASK. This broke LOCKDEP, as __GFP_BITS_MASK now gets the 25th bit enabled instead of the 28th for __GFP_NOLOCKDEP. Update __GFP_BITS_SHIFT to always count KASAN GFP bits. In the future, we could handle all combinations of KASAN and LOCKDEP to occupy as few bits as possible. For now, we have enough GFP bits to be inefficient in this quick fix. Link: https://lkml.kernel.org/r/462ff52742a1fcc95a69778685737f723ee4dfb3.1648400273.git.andreyknvl@google.com Fixes: 9353ffa6e9e9 ("kasan, page_alloc: allow skipping memory init for HW_TAGS") Fixes: 53ae233c30a6 ("kasan, page_alloc: allow skipping unpoisoning for HW_TAGS") Fixes: f49d9c5bb15c ("kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-01mm,hwpoison: unmap poisoned page before invalidationRik van Riel
In some cases it appears the invalidation of a hwpoisoned page fails because the page is still mapped in another process. This can cause a program to be continuously restarted and die when it page faults on the page that was not invalidated. Avoid that problem by unmapping the hwpoisoned page when we find it. Another issue is that sometimes we end up oopsing in finish_fault, if the code tries to do something with the now-NULL vmf->page. I did not hit this error when submitting the previous patch because there are several opportunities for alloc_set_pte to bail out before accessing vmf->page, and that apparently happened on those systems, and most of the time on other systems, too. However, across several million systems that error does occur a handful of times a day. It can be avoided by returning VM_FAULT_NOPAGE which will cause do_read_fault to return before calling finish_fault. Link: https://lkml.kernel.org/r/20220325161428.5068d97e@imladris.surriel.com Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path") Signed-off-by: Rik van Riel <riel@surriel.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-01mailmap: update Kirill's emailKirill Tkhai
My new email address is kirill.tkhai@openvz.org. Link: https://lkml.kernel.org/r/164846762354.278960.13129571556274098855.stgit@pro Signed-off-by: Kirill Tkhai <kirill.tkhai@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>