summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2021-05-24blk-mq: Use request queue-wide tags for tagset-wide sbitmapJohn Garry
The tags used for an IO scheduler are currently per hctx. As such, when q->nr_hw_queues grows, so does the request queue total IO scheduler tag depth. This may cause problems for SCSI MQ HBAs whose total driver depth is fixed. Ming and Yanhui report higher CPU usage and lower throughput in scenarios where the fixed total driver tag depth is appreciably lower than the total scheduler tag depth: https://lore.kernel.org/linux-block/440dfcfc-1a2c-bd98-1161-cec4d78c6dfc@huawei.com/T/#mc0d6d4f95275a2743d1c8c3e4dc9ff6c9aa3a76b In that scenario, since the scheduler tag is got first, much contention is introduced since a driver tag may not be available after we have got the sched tag. Improve this scenario by introducing request queue-wide tags for when a tagset-wide sbitmap is used. The static sched requests are still allocated per hctx, as requests are initialised per hctx, as in blk_mq_init_request(..., hctx_idx, ...) -> set->ops->init_request(.., hctx_idx, ...). For simplicity of resizing the request queue sbitmap when updating the request queue depth, just init at the max possible size, so we don't need to deal with the possibly with swapping out a new sbitmap for old if we need to grow. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1620907258-30910-3-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24block_dump: remove block_dump featurezhangyi (F)
We have already delete block_dump feature in mark_inode_dirty() because it can be replaced by tracepoints, now we also remove the part in submit_bio() for the same reason. The part of block dump feature in submit_bio() dump the write process, write region and sectors on the target disk into kernel message. it can be replaced by block_bio_queue tracepoint in submit_bio_checks(), so we do not need block_dump anymore, remove the whole block_dump feature. Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210313030146.2882027-3-yi.zhang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-22linux/bits.h: fix compilation error with GENMASKRikard Falkeborn
GENMASK() has an input check which uses __builtin_choose_expr() to enable a compile time sanity check of its inputs if they are known at compile time. However, it turns out that __builtin_constant_p() does not always return a compile time constant [0]. It was thought this problem was fixed with gcc 4.9 [1], but apparently this is not the case [2]. Switch to use __is_constexpr() instead which always returns a compile time constant, regardless of its inputs. Link: https://lore.kernel.org/lkml/42b4342b-aefc-a16a-0d43-9f9c0d63ba7a@rasmusvillemoes.dk [0] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19449 [1] Link: https://lore.kernel.org/lkml/1ac7bbc2-45d9-26ed-0b33-bf382b8d858b@I-love.SAKURA.ne.jp [2] Link: https://lkml.kernel.org/r/20210511203716.117010-1-rikard.falkeborn@gmail.com Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22Merge tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Fix BLKRRPART and deletion race (Gulam, Christoph) - NVMe pull request (Christoph): - nvme-tcp corruption and timeout fixes (Sagi Grimberg, Keith Busch) - nvme-fc teardown fix (James Smart) - nvmet/nvme-loop memory leak fixes (Wu Bo)" * tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block: block: fix a race between del_gendisk and BLKRRPART block: prevent block device lookups at the beginning of del_gendisk nvme-fc: clear q_live at beginning of association teardown nvme-tcp: rerun io_work if req_list is not empty nvme-tcp: fix possible use-after-completion nvme-loop: fix memory leak in nvme_loop_create_ctrl() nvmet: fix memory leak in nvmet_alloc_ctrl()
2021-05-21Merge branch 'for-v5.13-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo fix from Eric Biederman: "During the merge window an issue with si_perf and the siginfo ABI came up. The alpha and sparc siginfo structure layout had changed with the addition of SIGTRAP TRAP_PERF and the new field si_perf. The reason only alpha and sparc were affected is that they are the only architectures that use si_trapno. Looking deeper it was discovered that si_trapno is used for only a few select signals on alpha and sparc, and that none of the other _sigfault fields past si_addr are used at all. Which means technically no regression on alpha and sparc. While the alignment concerns might be dismissed the abuse of si_errno by SIGTRAP TRAP_PERF does have the potential to cause regressions in existing userspace. While we still have time before userspace starts using and depending on the new definition siginfo for SIGTRAP TRAP_PERF this set of changes cleans up siginfo_t. - The si_trapno field is demoted from magic alpha and sparc status and made an ordinary union member of the _sigfault member of siginfo_t. Without moving it of course. - si_perf is replaced with si_perf_data and si_perf_type ending the abuse of si_errno. - Unnecessary additions to signalfd_siginfo are removed" * 'for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: signalfd: Remove SIL_PERF_EVENT fields from signalfd_siginfo signal: Deliver all of the siginfo perf data in _perf signal: Factor force_sig_perf out of perf_sigtrap signal: Implement SIL_FAULT_TRAPNO siginfo: Move si_trapno inside the union inside _si_fault
2021-05-20Merge tag 'platform-drivers-x86-v5.13-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "Assorted pdx86 bug-fixes and model-specific quirks for 5.13" * tag 'platform-drivers-x86-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet platform/x86: touchscreen_dmi: Add info for the Mediacom Winpad 7.0 W700 tablet platform/x86: intel_punit_ipc: Append MODULE_DEVICE_TABLE for ACPI platform/x86: dell-smbios-wmi: Fix oops on rmmod dell_smbios platform/x86: hp-wireless: add AMD's hardware id to the supported list platform/x86: intel_int0002_vgpio: Only call enable_irq_wake() when using s2idle platform/x86: gigabyte-wmi: add support for B550 Aorus Elite platform/x86: gigabyte-wmi: add support for X570 UD platform/x86: gigabyte-wmi: streamline dmi matching platform/mellanox: mlxbf-tmfifo: Fix a memory barrier issue platform/surface: dtx: Fix poll function platform/surface: aggregator: Add platform-drivers-x86 list to MAINTAINERS entry platform/surface: aggregator: avoid clang -Wconstant-conversion warning platform/surface: aggregator: Do not mark interrupt as shared platform/x86: hp_accel: Avoid invoking _INI to speed up resume platform/x86: ideapad-laptop: fix method name typo platform/x86: ideapad-laptop: fix a NULL pointer dereference
2021-05-20Merge tag 'char-misc-5.13-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here is a big set of char/misc/other driver fixes for 5.13-rc3. The majority here is the fallout of the umn.edu re-review of all prior submissions. That resulted in a bunch of reverts along with the "correct" changes made, such that there is no regression of any of the potential fixes that were made by those individuals. I would like to thank the over 80 different developers who helped with the review and fixes for this mess. Other than that, there's a few habanna driver fixes for reported issues, and some dyndbg fixes for reported problems. All of these have been in linux-next for a while with no reported problems" * tag 'char-misc-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (82 commits) misc: eeprom: at24: check suspend status before disable regulator uio_hv_generic: Fix another memory leak in error handling paths uio_hv_generic: Fix a memory leak in error handling paths uio/uio_pci_generic: fix return value changed in refactoring Revert "Revert "ALSA: usx2y: Fix potential NULL pointer dereference"" dyndbg: drop uninformative vpr_info dyndbg: avoid calling dyndbg_emit_prefix when it has no work binder: Return EFAULT if we fail BINDER_ENABLE_ONEWAY_SPAM_DETECTION cdrom: gdrom: initialize global variable at init time brcmfmac: properly check for bus register errors Revert "brcmfmac: add a check for the status of usb_register" video: imsttfb: check for ioremap() failures Revert "video: imsttfb: fix potential NULL pointer dereferences" net: liquidio: Add missing null pointer checks Revert "net: liquidio: fix a NULL pointer dereference" media: gspca: properly check for errors in po1030_probe() Revert "media: gspca: Check the return value of write_bridge for timeout" media: gspca: mt9m111: Check write_bridge for timeout Revert "media: gspca: mt9m111: Check write_bridge for timeout" media: dvb: Add check on sp8870_readreg return ...
2021-05-20block: prevent block device lookups at the beginning of del_gendiskChristoph Hellwig
As an artifact of how gendisk lookup used to work in earlier kernels, GENHD_FL_UP is only cleared very late in del_gendisk, and a global lock is used to prevent opens from succeeding while del_gendisk is tearing down the gendisk. Switch to clearing the flag early and under bd_mutex so that callers can use bd_mutex to stabilize the flag, which removes the need for the global mutex. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210514131842.1600568-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-19platform/surface: aggregator: avoid clang -Wconstant-conversion warningArnd Bergmann
Clang complains about the assignment of SSAM_ANY_IID to ssam_device_uid->instance: drivers/platform/surface/surface_aggregator_registry.c:478:25: error: implicit conversion from 'int' to '__u8' (aka 'unsigned char') changes value from 65535 to 255 [-Werror,-Wconstant-conversion] { SSAM_VDEV(HUB, 0x02, SSAM_ANY_IID, 0x00) }, ~ ^~~~~~~~~~~~ include/linux/surface_aggregator/device.h:71:23: note: expanded from macro 'SSAM_ANY_IID' #define SSAM_ANY_IID 0xffff ^~~~~~ include/linux/surface_aggregator/device.h:126:63: note: expanded from macro 'SSAM_VDEV' SSAM_DEVICE(SSAM_DOMAIN_VIRTUAL, SSAM_VIRTUAL_TC_##cat, tid, iid, fun) ^~~ include/linux/surface_aggregator/device.h:102:41: note: expanded from macro 'SSAM_DEVICE' .instance = ((iid) != SSAM_ANY_IID) ? (iid) : 0, \ ^~~ The assignment doesn't actually happen, but clang checks the type limits before checking whether this assignment is reached. Replace the ?: operator with a __builtin_choose_expr() invocation that avoids the warning for the untaken part. Fixes: eb0e90a82098 ("platform/surface: aggregator: Add dedicated bus and device type") Cc: platform-driver-x86@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20210514200453.1542978-1-arnd@kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-05-18signal: Deliver all of the siginfo perf data in _perfEric W. Biederman
Don't abuse si_errno and deliver all of the perf data in _perf member of siginfo_t. Note: The data field in the perf data structures in a u64 to allow a pointer to be encoded without needed to implement a 32bit and 64bit version of the same structure. There already exists a 32bit and 64bit versions siginfo_t, and the 32bit version can not include a 64bit member as it only has 32bit alignment. So unsigned long is used in siginfo_t instead of a u64 as unsigned long can encode a pointer on all architectures linux supports. v1: https://lkml.kernel.org/r/m11rarqqx2.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210503203814.25487-10-ebiederm@xmission.com v3: https://lkml.kernel.org/r/20210505141101.11519-11-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20210517195748.8880-4-ebiederm@xmission.com Reviewed-by: Marco Elver <elver@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-05-18signal: Factor force_sig_perf out of perf_sigtrapEric W. Biederman
Separate filling in siginfo for TRAP_PERF from deciding that siginal needs to be sent. There are enough little details that need to be correct when properly filling in siginfo_t that it is easy to make mistakes if filling in the siginfo_t is in the same function with other logic. So factor out force_sig_perf to reduce the cognative load of on reviewers, maintainers and implementors. v1: https://lkml.kernel.org/r/m17dkjqqxz.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-10-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20210517195748.8880-3-ebiederm@xmission.com Reviewed-by: Marco Elver <elver@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-05-18signal: Implement SIL_FAULT_TRAPNOEric W. Biederman
Now that si_trapno is part of the union in _si_fault and available on all architectures, add SIL_FAULT_TRAPNO and update siginfo_layout to return SIL_FAULT_TRAPNO when the code assumes si_trapno is valid. There is room for future changes to reduce when si_trapno is valid but this is all that is needed to make si_trapno and the other members of the the union in _sigfault mutually exclusive. Update the code that uses siginfo_layout to deal with SIL_FAULT_TRAPNO and have the same code ignore si_trapno in in all other cases. v1: https://lkml.kernel.org/r/m1o8dvs7s7.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-6-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20210517195748.8880-2-ebiederm@xmission.com Reviewed-by: Marco Elver <elver@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-05-18siginfo: Move si_trapno inside the union inside _si_faultEric W. Biederman
It turns out that linux uses si_trapno very sparingly, and as such it can be considered extra information for a very narrow selection of signals, rather than information that is present with every fault reported in siginfo. As such move si_trapno inside the union inside of _si_fault. This results in no change in placement, and makes it eaiser to extend _si_fault in the future as this reduces the number of special cases. In particular with si_trapno included in the union it is no longer a concern that the union must be pointer aligned on most architectures because the union follows immediately after si_addr which is a pointer. This change results in a difference in siginfo field placement on sparc and alpha for the fields si_addr_lsb, si_lower, si_upper, si_pkey, and si_perf. These architectures do not implement the signals that would use si_addr_lsb, si_lower, si_upper, si_pkey, and si_perf. Further these architecture have not yet implemented the userspace that would use si_perf. The point of this change is in fact to correct these placement issues before sparc or alpha grow userspace that cares. This change was discussed[1] and the agreement is that this change is currently safe. [1]: https://lkml.kernel.org/r/CAK8P3a0+uKYwL1NhY6Hvtieghba2hKYGD6hcKx5n8=4Gtt+pHA@mail.gmail.com Acked-by: Marco Elver <elver@google.com> v1: https://lkml.kernel.org/r/m1tunns7yf.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-5-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20210517195748.8880-1-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-05-16Merge tag 'driver-core-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver fixes for driver core changes that happened in 5.13-rc1. The clk driver fix resolves a many-reported issue with booting some devices, and the USB typec fix resolves the reported problem of USB systems on some embedded boards. Both of these have been in linux-next this week with no reported issues" * tag 'driver-core-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: clk: Skip clk provider registration when np is NULL usb: typec: tcpm: Don't block probing of consumers of "connector" nodes
2021-05-15Merge tag 'core-urgent-2021-05-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 stack randomization fix from Ingo Molnar: "Fix an assembly constraint that affected LLVM up to version 12" * tag 'core-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stack: Replace "o" output with "r" input constraint
2021-05-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 patches. Subsystems affected by this patch series: resource, squashfs, hfsplus, modprobe, and mm (hugetlb, slub, userfaultfd, ksm, pagealloc, kasan, pagemap, and ioremap)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/ioremap: fix iomap_max_page_shift docs: admin-guide: update description for kernel.modprobe sysctl hfsplus: prevent corruption in shrinking truncate mm/filemap: fix readahead return types kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled mm: fix struct page layout on 32-bit systems ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()" userfaultfd: release page in error path to avoid BUG_ON squashfs: fix divide error in calculate_skip() kernel/resource: fix return code check in __request_free_mem_region mm, slub: move slub_debug static key enabling outside slab_mutex mm/hugetlb: fix cow where page writtable in child mm/hugetlb: fix F_SEAL_FUTURE_WRITE
2021-05-15Merge tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for shared tag set exit (Bart) - Correct ioctl range for zoned ioctls (Damien) - Removed dead/unused function (Lin) - Fix perf regression for shared tags (Ming) - Fix out-of-bounds issue with kyber and preemption (Omar) - BFQ merge fix (Paolo) - Two error handling fixes for nbd (Sun) - Fix weight update in blk-iocost (Tejun) - NVMe pull request (Christoph): - correct the check for using the inline bio in nvmet (Chaitanya Kulkarni) - demote unsupported command warnings (Chaitanya Kulkarni) - fix corruption due to double initializing ANA state (me, Hou Pu) - reset ns->file when open fails (Daniel Wagner) - fix a NULL deref when SEND is completed with error in nvmet-rdma (Michal Kalderon) - Fix kernel-doc warning (Bart) * tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block: block/partitions/efi.c: Fix the efi_partition() kernel-doc header blk-mq: Swap two calls in blk_mq_exit_queue() blk-mq: plug request for shared sbitmap nvmet: use new ana_log_size instead the old one nvmet: seset ns->file when open fails nbd: share nbd_put and return by goto put_nbd nbd: Fix NULL pointer in flush_workqueue blkdev.h: remove unused codes blk_account_rq block, bfq: avoid circular stable merges blk-iocost: fix weight updates of inner active iocgs nvmet: demote fabrics cmd parse err msg to debug nvmet: use helper to remove the duplicate code nvmet: demote discovery cmd parse err msg to debug nvmet-rdma: Fix NULL deref when SEND is completed with error nvmet: fix inline bio check for passthru nvmet: fix inline bio check for bdev-ns nvme-multipath: fix double initialization of ANA state kyber: fix out of bounds access when preempted block: uapi: fix comment about block device ioctl
2021-05-15Merge tag 'libnvdimm-fixes-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A regression fix for a bootup crash condition introduced in this merge window and some other minor fixups: - Fix regression in ACPI NFIT table handling leading to crashes and driver load failures. - Move the nvdimm mailing list - Miscellaneous minor fixups" * tag 'libnvdimm-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: ACPI: NFIT: Fix support for variable 'SPA' structure size MAINTAINERS: Move nvdimm mailing list tools/testing/nvdimm: Make symbol '__nfit_test_ioremap' static libnvdimm: Remove duplicate struct declaration
2021-05-14mm/filemap: fix readahead return typesMatthew Wilcox (Oracle)
A readahead request will not allocate more memory than can be represented by a size_t, even on systems that have HIGHMEM available. Change the length functions from returning an loff_t to a size_t. Link: https://lkml.kernel.org/r/20210510201201.1558972-1-willy@infradead.org Fixes: 32c0a6bcaa1f57 ("btrfs: add and use readahead_batch_length") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14mm: fix struct page layout on 32-bit systemsMatthew Wilcox (Oracle)
32-bit architectures which expect 8-byte alignment for 8-byte integers and need 64-bit DMA addresses (arm, mips, ppc) had their struct page inadvertently expanded in 2019. When the dma_addr_t was added, it forced the alignment of the union to 8 bytes, which inserted a 4 byte gap between 'flags' and the union. Fix this by storing the dma_addr_t in one or two adjacent unsigned longs. This restores the alignment to that of an unsigned long. We always store the low bits in the first word to prevent the PageTail bit from being inadvertently set on a big endian platform. If that happened, get_user_pages_fast() racing against a page which was freed and reallocated to the page_pool could dereference a bogus compound_head(), which would be hard to trace back to this cause. Link: https://lkml.kernel.org/r/20210510153211.1504886-1-willy@infradead.org Fixes: c25fff7171be ("mm: add dma_addr_t to struct page") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Matteo Croce <mcroce@linux.microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14mm/hugetlb: fix F_SEAL_FUTURE_WRITEPeter Xu
Patch series "mm/hugetlb: Fix issues on file sealing and fork", v2. Hugh reported issue with F_SEAL_FUTURE_WRITE not applied correctly to hugetlbfs, which I can easily verify using the memfd_test program, which seems that the program is hardly run with hugetlbfs pages (as by default shmem). Meanwhile I found another probably even more severe issue on that hugetlb fork won't wr-protect child cow pages, so child can potentially write to parent private pages. Patch 2 addresses that. After this series applied, "memfd_test hugetlbfs" should start to pass. This patch (of 2): F_SEAL_FUTURE_WRITE is missing for hugetlb starting from the first day. There is a test program for that and it fails constantly. $ ./memfd_test hugetlbfs memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE mmap() didn't fail as expected Aborted (core dumped) I think it's probably because no one is really running the hugetlbfs test. Fix it by checking FUTURE_WRITE also in hugetlbfs_file_mmap() as what we do in shmem_mmap(). Generalize a helper for that. Link: https://lkml.kernel.org/r/20210503234356.9097-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210503234356.9097-2-peterx@redhat.com Fixes: ab3948f58ff84 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Hugh Dickins <hughd@google.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-13Merge tag 'pm-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These close a coverage gap in the intel_pstate driver and fix runtime PM child count imbalance related to interactions with system-wide suspend. Specifics: - Make intel_pstate work as expected on systems where the platform firmware enables HWP even though the HWP EPP support is not advertised (Rafael Wysocki). - Fix possible runtime PM child count imbalance that may occur if other runtime PM functions are called after invoking pm_runtime_force_suspend() and before pm_runtime_force_resume() is called (Tony Lindgren)" * tag 'pm-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: runtime: Fix unpaired parent child_count for force_resume cpufreq: intel_pstate: Use HWP if enabled by platform firmware
2021-05-13dyndbg: avoid calling dyndbg_emit_prefix when it has no workJim Cromie
Wrap function in a static-inline one, which checks flags to avoid calling the function unnecessarily. And hoist its output-buffer initialization to the grand-caller, which is already allocating the buffer on the stack, and can trivially initialize it too. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20210504222235.1033685-2-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-13Merge branch 'resizex' (patches from Maciej)Linus Torvalds
Merge VT_RESIZEX fixes from Maciej Rozycki: "I got to the bottom of the issue with VT_RESIZEX recently discussed and came up with this small patch series, fixing an additional issue that I originally thought might be broken VGA hardware emulation with my laptop, which however turned out to be intertwined with the original problem and also a regression introduced somewhat later. The fix for that because the first patch, and then to make backporting feasible I had to put a revert of the offending change from last September next, followed by a proper fix for the framebuffer issue that change had tried to address. See individual change descriptions for details. These have been verified with true VGA hardware (a Trident TVGA8900 ISA video adapter) using various combinations of `svgatextmode' and `setfont' command invocations to change both the VT size and the font size, and also switching between the text console and X11, both by starting/stopping the X server and by switching between VTs. All this to ensure bringing the behaviour of VGA text console back to correct operation as it used to be with Linux 2.6.18" * emailed patches from Maciej W. Rozycki <macro@orcam.me.uk>: vt: Fix character height handling with VT_RESIZEX vt_ioctl: Revert VT_RESIZEX parameter handling removal vgacon: Record video mode changes with VT_RESIZEX
2021-05-13vt: Fix character height handling with VT_RESIZEXMaciej W. Rozycki
Restore the original intent of the VT_RESIZEX ioctl's `v_clin' parameter which is the number of pixel rows per character (cell) rather than the height of the font used. For framebuffer devices the two values are always the same, because the former is inferred from the latter one. For VGA used as a true text mode device these two parameters are independent from each other: the number of pixel rows per character is set in the CRT controller, while font height is in fact hardwired to 32 pixel rows and fonts of heights below that value are handled by padding their data with blanks when loaded to hardware for use by the character generator. One can change the setting in the CRT controller and it will update the screen contents accordingly regardless of the font loaded. The `v_clin' parameter is used by the `vgacon' driver to set the height of the character cell and then the cursor position within. Make the parameter explicit then, by defining a new `vc_cell_height' struct member of `vc_data', set it instead of `vc_font.height' from `v_clin' in the VT_RESIZEX ioctl, and then use it throughout the `vgacon' driver except where actual font data is accessed which as noted above is independent from the CRTC setting. This way the framebuffer console driver is free to ignore the `v_clin' parameter as irrelevant, as it always should have, avoiding any issues attempts to give the parameter a meaning there could have caused, such as one that has led to commit 988d0763361b ("vt_ioctl: make VT_RESIZEX behave like VT_RESIZE"): "syzbot is reporting UAF/OOB read at bit_putcs()/soft_cursor() [1][2], for vt_resizex() from ioctl(VT_RESIZEX) allows setting font height larger than actual font height calculated by con_font_set() from ioctl(PIO_FONT). Since fbcon_set_font() from con_font_set() allocates minimal amount of memory based on actual font height calculated by con_font_set(), use of vt_resizex() can cause UAF/OOB read for font data." The problem first appeared around Linux 2.5.66 which predates our repo history, but the origin could be identified with the old MIPS/Linux repo also at: <git://git.kernel.org/pub/scm/linux/kernel/git/ralf/linux.git> as commit 9736a3546de7 ("Merge with Linux 2.5.66."), where VT_RESIZEX code in `vt_ioctl' was updated as follows: if (clin) - video_font_height = clin; + vc->vc_font.height = clin; making the parameter apply to framebuffer devices as well, perhaps due to the use of "font" in the name of the original `video_font_height' variable. Use "cell" in the new struct member then to avoid ambiguity. References: [1] https://syzkaller.appspot.com/bug?id=32577e96d88447ded2d3b76d71254fb855245837 [2] https://syzkaller.appspot.com/bug?id=6b8355d27b2b94fb5cedf4655e3a59162d9e48e3 Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org # v2.6.12+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-12libnvdimm: Remove duplicate struct declarationWan Jiabing
struct device is declared at 133rd line. The second declaration is unnecessary, remove it. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Link: https://lore.kernel.org/r/20210419112725.42145-1-wanjiabing@vivo.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-12blkdev.h: remove unused codes blk_account_rqLin Feng
Last users of blk_account_rq gone with patch commit a1ce35fa49852db ("block: remove dead elevator code") and now it gets no caller, it can be safely removed. Signed-off-by: Lin Feng <linf@wangsu.com> Link: https://lore.kernel.org/r/20210512100124.173769-1-linf@wangsu.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-11kyber: fix out of bounds access when preemptedOmar Sandoval
__blk_mq_sched_bio_merge() gets the ctx and hctx for the current CPU and passes the hctx to ->bio_merge(). kyber_bio_merge() then gets the ctx for the current CPU again and uses that to get the corresponding Kyber context in the passed hctx. However, the thread may be preempted between the two calls to blk_mq_get_ctx(), and the ctx returned the second time may no longer correspond to the passed hctx. This "works" accidentally most of the time, but it can cause us to read garbage if the second ctx came from an hctx with more ctx's than the first one (i.e., if ctx->index_hw[hctx->type] > hctx->nr_ctx). This manifested as this UBSAN array index out of bounds error reported by Jakub: UBSAN: array-index-out-of-bounds in ../kernel/locking/qspinlock.c:130:9 index 13106 is out of range for type 'long unsigned int [128]' Call Trace: dump_stack+0xa4/0xe5 ubsan_epilogue+0x5/0x40 __ubsan_handle_out_of_bounds.cold.13+0x2a/0x34 queued_spin_lock_slowpath+0x476/0x480 do_raw_spin_lock+0x1c2/0x1d0 kyber_bio_merge+0x112/0x180 blk_mq_submit_bio+0x1f5/0x1100 submit_bio_noacct+0x7b0/0x870 submit_bio+0xc2/0x3a0 btrfs_map_bio+0x4f0/0x9d0 btrfs_submit_data_bio+0x24e/0x310 submit_one_bio+0x7f/0xb0 submit_extent_page+0xc4/0x440 __extent_writepage_io+0x2b8/0x5e0 __extent_writepage+0x28d/0x6e0 extent_write_cache_pages+0x4d7/0x7a0 extent_writepages+0xa2/0x110 do_writepages+0x8f/0x180 __writeback_single_inode+0x99/0x7f0 writeback_sb_inodes+0x34e/0x790 __writeback_inodes_wb+0x9e/0x120 wb_writeback+0x4d2/0x660 wb_workfn+0x64d/0xa10 process_one_work+0x53a/0xa80 worker_thread+0x69/0x5b0 kthread+0x20b/0x240 ret_from_fork+0x1f/0x30 Only Kyber uses the hctx, so fix it by passing the request_queue to ->bio_merge() instead. BFQ and mq-deadline just use that, and Kyber can map the queues itself to avoid the mismatch. Fixes: a6088845c2bf ("block: kyber: make kyber more friendly with merging") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Link: https://lore.kernel.org/r/c7598605401a48d5cfeadebb678abd10af22b83f.1620691329.git.osandov@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-11stack: Replace "o" output with "r" input constraintNick Desaulniers
"o" isn't a common asm() constraint to use; it triggers an assertion in assert-enabled builds of LLVM that it's not recognized when targeting aarch64 (though it appears to fall back to "m"). It's fixed in LLVM 13 now, but there isn't really a good reason to use "o" in particular here. To avoid causing build issues for those using assert-enabled builds of earlier LLVM versions, the constraint needs changing. Instead, if the point is to retain the __builtin_alloca(), make ptr appear to "escape" via being an input to an empty inline asm block. This is preferable anyways, since otherwise this looks like a dead store. While the use of "r" was considered in https://lore.kernel.org/lkml/202104011447.2E7F543@keescook/ it was only tested as an output (which looks like a dead store, and wasn't sufficient). Use "r" as an input constraint instead, which behaves correctly across compilers and architectures. Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall") Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://reviews.llvm.org/D100412 Link: https://bugs.llvm.org/show_bug.cgi?id=49956 Link: https://lore.kernel.org/r/20210419231741.4084415-1-keescook@chromium.org
2021-05-10PM: runtime: Fix unpaired parent child_count for force_resumeTony Lindgren
As pm_runtime_need_not_resume() relies also on usage_count, it can return a different value in pm_runtime_force_suspend() compared to when called in pm_runtime_force_resume(). Different return values can happen if anything calls PM runtime functions in between, and causes the parent child_count to increase on every resume. So far I've seen the issue only for omapdrm that does complicated things with PM runtime calls during system suspend for legacy reasons: omap_atomic_commit_tail() for omapdrm.0 dispc_runtime_get() wakes up 58000000.dss as it's the dispc parent dispc_runtime_resume() rpm_resume() increases parent child_count dispc_runtime_put() won't idle, PM runtime suspend blocked pm_runtime_force_suspend() for 58000000.dss, !pm_runtime_need_not_resume() __update_runtime_status() system suspended pm_runtime_force_resume() for 58000000.dss, pm_runtime_need_not_resume() pm_runtime_enable() only called because of pm_runtime_need_not_resume() omap_atomic_commit_tail() for omapdrm.0 dispc_runtime_get() wakes up 58000000.dss as it's the dispc parent dispc_runtime_resume() rpm_resume() increases parent child_count dispc_runtime_put() won't idle, PM runtime suspend blocked ... rpm_suspend for 58000000.dss but parent child_count is now unbalanced Let's fix the issue by adding a flag for needs_force_resume and use it in pm_runtime_force_resume() instead of pm_runtime_need_not_resume(). Additionally omapdrm system suspend could be simplified later on to avoid lots of unnecessary PM runtime calls and the complexity it adds. The driver can just use internal functions that are shared between the PM runtime and system suspend related functions. Fixes: 4918e1f87c5f ("PM / runtime: Rework pm_runtime_force_suspend/resume()") Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-05-10usb: typec: tcpm: Don't block probing of consumers of "connector" nodesSaravana Kannan
fw_devlink expects DT device nodes with "compatible" property to have struct devices created for them. Since the connector node might not be populated as a device, mark it as such so that fw_devlink knows not to wait on this fwnode being populated as a struct device. Without this patch, USB functionality can be broken on some boards. Fixes: f7514a663016 ("of: property: fw_devlink: Add support for remote-endpoint") Reported-by: John Stultz <john.stultz@linaro.org> Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210506004423.345199-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-09Merge tag 'block-5.13-2021-05-09' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fix from Jens Axboe: "Turns out the bio max size change still has issues, so let's get it reverted for 5.13-rc1. We'll shake out the issues there and defer it to 5.14 instead" * tag 'block-5.13-2021-05-09' of git://git.kernel.dk/linux-block: Revert "bio: limit bio max size"
2021-05-09Merge tag 'locking-urgent-2021-05-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A set of locking related fixes and updates: - Two fixes for the futex syscall related to the timeout handling. FUTEX_LOCK_PI does not support the FUTEX_CLOCK_REALTIME bit and because it's not set the time namespace adjustment for clock MONOTONIC is applied wrongly. FUTEX_WAIT cannot support the FUTEX_CLOCK_REALTIME bit because its always a relative timeout. - Cleanups in the futex syscall entry points which became obvious when the two timeout handling bugs were fixed. - Cleanup of queued_write_lock_slowpath() as suggested by Linus - Fixup of the smp_call_function_single_async() prototype" * tag 'locking-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Make syscall entry points less convoluted futex: Get rid of the val2 conditional dance futex: Do not apply time namespace adjustment on FUTEX_LOCK_PI Revert 337f13046ff0 ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op") locking/qrwlock: Cleanup queued_write_lock_slowpath() smp: Fix smp_call_function_single_async prototype
2021-05-09Merge tag 'x86_urgent_for_v5.13_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "A bunch of things accumulated for x86 in the last two weeks: - Fix guest vtime accounting so that ticks happening while the guest is running can also be accounted to it. Along with a consolidation to the guest-specific context tracking helpers. - Provide for the host NMI handler running after a VMX VMEXIT to be able to run on the kernel stack correctly. - Initialize MSR_TSC_AUX when RDPID is supported and not RDTSCP (virt relevant - real hw supports both) - A code generation improvement to TASK_SIZE_MAX through the use of alternatives - The usual misc and related cleanups and improvements" * tag 'x86_urgent_for_v5.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM: x86: Consolidate guest enter/exit logic to common helpers context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain context_tracking: Consolidate guest enter/exit wrappers sched/vtime: Move guest enter/exit vtime accounting to vtime.h sched/vtime: Move vtime accounting external declarations above inlines KVM: x86: Defer vtime accounting 'til after IRQ handling context_tracking: Move guest exit vtime accounting to separate helpers context_tracking: Move guest exit context tracking to separate helpers KVM/VMX: Invoke NMI non-IST entry instead of IST entry x86/cpu: Remove write_tsc() and write_rdtscp_aux() wrappers x86/cpu: Initialize MSR_TSC_AUX if RDTSCP *or* RDPID is supported x86/resctrl: Fix init const confusion x86: Delete UD0, UD1 traces x86/smpboot: Remove duplicate includes x86/cpu: Use alternative to generate the TASK_SIZE_MAX constant
2021-05-08Revert "bio: limit bio max size"block-5.13-2021-05-09Jens Axboe
This reverts commit cd2c7545ae1beac3b6aae033c7f31193b3255946. Alex reports that the commit causes corruption with LUKS on ext4. Revert it for now so that this can be investigated properly. Link: https://lore.kernel.org/linux-block/1620493841.bxdq8r5haw.none@localhost/ Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-08Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull more SCSI updates from James Bottomley: "This is a set of minor fixes in various drivers (qla2xxx, ufs, scsi_debug, lpfc) one doc fix and a fairly large update to the fnic driver to remove the open coded iteration functions in favour of the scsi provided ones" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: fnic: Use scsi_host_busy_iter() to traverse commands scsi: fnic: Kill 'exclude_id' argument to fnic_cleanup_io() scsi: scsi_debug: Fix cmd_per_lun, set to max_queue scsi: ufs: core: Narrow down fast path in system suspend path scsi: ufs: core: Cancel rpm_dev_flush_recheck_work during system suspend scsi: ufs: core: Do not put UFS power into LPM if link is broken scsi: qla2xxx: Prevent PRLI in target mode scsi: qla2xxx: Add marginal path handling support scsi: target: tcmu: Return from tcmu_handle_completions() if cmd_id not found scsi: ufs: core: Fix a typo in ufs-sysfs.c scsi: lpfc: Fix bad memory access during VPD DUMP mailbox command scsi: lpfc: Fix DMA virtual address ptr assignment in bsg scsi: lpfc: Fix illegal memory access on Abort IOCBs scsi: blk-mq: Fix build warning when making htmldocs
2021-05-08Merge tag 'kbuild-v5.13-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - Convert sh and sparc to use generic shell scripts to generate the syscall headers - refactor .gitignore files - Update kernel/config_data.gz only when the content of the .config is really changed, which avoids the unneeded re-link of vmlinux - move "remove stale files" workarounds to scripts/remove-stale-files - suppress unused-but-set-variable warnings by default for Clang as well - fix locale setting LANG=C to LC_ALL=C - improve 'make distclean' - always keep intermediate objects from scripts/link-vmlinux.sh - move IF_ENABLED out of <linux/kconfig.h> to make it self-contained - misc cleanups * tag 'kbuild-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits) linux/kconfig.h: replace IF_ENABLED() with PTR_IF() in <linux/kernel.h> kbuild: Don't remove link-vmlinux temporary files on exit/signal kbuild: remove the unneeded comments for external module builds kbuild: make distclean remove tag files in sub-directories kbuild: make distclean work against $(objtree) instead of $(srctree) kbuild: refactor modname-multi by using suffix-search kbuild: refactor fdtoverlay rule kbuild: parameterize the .o part of suffix-search arch: use cross_compiling to check whether it is a cross build or not kbuild: remove ARCH=sh64 support from top Makefile .gitignore: prefix local generated files with a slash kbuild: replace LANG=C with LC_ALL=C Makefile: Move -Wno-unused-but-set-variable out of GCC only block kbuild: add a script to remove stale generated files kbuild: update config_data.gz only when the content of .config is changed .gitignore: ignore only top-level modules.builtin .gitignore: move tags and TAGS close to other tag files kernel/.gitgnore: remove stale timeconst.h and hz.bc usr/include: refactor .gitignore genksyms: fix stale comment ...
2021-05-08Merge tag 'net-5.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.13-rc1, including fixes from bpf, can and netfilter trees. Self-contained fixes, nothing risky. Current release - new code bugs: - dsa: ksz: fix a few bugs found by static-checker in the new driver - stmmac: fix frame preemption handshake not triggering after interface restart Previous releases - regressions: - make nla_strcmp handle more then one trailing null character - fix stack OOB reads while fragmenting IPv4 packets in openvswitch and net/sched - sctp: do asoc update earlier in sctp_sf_do_dupcook_a - sctp: delay auto_asconf init until binding the first addr - stmmac: clear receive all(RA) bit when promiscuous mode is off - can: mcp251x: fix resume from sleep before interface was brought up Previous releases - always broken: - bpf: fix leakage of uninitialized bpf stack under speculation - bpf: fix masking negation logic upon negative dst register - netfilter: don't assume that skb_header_pointer() will never fail - only allow init netns to set default tcp cong to a restricted algo - xsk: fix xp_aligned_validate_desc() when len == chunk_size to avoid false positive errors - ethtool: fix missing NLM_F_MULTI flag when dumping - can: m_can: m_can_tx_work_queue(): fix tx_skb race condition - sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b - bridge: fix NULL-deref caused by a races between assigning rx_handler_data and setting the IFF_BRIDGE_PORT bit Latecomer: - seg6: add counters support for SRv6 Behaviors" * tag 'net-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits) atm: firestream: Use fallthrough pseudo-keyword net: stmmac: Do not enable RX FIFO overflow interrupts mptcp: fix splat when closing unaccepted socket i40e: Remove LLDP frame filters i40e: Fix PHY type identifiers for 2.5G and 5G adapters i40e: fix the restart auto-negotiation after FEC modified i40e: Fix use-after-free in i40e_client_subtask() i40e: fix broken XDP support netfilter: nftables: avoid potential overflows on 32bit arches netfilter: nftables: avoid overflows in nft_hash_buckets() tcp: Specify cmsgbuf is user pointer for receive zerocopy. mlxsw: spectrum_mr: Update egress RIF list before route's action net: ipa: fix inter-EE IRQ register definitions can: m_can: m_can_tx_work_queue(): fix tx_skb race condition can: mcp251x: fix resume from sleep before interface was brought up can: mcp251xfd: mcp251xfd_probe(): add missing can_rx_offload_del() in error path can: mcp251xfd: mcp251xfd_probe(): fix an error pointer dereference in probe netfilter: nftables: Fix a memleak from userdata error path in new objects netfilter: remove BUG_ON() after skb_header_pointer() netfilter: nfnetlink_osf: Fix a missing skb_header_pointer() NULL check ...
2021-05-09linux/kconfig.h: replace IF_ENABLED() with PTR_IF() in <linux/kernel.h>Masahiro Yamada
<linux/kconfig.h> is included from all the kernel-space source files, including C, assembly, linker scripts. It is intended to contain a minimal set of macros to evaluate CONFIG options. IF_ENABLED() is an intruder here because (x ? y : z) is C code, which should not be included from assembly files or linker scripts. Also, <linux/kconfig.h> is no longer self-contained because NULL is defined in <linux/stddef.h>. Move IF_ENABLED() out to <linux/kernel.h> as PTR_IF(). PTF_IF() takes the general boolean expression instead of a CONFIG option so that it fits better in <linux/kernel.h>. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org>
2021-05-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Add SECMARK revision 1 to fix incorrect layout that prevents from remove rule with this target, from Phil Sutter. 2) Fix pernet exit path spat in arptables, from Florian Westphal. 3) Missing rcu_read_unlock() for unknown nfnetlink callbacks, reported by syzbot, from Eric Dumazet. 4) Missing check for skb_header_pointer() NULL pointer in nfnetlink_osf. 5) Remove BUG_ON() after skb_header_pointer() from packet path in several conntrack helper and the TCP tracker. 6) Fix memleak in the new object error path of userdata. 7) Avoid overflows in nft_hash_buckets(), reported by syzbot, also from Eric. 8) Avoid overflows in 32bit arches, from Eric. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: nftables: avoid potential overflows on 32bit arches netfilter: nftables: avoid overflows in nft_hash_buckets() netfilter: nftables: Fix a memleak from userdata error path in new objects netfilter: remove BUG_ON() after skb_header_pointer() netfilter: nfnetlink_osf: Fix a missing skb_header_pointer() NULL check netfilter: nfnetlink: add a missing rcu_read_unlock() netfilter: arptables: use pernet ops struct during unregister netfilter: xt_SECMARK: add new revision to fix structure layout ==================== Link: https://lore.kernel.org/r/20210507174739.1850-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-07Merge tag 'tag-chrome-platform-for-v5.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform updates from Benson Leung: "cros_ec_typec: - Changes around DP mode check, hard reset, tracking port change. cros_ec misc: - wilco_ec: Convert stream-like files from nonseekable to stream open - cros_usbpd_notify: Listen to EC_HSOT_EVENT_USB_MUX host event - fix format warning in cros_ec_typec" * tag 'tag-chrome-platform-for-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: platform/chrome: cros_ec_lpc: Use DEFINE_MUTEX() for mutex lock platform/chrome: cros_usbpd_notify: Listen to EC_HOST_EVENT_USB_MUX host event platform/chrome: cros_ec_typec: Add DP mode check platform/chrome: cros_ec_typec: Handle hard reset platform/chrome: cros_ec: Add Type C hard reset platform/chrome: cros_ec_typec: Track port role platform/chrome: cros_ec_typec: fix clang -Wformat warning platform/chrome: cros_ec_typec: Check for device within remove function platform/chrome: wilco_ec: convert stream-like files from nonseekable_open -> stream_open
2021-05-07Merge tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - dasd spelling fixes (Bhaskar) - Limit bio max size on multi-page bvecs to the hardware limit, to avoid overly large bio's (and hence latencies). Originally queued for the merge window, but needed a fix and was dropped from the initial pull (Changheun) - NVMe pull request (Christoph): - reset the bdev to ns head when failover (Daniel Wagner) - remove unsupported command noise (Keith Busch) - misc passthrough improvements (Kanchan Joshi) - fix controller ioctl through ns_head (Minwoo Im) - fix controller timeouts during reset (Tao Chiu) - rnbd fixes/cleanups (Gioh, Md, Dima) - Fix iov_iter re-expansion (yangerkun) * tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block: block: reexpand iov_iter after read/write nvmet: remove unsupported command noise nvme-multipath: reset bdev to ns head when failover nvme-pci: fix controller reset hang when racing with nvme_timeout nvme: move the fabrics queue ready check routines to core nvme: avoid memset for passthrough requests nvme: add nvme_get_ns helper nvme: fix controller ioctl through ns_head bio: limit bio max size RDMA/rtrs: fix uninitialized symbol 'cnt' s390: dasd: Mundane spelling fixes block/rnbd: Remove all likely and unlikely block/rnbd-clt: Check the return value of the function rtrs_clt_query block/rnbd: Fix style issues block/rnbd-clt: Change queue_depth type in rnbd_clt_session to size_t
2021-05-07Merge tag 'nfs-for-5.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Add validation of the UDP retrans parameter to prevent shift out-of-bounds - Don't discard pNFS layout segments that are marked for return Bugfixes: - Fix a NULL dereference crash in xprt_complete_bc_request() when the NFSv4.1 server misbehaves. - Fix the handling of NFS READDIR cookie verifiers - Sundry fixes to ensure attribute revalidation works correctly when the server does not return post-op attributes. - nfs4_bitmask_adjust() must not change the server global bitmasks - Fix major timeout handling in the RPC code. - NFSv4.2 fallocate() fixes. - Fix the NFSv4.2 SEEK_HOLE/SEEK_DATA end-of-file handling - Copy offload attribute revalidation fixes - Fix an incorrect filehandle size check in the pNFS flexfiles driver - Fix several RDMA transport setup/teardown races - Fix several RDMA queue wrapping issues - Fix a misplaced memory read barrier in sunrpc's call_decode() Features: - Micro optimisation of the TCP transmission queue using TCP_CORK - statx() performance improvements by further splitting up the tracking of invalid cached file metadata. - Support the NFSv4.2 'change_attr_type' attribute and use it to optimise handling of change attribute updates" * tag 'nfs-for-5.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (85 commits) xprtrdma: Fix a NULL dereference in frwr_unmap_sync() sunrpc: Fix misplaced barrier in call_decode NFSv4.2: Remove ifdef CONFIG_NFSD from NFSv4.2 client SSC code. xprtrdma: Move fr_mr field to struct rpcrdma_mr xprtrdma: Move the Work Request union to struct rpcrdma_mr xprtrdma: Move fr_linv_done field to struct rpcrdma_mr xprtrdma: Move cqe to struct rpcrdma_mr xprtrdma: Move fr_cid to struct rpcrdma_mr xprtrdma: Remove the RPC/RDMA QP event handler xprtrdma: Don't display r_xprt memory addresses in tracepoints xprtrdma: Add an rpcrdma_mr_completion_class xprtrdma: Add tracepoints showing FastReg WRs and remote invalidation xprtrdma: Avoid Send Queue wrapping xprtrdma: Do not wake RPC consumer on a failed LocalInv xprtrdma: Do not recycle MR after FastReg/LocalInv flushes xprtrdma: Clarify use of barrier in frwr_wc_localinv_done() xprtrdma: Rename frwr_release_mr() xprtrdma: rpcrdma_mr_pop() already does list_del_init() xprtrdma: Delete rpcrdma_recv_buffer_put() xprtrdma: Fix cwnd update ordering ...
2021-05-07Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge yet more updates from Andrew Morton: "This is everything else from -mm for this merge window. 90 patches. Subsystems affected by this patch series: mm (cleanups and slub), alpha, procfs, sysctl, misc, core-kernel, bitmap, lib, compat, checkpatch, epoll, isofs, nilfs2, hpfs, exit, fork, kexec, gcov, panic, delayacct, gdb, resource, selftests, async, initramfs, ipc, drivers/char, and spelling" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (90 commits) mm: fix typos in comments mm: fix typos in comments treewide: remove editor modelines and cruft ipc/sem.c: spelling fix fs: fat: fix spelling typo of values kernel/sys.c: fix typo kernel/up.c: fix typo kernel/user_namespace.c: fix typos kernel/umh.c: fix some spelling mistakes include/linux/pgtable.h: few spelling fixes mm/slab.c: fix spelling mistake "disired" -> "desired" scripts/spelling.txt: add "overflw" scripts/spelling.txt: Add "diabled" typo scripts/spelling.txt: add "overlfow" arm: print alloc free paths for address in registers mm/vmalloc: remove vwrite() mm: remove xlate_dev_kmem_ptr() drivers/char: remove /dev/kmem for good mm: fix some typos and code style problems ipc/sem.c: mundane typo fixes ...
2021-05-07mm: fix typos in commentsIngo Molnar
Fix ~94 single-word typos in locking code comments, plus a few very obvious grammar mistakes. Link: https://lkml.kernel.org/r/20210322212624.GA1963421@gmail.com Link: https://lore.kernel.org/r/20210322205203.GB1959563@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07treewide: remove editor modelines and cruftMasahiro Yamada
The section "19) Editor modelines and other cruft" in Documentation/process/coding-style.rst clearly says, "Do not include any of these in source files." I recently receive a patch to explicitly add a new one. Let's do treewide cleanups, otherwise some people follow the existing code and attempt to upstream their favoriate editor setups. It is even nicer if scripts/checkpatch.pl can check it. If we like to impose coding style in an editor-independent manner, I think editorconfig (patch [1]) is a saner solution. [1] https://lore.kernel.org/lkml/20200703073143.423557-1-danny@kdrag0n.dev/ Link: https://lkml.kernel.org/r/20210324054457.1477489-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> [auxdisplay] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07include/linux/pgtable.h: few spelling fixesBhaskar Chowdhury
Few spelling fixes throughout the file. Link: https://lkml.kernel.org/r/20210318201404.6380-1-unixbhaskar@gmail.com Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07mm/vmalloc: remove vwrite()David Hildenbrand
The last user (/dev/kmem) is gone. Let's drop it. Link: https://lkml.kernel.org/r/20210324102351.6932-4-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Minchan Kim <minchan@kernel.org> Cc: huang ying <huang.ying.caritas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07drivers/char: remove /dev/kmem for goodDavid Hildenbrand
Patch series "drivers/char: remove /dev/kmem for good". Exploring /dev/kmem and /dev/mem in the context of memory hot(un)plug and memory ballooning, I started questioning the existence of /dev/kmem. Comparing it with the /proc/kcore implementation, it does not seem to be able to deal with things like a) Pages unmapped from the direct mapping (e.g., to be used by secretmem) -> kern_addr_valid(). virt_addr_valid() is not sufficient. b) Special cases like gart aperture memory that is not to be touched -> mem_pfn_is_ram() Unless I am missing something, it's at least broken in some cases and might fault/crash the machine. Looks like its existence has been questioned before in 2005 and 2010 [1], after ~11 additional years, it might make sense to revive the discussion. CONFIG_DEVKMEM is only enabled in a single defconfig (on purpose or by mistake?). All distributions disable it: in Ubuntu it has been disabled for more than 10 years, in Debian since 2.6.31, in Fedora at least starting with FC3, in RHEL starting with RHEL4, in SUSE starting from 15sp2, and OpenSUSE has it disabled as well. 1) /dev/kmem was popular for rootkits [2] before it got disabled basically everywhere. Ubuntu documents [3] "There is no modern user of /dev/kmem any more beyond attackers using it to load kernel rootkits.". RHEL documents in a BZ [5] "it served no practical purpose other than to serve as a potential security problem or to enable binary module drivers to access structures/functions they shouldn't be touching" 2) /proc/kcore is a decent interface to have a controlled way to read kernel memory for debugging puposes. (will need some extensions to deal with memory offlining/unplug, memory ballooning, and poisoned pages, though) 3) It might be useful for corner case debugging [1]. KDB/KGDB might be a better fit, especially, to write random memory; harder to shoot yourself into the foot. 4) "Kernel Memory Editor" [4] hasn't seen any updates since 2000 and seems to be incompatible with 64bit [1]. For educational purposes, /proc/kcore might be used to monitor value updates -- or older kernels can be used. 5) It's broken on arm64, and therefore, completely disabled there. Looks like it's essentially unused and has been replaced by better suited interfaces for individual tasks (/proc/kcore, KDB/KGDB). Let's just remove it. [1] https://lwn.net/Articles/147901/ [2] https://www.linuxjournal.com/article/10505 [3] https://wiki.ubuntu.com/Security/Features#A.2Fdev.2Fkmem_disabled [4] https://sourceforge.net/projects/kme/ [5] https://bugzilla.redhat.com/show_bug.cgi?id=154796 Link: https://lkml.kernel.org/r/20210324102351.6932-1-david@redhat.com Link: https://lkml.kernel.org/r/20210324102351.6932-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Alexander A. Klimov" <grandmaster@al2klimov.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Corentin Labbe <clabbe@baylibre.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Gregory Clement <gregory.clement@bootlin.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hillf Danton <hdanton@sina.com> Cc: huang ying <huang.ying.caritas@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: James Troup <james.troup@canonical.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kairui Song <kasong@redhat.com> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Liviu Dudau <liviu.dudau@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@kernel.org> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: openrisc@lists.librecores.org Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: "Pavel Machek (CIP)" <pavel@denx.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Pierre Morel <pmorel@linux.ibm.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Robert Richter <rric@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Cc: sparclinux@vger.kernel.org Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Theodore Dubois <tblodt@icloud.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: William Cohen <wcohen@redhat.com> Cc: Xiaoming Ni <nixiaoming@huawei.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07mm: fix some typos and code style problemsShijie Luo
fix some typos and code style problems in mm. gfp.h: s/MAXNODES/MAX_NUMNODES mmzone.h: s/then/than rmap.c: s/__vma_split()/__vma_adjust() swap.c: s/__mod_zone_page_stat/__mod_zone_page_state, s/is is/is swap_state.c: s/whoes/whose z3fold.c: code style problem fix in z3fold_unregister_migration zsmalloc.c: s/of/or, s/give/given Link: https://lkml.kernel.org/r/20210419083057.64820-1-luoshijie1@huawei.com Signed-off-by: Shijie Luo <luoshijie1@huawei.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>