summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-10-09xfs: create a big array data structureDarrick J. Wong
Create a simple 'big array' data structure for storage of fixed-size metadata records that will be used to reconstruct a btree index. For repair operations, the most important operations are append, iterate, and sort. Earlier implementations of the big array used linked lists and suffered from severe problems -- pinning all records in kernel memory was not a good idea and frequently lead to OOM situations; random access was very inefficient; and record overhead for the lists was unacceptably high at 40-60%. Therefore, the big memory array relies on the 'xfile' abstraction, which creates a memfd file and stores the records in page cache pages. Since the memfd is created in tmpfs, the memory pages can be pushed out to disk if necessary and we have a built-in usage limit of 50% of physical memory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: log EFIs for all btree blocks being used to stage a btreerepair-prep-for-bulk-loading_2019-10-09Darrick J. Wong
We need to log EFIs for every extent that we allocate for the purpose of staging a new btree so that if we fail then the blocks will be freed during log recovery. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: implement block reservation accounting for btrees we're stagingDarrick J. Wong
Create a new xrep_newbt structure to encapsulate a fake root for creating a staged btree cursor as well as to track all the blocks that we need to reserve in order to build that btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: add debug knobs to control btree bulk load slack factorsDarrick J. Wong
Add some debug knobs so that we can control the leaf and node block slack when rebuilding btrees. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: always rescan allegedly healthy per-ag metadata after repairrepair-health_2019-10-09Darrick J. Wong
After an online repair function runs for a per-AG metadata structure, sc->sick_mask is supposed to reflect the per-AG metadata that the repair function fixed. Our next move is to re-check the metadata to assess the completeness of our repair, so we don't want the rebuilt structure to be excluded from the rescan just because the health system previously logged a problem with the data structure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: add a repair revalidation function pointerDarrick J. Wong
Allow repair functions to set a separate function pointer to validate the metadata that they've rebuilt. This prevents us from exiting from a repair function that rebuilds both A and B without checking that both A and B can pass a scrub test. We'll need this for the free space and inode btree repair strategies. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: convert xbitmap to interval treerepair-bitmap-rework_2019-10-09Darrick J. Wong
Convert the xbitmap code to use interval trees instead of linked lists. This reduces the amount of coding required to handle the disunion operation and in the future will make it easier to set bits in arbitrary order yet later be able to extract maximally sized extents, which we'll need for rebuilding certain structures. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: remove the for_each_xbitmap_ helpersDarrick J. Wong
Remove the for_each_xbitmap_ macros in favor of proper iterator functions. We'll soon be switching this data structure over to an interval tree implementation, which means that we can't allow callers to modify the bitmap during iteration without telling us. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: replace open-coded bitmap weight logicDarrick J. Wong
Add a xbitmap_hweight helper function so that we can get rid of the open-coded loop. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: rename xfs_bitmap to xbitmapDarrick J. Wong
Shorten the name of xfs_bitmap to xbitmap since the scrub bitmap has nothing to do with the libxfs bitmap. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: use deferred frees to reap old btree blocksrepair-reap-fixes_2019-10-09Darrick J. Wong
Use deferred frees (EFIs) to reap the blocks of a btree that we just replaced. This helps us to shrink the window in which those old blocks could be lost due to a system crash, though we try to flush the EFIs every few hundred blocks so that we don't also overflow the transaction reservations during and after we commit the new btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: only invalidate blocks if we're going to free themDarrick J. Wong
When we're discarding old btree blocks after a repair, only invalidate the buffers for the ones that we're freeing -- if the metadata was crosslinked with another data structure, we don't want to touch it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: xrep_reap_extents should not destroy the bitmapDarrick J. Wong
Remove the xfs_bitmap_destroy call from the end of xrep_reap_extents because this sort of violates our rule that the function initializing a structure should destroy it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: support staging cursors for per-AG btree typesbtree-bulk-loading_2019-10-09Darrick J. Wong
Add support for btree staging cursors for the per-AG btree types. This is needed both for online repair and also to convert xfs_repair to use btree bulk loading. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: support bulk loading of staged btreesDarrick J. Wong
Add a new btree function that enables us to bulk load a btree cursor. This will be used by the upcoming online repair patches to generate new btrees. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: introduce fake roots for inode-rooted btreesDarrick J. Wong
Create an in-core fake root for inode-rooted btree types so that callers can generate a whole new btree using the upcoming btree bulk load function without making the new tree accessible from the rest of the filesystem. It is up to the individual btree type to provide a function to create a staged cursor (presumably with the appropriate callouts to update the fakeroot) and then commit the staged root back into the filesystem. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: introduce fake roots for ag-rooted btreesDarrick J. Wong
Create an in-core fake root for AG-rooted btree types so that callers can generate a whole new btree using the upcoming btree bulk load function without making the new tree accessible from the rest of the filesystem. It is up to the individual btree type to provide a function to create a staged cursor (presumably with the appropriate callouts to update the fakeroot) and then commit the staged root back into the filesystem. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: increase the default parallelism levels of pwork clientspwork-parallelism_2019-10-09Darrick J. Wong
Increase the default parallelism level for pwork clients so that we can take advantage of computers with a lot of CPUs and a lot of hardware. 8x raid0 spinning rust running quotacheck: 1 39s 2 29s 4 26s 8 24s 24 (nr_cpus) 24s 4x raid0 sata ssds running quotacheck: 1 12s 2 12s 4 12s 8 13s 24 (nr_cpus) 14s 4x raid0 nvme ssds running quotacheck: 1 18s 2 18s 4 19s 8 20s 20 (nr_cpus) 20s So, mixed results... Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: extend the range of flush_unmap rangesstale-exposure_2019-10-09Darrick J. Wong
If we have to initiate writeback of a range that starts beyond the on-disk EOF, extend the flushed range to start at the on-disk EOF so that there's no chance that we put real extents in the data fork having not actually flushed the data. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: force writes to delalloc regions to unwrittenDarrick J. Wong
When writing to a delalloc region in the data fork, commit the new allocations (of the da reservation) as unwritten so that the mappings are only marked written once writeback completes successfully. This fixes the problem of stale data exposure if the system goes down during targeted writeback of a specific region of a file, as tested by generic/042. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09"BUG: kernel NULL pointer dereference, address: 0000000000000002"xfs-5.4-fixes_2019-10-09Darrick J. Wong
#PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP DEBUG_PAGEALLOC CPU: 40 PID: 555153 Comm: kworker/u98:50 Kdump: loaded Not tainted ... Workqueue: writeback wb_workfn (flush-btrfs-1) RIP: 0010:_raw_spin_lock_irqsave+0x10/0x30 Code: 48 89 d8 5b c3 e8 50 db 6b ff eb f4 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 9c 5b fa 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 05 48 89 d8 5b c3 89 c6 e8 fe ca 6b ff eb f2 66 90 RSP: 0018:ffffc90049b27d98 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000003 RDI: 0000000000000002 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 R10: ffff889fff407600 R11: ffff88ba9395d740 R12: 000000000000e300 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88bfdfa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000002 CR3: 0000000002409005 CR4: 00000000001606e0 Call Trace: __wake_up_common_lock+0x63/0xc0 wb_workfn+0xd2/0x3e0 process_one_work+0x1f5/0x3f0 worker_thread+0x2d/0x3d0 kthread+0x111/0x130 ret_from_fork+0x1f/0x30 Fix it by reading and caching @done->waitq before decrementing @done->cnt. Signed-off-by: Tejun Heo <tj@kernel.org> Debugged-by: Chris Mason <clm@fb.com> Fixes: 30638b0125e1 ("writeback: Generalize and expose wb_completion") Cc: stable@vger.kernel.org # v5.2+
2019-10-09xfs: removed unused error variable from xchk_refcountbt_recAliasgar Surti
Removed unused error variable. Instead of using error variable, returned the value directly as it wasn't updated. Signed-off-by: Aliasgar Surti <aliasgar.surti500@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: remove unused flags arg from xfs_get_aghdr_buf()Eric Sandeen
The flags arg is always passed as zero, so remove it. (xfs_buf_get_uncached takes flags to support XBF_NO_IOACCT for the sb, but that should never be relevant for xfs_get_aghdr_buf) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: Fix tail rounding in xfs_alloc_file_space()Max Reitz
To ensure that all blocks touched by the range [offset, offset + count) are allocated, we need to calculate the block count from the difference of the range end (rounded up) and the range start (rounded down). Before this patch, we just round up the byte count, which may lead to unaligned ranges not being fully allocated: $ touch test_file $ block_size=$(stat -fc '%S' test_file) $ fallocate -o $((block_size / 2)) -l $block_size test_file $ xfs_bmap test_file test_file: 0: [0..7]: 1396264..1396271 1: [8..15]: hole There should not be a hole there. Instead, the first two blocks should be fully allocated. With this patch applied, the result is something like this: $ touch test_file $ block_size=$(stat -fc '%S' test_file) $ fallocate -o $((block_size / 2)) -l $block_size test_file $ xfs_bmap test_file test_file: 0: [0..15]: 11024..11039 Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-06Linux 5.4-rc2v5.4-rc2Linus Torvalds
2019-10-06elf: don't use MAP_FIXED_NOREPLACE for elf executable mappingsLinus Torvalds
In commit 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") we changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the executable mappings. Then, people reported that it broke some binaries that had overlapping segments from the same file, and commit ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some overlaying elf segment cases. But only some - despite the summary line of that commit, it only did it when it also does a temporary brk vma for one obvious overlapping case. Now Russell King reports another overlapping case with old 32-bit x86 binaries, which doesn't trigger that limited case. End result: we had better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED. Yes, it's a sign of old binaries generated with old tool-chains, but we do pride ourselves on not breaking existing setups. This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp() and the old load_elf_library() use-cases, because nobody has reported breakage for those. Yet. Note that in all the cases seen so far, the overlapping elf sections seem to be just re-mapping of the same executable with different section attributes. We could possibly introduce a new MAP_FIXED_NOFILECHANGE flag or similar, which acts like NOREPLACE, but allows just remapping the same executable file using different protection flags. It's not clear that would make a huge difference to anything, but if people really hate that "elf remaps over previous maps" behavior, maybe at least a more limited form of remapping would alleviate some concerns. Alternatively, we should take a look at our elf_map() logic to see if we end up not mapping things properly the first time. In the meantime, this is the minimal "don't do that then" patch while people hopefully think about it more. Reported-by: Russell King <linux@armlinux.org.uk> Fixes: 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") Fixes: ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments") Cc: Michal Hocko <mhocko@suse.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-06Merge tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping regression fix from Christoph Hellwig: "Revert an incorret hunk from a patch that caused problems on various arm boards (Andrey Smirnov)" * tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: fix false positive warnings in dma_common_free_remap()
2019-10-05Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "A few fixes this time around: - Fixup of some clock specifications for DRA7 (device-tree fix) - Removal of some dead/legacy CPU OPP/PM code for OMAP that throws warnings at boot - A few more minor fixups for OMAPs, most around display - Enable STM32 QSPI as =y since their rootfs sometimes comes from there - Switch CONFIG_REMOTEPROC to =y since it went from tristate to bool - Fix of thermal zone definition for ux500 (5.4 regression)" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: multi_v7_defconfig: Fix SPI_STM32_QSPI support ARM: dts: ux500: Fix up the CPU thermal zone arm64/ARM: configs: Change CONFIG_REMOTEPROC from m to y ARM: dts: am4372: Set memory bandwidth limit for DISPC ARM: OMAP2+: Fix warnings with broken omap2_set_init_voltage() ARM: OMAP2+: Add missing LCDC midlemode for am335x ARM: OMAP2+: Fix missing reset done flag for am3 and am43 ARM: dts: Fix gpio0 flags for am335x-icev2 ARM: omap2plus_defconfig: Enable more droid4 devices as loadable modules ARM: omap2plus_defconfig: Enable DRM_TI_TFP410 DTS: ARM: gta04: introduce legacy spi-cs-high to make display work again ARM: dts: Fix wrong clocks for dra7 mcasp clk: ti: dra7: Fix mcasp8 clock bits
2019-10-05Merge tag 'kbuild-fixes-v5.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - remove unneeded ar-option and KBUILD_ARFLAGS - remove long-deprecated SUBDIRS - fix modpost to suppress false-positive warnings for UML builds - fix namespace.pl to handle relative paths to ${objtree}, ${srctree} - make setlocalversion work for /bin/sh - make header archive reproducible - fix some Makefiles and documents * tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kheaders: make headers archive reproducible kbuild: update compile-test header list for v5.4-rc2 kbuild: two minor updates for Documentation/kbuild/modules.rst scripts/setlocalversion: clear local variable to make it work for sh namespace: fix namespace.pl script to support relative paths video/logo: do not generate unneeded logo C files video/logo: remove unneeded *.o pattern from clean-files integrity: remove pointless subdir-$(CONFIG_...) integrity: remove unneeded, broken attempt to add -fshort-wchar modpost: fix static EXPORT_SYMBOL warnings for UML build kbuild: correct formatting of header in kbuild module docs kbuild: remove SUBDIRS support kbuild: remove ar-option and KBUILD_ARFLAGS
2019-10-05Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Twelve patches mostly small but obvious fixes or cosmetic but small updates" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix Nport ID display value scsi: qla2xxx: Fix N2N link up fail scsi: qla2xxx: Fix N2N link reset scsi: qla2xxx: Optimize NPIV tear down process scsi: qla2xxx: Fix stale mem access on driver unload scsi: qla2xxx: Fix unbound sleep in fcport delete path. scsi: qla2xxx: Silence fwdump template message scsi: hisi_sas: Make three functions static scsi: megaraid: disable device when probe failed after enabled device scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue scsi: qedf: Remove always false 'tmp_prio < 0' statement scsi: ufs: skip shutdown if hba is not powered scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF
2019-10-05Merge branch 'readdir' (readdir speedup and sanity checking)Linus Torvalds
This makes getdents() and getdents64() do sanity checking on the pathname that it gives to user space. And to mitigate the performance impact of that, it first cleans up the way it does the user copying, so that the code avoids doing the SMAP/PAN updates between each part of the dirent structure write. I really wanted to do this during the merge window, but didn't have time. The conversion of filldir to unsafe_put_user() is something I've had around for years now in a private branch, but the extra pathname checking finally made me clean it up to the point where it is mergable. It's worth noting that the filename validity checking really should be a bit smarter: it would be much better to delay the error reporting until the end of the readdir, so that non-corrupted filenames are still returned. But that involves bigger changes, so let's see if anybody actually hits the corrupt directory entry case before worrying about it further. * branch 'readdir': Make filldir[64]() verify the directory entry filename is valid Convert filldir[64]() from __put_user() to unsafe_put_user()
2019-10-05Make filldir[64]() verify the directory entry filename is validLinus Torvalds
This has been discussed several times, and now filesystem people are talking about doing it individually at the filesystem layer, so head that off at the pass and just do it in getdents{64}(). This is partially based on a patch by Jann Horn, but checks for NUL bytes as well, and somewhat simplified. There's also commentary about how it might be better if invalid names due to filesystem corruption don't cause an immediate failure, but only an error at the end of the readdir(), so that people can still see the filenames that are ok. There's also been discussion about just how much POSIX strictly speaking requires this since it's about filesystem corruption. It's really more "protect user space from bad behavior" as pointed out by Jann. But since Eric Biederman looked up the POSIX wording, here it is for context: "From readdir: The readdir() function shall return a pointer to a structure representing the directory entry at the current position in the directory stream specified by the argument dirp, and position the directory stream at the next entry. It shall return a null pointer upon reaching the end of the directory stream. The structure dirent defined in the <dirent.h> header describes a directory entry. From definitions: 3.129 Directory Entry (or Link) An object that associates a filename with a file. Several directory entries can associate names with the same file. ... 3.169 Filename A name consisting of 1 to {NAME_MAX} bytes used to name a file. The characters composing the name may be selected from the set of all character values excluding the slash character and the null byte. The filenames dot and dot-dot have special meaning. A filename is sometimes referred to as a 'pathname component'." Note that I didn't bother adding the checks to any legacy interfaces that nobody uses. Also note that if this ends up being noticeable as a performance regression, we can fix that to do a much more optimized model that checks for both NUL and '/' at the same time one word at a time. We haven't really tended to optimize 'memchr()', and it only checks for one pattern at a time anyway, and we really _should_ check for NUL too (but see the comment about "soft errors" in the code about why it currently only checks for '/') See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name lookup code looks for pathname terminating characters in parallel. Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/ Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jann Horn <jannh@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-05Convert filldir[64]() from __put_user() to unsafe_put_user()Linus Torvalds
We really should avoid the "__{get,put}_user()" functions entirely, because they can easily be mis-used and the original intent of being used for simple direct user accesses no longer holds in a post-SMAP/PAN world. Manually optimizing away the user access range check makes no sense any more, when the range check is generally much cheaper than the "enable user accesses" code that the __{get,put}_user() functions still need. So instead of __put_user(), use the unsafe_put_user() interface with user_access_{begin,end}() that really does generate better code these days, and which is generally a nicer interface. Under some loads, the multiple user writes that filldir() does are actually quite noticeable. This also makes the dirent name copy use unsafe_put_user() with a couple of macros. We do not want to make function calls with SMAP/PAN disabled, and the code this generates is quite good when the architecture uses "asm goto" for unsafe_put_user() like x86 does. Note that this doesn't bother with the legacy cases. Nobody should use them anyway, so performance doesn't really matter there. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold. 2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric Dumazet. 3) txq null deref in mac80211, from Miaoqing Pan. 4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann. 5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso. 6) Avoid division by zero in taprio scheduler, from Vladimir Oltean. 7) Various xgmac fixes in stmmac driver from Jose Abreu. 8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from Michal Kubecek. 9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij. 10) Fix sleep while atomic in sja1105, from Vladimir Oltean. 11) Suspend/resume deadlock in stmmac, from Thierry Reding. 12) Various UDP GSO fixes from Josh Hunt. 13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric Dumazet. 14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern. 15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) selftests/net: add nettest to .gitignore net: qlogic: Fix memory leak in ql_alloc_large_buffers nfc: fix memory leak in llcp_sock_bind() sch_dsmark: fix potential NULL deref in dsmark_init() net: phy: at803x: use operating parameters from PHY-specific status net: phy: extract pause mode net: phy: extract link partner advertisement reading net: phy: fix write to mii-ctrl1000 register ipv6: Handle missing host route in __ipv6_ifa_notify net: phy: allow for reset line to be tied to a sleepy GPIO controller net: ipv4: avoid mixed n_redirects and rate_tokens usage r8152: Set macpassthru in reset_resume callback cxgb4:Fix out-of-bounds MSI-X info array access Revert "ipv6: Handle race in addrconf_dad_work" net: make sock_prot_memory_pressure() return "const char *" rxrpc: Fix rxrpc_recvmsg tracepoint qmi_wwan: add support for Cinterion CLS8 devices tcp: fix slab-out-of-bounds in tcp_zerocopy_receive() lib: textsearch: fix escapes in example code udp: only do GSO if # of segs > 1 ...
2019-10-05Merge tag 's390-5.4-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - defconfig updates - Fix build errors with CC_OPTIMIZE_FOR_SIZE due to usage of "i" constraint for function arguments. Two kvm changes acked-by Christian Borntraeger. - Fix -Wunused-but-set-variable warnings in mm code. - Avoid a constant misuse in qdio. - Handle a case when cpumf is temporarily unavailable. * tag 's390-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: KVM: s390: mark __insn32_query() as __always_inline KVM: s390: fix __insn32_query() inline assembly s390: update defconfigs s390/pci: mark function(s) __always_inline s390/mm: mark function(s) __always_inline s390/jump_label: mark function(s) __always_inline s390/cpu_mf: mark function(s) __always_inline s390/atomic,bitops: mark function(s) __always_inline s390/mm: fix -Wunused-but-set-variable warnings s390: mark __cpacf_query() as __always_inline s390/qdio: clarify size of the QIB parm area s390/cpumf: Fix indentation in sampling device driver s390/cpumsf: Check for CPU Measurement sampling s390/cpumf: Use consistant debug print format
2019-10-05KVM: s390: mark __insn32_query() as __always_inlineHeiko Carstens
__insn32_query() will not compile if the compiler decides to not inline it, since it contains an inline assembly with an "i" constraint with variable contents. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-05KVM: s390: fix __insn32_query() inline assemblyHeiko Carstens
The inline assembly constraints of __insn32_query() tell the compiler that only the first byte of "query" is being written to. Intended was probably that 32 bytes are written to. Fix and simplify the code and just use a "memory" clobber. Fixes: d668139718a9 ("KVM: s390: provide query function for instructions returning 32 byte") Cc: stable@vger.kernel.org # v5.2+ Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-05dma-mapping: fix false positivse warnings in dma_common_free_remap()Andrey Smirnov
Commit 5cf4537975bb ("dma-mapping: introduce a dma_common_find_pages helper") changed invalid input check in dma_common_free_remap() from: if (!area || !area->flags != VM_DMA_COHERENT) to if (!area || !area->flags != VM_DMA_COHERENT || !area->pages) which seem to produce false positives for memory obtained via dma_common_contiguous_remap() This triggers the following warning message when doing "reboot" on ZII VF610 Dev Board Rev B: WARNING: CPU: 0 PID: 1 at kernel/dma/remap.c:112 dma_common_free_remap+0x88/0x8c trying to free invalid coherent area: 9ef82980 Modules linked in: CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.3.0-rc6-next-20190820 #119 Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) Backtrace: [<8010d1ec>] (dump_backtrace) from [<8010d588>] (show_stack+0x20/0x24) r7:8015ed78 r6:00000009 r5:00000000 r4:9f4d9b14 [<8010d568>] (show_stack) from [<8077e3f0>] (dump_stack+0x24/0x28) [<8077e3cc>] (dump_stack) from [<801197a0>] (__warn.part.3+0xcc/0xe4) [<801196d4>] (__warn.part.3) from [<80119830>] (warn_slowpath_fmt+0x78/0x94) r6:00000070 r5:808e540c r4:81c03048 [<801197bc>] (warn_slowpath_fmt) from [<8015ed78>] (dma_common_free_remap+0x88/0x8c) r3:9ef82980 r2:808e53e0 r7:00001000 r6:a0b1e000 r5:a0b1e000 r4:00001000 [<8015ecf0>] (dma_common_free_remap) from [<8010fa9c>] (remap_allocator_free+0x60/0x68) r5:81c03048 r4:9f4d9b78 [<8010fa3c>] (remap_allocator_free) from [<801100d0>] (__arm_dma_free.constprop.3+0xf8/0x148) r5:81c03048 r4:9ef82900 [<8010ffd8>] (__arm_dma_free.constprop.3) from [<80110144>] (arm_dma_free+0x24/0x2c) r5:9f563410 r4:80110120 [<80110120>] (arm_dma_free) from [<8015d80c>] (dma_free_attrs+0xa0/0xdc) [<8015d76c>] (dma_free_attrs) from [<8020f3e4>] (dma_pool_destroy+0xc0/0x154) r8:9efa8860 r7:808f02f0 r6:808f02d0 r5:9ef82880 r4:9ef82780 [<8020f324>] (dma_pool_destroy) from [<805525d0>] (ehci_mem_cleanup+0x6c/0x150) r7:9f563410 r6:9efa8810 r5:00000000 r4:9efd0148 [<80552564>] (ehci_mem_cleanup) from [<80558e0c>] (ehci_stop+0xac/0xc0) r5:9efd0148 r4:9efd0000 [<80558d60>] (ehci_stop) from [<8053c4bc>] (usb_remove_hcd+0xf4/0x1b0) r7:9f563410 r6:9efd0074 r5:81c03048 r4:9efd0000 [<8053c3c8>] (usb_remove_hcd) from [<8056361c>] (host_stop+0x48/0xb8) r7:9f563410 r6:9efd0000 r5:9f5f4040 r4:9f5f5040 [<805635d4>] (host_stop) from [<80563d0c>] (ci_hdrc_host_destroy+0x34/0x38) r7:9f563410 r6:9f5f5040 r5:9efa8800 r4:9f5f4040 [<80563cd8>] (ci_hdrc_host_destroy) from [<8055ef18>] (ci_hdrc_remove+0x50/0x10c) [<8055eec8>] (ci_hdrc_remove) from [<804a2ed8>] (platform_drv_remove+0x34/0x4c) r7:9f563410 r6:81c4f99c r5:9efa8810 r4:9efa8810 [<804a2ea4>] (platform_drv_remove) from [<804a18a8>] (device_release_driver_internal+0xec/0x19c) r5:00000000 r4:9efa8810 [<804a17bc>] (device_release_driver_internal) from [<804a1978>] (device_release_driver+0x20/0x24) r7:9f563410 r6:81c41ed0 r5:9efa8810 r4:9f4a1dac [<804a1958>] (device_release_driver) from [<804a01b8>] (bus_remove_device+0xdc/0x108) [<804a00dc>] (bus_remove_device) from [<8049c204>] (device_del+0x150/0x36c) r7:9f563410 r6:81c03048 r5:9efa8854 r4:9efa8810 [<8049c0b4>] (device_del) from [<804a3368>] (platform_device_del.part.2+0x20/0x84) r10:9f563414 r9:809177e0 r8:81cb07dc r7:81c78320 r6:9f563454 r5:9efa8800 r4:9efa8800 [<804a3348>] (platform_device_del.part.2) from [<804a3420>] (platform_device_unregister+0x28/0x34) r5:9f563400 r4:9efa8800 [<804a33f8>] (platform_device_unregister) from [<8055dce0>] (ci_hdrc_remove_device+0x1c/0x30) r5:9f563400 r4:00000001 [<8055dcc4>] (ci_hdrc_remove_device) from [<805652ac>] (ci_hdrc_imx_remove+0x38/0x118) r7:81c78320 r6:9f563454 r5:9f563410 r4:9f541010 [<8056538c>] (ci_hdrc_imx_shutdown) from [<804a2970>] (platform_drv_shutdown+0x2c/0x30) [<804a2944>] (platform_drv_shutdown) from [<8049e4fc>] (device_shutdown+0x158/0x1f0) [<8049e3a4>] (device_shutdown) from [<8013ac80>] (kernel_restart_prepare+0x44/0x48) r10:00000058 r9:9f4d8000 r8:fee1dead r7:379ce700 r6:81c0b280 r5:81c03048 r4:00000000 [<8013ac3c>] (kernel_restart_prepare) from [<8013ad14>] (kernel_restart+0x1c/0x60) [<8013acf8>] (kernel_restart) from [<8013af84>] (__do_sys_reboot+0xe0/0x1d8) r5:81c03048 r4:00000000 [<8013aea4>] (__do_sys_reboot) from [<8013b0ec>] (sys_reboot+0x18/0x1c) r8:80101204 r7:00000058 r6:00000000 r5:00000000 r4:00000000 [<8013b0d4>] (sys_reboot) from [<80101000>] (ret_fast_syscall+0x0/0x54) Exception stack(0x9f4d9fa8 to 0x9f4d9ff0) 9fa0: 00000000 00000000 fee1dead 28121969 01234567 379ce700 9fc0: 00000000 00000000 00000000 00000058 00000000 00000000 00000000 00016d04 9fe0: 00028e0c 7ec87c64 000135ec 76c1f410 Restore original invalid input check in dma_common_free_remap() to avoid this problem. Fixes: 5cf4537975bb ("dma-mapping: introduce a dma_common_find_pages helper") Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> [hch: just revert the offending hunk instead of creating a new helper] Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-10-05kheaders: make headers archive reproducibleDmitry Goldin
In commit 43d8ce9d65a5 ("Provide in-kernel headers to make extending kernel easier") a new mechanism was introduced, for kernels >=5.2, which embeds the kernel headers in the kernel image or a module and exposes them in procfs for use by userland tools. The archive containing the header files has nondeterminism caused by header files metadata. This patch normalizes the metadata and utilizes KBUILD_BUILD_TIMESTAMP if provided and otherwise falls back to the default behaviour. In commit f7b101d33046 ("kheaders: Move from proc to sysfs") it was modified to use sysfs and the script for generation of the archive was renamed to what is being patched. Signed-off-by: Dmitry Goldin <dgoldin+lkml@protonmail.ch> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-05kbuild: update compile-test header list for v5.4-rc2Masahiro Yamada
Commit 6dc280ebeed2 ("coda: remove uapi/linux/coda_psdev.h") removed a header in question. Some more build errors were fixed. Add more headers into the test coverage. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-05kbuild: two minor updates for Documentation/kbuild/modules.rstMasahiro Yamada
Capitalize the first word in the sentence. Use obj-m instead of obj-y. obj-y still works, but we have no built-in objects in external module builds. So, obj-m is better IMHO. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-05scripts/setlocalversion: clear local variable to make it work for shMasahiro Yamada
Geert Uytterhoeven reports a strange side-effect of commit 858805b336be ("kbuild: add $(BASH) to run scripts with bash-extension"), which inserts the contents of a localversion file in the build directory twice. [Steps to Reproduce] $ echo bar > localversion $ mkdir build $ cd build/ $ echo foo > localversion $ make -s -f ../Makefile defconfig include/config/kernel.release $ cat include/config/kernel.release 5.4.0-rc1foofoobar This comes down to the behavior change of local variables. The 'man sh' on my Ubuntu machine, where sh is an alias to dash, explains as follows: When a variable is made local, it inherits the initial value and exported and readonly flags from the variable with the same name in the surrounding scope, if there is one. Otherwise, the variable is initially unset. [Test Code] foo () { local res echo "res: $res" } res=1 foo [Result] $ sh test.sh res: 1 $ bash test.sh res: So, scripts/setlocalversion correctly works only for bash in spite of its hashbang being #!/bin/sh. Nobody had noticed it before because CONFIG_SHELL was previously set to bash almost all the time. Now that CONFIG_SHELL is set to sh, we must write portable and correct code. I gave the Fixes tag to the commit that uncovered the issue. Clear the variable 'res' in collect_files() to make it work for sh (and it also works on distributions where sh is an alias to bash). Fixes: 858805b336be ("kbuild: add $(BASH) to run scripts with bash-extension") Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
2019-10-05namespace: fix namespace.pl script to support relative pathsJacob Keller
The namespace.pl script does not work properly if objtree is not set to an absolute path. The do_nm function is run from within the find function, which changes directories. Because of this, appending objtree, $File::Find::dir, and $source, will return a path which is not valid from the current directory. This used to work when objtree was set to an absolute path when using "make namespacecheck". It appears to have not worked when calling ./scripts/namespace.pl directly. This behavior was changed in 7e1c04779efd ("kbuild: Use relative path for $(objtree)", 2014-05-14) Rather than fixing the Makefile to set objtree to an absolute path, just fix namespace.pl to work when srctree and objtree are relative. Also fix the script to use an absolute path for these by default. Use the File::Spec module for this purpose. It's been part of perl 5 since 5.005. The curdir() function is used to get the current directory when the objtree and srctree aren't set in the environment. rel2abs() is used to convert possibly relative objtree and srctree environment variables to absolute paths. Finally, the catfile() function is used instead of string appending paths together, since this is more robust when joining paths together. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-05video/logo: do not generate unneeded logo C filesMasahiro Yamada
Currently, all the logo C files are generated irrespective of the CONFIG options. Adding them to extra-y is wrong. What we need to do here is to add them to 'targets' so that if_changed works properly. Files listed in 'targets' are cleaned, so clean-files is unneeded. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-05video/logo: remove unneeded *.o pattern from clean-filesMasahiro Yamada
The pattern *.o is cleaned up globally by the top Makefile. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-05integrity: remove pointless subdir-$(CONFIG_...)Masahiro Yamada
The ima/ and evm/ sub-directories contain built-in objects, so obj-$(CONFIG_...) is the correct way to descend into them. subdir-$(CONFIG_...) is redundant. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-05integrity: remove unneeded, broken attempt to add -fshort-wcharMasahiro Yamada
I guess commit 15ea0e1e3e18 ("efi: Import certificates from UEFI Secure Boot") attempted to add -fshort-wchar for building load_uefi.o, but it has never worked as intended. load_uefi.o is created in the platform_certs/ sub-directory. If you really want to add -fshort-wchar, the correct code is: $(obj)/platform_certs/load_uefi.o: KBUILD_CFLAGS += -fshort-wchar But, you do not need to fix it. Commit 8c97023cf051 ("Kbuild: use -fshort-wchar globally") had already added -fshort-wchar globally. This code was unneeded in the first place. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-04selftests/net: add nettest to .gitignoreJakub Kicinski
nettest is missing from gitignore. Fixes: acda655fefae ("selftests: Add nettest") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04net: qlogic: Fix memory leak in ql_alloc_large_buffersNavid Emamdoost
In ql_alloc_large_buffers, a new skb is allocated via netdev_alloc_skb. This skb should be released if pci_dma_mapping_error fails. Fixes: 0f8ab89e825f ("qla3xxx: Check return code from pci_map_single() in ql_release_to_lrg_buf_free_list(), ql_populate_free_queue(), ql_alloc_large_buffers(), and ql3xxx_send()") Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04nfc: fix memory leak in llcp_sock_bind()Eric Dumazet
sysbot reported a memory leak after a bind() has failed. While we are at it, abort the operation if kmemdup() has failed. BUG: memory leak unreferenced object 0xffff888105d83ec0 (size 32): comm "syz-executor067", pid 7207, jiffies 4294956228 (age 19.430s) hex dump (first 32 bytes): 00 69 6c 65 20 72 65 61 64 00 6e 65 74 3a 5b 34 .ile read.net:[4 30 32 36 35 33 33 30 39 37 5d 00 00 00 00 00 00 026533097]...... backtrace: [<0000000036bac473>] kmemleak_alloc_recursive /./include/linux/kmemleak.h:43 [inline] [<0000000036bac473>] slab_post_alloc_hook /mm/slab.h:522 [inline] [<0000000036bac473>] slab_alloc /mm/slab.c:3319 [inline] [<0000000036bac473>] __do_kmalloc /mm/slab.c:3653 [inline] [<0000000036bac473>] __kmalloc_track_caller+0x169/0x2d0 /mm/slab.c:3670 [<000000000cd39d07>] kmemdup+0x27/0x60 /mm/util.c:120 [<000000008e57e5fc>] kmemdup /./include/linux/string.h:432 [inline] [<000000008e57e5fc>] llcp_sock_bind+0x1b3/0x230 /net/nfc/llcp_sock.c:107 [<000000009cb0b5d3>] __sys_bind+0x11c/0x140 /net/socket.c:1647 [<00000000492c3bbc>] __do_sys_bind /net/socket.c:1658 [inline] [<00000000492c3bbc>] __se_sys_bind /net/socket.c:1656 [inline] [<00000000492c3bbc>] __x64_sys_bind+0x1e/0x30 /net/socket.c:1656 [<0000000008704b2a>] do_syscall_64+0x76/0x1a0 /arch/x86/entry/common.c:296 [<000000009f4c57a4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 30cc4587659e ("NFC: Move LLCP code to the NFC top level diirectory") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>