summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-19t_ofd_locks: fix stalled semaphore handlingStas Sergeev
Currently IPC_RMID was attempted on a semid returned after failed semget() with flags=IPC_CREAT|IPC_EXCL. So nothing was actually removed. This patch introduces the much more reliable scheme where the wrapper script creates and removes semaphores, passing a sem key to the test binary via new -K option. This patch speeds up the test ~5 times by removing the sem-awaiting loop in a lock-getter process. As the semaphore is now created before the test process started, there is no need to wait for anything. CC: fstests@vger.kernel.org CC: Murphy Zhou <xzhou@redhat.com> CC: Jeff Layton <jlayton@kernel.org> CC: Zorro Lang <zlang@redhat.com> Signed-off-by: Stas Sergeev <stsp2@yandex.ru> Reviwed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-19btrfs/213: fix failure due to misspelled function nameFilipe Manana
The test is calling _not_run but it should be _notrun, so fix that. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05xfs: skip fragmentation tests when alwayscow mode is enabled, part 2v2023.08.06Darrick J. Wong
If the always_cow debugging flag is enabled, all file writes turn into copy writes. This dramatically ramps up fragmentation in the filesystem (intentionally!) so there's no point in complaining about fragmentation. I missed these two in the original commit because readahead for md5sum would create large folios at the start of the file. This resulted in the fdatatasync after the random writes issuing writeback for the whole large folio, which reduced file fragmentation to the point where this test started passing. With Ritesh's patchset implementing sub-folio dirty tracking, this test goes back to failing due to high fragmentation (as it did before large folios) so we need to mask these off too. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05generic/642: fix SOAK_DURATION usage in generic/642Darrick J. Wong
Misspelled variable name. Yay bash. Fixes: 3e85dd4fe4 ("misc: add duration for long soak tests") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05fstests: add helper to canonicalize devices used to enable persistent disksLuis Chamberlain
The filesystem configuration file does not allow you to use symlinks to real devices given the existing sanity checks verify that the target end device matches the source. Device mapper links work but not symlinks for real drives do not. Using a symlink is desirable if you want to enable persistent tests across reboots. For example you may want to use /dev/disk/by-id/nvme-eui.* so to ensure that the same drives are used even after reboot. This is very useful if you are testing for example with a virtualized environment and are using PCIe passthrough with other qemu NVMe drives with one or many NVMe drives. To enable support just add a helper to canonicalize devices prior to running the tests. This allows one test runner, kdevops, which I just extended with support to use real NVMe drives it has support now to use nvme EUI symlinks and fallbacks to nvme model + serial symlinks as not all NVMe drives support EUIs. The drives it uses for the filesystem configuration optionally is with NVMe eui symlinks so to allow the same drives to be used over reboots. For instance this works today with real nvme drives: mkfs.xfs -f /dev/nvme0n1 mount /dev/nvme0n1 /mnt TEST_DIR=/mnt TEST_DEV=/dev/nvme0n1 FSTYP=xfs ./check generic/110 FSTYP -- xfs (debug) PLATFORM -- Linux/x86_64 flax-mtr01 6.5.0-rc3-djwx #rc3 SMP PREEMPT_DYNAMIC Wed Jul 26 14:26:48 PDT 2023 generic/110 2s Ran: generic/110 Passed all 1 tests But this does not: TEST_DIR=/mnt TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e FSTYP=xfs ./check generic/110 mount: /mnt: /dev/disk/by-id/nvme-eui.0035385411904c1e already mounted on /mnt. common/rc: retrying test device mount with external set mount: /mnt: /dev/disk/by-id/nvme-eui.0035385411904c1e already mounted on /mnt. common/rc: could not mount /dev/disk/by-id/nvme-eui.0035385411904c1e on /mnt umount /mnt TEST_DIR=/mnt TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e FSTYP=xfs ./check generic/110 TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e is mounted but not on TEST_DIR=/mnt - aborting Already mounted result: /dev/disk/by-id/nvme-eui.0035385411904c1e /mnt This fixes this. This allows the same real drives for a test to be used over and over after reboots. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05check: generate gcov code coverage reports at the end of each sectionDarrick J. Wong
Support collecting kernel code coverage information as reported in debugfs. At the start of each section, we reset the gcov counters; during the section wrapup, we'll collect the kernel gcov data. If lcov is installed and the kernel source code is available, it will also generate a nice html report. If a CLI web browser is available, it will also format the html report into text for easy grepping. This requires the test runner to set REPORT_GCOV=1 explicitly and gcov to be enabled in the kernel. Cc: tytso@mit.edu Cc: kent.overstreet@linux.dev Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05btrfs/276: make test accurate regarding number of expected extentsFilipe Manana
btrfs/276 creates a 16G file with compression enabled in order to quickly and efficiently create a file with many extents and have a fs tree with a height of 3 (root node at level 2), so that it can test that fiemap is correctly reporting extent sharedness when we have shared subtrees of the fs tree due to a snapshot. Compression results in extents with a maximum size of 128K and the test is expecting only extents of 128K, which normally happens if the machine has a large amount of RAM and writeback is not triggered before the xfs_io command finishes. However if writeback is triggered in the meanwhile, due to memory pressure for example, then we can get end up with some extents that are smaller than 128K, therefore increasing the total number of extents in the test file and make the test fail. This seems to happen often on test machines with small amounts of RAM, such as 4G, as reported by Qu in the following thread: https://lore.kernel.org/linux-btrfs/20230801065529.50122-1-wqu@suse.com/ So to address this create a file with holes and direct IO to make sure we always get a specific number of extents in the test file. To speedup the test create 2000 64K extents, with holes in between them, so that it works on a fs with any sector size, and then create a bunch of files with large xattrs to quickly bump the fs tree height to 3 for any node size (4K to 64K). This also guarantees that the file extent items are spread over multiples leaves, in order to exercise fiemap's correctness when reporting shared extents due to shared subtrees. Reported-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Tested-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05fstests: add smoketest groupZorro Lang
Darrick suggests that fstests can provide a simple smoketest, by running several generic filesystem smoke testing for five minutes apiece (SOAK_DURATION="5m"). Since there are only five smoke tests, this is effectively a 20min super-quick test. With gcov enabled, running these tests yields about ~75% coverage for iomap and ~60% for xfs; or ~50% for ext4 and ~75% for ext4; and ~45% for btrfs. Coverage was about ~65% for the pagecache. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05xfs/122: adjust test for flexarray conversions in 6.5Darrick J. Wong
Adjust the output of this test to handle the conversion of flexarray declaration conversions in linux v6.5, commit a49bbce58ea9 ("xfs: convert flex-array declarations in xfs attr leaf blocks") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05generic: add a test for device removal without dirty dataChristoph Hellwig
Test the removal of the underlying device when the file system still does not have dirty data. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05generic: add a test for device removal with dirty dataChristoph Hellwig
Test the removal of the underlying device when the file system still has dirty data. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05btrfs: add a test case to make sure scrub can repair parity corruptionQu Wenruo
There is a kernel regression caused by commit 75b470332965 ("btrfs: raid56: migrate recovery and scrub recovery path to use error_bitmap"), which leads to scrub not repairing corrupted parity stripes. So here we add a test case to verify the P/Q stripe scrub behavior by: - Create a RAID5 or RAID6 btrfs with minimal amount of devices This means 2 devices for RAID5, and 3 devices for RAID6. This would result the parity stripe to be a mirror of the only data stripe. And since we have control of the content of data stripes, the content of the P stripe is also fixed. - Create an 64K file The file would cover one data stripe. - Corrupt the P stripe - Scrub the fs If scrub is working, the P stripe would be repaired. Unfortunately scrub can not report any P/Q corruption, limited by its reporting structure. So we can not use the return value of scrub to determine if we repaired anything. - Verify the content of the P stripe - Use "btrfs check --check-data-csum" to double check By above steps, we can verify if the P stripe is properly fixed. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-05btrfs/294: reject zoned devices for nowQu Wenruo
The test case itself is utilizing RAID5/6, which is not yet supported on zoned device. In the future we would use raid-stripe-tree (RST) feature, but for now just reject zoned devices completely. And since we're here, also update the _fixed_by_kernel_commit lines, as the proper fix is already merged upstream. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-08-04fstests: install soak_duration.awkTheodore Ts'o
Commit 3e85dd4fe423 ("misc: add duration for long soak tests") added a helper executable, soak_duration.awk, is which used by the check script if SOAK_DURATION is set. This script translates a "human-friendly" time duration specifier, such as 4m or 2d into an integer number of seconds. We need to make sure that this script is installed or the check script will bomb out if SOAK_DURATION is set (and if the fstests installation doesn't include a full set of fstests source, but just those files installed by "make install"). Fixes: 3e85dd4fe423 ("misc: add duration for long soak tests") Cc: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-23btrfs: add a test case to verify that per-fs features directory gets updatedv2023.07.23Qu Wenruo
Although btrfs has a per-fs feature directory, it's not properly refreshed after new features are enabled. We had some attempts to do that properly, like commit 14e46e04958d ("btrfs: synchronize incompat feature bits with sysfs files"). But unfortunately that commit get later reverted as some call sites is not safe to update sysfs files. Now we have a new commit b7625f461da6 ("btrfs: sysfs: update fs features directory asynchronously") to properly refresh that per-fs features directory. So it's time to add a test case for it. The test case itself is pretty straightforward: - Make a very basic 3 disks btrfs Only using the very basic profiles (DUP/SINGLE) so that even older mkfs.btrfs can support. - Make sure per-fs features directory doesn't contain "raid1c34" file - Balance the metadata to RAID1C3 profile - Verify the per-fs features directory contains "raid1c34" feature file Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> [ Update commit log. Remove commented code. Add _fixed_by_kernel_commit. Check mkfs status. Add sync. ] Signed-off-by: Anand Jain <anand.jain@oracle.com>
2023-07-23btrfs: add a test case to check btrfs won't crash on certain corruptionQu Wenruo
The test case would reproduce the situation by creating an empty fs, with SINGLE metadata profile, then corrupt the tree root manually. Finally try mounting the corrupted fs, the mount should fail while our kernel should not fail. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> [ Update commit log. Fix a line gt 80 chars. Use append to $seqres.full. Fix comment ] Signed-off-by: Anand Jain <anand.jain@oracle.com>
2023-07-23btrfs: add a test case to verify the write behavior of large RAID5 data chunksQu Wenruo
There is a recent regression during v6.4 merge window, that a u32 left shift overflow can cause problems with large data chunks (over 4G) sized. This is especially nasty for RAID56, which can lead to ASSERT() during regular writes, or corrupt memory if CONFIG_BTRFS_ASSERT is not enabled. This is the regression test case for it. Unlike btrfs/292, btrfs doesn't support trim inside RAID56 chunks, thus the workflow is simplified: - Create a RAID5 or RAID6 data chunk during mkfs - Fill the fs with 5G data and sync For unpatched kernel, the sync would crash the kernel. - Make sure everything is fine Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-23generic/558: avoid forkbombs on filesystems with many free inodesDarrick J. Wong
Mikulas reported that this test became a forkbomb on his system when he tested it with bcachefs. Unlike XFS and ext4, which have large inodes consuming hundreds of bytes, bcachefs has very tiny ones. Therefore, it reports a large number of free inodes on a freshly mounted 1GB fs (~15 million), which causes this test to try to create 15000 processes. There's really no reason to do that -- all this test wanted to do was to exhaust the number of inodes as quickly as possible using all available CPUs, and then it ran xfs_repair to try to reproduce a bug. Set the number of subshells to 4x the CPU count and spread the work among them instead of forking thousands of processes. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-23xfs: add a couple more tests for ascii-ci problemsDarrick J. Wong
Add some tests to make sure that userspace and the kernel actually agree on how to do ascii case-insensitive directory lookups, and that metadump can actually obfuscate such filesystems. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-23common/rc: cleanup old .kmemleak filesLuís Henriques
I've spent a non-negligible amount of time looking into a kmemleak that didn't exist in the code I was testing because there was an old .kmemleak file in the results directory. I don't think this is an intended behaviour, so I'm proposing to remove these files everytime we capture the result of a new scan. Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-23overlay: Add test coverage for fs-verity supportAlexander Larsson
This tests that the right xattrs are set during copy-up, and that we properly fail on missing of erronous fs-verity digests when validating. We also ensure that verity=require fails if a metacopy has not fs-verity, and doesn't do a meta-coopy-up if the base file lacks verity. Signed-off-by: Alexander Larsson <alexl@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-23overlay: Add test for follow of lowerdata in data-only layersAmir Goldstein
Add test coverage for following metacopy from lower layer to data-only lower layers. Data-only lower layers are configured using the syntax: lowerdir=<lowerdir1>:<lowerdir2>::<lowerdata1>::<lowerdata2>. Test that lowerdata files can be followed only by absolute redirect from lower layer. Test that with two lowerdata dirs, we can lookup individual lowerdata files in both, and that a shared file is resolved from the uppermost lowerdata dir. There is also test case for lazy-data lookups, where we remove the lowerdata file and validate that we get metadata from the metacopy file, but open fails. Signed-off-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-23overlay/060: add test cases of follow to lowerdataAmir Goldstein
Add test coverage for following metacopy from lower layer to lower data with absolute, relative and no redirect. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Alexander Larsson <alexl@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-23overlay: add helper for mounting rdonly overlayAmir Goldstein
Allow passing empty upperdir to _overlay_mount_dirs(). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Alexander Larsson <alexl@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-09report: remove xmlns specifierv2023.07.09Theodore Ts'o
By specifying "xmlns=https://git.kernel.org/.../xfstests-dev.git", this causes XML complaint parsers, such as the one used by the python junitparser library, to put all of the XML elements into a namespace, which then causes junitparser to toss its cookies. This can be worked-around in a test runner script via: sed -i.orig -e 's/xmlns=\".*\"//' "$RESULT_BASE/result.xml" but it's better not to include the xmlns line at all in the first place, since this may cause other users of fstests who are using the Python junitparser library a lot of headaches. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-09report: safely update the result.xml fileTheodore Ts'o
After every single test, we rewrite result.xml from scratch. This ensures that the XML file is always in a valid, parseable state, even if the check script is killed or the machine crashes in the middle of a test. If the test is being run in a Cloud VM as a "spot" (Amazon, Azure, or GCE) or "preemptible" (Oracle) instance, the VM can be halted whenever the Cloud provider needs the capacity for customers who are willing to pay full price. ("Spot" instances can be 60% to 90% cheaper --- allowing the frugal kernel developer to get up to 10 times more testing for the same amount of money. :-) Since a "spot" VM can get terminated at any time, it is possible for the check script to be killed while it is in the middle of rewriting the result.xml file. If the result.xml file is only partially written, information regarding the tests run before VM termination will be lost. To address this race, write the new result.xml file as result.xml.new, and only rename it to result.xml after the XML file is fully written out. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07xfs: test growfs of the realtime deviceDarrick J. Wong
Create a new test to make sure that growfs on the realtime device works without corrupting anything. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07xfs/041: force create files on the data deviceDarrick J. Wong
Since we're testing growfs of the data device, we should create the files there, even if the mkfs configuration enables rtinherit on the root dir. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07xfs/439: amend test to work with new log geometry validationDarrick J. Wong
An upcoming patch moves more log validation checks to the superblock verifier, so update this test as needed. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07xfs/569: skip post-test fsck runDarrick J. Wong
This test examines the behavior of mkfs.xfs with specific filesystem configuration files by formatting the scratch device directly with those exact parameters. IOWs, it doesn't include external log devices or realtime devices. If external devices are set up, the post-test fsck run fails because the filesystem doesnt' use the (allegedly) configured external devices. Fix that by adding _require_scratch_nocheck. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07xfs/529: fix bogus failure when realtime is configuredDarrick J. Wong
If I have a realtime volume configured, this test will sometimes trip over this: XFS: Assertion failed: nmaps == 1, file: fs/xfs/xfs_dquot.c, line: 360 Call Trace: xfs_dquot_disk_alloc+0x3dc/0x400 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501] xfs_qm_dqread+0xc9/0x190 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501] xfs_qm_dqget+0xa8/0x230 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501] xfs_qm_vop_dqalloc+0x160/0x600 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501] xfs_setattr_nonsize+0x318/0x520 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501] notify_change+0x30e/0x490 chown_common+0x13e/0x1f0 do_fchownat+0x8d/0xe0 __x64_sys_fchownat+0x1b/0x20 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fa6985e2cae The test injects the bmap_alloc_minlen_extent error, which refuses to allocate file space unless it's exactly minlen long. However, a precondition of this injection point is that the free space on the data device must be sufficiently fragmented that there are small free extents. However, if realtime and rtinherit are enabled, the punch-alternating call will operate on a realtime file, which only serves to write 0x55 patterns into the realtime bitmap. Hence the test preconditions are not satisfied, so the test is not serving its purpose. Fix it by disabling rtinherit=1 on the rootdir so that we actually fragment the bnobt/cntbt as required. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07common/btrfs: handle dmdust as mounted device in ↵Qu Wenruo
_btrfs_buffered_read_on_mirror() [BUG] After commit ab41f0bddb73 ("common/btrfs: use _scratch_cycle_mount to ensure all page caches are dropped"), the test case btrfs/143 can fail like below: btrfs/143 6s ... [failed, exit status 1]- output mismatch (see ~/xfstests/results//btrfs/143.out.bad) --- tests/btrfs/143.out 2020-06-10 19:29:03.818519162 +0100 +++ ~/xfstests/results//btrfs/143.out.bad 2023-06-19 17:04:00.575033899 +0100 @@ -1,37 +1,6 @@ QA output created by 143 wrote 131072/131072 bytes XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ -XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ -XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ -XXXXXXXX: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ [CAUSE] Test case btrfs/143 uses dm-dust device to emulate read errors, this means we can not use _scratch_cycle_mount to cycle mount $SCRATCH_MNT. As it would go mount $SCRATCH_DEV, not the dm-dust device to $SCRATCH_MNT. This prevents us to trigger read-repair (since no error would be hit) thus fail the test. [FIX] Since we can mount whatever device at $SCRATCH_MNT, we can not use _scratch_cycle_mount in this case. Instead implement a small helper to grab the mounted device and its mount options, and use the same device and mount options to cycle $SCRATCH_MNT mount. This would fix btrfs/143 and hopefully future test cases which use dm devices. Reported-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07btrfs: add test case to verify the behavior with large RAID0 data chunksQu Wenruo
There is a recent regression during v6.4 merge window, that a u32 left shift overflow can cause problems with large data chunks (over 4G sized). This is the regression test case for it. The test case itself would: - Create a RAID0 chunk with a single 6G data chunk This requires at least 6 devices from SCRATCH_DEV_POOL, and each should be larger than 2G. - Fill the fs with 5G data - Make sure everything is fine Including the data csums. - Delete half of the data - Do a fstrim This would lead to out-of-boundary trim if the kernel is not patched. - Make sure everything is fine again If not patched, we may have corrupted data due to the bad fstrim above. For now, this test case only checks the behavior for RAID0. As for RAID10, we need 12 devices, which is out-of-reach for a lot of test envionrments. For RAID56, they would have a different test case, as they don't support discard inside the RAID56 chunks. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07btrfs: test activating swapfile in the presence of snapshotsFilipe Manana
Test that if we have a subvolume with a non-active swap file, we can not activate it if there are any snapshots. Also test that after all the snapshots are removed, we will be able to activate the swapfile. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07generic/604: Fix for overlayfsAmir Goldstein
Since v6.3, I noticed that generic/604 does not run on overlayfs because: generic/604 -- upper fs needs to support d_type This is odd because the base fs I am using (xfs) does support d_type. The reason is that for overlayfs, this sequence run by this test: _scratch_unmount & _scratch_mount Translates to: umount $OVL_MNT; umount $BASE_MNT & mount $BASE_MNT ...; mount $OVL_MNT ... Which can end up reordred as: umount $OVL_MNT; mount $BASE_MNT ... umount $BASE_MNT & mount $OVL_MNT ... and overlayfs is trying to use a non-existing upper fs. Use UMOUNT_PROG directly instead of the _scratch_unmount helper, to avoid unmounting the base fs. Incidently, the only thing that has changed in overlayfs in v6.3 is idmapped mounts support and the test in question was run without idmapped mounts enabled, so the cahnge in behavior must be related to some subtle timing change. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-07-07check: fix excluded tests are only expunged in the first iterationYuezhang Mo
If iterating more than once and excluding some tests, the excluded tests are expunged in the first iteration, but run in subsequent iterations. This is not expected. The problem was caused by the temporary file saving the excluded tests being deleted by `rm -f $tmp.*` in _wrapup() at the end of the first iteration. This commit saves the excluded tests into a variable instead of a temporary file. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@foxmail.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-28common/config: redirect modprobe helpinfo to stdout for busyboxStas Sergeev
Due to the busybox' modprobe writes help to stderr. We need to redirect it to stdout, or it will end up in a test results. Signed-off-by: Stas Sergeev <stsp2@yandex.ru> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-23fstests: reduce runtime of check -nAmir Goldstein
kvm-xfstests invokes check -n twice to pre-process and generate the tests-to-run list, which is then being passed as a long list of tests for invkoing check in the command line. check invokes dirname, basename and sed several times per test just for doing basic string prefix/suffix trimming. Use bash string pattern matching instead which is much faster. Note that the following pattern matching expression change: < test_dir=${test_dir#$SRC_DIR/*} > t=${t#$SRC_DIR/} does not change the meaning of the expression, because the shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/" and removing the tests/ prefix is what this code intended to do. With check -n, there is no need to cleanup the results dir, but check -n is doing that for every single listed test. Move the cleanup of results dir to before actually running the test. These improvements to check pre-test code cut down several minutes from the time until tests actually start to run with kvm-xfstests. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-18common/rc: Enable _test_mkfs to force a mkfs on a xfs filesystemv2023.06.18Carlos Maiolino
Calling _test_mkfs on an already initialized xfs FS will fail as the initialization is not enforced by '-f' argument, unless it's included in MKFS_OPTIONS. So, adding 'RECREATE_TEST_DEV=true' to the config file end up being useless for xfs filesystems. So, adding the a specific xfs optiong in _test_mkfs using -f argument makes RECREATE_TEST_DEV actually useful. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-18common/rc: skip ceph-fuse when atime is requiredXiubo Li
Ceph won't maintain the atime, so just skip the tests when the atime is required. Fixes: https://tracker.ceph.com/issues/61551 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-18generic/020: add ceph-fuse supportXiubo Li
For ceph fuse client the fs type will be "ceph-fuse". Fixes: https://tracker.ceph.com/issues/61496 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-18generic/506: fix to call _scratch_enable_pquota()Chao Yu
Otherwise, this testcase will fail due to project quota feature is not enabled in the image. Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-18common/btrfs: use _scratch_cycle_mount to ensure all page caches are droppedQu Wenruo
[BUG] There is a chance that btrfs/266 would fail on aarch64 with 64K page size. (No matter if it's 4K sector size or 64K sector size) The failure indicates that one or more mirrors are not properly fixed. [CAUSE] I have added some trace events into the btrfs IO paths, including __btrfs_submit_bio() and __end_bio_extent_readpage(). When the test failed, the trace looks like this: 112819.764977: __btrfs_submit_bio: r/i=5/257 fileoff=0 mirror=1 len=196608 pid=33663 ^^^ Initial read on the full 192K file 112819.766604: __btrfs_submit_bio: r/i=5/257 fileoff=0 mirror=2 len=65536 pid=21806 ^^^ Repair on the first 64K block Which would success 112819.766760: __btrfs_submit_bio: r/i=5/257 fileoff=65536 mirror=2 len=65536 pid=21806 ^^^ Repair on the second 64K block Which would fail 112819.767625: __btrfs_submit_bio: r/i=5/257 fileoff=65536 mirror=3 len=65536 pid=21806 ^^^ Repair on the third 64K block Which would success 112819.770999: end_bio_extent_readpage: read finished, r/i=5/257 fileoff=0 len=196608 mirror=1 status=0 ^^^ The repair succeeded, the full 192K read finished. 112819.864851: __btrfs_submit_bio: r/i=5/257 fileoff=0 mirror=3 len=196608 pid=33665 112819.874854: __btrfs_submit_bio: r/i=5/257 fileoff=0 mirror=1 len=65536 pid=31369 112819.875029: __btrfs_submit_bio: r/i=5/257 fileoff=131072 mirror=1 len=65536 pid=31369 112819.876926: end_bio_extent_readpage: read finished, r/i=5/257 fileoff=0 len=196608 mirror=3 status=0 But above read only happen for mirror 1 and mirror 3, mirror 2 is not involved. This means by somehow, the read on mirror 2 didn't happen, mostly due to something wrong during the drop_caches call. It may be an async operation or some hardware specific behavior. On the other hand, for test cases doing the same operation but utilizing direct IO, aka btrfs/267, it never fails as we never populate the page cache thus would always read from the disk. [WORKAROUND] The root cause is in the "echo 3 > drop_caches", which I'm still debugging. But at the same time, we can workaround the problem by forcing a cycle mount of scratch device, inside _btrfs_buffered_read_on_mirror(). By this we can ensure all page caches are dropped no matter if drop_caches is working correctly. With this patch, I no longer hit the failure on aarch64 with 64K page size anymore, while before this the test case would always fail during a -I 10 run. [zlang: remove the duplicated drop_caches line] Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-18btrfs/106: avoid hard coded output to handle different page sizesQu Wenruo
[BUG] Test case btrfs/106 is known to fail if the system has a page size other than 4K. This test case can fail like this: btrfs/106 5s ... - output mismatch (see ~/xfstests-dev/results//btrfs/106.out.bad) --- tests/btrfs/106.out 2022-11-24 19:53:53.140469437 +0800 +++ ~/xfstests-dev/results//btrfs/106.out.bad 2023-06-02 16:12:57.014273249 +0800 @@ -5,19 +5,19 @@ File contents before unmount: 0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * -40 +1000 File contents after remount: 0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ... (Run 'diff -u ~/xfstests-dev/tests/btrfs/106.out /home/adam/xfstests-dev/results//btrfs/106.out.bad' to see the entire diff) This is particularly problematic for systems like Aarch64 or PPC64 which supports 64K page size. [CAUSE] The test case is using page size to calculate the amount of data to write, thus it doesn't support any page sizes other than 4K. [FIX] Instead of using the golden output, go with md5sum and compare them before and after the remount. The new md5sum would only go into $seqres.full for debugging, not into golden output to avoid false alerts on different pages sizes. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-18btrfs/122: fix nodesize option in mfks.btrfsAnand Jain
btrf/122 is failing on a system with 64k page size: QA output created by 122 +ERROR: illegal nodesize 16384 (smaller than 65536) +mount: /mnt/scratch: wrong fs type, bad option, bad superblock on /dev/vdb2, missing codepage or helper program, or other error. +mount /dev/vdb2 /mnt/scratch failed +(see /xfstests-dev/results//btrfs/122.full for details) Mkfs.btrfs sets the default node size to 16K when the sector size is less than 16K, and it matches the sector size when it's greater than 16K. So, there's no need to explicitly set it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Tested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-10common/xfs: compress online repair rebuild output by defaultDarrick J. Wong
Force-repairing the filesystem after a test can fill up /tmp with quite a lot of logging message. We don't have a better place to stash that output in case the scrub fails and we need to analyze it later, so compress it with gzip and only decompress it later. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-10xfs/503: don't rebuild the fs metadata when testing metadumpDarrick J. Wong
This test exercises metadump with the standard populate image. There's no need to test rebuilding the entire fs every step of the way since we're just going to metadump over it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-10fuzzy: disallow post-test online rebuilds when testing online fsckDarrick J. Wong
If we're testing the online fsck code or running fuzz tests of the filesystem, don't let the post-test checks rebuild the filesystem metadata, because this is redundant with the test and will disturb the metadata if the tools fail. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-10xfs/155: improve logging in this testDarrick J. Wong
If this test fails after a certain number of writes, we should state the exact number of writes so that we can coordinate with 155.full. Instead, we state the pre-randomization number, which isn't all that helpful. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
2023-06-10xfs/155: discard stderr when checking for NEEDSREPAIRDarrick J. Wong
This test deliberate crashes xfs_repair midway through writing metadata to check that NEEDSREPAIR is always triggered by filesystem writes. However, the subsequent scan for the NEEDSREPAIR feature bit prints verifier errors to stderr. On a filesystem with metadata directories, this leads to the test failing with this recorded in the golden output: +Metadata CRC error detected at 0x55c0a2dd0d38, xfs_dir3_block block 0xc0/0x1000 +dir block owner 0x82 doesnt match block 0xbb8cd37e44eb3623 This isn't specific to metadata directories -- any repair crash could leave a metadata structure in a weird state such that starting xfs_db will spray verifier errors. For _check_scratch_xfs_features here, we don't care if the filesystem is corrupt; we /only/ care that the superblock feature bit is set. Route all that noise to devnull. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>