summaryrefslogtreecommitdiff
path: root/fs/xfs/scrub
AgeCommit message (Collapse)Author
2018-02-22xfs: use memset to initialize xfs_scrub_agfl_infoEric Sandeen
Apparently different gcc versions have competing and incompatible notions of how to initialize at declaration, so just give up and fall back to the time-tested memset(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-29xfs: don't clobber inobt/finobt cursors when xref with rmapDarrick J. Wong
Even if we can't use the inobt/finobt cursors to count the number of inode btree blocks, we are never allowed to clobber the cursor of the btree being checked, so don't do this. Found by fuzzing level = ones in xfs/364. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-01-29xfs: make tracepoint inode number format consistentDarrick J. Wong
Fix all the inode number formats to be consistently (0x%llx) in all trace point definitions. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-01-17xfs: check that br_blockcount doesn't overflowDarrick J. Wong
xfs_bmbt_irec.br_blockcount is declared as xfs_filblks_t, which is an unsigned 64-bit integer. Though the bmbt helpers will never set a value larger than 2^21 (since the underlying on-disk extent record has a length field that is only 21 bits wide), we should be a little defensive about checking that a bmbt record doesn't exceed what we're expecting or overflow into the next AG. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: directory scrubber must walk through data block to offsetDarrick J. Wong
In xfs_scrub_dir_rec, we must walk through the directory block entries to arrive at the offset given by the hash structure. If we blindly trust the hash address, we can end up midway into a directory entry and stray outside the block. Found by lastbit fuzzing lents[3].address in xfs/390 with KASAN enabled. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: don't iunlock unlocked inodesDarrick J. Wong
Don't iunlock an unlocked inode, which can happen if the parent pointer scrubber bails out with sc->ip unlocked while trying to grab the parent directory inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-01-17xfs: scrub in-core metadataDarrick J. Wong
Whenever we load a buffer, explicitly re-call the structure verifier to ensure that memory isn't corrupting things. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference the block mappings when possibleDarrick J. Wong
Use an inode's block mappings to cross-reference inode block counters. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference the realtime bitmapDarrick J. Wong
While we're scrubbing various btrees, cross-reference the records with the other metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference refcount btree during scrubDarrick J. Wong
During metadata btree scrub, we should cross-reference with the reference counts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference the rmapbt data with the refcountbtDarrick J. Wong
Cross reference the refcount data with the rmap data to check that the number of rmaps for a given block match the refcount of that block, and that CoW blocks (which are owned entirely by the refcountbt) are tracked as well. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference reverse-mapping btreeDarrick J. Wong
When scrubbing various btrees, we should cross-reference the records with the reverse mapping btree and ensure that traversing the btree finds the same number of blocks that the rmapbt thinks are owned by that btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference inode btrees during scrubDarrick J. Wong
Cross-reference the inode btrees with the other metadata when we scrub the filesystem. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference bnobt records with cntbtDarrick J. Wong
Scrub should make sure that each bnobt record has a corresponding cntbt record. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference with the bnobtDarrick J. Wong
When we're scrubbing various btrees, cross-reference the records with the bnobt to ensure that we don't also think the space is free. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: introduce scrubber cross-referencing stubsDarrick J. Wong
Create some stubs that will be used to cross-reference metadata records. The actual cross-referencing will be filled in by subsequent patches. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: check btree block ownership with bnobt/rmapbt when scrubbing btreeDarrick J. Wong
When scanning a metadata btree block, cross-reference the block location with the free space btree and the reverse mapping btree to ensure that the rmapbt knows about the block and the bnobt does not. Add a mechanism to defer checks when we happen to be scanning the bnobt/rmapbt itself because it's less efficient to repeatedly clone and destroy the cursor. This patch provides the framework to make btree block owner checks happen; the actual meat will be added in subsequent patches. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: fix a few erroneous process_error calls in the scrubbersDarrick J. Wong
There are a few places where we make a libxfs api call on behalf of some object other than the one we're scrubbing but inadvertently call the regular process_error function. When this happens we mark the object corrupt even though it was corruption in /some other/ object that actually produced the -EFSCORRUPTED code. The correct output flag for these situations is SCRUB_OFLAG_XFAIL, not SCRUB_OFLAG_CORRUPT, so fix this now that we also have a helper to set these. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: set up scrub cross-referencing helpersDarrick J. Wong
Create some helper functions that we'll use later to deal with problems we might encounter while cross referencing metadata with other metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-12xfs: use %pS printk format for direct instruction addressesDarrick J. Wong
Use the %pS instead of the %pF printk format specifier for printing symbols from direct addresses. This is needed for the ia64, ppc64 and parisc64 architectures. While we're at it, be consistent with the capitalization of the 'S'. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-09xfs: harden directory integrity checks some moreDarrick J. Wong
If a malicious filesystem image contains a block+ format directory wherein the directory inode's core.mode is set such that S_ISDIR(core.mode) == 0, and if there are subdirectories of the corrupted directory, an attempt to traverse up the directory tree will crash the kernel in __xfs_dir3_data_check. Running the online scrub's parent checks will tend to do this. The crash occurs because the directory inode's d_ops get set to xfs_dir[23]_nondir_ops (it's not a directory) but the parent pointer scrubber's indiscriminate call to xfs_readdir proceeds past the ASSERT if we have non fatal asserts configured. Fix the null pointer dereference crash in __xfs_dir3_data_check by looking for S_ISDIR or wrong d_ops; and teach the parent scrubber to bail out if it is fed a non-directory "parent". Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-01-08xfs: have buffer verifier functions report failing addressDarrick J. Wong
Modify each function that checks the contents of a metadata buffer to return the instruction address of the failing test so that we can report more precise failure errors to the log. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: distinguish between corrupt inode and invalid inum in xfs_scrub_get_inodeDarrick J. Wong
In xfs_scrub_get_inode, we don't do a good enough job distinguishing EINVAL returns from xfs_iget w/ IGET_UNTRUSTED -- this can happen if the passed in inode number is invalid (past eofs, inobt says it isn't an inode) or if the inum is actually valid but the inode buffer fails verifier. In the first case we still want to return ENOENT, but in the second case we want to capture the corruption error. Therefore, if xfs_iget returns EINVAL, try the raw imap lookup. If that succeeds, we conclude it's a corruption error, otherwise we just bounce out to userspace. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: always grab transaction when scrubbing inodeDarrick J. Wong
Always allocate a transaction for inode scrubbing, even if the _iget fails. This is something that is nice to have now for consistency with the other scrubbers but will become critical when we get to online repair where we'll actually use the transaction + raw buffer read to fix the verifier errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: xfs_scrub_bmap should use for_each_xfs_iextDarrick J. Wong
Refactor xfs_scrub_bmap to use for_each_xfs_iext now that it exists. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: catch a few more error codes when scrubbing secondary sbDarrick J. Wong
The superblock validation routines return a variety of error codes to reject a mount request. For scrub we can assume that the mount succeeded, so if we see these things appear when scrubbing secondary sb X, we can treat them all like corruption. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: ignore agfl read errors when not scrubbing agflDarrick J. Wong
In xfs_scrub_ag_read_headers, if we're not scrubbing the AGFL but hit a read error reading the AGFL, we should reset the error code so that it doesn't propagate up into the caller. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: scrub inode nsec fieldsDarrick J. Wong
Check that the nanosecond fields in each timestamp aren't larger than a billion. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-08xfs: move all scrub input checking to xfs_scrub_validateEric Sandeen
There were ad-hoc checks for some scrub types but not others; mark each scrub type with ... it's type, and use that to validate the allowed and/or required input fields. Moving these checks out of xfs_scrub_setup_ag_header makes it a thin wrapper, so unwrap it in the process. Signed-off-by: Eric Sandeen <sandeen@redhat.com> [darrick: add xfs_ prefix to enum, check scrub args after checking type] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-08xfs: factor out scrub input checkingEric Sandeen
Do this before adding more core checks. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-08xfs: explicitly initialize meta_scrub_ops array by typeEric Sandeen
An implicit mapping to type by order of initialization seems error-prone, and doesn't lend itself to cscope-ing. Also add sanity checks about size of array vs. max types, and a defensive check that ->scrub exists before using it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-12-08fs: xfs: remove duplicate includesPravin Shedge
These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-30xfs: scrub inode mode properlyxfs-4.15-fixes-3Darrick J. Wong
Since we've used up all the bits in i_mode, the existing mode check doesn't actually do anything useful. However, we've not used all the bit values in the format portion of i_mode, so we /do/ need to test that for bad values. Fixes: 80e4e1268 ("xfs: scrub inodes") Fixes-coverity-id: 1423992 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-11-28xfs: calculate correct offset in xfs_scrub_quota_itemxfs-4.15-fixes-2Eric Sandeen
It's only used for tracepoints so it's relatively harmless, but the offset is calculated incorrectly in xfs_scrub_quota_item. qi_dqperchunk is the nr. of dquots per "chunk" which we have conveniently *cough* defined to always be 1 FSB. Therefore block_offset * qi_dqperchunk == first id in that chunk, and so offset = id / qi_dqperchunk id * dqperchunk is ... meaningless. Fixes-coverity-id: 1423965 Fixes: c2fc338c ("xfs: scrub quota information") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-28xfs: fix uninitialized variable in xfs_scrub_quotaEric Sandeen
On the first pass through the while(1) loop, we get to xfs_scrub_should_terminate() which can test the uninitialized error variable. Fixes-coverity-id: 1423737 Fixes: c2fc338c ("xfs: scrub quota information") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-09xfs: check the uniqueness of the AGFL entriesDarrick J. Wong
Make sure we don't list a block twice in the agfl by copying the contents of the AGFL to an array, sorting it, and looking for duplicates. We can easily check that the number of agfl entries we see actually matches the flcount, so do that too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-11-09xfs: only check da node header padding on v5 filesystemsDarrick J. Wong
It turns out that we only started zeroing a new da btree node's block header on v5 filesystems. Prior to that, we just wouldn't set anything at all, which means that the pad field never got set and would retain whatever happened to be in memory. Therefore, we can only check the pad for zeroness on v5 filesystems. shared/006 on a v4 filesystem exposes this scrub bug. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-11-09xfs: fix btree scrub deref checkDarrick J. Wong
The btree scrubber has some custom code to retrieve and check a btree block via xfs_btree_lookup_get_block. This function will either return an error code (verifiers failed) or a *pblock will be untouched (bad pointer). Since we previously set *pblock to NULL, we need to check *pblock, not pblock, to trigger the early bailout. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-11-09xfs: fix uninitialized return values in scrub codeDarrick J. Wong
Fix smatch complaints about uninitialized return codes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-11-09xfs: pass inode number to xfs_scrub_ino_set_{preen,warning}Darrick J. Wong
There are two ways to scrub an inode -- calling xfs_iget and checking the raw inode core, or by loading the inode cluster buffer and checking the on-disk contents directly. The second method is only useful if _iget fails the verifiers; when this is the case, sc->ip is NULL and calling the tracepoint will cause a system crash. Therefore, pass the raw inode number directly into the _preen and _warning functions. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-11-09xfs: refactor the directory data block bestfree checksDarrick J. Wong
In a directory data block, the zeroth bestfree item must point to the longest free space. Therefore, when we check the bestfree block's records against the data blocks, we only need to compare with bf[0] and don't need the loop. The weird loop was most probably the result of an earlier refactoring gone bad. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-11-06xfs: trivial sparse fixes for the new scrub codeChristoph Hellwig
[darrick: fix broken initializer in xfs_scrub_xattr] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-06xfs: use a b+tree for the in-core extent listChristoph Hellwig
Replace the current linear list and the indirection array for the in-core extent list with a b+tree to avoid the need for larger memory allocations for the indirection array when lots of extents are present. The current extent list implementations leads to heavy pressure on the memory allocator when modifying files with a high extent count, and can lead to high latencies because of that. The replacement is a b+tree with a few quirks. The leaf nodes directly store the extent record in two u64 values. The encoding is a little bit different from the existing in-core extent records so that the start offset and length which are required for lookups can be retreived with simple mask operations. The inner nodes store a 64-bit key containing the start offset in the first half of the node, and the pointers to the next lower level in the second half. In either case we walk the node from the beginninig to the end and do a linear search, as that is more efficient for the low number of cache lines touched during a search (2 for the inner nodes, 4 for the leaf nodes) than a binary search. We store termination markers (zero length for the leaf nodes, an otherwise impossible high bit for the inner nodes) to terminate the key list / records instead of storing a count to use the available cache lines as efficiently as possible. One quirk of the algorithm is that while we normally split a node half and half like usual btree implementations we just spill over entries added at the very end of the list to a new node on its own. This means we get a 100% fill grade for the common cases of bulk insertion when reading an inode into memory, and when only sequentially appending to a file. The downside is a slightly higher chance of splits on the first random insertions. Both insert and removal manually recurse into the lower levels, but the bulk deletion of the whole tree is still implemented as a recursive function call, although one limited by the overall depth and with very little stack usage in every iteration. For the first few extents we dynamically grow the list from a single extent to the next powers of two until we have a first full leaf block and that building the actual tree. The code started out based on the generic lib/btree.c code from Joern Engel based on earlier work from Peter Zijlstra, but has since been rewritten beyond recognition. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-06xfs: introduce the xfs_iext_cursor abstractionChristoph Hellwig
Add a new xfs_iext_cursor structure to hide the direct extent map index manipulations. In addition to the existing lookup/get/insert/ remove and update routines new primitives to get the first and last extent cursor, as well as moving up and down by one extent are provided. Also new are convenience to increment/decrement the cursor and retreive the new extent, as well as to peek into the previous/next extent without updating the cursor and last but not least a macro to iterate over all extents in a fork. [darrick: rename for_each_iext to for_each_xfs_iext] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-03xfs: scrub: avoid uninitialized return codeDarrick J. Wong
The newly added xfs_scrub_da_btree_block() function has one code path that returns the 'error' variable without initializing it first, as shown by this compiler warning: fs/xfs/scrub/dabtree.c: In function 'xfs_scrub_da_btree_block': fs/xfs/scrub/dabtree.c:462:9: error: 'error' may be used uninitialized in this function [-Werror=maybe-uninitialized] Return zero since the caller will exit the scrub code if we don't produce a buffer pointer. Fixes: 7c4a07a424c1 ("xfs: scrub directory/attribute btrees") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-11-01xfs: scrub extended attribute leaf spaceDarrick J. Wong
As we walk the attribute btree, explicitly check the structure of the attribute leaves to make sure the pointers make sense and the freemap is sensible. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-10-27xfs: compare btree block keys to parent block's keys during scrubDarrick J. Wong
When we're done checking all the records/keys in a btree block, compute the low and high key of the block and compare them to the associated key in the parent btree block. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-10-26xfs: scrub quota informationDarrick J. Wong
Perform some quick sanity testing of the disk quota information. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-10-26xfs: scrub realtime bitmap/summaryDarrick J. Wong
Perform simple tests of the realtime bitmap and summary. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-10-26xfs: scrub directory parent pointersDarrick J. Wong
Scrub parent pointers, sort of. For directories, we can ride the '..' entry up to the parent to confirm that there's at most one dentry that points back to this directory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>