summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2019-10-24xfs: Sanity check flags of Q_XQUOTARM callxfs-5.5-merge-3Jan Kara
Flags passed to Q_XQUOTARM were not sanity checked for invalid values. Fix that. Fixes: 9da93f9b7cdf ("xfs: fix Q_XQUOTARM ioctl") Reported-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-23xfs: add mising include of xfs_pnfs.h for missing declarationsBen Dooks (Codethink)
The xfs_pnfs.c file is missing an include of xfs_pnfs.h to add the prototypes of the functions it exports. Include this file to fix the following sparse warnings: fs/xfs/xfs_pnfs.c:27:1: warning: symbol 'xfs_break_leased_layouts' was not declared. Should it be static? fs/xfs/xfs_pnfs.c:52:1: warning: symbol 'xfs_fs_get_uuid' was not declared. Should it be static? fs/xfs/xfs_pnfs.c:77:1: warning: symbol 'xfs_fs_map_blocks' was not declared. Should it be static? fs/xfs/xfs_pnfs.c:226:1: warning: symbol 'xfs_fs_commit_blocks' was not declared. Should it be static? Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-23xfs: don't set bmapi total block req where minleft isBrian Foster
xfs_bmapi_write() takes a total block requirement parameter that is passed down to the block allocation code and is used to specify the total block requirement of the associated transaction. This is used to try and select an AG that can not only satisfy the requested extent allocation, but can also accommodate subsequent allocations that might be required to complete the transaction. For example, additional bmbt block allocations may be required on insertion of the resulting extent to an inode data fork. While it's important for callers to calculate and reserve such extra blocks in the transaction, it is not necessary to pass the total value to xfs_bmapi_write() in all cases. The latter automatically sets minleft to ensure that sufficient free blocks remain after the allocation attempt to expand the format of the associated inode (i.e., such as extent to btree conversion, btree splits, etc). Therefore, any callers that pass a total block requirement of the bmap mapping length plus worst case bmbt expansion essentially specify the additional reservation requirement twice. These callers can pass a total of zero to rely on the bmapi minleft policy. Beyond being superfluous, the primary motivation for this change is that the total reservation logic in the bmbt code is dubious in scenarios where minlen < maxlen and a maxlen extent cannot be allocated (which is more common for data extent allocations where contiguity is not required). The total value is based on maxlen in the xfs_bmapi_write() caller. If the bmbt code falls back to an allocation between minlen and maxlen, that allocation will not succeed until total is reset to minlen, which essentially throws away any additional reservation included in total by the caller. In addition, the total value is not reset until after alignment is dropped, which means that such callers drop alignment far too aggressively than necessary. Update all callers of xfs_bmapi_write() that pass a total block value of the mapping length plus bmbt reservation to instead pass zero and rely on xfs_bmapi_minleft() to enforce the bmbt reservation requirement. This trades off slightly less conservative AG selection for the ability to preserve alignment in more scenarios. xfs_bmapi_write() callers that incorporate unrelated or additional reservations in total beyond what is already included in minleft must continue to use the former. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-23xfs: cap longest free extent to maximum allocatableDave Chinner
Cap longest extent to the largest we can allocate based on limits calculated at mount time. Dynamic state (such as finobt blocks) can result in the longest free extent exceeding the size we can allocate, and that results in failure to align full AG allocations when the AG is empty. Result: xfs_io-4413 [003] 426.412459: xfs_alloc_vextent_loopfailed: dev 8:96 agno 0 agbno 32 minlen 243968 maxlen 244000 mod 0 prod 1 minleft 1 total 262148 alignment 32 minalignslop 0 len 0 type NEAR_BNO otype START_BNO wasdel 0 wasfromfl 0 resv 0 datatype 0x5 firstblock 0xffffffffffffffff minlen and maxlen are now separated by the alignment size, and allocation fails because args.total > free space in the AG. [bfoster: Added xfs_bmap_btalloc() changes.] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: remove the duplicated inode log fieldmask setkaixuxia
The xfs_bumplink() call has set the inode log fieldmask XFS_ILOG_CORE, so the next xfs_trans_log_inode() call is not necessary. Signed-off-by: kaixuxia <kaixuxia@tencent.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: improve the IOMAP_NOWAIT check for COW inodesxfs-5.5-merge-2Christoph Hellwig
Only bail out once we know that a COW allocation is actually required, similar to how we handle normal data fork allocations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: cleanup xfs_direct_write_iomap_beginChristoph Hellwig
Move more checks into the helpers that determine if we need a COW operation or allocation and split the return path for when an existing data for allocation has been found versus a new allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: rename the whichfork variable in xfs_buffered_write_iomap_beginChristoph Hellwig
Renaming whichfork to allocfork in xfs_buffered_write_iomap_begin makes the usage of this variable a little more clear. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: split the iomap ops for buffered vs direct writesChristoph Hellwig
Instead of lots of magic conditionals in the main write_begin handler this make the intent very clear. Thing will become even better once we support delayed allocations for extent size hints and realtime allocations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: move xfs_file_iomap_begin_delay aroundChristoph Hellwig
Move xfs_file_iomap_begin_delay near the end of the file next to the other iomap functions to prepare for additional refactoring. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: split out a new set of read-only iomap opsChristoph Hellwig
Start untangling xfs_file_iomap_begin by splitting out the read-only case into its own set of iomap_ops with a very simply iomap_begin helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: factor out a helper to calculate the end_fsbChristoph Hellwig
We have lots of places that want to calculate the final fsb for a offset + count in bytes and check that the result fits into s_maxbytes. Factor out a helper for that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: fill out the srcmap in iomap_beginChristoph Hellwig
Replace our local hacks to report the source block in the main iomap with the proper scrmap reporting. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: refactor xfs_file_iomap_begin_delayChristoph Hellwig
Rejuggle the return path to prepare for filling out a source iomap. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: pass two imaps to xfs_reflink_allocate_cowChristoph Hellwig
xfs_reflink_allocate_cow consumes the source data fork imap, and potentially returns the COW fork imap. Split the arguments in two to clear up the calling conventions and to prepare for returning a source iomap from ->iomap_begin. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: remove xfs_reflink_dirty_extentsChristoph Hellwig
Now that xfs_file_unshare is not completely dumb we can just call it directly without iterating the extent and reflink btrees ourselves. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: also call xfs_file_iomap_end_delalloc for zeroing operationsChristoph Hellwig
There is no reason not to punch out stale delalloc blocks for zeroing operations, as they otherwise behave exactly like normal writes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: fix inode fork extent count overflowDave Chinner
[commit message is verbose for discussion purposes - will trim it down later. Some questions about implementation details at the end.] Zorro Lang recently ran a new test to stress single inode extent counts now that they are no longer limited by memory allocation. The test was simply: # xfs_io -f -c "falloc 0 40t" /mnt/scratch/big-file # ~/src/xfstests-dev/punch-alternating /mnt/scratch/big-file This test uncovered a problem where the hole punching operation appeared to finish with no error, but apparently only created 268M extents instead of the 10 billion it was supposed to. Further, trying to punch out extents that should have been present resulted in success, but no change in the extent count. It looked like a silent failure. While running the test and observing the behaviour in real time, I observed the extent coutn growing at ~2M extents/minute, and saw this after about an hour: # xfs_io -f -c "stat" /mnt/scratch/big-file |grep next ; \ > sleep 60 ; \ > xfs_io -f -c "stat" /mnt/scratch/big-file |grep next fsxattr.nextents = 127657993 fsxattr.nextents = 129683339 # And a few minutes later this: # xfs_io -f -c "stat" /mnt/scratch/big-file |grep next fsxattr.nextents = 4177861124 # Ah, what? Where did that 4 billion extra extents suddenly come from? Stop the workload, unmount, mount: # xfs_io -f -c "stat" /mnt/scratch/big-file |grep next fsxattr.nextents = 166044375 # And it's back at the expected number. i.e. the extent count is correct on disk, but it's screwed up in memory. I loaded up the extent list, and immediately: # xfs_io -f -c "stat" /mnt/scratch/big-file |grep next fsxattr.nextents = 4192576215 # It's bad again. So, where does that number come from? xfs_fill_fsxattr(): if (ip->i_df.if_flags & XFS_IFEXTENTS) fa->fsx_nextents = xfs_iext_count(&ip->i_df); else fa->fsx_nextents = ip->i_d.di_nextents; And that's the behaviour I just saw in a nutshell. The on disk count is correct, but once the tree is loaded into memory, it goes whacky. Clearly there's something wrong with xfs_iext_count(): inline xfs_extnum_t xfs_iext_count(struct xfs_ifork *ifp) { return ifp->if_bytes / sizeof(struct xfs_iext_rec); } Simple enough, but 134M extents is 2**27, and that's right about where things went wrong. A struct xfs_iext_rec is 16 bytes in size, which means 2**27 * 2**4 = 2**31 and we're right on target for an integer overflow. And, sure enough: struct xfs_ifork { int if_bytes; /* bytes in if_u1 */ .... Once we get 2**27 extents in a file, we overflow if_bytes and the in-core extent count goes wrong. And when we reach 2**28 extents, if_bytes wraps back to zero and things really start to go wrong there. This is where the silent failure comes from - only the first 2**28 extents can be looked up directly due to the overflow, all the extents above this index wrap back to somewhere in the first 2**28 extents. Hence with a regular pattern, trying to punch a hole in the range that didn't have holes mapped to a hole in the first 2**28 extents and so "succeeded" without changing anything. Hence "silent failure"... Fix this by converting if_bytes to a int64_t and converting all the index variables and size calculations to use int64_t types to avoid overflows in future. Signed integers are still used to enable easy detection of extent count underflows. This enables scalability of extent counts to the limits of the on-disk format - MAXEXTNUM (2**31) extents. Current testing is at over 500M extents and still going: fsxattr.nextents = 517310478 Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: remove the XLOG_STATE_DO_CALLBACK stateChristoph Hellwig
XLOG_STATE_DO_CALLBACK is only entered through XLOG_STATE_DONE_SYNC and just used in a single debug check. Remove the flag and thus simplify the calling conventions for xlog_state_do_callback and xlog_state_iodone_process_iclog. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: turn ic_state into an enumChristoph Hellwig
ic_state really is a set of different states, even if the values are encoded as non-conflicting bits and we sometimes use logical and operations to check for them. Switch all comparisms to check for exact values (and use switch statements in a few places to make it more clear) and turn the values into an implicitly enumerated enum type. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: remove the unused XLOG_STATE_ALL and XLOG_STATE_UNUSED flagsChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: remove dead ifdef XFSERRORDEBUG codeChristoph Hellwig
XFSERRORDEBUG is never set and the code isn't all that useful, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: call xlog_state_release_iclog with l_icloglock heldChristoph Hellwig
All but one caller of xlog_state_release_iclog hold l_icloglock and need to drop and reacquire it to call xlog_state_release_iclog. Switch the xlog_state_release_iclog calling conventions to expect the lock to be held, and open code the logic (using a shared helper) in the only remaining caller that does not have the lock (and where not holding it is a nice performance optimization). Also move the refactored code to require the least amount of forward declarations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: minor whitespace cleanup] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: move the locking from xlog_state_finish_copy to the callersChristoph Hellwig
This will allow optimizing various locking cycles in the following patches. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: remove the unused ic_io_size field from xlog_in_coreChristoph Hellwig
ic_io_size is only used inside xlog_write_iclog, where we can just use the count parameter intead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: pass the correct flag to xlog_write_iclogChristoph Hellwig
xlog_write_iclog expects a bool for the second argument. While any non-0 value happens to work fine this makes all calls consistent. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-21xfs: optimize near mode bnobt scans with concurrent cntbt lookupsBrian Foster
The near mode fallback algorithm consists of a left/right scan of the bnobt. This algorithm has very poor breakdown characteristics under worst case free space fragmentation conditions. If a suitable extent is far enough from the locality hint, each allocation may scan most or all of the bnobt before it completes. This causes pathological behavior and extremely high allocation latencies. While locality is important to near mode allocations, it is not so important as to incur pathological allocation latency to provide the asolute best available locality for every allocation. If the allocation is large enough or far enough away, there is a point of diminishing returns. As such, we can bound the overall operation by including an iterative cntbt lookup in the broader search. The cntbt lookup is optimized to immediately find the extent with best locality for the given size on each iteration. Since the cntbt is indexed by extent size, the lookup repeats with a variably aggressive increasing search key size until it runs off the edge of the tree. This approach provides a natural balance between the two algorithms for various situations. For example, the bnobt scan is able to satisfy smaller allocations such as for inode chunks or btree blocks more quickly where the cntbt search may have to search through a large set of extent sizes when the search key starts off small relative to the largest extent in the tree. On the other hand, the cntbt search more deterministically covers the set of suitable extents for larger data extent allocation requests that the bnobt scan may have to search the entire tree to locate. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: factor out tree fixup logic into helperBrian Foster
Lift the btree fixup path into a helper function. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: refactor near mode alloc bnobt scan into separate functionBrian Foster
In preparation to enhance the near mode allocation bnobt scan algorithm, lift it into a separate function. No functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: refactor and reuse best extent scanning logicBrian Foster
The bnobt "find best" helper implements a simple btree walker function. This general pattern, or a subset thereof, is reused in various parts of a near mode allocation operation. For example, the bnobt left/right scans are each iterative btree walks along with the cntbt lastblock scan. Rework this function into a generic btree walker, add a couple parameters to control termination behavior from various contexts and reuse it where applicable. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: refactor allocation tree fixup codeBrian Foster
Both algorithms duplicate the same btree allocation code. Eliminate the duplication and reuse the fallback algorithm codepath. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: reuse best extent tracking logic for bnobt scanBrian Foster
The near mode bnobt scan searches left and right in the bnobt looking for the closest free extent to the allocation hint that satisfies minlen. Once such an extent is found, the left/right search terminates, we search one more time in the opposite direction and finish the allocation with the best overall extent. The left/right and find best searches are currently controlled via a combination of cursor state and local variables. Clean up this code and prepare for further improvements to the near mode fallback algorithm by reusing the allocation cursor best extent tracking mechanism. Update the tracking logic to deactivate bnobt cursors when out of allocation range and replace open-coded extent checks to calls to the common helper. In doing so, rename some misnamed local variables in the top-level near mode allocation function. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: refactor cntbt lastblock scan best extent logic into helperBrian Foster
The cntbt lastblock scan checks the size, alignment, locality, etc. of each free extent in the block and compares it with the current best candidate. This logic will be reused by the upcoming optimized cntbt algorithm, so refactor it into a separate helper. Note that acur->diff is now initialized to -1 (unsigned) instead of 0 to support the more granular comparison logic in the new helper. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: track best extent from cntbt lastblock scan in alloc cursorBrian Foster
If the size lookup lands in the last block of the by-size btree, the near mode algorithm scans the entire block for the extent with best available locality. In preparation for similar best available extent tracking across both btrees, extend the allocation cursor with best extent data and lift the associated state from the cntbt last block scan code. No functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: track allocation busy state in allocation cursorBrian Foster
Extend the allocation cursor to track extent busy state for an allocation attempt. No functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: introduce allocation cursor data structureBrian Foster
Introduce a new allocation cursor data structure to encapsulate the various states and structures used to perform an extent allocation. This structure will eventually be used to track overall allocation state across different search algorithms on both free space btrees. To start, include the three btree cursors (one for the cntbt and two for the bnobt left/right search) used by the near mode allocation algorithm and refactor the cursor setup and teardown code into helpers. This slightly changes cursor memory allocation patterns, but otherwise makes no functional changes to the allocation algorithm. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix sparse complaints] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: track active state of allocation btree cursorsBrian Foster
The upcoming allocation algorithm update searches multiple allocation btree cursors concurrently. As such, it requires an active state to track when a particular cursor should continue searching. While active state will be modified based on higher level logic, we can define base functionality based on the result of allocation btree lookups. Define an active flag in the private area of the btree cursor. Update it based on the result of lookups in the existing allocation btree helpers. Finally, provide a new helper to query the current state. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: ignore extent size hints for always COW inodesChristoph Hellwig
There is no point in applying extent size hints for always COW inodes, as we would just have to COW any extra allocation beyond the data actually written. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: include QUOTA, FATAL ASSERT build options in XFS_BUILD_OPTIONSyu kuai
In commit d03a2f1b9fa8 ("xfs: include WARN, REPAIR build options in XFS_BUILD_OPTIONS"), Eric pointed out that the XFS_BUILD_OPTIONS string, shown at module init time and in modinfo output, does not currently include all available build options. So, he added in CONFIG_XFS_WARN and CONFIG_XFS_REPAIR. However, this is not enough, add in CONFIG_XFS_QUOTA and CONFIG_XFS_ASSERT_FATAL. Signed-off-by: yu kuai <yukuai3@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: use a srcmap for a read-modify-write I/Oiomap-5.5-merge-5Goldwyn Rodrigues
The srcmap is used to identify where the read is to be performed from. It is passed to ->iomap_begin, which can fill it in if we need to read data for partially written blocks from a different location than the write target. The srcmap is only supported for buffered writes so far. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> [hch: merged two patches, removed the IOMAP_F_COW flag, use iomap as srcmap if not set, adjust length down to srcmap end as well] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
2019-10-21iomap: use write_begin to read pages to unshareChristoph Hellwig
Use the existing iomap write_begin code to read the pages unshared by iomap_file_unshare. That avoids the extra ->readpage call and extent tree lookup currently done by read_mapping_page. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: move the zeroing case out of iomap_read_page_syncChristoph Hellwig
That keeps the function a little easier to understand, and easier to modify for pending enhancements. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: ignore non-shared or non-data blocks in xfs_file_dirtyChristoph Hellwig
xfs_file_dirty is used to unshare reflink blocks. Rename the function to xfs_file_unshare to better document that purpose, and skip iomaps that are not shared and don't need zeroing. This will allow to simplify the caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: always use AOP_FLAG_NOFS in iomap_write_beginChristoph Hellwig
All callers pass AOP_FLAG_NOFS, so lift that flag to iomap_write_begin to allow reusing the flags arguments for an internal flags namespace soon. Also remove the local index variable that is only used once. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: remove the unused iomap argument to __iomap_write_endChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: enhance writeback error messageDarrick J. Wong
If we encounter an IO error during writeback, log the inode, offset, and sector number of the failure, instead of forcing the user to do some sort of reverse mapping to figure out which file is affected. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-10-21iomap: pass a struct page to iomap_finish_page_writebackChristoph Hellwig
No need to pass the full bio_vec. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: cleanup iomap_ioend_compareChristoph Hellwig
Move the initialization of ia and ib to the declaration line and remove a superflous else. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: move struct iomap_page out of iomap.hChristoph Hellwig
Now that all the writepage code is in the iomap code there is no need to keep this structure public. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21iomap: warn on inline maps in iomap_writepage_mapChristoph Hellwig
And inline mapping should never mark the page dirty and thus never end up in writepages. Add a check for that condition and warn if it happens. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>