summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2009-09-04Merge commit 'hwpoison/hwpoison'Stephen Rothwell
2009-09-04Merge commit 'tip/auto-latest'Stephen Rothwell
Conflicts: arch/x86/include/asm/socket.h arch/x86/kernel/setup.c drivers/pci/dmar.c drivers/pci/intel-iommu.c include/linux/rcupdate.h kernel/fork.c
2009-09-04Merge commit 'fsnotify/for-next'Stephen Rothwell
2009-09-04Merge commit 'trivial/for-next'Stephen Rothwell
2009-09-04Merge commit 'osd/linux-next'Stephen Rothwell
2009-09-04Merge commit 'cputime/cputime'Stephen Rothwell
2009-09-04Merge commit 'refs/next/20090902/security-testing'Stephen Rothwell
2009-09-04Merge commit 'block/for-next'Stephen Rothwell
Conflicts: fs/ubifs/super.c
2009-09-04Merge commit 'sound/for-next'Stephen Rothwell
2009-09-04Merge commit 'mtd/master'Stephen Rothwell
2009-09-04Merge commit 'net/master'Stephen Rothwell
2009-09-04Merge commit 'dlm/next'Stephen Rothwell
2009-09-04Merge branch 'quilt/i2c'Stephen Rothwell
2009-09-04Merge commit 'reiserfs-bkl/reiserfs/kill-bkl'Stephen Rothwell
2009-09-04Merge commit 'xfs/master'Stephen Rothwell
Conflicts: fs/xfs/linux-2.6/xfs_lrw.c
2009-09-04Merge commit 'ubifs/linux-next'Stephen Rothwell
2009-09-04Merge commit 'udf/for_next'Stephen Rothwell
2009-09-04Merge commit 'ocfs2/linux-next'Stephen Rothwell
2009-09-04Merge commit 'nilfs2/for-next'Stephen Rothwell
2009-09-04Merge commit 'nfsd/nfsd-next'Stephen Rothwell
Conflicts: net/sunrpc/cache.c
2009-09-04Merge commit 'nfs/linux-next'Stephen Rothwell
2009-09-04Merge commit 'gfs2/master'Stephen Rothwell
2009-09-04Merge commit 'fuse/for-next'Stephen Rothwell
2009-09-04Merge commit 'fatfs/master'Stephen Rothwell
2009-09-04Merge commit 'ext4/next'Stephen Rothwell
2009-09-04Merge commit 'ext3/for_next'Stephen Rothwell
2009-09-04Merge commit 'ecryptfs/next'Stephen Rothwell
2009-09-04Merge commit 'cifs/master'Stephen Rothwell
2009-09-03cifs: consolidate reconnect logic in smb_init routinesJeff Layton
There's a large cut and paste chunk of code in smb_init and small_smb_init to handle reconnects. Break it out into a separate function, clean it up and have both routines call it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-09-03JFFS2: add missing verify buffer allocation/deallocationMassimo Cirillo
The function jffs2_nor_wbuf_flash_setup() doesn't allocate the verify buffer if CONFIG_JFFS2_FS_WBUF_VERIFY is defined, so causing a kernel panic when that macro is enabled and the verify function is called. Similarly the jffs2_nor_wbuf_flash_cleanup() must free the buffer if CONFIG_JFFS2_FS_WBUF_VERIFY is enabled. The following patch fixes the problem. The following patch applies to 2.6.30 kernel. Signed-off-by: Massimo Cirillo <maxcir@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org
2009-09-03Merge branch 'osync_inode' into for_nextJan Kara
2009-09-03ext3: Fix possible deadlock between ext3_truncate() and ext3_get_blocks()Jan Kara
During truncate we are sometimes forced to start a new transaction as the amount of blocks to be journaled is both quite large and hard to predict. So far we restarted a transaction while holding truncate_mutex and that violates lock ordering because truncate_mutex ranks below transaction start (and it can lead to a real deadlock with ext3_get_blocks() allocating new blocks from ext3_writepage()). Luckily, the problem is easy to fix: We just drop the truncate_mutex before restarting the transaction and acquire it afterwards. We are safe to do this as by the time ext3_truncate() is called, all the page cache for the truncated part of the file is dropped and so writepage() cannot come and allocate new blocks in the part of the file we are truncating. The rest of writers is stopped by us holding i_mutex. Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03jbd: Annotate transaction start also for journal_restart()Jan Kara
lockdep annotation for a transaction start has been at the end of journal_start(). But a transaction is also started from journal_restart(). Move the lockdep annotation to start_this_handle() which covers both cases. Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03jbd: Journal block numbers can ever be only 32-bit use unsigned int for themJan Kara
It does not make sense to store block number for journal as unsigned long since they can be only 32-bit (because of on-disk format limitation). So change in-memory structures and variables to use unsigned int instead. Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03JBD: round commit timer up to avoid uncommitted transactionAndreas Dilger
Fix jiffie rounding in jbd commit timer setup code. Rounding down could cause the timer to be fired before the corresponding transaction has expired. That transaction can stay not committed forever if no new transaction is created or explicit sync/umount happens. Signed-off-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03fsync: wait for data writeout completion before calling ->fsyncChristoph Hellwig
Currenly vfs_fsync(_range) first calls filemap_fdatawrite to write out the data, the calls into ->fsync to write out the metadata and then finally calls filemap_fdatawait to wait for the data I/O to complete. What sounds like a clever micro-optimization actually is nast trap for many filesystems. For many modern filesystems i_size or other inode information is only updated on I/O completion and we need to wait for I/O to finish before we can write out the metadata. For old fashionen filesystems that instanciate blocks during the actual write and also update the metadata at that point it opens up a large window were we could expose uninitialized blocks after a crash. While a few filesystems that need it already wait for the I/O to finish inside their ->fsync methods it is rather suboptimal as it is done under the i_mutex and also always for the whole file instead of just a part as we could do for O_SYNC handling. Here is a small audit of all fsync instances in the tree: - spufs_mfc_fsync: - ps3flash_fsync: - vol_cdev_fsync: - printer_fsync: - fb_deferred_io_fsync: - bad_file_fsync: - simple_sync_file: don't care - filesystems/drivers do't use the page cache or are purely in-memory. - simple_fsync: - file_fsync: - affs_file_fsync: - fat_file_fsync: - jfs_fsync: - ubifs_fsync: - reiserfs_dir_fsync: - reiserfs_sync_file: never touch pagecache themselves. We need to wait before if we do not want to expose stale data after an allocation. - afs_fsync: - fuse_fsync_common: do the waiting writeback itself in awkward ways, would benefit from proper semantics - block_fsync: Does a filemap_write_and_wait on the block device inode. Because we now have f_mapping that is the same inode we call it on in vfs_fsync. So just removing it and letting the VFS do the work in one go would be an improvement. - btrfs_sync_file: - cifs_fsync: - xfs_file_fsync: need the wait first and currently do it themselves. would benefit from doing it outside i_mutex. - coda_fsync: - ecryptfs_fsync: - exofs_file_fsync: - shm_fsync: only passes the fsync through to the lower layer - ext3_sync_file: doesn't seem to care, comments are confusing. - ext4_sync_file: would need the wait to work correctly for delalloc mode with late i_size updates. Otherwise the ext3 comment applies. currently implemens it's own writeback and wait in an odd way, could benefit from doing it properly. - gfs2_fsync: not needed for journaled data mode, but probably harmless there. Currently writes back data asynchronously itself. Needs some major audit. - hostfs_fsync: just calls fsync/datasync on the host FD. Without the wait before data might not even be inflight yet if we're unlucky. - hpfs_file_fsync: - ncp_fsync: no-ops. Dangerous before and after. - jffs2_fsync: just calls jffs2_flush_wbuf_gc, not sure how this relates to data. - nfs_fsync_dir: just increments stats, claims all directory operations are synchronous - nfs_file_fsync: only writes out data??? Looks very odd. - nilfs_sync_file: looks like it expects all data done, but not sure from the code - ntfs_dir_fsync: - ntfs_file_fsync: appear to do their own data writeback. Very convoluted code. - ocfs2_sync_file: does it's own data writeback, but no wait. probably needs the wait. - smb_fsync: according to a comment expects all pages written already, probably needs the wait before. This patch only changes vfs_fsync_range, removal of the wait in the methods that have it is left to the filesystem maintainers. Note that most filesystems really do need an audit for their fsync methods given the gems found in this very brief audit. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03vfs: Remove generic_osync_inode() and sync_page_range{_nolock}()Jan Kara
Remove these three functions since nobody uses them anymore. Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03fat: Opencode sync_page_range_nolock()Jan Kara
fat_cont_expand() is the only user of sync_page_range_nolock(). It's also the only user of generic_osync_inode() which does not have a file open. So opencode needed actions for FAT so that we can convert generic_osync_inode() to a standard syncing path. Update a comment about generic_osync_inode(). CC: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03xfs: Convert sync_page_range() to simple filemap_write_and_wait_range()Jan Kara
Christoph Hellwig says that it is enough for XFS to call filemap_write_and_wait_range() instead of sync_page_range() because we do all the metadata syncing when forcing the log. CC: Felix Blyakher <felixb@sgi.com> CC: xfs@oss.sgi.com CC: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03ocfs2: Update syncing after splicing to match generic versionJan Kara
Update ocfs2 specific splicing code to use generic syncing helper. The sync now does not happen under rw_lock because generic_write_sync() acquires i_mutex which ranks above rw_lock. That should not matter because standard fsync path does not hold it either. Acked-by: Joel Becker <Joel.Becker@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> CC: ocfs2-devel@oss.oracle.com Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03ntfs: Use new syncing helpers and update commentsJan Kara
Use new syncing helpers in .write and .aio_write functions. Also remove superfluous syncing in ntfs_file_buffered_write() and update comments about generic_osync_inode(). CC: Anton Altaparmakov <aia21@cantab.net> CC: linux-ntfs-dev@lists.sourceforge.net Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03ext4: Remove syncing logic from ext4_file_writeJan Kara
The syncing is now properly handled by generic_file_aio_write() so no special ext4 code is needed. CC: linux-ext4@vger.kernel.org CC: tytso@mit.edu Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03ext3: Remove syncing logic from ext3_file_writeJan Kara
Syncing is now properly done by generic_file_aio_write() so no special logic is needed in ext3. CC: linux-ext4@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03ext2: Update comment about generic_osync_inodeJan Kara
We rely on generic_write_sync() now. CC: linux-ext4@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03vfs: Introduce new helpers for syncing after writing to O_SYNC file or ↵Jan Kara
IS_SYNC inode Introduce new function for generic inode syncing (vfs_fsync_range) and use it from fsync() path. Introduce also new helper for syncing after a sync write (generic_write_sync) using the generic function. Use these new helpers for syncing from generic VFS functions. This makes O_SYNC writes to block devices acquire i_mutex for syncing. If we really care about this, we can make block_fsync() drop the i_mutex and reacquire it before it returns. CC: Evgeniy Polyakov <zbr@ioremap.net> CC: ocfs2-devel@oss.oracle.com CC: Joel Becker <joel.becker@oracle.com> CC: Felix Blyakher <felixb@sgi.com> CC: xfs@oss.sgi.com CC: Anton Altaparmakov <aia21@cantab.net> CC: linux-ntfs-dev@lists.sourceforge.net CC: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> CC: linux-ext4@vger.kernel.org CC: tytso@mit.edu Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03vfs: Rename generic_file_aio_write_nolockChristoph Hellwig
generic_file_aio_write_nolock() is now used only by block devices and raw character device. Filesystems should use __generic_file_aio_write() in case generic_file_aio_write() doesn't suit them. So rename the function to blkdev_aio_write() and move it to fs/blockdev.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-03Merge branch 'writeback' into for-nextJens Axboe
2009-09-03Merge branch 'for-2.6.32' into for-nextJens Axboe
2009-09-03vm: Add an tuning knob for vm.max_writeback_pagesTheodore Ts'o
Originally, MAX_WRITEBACK_PAGES was hard-coded to 1024 because of a concern of not holding I_SYNC for too long. (At least, that was the comment previously.) This doesn't make sense now because the only time we wait for I_SYNC is if we are calling sync or fsync, and in that case we need to write out all of the data anyway. Previously there may have been other code paths that waited on I_SYNC, but not any more. According to Christoph, the current writeback size is way too small, and XFS had a hack that bumped out nr_to_write to four times the value sent by the VM to be able to saturate medium-sized RAID arrays. This value was also problematic for ext4 as well, as it caused large files to be come interleaved on disk by in 8 megabyte chunks (we bumped up the nr_to_write by a factor of two). So, in this patch, we make the MAX_WRITEBACK_PAGES a tunable, and change the default to be 32768 blocks. http://bugzilla.kernel.org/show_bug.cgi?id=13930 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-03writeback: check for registered bdi in flusher add and inode dirtyJens Axboe
Also a debugging aid. We want to catch dirty inodes being added to backing devices that don't do writeback. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>