summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2008-04-01Revert "novfs: Add the Novell filesystem client kernel module"Stephen Rothwell
This reverts commit 6c4a4f4eb4a71367ba71c861c8d7fe412ca86e5a.
2008-04-01Merge branch 'quilt/ldp'Stephen Rothwell
2008-04-01Merge commit 'vfs/vfs-2.6.25'Stephen Rothwell
Conflicts: fs/xfs/linux-2.6/xfs_ioctl.c fs/xfs/xfs_ialloc.c fs/xfs/xfs_sb.h
2008-04-01Merge commit 'mtd/master'Stephen Rothwell
2008-04-01Merge commit 'net/master'Stephen Rothwell
Conflicts: MAINTAINERS
2008-04-01Merge commit 'udf/for_next'Stephen Rothwell
2008-04-01Merge commit 'ext4/next'Stephen Rothwell
Conflicts: fs/jbd2/journal.c fs/jbd2/revoke.c
2008-04-01Merge commit 'ocfs2/linux-next'Stephen Rothwell
2008-04-01Merge commit 'dlm/next'Stephen Rothwell
2008-04-01Merge commit 'nfsd/nfsd-next'Stephen Rothwell
2008-04-01Merge commit 'xfs/master'Stephen Rothwell
2008-04-01Merge commit 'nfs/linux-next'Stephen Rothwell
2008-04-01Merge commit 'jfs/next'Stephen Rothwell
2008-04-01Merge commit 's390/features'Stephen Rothwell
2008-04-01novfs: Add the Novell filesystem client kernel moduleGreg Kroah-Hartman
This adds the Novell filesystem client kernel module. Things to do before it can be submitted: - coding style cleanups - remove typedefs - function name lowercase - 80 chars wide - sparse cleanups - __user markings - endian markings - remove functions that are never called and structures never used - yeah, there are a lot of them... - remove wrapper functions - private kmalloc/free? - resolve FIXME markings that have been added to the code - wrong types passed to functions!!! - userspace interface revisit - uses /proc/novfs, not nice. - might need userspace tools rework Cc: Lonnie Iverson <ldiverson@novell.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-01sysfs: crash debuggingAndrew Morton
Print the name of the last-accessed sysfs file when we oops, to help track down oopses which occur in sysfs store/read handlers. Because these oopses tend to not leave any trace of the offending code in the stack traces. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-01SYSFS: Explicitly include required header file slab.h.Robert P. J. Day
After an experimental deletion of the unnecessary inclusion of <linux/slab.h> from the header file <linux/percpu.h>, the following files under fs/sysfs were exposed as needing to explicitly include <linux/slab.h>. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-01block: send disk "change" event for rescan_partitions()Kay Sievers
Userspace likes to get notified that the disk may have changed, when rescan_partitions() is called after partitioning or media change. It will make it possible to update the state of the disk with the "change" event, before the following partition "add" events are handled. Cc: David Zeuthen <david@fubar.dk> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-31dlm: recover nodes that are removed and re-addedDavid Teigland
If a node is removed from a lockspace, and then added back before the dlm is notified of the removal, the dlm will not detect the removal and won't clear the old state from the node. This is fixed by using a list of added nodes so the membership recovery can detect when a newly added node is already in the member list. Signed-off-by: David Teigland <teigland@redhat.com>
2008-03-30cifs: fix misannotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30NULL noise: fs/*, mm/*, kernel/*Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30jbd/jbd2 NULL noiseAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28Merge branch 'o2net' into linux-nextMark Fasheh
2008-03-28ocfs2/cluster: Get rid of arguments to the timeout routinesJeff Mahoney
We keep seeing bug reports related to NULL pointer derefs in o2net_set_nn_state(). When I originally wrote up the configurable timeout patch, I had tried to plan for multiple clusters. This was silly. The timeout routines all use o2nm_single_cluster so there's no point in passing an argument at all. This patch removes the arguments and kills those bugs dead. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-03-28Merge branch 'iserr_fixes' into linux-nextMark Fasheh
2008-03-28[patch 1/1] fs/ocfs2/aops.c: test for IS_ERR rather than 0Julia Lawall
The function ocfs2_start_trans always returns either a valid pointer or a value made with ERR_PTR, so its result should be tested with IS_ERR, not with a test for 0. Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-03-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [PATCH] mnt_expire is protected by namespace_sem, no need for vfsmount_lock [PATCH] do shrink_submounts() for all fs types [PATCH] sanitize locking in mark_mounts_for_expiry() and shrink_submounts() [PATCH] count ghost references to vfsmounts [PATCH] reduce stack footprint in namespace.c
2008-03-28afs: prevent double cell registrationSven Schnelle
kafs doesn't check if the cell already exists - so if you do an echo "add newcell.org 1.2.3.4" >/proc/fs/afs/cells it will try to create this cell again. kobject will also complain about a double registration. To prevent such problems, return -EEXIST in that case. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28vfs: fix data leak in nobh_write_end()Dmitri Monakhov
Current nobh_write_end() implementation ignore partial writes(copied < len) case if page was fully mapped and simply mark page as Uptodate, which is totally wrong because area [pos+copied, pos+len) wasn't updated explicitly in previous write_begin call. It simply contains garbage from pagecache and result in data leakage. #TEST_CASE_BEGIN: ~~~~~~~~~~~~~~~~ In fact issue triggered by classical testcase open("/mnt/test", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3 ftruncate(3, 409600) = 0 writev(3, [{"a", 1}, {NULL, 4095}], 2) = 1 ##TESTCASE_SOURCE: ~~~~~~~~~~~~~~~~~ #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <sys/uio.h> #include <sys/mman.h> #include <errno.h> int main(int argc, char **argv) { int fd, ret; void* p; struct iovec iov[2]; fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666); ftruncate(fd, 409600); iov[0].iov_base="a"; iov[0].iov_len=1; iov[1].iov_base=NULL; iov[1].iov_len=4096; ret = writev(fd, iov, sizeof(iov)/sizeof(struct iovec)); printf("writev = %d, err = %d\n", ret, errno); return 0; } ##TESTCASE RESULT: ~~~~~~~~~~~~~~~~~~ [root@ts63 ~]# mount | grep mnt2 /dev/mapper/test on /mnt2 type ext2 (rw,nobh) [root@ts63 ~]# /tmp/writev /mnt2/test writev = 1, err = 0 [root@ts63 ~]# hexdump -C /mnt2/test 00000000 61 65 62 6f 6f 74 00 00 f0 b9 b4 59 3a 00 00 00 |aeboot.....Y:...| 00000010 20 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 | .......!.......| 00000020 df df df df df df df df df df df df df df df df |................| 00000030 3a 00 00 00 2a 00 00 00 21 00 00 00 00 00 00 00 |:...*...!.......| 00000040 60 c0 8c 00 00 00 00 00 40 4a 8d 00 00 00 00 00 |`.......@J......| 00000050 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 |........A.......| 00000060 74 69 6d 65 20 64 64 20 69 66 3d 2f 64 65 76 2f |time dd if=/dev/| 00000070 6c 6f 6f 70 30 20 20 6f 66 3d 2f 64 65 76 2f 6e |loop0 of=/dev/n| skip.. 00000f50 00 00 00 00 00 00 00 00 31 00 00 00 00 00 00 00 |........1.......| 00000f60 6d 6b 66 73 2e 65 78 74 33 20 2f 64 65 76 2f 76 |mkfs.ext3 /dev/v| 00000f70 7a 76 67 2f 74 65 73 74 20 2d 62 34 30 39 36 00 |zvg/test -b4096.| 00000f80 a0 fe 8c 00 00 00 00 00 21 00 00 00 00 00 00 00 |........!.......| 00000f90 23 31 32 30 35 39 35 30 34 30 34 00 3a 00 00 00 |#1205950404.:...| 00000fa0 20 00 8d 00 00 00 00 00 21 00 00 00 00 00 00 00 | .......!.......| 00000fb0 d0 cf 8c 00 00 00 00 00 10 d0 8c 00 00 00 00 00 |................| 00000fc0 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 |........A.......| 00000fd0 6d 6f 75 6e 74 20 2f 64 65 76 2f 76 7a 76 67 2f |mount /dev/vzvg/| 00000fe0 74 65 73 74 20 20 2f 76 7a 20 2d 6f 20 64 61 74 |test /vz -o dat| 00000ff0 61 3d 77 72 69 74 65 62 61 63 6b 00 00 00 00 00 |a=writeback.....| 00001000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| As you can see file's page contains garbage from pagecache instead of zeros. #TEST_CASE_END Attached patch: - Add sanity check BUG_ON in order to prevent incorrect usage by caller, This is function invariant because page can has buffers and in no zero *fadata pointer at the same time. - Always attach buffers to page is it is partial write case. - Always switch back to generic_write_end if page has buffers. This is reasonable because if page already has buffer then generic_write_begin was called previously. Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org> Reviewed-by: Nick Piggin <npiggin@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28[S390] System z large page support.Gerald Schaefer
This adds hugetlbfs support on System z, using both hardware large page support if available and software large page emulation on older hardware. Shared (large) page tables are implemented in software emulation mode, by using page->index of the first tail page from a compound large page to store page table information. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-03-28Merge branch 'mountinfo' into vfs-2.6.25Al Viro
2008-03-28[PATCH] fix race in kvm_dev_ioctl_create_vm(), sanitize anon_inode_getfd()Al Viro
a) none of the callers even looks at inode returned by anon_inode_getfd() b) none of correct callers looks at file returned by anon_inode_getfd() c) the only caller that does is (inevitably) racy, since by the time it returns we might have raced with close() from another thread and that file would be pining for fjords. Fixed by adding a variant that pins the file (used by the only odd caller) and by turning anon_inode_getfd() into wrapper around it. And losing the damn pfile and pinode from anon_inode_getfd(), to prevent similar breakage in the future. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[patch 7/7] vfs: mountinfo: show dominating group idMiklos Szeredi
Show peer group ID of nearest dominating group that has intersection with the mount's namespace. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[patch 6/7] vfs: mountinfo: add /proc/<pid>/mountinfoRam Pai
[mszeredi@suse.cz] rewrite and split big patch into managable chunks /proc/mounts in its current form lacks important information: - propagation state - root of mount for bind mounts - the st_dev value used within the filesystem - identifier for each mount and it's parent It also suffers from the following problems: - not easily extendable - ambiguity of mountpoints within a chrooted environment - doesn't distinguish between filesystem dependent and independent options - doesn't distinguish between per mount and per super block options This patch introduces /proc/<pid>/mountinfo which attempts to address all these deficiencies. Code shared between /proc/<pid>/mounts and /proc/<pid>/mountinfo is extracted into separate functions. Thanks to Al Viro for the help in getting the design right. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[patch 5/7] vfs: mountinfo: allow using process rootMiklos Szeredi
Allow /proc/<pid>/mountinfo to use the root of <pid> to calculate mountpoints. - move definition of 'struct proc_mounts' to <linux/mnt_namespace.h> - add the process's namespace and root to this structure - pass a pointer to 'struct proc_mounts' into seq_operations In addition the following cleanups are made: - use a common open function for /proc/<pid>/{mounts,mountstat} - surround namespace.c part of these proc files with #ifdef CONFIG_PROC_FS - make the seq_operations structures const Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[patch 4/7] vfs: mountinfo: add mount peer group IDMiklos Szeredi
Add a unique ID to each peer group using the IDR infrastructure. The identifiers are reused after the peer group dissolves. The IDR structures are protected by holding namepspace_sem for write while allocating or deallocating IDs. IDs are allocated when a previously unshared vfsmount becomes the first member of a peer group. When a new member is added to an existing group, the ID is copied from one of the old members. IDs are freed when the last member of a peer group is unshared. Setting the MNT_SHARED flag on members of a subtree is done as a separate step, after all the IDs have been allocated. This way an allocation failure can be cleaned up easilty, without affecting the propagation state. Based on design sketch by Al Viro. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[patch 3/7] vfs: mountinfo: add mount IDMiklos Szeredi
Add a unique ID to each vfsmount using the IDR infrastructure. The identifiers are reused after the vfsmount is freed. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[patch 2/7] vfs: mountinfo: add seq_file_root()Miklos Szeredi
Add a new function: seq_file_root() This is similar to seq_path(), but calculates the path relative to the given root, instead of current->fs->root. If the path was unreachable from root, then modify the root parameter to reflect this. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[patch 1/7] vfs: mountinfo: add dentry_path()Ram Pai
[mszeredi@suse.cz] split big patch into managable chunks Add the following functions: dentry_path() seq_dentry() These are similar to d_path() and seq_path(). But instead of calculating the path within a mount namespace, they calculate the path from the root of the filesystem to a given dentry, ignoring mounts completely. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] teach seq_file to discard entriesAl Viro
Allow ->show() return SEQ_SKIP; that will discard all output from that element and move on. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28Merge branch 'ro-bind.b4' into mountinfoAl Viro
2008-03-28[PATCH] r/o bind mounts: debugging for missed callsDave Hansen
There have been a few oopses caused by 'struct file's with NULL f_vfsmnts. There was also a set of potentially missed mnt_want_write()s from dentry_open() calls. This patch provides a very simple debugging framework to catch these kinds of bugs. It will WARN_ON() them, but should stop us from having any oopses or mnt_writer count imbalances. I'm quite convinced that this is a good thing because it found bugs in the stuff I was working on as soon as I wrote it. [hch: made it conditional on a debug option. But it's still a little bit too ugly] [hch: merged forced remount r/o fix from Dave and akpm's fix for the fix] Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] r/o bind mounts: honor mount writer counts at remountDave Hansen
Originally from: Herbert Poetzl <herbert@13thfloor.at> This is the core of the read-only bind mount patch set. Note that this does _not_ add a "ro" option directly to the bind mount operation. If you require such a mount, you must first do the bind, then follow it up with a 'mount -o remount,ro' operation: If you wish to have a r/o bind mount of /foo on bar: mount --bind /foo /bar mount -o remount,ro /bar Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] r/o bind mounts: track numbers of writers to mountsDave Hansen
This is the real meat of the entire series. It actually implements the tracking of the number of writers to a mount. However, it causes scalability problems because there can be hundreds of cpus doing open()/close() on files on the same mnt at the same time. Even an atomic_t in the mnt has massive scalaing problems because the cacheline gets so terribly contended. This uses a statically-allocated percpu variable. All want/drop operations are local to a cpu as long that cpu operates on the same mount, and there are no writer count imbalances. Writer count imbalances happen when a write is taken on one cpu, and released on another, like when an open/close pair is performed on two Upon a remount,ro request, all of the data from the percpu variables is collected (expensive, but very rare) and we determine if there are any outstanding writers to the mount. I've written a little benchmark to sit in a loop for a couple of seconds in several cpus in parallel doing open/write/close loops. http://sr71.net/~dave/linux/openbench.c The code in here is a a worst-possible case for this patch. It does opens on a _pair_ of files in two different mounts in parallel. This should cause my code to lose its "operate on the same mount" optimization completely. This worst-case scenario causes a 3% degredation in the benchmark. I could probably get rid of even this 3%, but it would be more complex than what I have here, and I think this is getting into acceptable territory. In practice, I expect writing more than 3 bytes to a file, as well as disk I/O to mask any effects that this has. (To get rid of that 3%, we could have an #defined number of mounts in the percpu variable. So, instead of a CPU getting operate only on percpu data when it accesses only one mount, it could stay on percpu data when it only accesses N or fewer mounts.) [AV] merged fix for __clear_mnt_mount() stepping on freed vfsmount Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] r/o bind mounts: get callers of vfs_mknod/create()Dave Hansen
This takes care of all of the direct callers of vfs_mknod(). Since a few of these cases also handle normal file creation as well, this also covers some calls to vfs_create(). So that we don't have to make three mnt_want/drop_write() calls inside of the switch statement, we move some of its logic outside of the switch and into a helper function suggested by Christoph. This also encapsulates a fix for mknod(S_IFREG) that Miklos found. Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] r/o bind mounts: check mnt instead of superblock directlyDave Hansen
If we depend on the inodes for writeability, we will not catch the r/o mounts when implemented. This patches uses __mnt_want_write(). It does not guarantee that the mount will stay writeable after the check. But, this is OK for one of the checks because it is just for a printk(). The other two are probably unnecessary and duplicate existing checks in the VFS. This won't make them better checks than before, but it will make them detect r/o mounts. Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] r/o bind mounts: make access() use new r/o helperDave Hansen
It is OK to let access() go without using a mnt_want/drop_write() pair because it doesn't actually do writes to the filesystem, and it is inherently racy anyway. This is a rare case when it is OK to use __mnt_is_readonly() directly. Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] r/o bind mounts: elevate count for xfs timestamp updatesDave Hansen
Elevate the write count during the xfs m/ctime updates. XFS has to do it's own timestamp updates due to an unfortunate VFS design limitation, so it will have to track writers by itself aswell. [hch: split out from the touch_atime patch as it's not related to it at all] Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] r/o bind mounts: write counts for truncate()Dave Hansen
Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28[PATCH] r/o bind mounts: elevate write count for chmod/chown callersDave Hansen
chown/chmod,etc... don't call permission in the same way that the normal "open for write" calls do. They still write to the filesystem, so bump the write count during these operations. Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>