From 743162013d40ca612b4cb53d3a200dff2d9ab26e Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 7 Jul 2014 15:16:04 +1000 Subject: sched: Remove proliferation of wait_on_bit() action functions The current "wait_on_bit" interface requires an 'action' function to be provided which does the actual waiting. There are over 20 such functions, many of them identical. Most cases can be satisfied by one of just two functions, one which uses io_schedule() and one which just uses schedule(). So: Rename wait_on_bit and wait_on_bit_lock to wait_on_bit_action and wait_on_bit_lock_action to make it explicit that they need an action function. Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io which are *not* given an action function but implicitly use a standard one. The decision to error-out if a signal is pending is now made based on the 'mode' argument rather than being encoded in the action function. All instances of the old wait_on_bit and wait_on_bit_lock which can use the new version have been changed accordingly and their action functions have been discarded. wait_on_bit{_lock} does not return any specific error code in the event of a signal so the caller must check for non-zero and interpolate their own error code as appropriate. The wait_on_bit() call in __fscache_wait_on_invalidate() was ambiguous as it specified TASK_UNINTERRUPTIBLE but used fscache_wait_bit_interruptible as an action function. David Howells confirms this should be uniformly "uninterruptible" The main remaining user of wait_on_bit{,_lock}_action is NFS which needs to use a freezer-aware schedule() call. A comment in fs/gfs2/glock.c notes that having multiple 'action' functions is useful as they display differently in the 'wchan' field of 'ps'. (and /proc/$PID/wchan). As the new bit_wait{,_io} functions are tagged "__sched", they will not show up at all, but something higher in the stack. So the distinction will still be visible, only with different function names (gds2_glock_wait versus gfs2_glock_dq_wait in the gfs2/glock.c case). Since first version of this patch (against 3.15) two new action functions appeared, on in NFS and one in CIFS. CIFS also now uses an action function that makes the same freezer aware schedule call as NFS. Signed-off-by: NeilBrown Acked-by: David Howells (fscache, keys) Acked-by: Steven Whitehouse (gfs2) Acked-by: Peter Zijlstra Cc: Oleg Nesterov Cc: Steve French Cc: Linus Torvalds Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown Signed-off-by: Ingo Molnar --- Documentation/filesystems/caching/operations.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/caching/operations.txt b/Documentation/filesystems/caching/operations.txt index bee2a5f93d60..a1c052cbba35 100644 --- a/Documentation/filesystems/caching/operations.txt +++ b/Documentation/filesystems/caching/operations.txt @@ -90,7 +90,7 @@ operations: to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, - fscache_wait_bit, TASK_UNINTERRUPTIBLE); + TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it -- cgit v1.2.3 From 854d06d9fc350130617debdd9e9f49d1c4ba5206 Mon Sep 17 00:00:00 2001 From: Cyrill Gorcunov Date: Wed, 16 Jul 2014 01:54:53 +0400 Subject: docs: Procfs -- Document timerfd output Signed-off-by: Cyrill Gorcunov Cc: Andrew Morton Cc: Michael Kerrisk Cc: Andrey Vagin Cc: Pavel Emelyanov Cc: Vladimir Davydov Link: http://lkml.kernel.org/r/20140715215703.199905126@openvz.org Signed-off-by: Thomas Gleixner --- Documentation/filesystems/proc.txt | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index ddc531a74d04..eb8a10e22f7c 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -1743,6 +1743,25 @@ pair provide additional information particular to the objects they represent. While the first three lines are mandatory and always printed, the rest is optional and may be omitted if no marks created yet. + Timerfd files + ~~~~~~~~~~~~~ + + pos: 0 + flags: 02 + mnt_id: 9 + clockid: 0 + ticks: 0 + settime flags: 01 + it_value: (0, 49406829) + it_interval: (1, 0) + + where 'clockid' is the clock type and 'ticks' is the number of the timer expirations + that have occurred [see timerfd_create(2) for details]. 'settime flags' are + flags in octal form been used to setup the timer [see timerfd_settime(2) for + details]. 'it_value' is remaining time until the timer exiration. + 'it_interval' is the interval for the timer. Note the timer might be set up + with TIMER_ABSTIME option which will be shown in 'settime flags', but 'it_value' + still exhibits timer's remaining time. ------------------------------------------------------------------------------ Configuring procfs -- cgit v1.2.3 From 0f7b2abd188089a44f60e2bf8521d1363ada9e12 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Wed, 23 Jul 2014 09:57:31 -0700 Subject: f2fs: add nobarrier mount option This patch adds a mount option, nobarrier, in f2fs. The assumption in here is that file system keeps the IO ordering, but doesn't care about cache flushes inside the storages. Reviewed-by: Chao Yu Signed-off-by: Jaegeuk Kim --- Documentation/filesystems/f2fs.txt | 5 +++++ fs/f2fs/data.c | 5 ++++- fs/f2fs/f2fs.h | 1 + fs/f2fs/segment.c | 3 +++ fs/f2fs/super.c | 7 +++++++ 5 files changed, 20 insertions(+), 1 deletion(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/f2fs.txt b/Documentation/filesystems/f2fs.txt index 51afba17bbae..a2046a7d0a9d 100644 --- a/Documentation/filesystems/f2fs.txt +++ b/Documentation/filesystems/f2fs.txt @@ -126,6 +126,11 @@ flush_merge Merge concurrent cache_flush commands as much as possible to eliminate redundant command issues. If the underlying device handles the cache_flush command relatively slowly, recommend to enable this option. +nobarrier This option can be used if underlying storage guarantees + its cached data should be written to the novolatile area. + If this option is set, no cache_flush commands are issued + but f2fs still guarantees the write ordering of all the + data writes. ================================================================================ DEBUGFS ENTRIES diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index c77c66723d70..482313dd9e24 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -139,7 +139,10 @@ void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, /* change META to META_FLUSH in the checkpoint procedure */ if (type >= META_FLUSH) { io->fio.type = META_FLUSH; - io->fio.rw = WRITE_FLUSH_FUA | REQ_META | REQ_PRIO; + if (test_opt(sbi, NOBARRIER)) + io->fio.rw = WRITE_FLUSH | REQ_META | REQ_PRIO; + else + io->fio.rw = WRITE_FLUSH_FUA | REQ_META | REQ_PRIO; } __submit_merged_bio(io); up_write(&io->io_rwsem); diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 8f507d4c0f71..e999eec200b7 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -41,6 +41,7 @@ #define F2FS_MOUNT_INLINE_XATTR 0x00000080 #define F2FS_MOUNT_INLINE_DATA 0x00000100 #define F2FS_MOUNT_FLUSH_MERGE 0x00000200 +#define F2FS_MOUNT_NOBARRIER 0x00000400 #define clear_opt(sbi, option) (sbi->mount_opt.opt &= ~F2FS_MOUNT_##option) #define set_opt(sbi, option) (sbi->mount_opt.opt |= F2FS_MOUNT_##option) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 8a6e57d83c79..9fce0f47eb35 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -239,6 +239,9 @@ int f2fs_issue_flush(struct f2fs_sb_info *sbi) struct flush_cmd_control *fcc = SM_I(sbi)->cmd_control_info; struct flush_cmd cmd; + if (test_opt(sbi, NOBARRIER)) + return 0; + if (!test_opt(sbi, FLUSH_MERGE)) return blkdev_issue_flush(sbi->sb->s_bdev, GFP_KERNEL, NULL); diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 34649aa66e04..93593ceec175 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -52,6 +52,7 @@ enum { Opt_inline_xattr, Opt_inline_data, Opt_flush_merge, + Opt_nobarrier, Opt_err, }; @@ -69,6 +70,7 @@ static match_table_t f2fs_tokens = { {Opt_inline_xattr, "inline_xattr"}, {Opt_inline_data, "inline_data"}, {Opt_flush_merge, "flush_merge"}, + {Opt_nobarrier, "nobarrier"}, {Opt_err, NULL}, }; @@ -339,6 +341,9 @@ static int parse_options(struct super_block *sb, char *options) case Opt_flush_merge: set_opt(sbi, FLUSH_MERGE); break; + case Opt_nobarrier: + set_opt(sbi, NOBARRIER); + break; default: f2fs_msg(sb, KERN_ERR, "Unrecognized mount option \"%s\" or missing value", @@ -544,6 +549,8 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root) seq_puts(seq, ",inline_data"); if (!f2fs_readonly(sbi->sb) && test_opt(sbi, FLUSH_MERGE)) seq_puts(seq, ",flush_merge"); + if (test_opt(sbi, NOBARRIER)) + seq_puts(seq, ",nobarrier"); seq_printf(seq, ",active_logs=%u", sbi->active_logs); return 0; -- cgit v1.2.3 From 480e83275a0284a5fed75323aaa4b32547b14308 Mon Sep 17 00:00:00 2001 From: Steve French Date: Fri, 1 Aug 2014 15:46:22 -0500 Subject: Add Pavel to contributor list in cifs AUTHORS file Signed-off-by: Steve French CC: Pavel Shilovsky --- Documentation/filesystems/cifs/AUTHORS | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/cifs/AUTHORS b/Documentation/filesystems/cifs/AUTHORS index ca4a67a0bb1e..c98800df677f 100644 --- a/Documentation/filesystems/cifs/AUTHORS +++ b/Documentation/filesystems/cifs/AUTHORS @@ -40,6 +40,7 @@ Gunter Kukkukk (testing and suggestions for support of old servers) Igor Mammedov (DFS support) Jeff Layton (many, many fixes, as well as great work on the cifs Kerberos code) Scott Lovenberg +Pavel Shilovsky (for great work adding SMB2 support, and various SMB3 features) Test case and Bug Report contributors ------------------------------------- -- cgit v1.2.3 From 2075cf0b71bfffc4942e232f1487ac98bdfb7bf5 Mon Sep 17 00:00:00 2001 From: Steve French Date: Fri, 1 Aug 2014 16:00:01 -0500 Subject: update CIFS TODO list Signed-off-by: Steve French --- Documentation/filesystems/cifs/TODO | 97 ++++++++++++++----------------------- 1 file changed, 36 insertions(+), 61 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/cifs/TODO b/Documentation/filesystems/cifs/TODO index 355abcdcda98..066ffddc3964 100644 --- a/Documentation/filesystems/cifs/TODO +++ b/Documentation/filesystems/cifs/TODO @@ -1,4 +1,4 @@ -Version 1.53 May 20, 2008 +Version 2.03 August 1, 2014 A Partial List of Missing Features ================================== @@ -7,63 +7,49 @@ Contributions are welcome. There are plenty of opportunities for visible, important contributions to this module. Here is a partial list of the known problems and missing features: -a) Support for SecurityDescriptors(Windows/CIFS ACLs) for chmod/chgrp/chown -so that these operations can be supported to Windows servers +a) SMB3 (and SMB3.02) missing optional features: + - RDMA + - multichannel (started) + - directory leases (improved metadata caching) + - T10 copy offload (copy chunk is only mechanism supported) + - encrypted shares -b) Mapping POSIX ACLs (and eventually NFSv4 ACLs) to CIFS -SecurityDescriptors +b) improved sparse file support -c) Better pam/winbind integration (e.g. to handle uid mapping -better) - -d) Cleanup now unneeded SessSetup code in -fs/cifs/connect.c and add back in NTLMSSP code if any servers -need it - -e) fix NTLMv2 signing when two mounts with different users to same -server. - -f) Directory entry caching relies on a 1 second timer, rather than +c) Directory entry caching relies on a 1 second timer, rather than using FindNotify or equivalent. - (started) -g) quota support (needs minor kernel change since quota calls +d) quota support (needs minor kernel change since quota calls to make it to network filesystems or deviceless filesystems) -h) investigate sync behavior (including syncpage) and check -for proper behavior of intr/nointr - -i) improve support for very old servers (OS/2 and Win9x for example) +e) improve support for very old servers (OS/2 and Win9x for example) Including support for changing the time remotely (utimes command). -j) hook lower into the sockets api (as NFS/SunRPC does) to avoid the +f) hook lower into the sockets api (as NFS/SunRPC does) to avoid the extra copy in/out of the socket buffers in some cases. -k) Better optimize open (and pathbased setfilesize) to reduce the +g) Better optimize open (and pathbased setfilesize) to reduce the oplock breaks coming from windows srv. Piggyback identical file opens on top of each other by incrementing reference count rather than resending (helps reduce server resource utilization and avoid spurious oplock breaks). -l) Improve performance of readpages by sending more than one read -at a time when 8 pages or more are requested. In conjuntion -add support for async_cifs_readpages. - -m) Add support for storing symlink info to Windows servers +h) Add support for storing symlink info to Windows servers in the Extended Attribute format their SFU clients would recognize. -n) Finish fcntl D_NOTIFY support so kde and gnome file list windows +i) Finish inotify support so kde and gnome file list windows will autorefresh (partially complete by Asser). Needs minor kernel vfs change to support removing D_NOTIFY on a file. -o) Add GUI tool to configure /proc/fs/cifs settings and for display of +j) Add GUI tool to configure /proc/fs/cifs settings and for display of the CIFS statistics (started) -p) implement support for security and trusted categories of xattrs +k) implement support for security and trusted categories of xattrs (requires minor protocol extension) to enable better support for SELINUX -q) Implement O_DIRECT flag on open (already supported on mount) +l) Implement O_DIRECT flag on open (already supported on mount) -r) Create UID mapping facility so server UIDs can be mapped on a per +m) Create UID mapping facility so server UIDs can be mapped on a per mount or a per server basis to client UIDs or nobody if no mapping exists. This is helpful when Unix extensions are negotiated to allow better permission checking when UIDs differ on the server @@ -71,28 +57,29 @@ and client. Add new protocol request to the CIFS protocol standard for asking the server for the corresponding name of a particular uid. -s) Add support for CIFS Unix and also the newer POSIX extensions to the -server side for Samba 4. +n) DOS attrs - returned as pseudo-xattr in Samba format (check VFAT and NTFS for this too) + +o) mount check for unmatched uids -t) In support for OS/2 (LANMAN 1.2 and LANMAN2.1 based SMB servers) -need to add ability to set time to server (utimes command) +p) Add support for new vfs entry point for fallocate -u) DOS attrs - returned as pseudo-xattr in Samba format (check VFAT and NTFS for this too) +q) Add tools to take advantage of cifs/smb3 specific ioctls and features +such as "CopyChunk" (fast server side file copy) -v) mount check for unmatched uids +r) encrypted file support -w) Add support for new vfs entry point for fallocate +s) improved stats gathering, tools (perhaps integration with nfsometer?) -x) Fix Samba 3 server to handle Linux kernel aio so dbench with lots of -processes can proceed better in parallel (on the server) +t) allow setting more NTFS/SMB3 file attributes remotely (currently limited to compressed +file attribute via chflags) -y) Fix Samba 3 to handle reads/writes over 127K (and remove the cifs mount -restriction of wsize max being 127K) +u) mount helper GUI (to simplify the various configuration options on mount) -KNOWN BUGS (updated April 24, 2007) + +KNOWN BUGS ==================================== See http://bugzilla.samba.org - search on product "CifsVFS" for -current bug list. +current bug list. Also check http://bugzilla.kernel.org (Product = File System, Component = CIFS) 1) existing symbolic links (Windows reparse points) are recognized but can not be created remotely. They are implemented for Samba and those that @@ -100,30 +87,18 @@ support the CIFS Unix extensions, although earlier versions of Samba overly restrict the pathnames. 2) follow_link and readdir code does not follow dfs junctions but recognizes them -3) create of new files to FAT partitions on Windows servers can -succeed but still return access denied (appears to be Windows -server not cifs client problem) and has not been reproduced recently. -NTFS partitions do not have this problem. -4) Unix/POSIX capabilities are reset after reconnection, and affect -a few fields in the tree connection but we do do not know which -superblocks to apply these changes to. We should probably walk -the list of superblocks to set these. Also need to check the -flags on the second mount to the same share, and see if we -can do the same trick that NFS does to remount duplicate shares. Misc testing to do ================== 1) check out max path names and max path name components against various server types. Try nested symlinks (8 deep). Return max path name in stat -f information -2) Modify file portion of ltp so it can run against a mounted network -share and run it against cifs vfs in automated fashion. +2) Improve xfstest's cifs enablement and adapt xfstests where needed to test +cifs better 3) Additional performance testing and optimization using iozone and similar - there are some easy changes that can be done to parallelize sequential writes, and when signing is disabled to request larger read sizes (larger than negotiated size) and send larger write sizes to modern servers. -4) More exhaustively test against less common servers. More testing -against Windows 9x, Windows ME servers. - +4) More exhaustively test against less common servers -- cgit v1.2.3 From b8faf035ea9d0011b04856a8198e2e212d93346a Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 4 Aug 2014 17:06:29 +1000 Subject: VFS: allow ->d_manage() to declare -EISDIR in rcu_walk mode. In REF-walk mode, ->d_manage can return -EISDIR to indicate that the dentry is not really a mount trap (or even a mount point) and that any mounts or any DCACHE_NEED_AUTOMOUNT flag should be ignored. RCU-walk mode doesn't currently support this, so if there is a dentry with DCACHE_NEED_AUTOMOUNT set but which shouldn't be a mount-trap, lookup_fast() will always drop in REF-walk mode. With this patch, an -EISDIR from ->d_manage will always cause mounts and automounts to be ignored, both in REF-walk and RCU-walk. Bug-fixed-by: Dan Carpenter Cc: Ian Kent Signed-off-by: NeilBrown Signed-off-by: Al Viro --- Documentation/filesystems/vfs.txt | 3 ++- fs/namei.c | 27 ++++++++++++++++----------- 2 files changed, 18 insertions(+), 12 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index a1d0d7a30165..61d65cc65c54 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -1053,7 +1053,8 @@ struct dentry_operations { If the 'rcu_walk' parameter is true, then the caller is doing a pathwalk in RCU-walk mode. Sleeping is not permitted in this mode, and the caller can be asked to leave it and call again by returning - -ECHILD. + -ECHILD. -EISDIR may also be returned to tell pathwalk to + ignore d_automount or any mounts. This function is only used if DCACHE_MANAGE_TRANSIT is set on the dentry being transited from. diff --git a/fs/namei.c b/fs/namei.c index 0ff23cecb1bb..8a217c48f6db 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1091,10 +1091,10 @@ int follow_down_one(struct path *path) } EXPORT_SYMBOL(follow_down_one); -static inline bool managed_dentry_might_block(struct dentry *dentry) +static inline int managed_dentry_rcu(struct dentry *dentry) { - return (dentry->d_flags & DCACHE_MANAGE_TRANSIT && - dentry->d_op->d_manage(dentry, true) < 0); + return (dentry->d_flags & DCACHE_MANAGE_TRANSIT) ? + dentry->d_op->d_manage(dentry, true) : 0; } /* @@ -1110,11 +1110,18 @@ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, * Don't forget we might have a non-mountpoint managed dentry * that wants to block transit. */ - if (unlikely(managed_dentry_might_block(path->dentry))) + switch (managed_dentry_rcu(path->dentry)) { + case -ECHILD: + default: return false; + case -EISDIR: + return true; + case 0: + break; + } if (!d_mountpoint(path->dentry)) - return true; + return !(path->dentry->d_flags & DCACHE_NEED_AUTOMOUNT); mounted = __lookup_mnt(path->mnt, path->dentry); if (!mounted) @@ -1130,7 +1137,8 @@ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, */ *inode = path->dentry->d_inode; } - return read_seqretry(&mount_lock, nd->m_seq); + return read_seqretry(&mount_lock, nd->m_seq) && + !(path->dentry->d_flags & DCACHE_NEED_AUTOMOUNT); } static int follow_dotdot_rcu(struct nameidata *nd) @@ -1402,11 +1410,8 @@ static int lookup_fast(struct nameidata *nd, } path->mnt = mnt; path->dentry = dentry; - if (unlikely(!__follow_mount_rcu(nd, path, inode))) - goto unlazy; - if (unlikely(path->dentry->d_flags & DCACHE_NEED_AUTOMOUNT)) - goto unlazy; - return 0; + if (likely(__follow_mount_rcu(nd, path, inode))) + return 0; unlazy: if (unlazy_walk(nd, dentry)) return -ECHILD; -- cgit v1.2.3 From 9635389534a765c8357d988e53d323bf033dcdd0 Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Tue, 18 Feb 2014 12:31:31 -0500 Subject: exportfs: update Exporting documentation Minor documentation updates: - refer to d_obtain_alias rather than d_alloc_anon - explain when to use d_splice_alias and when d_materialise_unique. - cut some details of d_splice_alias/d_materialise_unique implementation. Signed-off-by: J. Bruce Fields Signed-off-by: Al Viro --- Documentation/filesystems/nfs/Exporting | 38 ++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 15 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/nfs/Exporting b/Documentation/filesystems/nfs/Exporting index e543b1a619cc..c8f036a9b13f 100644 --- a/Documentation/filesystems/nfs/Exporting +++ b/Documentation/filesystems/nfs/Exporting @@ -66,23 +66,31 @@ b/ A per-superblock list "s_anon" of dentries which are the roots of c/ Helper routines to allocate anonymous dentries, and to help attach loose directory dentries at lookup time. They are: - d_alloc_anon(inode) will return a dentry for the given inode. + d_obtain_alias(inode) will return a dentry for the given inode. If the inode already has a dentry, one of those is returned. If it doesn't, a new anonymous (IS_ROOT and DCACHE_DISCONNECTED) dentry is allocated and attached. In the case of a directory, care is taken that only one dentry can ever be attached. - d_splice_alias(inode, dentry) will make sure that there is a - dentry with the same name and parent as the given dentry, and - which refers to the given inode. - If the inode is a directory and already has a dentry, then that - dentry is d_moved over the given dentry. - If the passed dentry gets attached, care is taken that this is - mutually exclusive to a d_alloc_anon operation. - If the passed dentry is used, NULL is returned, else the used - dentry is returned. This corresponds to the calling pattern of - ->lookup. - + d_splice_alias(inode, dentry) or d_materialise_unique(dentry, inode) + will introduce a new dentry into the tree; either the passed-in + dentry or a preexisting alias for the given inode (such as an + anonymous one created by d_obtain_alias), if appropriate. The two + functions differ in their handling of directories with preexisting + aliases: + d_splice_alias will use any existing IS_ROOT dentry, but it will + return -EIO rather than try to move a dentry with a different + parent. This is appropriate for local filesystems, which + should never see such an alias unless the filesystem is + corrupted somehow (for example, if two on-disk directory + entries refer to the same directory.) + d_materialise_unique will attempt to move any dentry. This is + appropriate for distributed filesystems, where finding a + directory other than where we last cached it may be a normal + consequence of concurrent operations on other hosts. + Both functions return NULL when the passed-in dentry is used, + following the calling convention of ->lookup. + Filesystem Issues ----------------- @@ -120,12 +128,12 @@ struct which has the following members: fh_to_dentry (mandatory) Given a filehandle fragment, this should find the implied object and - create a dentry for it (possibly with d_alloc_anon). + create a dentry for it (possibly with d_obtain_alias). fh_to_parent (optional but strongly recommended) Given a filehandle fragment, this should find the parent of the - implied object and create a dentry for it (possibly with d_alloc_anon). - May fail if the filehandle fragment is too small. + implied object and create a dentry for it (possibly with + d_obtain_alias). May fail if the filehandle fragment is too small. get_parent (optional but strongly recommended) When given a dentry for a directory, this should return a dentry for -- cgit v1.2.3 From 2ece173e4715031c031de9114491eee80a69cf68 Mon Sep 17 00:00:00 2001 From: Jeff Layton Date: Tue, 12 Aug 2014 10:38:07 -0400 Subject: locks: update Locking documentation to clarify fl_release_private behavior Acked-by: J. Bruce Fields Signed-off-by: Jeff Layton --- Documentation/filesystems/Locking | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index b18dd1779029..f1997e9da61f 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -349,7 +349,11 @@ prototypes: locking rules: inode->i_lock may block fl_copy_lock: yes no -fl_release_private: maybe no +fl_release_private: maybe maybe[1] + +[1]: ->fl_release_private for flock or POSIX locks is currently allowed +to block. Leases however can still be freed while the i_lock is held and +so fl_release_private called on a lease should not block. ----------------------- lock_manager_operations --------------------------- prototypes: -- cgit v1.2.3 From 77be4daf4e65eb1da70e6623ec61ecde62f5de95 Mon Sep 17 00:00:00 2001 From: Rob Jones Date: Sun, 7 Sep 2014 11:24:40 -0700 Subject: Documentation: seq_file: Document seq_open_private(), seq_release_private() Despite the fact that these functions have been around for years, they are little used (only 15 uses in 13 files at the preseht time) even though many other files use work-arounds to achieve the same result. By documenting them, hopefully they will become more widely used. Signed-off-by: Rob Jones Acked-by: Steven Whitehouse Signed-off-by: Randy Dunlap Signed-off-by: Linus Torvalds --- Documentation/filesystems/seq_file.txt | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/seq_file.txt b/Documentation/filesystems/seq_file.txt index 1fe0ccb1af55..8ea3e90ace07 100644 --- a/Documentation/filesystems/seq_file.txt +++ b/Documentation/filesystems/seq_file.txt @@ -235,6 +235,39 @@ be used for more than one file, you can store an arbitrary pointer in the private field of the seq_file structure; that value can then be retrieved by the iterator functions. +There is also a wrapper function to seq_open() called seq_open_private(). It +kmallocs a zero filled block of memory and stores a pointer to it in the +private field of the seq_file structure, returning 0 on success. The +block size is specified in a third parameter to the function, e.g.: + + static int ct_open(struct inode *inode, struct file *file) + { + return seq_open_private(file, &ct_seq_ops, + sizeof(struct mystruct)); + } + +There is also a variant function, __seq_open_private(), which is functionally +identical except that, if successful, it returns the pointer to the allocated +memory block, allowing further initialisation e.g.: + + static int ct_open(struct inode *inode, struct file *file) + { + struct mystruct *p = + __seq_open_private(file, &ct_seq_ops, sizeof(*p)); + + if (!p) + return -ENOMEM; + + p->foo = bar; /* initialize my stuff */ + ... + p->baz = true; + + return 0; + } + +A corresponding close function, seq_release_private() is available which +frees the memory allocated in the corresponding open. + The other operations of interest - read(), llseek(), and release() - are all implemented by the seq_file code itself. So a virtual file's file_operations structure will look like: -- cgit v1.2.3 From 731d5cca82729c85ca3296902a64836619f4ba2d Mon Sep 17 00:00:00 2001 From: Paul Bolle Date: Sun, 7 Sep 2014 11:25:55 -0700 Subject: Documentation: NFS/RDMA: Document separate Kconfig symbols The NFS/RDMA Kconfig symbol was split into separate options for client and server in commit 2e8c12e1b765 ("xprtrdma: add separate Kconfig options for NFSoRDMA client and server support"). Update the documentation to reflect this split. Signed-off-by: Paul Bolle Reviewed-by: Jeff Layton Signed-off-by: Randy Dunlap Cc: "J. Bruce Fields" Signed-off-by: Linus Torvalds --- Documentation/filesystems/nfs/nfs-rdma.txt | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/nfs/nfs-rdma.txt b/Documentation/filesystems/nfs/nfs-rdma.txt index e386f7e4bcee..724043858b08 100644 --- a/Documentation/filesystems/nfs/nfs-rdma.txt +++ b/Documentation/filesystems/nfs/nfs-rdma.txt @@ -138,9 +138,9 @@ Installation - Build, install, reboot The NFS/RDMA code will be enabled automatically if NFS and RDMA - are turned on. The NFS/RDMA client and server are configured via the hidden - SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The - value of SUNRPC_XPRT_RDMA will be: + are turned on. The NFS/RDMA client and server are configured via the + SUNRPC_XPRT_RDMA_CLIENT and SUNRPC_XPRT_RDMA_SERVER config options that both + depend on SUNRPC and INFINIBAND. The default value of both options will be: - N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client and server will not be built @@ -235,8 +235,9 @@ NFS/RDMA Setup - Start the NFS server - If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in - kernel config), load the RDMA transport module: + If the NFS/RDMA server was built as a module + (CONFIG_SUNRPC_XPRT_RDMA_SERVER=m in kernel config), load the RDMA + transport module: $ modprobe svcrdma @@ -255,8 +256,9 @@ NFS/RDMA Setup - On the client system - If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in - kernel config), load the RDMA client module: + If the NFS/RDMA client was built as a module + (CONFIG_SUNRPC_XPRT_RDMA_CLIENT=m in kernel config), load the RDMA client + module: $ modprobe xprtrdma.ko -- cgit v1.2.3