summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/00-INDEX6
-rw-r--r--Documentation/ABI/removed/o2cb (renamed from Documentation/ABI/obsolete/o2cb)9
-rw-r--r--Documentation/ABI/testing/sysfs-kernel-mm-cleancache11
-rw-r--r--Documentation/DocBook/dvb/dvbproperty.xml5
-rw-r--r--Documentation/DocBook/media-entities.tmpl7
-rw-r--r--Documentation/DocBook/mtdnand.tmpl3
-rw-r--r--Documentation/DocBook/v4l/media-controller.xml6
-rw-r--r--Documentation/DocBook/v4l/pixfmt.xml1
-rw-r--r--Documentation/DocBook/v4l/subdev-formats.xml10
-rw-r--r--Documentation/RCU/trace.txt17
-rw-r--r--Documentation/accounting/getdelays.c38
-rw-r--r--Documentation/acpi/method-customizing.txt5
-rw-r--r--Documentation/arm/Booting33
-rw-r--r--Documentation/arm/Samsung/Overview.txt2
-rw-r--r--Documentation/atomic_ops.txt2
-rw-r--r--Documentation/cgroups/cgroups.txt41
-rw-r--r--Documentation/devicetree/booting-without-of.txt48
-rw-r--r--Documentation/dmaengine.txt97
-rw-r--r--Documentation/feature-removal-schedule.txt46
-rw-r--r--Documentation/filesystems/Locking4
-rw-r--r--Documentation/filesystems/configfs/configfs_example_explicit.c6
-rw-r--r--Documentation/filesystems/configfs/configfs_example_macros.c6
-rw-r--r--Documentation/filesystems/ext4.txt4
-rw-r--r--Documentation/filesystems/nfs/idmapper.txt4
-rw-r--r--Documentation/filesystems/ocfs2.txt8
-rw-r--r--Documentation/filesystems/vfs.txt2
-rw-r--r--Documentation/filesystems/xfs.txt6
-rw-r--r--Documentation/kernel-parameters.txt7
-rw-r--r--Documentation/laptops/acer-wmi.txt184
-rw-r--r--Documentation/lockstat.txt36
-rw-r--r--Documentation/networking/dns_resolver.txt4
-rw-r--r--Documentation/power/regulator/machine.txt4
-rw-r--r--Documentation/scsi/ChangeLog.megaraid_sas14
-rw-r--r--Documentation/security/00-INDEX18
-rw-r--r--Documentation/security/SELinux.txt (renamed from Documentation/SELinux.txt)0
-rw-r--r--Documentation/security/Smack.txt (renamed from Documentation/Smack.txt)0
-rw-r--r--Documentation/security/apparmor.txt (renamed from Documentation/apparmor.txt)0
-rw-r--r--Documentation/security/credentials.txt (renamed from Documentation/credentials.txt)2
-rw-r--r--Documentation/security/keys-request-key.txt (renamed from Documentation/keys-request-key.txt)4
-rw-r--r--Documentation/security/keys-trusted-encrypted.txt (renamed from Documentation/keys-trusted-encrypted.txt)0
-rw-r--r--Documentation/security/keys.txt (renamed from Documentation/keys.txt)4
-rw-r--r--Documentation/security/tomoyo.txt (renamed from Documentation/tomoyo.txt)0
-rw-r--r--Documentation/sysctl/kernel.txt3
-rw-r--r--Documentation/virtual/lguest/Makefile2
-rw-r--r--Documentation/virtual/lguest/lguest.c22
-rw-r--r--Documentation/vm/cleancache.txt278
46 files changed, 690 insertions, 319 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 1b777b960492..1f89424c36a6 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -192,10 +192,6 @@ kernel-docs.txt
- listing of various WWW + books that document kernel internals.
kernel-parameters.txt
- summary listing of command line / boot prompt args for the kernel.
-keys-request-key.txt
- - description of the kernel key request service.
-keys.txt
- - description of the kernel key retention service.
kobject.txt
- info of the kobject infrastructure of the Linux kernel.
kprobes.txt
@@ -294,6 +290,8 @@ scheduler/
- directory with info on the scheduler.
scsi/
- directory with info on Linux scsi support.
+security/
+ - directory that contains security-related info
serial/
- directory with info on the low level serial API.
serial-console.txt
diff --git a/Documentation/ABI/obsolete/o2cb b/Documentation/ABI/removed/o2cb
index 9c49d8e6c0cc..7f5daa465093 100644
--- a/Documentation/ABI/obsolete/o2cb
+++ b/Documentation/ABI/removed/o2cb
@@ -1,11 +1,10 @@
What: /sys/o2cb symlink
-Date: Dec 2005
-KernelVersion: 2.6.16
+Date: May 2011
+KernelVersion: 2.6.40
Contact: ocfs2-devel@oss.oracle.com
-Description: This is a symlink: /sys/o2cb to /sys/fs/o2cb. The symlink will
- be removed when new versions of ocfs2-tools which know to look
+Description: This is a symlink: /sys/o2cb to /sys/fs/o2cb. The symlink is
+ removed when new versions of ocfs2-tools which know to look
in /sys/fs/o2cb are sufficiently prevalent. Don't code new
software to look here, it should try /sys/fs/o2cb instead.
- See Documentation/ABI/stable/o2cb for more information on usage.
Users: ocfs2-tools. It's sufficient to mail proposed changes to
ocfs2-devel@oss.oracle.com.
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cleancache b/Documentation/ABI/testing/sysfs-kernel-mm-cleancache
new file mode 100644
index 000000000000..662ae646ea12
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-cleancache
@@ -0,0 +1,11 @@
+What: /sys/kernel/mm/cleancache/
+Date: April 2011
+Contact: Dan Magenheimer <dan.magenheimer@oracle.com>
+Description:
+ /sys/kernel/mm/cleancache/ contains a number of files which
+ record a count of various cleancache operations
+ (sum across all filesystems):
+ succ_gets
+ failed_gets
+ puts
+ flushes
diff --git a/Documentation/DocBook/dvb/dvbproperty.xml b/Documentation/DocBook/dvb/dvbproperty.xml
index 52d5e3c7cf6c..b5365f61d69b 100644
--- a/Documentation/DocBook/dvb/dvbproperty.xml
+++ b/Documentation/DocBook/dvb/dvbproperty.xml
@@ -141,13 +141,15 @@ struct dtv_properties {
</row></tbody></tgroup></informaltable>
</section>
+<section>
+ <title>Property types</title>
<para>
On <link linkend="FE_GET_PROPERTY">FE_GET_PROPERTY</link>/<link linkend="FE_SET_PROPERTY">FE_SET_PROPERTY</link>,
the actual action is determined by the dtv_property cmd/data pairs. With one single ioctl, is possible to
get/set up to 64 properties. The actual meaning of each property is described on the next sections.
</para>
-<para>The Available frontend property types are:</para>
+<para>The available frontend property types are:</para>
<programlisting>
#define DTV_UNDEFINED 0
#define DTV_TUNE 1
@@ -193,6 +195,7 @@ get/set up to 64 properties. The actual meaning of each property is described on
#define DTV_ISDBT_LAYER_ENABLED 41
#define DTV_ISDBS_TS_ID 42
</programlisting>
+</section>
<section id="fe_property_common">
<title>Parameters that are common to all Digital TV standards</title>
diff --git a/Documentation/DocBook/media-entities.tmpl b/Documentation/DocBook/media-entities.tmpl
index c8abb23ef1e7..e5fe09430fd9 100644
--- a/Documentation/DocBook/media-entities.tmpl
+++ b/Documentation/DocBook/media-entities.tmpl
@@ -293,6 +293,7 @@
<!ENTITY sub-yuyv SYSTEM "v4l/pixfmt-yuyv.xml">
<!ENTITY sub-yvyu SYSTEM "v4l/pixfmt-yvyu.xml">
<!ENTITY sub-srggb10 SYSTEM "v4l/pixfmt-srggb10.xml">
+<!ENTITY sub-srggb12 SYSTEM "v4l/pixfmt-srggb12.xml">
<!ENTITY sub-srggb8 SYSTEM "v4l/pixfmt-srggb8.xml">
<!ENTITY sub-y10 SYSTEM "v4l/pixfmt-y10.xml">
<!ENTITY sub-y12 SYSTEM "v4l/pixfmt-y12.xml">
@@ -373,9 +374,9 @@
<!ENTITY sub-media-indices SYSTEM "media-indices.tmpl">
<!ENTITY sub-media-controller SYSTEM "v4l/media-controller.xml">
-<!ENTITY sub-media-open SYSTEM "v4l/media-func-open.xml">
-<!ENTITY sub-media-close SYSTEM "v4l/media-func-close.xml">
-<!ENTITY sub-media-ioctl SYSTEM "v4l/media-func-ioctl.xml">
+<!ENTITY sub-media-func-open SYSTEM "v4l/media-func-open.xml">
+<!ENTITY sub-media-func-close SYSTEM "v4l/media-func-close.xml">
+<!ENTITY sub-media-func-ioctl SYSTEM "v4l/media-func-ioctl.xml">
<!ENTITY sub-media-ioc-device-info SYSTEM "v4l/media-ioc-device-info.xml">
<!ENTITY sub-media-ioc-enum-entities SYSTEM "v4l/media-ioc-enum-entities.xml">
<!ENTITY sub-media-ioc-enum-links SYSTEM "v4l/media-ioc-enum-links.xml">
diff --git a/Documentation/DocBook/mtdnand.tmpl b/Documentation/DocBook/mtdnand.tmpl
index 6f242d5dee9a..17910e2052ad 100644
--- a/Documentation/DocBook/mtdnand.tmpl
+++ b/Documentation/DocBook/mtdnand.tmpl
@@ -189,8 +189,7 @@ static void __iomem *baseaddr;
<title>Partition defines</title>
<para>
If you want to divide your device into partitions, then
- enable the configuration switch CONFIG_MTD_PARTITIONS and define
- a partitioning scheme suitable to your board.
+ define a partitioning scheme suitable to your board.
</para>
<programlisting>
#define NUM_PARTITIONS 2
diff --git a/Documentation/DocBook/v4l/media-controller.xml b/Documentation/DocBook/v4l/media-controller.xml
index 2dc25e1d4089..873ac3a621f0 100644
--- a/Documentation/DocBook/v4l/media-controller.xml
+++ b/Documentation/DocBook/v4l/media-controller.xml
@@ -78,9 +78,9 @@
<appendix id="media-user-func">
<title>Function Reference</title>
<!-- Keep this alphabetically sorted. -->
- &sub-media-open;
- &sub-media-close;
- &sub-media-ioctl;
+ &sub-media-func-open;
+ &sub-media-func-close;
+ &sub-media-func-ioctl;
<!-- All ioctls go here. -->
&sub-media-ioc-device-info;
&sub-media-ioc-enum-entities;
diff --git a/Documentation/DocBook/v4l/pixfmt.xml b/Documentation/DocBook/v4l/pixfmt.xml
index dbfe3b08435f..deb660207f94 100644
--- a/Documentation/DocBook/v4l/pixfmt.xml
+++ b/Documentation/DocBook/v4l/pixfmt.xml
@@ -673,6 +673,7 @@ access the palette, this must be done with ioctls of the Linux framebuffer API.<
&sub-srggb8;
&sub-sbggr16;
&sub-srggb10;
+ &sub-srggb12;
</section>
<section id="yuv-formats">
diff --git a/Documentation/DocBook/v4l/subdev-formats.xml b/Documentation/DocBook/v4l/subdev-formats.xml
index a26b10c07857..8d3409d2c632 100644
--- a/Documentation/DocBook/v4l/subdev-formats.xml
+++ b/Documentation/DocBook/v4l/subdev-formats.xml
@@ -2531,13 +2531,13 @@
<constant>_JPEG</constant> prefix the format code is made of
the following information.
<itemizedlist>
- <listitem>The number of bus samples per entropy encoded byte.</listitem>
- <listitem>The bus width.</listitem>
+ <listitem><para>The number of bus samples per entropy encoded byte.</para></listitem>
+ <listitem><para>The bus width.</para></listitem>
</itemizedlist>
+ </para>
- <para>For instance, for a JPEG baseline process and an 8-bit bus width
- the format will be named <constant>V4L2_MBUS_FMT_JPEG_1X8</constant>.
- </para>
+ <para>For instance, for a JPEG baseline process and an 8-bit bus width
+ the format will be named <constant>V4L2_MBUS_FMT_JPEG_1X8</constant>.
</para>
<para>The following table lists existing JPEG compressed formats.</para>
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index c078ad48f7a1..8173cec473aa 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -99,18 +99,11 @@ o "qp" indicates that RCU still expects a quiescent state from
o "dt" is the current value of the dyntick counter that is incremented
when entering or leaving dynticks idle state, either by the
- scheduler or by irq. The number after the "/" is the interrupt
- nesting depth when in dyntick-idle state, or one greater than
- the interrupt-nesting depth otherwise.
-
- This field is displayed only for CONFIG_NO_HZ kernels.
-
-o "dn" is the current value of the dyntick counter that is incremented
- when entering or leaving dynticks idle state via NMI. If both
- the "dt" and "dn" values are even, then this CPU is in dynticks
- idle mode and may be ignored by RCU. If either of these two
- counters is odd, then RCU must be alert to the possibility of
- an RCU read-side critical section running on this CPU.
+ scheduler or by irq. This number is even if the CPU is in
+ dyntick idle mode and odd otherwise. The number after the first
+ "/" is the interrupt nesting depth when in dyntick-idle state,
+ or one greater than the interrupt-nesting depth otherwise.
+ The number after the second "/" is the NMI nesting depth.
This field is displayed only for CONFIG_NO_HZ kernels.
diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c
index e9c77788a39d..f6318f6d7baf 100644
--- a/Documentation/accounting/getdelays.c
+++ b/Documentation/accounting/getdelays.c
@@ -177,6 +177,8 @@ static int get_family_id(int sd)
rc = send_cmd(sd, GENL_ID_CTRL, getpid(), CTRL_CMD_GETFAMILY,
CTRL_ATTR_FAMILY_NAME, (void *)name,
strlen(TASKSTATS_GENL_NAME)+1);
+ if (rc < 0)
+ return 0; /* sendto() failure? */
rep_len = recv(sd, &ans, sizeof(ans), 0);
if (ans.n.nlmsg_type == NLMSG_ERROR ||
@@ -191,30 +193,37 @@ static int get_family_id(int sd)
return id;
}
+#define average_ms(t, c) (t / 1000000ULL / (c ? c : 1))
+
static void print_delayacct(struct taskstats *t)
{
- printf("\n\nCPU %15s%15s%15s%15s\n"
- " %15llu%15llu%15llu%15llu\n"
- "IO %15s%15s\n"
- " %15llu%15llu\n"
- "SWAP %15s%15s\n"
- " %15llu%15llu\n"
- "RECLAIM %12s%15s\n"
- " %15llu%15llu\n",
- "count", "real total", "virtual total", "delay total",
+ printf("\n\nCPU %15s%15s%15s%15s%15s\n"
+ " %15llu%15llu%15llu%15llu%15.3fms\n"
+ "IO %15s%15s%15s\n"
+ " %15llu%15llu%15llums\n"
+ "SWAP %15s%15s%15s\n"
+ " %15llu%15llu%15llums\n"
+ "RECLAIM %12s%15s%15s\n"
+ " %15llu%15llu%15llums\n",
+ "count", "real total", "virtual total",
+ "delay total", "delay average",
(unsigned long long)t->cpu_count,
(unsigned long long)t->cpu_run_real_total,
(unsigned long long)t->cpu_run_virtual_total,
(unsigned long long)t->cpu_delay_total,
- "count", "delay total",
+ average_ms((double)t->cpu_delay_total, t->cpu_count),
+ "count", "delay total", "delay average",
(unsigned long long)t->blkio_count,
(unsigned long long)t->blkio_delay_total,
- "count", "delay total",
+ average_ms(t->blkio_delay_total, t->blkio_count),
+ "count", "delay total", "delay average",
(unsigned long long)t->swapin_count,
(unsigned long long)t->swapin_delay_total,
- "count", "delay total",
+ average_ms(t->swapin_delay_total, t->swapin_count),
+ "count", "delay total", "delay average",
(unsigned long long)t->freepages_count,
- (unsigned long long)t->freepages_delay_total);
+ (unsigned long long)t->freepages_delay_total,
+ average_ms(t->freepages_delay_total, t->freepages_count));
}
static void task_context_switch_counts(struct taskstats *t)
@@ -433,8 +442,6 @@ int main(int argc, char *argv[])
}
do {
- int i;
-
rep_len = recv(nl_sd, &msg, sizeof(msg), 0);
PRINTF("received %d bytes\n", rep_len);
@@ -459,7 +466,6 @@ int main(int argc, char *argv[])
na = (struct nlattr *) GENLMSG_DATA(&msg);
len = 0;
- i = 0;
while (len < rep_len) {
len += NLA_ALIGN(na->nla_len);
switch (na->nla_type) {
diff --git a/Documentation/acpi/method-customizing.txt b/Documentation/acpi/method-customizing.txt
index 3e1d25aee3fb..5f55373dd53b 100644
--- a/Documentation/acpi/method-customizing.txt
+++ b/Documentation/acpi/method-customizing.txt
@@ -66,3 +66,8 @@ Note: We can use a kernel with multiple custom ACPI method running,
But each individual write to debugfs can implement a SINGLE
method override. i.e. if we want to insert/override multiple
ACPI methods, we need to redo step c) ~ g) for multiple times.
+
+Note: Be aware that root can mis-use this driver to modify arbitrary
+ memory and gain additional rights, if root's privileges got
+ restricted (for example if root is not allowed to load additional
+ modules after boot).
diff --git a/Documentation/arm/Booting b/Documentation/arm/Booting
index 76850295af8f..4e686a2ed91e 100644
--- a/Documentation/arm/Booting
+++ b/Documentation/arm/Booting
@@ -65,13 +65,19 @@ looks at the connected hardware is beyond the scope of this document.
The boot loader must ultimately be able to provide a MACH_TYPE_xxx
value to the kernel. (see linux/arch/arm/tools/mach-types).
-
-4. Setup the kernel tagged list
--------------------------------
+4. Setup boot data
+------------------
Existing boot loaders: OPTIONAL, HIGHLY RECOMMENDED
New boot loaders: MANDATORY
+The boot loader must provide either a tagged list or a dtb image for
+passing configuration data to the kernel. The physical address of the
+boot data is passed to the kernel in register r2.
+
+4a. Setup the kernel tagged list
+--------------------------------
+
The boot loader must create and initialise the kernel tagged list.
A valid tagged list starts with ATAG_CORE and ends with ATAG_NONE.
The ATAG_CORE tag may or may not be empty. An empty ATAG_CORE tag
@@ -101,6 +107,24 @@ The tagged list must be placed in a region of memory where neither
the kernel decompressor nor initrd 'bootp' program will overwrite
it. The recommended placement is in the first 16KiB of RAM.
+4b. Setup the device tree
+-------------------------
+
+The boot loader must load a device tree image (dtb) into system ram
+at a 64bit aligned address and initialize it with the boot data. The
+dtb format is documented in Documentation/devicetree/booting-without-of.txt.
+The kernel will look for the dtb magic value of 0xd00dfeed at the dtb
+physical address to determine if a dtb has been passed instead of a
+tagged list.
+
+The boot loader must pass at a minimum the size and location of the
+system memory, and the root filesystem location. The dtb must be
+placed in a region of memory where the kernel decompressor will not
+overwrite it. The recommended placement is in the first 16KiB of RAM
+with the caveat that it may not be located at physical address 0 since
+the kernel interprets a value of 0 in r2 to mean neither a tagged list
+nor a dtb were passed.
+
5. Calling the kernel image
---------------------------
@@ -125,7 +149,8 @@ In either case, the following conditions must be met:
- CPU register settings
r0 = 0,
r1 = machine type number discovered in (3) above.
- r2 = physical address of tagged list in system RAM.
+ r2 = physical address of tagged list in system RAM, or
+ physical address of device tree block (dtb) in system RAM
- CPU mode
All forms of interrupts must be disabled (IRQs and FIQs)
diff --git a/Documentation/arm/Samsung/Overview.txt b/Documentation/arm/Samsung/Overview.txt
index c3094ea51aa7..658abb258cef 100644
--- a/Documentation/arm/Samsung/Overview.txt
+++ b/Documentation/arm/Samsung/Overview.txt
@@ -14,7 +14,6 @@ Introduction
- S3C24XX: See Documentation/arm/Samsung-S3C24XX/Overview.txt for full list
- S3C64XX: S3C6400 and S3C6410
- S5P6440
- - S5P6442
- S5PC100
- S5PC110 / S5PV210
@@ -36,7 +35,6 @@ Configuration
unifying all the SoCs into one kernel.
s5p6440_defconfig - S5P6440 specific default configuration
- s5p6442_defconfig - S5P6442 specific default configuration
s5pc100_defconfig - S5PC100 specific default configuration
s5pc110_defconfig - S5PC110 specific default configuration
s5pv210_defconfig - S5PV210 specific default configuration
diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
index ac4d47187122..3bd585b44927 100644
--- a/Documentation/atomic_ops.txt
+++ b/Documentation/atomic_ops.txt
@@ -12,7 +12,7 @@ Also, it should be made opaque such that any kind of cast to a normal
C integer type will fail. Something like the following should
suffice:
- typedef struct { volatile int counter; } atomic_t;
+ typedef struct { int counter; } atomic_t;
Historically, counter has been declared volatile. This is now discouraged.
See Documentation/volatile-considered-harmful.txt for the complete rationale.
diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index aedf1bd02fdd..0ed99f08f1f3 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -236,7 +236,8 @@ containing the following files describing that cgroup:
- cgroup.procs: list of tgids in the cgroup. This list is not
guaranteed to be sorted or free of duplicate tgids, and userspace
should sort/uniquify the list if this property is required.
- This is a read-only file, for now.
+ Writing a thread group id into this file moves all threads in that
+ group into this cgroup.
- notify_on_release flag: run the release agent on exit?
- release_agent: the path to use for release notifications (this file
exists in the top cgroup only)
@@ -430,6 +431,12 @@ You can attach the current shell task by echoing 0:
# echo 0 > tasks
+You can use the cgroup.procs file instead of the tasks file to move all
+threads in a threadgroup at once. Echoing the pid of any task in a
+threadgroup to cgroup.procs causes all tasks in that threadgroup to be
+be attached to the cgroup. Writing 0 to cgroup.procs moves all tasks
+in the writing task's threadgroup.
+
Note: Since every task is always a member of exactly one cgroup in each
mounted hierarchy, to remove a task from its current cgroup you must
move it into a new cgroup (possibly the root cgroup) by writing to the
@@ -575,7 +582,7 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
called multiple times against a cgroup.
int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
- struct task_struct *task, bool threadgroup)
+ struct task_struct *task)
(cgroup_mutex held by caller)
Called prior to moving a task into a cgroup; if the subsystem
@@ -584,9 +591,14 @@ task is passed, then a successful result indicates that *any*
unspecified task can be moved into the cgroup. Note that this isn't
called on a fork. If this method returns 0 (success) then this should
remain valid while the caller holds cgroup_mutex and it is ensured that either
-attach() or cancel_attach() will be called in future. If threadgroup is
-true, then a successful result indicates that all threads in the given
-thread's threadgroup can be moved together.
+attach() or cancel_attach() will be called in future.
+
+int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
+(cgroup_mutex held by caller)
+
+As can_attach, but for operations that must be run once per task to be
+attached (possibly many when using cgroup_attach_proc). Called after
+can_attach.
void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct task_struct *task, bool threadgroup)
@@ -598,15 +610,24 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
This will be called only about subsystems whose can_attach() operation have
succeeded.
+void pre_attach(struct cgroup *cgrp);
+(cgroup_mutex held by caller)
+
+For any non-per-thread attachment work that needs to happen before
+attach_task. Needed by cpuset.
+
void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
- struct cgroup *old_cgrp, struct task_struct *task,
- bool threadgroup)
+ struct cgroup *old_cgrp, struct task_struct *task)
(cgroup_mutex held by caller)
Called after the task has been attached to the cgroup, to allow any
post-attachment activity that requires memory allocations or blocking.
-If threadgroup is true, the subsystem should take care of all threads
-in the specified thread's threadgroup. Currently does not support any
+
+void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
+(cgroup_mutex held by caller)
+
+As attach, but for operations that must be run once per task to be attached,
+like can_attach_task. Called before attach. Currently does not support any
subsystem that might need the old_cgrp for every thread in the group.
void fork(struct cgroup_subsy *ss, struct task_struct *task)
@@ -630,7 +651,7 @@ always handled well.
void post_clone(struct cgroup_subsys *ss, struct cgroup *cgrp)
(cgroup_mutex held by caller)
-Called at the end of cgroup_clone() to do any parameter
+Called during cgroup_create() to do any parameter
initialization which might be required before a task could attach. For
example in cpusets, no task may attach before 'cpus' and 'mems' are set
up.
diff --git a/Documentation/devicetree/booting-without-of.txt b/Documentation/devicetree/booting-without-of.txt
index 50619a0720a8..7c1329de0596 100644
--- a/Documentation/devicetree/booting-without-of.txt
+++ b/Documentation/devicetree/booting-without-of.txt
@@ -12,8 +12,9 @@ Table of Contents
=================
I - Introduction
- 1) Entry point for arch/powerpc
- 2) Entry point for arch/x86
+ 1) Entry point for arch/arm
+ 2) Entry point for arch/powerpc
+ 3) Entry point for arch/x86
II - The DT block format
1) Header
@@ -148,7 +149,46 @@ upgrades without significantly impacting the kernel code or cluttering
it with special cases.
-1) Entry point for arch/powerpc
+1) Entry point for arch/arm
+---------------------------
+
+ There is one single entry point to the kernel, at the start
+ of the kernel image. That entry point supports two calling
+ conventions. A summary of the interface is described here. A full
+ description of the boot requirements is documented in
+ Documentation/arm/Booting
+
+ a) ATAGS interface. Minimal information is passed from firmware
+ to the kernel with a tagged list of predefined parameters.
+
+ r0 : 0
+
+ r1 : Machine type number
+
+ r2 : Physical address of tagged list in system RAM
+
+ b) Entry with a flattened device-tree block. Firmware loads the
+ physical address of the flattened device tree block (dtb) into r2,
+ r1 is not used, but it is considered good practise to use a valid
+ machine number as described in Documentation/arm/Booting.
+
+ r0 : 0
+
+ r1 : Valid machine type number. When using a device tree,
+ a single machine type number will often be assigned to
+ represent a class or family of SoCs.
+
+ r2 : physical pointer to the device-tree block
+ (defined in chapter II) in RAM. Device tree can be located
+ anywhere in system RAM, but it should be aligned on a 64 bit
+ boundary.
+
+ The kernel will differentiate between ATAGS and device tree booting by
+ reading the memory pointed to by r2 and looking for either the flattened
+ device tree block magic value (0xd00dfeed) or the ATAG_CORE value at
+ offset 0x4 from r2 (0x54410001).
+
+2) Entry point for arch/powerpc
-------------------------------
There is one single entry point to the kernel, at the start
@@ -226,7 +266,7 @@ it with special cases.
cannot support both configurations with Book E and configurations
with classic Powerpc architectures.
-2) Entry point for arch/x86
+3) Entry point for arch/x86
-------------------------------
There is one single 32bit entry point to the kernel at code32_start,
diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
index 0c1c2f63c0a9..5a0cb1ef6164 100644
--- a/Documentation/dmaengine.txt
+++ b/Documentation/dmaengine.txt
@@ -1 +1,96 @@
-See Documentation/crypto/async-tx-api.txt
+ DMA Engine API Guide
+ ====================
+
+ Vinod Koul <vinod dot koul at intel.com>
+
+NOTE: For DMA Engine usage in async_tx please see:
+ Documentation/crypto/async-tx-api.txt
+
+
+Below is a guide to device driver writers on how to use the Slave-DMA API of the
+DMA Engine. This is applicable only for slave DMA usage only.
+
+The slave DMA usage consists of following steps
+1. Allocate a DMA slave channel
+2. Set slave and controller specific parameters
+3. Get a descriptor for transaction
+4. Submit the transaction and wait for callback notification
+
+1. Allocate a DMA slave channel
+Channel allocation is slightly different in the slave DMA context, client
+drivers typically need a channel from a particular DMA controller only and even
+in some cases a specific channel is desired. To request a channel
+dma_request_channel() API is used.
+
+Interface:
+struct dma_chan *dma_request_channel(dma_cap_mask_t mask,
+ dma_filter_fn filter_fn,
+ void *filter_param);
+where dma_filter_fn is defined as:
+typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param);
+
+When the optional 'filter_fn' parameter is set to NULL dma_request_channel
+simply returns the first channel that satisfies the capability mask. Otherwise,
+when the mask parameter is insufficient for specifying the necessary channel,
+the filter_fn routine can be used to disposition the available channels in the
+system. The filter_fn routine is called once for each free channel in the
+system. Upon seeing a suitable channel filter_fn returns DMA_ACK which flags
+that channel to be the return value from dma_request_channel. A channel
+allocated via this interface is exclusive to the caller, until
+dma_release_channel() is called.
+
+2. Set slave and controller specific parameters
+Next step is always to pass some specific information to the DMA driver. Most of
+the generic information which a slave DMA can use is in struct dma_slave_config.
+It allows the clients to specify DMA direction, DMA addresses, bus widths, DMA
+burst lengths etc. If some DMA controllers have more parameters to be sent then
+they should try to embed struct dma_slave_config in their controller specific
+structure. That gives flexibility to client to pass more parameters, if
+required.
+
+Interface:
+int dmaengine_slave_config(struct dma_chan *chan,
+ struct dma_slave_config *config)
+
+3. Get a descriptor for transaction
+For slave usage the various modes of slave transfers supported by the
+DMA-engine are:
+slave_sg - DMA a list of scatter gather buffers from/to a peripheral
+dma_cyclic - Perform a cyclic DMA operation from/to a peripheral till the
+ operation is explicitly stopped.
+The non NULL return of this transfer API represents a "descriptor" for the given
+transaction.
+
+Interface:
+struct dma_async_tx_descriptor *(*chan->device->device_prep_dma_sg)(
+ struct dma_chan *chan,
+ struct scatterlist *dst_sg, unsigned int dst_nents,
+ struct scatterlist *src_sg, unsigned int src_nents,
+ unsigned long flags);
+struct dma_async_tx_descriptor *(*chan->device->device_prep_dma_cyclic)(
+ struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
+ size_t period_len, enum dma_data_direction direction);
+
+4. Submit the transaction and wait for callback notification
+To schedule the transaction to be scheduled by dma device, the "descriptor"
+returned in above (3) needs to be submitted.
+To tell the dma driver that a transaction is ready to be serviced, the
+descriptor->submit() callback needs to be invoked. This chains the descriptor to
+the pending queue.
+The transactions in the pending queue can be activated by calling the
+issue_pending API. If channel is idle then the first transaction in queue is
+started and subsequent ones queued up.
+On completion of the DMA operation the next in queue is submitted and a tasklet
+triggered. The tasklet would then call the client driver completion callback
+routine for notification, if set.
+Interface:
+void dma_async_issue_pending(struct dma_chan *chan);
+
+==============================================================================
+
+Additional usage notes for dma driver writers
+1/ Although DMA engine specifies that completion callback routines cannot submit
+any new operations, but typically for slave DMA subsequent transaction may not
+be available for submit prior to callback routine being called. This requirement
+is not a requirement for DMA-slave devices. But they should take care to drop
+the spin-lock they might be holding before calling the callback routine
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 95788ad2506c..1a9446b59153 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -6,6 +6,42 @@ be removed from this file.
---------------------------
+What: x86 floppy disable_hlt
+When: 2012
+Why: ancient workaround of dubious utility clutters the
+ code used by everybody else.
+Who: Len Brown <len.brown@intel.com>
+
+---------------------------
+
+What: CONFIG_APM_CPU_IDLE, and its ability to call APM BIOS in idle
+When: 2012
+Why: This optional sub-feature of APM is of dubious reliability,
+ and ancient APM laptops are likely better served by calling HLT.
+ Deleting CONFIG_APM_CPU_IDLE allows x86 to stop exporting
+ the pm_idle function pointer to modules.
+Who: Len Brown <len.brown@intel.com>
+
+----------------------------
+
+What: x86_32 "no-hlt" cmdline param
+When: 2012
+Why: remove a branch from idle path, simplify code used by everybody.
+ This option disabled the use of HLT in idle and machine_halt()
+ for hardware that was flakey 15-years ago. Today we have
+ "idle=poll" that removed HLT from idle, and so if such a machine
+ is still running the upstream kernel, "idle=poll" is likely sufficient.
+Who: Len Brown <len.brown@intel.com>
+
+----------------------------
+
+What: x86 "idle=mwait" cmdline param
+When: 2012
+Why: simplify x86 idle code
+Who: Len Brown <len.brown@intel.com>
+
+----------------------------
+
What: PRISM54
When: 2.6.34
@@ -262,16 +298,6 @@ Who: Michael Buesch <mb@bu3sch.de>
---------------------------
-What: /sys/o2cb symlink
-When: January 2010
-Why: /sys/fs/o2cb is the proper location for this information - /sys/o2cb
- exists as a symlink for backwards compatibility for old versions of
- ocfs2-tools. 2 years should be sufficient time to phase in new versions
- which know to look in /sys/fs/o2cb.
-Who: ocfs2-devel@oss.oracle.com
-
----------------------------
-
What: Ability for non root users to shm_get hugetlb pages based on mlock
resource limits
When: 2.6.31
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 61b31acb9176..57d827d6071d 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -104,7 +104,7 @@ of the locking scheme for directory operations.
prototypes:
struct inode *(*alloc_inode)(struct super_block *sb);
void (*destroy_inode)(struct inode *);
- void (*dirty_inode) (struct inode *);
+ void (*dirty_inode) (struct inode *, int flags);
int (*write_inode) (struct inode *, struct writeback_control *wbc);
int (*drop_inode) (struct inode *);
void (*evict_inode) (struct inode *);
@@ -126,7 +126,7 @@ locking rules:
s_umount
alloc_inode:
destroy_inode:
-dirty_inode: (must not sleep)
+dirty_inode:
write_inode:
drop_inode: !!!inode->i_lock!!!
evict_inode:
diff --git a/Documentation/filesystems/configfs/configfs_example_explicit.c b/Documentation/filesystems/configfs/configfs_example_explicit.c
index fd53869f5633..1420233dfa55 100644
--- a/Documentation/filesystems/configfs/configfs_example_explicit.c
+++ b/Documentation/filesystems/configfs/configfs_example_explicit.c
@@ -464,9 +464,8 @@ static int __init configfs_example_init(void)
return 0;
out_unregister:
- for (; i >= 0; i--) {
+ for (i--; i >= 0; i--)
configfs_unregister_subsystem(example_subsys[i]);
- }
return ret;
}
@@ -475,9 +474,8 @@ static void __exit configfs_example_exit(void)
{
int i;
- for (i = 0; example_subsys[i]; i++) {
+ for (i = 0; example_subsys[i]; i++)
configfs_unregister_subsystem(example_subsys[i]);
- }
}
module_init(configfs_example_init);
diff --git a/Documentation/filesystems/configfs/configfs_example_macros.c b/Documentation/filesystems/configfs/configfs_example_macros.c
index d8e30a0378aa..327dfbc640a9 100644
--- a/Documentation/filesystems/configfs/configfs_example_macros.c
+++ b/Documentation/filesystems/configfs/configfs_example_macros.c
@@ -427,9 +427,8 @@ static int __init configfs_example_init(void)
return 0;
out_unregister:
- for (; i >= 0; i--) {
+ for (i--; i >= 0; i--)
configfs_unregister_subsystem(example_subsys[i]);
- }
return ret;
}
@@ -438,9 +437,8 @@ static void __exit configfs_example_exit(void)
{
int i;
- for (i = 0; example_subsys[i]; i++) {
+ for (i = 0; example_subsys[i]; i++)
configfs_unregister_subsystem(example_subsys[i]);
- }
}
module_init(configfs_example_init);
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index c79ec58fd7f6..3ae9bc94352a 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -226,10 +226,6 @@ acl Enables POSIX Access Control Lists support.
noacl This option disables POSIX Access Control List
support.
-reservation
-
-noreservation
-
bsddf (*) Make 'df' act like BSD.
minixdf Make 'df' act like Minix.
diff --git a/Documentation/filesystems/nfs/idmapper.txt b/Documentation/filesystems/nfs/idmapper.txt
index b9b4192ea8b5..9c8fd6148656 100644
--- a/Documentation/filesystems/nfs/idmapper.txt
+++ b/Documentation/filesystems/nfs/idmapper.txt
@@ -47,8 +47,8 @@ request-key will find the first matching line and corresponding program. In
this case, /some/other/program will handle all uid lookups and
/usr/sbin/nfs.idmap will handle gid, user, and group lookups.
-See <file:Documentation/keys-request-keys.txt> for more information about the
-request-key function.
+See <file:Documentation/security/keys-request-keys.txt> for more information
+about the request-key function.
=========
diff --git a/Documentation/filesystems/ocfs2.txt b/Documentation/filesystems/ocfs2.txt
index 9ed920a8cd79..7618a287aa41 100644
--- a/Documentation/filesystems/ocfs2.txt
+++ b/Documentation/filesystems/ocfs2.txt
@@ -46,9 +46,15 @@ errors=panic Panic and halt the machine if an error occurs.
intr (*) Allow signals to interrupt cluster operations.
nointr Do not allow signals to interrupt cluster
operations.
+noatime Do not update access time.
+relatime(*) Update atime if the previous atime is older than
+ mtime or ctime
+strictatime Always update atime, but the minimum update interval
+ is specified by atime_quantum.
atime_quantum=60(*) OCFS2 will not update atime unless this number
of seconds has passed since the last update.
- Set to zero to always update atime.
+ Set to zero to always update atime. This option need
+ work with strictatime.
data=ordered (*) All data are forced directly out to the main file
system prior to its metadata being committed to the
journal.
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 21a7dc467bba..88b9f5519af9 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -211,7 +211,7 @@ struct super_operations {
struct inode *(*alloc_inode)(struct super_block *sb);
void (*destroy_inode)(struct inode *);
- void (*dirty_inode) (struct inode *);
+ void (*dirty_inode) (struct inode *, int flags);
int (*write_inode) (struct inode *, int);
void (*drop_inode) (struct inode *);
void (*delete_inode) (struct inode *);
diff --git a/Documentation/filesystems/xfs.txt b/Documentation/filesystems/xfs.txt
index 7bff3e4f35df..3fc0c31a6f5d 100644
--- a/Documentation/filesystems/xfs.txt
+++ b/Documentation/filesystems/xfs.txt
@@ -39,6 +39,12 @@ When mounting an XFS filesystem, the following options are accepted.
drive level write caching to be enabled, for devices that
support write barriers.
+ discard
+ Issue command to let the block device reclaim space freed by the
+ filesystem. This is useful for SSD devices, thinly provisioned
+ LUNs and virtual machine images, but may have a performance
+ impact. This option is incompatible with the nodelaylog option.
+
dmapi
Enable the DMAPI (Data Management API) event callouts.
Use with the "mtpt" option.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 5438a2d7907f..fd248a318211 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -999,7 +999,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
With this option on every unmap_single operation will
result in a hardware IOTLB flush operation as opposed
to batching them for performance.
-
+ sp_off [Default Off]
+ By default, super page will be supported if Intel IOMMU
+ has the capability. With this option, super page will
+ not be supported.
intremap= [X86-64, Intel-IOMMU]
Format: { on (default) | off | nosid }
on enable Interrupt Remapping (default)
@@ -2595,6 +2598,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
unlock ejectable media);
m = MAX_SECTORS_64 (don't transfer more
than 64 sectors = 32 KB at a time);
+ n = INITIAL_READ10 (force a retry of the
+ initial READ(10) command);
o = CAPACITY_OK (accept the capacity
reported by the device);
r = IGNORE_RESIDUE (the device reports
diff --git a/Documentation/laptops/acer-wmi.txt b/Documentation/laptops/acer-wmi.txt
deleted file mode 100644
index 4beafa663dd6..000000000000
--- a/Documentation/laptops/acer-wmi.txt
+++ /dev/null
@@ -1,184 +0,0 @@
-Acer Laptop WMI Extras Driver
-http://code.google.com/p/aceracpi
-Version 0.3
-4th April 2009
-
-Copyright 2007-2009 Carlos Corbacho <carlos@strangeworlds.co.uk>
-
-acer-wmi is a driver to allow you to control various parts of your Acer laptop
-hardware under Linux which are exposed via ACPI-WMI.
-
-This driver completely replaces the old out-of-tree acer_acpi, which I am
-currently maintaining for bug fixes only on pre-2.6.25 kernels. All development
-work is now focused solely on acer-wmi.
-
-Disclaimer
-**********
-
-Acer and Wistron have provided nothing towards the development acer_acpi or
-acer-wmi. All information we have has been through the efforts of the developers
-and the users to discover as much as possible about the hardware.
-
-As such, I do warn that this could break your hardware - this is extremely
-unlikely of course, but please bear this in mind.
-
-Background
-**********
-
-acer-wmi is derived from acer_acpi, originally developed by Mark
-Smith in 2005, then taken over by Carlos Corbacho in 2007, in order to activate
-the wireless LAN card under a 64-bit version of Linux, as acerhk[1] (the
-previous solution to the problem) relied on making 32 bit BIOS calls which are
-not possible in kernel space from a 64 bit OS.
-
-[1] acerhk: http://www.cakey.de/acerhk/
-
-Supported Hardware
-******************
-
-NOTE: The Acer Aspire One is not supported hardware. It cannot work with
-acer-wmi until Acer fix their ACPI-WMI implementation on them, so has been
-blacklisted until that happens.
-
-Please see the website for the current list of known working hardware:
-
-http://code.google.com/p/aceracpi/wiki/SupportedHardware
-
-If your laptop is not listed, or listed as unknown, and works with acer-wmi,
-please contact me with a copy of the DSDT.
-
-If your Acer laptop doesn't work with acer-wmi, I would also like to see the
-DSDT.
-
-To send me the DSDT, as root/sudo:
-
-cat /sys/firmware/acpi/tables/DSDT > dsdt
-
-And send me the resulting 'dsdt' file.
-
-Usage
-*****
-
-On Acer laptops, acer-wmi should already be autoloaded based on DMI matching.
-For non-Acer laptops, until WMI based autoloading support is added, you will
-need to manually load acer-wmi.
-
-acer-wmi creates /sys/devices/platform/acer-wmi, and fills it with various
-files whose usage is detailed below, which enables you to control some of the
-following (varies between models):
-
-* the wireless LAN card radio
-* inbuilt Bluetooth adapter
-* inbuilt 3G card
-* mail LED of your laptop
-* brightness of the LCD panel
-
-Wireless
-********
-
-With regards to wireless, all acer-wmi does is enable the radio on the card. It
-is not responsible for the wireless LED - once the radio is enabled, this is
-down to the wireless driver for your card. So the behaviour of the wireless LED,
-once you enable the radio, will depend on your hardware and driver combination.
-
-e.g. With the BCM4318 on the Acer Aspire 5020 series:
-
-ndiswrapper: Light blinks on when transmitting
-b43: Solid light, blinks off when transmitting
-
-Wireless radio control is unconditionally enabled - all Acer laptops that support
-acer-wmi come with built-in wireless. However, should you feel so inclined to
-ever wish to remove the card, or swap it out at some point, please get in touch
-with me, as we may well be able to gain some data on wireless card detection.
-
-The wireless radio is exposed through rfkill.
-
-Bluetooth
-*********
-
-For bluetooth, this is an internal USB dongle, so once enabled, you will get
-a USB device connection event, and a new USB device appears. When you disable
-bluetooth, you get the reverse - a USB device disconnect event, followed by the
-device disappearing again.
-
-Bluetooth is autodetected by acer-wmi, so if you do not have a bluetooth module
-installed in your laptop, this file won't exist (please be aware that it is
-quite common for Acer not to fit bluetooth to their laptops - so just because
-you have a bluetooth button on the laptop, doesn't mean that bluetooth is
-installed).
-
-For the adventurously minded - if you want to buy an internal bluetooth
-module off the internet that is compatible with your laptop and fit it, then
-it will work just fine with acer-wmi.
-
-Bluetooth is exposed through rfkill.
-
-3G
-**
-
-3G is currently not autodetected, so the 'threeg' file is always created under
-sysfs. So far, no-one in possession of an Acer laptop with 3G built-in appears to
-have tried Linux, or reported back, so we don't have any information on this.
-
-If you have an Acer laptop that does have a 3G card in, please contact me so we
-can properly detect these, and find out a bit more about them.
-
-To read the status of the 3G card (0=off, 1=on):
-cat /sys/devices/platform/acer-wmi/threeg
-
-To enable the 3G card:
-echo 1 > /sys/devices/platform/acer-wmi/threeg
-
-To disable the 3G card:
-echo 0 > /sys/devices/platform/acer-wmi/threeg
-
-To set the state of the 3G card when loading acer-wmi, pass:
-threeg=X (where X is 0 or 1)
-
-Mail LED
-********
-
-This can be found in most older Acer laptops supported by acer-wmi, and many
-newer ones - it is built into the 'mail' button, and blinks when active.
-
-On newer (WMID) laptops though, we have no way of detecting the mail LED. If
-your laptop identifies itself in dmesg as a WMID model, then please try loading
-acer_acpi with:
-
-force_series=2490
-
-This will use a known alternative method of reading/ writing the mail LED. If
-it works, please report back to me with the DMI data from your laptop so this
-can be added to acer-wmi.
-
-The LED is exposed through the LED subsystem, and can be found in:
-
-/sys/devices/platform/acer-wmi/leds/acer-wmi::mail/
-
-The mail LED is autodetected, so if you don't have one, the LED device won't
-be registered.
-
-Backlight
-*********
-
-The backlight brightness control is available on all acer-wmi supported
-hardware. The maximum brightness level is usually 15, but on some newer laptops
-it's 10 (this is again autodetected).
-
-The backlight is exposed through the backlight subsystem, and can be found in:
-
-/sys/devices/platform/acer-wmi/backlight/acer-wmi/
-
-Credits
-*******
-
-Olaf Tauber, who did the real hard work when he developed acerhk
-http://www.cakey.de/acerhk/
-All the authors of laptop ACPI modules in the kernel, whose work
-was an inspiration in the early days of acer_acpi
-Mathieu Segaud, who solved the problem with having to modprobe the driver
-twice in acer_acpi 0.2.
-Jim Ramsay, who added support for the WMID interface
-Mark Smith, who started the original acer_acpi
-
-And the many people who have used both acer_acpi and acer-wmi.
diff --git a/Documentation/lockstat.txt b/Documentation/lockstat.txt
index 9c0a80d17a23..cef00d42ed5b 100644
--- a/Documentation/lockstat.txt
+++ b/Documentation/lockstat.txt
@@ -12,8 +12,9 @@ Because things like lock contention can severely impact performance.
- HOW
Lockdep already has hooks in the lock functions and maps lock instances to
-lock classes. We build on that. The graph below shows the relation between
-the lock functions and the various hooks therein.
+lock classes. We build on that (see Documentation/lockdep-design.txt).
+The graph below shows the relation between the lock functions and the various
+hooks therein.
__acquire
|
@@ -128,6 +129,37 @@ points are the points we're contending with.
The integer part of the time values is in us.
+Dealing with nested locks, subclasses may appear:
+
+32...............................................................................................................................................................................................
+33
+34 &rq->lock: 13128 13128 0.43 190.53 103881.26 97454 3453404 0.00 401.11 13224683.11
+35 ---------
+36 &rq->lock 645 [<ffffffff8103bfc4>] task_rq_lock+0x43/0x75
+37 &rq->lock 297 [<ffffffff8104ba65>] try_to_wake_up+0x127/0x25a
+38 &rq->lock 360 [<ffffffff8103c4c5>] select_task_rq_fair+0x1f0/0x74a
+39 &rq->lock 428 [<ffffffff81045f98>] scheduler_tick+0x46/0x1fb
+40 ---------
+41 &rq->lock 77 [<ffffffff8103bfc4>] task_rq_lock+0x43/0x75
+42 &rq->lock 174 [<ffffffff8104ba65>] try_to_wake_up+0x127/0x25a
+43 &rq->lock 4715 [<ffffffff8103ed4b>] double_rq_lock+0x42/0x54
+44 &rq->lock 893 [<ffffffff81340524>] schedule+0x157/0x7b8
+45
+46...............................................................................................................................................................................................
+47
+48 &rq->lock/1: 11526 11488 0.33 388.73 136294.31 21461 38404 0.00 37.93 109388.53
+49 -----------
+50 &rq->lock/1 11526 [<ffffffff8103ed58>] double_rq_lock+0x4f/0x54
+51 -----------
+52 &rq->lock/1 5645 [<ffffffff8103ed4b>] double_rq_lock+0x42/0x54
+53 &rq->lock/1 1224 [<ffffffff81340524>] schedule+0x157/0x7b8
+54 &rq->lock/1 4336 [<ffffffff8103ed58>] double_rq_lock+0x4f/0x54
+55 &rq->lock/1 181 [<ffffffff8104ba65>] try_to_wake_up+0x127/0x25a
+
+Line 48 shows statistics for the second subclass (/1) of &rq->lock class
+(subclass starts from 0), since in this case, as line 50 suggests,
+double_rq_lock actually acquires a nested lock of two spinlocks.
+
View the top contending locks:
# grep : /proc/lock_stat | head
diff --git a/Documentation/networking/dns_resolver.txt b/Documentation/networking/dns_resolver.txt
index 04ca06325b08..7f531ad83285 100644
--- a/Documentation/networking/dns_resolver.txt
+++ b/Documentation/networking/dns_resolver.txt
@@ -139,8 +139,8 @@ the key will be discarded and recreated when the data it holds has expired.
dns_query() returns a copy of the value attached to the key, or an error if
that is indicated instead.
-See <file:Documentation/keys-request-key.txt> for further information about
-request-key function.
+See <file:Documentation/security/keys-request-key.txt> for further
+information about request-key function.
=========
diff --git a/Documentation/power/regulator/machine.txt b/Documentation/power/regulator/machine.txt
index bdec39b9bd75..b42419b52e44 100644
--- a/Documentation/power/regulator/machine.txt
+++ b/Documentation/power/regulator/machine.txt
@@ -53,11 +53,11 @@ static struct regulator_init_data regulator1_data = {
Regulator-1 supplies power to Regulator-2. This relationship must be registered
with the core so that Regulator-1 is also enabled when Consumer A enables its
-supply (Regulator-2). The supply regulator is set by the supply_regulator_dev
+supply (Regulator-2). The supply regulator is set by the supply_regulator
field below:-
static struct regulator_init_data regulator2_data = {
- .supply_regulator_dev = &platform_regulator1_device.dev,
+ .supply_regulator = "regulator_name",
.constraints = {
.min_uV = 1800000,
.max_uV = 2000000,
diff --git a/Documentation/scsi/ChangeLog.megaraid_sas b/Documentation/scsi/ChangeLog.megaraid_sas
index 4d9ce73ff730..9ed1d9d96783 100644
--- a/Documentation/scsi/ChangeLog.megaraid_sas
+++ b/Documentation/scsi/ChangeLog.megaraid_sas
@@ -1,3 +1,17 @@
+Release Date : Wed. May 11, 2011 17:00:00 PST 2010 -
+ (emaild-id:megaraidlinux@lsi.com)
+ Adam Radford
+Current Version : 00.00.05.38-rc1
+Old Version : 00.00.05.34-rc1
+ 1. Remove MSI-X black list, use MFI_REG_STATE.ready.msiEnable.
+ 2. Remove un-used function megasas_return_cmd_for_smid().
+ 3. Check MFI_REG_STATE.fault.resetAdapter in megasas_reset_fusion().
+ 4. Disable interrupts/free_irq() in megasas_shutdown().
+ 5. Fix bug where AENs could be lost in probe() and resume().
+ 6. Convert 6,10,12 byte CDB's to 16 byte CDB for large LBA's for FastPath
+ IO.
+ 7. Add 1078 OCR support.
+-------------------------------------------------------------------------------
Release Date : Thu. Feb 24, 2011 17:00:00 PST 2010 -
(emaild-id:megaraidlinux@lsi.com)
Adam Radford
diff --git a/Documentation/security/00-INDEX b/Documentation/security/00-INDEX
new file mode 100644
index 000000000000..19bc49439cac
--- /dev/null
+++ b/Documentation/security/00-INDEX
@@ -0,0 +1,18 @@
+00-INDEX
+ - this file.
+SELinux.txt
+ - how to get started with the SELinux security enhancement.
+Smack.txt
+ - documentation on the Smack Linux Security Module.
+apparmor.txt
+ - documentation on the AppArmor security extension.
+credentials.txt
+ - documentation about credentials in Linux.
+keys-request-key.txt
+ - description of the kernel key request service.
+keys-trusted-encrypted.txt
+ - info on the Trusted and Encrypted keys in the kernel key ring service.
+keys.txt
+ - description of the kernel key retention service.
+tomoyo.txt
+ - documentation on the TOMOYO Linux Security Module.
diff --git a/Documentation/SELinux.txt b/Documentation/security/SELinux.txt
index 07eae00f3314..07eae00f3314 100644
--- a/Documentation/SELinux.txt
+++ b/Documentation/security/SELinux.txt
diff --git a/Documentation/Smack.txt b/Documentation/security/Smack.txt
index e9dab41c0fe0..e9dab41c0fe0 100644
--- a/Documentation/Smack.txt
+++ b/Documentation/security/Smack.txt
diff --git a/Documentation/apparmor.txt b/Documentation/security/apparmor.txt
index 93c1fd7d0635..93c1fd7d0635 100644
--- a/Documentation/apparmor.txt
+++ b/Documentation/security/apparmor.txt
diff --git a/Documentation/credentials.txt b/Documentation/security/credentials.txt
index 995baf379c07..fc0366cbd7ce 100644
--- a/Documentation/credentials.txt
+++ b/Documentation/security/credentials.txt
@@ -216,7 +216,7 @@ The Linux kernel supports the following types of credentials:
When a process accesses a key, if not already present, it will normally be
cached on one of these keyrings for future accesses to find.
- For more information on using keys, see Documentation/keys.txt.
+ For more information on using keys, see Documentation/security/keys.txt.
(5) LSM
diff --git a/Documentation/keys-request-key.txt b/Documentation/security/keys-request-key.txt
index 69686ad12c66..51987bfecfed 100644
--- a/Documentation/keys-request-key.txt
+++ b/Documentation/security/keys-request-key.txt
@@ -3,8 +3,8 @@
===================
The key request service is part of the key retention service (refer to
-Documentation/keys.txt). This document explains more fully how the requesting
-algorithm works.
+Documentation/security/keys.txt). This document explains more fully how
+the requesting algorithm works.
The process starts by either the kernel requesting a service by calling
request_key*():
diff --git a/Documentation/keys-trusted-encrypted.txt b/Documentation/security/keys-trusted-encrypted.txt
index 8fb79bc1ac4b..8fb79bc1ac4b 100644
--- a/Documentation/keys-trusted-encrypted.txt
+++ b/Documentation/security/keys-trusted-encrypted.txt
diff --git a/Documentation/keys.txt b/Documentation/security/keys.txt
index 6523a9e6f293..4d75931d2d79 100644
--- a/Documentation/keys.txt
+++ b/Documentation/security/keys.txt
@@ -434,7 +434,7 @@ The main syscalls are:
/sbin/request-key will be invoked in an attempt to obtain a key. The
callout_info string will be passed as an argument to the program.
- See also Documentation/keys-request-key.txt.
+ See also Documentation/security/keys-request-key.txt.
The keyctl syscall functions are:
@@ -864,7 +864,7 @@ payload contents" for more information.
If successful, the key will have been attached to the default keyring for
implicitly obtained request-key keys, as set by KEYCTL_SET_REQKEY_KEYRING.
- See also Documentation/keys-request-key.txt.
+ See also Documentation/security/keys-request-key.txt.
(*) To search for a key, passing auxiliary data to the upcaller, call:
diff --git a/Documentation/tomoyo.txt b/Documentation/security/tomoyo.txt
index 200a2d37cbc8..200a2d37cbc8 100644
--- a/Documentation/tomoyo.txt
+++ b/Documentation/security/tomoyo.txt
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 36f007514db3..5e7cb39ad195 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -161,7 +161,8 @@ core_pattern is used to specify a core dumpfile pattern name.
%s signal number
%t UNIX time of dump
%h hostname
- %e executable filename
+ %e executable filename (may be shortened)
+ %E executable path
%<OTHER> both are dropped
. If the first character of the pattern is a '|', the kernel will treat
the rest of the pattern as a command to run. The core dump will be
diff --git a/Documentation/virtual/lguest/Makefile b/Documentation/virtual/lguest/Makefile
index bebac6b4f332..0ac34206f7a7 100644
--- a/Documentation/virtual/lguest/Makefile
+++ b/Documentation/virtual/lguest/Makefile
@@ -1,5 +1,5 @@
# This creates the demonstration utility "lguest" which runs a Linux guest.
-# Missing headers? Add "-I../../include -I../../arch/x86/include"
+# Missing headers? Add "-I../../../include -I../../../arch/x86/include"
CFLAGS:=-m32 -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -U_FORTIFY_SOURCE
all: lguest
diff --git a/Documentation/virtual/lguest/lguest.c b/Documentation/virtual/lguest/lguest.c
index d9da7e148538..cd9d6af61d07 100644
--- a/Documentation/virtual/lguest/lguest.c
+++ b/Documentation/virtual/lguest/lguest.c
@@ -49,7 +49,7 @@
#include <linux/virtio_rng.h>
#include <linux/virtio_ring.h>
#include <asm/bootparam.h>
-#include "../../include/linux/lguest_launcher.h"
+#include "../../../include/linux/lguest_launcher.h"
/*L:110
* We can ignore the 42 include files we need for this program, but I do want
* to draw attention to the use of kernel-style types.
@@ -135,9 +135,6 @@ struct device {
/* Is it operational */
bool running;
- /* Does Guest want an intrrupt on empty? */
- bool irq_on_empty;
-
/* Device-specific data. */
void *priv;
};
@@ -637,10 +634,7 @@ static void trigger_irq(struct virtqueue *vq)
/* If they don't want an interrupt, don't send one... */
if (vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT) {
- /* ... unless they've asked us to force one on empty. */
- if (!vq->dev->irq_on_empty
- || lg_last_avail(vq) != vq->vring.avail->idx)
- return;
+ return;
}
/* Send the Guest an interrupt tell them we used something up. */
@@ -1057,15 +1051,6 @@ static void create_thread(struct virtqueue *vq)
close(vq->eventfd);
}
-static bool accepted_feature(struct device *dev, unsigned int bit)
-{
- const u8 *features = get_feature_bits(dev) + dev->feature_len;
-
- if (dev->feature_len < bit / CHAR_BIT)
- return false;
- return features[bit / CHAR_BIT] & (1 << (bit % CHAR_BIT));
-}
-
static void start_device(struct device *dev)
{
unsigned int i;
@@ -1079,8 +1064,6 @@ static void start_device(struct device *dev)
verbose(" %02x", get_feature_bits(dev)
[dev->feature_len+i]);
- dev->irq_on_empty = accepted_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY);
-
for (vq = dev->vq; vq; vq = vq->next) {
if (vq->service)
create_thread(vq);
@@ -1564,7 +1547,6 @@ static void setup_tun_net(char *arg)
/* Set up the tun device. */
configure_device(ipfd, tapif, ip);
- add_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY);
/* Expect Guest to handle everything except UFO */
add_feature(dev, VIRTIO_NET_F_CSUM);
add_feature(dev, VIRTIO_NET_F_GUEST_CSUM);
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt
new file mode 100644
index 000000000000..36c367c73084
--- /dev/null
+++ b/Documentation/vm/cleancache.txt
@@ -0,0 +1,278 @@
+MOTIVATION
+
+Cleancache is a new optional feature provided by the VFS layer that
+potentially dramatically increases page cache effectiveness for
+many workloads in many environments at a negligible cost.
+
+Cleancache can be thought of as a page-granularity victim cache for clean
+pages that the kernel's pageframe replacement algorithm (PFRA) would like
+to keep around, but can't since there isn't enough memory. So when the
+PFRA "evicts" a page, it first attempts to use cleancache code to
+put the data contained in that page into "transcendent memory", memory
+that is not directly accessible or addressable by the kernel and is
+of unknown and possibly time-varying size.
+
+Later, when a cleancache-enabled filesystem wishes to access a page
+in a file on disk, it first checks cleancache to see if it already
+contains it; if it does, the page of data is copied into the kernel
+and a disk access is avoided.
+
+Transcendent memory "drivers" for cleancache are currently implemented
+in Xen (using hypervisor memory) and zcache (using in-kernel compressed
+memory) and other implementations are in development.
+
+FAQs are included below.
+
+IMPLEMENTATION OVERVIEW
+
+A cleancache "backend" that provides transcendent memory registers itself
+to the kernel's cleancache "frontend" by calling cleancache_register_ops,
+passing a pointer to a cleancache_ops structure with funcs set appropriately.
+Note that cleancache_register_ops returns the previous settings so that
+chaining can be performed if desired. The functions provided must conform to
+certain semantics as follows:
+
+Most important, cleancache is "ephemeral". Pages which are copied into
+cleancache have an indefinite lifetime which is completely unknowable
+by the kernel and so may or may not still be in cleancache at any later time.
+Thus, as its name implies, cleancache is not suitable for dirty pages.
+Cleancache has complete discretion over what pages to preserve and what
+pages to discard and when.
+
+Mounting a cleancache-enabled filesystem should call "init_fs" to obtain a
+pool id which, if positive, must be saved in the filesystem's superblock;
+a negative return value indicates failure. A "put_page" will copy a
+(presumably about-to-be-evicted) page into cleancache and associate it with
+the pool id, a file key, and a page index into the file. (The combination
+of a pool id, a file key, and an index is sometimes called a "handle".)
+A "get_page" will copy the page, if found, from cleancache into kernel memory.
+A "flush_page" will ensure the page no longer is present in cleancache;
+a "flush_inode" will flush all pages associated with the specified file;
+and, when a filesystem is unmounted, a "flush_fs" will flush all pages in
+all files specified by the given pool id and also surrender the pool id.
+
+An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache
+to treat the pool as shared using a 128-bit UUID as a key. On systems
+that may run multiple kernels (such as hard partitioned or virtualized
+systems) that may share a clustered filesystem, and where cleancache
+may be shared among those kernels, calls to init_shared_fs that specify the
+same UUID will receive the same pool id, thus allowing the pages to
+be shared. Note that any security requirements must be imposed outside
+of the kernel (e.g. by "tools" that control cleancache). Or a
+cleancache implementation can simply disable shared_init by always
+returning a negative value.
+
+If a get_page is successful on a non-shared pool, the page is flushed (thus
+making cleancache an "exclusive" cache). On a shared pool, the page
+is NOT flushed on a successful get_page so that it remains accessible to
+other sharers. The kernel is responsible for ensuring coherency between
+cleancache (shared or not), the page cache, and the filesystem, using
+cleancache flush operations as required.
+
+Note that cleancache must enforce put-put-get coherency and get-get
+coherency. For the former, if two puts are made to the same handle but
+with different data, say AAA by the first put and BBB by the second, a
+subsequent get can never return the stale data (AAA). For get-get coherency,
+if a get for a given handle fails, subsequent gets for that handle will
+never succeed unless preceded by a successful put with that handle.
+
+Last, cleancache provides no SMP serialization guarantees; if two
+different Linux threads are simultaneously putting and flushing a page
+with the same handle, the results are indeterminate. Callers must
+lock the page to ensure serial behavior.
+
+CLEANCACHE PERFORMANCE METRICS
+
+Cleancache monitoring is done by sysfs files in the
+/sys/kernel/mm/cleancache directory. The effectiveness of cleancache
+can be measured (across all filesystems) with:
+
+succ_gets - number of gets that were successful
+failed_gets - number of gets that failed
+puts - number of puts attempted (all "succeed")
+flushes - number of flushes attempted
+
+A backend implementatation may provide additional metrics.
+
+FAQ
+
+1) Where's the value? (Andrew Morton)
+
+Cleancache provides a significant performance benefit to many workloads
+in many environments with negligible overhead by improving the
+effectiveness of the pagecache. Clean pagecache pages are
+saved in transcendent memory (RAM that is otherwise not directly
+addressable to the kernel); fetching those pages later avoids "refaults"
+and thus disk reads.
+
+Cleancache (and its sister code "frontswap") provide interfaces for
+this transcendent memory (aka "tmem"), which conceptually lies between
+fast kernel-directly-addressable RAM and slower DMA/asynchronous devices.
+Disallowing direct kernel or userland reads/writes to tmem
+is ideal when data is transformed to a different form and size (such
+as with compression) or secretly moved (as might be useful for write-
+balancing for some RAM-like devices). Evicted page-cache pages (and
+swap pages) are a great use for this kind of slower-than-RAM-but-much-
+faster-than-disk transcendent memory, and the cleancache (and frontswap)
+"page-object-oriented" specification provides a nice way to read and
+write -- and indirectly "name" -- the pages.
+
+In the virtual case, the whole point of virtualization is to statistically
+multiplex physical resources across the varying demands of multiple
+virtual machines. This is really hard to do with RAM and efforts to
+do it well with no kernel change have essentially failed (except in some
+well-publicized special-case workloads). Cleancache -- and frontswap --
+with a fairly small impact on the kernel, provide a huge amount
+of flexibility for more dynamic, flexible RAM multiplexing.
+Specifically, the Xen Transcendent Memory backend allows otherwise
+"fallow" hypervisor-owned RAM to not only be "time-shared" between multiple
+virtual machines, but the pages can be compressed and deduplicated to
+optimize RAM utilization. And when guest OS's are induced to surrender
+underutilized RAM (e.g. with "self-ballooning"), page cache pages
+are the first to go, and cleancache allows those pages to be
+saved and reclaimed if overall host system memory conditions allow.
+
+And the identical interface used for cleancache can be used in
+physical systems as well. The zcache driver acts as a memory-hungry
+device that stores pages of data in a compressed state. And
+the proposed "RAMster" driver shares RAM across multiple physical
+systems.
+
+2) Why does cleancache have its sticky fingers so deep inside the
+ filesystems and VFS? (Andrew Morton and Christoph Hellwig)
+
+The core hooks for cleancache in VFS are in most cases a single line
+and the minimum set are placed precisely where needed to maintain
+coherency (via cleancache_flush operations) between cleancache,
+the page cache, and disk. All hooks compile into nothingness if
+cleancache is config'ed off and turn into a function-pointer-
+compare-to-NULL if config'ed on but no backend claims the ops
+functions, or to a compare-struct-element-to-negative if a
+backend claims the ops functions but a filesystem doesn't enable
+cleancache.
+
+Some filesystems are built entirely on top of VFS and the hooks
+in VFS are sufficient, so don't require an "init_fs" hook; the
+initial implementation of cleancache didn't provide this hook.
+But for some filesystems (such as btrfs), the VFS hooks are
+incomplete and one or more hooks in fs-specific code are required.
+And for some other filesystems, such as tmpfs, cleancache may
+be counterproductive. So it seemed prudent to require a filesystem
+to "opt in" to use cleancache, which requires adding a hook in
+each filesystem. Not all filesystems are supported by cleancache
+only because they haven't been tested. The existing set should
+be sufficient to validate the concept, the opt-in approach means
+that untested filesystems are not affected, and the hooks in the
+existing filesystems should make it very easy to add more
+filesystems in the future.
+
+The total impact of the hooks to existing fs and mm files is only
+about 40 lines added (not counting comments and blank lines).
+
+3) Why not make cleancache asynchronous and batched so it can
+ more easily interface with real devices with DMA instead
+ of copying each individual page? (Minchan Kim)
+
+The one-page-at-a-time copy semantics simplifies the implementation
+on both the frontend and backend and also allows the backend to
+do fancy things on-the-fly like page compression and
+page deduplication. And since the data is "gone" (copied into/out
+of the pageframe) before the cleancache get/put call returns,
+a great deal of race conditions and potential coherency issues
+are avoided. While the interface seems odd for a "real device"
+or for real kernel-addressable RAM, it makes perfect sense for
+transcendent memory.
+
+4) Why is non-shared cleancache "exclusive"? And where is the
+ page "flushed" after a "get"? (Minchan Kim)
+
+The main reason is to free up space in transcendent memory and
+to avoid unnecessary cleancache_flush calls. If you want inclusive,
+the page can be "put" immediately following the "get". If
+put-after-get for inclusive becomes common, the interface could
+be easily extended to add a "get_no_flush" call.
+
+The flush is done by the cleancache backend implementation.
+
+5) What's the performance impact?
+
+Performance analysis has been presented at OLS'09 and LCA'10.
+Briefly, performance gains can be significant on most workloads,
+especially when memory pressure is high (e.g. when RAM is
+overcommitted in a virtual workload); and because the hooks are
+invoked primarily in place of or in addition to a disk read/write,
+overhead is negligible even in worst case workloads. Basically
+cleancache replaces I/O with memory-copy-CPU-overhead; on older
+single-core systems with slow memory-copy speeds, cleancache
+has little value, but in newer multicore machines, especially
+consolidated/virtualized machines, it has great value.
+
+6) How do I add cleancache support for filesystem X? (Boaz Harrash)
+
+Filesystems that are well-behaved and conform to certain
+restrictions can utilize cleancache simply by making a call to
+cleancache_init_fs at mount time. Unusual, misbehaving, or
+poorly layered filesystems must either add additional hooks
+and/or undergo extensive additional testing... or should just
+not enable the optional cleancache.
+
+Some points for a filesystem to consider:
+
+- The FS should be block-device-based (e.g. a ram-based FS such
+ as tmpfs should not enable cleancache)
+- To ensure coherency/correctness, the FS must ensure that all
+ file removal or truncation operations either go through VFS or
+ add hooks to do the equivalent cleancache "flush" operations
+- To ensure coherency/correctness, either inode numbers must
+ be unique across the lifetime of the on-disk file OR the
+ FS must provide an "encode_fh" function.
+- The FS must call the VFS superblock alloc and deactivate routines
+ or add hooks to do the equivalent cleancache calls done there.
+- To maximize performance, all pages fetched from the FS should
+ go through the do_mpag_readpage routine or the FS should add
+ hooks to do the equivalent (cf. btrfs)
+- Currently, the FS blocksize must be the same as PAGESIZE. This
+ is not an architectural restriction, but no backends currently
+ support anything different.
+- A clustered FS should invoke the "shared_init_fs" cleancache
+ hook to get best performance for some backends.
+
+7) Why not use the KVA of the inode as the key? (Christoph Hellwig)
+
+If cleancache would use the inode virtual address instead of
+inode/filehandle, the pool id could be eliminated. But, this
+won't work because cleancache retains pagecache data pages
+persistently even when the inode has been pruned from the
+inode unused list, and only flushes the data page if the file
+gets removed/truncated. So if cleancache used the inode kva,
+there would be potential coherency issues if/when the inode
+kva is reused for a different file. Alternately, if cleancache
+flushed the pages when the inode kva was freed, much of the value
+of cleancache would be lost because the cache of pages in cleanache
+is potentially much larger than the kernel pagecache and is most
+useful if the pages survive inode cache removal.
+
+8) Why is a global variable required?
+
+The cleancache_enabled flag is checked in all of the frequently-used
+cleancache hooks. The alternative is a function call to check a static
+variable. Since cleancache is enabled dynamically at runtime, systems
+that don't enable cleancache would suffer thousands (possibly
+tens-of-thousands) of unnecessary function calls per second. So the
+global variable allows cleancache to be enabled by default at compile
+time, but have insignificant performance impact when cleancache remains
+disabled at runtime.
+
+9) Does cleanache work with KVM?
+
+The memory model of KVM is sufficiently different that a cleancache
+backend may have less value for KVM. This remains to be tested,
+especially in an overcommitted system.
+
+10) Does cleancache work in userspace? It sounds useful for
+ memory hungry caches like web browsers. (Jamie Lokier)
+
+No plans yet, though we agree it sounds useful, at least for
+apps that bypass the page cache (e.g. O_DIRECT).
+
+Last updated: Dan Magenheimer, April 13 2011