summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorStephen Rothwell <sfr@canb.auug.org.au>2014-07-30 15:58:29 +1000
committerStephen Rothwell <sfr@canb.auug.org.au>2014-07-30 15:58:29 +1000
commit83e2c7c43e4b21b072793f40a2fd6b0e1b86c4d9 (patch)
treed2a81e6013d8d0301a6a4de49c503652c5278f49 /Documentation
parentb60e084d85151334dcdbc44659b6c28e6b12a44f (diff)
parent1c91efb247998d2a2cb452f812fc4efda6f36506 (diff)
Merge remote-tracking branch 'rcu/rcu/next'
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/RCU/rcuref.txt9
-rw-r--r--Documentation/kernel-parameters.txt7
-rw-r--r--Documentation/memory-barriers.txt47
3 files changed, 52 insertions, 11 deletions
diff --git a/Documentation/RCU/rcuref.txt b/Documentation/RCU/rcuref.txt
index 141d531aa14b..613033ff2b9b 100644
--- a/Documentation/RCU/rcuref.txt
+++ b/Documentation/RCU/rcuref.txt
@@ -1,5 +1,14 @@
Reference-count design for elements of lists/arrays protected by RCU.
+
+Please note that the percpu-ref feature is likely your first
+stop if you need to combine reference counts and RCU. Please see
+include/linux/percpu-refcount.h for more information. However, in
+those unusual cases where percpu-ref would consume too much memory,
+please read on.
+
+------------------------------------------------------------------------
+
Reference counting on elements of lists which are protected by traditional
reader/writer spinlocks or semaphores are straightforward:
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index ee7c7f1f59e6..f5326acdf1a4 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2850,6 +2850,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
quiescent states. Units are jiffies, minimum
value is one, and maximum value is HZ.
+ rcutree.rcu_nocb_leader_stride= [KNL]
+ Set the number of NOCB kthread groups, which
+ defaults to the square root of the number of
+ CPUs. Larger numbers reduces the wakeup overhead
+ on the per-CPU grace-period kthreads, but increases
+ that same overhead on each group's leader.
+
rcutree.qhimark= [KNL]
Set threshold of queued RCU callbacks beyond which
batch limiting is disabled.
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index f1dc4a215593..abec3f9273bf 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -697,12 +697,13 @@ should do something like the following:
}
Finally, control dependencies do -not- provide transitivity. This is
-demonstrated by two related examples:
+demonstrated by two related examples, with the initial values of
+x and y both being zero:
CPU 0 CPU 1
===================== =====================
r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y);
- if (r1 >= 0) if (r2 >= 0)
+ if (r1 > 0) if (r2 > 0)
ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1;
assert(!(r1 == 1 && r2 == 1));
@@ -711,16 +712,21 @@ The above two-CPU example will never trigger the assert(). However,
if control dependencies guaranteed transitivity (which they do not),
then adding the following two CPUs would guarantee a related assertion:
- CPU 2 CPU 3
- ===================== =====================
- ACCESS_ONCE(x) = 2; ACCESS_ONCE(y) = 2;
+ CPU 2
+ =====================
+ ACCESS_ONCE(x) = 2;
- assert(!(r1 == 2 && r2 == 2 && x == 1 && y == 1)); /* FAILS!!! */
+ assert(!(r1 == 2 && r2 == 1 && x == 2)); /* FAILS!!! */
But because control dependencies do -not- provide transitivity, the
above assertion can fail after the combined four-CPU example completes.
If you need the four-CPU example to provide ordering, you will need
-smp_mb() between the loads and stores in the CPU 0 and CPU 1 code fragments.
+smp_mb() between the loads and stores in the CPU 0 and CPU 1 code fragments,
+that is, just before or just after the "if" statements.
+
+These two examples are the LB and WWC litmus tests from this paper:
+http://www.cl.cam.ac.uk/users/pes20/ppc-supplemental/test6.pdf and this
+site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.
In summary:
@@ -757,10 +763,14 @@ SMP BARRIER PAIRING
When dealing with CPU-CPU interactions, certain types of memory barrier should
always be paired. A lack of appropriate pairing is almost certainly an error.
-A write barrier should always be paired with a data dependency barrier or read
-barrier, though a general barrier would also be viable. Similarly a read
-barrier or a data dependency barrier should always be paired with at least an
-write barrier, though, again, a general barrier is viable:
+General barriers pair with each other, though they also pair with
+most other types of barriers, albeit without transitivity. An acquire
+barrier pairs with a release barrier, but both may also pair with other
+barriers, including of course general barriers. A write barrier pairs
+with a data dependency barrier, an acquire barrier, a release barrier,
+a read barrier, or a general barrier. Similarly a read barrier or a
+data dependency barrier pairs with a write barrier, an acquire barrier,
+a release barrier, or a general barrier:
CPU 1 CPU 2
=============== ===============
@@ -1893,6 +1903,21 @@ between the STORE to indicate the event and the STORE to set TASK_RUNNING:
<general barrier> STORE current->state
LOAD event_indicated
+To repeat, this write memory barrier is present if and only if something
+is actually awakened. To see this, consider the following sequence of
+events, where X and Y are both initially zero:
+
+ CPU 1 CPU 2
+ =============================== ===============================
+ X = 1; STORE event_indicated
+ smp_mb(); wake_up();
+ Y = 1; wait_event(wq, Y == 1);
+ wake_up(); load from Y sees 1, no memory barrier
+ load from X might see 0
+
+In contrast, if a wakeup does occur, CPU 2's load from X would be guaranteed
+to see 1.
+
The available waker functions include:
complete();