summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorEli Friedman <eli.friedman@gmail.com>2011-08-12 21:50:54 +0000
committerEli Friedman <eli.friedman@gmail.com>2011-08-12 21:50:54 +0000
commit91a44dd9ccd8ec3a10fa35315c381cffade91d5b (patch)
tree6b8ddd50a4a0df4f1caef1db7562559a76fabb01 /docs
parent53cae1362dca8aa312c3e36c10b106ea7d349f93 (diff)
downloadllvm-91a44dd9ccd8ec3a10fa35315c381cffade91d5b.tar.gz
llvm-91a44dd9ccd8ec3a10fa35315c381cffade91d5b.tar.bz2
llvm-91a44dd9ccd8ec3a10fa35315c381cffade91d5b.tar.xz
Some reorganization of atomic docs. Added explicit section for NonAtomic. Added example for illegal non-atomic operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137520 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r--docs/Atomics.html143
1 files changed, 111 insertions, 32 deletions
diff --git a/docs/Atomics.html b/docs/Atomics.html
index 967ebdddb1..357f43167b 100644
--- a/docs/Atomics.html
+++ b/docs/Atomics.html
@@ -14,8 +14,8 @@
<ol>
<li><a href="#introduction">Introduction</a></li>
- <li><a href="#loadstore">Load and store</a></li>
- <li><a href="#otherinst">Other atomic instructions</a></li>
+ <li><a href="#outsideatomic">Optimization outside atomic</a></li>
+ <li><a href="#atomicinst">Atomic instructions</a></li>
<li><a href="#ordering">Atomic orderings</a></li>
<li><a href="#iropt">Atomics and IR optimization</a></li>
<li><a href="#codegen">Atomics and Codegen</a></li>
@@ -75,51 +75,84 @@ instructions has been clarified in the IR.</p>
<!-- *********************************************************************** -->
<h2>
- <a name="loadstore">Load and store</a>
+ <a name="outsideatomic">Optimization outside atomic</a>
</h2>
<!-- *********************************************************************** -->
<div>
<p>The basic <code>'load'</code> and <code>'store'</code> allow a variety of
- optimizations, but can have unintuitive results in a concurrent environment.
- For a frontend writer, the rule is essentially that all memory accessed
- with basic loads and stores by multiple threads should be protected by a
- lock or other synchronization; otherwise, you are likely to run into
- undefined behavior. (Do not use volatile as a substitute for atomics; it
- might work on some platforms, but does not provide the necessary guarantees
- in general.)</p>
+ optimizations, but can lead to undefined results in a concurrent environment;
+ see <a href="#o_nonatomic">NonAtomic</a>. This section specifically goes
+ into the one optimizer restriction which applies in concurrent environments,
+ which gets a bit more of an extended description because any optimization
+ dealing with stores needs to be aware of it.</p>
<p>From the optimizer's point of view, the rule is that if there
are not any instructions with atomic ordering involved, concurrency does
not matter, with one exception: if a variable might be visible to another
thread or signal handler, a store cannot be inserted along a path where it
- might not execute otherwise. For example, suppose LICM wants to take all the
- loads and stores in a loop to and from a particular address and promote them
- to registers. LICM is not allowed to insert an unconditional store after
- the loop with the computed value unless a store unconditionally executes
- within the loop. Note that speculative loads are allowed; a load which
+ might not execute otherwise. Take the following example:</p>
+
+<pre>
+/* C code, for readability; run through clang -O2 -S -emit-llvm to get
+ equivalent IR */
+int x;
+void f(int* a) {
+ for (int i = 0; i &lt; 100; i++) {
+ if (a[i])
+ x += 1;
+ }
+}
+</pre>
+
+<p>The following is equivalent in non-concurrent situations:</p>
+
+<pre>
+int x;
+void f(int* a) {
+ int xtemp = x;
+ for (int i = 0; i &lt; 100; i++) {
+ if (a[i])
+ xtemp += 1;
+ }
+ x = xtemp;
+}
+</pre>
+
+<p>However, LLVM is not allowed to transform the former to the latter: it could
+ introduce undefined behavior if another thread can access x at the same time.
+ (This example is particularly of interest because before the concurrency model
+ was implemented, LLVM would perform this transformation.)</p>
+
+<p>Note that speculative loads are allowed; a load which
is part of a race returns <code>undef</code>, but does not have undefined
behavior.</p>
-<p>For cases where simple loads and stores are not sufficient, LLVM provides
- atomic loads and stores with varying levels of guarantees.</p>
</div>
<!-- *********************************************************************** -->
<h2>
- <a name="otherinst">Other atomic instructions</a>
+ <a name="atomicinst">Atomic instructions</a>
</h2>
<!-- *********************************************************************** -->
<div>
+<p>For cases where simple loads and stores are not sufficient, LLVM provides
+ various atomic instructions. The exact guarantees provided depend on the
+ ordering; see <a href="#ordering">Atomic orderings</a></p>
+
+<p><code>load atomic</code> and <code>store atomic</code> provide the same
+ basic functionality as non-atomic loads and stores, but provide additional
+ guarantees in situations where threads and signals are involved.</p>
+
<p><code>cmpxchg</code> and <code>atomicrmw</code> are essentially like an
atomic load followed by an atomic store (where the store is conditional for
- <code>cmpxchg</code>), but no other memory operation can happen between
- the load and store. Note that our cmpxchg does not have quite as many
- options for making cmpxchg weaker as the C++0x version.</p>
+ <code>cmpxchg</code>), but no other memory operation can happen on any thread
+ between the load and store. Note that LLVM's cmpxchg does not provide quite
+ as many options as the C++0x version.</p>
<p>A <code>fence</code> provides Acquire and/or Release ordering which is not
part of another operation; it is normally used along with Monotonic memory
@@ -148,6 +181,54 @@ instructions has been clarified in the IR.</p>
<!-- ======================================================================= -->
<h3>
+ <a name="o_notatomic">NotAtomic</a>
+</h3>
+
+<div>
+
+<p>NotAtomic is the obvious, a load or store which is not atomic. (This isn't
+ really a level of atomicity, but is listed here for comparison.) This is
+ essentially a regular load or store. If code accesses a memory location
+ from multiple threads at the same time, the resulting loads return
+ 'undef'.</p>
+
+<dl>
+ <dt>Relevant standard</dt>
+ <dd>This is intended to match shared variables in C/C++, and to be used
+ in any other context where memory access is necessary, and
+ a race is impossible.
+ <dt>Notes for frontends</dt>
+ <dd>The rule is essentially that all memory accessed with basic loads and
+ stores by multiple threads should be protected by a lock or other
+ synchronization; otherwise, you are likely to run into undefined
+ behavior. If your frontend is for a "safe" language like Java,
+ use Unordered to load and store any shared variable. Note that NotAtomic
+ volatile loads and stores are not properly atomic; do not try to use
+ them as a substitute. (Per the C/C++ standards, volatile does provide
+ some limited guarantees around asynchronous signals, but atomics are
+ generally a better solution.)
+ <dt>Notes for optimizers</dt>
+ <dd>Introducing loads to shared variables along a codepath where they would
+ not otherwise exist is allowed; introducing stores to shared variables
+ is not. See <a href="#outsideatomic">Optimization outside
+ atomic</a>.</dd>
+ <dt>Notes for code generation</dt>
+ <dd>The one interesting restriction here is that it is not allowed to write
+ to bytes outside of the bytes relevant to a store. This is mostly
+ relevant to unaligned stores: it is not allowed in general to convert
+ an unaligned store into two aligned stores of the same width as the
+ unaligned store. Backends are also expected to generate an i8 store
+ as an i8 store, and not an instruction which writes to surrounding
+ bytes. (If you are writing a backend for an architecture which cannot
+ satisfy these restrictions and cares about concurrency, please send an
+ email to llvmdev.)</dd>
+</dl>
+
+</div>
+
+
+<!-- ======================================================================= -->
+<h3>
<a name="o_unordered">Unordered</a>
</h3>
@@ -379,24 +460,22 @@ instructions has been clarified in the IR.</p>
<ul>
<li>isSimple(): A load or store which is not volatile or atomic. This is
what, for example, memcpyopt would check for operations it might
- transform.
+ transform.</li>
<li>isUnordered(): A load or store which is not volatile and at most
Unordered. This would be checked, for example, by LICM before hoisting
- an operation.
+ an operation.</li>
<li>mayReadFromMemory()/mayWriteToMemory(): Existing predicate, but note
that they return true for any operation which is volatile or at least
- Monotonic.
+ Monotonic.</li>
<li>Alias analysis: Note that AA will return ModRef for anything Acquire or
- Release, and for the address accessed by any Monotonic operation.
+ Release, and for the address accessed by any Monotonic operation.</li>
</ul>
-<p>There are essentially two components to supporting atomic operations. The
- first is making sure to query isSimple() or isUnordered() instead
- of isVolatile() before transforming an operation. The other piece is
- making sure that a transform does not end up replacing, for example, an
- Unordered operation with a non-atomic operation. Most of the other
- necessary checks automatically fall out from existing predicates and
- alias analysis queries.</p>
+<p>To support optimizing around atomic operations, make sure you are using
+ the right predicates; everything should work if that is done. If your
+ pass should optimize some atomic operations (Unordered operations in
+ particular), make sure it doesn't replace an atomic load or store with
+ a non-atomic operation.</p>
<p>Some examples of how optimizations interact with various kinds of atomic
operations: