summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorTim Northover <tnorthover@apple.com>2014-06-13 14:24:07 +0000
committerTim Northover <tnorthover@apple.com>2014-06-13 14:24:07 +0000
commit8f2a85e0995a5bb2943bf9c5021950beaf48265e (patch)
tree4cb75b9eb20ea78ff6c905e4995176e68eccb937 /docs
parent1d5e37946d2ec6ca7834e333a850276569593a90 (diff)
downloadllvm-8f2a85e0995a5bb2943bf9c5021950beaf48265e.tar.gz
llvm-8f2a85e0995a5bb2943bf9c5021950beaf48265e.tar.bz2
llvm-8f2a85e0995a5bb2943bf9c5021950beaf48265e.tar.xz
IR: add "cmpxchg weak" variant to support permitted failure.
This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210903 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r--docs/Atomics.rst10
-rw-r--r--docs/LangRef.rst26
2 files changed, 21 insertions, 15 deletions
diff --git a/docs/Atomics.rst b/docs/Atomics.rst
index 1243f34548..5f17c6154f 100644
--- a/docs/Atomics.rst
+++ b/docs/Atomics.rst
@@ -110,8 +110,7 @@ where threads and signals are involved.
``cmpxchg`` and ``atomicrmw`` are essentially like an atomic load followed by an
atomic store (where the store is conditional for ``cmpxchg``), but no other
-memory operation can happen on any thread between the load and store. Note that
-LLVM's cmpxchg does not provide quite as many options as the C++0x version.
+memory operation can happen on any thread between the load and store.
A ``fence`` provides Acquire and/or Release ordering which is not part of
another operation; it is normally used along with Monotonic memory operations.
@@ -430,10 +429,9 @@ other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``. Depending
on the users of the result, some ``atomicrmw`` operations can be translated into
operations like ``LOCK AND``, but that does not work in general.
-On ARM, MIPS, and many other RISC architectures, Acquire, Release, and
-SequentiallyConsistent semantics require barrier instructions for every such
+On ARM (before v8), MIPS, and many other RISC architectures, Acquire, Release,
+and SequentiallyConsistent semantics require barrier instructions for every such
operation. Loads and stores generate normal instructions. ``cmpxchg`` and
``atomicrmw`` can be represented using a loop with LL/SC-style instructions
which take some sort of exclusive lock on a cache line (``LDREX`` and ``STREX``
-on ARM, etc.). At the moment, the IR does not provide any way to represent a
-weak ``cmpxchg`` which would not require a loop.
+on ARM, etc.).
diff --git a/docs/LangRef.rst b/docs/LangRef.rst
index ee95cb9f85..8958505e9d 100644
--- a/docs/LangRef.rst
+++ b/docs/LangRef.rst
@@ -5059,14 +5059,14 @@ Syntax:
::
- cmpxchg [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields {ty}
+ cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields { <ty>, i1 }
Overview:
"""""""""
The '``cmpxchg``' instruction is used to atomically modify memory. It
loads a value in memory and compares it to a given value. If they are
-equal, it stores a new value into the memory.
+equal, it tries to store a new value into the memory.
Arguments:
""""""""""
@@ -5099,10 +5099,17 @@ equal to the size in memory of the operand.
Semantics:
""""""""""
-The contents of memory at the location specified by the '``<pointer>``'
-operand is read and compared to '``<cmp>``'; if the read value is the
-equal, '``<new>``' is written. The original value at the location is
-returned.
+The contents of memory at the location specified by the '``<pointer>``' operand
+is read and compared to '``<cmp>``'; if the read value is the equal, the
+'``<new>``' is written. The original value at the location is returned, together
+with a flag indicating success (true) or failure (false).
+
+If the cmpxchg operation is marked as ``weak`` then a spurious failure is
+permitted: the operation may not write ``<new>`` even if the comparison
+matched.
+
+If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
+if the value loaded equals ``cmp``.
A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
@@ -5114,14 +5121,15 @@ Example:
.. code-block:: llvm
entry:
- %orig = atomic load i32* %ptr unordered ; yields {i32}
+ %orig = atomic load i32* %ptr unordered ; yields i32
br label %loop
loop:
%cmp = phi i32 [ %orig, %entry ], [%old, %loop]
%squared = mul i32 %cmp, %cmp
- %old = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields {i32}
- %success = icmp eq i32 %cmp, %old
+ %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
+ %value_loaded = extractvalue { i32, i1 } %val_success, 0
+ %success = extractvalue { i32, i1 } %val_success, 1
br i1 %success, label %done, label %loop
done: