2 files changed, 101 insertions, 79 deletions
diff --git a/docs/InAlloca.rst b/docs/InAlloca.rst
index b1779874e0..a5df96da77 100644
--- a/docs/InAlloca.rst
+++ b/docs/InAlloca.rst
@@ -7,19 +7,19 @@ Introduction
 
 .. Warning:: This feature is unstable and not fully implemented.
 
-The :ref:`attr_inalloca` attribute is designed to allow taking the
-address of an aggregate argument that is being passed by value through
-memory.  Primarily, this feature is required for compatibility with the
-Microsoft C++ ABI.  Under that ABI, class instances that are passed by
-value are constructed directly into argument stack memory.  Prior to the
-addition of inalloca, calls in LLVM were indivisible instructions.
-There was no way to perform intermediate work, such as object
-construction, between the first stack adjustment and the final control
-transfer.  With inalloca, each argument is modelled as an alloca, which
-can be stored to independently of the call.  Unfortunately, this
-complicated feature comes with a large set of restrictions designed to
-bound the lifetime of the argument memory around the call, which are
-explained in this document.
+The :ref:`inalloca <attr_inalloca>` attribute is designed to allow
+taking the address of an aggregate argument that is being passed by
+value through memory.  Primarily, this feature is required for
+compatibility with the Microsoft C++ ABI.  Under that ABI, class
+instances that are passed by value are constructed directly into
+argument stack memory.  Prior to the addition of inalloca, calls in LLVM
+were indivisible instructions.  There was no way to perform intermediate
+work, such as object construction, between the first stack adjustment
+and the final control transfer.  With inalloca, all arguments passed in
+memory are modelled as a single alloca, which can be stored to prior to
+the call.  Unfortunately, this complicated feature comes with a large
+set of restrictions designed to bound the lifetime of the argument
+memory around the call.
 
 For now, it is recommended that frontends and optimizers avoid producing
 this construct, primarily because it forces the use of a base pointer.
@@ -30,48 +30,60 @@ passing by value with a copy.
 Intended Usage
 ==============
 
-In the example below, ``f`` is attempting to pass a default-constructed
-``Foo`` object to ``g`` by value.
+The example below is the intended LLVM IR lowering for some C++ code
+that passes a default-constructed ``Foo`` object to ``g`` in the 32-bit
+Microsoft C++ ABI.
+
+.. code-block:: c++
+
+    // Foo is non-trivial.
+    struct Foo { int a, b; Foo(); ~Foo(); Foo(const &Foo); };
+    void g(Foo a, Foo b);
+    void f() {
+      f(1, Foo(), 3);
+    }
 
 .. code-block:: llvm
 
-    %Foo = type { i32, i32 }
+    %struct.Foo = type { i32, i32 }
+    %callframe.f = type { %struct.Foo, %struct.Foo }
     declare void @Foo_ctor(%Foo* %this)
-    declare void @g(%Foo* inalloca %arg)
+    declare void @Foo_dtor(%Foo* %this)
+    declare void @g(%Foo* inalloca %memargs)
 
     define void @f() {
-      ...
-
-    bb1:
+    entry:
       %base = call i8* @llvm.stacksave()
-      %arg = alloca %Foo
-      invoke void @Foo_ctor(%Foo* %arg)
+      %memargs = alloca %callframe.f
+      %b = getelementptr %callframe.f*, i32 0
+      %a = getelementptr %callframe.f*, i32 1
+      call void @Foo_ctor(%struct.Foo* %b)
+
+      ; If a's ctor throws, we must destruct b.
+      invoke void @Foo_ctor(%struct.Foo* %arg1)
           to label %invoke.cont unwind %invoke.unwind
 
     invoke.cont:
-      call void @g(%Foo* inalloca %arg)
+      store i32 1, i32* %arg0
+      call void @g(%callframe.f* inalloca %memargs)
       call void @llvm.stackrestore(i8* %base)
       ...
 
     invoke.unwind:
+      call void @Foo_dtor(%struct.Foo* %b)
       call void @llvm.stackrestore(i8* %base)
       ...
     }
 
-The alloca in this example is dynamic, meaning it is not in the entry
-block, and it can be executed more than once.  Due to the restrictions
-against allocas between an alloca used with inalloca and its associated
-call site, all allocas used with inalloca are considered dynamic.
-
-To avoid any stack leakage, the frontend saves the current stack pointer
-with a call to :ref:`llvm.stacksave <int_stacksave>`.  Then, it
-allocates the argument stack space with alloca and calls the default
-constructor.  One important consideration is that the default
-constructor could throw an exception, so the frontend has to create a
-landing pad.  At this point, if there were any other inalloca arguments,
-the frontend would have to destruct them before restoring the stack
-pointer.  If the constructor does not unwind, ``g`` is called, and then
-the stack is restored.
+To avoid stack leaks, the frontend saves the current stack pointer with
+a call to :ref:`llvm.stacksave <int_stacksave>`.  Then, it allocates the
+argument stack space with alloca and calls the default constructor.  The
+default constructor could throw an exception, so the frontend has to
+create a landing pad.  The frontend has to destroy the already
+constructed argument ``b`` before restoring the stack pointer.  If the
+constructor does not unwind, ``g`` is called.  In the Microsoft C++ ABI,
+``g`` will destroy its arguments, and then the stack is restored in
+``f``.
 
 Design Considerations
 =====================
@@ -81,31 +93,43 @@ Lifetime
 
 The biggest design consideration for this feature is object lifetime.
 We cannot model the arguments as static allocas in the entry block,
-because all calls need to use the memory that is at the end of the call
-frame to pass arguments.  We cannot vend pointers to that memory at
-function entry because after code generation they will alias.  In the
-current design, the rule against allocas between the inalloca alloca
-values and the call site avoids this problem, but it creates a cleanup
-problem.  Cleanup and lifetime is handled explicitly with stack save and
-restore calls.  In the future, we may be able to avoid this by using
-:ref:`llvm.lifetime.start <int_lifestart>` and :ref:`llvm.lifetime.end
-<int_lifeend>` instead.
+because all calls need to use the memory at the top of the stack to pass
+arguments.  We cannot vend pointers to that memory at function entry
+because after code generation they will alias.
+
+The rule against allocas between argument allocations and the call site
+avoids this problem, but it creates a cleanup problem.  Cleanup and
+lifetime is handled explicitly with stack save and restore calls.  In
+the future, we may want to introduce a new construct such as ``freea``
+or ``afree`` to make it clear that this stack adjusting cleanup is less
+powerful than a full stack save and restore.
 
 Nested Calls and Copy Elision
 -----------------------------
 
-The next consideration is the ability for the frontend to perform copy
-elision in the face of nested calls.  Consider the evaluation of
-``foo(foo(Bar()))``, where ``foo`` takes and returns a ``Bar`` object by
-value and ``Bar`` has non-trivial constructors.  In this case, we want
-to be able to elide copies into ``foo``'s argument slots.  That means we
-need to have more than one set of argument frames active at the same
-time.  First, we need to allocate the frame for the outer call so we can
-pass it in as the hidden struct return pointer to the middle call.  Then
-we do the same for the middle call, allocating a frame and passing its
-address to ``Bar``'s default constructor.  By wrapping the evaluation of
-the inner ``foo`` with stack save and restore, we can have multiple
-overlapping active call frames.
+We also want to be able to support copy elision into these argument
+slots.  This means we have to support multiple live argument
+allocations.
+
+Consider the evaluation of:
+
+.. code-block:: c++
+
+    // Foo is non-trivial.
+    struct Foo { int a; Foo(); Foo(const &Foo); ~Foo(); };
+    Foo bar(Foo b);
+    int main() {
+      bar(bar(Foo()));
+    }
+
+In this case, we want to be able to elide copies into ``bar``'s argument
+slots.  That means we need to have more than one set of argument frames
+active at the same time.  First, we need to allocate the frame for the
+outer call so we can pass it in as the hidden struct return pointer to
+the middle call.  Then we do the same for the middle call, allocating a
+frame and passing its address to ``Foo``'s default constructor.  By
+wrapping the evaluation of the inner ``bar`` with stack save and
+restore, we can have multiple overlapping active call frames.
 
 Callee-cleanup Calling Conventions
 ----------------------------------
diff --git a/docs/LangRef.rst b/docs/LangRef.rst
index a1b3eb47ec..d450b2a465 100644
--- a/docs/LangRef.rst
+++ b/docs/LangRef.rst
@@ -727,29 +727,27 @@ Currently, only the following parameter attributes are defined:
 
 .. Warning:: This feature is unstable and not fully implemented.
 
-    The ``inalloca`` argument attribute allows the caller to get the
-    address of an outgoing argument to a ``call`` or ``invoke`` before
-    it executes.  It is similar to ``byval`` in that it is used to pass
-    arguments by value, but it guarantees that the argument will not be
-    copied.
-
-    To be :ref:`well formed <wellformed>`, the caller must pass in an
-    alloca value into an ``inalloca`` parameter, and an alloca may be
-    used as an ``inalloca`` argument at most once.  The attribute can
-    only be applied to parameters that would be passed in memory and not
-    registers.  The ``inalloca`` attribute cannot be used in conjunction
-    with other attributes that affect argument storage, like ``inreg``,
-    ``nest``, ``sret``, or ``byval``.  The ``inalloca`` stack space is
-    considered to be clobbered by any call that uses it, so any
+    The ``inalloca`` argument attribute allows the caller to take the
+    address of all stack-allocated arguments to a ``call`` or ``invoke``
+    before it executes.  It is similar to ``byval`` in that it is used
+    to pass arguments by value, but it guarantees that the argument will
+    not be copied.
+
+    To be :ref:`well formed <wellformed>`, an alloca may be used as an
+    ``inalloca`` argument at most once.  The attribute can only be
+    applied to the last parameter, and it guarantees that they are
+    passed in memory.  The ``inalloca`` attribute cannot be used in
+    conjunction with other attributes that affect argument storage, like
+    ``inreg``, ``nest``, ``sret``, or ``byval``.  The ``inalloca`` stack
+    space is considered to be clobbered by any call that uses it, so any
     ``inalloca`` parameters cannot be marked ``readonly``.
 
-    Allocas passed with ``inalloca`` to a call must be in the opposite
-    order of the parameter list, meaning that the rightmost argument
-    must be allocated first.  If a call has inalloca arguments, no other
-    allocas can occur between the first alloca used by the call and the
-    call site, unless they are are cleared by calls to
-    :ref:`llvm.stackrestore <int_stackrestore>`.  Violating these rules
-    results in undefined behavior at runtime.
+    When the call site is reached, the argument allocation must have
+    been the most recent stack allocation that is still live, or the
+    results are undefined.  It is possible to allocate additional stack
+    space after an argument allocation and before its call site, but it
+    must be cleared off with :ref:`llvm.stackrestore
+    <int_stackrestore>`.
 
     See :doc:`InAlloca` for more information on how to use this
     attribute.