diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/InAlloca.rst | 140 | ||||
-rw-r--r-- | docs/LangRef.rst | 37 |
2 files changed, 177 insertions, 0 deletions
diff --git a/docs/InAlloca.rst b/docs/InAlloca.rst new file mode 100644 index 0000000000..b1779874e0 --- /dev/null +++ b/docs/InAlloca.rst @@ -0,0 +1,140 @@ +========================================== +Design and Usage of the InAlloca Attribute +========================================== + +Introduction +============ + +.. Warning:: This feature is unstable and not fully implemented. + +The :ref:`attr_inalloca` attribute is designed to allow taking the +address of an aggregate argument that is being passed by value through +memory. Primarily, this feature is required for compatibility with the +Microsoft C++ ABI. Under that ABI, class instances that are passed by +value are constructed directly into argument stack memory. Prior to the +addition of inalloca, calls in LLVM were indivisible instructions. +There was no way to perform intermediate work, such as object +construction, between the first stack adjustment and the final control +transfer. With inalloca, each argument is modelled as an alloca, which +can be stored to independently of the call. Unfortunately, this +complicated feature comes with a large set of restrictions designed to +bound the lifetime of the argument memory around the call, which are +explained in this document. + +For now, it is recommended that frontends and optimizers avoid producing +this construct, primarily because it forces the use of a base pointer. +This feature may grow in the future to allow general mid-level +optimization, but for now, it should be regarded as less efficient than +passing by value with a copy. + +Intended Usage +============== + +In the example below, ``f`` is attempting to pass a default-constructed +``Foo`` object to ``g`` by value. + +.. code-block:: llvm + + %Foo = type { i32, i32 } + declare void @Foo_ctor(%Foo* %this) + declare void @g(%Foo* inalloca %arg) + + define void @f() { + ... + + bb1: + %base = call i8* @llvm.stacksave() + %arg = alloca %Foo + invoke void @Foo_ctor(%Foo* %arg) + to label %invoke.cont unwind %invoke.unwind + + invoke.cont: + call void @g(%Foo* inalloca %arg) + call void @llvm.stackrestore(i8* %base) + ... + + invoke.unwind: + call void @llvm.stackrestore(i8* %base) + ... + } + +The alloca in this example is dynamic, meaning it is not in the entry +block, and it can be executed more than once. Due to the restrictions +against allocas between an alloca used with inalloca and its associated +call site, all allocas used with inalloca are considered dynamic. + +To avoid any stack leakage, the frontend saves the current stack pointer +with a call to :ref:`llvm.stacksave <int_stacksave>`. Then, it +allocates the argument stack space with alloca and calls the default +constructor. One important consideration is that the default +constructor could throw an exception, so the frontend has to create a +landing pad. At this point, if there were any other inalloca arguments, +the frontend would have to destruct them before restoring the stack +pointer. If the constructor does not unwind, ``g`` is called, and then +the stack is restored. + +Design Considerations +===================== + +Lifetime +-------- + +The biggest design consideration for this feature is object lifetime. +We cannot model the arguments as static allocas in the entry block, +because all calls need to use the memory that is at the end of the call +frame to pass arguments. We cannot vend pointers to that memory at +function entry because after code generation they will alias. In the +current design, the rule against allocas between the inalloca alloca +values and the call site avoids this problem, but it creates a cleanup +problem. Cleanup and lifetime is handled explicitly with stack save and +restore calls. In the future, we may be able to avoid this by using +:ref:`llvm.lifetime.start <int_lifestart>` and :ref:`llvm.lifetime.end +<int_lifeend>` instead. + +Nested Calls and Copy Elision +----------------------------- + +The next consideration is the ability for the frontend to perform copy +elision in the face of nested calls. Consider the evaluation of +``foo(foo(Bar()))``, where ``foo`` takes and returns a ``Bar`` object by +value and ``Bar`` has non-trivial constructors. In this case, we want +to be able to elide copies into ``foo``'s argument slots. That means we +need to have more than one set of argument frames active at the same +time. First, we need to allocate the frame for the outer call so we can +pass it in as the hidden struct return pointer to the middle call. Then +we do the same for the middle call, allocating a frame and passing its +address to ``Bar``'s default constructor. By wrapping the evaluation of +the inner ``foo`` with stack save and restore, we can have multiple +overlapping active call frames. + +Callee-cleanup Calling Conventions +---------------------------------- + +Another wrinkle is the existence of callee-cleanup conventions. On +Windows, all methods and many other functions adjust the stack to clear +the memory used to pass their arguments. In some sense, this means that +the allocas are automatically cleared by the call. However, LLVM +instead models this as a write of undef to all of the inalloca values +passed to the call instead of a stack adjustment. Frontends should +still restore the stack pointer to avoid a stack leak. + +Exceptions +---------- + +There is also the possibility of an exception. If argument evaluation +or copy construction throws an exception, the landing pad must do +cleanup, which includes adjusting the stack pointer to avoid a stack +leak. This means the cleanup of the stack memory cannot be tied to the +call itself. There needs to be a separate IR-level instruction that can +perform independent cleanup of arguments. + +Efficiency +---------- + +Eventually, it should be possible to generate efficient code for this +construct. In particular, using inalloca should not require a base +pointer. If the backend can prove that all points in the CFG only have +one possible stack level, then it can address the stack directly from +the stack pointer. While this is not yet implemented, the plan is that +the inalloca attribute should not change much, but the frontend IR +generation recommendations may change. diff --git a/docs/LangRef.rst b/docs/LangRef.rst index 2faa15692b..86b5a15f25 100644 --- a/docs/LangRef.rst +++ b/docs/LangRef.rst @@ -697,6 +697,39 @@ Currently, only the following parameter attributes are defined: site. If the alignment is not specified, then the code generator makes a target-specific assumption. +.. _attr_inalloca: + +``inalloca`` + +.. Warning:: This feature is unstable and not fully implemented. + + The ``inalloca`` argument attribute allows the caller to get the + address of an outgoing argument to a ``call`` or ``invoke`` before + it executes. It is similar to ``byval`` in that it is used to pass + arguments by value, but it guarantees that the argument will not be + copied. + + To be :ref:`well formed <wellformed>`, the caller must pass in an + alloca value into an ``inalloca`` parameter, and an alloca may be + used as an ``inalloca`` argument at most once. The attribute can + only be applied to parameters that would be passed in memory and not + registers. The ``inalloca`` attribute cannot be used in conjunction + with other attributes that affect argument storage, like ``inreg``, + ``nest``, ``sret``, or ``byval``. The ``inalloca`` stack space is + considered to be clobbered by any call that uses it, so any + ``inalloca`` parameters cannot be marked ``readonly``. + + Allocas passed with ``inalloca`` to a call must be in the opposite + order of the parameter list, meaning that the rightmost argument + must be allocated first. If a call has inalloca arguments, no other + allocas can occur between the first alloca used by the call and the + call site, unless they are are cleared by calls to + :ref:`llvm.stackrestore <int_stackrestore>`. Violating these rules + results in undefined behavior at runtime. + + See :doc:`InAlloca` for more information on how to use this + attribute. + ``sret`` This indicates that the pointer parameter specifies the address of a structure that is the return value of the function in the source @@ -8419,6 +8452,8 @@ Memory Use Markers This class of intrinsics exists to information about the lifetime of memory objects and ranges where variables are immutable. +.. _int_lifestart: + '``llvm.lifetime.start``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -8450,6 +8485,8 @@ of the memory pointed to by ``ptr`` is dead. This means that it is known to never be used and has an undefined value. A load from the pointer that precedes this intrinsic can be replaced with ``'undef'``. +.. _int_lifeend: + '``llvm.lifetime.end``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |