From bef3ef99752ba2753decefc5d7f9e80c3e5d47b6 Mon Sep 17 00:00:00 2001 From: Bill Wendling Date: Sun, 7 Oct 2012 04:56:08 +0000 Subject: Sphinxify the ExtendingLLVM documentation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165371 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/ExtendingLLVM.rst | 306 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 306 insertions(+) create mode 100644 docs/ExtendingLLVM.rst (limited to 'docs/ExtendingLLVM.rst') diff --git a/docs/ExtendingLLVM.rst b/docs/ExtendingLLVM.rst new file mode 100644 index 0000000000..e41cfd996e --- /dev/null +++ b/docs/ExtendingLLVM.rst @@ -0,0 +1,306 @@ +.. _extending_llvm: + +============================================================ +Extending LLVM: Adding instructions, intrinsics, types, etc. +============================================================ + +Introduction and Warning +======================== + + +During the course of using LLVM, you may wish to customize it for your research +project or for experimentation. At this point, you may realize that you need to +add something to LLVM, whether it be a new fundamental type, a new intrinsic +function, or a whole new instruction. + +When you come to this realization, stop and think. Do you really need to extend +LLVM? Is it a new fundamental capability that LLVM does not support at its +current incarnation or can it be synthesized from already pre-existing LLVM +elements? If you are not sure, ask on the `LLVM-dev +`_ list. The reason is that +extending LLVM will get involved as you need to update all the different passes +that you intend to use with your extension, and there are ``many`` LLVM analyses +and transformations, so it may be quite a bit of work. + +Adding an `intrinsic function`_ is far easier than adding an +instruction, and is transparent to optimization passes. If your added +functionality can be expressed as a function call, an intrinsic function is the +method of choice for LLVM extension. + +Before you invest a significant amount of effort into a non-trivial extension, +**ask on the list** if what you are looking to do can be done with +already-existing infrastructure, or if maybe someone else is already working on +it. You will save yourself a lot of time and effort by doing so. + +.. _intrinsic function: + +Adding a new intrinsic function +=============================== + +Adding a new intrinsic function to LLVM is much easier than adding a new +instruction. Almost all extensions to LLVM should start as an intrinsic +function and then be turned into an instruction if warranted. + +#. ``llvm/docs/LangRef.html``: + + Document the intrinsic. Decide whether it is code generator specific and + what the restrictions are. Talk to other people about it so that you are + sure it's a good idea. + +#. ``llvm/include/llvm/Intrinsics*.td``: + + Add an entry for your intrinsic. Describe its memory access characteristics + for optimization (this controls whether it will be DCE'd, CSE'd, etc). Note + that any intrinsic using the ``llvm_int_ty`` type for an argument will + be deemed by ``tblgen`` as overloaded and the corresponding suffix will + be required on the intrinsic's name. + +#. ``llvm/lib/Analysis/ConstantFolding.cpp``: + + If it is possible to constant fold your intrinsic, add support to it in the + ``canConstantFoldCallTo`` and ``ConstantFoldCall`` functions. + +#. ``llvm/test/Regression/*``: + + Add test cases for your test cases to the test suite + +Once the intrinsic has been added to the system, you must add code generator +support for it. Generally you must do the following steps: + +Add support to the .td file for the target(s) of your choice in +``lib/Target/*/*.td``. + + This is usually a matter of adding a pattern to the .td file that matches the + intrinsic, though it may obviously require adding the instructions you want to + generate as well. There are lots of examples in the PowerPC and X86 backend + to follow. + +Adding a new SelectionDAG node +============================== + +As with intrinsics, adding a new SelectionDAG node to LLVM is much easier than +adding a new instruction. New nodes are often added to help represent +instructions common to many targets. These nodes often map to an LLVM +instruction (add, sub) or intrinsic (byteswap, population count). In other +cases, new nodes have been added to allow many targets to perform a common task +(converting between floating point and integer representation) or capture more +complicated behavior in a single node (rotate). + +#. ``include/llvm/CodeGen/ISDOpcodes.h``: + + Add an enum value for the new SelectionDAG node. + +#. ``lib/CodeGen/SelectionDAG/SelectionDAG.cpp``: + + Add code to print the node to ``getOperationName``. If your new node can be + evaluated at compile time when given constant arguments (such as an add of a + constant with another constant), find the ``getNode`` method that takes the + appropriate number of arguments, and add a case for your node to the switch + statement that performs constant folding for nodes that take the same number + of arguments as your new node. + +#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: + + Add code to `legalize, promote, and expand + `_ the node as necessary. At a + minimum, you will need to add a case statement for your node in + ``LegalizeOp`` which calls LegalizeOp on the node's operands, and returns a + new node if any of the operands changed as a result of being legalized. It + is likely that not all targets supported by the SelectionDAG framework will + natively support the new node. In this case, you must also add code in your + node's case statement in ``LegalizeOp`` to Expand your node into simpler, + legal operations. The case for ``ISD::UREM`` for expanding a remainder into + a divide, multiply, and a subtract is a good example. + +#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: + + If targets may support the new node being added only at certain sizes, you + will also need to add code to your node's case statement in ``LegalizeOp`` + to Promote your node's operands to a larger size, and perform the correct + operation. You will also need to add code to ``PromoteOp`` to do this as + well. For a good example, see ``ISD::BSWAP``, which promotes its operand to + a wider size, performs the byteswap, and then shifts the correct bytes right + to emulate the narrower byteswap in the wider type. + +#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: + + Add a case for your node in ``ExpandOp`` to teach the legalizer how to + perform the action represented by the new node on a value that has been split + into high and low halves. This case will be used to support your node with a + 64 bit operand on a 32 bit target. + +#. ``lib/CodeGen/SelectionDAG/DAGCombiner.cpp``: + + If your node can be combined with itself, or other existing nodes in a + peephole-like fashion, add a visit function for it, and call that function + from. There are several good examples for simple combines you can do; + ``visitFABS`` and ``visitSRL`` are good starting places. + +#. ``lib/Target/PowerPC/PPCISelLowering.cpp``: + + Each target has an implementation of the ``TargetLowering`` class, usually in + its own file (although some targets include it in the same file as the + DAGToDAGISel). The default behavior for a target is to assume that your new + node is legal for all types that are legal for that target. If this target + does not natively support your node, then tell the target to either Promote + it (if it is supported at a larger type) or Expand it. This will cause the + code you wrote in ``LegalizeOp`` above to decompose your new node into other + legal nodes for this target. + +#. ``lib/Target/TargetSelectionDAG.td``: + + Most current targets supported by LLVM generate code using the DAGToDAG + method, where SelectionDAG nodes are pattern matched to target-specific + nodes, which represent individual instructions. In order for the targets to + match an instruction to your new node, you must add a def for that node to + the list in this file, with the appropriate type constraints. Look at + ``add``, ``bswap``, and ``fadd`` for examples. + +#. ``lib/Target/PowerPC/PPCInstrInfo.td``: + + Each target has a tablegen file that describes the target's instruction set. + For targets that use the DAGToDAG instruction selection framework, add a + pattern for your new node that uses one or more target nodes. Documentation + for this is a bit sparse right now, but there are several decent examples. + See the patterns for ``rotl`` in ``PPCInstrInfo.td``. + +#. TODO: document complex patterns. + +#. ``llvm/test/Regression/CodeGen/*``: + + Add test cases for your new node to the test suite. + ``llvm/test/Regression/CodeGen/X86/bswap.ll`` is a good example. + +Adding a new instruction +======================== + +.. warning:: + + Adding instructions changes the bitcode format, and it will take some effort + to maintain compatibility with the previous version. Only add an instruction + if it is absolutely necessary. + +#. ``llvm/include/llvm/Instruction.def``: + + add a number for your instruction and an enum name + +#. ``llvm/include/llvm/Instructions.h``: + + add a definition for the class that will represent your instruction + +#. ``llvm/include/llvm/Support/InstVisitor.h``: + + add a prototype for a visitor to your new instruction type + +#. ``llvm/lib/AsmParser/Lexer.l``: + + add a new token to parse your instruction from assembly text file + +#. ``llvm/lib/AsmParser/llvmAsmParser.y``: + + add the grammar on how your instruction can be read and what it will + construct as a result + +#. ``llvm/lib/Bitcode/Reader/Reader.cpp``: + + add a case for your instruction and how it will be parsed from bitcode + +#. ``llvm/lib/VMCore/Instruction.cpp``: + + add a case for how your instruction will be printed out to assembly + +#. ``llvm/lib/VMCore/Instructions.cpp``: + + implement the class you defined in ``llvm/include/llvm/Instructions.h`` + +#. Test your instruction + +#. ``llvm/lib/Target/*``: + + add support for your instruction to code generators, or add a lowering pass. + +#. ``llvm/test/Regression/*``: + + add your test cases to the test suite. + +Also, you need to implement (or modify) any analyses or passes that you want to +understand this new instruction. + +Adding a new type +================= + +.. warning:: + + Adding new types changes the bitcode format, and will break compatibility with + currently-existing LLVM installations. Only add new types if it is absolutely + necessary. + +Adding a fundamental type +------------------------- + +#. ``llvm/include/llvm/Type.h``: + + add enum for the new type; add static ``Type*`` for this type + +#. ``llvm/lib/VMCore/Type.cpp``: + + add mapping from ``TypeID`` => ``Type*``; initialize the static ``Type*`` + +#. ``llvm/lib/AsmReader/Lexer.l``: + + add ability to parse in the type from text assembly + +#. ``llvm/lib/AsmReader/llvmAsmParser.y``: + + add a token for that type + +Adding a derived type +--------------------- + +#. ``llvm/include/llvm/Type.h``: + + add enum for the new type; add a forward declaration of the type also + +#. ``llvm/include/llvm/DerivedTypes.h``: + + add new class to represent new class in the hierarchy; add forward + declaration to the TypeMap value type + +#. ``llvm/lib/VMCore/Type.cpp``: + + add support for derived type to: + + .. code:: c++ + + std::string getTypeDescription(const Type &Ty, + std::vector &TypeStack) + bool TypesEqual(const Type *Ty, const Type *Ty2, + std::map &EqTypes) + + add necessary member functions for type, and factory methods + +#. ``llvm/lib/AsmReader/Lexer.l``: + + add ability to parse in the type from text assembly + +#. ``llvm/lib/BitCode/Writer/Writer.cpp``: + + modify ``void BitcodeWriter::outputType(const Type *T)`` to serialize your + type + +#. ``llvm/lib/BitCode/Reader/Reader.cpp``: + + modify ``const Type *BitcodeReader::ParseType()`` to read your data type + +#. ``llvm/lib/VMCore/AsmWriter.cpp``: + + modify + + .. code:: c++ + + void calcTypeName(const Type *Ty, + std::vector &TypeStack, + std::map &TypeNames, + std::string &Result) + + to output the new derived type -- cgit v1.2.3