Writing an LLVM backend

In general, you want to follow the format of SPARC, X86 or PowerPC (in lib/Target). SPARC is the simplest backend, and is RISC, so if you're working on a RISC target, it is a good one to start with.

To create a static compiler (one that emits text assembly), you need to implement the following:

Describe the register set.
- Create a TableGen description of the register set and register classes
- Implement a subclass of TargetRegisterInfo
Describe the instruction set.
- Create a TableGen description of the instruction set
- Implement a subclass of TargetInstrInfo
Describe the target machine.
- Create a TableGen description of the target that describes the pointer size and references the instruction set
- Implement a subclass of TargetMachine, which configures TargetData correctly
- Register your new target using the RegisterTarget template:
```
RegisterTarget<MyTargetMachine> M("short_name", "  Target name");
```
  Here, MyTargetMachine is the name of your implemented subclass of TargetMachine, short_name is the option that will be active following -march= to select a target in llc and lli, and the last string is the description of your target to appear in -help listing.
Implement the assembly printer for the architecture.
- Define all of the assembly strings for your target, adding them to the instructions in your *InstrInfo.td file.
- Implement the llvm::AsmPrinter interface.
Implement an instruction selector for the architecture.
- The recommended method is the pattern-matching DAG-to-DAG instruction selector (for example, see the PowerPC backend in PPCISelDAGtoDAG.cpp). Parts of instruction selector creation can be performed by adding patterns to the instructions in your .td file.
Optionally, add subtarget support.
- If your target has multiple subtargets (e.g. variants with different capabilities), implement the llvm::TargetSubtarget interface for your architecture. This allows you to add -mcpu= and -mattr= options.
Optionally, add JIT support.
- Create a subclass of TargetJITInfo
- Create a machine code emitter that will be used to emit binary code directly into memory, given MachineInstrs

TableGen register info description - describe a class which will store the register's number in the binary encoding of the instruction (e.g., for JIT purposes).

You also need to define register classes to contain these registers, such as the integer register class and floating-point register class, so that you can allocate virtual registers to instructions from these sets, and let the target-independent register allocator automatically choose the actual architected registers.
```
// class Register is defined in Target.td
class TargetReg<string name> : Register<name> {
  let Namespace = "Target";
}

class IntReg<bits<5> num, string name> : TargetReg<name> {
  field bits<5> Num = num;
}

def R0 : IntReg<0, "%R0">;
...

// class RegisterClass is defined in Target.td
def IReg : RegisterClass<i64, 64, [R0, ... ]>;
```
TableGen instruction info description - break up instructions into classes, usually that's already done by the manufacturer (see instruction manual). Define a class for each instruction category. Define each opcode as a subclass of the category, with appropriate parameters such as the fixed binary encoding of opcodes and extended opcodes, and map the register bits to the bits of the instruction which they are encoded in (for the JIT). Also specify how the instruction should be printed so it can use the automatic assembly printer, e.g.:
```
// class Instruction is defined in Target.td
class Form<bits<6> opcode, dag OL, string asmstr> : Instruction {
  field bits<42> Inst;

  let Namespace = "Target";
  let Inst{0-6} = opcode;
  let OperandList = OL;
  let AsmString = asmstr;
}

def ADD : Form<42, (ops IReg:$rD, IReg:$rA, IReg:$rB), "add $rD, $rA, $rB">;
```

For now, just take a look at lib/Target/CBackend for an example of how the C backend is written.

To actually create your backend, you need to create and modify a few files. Here, the absolute minimum will be discussed. To actually use LLVM's target independent codegenerator, you must implement extra things.

First of all, you should create a subdirectory under lib/Target, which will hold all the files related to your target. Let's assume that our target is called, "Dummy", we would create the directory lib/Target/Dummy.

In this new directory, you should put a Makefile. You can probably copy one from another target and modify it. It should at least contain the LEVEL, LIBRARYNAME and TARGET variables, and then include $(LEVEL)/Makefile.common. Be careful to give the library the correct name, it must be named LLVMDummy (see the MIPS target, for example). Alternatively, you can split the library into LLVMDummyCodeGen and LLVMDummyAsmPrinter, the latter of which should be implemented in a subdirectory below lib/Target/Dummy (see the PowerPC target, for example).

Note that these two naming schemes are hardcoded into llvm-config. Using any other naming scheme will confuse llvm-config and produce lots of (seemingly unrelated) linker errors when linking llc.

To make your target actually do something, you need to implement a subclass of TargetMachine. This implementation should typically be in the file lib/Target/DummyTargetMachine.cpp, but any file in the lib/Target directory will be built and should work. To use LLVM's target independent code generator, you should create a subclass of LLVMTargetMachine. This is what all current machine backends do. To create a target from scratch, create a subclass of TargetMachine. This is what the current language backends do.

To get LLVM to actually build and link your target, you also need to add it to the TARGETS_TO_BUILD variable. To do this, you need to modify the configure script to know about your target when parsing the --enable-targets option. Search the configure script for TARGETS_TO_BUILD, add your target to the lists there (some creativity required) and then reconfigure. Alternatively, you can change autotools/configure.ac and regenerate configure by running ./autoconf/AutoRegen.sh.