summaryrefslogtreecommitdiff
path: root/utils/TableGen/DisassemblerEmitter.cpp
diff options
context:
space:
mode:
authorSean Callanan <scallanan@apple.com>2009-12-19 02:59:52 +0000
committerSean Callanan <scallanan@apple.com>2009-12-19 02:59:52 +0000
commit8ed9f51663bc5533f36ca62e5668ae08e9a1313f (patch)
tree3054645839caee367e9403507d8487538819ed5b /utils/TableGen/DisassemblerEmitter.cpp
parente9ec6ad1ba5fd9ad70f5d0c059c5a5aa44f501f7 (diff)
downloadllvm-8ed9f51663bc5533f36ca62e5668ae08e9a1313f.tar.gz
llvm-8ed9f51663bc5533f36ca62e5668ae08e9a1313f.tar.bz2
llvm-8ed9f51663bc5533f36ca62e5668ae08e9a1313f.tar.xz
Table-driven disassembler for the X86 architecture (16-, 32-, and 64-bit
incarnations), integrated into the MC framework. The disassembler is table-driven, using a custom TableGen backend to generate hierarchical tables optimized for fast decode. The disassembler consumes MemoryObjects and produces arrays of MCInsts, adhering to the abstract base class MCDisassembler (llvm/MC/MCDisassembler.h). The disassembler is documented in detail in - lib/Target/X86/Disassembler/X86Disassembler.cpp (disassembler runtime) - utils/TableGen/DisassemblerEmitter.cpp (table emitter) You can test the disassembler by running llvm-mc -disassemble for i386 or x86_64 targets. Please let me know if you encounter any problems with it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91749 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'utils/TableGen/DisassemblerEmitter.cpp')
-rw-r--r--utils/TableGen/DisassemblerEmitter.cpp99
1 files changed, 99 insertions, 0 deletions
diff --git a/utils/TableGen/DisassemblerEmitter.cpp b/utils/TableGen/DisassemblerEmitter.cpp
index cc131257cf..61b9b1583b 100644
--- a/utils/TableGen/DisassemblerEmitter.cpp
+++ b/utils/TableGen/DisassemblerEmitter.cpp
@@ -10,7 +10,86 @@
#include "DisassemblerEmitter.h"
#include "CodeGenTarget.h"
#include "Record.h"
+#include "X86DisassemblerTables.h"
+#include "X86RecognizableInstr.h"
using namespace llvm;
+using namespace llvm::X86Disassembler;
+
+/// DisassemblerEmitter - Contains disassembler table emitters for various
+/// architectures.
+
+/// X86 Disassembler Emitter
+///
+/// *** IF YOU'RE HERE TO RESOLVE A "Primary decode conflict", LOOK DOWN NEAR
+/// THE END OF THIS COMMENT!
+///
+/// The X86 disassembler emitter is part of the X86 Disassembler, which is
+/// documented in lib/Target/X86/X86Disassembler.h.
+///
+/// The emitter produces the tables that the disassembler uses to translate
+/// instructions. The emitter generates the following tables:
+///
+/// - One table (CONTEXTS_SYM) that contains a mapping of attribute masks to
+/// instruction contexts. Although for each attribute there are cases where
+/// that attribute determines decoding, in the majority of cases decoding is
+/// the same whether or not an attribute is present. For example, a 64-bit
+/// instruction with an OPSIZE prefix and an XS prefix decodes the same way in
+/// all cases as a 64-bit instruction with only OPSIZE set. (The XS prefix
+/// may have effects on its execution, but does not change the instruction
+/// returned.) This allows considerable space savings in other tables.
+/// - Four tables (ONEBYTE_SYM, TWOBYTE_SYM, THREEBYTE38_SYM, and
+/// THREEBYTE3A_SYM) contain the hierarchy that the decoder traverses while
+/// decoding an instruction. At the lowest level of this hierarchy are
+/// instruction UIDs, 16-bit integers that can be used to uniquely identify
+/// the instruction and correspond exactly to its position in the list of
+/// CodeGenInstructions for the target.
+/// - One table (INSTRUCTIONS_SYM) contains information about the operands of
+/// each instruction and how to decode them.
+///
+/// During table generation, there may be conflicts between instructions that
+/// occupy the same space in the decode tables. These conflicts are resolved as
+/// follows in setTableFields() (X86DisassemblerTables.cpp)
+///
+/// - If the current context is the native context for one of the instructions
+/// (that is, the attributes specified for it in the LLVM tables specify
+/// precisely the current context), then it has priority.
+/// - If the current context isn't native for either of the instructions, then
+/// the higher-priority context wins (that is, the one that is more specific).
+/// That hierarchy is determined by outranks() (X86DisassemblerTables.cpp)
+/// - If the current context is native for both instructions, then the table
+/// emitter reports a conflict and dies.
+///
+/// *** RESOLUTION FOR "Primary decode conflict"S
+///
+/// If two instructions collide, typically the solution is (in order of
+/// likelihood):
+///
+/// (1) to filter out one of the instructions by editing filter()
+/// (X86RecognizableInstr.cpp). This is the most common resolution, but
+/// check the Intel manuals first to make sure that (2) and (3) are not the
+/// problem.
+/// (2) to fix the tables (X86.td and its subsidiaries) so the opcodes are
+/// accurate. Sometimes they are not.
+/// (3) to fix the tables to reflect the actual context (for example, required
+/// prefixes), and possibly to add a new context by editing
+/// lib/Target/X86/X86DisassemblerDecoderCommon.h. This is unlikely to be
+/// the cause.
+///
+/// DisassemblerEmitter.cpp contains the implementation for the emitter,
+/// which simply pulls out instructions from the CodeGenTarget and pushes them
+/// into X86DisassemblerTables.
+/// X86DisassemblerTables.h contains the interface for the instruction tables,
+/// which manage and emit the structures discussed above.
+/// X86DisassemblerTables.cpp contains the implementation for the instruction
+/// tables.
+/// X86ModRMFilters.h contains filters that can be used to determine which
+/// ModR/M values are valid for a particular instruction. These are used to
+/// populate ModRMDecisions.
+/// X86RecognizableInstr.h contains the interface for a single instruction,
+/// which knows how to translate itself from a CodeGenInstruction and provide
+/// the information necessary for integration into the tables.
+/// X86RecognizableInstr.cpp contains the implementation for a single
+/// instruction.
void DisassemblerEmitter::run(raw_ostream &OS) {
CodeGenTarget Target;
@@ -25,6 +104,26 @@ void DisassemblerEmitter::run(raw_ostream &OS) {
<< " *===---------------------------------------------------------------"
<< "-------===*/\n";
+ // X86 uses a custom disassembler.
+ if (Target.getName() == "X86") {
+ DisassemblerTables Tables;
+
+ std::vector<const CodeGenInstruction*> numberedInstructions;
+ Target.getInstructionsByEnumValue(numberedInstructions);
+
+ for (unsigned i = 0, e = numberedInstructions.size(); i != e; ++i)
+ RecognizableInstr::processInstr(Tables, *numberedInstructions[i], i);
+
+ // FIXME: As long as we are using exceptions, might as well drop this to the
+ // actual conflict site.
+ if (Tables.hasConflicts())
+ throw TGError(Target.getTargetRecord()->getLoc(),
+ "Primary decode conflict");
+
+ Tables.emit(OS);
+ return;
+ }
+
throw TGError(Target.getTargetRecord()->getLoc(),
"Unable to generate disassembler for this target");
}