diff options
author | Jim Laskey <jlaskey@mac.com> | 2006-03-14 18:08:46 +0000 |
---|---|---|
committer | Jim Laskey <jlaskey@mac.com> | 2006-03-14 18:08:46 +0000 |
commit | cec12a5c30cf6dbb96733f5f01cd9cbbc8fbe249 (patch) | |
tree | 7b074c7d62ae72ca9a3d677a70181973120d605b /docs/SourceLevelDebugging.html | |
parent | a08610c8a534501bc4301c5037e883f180b19a99 (diff) | |
download | llvm-cec12a5c30cf6dbb96733f5f01cd9cbbc8fbe249.tar.gz llvm-cec12a5c30cf6dbb96733f5f01cd9cbbc8fbe249.tar.bz2 llvm-cec12a5c30cf6dbb96733f5f01cd9cbbc8fbe249.tar.xz |
Bring debugging information up to date.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26759 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/SourceLevelDebugging.html')
-rw-r--r-- | docs/SourceLevelDebugging.html | 1802 |
1 files changed, 1166 insertions, 636 deletions
diff --git a/docs/SourceLevelDebugging.html b/docs/SourceLevelDebugging.html index c735e4e781..6a3d675080 100644 --- a/docs/SourceLevelDebugging.html +++ b/docs/SourceLevelDebugging.html @@ -17,46 +17,41 @@ <ol> <li><a href="#phil">Philosophy behind LLVM debugging information</a></li> <li><a href="#debugopt">Debugging optimized code</a></li> - <li><a href="#future">Future work</a></li> </ol></li> - <li><a href="#llvm-db">Using the <tt>llvm-db</tt> tool</a> - <ol> - <li><a href="#limitations">Limitations of <tt>llvm-db</tt></a></li> - <li><a href="#sample">A sample <tt>llvm-db</tt> session</a></li> - <li><a href="#startup">Starting the debugger</a></li> - <li><a href="#commands">Commands recognized by the debugger</a></li> - </ol></li> - - <li><a href="#architecture">Architecture of the LLVM debugger</a> - <ol> - <li><a href="#arch_debugger">The Debugger and InferiorProcess classes</a></li> - <li><a href="#arch_info">The RuntimeInfo, ProgramInfo, and SourceLanguage classes</a></li> - <li><a href="#arch_llvm-db">The <tt>llvm-db</tt> tool</a></li> - <li><a href="#arch_todo">Short-term TODO list</a></li> - </ol></li> - <li><a href="#format">Debugging information format</a> <ol> - <li><a href="#format_common_anchors">Anchors for global objects</a></li> - <li><a href="#format_common_stoppoint">Representing stopping points in the source program</a></li> - <li><a href="#format_common_lifetime">Object lifetimes and scoping</a></li> - <li><a href="#format_common_descriptors">Object descriptor formats</a> + <li><a href="#debug_info_descriptors">Debug information descriptors</a> <ul> - <li><a href="#format_common_source_files">Representation of source files</a></li> - <li><a href="#format_common_program_objects">Representation of program objects</a></li> - <li><a href="#format_common_object_contexts">Program object contexts</a></li> + <li><a href="#format_anchors">Anchor descriptors</a></li> + <li><a href="#format_compile_units">Compile unit descriptors</a></li> + <li><a href="#format_global_variables">Global variable descriptors</a></li> + <li><a href="#format_subprograms">Subprogram descriptors</a></li> + <li><a href="#format_basic_type">Basic type descriptors</a></li> + <li><a href="#format_derived_type">Derived type descriptors</a></li> + <li><a href="#format_composite_type">Composite type descriptors</a></li> + <li><a href="#format_subrange">Subrange descriptors</a></li> + <li><a href="#format_enumeration">Enumerator descriptors</a></li> + </ul></li> + <li><a href="#format_common_intrinsics">Debugger intrinsic functions</a> + <ul> + <li><a href="#format_common_stoppoint">llvm.dbg.stoppoint</a></li> + <li><a href="#format_common_func_start">llvm.dbg.func.start</a></li> + <li><a href="#format_common_region_start">llvm.dbg.region.start</a></li> + <li><a href="#format_common_region_end">llvm.dbg.region.end</a></li> + <li><a href="#format_common_declare">llvm.dbg.declare</a></li> </ul></li> - <li><a href="#format_common_intrinsics">Debugger intrinsic functions</a></li> - <li><a href="#format_common_tags">Values for debugger tags</a></li> + <li><a href="#format_common_stoppoints">Representing stopping points in the + source program</a></li> </ol></li> <li><a href="#ccxx_frontend">C/C++ front-end specific debug information</a> <ol> - <li><a href="#ccxx_pse">Program Scope Entries</a> - <ul> - <li><a href="#ccxx_compilation_units">Compilation unit entries</a></li> - <li><a href="#ccxx_modules">Module, namespace, and importing entries</a></li> - </ul></li> - <li><a href="#ccxx_dataobjects">Data objects (program variables)</a></li> + <li><a href="#ccxx_compile_units">C/C++ source file information</a></li> + <li><a href="#ccxx_global_variable">C/C++ global variable information</a></li> + <li><a href="#ccxx_subprogram">C/C++ function information</a></li> + <li><a href="#ccxx_basic_types">C/C++ basic types</a></li> + <li><a href="#ccxx_derived_types">C/C++ derived types</a></li> + <li><a href="#ccxx_composite_types">C/C++ struct/union types</a></li> + <li><a href="#ccxx_enumeration_types">C/C++ enumeration types</a></li> </ol></li> </ul> </td> @@ -67,7 +62,8 @@ height="369"> </tr></table> <div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> + <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> + and <a href="mailto:jlaskey@apple.com">Jim Laskey</a></p> </div> @@ -78,15 +74,10 @@ height="369"> <div class="doc_text"> <p>This document is the central repository for all information pertaining to -debug information in LLVM. It describes the <a href="#llvm-db">user -interface</a> for the <tt>llvm-db</tt> tool, which provides a -powerful <a href="#llvm-db">source-level debugger</a> -to users of LLVM-based compilers. It then describes the <a -href="#architecture">various components</a> that make up the debugger and the -libraries which future clients may use. Finally, it describes the <a -href="#format">actual format that the LLVM debug information</a> takes, -which is useful for those interested in creating front-ends or dealing directly -with the information.</p> +debug information in LLVM. It describes the <a href="#format">actual format +that the LLVM debug information</a> takes, which is useful for those interested +in creating front-ends or dealing directly with the information. Further, this +document provides specifc examples of what debug information for C/C++.</p> </div> @@ -133,15 +124,13 @@ href="#ccxx_frontend">implementation-defined format</a> (the C/C++ front-end currently uses working draft 7 of the <a href="http://www.eagercon.com/dwarf/dwarf3std.htm">Dwarf 3 standard</a>).</p> -<p>When a program is debugged, the debugger interacts with the user and turns -the stored debug information into source-language specific information. As -such, the debugger must be aware of the source-language, and is thus tied to a -specific language of family of languages. The <a href="#llvm-db">LLVM -debugger</a> is designed to be modular in its support for source-languages.</p> +<p>When a program is being debugged, a debugger interacts with the user and +turns the stored debug information into source-language specific information. +As such, the debugger must be aware of the source-language, and is thus tied to +a specific language of family of languages.</p> </div> - <!-- ======================================================================= --> <div class="doc_subsection"> <a name="debugopt">Debugging optimized code</a> @@ -195,508 +184,531 @@ completely.</p> </div> -<!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="future">Future work</a> +<!-- *********************************************************************** --> +<div class="doc_section"> + <a name="format">Debugging information format</a> </div> +<!-- *********************************************************************** --> <div class="doc_text"> -<p>There are several important extensions that could be eventually added to the -LLVM debugger. The most important extension would be to upgrade the LLVM code -generators to support debugging information. This would also allow, for -example, the X86 code generator to emit native objects that contain debugging -information consumable by traditional source-level debuggers like GDB or -DBX.</p> -<p>Additionally, LLVM optimizations can be upgraded to incrementally update the -debugging information, <a href="#commands">new commands</a> can be added to the -debugger, and thread support could be added to the debugger.</p> +<p>LLVM debugging information has been carefully designed to make it possible +for the optimizer to optimize the program and debugging information without +necessarily having to know anything about debugging information. In particular, +the global constant merging pass automatically eliminates duplicated debugging +information (often caused by header files), the global dead code elimination +pass automatically deletes debugging information for a function if it decides to +delete the function, and the linker eliminates debug information when it merges +<tt>linkonce</tt> functions.</p> -<p>The "SourceLanguage" modules provided by <tt>llvm-db</tt> could be -substantially improved to provide good support for C++ language features like -namespaces and scoping rules.</p> +<p>To do this, most of the debugging information (descriptors for types, +variables, functions, source files, etc) is inserted by the language front-end +in the form of LLVM global variables. These LLVM global variables are no +different from any other global variables, except that they have a web of LLVM +intrinsic functions that point to them. If the last references to a particular +piece of debugging information are deleted (for example, by the +<tt>-globaldce</tt> pass), the extraneous debug information will automatically +become dead and be removed by the optimizer.</p> + +<p>Debug information is designed to be agnostic about the target debugger and +debugging information representation (e.g. DWARF/Stabs/etc). It uses a generic +machine debug information pass to decode the information that represents +variables, types, functions, namespaces, etc: this allows for arbitrary +source-language semantics and type-systems to be used, as long as there is a +module written for the target debugger to interpret the information. In +addition, debug global variables are declared in the <tt>"llvm.metadata"</tt> +section. All values declared in this section are stripped away after target +debug information is constructed and before the program object is emitted.</p> -<p>After working with the debugger for a while, perhaps the nicest improvement -would be to add some sort of line editor, such as GNU readline (but one that is -compatible with the LLVM license).</p> +<p>To provide basic functionality, the LLVM debugger does have to make some +assumptions about the source-level language being debugged, though it keeps +these to a minimum. The only common features that the LLVM debugger assumes +exist are <a href="#format_compile_units">source files</a>, and <a +href="#format_global_variables">program objects</a>. These abstract objects are +used by the debugger to form stack traces, show information about local +variables, etc.</p> -<p>For someone so inclined, it should be straight-forward to write different -front-ends for the LLVM debugger, as the LLVM debugging engine is cleanly -separated from the <tt>llvm-db</tt> front-end. A new LLVM GUI debugger or IDE -would be nice.</p> +<p>This section of the documentation first describes the representation aspects +common to any source-language. The <a href="#ccxx_frontend">next section</a> +describes the data layout conventions used by the C and C++ front-ends.</p> </div> -<!-- *********************************************************************** --> -<div class="doc_section"> - <a name="llvm-db">Using the <tt>llvm-db</tt> tool</a> +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="debug_info_descriptors">Debug information descriptors</a> </div> -<!-- *********************************************************************** --> <div class="doc_text"> +<p>In consideration of the complexity and volume of debug information, LLVM +provides a specification for well formed debug global variables. The constant +value of each of these globals is one of a limited set of structures, known as +debug descriptors.</p> + +<p>Consumers of LLVM debug information expect the descriptors for program +objects to start in a canonical format, but the descriptors can include +additional information appended at the end that is source-language specific. +All LLVM debugging information is versioned, allowing backwards compatibility in +the case that the core structures need to change in some way. Also, all +debugging information objects start with a tag to indicate what type of object +it is. The source-language is allowed to define its own objects, by using +unreserved tag numbers.</p> + +<p>The fields of debug descriptors used internally by LLVM (MachineDebugInfo) +are restricted to only the simple data types <tt>int</tt>, <tt>uint</tt>, +<tt>bool</tt>, <tt>float</tt>, <tt>double</tt>, <tt>sbyte*</tt> and <tt> { }* +</tt>. References to arbitrary values are handled using a <tt> { }* </tt> and a +cast to <tt> { }* </tt> expression; typically references to other field +descriptors, arrays of descriptors or global variables.</p> + +<pre> + %llvm.dbg.object.type = type { + uint, ;; A tag + ... + } +</pre> -<p>The <tt>llvm-db</tt> tool provides a GDB-like interface for source-level -debugging of programs. This tool provides many standard commands for inspecting -and modifying the program as it executes, loading new programs, single stepping, -placing breakpoints, etc. This section describes how to use the debugger.</p> +<p>The first field of a descriptor is always an <tt>uint</tt> containing a tag +value identifying the content of the descriptor. The remaining fields are +specific to the descriptor. The values of tags are loosely bound to the tag +values of Dwarf information entries. However, that does not restrict the use of +the information supplied to Dwarf targets.</p> -<p><tt>llvm-db</tt> has been designed to be as similar to GDB in its user -interface as possible. This should make it extremely easy to learn -<tt>llvm-db</tt> if you already know <tt>GDB</tt>. In general, <tt>llvm-db</tt> -provides the subset of GDB commands that are applicable to LLVM debugging users. -If there is a command missing that make a reasonable amount of sense within the -<a href="#limitations">limitations of <tt>llvm-db</tt></a>, please report it as -a bug or, better yet, submit a patch to add it.</p> +<p>The details of the various descriptors follow.</p> </div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="limitations">Limitations of <tt>llvm-db</tt></a> +<div class="doc_subsubsection"> + <a name="format_anchors">Anchor descriptors</a> </div> <div class="doc_text"> -<p><tt>llvm-db</tt> is designed to be modular and easy to extend. This -extensibility was key to getting the debugger up-and-running quickly, because we -can start with simple-but-unsophisicated implementations of various components. -Because of this, it is currently missing many features, though they should be -easy to add over time (patches welcomed!). The biggest inherent limitations of -<tt>llvm-db</tt> are currently due to extremely simple <a -href="#arch_debugger">debugger backend</a> (implemented in -"lib/Debugger/UnixLocalInferiorProcess.cpp") which is designed to work without -any cooperation from the code generators. Because it is so simple, it suffers -from the following inherent limitations:</p> +<pre> + %<a href="#format_anchors">llvm.dbg.anchor.type</a> = type { + uint, ;; Tag = 0 + uint ;; Tag of descriptors grouped by the anchor + } +</pre> -<ul> +<p>One important aspect of the LLVM debug representation is that it allows the +LLVM debugger to efficiently index all of the global objects without having the +scan the program. To do this, all of the global objects use "anchor" +descriptors with designated names. All of the global objects of a particular +type (e.g., compile units) contain a pointer to the anchor. This pointer allows +the debugger to use def-use chains to find all global objects of that type.</p> -<li>Running a program in <tt>llvm-db</tt> is a bit slower than running it with -<tt>lli</tt> (i.e., in the JIT).</li> +<p>The following names are recognized as anchors by LLVM:</p> -<li>Inspection of the target hardware is not supported. This means that you -cannot, for example, print the contents of X86 registers.</li> +<pre> + %<a href="#format_compile_units">llvm.dbg.compile_units</a> = linkonce constant %<a href="#format_anchors">llvm.dbg.anchor.type</a> { uint 0, uint 17 } ;; DW_TAG_compile_unit + %<a href="#format_global_variables">llvm.dbg.global_variables</a> = linkonce constant %<a href="#format_anchors">llvm.dbg.anchor.type</a> { uint 0, uint 52 } ;; DW_TAG_variable + %<a href="#format_subprograms">llvm.dbg.subprograms</a> = linkonce constant %<a href="#format_anchors">llvm.dbg.anchor.type</a> { uint 0, uint 46 } ;; DW_TAG_subprogram +</pre> -<li>Inspection of LLVM code is not supported. This means that you cannot print -the contents of arbitrary LLVM values, or use commands such as <tt>stepi</tt>. -This also means that you cannot debug code without debug information.</li> +<p>Using anchors in this way (where the compile unit descriptor points to the +anchors, as opposed to having a list of compile unit descriptors) allows for the +standard dead global elimination and merging passes to automatically remove +unused debugging information. If the globals were kept track of through lists, +there would always be an object pointing to the descriptors, thus would never be +deleted.</p> -<li>Portions of the debugger run in the same address space as the program being -debugged. This means that memory corruption by the program could trample on -portions of the debugger.</li> +</div> -<li>Attaching to existing processes and core files is not currently -supported.</li> +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="format_compile_units">Compile unit descriptors</a> +</div> -</ul> +<div class="doc_text"> -<p>That said, the debugger is still quite useful, and all of these limitations -can be eliminated by integrating support for the debugger into the code -generators, and writing a new <a href="#arch_debugger">InferiorProcess</a> -subclass to use it. See the <a href="#future">future work</a> section for ideas -of how to extend the LLVM debugger despite these limitations.</p> +<pre> + %<a href="#format_compile_units">llvm.dbg.compile_unit.type</a> = type { + uint, ;; Tag = 17 (DW_TAG_compile_unit) + { }*, ;; Compile unit anchor = cast = (%<a href="#format_anchors">llvm.dbg.anchor.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_units</a> to { }*) + uint, ;; LLVM debug version number = 1 + uint, ;; Dwarf language identifier (ex. DW_LANG_C89) + sbyte*, ;; Source file name + sbyte*, ;; Source file directory (includes trailing slash) + sbyte* ;; Producer (ex. "4.0.1 LLVM (LLVM research group)") + } +</pre> -</div> +<p>These descriptors contain the version number for the debug info (currently +1), a source language ID for the file (we use the Dwarf 3.0 ID numbers, such as +<tt>DW_LANG_C89</tt>, <tt>DW_LANG_C_plus_plus</tt>, <tt>DW_LANG_Cobol74</tt>, +etc), three strings describing the filename, working directory of the compiler, +and an identifier string for the compiler that produced it.</p> +<p> Compile unit descriptors provide the root context for objects declared in a +specific source file. Global variables and top level functions would be defined +using this context. Compile unit descriptors also provide context for source +line correspondence.</p> + +</div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="sample">A sample <tt>llvm-db</tt> session</a> +<div class="doc_subsubsection"> + <a name="format_global_variables">Global variable descriptors</a> </div> <div class="doc_text"> -<p>TODO: this is obviously lame, when more is implemented, this can be much -better.</p> - <pre> -$ <b>llvm-db funccall</b> -llvm-db: The LLVM source-level debugger -Loading program... successfully loaded 'funccall.bc'! -(llvm-db) <b>create</b> -Starting program: funccall.bc -main at funccall.c:9:2 -9 -> q = 0; -(llvm-db) <b>list main</b> -4 void foo() { -5 int t = q; -6 q = t + 1; -7 } -8 int main() { -9 -> q = 0; -10 foo(); -11 q = q - 1; -12 -13 return q; -(llvm-db) <b>list</b> -14 } -(llvm-db) <b>step</b> -10 -> foo(); -(llvm-db) <b>s</b> -foo at funccall.c:5:2 -5 -> int t = q; -(llvm-db) <b>bt</b> -#0 -> 0x85ffba0 in foo at funccall.c:5:2 -#1 0x85ffd98 in main at funccall.c:10:2 -(llvm-db) <b>finish</b> -main at funccall.c:11:2 -11 -> q = q - 1; -(llvm-db) <b>s</b> -13 -> return q; -(llvm-db) <b>s</b> -The program stopped with exit code 0 -(llvm-db) <b>quit</b> -$ + %<a href="#format_global_variables">llvm.dbg.global_variable.type</a> = type { + uint, ;; Tag = 52 (DW_TAG_variable) + { }*, ;; Global variable anchor = cast (%<a href="#format_anchors">llvm.dbg.anchor.type</a>* %<a href="#format_global_variables">llvm.dbg.global_variables</a> to { }*), + { }*, ;; Reference to compile unit + sbyte*, ;; Name + { }*, ;; Reference to type descriptor + bool, ;; True if the global is local to compile unit (static) + bool, ;; True if the global is defined in the compile unit (not extern) + { }*, ;; Reference to the global variable + uint ;; Line number in compile unit where variable is defined + } </pre> +<p>These descriptors provide debug information about globals variables. The +provide details such as name, type and where the variable is defined.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="format_subprograms">Subprogram descriptors</a> </div> +<div class="doc_text"> + +<pre> + %<a href="#format_subprograms">llvm.dbg.subprogram.type</a> = type { + uint, ;; Tag = 46 (DW_TAG_subprogram) + { }*, ;; Subprogram anchor = cast (%<a href="#format_anchors">llvm.dbg.anchor.type</a>* %<a href="#format_subprograms">llvm.dbg.subprograms</a> to { }*), + { }*, ;; Reference to compile unit + sbyte*, ;; Name + { }*, ;; Reference to type descriptor + bool, ;; True if the global is local to compile unit (static) + bool ;; True if the global is defined in the compile unit (not extern) + TODO - MORE TO COME + } + +</pre> + +<p>These descriptors provide debug information about functions, methods and +subprograms. The provide details such as name, return and argument types and +where the subprogram is defined.</p> +</div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="startup">Starting the debugger</a> +<div class="doc_subsubsection"> + <a name="format_basic_type">Basic type descriptors</a> </div> <div class="doc_text"> -<p>There are three ways to start up the <tt>llvm-db</tt> debugger:</p> +<pre> + %<a href="#format_basic_type">llvm.dbg.basictype.type</a> = type { + uint, ;; Tag = 36 (DW_TAG_base_type) + { }*, ;; Reference to context (typically a compile unit) + sbyte*, ;; Name (may be "" for anonymous types) + { }*, ;; Reference to compile unit where defined (may be NULL) + int, ;; Line number where defined (may be 0) + uint, ;; Size in bits + uint, ;; Alignment in bits + uint, ;; Offset in bits + uint ;; Dwarf type encoding + } +</pre> -<p>When run with no options, just <tt>llvm-db</tt>, the debugger starts up -without a program loaded at all. You must use the <a -href="#c_file"><tt>file</tt> command</a> to load a program, and the <a -href="#c_set_args"><tt>set args</tt></a> or <a href="#c_run"><tt>run</tt></a> -commands to specify the arguments for the program.</p> +<p>These descriptors define primitive types used in the code. Example int, bool +and float. The context provides the scope of the type, which is usually the top +level. Since basic types are not usually user defined the compile unit and line +number can be left as NULL and 0. The size, alignment and offset are expressed +in bits and can be 64 bit values. The alignment is used to round the offset +when embedded in a <a href="#format_composite_type">composite type</a> +(example to keep float doubles on 64 bit boundaries.) The offset is the bit +offset if embedded in a <a href="#format_composite_type">composite +type</a>.</p> -<p>If you start the debugger with one argument, as <tt>llvm-db -<program></tt>, the debugger will start up and load in the specified -program. You can then optionally specify arguments to the program with the <a -href="#c_set_args"><tt>set args</tt></a> or <a href="#c_run"><tt>run</tt></a> -commands.</p> +<p>The type encoding provides the details of the type. The values are typically +one of the following;</p> -<p>The third way to start the program is with the <tt>--args</tt> option. This -option allows you to specify the program to load and the arguments to start out -with. <!-- No options to <tt>llvm-db</tt> may be specified after the -<tt>-args</tt> option. --> Example use: <tt>llvm-db --args ls /home</tt></p> +<pre> + DW_ATE_address = 1 + DW_ATE_boolean = 2 + DW_ATE_float = 4 + DW_ATE_signed = 5 + DW_ATE_signed_char = 6 + DW_ATE_unsigned = 7 + DW_ATE_unsigned_char = 8 +</pre> </div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="commands">Commands recognized by the debugger</a> +<div class="doc_subsubsection"> + <a name="format_derived_type">Derived type descriptors</a> </div> <div class="doc_text"> -<p>FIXME: this needs work obviously. See the <a -href="http://sources.redhat.com/gdb/documentation/">GDB documentation</a> for -information about what these do, or try '<tt>help [command]</tt>' within -<tt>llvm-db</tt> to get information.</p> +<pre> + %<a href="#format_derived_type">llvm.dbg.derivedtype.type</a> = type { + uint, ;; Tag (see below) + { }*, ;; Reference to context + sbyte*, ;; Name (may be "" for anonymous types) + { }*, ;; Reference to compile unit where defined (may be NULL) + int, ;; Line number where defined (may be 0) + uint, ;; Size in bits + uint, ;; Alignment in bits + uint, ;; Offset in bits + { }* ;; Reference to type derived from + } +</pre> -<p> -<h2>General usage:</h2> -<ul> -<li>help [command]</li> -<li>quit</li> -<li><a name="c_file">file</a> [program]</li> -</ul> +<p>These descriptors are used to define types derived from other types. The +value of the tag varies depending on the meaning. The following are possible +tag values;</p> -<h2>Program inspection and interaction:</h2> -<ul> -<li>create (start the program, stopping it ASAP in <tt>main</tt>)</li> -<li>kill</li> -<li>run [args]</li> -<li>step [num]</li> -<li>next [num]</li> -<li>cont</li> -<li>finish</li> - -<li>list [start[, end]]</li> -<li>info source</li> -<li>info sources</li> -<li>info functions</li> -</ul> +<pre> + DW_TAG_member = 13 + DW_TAG_pointer_type = 15 + DW_TAG_reference_type = 16 + DW_TAG_typedef = 22 + DW_TAG_const_type = 38 + DW_TAG_volatile_type = 53 + DW_TAG_restrict_type = 55 +</pre> -<h2>Call stack inspection:</h2> -<ul> -<li>backtrace</li> -<li>up [n]</li> -<li>down [n]</li> -<li>frame [n]</li> -</ul> +<p> <tt>DW_TAG_member</tt> is used to define a member of a <a +href="#format_composite_type">composite type</a>. The type of the member is the +<a href="#format_derived_type">derived type</a>.</p> +<p><tt>DW_TAG_typedef</tt> is used to +provide a name for the derived type.</p> -<h2>Debugger inspection and interaction:</h2> -<ul> -<li>info target</li> -<li>show prompt</li> -<li>set prompt</li> -<li>show listsize</li> -<li>set listsize</li> -<li>show language</li> -<li>set language</li> -<li>show args</li> -<li>set args [args]</li> -</ul> +<p><tt>DW_TAG_pointer_type</tt>, +<tt>DW_TAG_reference_type</tt>, <tt>DW_TAG_const_type</tt>, +<tt>DW_TAG_volatile_type</tt> and <tt>DW_TAG_restrict_type</tt> are used to +qualify the <a href="#format_derived_type">derived type</a>. </p> -<h2>TODO:</h2> -<ul> -<li>info frame</li> -<li>break</li> -<li>print</li> -<li>ptype</li> - -<li>info types</li> -<li>info variables</li> -<li>info program</li> - -<li>info args</li> -<li>info locals</li> -<li>info catch</li> -<li>... many others</li> -</ul> +<p><a href="#format_derived_type">Derived type</a> location can be determined +from the compile unit and line number. The size, alignment and offset are +expressed in bits and can be 64 bit values. The alignment is used to round the +offset when embedded in a <a href="#format_composite_type">composite type</a> +(example to keep float doubles on 64 bit boundaries.) The offset is the bit +offset if embedded in a <a href="#format_composite_type">composite +type</a>.</p> + +<p>Note that the <tt>void *</tt> type is expressed as a +<tt>llvm.dbg.derivedtype.type</tt> with tag of <tt>DW_TAG_pointer_type</tt> and +NULL derived type.</p> </div> -<!-- *********************************************************************** --> -<div class="doc_section"> - <a name="architecture">Architecture of the LLVM debugger</a> +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="format_composite_type">Composite type descriptors</a> </div> -<!-- *********************************************************************** --> <div class="doc_text"> -<p>The LLVM debugger is built out of three distinct layers of software. These -layers provide clients with different interface options depending on what pieces -of they want to implement themselves, and it also promotes code modularity and -good design. The three layers are the <a href="#arch_debugger">Debugger -interface</a>, the <a href="#arch_info">"info" interfaces</a>, and the <a -href="#arch_llvm-db"><tt>llvm-db</tt> tool</a> itself.</p> + +<pre> + %<a href="#format_composite_type">llvm.dbg.compositetype.type</a> = type { + uint, ;; Tag (see below) + { }*, ;; Reference to context + sbyte*, ;; Name (may be "" for anonymous types) + { }*, ;; Reference to compile unit where defined (may be NULL) + int, ;; Line number where defined (may be 0) + uint, ;; Size in bits + uint, ;; Alignment in bits + uint, ;; Offset in bits + { }* ;; Reference to array of member descriptors + } +</pre> + +<p>These descriptors are used to define types that are composed of 0 or more +elements. The value of the tag varies depending on the meaning. The following +are possible tag values;</p> + +<pre> + DW_TAG_array_type = 1 + DW_TAG_enumeration_type = 4 + DW_TAG_structure_type = 19 + DW_TAG_union_type = 23 +</pre> + +<p>The members of array types (tag = <tt>DW_TAG_array_type</tt>) are <a +href="#format_subrange">subrange descriptors</a>, each representing the range of +subscripts at that level of indexing.</p> + +<p>The members of enumeration types (tag = <tt>DW_TAG_enumeration_type</tt>) are +<a href="#format_enumeration">enumerator descriptors</a>, each representing the +definition of enumeration value +for the set.</p> + +<p>The members of structure (tag = <tt>DW_TAG_structure_type</tt>) or union (tag += <tt>DW_TAG_union_type</tt>) types are any one of the <a +href="#format_basic_type">basic</a>, <a href="#format_derived_type">derived</a> +or <a href="#format_composite_type">composite</a> type descriptors, each +representing a field member of the structure or union.</p> + +<p><a href="#format_composite_type">Composite type</a> location can be +determined from the compile unit and line number. The size, alignment and +offset are expressed in bits and can be 64 bit values. The alignment is used to +round the offset when embedded in a <a href="#format_composite_type">composite +type</a> (as an example, to keep float doubles on 64 bit boundaries.) The offset +is the bit offset if embedded in a <a href="#format_composite_type">composite +type</a>.</p> + </div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="arch_debugger">The Debugger and InferiorProcess classes</a> +<div class="doc_subsubsection"> + <a name="format_subrange">Subrange descriptors</a> </div> <div class="doc_text"> -<p>The Debugger class (defined in the <tt>include/llvm/Debugger/</tt> directory) -is a low-level class which is used to maintain information about the loaded -program, as well as start and stop the program running as necessary. This class -does not provide any high-level analysis or control over the program, only -exposing simple interfaces like <tt>load/unloadProgram</tt>, -<tt>create/killProgram</tt>, <tt>step/next/finish/contProgram</tt>, and -low-level methods for installing breakpoints.</p> - -<p> -The Debugger class is itself a wrapper around the lowest-level InferiorProcess -class. This class is used to represent an instance of the program running under -debugger control. The InferiorProcess class can be implemented in different -ways for different targets and execution scenarios (e.g., remote debugging). -The InferiorProcess class exposes a small and simple collection of interfaces -which are useful for inspecting the current state of the program (such as -collecting stack trace information, reading the memory image of the process, -etc). The interfaces in this class are designed to be as low-level and simple -as possible, to make it easy to create new instances of the class. -</p> - -<p> -The Debugger class exposes the currently active instance of InferiorProcess -through the <tt>Debugger::getRunningProcess</tt> method, which returns a -<tt>const</tt> reference to the class. This means that clients of the Debugger -class can only <b>inspect</b> the running instance of the program directly. To -change the executing process in some way, they must use the interces exposed by -the Debugger class. -</p> + +<pre> + %<a href="#format_subrange">llvm.dbg.subrange.type</a> = type { + uint, ;; Tag = 33 (DW_TAG_subrange_type) + uint, ;; Low value + uint ;; High value + } +</pre> + +<p>These descriptors are used to define ranges of array subscripts for an array +<a href="#format_composite_type">composite type</a>. The low value defines the +lower bounds typically zero for C/C++. The high value is the upper bounds. +Values are 64 bit. High - low + 1 is the size of the array. If +low == high the array will be unbounded.</p> + </div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="arch_info">The RuntimeInfo, ProgramInfo, and SourceLanguage classes</a> +<div class="doc_subsubsection"> + <a name="format_enumeration">Enumerator descriptors</a> </div> <div class="doc_text"> -<p> -The next-highest level of debugger abstraction is provided through the -ProgramInfo, RuntimeInfo, SourceLanguage and related classes (also defined in -the <tt>include/llvm/Debugger/</tt> directory). These classes efficiently -decode the debugging information and low-level interfaces exposed by -InferiorProcess into a higher-level representation, suitable for analysis by the -debugger. -</p> - -<p> -The ProgramInfo class exposes a variety of different kinds of information about -the program objects in the source-level-language. The SourceFileInfo class -represents a source-file in the program (e.g. a .cpp or .h file). The -SourceFileInfo class captures information such as which SourceLanguage was used -to compile the file, where the debugger can get access to the actual file text -(which is lazily loaded on demand), etc. The SourceFunctionInfo class -represents a... <b>FIXME: finish</b>. The ProgramInfo class provides interfaces -to lazily find and decode the information needed to create the Source*Info -classes requested by the debugger. -</p> - -<p> -The RuntimeInfo class exposes information about the currently executed program, -by decoding information from the InferiorProcess and ProgramInfo classes. It -provides a StackFrame class which provides an easy-to-use interface for -inspecting the current and suspended stack frames in the program. -</p> - -<p> -The SourceLanguage class is an abstract interface used by the debugger to -perform all source-language-specific tasks. For example, this interface is used -by the ProgramInfo class to decode language-specific types and functions and by -the debugger front-end (such as <a href="#arch_llvm-db"><tt>llvm-db</tt></a> to -evaluate source-langauge expressions typed into the debugger. This class uses -the RuntimeInfo & ProgramInfo classes to get information about the current -execution context and the loaded program, respectively. -</p> + +<pre> + %<a href="#format_enumeration">llvm.dbg.enumerator.type</a> = type { + uint, ;; Tag = 40 (DW_TAG_enumerator) + sbyte*, ;; Name + uint ;; Value + } +</pre> + +<p>These descriptors are used to define members of an enumeration <a +href="#format_composite_type">composite type</a>, it associates the name to the +value.</p> </div> <!-- ======================================================================= --> <div class="doc_subsection"> - <a name="arch_llvm-db">The <tt>llvm-db</tt> tool</a> + <a name="format_common_intrinsics">Debugger intrinsic functions</a> </div> <div class="doc_text"> -<p> -The <tt>llvm-db</tt> is designed to be a debugger providing an interface as <a -href="#llvm-db">similar to GDB</a> as reasonable, but no more so than that. -Because the <a href="#arch_debugger">Debugger</a> and <a -href="#arch_info">info</a> classes implement all of the heavy lifting and -analysis, <tt>llvm-db</tt> (which lives in <tt>llvm/tools/llvm-db</tt>) consists -mainly of of code to interact with the user and parse commands. The CLIDebugger -constructor registers all of the builtin commands for the debugger, and each -command is implemented as a CLIDebugger::[name]Command method. -</p> -</div> +<p>LLVM uses several intrinsic functions (name prefixed with "llvm.dbg") to +provide debug information at various points in generated code.</p> + +</div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="arch_todo">Short-term TODO list</a> +<div class="doc_subsubsection"> + <a name="format_common_stoppoint">llvm.dbg.stoppoint</a> </div> <div class="doc_text"> +<pre> + void %<a href="#format_common_stoppoint">llvm.dbg.stoppoint</a>( uint, uint, %<a href="#format_compile_units">llvm.dbg.compile_unit</a>* ) +</pre> -<p> -FIXME: this section will eventually go away. These are notes to myself of -things that should be implemented, but haven't yet. -</p> - -<p> -<b>Breakpoints:</b> Support is already implemented in the 'InferiorProcess' -class, though it hasn't been tested yet. To finish breakpoint support, we need -to implement breakCommand (which should reuse the linespec parser from the list -command), and handle the fact that 'break foo' or 'break file.c:53' may insert -multiple breakpoints. Also, if you say 'break file.c:53' and there is no -stoppoint on line 53, the breakpoint should go on the next available line. My -idea was to have the Debugger class provide a "Breakpoint" class which -encapsulated this messiness, giving the debugger front-end a simple interface. -The debugger front-end would have to map the really complex semantics of -temporary breakpoints and 'conditional' breakpoints onto this intermediate -level. Also, breakpoints should survive as much as possible across program -reloads. -</p> - -<p> -<b>UnixLocalInferiorProcess.cpp speedup</b>: There is no reason for the debugged -process to code gen the globals corresponding to debug information. The -IntrinsicLowering object could instead change descriptors into constant expr -casts of the constant address of the LLVM objects for the descriptors. This -would also allow us to eliminate the mapping back and forth between physical -addresses that must be done.</p> - -<p> -<b>Process deaths</b>: The InferiorProcessDead exception should be extended to -know "how" a process died, i.e., it was killed by a signal. This is easy to -collect in the UnixLocalInferiorProcess, we just need to represent it.</p> +<p>This intrinsic is used to provide correspondence between the source file and +the generated code. The first argument is the line number (base 1), second +argument si the column number (0 if unknown) and the third argument the source +compile unit. Code following a call to this intrinsic will have been defined in +close proximity of the line, column and file. This information holds until the +next call to <a href="#format_common_stoppoint">lvm.dbg.stoppoint</a>.</p> </div> -<!-- *********************************************************************** --> -<div class="doc_section"> - <a name="format">Debugging information format</a> +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="format_common_func_start">llvm.dbg.func.start</a> </div> -<!-- *********************************************************************** --> <div class="doc_text"> +<pre> + void %<a href="#format_common_func_start">llvm.dbg.func.start</a>( %<a href="#format_subprograms">llvm.dbg.subprogram.type</a>* ) +</pre> -<p>LLVM debugging information has been carefully designed to make it possible -for the optimizer to optimize the program and debugging information without -necessarily having to know anything about debugging information. In particular, -the global constant merging pass automatically eliminates duplicated debugging -information (often caused by header files), the global dead code elimination -pass automatically deletes debugging information for a function if it decides to -delete the function, and the linker eliminates debug information when it merges -<tt>linkonce</tt> functions.</p> +<p>This intrinsic is used to link the debug information in <tt>%<a +href="#format_subprograms">llvm.dbg.subprogram</a></tt> to the function. It also +defines the beginning of the function's declarative region (scope.) The +intrinsic should be called early in the function after the all the alloca +instructions.</p> -<p>To do this, most of the debugging information (descriptors for types, -variables, functions, source files, etc) is inserted by the language front-end -in the form of LLVM global variables. These LLVM global variables are no -different from any other global variables, except that they have a web of LLVM -intrinsic functions that point to them. If the last references to a particular -piece of debugging information are deleted (for example, by the -<tt>-globaldce</tt> pass), the extraneous debug information will automatically -become dead and be removed by the optimizer.</p> +</div> -<p>The debugger is designed to be agnostic about the contents of most of the -debugging information. It uses a <a href="#arch_info">source-language-specific -module</a> to decode the information that represents variables, types, -functions, namespaces, etc: this allows for arbitrary source-language semantics -and type-systems to be used, as long as there is a module written for the -debugger to interpret the information.</p> +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="format_common_region_start">llvm.dbg.region.start</a> +</div> -<p>To provide basic functionality, the LLVM debugger does have to make some -assumptions about the source-level language being debugged, though it keeps -these to a minimum. The only common features that the LLVM debugger assumes -exist are <a href="#format_common_source_files">source files</a>, and <a -href="#format_program_objects">program objects</a>. These abstract objects are -used by the debugger to form stack traces, show information about local -variables, etc.</p> +<div class="doc_text"> +<pre> + void %<a href="#format_common_region_start">llvm.dbg.region.start</a>() +</pre> -<p>This section of the documentation first describes the representation aspects -common to any source-language. The <a href="#ccxx_frontend">next section</a> -describes the data layout conventions used by the C and C++ front-ends.</p> +<p>This intrinsic is used to define the beginning of a declarative scope (ex. +block) for local language elements. It should be paired off with a closing +<tt>%<a href="#format_common_region_end">llvm.dbg.region.end</a></tt>.</p> </div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="format_common_anchors">Anchors for global objects</a> +<div class="doc_subsubsection"> + <a name="format_common_region_end">llvm.dbg.region.end</a> </div> <div class="doc_text"> -<p>One important aspect of the LLVM debug representation is that it allows the -LLVM debugger to efficiently index all of the global objects without having the -scan the program. To do this, all of the global objects use "anchor" globals of -type "<tt>{}</tt>", with designated names. These anchor objects obviously do -not contain any content or meaning by themselves, but all of the global objects -of a particular type (e.g., source file descriptors) contain a pointer to the -anchor. This pointer allows the debugger to use def-use chains to find all -global objects of that type.</p> +<pre> + void %<a href="#format_common_region_end">llvm.dbg.region.end</a>() +</pre> -<p>So far, the following names are recognized as anchors by the LLVM -debugger:</p> +<p>This intrinsic is used to define the end of a declarative scope (ex. block) +for local language elements. It should be paired off with an opening <tt>%<a +href="#format_common_region_start">llvm.dbg.region.start</a></tt> or <tt>%<a +href="#format_common_func_start">llvm.dbg.func.start</a></tt>.</p> +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="format_common_declare">llvm.dbg.declare</a> +</div> + +<div class="doc_text"> <pre> - %<a href="#format_common_source_files">llvm.dbg.translation_units</a> = linkonce global {} {} - %<a href="#format_program_objects">llvm.dbg.globals</a> = linkonce global {} {} + void %<a href="#format_common_declare">llvm.dbg.declare</a>( {} *, ... ) </pre> -<p>Using anchors in this way (where the source file descriptor points to the -anchors, as opposed to having a list of source file descriptors) allows for the -standard dead global elimination and merging passes to automatically remove -unused debugging information. If the globals were kept track of through lists, -there would always be an object pointing to the descriptors, thus would never be -deleted.</p> +<p>This intrinsic provides information about a local element (ex. variable.) +TODO - details.</p> </div> <!-- ======================================================================= --> <div class="doc_subsection"> - <a name="format_common_stoppoint"> + <a name="format_common_stoppoints"> Representing stopping points in the source program </a> </div> @@ -706,13 +718,14 @@ deleted.</p> <p>LLVM debugger "stop points" are a key part of the debugging representation that allows the LLVM to maintain simple semantics for <a href="#debugopt">debugging optimized code</a>. The basic idea is that the -front-end inserts calls to the <tt>%llvm.dbg.stoppoint</tt> intrinsic function -at every point in the program where the debugger should be able to inspect the -program (these correspond to places the debugger stops when you "<tt>step</tt>" -through it). The front-end can choose to place these as fine-grained as it -would like (for example, before every subexpression evaluated), but it is -recommended to only put them after every source statement that includes -executable code.</p> +front-end inserts calls to the <a +href="#format_common_stoppoint">%<tt>llvm.dbg.stoppoint</tt></a> intrinsic +function at every point in the program where the debugger should be able to +inspect the program (these correspond to places the debugger stops when you +"<tt>step</tt>" through it). The front-end can choose to place these as +fine-grained as it would like (for example, before every subexpression +evaluated), but it is recommended to only put them after every source statement +that includes executable code.</p> <p>Using calls to this intrinsic function to demark legal points for the debugger to inspect the program automatically disables any optimizations that @@ -724,12 +737,6 @@ such as code motion of non-trapping instructions, nor does it impact optimization of subexpressions, code duplication transformations, or basic-block reordering transformations.</p> -<p>An important aspect of the calls to the <tt>%llvm.dbg.stoppoint</tt> -intrinsic is that the function-local debugging information is woven together -with use-def chains. This makes it easy for the debugger to, for example, -locate the 'next' stop point. For a concrete example of stop points, see the -example in <a href="#format_common_lifetime">the next section</a>.</p> - </div> @@ -764,54 +771,67 @@ lifetime expires. Consider the following C fragment, for example:</p> 9. } </pre> -<p>Compiled to LLVM, this function would be represented like this (FIXME: CHECK -AND UPDATE THIS):</p> +<p>Compiled to LLVM, this function would be represented like this:</p> <pre> void %foo() { +entry: %X = alloca int %Y = alloca int %Z = alloca int - <a name="#icl_ex_D1">%D1</a> = call {}* %llvm.dbg.func.start(<a href="#format_program_objects">%lldb.global</a>* %d.foo) - %D2 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D1, uint 2, uint 2, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file) - - %D3 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D2, ...) + + ... + + call void %<a href="#format_common_func_start">llvm.dbg.func.start</a>( %<a href="#format_subprograms">llvm.dbg.subprogram.type</a>* %llvm.dbg.subprogram ) + + call void %<a href="#format_common_stoppoint">llvm.dbg.stoppoint</a>( uint 2, uint 2, %<a href="#format_compile_units">llvm.dbg.compile_unit</a>* %llvm.dbg.compile_unit ) + + call void %<a href="#format_common_declare">llvm.dbg.declare</a>({}* %X, ...) + call void %<a href="#format_common_declare">llvm.dbg.declare</a>({}* %Y, ...) + <i>;; Evaluate expression on line 2, assigning to X.</i> - %D4 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D3, uint 3, uint 2, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file) - - %D5 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D4, ...) + + call void %<a href="#format_common_stoppoint">llvm.dbg.stoppoint</a>( uint 3, uint 2, %<a href="#format_compile_units">llvm.dbg.compile_unit</a>* %llvm.dbg.compile_unit ) + <i>;; Evaluate expression on line 3, assigning to Y.</i> - %D6 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D5, uint 5, uint 4, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file) - - <a name="#icl_ex_D1">%D7</a> = call {}* %llvm.region.start({}* %D6) - %D8 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D7, ...) + + call void %<a href="#format_common_stoppoint">llvm.region.start</a>() + call void %<a href="#format_common_stoppoint">llvm.dbg.stoppoint</a>( uint 5, uint 4, %<a href="#format_compile_units">llvm.dbg.compile_unit</a>* %llvm.dbg.compile_unit ) + call void %<a href="#format_common_declare">llvm.dbg.declare</a>({}* %X, ...) + <i>;; Evaluate expression on line 5, assigning to Z.</i> - %D9 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D8, uint 6, uint 4, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file) - - <i>;; Code for line 6.</i> - %D10 = call {}* %llvm.region.end({}* %D9) - %D11 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D10, uint 8, uint 2, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file) - - <i>;; Code for line 8.</i> - <a name="#icl_ex_D1">%D12</a> = call {}* %llvm.region.end({}* %D11) + + call void %<a href="#format_common_stoppoint">llvm.dbg.stoppoint</a>( uint 7, uint 2, %<a href="#format_compile_units">llvm.dbg.compile_unit</a>* %llvm.dbg.compile_unit ) + call void %<a href="#format_common_region_end">llvm.region.end</a>() + + call void %<a href="#format_common_stoppoint">llvm.dbg.stoppoint</a>( uint 9, uint 2, %<a href="#format_compile_units">llvm.dbg.compile_unit</a>* %llvm.dbg.compile_unit ) + + call void %<a href="#format_common_region_end">llvm.region.end</a>() + ret void } </pre> <p>This example illustrates a few important details about the LLVM debugging -information. In particular, it shows how the various intrinsics used are woven -together with def-use and use-def chains, similar to how <a -href="#format_common_anchors">anchors</a> are used with globals. This allows -the debugger to analyze the relationship between statements, variable -definitions, and the code used to implement the function.</p> - -<p>In this example, two explicit regions are defined, one with the <a -href="#icl_ex_D1">definition of the <tt>%D1</tt> variable</a> and one with the -<a href="#icl_ex_D7">definition of <tt>%D7</tt></a>. In the case of -<tt>%D1</tt>, the debug information indicates that the function whose <a -href="#format_program_objects">descriptor</a> is specified as an argument to the -intrinsic. This defines a new stack frame whose lifetime ends when the region -is ended by <a href="#icl_ex_D12">the <tt>%D12</tt> call</a>.</p> +information. In particular, it shows how the various intrinsics are applied +together to allow a debugger to analyze the relationship between statements, +variable definitions, and the code used to implement the function.</p> + +<p>The first intrinsic <tt>%<a +href="#format_common_func_start">llvm.dbg.func.start</a></tt> provides +a link with the <a href="#format_subprograms">subprogram descriptor</a> +containing the details of this function. This call also defines the beginning +of the function region, bounded by the <tt>%<a +href="#format_common_region_end">llvm.region.end</a></tt> at the end of +the function. This region is used to bracket the lifetime of variables declared +within. For a function, this outer region defines a new stack frame whose +lifetime ends when the region is ended.</p> + +<p>It is possible to define inner regions for short term variables by using the +%<a href="#format_common_stoppoint"><tt>llvm.region.start</tt></a> and <a +href="#format_common_region_end"><tt>%llvm.region.end</tt></a> to bound a +region. The inner region in this example would be for the block containing the +declaration of Z.</p> <p>Using regions to represent the boundaries of source-level functions allow LLVM interprocedural optimizations to arbitrarily modify LLVM functions without @@ -824,280 +844,790 @@ its caller that it will not be possible for the user to manually invoke the inlined function from the debugger).</p> <p>Once the function has been defined, the <a -href="#format_common_stoppoint">stopping point</a> corresponding to line #2 of -the function is encountered. At this point in the function, <b>no</b> local -variables are live. As lines 2 and 3 of the example are executed, their -variable definitions are automatically introduced into the program, without the +href="#format_common_stoppoint"><tt>stopping point</tt></a> corresponding to +line #2 (column #2) of the function is encountered. At this point in the +function, <b>no</b> local variables are live. As lines 2 and 3 of the example +are executed, their variable definitions are introduced into the program using +%<a href="#format_common_declare"><tt>llvm.dbg.declare</tt></a>, without the need to specify a new region. These variables do not require new regions to be introduced because they go out of scope at the same point in the program: line 9.</p> <p>In contrast, the <tt>Z</tt> variable goes out of scope at a different time, -on line 7. For this reason, it is defined within <a href="#icl_ex_D7">the -<tt>%D7</tt> region</a>, which kills the availability of <tt>Z</tt> before the -code for line 8 is executed. In this way, regions can support arbitrary -source-language scoping rules, as long as they can only be nested (ie, one scope -cannot partially overlap with a part of another scope).</p> +on line 7. For this reason, it is defined within the inner region, which kills +the availability of <tt>Z</tt> before the code for line 8 is executed. In this +way, regions can support arbitrary source-language scoping rules, as long as +they can only be nested (ie, one scope cannot partially overlap with a part of +another scope).</p> <p>It is worth noting that this scoping mechanism is used to control scoping of all declarations, not just variable declarations. For example, the scope of a -C++ using declaration is controlled with this, and the <tt>llvm-db</tt> C++ -support routines could use this to change how name lookup is performed (though -this is not implemented yet).</p> +C++ using declaration is controlled with this couldchange how name lookup is +performed.</p> + +</div> + + + +<!-- *********************************************************************** --> +<div class="doc_section"> + <a name="ccxx_frontend">C/C++ front-end specific debug information</a> +</div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>The C and C++ front-ends represent information about the program in a format +that is effectively identical to <a +href="http://www.eagercon.com/dwarf/dwarf3std.htm">Dwarf 3.0</a> in terms of +information content. This allows code generators to trivially support native +debuggers by generating standard dwarf information, and contains enough +information for non-dwarf targets to translate it as needed.</p> + +<p>This section describes the forms used to represent C and C++ programs. Other +languages could pattern themselves after this (which itself is tuned to +representing programs in the same way that Dwarf 3 does), or they could choose +to provide completely different forms if they don't fit into the Dwarf model. +As support for debugging information gets added to the various LLVM +source-language front-ends, the information used should be documented here.</p> + +<p>The following sections provide examples of various C/C++ constructs and the +debug information that would best describe those constructs.</p> </div> <!-- ======================================================================= --> <div class="doc_subsection"> - <a name="format_common_descriptors">Object descriptor formats</a> + <a name="ccxx_compile_units">C/C++ source file information</a> </div> <div class="doc_text"> -<p>The LLVM debugger expects the descriptors for program objects to start in a -canonical format, but the descriptors can include additional information -appended at the end that is source-language specific. All LLVM debugging -information is versioned, allowing backwards compatibility in the case that the -core structures need to change in some way. Also, all debugging information -objects start with a <a href="#format_common_tags">tag</a> to indicate what type -of object it is. The source-language is allows to define its own objects, by -using unreserved tag numbers.</p> -<p>The lowest-level descriptor are those describing <a -href="#format_common_source_files">the files containing the program source -code</a>, as most other descriptors (sometimes indirectly) refer to them. -</p> +<p>Given the source files "MySource.cpp" and "MyHeader.h" located in the +directory "/Users/mine/sources", the following code;</p> + +<pre> +#include "MyHeader.h" + +int main(int argc, char *argv[]) { + return 0; +} +</pre> + +<p>a C/C++ front-end would generate the following descriptors;</p> + +<pre> +... +;; +;; Define types used. In this case we need one for compile unit anchors and one +;; for compile units. +;; +%<a href="#format_anchors">llvm.dbg.anchor.type</a> = type { uint, uint } +%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a> = type { uint, { }*, uint, uint, sbyte*, sbyte*, sbyte* } +... +;; +;; Define the anchor for compile units. Note that the second field of the +;; anchor is 17, which is the same as the tag for compile units +;; (17 = DW_TAG_compile_unit.) +;; +%<a href="#format_compile_units">llvm.dbg.compile_units</a> = linkonce constant %<a href="#format_anchors">llvm.dbg.anchor.type</a> { uint 0, uint 17 }, section "llvm.metadata" + +;; +;; Define the compile unit for the source file "/Users/mine/sources/MySource.cpp". +;; +%<a href="#format_compile_units">llvm.dbg.compile_unit1</a> = internal constant %<a href="#format_compile_units">llvm.dbg.compile_unit.type</a> { + uint 17, + { }* cast (%<a href="#format_anchors">llvm.dbg.anchor.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_units</a> to { }*), + uint 1, + uint 1, + sbyte* getelementptr ([13 x sbyte]* %str1, int 0, int 0), + sbyte* getelementptr ([21 x sbyte]* %str2, int 0, int 0), + sbyte* getelementptr ([33 x sbyte]* %str3, int 0, int 0) }, section "llvm.metadata" + +;; +;; Define the compile unit for the header file "/Users/mine/sources/MyHeader.h". +;; +%<a href="#format_compile_units">llvm.dbg.compile_unit2</a> = internal constant %<a href="#format_compile_units">llvm.dbg.compile_unit.type</a> { + uint 17, + { }* cast (%<a href="#format_anchors">llvm.dbg.anchor.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_units</a> to { }*), + uint 1, + uint 1, + sbyte* getelementptr ([11 x sbyte]* %str4, int 0, int 0), + sbyte* getelementptr ([21 x sbyte]* %str2, int 0, int 0), + sbyte* getelementptr ([33 x sbyte]* %str3, int 0, int 0) }, section "llvm.metadata" + +;; +;; Define each of the strings used in the compile units. +;; +%str1 = internal constant [13 x sbyte] c"MySource.cpp\00", section "llvm.metadata"; +%str2 = internal constant [21 x sbyte] c"/Users/mine/sources/\00", section "llvm.metadata"; +%str3 = internal constant [33 x sbyte] c"4.0.1 LLVM (LLVM research group)\00", section "llvm.metadata"; +%str4 = internal constant [11 x sbyte] c"MyHeader.h\00", section "llvm.metadata"; +... +</pre> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="ccxx_global_variable">C/C++ global variable information</a> </div> +<div class="doc_text"> + +<p>Given an integer global variable declared as follows;</p> + +<pre> +int MyGlobal = 100; +</pre> + +<p>a C/C++ front-end would generate the following descriptors;</p> + +<pre> +;; +;; Define types used. One for global variable anchors, one for the global +;; variable descriptor, one for the global's basic type and one for the global's +;; compile unit. +;; +%<a href="#format_anchors">llvm.dbg.anchor.type</a> = type { uint, uint } +%<a href="#format_global_variables">llvm.dbg.global_variable.type</a> = type { uint, { }*, { }*, sbyte*, { }*, bool, bool, { }*, uint } +%<a href="#format_basic_type">llvm.dbg.basictype.type</a> = type { uint, { }*, sbyte*, { }*, int, uint, uint, uint, uint } +%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a> = ... +... +;; +;; Define the global itself. +;; +%MyGlobal = global int 100 +... +;; +;; Define the anchor for global variables. Note that the second field of the +;; anchor is 52, which is the same as the tag for global variables +;; (52 = DW_TAG_variable.) +;; +%<a href="#format_global_variables">llvm.dbg.global_variables</a> = linkonce constant %<a href="#format_anchors">llvm.dbg.anchor.type</a> { uint 0, uint 52 }, section "llvm.metadata" + +;; +;; Define the global variable descriptor. Note the reference to the global +;; variable anchor and the global variable itself. +;; +%<a href="#format_global_variables">llvm.dbg.global_variable</a> = internal constant %<a href="#format_global_variables">llvm.dbg.global_variable.type</a> { + uint 52, + { }* cast (%<a href="#format_anchors">llvm.dbg.anchor.type</a>* %<a href="#format_global_variables">llvm.dbg.global_variables</a> to { }*), + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([9 x sbyte]* %str1, int 0, int 0), + { }* cast (%<a href="#format_basic_type">llvm.dbg.basictype.type</a>* %<a href="#format_basic_type">llvm.dbg.basictype</a> to { }*), + bool false, + bool true, + { }* cast (int* %MyGlobal to { }*), + uint 1 }, section "llvm.metadata" + +;; +;; Define the basic type of 32 bit signed integer. Note that since int is an +;; intrinsic type the source file is NULL and line 0. +;; +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([4 x sbyte]* %str2, int 0, int 0), + { }* null, + int 0, + uint 32, + uint 32, + uint 0, + uint 5 }, section "llvm.metadata" + +;; +;; Define the names of the global variable and basic type. +;; +%str1 = internal constant [9 x sbyte] c"MyGlobal\00", section "llvm.metadata" +%str2 = internal constant [4 x sbyte] c"int\00", section "llvm.metadata" +</pre> -<!-- ------------------------------------------------------------------------ -> -<div class="doc_subsubsection"> - <a name="format_common_source_files">Representation of source files</a> +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="ccxx_subprogram">C/C++ function information</a> </div> <div class="doc_text"> -<p> -Source file descriptors are patterned after the Dwarf "compile_unit" object. -The descriptor currently is defined to have at least the following LLVM -type entries:</p> + +<p>Given a function declared as follows;</p> <pre> -%lldb.compile_unit = type { - uint, <i>;; Tag: <a href="#tag_compile_unit">LLVM_COMPILE_UNIT</a></i> - ushort, <i>;; LLVM debug version number</i> - ushort, <i>;; Dwarf language identifier</i> - sbyte*, <i>;; Filename</i> - sbyte*, <i>;; Working directory when compiled</i> - sbyte* <i>;; Producer of the debug information</i> +int main(int argc, char *argv[]) { + return 0; } </pre> -<p> -These descriptors contain the version number for the debug info, a source -language ID for the file (we use the Dwarf 3.0 ID numbers, such as -<tt>DW_LANG_C89</tt>, <tt>DW_LANG_C_plus_plus</tt>, <tt>DW_LANG_Cobol74</tt>, -etc), three strings describing the filename, working directory of the compiler, -and an identifier string for the compiler that produced it. Note that actual -compile_unit declarations must also include an <a -href="#format_common_anchors">anchor</a> to <tt>llvm.dbg.translation_units</tt>, -but it is not specified where the anchor is to be located. Here is an example -descriptor: -</p> - -<p><pre> -%arraytest_source_file = internal constant %lldb.compile_unit { - <a href="#tag_compile_unit">uint 17</a>, ; Tag value - ushort 0, ; Version #0 - ushort 1, ; DW_LANG_C89 - sbyte* getelementptr ([12 x sbyte]* %.str_1, long 0, long 0), ; filename - sbyte* getelementptr ([12 x sbyte]* %.str_2, long 0, long 0), ; working dir - sbyte* getelementptr ([12 x sbyte]* %.str_3, long 0, long 0), ; producer - {}* %llvm.dbg.translation_units ; Anchor +<p>a C/C++ front-end would generate the following descriptors;</p> + +<pre> +;; +;; Define types used. One for subprogram anchors, one for the subprogram +;; descriptor, one for the global's basic type and one for the subprogram's +;; compile unit. +;; +%<a href="#format_subprograms">llvm.dbg.subprogram.type</a> = type { uint, { }*, { }*, sbyte*, { }*, bool, bool } +%<a href="#format_anchors">llvm.dbg.anchor.type</a> = type { uint, uint } +%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a> = ... + +;; +;; Define the anchor for subprograms. Note that the second field of the +;; anchor is 46, which is the same as the tag for subprograms +;; (46 = DW_TAG_subprogram.) +;; +%<a href="#format_subprograms">llvm.dbg.subprograms</a> = linkonce constant %<a href="#format_anchors">llvm.dbg.anchor.type</a> { uint 0, uint 46 }, section "llvm.metadata" + +;; +;; Define the descriptor for the subprogram. TODO - more details. +;; +%<a href="#format_subprograms">llvm.dbg.subprogram</a> = internal constant %<a href="#format_subprograms">llvm.dbg.subprogram.type</a> { + uint 46, + { }* cast (%<a href="#format_anchors">llvm.dbg.anchor.type</a>* %<a href="#format_subprograms">llvm.dbg.subprograms</a> to { }*), + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([5 x sbyte]* %str1, int 0, int 0), + { }* null, + bool false, + bool true }, section "llvm.metadata" + +;; +;; Define the name of the subprogram. +;; +%str1 = internal constant [5 x sbyte] c"main\00", section "llvm.metadata" + +;; +;; Define the subprogram itself. +;; +int %main(int %argc, sbyte** %argv) { +... } -%.str_1 = internal constant [12 x sbyte] c"arraytest.c\00" -%.str_2 = internal constant [12 x sbyte] c"/home/sabre\00" -%.str_3 = internal constant [12 x sbyte] c"llvmgcc 3.4\00" -</pre></p> +</pre> -<p> -Note that the LLVM constant merging pass should eliminate duplicate copies of -the strings that get emitted to each translation unit, such as the producer. -</p> +</div> +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="ccxx_basic_types">C/C++ basic types</a> </div> +<div class="doc_text"> + +<p>The following are the basic type descriptors for C/C++ core types;</p> + +</div> -<!-- ----------------------------------------------------------------------- --> +<!-- ======================================================================= --> <div class="doc_subsubsection"> - <a name="format_program_objects">Representation of program objects</a> + <a name="ccxx_basic_type_bool">bool</a> </div> <div class="doc_text"> -<p> -The LLVM debugger needs to know about some source-language program objects, in -order to build stack traces, print information about local variables, and other -related activities. The LLVM debugger differentiates between three different -types of program objects: subprograms (functions, messages, methods, etc), -variables (locals and globals), and others. Because source-languages have -widely varying forms of these objects, the LLVM debugger expects only a few -fields in the descriptor for each object: -</p> <pre> -%lldb.object = type { - uint, <i>;; <a href="#format_common_tag">A tag</a></i> - <i>any</i>*, <i>;; The <a href="#format_common_object_contexts">context</a> for the object</i> - sbyte* <i>;; The object 'name'</i> -} +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([5 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 32, + uint 32, + uint 0, + uint 2 }, section "llvm.metadata" +%str1 = internal constant [5 x sbyte] c"bool\00", section "llvm.metadata" </pre> -<p>The first field contains a tag for the descriptor. The second field contains -either a pointer to the descriptor for the containing <a -href="#format_common_source_files">source file</a>, or it contains a pointer to -another program object whose context pointer eventually reaches a source file. -Through this <a href="#format_common_object_contexts">context</a> pointer, the -LLVM debugger can establish the debug version number of the object.</p> - -<p>The third field contains a string that the debugger can use to identify the -object if it does not contain explicit support for the source-language in use -(ie, the 'unknown' source language handler uses this string). This should be -some sort of unmangled string that corresponds to the object, but it is a -quality of implementation issue what exactly it contains (it is legal, though -not useful, for all of these strings to be null).</p> +</div> -<p>Note again that descriptors can be extended to include -source-language-specific information in addition to the fields required by the -LLVM debugger. See the <a href="#ccxx_descriptors">section on the C/C++ -front-end</a> for more information. Also remember that global objects -(functions, selectors, global variables, etc) must contain an <a -href="#format_common_anchors">anchor</a> to the <tt>llvm.dbg.globals</tt> -variable.</p> +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="ccxx_basic_char">char</a> </div> +<div class="doc_text"> + +<pre> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([5 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 8, + uint 8, + uint 0, + uint 6 }, section "llvm.metadata" +%str1 = internal constant [5 x sbyte] c"char\00", section "llvm.metadata" +</pre> + +</div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="format_common_object_contexts">Program object contexts</a> +<div class="doc_subsubsection"> + <a name="ccxx_basic_unsigned_char">unsigned char</a> </div> <div class="doc_text"> + <pre> -Allow source-language specific contexts, use to identify namespaces etc -Must end up in a source file descriptor. -Debugger core ignores all unknown context objects. +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([14 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 8, + uint 8, + uint 0, + uint 8 }, section "llvm.metadata" +%str1 = internal constant [14 x sbyte] c"unsigned char\00", section "llvm.metadata" </pre> + </div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="format_common_intrinsics">Debugger intrinsic functions</a> +<div class="doc_subsubsection"> + <a name="ccxx_basic_short">short</a> </div> <div class="doc_text"> -<pre> -Define each intrinsics, as an extension of the language reference manual. -llvm.dbg.stoppoint -llvm.dbg.region.start -llvm.dbg.region.end -llvm.dbg.function.start -llvm.dbg.declare +<pre> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([10 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 16, + uint 16, + uint 0, + uint 5 }, section "llvm.metadata" +%str1 = internal constant [10 x sbyte] c"short int\00", section "llvm.metadata" </pre> + </div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="format_common_tags">Values for debugger tags</a> +<div class="doc_subsubsection"> + <a name="ccxx_basic_unsigned_short">unsigned short</a> </div> <div class="doc_text"> -<p>Happen to be the same value as the similarly named Dwarf-3 tags, this may -change in the future.</p> - <pre> - <a name="tag_compile_unit">LLVM_COMPILE_UNIT</a> : 17 - <a name="tag_subprogram">LLVM_SUBPROGRAM</a> : 46 - <a name="tag_variable">LLVM_VARIABLE</a> : 52 -<!-- <a name="tag_formal_parameter">LLVM_FORMAL_PARAMETER : 5--> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([19 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 16, + uint 16, + uint 0, + uint 7 }, section "llvm.metadata" +%str1 = internal constant [19 x sbyte] c"short unsigned int\00", section "llvm.metadata" </pre> + </div> +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="ccxx_basic_int">int</a> +</div> +<div class="doc_text"> + +<pre> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([4 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 32, + uint 32, + uint 0, + uint 5 }, section "llvm.metadata" +%str1 = internal constant [4 x sbyte] c"int\00", section "llvm.metadata" +</pre> -<!-- *********************************************************************** --> -<div class="doc_section"> - <a name="ccxx_frontend">C/C++ front-end specific debug information</a> </div> -<!-- *********************************************************************** --> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="ccxx_basic_unsigned_int">unsigned int</a> +</div> <div class="doc_text"> -<p>The C and C++ front-ends represent information about the program in a format -that is effectively identical to <a -href="http://www.eagercon.com/dwarf/dwarf3std.htm">Dwarf 3.0</a> in terms of -information content. This allows code generators to trivially support native -debuggers by generating standard dwarf information, and contains enough -information for non-dwarf targets to translate it as needed.</p> +<pre> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([13 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 32, + uint 32, + uint 0, + uint 7 }, section "llvm.metadata" +%str1 = internal constant [13 x sbyte] c"unsigned int\00", section "llvm.metadata" +</pre> -<p>The basic debug information required by the debugger is (intentionally) -designed to be as minimal as possible. This basic information is so minimal -that it is unlikely that <b>any</b> source-language could be adequately -described by it. Because of this, the debugger format was designed for -extension to support source-language-specific information. The extended -descriptors are read and interpreted by the <a -href="#arch_info">language-specific</a> modules in the debugger if there is -support available, otherwise it is ignored.</p> - -<p>This section describes the extensions used to represent C and C++ programs. -Other languages could pattern themselves after this (which itself is tuned to -representing programs in the same way that Dwarf 3 does), or they could choose -to provide completely different extensions if they don't fit into the Dwarf -model. As support for debugging information gets added to the various LLVM -source-language front-ends, the information used should be documented here.</p> +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="ccxx_basic_long_long">long long</a> +</div> + +<div class="doc_text"> + +<pre> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([14 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 64, + uint 64, + uint 0, + uint 5 }, section "llvm.metadata" +%str1 = internal constant [14 x sbyte] c"long long int\00", section "llvm.metadata" +</pre> </div> <!-- ======================================================================= --> -<div class="doc_subsection"> - <a name="ccxx_pse">Program Scope Entries</a> +<div class="doc_subsubsection"> + <a name="ccxx_basic_unsigned_long_long">unsigned long long</a> </div> <div class="doc_text"> -<p>TODO</p> + +<pre> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([23 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 64, + uint 64, + uint 0, + uint 7 }, section "llvm.metadata" +%str1 = internal constant [23 x sbyte] c"long long unsigned int\00", section "llvm.metadata" +</pre> + </div> -<!-- --------------------------------------------------------------------------> +<!-- ======================================================================= --> <div class="doc_subsubsection"> - <a name="ccxx_compilation_units">Compilation unit entries</a> + <a name="ccxx_basic_float">float</a> </div> <div class="doc_text"> -<p> -Translation units do not add any information over the standard <a -href="#format_common_source_files">source file representation</a> already -expected by the debugger. As such, it uses descriptors of the type specified, -with a trailing <a href="#format_common_anchors">anchor</a>. -</p> + +<pre> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([6 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 32, + uint 32, + uint 0, + uint 4 }, section "llvm.metadata" +%str1 = internal constant [6 x sbyte] c"float\00", section "llvm.metadata" +</pre> + </div> -<!-- --------------------------------------------------------------------------> +<!-- ======================================================================= --> <div class="doc_subsubsection"> - <a name="ccxx_modules">Module, namespace, and importing entries</a> + <a name="ccxx_basic_double">double</a> +</div> + +<div class="doc_text"> + +<pre> +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([7 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 64, + uint 64, + uint 0, + uint 4 }, section "llvm.metadata" +%str1 = internal constant [7 x sbyte] c"double\00", section "llvm.metadata" +</pre> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="ccxx_derived_types">C/C++ derived types</a> </div> <div class="doc_text"> -<p>TODO</p> + +<p>Given the following as an example of C/C++ derived type;</p> + +<pre> +typedef const int *IntPtr; +</pre> + +<p>a C/C++ front-end would generate the following descriptors;</p> + +<pre> +;; +;; Define the typedef "IntPtr". +;; +%<a href="#format_derived_type">llvm.dbg.derivedtype1</a> = internal constant %<a href="#format_derived_type">llvm.dbg.derivedtype.type</a> { + uint 22, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([7 x sbyte]* %str1, int 0, int 0), + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + int 1, + uint 0, + uint 0, + uint 0, + { }* cast (%<a href="#format_derived_type">llvm.dbg.derivedtype.type</a>* %<a href="#format_derived_type">llvm.dbg.derivedtype2</a> to { }*) }, section "llvm.metadata" +%str1 = internal constant [7 x sbyte] c"IntPtr\00", section "llvm.metadata" + +;; +;; Define the pointer type. +;; +%<a href="#format_derived_type">llvm.dbg.derivedtype2</a> = internal constant %<a href="#format_derived_type">llvm.dbg.derivedtype.type</a> { + uint 15, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([1 x sbyte]* %str2, int 0, int 0), + { }* null, + int 0, + uint 32, + uint 32, + uint 0, + { }* cast (%<a href="#format_derived_type">llvm.dbg.derivedtype.type</a>* %<a href="#format_derived_type">llvm.dbg.derivedtype3</a> to { }*) }, section "llvm.metadata" +%str2 = internal constant [1 x sbyte] zeroinitializer, section "llvm.metadata" + +;; +;; Define the const type. +;; +%<a href="#format_derived_type">llvm.dbg.derivedtype3</a> = internal constant %<a href="#format_derived_type">llvm.dbg.derivedtype.type</a> { + uint 38, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([1 x sbyte]* %str2, int 0, int 0), + { }* null, + int 0, + uint 0, + uint 0, + uint 0, + { }* cast (%<a href="#format_basic_type">llvm.dbg.basictype.type</a>* %<a href="#format_basic_type">llvm.dbg.basictype1</a> to { }*) }, section "llvm.metadata" + +;; +;; Define the int type. +;; +%<a href="#format_basic_type">llvm.dbg.basictype1</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([4 x sbyte]* %str4, int 0, int 0), + { }* null, + int 0, + uint 32, + uint 32, + uint 0, + uint 5 }, section "llvm.metadata" +%str4 = internal constant [4 x sbyte] c"int\00", section "llvm.metadata" +</pre> + </div> <!-- ======================================================================= --> <div class="doc_subsection"> - <a name="ccxx_dataobjects">Data objects (program variables)</a> + <a name="ccxx_composite_types">C/C++ struct/union types</a> </div> <div class="doc_text"> -<p>TODO</p> + +<p>Given the following as an example of C/C++ struct type;</p> + +<pre> +struct Color { + unsigned Red; + unsigned Green; + unsigned Blue; +}; +</pre> + +<p>a C/C++ front-end would generate the following descriptors;</p> + +<pre> +;; +;; Define basic type for unsigned int. +;; +%<a href="#format_basic_type">llvm.dbg.basictype</a> = internal constant %<a href="#format_basic_type">llvm.dbg.basictype.type</a> { + uint 36, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([13 x sbyte]* %str1, int 0, int 0), + { }* null, + int 0, + uint 32, + uint 32, + uint 0, + uint 7 }, section "llvm.metadata" +%str1 = internal constant [13 x sbyte] c"unsigned int\00", section "llvm.metadata" + +;; +;; Define composite type for struct Color. +;; +%<a href="#format_composite_type">llvm.dbg.compositetype</a> = internal constant %<a href="#format_composite_type">llvm.dbg.compositetype.type</a> { + uint 19, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([6 x sbyte]* %str2, int 0, int 0), + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + int 1, + uint 96, + uint 32, + uint 0, + { }* null, + { }* cast ([3 x { }*]* %llvm.dbg.array to { }*) }, section "llvm.metadata" +%str2 = internal constant [6 x sbyte] c"Color\00", section "llvm.metadata" + +;; +;; Define the Red field. +;; +%<a href="#format_derived_type">llvm.dbg.derivedtype1</a> = internal constant %<a href="#format_derived_type">llvm.dbg.derivedtype.type</a> { + uint 13, + { }* null, + sbyte* getelementptr ([4 x sbyte]* %str3, int 0, int 0), + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + int 2, + uint 32, + uint 32, + uint 0, + { }* cast (%<a href="#format_basic_type">llvm.dbg.basictype.type</a>* %<a href="#format_basic_type">llvm.dbg.basictype</a> to { }*) }, section "llvm.metadata" +%str3 = internal constant [4 x sbyte] c"Red\00", section "llvm.metadata" + +;; +;; Define the Green field. +;; +%<a href="#format_derived_type">llvm.dbg.derivedtype2</a> = internal constant %<a href="#format_derived_type">llvm.dbg.derivedtype.type</a> { + uint 13, + { }* null, + sbyte* getelementptr ([6 x sbyte]* %str4, int 0, int 0), + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + int 3, + uint 32, + uint 32, + uint 32, + { }* cast (%<a href="#format_basic_type">llvm.dbg.basictype.type</a>* %<a href="#format_basic_type">llvm.dbg.basictype</a> to { }*) }, section "llvm.metadata" +%str4 = internal constant [6 x sbyte] c"Green\00", section "llvm.metadata" + +;; +;; Define the Blue field. +;; +%<a href="#format_derived_type">llvm.dbg.derivedtype3</a> = internal constant %<a href="#format_derived_type">llvm.dbg.derivedtype.type</a> { + uint 13, + { }* null, + sbyte* getelementptr ([5 x sbyte]* %str5, int 0, int 0), + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + int 4, + uint 32, + uint 32, + uint 64, + { }* cast (%<a href="#format_basic_type">llvm.dbg.basictype.type</a>* %<a href="#format_basic_type">llvm.dbg.basictype</a> to { }*) }, section "llvm.metadata" +%str5 = internal constant [5 x sbyte] c"Blue\00", section "llvm.metadata" + +;; +;; Define the array of fields used by the composite type Color. +;; +%llvm.dbg.array = internal constant [3 x { }*] [ + { }* cast (%<a href="#format_derived_type">llvm.dbg.derivedtype.type</a>* %<a href="#format_derived_type">llvm.dbg.derivedtype1</a> to { }*), + { }* cast (%<a href="#format_derived_type">llvm.dbg.derivedtype.type</a>* %<a href="#format_derived_type">llvm.dbg.derivedtype2</a> to { }*), + { }* cast (%<a href="#format_derived_type">llvm.dbg.derivedtype.type</a>* %<a href="#format_derived_type">llvm.dbg.derivedtype3</a> to { }*) ], section "llvm.metadata" +</pre> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="ccxx_enumeration_types">C/C++ enumeration types</a> </div> +<div class="doc_text"> + +<p>Given the following as an example of C/C++ enumeration type;</p> + +<pre> +enum Trees { + Spruce = 100, + Oak = 200, + Maple = 300 +}; +</pre> + +<p>a C/C++ front-end would generate the following descriptors;</p> + +<pre> +;; +;; Define composite type for enum Trees +;; +%<a href="#format_composite_type">llvm.dbg.compositetype</a> = internal constant %<a href="#format_composite_type">llvm.dbg.compositetype.type</a> { + uint 4, + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + sbyte* getelementptr ([6 x sbyte]* %str1, int 0, int 0), + { }* cast (%<a href="#format_compile_units">llvm.dbg.compile_unit.type</a>* %<a href="#format_compile_units">llvm.dbg.compile_unit</a> to { }*), + int 1, + uint 32, + uint 32, + uint 0, + { }* null, + { }* cast ([3 x { }*]* %llvm.dbg.array to { }*) }, section "llvm.metadata" +%str1 = internal constant [6 x sbyte] c"Trees\00", section "llvm.metadata" + +;; +;; Define Spruce enumerator. +;; +%<a href="#format_enumeration">llvm.dbg.enumerator1</a> = internal constant %<a href="#format_enumeration">llvm.dbg.enumerator.type</a> { + uint 40, + sbyte* getelementptr ([7 x sbyte]* %str2, int 0, int 0), + int 100 }, section "llvm.metadata" +%str2 = internal constant [7 x sbyte] c"Spruce\00", section "llvm.metadata" + +;; +;; Define Oak enumerator. +;; +%<a href="#format_enumeration">llvm.dbg.enumerator2</a> = internal constant %<a href="#format_enumeration">llvm.dbg.enumerator.type</a> { + uint 40, + sbyte* getelementptr ([4 x sbyte]* %str3, int 0, int 0), + int 200 }, section "llvm.metadata" +%str3 = internal constant [4 x sbyte] c"Oak\00", section "llvm.metadata" + +;; +;; Define Maple enumerator. +;; +%<a href="#format_enumeration">llvm.dbg.enumerator3</a> = internal constant %<a href="#format_enumeration">llvm.dbg.enumerator.type</a> { + uint 40, + sbyte* getelementptr ([6 x sbyte]* %str4, int 0, int 0), + int 300 }, section "llvm.metadata" +%str4 = internal constant [6 x sbyte] c"Maple\00", section "llvm.metadata" + +;; +;; Define the array of enumerators used by composite type Trees. +;; +%llvm.dbg.array = internal constant [3 x { }*] [ + { }* cast (%<a href="#format_enumeration">llvm.dbg.enumerator.type</a>* %<a href="#format_enumeration">llvm.dbg.enumerator1</a> to { }*), + { }* cast (%<a href="#format_enumeration">llvm.dbg.enumerator.type</a>* %<a href="#format_enumeration">llvm.dbg.enumerator2</a> to { }*), + { }* cast (%<a href="#format_enumeration">llvm.dbg.enumerator.type</a>* %<a href="#format_enumeration">llvm.dbg.enumerator3</a> to { }*) ], section "llvm.metadata" +</pre> + +</div> <!-- *********************************************************************** --> |