summaryrefslogtreecommitdiff
path: root/docs/GarbageCollection.html
diff options
context:
space:
mode:
authorChris Lattner <sabre@nondot.org>2004-05-27 05:52:10 +0000
committerChris Lattner <sabre@nondot.org>2004-05-27 05:52:10 +0000
commit9b2a1849083284b213c16bea6edd181e9e337885 (patch)
tree9ac7a8fa9c8c961627a0973ddf5862c3d87e806f /docs/GarbageCollection.html
parent9772629794cc28bc93b3c610386f1af50594e38c (diff)
downloadllvm-9b2a1849083284b213c16bea6edd181e9e337885.tar.gz
llvm-9b2a1849083284b213c16bea6edd181e9e337885.tar.bz2
llvm-9b2a1849083284b213c16bea6edd181e9e337885.tar.xz
Continue the exposition
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13819 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/GarbageCollection.html')
-rw-r--r--docs/GarbageCollection.html158
1 files changed, 133 insertions, 25 deletions
diff --git a/docs/GarbageCollection.html b/docs/GarbageCollection.html
index 7502fb630f..35bb60a598 100644
--- a/docs/GarbageCollection.html
+++ b/docs/GarbageCollection.html
@@ -21,7 +21,6 @@
<li><a href="#interfaces">Interfaces for user programs</a>
<ul>
<li><a href="#roots">Identifying GC roots on the stack: <tt>llvm.gcroot</tt></a></li>
- <li><a href="#gcdescriptors">GC descriptor format for heap objects</a></li>
<li><a href="#allocate">Allocating memory from the GC</a></li>
<li><a href="#barriers">Reading and writing references to the heap</a></li>
<li><a href="#explicit">Explicit invocation of the garbage collector</a></li>
@@ -31,10 +30,13 @@
<li><a href="#gcimpl">Implementing a garbage collector</a>
<ul>
<li><a href="#llvm_gc_readwrite">Implementing <tt>llvm_gc_read</tt> and <tt>llvm_gc_write</tt></a></li>
- <li><a href="#traceroots">Tracing the GC roots from the program stack</a></li>
- <li><a href="#gcimpls">GC implementations available</a></li>
+ <li><a href="#callbacks">Callback functions used to implement the garbage collector</a></li>
</ul>
</li>
+ <li><a href="#gcimpls">GC implementations available</a>
+ <ul>
+ <li><a href="#semispace">SemiSpace - A simple copying garbage collector</a></li>
+ </li>
<!--
<li><a href="#codegen">Implementing GC support in a code generator</a></li>
@@ -208,20 +210,6 @@ CodeBlock:
<!-- ======================================================================= -->
<div class="doc_subsection">
- <a name="gcdescriptors">GC descriptor format for heap objects</a>
-</div>
-
-<div class="doc_text">
-
-<p>
-TODO: Either from root meta data, or from object headers. Front-end can provide a
-call-back to get descriptor from object without meta-data.
-</p>
-
-</div>
-
-<!-- ======================================================================= -->
-<div class="doc_subsection">
<a name="allocate">Allocating memory from the GC</a>
</div>
@@ -282,13 +270,14 @@ normal LLVM loads/stores.</p>
<div class="doc_text">
<div class="doc_code"><tt>
- void %llvm_gc_initialize()
+ void %llvm_gc_initialize(unsigned %InitialHeapSize)
</tt></div>
<p>
The <tt>llvm_gc_initialize</tt> function should be called once before any other
garbage collection functions are called. This gives the garbage collector the
-chance to initialize itself and allocate the heap spaces.
+chance to initialize itself and allocate the heap spaces. The initial heap size
+to allocate should be specified as an argument.
</p>
</div>
@@ -323,7 +312,14 @@ the garbage collector.
<div class="doc_text">
<p>
-Implementing a garbage collector for LLVM is fairly straight-forward. The
+Implementing a garbage collector for LLVM is fairly straight-forward. The LLVM
+garbage collectors are provided in a form that makes them easy to link into the
+language-specific runtime that a language front-end would use. They require
+functionality from the language-specific runtime to get information about <a
+href="#gcdescriptors">where pointers are located in heap objects</a>.
+</p>
+
+<p>The
implementation must include the <a
href="#allocate"><tt>llvm_gc_allocate</tt></a> and <a
href="#explicit"><tt>llvm_gc_collect</tt></a> functions, and it must implement
@@ -363,7 +359,15 @@ as a parameter in the future, if needed.
<!-- ======================================================================= -->
<div class="doc_subsection">
- <a name="traceroots">Tracing the GC roots from the program stack</a>
+ <a name="callbacks">Callback functions used to implement the garbage collector</a></li>
+</div>
+
+Garbage collector implementations make use of call-back functions that are
+implemented by other parts of the LLVM system.
+
+<!--_________________________________________________________________________-->
+<div class="doc_subsubsection">
+ <a name="traceroots">Tracing GC pointers from the program stack</a>
</div>
<div class="doc_text">
@@ -380,25 +384,129 @@ href="#gcroot"><tt>llvm.gcroot</tt></a> intrinsic) are provided.
</p>
</div>
+<!--_________________________________________________________________________-->
+<div class="doc_subsubsection">
+ <a name="staticroots">Tracing GC pointers from static roots</a>
+</div>
-<!-- ======================================================================= -->
-<div class="doc_subsection">
+<div class="doc_text">
+TODO
+</div>
+
+
+<!--_________________________________________________________________________-->
+<div class="doc_subsubsection">
+ <a name="gcdescriptors">Tracing GC pointers from heap objects</a>
+</div>
+
+<div class="doc_text">
+<p>
+The three most common ways to keep track of where pointers live in heap objects
+are (listed in order of space overhead required):</p>
+
+<ol>
+<li>In languages with polymorphic objects, pointers from an object header are
+usually used to identify the GC pointers in the heap object. This is common for
+object-oriented languages like Self, Smalltalk, Java, or C#.</li>
+
+<li>If heap objects are not polymorphic, often the "shape" of the heap can be
+determined from the roots of the heap or from some other meta-data [<a
+href="#appel89">Appel89</a>, <a href="#goldberg91">Goldberg91</a>, <a
+href="#tolmach94">Tolmach94</a>]. In this case, the garbage collector can
+propagate the information around from meta data stored with the roots. This
+often eliminates the need to have a header on objects in the heap. This is
+common in the ML family.</li>
+
+<li>If all heap objects have pointers in the same locations, or pointers can be
+distinguished just by looking at them (e.g., the low order bit is clear), no
+book-keeping is needed at all. This is common for Lisp-like languages.</li>
+</ol>
+
+<p>The LLVM garbage collectors are capable of supporting all of these styles of
+language, including ones that mix various implementations. To do this, it
+allows the source-language to associate meta-data with the <a
+href="#roots">stack roots</a>, and the heap tracing routines can propagate the
+information. In addition, LLVM allows the front-end to extract GC information
+from in any form from a specific object pointer (this supports situations #1 and
+#3).
+</p>
+
+<p><b>Making this efficient</b></p>
+
+
+
+</div>
+
+
+
+<!-- *********************************************************************** -->
+<div class="doc_section">
<a name="gcimpls">GC implementations available</a>
</div>
+<!-- *********************************************************************** -->
<div class="doc_text">
<p>
To make this more concrete, the currently implemented LLVM garbage collectors
-all live in the llvm/runtime/GC directory in the LLVM source-base.
+all live in the <tt>llvm/runtime/GC/*</tt> directories in the LLVM source-base.
+If you are interested in implementing an algorithm, there are many interesting
+possibilities (mark/sweep, a generational collector, a reference counting
+collector, etc), or you could choose to improve one of the existing algorithms.
</p>
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="semispace">SemiSpace - A simple copying garbage collector</a></li>
+</div>
+
+<div class="doc_text">
<p>
-TODO: Brief overview of each.
+SemiSpace is a very simple copying collector. When it starts up, it allocates
+two blocks of memory for the heap. It uses a simple bump-pointer allocator to
+allocate memory from the first block until it runs out of space. When it runs
+out of space, it traces through all of the roots of the program, copying blocks
+to the other half of the memory space.
</p>
</div>
+<!--_________________________________________________________________________-->
+<div class="doc_subsubsection">
+ Possible Improvements
+</div>
+
+<div class="doc_text">
+
+<p>
+If a collection cycle happens and the heap is not compacted very much (say less
+than 25% of the allocated memory was freed), the memory regions should be
+doubled in size.</p>
+
+</div>
+
+<!-- *********************************************************************** -->
+<div class="doc_section">
+ <a name="references">References</a>
+</div>
+<!-- *********************************************************************** -->
+
+<div class="doc_text">
+
+<p><a name="appel89">[Appel89]</a> Runtime Tags Aren't Necessary. Andrew
+W. Appel. Lisp and Symbolic Computation 19(7):703-705, July 1989.</p>
+
+<p><a name="goldberg91">[Goldberg91]</a> Tag-free garbage collection for
+strongly typed programming languages. Benjamin Goldberg. ACM SIGPLAN
+PLDI'91.</p>
+
+<p><a name="tolmach94">[Tolmach94]</a> Tag-free garbage collection using
+explicit type parameters. Andrew Tolmach. Proceedings of the 1994 ACM
+conference on LISP and functional programming.</p>
+
+</div>
<!-- *********************************************************************** -->