summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/ChrisNotes.txt50
-rw-r--r--docs/HistoricalNotes/2000-11-18-EarlyDesignIdeas.txt74
-rw-r--r--docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt199
-rw-r--r--docs/HistoricalNotes/2000-12-06-EncodingIdea.txt30
-rw-r--r--docs/HistoricalNotes/2000-12-06-MeetingSummary.txt83
-rw-r--r--docs/HistoricalNotes/2001-01-31-UniversalIRIdea.txt39
-rw-r--r--docs/HistoricalNotes/2001-02-06-TypeNotationDebate.txt67
-rw-r--r--docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp1.txt75
-rw-r--r--docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp2.txt53
-rw-r--r--docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp4.txt89
-rw-r--r--docs/HistoricalNotes/2001-02-09-AdveComments.txt120
-rw-r--r--docs/HistoricalNotes/2001-02-09-AdveCommentsResponse.txt245
-rw-r--r--docs/HistoricalNotes/2001-02-13-Reference-Memory.txt39
-rw-r--r--docs/HistoricalNotes/2001-02-13-Reference-MemoryResponse.txt47
-rw-r--r--docs/HistoricalNotes/2001-04-16-DynamicCompilation.txt12
-rw-r--r--docs/HistoricalNotes/2001-05-18-ExceptionHandling.txt202
-rw-r--r--docs/HistoricalNotes/2001-05-19-ExceptionResponse.txt45
-rw-r--r--docs/HistoricalNotes/2001-06-01-GCCOptimizations.txt63
-rw-r--r--docs/HistoricalNotes/2001-06-01-GCCOptimizations2.txt71
-rw-r--r--docs/LangRef.html1376
20 files changed, 2979 insertions, 0 deletions
diff --git a/docs/ChrisNotes.txt b/docs/ChrisNotes.txt
new file mode 100644
index 0000000000..f0ea5c6f06
--- /dev/null
+++ b/docs/ChrisNotes.txt
@@ -0,0 +1,50 @@
+* Provide a pass that eliminates critical edges from the CFG
+* Provide a print hook to print out xvcg format files for vis
+* I need to provide an option to the bytecode loader to ignore memory
+ dependance edges. Instead, the VM would just treat memory operations
+ (load, store, getfield, putfield, call) as pinned instructions.
+* I need to have a way to prevent taking the address of a constant pool
+ reference. You should only be able to take the address of a variable.
+ Maybe taking the address of a constant copies it? What about virtual
+ function tables? Maybe a const pointer would be better...
+* Structures should be accessed something like this: ubyte is ok. Limits
+ structure size to 256 members. This can be fixed later by either:
+ 1. adding varient that takes ushort
+ 2. Splitting structures into nested structures each of half size
+ <float> %f = loadfield *{int, {float}} Str, ubyte 1, ubyte 0
+ storefield float %f, *{int, {float}} Str, ubyte 1, ubyte 0
+* I'm noticing me writing a lot of code that looks like this (dtor material here):
+ ConstPool.dropAllReferences();
+ ConstPool.delete_all();
+ ConstPool.setParent(0);
+ ~ConstPool
+
+* Need a way to attach bytecode block info at various levels of asm code.
+* Rename "ConstantPool" to "ConstPool"
+* Maybe ConstantPool objects should keep themselves sorted as things are
+ inserted.
+* Need to be able to inflate recursive types. %x = { *%x }, %x = %x ()
+* Recognize and save comments in assembly and bytecode format
+* Encode line number table in bytecode (like #line), optional table
+
+* Encode negative relative offsets in the bytecode file
+
+* Implement switch to switch on a constant pool array of type:
+ [{ label, int }] or [label] (lookup vs index switch)
+* Apparently bison has a %pure_parser option. Maybe useful for Assembly/Parser
+
+* Implement a header file that can read either assembly or bytecode, implement
+ a writer that can output either based on what is read with this reader..
+* Implement the following derived types:
+ * structure/record { int %foo, int %bar} or { %foo = int, int }
+ * pointer int *
+ * "packed format", like this: [4 x sbyte]: Packed SIMD datatype
+* Maybe 'tailcall' also?
+* It might be nice to support enumerations of some sort... especially for use
+ as a compiler IR
+* Include a method level bytecode block that defines a mapping between values
+ and registers that defines a minimally register allocated code. This can
+ make me finally address how to encode extensions in assembly.
+* Bytecode reader should use extensions that may or may not be linked into the
+ application to read blocks. Thus an easy way to ignore symbol table info
+ would be to not link in that reader into the app.
diff --git a/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeas.txt b/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeas.txt
new file mode 100644
index 0000000000..f086181192
--- /dev/null
+++ b/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeas.txt
@@ -0,0 +1,74 @@
+Date: Sat, 18 Nov 2000 09:19:35 -0600 (CST)
+From: Vikram Adve <vadve@cs.uiuc.edu>
+To: Chris Lattner <lattner@cs.uiuc.edu>
+Subject: a few thoughts
+
+I've been mulling over the virtual machine problem and I had some
+thoughts about some things for us to think about discuss:
+
+1. We need to be clear on our goals for the VM. Do we want to emphasize
+ portability and safety like the Java VM? Or shall we focus on the
+ architecture interface first (i.e., consider the code generation and
+ processor issues), since the architecture interface question is also
+ important for portable Java-type VMs?
+
+ This is important because the audiences for these two goals are very
+ different. Architects and many compiler people care much more about
+ the second question. The Java compiler and OS community care much more
+ about the first one.
+
+ Also, while the architecture interface question is important for
+ Java-type VMs, the design constraints are very different.
+
+
+2. Design issues to consider (an initial list that we should continue
+ to modify). Note that I'm not trying to suggest actual solutions here,
+ but just various directions we can pursue:
+
+ a. A single-assignment VM, which we've both already been thinking about.
+
+ b. A strongly-typed VM. One question is do we need the types to be
+ explicitly declared or should they be inferred by the dynamic compiler?
+
+ c. How do we get more high-level information into the VM while keeping
+ to a low-level VM design?
+
+ o Explicit array references as operands? An alternative is
+ to have just an array type, and let the index computations be
+ separate 3-operand instructions.
+
+ o Explicit instructions to handle aliasing, e.g.s:
+ -- an instruction to say "I speculate that these two values are not
+ aliased, but check at runtime", like speculative execution in
+ EPIC?
+ -- or an instruction to check whether two values are aliased and
+ execute different code depending on the answer, somewhat like
+ predicated code in EPIC
+
+ o (This one is a difficult but powerful idea.)
+ A "thread-id" field on every instruction that allows the static
+ compiler to generate a set of parallel threads, and then have
+ the runtime compiler and hardware do what they please with it.
+ This has very powerful uses, but thread-id on every instruction
+ is expensive in terms of instruction size and code size.
+ We would need to compactly encode it somehow.
+
+ Also, this will require some reading on at least two other
+ projects:
+ -- Multiscalar architecture from Wisconsin
+ -- Simultaneous multithreading architecture from Washington
+
+ o Or forget all this and stick to a traditional instruction set?
+
+
+BTW, on an unrelated note, after the meeting yesterday, I did remember
+that you had suggested doing instruction scheduling on SSA form instead
+of a dependence DAG earlier in the semester. When we talked about
+it yesterday, I didn't remember where the idea had come from but I
+remembered later. Just giving credit where its due...
+
+Perhaps you can save the above as a file under RCS so you and I can
+continue to expand on this.
+
+--Vikram
+
diff --git a/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt b/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt
new file mode 100644
index 0000000000..1c725f5aa7
--- /dev/null
+++ b/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt
@@ -0,0 +1,199 @@
+Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST)
+From: Chris Lattner <sabre@nondot.org>
+To: Vikram Adve <vadve@cs.uiuc.edu>
+Subject: Re: a few thoughts
+
+Okay... here are a few of my thoughts on this (it's good to know that we
+think so alike!):
+
+> 1. We need to be clear on our goals for the VM. Do we want to emphasize
+> portability and safety like the Java VM? Or shall we focus on the
+> architecture interface first (i.e., consider the code generation and
+> processor issues), since the architecture interface question is also
+> important for portable Java-type VMs?
+
+I forsee the architecture looking kinda like this: (which is completely
+subject to change)
+
+1. The VM code is NOT guaranteed safe in a java sense. Doing so makes it
+ basically impossible to support C like languages. Besides that,
+ certifying a register based language as safe at run time would be a
+ pretty expensive operation to have to do. Additionally, we would like
+ to be able to statically eliminate many bounds checks in Java
+ programs... for example.
+
+ 2. Instead, we can do the following (eventually):
+ * Java bytecode is used as our "safe" representation (to avoid
+ reinventing something that we don't add much value to). When the
+ user chooses to execute Java bytecodes directly (ie, not
+ precompiled) the runtime compiler can do some very simple
+ transformations (JIT style) to convert it into valid input for our
+ VM. Performance is not wonderful, but it works right.
+ * The file is scheduled to be compiled (rigorously) at a later
+ time. This could be done by some background process or by a second
+ processor in the system during idle time or something...
+ * To keep things "safe" ie to enforce a sandbox on Java/foreign code,
+ we could sign the generated VM code with a host specific private
+ key. Then before the code is executed/loaded, we can check to see if
+ the trusted compiler generated the code. This would be much quicker
+ than having to validate consistency (especially if bounds checks have
+ been removed, for example)
+
+> This is important because the audiences for these two goals are very
+> different. Architects and many compiler people care much more about
+> the second question. The Java compiler and OS community care much more
+> about the first one.
+
+3. By focusing on a more low level virtual machine, we have much more room
+ for value add. The nice safe "sandbox" VM can be provided as a layer
+ on top of it. It also lets us focus on the more interesting compilers
+ related projects.
+
+> 2. Design issues to consider (an initial list that we should continue
+> to modify). Note that I'm not trying to suggest actual solutions here,
+> but just various directions we can pursue:
+
+Understood. :)
+
+> a. A single-assignment VM, which we've both already been thinking
+> about.
+
+Yup, I think that this makes a lot of sense. I am still intrigued,
+however, by the prospect of a minimally allocated VM representation... I
+think that it could have definate advantages for certain applications
+(think very small machines, like PDAs). I don't, however, think that our
+initial implementations should focus on this. :)
+
+Here are some other auxilliary goals that I think we should consider:
+
+1. Primary goal: Support a high performance dynamic compilation
+ system. This means that we have an "ideal" division of labor between
+ the runtime and static compilers. Of course, the other goals of the
+ system somewhat reduce the importance of this point (f.e. portability
+ reduces performance, but hopefully not much)
+2. Portability to different processors. Since we are most familiar with
+ x86 and solaris, I think that these two are excellent candidates when
+ we get that far...
+3. Support for all languages & styles of programming (general purpose
+ VM). This is the point that disallows java style bytecodes, where all
+ array refs are checked for bounds, etc...
+4. Support linking between different language families. For example, call
+ C functions directly from Java without using the nasty/slow/gross JNI
+ layer. This involves several subpoints:
+ A. Support for languages that require garbage collectors and integration
+ with languages that don't. As a base point, we could insist on
+ always using a conservative GC, but implement free as a noop, f.e.
+
+> b. A strongly-typed VM. One question is do we need the types to be
+> explicitly declared or should they be inferred by the dynamic
+> compiler?
+
+ B. This is kind of similar to another idea that I have: make OOP
+ constructs (virtual function tables, class heirarchies, etc) explicit
+ in the VM representation. I believe that the number of additional
+ constructs would be fairly low, but would give us lots of important
+ information... something else that would/could be important is to
+ have exceptions as first class types so that they would be handled in
+ a uniform way for the entire VM... so that C functions can call Java
+ functions for example...
+
+> c. How do we get more high-level information into the VM while keeping
+> to a low-level VM design?
+> o Explicit array references as operands? An alternative is
+> to have just an array type, and let the index computations be
+> separate 3-operand instructions.
+
+ C. In the model I was thinking of (subject to change of course), we
+ would just have an array type (distinct from the pointer
+ types). This would allow us to have arbitrarily complex index
+ expressions, while still distinguishing "load" from "Array load",
+ for example. Perhaps also, switch jump tables would be first class
+ types as well? This would allow better reasoning about the program.
+
+5. Support dynamic loading of code from various sources. Already
+ mentioned above was the example of loading java bytecodes, but we want
+ to support dynamic loading of VM code as well. This makes the job of
+ the runtime compiler much more interesting: it can do interprocedural
+ optimizations that the static compiler can't do, because it doesn't
+ have all of the required information (for example, inlining from
+ shared libraries, etc...)
+
+6. Define a set of generally useful annotations to add to the VM
+ representation. For example, a function can be analysed to see if it
+ has any sideeffects when run... also, the MOD/REF sets could be
+ calculated, etc... we would have to determine what is reasonable. This
+ would generally be used to make IP optimizations cheaper for the
+ runtime compiler...
+
+> o Explicit instructions to handle aliasing, e.g.s:
+> -- an instruction to say "I speculate that these two values are not
+> aliased, but check at runtime", like speculative execution in
+> EPIC?
+> -- or an instruction to check whether two values are aliased and
+> execute different code depending on the answer, somewhat like
+> predicated code in EPIC
+
+These are also very good points... if this can be determined at compile
+time. I think that an epic style of representation (not the instruction
+packing, just the information presented) could be a very interesting model
+to use... more later...
+
+> o (This one is a difficult but powerful idea.)
+> A "thread-id" field on every instruction that allows the static
+> compiler to generate a set of parallel threads, and then have
+> the runtime compiler and hardware do what they please with it.
+> This has very powerful uses, but thread-id on every instruction
+> is expensive in terms of instruction size and code size.
+> We would need to compactly encode it somehow.
+
+Yes yes yes! :) I think it would be *VERY* useful to include this kind
+of information (which EPIC architectures *implicitly* encode. The trend
+that we are seeing supports this greatly:
+
+1. Commodity processors are getting massive SIMD support:
+ * Intel/Amd MMX/MMX2
+ * AMD's 3Dnow!
+ * Intel's SSE/SSE2
+ * Sun's VIS
+2. SMP is becoming much more common, especially in the server space.
+3. Multiple processors on a die are right around the corner.
+
+If nothing else, not designing this in would severely limit our future
+expansion of the project...
+
+> Also, this will require some reading on at least two other
+> projects:
+> -- Multiscalar architecture from Wisconsin
+> -- Simultaneous multithreading architecture from Washington
+>
+> o Or forget all this and stick to a traditional instruction set?
+
+Heh... :) Well, from a pure research point of view, it is almost more
+attactive to go with the most extreme/different ISA possible. On one axis
+you get safety and conservatism, and on the other you get degree of
+influence that the results have. Of course the problem with pure research
+is that often times there is no concrete product of the research... :)
+
+> BTW, on an unrelated note, after the meeting yesterday, I did remember
+> that you had suggested doing instruction scheduling on SSA form instead
+> of a dependence DAG earlier in the semester. When we talked about
+> it yesterday, I didn't remember where the idea had come from but I
+> remembered later. Just giving credit where its due...
+
+:) Thanks.
+
+> Perhaps you can save the above as a file under RCS so you and I can
+> continue to expand on this.
+
+I think it makes sense to do so when we get our ideas more formalized and
+bounce it back and forth a couple of times... then I'll do a more formal
+writeup of our goals and ideas. Obviously our first implementation will
+not want to do all of the stuff that I pointed out above... be we will
+want to design the project so that we do not artificially limit ourselves
+at sometime in the future...
+
+Anyways, let me know what you think about these ideas... and if they sound
+reasonable...
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2000-12-06-EncodingIdea.txt b/docs/HistoricalNotes/2000-12-06-EncodingIdea.txt
new file mode 100644
index 0000000000..8c452924dd
--- /dev/null
+++ b/docs/HistoricalNotes/2000-12-06-EncodingIdea.txt
@@ -0,0 +1,30 @@
+From: Chris Lattner [mailto:sabre@nondot.org]
+Sent: Wednesday, December 06, 2000 6:41 PM
+To: Vikram S. Adve
+Subject: Additional idea with respect to encoding
+
+Here's another idea with respect to keeping the common case instruction
+size down (less than 32 bits ideally):
+
+Instead of encoding an instruction to operate on two register numbers,
+have it operate on two negative offsets based on the current register
+number. Therefore, instead of using:
+
+r57 = add r55, r56 (r57 is the implicit dest register, of course)
+
+We could use:
+
+r57 = add -2, -1
+
+My guess is that most SSA references are to recent values (especially if
+they correspond to expressions like (x+y*z+p*q/ ...), so the negative
+numbers would tend to stay small, even at the end of the procedure (where
+the implicit register destination number could be quite large). Of course
+the negative sign is reduntant, so you would be storing small integers
+almost all of the time, and 5-6 bits worth of register number would be
+plenty for most cases...
+
+What do you think?
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2000-12-06-MeetingSummary.txt b/docs/HistoricalNotes/2000-12-06-MeetingSummary.txt
new file mode 100644
index 0000000000..b66e18556f
--- /dev/null
+++ b/docs/HistoricalNotes/2000-12-06-MeetingSummary.txt
@@ -0,0 +1,83 @@
+SUMMARY
+-------
+
+We met to discuss the LLVM instruction format and bytecode representation:
+
+ISSUES RESOLVED
+---------------
+
+1. We decided that we shall use a flat namespace to represent our
+ variables in SSA form, as opposed to having a two dimensional namespace
+ of the original variable and the SSA instance subscript.
+
+ARGUMENT AGAINST:
+ * A two dimensional namespace would be valuable when doing alias
+ analysis because the extra information can help limit the scope of
+ analysis.
+
+ARGUMENT FOR:
+ * Including this information would require that all users of the LLVM
+ bytecode would have to parse and handle it. This would slow down the
+ common case and inflate the instruction representation with another
+ infinite variable space.
+
+REASONING:
+ * It was decided that because original variable sources could be
+ reconstructed from SSA form in linear time, that it would be an
+ unjustified expense for the common case to include the extra
+ information for one optimization. Alias analysis itself is typically
+ greater than linear in asymptotic complexity, so this extra analaysis
+ would not affect the runtime of the optimization in a significant
+ way. Additionally, this would be an unlikely optimization to do at
+ runtime.
+
+
+IDEAS TO CONSIDER
+-----------------
+
+1. Including dominator information in the LLVM bytecode
+ representation. This is one example of an analysis result that may be
+ packaged with the bytecodes themselves. As a conceptual implementation
+ idea, we could include an immediate dominator number for each basic block
+ in the LLVM bytecode program. Basic blocks could be numbered according
+ to the order of occurance in the bytecode representation.
+
+2. Including loop header and body information. This would facilitate
+ detection of intervals and natural loops.
+
+UNRESOLVED ISSUES
+-----------------
+
+1. Will oSUIF provide enough of an infrastructure to support the research
+ that we will be doing? We know that it has less than stellar
+ performance, but hope that this will be of little importance for our
+ static compiler. This could affect us if we decided to do some IP
+ research. Also we do not yet understand the level of exception support
+ currently implemented.
+
+2. Should we consider the requirements of a direct hardware implementation
+ of the LLVM when we design it? If so, several design issues should
+ have their priorities shifted. The other option is to focus on a
+ software layer interpreting the LLVM in all cases.
+
+3. Should we use some form of packetized format to improve forward
+ compatibility? For example, we could design the system to encode a
+ packet type and length field before analysis information, to allow a
+ runtime to skip information that it didn't understand in a bytecode
+ stream. The obvious benefit would be for compatibility, the drawback
+ is that it would tend to splinter that 'standard' LLVM definition.
+
+4. Should we use fixed length instructions or variable length
+ instructions? Fetching variable length instructions is expensive (for
+ either hardware or software based LLVM runtimes), but we have several
+ 'infinite' spaces that instructions operate in (SSA register numbers,
+ type spaces, or packet length [if packets were implemented]). Several
+ options were mentioned including:
+ A. Using 16 or 32 bit numbers, which would be 'big enough'
+ B. A scheme similar to how UTF-8 works, to encode infinite numbers
+ while keeping small number small.
+ C. Use something similar to Huffman encoding, so that the most common
+ numbers are the smallest.
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2001-01-31-UniversalIRIdea.txt b/docs/HistoricalNotes/2001-01-31-UniversalIRIdea.txt
new file mode 100644
index 0000000000..111706a344
--- /dev/null
+++ b/docs/HistoricalNotes/2001-01-31-UniversalIRIdea.txt
@@ -0,0 +1,39 @@
+Date: Wed, 31 Jan 2001 12:04:33 -0600
+From: Vikram S. Adve <vadve@cs.uiuc.edu>
+To: Chris Lattner <lattner@cs.uiuc.edu>
+Subject: another thought
+
+I have a budding idea about making LLVM a little more ambitious: a
+customizable runtime system that can be used to implement language-specific
+virtual machines for many different languages. E.g., a C vm, a C++ vm, a
+Java vm, a Lisp vm, ..
+
+The idea would be that LLVM would provide a standard set of runtime features
+(some low-level like standard assembly instructions with code generation and
+static and runtime optimization; some higher-level like type-safety and
+perhaps a garbage collection library). Each language vm would select the
+runtime features needed for that language, extending or customizing them as
+needed. Most of the machine-dependent code-generation and optimization
+features as well as low-level machine-independent optimizations (like PRE)
+could be provided by LLVM and should be sufficient for any language,
+simplifying the language compiler. (This would also help interoperability
+between languages.) Also, some or most of the higher-level
+machine-independent features like type-safety and access safety should be
+reusable by different languages, with minor extensions. The language
+compiler could then focus on language-specific analyses and optimizations.
+
+The risk is that this sounds like a universal IR -- something that the
+compiler community has tried and failed to develop for decades, and is
+universally skeptical about. No matter what we say, we won't be able to
+convince anyone that we have a universal IR that will work. We need to
+think about whether LLVM is different or if has something novel that might
+convince people. E.g., the idea of providing a package of separable
+features that different languages select from. Also, using SSA with or
+without type-safety as the intermediate representation.
+
+One interesting starting point would be to discuss how a JVM would be
+implemented on top of LLVM a bit more. That might give us clues on how to
+structure LLVM to support one or more language VMs.
+
+--Vikram
+
diff --git a/docs/HistoricalNotes/2001-02-06-TypeNotationDebate.txt b/docs/HistoricalNotes/2001-02-06-TypeNotationDebate.txt
new file mode 100644
index 0000000000..c09cf1f03c
--- /dev/null
+++ b/docs/HistoricalNotes/2001-02-06-TypeNotationDebate.txt
@@ -0,0 +1,67 @@
+Date: Tue, 6 Feb 2001 20:27:37 -0600 (CST)
+From: Chris Lattner <sabre@nondot.org>
+To: Vikram S. Adve <vadve@cs.uiuc.edu>
+Subject: Type notation debate...
+
+This is the way that I am currently planning on implementing types:
+
+Primitive Types:
+type ::= void|bool|sbyte|ubyte|short|ushort|int|uint|long|ulong
+
+Method:
+typelist ::= typelisth | /*empty*/
+typelisth ::= type | typelisth ',' type
+type ::= type (typelist)
+
+Arrays (without and with size):
+type ::= '[' type ']' | '[' INT ',' type ']'
+
+Pointer:
+type ::= type '*'
+
+Structure:
+type ::= '{' typelist '}'
+
+Packed:
+type ::= '<' INT ',' type '>'
+
+Simple examples:
+
+[[ %4, int ]] - array of (array of 4 (int))
+[ { int, int } ] - Array of structure
+[ < %4, int > ] - Array of 128 bit SIMD packets
+int (int, [[int, %4]]) - Method taking a 2d array and int, returning int
+
+
+Okay before you comment, please look at:
+
+http://www.research.att.com/~bs/devXinterview.html
+
+Search for "In another interview, you defined the C declarator syntax as
+an experiment that failed. However, this syntactic construct has been
+around for 27 years and perhaps more; why do you consider it problematic
+(except for its cumbersome syntax)?" and read that response for me. :)
+
+Now with this syntax, his example would be represented as:
+
+[ %10, bool (int, int) * ] *
+
+vs
+
+bool (*(*)[10])(int, int)
+
+in C.
+
+Basically, my argument for this type construction system is that it is
+VERY simple to use and understand (although it IS different than C, it is
+very simple and straightforward, which C is NOT). In fact, I would assert
+that most programmers TODAY do not understand pointers to member
+functions, and have to look up an example when they have to write them.
+
+In my opinion, it is critically important to have clear and concise type
+specifications, because types are going to be all over the programs.
+
+Let me know your thoughts on this. :)
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp1.txt b/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp1.txt
new file mode 100644
index 0000000000..8bfefbf69f
--- /dev/null
+++ b/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp1.txt
@@ -0,0 +1,75 @@
+Date: Thu, 8 Feb 2001 08:42:04 -0600
+From: Vikram S. Adve <vadve@cs.uiuc.edu>
+To: Chris Lattner <sabre@nondot.org>
+Subject: RE: Type notation debate...
+
+Chris,
+
+> Okay before you comment, please look at:
+>
+> http://www.research.att.com/~bs/devXinterview.html
+
+I read this argument. Even before that, I was already in agreement with you
+and him that the C declarator syntax is difficult and confusing.
+
+But in fact, if you read the entire answer carefully, he came to the same
+conclusion I do: that you have to go with familiar syntax over logical
+syntax because familiarity is such a strong force:
+
+ "However, familiarity is a strong force. To compare, in English, we
+live
+more or less happily with the absurd rules for "to be" (am, are, is, been,
+was, were, ...) and all attempts to simplify are treated with contempt or
+(preferably) humor. It be a curious world and it always beed."
+
+> Basically, my argument for this type construction system is that it is
+> VERY simple to use and understand (although it IS different than C, it is
+> very simple and straightforward, which C is NOT). In fact, I would assert
+> that most programmers TODAY do not understand pointers to member
+> functions, and have to look up an example when they have to write them.
+
+Again, I don't disagree with this at all. But to some extent this
+particular problem is inherently difficult. Your syntax for the above
+example may be easier for you to read because this is the way you have been
+thinking about it. Honestly, I don't find it much easier than the C syntax.
+In either case, I would have to look up an example to write pointers to
+member functions.
+
+But pointers to member functions are nowhere near as common as arrays. And
+the old array syntax:
+ type [ int, int, ...]
+is just much more familiar and clear to people than anything new you
+introduce, no matter how logical it is. Introducing a new syntax that may
+make function pointers easier but makes arrays much more difficult seems
+very risky to me.
+
+> In my opinion, it is critically important to have clear and concise type
+> specifications, because types are going to be all over the programs.
+
+I absolutely agree. But the question is, what is more clear and concise?
+The syntax programmers are used to out of years of experience or a new
+syntax that they have never seen that has a more logical structure. I think
+the answer is the former. Sometimes, you have to give up a better idea
+because you can't overcome sociological barriers to it. Qwerty keyboards
+and Windows are two classic examples of bad technology that are difficult to
+root out.
+
+P.S. Also, while I agree that most your syntax is more logical, there is
+one part that isn't:
+
+Arrays (without and with size):
+type ::= '[' type ']' | '[' INT ',' type ']'.
+
+The arrays with size lists the dimensions and the type in a single list.
+That is just too confusing:
+ [10, 40, int]
+This seems to be a 3-D array where the third dimension is something strange.
+It is too confusing to have a list of 3 things, some of which are dimensions
+and one is a type. Either of the following would be better:
+
+ array [10, 40] of int
+or
+ int [10, 40]
+
+--Vikram
+
diff --git a/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp2.txt b/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp2.txt
new file mode 100644
index 0000000000..6e9784158a
--- /dev/null
+++ b/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp2.txt
@@ -0,0 +1,53 @@
+Date: Thu, 8 Feb 2001 14:31:05 -0600 (CST)
+From: Chris Lattner <sabre@nondot.org>
+To: Vikram S. Adve <vadve@cs.uiuc.edu>
+Subject: RE: Type notation debate...
+
+> Arrays (without and with size):
+> type ::= '[' type ']' | '[' INT ',' type ']'.
+>
+> The arrays with size lists the dimensions and the type in a single list.
+> That is just too confusing:
+
+> [10, 40, int]
+> This seems to be a 3-D array where the third dimension is something strange.
+> It is too confusing to have a list of 3 things, some of which are dimensions
+> and one is a type.
+
+The above grammar indicates that there is only one integer parameter, ie
+the upper bound. The lower bound is always implied to be zero, for
+several reasons:
+
+* As a low level VM, we want to expose addressing computations
+ explicitly. Since the lower bound must always be known in a high level
+ language statically, the language front end can do the translation
+ automatically.
+* This fits more closely with what Java needs, ie what we need in the
+ short term. Java arrays are always zero based.
+
+If a two element list is too confusing, I would recommend an alternate
+syntax of:
+
+type ::= '[' type ']' | '[' INT 'x' type ']'.
+
+For example:
+ [12 x int]
+ [12x int]
+ [ 12 x [ 4x int ]]
+
+Which is syntactically nicer, and more explicit.
+
+> Either of the following would be better:
+> array [10, 40] of int
+
+I considered this approach for arrays in general (ie array of int/ array
+of 12 int), but found that it made declarations WAY too long. Remember
+that because of the nature of llvm, you get a lot of types strewn all over
+the program, and using the 'typedef' like facility is not a wonderful
+option, because then types aren't explicit anymore.
+
+I find this email interesting, because you contradict the previous email
+you sent, where you recommend that we stick to C syntax....
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp4.txt b/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp4.txt
new file mode 100644
index 0000000000..7b9032742a
--- /dev/null
+++ b/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp4.txt
@@ -0,0 +1,89 @@
+> But in fact, if you read the entire answer carefully, he came to the same
+> conclusion I do: that you have to go with familiar syntax over logical
+> syntax because familiarity is such a strong force:
+> "However, familiarity is a strong force. To compare, in English, we
+live
+> more or less happily with the absurd rules for "to be" (am, are, is, been,
+> was, were, ...) and all attempts to simplify are treated with contempt or
+> (preferably) humor. It be a curious world and it always beed."
+
+Although you have to remember that his situation was considerably
+different than ours. He was in a position where he was designing a high
+level language that had to be COMPATIBLE with C. Our language is such
+that a new person would have to learn the new, different, syntax
+anyways. Making them learn about the type system does not seem like much
+of a stretch from learning the opcodes and how SSA form works, and how
+everything ties together...
+
+> > Basically, my argument for this type construction system is that it is
+> > VERY simple to use and understand (although it IS different than C, it is
+> > very simple and straightforward, which C is NOT). In fact, I would assert
+> > that most programmers TODAY do not understand pointers to member
+> > functions, and have to look up an example when they have to write them.
+
+> Again, I don't disagree with this at all. But to some extent this
+> particular problem is inherently difficult. Your syntax for the above
+> example may be easier for you to read because this is the way you have been
+> thinking about it. Honestly, I don't find it much easier than the C syntax.
+> In either case, I would have to look up an example to write pointers to
+> member functions.
+
+I would argue that because the lexical structure of the language is self
+consistent, any person who spent a significant amount of time programming
+in LLVM directly would understand how to do it without looking it up in a
+manual. The reason this does not work for C is because you rarely have to
+declare these pointers, and the syntax is inconsistent with the method
+declaration and calling syntax.
+
+> But pointers to member functions are nowhere near as common as arrays.
+
+Very true. If you're implementing an object oriented language, however,
+remember that you have to do all the pointer to member function stuff
+yourself.... so everytime you invoke a virtual method one is involved
+(instead of having C++ hide it for you behind "syntactic sugar").
+
+> And the old array syntax:
+> type [ int, int, ...]
+> is just much more familiar and clear to people than anything new you
+> introduce, no matter how logical it is.
+
+Erm... excuse me but how is this the "old array syntax"? If you are
+arguing for consistency with C, you should be asking for 'type int []',
+which is significantly different than the above (beside the above
+introduces a new operator and duplicates information
+needlessly). Basically what I am suggesting is exactly the above without
+the fluff. So instead of:
+
+ type [ int, int, ...]
+
+you use:
+
+ type [ int ]
+
+> Introducing a new syntax that may
+> make function pointers easier but makes arrays much more difficult seems
+> very risky to me.
+
+This is not about function pointers. This is about consistency in the
+type system, and consistency with the rest of the language. The point
+above does not make arrays any more difficult to use, and makes the
+structure of types much more obvious than the "c way".
+
+> > In my opinion, it is critically important to have clear and concise type
+> > specifications, because types are going to be all over the programs.
+>
+> I absolutely agree. But the question is, what is more clear and concise?
+> The syntax programmers are used to out of years of experience or a new
+> syntax that they have never seen that has a more logical structure. I think
+> the answer is the former. Sometimes, you have to give up a better idea
+> because you can't overcome sociological barriers to it. Qwerty keyboards
+> and Windows are two classic examples of bad technology that are difficult to
+> root out.
+
+Very true, but you seem to be advocating a completely different Type
+system than C has, in addition to it not offering the advantages of clear
+structure that the system I recommended does... so you seem to not have a
+problem with changing this, just with what I change it to. :)
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2001-02-09-AdveComments.txt b/docs/HistoricalNotes/2001-02-09-AdveComments.txt
new file mode 100644
index 0000000000..5503233c1e
--- /dev/null
+++ b/docs/HistoricalNotes/2001-02-09-AdveComments.txt
@@ -0,0 +1,120 @@
+Ok, here are my comments and suggestions about the LLVM instruction set.
+We should discuss some now, but can discuss many of them later, when we
+revisit synchronization, type inference, and other issues.
+(We have discussed some of the comments already.)
+
+
+o We should consider eliminating the type annotation in cases where it is
+ essentially obvious from the instruction type, e.g., in br, it is obvious
+ that the first arg. should be a bool and the other args should be labels:
+
+ br bool <cond>, label <iftrue>, label <iffalse>
+
+ I think your point was that making all types explicit improves clarity
+ and readability. I agree to some extent, but it also comes at the cost
+ of verbosity. And when the types are obvious from people's experience
+ (e.g., in the br instruction), it doesn't seem to help as much.
+
+
+o On reflection, I really like your idea of having the two different switch
+ types (even though they encode implementation techniques rather than
+ semantics). It should simplify building the CFG and my guess is it could
+ enable some significant optimizations, though we should think about which.
+
+
+o In the lookup-indirect form of the switch, is there a reason not to make
+ the val-type uint? Most HLL switch statements (including Java and C++)
+ require that anyway. And it would also make the val-type uniform
+ in the two forms of the switch.
+
+ I did see the switch-on-bool examples and, while cute, we can just use
+ the branch instructions in that particular case.
+
+
+o I agree with your comment that we don't need 'neg'.
+
+
+o There's a trade-off with the cast instruction:
+ + it avoids having to define all the upcasts and downcasts that are
+ valid for the operands of each instruction (you probably have thought
+ of other benefits also)
+ - it could make the bytecode significantly larger because there could
+ be a lot of cast operations
+
+
+o Making the second arg. to 'shl' a ubyte seems good enough to me.
+ 255 positions seems adequate for several generations of machines
+ and is more compact than uint.
+
+
+o I still have some major concerns about including malloc and free in the
+ language (either as builtin functions or instructions). LLVM must be
+ able to represent code from many different languages. Languages such as
+ C, C++ Java and Fortran 90 would not be able to use our malloc anyway
+ because each of them will want to provide a library implementation of it.
+
+ This gets even worse when code from different languages is linked
+ into a single executable (which is fairly common in large apps).
+ Having a single malloc would just not suffice, and instead would simply
+ complicate the picture further because it adds an extra variant in
+ addition to the one each language provides.
+
+ Instead, providing a default library version of malloc and free
+ (and perhaps a malloc_gc with garbage collection instead of free)
+ would make a good implementation available to anyone who wants it.
+
+ I don't recall all your arguments in favor so let's discuss this again,
+ and soon.
+
+
+o 'alloca' on the other hand sounds like a good idea, and the
+ implementation seems fairly language-independent so it doesn't have the
+ problems with malloc listed above.
+
+
+o About indirect call:
+ Your option #2 sounded good to me. I'm not sure I understand your
+ concern about an explicit 'icall' instruction?
+
+
+o A pair of important synchronization instr'ns to think about:
+ load-linked
+ store-conditional
+
+
+o Other classes of instructions that are valuable for pipeline performance:
+ conditional-move
+ predicated instructions
+
+
+o I believe tail calls are relatively easy to identify; do you know why
+ .NET has a tailcall instruction?
+
+
+o I agree that we need a static data space. Otherwise, emulating global
+ data gets unnecessarily complex.
+
+
+o About explicit parallelism:
+
+ We once talked about adding a symbolic thread-id field to each
+ instruction. (It could be optional so single-threaded codes are
+ not penalized.) This could map well to multi-threaded architectures
+ while providing easy ILP for single-threaded onces. But it is probably
+ too radical an idea to include in a base version of LLVM. Instead, it
+ could a great topic for a separate study.
+
+ What is the semantics of the IA64 stop bit?
+
+
+
+
+o And finally, another thought about the syntax for arrays :-)
+
+ Although this syntax:
+ array <dimension-list> of <type>
+ is verbose, it will be used only in the human-readable assembly code so
+ size should not matter. I think we should consider it because I find it
+ to be the clearest syntax. It could even make arrays of function
+ pointers somewhat readable.
+
diff --git a/docs/HistoricalNotes/2001-02-09-AdveCommentsResponse.txt b/docs/HistoricalNotes/2001-02-09-AdveCommentsResponse.txt
new file mode 100644
index 0000000000..4d2879554a
--- /dev/null
+++ b/docs/HistoricalNotes/2001-02-09-AdveCommentsResponse.txt
@@ -0,0 +1,245 @@
+From: Chris Lattner <sabre@nondot.org>
+To: "Vikram S. Adve" <vadve@cs.uiuc.edu>
+Subject: Re: LLVM Feedback
+
+I've included your feedback in the /home/vadve/lattner/llvm/docs directory
+so that it will live in CVS eventually with the rest of LLVM. I've
+significantly updated the documentation to reflect the changes you
+suggested, as specified below:
+
+> We should consider eliminating the type annotation in cases where it is
+> essentially obvious from the instruction type:
+> br bool <cond>, label <iftrue>, label <iffalse>
+> I think your point was that making all types explicit improves clarity
+> and readability. I agree to some extent, but it also comes at the
+> cost of verbosity. And when the types are obvious from people's
+> experience (e.g., in the br instruction), it doesn't seem to help as
+> much.
+
+Very true. We should discuss this more, but my reasoning is more of a
+consistency argument. There are VERY few instructions that can have all
+of the types eliminated, and doing so when available unnecesarily makes
+the language more difficult to handle. Especially when you see 'int
+%this' and 'bool %that' all over the place, I think it would be
+disorienting to see:
+
+ br %predicate, %iftrue, %iffalse
+
+for branches. Even just typing that once gives me the creeps. ;) Like I
+said, we should probably discuss this further in person...
+
+> On reflection, I really like your idea of having the two different
+> switch types (even though they encode implementation techniques rather
+> than semantics). It should simplify building the CFG and my guess is it
+> could enable some significant optimizations, though we should think
+> about which.
+
+Great. I added a note to the switch section commenting on how the VM
+should just use the instruction type as a hint, and that the
+implementation may choose altermate representations (such as predicated
+branches).
+
+> In the lookup-indirect form of the switch, is there a reason not to
+> make the val-type uint?
+
+No. This was something I was debating for a while, and didn't really feel
+strongly about either way. It is common to switch on other types in HLL's
+(for example signed int's are particually common), but in this case, all
+that will be added is an additional 'cast' instruction. I removed that
+from the spec.
+
+> I agree with your comment that we don't need 'neg'
+
+Removed.
+
+> There's a trade-off with the cast instruction:
+> + it avoids having to define all the upcasts and downcasts that are
+> valid for the operands of each instruction (you probably have
+> thought of other benefits also)
+> - it could make the bytecode significantly larger because there could
+> be a lot of cast operations
+
+ + You NEED casts to represent things like:
+ void foo(float);
+ ...
+ int x;
+ ...
+ foo(x);
+ in a language like C. Even in a Java like language, you need upcasts
+ and some way to implement dynamic downcasts.
+ + Not all forms of instructions take every type (for example you can't
+ shift by a floating point number of bits), thus SOME programs will need
+ implicit casts.
+
+To be efficient and to avoid your '-' point above, we just have to be
+careful to specify that the instructions shall operate on all common
+types, therefore casting should be relatively uncommon. For example all
+of the arithmetic operations work on almost all data types.
+
+> Making the second arg. to 'shl' a ubyte seems good enough to me.
+> 255 positions seems adequate for several generations of machines
+
+Okay, that comment is removed.
+
+> and is more compact than uint.
+
+No, it isn't. Remember that the bytecode encoding saves value slots into
+the bytecode instructions themselves, not constant values. This is
+another case where we may introduce more cast instructions (but we will
+also reduce the number of opcode variants that must be supported by a
+virtual machine). Because most shifts are by constant values, I don't
+think that we'll have to cast many shifts. :)
+
+> I still have some major concerns about including malloc and free in the
+> language (either as builtin functions or instructions).
+
+Agreed. How about this proposal:
+
+malloc/free are either built in functions or actual opcodes. They provide
+all of the type safety that the document would indicate, blah blah
+blah. :)
+
+Now, because of all of the excellent points that you raised, an
+implementation may want to override the default malloc/free behavior of
+the program. To do this, they simply implement a "malloc" and
+"free" function. The virtual machine will then be defined to use the user
+defined malloc/free function (which return/take void*'s, not type'd
+pointers like the builtin function would) if one is available, otherwise
+fall back on a system malloc/free.
+
+Does this sound like a good compromise? It would give us all of the
+typesafety/elegance in the language while still allowing the user to do
+all the cool stuff they want to...
+
+> 'alloca' on the other hand sounds like a good idea, and the
+> implementation seems fairly language-independent so it doesn't have the
+> problems with malloc listed above.
+
+Okay, once we get the above stuff figured out, I'll put it all in the
+spec.
+
+> About indirect call:
+> Your option #2 sounded good to me. I'm not sure I understand your
+> concern about an explicit 'icall' instruction?
+
+I worry too much. :) The other alternative has been removed. 'icall' is
+now up in the instruction list next to 'call'.
+
+> I believe tail calls are relatively easy to identify; do you know why
+> .NET has a tailcall instruction?
+
+Although I am just guessing, I believe it probably has to do with the fact
+that they want languages like Haskell and lisp to be efficiently runnable
+on their VM. Of course this means that the VM MUST implement tail calls
+'correctly', or else life will suck. :) I would put this into a future
+feature bin, because it could be pretty handy...
+
+> A pair of important synchronization instr'ns to think about:
+> load-linked
+> store-conditional
+
+What is 'load-linked'? I think that (at least for now) I should add these
+to the 'possible extensions' section, because they are not immediately
+needed...
+
+> Other classes of instructions that are valuable for pipeline
+> performance:
+> conditional-move
+> predicated instructions
+
+Conditional move is effectly a special case of a predicated
+instruction... and I think that all predicated instructions can possibly
+be implemented later in LLVM. It would significantly change things, and
+it doesn't seem to be very neccesary right now. It would seem to
+complicate flow control analysis a LOT in the virtual machine. I would
+tend to prefer that a predicated architecture like IA64 convert from a
+"basic block" representation to a predicated rep as part of it's dynamic
+complication phase. Also, if a basic block contains ONLY a move, then
+that can be trivally translated into a conditional move...
+
+> I agree that we need a static data space. Otherwise, emulating global
+> data gets unnecessarily complex.
+
+Definately. Also a later item though. :)
+
+> We once talked about adding a symbolic thread-id field to each
+> ..
+> Instead, it could a great topic for a separate study.
+
+Agreed. :)
+
+> What is the semantics of the IA64 stop bit?
+
+Basically, the IA64 writes instructions like this:
+mov ...
+add ...
+sub ...
+op xxx
+op xxx
+;;
+mov ...
+add ...
+sub ...
+op xxx
+op xxx
+;;
+
+Where the ;; delimits a group of instruction with no dependencies between
+them, which can all be executed concurrently (to the limits of the
+available functional units). The ;; gets translated into a bit set in one
+of the opcodes.
+
+The advantages of this representation is that you don't have to do some
+kind of 'thread id scheduling' pass by having to specify ahead of time how
+many threads to use, and the representation doesn't have a per instruction
+overhead...
+
+> And finally, another thought about the syntax for arrays :-)
+> Although this syntax:
+> array <dimension-list> of <type>
+> is verbose, it will be used only in the human-readable assembly code so
+> size should not matter. I think we should consider it because I find it
+> to be the clearest syntax. It could even make arrays of function
+> pointers somewhat readable.
+
+My only comment will be to give you an example of why this is a bad
+idea. :)
+
+Here is an example of using the switch statement (with my recommended
+syntax):
+
+switch uint %val, label %otherwise,
+ [%3 x {uint, label}] [ { uint %57, label %l1 },
+ { uint %20, label %l2 },
+ { uint %14, label %l3 } ]
+
+Here it is with the syntax you are proposing:
+
+switch uint %val, label %otherwise,
+ array %3 of {uint, label}
+ array of {uint, label}
+ { uint %57, label %l1 },
+ { uint %20, label %l2 },
+ { uint %14, label %l3 }
+
+Which is ambiguous and very verbose. It would be possible to specify
+constants with [] brackets as in my syntax, which would look like this:
+
+switch uint %val, label %otherwise,
+ array %3 of {uint, label} [ { uint %57, label %l1 },
+ { uint %20, label %l2 },
+ { uint %14, label %l3 } ]
+
+But then the syntax is inconsistent between type definition and constant
+definition (why do []'s enclose the constants but not the types??).
+
+Anyways, I'm sure that there is much debate still to be had over
+this... :)
+
+-Chris
+
+http://www.nondot.org/~sabre/os/
+http://www.nondot.org/MagicStats/
+http://korbit.sourceforge.net/
+
+
diff --git a/docs/HistoricalNotes/2001-02-13-Reference-Memory.txt b/docs/HistoricalNotes/2001-02-13-Reference-Memory.txt
new file mode 100644
index 0000000000..2c7534d9da
--- /dev/null
+++ b/docs/HistoricalNotes/2001-02-13-Reference-Memory.txt
@@ -0,0 +1,39 @@
+Date: Tue, 13 Feb 2001 13:29:52 -0600 (CST)
+From: Chris Lattner <sabre@nondot.org>
+To: Vikram S. Adve <vadve@cs.uiuc.edu>
+Subject: LLVM Concerns...
+
+
+I've updated the documentation to include load store and allocation
+instructions (please take a look and let me know if I'm on the right
+track):
+
+file:/home/vadve/lattner/llvm/docs/LangRef.html#memoryops
+
+I have a couple of concerns I would like to bring up:
+
+1. Reference types
+ Right now, I've spec'd out the language to have a pointer type, which
+ works fine for lots of stuff... except that Java really has
+ references: constrained pointers that cannot be manipulated: added and
+ subtracted, moved, etc... Do we want to have a type like this? It
+ could be very nice for analysis (pointer always points to the start of
+ an object, etc...) and more closely matches Java semantics. The
+ pointer type would be kept for C++ like semantics. Through analysis,
+ C++ pointers could be promoted to references in the LLVM
+ representation.
+
+2. Our "implicit" memory references in assembly language:
+ After thinking about it, this model has two problems:
+ A. If you do pointer analysis and realize that two stores are
+ independent and can share the same memory source object, there is
+ no way to represent this in either the bytecode or assembly.
+ B. When parsing assembly/bytecode, we effectively have to do a full
+ SSA generation/PHI node insertion pass to build the dependencies
+ when we don't want the "pinned" representation. This is not
+ cool.
+ I'm tempted to make memory references explicit in both the assembly and
+ bytecode to get around this... what do you think?
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2001-02-13-Reference-MemoryResponse.txt b/docs/HistoricalNotes/2001-02-13-Reference-MemoryResponse.txt
new file mode 100644
index 0000000000..505343378d
--- /dev/null
+++ b/docs/HistoricalNotes/2001-02-13-Reference-MemoryResponse.txt
@@ -0,0 +1,47 @@
+Date: Tue, 13 Feb 2001 18:25:42 -0600
+From: Vikram S. Adve <vadve@cs.uiuc.edu>
+To: Chris Lattner <sabre@nondot.org>
+Subject: RE: LLVM Concerns...
+
+> 1. Reference types
+> Right now, I've spec'd out the language to have a pointer type, which
+> works fine for lots of stuff... except that Java really has
+> references: constrained pointers that cannot be manipulated: added and
+> subtracted, moved, etc... Do we want to have a type like this? It
+> could be very nice for analysis (pointer always points to the start of
+> an object, etc...) and more closely matches Java semantics. The
+> pointer type would be kept for C++ like semantics. Through analysis,
+> C++ pointers could be promoted to references in the LLVM
+> representation.
+
+
+You're right, having references would be useful. Even for C++ the *static*
+compiler could generate references instead of pointers with fairly
+straightforward analysis. Let's include a reference type for now. But I'm
+also really concerned that LLVM is becoming big and complex and (perhaps)
+too high-level. After we get some initial performance results, we may have
+a clearer idea of what our goals should be and we should revisit this
+question then.
+
+> 2. Our "implicit" memory references in assembly language:
+> After thinking about it, this model has two problems:
+> A. If you do pointer analysis and realize that two stores are
+> independent and can share the same memory source object,
+
+not sure what you meant by "share the same memory source object"
+
+> there is
+> no way to represent this in either the bytecode or assembly.
+> B. When parsing assembly/bytecode, we effectively have to do a full
+> SSA generation/PHI node insertion pass to build the dependencies
+> when we don't want the "pinned" representation. This is not
+> cool.
+
+I understand the concern. But again, let's focus on the performance first
+and then look at the language design issues. E.g., it would be good to know
+how big the bytecode files are before expanding them further. I am pretty
+keen to explore the implications of LLVM for mobile devices. Both bytecode
+size and power consumption are important to consider there.
+
+--Vikram
+
diff --git a/docs/HistoricalNotes/2001-04-16-DynamicCompilation.txt b/docs/HistoricalNotes/2001-04-16-DynamicCompilation.txt
new file mode 100644
index 0000000000..923aa62037
--- /dev/null
+++ b/docs/HistoricalNotes/2001-04-16-DynamicCompilation.txt
@@ -0,0 +1,12 @@
+By Chris:
+
+LLVM has been designed with two primary goals in mind. First we strive to enable the best possible division of labor between static and dynamic compilers, and second, we need a flexible and powerful interface between these two complementary stages of compilation. We feel that providing a solution to these two goals will yield an excellent solution to the performance problem faced by modern architectures and programming languages.
+
+A key insight into current compiler and runtime systems is that a compiler may fall in anywhere in a "continuum of compilation" to do its job. On one side, scripting languages statically compile nothing and dynamically compile (or equivalently, interpret) everything. On the far other side, traditional static compilers process everything statically and nothing dynamically. These approaches have typically been seen as a tradeoff between performance and portability. On a deeper level, however, there are two reasons that optimal system performance may be obtained by a system somewhere in between these two extremes: Dynamic application behavior and social constraints.
+
+From a technical perspective, pure static compilation cannot ever give optimal performance in all cases, because applications have varying dynamic behavior that the static compiler cannot take into consideration. Even compilers that support profile guided optimization generate poor code in the real world, because using such optimization tunes that application to one particular usage pattern, whereas real programs (as opposed to benchmarks) often have several different usage patterns.
+
+On a social level, static compilation is a very shortsighted solution to the performance problem. Instruction set architectures (ISAs) continuously evolve, and each implementation of an ISA (a processor) must choose a set of tradeoffs that make sense in the market context that it is designed for. With every new processor introduced, the vendor faces two fundamental problems: First, there is a lag time between when a processor is introduced to when compilers generate quality code for the architecture. Secondly, even when compilers catch up to the new architecture there is often a large body of legacy code that was compiled for previous generations and will not or can not be upgraded. Thus a large percentage of code running on a processor may be compiled quite sub-optimally for the current characteristics of the dynamic execution environment.
+
+For these reasons, LLVM has been designed from the beginning as a long-term solution to these problems. Its design allows the large body of platform independent, static, program optimizations currently in compilers to be reused unchanged in their current form. It also provides important static type information to enable powerful dynamic and link time optimizations to be performed quickly and efficiently. This combination enables an increase in effective system performance for real world environments.
+
diff --git a/docs/HistoricalNotes/2001-05-18-ExceptionHandling.txt b/docs/HistoricalNotes/2001-05-18-ExceptionHandling.txt
new file mode 100644
index 0000000000..2e0b794075
--- /dev/null
+++ b/docs/HistoricalNotes/2001-05-18-ExceptionHandling.txt
@@ -0,0 +1,202 @@
+Meeting notes: Implementation idea: Exception Handling in C++/Java
+
+The 5/18/01 meeting discussed ideas for implementing exceptions in LLVM.
+We decided that the best solution requires a set of library calls provided by
+the VM, as well as an extension to the LLVM function invocation syntax.
+
+The LLVM function invocation instruction previously looks like this (ignoring
+types):
+
+ call func(arg1, arg2, arg3)
+
+The extension discussed today adds an optional "with" clause that
+associates a label with the call site. The new syntax looks like this:
+
+ call func(arg1, arg2, arg3) with funcCleanup
+
+This funcHandler always stays tightly associated with the call site (being
+encoded directly into the call opcode itself), and should be used whenever
+there is cleanup work that needs to be done for the current function if
+an exception is thrown by func (or if we are in a try block).
+
+To support this, the VM/Runtime provide the following simple library
+functions (all syntax in this document is very abstract):
+
+typedef struct { something } %frame;
+ The VM must export a "frame type", that is an opaque structure used to
+ implement different types of stack walking that may be used by various
+ language runtime libraries. We imagine that it would be typical to
+ represent a frame with a PC and frame pointer pair, although that is not
+ required.
+
+%frame getStackCurrentFrame();
+ Get a frame object for the current function. Note that if the current
+ function was inlined into its caller, the "current" frame will belong to
+ the "caller".
+
+bool isFirstFrame(%frame f);
+ Returns true if the specified frame is the top level (first activated) frame
+ for this thread. For the main thread, this corresponds to the main()
+ function, for a spawned thread, it corresponds to the thread function.
+
+%frame getNextFrame(%frame f);
+ Return the previous frame on the stack. This function is undefined if f
+ satisfies the predicate isFirstFrame(f).
+
+Label *getFrameLabel(%frame f);
+ If a label was associated with f (as discussed below), this function returns
+ it. Otherwise, it returns a null pointer.
+
+doNonLocalBranch(Label *L);
+ At this point, it is not clear whether this should be a function or
+ intrinsic. It should probably be an intrinsic in LLVM, but we'll deal with
+ this issue later.
+
+
+Here is a motivating example that illustrates how these facilities could be
+used to implement the C++ exception model:
+
+void TestFunction(...) {
+ A a; B b;
+ foo(); // Any function call may throw
+ bar();
+ C c;
+
+ try {
+ D d;
+ baz();
+ } catch (int) {
+ ...int Stuff...
+ // execution continues after the try block: the exception is consumed
+ } catch (double) {
+ ...double stuff...
+ throw; // Exception is propogated
+ }
+}
+
+This function would compile to approximately the following code (heavy
+pseudo code follows):
+
+Func:
+ %a = alloca A
+ A::A(%a) // These ctors & dtors could throw, but we ignore this
+ %b = alloca B // minor detail for this example
+ B::B(%b)
+
+ call foo() with fooCleanup // An exception in foo is propogated to fooCleanup
+ call bar() with barCleanup // An exception in bar is propogated to barCleanup
+
+ %c = alloca C
+ C::C(c)
+ %d = alloca D
+ D::D(d)
+ call baz() with bazCleanup // An exception in baz is propogated to bazCleanup
+ d->~D();
+EndTry: // This label corresponds to the end of the try block
+ c->~C() // These could also throw, these are also ignored
+ b->~B()
+ a->~A()
+ return
+
+Note that this is a very straight forward and literal translation: exactly
+what we want for zero cost (when unused) exception handling. Especially on
+platforms with many registers (ie, the IA64) setjmp/longjmp style exception
+handling is *very* impractical. Also, the "with" clauses describe the
+control flow paths explicitly so that analysis is not adversly effected.
+
+The foo/barCleanup labels are implemented as:
+
+TryCleanup: // Executed if an exception escapes the try block
+ c->~C()
+barCleanup: // Executed if an exception escapes from bar()
+ // fall through
+fooCleanup: // Executed if an exception escapes from foo()
+ b->~B()
+ a->~A()
+ Exception *E = getThreadLocalException()
+ call throw(E) // Implemented by the C++ runtime, described below
+
+Which does the work one would expect. getThreadLocalException is a function
+implemented by the C++ support library. It returns the current exception
+object for the current thread. Note that we do not attempt to recycle the
+shutdown code from before, because performance of the mainline code is
+critically important. Also, obviously fooCleanup and barCleanup may be
+merged and one of them eliminated. This just shows how the code generator
+would most likely emit code.
+
+The bazCleanup label is more interesting. Because the exception may be caught
+by the try block, we must dispatch to its handler... but it does not exist
+on the call stack (it does not have a VM Call->Label mapping installed), so
+we must dispatch statically with a goto. The bazHandler thus appears as:
+
+bazHandler:
+ d->~D(); // destruct D as it goes out of scope when entering catch clauses
+ goto TryHandler
+
+In general, TryHandler is not the same as bazHandler, because multiple
+function calls could be made from the try block. In this case, trivial
+optimization could merge the two basic blocks. TryHandler is the code
+that actually determines the type of exception, based on the Exception object
+itself. For this discussion, assume that the exception object contains *at
+least*:
+
+1. A pointer to the RTTI info for the contained object
+2. A pointer to the dtor for the contained object
+3. The contained object itself
+
+Note that it is neccesary to maintain #1 & #2 in the exception object itself
+because objects without virtual function tables may be thrown (as in this
+example). Assuming this, TryHandler would look something like this:
+
+TryHandler:
+ Exception *E = getThreadLocalException();
+ switch (E->RTTIType) {
+ case IntRTTIInfo:
+ ...int Stuff... // The action to perform from the catch block
+ break;
+ case DoubleRTTIInfo:
+ ...double Stuff... // The action to perform from the catch block
+ goto TryCleanup // This catch block rethrows the exception
+ break; // Redundant, eliminated by the optimizer
+ default:
+ goto TryCleanup // Exception not caught, rethrow
+ }
+
+ // Exception was consumed
+ if (E->dtor)
+ E->dtor(E->object) // Invoke the dtor on the object if it exists
+ goto EndTry // Continue mainline code...
+
+And that is all there is to it.
+
+The throw(E) function would then be implemented like this (which may be
+inlined into the caller through standard optimization):
+
+function throw(Exception *E) {
+ // Get the start of the stack trace...
+ %frame %f = call getStackCurrentFrame()
+
+ // Get the label information that corresponds to it
+ label * %L = call getFrameLabel(%f)
+ while (%L == 0 && !isFirstFrame(%f)) {
+ // Loop until a cleanup handler is found
+ %f = call getNextFrame(%f)
+ %L = call getFrameLabel(%f)
+ }
+
+ if (%L != 0) {
+ call setThreadLocalException(E) // Allow handlers access to this...
+ call doNonLocalBranch(%L)
+ }
+ // No handler found!
+ call BlowUp() // Ends up calling the terminate() method in use
+}
+
+That's a brief rundown of how C++ exception handling could be implemented in
+llvm. Java would be very similar, except it only uses destructors to unlock
+synchronized blocks, not to destroy data. Also, it uses two stack walks: a
+nondestructive walk that builds a stack trace, then a destructive walk that
+unwinds the stack as shown here.
+
+It would be trivial to get exception interoperability between C++ and Java.
+
diff --git a/docs/HistoricalNotes/2001-05-19-ExceptionResponse.txt b/docs/HistoricalNotes/2001-05-19-ExceptionResponse.txt
new file mode 100644
index 0000000000..3375365f54
--- /dev/null
+++ b/docs/HistoricalNotes/2001-05-19-ExceptionResponse.txt
@@ -0,0 +1,45 @@
+Date: Sat, 19 May 2001 19:09:13 -0500 (CDT)
+From: Chris Lattner <sabre@nondot.org>
+To: Vikram S. Adve <vadve@cs.uiuc.edu>
+Subject: RE: Meeting writeup
+
+> I read it through and it looks great!
+
+Thanks!
+
+> The finally clause in Java may need more thought. The code for this clause
+> is like a subroutine because it needs to be entered from many points (end of
+> try block and beginning of each catch block), and then needs to *return to
+> the place from where the code was entered*. That's why JVM has the
+> jsr/jsr_w instruction.
+
+Hrm... I guess that is an implementation decision. It can either be
+modelled as a subroutine (as java bytecodes do), which is really
+gross... or it can be modelled as code duplication (emitted once inline,
+then once in the exception path). Because this could, at worst,
+slightly less than double the amount of code in a function (it is
+bounded) I don't think this is a big deal. One of the really nice things
+about the LLVM representation is that it still allows for runtime code
+generation for exception paths (exceptions paths are not compiled until
+needed). Obviously a static compiler couldn't do this though. :)
+
+In this case, only one copy of the code would be compiled... until the
+other one is needed on demand. Also this strategy fits with the "zero
+cost" exception model... the standard case is not burdened with extra
+branches or "call"s.
+
+> I suppose you could save the return address in a particular register
+> (specific to this finally block), jump to the finally block, and then at the
+> end of the finally block, jump back indirectly through this register. It
+> will complicate building the CFG but I suppose that can be handled. It is
+> also unsafe in terms of checking where control returns (which is I suppose
+> why the JVM doesn't use this).
+
+I think that a code duplication method would be cleaner, and would avoid
+the caveats that you mention. Also, it does not slow down the normal case
+with an indirect branch...
+
+Like everything, we can probably defer a final decision until later. :)
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2001-06-01-GCCOptimizations.txt b/docs/HistoricalNotes/2001-06-01-GCCOptimizations.txt
new file mode 100644
index 0000000000..d542fb478c
--- /dev/null
+++ b/docs/HistoricalNotes/2001-06-01-GCCOptimizations.txt
@@ -0,0 +1,63 @@
+Date: Fri, 1 Jun 2001 16:38:17 -0500 (CDT)
+From: Chris Lattner <sabre@nondot.org>
+To: Vikram S. Adve <vadve@cs.uiuc.edu>
+Subject: Interesting: GCC passes
+
+
+Take a look at this document (which describes the order of optimizations
+that GCC performs):
+
+http://gcc.gnu.org/onlinedocs/gcc_17.html
+
+The rundown is that after RTL generation, the following happens:
+
+1 . [t] jump optimization (jumps to jumps, etc)
+2 . [t] Delete unreachable code
+3 . Compute live ranges for CSE
+4 . [t] Jump threading (jumps to jumps with identical or inverse conditions)
+5 . [t] CSE
+6 . *** Conversion to SSA
+7 . [t] SSA Based DCE
+8 . *** Conversion to LLVM
+9 . UnSSA
+10. GCSE
+11. LICM
+12. Strength Reduction
+13. Loop unrolling
+14. [t] CSE
+15. [t] DCE
+16. Instruction combination, register movement, scheduling... etc.
+
+I've marked optimizations with a [t] to indicate things that I believe to
+be relatively trivial to implement in LLVM itself. The time consuming
+things to reimplement would be SSA based PRE, Strength reduction & loop
+unrolling... these would be the major things we would miss out on if we
+did LLVM creation from tree code [inlining and other high level
+optimizations are done on the tree representation].
+
+Given the lack of "strong" optimizations that would take a long time to
+reimplement, I am leaning a bit more towards creating LLVM from the tree
+code. Especially given that SGI has GPL'd their compiler, including many
+SSA based optimizations that could be adapted (besides the fact that their
+code looks MUCH nicer than GCC :)
+
+Even if we choose to do LLVM code emission from RTL, we will almost
+certainly want to move LLVM emission from step 8 down until at least CSE
+has been rerun... which causes me to wonder if the SSA generation code
+will still work (due to global variable dependancies and stuff). I assume
+that it can be made to work, but might be a little more involved than we
+would like.
+
+I'm continuing to look at the Tree -> RTL code. It is pretty gross
+because they do some of the translation a statement at a time, and some
+of it a function at a time... I'm not quite clear why and how the
+distinction is drawn, but it does not appear that there is a wonderful
+place to attach extra info.
+
+Anyways, I'm proceeding with the RTL -> LLVM conversion phase for now. We
+can talk about this more on Monday.
+
+Wouldn't it be nice if there were a obvious decision to be made? :)
+
+-Chris
+
diff --git a/docs/HistoricalNotes/2001-06-01-GCCOptimizations2.txt b/docs/HistoricalNotes/2001-06-01-GCCOptimizations2.txt
new file mode 100644
index 0000000000..6c9e0971a0
--- /dev/null
+++ b/docs/HistoricalNotes/2001-06-01-GCCOptimizations2.txt
@@ -0,0 +1,71 @@
+Date: Fri, 1 Jun 2001 17:08:44 -0500 (CDT)
+From: Chris Lattner <sabre@nondot.org>
+To: Vikram S. Adve <vadve@cs.uiuc.edu>
+Subject: RE: Interesting: GCC passes
+
+> That is very interesting. I agree that some of these could be done on LLVM
+> at link-time, but it is the extra time required that concerns me. Link-time
+> optimization is severely time-constrained.
+
+If we were to reimplement any of these optimizations, I assume that we
+could do them a translation unit at a time, just as GCC does now. This
+would lead to a pipeline like this:
+
+Static optimizations, xlation unit at a time:
+.c --GCC--> .llvm --llvmopt--> .llvm
+
+Link time optimizations:
+.llvm --llvm-ld--> .llvm --llvm-link-opt--> .llvm
+
+Of course, many optimizations could be shared between llvmopt and
+llvm-link-opt, but the wouldn't need to be shared... Thus compile time
+could be faster, because we are using a "smarter" IR (SSA based).
+
+> BTW, about SGI, "borrowing" SSA-based optimizations from one compiler and
+> putting it into another is not necessarily easier than re-doing it.
+> Optimization code is usually heavily tied in to the specific IR they use.
+
+Understood. The only reason that I brought this up is because SGI's IR is
+more similar to LLVM than it is different in many respects (SSA based,
+relatively low level, etc), and could be easily adapted. Also their
+optimizations are written in C++ and are actually somewhat
+structured... of course it would be no walk in the park, but it would be
+much less time consuming to adapt, say, SSA-PRE than to rewrite it.
+
+> But your larger point is valid that adding SSA based optimizations is
+> feasible and should be fun. (Again, link time cost is the issue.)
+
+Assuming linktime cost wasn't an issue, the question is:
+Does using GCC's backend buy us anything?
+
+> It also occurs to me that GCC is probably doing quite a bit of back-end
+> optimization (step 16 in your list). Do you have a breakdown of that?
+
+Not really. The irritating part of GCC is that it mixes it all up and
+doesn't have a clean seperation of concerns. A lot of the "back end
+optimization" happens right along with other data optimizations (ie, CSE
+of machine specific things).
+
+As far as REAL back end optimizations go, it looks something like this:
+
+1. Instruction combination: try to make CISCy instructions, if available
+2. Register movement: try to get registers in the right places for the
+architecture to avoid register to register moves. For example, try to get
+the first argument of a function to naturally land in %o0 for sparc.
+3. Instruction scheduling: 'nuff said :)
+4. Register class preferencing: ??
+5. Local register allocation
+6. global register allocation
+7. Spilling
+8. Local regalloc
+9. Jump optimization
+10. Delay slot scheduling
+11. Branch shorting for CISC machines
+12. Instruction selection & peephole optimization
+13. Debug info output
+
+But none of this would be usable for LLVM anyways, unless we were using
+GCC as a static compiler.
+
+-Chris
+
diff --git a/docs/LangRef.html b/docs/LangRef.html
new file mode 100644
index 0000000000..b3d6d52121
--- /dev/null
+++ b/docs/LangRef.html
@@ -0,0 +1,1376 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html><head><title>llvm Assembly Language Reference Manual</title></head>
+<body bgcolor=white>
+
+<table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0>
+<tr><td>&nbsp; <font size=+5 color="#EEEEFF" face="Georgia,Palatino,Times,Roman"><b>llvm Assembly Language Reference Manual</b></font></td>
+</tr></table>
+
+<ol>
+ <li><a href="#abstract">Abstract</a>
+ <li><a href="#introduction">Introduction</a>
+ <li><a href="#identifiers">Identifiers</a>
+ <li><a href="#typesystem">Type System</a>
+ <ol>
+ <li><a href="#t_primitive">Primitive Types</a>
+ <ol>
+ <li><a href="#t_classifications">Type Classifications</a>
+ </ol>
+ <li><a href="#t_derived">Derived Types</a>
+ <ol>
+ <li><a href="#t_array" >Array Type</a>
+ <li><a href="#t_method" >Method Type</a>
+ <li><a href="#t_pointer">Pointer Type</a>
+ <li><a href="#t_struct" >Structure Type</a>
+ <li><a href="#t_packed" >Packed Type</a>
+ </ol>
+ </ol>
+ <li><a href="#highlevel">High Level Structure</a>
+ <ol>
+ <li><a href="#modulestructure">Module Structure</a>
+ <li><a href="#methodstructure">Method Structure</a>
+ </ol>
+ <li><a href="#instref">Instruction Reference</a>
+ <ol>
+ <li><a href="#terminators">Terminator Instructions</a>
+ <ol>
+ <li><a href="#i_ret" >'<tt>ret</tt>' Instruction</a>
+ <li><a href="#i_br" >'<tt>br</tt>' Instruction</a>
+ <li><a href="#i_switch" >'<tt>switch</tt>' Instruction</a>
+ <li><a href="#i_callwith">'<tt>call .. with</tt>' Instruction</a>
+ </ol>
+ <li><a href="#unaryops">Unary Operations</a>
+ <ol>
+ <li><a href="#i_not" >'<tt>not</tt>' Instruction</a>
+ <li><a href="#i_cast">'<tt>cast .. to</tt>' Instruction</a>
+ </ol>
+ <li><a href="#binaryops">Binary Operations</a>
+ <ol>
+ <li><a href="#i_add" >'<tt>add</tt>' Instruction</a>
+ <li><a href="#i_sub" >'<tt>sub</tt>' Instruction</a>
+ <li><a href="#i_mul" >'<tt>mul</tt>' Instruction</a>
+ <li><a href="#i_div" >'<tt>div</tt>' Instruction</a>
+ <li><a href="#i_rem" >'<tt>rem</tt>' Instruction</a>
+ <li><a href="#i_setcc">'<tt>set<i>cc</i></tt>' Instructions</a>
+ </ol>
+ <li><a href="#bitwiseops">Bitwise Binary Operations</a>
+ <ol>
+ <li><a href="#i_and">'<tt>and</tt>' Instruction</a>
+ <li><a href="#i_or" >'<tt>or</tt>' Instruction</a>
+ <li><a href="#i_xor">'<tt>xor</tt>' Instruction</a>
+ <li><a href="#i_shl">'<tt>shl</tt>' Instruction</a>
+ <li><a href="#i_shr">'<tt>shr</tt>' Instruction</a>
+ </ol>
+ <li><a href="#memoryops">Memory Access Operations</a>
+ <ol>
+ <li><a href="#i_malloc" >'<tt>malloc</tt>' Instruction</a>
+ <li><a href="#i_free" >'<tt>free</tt>' Instruction</a>
+ <li><a href="#i_alloca" >'<tt>alloca</tt>' Instruction</a>
+ <li><a href="#i_load" >'<tt>load</tt>' Instruction</a>
+ <li><a href="#i_store" >'<tt>store</tt>' Instruction</a>
+ <li><a href="#i_getfieldptr">'<tt>getfieldptr</tt>' Instruction</a>
+ </ol>
+ <li><a href="#otherops">Other Operations</a>
+ <ol>
+ <li><a href="#i_call" >'<tt>call</tt>' Instruction</a>
+ <li><a href="#i_icall">'<tt>icall</tt>' Instruction</a>
+ <li><a href="#i_phi" >'<tt>phi</tt>' Instruction</a>
+ </ol>
+ <li><a href="#builtinfunc">Builtin Functions</a>
+ </ol>
+ <li><a href="#todo">TODO List</a>
+ <ol>
+ <li><a href="#exception">Exception Handling Instructions</a>
+ <li><a href="#synchronization">Synchronization Instructions</a>
+ </ol>
+ <li><a href="#extensions">Possible Extensions</a>
+ <ol>
+ <li><a href="#i_tailcall">'<tt>tailcall</tt>' Instruction</a>
+ <li><a href="#globalvars">Global Variables</a>
+ <li><a href="#explicitparrellelism">Explicit Parrellelism</a>
+ </ol>
+ <li><a href="#related">Related Work</a>
+</ol>
+
+
+<!-- *********************************************************************** -->
+<p><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="abstract">Abstract
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+<blockquote>
+ This document describes the LLVM assembly language IR/VM. LLVM is an SSA
+ based representation that attempts to be a useful midlevel IR by providing
+ type safety, low level operations, flexibility, and the capability to
+ represent 'all' high level languages cleanly.
+</blockquote>
+
+
+
+
+<!-- *********************************************************************** -->
+</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="introduction">Introduction
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+The LLVM is designed to exhibit a dual nature: on one hand, it is a useful compiler IR, on the other hand, it is a bytecode representation for dynamic compilation. We contend that this is a natural and good thing, making LLVM a natural form of communication between different compiler phases, and also between a static and dynamic compiler.<p>
+
+This dual nature leads to three different representations of LLVM (the human readable assembly representation, the compact bytecode representation, and the in memory, pointer based, representation). This document describes the human readable representation and notation.<p>
+
+The LLVM representation aims to be a light weight and low level while being expressive, type safe, and extensible at the same time. It aims to be a "universal IR" of sorts, by being at a low enough level that high level ideas may be cleanly mapped to it. By providing type safety, LLVM can be used as the target of optimizations: for example, through pointer analysis, it can be proven that a C automatic variable is never accessed outside of the current function... allowing it to be promoted to a simple SSA value instead of a memory location.<p>
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="wellformed"><h4><hr size=0>Well Formedness</h4><ul>
+
+It is important to note that this document describes 'well formed' llvm assembly language. There is a difference between what the parser accepts and what is considered 'well formed'. For example, the following instruction is syntactically okay, but not well formed:<p>
+
+<pre>
+ %x = <a href="#i_add">add</a> int 1, %x
+</pre>
+
+...because only a <tt><a href="#i_phi">phi</a></tt> node may refer to itself. The LLVM api provides a verification function (<tt>verify</tt>) that may be used to verify that a whole module or a single method is well formed. It is useful to validate whether an optimization pass performed a well formed transformation to the code.<p>
+
+
+Describe the typesetting conventions here.
+
+
+<!-- *********************************************************************** -->
+</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="identifiers">Identifiers
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+LLVM uses three different forms of identifiers, for different purposes:<p>
+
+<ol>
+<li>Numeric constants are represented as you would expect: 12, -3 123.421, etc.
+<li>Named values are represented as a string of characters with a '%' prefix. For example, %foo, %DivisionByZero, %a.really.long.identifier. The actual regular expression used is '<tt>%[a-zA-Z$._][a-zA-Z$._0-9]*</tt>'.
+<li>Unnamed values are represented as an unsigned numeric value with a '%' prefix. For example, %12, %2, %44.
+</ol><p>
+
+LLVM requires the values start with a '%' sign for two reasons: Compilers don't need to worry about name clashes with reserved words, and the set of reserved words may be expanded in the future without penalty. Additionally, unnamed identifiers allow a compiler to quickly come up with a temporary variable without having to avoid symbol table conflicts.<p>
+
+Reserved words in LLVM are very similar to reserved words in other languages. There are keywords for different opcodes ('<tt><a href="#i_add">add</a></tt>', '<tt><a href="#i_cast">cast</a></tt>', '<tt><a href="#i_ret">ret</a></tt>', etc...), for primitive type names ('<tt><a href="#t_void">void</a></tt>', '<tt><a href="#t_uint">uint</a></tt>', etc...), and others. These reserved words cannot conflict with variable names, because none of them may start with a '%' character.<p>
+
+Here is an example of LLVM code to multiply the integer variable '<tt>%X</tt>' by 8:<p>
+
+The easy way:
+<pre>
+ %result = <a href="#i_mul">mul</a> int %X, 8
+</pre>
+
+After strength reduction:
+<pre>
+ %result = <a href="#i_shl">shl</a> int %X, ubyte 3
+</pre>
+
+And the hard way:
+<pre>
+ <a href="#i_add">add</a> int %X, %X <i>; yields {int}:%0</i>
+ <a href="#i_add">add</a> int %0, %0 <i>; yields {int}:%1</i>
+ %result = <a href="#i_add">add</a> int %1, %1
+</pre>
+
+This last way of multiplying <tt>%X</tt> by 8 illustrates several important lexical features of LLVM:<p>
+
+<ol>
+<li>Comments are delimited with a '<tt>;</tt>' and go until the end of line.
+<li>Unnamed temporaries are created when the result of a computation is not assigned to a named value.
+<li>Unnamed temporaries are numbered sequentially
+</ol><p>
+
+...and it also show a convention that we follow in this document. When demonstrating instructions, we will follow an instruction with a comment that defines the type and name of value produced. Comments are shown in italic text.<p>
+
+
+
+<!-- *********************************************************************** -->
+</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="typesystem">Type System
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+The LLVM type system is important to the overall usefulness of the language and VM runtime. By being strongly typed, a number of optimizations may be performed on the IR directly, without having to do extra analysis to derive types. A strong type system also makes it easier to comprehend generated code and assists with safety concerns.<p>
+
+The assembly language form for the type system was heavily influenced by the type problems in the C language<sup><a href="#rw_stroustrup">1</a></sup>.<p>
+
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="t_primitive">Primitive Types
+</b></font></td></tr></table><ul>
+
+The primitive types are the fundemental building blocks of the LLVM system. The current set of primitive types are as follows:<p>
+
+<table border=0 align=center><tr><td>
+
+<table border=1 cellspacing=0 cellpadding=4 align=center>
+<tr><td><tt>void</tt></td> <td>No value</td></tr>
+<tr><td><tt>ubyte</tt></td> <td>Unsigned 8 bit value</td></tr>
+<tr><td><tt>ushort</tt></td><td>Unsigned 16 bit value</td></tr>
+<tr><td><tt>uint</tt></td> <td>Unsigned 32 bit value</td></tr>
+<tr><td><tt>ulong</tt></td> <td>Unsigned 64 bit value</td></tr>
+<tr><td><tt>float</tt></td> <td>32 bit floating point value</td></tr>
+<tr><td><tt>label</tt></td> <td>Branch destination</td></tr>
+</table>
+
+</td><td>
+
+<table border=1 cellspacing=0 cellpadding=4 align=center>
+<tr><td><tt>bool</tt></td> <td>True or False value</td></tr>
+<tr><td><tt>sbyte</tt></td> <td>Signed 8 bit value</td></tr>
+<tr><td><tt>short</tt></td> <td>Signed 16 bit value</td></tr>
+<tr><td><tt>int</tt></td> <td>Signed 32 bit value</td></tr>
+<tr><td><tt>long</tt></td> <td>Signed 64 bit value</td></tr>
+<tr><td><tt>double</tt></td><td>64 bit floating point value</td></tr>
+<tr><td><tt>lock</tt></td> <td>Recursive mutex value</td></tr>
+</table>
+
+</td></tr></table><p>
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="t_classifications"><h4><hr size=0>Type Classifications</h4><ul>
+
+These different primitive types fall into a few useful classifications:<p>
+
+<table border=1 cellspacing=0 cellpadding=4 align=center>
+<tr><td><a name="t_signed">signed</td> <td><tt>sbyte, short, int, long, float, double</tt></td></tr>
+<tr><td><a name="t_unsigned">unsigned</td><td><tt>ubyte, ushort, uint, ulong</tt></td></tr>
+<tr><td><a name="t_integral">integral</td><td><tt>ubyte, sbyte, ushort, short, uint, int, ulong, long</tt></td></tr>
+<tr><td><a name="t_floating">floating point</td><td><tt>float, double</tt></td></tr>
+<tr><td><a name="t_firstclass">first class</td><td><tt>bool, ubyte, sbyte, ushort, short, uint, int, ulong, long, float, double, lock</tt></td></tr>
+</table><p>
+
+
+
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="t_derived">Derived Types
+</b></font></td></tr></table><ul>
+
+The real power in LLVM comes from the derived types in the system. This is what allows a programmer to represent arrays, methods, pointers, and other useful types. Note that these derived types may be recursive: For example, it is possible to have a two dimensional array.<p>
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="t_array"><h4><hr size=0>Array Type</h4><ul>
+
+<h5>Overview:</h5>
+
+The array type is a very simple derived type. It arranges elements sequentially in memory. There are two different forms of the array type:<p>
+
+<ol>
+<a name="t_array_fixed"><b><li>Fixed size array type:</b><br>
+ The simplest form of the array type, has a size hard coded in as part of the type. Thus these are three distinct type qualifiers:<p>
+
+ <tt>[40 x int ]</tt>: Array of 40 integer values.<br>
+ <tt>[41 x int ]</tt>: Array of 41 integer values.<br>
+ <tt>[40 x uint]</tt>: Array of 40 unsigned integer values.<p>
+
+Fixed sized arrays are very useful for compiler optimization passes and for representing analysis results. Additionally, multidimensional arrays must have fixed sizes for all dimensions except the outer-most dimension.<p>
+
+<a name="t_array_unsized"><b><li>Dynamically sized array type:</b><br>
+ The dynamically sized arrays are very similar to the fixed size arrays, except that the size of the array is calculated at runtime by the virtual machine. This is useful for representing generic methods that take any size array as an argument, or when representing Java style arrays.
+</ol><p>
+
+Here are some examples of multidimensional arrays:<p>
+<ul>
+<table border=0 cellpadding=0 cellspacing=0>
+<tr><td><tt>[3 x [4 x int]]</tt></td><td>: 3x4 array integer values.</td></tr>
+<tr><td><tt>[[10 x int]]</tt></td><td>: Nx10 array of integer values.</td></tr>
+<tr><td><tt>[2 x [3 x [4 x uint]]]</tt></td><td>: 2x3x4 array of unsigned integer values.</td></tr>
+</table>
+</ul>
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="t_method"><h4><hr size=0>Method Type</h4><ul>
+
+<h5>Overview:</h5>
+
+The method type can be thought of as a method signature. It consists of a return type and a list of formal parameter types. Method types are usually used when to build virtual function tables (which are structures of pointers to methods) and for indirect method calls.<p>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;returntype&gt; (&lt;parameter list&gt;)
+</pre>
+
+Where '<tt>&lt;parameter list&gt;</tt>' is a comma seperated list of type specifiers.<p>
+
+<h5>Examples:</h5>
+<ul>
+<table border=0 cellpadding=0 cellspacing=0>
+<tr><td><tt>int (int)</tt></td><td>: method taking an <tt>int</tt>, returning an <tt>int</tt></td></tr>
+<tr><td><tt>float (int, int *) *</tt></td><td>: <a href="#t_pointer">Pointer</a> to a method that takes an <tt>int</tt> and a <a href="#t_pointer">pointer</a> to <tt>int</tt>, returning <tt>float</tt>.</td></tr>
+</table>
+</ul>
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="t_struct"><h4><hr size=0>Structure Type</h4><ul>
+
+<h5>Overview:</h5>
+
+The structure type is used to represent a collection of data members together in memory. Although the runtime is allowed to lay out the data members any way that it would like, they are guaranteed to be "close" to each other.<p>
+
+Structures are accessed using '<tt><a href="#i_load">load</a></tt> and '<tt><a href="#i_store">store</a></tt>' by getting a pointer to a field with the '<tt><a href="#i_getfieldptr">getfieldptr</a></tt>' instruction.<p>
+
+<h5>Syntax:</h5>
+<pre>
+ { &lt;type list&gt; }
+</pre>
+
+
+<h5>Examples:</h5>
+<table border=0 cellpadding=0 cellspacing=0>
+<tr><td><tt>{ int, int, int }</tt></td><td>: a triple of three <tt>int</tt> values</td></tr>
+<tr><td><tt>{ float, int (int *) * }</tt></td><td>: A pair, where the first element is a <tt>float</tt> and the second element is a <a href="#t_pointer">pointer</a> to a <a href="t_method">method</a> that takes an <tt>int</tt>, returning an <tt>int</tt>.</td></tr>
+</table>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="t_pointer"><h4><hr size=0>Pointer Type</h4><ul>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="t_packed"><h4><hr size=0>Packed Type</h4><ul>
+
+Mention/decide that packed types work with saturation or not. Maybe have a packed+saturated type in addition to just a packed type.<p>
+
+Packed types should be 'nonsaturated' because standard data types are not saturated. Maybe have a saturated packed type?<p>
+
+
+<!-- *********************************************************************** -->
+</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="highlevel">High Level Structure
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="modulestructure">Module Structure
+</b></font></td></tr></table><ul>
+
+
+talk about the elements of a module: constant pool and method list.<p>
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="methodstructure">Method Structure
+</b></font></td></tr></table><ul>
+
+
+talk about the constant pool<p>
+talk about how basic blocks delinate labels<p>
+talk about how basic blocks end with terminators<p>
+
+
+<!-- *********************************************************************** -->
+</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="instref">Instruction Reference
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+List all of the instructions, list valid types that they accept. Tell what they
+do and stuff also.
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="terminators">Terminator Instructions
+</b></font></td></tr></table><ul>
+
+
+
+As was mentioned <a href="#methodstructure">previously</a>, every basic block in
+a program ends with a "Terminator" instruction. Additionally, all terminators yield a '<tt>void</tt>' value: they produce control flow, not values.<p>
+
+There are three different terminator instructions: the '<a href="#i_ret"><tt>ret</tt></a>' instruction, the '<a href="#i_br"><tt>br</tt></a>' instruction, and the '<a href="#i_switch"><tt>switch</tt></a>' instruction.<p>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_ret"><h4><hr size=0>'<tt>ret</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ ret &lt;type&gt; &lt;value&gt; <i>; Return a value from a non-void method</i>
+ ret void <i>; Return from void method</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>ret</tt>' instruction is used to return control flow (and optionally a value) from a method, back to the caller.<p>
+
+There are two forms of the '<tt>ret</tt>' instructruction: one that returns a value and then causes control flow, and one that just causes control flow to occur.<p>
+
+<h5>Arguments:</h5>
+The '<tt>ret</tt>' instruction may return any '<a href="#t_firstclass">first class</a>' type. Notice that a method is not <a href="#wellformed">well formed</a> if there exists a '<tt>ret</tt>' instruction inside of the method that returns a value that does not match the return type of the method.<p>
+
+<h5>Semantics:</h5>
+When the '<tt>ret</tt>' instruction is executed, control flow returns back to the calling method's context. If the instruction returns a value, that value shall be propogated into the calling method's data space.<p>
+
+<h5>Example:</h5>
+<pre>
+ ret int 5 <i>; Return an integer value of 5</i>
+ ret void <i>; Return from a void method</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_br"><h4><hr size=0>'<tt>br</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ br bool &lt;cond&gt;, label &lt;iftrue&gt;, label &lt;iffalse&gt;
+ br label &lt;dest&gt; <i>; Unconditional branch</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>br</tt>' instruction is used to cause control flow to transfer to a different basic block in the current method. There are two forms of this instruction, corresponding to a conditional branch and an unconditional branch. The '<tt>br</tt>' instruction is a (useful) special case '<tt><a href="#i_switch">switch</a></tt>' instruction.<p>
+
+<h5>Arguments:</h5>
+
+The conditional branch form of the '<tt>br</tt>' instruction shall take a single '<tt>bool</tt>' value and two '<tt>label</tt>' values. The unconditional form of the '<tt>br</tt>' instruction takes a single '<tt>label</tt>' value as a target.<p>
+
+<h5>Semantics:</h5>
+
+Upon execution of a conditional '<tt>br</tt>' instruction, the '<tt>bool</tt>' argument is evaluated. If the value is <tt>true</tt>, control flows to the '<tt>iftrue</tt>' '<tt>label</tt>' argument. If "cond" is <tt>false</tt>, control flows to the '<tt>iffalse</tt>' '<tt>label</tt>' argument.<p>
+
+<h5>Example:</h5>
+<pre>
+Test:
+ %cond = <a href="#i_setcc">seteq</a> int %a, %b
+ br bool %cond, label %IfEqual, label %IfUnequal
+IfEqual:
+ <a href="#i_ret">ret</a> bool true
+IfUnequal:
+ <a href="#i_ret">ret</a> bool false
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_switch"><h4><hr size=0>'<tt>switch</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ <i>; Definitions for lookup indirect branch</i>
+ %switchtype = type [&lt;anysize&gt; x { uint, label }]
+
+ <i>; Lookup indirect branch</i>
+ switch uint &lt;value&gt;, label &lt;defaultdest&gt;, %switchtype &lt;switchtable&gt;
+
+ <i>; Indexed indirect branch</i>
+ switch uint &lt;idxvalue&gt;, label &lt;defaultdest&gt;, [&lt;anysize&gt; x label] &lt;desttable&gt;
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>switch</tt>' instruction is used to transfer control flow to one of several different places. It is a simple generalization of the '<tt>br</tt>' instruction, and supports a strict superset of its functionality.<p>
+
+The '<tt>switch</tt>' statement supports two different styles of indirect branching: lookup branching and indexed branching. Lookup branching is generally useful if the values to switch on are spread far appart, where index branching is useful if the values to switch on are generally dense.<p>
+
+The two different forms of the '<tt>switch</tt>' statement are simple hints to the underlying virtual machine implementation. For example, a virtual machine may choose to implement a small indirect branch table as a series of predicated comparisons: if it is faster for the target architecture.<p>
+
+<h5>Arguments:</h5>
+The lookup form of the '<tt>switch</tt>' instruction uses three parameters: a '<tt>uint</tt>' comparison value '<tt>value</tt>', a default '<tt>label</tt>' destination, and a sized array of pairs of comparison value constants and '<tt>label</tt>'s. The sized array must be a constant value.<p>
+
+The indexed form of the '<tt>switch</tt>' instruction uses three parameters: an '<tt>uint</tt>' index value, a default '<tt>label</tt>' and a sized array of '<tt>label</tt>'s. The '<tt>dests</tt>' array must be a constant array.
+
+<h5>Semantics:</h5>
+
+The lookup style switch statement specifies a table of values and destinations. When the '<tt>switch</tt>' instruction is executed, this table is searched for the given value. If the value is found, the corresponding destination is branched to. <p>
+The index branch form simply looks up a label element directly in a table and branches to it.<p>
+
+In either case, the compiler knows the static size of the array, because it is provided as part of the constant values type.<p>
+
+<h5>Example:</h5>
+<pre>
+ <i>; Emulate a conditional br instruction</i>
+ %Val = <a href="#i_cast">cast</a> bool %value to uint
+ switch uint %Val, label %truedest, [1 x label] [label %falsedest ]
+
+ <i>; Emulate an unconditional br instruction</i>
+ switch uint 0, label %dest, [ 0 x label] [ ]
+
+ <i>; Implement a jump table using the constant pool:</i>
+ void "testmeth"(int %arg0)
+ %switchdests = [3 x label] [ label %onzero, label %onone, label %ontwo ]
+ {
+ ...
+ switch uint %val, label %otherwise, [3 x label] %switchdests...
+ ...
+ }
+
+ <i>; Implement the equivilent jump table directly:</i>
+ switch uint %val, label %otherwise, [3 x label] [ label %onzero,
+ label %onone,
+ label %ontwo ]
+
+</pre>
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_callwith"><h4><hr size=0>'<tt>call .. with</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = call &lt;method ty&gt; %&lt;method name&gt;(&lt;method args&gt;) with label &lt;break label&gt;
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>call .. with</tt>' instruction is used to cause control flow to transfer to a specified method, with the possibility of control flow transfer to the '<tt>break label</tt>' label, in addition to the possibility of fallthrough to the next basic block. The '<tt><a href="#i_call">call</a></tt>' instruction is closely related, but does guarantees that control flow either never returns from the invoked method, or that it returns to the instruction succeeding the '<tt><a href="#i_call">call</a></tt>' instruction.<p>
+
+TODO: icall .. with needs to be defined as well for an indirect call.<p>
+
+<h5>Arguments:</h5>
+
+This instruction requires several arguments:<p>
+<ol>
+<li>'<tt>method ty</tt>': shall be the signature of the named method being invoked. This must be a <a href="#t_method">method type</a>.
+<li>'<tt>method name</tt>': method name to be invoked.
+<li>'<tt>method args</tt>': argument list whose types match the method signature argument types.
+<li>'<tt>break label</tt>': a label that specifies the break label associated with this call.
+</ol>
+
+<h5>Semantics:</h5>
+
+This instruction is designed to operate as a standard '<tt><a href="#i_call">call</a></tt>' instruction in most regards. The primary difference is that it assiciates a label with the method invocation that may be accessed via the runtime library provided by the execution environment. This instruction is used in languages with destructors to ensure that proper cleanup is performed in the case of either a <tt>longjmp</tt> or a thrown exception. Additionally, this is important for implementation of '<tt>catch</tt>' clauses in high-level languages that support them.<p>
+
+For a more comprehensive explanation of this instruction look in the llvm/docs/2001-05-18-ExceptionHandling.txt document.
+
+<h5>Example:</h5>
+<pre>
+ %retval = call int (int) %Test(int 15) with label %TestCleanup <i>; {int}:retval set</i>
+</pre>
+
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="unaryops">Unary Operations
+</b></font></td></tr></table><ul>
+
+Unary operators are used to do a simple operation to a single value.<p>
+
+There are two different unary operators: the '<a href="#i_not"><tt>not</tt></a>' instruction and the '<a href="#i_cast"><tt>cast</tt></a>' instruction.<p>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_not"><h4><hr size=0>'<tt>not</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = not &lt;ty&gt; &lt;var&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>not</tt>' instruction returns the <a href="#logical_integrals">logical</a> inverse of its operand.<p>
+
+<h5>Arguments:</h5>
+The single argument to '<tt>not</tt>' must be of of <a href="#t_integral">integral</a> type.<p>
+
+
+<h5>Semantics:</h5>
+The '<tt>not</tt>' instruction returns the <a href="#logical_integrals">logical</a> inverse of an <a href="#t_integral">integral</a> type.<p>
+
+Note that the '<tt>not</tt>' instruction is is not defined over to '<tt>bool</tt>' type. To invert a boolean value, the recommended method is to use:<p>
+
+<pre>
+ &lt;result&gt; = xor bool true, &lt;var&gt; <i>; yields {bool}:result</i>
+</pre>
+
+<h5>Example:</h5>
+<pre>
+ %x = not int 1 <i>; {int}:x is now equal to 0</i>
+ %x = not bool true <i>; {bool}:x is now equal to false</i>
+</pre>
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_cast"><h4><hr size=0>'<tt>cast .. to</tt>' Instruction</h4><ul>
+
+<h1>TODO</h1>
+
+<a name="logical_integrals">
+ Talk about what is considered true or false for integrals.
+
+
+
+<h5>Syntax:</h5>
+<pre>
+</pre>
+
+<h5>Overview:</h5>
+
+
+<h5>Arguments:</h5>
+
+
+<h5>Semantics:</h5>
+
+
+<h5>Example:</h5>
+<pre>
+</pre>
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="binaryops">Binary Operations
+</b></font></td></tr></table><ul>
+
+Binary operators are used to do most of the computation in a program. They require two operands, execute an operation on them, and produce a single value. The result value of a binary operator is not neccesarily the same type as its operands.<p>
+
+There are several different binary operators:<p>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_add"><h4><hr size=0>'<tt>add</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = add &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>add</tt>' instruction returns the sum of its two operands.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>add</tt>' instruction must be either <a href="#t_integral">integral</a> or <a href="#t_floating">floating point</a> values. Both arguments must have identical types.<p>
+
+<h5>Semantics:</h5>
+...<p>
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = add int 4, %var <i>; yields {int}:result = 4 + %var</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_sub"><h4><hr size=0>'<tt>sub</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = sub &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>sub</tt>' instruction returns the difference of its two operands.<p>
+
+Note that the '<tt>sub</tt>' instruction is the cannonical way the '<tt>neg</tt>' instruction is represented as well.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>sub</tt>' instruction must be either <a href="#t_integral">integral</a> or <a href="#t_floating">floating point</a> values. Both arguments must have identical types.<p>
+
+<h5>Semantics:</h5>
+...<p>
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = sub int 4, %var <i>; yields {int}:result = 4 - %var</i>
+ &lt;result&gt; = sub int 0, %val <i>; yields {int}:result = -%var</i>
+</pre>
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_mul"><h4><hr size=0>'<tt>mul</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = mul &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>mul</tt>' instruction returns the product of its two operands.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>mul</tt>' instruction must be either <a href="#t_integral">integral</a> or <a href="#t_floating">floating point</a> values. Both arguments must have identical types.<p>
+
+<h5>Semantics:</h5>
+...<p>
+There is no signed vs unsigned multiplication. The appropriate action is taken based on the type of the operand. <p>
+
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = mul int 4, %var <i>; yields {int}:result = 4 * %var</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_div"><h4><hr size=0>'<tt>div</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = div &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>div</tt>' instruction returns the quotient of its two operands.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>div</tt>' instruction must be either <a href="#t_integral">integral</a> or <a href="#t_floating">floating point</a> values. Both arguments must have identical types.<p>
+
+<h5>Semantics:</h5>
+...<p>
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = div int 4, %var <i>; yields {int}:result = 4 / %var</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_rem"><h4><hr size=0>'<tt>rem</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = rem &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>rem</tt>' instruction returns the remainder from the division of its two operands.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>rem</tt>' instruction must be either <a href="#t_integral">integral</a> or <a href="#t_floating">floating point</a> values. Both arguments must have identical types.<p>
+
+<h5>Semantics:</h5>
+TODO: remainder or modulus?<p>
+...<p>
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = rem int 4, %var <i>; yields {int}:result = 4 % %var</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_setcc"><h4><hr size=0>'<tt>set<i>cc</i></tt>' Instructions</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = seteq &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {bool}:result</i>
+ &lt;result&gt; = setne &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {bool}:result</i>
+ &lt;result&gt; = setlt &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {bool}:result</i>
+ &lt;result&gt; = setgt &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {bool}:result</i>
+ &lt;result&gt; = setle &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {bool}:result</i>
+ &lt;result&gt; = setge &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {bool}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>set<i>cc</i></tt>' family of instructions returns a boolean value based on a comparison of their two operands.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>set<i>cc</i></tt>' instructions must be of <a href="#t_firstclass">first class</a> or <a href="#t_derived">derived</a> type (it is not possible to compare '<tt>label</tt>'s or '<tt>void</tt>' values). Both arguments must have identical types.<p>
+
+The '<tt>setlt</tt>', '<tt>setgt</tt>', '<tt>setle</tt>', and '<tt>setge</tt>' instructions do not operate on '<tt>bool</tt>' typed arguments.<p>
+
+<h5>Semantics:</h5>
+The '<tt>seteq</tt>' instruction yields a <tt>true</tt> '<tt>bool</tt>' value if both operands are equal.<br>
+The '<tt>setne</tt>' instruction yields a <tt>true</tt> '<tt>bool</tt>' value if both operands are unequal.<br>
+The '<tt>setlt</tt>' instruction yields a <tt>true</tt> '<tt>bool</tt>' value if the first operand is less than the second operand.<br>
+The '<tt>setgt</tt>' instruction yields a <tt>true</tt> '<tt>bool</tt>' value if the first operand is greater than the second operand.<br>
+The '<tt>setle</tt>' instruction yields a <tt>true</tt> '<tt>bool</tt>' value if the first operand is less than or equal to the second operand.<br>
+The '<tt>setge</tt>' instruction yields a <tt>true</tt> '<tt>bool</tt>' value if the first operand is greater than or equal to the second operand.<p>
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = seteq int 4, 5 <i>; yields {bool}:result = false</i>
+ &lt;result&gt; = setne float 4, 5 <i>; yields {bool}:result = true</i>
+ &lt;result&gt; = setlt uint 4, 5 <i>; yields {bool}:result = true</i>
+ &lt;result&gt; = setgt sbyte 4, 5 <i>; yields {bool}:result = false</i>
+ &lt;result&gt; = setle sbyte 4, 5 <i>; yields {bool}:result = true</i>
+ &lt;result&gt; = setge sbyte 4, 5 <i>; yields {bool}:result = false</i>
+</pre>
+
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="bitwiseops">Bitwise Binary Operations
+</b></font></td></tr></table><ul>
+
+Bitwise binary operators are used to do various forms of bit-twiddling in a program. They are generally very efficient instructions, and can commonly be strength reduced from other instructions. They require two operands, execute an operation on them, and produce a single value. The resulting value of the bitwise binary operators is always the same type as its first operand.<p>
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_and"><h4><hr size=0>'<tt>and</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = and &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>and</tt>' instruction returns the bitwise logical and of its two operands.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>and</tt>' instruction must be either <a href="#t_integral">integral</a> or <a href="#t_bool"><tt>bool</tt></a> values. Both arguments must have identical types.<p>
+
+
+<h5>Semantics:</h5>
+...<p>
+
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = and int 4, %var <i>; yields {int}:result = 4 & %var</i>
+ &lt;result&gt; = and int 15, 40 <i>; yields {int}:result = 8</i>
+ &lt;result&gt; = and int 4, 8 <i>; yields {int}:result = 0</i>
+</pre>
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_or"><h4><hr size=0>'<tt>or</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = or &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>or</tt>' instruction returns the bitwise logical inclusive or of its two operands.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>or</tt>' instruction must be either <a href="#t_integral">integral</a> or <a href="#t_bool"><tt>bool</tt></a> values. Both arguments must have identical types.<p>
+
+
+<h5>Semantics:</h5>
+...<p>
+
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = or int 4, %var <i>; yields {int}:result = 4 | %var</i>
+ &lt;result&gt; = or int 15, 40 <i>; yields {int}:result = 47</i>
+ &lt;result&gt; = or int 4, 8 <i>; yields {int}:result = 12</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_xor"><h4><hr size=0>'<tt>xor</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = xor &lt;ty&gt; &lt;var1&gt;, &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>xor</tt>' instruction returns the bitwise logical exclusive or of its two operands.<p>
+
+<h5>Arguments:</h5>
+The two arguments to the '<tt>xor</tt>' instruction must be either <a href="#t_integral">integral</a> or <a href="#t_bool"><tt>bool</tt></a> values. Both arguments must have identical types.<p>
+
+
+<h5>Semantics:</h5>
+...<p>
+
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = xor int 4, %var <i>; yields {int}:result = 4 ^ %var</i>
+ &lt;result&gt; = xor int 15, 40 <i>; yields {int}:result = 39</i>
+ &lt;result&gt; = xor int 4, 8 <i>; yields {int}:result = 12</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_shl"><h4><hr size=0>'<tt>shl</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = shl &lt;ty&gt; &lt;var1&gt;, ubyte &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>shl</tt>' instruction returns the first operand shifted to the left a specified number of bits.
+
+<h5>Arguments:</h5>
+The first argument to the '<tt>shl</tt>' instruction must be an <a href="#t_integral">integral</a> type. The second argument must be an '<tt>ubyte</tt>' type.<p>
+
+<h5>Semantics:</h5>
+... 0 bits are shifted into the emptied bit positions...<p>
+
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = shl int 4, ubyte %var <i>; yields {int}:result = 4 << %var</i>
+ &lt;result&gt; = shl int 4, ubyte 2 <i>; yields {int}:result = 16</i>
+ &lt;result&gt; = shl int 1, ubyte 10 <i>; yields {int}:result = 1024</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_shr"><h4><hr size=0>'<tt>shr</tt>' Instruction</h4><ul>
+
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = shr &lt;ty&gt; &lt;var1&gt;, ubyte &lt;var2&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>shr</tt>' instruction returns the first operand shifted to the right a specified number of bits.
+
+<h5>Arguments:</h5>
+The first argument to the '<tt>shr</tt>' instruction must be an <a href="#t_integral">integral</a> type. The second argument must be an '<tt>ubyte</tt>' type.<p>
+
+<h5>Semantics:</h5>
+... if the first argument is a <a href="#t_signed">signed</a> type, the most significant bit is duplicated in the newly free'd bit positions. If the first argument is unsigned, zeros shall fill the empty positions...<p>
+
+<h5>Example:</h5>
+<pre>
+ &lt;result&gt; = shr int 4, ubyte %var <i>; yields {int}:result = 4 >> %var</i>
+ &lt;result&gt; = shr int 4, ubyte 1 <i>; yields {int}:result = 2</i>
+ &lt;result&gt; = shr int 4, ubyte 2 <i>; yields {int}:result = 1</i>
+ &lt;result&gt; = shr int 4, ubyte 3 <i>; yields {int}:result = 0</i>
+</pre>
+
+
+
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="memoryops">Memory Access Operations
+</b></font></td></tr></table><ul>
+
+Accessing memory in SSA form is, well, sticky at best. This section describes how to read and write memory in LLVM.<p>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_malloc"><h4><hr size=0>'<tt>malloc</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = malloc &lt;type&gt; <i>; yields { type *}:result</i>
+ &lt;result&gt; = malloc [&lt;type&gt;], uint &lt;NumElements&gt; <i>; yields {[type] *}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>malloc</tt>' instruction allocates memory from the system heap and returns a pointer to it.<p>
+
+<h5>Arguments:</h5>
+
+There are two forms of the '<tt>malloc</tt>' instruction, one for allocating a variable of a fixed type, and one for allocating an array. The array form is used to allocate an array, where the upper bound is not known until run time. If the upper bound is known at compile time, it is recommended that the first form be used with a <a href="#t_array_fixed">sized array type</a>.<p>
+
+'<tt>type</tt>' may be any type except for a <a href="#t_array_unsized">unsized array type</a>.<p>
+
+<h5>Semantics:</h5>
+Memory is allocated, a pointer is returned.<p>
+
+<h5>Example:</h5>
+<pre>
+ %array = malloc [4 x ubyte ] <i>; yields {[%4 x ubyte]*}:array</i>
+
+ %size = <a href="#i_add">add</a> uint 2, 2 <i>; yields {uint}:size = uint 4</i>
+ %array1 = malloc [ubyte], uint 4 <i>; yields {[ubyte]*}:array1</i>
+ %array2 = malloc [ubyte], uint %size <i>; yields {[ubyte]*}:array2</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_free"><h4><hr size=0>'<tt>free</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ free &lt;type&gt; &lt;value&gt; <i>; yields {void}</i>
+</pre>
+
+
+<h5>Overview:</h5>
+The '<tt>free</tt>' instruction returns memory back to the unused memory heap, to be reallocated in the future.<p>
+
+
+<h5>Arguments:</h5>
+
+'<tt>value</tt>' shall be a pointer value that points to a value that was allocated with the '<tt><a href="#i_malloc">malloc</a></tt>' instruction.<p>
+
+
+<h5>Semantics:</h5>
+Memory is available for use after this point. The contents of the '<tt>value</tt>' pointer are undefined after this instruction.<p>
+
+
+<h5>Example:</h5>
+<pre>
+ %array = <a href="#i_malloc">malloc</a> [4 x ubyte] <i>; yields {[4 x ubyte]*}:array</i>
+ free [4 x ubyte]* %array
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_alloca"><h4><hr size=0>'<tt>alloca</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = alloca &lt;type&gt; <i>; yields {type*}:result</i>
+ &lt;result&gt; = alloca [&lt;type&gt;], uint &lt;NumElements&gt; <i>; yields {[type] *}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+
+The '<tt>alloca</tt>' instruction allocates memory on the current stack frame of the procedure that is live as long as the method does not return.<p>
+
+<h5>Arguments:</h5>
+There are two forms of the '<tt>alloca</tt>' instruction, one for allocating a variable of a fixed type, and one for allocating an array. The array form is used to allocate an array, where the upper bound is not known until run time. If the upper bound is known at compile time, it is recommended that the first form be used with a <a href="#t_array_fixed">sized array type</a>.<p>
+
+'<tt>type</tt>' may be any type except for a <a href="#t_array_unsized">unsized array type</a>.<p>
+
+Note that a virtual machine may generate more efficient native code for a method if all of the fixed size '<tt>alloca</tt>' instructions live in the first basic block of that method.
+
+
+<h5>Semantics:</h5>
+Memory is allocated, a pointer is returned. '<tt>alloca</tt>'d memory is automatically released when the method returns. The '<tt>alloca</tt>' utility is how variable spills shall be implemented.<p>
+
+<h5>Example:</h5>
+<pre>
+ %ptr = alloca int <i>; yields {int*}:ptr</i>
+ %ptr = alloca [int], uint 4 <i>; yields {[int]*}:ptr</i>
+</pre>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_load"><h4><hr size=0>'<tt>load</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ &lt;result&gt; = load &lt;ty&gt;* &lt;pointer&gt; <i>; yields {ty}:result</i>
+ &lt;result&gt; = load &lt;ty&gt;* &lt;arrayptr&gt;, uint &lt;idx&gt; <i>; yields {ty}:result</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>load</tt>' instruction is used to read from memory.<p>
+
+<h5>Arguments:</h5>
+
+There are two forms of the '<tt>load</tt>' instruction: one for reading from a general pointer, and one for reading from a pointer to an array.<p>
+
+In the first form, '<tt>&lt;ty&gt;</tt>' may be any pointer type. If it is a pointer to an array, the first (zeroth) element is read from). In the second form, '<tt>&lt;ty&gt;</tt>' must be a pointer to an array. No bounds checking is performed on array reads.<p>
+
+
+<h5>Semantics:</h5>
+...
+
+<h5>Examples:</h5>
+<pre>
+ %ptr = <a href="#i_alloca">alloca</a> int <i>; yields {int*}:ptr</i>
+ <a href="#i_store">store</a> int* %ptr, int 3 <i>; yields {void}</i>
+ %val = load int* %ptr <i>; yields {int}:val = int 3</i>
+
+ %array = <a href="#i_malloc">malloc</a> [4 x ubyte] <i>; yields {[4 x ubyte]*}:array</i>
+ <a href="#i_store">store</a> [4 x ubyte]* %array,
+ uint 4, ubyte 124
+ %val = load [4 x ubyte]* %array, uint 4 <i>; yields {ubyte}:val = ubyte 124</i>
+
+</pre>
+
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_store"><h4><hr size=0>'<tt>store</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+ store &lt;ty&gt;* &lt;pointer&gt;, &lt;ty&gt; &lt;value&gt; <i>; yields {void}</i>
+ store &lt;ty&gt;* &lt;arrayptr&gt;, uint &lt;idx&gt;, &lt;ty&gt; &lt;value&gt; <i>; yields {void}</i>
+</pre>
+
+<h5>Overview:</h5>
+The '<tt>store</tt>' instruction is used to write to memory.<p>
+
+
+<h5>Arguments:</h5>
+There are two forms of the '<tt>store</tt>' instruction: one for writing through a general pointer, and one for writing through a pointer to an array.<p>
+
+In the first form, '<tt>&lt;ty&gt;</tt>' may be any pointer type. If it is a pointer to an array, the first (zeroth) element is writen to). In the second form, '<tt>&lt;ty&gt;</tt>' must be a pointer to an array. No bounds checking is performed on array writes.<p>
+
+
+<h5>Semantics:</h5>
+...
+
+<h5>Example:</h5>
+<pre>
+ %ptr = <a href="#i_alloca">alloca</a> int <i>; yields {int*}:ptr</i>
+ store int* %ptr, int 3 <i>; yields {void}</i>
+ %val = <a href="#i_load">load</a> int* %ptr <i>; yields {int}:val = int 3</i>
+
+ %array = <a href="#i_malloc">malloc</a> [4 x ubyte] <i>; yields {[4 x ubyte]*}:array</i>
+ store [4 x ubyte]* %array,
+ uint 4, ubyte 124
+ %val = <a href="#i_load">load</a> [4 x ubyte]* %array, uint 4 <i>; yields {ubyte}:val = ubyte 124</i>
+</pre>
+
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_getfieldptr"><h4><hr size=0>'<tt>getfieldptr</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+
+</pre>
+
+<h5>Overview:</h5>
+
+getfield takes a structure pointer, and an unsigned byte. It returns a pointer to the specified element, of the correct type. At the implementation level, this would be compiled down to an addition of a constant int.
+
+<h5>Arguments:</h5>
+
+
+<h5>Semantics:</h5>
+
+
+<h5>Example:</h5>
+<pre>
+
+</pre>
+
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="otherops">Other Operations
+</b></font></td></tr></table><ul>
+
+The instructions in this catagory are the "miscellaneous" functions, that defy better classification.<p>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_call"><h4><hr size=0>'<tt>call</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+
+</pre>
+
+<h5>Overview:</h5>
+
+
+<h5>Arguments:</h5>
+
+
+<h5>Semantics:</h5>
+
+
+<h5>Example:</h5>
+<pre>
+ %retval = call int %test(int %argc)
+</pre>
+
+
+<!-- _______________________________________________________________________ --></ul><a name="i_icall"><h3><hr size=0>'<tt>icall</tt>' Instruction</h3><ul>
+
+Indirect calls are desperately needed to implement virtual function tables (C++, java) and function pointers (C, C++, ...).<p>
+
+A new instruction <tt>icall</tt> or similar should be introduced to represent an indirect call.<p>
+
+Example:
+<pre>
+ %retval = icall int %funcptr(int %arg1) <i>; yields {int}:%retval</i>
+</pre>
+
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_phi"><h4><hr size=0>'<tt>phi</tt>' Instruction</h4><ul>
+
+<h5>Syntax:</h5>
+<pre>
+</pre>
+
+<h5>Overview:</h5>
+
+
+<h5>Arguments:</h5>
+
+
+<h5>Semantics:</h5>
+
+
+<h5>Example:</h5>
+<pre>
+</pre>
+
+
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0><tr><td>&nbsp;</td><td width="100%">&nbsp; <font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="builtinfunc">Builtin Functions
+</b></font></td></tr></table><ul>
+
+<b>Notice:</b> Preliminary idea!<p>
+
+Builtin functions are very similar to normal functions, except they are defined by the implementation. Invocations of these functions are very similar to method invocations, except that the syntax is a little less verbose.<p>
+
+Builtin functions are useful to implement semi-high level ideas like a '<tt>min</tt>' or '<tt>max</tt>' operation that can have important properties when doing program analysis. For example:
+
+<ul>
+<li>Some optimizations can make use of identities defined over the functions,
+ for example a parrallelizing compiler could make use of '<tt>min</tt>'
+ identities to parrellelize a loop.
+<li>Builtin functions would have polymorphic types, where normal method calls
+ may only have a single type.
+<li>Builtin functions would be known to not have side effects, simplifying
+ analysis over straight method calls.
+<li>The syntax of the builtin are cleaner than the syntax of the
+ '<a href="#i_call"><tt>call</tt></a>' instruction (very minor point).
+</ul>
+
+Because these invocations are explicit in the representation, the runtime can choose to implement these builtin functions any way that they want, including:
+
+<ul>
+<li>Inlining the code directly into the invocation
+<li>Implementing the functions in some sort of Runtime class, convert invocation
+ to a standard method call.
+<li>Implementing the functions in some sort of Runtime class, and perform
+ standard inlining optimizations on it.
+</ul>
+
+Note that these builtins do not use quoted identifiers: the name of the builtin effectively becomes an identifier in the language.<p>
+
+Example:
+<pre>
+ ; Example of a normal method call
+ %maximum = call int %maximum(int %arg1, int %arg2) <i>; yields {int}:%maximum</i>
+
+ ; Examples of potential builtin functions
+ %max = max(int %arg1, int %arg2) <i>; yields {int}:%max</i>
+ %min = min(int %arg1, int %arg2) <i>; yields {int}:%min</i>
+ %sin = sin(double %arg) <i>; yields {double}:%sin</i>
+ %cos = cos(double %arg) <i>; yields {double}:%cos</i>
+
+ ; Show that builtin's are polymorphic, like instructions
+ %max = max(float %arg1, float %arg2) <i>; yields {float}:%max</i>
+ %cos = cos(float %arg) <i>; yields {float}:%cos</i>
+</pre>
+
+The '<tt>maximum</tt>' vs '<tt>max</tt>' example illustrates the difference in calling semantics between a '<a href="#i_call"><tt>call</tt></a>' instruction and a builtin function invocation. Notice that the '<tt>maximum</tt>' example assumes that the method is defined local to the caller.<p>
+
+
+
+
+<!-- *********************************************************************** -->
+</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="todo">TODO List
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+This list of random topics includes things that will <b>need</b> to be addressed before the llvm may be used to implement a java like langauge. Right now, it is pretty much useless for any language, given to unavailable of structure types<p>
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="synchronization"><h3><hr size=0>Synchronization Instructions</h3><ul>
+
+We will need some type of synchronization instructions to be able to implement stuff in Java well. The way I currently envision doing this is to introduce a '<tt>lock</tt>' type, and then add two (builtin or instructions) operations to lock and unlock the lock.<p>
+
+
+<!-- *********************************************************************** -->
+</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="extensions">Possible Extensions
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+These extensions are distinct from the TODO list, as they are mostly "interesting" ideas that could be implemented in the future by someone so motivated. They are not directly required to get <a href="#rw_java">Java</a> like languages working.<p>
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="i_tailcall"><h3><hr size=0>'<tt>tailcall</tt>' Instruction</h3><ul>
+
+This could be useful. Who knows. '.net' does it, but is the optimization really worth the extra hassle? Using strong typing would make this trivial to implement and a runtime could always callback to using downconverting this to a normal '<a href="#i_call"><tt>call</tt></a>' instruction.<p>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="globalvars"><h3><hr size=0>Global Variables</h3><ul>
+
+In order to represent programs written in languages like C, we need to be able to support variables at the module (global) scope. Perhaps they should be written outside of the module definition even. Maybe global functions should be handled like this as well.<p>
+
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="explicitparrellelism"><h3><hr size=0>Explicit Parrellelism</h3><ul>
+
+With the rise of massively parrellel architectures (like <a href="#rw_ia64">the IA64 architecture</a>, multithreaded CPU cores, and SIMD data sets) it is becoming increasingly more important to extract all of the ILP from a code stream possible. It would be interesting to research encoding methods that can explicitly represent this. One straightforward way to do this would be to introduce a "stop" instruction that is equilivent to the IA64 stop bit.<p>
+
+
+
+<!-- *********************************************************************** -->
+</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0><tr><td align=center><font color="#EEEEFF" size=+2 face="Georgia,Palatino"><b>
+<a name="related">Related Work
+</b></font></td></tr></table><ul>
+<!-- *********************************************************************** -->
+
+
+Codesigned virtual machines.<p>
+
+<dl>
+<a name="rw_safetsa">
+<dt>SafeTSA
+<DD>Description here<p>
+
+<a name="rw_java">
+<dt><a href="http://www.javasoft.com">Java</a>
+<DD>Desciption here<p>
+
+<a name="rw_net">
+<dt><a href="http://www.microsoft.com/net">Microsoft .net</a>
+<DD>Desciption here<p>
+
+<a name="rw_gccrtl">
+<dt><a href="http://www.math.umn.edu/systems_guide/gcc-2.95.1/gcc_15.html">GNU RTL Intermediate Representation</a>
+<DD>Desciption here<p>
+
+<a name="rw_ia64">
+<dt><a href="http://developer.intel.com/design/ia-64/index.htm">IA64 Architecture &amp; Instruction Set</a>
+<DD>Desciption here<p>
+
+<a name="rw_mmix">
+<dt><a href="http://www-cs-faculty.stanford.edu/~knuth/mmix-news.html">MMIX Instruction Set</a>
+<DD>Desciption here<p>
+
+<a name="rw_stroustrup">
+<dt><a href="http://www.research.att.com/~bs/devXinterview.html">"Interview With Bjarne Stroustrup"</a>
+<DD>This interview influenced the design and thought process behind LLVM in several ways, most notably the way that derived types are written in text format. See the question that starts with "you defined the C declarator syntax as an experiment that failed".<p>
+</dl>
+
+<!-- _______________________________________________________________________ -->
+</ul><a name="rw_vectorization"><h3><hr size=0>Vectorized Architectures</h3><ul>
+
+<dl>
+<a name="rw_intel_simd">
+<dt>Intel MMX, MMX2, SSE, SSE2
+<DD>Description here<p>
+
+<a name="rw_amd_simd">
+<dt><a href="http://www.nondot.org/~sabre/os/H1ChipFeatures/3DNow!TechnologyManual.pdf">AMD 3Dnow!, 3Dnow! 2</a>
+<DD>Desciption here<p>
+
+<a name="rw_sun_simd">
+<dt><a href="http://www.nondot.org/~sabre/os/H1ChipFeatures/VISInstructionSetUsersManual.pdf">Sun VIS ISA</a>
+<DD>Desciption here<p>
+
+
+</dl>
+
+more...
+
+<!-- *********************************************************************** -->
+</ul>
+<!-- *********************************************************************** -->
+
+
+<hr>
+<font size=-1>
+<address><a href="mailto:sabre@nondot.org">Chris Lattner</a></address>
+<!-- Created: Tue Jan 23 15:19:28 CST 2001 -->
+<!-- hhmts start -->
+Last modified: Thu May 31 17:36:39 CDT 2001
+<!-- hhmts end -->
+</font>
+</body></html>