summaryrefslogtreecommitdiff
path: root/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt
diff options
context:
space:
mode:
Diffstat (limited to 'docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt')
-rw-r--r--docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt199
1 files changed, 199 insertions, 0 deletions
diff --git a/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt b/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt
new file mode 100644
index 0000000000..1c725f5aa7
--- /dev/null
+++ b/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt
@@ -0,0 +1,199 @@
+Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST)
+From: Chris Lattner <sabre@nondot.org>
+To: Vikram Adve <vadve@cs.uiuc.edu>
+Subject: Re: a few thoughts
+
+Okay... here are a few of my thoughts on this (it's good to know that we
+think so alike!):
+
+> 1. We need to be clear on our goals for the VM. Do we want to emphasize
+> portability and safety like the Java VM? Or shall we focus on the
+> architecture interface first (i.e., consider the code generation and
+> processor issues), since the architecture interface question is also
+> important for portable Java-type VMs?
+
+I forsee the architecture looking kinda like this: (which is completely
+subject to change)
+
+1. The VM code is NOT guaranteed safe in a java sense. Doing so makes it
+ basically impossible to support C like languages. Besides that,
+ certifying a register based language as safe at run time would be a
+ pretty expensive operation to have to do. Additionally, we would like
+ to be able to statically eliminate many bounds checks in Java
+ programs... for example.
+
+ 2. Instead, we can do the following (eventually):
+ * Java bytecode is used as our "safe" representation (to avoid
+ reinventing something that we don't add much value to). When the
+ user chooses to execute Java bytecodes directly (ie, not
+ precompiled) the runtime compiler can do some very simple
+ transformations (JIT style) to convert it into valid input for our
+ VM. Performance is not wonderful, but it works right.
+ * The file is scheduled to be compiled (rigorously) at a later
+ time. This could be done by some background process or by a second
+ processor in the system during idle time or something...
+ * To keep things "safe" ie to enforce a sandbox on Java/foreign code,
+ we could sign the generated VM code with a host specific private
+ key. Then before the code is executed/loaded, we can check to see if
+ the trusted compiler generated the code. This would be much quicker
+ than having to validate consistency (especially if bounds checks have
+ been removed, for example)
+
+> This is important because the audiences for these two goals are very
+> different. Architects and many compiler people care much more about
+> the second question. The Java compiler and OS community care much more
+> about the first one.
+
+3. By focusing on a more low level virtual machine, we have much more room
+ for value add. The nice safe "sandbox" VM can be provided as a layer
+ on top of it. It also lets us focus on the more interesting compilers
+ related projects.
+
+> 2. Design issues to consider (an initial list that we should continue
+> to modify). Note that I'm not trying to suggest actual solutions here,
+> but just various directions we can pursue:
+
+Understood. :)
+
+> a. A single-assignment VM, which we've both already been thinking
+> about.
+
+Yup, I think that this makes a lot of sense. I am still intrigued,
+however, by the prospect of a minimally allocated VM representation... I
+think that it could have definate advantages for certain applications
+(think very small machines, like PDAs). I don't, however, think that our
+initial implementations should focus on this. :)
+
+Here are some other auxilliary goals that I think we should consider:
+
+1. Primary goal: Support a high performance dynamic compilation
+ system. This means that we have an "ideal" division of labor between
+ the runtime and static compilers. Of course, the other goals of the
+ system somewhat reduce the importance of this point (f.e. portability
+ reduces performance, but hopefully not much)
+2. Portability to different processors. Since we are most familiar with
+ x86 and solaris, I think that these two are excellent candidates when
+ we get that far...
+3. Support for all languages & styles of programming (general purpose
+ VM). This is the point that disallows java style bytecodes, where all
+ array refs are checked for bounds, etc...
+4. Support linking between different language families. For example, call
+ C functions directly from Java without using the nasty/slow/gross JNI
+ layer. This involves several subpoints:
+ A. Support for languages that require garbage collectors and integration
+ with languages that don't. As a base point, we could insist on
+ always using a conservative GC, but implement free as a noop, f.e.
+
+> b. A strongly-typed VM. One question is do we need the types to be
+> explicitly declared or should they be inferred by the dynamic
+> compiler?
+
+ B. This is kind of similar to another idea that I have: make OOP
+ constructs (virtual function tables, class heirarchies, etc) explicit
+ in the VM representation. I believe that the number of additional
+ constructs would be fairly low, but would give us lots of important
+ information... something else that would/could be important is to
+ have exceptions as first class types so that they would be handled in
+ a uniform way for the entire VM... so that C functions can call Java
+ functions for example...
+
+> c. How do we get more high-level information into the VM while keeping
+> to a low-level VM design?
+> o Explicit array references as operands? An alternative is
+> to have just an array type, and let the index computations be
+> separate 3-operand instructions.
+
+ C. In the model I was thinking of (subject to change of course), we
+ would just have an array type (distinct from the pointer
+ types). This would allow us to have arbitrarily complex index
+ expressions, while still distinguishing "load" from "Array load",
+ for example. Perhaps also, switch jump tables would be first class
+ types as well? This would allow better reasoning about the program.
+
+5. Support dynamic loading of code from various sources. Already
+ mentioned above was the example of loading java bytecodes, but we want
+ to support dynamic loading of VM code as well. This makes the job of
+ the runtime compiler much more interesting: it can do interprocedural
+ optimizations that the static compiler can't do, because it doesn't
+ have all of the required information (for example, inlining from
+ shared libraries, etc...)
+
+6. Define a set of generally useful annotations to add to the VM
+ representation. For example, a function can be analysed to see if it
+ has any sideeffects when run... also, the MOD/REF sets could be
+ calculated, etc... we would have to determine what is reasonable. This
+ would generally be used to make IP optimizations cheaper for the
+ runtime compiler...
+
+> o Explicit instructions to handle aliasing, e.g.s:
+> -- an instruction to say "I speculate that these two values are not
+> aliased, but check at runtime", like speculative execution in
+> EPIC?
+> -- or an instruction to check whether two values are aliased and
+> execute different code depending on the answer, somewhat like
+> predicated code in EPIC
+
+These are also very good points... if this can be determined at compile
+time. I think that an epic style of representation (not the instruction
+packing, just the information presented) could be a very interesting model
+to use... more later...
+
+> o (This one is a difficult but powerful idea.)
+> A "thread-id" field on every instruction that allows the static
+> compiler to generate a set of parallel threads, and then have
+> the runtime compiler and hardware do what they please with it.
+> This has very powerful uses, but thread-id on every instruction
+> is expensive in terms of instruction size and code size.
+> We would need to compactly encode it somehow.
+
+Yes yes yes! :) I think it would be *VERY* useful to include this kind
+of information (which EPIC architectures *implicitly* encode. The trend
+that we are seeing supports this greatly:
+
+1. Commodity processors are getting massive SIMD support:
+ * Intel/Amd MMX/MMX2
+ * AMD's 3Dnow!
+ * Intel's SSE/SSE2
+ * Sun's VIS
+2. SMP is becoming much more common, especially in the server space.
+3. Multiple processors on a die are right around the corner.
+
+If nothing else, not designing this in would severely limit our future
+expansion of the project...
+
+> Also, this will require some reading on at least two other
+> projects:
+> -- Multiscalar architecture from Wisconsin
+> -- Simultaneous multithreading architecture from Washington
+>
+> o Or forget all this and stick to a traditional instruction set?
+
+Heh... :) Well, from a pure research point of view, it is almost more
+attactive to go with the most extreme/different ISA possible. On one axis
+you get safety and conservatism, and on the other you get degree of
+influence that the results have. Of course the problem with pure research
+is that often times there is no concrete product of the research... :)
+
+> BTW, on an unrelated note, after the meeting yesterday, I did remember
+> that you had suggested doing instruction scheduling on SSA form instead
+> of a dependence DAG earlier in the semester. When we talked about
+> it yesterday, I didn't remember where the idea had come from but I
+> remembered later. Just giving credit where its due...
+
+:) Thanks.
+
+> Perhaps you can save the above as a file under RCS so you and I can
+> continue to expand on this.
+
+I think it makes sense to do so when we get our ideas more formalized and
+bounce it back and forth a couple of times... then I'll do a more formal
+writeup of our goals and ideas. Obviously our first implementation will
+not want to do all of the stuff that I pointed out above... be we will
+want to design the project so that we do not artificially limit ourselves
+at sometime in the future...
+
+Anyways, let me know what you think about these ideas... and if they sound
+reasonable...
+
+-Chris
+