summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorChris Lattner <sabre@nondot.org>2010-04-21 06:42:24 +0000
committerChris Lattner <sabre@nondot.org>2010-04-21 06:42:24 +0000
commita54c1f70b8bffa78316d1447756d5ba400bda895 (patch)
treedc066f47fe0b0c013f3278d1b0d12f92e9c10a01 /docs
parent450a31edde46234dff2a681006878a853efc1027 (diff)
downloadllvm-a54c1f70b8bffa78316d1447756d5ba400bda895.tar.gz
llvm-a54c1f70b8bffa78316d1447756d5ba400bda895.tar.bz2
llvm-a54c1f70b8bffa78316d1447756d5ba400bda895.tar.xz
final hacking for tonight, still more to go.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101995 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r--docs/ReleaseNotes.html116
1 files changed, 58 insertions, 58 deletions
diff --git a/docs/ReleaseNotes.html b/docs/ReleaseNotes.html
index 9b65c6f3d5..129a4057c7 100644
--- a/docs/ReleaseNotes.html
+++ b/docs/ReleaseNotes.html
@@ -501,28 +501,48 @@ release includes a few major enhancements and additions to the optimizers:</p>
<ul>
-<li>...</li>
-Inliner reuses arrays allocas when inlining multiple callers to reduce stack usage.
-Optimal Edge Profiling?
-Instcombine is now a library, has its own IRBuilder to simplify itself.
-Better code size analysis in loop unswitch, inliner code split out to a new
- CodeMetrics class for reuse.
-Many changes to the pass ordering for improved optimization effectiveness.
-BasicAA improved to be less dependent on "type safe" pointers, it can now look
- through bitcasts more aggressively.
-GVN PHI Translation improvements. blog post: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html
-New SCEV AA pass: -scev-aa
-Target data now has notion of 'native' integer data types which optimizations can use.
-Opt now works conservatively if no target data is set (is this fully working?)
-New Analysis/InstructionSimplify.h interface for simplifying instructions that don't exist.
-Jump threading is now much more aggressive at simplifying correlated
+<li>Inliner reuses arrays allocas when inlining multiple callers to reduce stack usage.</li>
+<li>Instcombine is now a library, has its own IRBuilder to simplify itself.</li>
+<li>Better code size analysis in loop unswitch, inliner code split out to a new
+ CodeMetrics class for reuse.</li>
+<li>Many changes to the pass ordering for improved optimization
+ effectiveness.</li>
+<li>BasicAA improved to be less dependent on "type safe" pointers, it can now look
+ through bitcasts more aggressively.</li>
+<li>GVN PHI Translation improvements. blog post: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html</li>
+<li>New SCEV AA pass: -scev-aa</li>
+<li>Target data now has notion of 'native' integer data types which optimizations can use.</li>
+<li>Opt now works conservatively if no target data is set (is this fully working?)</li>
+<li>New Analysis/InstructionSimplify.h interface for simplifying instructions that don't exist.</li>
+<li>Jump threading is now much more aggressive at simplifying correlated
conditionals and threading blocks with otherwise complex logic. CondProp pass
- removed (functionality merged into jump threading).
-New SSAUpdater and MachineSSAUpdater classes for unstructured ssa updating,
+ removed (functionality merged into jump threading).</li>
+<li>New SSAUpdater and MachineSSAUpdater classes for unstructured ssa updating,
changed jump threading, GVN, etc to use it which simplified them and speed
- them up.
+ them up.</li>
+<li>
+The Optimal Edge Profiling implementation in 2.6 was more a proof of
+concept. The current implementation (the one that will go into 2.7) is
+now stable and (as far as my tests go) bug free.
+
+The profiling with instrumentation via "opt" and analysis via the tool
+"llvm-prof" should Work As Expected (TM).
+
+Two things are missing:
+
+*) Still missing is the modification of all -std-compile-opt passes to
+update the profiling information according to the changes made to the
+CFG, I'm planning to do this after my master thesis is finished. This
+will enable all passes to use the ProfileInfo if available and base
+decisions on that information.
+
+*) GCC has the options "-pg", "-fprofile-arcs" and "--coverage" that
+insert profiling code and "-fprofile-use" to use them the next time
+during compilation. I guess this options should also work properly in
+llvm-gcc and clang?</li>
+
</ul>
</div>
@@ -568,25 +588,20 @@ it run faster:</p>
<ul>
<li>New instruction selector [blog post?].</li>
-
-Code generator MC'ized except for debug info and EH.
-
-New CodeGen Level CSE
-Combiner-AA improvements, why not on by default?
-Pre-regalloc tail duplication
-New LSR with "full strength reduction" mode. Description?
-Codegen level OptimizeExtsPass pass, takes advantage of x86 subregs.
-Support for the GCC option -fno-schedule-insns
-non-temporal load/store
-MachineSSAUpdater.h
-X86 and XCore supports returning arbitrary return values, returning too many values is
- supported by returning through a hidden pointer.
-verbose-asm now produces information about spill slots and loop nests
-GHC Haskell ABI / calling conv support.
-Many improvements to debug info
-
-
-<li>...</li>
+<li>New LSR with "full strength reduction" mode. Description?</li>
+<li>Code generator MC'ized except for debug info and EH.</li>
+<li>New CodeGen Level CSE</li>
+<li>Combiner-AA improvements, why not on by default?</li>
+<li>Pre-regalloc tail duplication</li>
+<li>Codegen level OptimizeExtsPass pass, takes advantage of x86 subregs. </li>
+<li>Support for the GCC option -fno-schedule-insns</li>
+<li>Non-temporal load/store, only implemented on X86, see LangRef.html#i_load.</li>
+<li>MachineSSAUpdater.h</li>
+<li>X86 and XCore supports returning arbitrary return values, returning too many values is
+ supported by returning through a hidden pointer.</li>
+<li>verbose-asm now produces information about spill slots and loop nests</li>
+<li>GHC Haskell ABI / calling conv support.</li>
+<li>Many improvements to debug info</li>
</ul>
</div>
@@ -600,10 +615,13 @@ Many improvements to debug info
</p>
<ul>
+<li>The X86 backend now optimizes tails calls much more aggressively for
+ functions that use the standard C calling convention.</li>
+<li>The X86 backend now models scalar SSE registers as subregs of the SSE vector
+ registers, making the code generator more aggressive in cases where scalars
+ and vector types are mixed.</li>
-<li>PostRA scheduler for X86?</li>
-<li>x86 sibcall / tailcall optimization in CCC mode.</li>
-<li>X86: XMM subreg modeling for extraction of the low element.</li>
+<li>PostRA scheduler for X86? FIXME: is this on by default in 2.7?</li>
</ul>
@@ -642,21 +660,6 @@ href="http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html">
<!--=========================================================================-->
<div class="doc_subsection">
-<a name="OtherTarget">Other Target Specific Improvements</a>
-</div>
-
-<div class="doc_text">
-<p>New features of other targets include:
-</p>
-
-<ul>
-<li>...</li>
-</ul>
-
-</div>
-
-<!--=========================================================================-->
-<div class="doc_subsection">
<a name="newapis">New Useful APIs</a>
</div>
@@ -917,9 +920,6 @@ compilation, and lacks support for debug information.</li>
<div class="doc_text">
<ul>
-<li>Support for the Advanced SIMD (Neon) instruction set is still incomplete
-and not well tested. Some features may not work at all, and the code quality
-may be poor in some cases.</li>
<li>Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6
processors, thumb programs can crash or produce wrong
results (<a href="http://llvm.org/PR1388">PR1388</a>).</li>