summaryrefslogtreecommitdiff
path: root/lib/Target/README.txt
diff options
context:
space:
mode:
authorChris Lattner <sabre@nondot.org>2008-08-10 01:14:08 +0000
committerChris Lattner <sabre@nondot.org>2008-08-10 01:14:08 +0000
commit26e150f361d7643c3f1501962a5fc44f42f037eb (patch)
tree402fa285b8ac9c4f7da501eb8a3e6c66c9f27951 /lib/Target/README.txt
parentc90b866797d95357cb5051574d1402620003cf7d (diff)
downloadllvm-26e150f361d7643c3f1501962a5fc44f42f037eb.tar.gz
llvm-26e150f361d7643c3f1501962a5fc44f42f037eb.tar.bz2
llvm-26e150f361d7643c3f1501962a5fc44f42f037eb.tar.xz
move some more stuff out of my email into readme.txt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54603 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'lib/Target/README.txt')
-rw-r--r--lib/Target/README.txt73
1 files changed, 73 insertions, 0 deletions
diff --git a/lib/Target/README.txt b/lib/Target/README.txt
index fe11900353..1ee316c477 100644
--- a/lib/Target/README.txt
+++ b/lib/Target/README.txt
@@ -809,3 +809,76 @@ multiplication trees.
//===---------------------------------------------------------------------===//
+We generate a horrible libcall for llvm.powi. For example, we compile:
+
+#include <cmath>
+double f(double a) { return std::pow(a, 4); }
+
+into:
+
+__Z1fd:
+ subl $12, %esp
+ movsd 16(%esp), %xmm0
+ movsd %xmm0, (%esp)
+ movl $4, 8(%esp)
+ call L___powidf2$stub
+ addl $12, %esp
+ ret
+
+GCC produces:
+
+__Z1fd:
+ subl $12, %esp
+ movsd 16(%esp), %xmm0
+ mulsd %xmm0, %xmm0
+ mulsd %xmm0, %xmm0
+ movsd %xmm0, (%esp)
+ fldl (%esp)
+ addl $12, %esp
+ ret
+
+//===---------------------------------------------------------------------===//
+
+We compile this program: (from GCC PR11680)
+http://gcc.gnu.org/bugzilla/attachment.cgi?id=4487
+
+Into code that runs the same speed in fast/slow modes, but both modes run 2x
+slower than when compile with GCC (either 4.0 or 4.2):
+
+$ llvm-g++ perf.cpp -O3 -fno-exceptions
+$ time ./a.out fast
+1.821u 0.003s 0:01.82 100.0% 0+0k 0+0io 0pf+0w
+
+$ g++ perf.cpp -O3 -fno-exceptions
+$ time ./a.out fast
+0.821u 0.001s 0:00.82 100.0% 0+0k 0+0io 0pf+0w
+
+It looks like we are making the same inlining decisions, so this may be raw
+codegen badness or something else (haven't investigated).
+
+//===---------------------------------------------------------------------===//
+
+We miss some instcombines for stuff like this:
+void bar (void);
+void foo (unsigned int a) {
+ /* This one is equivalent to a >= (3 << 2). */
+ if ((a >> 2) >= 3)
+ bar ();
+}
+
+A few other related ones are in GCC PR14753.
+
+//===---------------------------------------------------------------------===//
+
+Divisibility by constant can be simplified (according to GCC PR12849) from
+being a mulhi to being a mul lo (cheaper). Testcase:
+
+void bar(unsigned n) {
+ if (n % 3 == 0)
+ true();
+}
+
+I think this basically amounts to a dag combine to simplify comparisons against
+multiply hi's into a comparison against the mullo.
+
+//===---------------------------------------------------------------------===//