diff options
author | Andrew Trick <atrick@apple.com> | 2014-04-07 21:29:22 +0000 |
---|---|---|
committer | Andrew Trick <atrick@apple.com> | 2014-04-07 21:29:22 +0000 |
commit | 7f7263354767fb201c53da88b83d1ec4e3511432 (patch) | |
tree | 650e3024d1ee0a36cff93a5ba0f8d3811185d02e /lib/CodeGen | |
parent | 1d8c7eb225672b08422089c12c0badf548882ff3 (diff) | |
download | llvm-7f7263354767fb201c53da88b83d1ec4e3511432.tar.gz llvm-7f7263354767fb201c53da88b83d1ec4e3511432.tar.bz2 llvm-7f7263354767fb201c53da88b83d1ec4e3511432.tar.xz |
Put a limit on ScheduleDAGSDNodes::ClusterNeighboringLoads to avoid blowing up compile time.
Fixes PR16365 - Extremely slow compilation in -O1 and -O2.
The SD scheduler has a quadratic implementation of load clustering
which absolutely blows up compile time for large blocks with constant
pool loads. The MI scheduler has a better implementation of load
clustering. However, we have not done the work yet to completely
eliminate the SD scheduler. Some benchmarks still seem to benefit from
early load clustering, although maybe by chance.
As an intermediate term fix, I just put a nice limit on the number of
DAG users to search before finding a match. With this limit there are no
binary differences in the LLVM test suite, and the PR16365 test case
does not suffer any compile time impact from this routine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205738 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'lib/CodeGen')
-rw-r--r-- | lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp | 7 |
1 files changed, 6 insertions, 1 deletions
diff --git a/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp b/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp index 5639894d09..ae9a580184 100644 --- a/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp +++ b/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp @@ -219,8 +219,11 @@ void ScheduleDAGSDNodes::ClusterNeighboringLoads(SDNode *Node) { DenseMap<long long, SDNode*> O2SMap; // Map from offset to SDNode. bool Cluster = false; SDNode *Base = Node; + // This algorithm requires a reasonably low use count before finding a match + // to avoid uselessly blowing up compile time in large blocks. + unsigned UseCount = 0; for (SDNode::use_iterator I = Chain->use_begin(), E = Chain->use_end(); - I != E; ++I) { + I != E && UseCount < 100; ++I, ++UseCount) { SDNode *User = *I; if (User == Node || !Visited.insert(User)) continue; @@ -237,6 +240,8 @@ void ScheduleDAGSDNodes::ClusterNeighboringLoads(SDNode *Node) { if (Offset2 < Offset1) Base = User; Cluster = true; + // Reset UseCount to allow more matches. + UseCount = 0; } if (!Cluster) |