summaryrefslogtreecommitdiff
path: root/docs/ReleaseNotes.html
blob: 290fbf6908354b242a8ae8c7a94b387182572eb4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
                      "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  <link rel="stylesheet" href="_static/llvm.css" type="text/css">
  <title>LLVM 3.2 Release Notes</title>
</head>
<body>

<h1>LLVM 3.2 Release Notes</h1>

<div>
<img style="float:right" src="http://llvm.org/img/DragonSmall.png"
     width="136" height="136" alt="LLVM Dragon Logo">
</div>

<ol>
  <li><a href="#intro">Introduction</a></li>
  <li><a href="#subproj">Sub-project Status Update</a></li>
  <li><a href="#externalproj">External Projects Using LLVM 3.2</a></li>
  <li><a href="#whatsnew">What's New in LLVM?</a></li>
  <li><a href="GettingStarted.html">Installation Instructions</a></li>
  <li><a href="#knownproblems">Known Problems</a></li>
  <li><a href="#additionalinfo">Additional Information</a></li>
</ol>

<div class="doc_author">
  <p>Written by the <a href="http://llvm.org/">LLVM Team</a></p>
</div>

<!-- *********************************************************************** -->
<h2>
  <a name="intro">Introduction</a>
</h2>
<!-- *********************************************************************** -->

<div>

<p>This document contains the release notes for the LLVM Compiler
   Infrastructure, release 3.2.  Here we describe the status of LLVM, including
   major improvements from the previous release, improvements in various
   sub-projects of LLVM, and some of the current users of the code.  All LLVM
   releases may be downloaded from the <a href="http://llvm.org/releases/">LLVM
   releases web site</a>.</p>

<p>For more information about LLVM, including information about the latest
   release, please check out the <a href="http://llvm.org/">main LLVM web
   site</a>.  If you have questions or comments,
   the <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVM
   Developer's Mailing List</a> is a good place to send them.</p>

<p>Note that if you are reading this file from a Subversion checkout or the main
   LLVM web page, this document applies to the <i>next</i> release, not the
   current one.  To see the release notes for a specific release, please see the
   <a href="http://llvm.org/releases/">releases page</a>.</p>

</div>


<!-- *********************************************************************** -->
<h2>
  <a name="subproj">Sub-project Status Update</a>
</h2>
<!-- *********************************************************************** -->

<div>

<p>The LLVM 3.2 distribution currently consists of production-quality code
   from the core LLVM repository, which roughly includes the LLVM optimizers,
   code generators and supporting tools, as well as Clang, DragonEgg and 
   compiler-rt sub-project repositories. In addition to this code, the LLVM 
   Project includes other sub-projects that are in development. Here we 
   include updates on these sub-projects.</p>

<!--=========================================================================-->
<h3>
<a name="clang">Clang: C/C++/Objective-C Frontend Toolkit</a>
</h3>

<div>

<p><a href="http://clang.llvm.org/">Clang</a> is an LLVM front end for the C,
   C++, and Objective-C languages. Clang aims to provide a better user
   experience through expressive diagnostics, a high level of conformance to
   language standards, fast compilation, and low memory use. Like LLVM, Clang
   provides a modular, library-based architecture that makes it suitable for
   creating or integrating with other development tools.</p>

<p>In the LLVM 3.2 time-frame, the Clang team has made many improvements.
   Highlights include:</p>
<ul>
  <li>Improvements to Clang's diagnostics</li>
  <li>Support for tls_model attribute</li>
  <li>Type safety attributes</li>
</ul>

<p>For more details about the changes to Clang since the 3.1 release, see the
   <a href="http://llvm.org/releases/3.2/tools/clang/docs/ReleaseNotes.html">Clang 3.2 release
   notes.</a></p>

<p>If Clang rejects your code but another compiler accepts it, please take a
   look at the <a href="http://clang.llvm.org/compatibility.html">language
   compatibility</a> guide to make sure this is not intentional or a known
   issue.</p>

</div>

<!--=========================================================================-->
<h3>
<a name="dragonegg">DragonEgg: GCC front-ends, LLVM back-end</a>
</h3>

<div>

<p><a href="http://dragonegg.llvm.org/">DragonEgg</a> is a
   <a href="http://gcc.gnu.org/wiki/plugins">gcc plugin</a> that replaces GCC's
   optimizers and code generators with LLVM's. It works with gcc-4.5 and gcc-4.6
   (and partially with gcc-4.7), can target the x86-32/x86-64 and ARM processor
   families, and has been successfully used on the Darwin, FreeBSD, KFreeBSD,
   Linux and OpenBSD platforms.  It fully supports Ada, C, C++ and Fortran.  It
   has partial support for Go, Java, Obj-C and Obj-C++.</p>

<p>The 3.2 release has the following notable changes:</p>

<ul>
 <li>Able to load LLVM plugins such as Polly.</li>
 <li>Supports thread-local storage models.</li>
 <li>Passes knowledge of variable lifetimes to the LLVM optimizers.</li>
 <li>No longer requires GCC to be built with LTO support.</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="compiler-rt">compiler-rt: Compiler Runtime Library</a>
</h3>

<div>


<p>The LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a>
   is a simple library that provides an implementation of the low-level
   target-specific hooks required by code generation and other runtime
   components.  For example, when compiling for a 32-bit target, converting a
   double to a 64-bit unsigned integer is compiled into a runtime call to the
   <code>__fixunsdfdi</code> function. The compiler-rt library provides highly
   optimized implementations of this and other low-level routines (some are 3x
   faster than the equivalent libgcc routines).</p>

<p>The 3.2 release has the following notable changes:</p>

<ul>
  <li><a href="http://llvm.org/releases/3.2/tools/clang/docs/ThreadSanitizer.html">ThreadSanitizer (TSan)</a> - data race detector run-time library for C/C++ has been added.</li>
  <li>Improvements to <a href="http://llvm.org/releases/3.2/tools/clang/docs/AddressSanitizer.html">AddressSanitizer</a> including: better portability 
  (OSX, Android NDK), support for cmake based builds, enhanced error reporting and lots of bug fixes.</li>
  <li>Added support for A6 'Swift' CPU.</li>
  <li><code>divsi3</code> function has been enhanced to take advantage of a hardware unsigned divide when it is available.</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="lldb">LLDB: Low Level Debugger</a>
</h3>

<div>

<p><a href="http://lldb.llvm.org">LLDB</a> is a ground-up implementation of a
   command line debugger, as well as a debugger API that can be used from other
   applications.  LLDB makes use of the Clang parser to provide high-fidelity
   expression parsing (particularly for C++) and uses the LLVM JIT for target
   support.</p>

<p>The 3.2 release has the following notable changes:</p>

<ul>
  <li>Linux build fixes for clang (see <a href="http://lldb.llvm.org/build.html">Building LLDB</a>)</li>
  <li>Some Linux stability and usability improvements</li>
  <li>Switch expression evaluation to use MCJIT (from legacy JIT) on Linux</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="libc++">libc++: C++ Standard Library</a>
</h3>

<div>

<p>Like compiler_rt, libc++ is now <a href="DeveloperPolicy.html#license">dual
   licensed</a> under the MIT and UIUC license, allowing it to be used more
   permissively.</p>

<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>

<ul>
  <li> C++11 shared_ptr atomic access API (20.7.2.5) has been implemented.</li>
  <li>Applied noexcept and constexpr throughout library.</li>
  <li>Improved C++11 conformance in associative container emplace.</li>
  <li>Performance improvements in: std::rotate algorithm and I/O.</li>
  <li>Operator new/delete and type_infos for exception types moved from libc++ to libc++abi.</li>
  <li>Bug fixes in: <code>&lt;atomic&gt;</code>; vector<code>&lt;bool&gt;</code> algorithms,
    <code>&lt;future&gt;</code>,<code>&lt;tuple&gt;</code>,
    <code>&lt;type_traits&gt;</code>,<code>&lt;fstream&gt;</code>,<code>&lt;istream&gt;</code>,
    <code>&lt;iterator&gt;</code>, <code>&lt;condition_variable&gt;</code>,<code>&lt;complex&gt;</code> as well as visibility fixes.
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="vmkit">VMKit</a>
</h3>

<div>

<p>The <a href="http://vmkit.llvm.org/">VMKit project</a> is an implementation
  of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and
  just-in-time compilation.</p>

<p>The 3.2 release has the following notable changes:</p>

<ul>
  <li>Bug fixes only, no functional changes.</li>
</ul>

</div>


<!--=========================================================================-->
<h3>
<a name="Polly">Polly: Polyhedral Optimizer</a>
</h3>

<div>

<p><a href="http://polly.llvm.org/">Polly</a> is an <em>experimental</em>
  optimizer for data locality and parallelism. It currently provides high-level
  loop optimizations and automatic parallelization (using the OpenMP run time).
  Work in the area of automatic SIMD and accelerator code generation was
  started.</p>

<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>

<ul>
  <li>isl, the integer set library used by Polly, was relicensed under the MIT license.</li>
  <li>isl based code generation.</li>
  <li>MIT licensed replacement for CLooG (LGPLv2).</li>
  <li>Fine grained option handling (separation of core and border computations, control overhead vs. code size).</li>
  <li>Support for FORTRAN and Dragonegg.</li>
  <li>OpenMP code generation fixes.</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="StaticAnalyzer">Clang Static Analyzer</a>
</h3>

<div>

<p>The <a href="http://clang-analyzer.llvm.org/">Clang Static Analyzer</a> 
    is an advanced source code analysis tool integrated into Clang that performs
    a deep analysis of code to find potential bugs.</p>
    
<p>In the LLVM 3.2 release, the static analyzer has made significant improvements
    in many areas, with notable highlights such as:</p>
    
<ul>
    <li>Improved interprocedural analysis within a translation unit (see details below), which greatly amplified the analyzer's ability to find bugs.</li>
    <li>New infrastructure to model &quot;well-known&quot; APIs, allowing the analyzer to do a much better job when modeling calls to such functions.</li>
    <li>Significant improvements to the APIs to write static analyzer checkers, with a more unified way of representing function/method calls in the checker API.  Details can be found in the <a href="http://llvm.org/devmtg/2012-11#talk13">Building a Checker in 24 hours</a> talk.
</ul>

<p>The release specifically includes notable improvements for Objective-C analysis, including:</p>

<ul>
    <li>Interprocedural analysis for Objective-C methods.</li>
    <li>Interprocedural analysis of calls to &quot;blocks&quot;.</li>
    <li>Precise modeling of GCD APIs such as <tt>dispatch_once</tt> and friends.</li>
    <li>Improved support for recently added Objective-C constructs such as array and dictionary literals.</li>
</ul>

<p>The release specifically includes notable improvements for C++ analysis, including:</p>

<ul>
    <li>Interprocedural analysis for C++ methods (within a translation unit).</li>
    <li>More precise modeling of C++ initializers and destructors.</li>
</ul>

<p>Finally, this release includes many small improvements to <tt>scan-build</tt>, which can be used to drive the analyzer from the command line or a continuous integration system.  This includes a directory-traversal issue, which could cause potential security problems in some cases.  We would like to acknowledge Tim Brown of Portcullis Computer Security Ltd for reporting this issue.</p>
    
</div>

</div>

<!-- *********************************************************************** -->
<h2>
  <a name="externalproj">External Open Source Projects Using LLVM 3.2</a>
</h2>
<!-- *********************************************************************** -->

<div>

<p>An exciting aspect of LLVM is that it is used as an enabling technology for
   a lot of other language and tools projects. This section lists some of the
   projects that have already been updated to work with LLVM 3.2.</p>

<h3>Crack</h3>

<div>

<p><a href="http://code.google.com/p/crack-language/">Crack</a> aims to provide
   the ease of development of a scripting language with the performance of a
   compiled language. The language derives concepts from C++, Java and Python,
   incorporating object-oriented programming, operator overloading and strong
   typing.</p>

</div>

<h3>EmbToolkit</h3>

<div>

<p><a href="http://www.embtoolkit.org/">EmbToolkit</a> provides Linux cross-compiler 
    toolchain/SDK (GCC/binutils/C library (uclibc,eglibc,musl)), a build system for 
    package cross-compilation and optionally various root file systems. 
    It supports ARM and MIPS. There is an ongoing effort to provide a clang+llvm 
    environment for the 3.2 releases, 
</p>

</div>

<h3>FAUST</h3>

<div>

<p><a href="http://faust.grame.fr/">FAUST</a> is a compiled language for
   real-time audio signal processing. The name FAUST stands for Functional
   AUdio STream. Its programming model combines two approaches: functional
   programming and block diagram composition. In addition with the C, C++, Java,
   JavaScript output formats, the Faust compiler can generate LLVM bitcode, and
   works with LLVM 2.7-3.2.</p>

</div>

<h3>Glasgow Haskell Compiler (GHC)</h3>

<div>

<p><a href="http://www.haskell.org/ghc/">GHC</a> is an open source compiler and
   programming suite for Haskell, a lazy functional programming language. It
   includes an optimizing static compiler generating good code for a variety of
   platforms, together with an interactive system for convenient, quick
   development.</p>

<p>GHC 7.0 and onwards include an LLVM code generator, supporting LLVM 2.8 and
   later.</p>

</div>

<h3>Julia</h3>

<div>

<p><a href="https://github.com/JuliaLang/julia">Julia</a> is a high-level,
   high-performance dynamic language for technical computing. It provides a
   sophisticated compiler, distributed parallel execution, numerical accuracy,
   and an extensive mathematical function library. The compiler uses type
   inference to generate fast code without any type declarations, and uses
   LLVM's optimization passes and JIT compiler. The
   <a href="http://julialang.org/"> Julia Language</a> is designed
   around multiple dispatch, giving programs a large degree of flexibility. It
   is ready for use on many kinds of problems.</p>

</div>

<h3>LLVM D Compiler</h3>

<div>

<p><a href="https://github.com/ldc-developers/ldc">LLVM D Compiler</a> (LDC) is
   a compiler for the D programming Language. It is based on the DMD frontend
   and uses LLVM as backend.</p>

</div>

<h3>Open Shading Language</h3>

<div>

<p><a href="https://github.com/imageworks/OpenShadingLanguage/">Open Shading
   Language (OSL)</a> is a small but rich language for programmable shading in
   advanced global illumination renderers and other applications, ideal for
   describing materials, lights, displacement, and pattern generation. It uses
   LLVM to JIT complex shader networks to x86 code at runtime.</p>

<p>OSL was developed by Sony Pictures Imageworks for use in its in-house
   renderer used for feature film animation and visual effects, and is
   distributed as open source software with the "New BSD" license.
   It has been used for all the shading on such films as The Amazing Spider-Man,
   Men in Black III, Hotel Transylvania, and may other films in-progress, 
   and also has been incorporated into several commercial and open source 
   rendering products such as Blender, VRay, and Autodesk Beast.</p>

</div>

<h3>Portable OpenCL (pocl)</h3>

<div>

<p>In addition to producing an easily portable open source OpenCL
   implementation, another major goal of <a href="http://pocl.sourceforge.net/">
   pocl</a> is improving performance portability of OpenCL programs with
   compiler optimizations, reducing the need for target-dependent manual
   optimizations. An important part of pocl is a set of LLVM passes used to
   statically parallelize multiple work-items with the kernel compiler, even in
   the presence of work-group barriers. This enables static parallelization of
   the fine-grained static concurrency in the work groups in multiple ways
   (SIMD, VLIW, superscalar,...).</p>

</div>

<h3>Pure</h3>

<div>

<p><a href="http://pure-lang.googlecode.com/">Pure</a> is an
   algebraic/functional programming language based on term rewriting. Programs
   are collections of equations which are used to evaluate expressions in a
   symbolic fashion. The interpreter uses LLVM as a backend to JIT-compile Pure
   programs to fast native code. Pure offers dynamic typing, eager and lazy
   evaluation, lexical closures, a hygienic macro system (also based on term
   rewriting), built-in list and matrix support (including list and matrix
   comprehensions) and an easy-to-use interface to C and other programming
   languages (including the ability to load LLVM bitcode modules, and inline C,
   C++, Fortran and Faust code in Pure programs if the corresponding
   LLVM-enabled compilers are installed).</p>

<p>Pure version 0.56 has been tested and is known to work with LLVM 3.2 (and
   continues to work with older LLVM releases >= 2.5).</p>

</div>

<h3>TTA-based Co-design Environment (TCE)</h3>

<div>

<p><a href="http://tce.cs.tut.fi/">TCE</a> is a toolset for designing
   application-specific processors (ASP) based on the Transport triggered
   architecture (TTA). The toolset provides a complete co-design flow from C/C++
   programs down to synthesizable VHDL/Verilog and parallel program binaries.
   Processor customization points include the register files, function units,
   supported operations, and the interconnection network.</p>

<p>TCE uses Clang and LLVM for C/C++ language support, target independent
   optimizations and also for parts of code generation. It generates new
   LLVM-based code generators "on the fly" for the designed TTA processors and
   loads them in to the compiler backend as runtime libraries to avoid
   per-target recompilation of larger parts of the compiler chain.</p>

</div>

</div>

<!-- *********************************************************************** -->
<h2>
  <a name="whatsnew">What's New in LLVM 3.2?</a>
</h2>
<!-- *********************************************************************** -->

<div>

<p>This release includes a huge number of bug fixes, performance tweaks and
   minor improvements. Some of the major improvements and new features are
   listed in this section.</p>

<!--=========================================================================-->
<h3>
<a name="majorfeatures">Major New Features</a>
</h3>

<div>

  <!-- Features that need text if they're finished for 3.2:
   ARM EHABI
   combiner-aa?
   strong phi elim
   loop dependence analysis
   CorrelatedValuePropagation
   lib/Transforms/IPO/MergeFunctions.cpp => consider for 3.2.
   Integrated assembler on by default for arm/thumb?

   -->

  <!-- Near dead:
   Analysis/RegionInfo.h + Dom Frontiers
   SparseBitVector: used in LiveVar.
   llvm/lib/Archive - replace with lib object?
   -->

<p>LLVM 3.2 includes several major changes and big features:</p>

<ul>
  <li>Loop Vectorizer.</li>
  <li>New implementation of SROA.</li>
  <li>New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources.</li>
</ul>

</div>


<!--=========================================================================-->
<h3>
<a name="coreimprovements">LLVM IR and Core Improvements</a>
</h3>

<div>

<p>LLVM IR has several new features for better support of new targets and that
   expose new optimization opportunities:</p>

<ul>
  <li>Thread local variables may have a specified TLS model. See the
  <a href="LangRef.html#globalvars">Language Reference Manual</a>.</li>
  <li>'TYPE_CODE_FUNCTION_OLD' type code and autoupgrade code for old function attributes format has been removed.</li>
  <li>Internal representation of the Attributes class has been converted into a pointer to an
         opaque object that's uniqued by and stored in the LLVMContext object. 
         The Attributes class then becomes a thin wrapper around this opaque object.</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="optimizer">Optimizer Improvements</a>
</h3>

<div>

<p>In addition to many minor performance tweaks and bug fixes, this release
   includes a few major enhancements and additions to the optimizers:</p>

<p> Loop Vectorizer - We've added a loop vectorizer and we are now able to
    vectorize small loops. The loop vectorizer is disabled by default and
    can be enabled using the <b>-mllvm -vectorize-loops</b> flag.
    The SIMD vector width can be specified using the flag
    <b>-mllvm -force-vector-width=4</b>.
    The default value is <b>0</b> which means auto-select.
    <br/>
    We can now vectorize this function:

    <pre class="doc_code">
    unsigned sum_arrays(int *A, int *B, int start, int end) {
      unsigned sum = 0;
      for (int i = start; i &lt; end; ++i)
        sum += A[i] + B[i] + i;

      return sum;
    }
    </pre>

    We vectorize under the following loops:
    <ul>
    <li>The inner most loops must have a single basic block.</li>
    <li>The number of iterations are known before the loop starts to execute.</li>
    <li>The loop counter needs to be incremented by one.</li>
    <li>The loop trip count <b>can</b> be a variable.</li>
    <li>Loops do <b>not</b> need to start at zero.</li>
    <li>The induction variable can be used inside the loop.</li>
    <li>Loop reductions are supported.</li>
    <li>Arrays with affine access pattern do <b>not</b> need to be marked as 'noalias' and are checked at runtime.</li>
    </ul>

</p>

<p>SROA - We&#8217;ve re-written SROA to be significantly more powerful and generate
code which is much more friendly to the rest of the optimization pipeline.
Previously this pass had scaling problems that required it to only operate on
relatively small aggregates, and at times it would mistakenly replace a large
aggregate with a single very large integer in order to make it a scalar SSA
value. The result was a large number of i1024 and i2048 values representing any
small stack buffer. These in turn slowed down many subsequent optimization
paths.</p>
<p>The new SROA pass uses a different algorithm that allows it to only promote to
scalars the pieces of the aggregate actively in use. Because of this it doesn&#8217;t
require any thresholds. It also always deduces the scalar values from the uses
of the aggregate rather than the specific LLVM type of the aggregate. These
features combine to both optimize more code with the pass but to improve the
compile time of many functions dramatically.</p>

<ul>
  <li>Branch weight metadata is preserved through more of the optimizer.</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="mc">MC Level Improvements</a>
</h3>

<div>

<p>The LLVM Machine Code (aka MC) subsystem was created to solve a number of
   problems in the realm of assembly, disassembly, object file format handling,
   and a number of other related areas that CPU instruction-set level tools work
   in. For more information, please see the
   <a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro
   to the LLVM MC Project Blog Post</a>.</p>

<ul>    
  <li> Added support for following assembler directives: <code>.ifb</code>, <code>.ifnb</code>, <code>.ifc</code>, 
          <code>.ifnc</code>, <code>.purgem</code>, <code>.rept</code> and <code>.version</code> (ELF) as well as Darwin specific
	<code>.pushsection</code>, <code>.popsection</code> and  <code>.previous</code> .</li>
  <li>Enhanced handling of <code>.lcomm directive</code>.</li>
  <li>MS style inline assembler: added implementation of the offset and TYPE operators.</li>
  <li>Targets can specify minimum supported NOP size for NOP padding.</li>
  <li>ELF improvements: added support for generating ELF objects on Windows.</li>
  <li>MachO improvements:  symbol-difference variables are marked as N_ABS, added direct-to-object attribute for data-in-code markers.</li>
  <li>Added support for annotated disassembly output for x86 and arm targets.</li>
  <li>Arm support has been improved by adding support for ARM TARGET2 relocation
          and fixing hadling of ARM-style "$d.*" labels.</li>
   <li>Implemented local-exec TLS on PowerPC.</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="codegen">Target Independent Code Generator Improvements</a>
</h3>

<div>

<p>Stack Coloring - We have implemented a new optimization pass
  to merge stack objects which are used in disjoin areas of the code.
  This optimization reduces the required stack space significantly, in cases
  where it is clear to the optimizer that the stack slot is not shared.
  We use the lifetime markers to tell the codegen that a certain alloca
  is used within a region.</p>

<p> We now merge consecutive loads and stores. </p>

<p>We have put a significant amount of work into the code generator
   infrastructure, which allows us to implement more aggressive algorithms and
   make it run faster:</p>

<p> We added new TableGen infrastructure to support bundling for
    Very Long Instruction Word (VLIW) architectures. TableGen can now
    automatically generate a deterministic finite automaton from a VLIW
    target's schedule description which can be queried to determine
    legal groupings of instructions in a bundle.</p>

<p> We have added a new target independent VLIW packetizer based on the
    DFA infrastructure to group machine instructions into bundles.</p>

<p> We have added new TableGen infrastructure to support relationship maps
    between instructions. This feature enables TableGen to automatically
    construct a set of relation tables and query functions that can be used
    to switch between various forms of instructions. For more information,
    please refer to <a href="http://llvm.org/docs/HowToUseInstrMappings.html">
    How To Use Instruction Mappings</a>.</p> 

</div>

<h4>
<a name="blockplacement">Basic Block Placement</a>
</h4>

<div>

<p>A probability based block placement and code layout algorithm was added to
   LLVM's code generator. This layout pass supports probabilities derived from
   static heuristics as well as source code annotations such as
   <code>__builtin_expect</code>.</p>

</div>

<!--=========================================================================-->
<h3>
<a name="x86">X86-32 and X86-64 Target Improvements</a>
</h3>

<div>

<p>New features and major changes in the X86 target include:</p>

<ul>
  <li>Small codegen optimizations, especially for AVX2.</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="ARM">ARM Target Improvements</a>
</h3>

<div>

<p>New features of the ARM target include:</p>

<ul>
  <li>Support and performance tuning for the A6 'Swift' CPU.</li>
</ul>

<!--_________________________________________________________________________-->

<h4>
<a name="armintegratedassembler">ARM Integrated Assembler</a>
</h4>

<div>

<p>The ARM target now includes a full featured macro assembler, including
   direct-to-object module support for clang. The assembler is currently enabled
   by default for Darwin only pending testing and any additional necessary
   platform specific support for Linux.</p>

<p>Full support is included for Thumb1, Thumb2 and ARM modes, along with
   sub-target and CPU specific extensions for VFP2, VFP3 and NEON.</p>

<p>The assembler is Unified Syntax only (see ARM Architecural Reference Manual
   for details). While there is some, and growing, support for pre-unfied
   (divided) syntax, there are still significant gaps in that support.</p>

</div>

</div>

<!--=========================================================================-->
<h3>
<a name="MIPS">MIPS Target Improvements</a>
</h3>

<div>

<p>New features and major changes in the MIPS target include:</p>

<ul>
  <li>Integrated assembler support: 
         MIPS32 works for both PIC and static, known limitation is the PR14456 where 
         R_MIPS_GPREL16 relocation is generated with the wrong addend.
         MIPS64 support is incomplete, for example exception handling is not working.</li>
   <li>Support for fast calling convention has been added.</li>
   <li>Support for Android MIPS toolchain has been added to clang driver.</li>
   <li>Added clang driver support for MIPS N32 ABI through "-mabi=n32" option.</li>
   <li>MIPS32 and MIPS64 disassembler has been implemented.</li>
   <li>Support for compiling programs with large GOTs (exceeding 64kB in size) has been added 
	through llc option "-mxgot".</li>
  <li>Added experimental support for MIPS32 DSP intrinsics.</li>
  <li>Experimental support for MIPS16 with following limitations: only soft float is supported,
         C++ exceptions are not supported, large stack frames (> 32000 bytes) are not supported,
         direct object code emission is not supported only .s .</li>
  <li>Standalone assembler (llvm-mc):  implementation is in progress and considered experimental.</li>
  <li>All classic JIT and MCJIT tests pass on Little and Big Endian MIPS32 platforms.</li>
  <li>Inline asm support: all common constraints and operand modifiers have been implemented.</li>
  <li>Added tail call optimization support, use llc option "-enable-mips-tail-calls"
      or clang options "-mllvm -enable-mips-tail-calls"to enable it.</li>
  <li>Improved register allocation by removing registers $fp, $gp, $ra and $at from the list of reserved registers.</li>
  <li>Long branch expansion pass has been implemented, which expands branch
      instructions with offsets that do not fit in the 16-bit field.</li>
  <li>Cavium Octeon II board is used for testing builds (llvm-mips-linux builder).</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="PowerPC">PowerPC Target Improvements</a>
</h3>

<div>

<p>Many fixes and changes across LLVM (and Clang) for better compliance with
   the 64-bit PowerPC ELF Application Binary Interface, interoperability with
   GCC, and overall 64-bit PowerPC support.   Some highlights include:</p>
<ul>
  <li>  MCJIT support added.</li>
  <li>  PPC64 relocation support and (small code model) TOC handling
        added.</li>
  <li>  Parameter passing and return value fixes (alignment issues,
        padding, varargs support, proper register usage, odd-sized
        structure support, float support, extension of return values
        for i32 return values).</li>
  <li>  Fixes in spill and reload code for vector registers.</li>
  <li>  C++ exception handling enabled.</li>
  <li>  Changes to remediate double-rounding compatibility issues with
        respect to GCC behavior.</li>
  <li>  Refactoring to disentangle ppc64-elf-linux ABI from Darwin
        ppc64 ABI support.</li>
  <li>  Assorted new test cases and test case fixes (endian and word
        size issues).</li>
  <li>  Fixes for big-endian codegen bugs, instruction encodings, and
        instruction constraints.</li>
  <li>  Implemented -integrated-as support.</li>
  <li>  Additional support for Altivec compare operations.</li>
  <li>  IBM long double support.</li>
</ul>
<p>There have also been code generation improvements for both 32- and 64-bit
   code. Instruction scheduling support for the Freescale e500mc and e5500
   cores has been added.</p>

</div>

<!--=========================================================================-->
<h3>
<a name="NVPTX">PTX/NVPTX Target Improvements</a>
</h3>

<div>

<p>The PTX back-end has been replaced by the NVPTX back-end, which is based on
   the LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler.
   Some highlights include:</p>
<ul>
  <li>Compatibility with PTX 3.1 and SM 3.5</li>
  <li>Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK</li>
  <li>Full compatibility with old PTX back-end, with much greater coverage of
      LLVM IR</li>
</ul>

<p>Please submit any back-end bugs to the LLVM Bugzilla site.</p>

</div>

<!--=========================================================================-->
<h3>
<a name="OtherTS">Other Target Specific Improvements</a>
</h3>

<div>

<ul>
  <li>Added support for custom names for library functions in TargetLibraryInfo.</li>
</ul>

</div>

<!--=========================================================================-->
<h3>
<a name="changes">Major Changes and Removed Features</a>
</h3>

<div>

<p>If you're already an LLVM user or developer with out-of-tree changes based on
   LLVM 3.2, this section lists some "gotchas" that you may run into upgrading
   from the previous release.</p>

<ul>
<li>llvm-ld and llvm-stub have been removed, llvm-ld functionality can be partially replaced by 
        llvm-link | opt | {llc | as, llc -filetype=obj} | ld, or fully replaced by Clang. </li>
<li>MCJIT: added support for inline assembly (requires asm parser), added faux remote target execution to lli option '-remote-mcjit'.</li>
</ul> 
 
</div>

<!--=========================================================================-->
<h3>
<a name="api_changes">Internal API Changes</a>
</h3>

<div>

<p>In addition, many APIs have changed in this release.  Some of the major
   LLVM API changes are:</p>

<p> We've added a new interface for allowing IR-level passes to access
  target-specific information. A new IR-level pass, called
  "TargetTransformInfo" provides a number of low-level interfaces.
  LSR and LowerInvoke already use the new interface. </p>

<p> The TargetData structure has been renamed to DataLayout and moved to VMCore
to remove a dependency on Target. </p>

</div>

<!--=========================================================================-->
<h3>
<a name="tools_changes">Tools Changes</a>
</h3>

<div>

<p>In addition, some tools have changed in this release. Some of the changes are:</p>

<ul>
<li>opt: added support for '-mtriple' option.</li>
<li>llvm-mc : - added '-disassemble' support for '-show-inst' and '-show-encoding' options, added '-edis' option to produce annotated 
        disassembly output for X86 and ARM targets.</li>
<li>libprofile: allows the profile data file name to be specified by the LLVMPROF_OUTPUT environment variable.</li>
<li>llvm-objdump: has been changed to display available targets, '-arch' option accepts x86 and x86-64 as valid arch names.</li>
<li>llc and opt: added FMA formation from pairs of FADD + FMUL or FSUB + FMUL enabled by option '-enable-excess-fp-precision' or option '-enable-unsafe-fp-math',
       option '-fp-contract' controls the creation by optimizations of fused FP by selecting Fast, Standard, or Strict mode.</li>
<li>llc: object file output from llc is no longer considered experimental.</li>
<li>gold plugin: handles Position Independent Executables.</li>
</ul>

</div>


<!-- *********************************************************************** -->
<h2>
  <a name="knownproblems">Known Problems</a>
</h2>
<!-- *********************************************************************** -->

<div>

<p>LLVM is generally a production quality compiler, and is used by a broad range
   of applications and shipping in many products.  That said, not every
   subsystem is as mature as the aggregate, particularly the more obscure
   targets.  If you run into a problem, please check
   the <a href="http://llvm.org/bugs/">LLVM bug database</a> and submit a bug if
   there isn't already one or ask on
   the <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev
   list</a>.</p>

  <p>Known problem areas include:</p>

<ul>
  <li>The CellSPU, MSP430, and XCore backends are experimental, and the CellSPU backend will be removed in LLVM 3.3.</li>

  <li>The integrated assembler, disassembler, and JIT is not supported by
      several targets. If an integrated assembler is not supported, then a
      system assembler is required.  For more details, see the <a
      href="CodeGenerator.html#targetfeatures">Target Features Matrix</a>.
  </li>
</ul>

</div>

<!-- *********************************************************************** -->
<h2>
  <a name="additionalinfo">Additional Information</a>
</h2>
<!-- *********************************************************************** -->

<div>

<p>A wide variety of additional information is available on
   the <a href="http://llvm.org/">LLVM web page</a>, in particular in
   the <a href="http://llvm.org/docs/">documentation</a> section.  The web page
   also contains versions of the API documentation which is up-to-date with the
   Subversion version of the source code.  You can access versions of these
   documents specific to this release by going into the "<tt>llvm/doc/</tt>"
   directory in the LLVM tree.</p>

<p>If you have any questions or comments about LLVM, please feel free to contact
   us via the <a href="http://llvm.org/docs/#maillist"> mailing lists</a>.</p>

</div>

<!-- *********************************************************************** -->

<hr>
<address>
  <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
  src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
  <a href="http://validator.w3.org/check/referer"><img
  src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>

  <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>
  Last modified: $Date$
</address>

</body>
</html>