Hacking on Clang

This document provides some hints for how to get started hacking on Clang for developers who are new to the Clang and/or LLVM codebases.

Coding Standards
Developer Documentation
Debugging
Testing
Creating Patch Files
LLVM IR Generation

Coding Standards

Clang follows the LLVM Coding Standards. When submitting patches, please take care to follow these standards and to match the style of the code to that present in Clang (for example, in terms of indentation, bracing, and statement spacing).

Clang has a few additional coding standards:

cstdio is forbidden: library code should not output diagnostics or other information using cstdio; debugging routines should use llvm::errs(). Other uses of cstdio impose behavior upon clients and block integrating Clang as a library. Libraries should support raw_ostream based interfaces for textual output. See Coding Standards.

Developer Documentation

Both Clang and LLVM use doxygen to provide API documentation. Their respective web pages (generated nightly) are here:

Clang
LLVM

For work on the LLVM IR generation, the LLVM assembly language reference manual is also useful.

Debugging

Inspecting data structures in a debugger:

Many LLVM and Clang data structures provide a dump() method which will print a description of the data structure to stderr.
The QualType structure is used pervasively. This is a simple value class for wrapping types with qualifiers; you can use the isConstQualified(), for example, to get one of the qualifiers, and the getTypePtr() method to get the wrapped Type* which you can then dump.
For LLDB users there are data formatters for clang data structures in utils/ClangDataFormat.py.

Debugging using Visual Studio

The files utils/llvm.natvis and utils/clang.natvis provide debugger visualizers that make debugging of more complex data types much easier.

Put the files into %USERPROFILE%\Documents\Visual Studio 2012\Visualizers or create a symbolic link so they update automatically.

Testing

Testing on Unix-like Systems

Clang includes a basic regression suite in the tree which can be run with make test from the top-level clang directory, or just make in the test sub-directory. make VERBOSE=1 can be used to show more detail about what is being run.

If you built LLVM and Clang using CMake, the test suite can be run with make clang-test from the top-level LLVM directory.

The tests primarily consist of a test runner script running the compiler under test on individual test files grouped in the directories under the test directory. The individual test files include comments at the beginning indicating the Clang compile options to use, to be read by the test runner. Embedded comments also can do things like telling the test runner that an error is expected at the current line. Any output files produced by the test will be placed under a created Output directory.

During the run of make test, the terminal output will display a line similar to the following:

--- Running clang tests for i686-pc-linux-gnu ---

followed by a line continually overwritten with the current test file being compiled, and an overall completion percentage.

After the make test run completes, the absence of any Failing Tests (count): message indicates that no tests failed unexpectedly. If any tests did fail, the Failing Tests (count): message will be followed by a list of the test source file paths that failed. For example:

  Failing Tests (3):
      /home/john/llvm/tools/clang/test/SemaCXX/member-name-lookup.cpp
      /home/john/llvm/tools/clang/test/SemaCXX/namespace-alias.cpp
      /home/john/llvm/tools/clang/test/SemaCXX/using-directive.cpp

If you used the make VERBOSE=1 option, the terminal output will reflect the error messages from the compiler and test runner.

The regression suite can also be run with Valgrind by running make test VG=1 in the top-level clang directory.

For more intensive changes, running the LLVM Test Suite with clang is recommended. Currently the best way to override LLVMGCC, as in: make LLVMGCC="clang -std=gnu89" TEST=nightly report (make sure clang is in your PATH or use the full path).

Testing using Visual Studio on Windows

The Clang test suite can be run from either Visual Studio or the command line.

Note that the test runner is based on Python, which must be installed. Find Python at: http://www.python.org/download/. Download the latest stable version (2.6.2 at the time of this writing).

The GnuWin32 tools are also necessary for running the tests. Get them from http://getgnuwin32.sourceforge.net/. If the environment variable %PATH% does not have GnuWin32, or if other grep(s) supercedes GnuWin32 on %PATH%, you should specify LLVM_LIT_TOOLS_DIR to CMake explicitly.

The cmake build tool is set up to create Visual Studio project files for running the tests, "clang-test" being the root. Therefore, to run the test from Visual Studio, right-click the clang-test project and select "Build".

Testing on the Command Line

If you want more control over how the tests are run, it may be convenient to run the test harness on the command-line directly. Before running tests from the command line, you will need to ensure that lit.site.cfg files have been created for your build. You can do this by running the tests as described in the previous sections. Once the tests have started running, you can stop them with control+C, as the files are generated before running any tests.

Once that is done, to run all the tests from the command line, execute a command like the following:

  python (path to llvm)\llvm\utils\lit\lit.py -sv
  --param=build_mode=Win32 --param=build_config=Debug
  --param=clang_site_config=(build dir)\tools\clang\test\lit.site.cfg
 (path to llvm)\llvm\tools\clang\test

For CMake builds e.g. on Windows with Visual Studio, you will need to specify your build configuration (Debug, Release, etc.) via --param=build_config=(build config). You may also need to specify the build mode (Win32, etc) via --param=build_mode=(build mode).

Additionally, you will need to specify the lit site configuration which lives in (build dir)\tools\clang\test, via --param=clang_site_config=(build dir)\tools\clang\test\lit.site.cfg.

To run a single test:

  python (path to llvm)\llvm\utils\lit\lit.py -sv
  --param=build_mode=Win32 --param=build_config=Debug
  --param=clang_site_config=(build dir)\tools\clang\test\lit.site.cfg
  (path to llvm)\llvm\tools\clang\test\(dir)\(test)

For example:

  python C:\Tool\llvm\utils\lit\lit.py -sv
  --param=build_mode=Win32 --param=build_config=Debug
  --param=clang_site_config=c:\Tools\build\tools\clang\test\lit.site.cfg
  C:\Tools\llvm\tools\clang\test\Sema\wchar.c

The -sv option above tells the runner to show the test output if any tests failed, to help you determine the cause of failure.

You can also pass in the --no-progress-bar option if you wish to disable progress indications while the tests are running.

Your output might look something like this:

lit.py: lit.cfg:152: note: using clang: 'C:\Tools\llvm\bin\Release\clang.EXE'
-- Testing: Testing: 2534 tests, 4 threads --
Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Testing Time: 81.52s
  Expected Passes    : 2503
  Expected Failures  : 28
  Unsupported Tests  : 3

The statistic, "Unexpected Failures" (not shown if all tests pass), is the important one.

Creating Patch Files

To return changes to the Clang team, unless you have checkin privileges, the preferred way is to send patch files to the cfe-commits mailing list, with an explanation of what the patch is for. clang follows LLVM's developer policy. If your patch requires a wider discussion (for example, because it is an architectural change), you can use the cfe-dev mailing list.

To create these patch files, change directory to the llvm/tools/clang root and run:

svn diff (relative path) >(patch file name)

For example, for getting the diffs of all of clang:

svn diff . >~/mypatchfile.patch

For example, for getting the diffs of a single file:

svn diff lib/Parse/ParseDeclCXX.cpp >~/ParseDeclCXX.patch

Note that the paths embedded in the patch depend on where you run it, so changing directory to the llvm/tools/clang directory is recommended.

LLVM IR Generation

The LLVM IR generation part of clang handles conversion of the AST nodes output by the Sema module to the LLVM Intermediate Representation (IR). Historically, this was referred to as "codegen", and the Clang code for this lives in lib/CodeGen.

The output is most easily inspected using the -emit-llvm option to clang (possibly in conjunction with -o -). You can also use -emit-llvm-bc to write an LLVM bitcode file which can be processed by the suite of LLVM tools like llvm-dis, llvm-nm, etc. See the LLVM Command Guide for more information.