From dceb002f826db10e260a7843e12a48b9fadde349 Mon Sep 17 00:00:00 2001 From: Justin Holewinski Date: Thu, 11 Aug 2011 17:34:16 +0000 Subject: PTX: Add basic documentation to CodeGenerator.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137315 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/CodeGenerator.html | 65 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) (limited to 'docs') diff --git a/docs/CodeGenerator.html b/docs/CodeGenerator.html index db62780c25..248a85c1b8 100644 --- a/docs/CodeGenerator.html +++ b/docs/CodeGenerator.html @@ -114,6 +114,7 @@
  • Prolog/Epilog
  • Dynamic Allocation
  • +
  • The PTX backend
  • @@ -2912,6 +2913,70 @@ MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory + + + +

    + The PTX backend +

    + +
    + +

    The PTX code generator lives in the lib/Target/PTX directory. It is + currently a work-in-progress, but already supports most of the code + generation functionality needed to generate correct PTX kernels for + CUDA devices.

    + +

    The code generator can target PTX 2.0+, and shader model 1.0+. The + PTX ISA Reference Manual is used as the primary source of ISA + information, though an effort is made to make the output of the code + generator match the output of the NVidia nvcc compiler, whenever + possible.

    + +

    Code Generator Options:

    + + + + + + + + + + + + + + + + + +
    OptionDescription
    doubleIf enabled, the map_f64_to_f32 directive is + disabled in the PTX output, allowing native double-precision + arithmetic
    no-fmaDisable generation of Fused-Multiply Add + instructions, which may be beneficial for some devices
    smxy / computexySet shader model/compute capability to x.y, + e.g. sm20 or compute13
    + +

    Working:

    + + +

    In Progress:

    + + +
    -- cgit v1.2.3