<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE flagsdescription SYSTEM "http://www.spec.org/dtd/cpuflags1.dtd">
<!-- The lines above are NOT optional. If you're adept at reading DTDs,
the one that this file conforms to is at the URL listed above.
But most humans writing a flags file will want to have it automatically
checked using a validating parser such as RXP (available at
http://www.ltg.ed.ac.uk/~richard/rxp.html), or use one of the on-line
The parser used by the CPU tools is _not_ a validating parser, so it
may be possible to sneak things by it that would not pass the checkers
above. However, if the checkers above say that your file is clean, it's
Flag files submitted to SPEC _will_ be checked by a validating parser.
Invalid or not-well-formed flag files will be rejected.
This file is
Copyright (C) 2006 Standard Performance Evaluation Corporation
All Rights Reserved
This file may be freely modified and redistributed, provided that the
copyright notice above and this notice remain unaltered.
$Id: flags-simple.xml 4007 2006-03-17 11:34:42Z cloyce $
Unless otherwise explicitly noted, all references to "section n.nn"
refer to flag_description.html, available at
<title>IBM XL Compiler Flags and Common Unix Commands and Environment Settings</title>
body { background: white; }
<!-- =====================================================================
The <header> section is also entirely optional. If it is provided, and
no class is specified, then it will be inserted verbatim at the top
of the flags dump.
If a class is specified, that text will be inserted verbatim before flags
of that class.
As the contents should be HTML, it will save lots of time to just enclose
the whole thing in a CDATA section. Section 2.3.1 again.
<p>Compilers: IBM XL C/C++ Enterprise Edition Version 8.0 for AIX</p>
<p>Compilers: IBM XL Fortran Enterprise Edition Version 10.1 for AIX</p>
<p>Compilers: IBM XL C/C++ Enterprise Edition Version 9.0 for AIX</p>
<p>Compilers: IBM XL Fortran Enterprise Edition Version 11.1 for AIX</p>
<p>Compilers: IBM XL C/C++ Version 10.1 for AIX</p>
<p>Compilers: IBM XL Fortran Version 12.1 for AIX</p>
<p>Last updated: 13-Apr-2009</p> ]]>
<!-- =====================================================================
Information about the meaning of boot-time settings, BIOS options,
kernel tuning, and so forth can go in the 'platform_settings' section.
They'll be appended to the end of both the flags dump and per-result flag report.
As the contents should be HTML, it will save lots of time to just enclose
the whole thing in a CDATA section. Section 2.3.1 again.
<li> fdpr -q -O4 -A 32 -bldcg -shci 90 -sdp 9</li>
The fdpr command (Feedback Directed Program Restructuring) is a performance-tuning utility that may help
improve the execution time and the real memory utilization of user-level application programs. The fdpr program
optimizes the executable image of a program by collecting information on the behavior of the program while the
program is used for some typical workload, and then creating a new version of the program that is optimized for
that workload. The new program generated by fdpr typically runs faster and uses less real memory.
fdpr [options] -p program [-x invocation]
where -p specifies the input program, in a form of executable, shared object
or archive file
-x specifies how to invoke the program
[options] can be one or more of the following:
Action Options:
-123 Specifies which actions/phases to run, where:
-1 generates instrumented program for profile gathering
-2 runs the instrumented program and updates profile data (requires -x <invocation>)
-3 generates optimized program
Default is set to run all three phases (-123)
-a/--action [action] Specifies customized actions
where [action] can be one of the following:
anl analyze program
instr generate instrumented program for profile gathering (same as -1)
opt generate optimized program (same as -3)
check_sign check fdpr signature in the input program
Analysis Options:
-aawc/-noaawc, --analyze-assembly-written-csects/--noanalyze-assembly-written-csects
Analyze/Do not analyze objects written in Assembly. The
default is set to analyze modules written in Assembly
-acf <analysis configuration file>, --analysis-configuration-file <analysis configuration file>
Provide a configuration file of analysis information
(advanced option)
-asd, --analyze-static-data
Analyze static data objects as distinct data elements
for data reordering (unsafe for certain compilers)
-esa, --extra-safe-analysis
Limit analysis phase to compiler generated code
-fca, --funcsect-analysis
Apply special analysis for an input executable that was
compiled with the -qfuncsect compiler option
-ff <string>, --file-format <string>
Input file format: can be LM (load module) or PO
(program object)
-ifl <file>, --ignored-function-list <file>
Set the ignored function list. The file contains names
of functions that should not be instrumentated or
-iinf, --ignore-info Ignore .info sections produced with the -qfdpr option
during compile time
Instrumentation Options:
-anl, --analyze-program
Analyze the program but does not create any modified
binary. This option is used to provide dump of
profile/code coverage information. When used with the
-d option it will dump the disassembly of the original
-ccf <coverage_file>, --code-coverage-file <coverage_file>
Use file mapped to shared memory to collect coverage
information at run-time
-ccgi <mode>, --code-coverage-generate-info <mode>
Produce coverage information to given file based on
profile information. Use <mode>=XML for XML output and
<mode>=FLAT for flat formatted text file. The generated
file is <output file>.cci[.xml]
-cci, --code-coverage-instrumentation
Instrument program in order to obtain code coverage
information. program must be compiled with line number
debug info
-ccl <level>, --code-coverage-level <level>
Perform Code Coverage at Basic Block level (BB) or at
functions level (Func). default is BB
-ccm <coverage_map>, --code-coverage-map <coverage_map>
Defines the map file name of coverage instrumentation.
Default is <output file>.cc
-ei, --embedded-instrumentation
Perform embedded instrumentation. The profile will be
collected into global variables
-fd <Fdesc>, --file-descriptor <Fdesc>
Set the file descriptor number to be used when opening
the profile file. The default of <Fdesc> is set to the
maximum-allowed number of open files
-imullX, --mullX-instrumentation
perform value profiling of RA and RB operands in mullX
-infp, --ignore-not-found-procedures
Ignore not found procedures
-ipcr/-noipcr, --instrumentation-preserve-condition-register/--noinstrumentation-preserve-condition-register
Preserve/Do not preserve Condition Register while
calling stubs
-ipctr/-noipctr, --instrumentation-preserve-count-register/--noinstrumentation-preserve-count-register
Preserve/Do not preserve Count Register while calling
-ipe/-noipe, --instrumentation-preserve-environment/--noinstrumentation-preserve-environment
Do not preserve registers that are not overwritten while
calling stubs. -noipe implies -noipvr -noipspr
-iplr/-noiplr, --instrumentation-preserve-link-register/--noinstrumentation-preserve-link-register
Preserve/Do not preserve link register while calling
-ipnvr, --instrumentation-preserve-non-volatile-registers
Preserve non volatile registers while calling stubs
-ipspr/-noipspr, --instrumentation-preserve-special-registers/--noinstrumentation-preserve-special-registers
Preserve/Do not preserve special purpose registers while
calling stubs
-ipvr/-noipvr, --instrumentation-preserve-volatile-registers/--noinstrumentation-preserve-volatile-registers
Preserve/Do not preserve volatile registers while
calling stubs. -noipvr implies -noipnvr and -nosfp
-ipxer/-noipxer, --instrumentation-preserve-fixed-point-exception-register/--noinstrumentation-preserve-fixed-point-exception-register
Preserve/Do not preserve Fixed-Point Exception Register
while calling stubs
-issu, --instrumentation-safe-stack-usage
Ensure additional stack space is properly allocated for
the instrumented run. Use this option if your
application uses stack extensively (e.g., when the
program uses alloca()). Note that this option adds
extra overhead on instrumentation code
-iso <offset>, --instrumentation-stack-offset <offset>
Set the offset from the stack, a negative number, where
the instrumentation's area for saving registers is kept
at runtime. Use with care
-M <addr>, --profile-map <addr>
Set shared memory segment address for profiling.
Alternative shared memory addresses are needed when the
instrumented program application creates a conflict
with the shared-memory addresses preserved for the
profiling. Typical alternative values are 0x40000000,
0x50000000, ... up to 0xC0000000. The default is set to
-pi, --profile-instrumentation
Instrument program in order to obtain execution count
-ri/-nori, --register-instrumentation/--noregister-instrumentation
Instrument/Do not instrument the input program file to
collect profile information about indirect branches via
registers. The default is set to collect the profile
-sfp/-nosfp, --save-floating-point-registers/--nosave-floating-point-registers
Save/Do not save floating point registers in
instrumented code. The default is set to save floating
point registers
-spescr <0-127>, --spe-scratch-register <0-127>
Specify a global SPE scratch register, decreasing
instrumenation overhead, in order to minimize
possibility of local store overflow
-ui, --user-instrumentation
Instrument program by insert calls to user supplied
functions compiled into shared library
Profile Files Options:
-af <prof_file>, --ascii-profile-file <prof_file>
Set the name of an ASCII profile file containing profile
information. There are three different XML entry
options: <Simple .. >, <Cond .. > and <Reg .. > for
profiling data on regular, conditional or branch via
register instructions, respectively
-aop, --accept-old-profile
Accept the old profile file collected on previous
versions of the input program file (requires the -f
-f <prof_file>, --profile-file <prof_file>
Set the profile file name. The profile file is created
during the instrumentation phase and read during the
optimization phase. The profile file is updated each
time you run the instrumented program
Optimization Options:
-A <alignment>, --align-code <alignment>
Align program so that hot code will be aligned on
<alignment>-byte addresses
-abb <factor>, --align-basic-blocks <factor>
Align basic blocks that are hotter than the average by a
given (float) <factor>. This is a lower-level
machine-specific alignment compared to --align-code.
Value of -1 (the default) disables this option
-bh <factor>, --branch-hint <factor>
add branch hints to basic blocks that are hotter then
the average by given (float) <factor>. This is a SPE
specific optimization. Value of -1 (the default)
disables this option
-bldcg, --build-dcg Build a Data Connectivity Graph (DCG) for enhanced data
reordering (applicable only with the -RD flag)
-btcar, --branch-table-csect-anchor-removal
Eliminate load instructions used when accessing branch
-cbtd, --convert-bss-to-data
Convert BSS section into a data section. This is useful
for more aggressive tocload and RD optimizations
-cib-opt, --convert-indirect-branches-optimization
Convert indirect branch to direct branch
-cRD, --conservativeRD
Perform conservative static data reordering by packing
together all frequently referenced static variables
-dce, --dead-code-elimination
Eliminate instructions related to unused local variables
within frequently executed functions. This is useful
mainly after applying function inlining optimization
-dp, --data-prefetch Insert data-cache prefetch instructions to improve
data-cache performance
-dpht <threshold>, --data-placement-hotness-threshold <threshold>
Set data placement algorithm hotness threshold between
(0,1), where 0 reorders the static variables in large
groups based on the control flow, and 1 reorders the
variables in very small groups based on their access
frequency. (This is applicable only with the -RD flag)
-dpnf <factor>, --data-placement-normalization-factor <factor>
Set data placement algorithm normalization factor
between (0,1), where 0 causes static variables to be
reordered regardless of their size, and 1 locates only
small sized variables first. (applicable only with the
-RD flag)
-ece, --epilog-code-eliminate
Reduce code size by grouping common instructions in
function epilogs, into a single unified code
-fc, --function-cloning
Enable function cloning phase only during function
inlining optimizations (applicable only with function
inlining flags: -i, -si, -ihf, -isf, -shci)
-hr, --hco-reschedule Relocate instructions from frequently executed code to
rarely executed code areas, when possible
-hrf <factor>, --hco-resched-factor <factor>
Set the aggressiveness of the -hr optimization option
according to a factor value between (0,1), where 0 is
the least aggressive factor (applicable only with the
-hr option)
-i, --inline Same as --selective-inline with --inline-small-funcs 12
-icm-opt, --icm-optimization
Replace a sequence of l/ltr or ly/ltr instructions with
and icm or icmy instruction respectively
-ihf <pct>, --inline-hot-functions <pct>
Inline all function call sites to functions that have a
frequency count greater than the given <pct> frequency
-iplte, --inline-plt-entries
Replaces the call to a PLT entry with the PLT entry code
itself, by inlining the first part of the entry
-isf <size>, --inline-small-funcs <size>
Inline all functions that are smaller than or equal to
the given <size> in bytes
-kr, --killed-registers
Eliminate stores and restores of registers that are
killed (overwritten) after frequently executed function
-lal-opt, --load-after-load-optimization
Replace two load instruction from the same memory
location to one load instruction and one placement
-lap, --load-address-propagation
Eliminate load instructions of variable addresses by
re-using pre-loaded addresses of adjacent variables
-larl-opt, --larl-optimization
Replace a sequence of bras/const area/llgt instructions
with a single lalr instruction
-las, --load-after-store
Add NOP instructions to place each load instruction
further apart following a store instruction that
references the same memory address
-ldce, --local-dead-code-optimization
Local dead code elimination (basic block scope only) -
needless when using -dce
-ldp-opt, --long-displacement-optimization
Replace an instruction which has long displacement with
the matching insturction which has short displacement,
according to the displacement operand (e.g. ay-->a,
oy-->o, xy-->x, etc.)
-lgfr-opt, --lgfr-optimization
Replace when can a 32 bit instruction with its matching
64 bit instruction
-llgh-opt, --llgh-optimization
Replace a sequence of lh/nilh/llgfr instructions with a
single llgh instruction
-lro, --link-register-optimization
Eliminate saves and restores of the link register in
frequently-executed functions
-lu <aggressiveness_factor>, --loop-unroll <aggressiveness_factor>
Unroll short loops containing one to several basic
blocks according to an aggressiveness factor between
(1,9), where 1 is the least aggressive unrolling option
for very hot and short loops
-lun <unrolling_number>, --loop-unrolling-number <unrolling_number>
Set the number of unrolled iterations in each unrolled
loop. The allowed range is between (2,50). Default is
set to 2. (Applicable only with the -lu flag)
-mvc-opt, --mvc-optimization
Replace an mvc instruction with lg/stg instructions
-nillr15-opt, --nillr15-optimization
Remove a nill r15,0xfffe instruction if followed by an
stmg r14,r12,8(r13) instruction
-O Switch on basic optimizations only. Same as -RC -nop -bp
-O2 Switch on less aggressive optimization flags. Same as -O
-hr -pto -isf 8 -tlo -kr
-O3 Switch on aggressive optimization flags. Same as -O2 -RD
-isf 12 -si -dp -lro -las -vro -btcar -lu 9 -rt 0 -so
-O4 Switch on aggressive optimization flags together with
aggressive function inlining. Same as -O3 -sidf 50 -ihf
20 -sdp 9 -shci 90 and -bldcg (for XCOFF files)
-O5 Switch on aggressive optimization flags together with
HLR optimization. Same as -O4 -sa -gcpyp -gcnstp -dce
-omullX, --mullX-optimization
Optimize mullX instructions by adding a run-time check
on RA and RB and performing equivalent operations with
lower penalty. The optimization requires the use of
-imullX in the instrumentation phase
-pbsi, --path-based-selective-inline
Perform selective inlining of dominant hot function
calls based on the control flow paths leading to hot
-pc, --preserve-csects
Preserve CSects' boundaries in reordered code
-pca, --propagate-constant-area
Relocate the constant variables area to the top of the
code section when possible
-pfb, --preserve-first-bb
Preserve original location of the entry point basic
block in program
-pp, --preserve-functions
Preserve functions' boundaries in reordered code
-pr/-nopr, --ptrgl-r11/--noptrgl-r11
Perform/Do not perform removal of R11 load instruction
in _ptrgl csect (the default is to perform the
-pto, --ptrgl-optimization
Perform optimization of indirect call instructions via
registers by replacing them with conditional direct
-ptoht <heatness_threshold>, --ptrgl-optimization-heatness-threshold <heatness_threshold>
Set the frequency threshold for indirect calls that are
to be optimized by -pto optimization. Allowed range
between 0 and 1. Default is set to 0.8. (Applicable
only with -pto flag)
-ptosl <limit_size>, --ptrgl-optimization-size-limit <limit_size>
Set the limit of the number of conditional statements
generated by -pto optimization. Allowed values are
between 1 and 100. Default value is set to 3.
(Applicable only with the -pto flag)
-rcaf <aggressiveness_factor>, --reorder-code-aggressivenes-factor <aggressiveness_factor>
Set the aggressiveness of code reordering optimization.
Allowed values are [0 | 1 | 2], where 0 preserves then
original code order and 2 is the most aggressive.
Default is set to 1. (Applicable only with the -RC
-rccrf <reversal_factor>, --reorder-code-condition-reversal-factor <reversal_factor>
Set the threshold fraction that determines when to
enable condition reversal for each conditional branch
during code reordering. Allowed input range is between
0.0 and 1.0 where 0.0 tries to preserve original
condition direction and 1.0 ignores it. Default is set
to 0.8 (Applicable only with the -RC flag)
-rcctf <termination_factor>, --reorder-code-chain-termination-factor <termination_factor>
Set the threshold fraction that determines when to
terminate each chain of basic blocks during code
reordering. Allowed input range is between 0.0 and 1.0
where 0.0 generates long chains and 1.0 creates single
basic block chains. Default is set to 0.05. (Applicable
only with the -RC flag)
-RD, --reorder-data Perform static data reordering
-rmte, --remove-multiple-toc-entries
Remove multiple TOC entries pointing to the same
location in the input program file
-rt <removal_factor>, --reduce-toc <removal_factor>
Perform removal of TOC entries according to a removal
factor between (0,1), where 0 removes non-accessed TOC
entries only and 1 removes all possible TOC entries
-rtb, --remove-traceback-tables
Remove traceback tables in reordered code
-sal-opt, --store-after-load-optimization
Remove store after load when there is no change
-sdp <aggressiveness_factor>, --stride-data-prefetch <aggressiveness_factor>
Perform data prefetching within frequently executed
loops based on stride analysis, according to an
aggressiveness factor between (1,9), where 1 is the
least aggressive
-sdpla <iterations_number>, --stride-data-prefetch-look-ahead <iterations_number>
Set the number of iterations for which data is
prefetched into the cache ahead of time. Default value
is set to 4 iterations. (Applicable only with the -sdp
-sdpms <stride_min_size>, --stride-data-prefetch-min-size <stride_min_size>
Set the minimal stride size in bytes, for which data
will be considered a candidate for prefetching. Default
value is set to 128 bytes. (Applicable only with the
-sdp flag)
-shci <pct>, --selective-hot-code-inline <pct>
Perform selective inlining of functions in order to
decrease the total number of execution counts, so that
only functions with hotness above the given percentage
are inlined
-si, --selective-inline
Perform selective inlining of dominant hot function
-sidf <percentage_factor>, --selective-inline-dominant-factor <percentage_factor>
Set a dominant factor percentage for selective inline
optimization. The allowed range is between 0 and 100.
Default is set to 80. (Applicable only with the -si and
-pbsi flags)
-siht <frequency_factor>, --selective-inline-hotness-threshold <frequency_factor>
Set a hotness threshold factor percentage for selective
inline optimization to inline all dominant function
calls that have a frequency count greater than the
given frequency percentage. Default is set to 100.
(Applicable only with the -si -pbsi flags)
-slbp, --spinlock-branch-prediction
Perform branch prediction bit setting for conditional
branches in spinlock code containing l*arx and st*cx
instructions. (Applicable after -bp flag)
-sldp, --spinlock-data-prefetch
Perform data prefetching for memory access instructions
preceding spinlock code containing l*arx and st*cx
-sll <Lib1:Prof1,...,LibN:ProfN>, --static-link-libraries <Lib1:Prof1,...,LibN:ProfN>
Statically link hot code from specified dynamically
linked libraries to the input program. The parameter
consists of a comma-separated list of libraries and
their profiles. IMPORTANT: Licensing rights of
specified libraries should be observed when applying
this copying optimization
-sllht <hotness_threshold>, --static-link-libraries-hotness-threshold <hotness_threshold>
Set hotness threshold for the --static-link-libraries
optimization. The allowed input range is between 0
(least aggressive) and 1, or -1, which does not require
a profile and selects all code that might be called by
the input program from the given libraries. Default is
set at 0.5
-so, --stack-optimization
Reduce the stack frame size of functions that are called
with a small number of arguments
-spc, --shortcut-plt-calls
Shortcut PLT calls in shared libraries to local
functions if they exist. Note: Resolving to external
symbols is disabled for such calls
-stf, --stack-flattening
Merge the stack frames of inlined functions with the
frames of the calling functions
-tb, --preserve-traceback-tables
Force the restructuring of traceback tables in reordered
code. If -tb option is omitted, traceback tables are
automatically included only for C++ applications that
use the Try & Catch mechanism
-tlo, --tocload-optimization
Replace each load instruction that references the TOC
with a corresponding add-immediate instruction via the
TOC anchor register, where possible
-ucde, --unreachable-code-data-elimination
Remove unreachable code and non-accessed static data
-vro, --volatile-registers-optimization
Eliminate stores and restores of non-volatile registers
in frequently executed functions by using available
volatile registers
-vrox, --volatile-registers-extended-optimization
Eliminate stores and restores of non-volatile registers
in frequently executed functions by using available
volatile registers, the extended version supports FP
registers and transparency
Output Options:
-bcdf <file>, --binary-code-dump-file <file>
Create a binary dump of the code (opcodes) with
annotations of addresses
-cep, --complement-edge-profile
Complements partial profile information given for the
basic blocks' frequencies by adding missing basic
block-to-basic block edge counts
-d, --disassemble-text
Print the disassembled text section of the output
program into <output_file>.dis_text file
-dap, --dump-ascii-profile
Dump profile information in ASCII format into
<program>.aprof (requires the -f flag)
-db, --disassemble-bss
Print the disassembled bss section of the output program
into <output_file>.dis_bss file
-dd, --disassemble-data
Print the disassembled data section of the output
program into <output_file>.dis_data file
-diap, --dump-initial-ascii-profile
Dump initial profile information in ASCII format into
<program>.aprof.init (requires the -f flag)
-dim, --dump-instruction-mix
Dump instruction mix statistics based on gathered
profile information
-dm, --dump-mapper Print a map of basic blocks and static variables with
their respective new -> old addresses into a
<program>.mapper file
-enc, --encapsulate Encapsulate SPE executables present in the PPE input
(see --spe-directory)
-o <output_file>, --output-file <output_file>
Set the name of the output file. The default
instrumented file is <program>.instr. The default
optimized file is <program>.fdpr
-pif, --print-inlined-funcs
Print the list of inlined functions along with their
corresponding calling functions into a
<output_file>.inl_list file (requires the -si or -i or
-isf flags)
-plc, --preserve-linkage-conventions
Preserve linkage conventions
-ppcf, --print-prof-counts-file
Print the profiling counters in ASCII format into a
<program>.counts file (requires the -f flag)
-sf, --strip-file Strip the optimized output file
-simo, --single-input-multiple-outputs
Optimize in parallel into multiple outputs as specified
by option sets read from stdin
-spedir <directory>, --spe-directory <directory>
Set the directory into which SPE executables will be
extracted and from which they will be encapsulated
General Options:
-cell, --cell-supervisor
Integrated PPE/SPE processing. Perform SPE extraction,
processing, and encapsulation automatically prior to
PPE processing
-h, --help Print online help
-m <machine-model>, --machine <machine-model>
Generate code for the specified machine model. Target
machine can be one of the following models: power2,
power3, ppc405, ppc440, power4, ppc970, power5, power6,
ppe, spe, spe_edp, zArch6, zArch5. Default is set to no
-q, --quiet Set quiet output mode, suppressing informational
-st <stat_file>, --statistics <stat_file>
Output statistics information to <stat_file>. If
<stat_file> is '-', the output goes to standard output.
See --verbose for the default
-v <level>, --verbose <level>
Set verbose output mode level. When set, various
statistics about the target optimized program are
printed into the file <program>.stat. Allowed level
range is between 0 and 3. Default is set to 0
-V, --version Print version
-armember For archive files - list of archive members to be
optimized, if -armember is not specified, all members
will be optimized
- Compiler declarations.
<flag name="xlc" class="compiler" regexp="(\S*\/)?xlc(_r)?\b">
Invoke the IBM XL C compiler. 32-bit binaries are produced by default.
<flag name="xlC" class="compiler" regexp="(\S*\/)?xlC(_r)?\b">
Invoke the IBM XL C++ compiler. 32-bit binaries are produced by default.
<flag name="xlf95" class="compiler" regexp="(\S*\/)?xlf95(_r)?\b">
Invoke the IBM XL Fortran compiler. 32-bit binaries are produced by default.
- Aggregated optimization flags.
<flag name="F-O5" class="optimization" regexp="-O5\b">
Perform optimizations for maximum performance. This includes maximum
interprocedural analysis on all of the objects presented on the "link"
step. This level of optimization will increase the compiler's memory
usage and compile time requirements. -O5 Provides all of the functionality
of the -O4 option, but also provides the functionality of the
-qipa=level=2 option.
-O5 is equivalent to the following flags
<li> <tt>-O4</tt> </li>
<li> <tt>-qipa=level=2</tt> </li>
<include flag="F-O4"/>
<include flag="F-qipa:level" flagtext="-qipa=level=2"/>
<flag name="F-O4" class="optimization" regexp="-O4\b">
Perform optimizations for maximum performance. This includes
interprocedural analysis on all of the objects presented on the "link"
-O4 is equivalent to the following flags
<li> <tt>-O3</tt> </li>
<li> <tt>-qipa=level=1</tt> </li>
<li> <tt>-qarch=auto</tt> </li>
<li> <tt>-qtune=auto</tt> </li>
<include flag="F-O3"/>
<include flag="F-qipa:level" flagtext="-qipa=level=1"/>
<include flag="F-qarch" flagtext="-qarch=auto"/>
<include flag="F-qtune" flagtext="-qtune=auto"/>
<flag name="F-O3" class="optimization" regexp="-O3\b">
-O3 Performs additional optimizations that are memory intensive, compile-time
intensive, and may change the semantics of the program slightly, unless
-qstrict is specified. We recommend these optimizations when the desire for
run-time speed improvements outweighs the concern for limiting compile-time
resources. The optimizations provided include:
<li> In-depth memory access analysis </li>
<li> Better loop scheduling </li>
<li> High-order loop analysis and transformations (-qhot=level=0) </li>
<li> Inlining of small procedures within a compilation unit by default </li>
<li> Eliminating implicit compile-time memory usage limits </li>
<li> Widening, which merges adjacent load/stores and other operations </li>
<li> Pointer aliasing improvements to enhance other optimizations </li>
-O3 is equivalent to the following flags
<li> <tt>-O2</tt> </li>
<li> <tt>-qhot=level=0</tt> </li>
<include flag="F-O2"/>
<include flag="F-qhot" flagtext="-qhot=level=0"/>
<flag name="F-O2" class="optimization" regexp="-O2\b">
-O2 Performs a set of optimizations that are intended to offer improved
performance without an unreasonable increase in time or storage that is
required for compilation including:
<li> Eliminates redundant code </li>
<li> Basic loop optimization </li>
<li> Can structure code to take advantage of -qarch and -qtune settings </li>
<include flag="F-O"/>
<flag name="F-O" class="optimization" regexp="-O\b">
-O enables the level of optimization that represents the best tradeoff
between compilation speed and run-time performance.
If you need a specific level of optimization, specify the appropriate
numeric value.
Currently, -O is equivalent to -O2.
<include flag="F-O2"/>
- Optimization flags: individual methods.
<flag name="F-qarch" class="optimization" regexp="-qarch=(\S+)\b">
Produces object code containing instructions that will run on the
specified processors. "auto" selects the processor the compile
is being done on. "pwr5x" is the POWER5+ processor.
<p>Supported values for this flag are</p>
<li>auto - Use the processor on which the program is compiled.</li>
<li>pwr6e - The POWER6 processor in "Enhanced" mode based systems.</li>
<li>pwr6 - The POWER6 processor based systems.</li>
<li>pwr5x - The POWER5+ processor based systems.</li>
<li>pwr5 - The POWER5 processor based systems.</li>
<li>pwr4 - The POWER4 processor based systems.</li>
<li>ppc970 - The PPC970 processor based systems.</li>
<flag name="F-qtune" class="optimization" regexp="-qtune=(\S+)\b">
Specifies the system architecture for which the executable program
is optimized. This includes instruction scheduling and cache setting.
<p>The supported values for <tt>suboption</tt> are</p>
<li>auto - Use the processor on which the program is compiled.</li>
<li>pwr6e - The POWER6 processor in "Enhanced" mode based systems.</li>
<li>pwr6 - The POWER6 processor based systems.</li>
<li>pwr5x - The POWER5+ processor based systems.</li>
<li>pwr5 - The POWER5 processor based systems.</li>
<li>pwr4 - The POWER4 processor based systems.</li>
<li>ppc970 - The PPC970 processor based systems.</li>
<flag name="F-qnoinline" class="optimization" regexp="-qnoinline\b">
This option specifies that no functions are to be inlined.
<flag name="F-qinlglue" class="optimization" regexp="-qinlglue\b">
This option inlines glue code that optimizes external
function calls when compiling.
<flag name="F-qhot" class="optimization" regexp="-qhot(=arraypad|=simd|=vector|=level=[01])?\b">
Performs high-order transformations on loops during optimization.
The supported values for <tt>suboption</tt> are:
<li>arraypad - The compiler will pad any arrays where it infers that there may be a benefit.</li>
<li>level=0 - The compiler performs a limited set of high-order loop transformations.</li>
<li>level=1 - The compiler performs its full set of high-order loop transformations.</li>
<li>simd - Replaces certain instruction sequences with vector instructions.</li>
<li>vector - Replaces certain instruction sequences with calls to the MASS library.</li>
Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and
-qhot=level=1. The -qhot option is also implied by -O4, and -O5.
<flag name="F-qipa:level" class="optimization" regexp="-qipa=level=[012]\b">
Enhances optimization by doing detailed analysis across procedures
(interprocedural analysis or IPA).
The <tt>level</tt> determines the amount of interprocedural analysis
and optimization that is performed.
<tt>level=0</tt> Does only minimal interprocedural analysis and optimization
<tt>level=1</tt> turns on inlining , limited alias analysis, and limited
call-site tailoring
<tt>level=2</tt> turns on full interprocedural data flow and alias analysis
<flag name="F-qnoipa" class="optimization" regexp="-qnoipa\b">
Suppresses interprocedural analysis (IPA), which is enabled by default
at optimization levels -O4 and -O5.
<flag name="F-qpdf1" class="optimization" regexp="-qpdf1\b">
The option used in the first pass of a profile directed feedback compile
that causes pdf information to be generated.
The profile directed feedback optimization gathers data on both execution
path and data values. It does not use hardware counters, nor gather any
data other than path and data values for PDF specific optimizations.
<flag name="F-qpdf2" class="optimization" regexp="-qpdf2\b">
The option used in the second pass of a profile directed feedback compile
that causes PDF information to be utilized during optimization.
<flag name="F-qfdpr" class="optimization" regexp="-qfdpr\b">
The compiler generates additional symbol information for use by the AIX "fdpr"
binary optimization tool.
<flag name="F-qxlf90" class="optimization" regexp="-qxlf90=(signedzero|nosignedzero|autodealloc|noautodealloc|oldpad|nooldpad|)\b">
Determines whether the compiler provides the
Fortran 90 or the Fortran 95 level of support for
certain aspects of the language. <suboption> can be
one of the following:
signedzero | nosignedzero
Determines how the SIGN(A,B) function handles
signed real 0.0. In addition, determines
whether negative internal values will be
prefixed with a minus when formatted output
would produce a negative sign zero.
autodealloc | noautodealloc
Determines whether the compiler deallocates
allocatable arrays that are declared locally
without either the SAVE or the STATIC
attribute and have a status of currently
allocated when the subprogram terminates.
oldpad | nooldpad
When the PAD=specifier is present in the
INQUIRE statement, specifying -qxlf90=nooldpad
returns UNDEFINED when there is no connection,
or when the connection is for unformatted I/O.
This behavior conforms with the Fortran 95
standard and above. Specifying -qxlf90=oldpad
preserves the Fortran 90 behavior.
o signedzero, autodealloc and nooldpad for the
xlf95, xlf95_r, xlf95_r7 and f95 invocation
o nosignedzero, noautodealloc and oldpad for
all other invocation commands.
- Optimization flags: memory allocation.
<flag name="F-q64" class="optimization" regexp="-q64\b">
Generates 64 bit ABI binaries. The default is to generate 32 bit ABI binaries.
<flag name="F-qlargepage" class="optimization" regexp="-qlargepage\b">
Indicates that a program, designed to execute in a
large page memory environment, can take advantage
of large 16 MB pages provided on POWER4 and higher
based systems.
<flag name="F-qalloca" class="optimization" regexp="-qalloca\b">
Indicates that the compiler understands how to do alloca().
<flag name="F-qsmallstack:dynlenonheap" class="optimization" regexp="-qsmallstack=dynlenonheap\b">
Causes the Fortran compiler to allocate dynamic arrays on the heap instead
of the stack
<flag name="F-qsave" class="optimization" regexp="-qsave\b">
Specifies that all local variables be treated as STATIC.
- Optimization flags: vector calculations.
<flag name="F-qenablevmx" class="optimization" regexp="-q(no)?enablevmx\b">
Enables the generation of vector instructions for processors
that support them.
<flag name="F-qvecnvol" class="optimization" regexp="-qvecnvol\b">
Specifies whether to use volatile or non-volatile vector
registers. Volatile vector registers are registers whose
value is not preserved across function calls so the
compiler will not depend on values in them across function
- Optimization flags: support libraries.
<flag name="F-lmass" class="optimization" regexp="-lmass\b">
Link the mathematical acceleration subsystem libraries (MASS),
which contain libraries of tuned mathematical intrinsic
<flag name="F-lessl" class="optimization" regexp="-lessl\b">
Link the Engineering and Scientific Subroutine Library (ESSL).
<flag name="F-qessl" class="optimization" regexp="-qessl\b">
Specifies that, if either -lessl or -lesslsmp are also
specified, then Engineering and Scientific Subroutine Library
(ESSL) routines should be used in place of some Fortran 90
intrinsic procedures when there is a safe opportunity to do so.
- Mixed: Semantic compliance issues.
<flag name="F-qrtti:all" class="optimization" regexp="-qrtti=all\b">
Cause the C++ compiler to generate Run Time Type Identification code
<flag name="F-qchars:signed" class="portability" regexp="-qchars=signed\b">
Causes the compiler to treat "char" variables as signed instead of the
default of unsigned.
- Portability flags: syntactic compliance.
<flag name="F-qfixed" class="portability" regexp="-qfixed\b">
Indicates that the input fortran source program is in fixed form.
<flag name="F-qextname" class="portability" regexp="-qextname\b">
Adds an underscore to global entities to match the C compiler ABI
<flag name="F-qcpluscmt" class="portability" regexp="-qcpluscmt\b">
Permits the usage of "//" to introduce a comment
that lasts until the end of the current source
line, as in C++.
- Other flags: optimizations and non-compliant code.
<flag name="F-qalias" class="optimization" regexp="-qalias=(noansi|nostd)\b">
qalias=ansi | noansi
If ansi is specified, type-based aliasing is
used during optimization, which restricts the
lvalues that can be safely used to access a
data object. The default is ansi for the xlc,
xlC, and c89 commands. This option has no
effect unless you also specify the -O option.
qalias=std |nostd
Indicates whether the compilation units contain
any non-standard aliasing (see Compiler Reference
for more information). If so, specify nostd.
<flag name="F-qalign" class="optimization" regexp="-qalign=(\S+)\b">
Specifies what aggregate alignment rules the
compiler uses for file compilation, where the
alignment options are:
The compiler uses the bit_packed alignment
The compiler uses the RISC System/6000
alignment rules. This is the same as power.
The compiler uses the Macintosh alignment
rules. This suboption is valid only for 32-
bit compilations.
The compiler maps structure members to their
natural boundaries.
The compiler uses the packed alignment rules.
The compiler uses the RISC System/6000
alignment rules.
The compiler uses the Macintosh alignment
rules. This suboption is valid only for 32-
bit compilations. The mac68k option is the
same as twobyte.
The default is -qalign=full.
<flag name="F-qsmp:auto" class="optimization" parallel="yes" regexp="-qsmp=auto\b">
Causes the compiler to automatically generate parallel code using
OMP controls when possible.
<flag name="F-qsmp:omp" class="optimization" parallel="yes" regexp="-qsmp=omp\b">
Tell the compiler that OMP controls are used to identify parallel code.
<flag name="F-qstrict" class="optimization" regexp="-q(no)?strict\b">
Ensures that optimizations done by default at
optimization levels -O3 and higher, and, optionally
at -O2, do not alter the semantics of a program.
The -qstrict=all, -qstrict=precision,
-qstrict=exceptions, -qstrict=ieeefp, and
-qstrict=order suboptions and their negative forms
are group suboptions that affect multiple,
individual suboptions. Group suboptions act as if
either the positive or the no form of every
suboption of the group is specified.
o Always -qstrict or -qstrict=all when the
-qnoopt or -O0 optimization level is in effect
o -qstrict or -qstrict=all is the default when
the -O2 or -O optimization level is in effect
o -qnostrict or -qstrict=none is the default
when -O3 or a higher optimization level is in
<suboptions_list> is a colon-separated list of one
or more of the following:
all | none
all disables all semantics-changing
transformations, including those controlled by
the ieeefp, order, library, precision, and
exceptions suboptions. none enables these
precision | noprecision
precision disables all transformations that
are likely to affect floating-point precision,
including those controlled by the subnormals,
operationprecision, association,
reductionorder, and library suboptions.
noprecision enables these transformations.
exceptions | noexceptions
exceptions disables all transformations likely
to affect exceptions or be affected by them,
including those controlled by the nans,
infinities, subnormals, guards, and library
suboptions. noexceptions enables these
ieeefp | noieeefp
ieeefp disables transformations that affect
IEEE floating-point compliance, including
those controlled by the nans, infinities,
subnormals, zerosigns, and operationprecision
suboptions. noieeefp enables these
nans | nonans
nans disables transformations that may produce
incorrect results in the presence of, or that
may incorrectly produce IEEE floating-point
signaling NaN (not-a-number) values. nonans
enables these transformations.
infinities | noinfinities
infinities disables transformations that may
produce incorrect results in the presence of,
or that may incorrectly produce floating-point
infinities. noinfinities enables these
subnormals | nosubnormals
subnormals disables transformations that may
produce incorrect results in the presence of,
or that may incorrectly produce IEEE
floating-point subnormals (formerly known as
denorms). nosubnormals enables these
zerosigns | nozerosigns
zerosigns disables transformations that may
affect or be affected by whether the sign of a
floating-point zero is correct. nozerosigns
enables these transformations.
operationprecision | nooperationprecision
operationprecision disables transformations
that produce approximate results for
individual floating-point operations.
nooperationprecision enables these
order | noorder
order disables all code reordering between
multiple operations that may affect results or
exceptions, including those controlled by the
association, reductionorder, and guards
suboptions. noorder enables code reordering.
association | noassociation
association disables reordering operations
within an expression. noassociation enables
reordering operations.
reductionorder | noreductionorder
reductionorder disables parallelizing
floating-point reductions. noreductionorder
enables these reductions.
guards | noguards
guards disables moving operations past guards
or calls which control whether the operation
should be executed or not. enables these
moving operations.
library | nolibrary
library disables transformations that affect
floating-point library functions. nolibrary
enables these transformations.
<flag name="F-qlanglvl:extc99" class="compiler" regexp="-qlanglvl=extc99\b">
Allows most any c dialect.
- Other flags: compiler resource consumption.
<flag name="F-qipa:noobject" class="other" regexp="-qipa=noobject\b">
Specifies whether to include standard object code in the object files.
The <tt>noobject</tt> suboption can substantially reduce overall
compilation time, by not generating object code during the first IPA phase.
This option does not affect the code in the final binary created.
<flag name="F-qipa:threads" class="other" regexp="-qipa=threads(=\d+)?\b">
The <tt>threads</tt> suboption allows the IPA optimizer to run portions
of the optimization process in parallel threads, which can speed up the
compilation process on multi-processor systems. All the available
threads, or the number specified by N, may be used. N must be a positive
integer. Specifying <tt>nothreads</tt> does not run any parallel threads;
this is equivalent to running one serial thread.
This option does not affect the code in the final binary created.
<flag name="F-qspillsize" class="other" regexp="-qspillsize=\d+\b">
Specifies the size of the compiler's internal program storage areas, in bytes.
- Other flags: error & warning messages.
<flag name="F-qdebug:except" class="other" regexp="-qdebug=except\b">
Causes the compiler to output a traceback if it abends.
<flag name="F-qsuppress" class="other" regexp="-qsuppress=([^:\s]+):(\S+)">
<include text="-qsuppress=$2"/>
<include text="-qsuppress=$1"/>
<display enable="0"/>
<flag name="F-qsuppress:" class="other" regexp="-qsuppress=([^:\s]+)\b">
Suppresses the message with the message number specified, or with cmpmsg supresses the informational messages that report compilation progress and a successful completion.
<flag name="F-w" class="other" regexp="-w\b">
Suppresses informational, language-level, and warning messages. This option sets
- Other flags: instrumentation & debugging.
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。