1 Star 0 Fork 0

zhangdaolong/speccpu2006-config-flags

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
IBM-Linux-XL.20090713.00.xml 36.28 KB
一键复制 编辑 原始数据 按行查看 历史
zhangdaolong 提交于 2024-04-07 09:28 . add flag file
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090
<?xml version="1.0"?>
<!DOCTYPE flagsdescription
SYSTEM "http://www.spec.org/dtd/cpuflags1.dtd"
>
<!--
-->
<!-- The lines above are NOT optional. If you're adept at reading DTDs,
the one that this file conforms to is at the URL listed above.
But most humans writing a flags file will want to have it automatically
checked using a validating parser such as RXP (available at
http://www.ltg.ed.ac.uk/~richard/rxp.html), or use one of the on-line
parsers:
http://www.stg.brown.edu/service/xmlvalid/
http://www.cogsci.ed.ac.uk/~richard/xml-check.html
The parser used by the CPU tools is _not_ a validating parser, so it
may be possible to sneak things by it that would not pass the checkers
above. However, if the checkers above say that your file is clean, it's
clean.
Flag files submitted to SPEC _will_ be checked by a validating parser.
Invalid or not-well-formed flag files will be rejected.
-->
<!-- **********************************************************************
**********************************************************************
Unless otherwise explicitly noted, all references to "section n.nn"
refer to flag_description.html, available at
http://www.spec.org/cpu2006/docs/flag_description.html
**********************************************************************
********************************************************************** -->
<!--
This file is
Copyright (C) 2006 Standard Performance Evaluation Corporation
All Rights Reserved
This file may be freely modified and redistributed, provided that the
copyright notice above and this notice remain unaltered.
-->
<flagsdescription>
<filename>IBM-Linux-XL.xml</filename>
<title>Linux on Power with IBM XL Compilers SPEC CPU 2006 Flags</title>
<style>
<![CDATA[
body { background: white; }
]]>
</style>
<!-- =====================================================================
The <header> section is also entirely optional. If it is provided, and
no class is specified, then it will be inserted verbatim at the top
of the flags dump.
If a class is specified, that text will be inserted verbatim before flags
of that class.
As the contents should be HTML, it will save lots of time to just enclose
the whole thing in a CDATA section. Section 2.3.1 again.
===================================================================-->
<header>
<![CDATA[
<p>Compilers: IBM XL C/C++ Advanced Edition for Linux V9.0 and XL Fortran Advanced Edition for Linux V11.1</p>
<p>Operating systems: SUSE Linux Enterprise 10 and Red Hat Enterprise Linux Advanced Platform 5</p>
]]>
</header>
<!-- =====================================================================
Information about the meaning of boot-time settings, BIOS options,
kernel tuning, and so forth can go in the 'platform_settings' section.
They'll be appended to the end of both the flags dump and per-result flag report.
As the contents should be HTML, it will save lots of time to just enclose
the whole thing in a CDATA section. Section 2.3.1 again.
===================================================================-->
<platform_settings>
<![CDATA[
<UL>
<li> <kbd>ulimit -s unlimited</kbd> </li>
<br />Sets the stack size to "<kbd>unlimited</kbd>" to allow the stack size to grow without limit.
<li> To reserve 200 huge pages out of the physical memory pool, issue the following command </li>
<pre>
echo 200 > /proc/sys/vm/nr_hugepages
</pre>
<li> chsyscfg -m <tt>system</tt> -r prof -i name=<tt>profile</tt>,lpar_name=<tt>partition</tt>,lpar_proc_compat_mode=POWER6_enhanced <br>
This command enables the POWERPC architeture optional instructions supported on POWER6.<BR>
<pre>
Usage: chsyscfg -r lpar | prof | sys | sysprof | frame
-m &lt;managed system&gt; | -e &lt;managed frame&gt;
-f &lt;configuration file&gt; | -i "&lt;configuration data&gt;"
[--help]
Changes partitions, partition profiles, system profiles, or the attributes of a
managed system or a managed frame.
-r - the type of resource(s) to be changed:
lpar - partition
prof - partition profile
sys - managed system
sysprof - system profile
frame - managed frame
-m &lt;managed system&gt; - the managed system's name
-e &lt;managed frame&gt; - the managed frame's name
-f &lt;configuration file&gt; - the name of the file containing the
configuration data for this command.
The format is:
attr_name1=value,attr_name2=value,...
or
"attr_name1=value1,value2,...",...
-i "&lt;configuration data&gt;" - the configuration data for this command.
The format is:
"attr_name1=value,attr_name2=value,..."
or
""attr_name1=value1,value2,...",..."
--help - prints this help
The valid attribute names for this command are:
-r prof required: name, lpar_id | lpar_name
optional: ...
lpar_proc_compat_mode (default | POWER6_enhanced)
</pre>
<li> Each process was bound to a cpu using submit= with the numactl command </li>
<pre>
submit = numactl --membind=\$SPECCOPYNUM --physcpubind=\$SPECCOPYNUM $command
</pre>
<li> numactl : Control NUMA policy for processes or shared memory
<pre>
--membind=nodes
Only allocate memory from nodes. Allocation will fail when
there is not enough memory available on these nodes.
--physcpubind=cpus
Only execute process on cpus. This accepts physical cpu numbers
as shown in the processor fields of /proc/cpuinfo.
</pre>
<li>Environment variables that can be set before the run: </li>
<pre>
HUGETLB_VERBOSE=0 : Turn off any debugging message from libhugetlbfs
HUGETLB_MORECORE=yes: Instructs libhugetlbfs to override libc's normal morecore() function with a hugepage version and use it for malloc().
HUGETLB_MORECORE_HEAPBASE=0x50000000: Specifies that the hugepage heap address to start at 0x50000000.
XLFRTEOPTS=intrinthrds=1 : Causes the Fortran runtime to only use a single thread.
</pre>
<li>Post-Link Optimization (fdprpro): </li>
<pre>
- First we copied the original executable (baseexe) to baseexe.orig.
- Then, the executable is instrumented and its initial profile generated, as follows:
$ fdprpro -a instr baseexe
The output will be generated (by default) in baseexe.instr and its profile in baseexe.nprof.
- Next, run baseexe.instr using the training data. This will fill the profile file with information that characterizes the training workload.
- Finally, re-run FDPR-Pro with the profile file provided, as follows:
$ fdprpro -a opt -f baseexe.nprof [optimization options] baseexe
- We use the following optimization options : -q -O4 -A 32 -shci 90 -sdp 9
Optimization Options Descriptions:
-A alignment, --align-code alignment
Align program code so that hot code will be aligned on alignment-byte addresses.
-abb factor, --align-basic-blocks factor
Align basic blocks that are hotter then the average by given (float) factor. This is a lower-level
machine-specific alignment compared to --align-code. Value of -1 (the default) disables this option.
-bf, --branch-folding
Eliminate branch to branch instructions.
-bp, --branch-prediction
Set branch prediction bit for conditional branches.
-dce, --dead-code-elimination
Eliminate instructions related to unused local variables within frequently executed functions (useful
mainly after applying function inlining optimization).
-dp, --data-prefetch
Insert dcbt instructions to improve data-cache performance.
-ece, --epilog-code-eliminate
Reduce code size by grouping common instructions in functions' epilogs, into a single unified code.
-hr, --hco-reschedule
Relocate instructions from frequently executed code to rarely executed code areas, when possible.
-hrf factor, --hco-resched-factor factor
Set the aggressiveness of the -hr optimization option according to a factor value between (0,1), where
0 is the least aggressive factor (applicable only with the -hr option).
-i, --inline
Same as --selective-inline with --inline-small-funcs 12.
-ihf pct, --inline-hot-functions pct
Inline all function call sites to functions that have a frequency count greater than the given pct
frequency percentage.
-isf size, --inline-small-funcs size
Inline all functions that are smaller or equal to the given size in bytes.
-kr, --killed-registers
Eliminate stores and restores of registers that are killed (overwritten) after frequently executed
function calls.
-lap, --load-address-propagation
Eliminate load instructions of variables' addresses by re-using pre-loaded addresses of adjacent vari-
ables.
-las, --load-after-store
Add NOP instructions to place each load instruction further apart following a store instruction that
reference the same memory address.
-lro, --link-register-optimization
Eliminate saves and restores of the link register in frequently-executed functions.
-lu aggressiveness_factor, --loop-unroll aggressiveness_factor
Unroll short loops containing of one to several basic blocks according to an aggressiveness factor
between (1,9), where 1 is the least aggressive unrolling option for very hot and short loops.
-lun unrolling_number, --loop-unrolling-number unrolling_number
Set the number of unrolled iterations in each unrolled loop. The allowed range is between (2,50).
Default is set to 2. (applicable only with the -lu flag).
-nop, --nop-removal
Remove NOP instructions from reordered code.
-O Switch on basic optimizations only. Same as -RC -nop -bp -bf.
-O2 Switch on less aggressive optimization flags. Same as -O -hr -pto -isf 8 -tlo -kr.
-O3 Switch on aggressive optimization flags. Same as -O2 -RD -isf 12 -si -dp -lro -las -vro -btcar -lu 9
-rt 0 -pbsi.
-O4 Switch on aggressive optimization flags together with aggressive function inlining. Same as -O3 -sidf
50 -ihf 20 -sdp 9 -shci 90 and -bldcg (for XCOFF files).
-O5 Switch on aggressive optimization flags together with HLR optimization. Same as -O4 -sa -gcpyp -gcnstp
-dce.
-pbsi, --path-based-selective-inline
Perform selective inlining of dominant hot function calls based on control flow paths leading to hot
functions.
-pca, --propagate-constant-area
Relocate the constant variables area to the top of the code section when possible.
-[no]pr, --[no]ptrgl-r11
Perform removal of R11 load instruction in _ptrgl csect.
-pto, --ptrgl-optimization
Perform optimization of indirect call instructions via registers by replacing them with conditional
direct jumps.
-ptosl limit_size, --ptrgl-optimization-size-limit limit_size
Set the limit of the number of conditional statements generated by -pto optimization. Allowed values
are between 1..100. Default value set to 3. (applicable only with the -pto flag).
-ptoht heatness_threshold, --ptrgl-optimization-heatness-threshold heatness_threshold
Set the frequency threshold for indirect calls that are to be optimized by -pto optimization. Allowed
range between 0..1. Default is set to 0.8. (applicable only with -pto flag).
-RC, --reorder-code
Perform code reordering.
-rcaf aggressiveness_factor, --reorder-code-aggressivenes-factor aggressiveness_factor
Set the aggressiveness of code reordering optimization. Allowed values are [0 | 1 | 2], where 0 pre-
serves original code order and 2 is the most aggressive. Default is set to 1. (applicable only with
the -RC flag).
-rcctf termination_factor, --reorder-code-chain-termination-factor termination_factor
Set the threshold fraction which determines when to terminate each chain of basic blocks during code
reordering. Allowed input range is between 0.0 to 1.0 where 0.0 generates long chains and 1.0 creates
single basic block chains. Default is set to 0.05. (applicable only with the -RC flag).
-rccrf reversal_factor, --reorder-code-condition-reversal-factor reversal_factor
Set the threshold fraction which determines when to enable condition reversal for each conditional
branch during code reordering. Allowed input range is between 0.0 to 1.0 when 0.0 tries to preserve
original condition direction and 1.0 ignores it. Default is set to 0.8 (applicable only with the -RC
flag).
-RD, --reorder-data
Perform static data reordering.
-rmte, --remove-multiple-toc-entries
Remove multiple TOC entries pointing to the same location in the input program file.
-rt removal_factor, --reduce-toc removal_factor
Perform removal of TOC entries according to a removal factor between (0,1), where 0 removes non-
accessed TOC entries only, and 1 removes all possible TOC entries.
-sdp aggressiveness_factor, --stride-data-prefetch aggressiveness_factor
Perform data prefetching within frequently executed loops based on stride analysis, according to an
aggressiveness factor between (1,9), where 1 is least aggressive.
-sdpla iterations_number, --stride-data-prefetch-look-ahead iterations_number
Set the number of iterations for which data is prefetched into the cache ahead of time. Default value
is set to 4 iterations. (applicable only with the -sdp flag).
-sdpms stride_min_size, --stride-data-prefetch-min-size stride_min_size
Set the minimal stride size in bytes, for which data will be considered as a candidate for prefetch-
ing. Default value is set to 128 bytes. (applicable only with the -sdp flag).
-shci pct, --selective-hot-code-inline pct
Perform selective inlining of functions in order to decrease the total number of execution counts, so
that only functions whose hotness is above the given percentage are inlined.
-si, --selective-inline
Perform selective inlining of dominant hot function calls.
-sll Lib1:Prof1,...,LibN:ProfN, --static-link-libraries Lib1:Prof1,...,LibN:ProfN
Statically link hot code from specified dynamically linked libraries to the input program. The parame-
ter consists of comma-separated list of libraries and their profiles. IMPORTANT: licensing rights of
specified libraries should be observed when applying this copying optimization.
-sllht hotness_threshold, --static-link-libraries-hotness-threshold hotness_threshold
Set hotness threshold for the --static-link-libraries optimization. The allowed input range is between
0 (least aggressive) to 1, or -1, which does not require profile and selects all code that might be
called by the input program from the given libraries. Default is 0.5.
-sidf percentage_factor, --selective-inline-dominant-factor percentage_factor
Set a dominant factor percentage for selective inline optimization. The allowed range is between
(0,100). Default is set to 80 (applicable only with the -si and -pbsi flags).
-siht frequency_factor, --selective-inline-hotness-threshold frequency_factor
Set a hotness threshold factor percentage for selective inline optimization to inline all dominant
function calls that have a frequency count greater than the given frequency percentage. Default is set
to 100 (applicable only with the -si -pbsi flags).
-so, --stack-optimization
Reduce the stack frame size of functions which are called with a small number of arguments.
-tb, --preserve-traceback-tables
Force the restructuring of traceback tables in reordered code. If -tb option is omitted, traceback
tables are automatically included only for C++ applications which use the Try & Catch mechanism.
-rtb, --remove-traceback-tables
Remove traceback tables in reordered code.
-tlo, --tocload-optimization
Replace each load instruction that references the TOC with a corresponding add-immediate instruction
via the TOC anchor register, when possible.
-vro, --volatile-registers-optimization
Eliminate stores and restores of non-volatile registers in frequently executed functions by using
available volatile registers.
</pre>
</ul>
]]>
</platform_settings>
<flag
name="invocation_path_stripper"
class="other"
regexp="(?:/\S+/)?(xlc|xlC|xlf95|fdpr)\b"
>
<include flag="INVOCATION_PATH" flagtext=" $1 " />
<include text="$2" />
<display enable="0" />
</flag>
<flag
name="lop_xlc"
class="compiler"
regexp="xlc\b">
<example>exampleOFxlc</example>
<![CDATA[
<p>
Invoke the IBM XL C compliler. 32-bit binaries are produced by default.
</p>
]]>
</flag>
<flag
name="lop_xlcpp"
class="compiler"
regexp="xlC\b">
<example>exampleOFxlC</example>
<![CDATA[
<p>
Invoke the IBM XL C++ compliler. 32-bit binaries are produced by default.
</p>
]]>
</flag>
<flag
name="lop_xlf95"
class="compiler"
regexp="xlf95\b">
<example>exampleOFxlf95</example>
<![CDATA[
<p>
Invoke the IBM XL Fortran compliler. 32-bit binaries are produced by default.
</p>
]]>
</flag>
<flag
name="lop_xlf95_r"
class="compiler"
regexp="xlf95_r\b">
<example>exampleOFxlf95</example>
<![CDATA[
<p>
Invoke the IBM XL Fortran compliler with the 'r' capabilities.
</p>
]]>
</flag>
<flag
name="lop_fdpr"
class="compiler"
regexp="fdpr\b">
<example>fdpr -O3</example>
<![CDATA[
<p>
Invoke the IBM fdpr FDO program to do FDO optimizations on a binary module.
</p>
]]>
</flag>
<flag name="F-O5"
class="optimization"
>
<example>
-O5
</example>
<![CDATA[
<p>
Perform optimizations for maximum performance. This includes maximum
interprocedural analysis on all of the objects presented on the "link"
step. This level of optimization will increase the compiler's memory
usage and compile time requirements. -O5 Provides all of the functionality
of the -O4 option, but also provides the functionality of the
-qipa=level=2 option.
</p>
<p>
-O5 is equivalent to the following flags
<ul>
<li> <tt>-O4</tt> </li>
<li> <tt>-qipa=level=2</tt> </li>
<li> <tt>-qarch=auto</tt> </li>
<li> <tt>-qtune=auto</tt> </li>
</ul>
</p>
]]>
<include flag="F-O4" />
<include flag="F-qipa:level" flagtext="-qipa=level=2" />
<include flag="F-qarch" flagtext="-qarch=auto" />
<include flag="F-qtune" flagtext="-qtune=auto" />
</flag>
<flag name="F-O4"
class="optimization"
>
<example>
-O4
</example>
<![CDATA[
<p>
Perform optimizations for maximum performance. This includes
interprocedural analysis on all of the objects presented on the "link"
step.
</p>
<p>
-O4 is equivalent to the following flags
<ul>
<li> <tt>-O3</tt> </li>
<li> <tt>-qipa=level=1</tt> </li>
<li> <tt>-qarch=auto</tt> </li>
<li> <tt>-qtune=auto</tt> </li>
</ul>
</p>
]]>
<include flag="F-O3" />
<include flag="F-qipa:level" flagtext="-qipa=level=1" />
<include flag="F-qarch" flagtext="-qarch=auto" />
<include flag="F-qtune" flagtext="-qtune=auto" />
</flag>
<flag name="F-O3"
class="optimization"
>
<example>-O3</example>
<![CDATA[
<p>
Performs additional optimizations that are memory intensive, compile-time
intensive, and may change the semantics of the program slightly, unless
-qstrict is specified. We recommend these optimizations when the desire for
run-time speed improvements outweighs the concern for limiting compile-time
resources. The optimizations provided include:
<ul>
<li> In-depth memory access analysis </li>
<li> Better loop scheduling </li>
<li> High-order loop analysis and transformations (-qhot=level=0) </li>
<li> Inlining of small procedures within a compilation unit by default </li>
<li> Eliminating implicit compile-time memory usage limits </li>
<li> Widening, which merges adjacent load/stores and other operations </li>
<li> Pointer aliasing improvements to enhance other optimizations </li>
</ul>
</p>
<p>
-O3 is equivalent to the following flags
<ul>
<li> <tt>-O2</tt> </li>
<li> <tt>-qhot=level=0</tt> </li>
</ul>
</p>
]]>
<include flag="F-O2" />
<include flag="F-qhot" flagtext="-qhot=level=0" />
</flag>
<flag name="F-O2"
class="optimization"
regexp="-O2\b">
<example>-O2</example>
<![CDATA[
<p>
Performs a set of optimizations that are intended to offer improved
performance without an unreasonable increase in time or storage that is
required for compilation including:
<ul>
<li> Eliminates redundant code </li>
<li> Basic loop optimization </li>
<li> Can structure code to take advantage of -qarch and -qtune settings </li>
</ul>
</p>
]]>
<include flag="F-O" />
</flag>
<flag name="F-O"
class="optimization"
>
<example>-O</example>
<![CDATA[
<p>
Enables the level of optimization that represents the best tradeoff between compilation speed and run-time performance. If you need a specific level of optimization, specify the appropriate numeric value. Currently, -O is equivalent to -O2.
</p>
]]>
<include flag="F-O2" />
</flag>
<flag name="F-qhot"
class="optimization"
>
<example>-qhot</example>
<![CDATA[
<pre>
Performs high-order transformations on loops during optimization.
o arraypad
The compiler will pad any arrays where it infers that there may be a benefit.
o level=0
The compiler performs a limited set of high-order loop transformations.
o level=1
The compiler performs its full set of high-order loop transformations.
o simd
Replaces certain instruction sequences with vector instructions.
o vector
Replaces certain instruction sequences with calls to the MASS library.
Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and -qhot=level=1. The -qhot option is also implied by -O4, and -O5.
</pre>
]]>
</flag>
<flag name="F-qarch"
class="optimization"
regexp="-qarch=(\S+)\b"
>
<example>-qarch=pwr5x, -qarch=auto</example>
<![CDATA[
<p>
Produces object code containing instructions that will run on the
specified processors. "auto" selects the processor the complile
is being done on. "pwr5x" is the POWER5+ processor.
</pe
</p>
<p>Supported values for this flag are</p>
<ul>
<li>auto </li> Use the processor on which the program is compiled.
<li>pwr6e </li> The POWER6 processor in "Enhanced" mode based systems.
<li>pwr6 </li> The POWER6 processor based systems.
<li>pwr5x </li> The POWER5+ processor based systems.
<li>pwr5 </li> The POWER5 processor based systems.
<li>pwr4 </li> The POWER4 processor based systems.
<li>ppc970 </li> The PPC970 processor based systems.
</ul>
]]>
</flag>
<flag name="F-qtune"
class="optimization"
regexp="-qtune=(\S+)\b"
>
<example>-qtune=pwr4, -qtune=auto</example>
<![CDATA[
<p>
Specifies the architecture system for which the executable program
is optimized. This includes instruction scheduling and cache setting.
The supported values for <tt>suboption<\tt> are:
<ul>
<li>auto </li> Use the processor on which the program is compiled.
<li>pwr6 </li> The POWER6 processor based systems.
<li>pwr5x </li> The POWER5+ processor based systems.
<li>pwr5 </li> The POWER5 processor based systems.
<li>pwr4 </li> The POWER4 processor based systems.
<li>ppc970 </li> The PPC970 processor based systems.
</ul>
]]>
</flag>
<flag name="F-qipa:level"
class="optimization"
regexp="-qipa=level=[012]\b">
<example>
-qipa=level
</example>
<![CDATA[
<p>
Enhances optimization by doing detailed analysis across procedures
(interprocedural analysis or IPA).
The <tt>level</tt> determines the amount of interprocedural analysis
and optimization that is performed.
</p>
<p>
<tt>level=0</tt> Does only minimal interprocedural analysis and optimization
</p>
<p>
<tt>level=1</tt> turns on inlining , limited alias analysis, and limited
call-site tailoring
</p>
<p>
<tt>level=2</tt> turns on full interprocedural data flow and alias analysis
</p>
]]>
</flag>
<flag name="F-qalias"
class="optimization"
regexp="-qalias=(noansi|nostd)\b">
<example>
-qalias=noansi
</example>
<![CDATA[
<pre>
qalias=ansi | noansi
If ansi is specified, type-based aliasing is
used during optimization, which restricts the
lvalues that can be safely used to access a
data object. The default is ansi for the xlc,
xlC, and c89 commands. This option has no
effect unless you also specify the -O option.
qalias=std |nostd
Indicates whether the compilation units contain
any non-standard aliasing (see Compiler Reference
for more information). If so, specify nostd.
</pre>
]]>
</flag>
<flag name="F-lhugetlbfs"
class="optimization"
>
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
</flag>
<flag name="F-qfixed"
class="portability"
>
Indicates that the input fortran source program is in fixed form.
</flag>
<flag name="F-qextname"
class="portability"
>
Adds an underscore to global entites to match the C compiler ABI
</flag>
<flag name="F-qchars:signed"
class="portability"
>
Causes the compiler to treat "char" variables as signed instead of the
default of unsigned.
</flag>
<flag name="F-qalloca"
class="optimization"
>
Indicates that the compiler understands how to do alloca().
</flag>
<flag name="F-q64"
class="optimization"
>
Generates 64 bit ABI binaries. The default is to generate 32 bit binaries.
</flag>
<flag name="F-qalign"
class="optimization"
>
<![CDATA[
<pre>
Specifies what aggregate alignment rules the compiler uses for file compilation,
where the alignment options are:
bit_packed
The compiler uses the bit_packed alignment rules.
full
The compiler uses the RISC System/6000 alignment rules. This is the same
as power.
mac68k
The compiler uses the Macintosh alignment rules. This suboption is valid only
for 32- bit compilations.
natural
The compiler maps structure members to their natural boundaries.
packed
The compiler uses the packed alignment rules.
power
The compiler uses the RISC System/6000 alignment rules.
twobyte
The compiler uses the Macintosh alignment rules. This suboption is valid
only for 32-bit compilations. The mac68k option is the same as twobyte.
The default is -qalign=full.
</pre>
]]>
</flag>
<flag name="F-lxlf90_r"
class="optimization"
>
Link the Fortran runtime library libxlf90_r.so which is required by libessl.so.
</flag>
<flag name="F-lmass"
class="optimization"
>
Link the mathematical acceleration subsystem libraries (MASS), which contain libraries of tuned mathematical intrinsic functions.
</flag>
<flag name="F-lessl"
class="optimization"
>
<![CDATA[
<p>
Link the Engineering and Scientifc Subroutine Library (ESSL), libessl.so.
ESSL is a collection of subroutines providing a wide range of performance-tuned mathematical functions for many common scientific and engineering applications. The mathematical subroutines are divided into nine computational areas:
<ul>
<li> Linear Algebra Subprograms
<li> Matrix Operations
<li> Linear Algebraic Equations
<li> Eigensystem Analysis
<li> Fourier Transforms, Convolutions, Correlations and Related Computations
<li> Sorting and Searching
<li> Interpolation
<li> Numerical Quadrature
<li> Random Number Generation
</ul>
]]>
</flag>
<flag name="F-qessl"
class="optimization"
>
Specifies that, if either -lessl or -lesslsmp are also specified, then Engineering and Scientific Subroutine Library (ESSL) routines should be used in place of some Fortran 90 intrinsic procedures when there is a safe opportunity to do so.
</flag>
<flag name="F-qpdf1"
class="optimization"
>
The option used in the first pass of a profile directed feedback compile that causes pdf information
to be generated. The profile directed feedback optimization gathers data on both exectuion path and
data values. It does not use hardware counters, nor gather any data other than path and data values
for PDF specific optimizations.
</flag>
<flag name="F-qpdf2"
class="optimization"
>
The option used in the second pass of a profile directed feedback compile that causes PDF information
to be utilized during optimization.
</flag>
<flag name="F-qlanglvl:extc99"
class="compiler"
>
Support ISO C99 standard, and accepts implementation-specific language extensions.
</flag>
<flag name="F-lsmartheap"
class="optimization"
>
Link with MicroQuill's SmartHeap (32-bit) library for Linux on POWER. This is a library that
optimizes calls to new, delete, malloc and free.
</flag>
<flag name="F-tl"
class="optimization"
>
<![CDATA[
Applies the prefix specified by the -B option to the designated components.
<table>
<tr>
<th align="left">Parameter</th>
<th align="left">Description</th>
<th align="left">Executable name</th>
</tr>
<tr>
<td>a</td>
<td>Assembler</td>
<td>as</td>
</tr>
<tr>
<td>b</td>
<td>Low-level optimizer</td>
<td>xlfcode</td>
</tr>
<tr>
<td>c</td>
<td>Compiler front end</td>
<td>xlfentry</td>
</tr>
<tr>
<td>d</td>
<td>Disassembler</td>
<td>dis</td>
</tr>
<tr>
<td>F</td>
<td>C preprocessor</td>
<td>cpp</td>
</tr>
<tr>
<td>h</td>
<td>Array language optimizer</td>
<td>xlfhot</td>
</tr>
<tr>
<td>I</td>
<td>High-level optimizer, compile step</td>
<td>ipa</td>
</tr>
<tr>
<td>l</td>
<td>Linker</td>
<td>ld</td>
</tr>
<tr>
<td>z</td>
<td>Binder</td>
<td>bolt</td>
</tr>
</table>
]]>
</flag>
<flag name="F-qxlf90"
class="optimization"
regexp="-qxlf90=nosignedzero\b">
<example>
-qxlf90=nosignedzero
</example>
<![CDATA[
<pre>
-qxlf90=<suboption>
Determines whether the compiler provides the
Fortran 90 or the Fortran 95 level of support for
certain aspects of the language. <suboption> can be
one of the following:
signedzero | nosignedzero
Determines how the SIGN(A,B) function handles
signed real 0.0. In addition, determines
whether negative internal values will be
prefixed with a minus when formatted output
would produce a negative sign zero.
autodealloc | noautodealloc
Determines whether the compiler deallocates
allocatable arrays that are declared locally
without either the SAVE or the STATIC
attribute and have a status of currently
allocated when the subprogram terminates.
oldpad | nooldpad
When the PAD=specifier is present in the
INQUIRE statement, specifying -qxlf90=nooldpad
returns UNDEFINED when there is no connection,
or when the connection is for unformatted I/O.
This behavior conforms with the Fortran 95
standard and above. Specifying -qxlf90=oldpad
preserves the Fortran 90 behavior.
Default:
o signedzero, autodealloc and nooldpad for the
xlf95, xlf95_r, xlf95_r7 and f95 invocation
commands.
o nosignedzero, noautodealloc and oldpad for
all other invocation commands.
</pre>
]]>
</flag>
<flag name="F-qstrict"
class="optimization"
regexp="-qstrict|-qnostrict\b"
>
<![CDATA[
<pre>
qstrict
Turns off aggressive optimizations which have the potential to alter the
semantics of your program. -qstrict sets -qfloat=nofltint:norsqrt.
qnostrict
Sets -qfloat=rsqrt.
These options are only valid with -O2 or higher optimization levels.
Default:
o -qnostrict at -O3 or higher.
o -qstrict otherwise.
</pre>
]]>
</flag>
<flag name="F-qstaticlink"
class="optimization"
>
Controls how shared and non-shared runtime libraries are linked into an application.
When -qstaticlink is in effect, the compiler links only static libraries with the object file named in the invocation. When -qnostaticlink is in effect, the compiler links shared libraries with the object file named in the invocation.
This option provides the ability to specify linking rules that are equivalent to those implied by the GNU options -static, -static-libgcc, and -shared-libgcc, used singly and in combination.
</flag>
<flag name="F-qnoenablevmx"
class="optimization"
>
Disables generation of vector instructions for processors that support them.
</flag>
<flag name="link_whole_archive"
class="optimization"
regexp="-Wl,--whole-archive\s/\S*"
>
<example>
"-Wl,--wholearchive /usr/lib/libhugetlbfs.a"
</example>
Instructs the linker to include every object file in the specified library,
rather than searching the library for the required object files.
</flag>
<flag name="link_no_whole_archive"
class="optimization"
regexp="-Wl,--no-whole-archive"
>
Turn off the effect of the --whole-archive flag.
</flag>
<flag name="hugetlbfs_BDT"
class="optimization"
regexp="-Wl,--hugetlbfs-link=BDT"
>
Pass the --hugetlbfs-link=BDT flag to the linker so that
the text, initialized data, and BSS segments of the application are backed by hugepages.
</flag>
<flag name="F-B"
class="optimization"
regexp="-B/\S*"
>
<example>
-B/usr/share/libhugetlbfs/
</example>
Determines substitute path names for XL Fortran executables such as the compiler, assembler, linker, and preprocessor. It can be used in combination with the -t option, which determines which of these components are affected by -B.
</flag>
<flag name="link_emit_relocation"
class="optimization"
regexp="-Wl,-q\b"
>
Pass the -q flag to the linker causing the final executable to have the relocation information.
</flag>
<flag name="F-DSPEC_CPU_LINUX_PPC"
class="portability"
>
This macro indicates that the benchmark is being compiled on a PowerPC-based Linux System.
</flag>
<flag name="F-qrtti"
class="optimization"
>
Cause the C++ compiler to generate Run Time Type Identification code for exception handling and for use by the typeid and dynamic_cast operators.
</flag>
<flag name="F-qsmallstack:dynlenonheap"
class="optimization"
>
Causes the Fortran compiler to allocate dynamic arrays on the heap instead of the stack
</flag>
<flag name="F-qipa:noobject"
class="other"
regexp="-qipa=noobject\b">
<example>
-qipa=noobject
</example>
<![CDATA[
<p>
Specifies whether to include standard object code in the object files.
The <tt>noobject</tt> suboption can substantially reduce overall
compilation time, by not generating object code during the first IPA phase.
This option does not affect the code in the final binary created.
</p>
]]>
</flag>
<flag name="F-qipa:threads"
class="other"
regexp="-qipa=threads\b">
<example>
-qipa=threads
</example>
<![CDATA[
<p>
The <tt>threads</tt> suboption allows the IPA optimizer to run portions
of the optimization process in parallel threads, which can speed up the
compilation process on multi-processor systems. All the available
threads, or the number specified by N, may be used. N must be a positive
integer. Specifying <tt>nothreads</tt> does not run any parallel threads;
this is equivalent to running one serial thread.
This option does not affect the code in the final binary created.
</p>
]]>
</flag>
<flag
name="INVOCATION_PATH"
class="other"
regexp="/\S+/bin/"
>
The path used to invoke the compilers.
<display enable="0" />
</flag>
</flagsdescription>
Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/zhangdaolong/speccpu2006-config-flags.git
[email protected]:zhangdaolong/speccpu2006-config-flags.git
zhangdaolong
speccpu2006-config-flags
speccpu2006-config-flags
master

搜索帮助