A compilation of slightly totally insane GCC ricer flags

Quite a while ago, I went over the full GCC documentation of the --param settings, in an effort to try and find some magical setting to make a certain program go faster. I ended up with the ridiculous set you see below, recompiled the project and the speedup was a big fat zero percent.
Ricer settings indeed.
But at least in my case nothing broke, so here they are, mostly having a theme of letting the compiler use all the RAM and CPU time it wants.
Maybe there is some program out there, that would benefit from these, so if you are in a “why the hell not?” mood, you are welcome to try them.

–param max-crossjump-edges=100000 --param max-delay-slot-insn-search=100000 --param max-delay-slot-live-search=100000 --param max-gcse-memory=2000000000 --param max-pending-list-length=100000 --param max-modulo-backtrack-attempts=100000 --param large-function-growth=100000 --param inline-unit-growth=100000 --param ipcp-unit-growth=100000 --param large-stack-frame-growth=1000000 --param max-early-inliner-iterations=1000 --param max-hoist-depth=1000 --param max-tail-merge-comparisons=1000 --param max-tail-merge-iterations=1000 --param iv-consider-all-candidates-bound=1000000 --param iv-max-considered-uses=1000000 --param scev-max-expr-size=100000 --param scev-max-expr-complexity=100000 --param max-iterations-to-track=1000 --param max-cse-path-length=100000 --param max-cse-insns=1000000 --param max-reload-search-insns=100000 --param max-cselib-memory-locations=100000 --param max-sched-ready-insns=100000 --param max-sched-region-blocks=100000 --param max-pipeline-region-blocks=100000 --param max-sched-region-insns=100000 --param max-pipeline-region-insns=100000 --param selsched-max-lookahead=100000 --param selsched-max-sched-times=10000 --param selsched-insns-to-rename=1000 --param max-partial-antic-length=1000000000 --param sccvn-max-scc-size=10000000 --param sccvn-max-alias-queries-per-access=100000 --param ira-max-loops-num=10000 --param ira-max-conflict-table-size=10000 --param loop-invariant-max-bbs-in-loop=10000000 --param loop-max-datarefs-for-datadeps=100000 --param max-vartrack-size=100000 --param max-vartrack-expr-depth=100000 --param ipa-cp-value-list-size=10000 --param ipa-max-agg-items=10000 --param max-slsr-cand-scan=10000

2 Likes

@AnotherDev @Kat

Get in here. GCC stuff.

2 Likes

You want execution/run time to go faster or you want compilation to go faster?

2 Likes

The goal was generating the fastest possible executable, without any regard for compilation time.

2 Likes

:joy:

using flags in GCC is like running around naked high on drugs in a minefield. The further you go from default, the less tested all of it is and the more subtle bugs you can expect to blow up in your face. Wouldn’t recommend lol

2 Likes

Yeah, for regular usage I use something like this:
g++ -O3 -std=c++14 -march=<whatever> -flto -fno-exceptions -fno-rtti -fno-unwind-tables
Fairly standard, except the disablement of RTTI and exception handling.

1 Like

gcc -Wall -Wextra -Wshadow -pedantic -std=c11 is my default

I add -O2 when doing “release” builds

Oh god… WHY?!

why what?

If you’re so desperate for performance consider writing some of your program in assembly :stuck_out_tongue:

2 Likes

I heared that so many times from people who should know better, I get vietnam flashbacks when someone says that…

2 Likes

Yeah, I looked up the inline ASM syntax for GCC once, recoiled in horror and nausea, and decided to never touch that. The AT&T style is garbage by itself, but nooo, they had to make it even worse.

So much this.

You may get 1-2 percent if you’re lucky with ricer flags at the cost of throwing the well known behaviour of the code under “standard” settings out of the window.

The amount of time you spend fucking with ricer compiler flags will not net you anything.

Use the flags the developer put in their make file. They probably know what’s best for their code better than you do.

No love for -Os ?

-Os is usually far worse than O2/O3, disables way too many things, there might be a couple rare cases where it it is faster due to being able to fit the entire hot code into the caches (L1i and µop), but by and large it is just for highly memory/storage constrained situations, like AVRs and tiny ARM SoCs

Oh snap! That will go right into my cringe comp CFLAGS and CXXFLAGS on make.conf!

2022 version, updated for gcc 9.3, with some changes
inlining left alone for this one, as the codebase was already…rather flat

-fgraphite-identity -floop-nest-optimize -fweb -frename-registers -frerun-cse-after-loop --param max-crossjump-edges=1000000 --param max-goto-duplication-insns=1000000 --param max-delay-slot-insn-search=1000000 --param max-delay-slot-live-search=1000000 --param max-gcse-memory=60000000000 --param max-pending-list-length=1000000 --param max-modulo-backtrack-attempts=1000000 --param gcse-unrestricted-cost=0 --param max-hoist-depth=0 --param max-tail-merge-comparisons=1000000 --param max-tail-merge-iterations=1000000 --param iv-consider-all-candidates-bound=1000000 --param iv-max-considered-uses=1000000 --param dse-max-object-size=100000000 --param dse-max-alias-queries-per-store=1000000 --param scev-max-expr-size=1000000 --param scev-max-expr-complexity=1000000 --param max-tree-if-conversion-phi-args=1000000 --param max-iterations-to-track=100000 --param max-cse-path-length=1000000 --param max-cse-insns=1000000 --param max-reload-search-insns=1000000 --param max-cselib-memory-locations=1000000 --param max-sched-ready-insns=100000 --param max-sched-region-blocks=100000 --param max-pipeline-region-blocks=100000 --param max-sched-region-insns=100000 --param max-pipeline-region-insns=100000 --param max-sched-extend-regions-iters=3 --param selsched-max-lookahead=100000 --param selsched-max-sched-times=10000 --param selsched-insns-to-rename=1000 --param max-last-value-rtl=100000 --param max-combine-insns=4 --param l1-cache-line-size=64 --param l1-cache-size=32 --param max-partial-antic-length=0 --param sccvn-max-alias-queries-per-access=1000000 --param ira-max-loops-num=1000000 --param ira-max-conflict-table-size=60000 --param loop-invariant-max-bbs-in-loop=1000000 --param loop-max-datarefs-for-datadeps=1000000 --param graphite-max-nb-scop-params=0 --param allow-store-data-races=1 --param max-slsr-cand-scan=100000 --param max-ssa-name-query-depth=9 --param slp-max-insns-in-bb=10000000 --param lra-max-considered-reload-pseudos=100000 --param max-dse-active-local-stores=1000000 --param max-iterations-computation-cost=100000 --param max-isl-operations=0 --param graphite-max-arrays-per-scop=100000 --param fsm-maximum-phi-arguments=100000

1 Like