A compilation of slightly totally insane GCC ricer flags

Methylzero · December 22, 2018, 8:48pm

Quite a while ago, I went over the full GCC documentation of the --param settings, in an effort to try and find some magical setting to make a certain program go faster. I ended up with the ridiculous set you see below, recompiled the project and the speedup was a big fat zero percent.
Ricer settings indeed.
But at least in my case nothing broke, so here they are, mostly having a theme of letting the compiler use all the RAM and CPU time it wants.
Maybe there is some program out there, that would benefit from these, so if you are in a “why the hell not?” mood, you are welcome to try them.

–param max-crossjump-edges=100000 --param max-delay-slot-insn-search=100000 --param max-delay-slot-live-search=100000 --param max-gcse-memory=2000000000 --param max-pending-list-length=100000 --param max-modulo-backtrack-attempts=100000 --param large-function-growth=100000 --param inline-unit-growth=100000 --param ipcp-unit-growth=100000 --param large-stack-frame-growth=1000000 --param max-early-inliner-iterations=1000 --param max-hoist-depth=1000 --param max-tail-merge-comparisons=1000 --param max-tail-merge-iterations=1000 --param iv-consider-all-candidates-bound=1000000 --param iv-max-considered-uses=1000000 --param scev-max-expr-size=100000 --param scev-max-expr-complexity=100000 --param max-iterations-to-track=1000 --param max-cse-path-length=100000 --param max-cse-insns=1000000 --param max-reload-search-insns=100000 --param max-cselib-memory-locations=100000 --param max-sched-ready-insns=100000 --param max-sched-region-blocks=100000 --param max-pipeline-region-blocks=100000 --param max-sched-region-insns=100000 --param max-pipeline-region-insns=100000 --param selsched-max-lookahead=100000 --param selsched-max-sched-times=10000 --param selsched-insns-to-rename=1000 --param max-partial-antic-length=1000000000 --param sccvn-max-scc-size=10000000 --param sccvn-max-alias-queries-per-access=100000 --param ira-max-loops-num=10000 --param ira-max-conflict-table-size=10000 --param loop-invariant-max-bbs-in-loop=10000000 --param loop-max-datarefs-for-datadeps=100000 --param max-vartrack-size=100000 --param max-vartrack-expr-depth=100000 --param ipa-cp-value-list-size=10000 --param ipa-max-agg-items=10000 --param max-slsr-cand-scan=10000

Goalkeeper · December 22, 2018, 8:55pm

@AnotherDev @Kat

Get in here. GCC stuff.

AnotherDev · December 22, 2018, 9:06pm

You want execution/run time to go faster or you want compilation to go faster?

Methylzero · December 22, 2018, 9:13pm

The goal was generating the fastest possible executable, without any regard for compilation time.

redgek · December 23, 2018, 4:56am

using flags in GCC is like running around naked high on drugs in a minefield. The further you go from default, the less tested all of it is and the more subtle bugs you can expect to blow up in your face. Wouldn’t recommend lol

Methylzero · December 23, 2018, 8:40pm

Yeah, for regular usage I use something like this:
g++ -O3 -std=c++14 -march=<whatever> -flto -fno-exceptions -fno-rtti -fno-unwind-tables
Fairly standard, except the disablement of RTTI and exception handling.

redgek · December 23, 2018, 8:43pm

gcc -Wall -Wextra -Wshadow -pedantic -std=c11 is my default

I add -O2 when doing “release” builds

MazeFrame · December 23, 2018, 8:56pm

Oh god… WHY?!

redgek · December 23, 2018, 8:57pm

why what?

Sauron · February 3, 2019, 3:40pm

If you’re so desperate for performance consider writing some of your program in assembly

MazeFrame · February 3, 2019, 4:41pm

I heared that so many times from people who should know better, I get vietnam flashbacks when someone says that…

Methylzero · February 3, 2019, 8:33pm

Yeah, I looked up the inline ASM syntax for GCC once, recoiled in horror and nausea, and decided to never touch that. The AT&T style is garbage by itself, but nooo, they had to make it even worse.

thro · February 4, 2019, 3:39am

So much this.

You may get 1-2 percent if you’re lucky with ricer flags at the cost of throwing the well known behaviour of the code under “standard” settings out of the window.

The amount of time you spend fucking with ricer compiler flags will not net you anything.

Use the flags the developer put in their make file. They probably know what’s best for their code better than you do.

abaxas · February 4, 2019, 7:52pm

No love for -Os ?

Methylzero · February 5, 2019, 12:02am

-Os is usually far worse than O2/O3, disables way too many things, there might be a couple rare cases where it it is faster due to being able to fit the entire hot code into the caches (L1i and µop), but by and large it is just for highly memory/storage constrained situations, like AVRs and tiny ARM SoCs

ulzeraj · February 5, 2019, 3:54am

Oh snap! That will go right into my ~~cringe comp~~ CFLAGS and CXXFLAGS on make.conf!

Methylzero · January 7, 2022, 7:35pm

2022 version, updated for gcc 9.3, with some changes
inlining left alone for this one, as the codebase was already…rather flat

-fgraphite-identity -floop-nest-optimize -fweb -frename-registers -frerun-cse-after-loop --param max-crossjump-edges=1000000 --param max-goto-duplication-insns=1000000 --param max-delay-slot-insn-search=1000000 --param max-delay-slot-live-search=1000000 --param max-gcse-memory=60000000000 --param max-pending-list-length=1000000 --param max-modulo-backtrack-attempts=1000000 --param gcse-unrestricted-cost=0 --param max-hoist-depth=0 --param max-tail-merge-comparisons=1000000 --param max-tail-merge-iterations=1000000 --param iv-consider-all-candidates-bound=1000000 --param iv-max-considered-uses=1000000 --param dse-max-object-size=100000000 --param dse-max-alias-queries-per-store=1000000 --param scev-max-expr-size=1000000 --param scev-max-expr-complexity=1000000 --param max-tree-if-conversion-phi-args=1000000 --param max-iterations-to-track=100000 --param max-cse-path-length=1000000 --param max-cse-insns=1000000 --param max-reload-search-insns=1000000 --param max-cselib-memory-locations=1000000 --param max-sched-ready-insns=100000 --param max-sched-region-blocks=100000 --param max-pipeline-region-blocks=100000 --param max-sched-region-insns=100000 --param max-pipeline-region-insns=100000 --param max-sched-extend-regions-iters=3 --param selsched-max-lookahead=100000 --param selsched-max-sched-times=10000 --param selsched-insns-to-rename=1000 --param max-last-value-rtl=100000 --param max-combine-insns=4 --param l1-cache-line-size=64 --param l1-cache-size=32 --param max-partial-antic-length=0 --param sccvn-max-alias-queries-per-access=1000000 --param ira-max-loops-num=1000000 --param ira-max-conflict-table-size=60000 --param loop-invariant-max-bbs-in-loop=1000000 --param loop-max-datarefs-for-datadeps=1000000 --param graphite-max-nb-scop-params=0 --param allow-store-data-races=1 --param max-slsr-cand-scan=100000 --param max-ssa-name-query-depth=9 --param slp-max-insns-in-bb=10000000 --param lra-max-considered-reload-pseudos=100000 --param max-dse-active-local-stores=1000000 --param max-iterations-computation-cost=100000 --param max-isl-operations=0 --param graphite-max-arrays-per-scop=100000 --param fsm-maximum-phi-arguments=100000