diff options
author | Stephen Rothwell <sfr@canb.auug.org.au> | 2014-12-12 11:56:22 +1100 |
---|---|---|
committer | Stephen Rothwell <sfr@canb.auug.org.au> | 2014-12-12 11:56:22 +1100 |
commit | 7b980deb5ecb3019c5b82b364debf5cbdee70b93 (patch) | |
tree | f0ec394ad1a988e3b05b099f5a53b4b03868b109 /Documentation | |
parent | d91827e81aca207de09fb1ba62a7c50b562bafd1 (diff) | |
parent | 657eeca391288f24e794bc35ea9da2f1865216fd (diff) |
Merge remote-tracking branch 'kbuild/for-next'
Conflicts:
scripts/Kbuild.include
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/lto-build | 173 |
1 files changed, 173 insertions, 0 deletions
diff --git a/Documentation/lto-build b/Documentation/lto-build new file mode 100644 index 000000000000..5dcce1e9cc25 --- /dev/null +++ b/Documentation/lto-build @@ -0,0 +1,173 @@ +Link time optimization (LTO) for the Linux kernel + +This is an experimental feature. + +Link Time Optimization allows the compiler to optimize the complete program +instead of just each file. LTO requires at least gcc 4.8 (but +works more efficiently with 4.9+) LTO requires Linux binutils (the normal FSF +releases used in many distributions do not work at the moment) + +The compiler can inline functions between files and do various other global +optimizations, like specializing functions for common parameters, +determing when global variables are clobbered, making functions pure/const, +propagating constants globally, removing unneeded data and others. + +It will also drop unused functions which can make the kernel +image smaller in some circumstances, in particular for small kernel +configurations. + +For small monolithic kernels it can throw away unused code very effectively +(especially when modules are disabled) and usually shrinks +the code size. + +Build time and memory consumption at build time will increase, depending +on the size of the largest binary. Modular kernels are less affected. +With LTO incremental builds are less incremental, as always the whole +binary needs to be re-optimized (but not re-parsed) + +Oops can be somewhat more difficult to read, due to the more aggressive +inlining. + +Normal "reasonable" builds work with less than 4GB of RAM, but very large +configurations like allyesconfig may need more memory. The actual +memory needed depends on the available memory (gcc sizes its garbage +collector pools based on that or on the ulimit -m limits) and +the compiler version. + +gcc 4.9+ has much better build performance and less memory consumption + +- A few kernel features are currently incompatible with LTO, in particular +function tracing, because they require special compiler flags for +specific files, which is not supported in LTO right now. +- Jobserver control for -j does not work correctly for the final +LTO phase due to some problems with the kernel's pipe code. +The makefiles hard codes -j<number of online cpus> for the final +LTO phase to work around for this + +Configuration: +- Enable CONFIG_LTO_MENU and then disable CONFIG_LTO_DISABLE. +This is mainly to not have allyesconfig default to LTO. +- FUNCTION_TRACER, STACK_TRACER, FUNCTION_GRAPH_TRACER, KALLSYMS_ALL, GCOV +have to disabled because they are currently incompatible with LTO. +- MODVERSIONS have to be disabled (may work with 4.9+) + +Requirements: +- Enough memory: 4GB for a standard build, more for allyesconfig +The peak memory usage happens single threaded (when lto-wpa merges types), +so dialing back -j options will not help much. + +A 32bit compiler is unlikely to work due to the memory requirements. +You can however build a kernel targeted at 32bit on a 64bit host. + +Example build procedure: + +Simplified procedure for distributions that have gcc 4.8, but not +the Linux binutils (for example openSUSE 13.1 or FC20): + +The LTO builds requires gcc-nm/gcc-ar. Some distributions ship +those in separate packages, which may need to be explicitely installed. + +- Get the latest Linux binutils from +http://www.kernel.org/pub/linux/devel/binutils/ +and unpack it. + +We install it in a separate directory to not overwrite the system binutils. + +# replace VERSION with respective version numbers + +cd binutils* +# don't forget the --enable-plugins! +./configure --prefix=/opt/binutils-VERSION --enable-plugins +make -j $(getconf _NPROCESSORS_ONLN) && sudo make install + +Fix up the kernel configuration to allow LTO: + +<start with a suitable kernel configuration> +./source/scripts/config --disable function_tracer \ + --disable function_graph_tracer \ + --disable stack_tracer --enable lto_menu \ + --disable lto_disable \ + --disable gcov \ + --disable kallsyms_all \ + --disable modversions +make oldconfig + +Then you can build with + +# The COMPILER_PATH is needed to let gcc use the new binutils +# as the LTO plugin linker +# if you installed gcc in a separate directory like below also +# add it to the PATH line below before the regular $PATH +# The COMPILER_PATH setting is only needed if the gcc was not built +# with --with-plugin-ld pointing to the Linux binutils ld +# The AR/NM setting works around a Makefile bug +COMPILER_PATH=/opt/binutils-VERSION/bin PATH=$COMPILER_PATH:$PATH \ +make -j$(getconf _NPROCESSORS_ONLN) AR=gcc-ar NM=gcc-nm + +If you don't have gcc 4.8+ as system compiler you would also need +to install that compiler. In this case I recommend getting +a gcc 4.9+ snapshot from http://gcc.gnu.org (or release when available), +as it builds much faster for LTO than 4.8. + +Here's an example build procedure: + +Assuming gcc is unpacked in gcc-VERSION + +cd gcc-VERSION +./contrib/download_preqrequisites +cd .. + +mkdir obj-gcc +# please don't skip this cd. the build will not work correctly in the +# source dir, you have to use the separate object dir +cd obj-gcc +../gcc-VERSION/configure --prefix=/opt/gcc-VERSION --enable-lto \ +--with-plugin-ld=/opt/binutils-VERSION/bin/ld +--disable-nls --enable-languages=c,c++ \ +--disable-libstdcxx-pch +make -j$(getconf _NPROCESSORS_ONLN) +sudo make install-no-fixedincludes + +FAQs: + +Q: I get a section type attribute conflict +A: Usually because of someone doing +const __initdata (should be const __initconst) or const __read_mostly +(should be just const). Check both symbols reported by gcc. + +Q: I see lots of undefined symbols for memcmp etc. +A: Usually because NM=gcc-nm AR=gcc-ar are missing. +The Makefile tries to set those automatically, but it doesn't always +work. Better to set it manually on the make command line. + +Q: It's quite slow / uses too much memory. +A: Consider a gcc 4.9 snapshot/release (not released yet) +The main problem in 4.8 is the type merging in the single threaded WPA pass, +which has been improved considerably in 4.9 by running it distributed. + +Q: It's still slow +A: It'll always be somewhat slower than non LTO sorry. + +Q: What's up with .XXXXX numeric post fixes +A: This is due LTO turning (near) all symbols to static +Use gcc 4.9, it avoids them in most cases. They are also filtered out +in kallsyms. + +References: + +Presentation on Kernel LTO +(note, performance numbers/details outdated. In particular gcc 4.9 fixed +most of the build time problems): +http://halobates.de/kernel-lto.pdf + +Generic gcc LTO: +http://www.ucw.cz/~hubicka/slides/labs2013.pdf +http://www.hipeac.net/system/files/barcelona.pdf + +Somewhat outdated too: +http://gcc.gnu.org/projects/lto/lto.pdf +http://gcc.gnu.org/projects/lto/whopr.pdf + +Happy Link-Time-Optimizing! + +Andi Kleen |