wiki:April 2011 Meeting/Getting compile time under control

Version 2 (modified by scherkus@chromium.org, 8 years ago) (diff)

--

TODO

  • Run Include What You Use
  • Measure compile times of individual files
  • Experiment with splitting Frame data members into a base class
  • Experiment with splitting inline functions into a separate file so that they are not actually inlined in Debug builds, especially for commonly-included files
  • Create a way to measure build time and track that over time, perhaps including compiling old builds to get historical data
  • Experiment with un-inlining functions in commonly-included headers

Notes

What are the major issues with compile time? What is causing slowness?

How many files recompile when you make a change? How long does it take one file to compile? How long is linking? How long do generation scripts take? How long does the build system take to calculate dependencies?

How long is a no-op build Chromium Linux no-op build is <10 seconds (make) Ninja (make replacement) is 0.1 seconds for dependency analysis

Geoff: Fixed cost of no-op build is 30-60 seconds, linking is 30-60 seconds, rebuilding lots of files is 20-30 minutes

Sam: When I started on the project I could build and run tests in under an hour

Gavin: Platform.h causes lots of recompilation and is touched frequently, would be interesting to look at stats of how frequently certain files are changed vs. how many files they cause to be recompiled

abarth: Someone wrote a script for analyzing header dependency graph, what are the biggest wins for reducing header includes? Sam: Document.h, Frame.h, FrameLoader.h, etc.

Eric: Frame.h includes some headers so that some of its members don't have to be pointers

abarth: How long does Frame.cpp take to compile? Sam: 3rd-longest in the source tree

Michael: Reducing the number of files that have to be compiled is the biggest win

Can we extract data members into simple base classes that don't include all the member functions? A lot of files just need to access Frame's members, don't need its other functions, so this could be a big win

Eric: .h files including other .h files contributes to .o file size, maybe also to linking time etc.

Dan: >10% of the object files on Mac are essentially empty object files from .cpp files for disabled features Dan: It's unlikely that those rebuild a lot, but the linker still has to open and process them Dan: Maybe we can avoid compiling files for disabled features at all (e.g., by using a smarter build system or better naming conventions) abarth: Generating a build system can help with this Dan: Also built-in features of Xcode can do it

Geoff: How can we reduce the time when you *do* need to rebuild everything (like I often do) Geoff: It's not disk-bound, but I don't know where the time is being spent

Sam: How can we figure out what the costs are?

Geoff: Run a clean world build and sort by most expensive files to compile, then compile those individually and profile the compiler Eric: And look at the intermediate files

Nico: Worked on Chrome build, made it 25% faster (not familiar with WebKit though) Main reason for long build times is that we have too much code One way is to remove features Inline code gets compiled multiple times, then linker has to strip them out Implicit constructor/destructors are also inlined Templates also generate a lot of code, one solution is to pull most code into a non-template base class WTF::Vector uses 700k of binary data in WebKit, etc. Include What You Use tool can help identify unnecessary includes

Sam: Switching to Clang is a big speedup, too, but can't switch right now for some reason

We can use the CC environment variable to find out how long it takes to compile individual files (maybe add a flag to build-webkit)

Does #pragma once help? No, not any better than #ifdef guards

We include RefPtr/PassRefPtr in a lot of places We have Forward.h but don't know when to use it Forward.h can let you avoid including RefPtr.h in a header, but inline functions that use RefPtrs defeat this strategy

What about not having any #include statements in .h files? Might be able to help based on experience with another project, but it's ugly

Chromium bots keep a graph of compile time General trend is up, but it's fairly level No-op builds are around 90 seconds, longest builds are around 1 hour

Tony Gentilcore worked on using forward-declarations more aggressively Helped with clean time but not rebuild time

Even if we get rid of all unnecessary #includes, we'll still have a lot of inline functions which require you to #include lots of things Maybe we could put all inline functions into a single header file Could try to include that file as little as possible Could also, in Debug builds, put the function definitions into .cpp file so that Debug builds don't pay the compile-time cost of inline functions Could do this on a per-class basis Can we just use a don't inline flag instead? No, the compiler still has to read the function definitions, etc.

Could instead use a linker that knows how to inline at compile-time More generally, we should move everything we can into .cpp files if it doesn't affect performance

Would be good to have historical data for build time, lines of code, etc.

Can we measure the incremental effect of a new commit? Should we yell at people for increasing compile time? What do we measure? Clean builds, incremental builds, something else? Maybe should only yell for individual commits that massively increase compile time

Can use Clang plugins to detect when an inlined function contains a large template instantiation, etc.

What are we optimizing for? Link time (good for people with infinite CPUs)? Compile time?

A compile farm for WebKit developers' use could perhaps reduce compile times Maybe this won't work for Apple Maybe this isn't practical across the whole internet

How much time is spent on code generation vs. preprocessing? Once upon a time, a lot of time was spent statting directories to find include files (gets worse as the number of include paths goes up)

SSDs give questionable improvements (maybe just for incremental builds, maybe just makes your computer more usable while compiling) Some tests showed that, given sufficient RAM, SSDs didn't help much

If you don't have enough RAM to fit all of WebKit in Mac OS X's vnode cache, your build time is waaaaay longer

lipoing GCC down to 32 bits helps reduce memory usage because GCC stores lots of pointers

Is it possible to split WebKit into multiple libraries? On Mac OS X, we have 4 frameworks: JSC, WebCore, WebKit, WebKit2 We've discussed splitting out WebCore/platform, but layering violations make it hard Also discussed splitting out WTF

Chromium uses a bindings layer on top of WebCore (.a)

Could try un-inlining functions in commonly-included headers Hard to measure the effect Could be dangerous not to inline trivial data accessors