Speed up source file set operations.

Profiling the Fuchsia Zircon build's "gn gen" operation [1] shows
that a lot of time is spend copying source file sets during
a very large number of scope merges.

This CL provides a 30% speedup for the Zircon "gn gen", without
degrading the Chromium one (see measurements below). This is done
by introducing SourceFileSet, a dedicated data type with avoids
std::string allocations, and has much faster lookup and insert /
copy operations, compared with std::set<SourceFileSet>

More precisely, by:

  - Introducing a StringAtom data type, modelling a pointer to
    a globally unique constant string, with a heavily optimized
    lookup implementation. See comments in src/gn/string_atom.h
    and src/gn/string_atom.cc for usage and full details.

  - Modifying SourceFile to use a StringAtom, instead of an
    std::string to store the file path. This reduces the size
    and copy cost of each instance.

  - Replace std::set<SourceFile> with the new SourceFileSet data
    type (defined as a base::flat_set<SourceFile>, with fast
    pointer-based comparisons, which turns out to be significantly
    faster than other alternatives).

Note that this works well because source file paths are immutable
and shared among many many scopes / SourceFileSet instances.

Note that despite the fact that SourceFileSet is unordered for
performance reasons, this CL does not modify the outputs of
"gn gen", at least when used for a Chromium Linux build. See [3]
for the methodology used to verify that.

Finally note that the patch reduces memory usage by nearly 1 GiB (!!)
for the Zircon build [4] (but only 6 MiB for the Chromium one).

[1] Performed from a Fuchsia source workspace with:
    $ gn gen --root=zircon out/default.zircon

[2] Results comparing the performance of 'gn' before and
    after the patch. Both binaries were built with a recent
    clang toolchain with build/gen.py -use-icf -use-lto:

--------------------------------------------------------
// ZIRCON BUILD

[fuchsia] $ gn gen --root=zircon out/default.zircon

//// BEFORE (* = best of 5 runs)

Done. Made 51431 targets from 944 files in 12825ms *
Done. Made 51431 targets from 944 files in 13932ms
Done. Made 51431 targets from 944 files in 14886ms
Done. Made 51431 targets from 944 files in 14642ms
Done. Made 51431 targets from 944 files in 13548ms

//// AFTER (* = best of 5 runs)

Done. Made 51431 targets from 944 files in 10623ms
Done. Made 51431 targets from 944 files in 10185ms
Done. Made 51431 targets from 944 files in 10623ms
Done. Made 51431 targets from 944 files in 9759ms  *
Done. Made 51431 targets from 944 files in 10656ms

Overall improvement: 12825 - 9759 = 3066 ms (31% faster)

--------------------------------------------------------
// CHROMIUM BUILD

[chromium/src] $ gn gen out/default

//// BEFORE
Done. Made 12150 targets from 2121 files in 4319ms
Done. Made 12150 targets from 2121 files in 4545ms
Done. Made 12150 targets from 2121 files in 4346ms
Done. Made 12150 targets from 2121 files in 4298ms *
Done. Made 12150 targets from 2121 files in 4390ms

//// AFTER
Done. Made 12150 targets from 2121 files in 4274ms
Done. Made 12150 targets from 2121 files in 4223ms
Done. Made 12150 targets from 2121 files in 4098ms
Done. Made 12150 targets from 2121 files in 4419ms
Done. Made 12150 targets from 2121 files in 4096ms *

Overall improvement: 4298 - 4096 = 202 ms  (4.9% faster)

[3] To verify that "gn gen" outputs were not affected
by this CL, the following experiment was setup on a
Linux workstation.

  1) Create a fresh chromium source checkout on
     a btrfs subvolume directory (e.g. /work/chromium,
     where /work is a btrfs mount point).

  2) cd /work/chromium/src && touch out/args.gn
     # Create an empty args.gn there.

  3) cd /work/chromium/src && <original-gn> gen out
     # Generate out/ files using original gn program.

  4) cd /work &&
    sudo btrfs subvolume snapshot chromium chromium-original
    # Snapshot the generated state to /work/chromium-original

  4) cd /work/chromium/src && <new-gn> gen out
    # Regenerate out/ files using patched gn program.

  5) diff -burN /work/chromium-original/src/out /work/chromium/src/out
     # Compare the files under out/

  --- /work/chromium0/src/out/default/build.ninja	2020-01-31 13:24:32.643343218 +0100
  +++ /work/cr-xx0/src/out/default/build.ninja	2020-01-31 13:21:42.105117112 +0100
  @@ -1,7 +1,7 @@
   ninja_required_version = 1.7.2

   rule gn
  -  command = ../../../../../tmp/gn-sourcefileset --root=../.. -q gen .
  +  command = ../../../../../tmp/gn-master --root=../.. -q gen .
     description = Regenerating ninja files

   build build.ninja: gn

The result shows that the only difference is the name of "gn"
executable name captured in build.ninja, all other files being
identical.

[4] Peak resident set size (RSS) usage as reported with /usr/bin/time -v

// ZIRCON BUILD

BEFORE: Maximum resident set size (kbytes): 2793824
AFTER:  Maximum resident set size (kbytes): 1817880
DIFF:   2793824 - 1817880 = 975944 kiB = 953 MiB (!!)

// CHROMIUM BUILD

BEFORE: Maximum resident set size (kbytes): 544296
AFTER:  Maximum resident set size (kbytes): 538068
DIFF:   544296 - 538068 = 6228 kiB = 6.1 MiB

Change-Id: I4f11fcddeb7b84a6126b748d2b4d66afd1642545
Reviewed-on: https://gn-review.googlesource.com/c/gn/+/7280
Commit-Queue: Brett Wilson <brettw@chromium.org>
Reviewed-by: Brett Wilson <brettw@chromium.org>
Reviewed-by: Scott Graham <scottmg@chromium.org>
22 files changed
tree: b9576f5600507019f5e9db9c327773b67a12458e
  1. .clang-format
  2. .editorconfig
  3. .gitignore
  4. .style.yapf
  5. AUTHORS
  6. LICENSE
  7. OWNERS
  8. README.md
  9. build/
  10. docs/
  11. examples/
  12. infra/
  13. misc/
  14. src/
README.md

GN

GN is a meta-build system that generates build files for Ninja.

Related resources:

Getting a binary

You can download the latest version of GN binary for Linux, macOS and Windows.

Alternatively, you can build GN from source:

git clone https://gn.googlesource.com/gn
cd gn
python build/gen.py
ninja -C out
# To run tests:
out/gn_unittests

On Windows, it is expected that cl.exe, link.exe, and lib.exe can be found in PATH, so you'll want to run from a Visual Studio command prompt, or similar.

On Linux and Mac, the default compiler is clang++, a recent version is expected to be found in PATH. This can be overridden by setting CC, CXX, and AR.

Examples

There is a simple example in examples/simple_build directory that is a good place to get started with the minimal configuration.

To build and run the simple example with the default gcc compiler:

cd examples/simple_build
../../out/gn gen -C out
ninja -C out
./out/hello

For a maximal configuration see the Chromium setup:

and the Fuchsia setup:

Reporting bugs

If you find a bug, you can see if it is known or report it in the bug database.

Sending patches

GN uses Gerrit for code review. The short version of how to patch is:

Register at https://gn-review.googlesource.com.

... edit code ...
ninja -C out && out/gn_unittests

Then, to upload a change for review:

git commit
git push origin HEAD:refs/for/master

The first time you do this you'll get an error from the server about a missing change-ID. Follow the directions in the error message to install the change-ID hook and run git commit --amend to apply the hook to the current commit.

When revising a change, use:

git commit --amend
git push origin HEAD:refs/for/master

which will add the new changes to the existing code review, rather than creating a new one.

We ask that all contributors sign Google's Contributor License Agreement (either individual or corporate as appropriate, select ‘any other Google project’).

Community

You may ask questions and follow along with GN‘s development on Chromium’s gn-dev@ Google Group.