Rollup merge of #89581 - jblazquez:master, r=Mark-Simulacrum

Add -Z no-unique-section-names to reduce ELF header bloat.

This change adds a new compiler flag that can help reduce the size of ELF binaries that contain many functions.

By default, when enabling function sections (which is the default for most targets), the LLVM backend will generate different section names for each function. For example, a function `func` would generate a section called `.text.func`. Normally this is fine because the linker will merge all those sections into a single one in the binary. However, starting with [LLVM 12](https://github.com/llvm/llvm-project/commit/ee5d1a04), the backend will also generate unique section names for exception handling, resulting in thousands of `.gcc_except_table.*` sections ending up in the final binary because some linkers like LLD don't currently merge or strip these EH sections (see discussion [here](https://reviews.llvm.org/D83655)). This can bloat the ELF headers and string table significantly in binaries that contain many functions.

The new option is analogous to Clang's `-fno-unique-section-names`, and instructs LLVM to generate the same `.text` and `.gcc_except_table` section for each function, resulting in a smaller final binary.

The motivation to add this new option was because we have a binary that ended up with so many ELF sections (over 65,000) that it broke some existing ELF tools, which couldn't handle so many sections.

Here's our old binary:

```
$ readelf --sections old.elf | head -1
There are 71746 section headers, starting at offset 0x2a246508:

$ readelf --sections old.elf | grep shstrtab
  [71742] .shstrtab      STRTAB          0000000000000000 2977204c ad44bb 00      0   0  1
```

That's an 11MB+ string table. Here's the new binary using this option:

```
$ readelf --sections new.elf | head -1
There are 43 section headers, starting at offset 0x29143ca8:

$ readelf --sections new.elf | grep shstrtab
  [40] .shstrtab         STRTAB          0000000000000000 29143acc 0001db 00      0   0  1
```

The whole binary size went down by over 20MB, which is quite significant.
This commit is contained in:
Matthias Krüger 2021-10-25 22:59:46 +02:00 committed by GitHub
commit 2f67647606
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 17 additions and 0 deletions

View file

@ -161,6 +161,7 @@ pub fn target_machine_factory(
let ffunction_sections =
sess.opts.debugging_opts.function_sections.unwrap_or(sess.target.function_sections);
let fdata_sections = ffunction_sections;
let funique_section_names = !sess.opts.debugging_opts.no_unique_section_names;
let code_model = to_llvm_code_model(sess.code_model());
@ -205,6 +206,7 @@ pub fn target_machine_factory(
use_softfp,
ffunction_sections,
fdata_sections,
funique_section_names,
trap_unreachable,
singlethread,
asm_comments,

View file

@ -2187,6 +2187,7 @@ extern "C" {
UseSoftFP: bool,
FunctionSections: bool,
DataSections: bool,
UniqueSectionNames: bool,
TrapUnreachable: bool,
Singlethread: bool,
AsmComments: bool,

View file

@ -744,6 +744,7 @@ fn test_debugging_options_tracking_hash() {
tracked!(new_llvm_pass_manager, Some(true));
tracked!(no_generate_arange_section, true);
tracked!(no_link, true);
tracked!(no_unique_section_names, true);
tracked!(no_profiler_runtime, true);
tracked!(osx_rpath_install_name, true);
tracked!(panic_abort_tests, true);

View file

@ -462,6 +462,7 @@ extern "C" LLVMTargetMachineRef LLVMRustCreateTargetMachine(
LLVMRustCodeGenOptLevel RustOptLevel, bool UseSoftFloat,
bool FunctionSections,
bool DataSections,
bool UniqueSectionNames,
bool TrapUnreachable,
bool Singlethread,
bool AsmComments,
@ -491,6 +492,7 @@ extern "C" LLVMTargetMachineRef LLVMRustCreateTargetMachine(
}
Options.DataSections = DataSections;
Options.FunctionSections = FunctionSections;
Options.UniqueSectionNames = UniqueSectionNames;
Options.MCOptions.AsmVerbose = AsmComments;
Options.MCOptions.PreserveAsmComments = AsmComments;
Options.MCOptions.ABIName = ABIStr;

View file

@ -1214,6 +1214,8 @@ options! {
"compile without linking"),
no_parallel_llvm: bool = (false, parse_no_flag, [UNTRACKED],
"run LLVM in non-parallel mode (while keeping codegen-units and ThinLTO)"),
no_unique_section_names: bool = (false, parse_bool, [TRACKED],
"do not use unique names for text and data sections when -Z function-sections is used"),
no_profiler_runtime: bool = (false, parse_no_flag, [TRACKED],
"prevent automatic injection of the profiler_builtins crate"),
normalize_docs: bool = (false, parse_bool, [TRACKED],

View file

@ -0,0 +1,9 @@
# `no-unique-section-names`
------------------------
This flag currently applies only to ELF-based targets using the LLVM codegen backend. It prevents the generation of unique ELF section names for each separate code and data item when `-Z function-sections` is also in use, which is the default for most targets. This option can reduce the size of object files, and depending on the linker, the final ELF binary as well.
For example, a function `func` will by default generate a code section called `.text.func`. Normally this is fine because the linker will merge all those `.text.*` sections into a single one in the binary. However, starting with [LLVM 12](https://github.com/llvm/llvm-project/commit/ee5d1a04), the backend will also generate unique section names for exception handling, so you would see a section name of `.gcc_except_table.func` in the object file and potentially in the final ELF binary, which could add significant bloat to programs that contain many functions.
This flag instructs LLVM to use the same `.text` and `.gcc_except_table` section name for each function, and it is analogous to Clang's `-fno-unique-section-names` option.