Commit graph

54 commits

Author SHA1 Message Date
Amir Ayupov aff52d1f08 [BOLT][CMAKE] Check build target architecture for runtime libs
Account for cross-compilation build scenarios (X86 to ARM, Linux
to Windows, etc).

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124712
2022-05-05 10:40:14 -07:00
Amir Ayupov 8634aa2503 [BOLT][CMAKE] Simplify Clang/LLD identification
Refactor nested conditions. NFC

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D123861
2022-04-23 12:17:24 -04:00
Amir Ayupov 2a9386726b [BOLT][NFC] Use LLVM_REVISION instead of BOLT_VERSION_STRING
Remove duplicate version string identification

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D123549
2022-04-14 19:16:35 -07:00
Vladislav Khmelevsky 4956e0e197 [BOLT] Fix plt relocations symbol match
The bfd linker adds the symbol versioning string to the symbol name in symtab.
Skip the versioning part in order to find the registered PLT function.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122039
2022-04-05 15:57:26 +03:00
Yi Kong 8142ace0a7 Revert "Add CMake option not to build BOLT tests"
This reverts commit d8f4d54664.

Merged by accident.
2022-03-08 02:01:15 +08:00
Yi Kong d8f4d54664 Add CMake option not to build BOLT tests 2022-03-08 01:59:44 +08:00
Vladislav Khmelevsky 20e9d4caf0 [BOLT] Prepare BOLT for unit-testing
This patch adds unit testing support for BOLT. In order to do this we will need at least do this changes on the code level:
* Make createMCPlusBuilder accessible externally
* Remove positional InputFilename argument to bolt utlity sources
And prepare the cmake and lit for the new tests.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Reviewed By: maksfb, Amir

Differential Revision: https://reviews.llvm.org/D118271
2022-01-27 00:22:13 +03:00
Amir Ayupov de3e3fcfa3 [BOLT][CMAKE] Accept BOLT_CLANG_EXE and BOLT_LLD_EXE
Add CMake options to supply clang and lld binaries for use in check-bolt
instead of requiring the build of clang and lld projects.

Suggested by Mehdi Amini in https://lists.llvm.org/pipermail/llvm-dev/2021-December/154426.html

Test Plan:
```
cmake -G Ninja ~/local/llvm-project/llvm \
-DLLVM_TARGETS_TO_BUILD="X86"  \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DLLVM_ENABLE_PROJECTS="bolt"  \
-DBOLT_CLANG_EXE=~/local/bin/clang \
-DBOLT_LLD_EXE=~/local/bin/lld

ninja check-bolt
...
llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using clang: /home/aaupov/local/bin/clang
llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using ld.lld: /home/aaupov/local/bin/ld.lld
llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using lld-link: /home/aaupov/local/bin/lld-link
llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using ld64.lld: /home/aaupov/local/bin/ld64.lld
llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using wasm-ld: /home/aaupov/local/bin/wasm-ld
...
```

Tested all configurations:
- LLVM_ENABLE_PROJECTS="bolt;clang;lld" + no BOLT_*_EXE
- LLVM_ENABLE_PROJECTS="bolt;clang" + BOLT_LLD_EXE
- LLVM_ENABLE_PROJECTS="bolt;lld" + BOLT_CLANG_EXE
- LLVM_ENABLE_PROJECTS="bolt" + BOLT_CLANG_EXE + BOLT_LLD_EXE
- LLVM_ENABLE_PROJECTS="bolt;clang;lld" + BOLT_CLANG_EXE + BOLT_LLD_EXE

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D117061
2022-01-15 04:37:29 -08:00
Amir Ayupov c34adaa3ca [BOLT][CMAKE] Use IN_LIST check
Summary:
Address @smeenai feedback https://reviews.llvm.org/D117061#inline-1122106:
>CMake has if(IN_LIST) now, which you can use instead of the string(FIND)

IN_LIST is available since CMake 3.3 released in 2015.

Reviewed By: smeenai

FBD33590959
2022-01-14 15:47:14 -08:00
Amir Ayupov cd7a630585 [BOLT][DOCS] Build doxygen documentation
Summary:
Added doxygen configuration files and CMake directives, copy-pasta from flang.

```cmake -G Ninja ../llvm-project/llvm \
  -DLLVM_ENABLE_PROJECTS="bolt" \
  -DBOLT_INCLUDE_DOCS=YES \
  -DLLVM_ENABLE_DOXYGEN=YES
ninja doxygen-bolt
```

(cherry picked from FBD33303249)
2021-12-23 15:23:39 -08:00
Rafael Auler b392ec696b Re-enable Windows build and fix issues
Summary:
Fix missing string header file inclusion and link_fdata find
problem in lit tests. Change root-level tests to require
linux. Re-enable Windows in our root CMakeLists.txt.

(cherry picked from FBD33296290)
2021-12-23 05:59:35 -08:00
Rafael Auler 283a87743e Fix install-bolt_rt dependencies
Summary:
This rule depends on bolt target, since it will run
install-bolt as well.

(cherry picked from FBD33152071)
2021-12-15 18:53:52 -08:00
Rafael Auler 5fc8adb529 Add bolt target to cmake
Summary:
Create a new high-level target named bolt that builds all
BOLT artifacts, as well as a install-bolt target that installs them.

(cherry picked from FBD33133002)
2021-12-15 10:16:58 -08:00
Rafael Auler bb201ca3e8 Disable Windows build
Summary:
Build is already broken because VS fails to locate
llvm-boltdiff when running tests, and VS also complains that
include/bolt/Passes/InstrumentationSummary.h is lacking an include
string header. Disable this until we have a Windows buildbot to make
sure this build is sane.

(cherry picked from FBD33039972)
2021-12-10 21:19:18 -08:00
Rafael Auler ae585be11c [BOLT] Fix Windows build
Summary:
Make BOLT build in VisualStudio compiler and run without
crashing on a simple test. Other tests are not running.

(cherry picked from FBD32378736)
2021-11-11 18:14:53 -08:00
Rafael Auler 0559dab546 [BOLT] Improve cmake configs for opensource
Summary:
Change cmake config in BOLT to only support Linux. In other
platforms, we print a warning that we won't build BOLT.  Change
configs to determine whether we will build BOLT runtime libs. This
only happens in x86 hosts. If true, we will build the runtime and
enable bolt-runtime tests. New tests that depend on the bolt_rt lib
needs to be marked REQUIRES:bolt-runtime. I updated the relevant
tests.  Fix cmake to do not crash when building llvm with a target
that BOLT does not support.

(cherry picked from FBD31935760)
2021-10-26 12:26:23 -07:00
Rafael Auler a34c753fe7 Rebase: [NFC] Refactor sources to be buildable in shared mode
Summary:
Moves source files into separate components, and make explicit
component dependency on each other, so LLVM build system knows how to
build BOLT in BUILD_SHARED_LIBS=ON.

Please use the -c merge.renamelimit=230 git option when rebasing your
work on top of this change.

To achieve this, we create a new library to hold core IR files (most
classes beginning with Binary in their names), a new library to hold
Utils, some command line options shared across both RewriteInstance
and core IR files, a new library called Rewrite to hold most classes
concerned with running top-level functions coordinating the binary
rewriting process, and a new library called Profile to hold classes
dealing with profile reading and writing.

To remove the dependency from BinaryContext into X86-specific classes,
we do some refactoring on the BinaryContext constructor to receive a
reference to the specific backend directly from RewriteInstance. Then,
the dependency on X86 or AArch64-specific classes is transfered to the
Rewrite library. We can't have the Core library depend on targets
because targets depend on Core (which would create a cycle).

Files implementing the entry point of a tool are transferred to the
tools/ folder. All header files are transferred to the include/
folder. The src/ folder was renamed to lib/.

(cherry picked from FBD32746834)
2021-10-08 11:47:10 -07:00
Rafael Auler 62aa74f836 [BOLT] Support instrumentation via runtime library
Summary:
To allow the development of future instrumentation work, this
patch adds support in BOLT for linking arbitrary libraries into the
binary processed by BOLT. We use orc relocation handling mechanism for
that. With this support, this patch also moves code programatically
generated in X86 assembly language by X86MCPlusBuilder to C code written
in a new library called bolt_rt. Change CMake to support this library as
an external project in the same way as clang does with compiler_rt. This
library is installed in the lib/ folder relative to BOLT root
installation and by default instrumentation will look for the library
at that location to finish processing the binary with instrumentation.

(cherry picked from FBD16572013)
2019-07-24 14:03:43 -07:00
laith sakka d3c1821f5f Compile Bolt using std 14.
Summary:
Compile Bolt using std 14.
We want that to be able to use some threading the locking tools that do not exists in std 11.

(cherry picked from FBD15671736)
2019-06-05 10:32:29 -07:00
Rafael Auler 8f717dd25e [BOLT] Add initial bolt-only test infra
Summary:
Create folders and setup to make LIT run BOLT-only tests. Add
a test example. This will add a new make/ninja rule "check-bolt" that
the user can invoke to run LIT on this folder.

(cherry picked from FBD8595786)
2018-06-22 13:50:07 -07:00
Maksim Panchenko 9c6f965616 [BOLT] Getting open-source ready
Summary:
BOLT sources are being moved under tools/llvm-bolt/src
and tools/llvm-bolt will contain more files such as LICENSE.txt,
README.txt, etc.

Remove trailing white spaces from our sources.

Create llvm.patch by running

  > git diff f137ed238db11440f03083b1c88b7ffc0f4af65e include lib > \
    tools/llvm-bolt/llvm.patch

README.txt has instructions on checking out sources and applying the
patch.

(cherry picked from FBD7878380)
2018-05-04 10:10:41 -07:00
Maksim Panchenko 48ae32a33b [BOLT] Introduce MCPlus layer
Summary:
Refactor architecture-specific code out of llvm into llvm-bolt.

Introduce MCPlusBuilder, a class that is taking over MCInstrAnalysis
responsibilities, i.e. creating, analyzing, and modifying instructions.
To access the builder use BC->MIB, i.e. substitute MIA with MIB.
MIB is an acronym for MCInstBuilder, that's what MCPlusBuilder used
to be. The name stuck, and I find it better than MPB.

Instructions are still MCInst, and a bunch of BOLT-specific code still
lives in LLVM, but the staff under Target/* is significantly reduced.

(cherry picked from FBD7300101)
2018-03-09 09:45:13 -08:00
Rafael Auler 6644548c74 [BOLTDIFF] Add a tool to audit performance differences
Summary:
This is a simple bolt-based tool that instantiates two
RewriteInstances objects and compares them. Add a method to
RewriteInstance to enable us to compare two objects. Include a mechanism
to match functions from binary 1 to binary 2 and finally print the
largest differences in profiling data from one binary to another.

(cherry picked from FBD6517076)
2017-12-07 15:00:41 -08:00
Bill Nell 0e4d86bf19 [BOLT] Refactor global symbol handling code.
Summary:
This is preparation work for static data reordering.

I've created a new class called BinaryData which represents a symbol
contained in a section.  It records almost all the information relevant
for dealing with data, e.g. names, address, size, alignment, profiling
data, etc.  BinaryContext still stores and manages BinaryData objects
similar to how it managed symbols and global addresses before.  The
interfaces are not changed too drastically from before either.  There is
a bit of overlap between BinaryData and BinaryFunction.  I would have
liked to do some more refactoring to make a BinaryFunctionFragment that
subclassed from BinaryData and then have BinaryFunction be composed or
associated with BinaryFunctionFragments.

I've also attempted to use (symbol + offset) for when addresses are
pointing into the middle of symbols with known sizes.  This changes the
simplify rodata loads optimization slightly since the expression on an
instruction can now also be a (symbol + offset) rather than just a symbol.

One of the overall goals for this refactoring is to make sure every
relocation is associated with a BinaryData object.  This requires adding
"hole" BinaryData's wherever there are gaps in a section's address space.
Most of the holes seem to be data that has no associated symbol info. In
this case we can't do any better than lumping all the adjacent hole
symbols into one big symbol (there may be more than one actual data
object that contributes to a hole). At least the combined holes should
be moveable.

Jump tables have similar issues. They appear to mostly be sub-objects
for top level local symbols. The main problem is that we can't recognize
jump tables at the time we scan the symbol table, we have to wait til
disassembly. When a jump table is discovered we add it as a sub-object
to the existing local symbol. If there are one or more existing
BinaryData's that appear in the address range of a newly created jump
table, those are added as sub-objects as well.

(cherry picked from FBD6362544)
2017-11-14 20:05:11 -08:00
Rafael Auler 8a5a30156e [BOLT rebase] Rebase fixes on top of LLVM Feb2018
Summary:
This commit includes all code necessary to make BOLT working again
after the rebase. This includes a redesign of the EHFrame work,
cherry-pick of the 3dnow disassembly work, compilation error fixes,
and port of the debug_info work. The macroop fusion feature is not
ported yet.

The rebased version has minor changes to the "executed instructions"
dynostats counter because REP prefixes are considered a part of the
instruction it applies to. Also, some X86 instructions had the "mayLoad"
tablegen property removed, which BOLT  uses to identify and account
for loads, thus reducing the total number of loads reported by
dynostats. This was observed in X86::MOVDQUmr. TRAP instructions are
not terminators anymore, changing our CFG. This commit adds compensation
to preserve this old behavior and minimize tests changes. debug_info
sections are now slightly larger. The discriminator field in the line
table is slightly different due to a change upstream. New profiles
generated with the other bolt are incompatible with this version
because of different hash values calculated for functions, so they will
be considered 100% stale. This commit changes the corresponding test
to XFAIL so it can be updated. The hash function changes because it
relies on raw opcode values, which change according to the opcodes
described in the X86 tablegen files. When processing HHVM, bolt was
observed to be using about 800MB more memory in the rebased version
and being about 5% slower.

(cherry picked from FBD7078072)
2018-02-06 15:00:23 -08:00
Bill Nell 2640b4071f [BOLT] Refactoring - add BinarySection class
Summary: Add BinarySection class that is a wrapper around SectionRef.  This is refactoring work for static data reordering.

(cherry picked from FBD6792785)
2018-01-23 15:10:24 -08:00
Maksim Panchenko b6cb112feb [BOLT] New profile format
Summary:
A new profile that is more resilient to minor binary modifications.

BranchData is eliminated. For calls, the data is converted into instruction
annotations if the profile matches a function. If a profile cannot be matched,
AllCallSites data should have call sites profiles.

The new profile format is YAML, which is quite verbose. It still takes
less space than the older format because we avoid function name repetition.

The plan is to get rid of the old profile format eventually.

merge-fdata does not work with the new format yet.

(cherry picked from FBD6753747)
2017-12-13 23:12:01 -08:00
Maksim Panchenko d15b93bade [BOLT] Major overhaul of profiling in BOLT
Summary:
Profile reading was tightly coupled with building CFG. Since I plan
to move to a new profile format that will be associated with CFG
it is critical to decouple the two phases.

We now have read profile right after the cfg was constructed, but
before it is "canonicalized", i.e. CTCs will till be there.

After reading the profile, we do a post-processing pass that fixes
CFG and does some post-processing for debug info, such as
inference of fall-throughs, which is still required with the current
format.

Another good reason for decoupling is that we can use profile with
CFG to more accurately record fall-through branches during
aggregation.

At the moment we use "Offset" annotations to facilitate location
of instructions corresponding to the profile. This might not be
super efficient. However, once we switch to the new profile format
the offsets would be no longer needed. We might keep them for
the aggregator, but if we have to trust LBR data that might
not be strictly necessary.

I've tried to make changes while keeping backwards compatibly. This makes
it easier to verify correctness of the changes, but that also means
that we lose accuracy of the profile.

Some refactoring is included.

Flag "-prof-compat-mode" (on by default) is used for bug-level
backwards compatibility. Disable it for more accurate tracing.

(cherry picked from FBD6506156)
2017-11-28 09:57:21 -08:00
spupyrev b77172ce2f updating cache metrics
Summary:
This is a replacement of a previous diff. The implemented metric
('graph distance') is not very useful at the moment but I plan to add
more relevant metrics in the subsequent diff. This diff fixes some
obvious problems and moves the call of CalcMetrics::printAll to the
right place.

(cherry picked from FBD6072312)
2017-10-16 16:53:50 -07:00
Rafael Auler 42f957bb75 [BOLT] Integrate perf2bolt into llvm-bolt
Summary:
Move the data aggregator logic from our python script to
our C++ LLVM/BOLT libs. This has a dramatic reduction in processing
time for profiling data (from 45 minutes for HHVM to 5 minutes) because
we directly use BOLT as a disassembler in order to validate traces found
in the LBR and to add the fallthrough counts. Previously, the python
approach relied on parsing the output objdump to check traces.

(cherry picked from FBD5761313)
2017-09-01 18:13:51 -07:00
Bohan Ren ec304396c3 [BOLT] Call Distance Metric
Summary:
Designed a new metric, which shows 93.46% correltation with Cache Miss
and 86% correlation with CPU Time.

Definition:

One can get all the traversal path for each function. And for each traversal,
we will define a distance. The distance represents how far two connected
basic blocks are. Therefore, for each traversal, I will go through the
basic blocks one by one, until the end of the traversal and sum up the
distance for the neighboring basic blocks.
Distance between two connected basic blocks is the distance of the
centers of two blocks in the binary file.

(cherry picked from FBD5242526)
2017-06-13 16:29:39 -07:00
Bill Nell 5cd58961a9 Add .bolt_info notes section containing BOLT revision and command line args.
Summary:
Optinally add a .bolt_info notes section containing BOLT revision and command line args.
The new section is controlled by the -add-bolt-info flag which is on by default.

(cherry picked from FBD5125890)
2017-05-24 14:14:16 -07:00
Maksim Panchenko f241e252fc [BOLT] Detect and handle __builtin_unreachable().
Summary:
Calls to __builtin_unreachable() can result in a inconsistent CFG.
It was possible for basic block to end with a conditional branche
and have a single successor. Or there could exist non-terminated
basic block without successors.

We also often treated conditional jumps with destination past the end
of a function as conditional tail calls. This can be prevented
reliably at least when the byte past the end of the function does
not belong to the next function.

This diff includes several changes:
  * At disassembly stage jumps past the end of a function are converted
    into 'nops'. This is done only for cases when we can guarantee that
    the jump is not a tail call. Conversion to nop is required since the
    instruction could be referenced either by exception handling
    tables and/or debug info. Nops are later removed.
  * In CFG insert 'ret' into non-terminated basic blocks without
    successors (this almost never happens).
  * Conditional jumps at the end of the function are removed from
    CFG. The block will still have a single successor.
  * Cases where a destination of a jump instruction is the start
    of the next function, are still conservatively handled as
    (conditional) tail calls.

(cherry picked from FBD4655046)
2017-03-03 11:35:41 -08:00
Maksim Panchenko 88244a10bb [BOLT] Move BOLT passes under Passes subdirectory (NFC).
Summary:
Move passes under Passes subdirectory.

Move inlining passes under Passes/Inliner.*

(cherry picked from FBD4575832)
2017-02-16 14:57:57 -08:00
Rafael Auler a75bbfc640 Add a frame optimization pass
Summary:
This is a first attempt to perform data flow analyses on bolt
and try to rebuild the stack frame for functions. The goal of the frame
optimization pass is to detect instructions that are accessing stack and,
if loading values, evaluate whether this load is redundant and we can
substitute the memory operation for a register load or immediate load.
To find opportunities, this pass also builds a map of clobbered registers
by function, so we use this in our analysis at call sites. If a call site
is found out to not clobber a caller-saved register but the caller is
spilling it anyway to the stack (to comply with the ABI), we should
detect these cases and remove this unnecessary move.

(cherry picked from FBD4337238)
2016-12-05 11:47:08 -08:00
Maksim Panchenko f2d82919d0 Move debug-handling code into DWARFRewriter (NFC).
Summary: RewriteInstance.cpp is getting too big. Split the code.

(cherry picked from FBD3596103)
2016-05-31 19:12:26 -07:00
Theodoros Kasampalis d09b00ebff Refactoring of the reordering algorithms
Summary:
The various reorder and clustering algorithms have been refactored
into separate classes, so that it is easier to add new algorithms and/or
change the logic of algorithm selection.

(cherry picked from FBD3473656)
2016-06-16 18:47:57 -07:00
Gabriel Poesia e6acc7bb53 Optimize calls to functions that are a single unconditional jump
Summary:
Many functions (around 600) in the HHVM binary are simply
a single unconditional jump instruction to another function. These can
be trivially optimized by modifying the call sites to directly call the
branch target instead (because it also happens with more than one jump
in sequence, we do it iteratively).

This diff also adds a very simple analysis/optimization pass system in
which this pass is the first one to be implemented. A follow-up to this
could be to move the current optimizations to other passes.

(cherry picked from FBD3211138)
2016-04-15 15:59:52 -07:00
Maksim Panchenko ff68b34553 Tool to merge .fdata files.
Summary:
merge-fdata tool takes multiple .fdata files and outputs to stdout
combined fdata. Takes about 2 seconds per each additional .fdata
file with hhvm production data.

(cherry picked from FBD3216430)
2016-04-08 12:18:06 -07:00
Gabriel Poesia ad344c4387 Group debugging info representation and serialization code.
Summary:
Moved the classes related to representing and serializing DWARF entities into a single
header, DebugData.h.

(cherry picked from FBD3153279)
2016-04-07 15:06:43 -07:00
Gabriel Poesia 4b4db40174 Update DWARF location lists after optimization.
Summary:

    Summary: Update DWARF location lists in .debug_loc and pointers to
    them in .debug_info so that gdb can print variables which change
    location during their lifetime.

    The following changes were made:

    - Refactored BasicBlockOffsetRanges to allow ranges to be tied to binary information (so that we can reuse it for location lists)
    - Implemented range compression optimization in BasicBlockOffsetRanges (needed otherwise too much data was being generated).
    - Added representation for location lists (LocationList.h, BinaryContext.h)
    - Implemented .debug_loc serializer that keeps the updated offsets (DebugLocWriter.{h,cpp})
    - After disassembly, traverse entries in .debug_loc and save them in context (BinaryContext.cpp)
    - After optimizations, serialize .debug_loc and update pointers in .debug_info (RewriteInstance.cpp)

(cherry picked from FBD3130682)
2016-04-01 11:37:28 -07:00
Gabriel Poesia ffa9641e16 Update DWARF lexical blocks address ranges.
Summary:
Updates DWARF lexical blocks address ranges in the output binary after optimizations.
This is similar to updating function address ranges except that the ranges representation needs
to be more general, since address ranges can begin or end in the middle of a basic block.

The following changes were made:

- Added a data structure for iterating over the basic blocks that intersect an address range: BasicBlockTable.h
- Added some more bookkeeping in BinaryBasicBlock. Basically, I needed to keep track of the block's size in the input binary as well as its address in the output binary. This information is mostly set by BinaryFunction after disassembly.
- Added a representation for address ranges relative to basic blocks (BasicBlockOffsetRanges.h). Will also serve for location lists.
- Added a representation for Lexical Blocks (LexicalBlock.h)
- Small refactorings in DebugArangesWriter:
-- Renamed to DebugRangesSectionsWriter since it also writes .debug_ranges
-- Refactored it not to depend on BinaryFunction but instead on anything that can be assined an aoffset in .debug_ranges (added an interface for that)
- Iterate over the DIE tree during initialization to find lexical blocks in .debug_info (BinaryContext.cpp)
- Added patches to .debug_abbrev and .debug_info in RewriteInstance to update lexical blocks attributes (in fact, this part is very similar to what was done to function address ranges and I just refactored/reused that code)
- Added small test case (lexical_blocks_address_ranges_debug.test)

(cherry picked from FBD3113181)
2016-03-28 17:45:22 -07:00
Gabriel Poesia 466cbae866 Update subroutine address ranges in binary.
Summary:
[WIP] Update DWARF info for function address ranges.

This diff currently does not work for unknown reasons,
but I'm describing here what's the current state.
According to both llvm-dwarf and readelf our output seems correct,
but GDB does not interpret it as expected. All details go below in
hope I missed something.

I couldn't actually track the whole change that introduced support for
what we need in gdb yet, but I think I can get to it
(2007-12-04: Support
lexical bocks and function bodies that occupy non-contiguous address ranges). I have reasons to believe gdb at least at some
nges).

The set of introduced changes was basically this:

- After disassembly, iterate over the DIEs in .debug_info and find the
ones that correspond to each BinaryFunction.
- Refactor DebugArangesWriter to also write addresses of functions to
.debug_ranges and track the offsets of function address ranges there
- Add some infrastructure to facilitate patching the binary in
simple ways (BinaryPatcher.h)
- In RewriteInstance, after writing .debug_ranges already with
function address ranges, for each function do:
-- Find the abbreviation corresponding to the function
-- Patch .debug_abbrev to replace DW_AT_low_pc with DW_AT_ranges and
DW_AT_high_pc with DW_AT_producer (I'll explain this hack below).
Also patch the corresponding forms to DW_FORM_sec_offset and
DW_FORM_string (null-terminated in-place string).
-- Patch debug_info with the .debug_ranges offset in place of
the first 4 bytes of DW_AT_low_pc (DW_AT_ranges only occupies 4
bytes whereas low_pc occupies 8), and write an arbitrary string
in-place in the other 12 bytes that were the 4 MSB of low_pc
and the 8 bytes of high_pc before the patch. This depends on
low_pc and high_pc being put consecutively by the compiler, but
it serves to validate the idea. I tried another way of doing it
that does not rely on this but it didn't work either and I believe
the reason for either not working is the same (and still unknown,
but unrelated to them. I might be wrong though, and if I find yet
another way of doing it I may try it). The other way was to
use a form of DW_FORM_data8 for the section offset. This is
disallowed by the specification, but I doubt gdb validates this,
as it's just easier to store it as 64-bit anyway as this is even
necessary to support 64-bit DWARF (which is not what gcc generates
by default apparently).

I still need to make changes to the diff to make it production-ready,
but first I want to figure out why it doesn't work as expected.

By looking at the output of llvm-dwarfdump or readelf, all of
.debug_ranges, .debug_abbrev and .debug_info seem to have been
correctly updated. However, gdb seems to have serious problems with
what we write.

(In fact, readelf --debug-dump=Ranges shows some funny warning messages
of the form ("Warning: There is a hole [0x100 - 0x120] in .debug_ranges"),
but I played around with this and it seems it's just because no
compile unit was using these ranges. Changing .debug_info apparently
changes these warnings, so they seem to be unrelated to the section
itself. Also looking at the hex dump of the section doesn't help,
as everything seems fine. llvm-dwarfdump doesn't say anything.
So I think .debug_ranges is fine.)

The result is that gdb not only doesn't show the function name as we
wanted, but it also stops showing line number information.
Apparently it's not reading/interpreting the address ranges at all,
and so the functions now have no associated address ranges, only the
symbol value which allows one to put a breakpoint in the function,
but not to show source code.

As this left me without more ideas of what to try to feed gdb with,
I believe the most promising next trial is to try to debug gdb itself,
unless someone spots anything I missed.
I found where the interesting part of the code lies for this
case (gdb/dwarf2read.c and some other related files, but mainly that one).
It seems in some parts gdb uses DW_AT_ranges for only getting
its lowest and highest addresses and setting that as low_pc and
high_pc (see dwarf2_get_pc_bounds in gdb's code and where it's called).
I really hope this is not actually the case for
function address ranges. I'll investigate this further. Otherwise
I don't think any changes we make will make it work as initially
intended, as we'll simply need gdb to support it and in that case it
doesn't.

(cherry picked from FBD3073641)
2016-03-16 18:08:29 -07:00
Gabriel Poesia 80ea31b24e Write updated .debug_aranges section after optimizations.
Summary:
Write the .debug_aranges section after optimizations to the output binary.
Each function generates at least one range and at most two (one extra for its cold part).
The writing is done manually because LLVM's implementation is tied to the output of
.debug_info (see EmitGenDwarfInfo and EmitGenDwarfARanges in lib/MC/MCDwarf.cpp),
which we don't want to trigger right now.

(cherry picked from FBD3043108)
2016-03-11 11:30:30 -08:00
Gabriel Poesia 77a6b72842 BOLT: Read and tie .debug_line info to IR.
Summary:
Reads information in the DWARF .debug_line section using LLVM and
tie every MCInst to one line of a line table from the input binary. Subsequent
diffs will update this information to match the final binary layout and
output updated line tables.

(cherry picked from FBD2989813)
2016-02-25 16:57:07 -08:00
Maksim Panchenko d1526083fc Rename binary optimizer to BOLT.
Summary:
BOLT - Binary Optimization and Layout Tool replaces FLO.
I'm keeping .fdata extension for "feedback data".

(cherry picked from FBD2908028)
2016-02-05 14:42:04 -08:00
Rafael Auler c67a753e3c Refactoring llvm-flo.cpp into a new class RewriteInstance, NFC.
Summary:
Previously, llvm-flo.cpp contained a long function doing lots of
different tasks. This patch refactors this logic into a separate class with
different member functions, exposing the relationship between each step of
the rewritting process and making it easier to coordinate/change it.

(cherry picked from FBD2691674)
2015-11-23 17:54:18 -08:00
Rafael Auler 2088875656 Teach llvm-flo how to read .eh_frame information from binaries
Summary: In order to reorder binaries with C++ exceptions, we first need to
read DWARF CFI (call frame info) from binaries in a table in the .eh_frame
ELF section. This table contains unwinding information we need to be aware of
when reordering basic blocks, so as to avoid corrupting it. This patch also
cleans up some code from Exceptions.cpp due to a refactoring where we moved
some functions to the LLVM's libSupport.

(cherry picked from FBD2614464)
2015-11-05 13:37:30 -08:00
Maksim Panchenko 21cc191ea8 Added function to parse and dump .gcc_except_table
Summary: Use '-print-exceptions' option to dump contents of .gcc_except_table.

(cherry picked from FBD2609925)
2015-11-02 11:50:53 -07:00
Maksim Panchenko b4ed5cc942 Make FLO work on hhvm binary.
Summary: Fixes several issues that prevented us from running hhvm binary.

(cherry picked from FBD2543057)
2015-10-14 15:35:14 -07:00