Commit graph

13 commits

Author SHA1 Message Date
Maksim Panchenko d1526083fc Rename binary optimizer to BOLT.
Summary:
BOLT - Binary Optimization and Layout Tool replaces FLO.
I'm keeping .fdata extension for "feedback data".

(cherry picked from FBD2908028)
2016-02-05 14:42:04 -08:00
Maksim Panchenko b91d1f1299 Enable REPNZ prefix support.
Summary:
I didn't see a case where REPNZ were not disassembled/reassembled
properly.

(cherry picked from FBD2869229)
2016-01-26 17:53:08 -08:00
Maksim Panchenko 218c5f0916 Fix a bug with outlining first basic block.
Summary:
We should never outline the first basic block.
Also add an option to accept a file with the list of
functions to optimize.

(cherry picked from FBD2868184)
2016-01-26 16:03:58 -08:00
Maksim Panchenko 89578e2314 Allow to partially split functions with exceptions.
Summary:
We could split functions with exceptions even without creating
a new exception handling table. This limits us to only move
basic blocks that never throw, and are not a start of a
landing pad.

(cherry picked from FBD2862937)
2016-01-22 16:45:39 -08:00
Rafael Auler ccbbb8f8b9 Teach llvm-flo how to split functions into hot and cold regions
Summary:
After basic block reordering, it may be possible that the reordered
function is now larger than the original because of the following reasons:

- jump offsets may change, forcing some jump instructions to use 4-byte
immediate operand instead of the 1-byte, shorter version.
- fall-throughs change, forcing us to emit an extra jump instruction to jump
to the original fall-through at the end of a basic block.

Since we currently do not change function addresses, we need to rewrite the
function back in the binary in the original location. If it doesn't fit, we were
dropping the function.

This patch adds a flag -split-functions that tells llvm-flo to split hot
functions into hot and cold separate regions. The hot region is written back
in the original function location, while the cold region is written in a
separate, far-away region reserved to flo via a linker script.

This patch also adds the logic to create and extra FDE to supply unwinding
information to the cold part of the function. Owing to this, we now need to
rewrite .eh_frame_hdr to another location and patch the EH_FRAME ELF segment
to point to this new .eh_frame_hdr.

(cherry picked from FBD2677996)
2015-11-19 17:59:41 -08:00
Rafael Auler 6c851dc2e3 Attempts to fix CFI state after reordering
Summary:
This patch introduces logic to check how the CFI instructions define a
table to help during stack unwinding at exception run time and attempts to fix
any problem in this table that may have been introduced by reordering the basic
blocks. If it fails to fix this problem, the function is marked as not simple
and not eligible for rewriting.

(cherry picked from FBD2633696)
2015-11-08 12:23:54 -08:00
Rafael Auler 13a520ab30 Implement two cluster layout heuristics
Summary:
Pettis' paper on block layout (PLDI'90) suggests we should order
clusters (or chains, using the paper terminology) using a specific criterion.
This patch implements two distinct ideas for cluster layout that can be
activated using different command-line flags. The first one reflects Pettis'
ideas on minimizing branch mispredictions and the second one is targeted at
reducing I-cache misses, described in the Ispike paper (CGO'04).

(cherry picked from FBD2588693)
2015-10-23 09:38:26 -07:00
Maksim Panchenko 7f44331773 Issue warning when relaxed tail call is seen on input.
Summary:
Issue warning when we see a 2-byte tail call. Currently we
will increase the size of these instructions.

(cherry picked from FBD2575520)
2015-10-20 10:51:17 -07:00
Maksim Panchenko 85b99eb7b7 Eliminate nop instruction in input and derive alignment.
Summary:
Nop instructions are primarily used for alignment purposes on the input.
We remove all nops when we build CFG and derive alignment of basic blocks
based on existing alignment and a presence of nops before it. This
will not always work as some basic blocks will be naturally aligned
without necessity for nops. However, it's better than random alignment.
We would also add heuristics for BB alignment based on execution profile.

(cherry picked from FBD2561740)
2015-10-20 10:51:17 -07:00
Rafael Auler cd6250d1e3 Fixes branches after reordering basic blocks in a binary function
Summary:
Adds logic in BinaryFunction to be able to fix branches (invert
its condition, delete or add a branch), making the new function work with the
new layout proposed by the layout pass. All the architecture-specific content
was designed to live in the LLVM Target library, in the MCInstrAnalysis pass.
For now, we only introduce such logic to the X86 backend.

(cherry picked from FBD2551479)
2015-10-16 09:49:04 -07:00
Maksim Panchenko b4ed5cc942 Make FLO work on hhvm binary.
Summary: Fixes several issues that prevented us from running hhvm binary.

(cherry picked from FBD2543057)
2015-10-14 15:35:14 -07:00
Rafael Auler 9b58b2e64b Make llvm-flo infer branch count data for fall-through edges
Summary:
The LBR only has information about taken branches and does not record
information when a branch is not taken. In our CFG, we call these edges
"fall-through" edges. This patch teaches llvm-flo how to infer fall-through
edge frequencies.

(cherry picked from FBD2536633)
2015-10-13 10:25:45 -07:00
Maksim Panchenko 9a2fe7ebe4 Commit FLO with control flow graph.
Summary:
llvm-flo disassembles, builds control flow graph, and re-writes
simple functions.

(cherry picked from FBD2524024)
2015-10-09 17:21:14 -07:00