Commit graph

398447 commits

Author SHA1 Message Date
Kazu Hirata dfc46f0268 [clang-tidy] Drop unnecessary const from return types (NFC)
Identified with readability-const-return-type.
2021-09-05 08:37:27 -07:00
David Green 1b83aaaefa [DAG] Remove oneuse check in select_cc setgt X, -1, C, ~C fold
This appears to produce better code, even if the condition may need to
be replicated.
2021-09-05 16:18:31 +01:00
Simon Pilgrim f114ef3731 [CostModel][X86] Add generic costs for vXi32 MUL -> v2Xi16 PMADDDW folds
Based off the improved fold in D108522

This should eventually allow us to replace the SLM only cost patterns with generic versions.
2021-09-05 16:08:11 +01:00
Simon Pilgrim 9962ebaee5 [CostModel][X86] Add vXi32 multiply pattern tests
Add tests for vXi32 multiplies where the operands have been extended from vXi8/vXi16
2021-09-05 16:08:11 +01:00
David Green 8523fb96a6 [DAG] Fold select_cc setgt X, -1, C, ~C -> xor (ashr X, BW-1), C
Given a select_cc producing a constant and a invertion of the constant
for a comparison more than zero, we can produce an xor with ashr
instead, which produces smaller code. The ashr either sets all bits or
clear all bits depending on if the value is negative. This is then xor'd
with the constant to optionally negate the value.
https://alive2.llvm.org/ce/z/DTFaBZ

This includes a OneUseCheck on the Cmp, which seems to make thinks a
little worse and will be removed in a followup.

Differential Revision: https://reviews.llvm.org/D109149
2021-09-05 16:04:01 +01:00
David Green 79845ed6df [DAG] Fold setcc eq with ashr to compare to zero.
Pulled out of D109149, this folds set_cc seteq (ashr X, BW-1), -1 ->
set_cc setlt X, 0 to prevent some regressions later on when folding
select_cc setgt X, -1, C, ~C -> xor (ashr X, BW-1), C

Differential Revision: https://reviews.llvm.org/D109214
2021-09-05 14:06:47 +01:00
Dávid Bolvanský 9c476172b9 [InstCombine] stpcpy(d,s) -> strcpy(d,s) if the result is not used 2021-09-05 12:12:07 +02:00
David Green 7801d7963d [DAG] Add tests for select_cc and setcc with constant patterns. 2021-09-05 10:17:21 +01:00
Cheng Wang 9b015383f1 [libc][Obvious] Reorder CMakelists alphabetically. 2021-09-05 11:01:05 +08:00
Cheng Wang 7abd8f6c6e [libc][Obvious] Fix typos 2021-09-05 10:54:52 +08:00
Michael Kruse 650bbc5620 [OpenMP][OpenMPIRBuilder] Implement loop unrolling.
Recommit of 707ce34b06. Don't introduce a
dependency to the LLVMPasses component, instead register the required
passes individually.

Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:

 * `unrollLoopFull`
 * `unrollLoopPartial`
 * `unrollLoopHeuristic`

`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.

With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.

Reviewed By: jdoerfert, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107764
2021-09-04 19:18:58 -05:00
Arthur Eubanks 37e6a27da7 [test] Fixup tests with -analyze in llvm/test/Transforms 2021-09-04 16:45:51 -07:00
Min-Yih Hsu 28868027f7 [M68k][test] Migrate the remaining fixup and relaxation tests
Migrate the tests regarding fixup and relaxation on branch and call
targets.
This patch wraps up the migration from `test/CodeGen/M68k/Encoding` to
`test/MC/M68k`.
2021-09-04 16:27:13 -07:00
Arthur Eubanks bd020bbbd2 [test] Cleanup tests with -enable-new-pm in llvm/test/Analysis 2021-09-04 16:06:10 -07:00
Arthur Eubanks d896f22fda [test] Cleanup legacy PM tests in llvm/test/Analyis/ScalarEvolution 2021-09-04 15:57:30 -07:00
Arthur Eubanks 0a0f62e8d6 [test] Cleanup legacy PM tests in llvm/test/DebugInfo 2021-09-04 15:52:43 -07:00
Anton Afanasyev dd028c359e [SLP][Test] Add tests for PR47624 and PR49933
Add tests monitoring issues fix. They should be fixed when
https://reviews.llvm.org/D57059 ("Initial support for the vectorization
of the non-power-of-2 vectors") is landed.
2021-09-05 01:16:59 +03:00
Jez Ng d9ab62ca3d [lld-macho] Initialize LTO backend with diagnostic handler
Failing to do so results in `std::bad_function_call` being
thrown when a pass tries to emit a diagnostic.

I've copied the relevant test over from LLD-ELF's test suite.

Reviewed By: #lld-macho, thevinster

Differential Revision: https://reviews.llvm.org/D109274
2021-09-04 17:40:07 -04:00
Nikita Popov ab79ffdb74 [verify-uselistorder] Support -force-opaque-pointers
By creating LLVMContext after parsing parameters.
2021-09-04 22:41:31 +02:00
Brad Smith 89f0587154 [CMake] Re-enable use --gc-sections on OpenBSD
Most archs have switched to lld.
2021-09-04 14:18:30 -04:00
Dávid Bolvanský 2572c76ec9 [NFC] Added testcases for new binop with select transformation 2021-09-04 20:15:50 +02:00
Fangrui Song e03c8d309a [AsmPrinter] Remove unneeded MCSubtargetInfo temporary after D14346. NFC
The temporary object was used as a workaround when the target parser may
change STI. D14346 made the MCSubtargetInfo argument to
createMCAsmParser const, so we no longer need the temporary object.
2021-09-04 10:50:10 -07:00
Dávid Bolvanský 3a696f6092 [InstCombine] rotate(X,Z) eq/ne rotate(Y,Z) ---> X eq/ne Y (PR51565)
```

----------------------------------------
define i1 @src(i8 %x, i8 %y, i8 %z) {
%0:
  %f = fshl i8 %x, i8 %x, i8 %z
  %f2 = fshl i8 %y, i8 %y, i8 %z
  %r = icmp eq i8 %f, %f2
  ret i1 %r
}
=>
define i1 @tgt(i8 %x, i8 %y, i8 %z) {
%0:
  %r = icmp eq i8 %x, %y
  ret i1 %r
}
Transformation seems to be correct!

```

https://alive2.llvm.org/ce/z/qAZp8f

Solves PR51565

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D109271
2021-09-04 18:58:44 +02:00
Bjorn Pettersson 0f0344dd1e [SimpleLoopUnswitch] Inform pass manager when child loops are deleted
As part of the nontrivial unswitching we could end up removing child
loops. This patch add a notification to the pass manager when
that happens (using the markLoopAsDeleted callback).

Without this there could be stale LoopAccessAnalysis results cached
in the analysis manager. Those analysis results are cached based on
a Loop* as key. Since the BumpPtrAllocator used to allocate
Loop objects could be resetted between different runs of for
example the loop-distribute pass (running on different functions),
a new Loop object could be created using the same Loop pointer.
And then when requiring the LoopAccessAnalysis for the loop we
got the stale (corrupt) result from the destroyed loop.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D109257
2021-09-04 17:54:39 +02:00
Shivam Gupta 59c954f76a [LLDB][Docs] Indicate PS1 variable by $ 2021-09-04 20:57:59 +05:30
Kazu Hirata 15cd16aaf0 [Driver] Drop unnecessary const from return types (NFC)
Identified with readability-const-return-type.
2021-09-04 08:05:27 -07:00
Shivam Gupta 5449d2da65 [NFC] Run clang-format on llvm/lib/Trget/AVR/
The current inconsistency confuse contributors which coding guidlines to follow.
It would be better to have it consistent using clang-format tool.

Reviewed By: mhjacobson

Differential Revision: https://reviews.llvm.org/D109270
2021-09-04 20:05:15 +05:30
Simon Pilgrim cb8d96e72f Fix Wdocumentation unknown parameter warning. NFCI. 2021-09-04 15:06:53 +01:00
Simon Pilgrim 2005ae15a6 [X86][SLM] WriteVecIMul instructions only take 1uop (REAPPLIED)
The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop.

I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD.

But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.
2021-09-04 15:03:56 +01:00
Simon Pilgrim ac51d69208 Revert rG994da657076900f5ad7fe593c3b5e5f89ab3d53d "[X86][SLM] WriteVecIMul instructions only take 1uop"
This changed some codegen tests that I forgot about in my rebase, I'll recommit shortly with a fix.
2021-09-04 13:39:10 +01:00
Dávid Bolvanský 73e1ba6215 [NFC] Added tests for PR51565 2021-09-04 14:38:43 +02:00
Simon Pilgrim 994da65707 [X86][SLM] WriteVecIMul instructions only take 1uop
The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop.

I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD.

But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.
2021-09-04 13:21:34 +01:00
Simon Pilgrim c6371020a8 [X86][SLM] RMW instructions don't require an extra uop
For RMW instructions, the load and store hold the MEC for an extra cycle, but within the same single uop. This is alluded to in the Intel AOM:

"The MEC also owns the MEC RSV, which is responsible for scheduling of all loads and stores. Load and
store instructions go through addresses generation phase in program order to avoid on-the-fly memory
ordering later in the pipeline. Therefore, an unknown address will stall younger memory instructions."

Noticed while trying to get a cheap SLM test box up and running with llvm-exegesis - RMW arithmetic is always 1uop - and matches what Agner / InstLatX64 report as well.
2021-09-04 13:21:34 +01:00
Simon Pilgrim da965a77d5 [X86][SLM] Fix MUL uops, latency and throughput
These were all set to the same best case mul i32 values (which seems to be the only version of MUL that SLM actually performs well with).

Noticed while trying to improve multiplication costs for vectorization via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.
2021-09-04 13:21:34 +01:00
Eugene Zhulenev fd52b4357a [mlir] Async: check awaited operand error state after sync await
Previously only await inside the async function (coroutine after lowering to async runtime) would check the error state

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D109229
2021-09-04 05:00:17 -07:00
David Carlier 2833a2edac [Sanitizers] netbsd build fix due to wordexp interception. 2021-09-04 12:50:28 +01:00
Mark de Wever fea130cec9 [libc++][doc] Update format status.
Marked the entries solely depending on D103357 or D96664 as complete.
Initial work on implementing P2216 has started.
2021-09-04 13:31:29 +02:00
Simon Pilgrim 7d062d2c47 [X86][Atom] MUL/DIV instructions require both ports, not either.
Noticed while trying to improve multiplication costs for vectorization via the D103695 helper script. Confirmed with Intel AoM.
2021-09-04 11:58:09 +01:00
Simon Pilgrim 0d0f39b0f3 [X86][Atom] Add missing UOps override to AtomWriteResPair multiclass
Make it easier to describe microcoded instructions.
2021-09-04 11:58:09 +01:00
David Carlier 08c3cdb8b8 [Sanitizers][PGO] missing return statement 2021-09-04 11:40:58 +01:00
Mark de Wever df2af9936c [libc++][format] Add a CMake Unicode option.
This option is used to select between the format headers output column
width option. This option should be independent of the locale setting.
It's encouraged to default to Unicode unless the platform doesn't offer
that option.

[format.string.std]/10
```
  For the purposes of width computation, a string is assumed to be in a
  locale-independent, implementation-defined encoding. Implementations
  should use a Unicode encoding on platforms capable of displaying Unicode
```

Reviewed By: #libc, ldionne, vitaut

Differential Revision: https://reviews.llvm.org/D103379
2021-09-04 11:55:10 +02:00
LLVM GN Syncbot a1ea479f0a [gn build] Port d7444d9f41 2021-09-04 09:41:32 +00:00
Mark de Wever d7444d9f41 [libc++][format] Implement formatters.
This implements the initial version of the `std::formatter` class and its specializations. It also implements the following formatting functions:
- `format`
- `vformat`
- `format_to`
- `vformat_to`
- `format_to_n`
- `formatted_size`

All functions have a `char` and `wchar_t` version. Parsing the format-spec and
using the parsed format-spec hasn't been implemented. The code isn't optimized,
neither for speed, nor for size.

The goal is to have the rudimentary basics working, which can be used as a
basis to improve upon. The formatters used in this commit are simple stubs that
will be replaced by real formatters in later commits.

The formatters that are slated to be replaced in this patch series don't have
an availability macro to avoid merge conflicts.

Note the formatter for `bool` uses `0` and `1` instead of "false" and
"true". This will be fixed when the stub is replaced with a real
formatter.

Implements parts of:
- P0645 Text Formatting

Completes:
- LWG3539 format_to must not copy models of output_iterator<const charT&>

Reviewed By: ldionne, #libc, vitaut

Differential Revision: https://reviews.llvm.org/D96664
2021-09-04 11:41:08 +02:00
Nikita Popov 66a54af967 [WebAssembly] Support opaque pointers in AddMissingPrototypes
The change here is basically the same as in D108880: Rather than
looking at bitcasts, look at calls and their function type. We
still need to look through bitcasts to find those calls.

The change in llvm/test/CodeGen/WebAssembly/add-prototypes-conflict.ll
is due to different visitation order. add-prototypes-opaque-ptrs.ll
is a copy of add-prototypes.ll with -force-opaque-pointers.

Differential Revision: https://reviews.llvm.org/D109256
2021-09-04 11:25:42 +02:00
Dávid Bolvanský 9e06c767a4 [NFC] Added testcase for PR39116 2021-09-04 10:52:46 +02:00
Dávid Bolvanský 2aea581004 [NFC] Added testcase for PR48641 2021-09-04 10:44:39 +02:00
Kazuaki Ishizaki a1e7e401d2 [compiler-rt] NFC: Fix trivial typo
Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D77457
2021-09-04 14:12:58 +05:30
Balazs Benics d6ca91ea42 [clang][AST] Add support for SubstTemplateTypeParmPackType to ASTImporter
Thank you @martong for acquiring a suitable test case!

Reviewed By: shafik, martong

Differential Revision: https://reviews.llvm.org/D109237
2021-09-04 10:19:57 +02:00
Balazs Benics b97a96400a [analyzer] SValBuilder should have an easy access to AnalyzerOptions
`SVB.getStateManager().getOwningEngine().getAnalysisManager().getAnalyzerOptions()`
is quite a mouthful and might involve a few pointer indirections to get
such a simple thing like an analyzer option.

This patch introduces an `AnalyzerOptions` reference to the `SValBuilder`
abstract class, while refactors a few cases to use this /simpler/ accessor.

Reviewed By: martong, Szelethus

Differential Revision: https://reviews.llvm.org/D108824
2021-09-04 10:19:57 +02:00
Balazs Benics 91c07eb8ee [analyzer] Ignore single element arrays in getStaticSize() conditionally
Quoting https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html:
> In the absence of the zero-length array extension, in ISO C90 the contents
> array in the example above would typically be declared to have a single
> element.

We should not assume that the size of the //flexible array member// field has
a single element, because in some cases they use it as a fallback for not
having the //zero-length array// language extension.
In this case, the analyzer should return `Unknown` as the extent of the field
instead.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D108230
2021-09-04 10:19:57 +02:00