squiid/llvm

Author	SHA1	Message	Date
Mehdi Amini	63d1dc6665	Add a doc/tutorial on traversing the IR Reviewed By: stephenneuendorffer Differential Revision: https://reviews.llvm.org/D87221	2020-09-08 00:07:03 +00:00
Mehdi Amini	0a63679267	Add documentation for getDependentDialects() in the PassManagement infra docs Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D87181	2020-09-07 23:59:11 +00:00
Zequan Wu	3e782bf809	[Sema][MSVC] warn at dynamic_cast when /GR- is given Differential Revision: https://reviews.llvm.org/D86369	2020-09-07 16:46:58 -07:00
Florian Hahn	efb8e156da	[DSE,MemorySSA] Add an early check for read clobbers to traversal. Depending on the benchmark, this early exit can save a substantial amount of compile-time: http://llvm-compile-time-tracker.com/compare.php?from=505f2d817aa8e07ba98e5fd4a8f6ff0666f89df1&to=eb4e441147f9b4b7a5fcbbc57428cadbe9e01f10&stat=instructions	2020-09-07 23:22:10 +01:00
Fangrui Song	5f5a0bb087	[asan][test] Use --image-base for Linux/asan_prelink_test.cpp if ld is LLD LLD supports -Ttext but with the option there is still a PT_LOAD at address zero and thus the Linux kernel will map it to a different address and the test will fail. Use --image-base instead.	2020-09-07 14:45:21 -07:00
Roman Lebedev	bb7d3af113	Reland [SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline This was reverted in `503deec218` because it caused gigantic increase (3x) in branch mispredictions in certain benchmarks on certain CPU's, see https://reviews.llvm.org/D84108#2227365. It has since been investigated and here are the results: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200907/827578.html > It's an amazingly severe regression, but it's also all due to branch > mispredicts (about 3x without this). The code layout looks ok so there's > probably something else to deal with. I'm not sure there's anything we can > reasonably do so we'll just have to take the hit for now and wait for > another code reorganization to make the branch predictor a bit more happy :) > > Thanks for giving us some time to investigate and feel free to recommit > whenever you'd like. > > -eric So let's just reland this. Original commit message: I've been looking at missed vectorizations in one codebase. One particular thing that stands out is that some of the loops reach vectorizer in a rather mangled form, with weird PHI's, and some of the loops aren't even in a rotated form. After taking a more detailed look, that happened because the loop's headers were too big by then. It is evident that SimplifyCFG's common code hoisting transform is at fault there, because the pattern it handles is precisely the unrotated loop basic block structure. Surprizingly, `SimplifyCFGOpt::HoistThenElseCodeToIf()` is enabled by default, and is always run, unlike it's friend, common code sinking transform, `SinkCommonCodeFromPredecessors()`, which is not enabled by default and is only run once very late in the pipeline. I'm proposing to harmonize this, and disable common code hoisting until //late// in pipeline. Definition of //late// may vary, here currently i've picked the same one as for code sinking, but i suppose we could enable it as soon as right after loop rotation happens. Experimentation shows that this does indeed unsurprizingly help, more loops got rotated, although other issues remain elsewhere. Now, this undoubtedly seriously shakes phase ordering. This will undoubtedly be a mixed bag in terms of both compile- and run- time performance, codesize. Since we no longer aggressively hoist+deduplicate common code, we don't pay the price of said hoisting (which wasn't big). That may allow more loops to be rotated, so we pay that price. That, in turn, that may enable all the transforms that require canonical (rotated) loop form, including but not limited to vectorization, so we pay that too. And in general, no deduplication means more [duplicate] instructions going through the optimizations. But there's still late hoisting, some of them will be caught late. As per benchmarks i've run {F12360204}, this is mostly within the noise, there are some small improvements, some small regressions. One big regression i saw i fixed in rG8d487668d09fb0e4e54f36207f07c1480ffabbfd, but i'm sure this will expose many more pre-existing missed optimizations, as usual :S llvm-compile-time-tracker.com thoughts on this: http://llvm-compile-time-tracker.com/compare.php?from=e40315d2b4ed1e38962a8f33ff151693ed4ada63&to=c8289c0ecbf235da9fb0e3bc052e3c0d6bff5cf9&stat=instructions * this does regress compile-time by +0.5% geomean (unsurprizingly) * size impact varies; for ThinLTO it's actually an improvement The largest fallout appears to be in GVN's load partial redundancy elimination, it spends much more time in `MemoryDependenceResults::getNonLocalPointerDependency()`. Non-local `MemoryDependenceResults` is widely-known to be, uh, costly. There does not appear to be a proper solution to this issue, other than silencing the compile-time performance regression by tuning cut-off thresholds in `MemoryDependenceResults`, at the cost of potentially regressing run-time performance. D84609 attempts to move in that direction, but the path is unclear and is going to take some time. If we look at stats before/after diffs, some excerpts: * RawSpeed (the target) {F12360200} * -14 (-73.68%) loops not rotated due to the header size (yay) * -272 (-0.67%) `"Number of live out of a loop variables"` - good for vectorizer * -3937 (-64.19%) common instructions hoisted * +561 (+0.06%) x86 asm instructions * -2 basic blocks * +2418 (+0.11%) IR instructions * vanilla test-suite + RawSpeed + darktable {F12360201} * -36396 (-65.29%) common instructions hoisted * +1676 (+0.02%) x86 asm instructions * +662 (+0.06%) basic blocks * +4395 (+0.04%) IR instructions It is likely to be sub-optimal for when optimizing for code size, so one might want to change tune pipeline by enabling sinking/hoisting when optimizing for size. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D84108 This reverts commit `503deec218`.	2020-09-08 00:24:03 +03:00
Nikita Popov	ddab4cd83e	[KnownBits] Avoid some copies (NFC) These lambdas don't need copies, use const reference.	2020-09-07 22:19:29 +02:00
Nikita Popov	9fb46a452d	[SCCP] Compute ranges for supported intrinsics For intrinsics supported by ConstantRange, compute the result range based on the argument ranges. We do this independently of whether some or all of the input ranges are full, as we can often still constrain the result in some way. Differential Revision: https://reviews.llvm.org/D87183	2020-09-07 22:16:06 +02:00
Craig Topper	da79b1eecc	[SelectionDAG][X86][ARM] Teach ExpandIntRes_ABS to use sra+add+xor expansion when ADDCARRY is supported. Rather than using SELECT instructions, use SRA, UADDO/ADDCARRY and XORs to expand ABS. This is the multi-part version of the sequence we use in LegalizeDAG. It's also the same as the Custom sequence uses for i64 on 32-bit and i128 on 64-bit. So we can remove the X86 customization. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D87215	2020-09-07 13:15:26 -07:00
Sanjay Patel	8b30067919	[InstCombine] improve fold of pointer differences This was supposed to be an NFC cleanup, but there's a real logic difference (did not drop 'nsw') visible in some tests in addition to an efficiency improvement. This is because in the case where we have 2 GEPs, the code was always swapping the operands and negating the result. But if we have 2 GEPs, we should never need swapping/negation AFAICT. This is part of improving flags propagation noticed with PR47430.	2020-09-07 15:54:32 -04:00
Sanjay Patel	70207816e3	[InstCombine] add ptr difference tests; NFC	2020-09-07 15:54:32 -04:00
Craig Topper	01b3e16757	[X86] Use the same sequence for i128 ISD::ABS on 64-bit targets as we use for i64 on 32-bit targets. Differential Revision: https://reviews.llvm.org/D87214	2020-09-07 11:14:05 -07:00
Craig Topper	f3a6f6ccfd	[X86] Pre-commit new test case for D87214. NFC	2020-09-07 11:14:05 -07:00
Sanjay Patel	7a06b166b1	[DAGCombiner] allow more store merging for non-i8 truncated ops This is a follow-up suggested in D86420 - if we have a pair of stores in inverted order for the target endian, we can rotate the source bits into place. The "be_i64_to_i16_order" test shows a limitation of the current function (which might be avoided if we integrate this function with the other cases in mergeConsecutiveStores). In the earlier "be_i64_to_i16" test, we skip the first 2 stores because we do not match the full set as consecutive or rotate-able, but then we reach the last 2 stores and see that they are an inverted pair of 16-bit stores. The "be_i64_to_i16_order" test alters the program order of the stores, so we miss matching the sub-pattern. Differential Revision: https://reviews.llvm.org/D87112	2020-09-07 14:12:36 -04:00
Eric Astor	a3ec4a3158	[ms] [llvm-ml] Allow use of locally-defined variables in expressions MASM allows variables defined by equate statements to be used in expressions. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D86946	2020-09-07 14:00:14 -04:00
Eric Astor	2feb6e9b84	[ms] [llvm-ml] Fix STRUCT field alignment MASM aligns fields to the _minimum_ of the STRUCT alignment value and the size of the next field. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D86945	2020-09-07 13:58:59 -04:00
Eric Astor	e52e7ad54d	[ms] [llvm-ml] Add support for bitwise named operators (AND, NOT, OR) in MASM Add support for expressions of the form '1 or 2', etc. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D86944	2020-09-07 13:57:54 -04:00
Simon Pilgrim	5ea9e655ef	VPlan.h - remove unnecessary forward declarations. NFCI. Already defined in includes.	2020-09-07 18:35:06 +01:00
Simon Pilgrim	4e89a0ab02	MipsISelLowering.h - remove CCState/CCValAssign forward declarations. NFCI. These are already defined in the CallingConvLower.h include.	2020-09-07 18:15:26 +01:00
Simon Pilgrim	95ca3aacf0	BTFDebug.h - reduce MachineInstr.h include to forward declaration. NFCI.	2020-09-07 17:51:13 +01:00
Simon Pilgrim	dfc333050b	LeonPasses.h - remove unnecessary includes. NFCI. Reduce to forward declarations and move includes to LeonPasses.cpp where necessary.	2020-09-07 17:51:12 +01:00
Simon Pilgrim	1c34ac03a2	LeonPasses.h - remove orphan function declarations. NFCI. The implementations no longer exist.	2020-09-07 17:51:12 +01:00
Sanjay Patel	7a6d6f0f70	[InstCombine] improve folds for icmp with multiply operands (PR47432) Check for no overflow along with an odd constant before we lose information by converting to bitwise logic. https://rise4fun.com/Alive/2Xl Pre: C1 != 0 %mx = mul nsw i8 %x, C1 %my = mul nsw i8 %y, C1 %r = icmp eq i8 %mx, %my => %r = icmp eq i8 %x, %y Name: nuw ne Pre: C1 != 0 %mx = mul nuw i8 %x, C1 %my = mul nuw i8 %y, C1 %r = icmp ne i8 %mx, %my => %r = icmp ne i8 %x, %y Name: odd ne Pre: C1 % 2 != 0 %mx = mul i8 %x, C1 %my = mul i8 %y, C1 %r = icmp ne i8 %mx, %my => %r = icmp ne i8 %x, %y	2020-09-07 12:40:37 -04:00
Sanjay Patel	11d8eedfa5	[InstCombine] move/add tests for icmp with mul operands; NFC	2020-09-07 12:40:37 -04:00
alex-t	2480a31e5d	[AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block optimizeEndCF removes EXEC restoring instruction case this instruction is the only one except the branch to the single successor and that successor contains EXEC mask restoring instruction that was lowered from END_CF belonging to IF_ELSE. As a result of such optimization we get the basic block with the only one instruction that is a branch to the single successor. In case the control flow can reach such an empty block from S_CBRANCH_EXEZ/EXECNZ it might happen that spill/reload instructions that were inserted later by register allocator are placed under exec == 0 condition and never execute. Removing empty block solves the problem. This change require further work to re-implement LIS updates. Recently, LIS is always nullptr in this pass. To enable it we need another patch to fix many places across the codegen. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D86634	2020-09-07 19:37:27 +03:00
Momchil Velikov	eb482afaf5	Reduce the number of memory allocations when displaying a warning about clobbering reserved registers (NFC). Also address some minor inefficiencies and style issues. Differential Revision: https://reviews.llvm.org/D86088	2020-09-07 17:04:00 +01:00
Gabor Marton	8248c2af94	[analyzer][StdLibraryFunctionsChecker] Have proper weak dependencies We want the generice StdLibraryFunctionsChecker to report only if there are no specific checkers that would handle the argument constraint for a function. Note, the assumptions are still evaluated, even if the arguement constraint checker is set to not report. This means that the assumptions made in the generic StdLibraryFunctionsChecker should be an over-approximation of the assumptions made in the specific checkers. But most importantly, the assumptions should not contradict. Differential Revision: https://reviews.llvm.org/D87240	2020-09-07 17:56:26 +02:00
Richard Barton	7e5dab5fca	[flang] Spelling and format edits to README.txt. NFC.	2020-09-07 16:49:08 +01:00
Gabor Marton	d01280587d	[analyzer][StdLibraryFunctionsChecker] Add POSIX pthread handling functions Differential Revision: https://reviews.llvm.org/D84415	2020-09-07 17:47:01 +02:00
Richard Barton	2e1827271c	[flang] Fix link to old repo location in doxygen mainpage. NFC.	2020-09-07 16:43:08 +01:00
Simon Pilgrim	783d7116dc	AntiDepBreaker.h - remove unnecessary ScheduleDAG.h include. NFCI.	2020-09-07 16:39:42 +01:00
Simon Pilgrim	c4056f8428	[Sparc] Add reduced funnel shift test case for PR47303	2020-09-07 16:17:31 +01:00
Simon Pilgrim	9de0a3da6a	[X86][SSE] Don't use LowerVSETCCWithSUBUS for unsigned compare with +ve operands (PR47448) We already simplify the unsigned comparisons if we've found the operands are non-negative, but we were still calling LowerVSETCCWithSUBUS which resulted in the PR47448 regressions.	2020-09-07 16:11:40 +01:00
Simon Pilgrim	7993431dad	[X86][SSE] Add test cases for PR47448	2020-09-07 15:57:18 +01:00
Simon Pilgrim	60162626a5	[X86] Replace UpgradeX86AddSubSatIntrinsics with UpgradeX86BinaryIntrinsics generic helper. NFCI. Feed the Intrinsic::ID value directly instead of via the IsSigned/IsAddition bool flags.	2020-09-07 15:57:18 +01:00
Sanjay Patel	b22910daab	[InstCombine] erase instructions leading up to unreachable Normal dead code elimination ignores assume intrinsics, so we fail to delete assumes that are not meaningful (and potentially worse if they cause conflicts with other assumptions). The motivating example in https://llvm.org/PR47416 suggests that we might have problems upstream from here (difference between C and C++), but this should be a cheap way to make sure we remove more dead code. Differential Revision: https://reviews.llvm.org/D87149	2020-09-07 10:44:08 -04:00
Frederik Gossen	a70f2eb3e3	[MLIR][Shape] Merge `shape` to `std`/`scf` lowerings. Merge the two lowering passes because they are not useful by themselves. The new pass lowers to `std` and `scf` is considered an auxiliary dialect. See also https://llvm.discourse.group/t/conversions-with-multiple-target-dialects/1541/12 Differential Revision: https://reviews.llvm.org/D86779	2020-09-07 14:39:37 +00:00
Sjoerd Meijer	288c582fc9	Follow up of rG5f1cad4d296a, slightly reduced test case. NFC.	2020-09-07 15:11:10 +01:00
Simon Pilgrim	96e0f34be7	[X86] Auto upgrade SSE/AVX PABS intrinsics to generic Intrinsic::abs Minor followup to D87101, we were expanding this to a neg+icmp+select pattern like we were in CGBuiltin	2020-09-07 15:07:26 +01:00
Simon Pilgrim	4b530f7519	[X86][SSE] Use llvm.abs.* vector intrinsics instead of old (deprecated) SSE/AVX intrinsics for combine tests This also allows us to extend testing to SSE2+ targets	2020-09-07 14:27:37 +01:00
Alex Zinenko	1e1a4a4819	[mlir] Take ValueRange instead of ArrayRef<Value> in StructuredIndexed This was likely overlooked when ValueRange was first introduced. There is no reason why StructuredIndexed needs specifically an ArrayRef so use ValueRange for better type compatibility with the rest of the APIs. Reviewed By: nicolasvasilache, mehdi_amini Differential Revision: https://reviews.llvm.org/D87127	2020-09-07 15:17:39 +02:00
Esme-Yi	a5046f7ace	[NFC][PowerPC] Add tests in constants-i64.ll.	2020-09-07 13:14:00 +00:00
Georgii Rymar	4368739941	[llvm-readobj] - Remove code duplication when printing dynamic relocations. NFCI. LLVM style code can be simplified to avoid the duplication of logic related to printing dynamic relocations. Differential revision: https://reviews.llvm.org/D87089	2020-09-07 16:11:12 +03:00
Daniel Muñoz	6b954f1b79	[KillTheDoctor/CMake] Add missing keyword PRIVATE in target_link_libraries Add PRIVATE keyword in target_link_libraries to prevent CMake Error on Windows. While trying to compile llvm/clang on Windows, the following CMake error occurred. The reason is a missing PUBLIC/PRIVATE/INTERFACE keyword in target_link_libraries. ` CMake Error at utils/KillTheDoctor/CMakeLists.txt:5 (target_link_libraries): The keyword signature for target_link_libraries has already been used with the target "KillTheDoctor". All uses of target_link_libraries with a target must be either all-keyword or all-plain. The uses of the keyword signature are here: * cmake/modules/AddLLVM.cmake:771 (target_link_libraries) ` Reviewed By: tambre Differential Revision: https://reviews.llvm.org/D87203	2020-09-07 16:08:55 +03:00
Simon Pilgrim	f6db681a78	[X86][SSE] Move llvm.x86.ssse3.pabs.*.128 intrinsics to ssse3-intrinsics-x86-upgrade.ll These have been auto upgraded for some time so this is just a tidyup.	2020-09-07 13:54:12 +01:00
Simon Pilgrim	2853ae3c1b	[X86] Update SSE/AVX ABS intrinsics to emit llvm.abs.* (PR46851) We're now getting close to having the necessary analysis/combines etc. for the new generic llvm.abs.* intrinsics. This patch updates the SSE/AVX ABS vector intrinsics to emit the generic equivalents instead of the icmp+sub+select code pattern. Differential Revision: https://reviews.llvm.org/D87101	2020-09-07 13:54:12 +01:00
LLVM GN Syncbot	bb73fcfd07	[gn build] Port `23f700c785`	2020-09-07 12:51:23 +00:00
Raphael Isemann	23f700c785	Revert "[clang] Prevent that Decl::dump on a CXXRecordDecl deserialises further declarations." This reverts commit `0478720157`. This probably doesn't work when forcing deserialising while dumping (which the ASTDumper optionally supports).	2020-09-07 14:50:13 +02:00
David Truby	973800dc7c	Revert "[MLIR][Shape] Merge `shape` to `std`/`scf` lowerings." This reverts commit `15acdd7543`.	2020-09-07 13:37:32 +01:00
Georgii Rymar	dbb8188195	[llvm-readobj/elf] - Generalize the code for printing dynamic relocations. NFCI. Currently we have 2 large `printDynamicRelocations` methods that have a very similar code for GNU/LLVM styles. This patch removes the duplication and renames them to `printDynamicReloc` for consistency. Differential revision: https://reviews.llvm.org/D87087	2020-09-07 15:36:51 +03:00

1 2 3 4 5 ...

365501 commits