-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL-MLIR] Merge from intel/llvm sycl branch #10895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This allows using VPRecipeWithIRFlags for VPInstruction and reduces the diff for D157144 & D157194.
Currently we only consider basic blocks with a unique predecessor when estimating the size of dead code. However, we could expand to this to consider blocks with a back-edge, or blocks preceded by dead blocks. Differential Revision: https://reviews.llvm.org/D156903
This ports over the test cases half-convert.ll and implements patterns or RISCVISelLowering.cpp changes for all of the most straight-forward cases (those that don't require changes outside of lib/Target/RISCV). The remaining cases and noted poor codegen for saturating conversions will be handled in follow-up patches. Differential Revision: https://reviews.llvm.org/D156943
This patch moves directive sets defined internally in Semantics to a header accessible by other stages of the compiler to enable reuse. Some sets are renamed/rearranged and others are lifted from local definitions to provide a single source of truth. Differential Revision: https://reviews.llvm.org/D157090
…nversion Extending to f32 first (as is done for f16) results in better generated code for RISC-V (and affects no other in-tree tests). Additionally, performing the FP_EXTEND first seems equally justified for bf16 as for f16. Differential Revision: https://reviews.llvm.org/D156944
In most of testcases, it usually has a blank line after end of RUN lines for readability.
This reduces the number of places where we have to check for a list of DS_GWS_* opcodes. Differential Revision: https://reviews.llvm.org/D157099
The affected lit tests failed when they were run in a path that contained the word "call". CHECK-NOT lines that were supposed to match only the IR ended up matching the path printed in the output. Fixed this by checking for "call void" instead.
Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D152141
Split off suggested refactoring from D157144. Also adds a assert to make sure this is only used when OpType is FPMathOp.
6640df9 did not actually remove it, just its final user. cannotBeOrderedLessThanZeroImpl still has a user which needs to be updated before it can be removed. The users of SignBitMustBeZero currently have broken expectations for nan handling, so requires more work to replace.
Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D157214
Currently `isTriviallyReMaterializable` calls `isReallyTriviallyReMaterializable` and `isReallyTriviallyReMaterializableGeneric`. The two interfaces are confusing, but there are also some real issues with this. The documentation of this function (see below) suggests that `isReallyTriviallyRematerializable` allows the target to override the default behaviour. /// For instructions with opcodes for which the M_REMATERIALIZABLE flag is /// set, this hook lets the target specify whether the instruction is actually /// trivially rematerializable, taking into consideration its operands. It however implements something different. The default behaviour is the analysis done in `isReallyTriviallyReMaterializableGeneric`, which is testing if it is safe to rematerialize the MachineInstr. The result of `isReallyTriviallyReMaterializable` is only considered if `isReallyTriviallyReMaterializableGeneric` returns `false`. That means there is no way to override the default behaviour if `isReallyTriviallyReMaterializableGeneric` returns true (i.e. it is safe to rematerialize, but we'd rather not). By making this a single interface, we can override the interface to do either. Reviewed By: craig.topper, nemanjai Differential Revision: https://reviews.llvm.org/D156520
…s for PACKSS/PACKUS Begin to consolidate the similar matching code we have - all have semi-similar constraints that still need merging together to ensure we get consistent codegen depending on when the truncate is lowered.
…vector types Fuzz testing noticed that the sub-128-bit vector splitting added in ef4330f didn't correctly halt at <2 x iXX> truncations.
As requested in review for https://reviews.llvm.org/D156990 This additionally consistently uses the ilp32d/lp64d ABIs when the D extension is enabled.
redundant get() call on smart pointer.
Support IR that is generated by the vector-to-scf lowering of 2D vector transfers with a mask. Only 2D transfers that were fully unrolled are supported at the moment. Differential Revision: https://reviews.llvm.org/D156695
Use APInt to represent numeric variables and expressions, therefore removing overflow concerns. Only remains underflow when the format of an expression is unsigned (incl. hex values) but the result is negative. Note that this can only happen when substituting an expression, not when capturing since the regex used to capture unsigned value will not include minus sign, hence all the code removal for match propagation testing. This is what this patch implement. Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D150880
Add Matcher dependentSizedExtVectorType for DependentSizedExtVectorType. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D157237
Add Matcher convertVectorExpr. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D157248
…me from CF_OPTIONS This cherry-picks swiftlang/llvm-project#6431 since without it, macOS 14 SDK headers don't compile when targeting catalyst. Fixes #64438.
Similar to the other ValueTracking function, switch over the instruction opcode instead of doing a long sequence of match()es.
CONFLICT (content): Merge conflict in llvm/lib/Passes/PassBuilderPipelines.cpp
…f a live register This patch tweaks the fix in D20627 "Do not rename registers that do not start an independent live range" to only consider Data dependencies, not Output or Anti dependencies. An Output or Anti dependency to a superreg does not imply that that superreg is live at the current instruction. This enables breaking anti-dependencies in a few more cases as shown by the lit test updates. Differential Revision: https://reviews.llvm.org/D156879
…ister This patch reworks the fix from D20627 "Do not rename registers that do not start an independent live range". That fix depended on the scheduler dependency graph having redundant edges. Those edges are removed by D156552 "[MachineScheduler] Track physical register dependencies per-regunit" with the result that on several Hexagon lit tests, the post-RA scheduler would schedule the code in a way that fails machine verification. Consider this code where D11 is a pair R23:R22: SU(0): %R2<def> = A2_add %R23, %R17<kill> (anti dependency on R23 here) SU(8): %R23<def> = S2_asr_i_r %R22, 31 (data dependency on R23->D11 here) SU(10): %D0<def> = A2_tfrp %D11<kill> The original fix would detect this situation by examining the dependency from SU(8) to SU(10) and seeing that D11 is not a subreg of R23. A slightly more complicated example: SU(0): %R2<def> = A2_add %R23, %R17<kill> (anti dependency on R23 here) SU(8): %R23<def> = S2_asr_i_r %R22, 31 (data dependency on R23 here) SU(9): %R23<def> = S2_asr_i_r %R23, 31 (data dependency on R23->D11 here) SU(10): %D0<def> = A2_tfrp %D11<kill> The original fix also worked on this example, but only because ScheduleDAGInstrs adds an extra data dependency edge directly from SU(8) to SU(10). This edge is redundant, since you could infer it transitively from the edges SU(8)->SU(9) and SU(9)->SU(10), and since none of the data that SU(8) writes to R23 is read by SU(10). After D156552 the redundant edge SU(8)->SU(10) will not be present, so when we examine the successors of SU(8) we will not find any that read from a superreg of R23. This patch removes the original fix from D20627, which examined edges in the dependency graph. Instead it extends a check that was already being done in FindSuitableFreeRegisters: instead of checking that *some* register is a superreg of all registers in the rename group, we now check that the specific register that carries the anti-dependency that we want to break is a superreg of all registers in the rename group. Differential Revision: https://reviews.llvm.org/D156880
Change the scheduler's physical register dependency tracking from registers-and-their-aliases to regunits. This has a couple of advantages when subregisters are used: - The dependency tracking is more accurate and creates fewer useless edges in the dependency graph. An AMDGPU example, edited for clarity: SU(0): $vgpr1 = V_MOV_B32 $sgpr0 SU(1): $vgpr1 = V_ADDC_U32 0, $vgpr1 SU(2): $vgpr0_vgpr1 = FLAT_LOAD_DWORDX2 $vgpr0_vgpr1, 0, 0 There is a data dependency on $vgpr1 from SU(0) to SU(1) and from SU(1) to SU(2). But the old dependency tracking code also added a useless edge from SU(0) to SU(2) because it thought that SU(0)'s def of $vgpr1 aliased with SU(2)'s use of $vgpr0_vgpr1. - On targets like AMDGPU that make heavy use of subregisters, each register can have a huge number of aliases - it can be quadratic in the size of the largest defined register tuple. There is a much lower bound on the number of regunits per register, so iterating over regunits is faster than iterating over aliases. The LLVM compile-time tracker shows a tiny overall improvement of 0.03% on X86. I expect a larger compile-time improvement on targets like AMDGPU. Recommit after fixing AggressiveAntiDepBreaker in D156880. Differential Revision: https://reviews.llvm.org/D156552
865f284
to
c5344b7
Compare
This reverts commit 474ec69.
Signed-off-by: Tsang, Whitney <[email protected]>
979d22e
to
1f587e5
Compare
Signed-off-by: Tsang, Whitney <[email protected]>
Signed-off-by: Tsang, Whitney <[email protected]>
This was referenced Aug 23, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
disable-lint
Skip linter check step and proceed with build jobs
sycl-mlir
Pull requests or issues for sycl-mlir branch
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Please only review 80301a6 and 2fee8ea and files with conflict:
A number of commits are reverted due to opaque pointer, track in #8616.
f39c399 [Driver] -###: exit with code 1 if hasErrorOccurred, this commit causes
ninja check-clang-driver
tests to fail, XFAIL them temporarily, track in #10932.Please do not squash and merge this PR.