Releases: intel/intel-graphics-compiler
Releases · intel/intel-graphics-compiler
igc-1.0.4241
Fixed Issues / Improvements
- Rename enableSimd32 variable to reflect the usage.
- Add Frame-Pointer support to stack calls.
- Add support for explicit split coalescing in LRA.
- replace assert to error for not supported llvm version in Scalar.h.
- ZEBinWriter: Fix standalone build and add test option.
- Add missing fence before EOT W/A when a kernel only has typed writes.
- Flag to override the value imposed on the kernel by CL_DEVICE_MAX_PARAMETER_SIZE.
- Fix select handling in ResolveGAS.
- Generating bindless access for image.
- Allow alignment improvement for all cases except stateful accesses, in which the base of stateful surface could be as little as DW, thus we can assume align 4 on stateful messages.
- Fix spilled variable offset alignment when scratch space compression is enabled.
- fix the SWSB globalt token allocation bug.
- avoid dst and src overlap when they are using same variable.
- Added support for DG1 platform.
- Fix the acc number of channels value in SubPair method.
- VISA R0 variable name for easier debugging.
- Support ForceBestSIMD on pixel shaders.
- DWARF debugger location expressions fixes.
- Adding new CustomSafeOpt pattern for Ldrawvector.
- Add missing set in CMakeLists.
- Disable implicit args for functions called from indirect functions.
- Adding needed out of bounds check for constant coalescing.
- Reduce the even align to improve BCR.
- Fix a bug in variable split. Fix condition for right bound check in split verification pass.
- Fix the loop info in SWSB.
- Relax condition in Simd32Profitability.
- Disable LRA when split changes IR. Fix coalescing in color assignment to make it work with preRA scheduled code. Add more conditions in split verification.
- Check string types after int/float/vec types since we can distinguish between those based only on their type.
- Prevent rematerialization of relocation mov.
- Adjust TPM size for CM.
- Wrap methods related to indirect calls.
- Move variable split pass invocation to Optimizer before pre-RA scheduler. Add verification step to check assignment overlap.
- Limit the INT64 HW support. Fix the acc number of channels value.
- Remove alwaysinline attribute for function calls with aggregate and GAS pointer arguments. Support stack calls for struct type and GAS pointer args.
- Removed asserts that can actually trigger currently. For example:
- Fix condition to guarantee 'IfBB' is non-null below.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4155
Fixed Issues / Improvements
- Fix wave intrinsics fp16 emulation
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4154
Fixed Issues / Improvements
- Add implementation for proper half precision using mul and mad instructions.
- Perform read-modify-write only for spills inside divergent control flow.
- Implement emulation of 64b mov.
- Run Emu64OpsPass preprocessor also if DataLayout contains 64bit global or local pointers.
- Better heuristic for A64WA.
- Check for dst/src overlap for IEEE FP64 macros.
- Enable SWSBDepReduction option by default.
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4111
Fixed Issues / Improvements
- Removed splitVariables pass as it should be subsumed by the split4GRFVars pass.
- When stack is used, make sure the spill/fill temp is at least one GRF.
- Fixed offset calculation for indirect function calls.
- Added optimization stats.
- Aligned caller-save and callee-save frame offset to 64 byte.
- Moved updateDebugLoc() function into help.hpp to avoid static definition of this function in several .cpp files.
- Fixed non-uniform calls with return on stack.
- Added vISA verifier check for input that straddles GRF boundary.
- Refactored hard-coding register size in a few legalization checks.
- Removed indentation from namespaces blocks.
- Removed m_ShaderMode from PS codegen, as its base already has it.
- Renamed m_ShaderMode -> m_ShaderDisdpatchMode in visa emitter.
- Use int copy instead of float copy during spill code generation.
- vISA syntax enables more identifiers (VAR), which used to be lexical keywords such as "byte", "word"
- Fix WaDisableSendSrcDstOverlap() for instructions in divergent control flow.
- Fix -timestats issue so VISA SPILL is not counted twice.
- Updated IGA.
- Unpaired Op and Subfunction. Now there are Op::MATH and subfunction MathFC::INV, instead of Op::MATH_INV.
- Enabled color syntax in terminal output of IGAExe.
- Improvements to diagnostics in parser.
- Moved some OpSpec methods to cpp file to make header interface clearer.
- Old ld/st syntax removed.
- Changes to vISA IGA adapter to use new IR.
- Other vISA IGA adapter refactoring and improvements.
- Fixed file path in .asm dumps (missing slash between directory and file).
- Fixed a leak from CreateSystemThreadKernel.
- Added capturing some statistics: loop count, send instruction count, spill/fill operation count and estimated cycle count regardless of dump settings.
- Added fix in variable split to detect multiple definitions correctly.
- Fixed gtpin header with explicit padding.
- Removeed alignment requirement for definitions in non-divergent BBs.
- Refactored GenIntrinsics to include comments in files and remove old csv comment file.
- Dst/src overlap bug fix.
- Dumping SPIRV files in clLinkProgram scenario.
- Emit top line number instead of zero.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4062
Fixed Issues / Improvements
- Vary private memory stack size based on SIMD width.
- Make register pressure estimate more accurate by considering alignment for <1GRF variables.
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4053
Fixed Issues / Improvements
- Add option for the first fit window in local RA
- Plumbs LLVM names into vISA to improve debugging and readability of both vISA and GEN assembly
- Add WriteOnly Attribute for Intrinsic generation
- Make pre-defined FE SP/BP 64-bits by default
- Extended Math instruction is given a little higher weight than other instruction
- Fix build with system LLVM: do not hardcode LLVM library path
- Fixed non-uniform call code gen
- Fix a bug in reducing maximun IGC stack size from 8K to 1K
- ZEBinWriter: Refine TargetFlags definition for ELF header
- Set buffer size per inline buffer. Multiple inline buffers can be defined
- Removing the dst/src overlap checking after augmentation
- Rebuild physical register pool when kernel parameters are updated.
- Add ZEBinWriter for building ZE binary object
- Remove redundant sampler header movs
- Avoid hard-coding CS local_id alignment.
- Reduce IGC stack size from 8K to 1K when there's stack call
- Fix for mad_sat using long type
- Update max supported spirv version in SPIRV Translator
- Change more instruction creation to use createMov/createBinOp wapper.
- Includes some minor fixes for SWSB and send decoding
- Use Target attribute whenever it is possible, rather than target option.
- Optimization for inline constant/global data buffers.
- Fix the bug in the respond length cal for SVM block read.
- Fixing Debug Info generation for IndirectlyCalled functions.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.3977
Fixed Issues / Improvements
- Added AnnotateUniformAllocas pass.
- Added handling of StructType in metadata.
- Improvements to ShaderDumpEnable asm output; specifically the input arguments (added types).
- Refactored fence before EOT and handle function calls.
- Compute natural loops after modifying CFG.
- Copying struct by element instead of by byte.
- Disabling MTP when stack calls are present.
- Removed an obsolete W/A.
- Extended LVN; fixed a memory leak.
- Removed hard-coding message's and declare's size for pixel interpolator message.
- Removed hard-coding SIMD size for render target's write with stencil.
- Removed hard-coding R0 and R1 sizes in payload.
- Replaced isInSimdFlow() with isDivergent() with a new function isAllLaneActive().
- Made Inst->isBackward() return true for while loops.
- Added GPTin build options.
- Marked CVariable fields "const".
- Other minor changes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.3951
Fixed Issues / Improvements
- Fixed performance regression due to ConstantBufferCount.
- Add GTPin flags to support L0 driver path.
- Fixed struct return uniform to non-uniform copy.
- Indirect call support in SWSB.
- Add AnnotateUniformAllocas pass.
- Extend LVN.
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.3899
Fixed Issues / Improvements
- Fix for setting Src1.Length in IGA IR.
- Small refactor to render taget write code generation.
- Avoid splitting RetVal variable into two.
- Not limiting the loop unroll to small loops for high register pressure shaders if constant folding
- Handling the math intrinsics in FindInterestingConstants pass.
- Improve MergeURBWrites to also merge URBWrite instructions with dynamic URB offsets.
- Add support to dump GRF usage chart per instruction. Fix some bugs with variable splitting.
- Add GTPin flags to support L0 driver path
- relax ACC substitution constraints
- Improve ConstantBufferCount estimate
- Merge from IGA; includes new platforms and some IGA IR refactoring.
- Fix to handle a infinite loop stuck issue leading to compilation hang.
- Using attribute SpillMemOffset instead of option -spillMemOffset, so that it is encoded in isa binary.
- Refactor to avoid hard-coded simd size for vertex shader compilation.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.3864
Fixed Issues / Improvements
- Fix get_sub_group_id builtin handling
- Remove unused GTPin vISA option
- Further refactoring of vISA attributes
- vISA will use dispatch SIMD size from an attribute instead of scanning all instructions
- Fix to handle a infinite loop stuck issue leading to compilation hang
- Fix the performance regression caused by scalar suppression
- Refactor pixel shader code gen to avoid hard-coding SIMD size and GRF size
- Fix to SIMD16 shuffle down
- Add possibility to control partial component packing via shader metadata
- Fix sub_group_clustered_reduce_* on 64-bit types
- Extend LVN to detect more instructions
- Use attributes to get attributes names instead of hard-coding them wherever possible
- Added SimdSize kernel attribute (not used yet)
- Added sin/cos to sinpi/cospi optimization to improve performance
- Fixed the BCR static check
- Added getPlatform() function to IR_Builder and G4_INST
- Fixes in L0 tests
- Added vISA documentation
- Minor cleanup for G4_Declare
- Fixes for getAttributeID (searching all attributes instead of only kernels')
- Fix for subgroup_scan_exclusive for char and short
- Updated fcl interface to latest
- Added include guards, needed overrides, and some miscellaneous path fixes
- Added a debug knob to change the max block push constants threshold.
- Other minor bugfixes and improvements
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.