Commit graph

12 commits

Author SHA1 Message Date
Seunghoon Lee d7714d84c0
Add support of ROCm 6. (#27)
* Add support of ROCm 6.1.2 for Windows.

* Fix CI.

* Use llvm.sqrt.f64.
2024-07-13 13:47:35 +09:00
Seunghoon Lee 4baac34d4e
Implement cublasDotEx. 2024-04-10 15:20:58 +09:00
Seunghoon Lee 0f191c2354
Fix type errors. 2024-03-25 14:09:02 +09:00
Seunghoon Lee 1122cc0e83
Implement cublasSdot. 2024-03-25 11:30:23 +09:00
Seunghoon Lee cd1e0a3d50
Implement cublasDgetrsBatched. 2024-03-18 22:38:57 +09:00
Seunghoon Lee 2aa6fa3491
Implement cublasDgemm. 2024-02-23 21:57:52 +09:00
Seunghoon Lee d325f6a43c
Enable build for Linux. 2024-02-18 23:38:55 +09:00
Seunghoon Lee ad970a7665
Implement cublasSgetrsBatched. 2024-02-18 21:34:25 +09:00
Seunghoon Lee 8f3c1292b0
Merge remote-tracking branch 'upstream/master' 2024-02-17 04:13:06 +09:00
Andrzej Janik 4a81dbffb5
Update llama.cpp support (#102)
Add sign extension support to prmt, allow set.<op>.f16x2.f16x2, add more BLAS mappings
2024-02-16 00:01:21 +01:00
Seunghoon Lee 1ef7ef3938
Add support of cuBLAS, cuSPARSE for Windows. 2024-02-15 06:54:21 +09:00
Andrzej Janik 1b9ba2b233 Nobody expects the Red Team
Too many changes to list, but broadly:
* Remove Intel GPU support from the compiler
* Add AMD GPU support to the compiler
* Remove Intel GPU host code
* Add AMD GPU host code
* More device instructions. From 40 to 68
* More host functions. From 48 to 184
* Add proof of concept implementation of OptiX framework
* Add minimal support of cuDNN, cuBLAS, cuSPARSE, cuFFT, NCCL, NVML
* Improve ZLUDA launcher for Windows
2024-02-11 20:45:51 +01:00