Commit graph

266 commits

Author SHA1 Message Date
Seunghoon Lee 86cdab3b14
Follow-up changes of #27 (#32)
* Update README.

* Drop dead codes.

* Use addr_of.

* Update Dockerfiles.

* Disable warnings for nvrtc.

* nvrtc.
2024-07-14 16:37:39 +09:00
Seunghoon Lee d7714d84c0
Add support of ROCm 6. (#27)
* Add support of ROCm 6.1.2 for Windows.

* Fix CI.

* Use llvm.sqrt.f64.
2024-07-13 13:47:35 +09:00
Seunghoon Lee d23b3f4636
Drop cudart. 2024-07-11 14:52:36 +09:00
Seunghoon Lee 53b51ddd55
Revert "Fix CI."
This reverts commit 0e558cc0ae.
2024-07-11 11:08:38 +09:00
Seunghoon Lee 0e558cc0ae
Fix CI. 2024-07-11 11:07:13 +09:00
Seunghoon Lee 4d5838ecec
Merge remote-tracking branch 'upstream/master' 2024-07-10 17:14:47 +09:00
Andrzej Janik 9e56862ebb
Update README with CGBN information (#237) 2024-05-23 13:28:17 +02:00
Seunghoon Lee 2ad9ad6851
[Fix] Clean up Runtime API. 2024-05-21 10:49:19 +09:00
Seunghoon Lee 11cc584451
Implement cuda runtime api. (cudart) (#17)
* [WIP] Implement cudart.

* wip

* wip

* Implement cudart.

* wip

* Ready to merge.
2024-05-17 13:15:16 +09:00
Seunghoon Lee 95f881004f
[CI] Rename CI jobs. 2024-05-17 13:08:52 +09:00
Seunghoon Lee 4f12e8cfe9
Merge remote-tracking branch 'upstream/master' 2024-05-17 12:44:07 +09:00
Seunghoon Lee 6fe6b7f843
[CI] Add CI for PR. 2024-05-17 12:34:31 +09:00
Andrzej Janik 2d8c47f147
Support Meshroom (#153) 2024-05-17 00:35:38 +02:00
NyanCatTW1 fcd7a57888
Fix + improve vprintf implementation (#211) 2024-05-16 00:38:52 +02:00
Andrzej Janik f0c905db15
Fix trap instruction codegen, don't fail build with older Rust versions (#229) 2024-05-08 15:19:59 +02:00
Andrzej Janik 27c0e13677
Minor codegen improvements (#225) 2024-05-06 00:28:49 +02:00
Seunghoon Lee 7538ae61c6
Remove zluda_dnn remains. 2024-04-29 22:56:19 +09:00
Seunghoon Lee 2804604c29
Merge remote-tracking branch 'upstream/master' 2024-04-29 22:32:59 +09:00
Andrzej Janik bdc652f9eb
Correctly report emulated wave32 CUDA device (#216) 2024-04-29 15:09:14 +02:00
Seunghoon Lee cb5fb0e633
[CI] Revise Windows HIP SDK dependency. 2024-04-28 14:39:42 +09:00
Seunghoon Lee 17af40c848
[CI] Add needs. 2024-04-28 13:53:26 +09:00
Seunghoon Lee c2710a88f1
Merge remote-tracking branch 'upstream/master' 2024-04-28 13:00:19 +09:00
Andrzej Janik 995bc95174
Build improvements (#206)
* Allow to create .zip package on Windows
* Allow to create .tar.gz package on Linux
* Add configuration for post-build Github CI
2024-04-28 01:22:43 +02:00
Seunghoon Lee b9e932b9c8
Rewrite nvrtcVersion. 2024-04-21 00:32:58 +09:00
Seunghoon Lee cf262c6889
Implement nvmlDeviceGetCudaComputeCapability. 2024-04-21 00:31:57 +09:00
Seunghoon Lee 12c2da499f
Merge remote-tracking branch 'upstream/master' 2024-04-21 00:08:33 +09:00
Andrzej Janik 5d5f7cca75
Rewrite surface implementation to more accurately support unofficial CUDA semantics (#203)
This fixes black screen in some CompuBench tests (TV-L1 Optical Flow) and other apps that use CUDA surfaces incorrectly
2024-04-14 02:39:34 +02:00
Seunghoon Lee 4baac34d4e
Implement cublasDotEx. 2024-04-10 15:20:58 +09:00
Seunghoon Lee 9e97c717c3
Disable DNN build on Windows. 2024-04-06 13:51:46 +09:00
Seunghoon Lee fcf2654050
Merge remote-tracking branch 'upstream/master' 2024-04-06 13:40:31 +09:00
Andrzej Janik 774f4bcb37
Implement sad instruction (#198) 2024-04-06 01:23:53 +02:00
Andrzej Janik 0d9ace2475
Fix buggy carry flags when mixing subc/sub.cc with addc/add.cc (#197) 2024-04-05 23:26:08 +02:00
NyanCatTW1 76bae5f91b
Implement mad.hi.cc (#196) 2024-04-05 19:12:59 +02:00
Andrzej Janik b695f44c18
Support old PTX compression scheme (#188) 2024-03-29 02:03:23 +01:00
Andrzej Janik 7d4147c8b2
Add Blender 4.2 support (#184)
Redo primary context and fix various long-standing bugs around this API
2024-03-28 17:12:10 +01:00
Seunghoon Lee 0f191c2354
Fix type errors. 2024-03-25 14:09:02 +09:00
Seunghoon Lee 1122cc0e83
Implement cublasSdot. 2024-03-25 11:30:23 +09:00
Seunghoon Lee 7c3891e6b3
Fix cusparseDnMatGet. 2024-03-21 22:07:36 +09:00
Seunghoon Lee ff1bc6d9b6
Fix nvrtcGetErrorString. 2024-03-21 22:06:42 +09:00
Seunghoon Lee 87d2e3e163
Add match order fallback. 2024-03-20 14:59:10 +09:00
Seunghoon Lee 2b52d0a040
Fix typo. 2024-03-20 14:38:49 +09:00
Seunghoon Lee 6b2488395d
Implement cusparseCreateDnMat, cusparseDestroyDnMat, cusparseDnMat*. 2024-03-20 14:12:00 +09:00
Seunghoon Lee 2812b1db44
Include zluda.exe in Windows release. 2024-03-20 11:22:57 +09:00
Seunghoon Lee 4fe7c2b1d4
Merge remote-tracking branch 'upstream/master' 2024-03-20 11:21:42 +09:00
Seunghoon Lee 4fab2af0f4
Implement cusparseXcoo2csr. 2024-03-20 10:30:31 +09:00
Seunghoon Lee f52edbd132
Implement cusparseXcoo2csr. 2024-03-20 10:29:09 +09:00
Seunghoon Lee cd1e0a3d50
Implement cublasDgetrsBatched. 2024-03-18 22:38:57 +09:00
Andrzej Janik 1ede61c696
Disable even more optional LLVM components (#179) 2024-03-17 14:53:15 +01:00
Seunghoon Lee 605254b38e
fix upload_url 2024-03-17 18:39:24 +09:00
Seunghoon Lee 65dbb30d2e
job deps 2024-03-17 18:09:08 +09:00