Donato Capitella
2c2c36d3da
add rocm-7.2.1-pr21344 toolbox (gfx1151 MMQ/MMVQ tile + nwarp tuning)
...
Adds a new toolbox variant based on PR #21344 (pedapudi/llama.cpp@gfx1151-opt)
which tunes MMQ tile sizes (x_max=48, y=64) and warp counts (nwarps=4) for
RDNA3_5 gfx1151, yielding up to +100% prefill throughput at small batch sizes.
Also adds BMI2/FMA/F16C CPU SIMD flags and GGML_CUDA_FA_ALL_QUANTS=ON to match
the benchmark build used in the PR. Wire up CI (build matrix + prune), the
refresh script, and run_benchmarks.sh so results land alongside rocm-7.2.1.
2026-04-15 09:23:58 +01:00
Donato Capitella
4ac481e7d1
chore: upgrade ROCm version from 7.2 to 7.2.1 across configuration and documentation
2026-04-09 18:33:52 +01:00
Donato Capitella
d1e49d4aa0
chore: remove llama.cpp PR 21566 patch from rocm7-nightlies Dockerfile
2026-04-07 18:33:18 +01:00
Donato Capitella
a58d133c5e
chore: update llama.cpp patch to PR 21566 for gemma-4 inference fix
2026-04-07 17:49:16 +01:00
Donato Capitella
d0281bb526
feat: apply upstream llama.cpp patch to fix Gemma-4 inference issues
2026-04-06 10:25:42 +01:00
Donato Capitella
bbd8f02014
build: remove -DGGML_CUDA_DISABLE_FUSION=1 from cmake configuration in rocm7-nightlies Dockerfile (this was for a temporary test)
2026-04-03 15:21:58 +01:00
Donato Capitella
b376d1558b
build: disable GGML CUDA fusion in ROCm build configuration (temporary test)
2026-04-03 15:16:12 +01:00
Donato Capitella
614b00af3e
fixed patch (AI slop!!!)
2026-03-25 09:36:50 +00:00
Donato Capitella
ca84f4cbf3
patch: increasing MAX_REPETITION_THRESHOLD to allow complex agentic workflows
2026-03-25 09:23:19 +00:00
Donato Capitella
5f4698c959
build: Remove amdgpu-unroll-threshold-local CMAKE_HIP_FLAG from ROCm 7 nightlies Dockerfile.
2026-03-03 12:54:45 +00:00
Trevor Starick
95aaf23a47
fix: remove trailing backslash ( #60 )
...
* feat: add REPO/BRANCH build args for llama.cpp
- Introduce ARG REPO and ARG BRANCH to replace the hardcoded git clone with: `git clone -b ${BRANCH} --single-branch --recursive ${REPO}` . This allows overriding the llama.cpp repository and branch at build time via `--build-arg`.
- Update `docs/building.md` to recommend using `--build-arg` instead of updating the file
* fix: remove trailing backslash
2026-02-17 21:41:55 +00:00
Trevor Starick
be936d6b59
feat: add REPO/BRANCH build args for llama.cpp ( #59 )
...
- Introduce ARG REPO and ARG BRANCH to replace the hardcoded git clone with: `git clone -b ${BRANCH} --single-branch --recursive ${REPO}` . This allows overriding the llama.cpp repository and branch at build time via `--build-arg`.
- Update `docs/building.md` to recommend using `--build-arg` instead of updating the file
2026-02-17 19:29:48 +00:00
Donato Capitella
06fc789eba
chore: deprecate and remove ROCm 7.1.1 toolbox and all associated references.
2026-02-04 17:56:41 +00:00
Donato Capitella
785f27b100
MAKE_HIP_FLAGS to fix performance regression
2026-02-04 17:17:51 +00:00
Donato Capitella
606bc292b9
attempting other ways to apply LLVM patch to rocm7
2026-02-04 16:59:43 +00:00
Donato Capitella
bd8069fe2f
remove AI slop and use correct envs to pass flasg to HIP compiler
2026-02-04 16:21:32 +00:00
Donato Capitella
7ffa22d8de
fix: Add temporary workaround for ROCm 7 performance regression by setting HIP_LLVM_FLAGS.
2026-02-04 14:50:32 +00:00
Donato Capitella
353686ac79
moving 6.4.4 toolbox to use official fedora 43 rocm packages that include backported fixes for kernel compatibility
2026-01-24 11:47:35 +00:00
Donato Capitella
1807e8cff2
Adding ROCm 7.2 backend
2026-01-23 08:07:40 +00:00
Donato Capitella
ea03c773c6
adding procps-ng to the toolbox runtime
2026-01-15 09:43:05 +00:00
Donato Capitella
783998589e
neclean up of legacy toolboxes, removal of rocwmma and renamed rocm7-alpha to rocm-7nightlies. Added new benchmarks
2026-01-10 10:31:04 +00:00
Donato Capitella
9ba6812003
feat: upgrade ROCm to 7.1.1 and update associated tooling and documentation
2025-12-07 09:30:14 +00:00
Donato Capitella
df54882433
remove manual application of RPC performance PR (this is merged into master now)
2025-11-28 14:20:03 +00:00
Donato Capitella
1b5ced1255
make PR-15405 application explicit in logs
2025-11-25 10:02:32 +00:00
Donato Capitella
528923aa66
restore PR_15405 for Vulkan backends
2025-11-17 11:44:11 +00:00
Donato Capitella
eae357f9dd
disable PR_15405 for vulkan
2025-11-17 11:19:51 +00:00
Donato Capitella
79a2438861
copy rpc-server binary to runtime container
2025-11-17 08:04:02 +00:00
Donato Capitella
9254f7b9e2
revert styatic library flag
2025-11-16 22:43:14 +00:00
kyuz0
c0e74afbb8
Disable RPC PR for .rocm-7alpha-rocwmma-improved
2025-11-16 10:32:54 +00:00
Donato Capitella
7e583193d0
migrated to fedora 43 from rawhide to fix build issues
2025-11-16 10:04:39 +00:00
Donato Capitella
a164b2308b
switching from rawhide to 43
2025-11-16 09:44:29 +00:00
Donato Capitella
5253e1143b
tryign DBUILD_SHARED_LIBS to check if it fixes HIP backend build issues
2025-11-16 09:37:53 +00:00
Donato Capitella
8cea1363f3
remove dangling
2025-11-16 08:39:39 +00:00
Donato Capitella
bf0d083975
Dropping PR 15405 (llama.cpp RPC experimental improvement) due to compile issue
2025-11-16 08:25:25 +00:00
Donato Capitella
9de07b1d25
Enable RPC builds and merge PR 15405 across Dockerfiles
2025-11-16 07:54:49 +00:00
Donato Capitella
40a47116a9
Merge remote-tracking branch 'origin/main' into pr-20
2025-11-12 13:19:56 +00:00
Donato Capitella
9529c03e61
Fix Docker build parallelism flag by removing extra quoting around -j$(nproc)
2025-11-12 12:32:09 +00:00
Donato Capitella
a044056534
Use absolute include path for HIP shuffle shim to fix CMake compiler detection
2025-11-12 11:54:52 +00:00
Donato Capitella
0fc19e1849
moving folder to the right place
2025-11-12 11:12:07 +00:00
Donato Capitella
e9ed0bac22
Copy local HIP shuffle shim into build image to restore __shfl_sync support on gfx1151
2025-11-12 10:44:26 +00:00
Donato Capitella
42bbc2301e
Force-include HIP shuffle shim to fix missing __shfl_sync on gfx1151 builds
2025-11-12 10:16:34 +00:00
Donato Capitella
b6de7881dd
fix hredoc synatx in rocm-6.4.4-rocwmma
2025-11-12 09:53:55 +00:00
Donato Capitella
ff0ef125cc
Add HIP shuffle macro shim to restore __shfl_sync support on gfx1151
2025-11-12 09:43:12 +00:00
Donato Capitella
be28cb2ad5
Add HIP shuffle compatibility shim for gfx1151 builds
2025-11-12 08:28:03 +00:00
Donato Capitella
e36bd3e4ec
trying another fix for rocm-6.4.4-rocwmma
2025-11-12 07:55:03 +00:00
Donato Capitella
52ee9d50f2
fix rocm-6.4.4-rocwmma
2025-11-12 07:46:55 +00:00
Donato Capitella
07536d3c42
Merge branch 'main' of github.com:kyuz0/amd-strix-halo-toolboxes
2025-11-10 19:30:35 +00:00
Donato Capitella
73f6a69310
fix
2025-11-10 19:29:30 +00:00
kyuz0
b21300027b
Merge pull request #11 from darkbasic/rocm-7alpha
...
Add rocm-7alpha, rocm-7alpha-rocwmma and rocm-7alpha-rocwmma-improved Dockerfiles
2025-11-10 19:26:08 +00:00
Donato Capitella
6d121bc88a
Merge PR#15405 to make RPC server faster
2025-11-10 19:21:21 +00:00