chore: update ROCm version to 7.2.3 and remove deprecated pr21344 toolbox
This commit is contained in:
@@ -28,7 +28,7 @@ jobs:
|
|||||||
IN='${{ inputs.backends }}'
|
IN='${{ inputs.backends }}'
|
||||||
|
|
||||||
if [[ "$IN" == "all" || -z "$IN" ]]; then
|
if [[ "$IN" == "all" || -z "$IN" ]]; then
|
||||||
JSON='["rocm-6.4.4","rocm-7.2.2","rocm-7.2.2-pr21344","rocm7-nightlies","vulkan-amdvlk","vulkan-radv"]'
|
JSON='["rocm-6.4.4","rocm-7.2.3","rocm7-nightlies","vulkan-amdvlk","vulkan-radv"]'
|
||||||
else
|
else
|
||||||
# Remove spaces and build JSON array from comma list
|
# Remove spaces and build JSON array from comma list
|
||||||
IN_CLEAN=$(echo "$IN" | tr -d '[:space:]')
|
IN_CLEAN=$(echo "$IN" | tr -d '[:space:]')
|
||||||
|
|||||||
@@ -44,7 +44,7 @@ jobs:
|
|||||||
run: |
|
run: |
|
||||||
IN='${{ github.event.inputs.backends }}'
|
IN='${{ github.event.inputs.backends }}'
|
||||||
if [[ "$IN" == "all" || -z "$IN" ]]; then
|
if [[ "$IN" == "all" || -z "$IN" ]]; then
|
||||||
JSON='["rocm-6.4.2","rocm-6.4.3","rocm-6.4.4","rocm-7.1.1","rocm-7.2","rocm-7.2.1","rocm-7.2.1-pr21344","rocm-7.2.2","rocm-7.2.2-pr21344","rocm-7beta","rocm7-nightlies","vulkan-amdvlk","vulkan-radv"]'
|
JSON='["rocm-6.4.2","rocm-6.4.3","rocm-6.4.4","rocm-7.1.1","rocm-7.2","rocm-7.2.1","rocm-7.2.1-pr21344","rocm-7.2.2","rocm-7.2.3","rocm-7beta","rocm7-nightlies","vulkan-amdvlk","vulkan-radv"]'
|
||||||
else
|
else
|
||||||
IN_CLEAN=$(echo "$IN" | tr -d '[:space:]')
|
IN_CLEAN=$(echo "$IN" | tr -d '[:space:]')
|
||||||
JSON='["'${IN_CLEAN//,/\",\"}'"]'
|
JSON='["'${IN_CLEAN//,/\",\"}'"]'
|
||||||
|
|||||||
@@ -8,7 +8,7 @@
|
|||||||
* **Hardware / Drivers**: AMD "Strix Halo" APUs (Gfx1151). Implementations depend on ROCm (v6.4.4, v7.x) and Vulkan (Mesa RADV, AMDVLK).
|
* **Hardware / Drivers**: AMD "Strix Halo" APUs (Gfx1151). Implementations depend on ROCm (v6.4.4, v7.x) and Vulkan (Mesa RADV, AMDVLK).
|
||||||
|
|
||||||
## Repository Structure Overview
|
## Repository Structure Overview
|
||||||
* `/toolboxes/`: Dockerfiles used to build the container images (e.g., `rocm-6.4.4`, `rocm-7.2.2`, `vulkan-radv`). These often use multi-stage builds to compile Llama.cpp and extract standalone binaries.
|
* `/toolboxes/`: Dockerfiles used to build the container images (e.g., `rocm-6.4.4`, `rocm-7.2.3`, `vulkan-radv`). These often use multi-stage builds to compile Llama.cpp and extract standalone binaries.
|
||||||
* `/benchmark/`: Shell scripts and Python utilities (like `generate_results_json.py`) to systematically test Llama.cpp throughput, latency, and RPC performance.
|
* `/benchmark/`: Shell scripts and Python utilities (like `generate_results_json.py`) to systematically test Llama.cpp throughput, latency, and RPC performance.
|
||||||
* `/docs/`: Markdown documents, along with HTML/CSS/JS (e.g., `index.html`, `assets/`) for the GitHub Pages website (`strix-halo-toolboxes.com`), plus interactive benchmark viewers and documentation on VRAM estimation.
|
* `/docs/`: Markdown documents, along with HTML/CSS/JS (e.g., `index.html`, `assets/`) for the GitHub Pages website (`strix-halo-toolboxes.com`), plus interactive benchmark viewers and documentation on VRAM estimation.
|
||||||
* `/scripts/`: Python utilities, including `run_distributed_llama.py` for distributed inference across nodes.
|
* `/scripts/`: Python utilities, including `run_distributed_llama.py` for distributed inference across nodes.
|
||||||
|
|||||||
@@ -44,7 +44,7 @@ This is currently the most stable setup. Kernels older than 6.18.4 have a bug th
|
|||||||
## Supported Toolboxes
|
## Supported Toolboxes
|
||||||
|
|
||||||
> [!WARNING]
|
> [!WARNING]
|
||||||
> Current `rocm7-nightlies` builds have a bug that caps memory allocation to 64GB. If you need larger models, prefer stable builds like `rocm-7.2.2` (performance is similar). Track the issue here: https://github.com/ROCm/TheRock/issues/4645
|
> Current `rocm7-nightlies` builds have a bug that caps memory allocation to 64GB. If you need larger models, prefer stable builds like `rocm-7.2.3` (performance is similar). Track the issue here: https://github.com/ROCm/TheRock/issues/4645
|
||||||
|
|
||||||
You can check the containers on DockerHub: [kyuz0/amd-strix-halo-toolboxes](https://hub.docker.com/r/kyuz0/amd-strix-halo-toolboxes/tags).
|
You can check the containers on DockerHub: [kyuz0/amd-strix-halo-toolboxes](https://hub.docker.com/r/kyuz0/amd-strix-halo-toolboxes/tags).
|
||||||
|
|
||||||
@@ -53,7 +53,7 @@ You can check the containers on DockerHub: [kyuz0/amd-strix-halo-toolboxes](http
|
|||||||
| `vulkan-amdvlk` | Vulkan (AMDVLK) | Fastest backend—AMD open-source driver. ≤2 GiB single buffer allocation limit, some large models won't load. |
|
| `vulkan-amdvlk` | Vulkan (AMDVLK) | Fastest backend—AMD open-source driver. ≤2 GiB single buffer allocation limit, some large models won't load. |
|
||||||
| `vulkan-radv` | Vulkan (Mesa RADV) | Most stable and compatible. Recommended for most users and all models. |
|
| `vulkan-radv` | Vulkan (Mesa RADV) | Most stable and compatible. Recommended for most users and all models. |
|
||||||
| `rocm-6.4.4` | ROCm 6.4.4 (Fedora 43) | Latest stable 6.x build. Uses Fedora 43 packages with backported patch for **kernel 6.18.4+** support. |
|
| `rocm-6.4.4` | ROCm 6.4.4 (Fedora 43) | Latest stable 6.x build. Uses Fedora 43 packages with backported patch for **kernel 6.18.4+** support. |
|
||||||
| `rocm-7.2.2` | ROCm 7.2.2 | Latest stable 7.x build. Includes patch for **kernel 6.18.4+** support. |
|
| `rocm-7.2.3` | ROCm 7.2.3 | Latest stable 7.x build. Includes patch for **kernel 6.18.4+** support. |
|
||||||
| `rocm7-nightlies` | ROCm 7 Nightly | Tracks nightly builds. Includes patch for **kernel 6.18.4+** support. |
|
| `rocm7-nightlies` | ROCm 7 Nightly | Tracks nightly builds. Includes patch for **kernel 6.18.4+** support. |
|
||||||
|
|
||||||
> These containers are **automatically** rebuilt whenever the Llama.cpp master branch is updated. Legacy images (`rocm-6.4.2`, `rocm-6.4.3`, `rocm-7.1.1`) are excluded from this list.
|
> These containers are **automatically** rebuilt whenever the Llama.cpp master branch is updated. Legacy images (`rocm-6.4.2`, `rocm-6.4.3`, `rocm-7.1.1`) are excluded from this list.
|
||||||
@@ -73,12 +73,12 @@ toolbox enter llama-vulkan-radv
|
|||||||
|
|
||||||
**Option B: ROCm (Recommended for Performance)**
|
**Option B: ROCm (Recommended for Performance)**
|
||||||
```sh
|
```sh
|
||||||
toolbox create llama-rocm-7.2.2 \
|
toolbox create llama-rocm-7.2.3 \
|
||||||
--image docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-7.2.2 \
|
--image docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-7.2.3 \
|
||||||
-- --device /dev/dri --device /dev/kfd \
|
-- --device /dev/dri --device /dev/kfd \
|
||||||
--group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined
|
--group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined
|
||||||
|
|
||||||
toolbox enter llama-rocm-7.2.2
|
toolbox enter llama-rocm-7.2.3
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Check GPU Access
|
### 2. Check GPU Access
|
||||||
|
|||||||
@@ -62,8 +62,7 @@ echo
|
|||||||
|
|
||||||
declare -A CMDS=(
|
declare -A CMDS=(
|
||||||
[rocm6_4_4]="toolbox run -c llama-rocm-6.4.4 -- /usr/local/bin/llama-bench"
|
[rocm6_4_4]="toolbox run -c llama-rocm-6.4.4 -- /usr/local/bin/llama-bench"
|
||||||
[rocm-7_2_2]="toolbox run -c llama-rocm-7.2.2 -- /usr/local/bin/llama-bench"
|
[rocm-7_2_3]="toolbox run -c llama-rocm-7.2.3 -- /usr/local/bin/llama-bench"
|
||||||
[rocm-7_2_2-pr21344]="toolbox run -c llama-rocm-7.2.2-pr21344 -- /usr/local/bin/llama-bench"
|
|
||||||
[rocm7-nightlies]="toolbox run -c llama-rocm7-nightlies -- /usr/local/bin/llama-bench"
|
[rocm7-nightlies]="toolbox run -c llama-rocm7-nightlies -- /usr/local/bin/llama-bench"
|
||||||
[vulkan_amdvlk]="toolbox run -c llama-vulkan-amdvlk -- /usr/sbin/llama-bench"
|
[vulkan_amdvlk]="toolbox run -c llama-vulkan-amdvlk -- /usr/sbin/llama-bench"
|
||||||
[vulkan_radv]="toolbox run -c llama-vulkan-radv -- /usr/sbin/llama-bench"
|
[vulkan_radv]="toolbox run -c llama-vulkan-radv -- /usr/sbin/llama-bench"
|
||||||
|
|||||||
@@ -8,8 +8,7 @@ declare -A TOOLBOXES
|
|||||||
TOOLBOXES["llama-vulkan-amdvlk"]="docker.io/kyuz0/amd-strix-halo-toolboxes:vulkan-amdvlk --device /dev/dri --group-add video --security-opt seccomp=unconfined"
|
TOOLBOXES["llama-vulkan-amdvlk"]="docker.io/kyuz0/amd-strix-halo-toolboxes:vulkan-amdvlk --device /dev/dri --group-add video --security-opt seccomp=unconfined"
|
||||||
TOOLBOXES["llama-vulkan-radv"]="docker.io/kyuz0/amd-strix-halo-toolboxes:vulkan-radv --device /dev/dri --group-add video --security-opt seccomp=unconfined"
|
TOOLBOXES["llama-vulkan-radv"]="docker.io/kyuz0/amd-strix-halo-toolboxes:vulkan-radv --device /dev/dri --group-add video --security-opt seccomp=unconfined"
|
||||||
TOOLBOXES["llama-rocm-6.4.4"]="docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-6.4.4 --device /dev/dri --device /dev/kfd --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined"
|
TOOLBOXES["llama-rocm-6.4.4"]="docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-6.4.4 --device /dev/dri --device /dev/kfd --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined"
|
||||||
TOOLBOXES["llama-rocm-7.2.2"]="docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-7.2.2 --device /dev/dri --device /dev/kfd --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined"
|
TOOLBOXES["llama-rocm-7.2.3"]="docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-7.2.3 --device /dev/dri --device /dev/kfd --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined"
|
||||||
TOOLBOXES["llama-rocm-7.2.2-pr21344"]="docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-7.2.2-pr21344 --device /dev/dri --device /dev/kfd --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined"
|
|
||||||
TOOLBOXES["llama-rocm7-nightlies"]="docker.io/kyuz0/amd-strix-halo-toolboxes:rocm7-nightlies --device /dev/dri --device /dev/kfd --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined"
|
TOOLBOXES["llama-rocm7-nightlies"]="docker.io/kyuz0/amd-strix-halo-toolboxes:rocm7-nightlies --device /dev/dri --device /dev/kfd --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined"
|
||||||
|
|
||||||
function usage() {
|
function usage() {
|
||||||
|
|||||||
@@ -1,124 +0,0 @@
|
|||||||
# build stage
|
|
||||||
# Based on Dockerfile.rocm-7.2.2, but clones pedapudi/llama.cpp@gfx1151-opt
|
|
||||||
# (PR #21344: gfx1151 nwarps, tile sizing to curb VGPR pressure)
|
|
||||||
FROM registry.fedoraproject.org/fedora:43 AS builder
|
|
||||||
|
|
||||||
# rocm 7.2.2 repo
|
|
||||||
RUN <<'EOF'
|
|
||||||
tee /etc/yum.repos.d/rocm.repo <<REPO
|
|
||||||
[ROCm-7.2.2]
|
|
||||||
name=ROCm7.2.2
|
|
||||||
baseurl=https://repo.radeon.com/rocm/rhel10/7.2.2/main
|
|
||||||
enabled=1
|
|
||||||
priority=50
|
|
||||||
gpgcheck=1
|
|
||||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
|
||||||
REPO
|
|
||||||
EOF
|
|
||||||
|
|
||||||
# deps
|
|
||||||
RUN dnf -y --nodocs --setopt=install_weak_deps=False \
|
|
||||||
--exclude='*sdk*' --exclude='*samples*' --exclude='*-doc*' --exclude='*-docs*' \
|
|
||||||
install \
|
|
||||||
make gcc cmake lld clang clang-devel compiler-rt libcurl-devel ninja-build \
|
|
||||||
rocm-llvm rocm-device-libs hip-runtime-amd hip-devel \
|
|
||||||
rocblas rocblas-devel hipblas hipblas-devel rocm-cmake libomp-devel libomp \
|
|
||||||
rocminfo radeontop \
|
|
||||||
git-core vim sudo rsync patch \
|
|
||||||
&& dnf clean all && rm -rf /var/cache/dnf/*
|
|
||||||
|
|
||||||
# rocm env
|
|
||||||
ENV ROCM_PATH=/opt/rocm \
|
|
||||||
HIP_PATH=/opt/rocm \
|
|
||||||
HIP_CLANG_PATH=/opt/rocm/llvm/bin \
|
|
||||||
HIP_DEVICE_LIB_PATH=/opt/rocm/amdgcn/bitcode \
|
|
||||||
PATH=/opt/rocm/bin:/opt/rocm/llvm/bin:$PATH
|
|
||||||
|
|
||||||
# llama.cpp — PR #21344 fork (gfx1151 MMQ/MMVQ tile + nwarp tuning)
|
|
||||||
WORKDIR /opt/llama.cpp
|
|
||||||
ARG REPO=https://github.com/pedapudi/llama.cpp.git
|
|
||||||
ARG BRANCH=gfx1151-opt
|
|
||||||
RUN git clone -b ${BRANCH} --single-branch --recursive ${REPO} .
|
|
||||||
|
|
||||||
COPY llama-grammar.patch /tmp/llama-grammar.patch
|
|
||||||
|
|
||||||
# build
|
|
||||||
RUN git clean -xdf \
|
|
||||||
&& git submodule update --recursive \
|
|
||||||
&& patch -p1 < /tmp/llama-grammar.patch \
|
|
||||||
&& cmake -S . -B build \
|
|
||||||
-DGGML_HIP=ON \
|
|
||||||
-DCMAKE_HIP_COMPILER=${HIP_CLANG_PATH}/clang \
|
|
||||||
-DCMAKE_HIP_FLAGS="--rocm-path=/opt/rocm -mllvm --amdgpu-unroll-threshold-local=600" \
|
|
||||||
-DAMDGPU_TARGETS=gfx1151 \
|
|
||||||
-DCMAKE_BUILD_TYPE=Release \
|
|
||||||
-DGGML_RPC=ON \
|
|
||||||
-DLLAMA_HIP_UMA=ON \
|
|
||||||
-DGGML_CUDA_ENABLE_UNIFIED_MEMORY=ON \
|
|
||||||
-DGGML_BMI2=ON \
|
|
||||||
-DGGML_FMA=ON \
|
|
||||||
-DGGML_F16C=ON \
|
|
||||||
-DGGML_CUDA_FA_ALL_QUANTS=ON \
|
|
||||||
-DLLAMA_BUILD_TESTS=OFF \
|
|
||||||
-DLLAMA_BUILD_EXAMPLES=OFF \
|
|
||||||
-DROCM_PATH=/opt/rocm \
|
|
||||||
-DHIP_PATH=/opt/rocm \
|
|
||||||
-DHIP_PLATFORM=amd \
|
|
||||||
&& cmake --build build --config Release -- -j$(nproc) \
|
|
||||||
&& cmake --install build --config Release
|
|
||||||
|
|
||||||
# libs
|
|
||||||
RUN find /opt/llama.cpp/build -type f -name 'lib*.so*' -exec cp {} /usr/lib64/ \; \
|
|
||||||
&& ldconfig
|
|
||||||
|
|
||||||
# helper
|
|
||||||
COPY gguf-vram-estimator.py /usr/local/bin/gguf-vram-estimator.py
|
|
||||||
RUN chmod +x /usr/local/bin/gguf-vram-estimator.py
|
|
||||||
|
|
||||||
# runtime stage
|
|
||||||
FROM registry.fedoraproject.org/fedora-minimal:43
|
|
||||||
|
|
||||||
# rocm 7.2.2 repo
|
|
||||||
RUN <<'EOF'
|
|
||||||
tee /etc/yum.repos.d/rocm.repo <<REPO
|
|
||||||
[ROCm-7.2.2]
|
|
||||||
name=ROCm7.2.2
|
|
||||||
baseurl=https://repo.radeon.com/rocm/rhel10/7.2.2/main
|
|
||||||
enabled=1
|
|
||||||
priority=50
|
|
||||||
gpgcheck=1
|
|
||||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
|
||||||
REPO
|
|
||||||
EOF
|
|
||||||
|
|
||||||
# runtime deps
|
|
||||||
RUN microdnf -y --nodocs --setopt=install_weak_deps=0 \
|
|
||||||
--exclude='*sdk*' --exclude='*samples*' --exclude='*-doc*' --exclude='*-docs*' \
|
|
||||||
install \
|
|
||||||
bash ca-certificates libatomic libstdc++ libgcc libgomp sudo \
|
|
||||||
hip-runtime-amd rocblas hipblas \
|
|
||||||
rocminfo radeontop procps-ng \
|
|
||||||
&& microdnf clean all && rm -rf /var/cache/dnf/*
|
|
||||||
|
|
||||||
# copy
|
|
||||||
COPY --from=builder /usr/local/ /usr/local/
|
|
||||||
COPY --from=builder /opt/llama.cpp/build/bin/rpc-* /usr/local/bin/
|
|
||||||
|
|
||||||
# ld
|
|
||||||
RUN echo "/usr/local/lib" > /etc/ld.so.conf.d/local.conf \
|
|
||||||
&& echo "/usr/local/lib64" >> /etc/ld.so.conf.d/local.conf \
|
|
||||||
&& ldconfig \
|
|
||||||
&& cp -n /usr/local/lib/libllama*.so* /usr/lib64/ 2>/dev/null || true \
|
|
||||||
&& ldconfig
|
|
||||||
|
|
||||||
# helper
|
|
||||||
COPY gguf-vram-estimator.py /usr/local/bin/gguf-vram-estimator.py
|
|
||||||
RUN chmod +x /usr/local/bin/gguf-vram-estimator.py
|
|
||||||
|
|
||||||
# profile
|
|
||||||
RUN printf '%s\n' \
|
|
||||||
> /etc/profile.d/rocm.sh && chmod +x /etc/profile.d/rocm.sh \
|
|
||||||
&& echo 'source /etc/profile.d/rocm.sh' >> /etc/bashrc
|
|
||||||
|
|
||||||
# shell
|
|
||||||
CMD ["/bin/bash"]
|
|
||||||
@@ -1,12 +1,12 @@
|
|||||||
# build stage
|
# build stage
|
||||||
FROM registry.fedoraproject.org/fedora:43 AS builder
|
FROM registry.fedoraproject.org/fedora:43 AS builder
|
||||||
|
|
||||||
# rocm 7.2.2 repo
|
# rocm 7.2.3 repo
|
||||||
RUN <<'EOF'
|
RUN <<'EOF'
|
||||||
tee /etc/yum.repos.d/rocm.repo <<REPO
|
tee /etc/yum.repos.d/rocm.repo <<REPO
|
||||||
[ROCm-7.2.2]
|
[ROCm-7.2.3]
|
||||||
name=ROCm7.2.2
|
name=ROCm7.2.3
|
||||||
baseurl=https://repo.radeon.com/rocm/rhel10/7.2.2/main
|
baseurl=https://repo.radeon.com/rocm/rhel10/7.2.3/main
|
||||||
enabled=1
|
enabled=1
|
||||||
priority=50
|
priority=50
|
||||||
gpgcheck=1
|
gpgcheck=1
|
||||||
@@ -69,12 +69,12 @@ RUN chmod +x /usr/local/bin/gguf-vram-estimator.py
|
|||||||
# runtime stage
|
# runtime stage
|
||||||
FROM registry.fedoraproject.org/fedora-minimal:43
|
FROM registry.fedoraproject.org/fedora-minimal:43
|
||||||
|
|
||||||
# rocm 7.2.2 repo
|
# rocm 7.2.3 repo
|
||||||
RUN <<'EOF'
|
RUN <<'EOF'
|
||||||
tee /etc/yum.repos.d/rocm.repo <<REPO
|
tee /etc/yum.repos.d/rocm.repo <<REPO
|
||||||
[ROCm-7.2.2]
|
[ROCm-7.2.3]
|
||||||
name=ROCm7.2.2
|
name=ROCm7.2.3
|
||||||
baseurl=https://repo.radeon.com/rocm/rhel10/7.2.2/main
|
baseurl=https://repo.radeon.com/rocm/rhel10/7.2.3/main
|
||||||
enabled=1
|
enabled=1
|
||||||
priority=50
|
priority=50
|
||||||
gpgcheck=1
|
gpgcheck=1
|
||||||
Reference in New Issue
Block a user