diff --git a/README.md b/README.md
index 41eadfd..fcb4b4f 100644
--- a/README.md
+++ b/README.md
@@ -30,7 +30,9 @@ This project provides pre-built containers (“toolboxes”) for running LLMs on
 7. [More Documentation](#7-more-documentation)  
 8. [References](#8-references)
 
+## 🚨 Updates — 2025-09-28
 
+Released ROCm 6.4.4 toolboxes. ROCm-6.4.4+ROCWMMA is the currently recommenede one for most use-cases, but always check the benchmakrs to find the backend that performs better with your model architecture and quantization of choice -> [Performance Benchmarks (Key Results)](#3-performance-benchmarks-key-results) 
 
 ## 1. Llama.cpp Compiled for Every Backend
 
@@ -47,8 +49,8 @@ You can check the containers on DockerHub: https://hub.docker.com/r/kyuz0/amd-st
 | -------------------- | ------------------------ | --------------- |
 | `vulkan-amdvlk`      | Vulkan (AMDVLK)           | Fastest backend—AMD open-source driver. ≤2 GiB single buffer allocation limit, some large models won't load. |
 | `vulkan-radv`        | Vulkan (Mesa RADV)        | Most stable and compatible. Recommended for most users and all models. |
-| `rocm-6.4.3`         | ROCm 6.4.3 (HIP) + hipBLASLt*          | Latest stable ROCm. Great for BF16 models. Occasional crashes possible. |
-| `rocm-6.4.3-rocwmma` | ROCm 6.4.3 (HIP) + ROCWMMA + hipBLASLt*  | ROCm with ROCWMMA enabled for improved flash attention on RDNA3+/CDNA. |
+| `rocm-6.4.4`         | ROCm 6.4.4 (HIP) + hipBLASLt*          | Latest stable ROCm. Great for BF16 models. Occasional crashes possible. |
+| `rocm-6.4.4-rocwmma` | ROCm 6.4.4 (HIP) + ROCWMMA + hipBLASLt*  | ROCm with ROCWMMA enabled for improved flash attention on RDNA3+/CDNA. |
 | `rocm-7rc`           | ROCm 7.0 RC (HIP) + hipBLASLt*         | Release candidate for ROCm 7.0. |
 | `rocm-7rc-rocwmma`   | ROCm 7.0 RC (HIP) + ROCWMMA + hipBLASLt*       | Release candidate for ROCm 7.0, with hipBLASLt and ROCWMMA for improved flash attention on RDNA3+/CDNA |
 
@@ -56,7 +58,7 @@ You can check the containers on DockerHub: https://hub.docker.com/r/kyuz0/amd-st
 
 > These containers are **automatically** rebuilt whenever the Llama.cpp master branch is updated, ensuring you get the latest bug fixes and new model support. The easiest way to update to the newest versions is by running the `refresh-toolboxes.sh` [script below](#211-toolbox-refresh-script-automatic-updates).
 
-> *rocm-6.4.2* and *rocm-7beta* coontainers have been retired in favour of *rocm-6.4.3* and *rocm_7rc*.
+> *rocm-6.4.2*, *rocm-6.4.3* and *rocm-7beta* coontainers have been retired in favour of *rocm-6.4.4* and *rocm_7rc*.
 
 ---
 
@@ -78,7 +80,7 @@ To use Llama.cpp with hardware acceleration inside a toolbox container, you must
 * **For ROCm:** You must expose both `/dev/dri` and `/dev/kfd`, and add the user to extra groups for compute access.
 
   ```sh
-  toolbox create llama-rocm-6.4.3-rocwmma \
+  toolbox create llama-rocm-6.4.4-rocwmma \
     --image docker.io/kyuz0/amd-strix-halo-toolboxes:rocm-6.4.3-rocwmma \
     -- --device /dev/dri --device /dev/kfd \
     --group-add video --group-add render --group-add sudo --security-opt seccomp=unconfined
@@ -166,33 +168,36 @@ Benchmarks were analysed with **error-aware ties** (mean ± σ). If two backends
 **Prompt Processing (pp512)**
 | Backend | 1st | 2nd | 3rd |
 | --- | ---: | ---: | ---: |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt) | 9 | 6 | 0 |
-| Vulkan AMDVLK | 4 | 0 | 2 |
-| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 3 | 3 | 8 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt | 1 | 8 | 5 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 0 | 0 | 1 |
-| Vulkan RADV | 0 | 0 | 1 |
+| ROCm 6.4.4 (hipBLASLt) | 6 | 2 | 2 |
+| Vulkan AMDVLK | 6 | 1 | 0 |
+| ROCm 6.4.4 (hipBLASLt OFF) | 3 | 2 | 3 |
+| Vulkan RADV | 1 | 2 | 0 |
+| ROCm 7 RC (hipBLASLt) | 1 | 1 | 1 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 0 | 5 | 4 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt) | 0 | 4 | 2 |
+| ROCm 7 RC (hipBLASLt OFF) | 0 | 0 | 2 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt | 0 | 0 | 3 |
 
 **Token Generation (tg128)**
 | Backend | 1st | 2nd | 3rd |
 | --- | ---: | ---: | ---: |
-| Vulkan RADV | 14 | 0 | 0 |
-| ROCm 6.4.3 (hipBLASLt) | 3 | 0 | 1 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt) | 1 | 4 | 3 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 1 | 2 | 4 |
-| ROCm 6.4.3 (hipBLASLt OFF) | 1 | 1 | 1 |
-| ROCm 7 RC (hipBLASLt) | 1 | 1 | 4 |
-| ROCm 7 RC (hipBLASLt OFF) | 1 | 1 | 2 |
-| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 1 | 1 | 1 |
-| Vulkan AMDVLK | 0 | 10 | 0 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt | 0 | 1 | 2 |
+| Vulkan RADV | 10 | 1 | 2 |
+| Vulkan AMDVLK | 3 | 10 | 0 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 2 | 3 | 7 |
+| ROCm 6.4.4 (hipBLASLt) | 1 | 4 | 3 |
+| ROCm 6.4.4 (hipBLASLt OFF) | 1 | 3 | 5 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt) | 1 | 2 | 6 |
+| ROCm 7 RC (hipBLASLt) | 1 | 0 | 1 |
+| ROCm 7 RC (hipBLASLt OFF) | 0 | 1 | 1 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt | 0 | 1 | 1 |
+| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 0 | 1 | 1 |
 
 ### Summary & Recommendations
-- **Fastest prompt processing:** ROCm 6.4.3 + ROCWMMA (hipBLASLt) (most 1st-place finishes).
+- **Fastest prompt processing:** Vulkan AMDVLK, ROCm 6.4.4 (hipBLASLt) (most 1st-place finishes).
 - **Fastest token generation:** Vulkan RADV (most 1st-place finishes).
-- **Balanced choice:** ROCm 6.4.3 + ROCWMMA (hipBLASLt) (consistently near the top across PP/TG).
+- **Balanced choice:** Vulkan AMDVLK (consistently near the top across PP/TG).
 
-> **Note (ROCm 7):** Toolboxes enable **hipBLASLt** by default. The benchmark suite also runs **hipBLASLt OFF** variants to show its impact.
+> **Note (ROCm):** ROCm toolboxes enable **hipBLASLt** by default, as in *most* cases this performs better. The benchmark suite also runs **hipBLASLt OFF** variants to show its impact.
 
 📄 Full per-model analysis: [docs/benchmarks.md](docs/benchmarks.md)
 
diff --git a/benchmark/generate_markdown_results.py b/benchmark/generate_markdown_results.py
index 8651b79..c9e31e6 100644
--- a/benchmark/generate_markdown_results.py
+++ b/benchmark/generate_markdown_results.py
@@ -23,11 +23,11 @@ ENV_LABEL: Dict[str, str] = {
     "rocm7_rc-hblt0": "ROCm 7 RC (hipBLASLt OFF)",
     "rocm7_rc-rocwmma-hblt0": "ROCm 7 RC + ROCWMMA (hipBLASLt OFF)",
 
-    # ROCm 6.4.3
-    "rocm6_4_3": "ROCm 6.4.3 (hipBLASLt)",
-    "rocm6_4_3-hblt0": "ROCm 6.4.3 (hipBLASLt OFF)",
-    "rocm6_4_3-rocwmma": "ROCm 6.4.3 + ROCWMMA (hipBLASLt)",
-    "rocm6_4_3-rocwmma-hblt0": "ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF)",
+    # ROCm 6.4.4
+    "rocm6_4_4": "ROCm 6.4.4 (hipBLASLt)",
+    "rocm6_4_4-hblt0": "ROCm 6.4.4 (hipBLASLt OFF)",
+    "rocm6_4_4-rocwmma": "ROCm 6.4.4 + ROCWMMA (hipBLASLt)",
+    "rocm6_4_4-rocwmma-hblt0": "ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF)",
 
     # Vulkan
     "vulkan_amdvlk": "Vulkan AMDVLK",
@@ -461,17 +461,17 @@ def build_benchmarks_doc(
         lines.append(md_row([ENV_LABEL.get(env, env), fmt_eff(row_pp), fmt_eff(row_tg)]))
     lines.append("")
 
-    # ROCWMMA effect — check both ROCm 7 and 6.4.3 families if present
+    # ROCWMMA effect — check both ROCm 7 and 6.4.4 families if present
     lines.append("### Impact of ROCWMMA")
     rocwmma_pairs = []
     if "rocm7_rc-rocwmma" in envs and "rocm7_rc" in envs:
         rocwmma_pairs.append(("rocm7_rc-rocwmma", "rocm7_rc", "ROCm 7 RC (hipBLASLt)"))
     if "rocm7_rc-rocwmma-hblt0" in envs and "rocm7_rc-hblt0" in envs:
         rocwmma_pairs.append(("rocm7_rc-rocwmma-hblt0", "rocm7_rc-hblt0", "ROCm 7 RC (hipBLASLt OFF)"))
-    if "rocm6_4_3-rocwmma" in envs and "rocm6_4_3" in envs:
-        rocwmma_pairs.append(("rocm6_4_3-rocwmma", "rocm6_4_3", "ROCm 6.4.3 (hipBLASLt)"))
-    if "rocm6_4_3-rocwmma-hblt0" in envs and "rocm6_4_3-hblt0" in envs:
-        rocwmma_pairs.append(("rocm6_4_3-rocwmma-hblt0", "rocm6_4_3-hblt0", "ROCm 6.4.3 (hipBLASLt OFF)"))
+    if "rocm6_4_4-rocwmma" in envs and "rocm6_4_4" in envs:
+        rocwmma_pairs.append(("rocm6_4_4-rocwmma", "rocm6_4_4", "ROCm 6.4.4 (hipBLASLt)"))
+    if "rocm6_4_4-rocwmma-hblt0" in envs and "rocm6_4_4-hblt0" in envs:
+        rocwmma_pairs.append(("rocm6_4_4-rocwmma-hblt0", "rocm6_4_4-hblt0", "ROCm 6.4.4 (hipBLASLt OFF)"))
 
     rocwmma_rows = rocwmma_effect(runs, rocwmma_pairs, TESTS)
     lines.append(md_row(["Context", "Test", "Compared Envs", "Pairs", "Median Δ%"]))
@@ -480,17 +480,17 @@ def build_benchmarks_doc(
         lines.append(md_row([label, test, f"{ENV_LABEL.get(env_on, env_on)} vs {ENV_LABEL.get(env_off, env_off)}", str(n), f"{delta}%"]))
     lines.append("")
 
-    # hipBLASLt effect — for both ROCm 7 and 6.4.3 families
+    # hipBLASLt effect — for both ROCm 7 and 6.4.4 families
     lines.append("### Impact of hipBLASLt")
     hip_pairs = []
     if "rocm7_rc" in envs and "rocm7_rc-hblt0" in envs:
         hip_pairs.append(("rocm7_rc", "rocm7_rc-hblt0", "ROCm 7 RC (no ROCWMMA)"))
     if "rocm7_rc-rocwmma" in envs and "rocm7_rc-rocwmma-hblt0" in envs:
         hip_pairs.append(("rocm7_rc-rocwmma", "rocm7_rc-rocwmma-hblt0", "ROCm 7 RC + ROCWMMA"))
-    if "rocm6_4_3" in envs and "rocm6_4_3-hblt0" in envs:
-        hip_pairs.append(("rocm6_4_3", "rocm6_4_3-hblt0", "ROCm 6.4.3 (no ROCWMMA)"))
-    if "rocm6_4_3-rocwmma" in envs and "rocm6_4_3-rocwmma-hblt0" in envs:
-        hip_pairs.append(("rocm6_4_3-rocwmma", "rocm6_4_3-rocwmma-hblt0", "ROCm 6.4.3 + ROCWMMA"))
+    if "rocm6_4_4" in envs and "rocm6_4_4-hblt0" in envs:
+        hip_pairs.append(("rocm6_4_4", "rocm6_4_4-hblt0", "ROCm 6.4.4 (no ROCWMMA)"))
+    if "rocm6_4_4-rocwmma" in envs and "rocm6_4_4-rocwmma-hblt0" in envs:
+        hip_pairs.append(("rocm6_4_4-rocwmma", "rocm6_4_4-rocwmma-hblt0", "ROCm 6.4.4 + ROCWMMA"))
 
     hip_rows = hipblaslt_effect(runs, hip_pairs, TESTS)
     lines.append(md_row(["Context", "Test", "Compared Envs", "Pairs", "Median Δ%"]))
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..daa1793
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |    0 |           pp512 |        128.18 ± 0.37 |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |    0 |           tg128 |         20.51 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..e798784
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           pp512 |        134.92 ± 0.21 |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           tg128 |         21.08 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..2d8f0ca
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |    0 |           pp512 |        159.31 ± 0.83 |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |    0 |           tg128 |         20.34 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..5aa0185
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           pp512 |        171.67 ± 0.36 |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           tg128 |         21.04 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log
new file mode 100644
index 0000000..e40c1b6
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |    0 |           pp512 |        128.02 ± 0.30 |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |    0 |           tg128 |         20.53 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..6554256
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           pp512 |        136.15 ± 0.32 |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           tg128 |         21.05 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..5da6d51
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |    0 |           pp512 |        160.41 ± 0.61 |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |    0 |           tg128 |         20.50 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..f00f375
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           pp512 |        161.32 ± 0.19 |
+| glm4moe 106B.A12B Q4_K - Medium |  68.01 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           tg128 |         21.06 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..3b36313
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |    0 |           pp512 |        123.24 ± 0.42 |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |    0 |           tg128 |         15.84 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..771d380
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           pp512 |        129.37 ± 0.24 |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           tg128 |         16.17 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..a85e834
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |    0 |           pp512 |        151.03 ± 0.45 |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |    0 |           tg128 |         15.79 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..a8f8332
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           pp512 |        155.49 ± 0.74 |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           tg128 |         16.18 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4.log b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4.log
new file mode 100644
index 0000000..591d402
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |    0 |           pp512 |        122.48 ± 0.34 |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |    0 |           tg128 |         15.86 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__fa1.log b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..f4aac77
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           pp512 |        130.06 ± 0.38 |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           tg128 |         16.18 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0.log b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..fb204a9
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |    0 |           pp512 |        150.67 ± 0.75 |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |    0 |           tg128 |         15.84 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..773856c
--- /dev/null
+++ b/benchmark/results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           pp512 |        149.93 ± 0.58 |
+| glm4moe 106B.A12B Q6_K         |  94.57 GiB |   110.47 B | ROCm       |  99 |  1 |    0 |           tg128 |         16.18 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..27c4fe3
--- /dev/null
+++ b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |    0 |           pp512 |         98.87 ± 0.18 |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |    0 |           tg128 |          2.77 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..ba04c4f
--- /dev/null
+++ b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |  1 |    0 |           pp512 |        104.31 ± 0.07 |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |  1 |    0 |           tg128 |          2.79 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..2cf5854
--- /dev/null
+++ b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |    0 |           pp512 |         97.43 ± 0.23 |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |    0 |           tg128 |          2.76 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..ca99086
--- /dev/null
+++ b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |  1 |    0 |           pp512 |        103.81 ± 0.09 |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |  1 |    0 |           tg128 |          2.78 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4.log b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4.log
new file mode 100644
index 0000000..c9ad273
--- /dev/null
+++ b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |    0 |           pp512 |         99.32 ± 0.17 |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |    0 |           tg128 |          2.78 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__fa1.log b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..5ab870f
--- /dev/null
+++ b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |  1 |    0 |           pp512 |        104.93 ± 0.11 |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |  1 |    0 |           tg128 |          2.79 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0.log b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..89f6c3b
--- /dev/null
+++ b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |    0 |           pp512 |         98.99 ± 0.21 |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |    0 |           tg128 |          2.78 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..83ddd35
--- /dev/null
+++ b/benchmark/results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |  1 |    0 |           pp512 |        103.03 ± 0.23 |
+| llama 70B Q8_0                 |  75.65 GiB |    70.55 B | ROCm       |  99 |  1 |    0 |           tg128 |          2.79 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..0942b1d
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        276.88 ± 1.57 |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         14.66 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..23ba9a1
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        292.47 ± 1.18 |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.83 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..3ad139d
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        277.79 ± 0.94 |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         14.65 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..17338e8
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        292.17 ± 1.61 |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.83 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4.log
new file mode 100644
index 0000000..0fba6ba
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        276.97 ± 1.15 |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         14.71 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..7a9c31a
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        293.79 ± 2.33 |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.84 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..f465241
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        278.59 ± 1.22 |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         14.70 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..1ceb016
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        296.61 ± 0.98 |
+| llama4 17Bx16E (Scout) Q6_K    |  82.35 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.83 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..a90f13a
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        281.33 ± 2.60 |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         11.89 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..e75d6ae
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        297.14 ± 1.58 |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         12.00 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..321f56b
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        280.36 ± 0.42 |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         11.88 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..541b6d2
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        298.12 ± 2.72 |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         12.00 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4.log
new file mode 100644
index 0000000..9662d5c
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        279.89 ± 0.66 |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         11.92 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..29a7042
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        297.68 ± 2.90 |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         11.97 ± 0.09 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..64d07c9
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        284.44 ± 3.25 |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         11.90 ± 0.04 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..906e40b
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        300.04 ± 1.45 |
+| llama4 17Bx16E (Scout) Q8_0    | 106.65 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         12.00 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..a4df56c
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        291.19 ± 2.35 |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         17.82 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..b692d74
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        307.71 ± 1.77 |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         18.00 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..b358a77
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        291.96 ± 2.18 |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         17.82 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..05bade8
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        310.84 ± 1.35 |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         18.01 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log
new file mode 100644
index 0000000..b0ad559
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        291.26 ± 0.79 |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         17.83 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..ff01df1
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        311.26 ± 1.06 |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         17.97 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..606cad7
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |    0 |           pp512 |        290.78 ± 1.38 |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |    0 |           tg128 |         17.81 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..b6018d7
--- /dev/null
+++ b/benchmark/results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        310.36 ± 1.62 |
+| llama4 17Bx16E (Scout) Q4_K - Medium |  57.73 GiB |   107.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         18.00 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..f7ed4a1
--- /dev/null
+++ b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |    0 |           pp512 |        134.57 ± 0.66 |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |    0 |           tg128 |         14.57 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..b462385
--- /dev/null
+++ b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |  1 |    0 |           pp512 |        144.38 ± 0.73 |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.90 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..f807cdd
--- /dev/null
+++ b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |    0 |           pp512 |        134.69 ± 1.05 |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |    0 |           tg128 |         14.58 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..ee3cd12
--- /dev/null
+++ b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |  1 |    0 |           pp512 |        143.45 ± 0.41 |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.97 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4.log b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4.log
new file mode 100644
index 0000000..5491561
--- /dev/null
+++ b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |    0 |           pp512 |        133.50 ± 0.67 |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |    0 |           tg128 |         14.55 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__fa1.log b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..2cc99e2
--- /dev/null
+++ b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |  1 |    0 |           pp512 |        144.31 ± 0.58 |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.93 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0.log b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..06bee5e
--- /dev/null
+++ b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |    0 |           pp512 |        133.54 ± 0.74 |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |    0 |           tg128 |         14.54 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..c8241e6
--- /dev/null
+++ b/benchmark/results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |  1 |    0 |           pp512 |        144.26 ± 0.29 |
+| qwen3moe 235B.A22B Q3_K - Medium |  96.99 GiB |   235.09 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.92 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma.log b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..0769aaf
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |    0 |           pp512 |        451.60 ± 1.80 |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |    0 |           tg128 |         25.54 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..0258221
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           pp512 |        482.09 ± 5.55 |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           tg128 |         25.77 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..80c30d1
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |    0 |           pp512 |        345.46 ± 3.07 |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |    0 |           tg128 |         25.49 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..0825c92
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           pp512 |        354.93 ± 5.65 |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           tg128 |         25.80 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4.log b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4.log
new file mode 100644
index 0000000..b6bb252
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |    0 |           pp512 |        448.97 ± 7.97 |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |    0 |           tg128 |         25.57 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__fa1.log b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..c4e8076
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           pp512 |        489.49 ± 3.92 |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           tg128 |         25.78 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0.log b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..993b363
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |    0 |           pp512 |        343.78 ± 1.91 |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |    0 |           tg128 |         25.48 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..26e3a84
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           pp512 |        363.09 ± 8.05 |
+| qwen3moe 30B.A3B BF16          |  56.89 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           tg128 |         25.75 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma.log b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..185f173
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |    0 |           pp512 |        577.98 ± 6.34 |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |    0 |           tg128 |         55.37 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..fcaa877
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           pp512 |        623.53 ± 3.70 |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           tg128 |         56.76 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..fe00584
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |    0 |           pp512 |        582.34 ± 4.27 |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |    0 |           tg128 |         55.34 ± 0.02 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..14fa7cd
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           pp512 |        622.32 ± 5.83 |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           tg128 |         56.82 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4.log b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4.log
new file mode 100644
index 0000000..feb7eb4
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |    0 |           pp512 |        582.99 ± 4.97 |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |    0 |           tg128 |         55.33 ± 0.02 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__fa1.log b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..f79c6d2
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           pp512 |        632.12 ± 3.63 |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           tg128 |         56.73 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0.log b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..4fb2d36
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |    0 |           pp512 |        582.14 ± 4.21 |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |    0 |           tg128 |         55.39 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0__fa1.log b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..350eefc
--- /dev/null
+++ b/benchmark/results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           pp512 |        632.63 ± 4.35 |
+| qwen3moe 30B.A3B Q6_K          |  24.53 GiB |    30.53 B | ROCm       |  99 |  1 |    0 |           tg128 |         56.77 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma.log b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..17398bd
--- /dev/null
+++ b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |    0 |           pp512 |        754.71 ± 0.79 |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |    0 |           tg128 |         14.16 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..1ce720a
--- /dev/null
+++ b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        803.95 ± 0.73 |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.07 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..2b06f55
--- /dev/null
+++ b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |    0 |           pp512 |        768.26 ± 1.35 |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |    0 |           tg128 |         14.15 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..df3734a
--- /dev/null
+++ b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        814.89 ± 0.73 |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.08 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4.log b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4.log
new file mode 100644
index 0000000..672a797
--- /dev/null
+++ b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |    0 |           pp512 |        751.85 ± 1.59 |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |    0 |           tg128 |         14.16 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__fa1.log b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..6023ebe
--- /dev/null
+++ b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        814.18 ± 1.01 |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.08 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0.log b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..12ac183
--- /dev/null
+++ b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |    0 |           pp512 |        769.51 ± 0.90 |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |    0 |           tg128 |         14.15 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0__fa1.log b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..98f7cdf
--- /dev/null
+++ b/benchmark/results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |  1 |    0 |           pp512 |        824.93 ± 0.75 |
+| gemma3 12B Q8_0                |  13.40 GiB |    11.77 B | ROCm       |  99 |  1 |    0 |           tg128 |         14.08 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma.log b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..d4000d3
--- /dev/null
+++ b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |    0 |           pp512 |        425.33 ± 1.61 |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |    0 |           tg128 |          4.11 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..a3a80d9
--- /dev/null
+++ b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |  1 |    0 |           pp512 |        470.80 ± 1.97 |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |  1 |    0 |           tg128 |          4.10 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..7dcd8b2
--- /dev/null
+++ b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |    0 |           pp512 |        469.59 ± 0.76 |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |    0 |           tg128 |          4.04 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..d193232
--- /dev/null
+++ b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |  1 |    0 |           pp512 |        524.38 ± 0.70 |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |  1 |    0 |           tg128 |          4.10 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4.log b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4.log
new file mode 100644
index 0000000..42dd307
--- /dev/null
+++ b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |    0 |           pp512 |        418.14 ± 0.79 |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |    0 |           tg128 |          4.10 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__fa1.log b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..b0b764e
--- /dev/null
+++ b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |  1 |    0 |           pp512 |        472.28 ± 1.24 |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |  1 |    0 |           tg128 |          4.10 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0.log b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..9a91629
--- /dev/null
+++ b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |    0 |           pp512 |        471.56 ± 0.60 |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |    0 |           tg128 |          4.10 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..d91b6ce
--- /dev/null
+++ b/benchmark/results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |  1 |    0 |           pp512 |        530.58 ± 0.66 |
+| gemma3 27B BF16                |  50.31 GiB |    27.01 B | ROCm       |  99 |  1 |    0 |           tg128 |          4.11 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma.log b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..092f913
--- /dev/null
+++ b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |    0 |           pp512 |       2110.44 ± 6.13 |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |    0 |           tg128 |         79.31 ± 0.03 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..7aeae92
--- /dev/null
+++ b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           pp512 |       2261.02 ± 8.46 |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           tg128 |         77.07 ± 0.04 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..ffa9c72
--- /dev/null
+++ b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |    0 |           pp512 |       2040.30 ± 9.11 |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |    0 |           tg128 |         79.33 ± 0.05 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..de860ae
--- /dev/null
+++ b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           pp512 |       2143.83 ± 3.82 |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           tg128 |         77.19 ± 0.02 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4.log b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4.log
new file mode 100644
index 0000000..b1c6f95
--- /dev/null
+++ b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |    0 |           pp512 |       2099.80 ± 6.34 |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |    0 |           tg128 |         79.43 ± 0.05 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__fa1.log b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..6dd1c51
--- /dev/null
+++ b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           pp512 |       2262.00 ± 6.48 |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           tg128 |         77.04 ± 0.03 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0.log b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..a41b287
--- /dev/null
+++ b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |    0 |           pp512 |       2038.14 ± 6.72 |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |    0 |           tg128 |         79.41 ± 0.04 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0__fa1.log b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..11e13dd
--- /dev/null
+++ b/benchmark/results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           pp512 |       2141.85 ± 6.83 |
+| gemma3 4B Q3_K - Small         |   1.80 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           tg128 |         77.14 ± 0.02 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma.log b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..6f18e5c
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |    0 |           pp512 |        683.95 ± 7.54 |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |    0 |           tg128 |         34.82 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..e759809
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           pp512 |        783.37 ± 6.29 |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           tg128 |         35.06 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..f285cae
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |    0 |           pp512 |        689.85 ± 4.60 |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |    0 |           tg128 |         34.84 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..aca0d6e
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           pp512 |        789.94 ± 5.16 |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           tg128 |         35.17 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-F16__rocm6_4_4.log b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4.log
new file mode 100644
index 0000000..0e3533b
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |    0 |           pp512 |        682.09 ± 3.61 |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |    0 |           tg128 |         34.89 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__fa1.log b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..2f2e390
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           pp512 |        790.76 ± 6.72 |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           tg128 |         35.06 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__hblt0.log b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..a70aa6f
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |    0 |           pp512 |        688.37 ± 4.43 |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |    0 |           tg128 |         34.74 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__hblt0__fa1.log b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..5a2445a
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-F16__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           pp512 |       777.75 ± 25.64 |
+| gpt-oss 120B F16               |  60.87 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           tg128 |         35.12 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma.log b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..58ee406
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |    0 |           pp512 |        668.07 ± 3.99 |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |    0 |           tg128 |         47.22 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..7b6a6b6
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           pp512 |        767.63 ± 5.37 |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           tg128 |         47.72 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..2509d50
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |    0 |           pp512 |        685.61 ± 4.60 |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |    0 |           tg128 |         47.15 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..85d58d8
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           pp512 |        785.43 ± 4.63 |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           tg128 |         47.65 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4.log b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4.log
new file mode 100644
index 0000000..b26e84b
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |    0 |           pp512 |        664.62 ± 3.53 |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |    0 |           tg128 |         47.11 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__fa1.log b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..e27bc1a
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           pp512 |        773.25 ± 6.50 |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           tg128 |         47.69 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0.log b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..b4438a1
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |    0 |           pp512 |        686.92 ± 5.29 |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |    0 |           tg128 |         47.15 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0__fa1.log b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..380679c
--- /dev/null
+++ b/benchmark/results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           pp512 |        781.60 ± 6.15 |
+| gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | ROCm       |  99 |  1 |    0 |           tg128 |         47.76 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma.log b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..50331f8
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |    0 |           pp512 |       1253.42 ± 6.47 |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |    0 |           tg128 |         27.29 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..b0a6f2f
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           pp512 |       1502.41 ± 9.99 |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           tg128 |         27.35 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..fd5dc3d
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |    0 |           pp512 |      1234.38 ± 12.52 |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |    0 |           tg128 |         27.25 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..5a05042
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           pp512 |       1463.75 ± 8.49 |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           tg128 |         27.34 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-F32__rocm6_4_4.log b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4.log
new file mode 100644
index 0000000..007f05a
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+hipBLASLt error: Heuristic Fetch Failed!
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |    0 |           pp512 |      1258.74 ± 12.44 |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |    0 |           tg128 |         27.27 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__fa1.log b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..c19f17f
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           pp512 |      1513.34 ± 10.79 |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           tg128 |         27.35 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__hblt0.log b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..41dede3
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |    0 |           pp512 |       1235.02 ± 7.10 |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |    0 |           tg128 |         27.26 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__hblt0__fa1.log b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..b6c968d
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-F32__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           pp512 |      1475.65 ± 12.28 |
+| gpt-oss 20B BF16               |  38.97 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           tg128 |         27.32 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma.log b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..b8cffdf
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |    0 |           pp512 |      1276.57 ± 15.26 |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |    0 |           tg128 |         67.47 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..7e59410
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           pp512 |      1520.24 ± 18.05 |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           tg128 |         68.08 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..5098e0d
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |    0 |           pp512 |       1335.36 ± 7.22 |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |    0 |           tg128 |         67.28 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..8ee2cf2
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           pp512 |      1575.76 ± 15.77 |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           tg128 |         68.18 ± 0.02 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4.log b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4.log
new file mode 100644
index 0000000..f71c809
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4.log
@@ -0,0 +1,15 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+rocBLAS error: No hipBLASLt solution found
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASLT_ERROR environment variable is set.
+
+rocBLAS warning: hipBlasLT failed, falling back to tensile. 
+This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |    0 |           pp512 |       1270.02 ± 3.61 |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |    0 |           tg128 |         67.37 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__fa1.log b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..e53da19
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           pp512 |      1533.65 ± 17.58 |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           tg128 |         68.13 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0.log b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..49fe2c4
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |    0 |           pp512 |      1337.89 ± 14.39 |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |    0 |           tg128 |         67.39 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0__fa1.log b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..a972365
--- /dev/null
+++ b/benchmark/results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           pp512 |      1587.21 ± 12.01 |
+| gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | ROCm       |  99 |  1 |    0 |           tg128 |         68.25 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma.log b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma.log
new file mode 100644
index 0000000..dde3de4
--- /dev/null
+++ b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |    0 |           pp512 |        979.59 ± 0.72 |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |    0 |           tg128 |         49.85 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__fa1.log b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__fa1.log
new file mode 100644
index 0000000..2e9c7a7
--- /dev/null
+++ b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |  1 |    0 |           pp512 |       1098.00 ± 4.05 |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |  1 |    0 |           tg128 |         49.40 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0.log b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0.log
new file mode 100644
index 0000000..51e1090
--- /dev/null
+++ b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |    0 |           pp512 |        899.84 ± 2.29 |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |    0 |           tg128 |         49.81 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0__fa1.log b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0__fa1.log
new file mode 100644
index 0000000..e7d62d8
--- /dev/null
+++ b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |  1 |    0 |           pp512 |       1005.78 ± 1.42 |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |  1 |    0 |           tg128 |         49.37 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4.log b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4.log
new file mode 100644
index 0000000..013bd5b
--- /dev/null
+++ b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |    0 |           pp512 |        979.86 ± 1.66 |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |    0 |           tg128 |         49.87 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__fa1.log b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__fa1.log
new file mode 100644
index 0000000..16d2791
--- /dev/null
+++ b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |  1 |    0 |           pp512 |       1117.04 ± 3.47 |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |  1 |    0 |           tg128 |         49.38 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__hblt0.log b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__hblt0.log
new file mode 100644
index 0000000..c0db236
--- /dev/null
+++ b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__hblt0.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |    0 |           pp512 |        895.65 ± 0.66 |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |    0 |           tg128 |         49.89 ± 0.00 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__hblt0__fa1.log b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__hblt0__fa1.log
new file mode 100644
index 0000000..81ee700
--- /dev/null
+++ b/benchmark/results/llama-2-7b.Q4_0__rocm6_4_4__hblt0__fa1.log
@@ -0,0 +1,10 @@
+ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
+ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
+ggml_cuda_init: found 1 ROCm devices:
+  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
+| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
+| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |  1 |    0 |           pp512 |       1020.22 ± 1.63 |
+| llama 7B Q4_0                  |   3.56 GiB |     6.74 B | ROCm       |  99 |  1 |    0 |           tg128 |         49.36 ± 0.01 |
+
+build: 4807e8f9 (6609)
diff --git a/benchmark/run_benchmarks.sh b/benchmark/run_benchmarks.sh
index 1aca5c0..ab7f79d 100755
--- a/benchmark/run_benchmarks.sh
+++ b/benchmark/run_benchmarks.sh
@@ -41,7 +41,7 @@ for MODEL_PATH in "${MODEL_PATHS[@]}"; do
     CMD="${CMDS[$ENV]}"
 
     # For ROCm 6.4.4 and 7 envs, run default + HIPBLASLT=0 variants; others: default only
-    if [[ "$ENV" == rocm7_* || "$ENV" == rocm6_4_4* ]]; then
+    if [[ "$ENV" == rocm7_* || "$ENV" == rocm6_4_* ]]; then
       HBLT_MODES=( default off )
     else
       HBLT_MODES=( default )
diff --git a/docs/benchmarks.md b/docs/benchmarks.md
index 5e650aa..3e8235b 100644
--- a/docs/benchmarks.md
+++ b/docs/benchmarks.md
@@ -26,9 +26,9 @@
 - Winners per model/test are **margin-aware**; multiple winners are possible when mean±σ overlap
 - Built from the same llama.cpp commit for consistency
 
-**Backends in this dataset:** ROCm 7 RC + ROCWMMA + hipBLASLt, ROCm 7 RC (hipBLASLt), ROCm 7 RC (hipBLASLt OFF), ROCm 7 RC + ROCWMMA (hipBLASLt OFF), ROCm 6.4.3 (hipBLASLt), ROCm 6.4.3 (hipBLASLt OFF), ROCm 6.4.3 + ROCWMMA (hipBLASLt), ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF), Vulkan AMDVLK, Vulkan RADV
+**Backends in this dataset:** ROCm 7 RC + ROCWMMA + hipBLASLt, ROCm 7 RC (hipBLASLt), ROCm 7 RC (hipBLASLt OFF), ROCm 7 RC + ROCWMMA (hipBLASLt OFF), ROCm 6.4.4 (hipBLASLt), ROCm 6.4.4 (hipBLASLt OFF), ROCm 6.4.4 + ROCWMMA (hipBLASLt), ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF), Vulkan AMDVLK, Vulkan RADV
 
-**ROCm hipBLASLt policy:** Toolboxes ship with **hipBLASLt enabled** by default (`ROCBLAS_USE_HIPBLASLT=1`). The benchmark script also runs **hipBLASLt OFF** variants (`-hblt0`) to measure its effect.
+**ROCm 7 hipBLASLt policy:** Toolboxes ship with **hipBLASLt enabled** by default (`ROCBLAS_USE_HIPBLASLT=1`). The benchmark script also runs **hipBLASLt OFF** variants (`-hblt0`) to measure its effect.
 
 ---
 
@@ -38,62 +38,68 @@
 **Prompt Processing (pp512)**
 | Backend | 1st | 2nd | 3rd |
 | --- | ---: | ---: | ---: |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt) | 9 | 6 | 0 |
-| Vulkan AMDVLK | 4 | 0 | 2 |
-| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 3 | 3 | 8 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt | 1 | 8 | 5 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 0 | 0 | 1 |
-| Vulkan RADV | 0 | 0 | 1 |
+| ROCm 6.4.4 (hipBLASLt) | 6 | 2 | 2 |
+| Vulkan AMDVLK | 6 | 1 | 0 |
+| ROCm 6.4.4 (hipBLASLt OFF) | 3 | 2 | 3 |
+| Vulkan RADV | 1 | 2 | 0 |
+| ROCm 7 RC (hipBLASLt) | 1 | 1 | 1 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 0 | 5 | 4 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt) | 0 | 4 | 2 |
+| ROCm 7 RC (hipBLASLt OFF) | 0 | 0 | 2 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt | 0 | 0 | 3 |
 
 **Token Generation (tg128)**
 | Backend | 1st | 2nd | 3rd |
 | --- | ---: | ---: | ---: |
-| Vulkan RADV | 14 | 0 | 0 |
-| ROCm 6.4.3 (hipBLASLt) | 3 | 0 | 1 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt) | 1 | 4 | 3 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 1 | 2 | 4 |
-| ROCm 6.4.3 (hipBLASLt OFF) | 1 | 1 | 1 |
-| ROCm 7 RC (hipBLASLt) | 1 | 1 | 4 |
-| ROCm 7 RC (hipBLASLt OFF) | 1 | 1 | 2 |
-| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 1 | 1 | 1 |
-| Vulkan AMDVLK | 0 | 10 | 0 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt | 0 | 1 | 2 |
+| Vulkan RADV | 10 | 1 | 2 |
+| Vulkan AMDVLK | 3 | 10 | 0 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 2 | 3 | 7 |
+| ROCm 6.4.4 (hipBLASLt) | 1 | 4 | 3 |
+| ROCm 6.4.4 (hipBLASLt OFF) | 1 | 3 | 5 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt) | 1 | 2 | 6 |
+| ROCm 7 RC (hipBLASLt) | 1 | 0 | 1 |
+| ROCm 7 RC (hipBLASLt OFF) | 0 | 1 | 1 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt | 0 | 1 | 1 |
+| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 0 | 1 | 1 |
 
 ### Pairwise head-to-head wins
 For any model+quant where both backends succeeded, this counts who was faster (ties when equal).
 | Comparison | Test | A wins | B wins | Ties | Total |
 | --- | --- | ---: | ---: | ---: | ---: |
-| ROCm 7 RC + ROCWMMA + hipBLASLt vs Vulkan AMDVLK | pp512 | 11 | 5 | 0 | 16 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt vs Vulkan AMDVLK | tg128 | 4 | 11 | 1 | 16 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt vs Vulkan RADV | pp512 | 15 | 2 | 0 | 17 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt vs Vulkan RADV | tg128 | 3 | 14 | 0 | 17 |
-| Vulkan AMDVLK vs Vulkan RADV | pp512 | 14 | 2 | 0 | 16 |
-| Vulkan AMDVLK vs Vulkan RADV | tg128 | 2 | 14 | 0 | 16 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt vs Vulkan AMDVLK | pp512 | 9 | 7 | 0 | 16 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt vs Vulkan AMDVLK | tg128 | 2 | 14 | 0 | 16 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt vs Vulkan RADV | pp512 | 14 | 3 | 0 | 17 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt vs Vulkan RADV | tg128 | 4 | 12 | 1 | 17 |
+| Vulkan AMDVLK vs Vulkan RADV | pp512 | 12 | 4 | 0 | 16 |
+| Vulkan AMDVLK vs Vulkan RADV | tg128 | 5 | 11 | 0 | 16 |
 
 ### Average ranks
 **Prompt Processing (pp512)**
 | Backend | Avg Rank (↓ is better) |
 | --- | ---: |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt) | 1.4 |
-| Vulkan AMDVLK | 1.67 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt | 2.29 |
-| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 2.36 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 3.0 |
-| Vulkan RADV | 3.0 |
+| Vulkan AMDVLK | 1.14 |
+| ROCm 6.4.4 (hipBLASLt) | 1.6 |
+| Vulkan RADV | 1.67 |
+| ROCm 6.4.4 (hipBLASLt OFF) | 2.0 |
+| ROCm 7 RC (hipBLASLt) | 2.0 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt) | 2.33 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 2.44 |
+| ROCm 7 RC (hipBLASLt OFF) | 3.0 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt | 3.0 |
 
 **Token Generation (tg128)**
 | Backend | Avg Rank (↓ is better) |
 | --- | ---: |
-| Vulkan RADV | 1.0 |
-| ROCm 6.4.3 (hipBLASLt) | 1.5 |
-| Vulkan AMDVLK | 2.0 |
-| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 2.0 |
-| ROCm 6.4.3 (hipBLASLt OFF) | 2.0 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt) | 2.25 |
-| ROCm 7 RC (hipBLASLt OFF) | 2.25 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 2.43 |
-| ROCm 7 RC (hipBLASLt) | 2.5 |
-| ROCm 7 RC + ROCWMMA + hipBLASLt | 2.67 |
+| Vulkan RADV | 1.38 |
+| Vulkan AMDVLK | 1.77 |
+| ROCm 7 RC (hipBLASLt) | 2.0 |
+| ROCm 6.4.4 (hipBLASLt) | 2.25 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 2.42 |
+| ROCm 6.4.4 (hipBLASLt OFF) | 2.44 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt | 2.5 |
+| ROCm 7 RC (hipBLASLt OFF) | 2.5 |
+| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 2.5 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt) | 2.56 |
 
 ---
 
@@ -103,54 +109,54 @@ For any model+quant where both backends succeeded, this counts who was faster (t
 Median % change when **Flash Attention ON vs OFF**, paired by model+quant, per backend:
 | Backend | pp512 Δ% (median, min..max, n) | tg128 Δ% (median, min..max, n) |
 | --- | --- | --- |
-| ROCm 7 RC + ROCWMMA + hipBLASLt | 8.8% (3.6..65.6), n=15 | -1.2% (-8.2..-0.3), n=15 |
-| ROCm 7 RC (hipBLASLt) | -20.7% (-30.1..6.5), n=11 | -0.9% (-8.5..3.0), n=11 |
-| ROCm 7 RC (hipBLASLt OFF) | -22.9% (-28.2..-16.1), n=10 | -1.5% (-8.6..0.1), n=10 |
-| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 5.8% (1.3..24.1), n=17 | -1.4% (-7.4..15.1), n=17 |
-| ROCm 6.4.3 (hipBLASLt) | -20.9% (-29.8..-11.9), n=13 | -1.2% (-6.9..0.8), n=13 |
-| ROCm 6.4.3 (hipBLASLt OFF) | -10.9% (-22.3..3.6), n=10 | -1.4% (-11.1..0.0), n=10 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt) | 11.3% (3.9..25.7), n=16 | -0.7% (-7.5..3.0), n=16 |
-| ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 5.9% (1.8..12.3), n=11 | -0.9% (-6.5..2.3), n=11 |
-| Vulkan AMDVLK | 1.1% (-45.4..20.2), n=16 | -1.3% (-28.6..0.1), n=16 |
-| Vulkan RADV | 3.7% (-2.6..12.5), n=17 | 0.0% (-5.8..2.4), n=17 |
+| ROCm 7 RC + ROCWMMA + hipBLASLt | 11.4% (4.2..34.1), n=17 | -0.5% (-8.8..0.8), n=17 |
+| ROCm 7 RC (hipBLASLt) | 11.7% (-23.0..25.6), n=14 | -1.1% (-8.7..1.0), n=14 |
+| ROCm 7 RC (hipBLASLt OFF) | 6.8% (2.1..18.4), n=15 | -0.8% (-9.0..0.5), n=15 |
+| ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 6.3% (-5.5..17.4), n=16 | -0.8% (-15.1..0.6), n=16 |
+| ROCm 6.4.4 (hipBLASLt) | 8.3% (5.6..20.8), n=17 | 0.8% (-3.0..2.6), n=17 |
+| ROCm 6.4.4 (hipBLASLt OFF) | 7.2% (-0.5..19.5), n=17 | 1.1% (-2.9..2.7), n=17 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt) | 7.1% (5.0..19.9), n=17 | 0.9% (-2.8..2.8), n=17 |
+| ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 6.5% (2.7..18.6), n=17 | 1.1% (-2.7..3.4), n=17 |
+| Vulkan AMDVLK | 1.3% (-10.8..27.8), n=16 | -1.2% (-6.8..0.1), n=16 |
+| Vulkan RADV | 4.8% (-0.5..20.1), n=17 | -0.1% (-2.1..2.0), n=17 |
 
 ### Impact of ROCWMMA
 | Context | Test | Compared Envs | Pairs | Median Δ% |
 | --- | --- | --- | ---: | ---: |
-| ROCm 7 RC (hipBLASLt) | pp512 | ROCm 7 RC + ROCWMMA + hipBLASLt vs ROCm 7 RC (hipBLASLt) | 17 | 17.6% |
-| ROCm 7 RC (hipBLASLt) | tg128 | ROCm 7 RC + ROCWMMA + hipBLASLt vs ROCm 7 RC (hipBLASLt) | 17 | -0.8% |
-| ROCm 7 RC (hipBLASLt OFF) | pp512 | ROCm 7 RC + ROCWMMA (hipBLASLt OFF) vs ROCm 7 RC (hipBLASLt OFF) | 16 | 14.6% |
-| ROCm 7 RC (hipBLASLt OFF) | tg128 | ROCm 7 RC + ROCWMMA (hipBLASLt OFF) vs ROCm 7 RC (hipBLASLt OFF) | 16 | -0.9% |
-| ROCm 6.4.3 (hipBLASLt) | pp512 | ROCm 6.4.3 + ROCWMMA (hipBLASLt) vs ROCm 6.4.3 (hipBLASLt) | 16 | 17.5% |
-| ROCm 6.4.3 (hipBLASLt) | tg128 | ROCm 6.4.3 + ROCWMMA (hipBLASLt) vs ROCm 6.4.3 (hipBLASLt) | 16 | -0.3% |
-| ROCm 6.4.3 (hipBLASLt OFF) | pp512 | ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) vs ROCm 6.4.3 (hipBLASLt OFF) | 10 | 9.7% |
-| ROCm 6.4.3 (hipBLASLt OFF) | tg128 | ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) vs ROCm 6.4.3 (hipBLASLt OFF) | 10 | 0.2% |
+| ROCm 7 RC (hipBLASLt) | pp512 | ROCm 7 RC + ROCWMMA + hipBLASLt vs ROCm 7 RC (hipBLASLt) | 15 | -0.0% |
+| ROCm 7 RC (hipBLASLt) | tg128 | ROCm 7 RC + ROCWMMA + hipBLASLt vs ROCm 7 RC (hipBLASLt) | 15 | 0.0% |
+| ROCm 7 RC (hipBLASLt OFF) | pp512 | ROCm 7 RC + ROCWMMA (hipBLASLt OFF) vs ROCm 7 RC (hipBLASLt OFF) | 17 | -0.2% |
+| ROCm 7 RC (hipBLASLt OFF) | tg128 | ROCm 7 RC + ROCWMMA (hipBLASLt OFF) vs ROCm 7 RC (hipBLASLt OFF) | 17 | 0.0% |
+| ROCm 6.4.4 (hipBLASLt) | pp512 | ROCm 6.4.4 + ROCWMMA (hipBLASLt) vs ROCm 6.4.4 (hipBLASLt) | 17 | -0.4% |
+| ROCm 6.4.4 (hipBLASLt) | tg128 | ROCm 6.4.4 + ROCWMMA (hipBLASLt) vs ROCm 6.4.4 (hipBLASLt) | 17 | 0.0% |
+| ROCm 6.4.4 (hipBLASLt OFF) | pp512 | ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) vs ROCm 6.4.4 (hipBLASLt OFF) | 17 | -0.5% |
+| ROCm 6.4.4 (hipBLASLt OFF) | tg128 | ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) vs ROCm 6.4.4 (hipBLASLt OFF) | 17 | -0.1% |
 
 ### Impact of hipBLASLt
 | Context | Test | Compared Envs | Pairs | Median Δ% |
 | --- | --- | --- | ---: | ---: |
-| ROCm 7 RC (no ROCWMMA) | pp512 | ROCm 7 RC (hipBLASLt) vs ROCm 7 RC (hipBLASLt OFF) | 16 | 0.4% |
-| ROCm 7 RC (no ROCWMMA) | tg128 | ROCm 7 RC (hipBLASLt) vs ROCm 7 RC (hipBLASLt OFF) | 16 | -0.1% |
-| ROCm 7 RC + ROCWMMA | pp512 | ROCm 7 RC + ROCWMMA + hipBLASLt vs ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 17 | 2.0% |
+| ROCm 7 RC (no ROCWMMA) | pp512 | ROCm 7 RC (hipBLASLt) vs ROCm 7 RC (hipBLASLt OFF) | 15 | -0.2% |
+| ROCm 7 RC (no ROCWMMA) | tg128 | ROCm 7 RC (hipBLASLt) vs ROCm 7 RC (hipBLASLt OFF) | 15 | 0.0% |
+| ROCm 7 RC + ROCWMMA | pp512 | ROCm 7 RC + ROCWMMA + hipBLASLt vs ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 17 | -0.1% |
 | ROCm 7 RC + ROCWMMA | tg128 | ROCm 7 RC + ROCWMMA + hipBLASLt vs ROCm 7 RC + ROCWMMA (hipBLASLt OFF) | 17 | 0.0% |
-| ROCm 6.4.3 (no ROCWMMA) | pp512 | ROCm 6.4.3 (hipBLASLt) vs ROCm 6.4.3 (hipBLASLt OFF) | 10 | 154.8% |
-| ROCm 6.4.3 (no ROCWMMA) | tg128 | ROCm 6.4.3 (hipBLASLt) vs ROCm 6.4.3 (hipBLASLt OFF) | 10 | 0.0% |
-| ROCm 6.4.3 + ROCWMMA | pp512 | ROCm 6.4.3 + ROCWMMA (hipBLASLt) vs ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 14 | 117.0% |
-| ROCm 6.4.3 + ROCWMMA | tg128 | ROCm 6.4.3 + ROCWMMA (hipBLASLt) vs ROCm 6.4.3 + ROCWMMA (hipBLASLt OFF) | 14 | -0.0% |
+| ROCm 6.4.4 (no ROCWMMA) | pp512 | ROCm 6.4.4 (hipBLASLt) vs ROCm 6.4.4 (hipBLASLt OFF) | 17 | 0.0% |
+| ROCm 6.4.4 (no ROCWMMA) | tg128 | ROCm 6.4.4 (hipBLASLt) vs ROCm 6.4.4 (hipBLASLt OFF) | 17 | 0.0% |
+| ROCm 6.4.4 + ROCWMMA | pp512 | ROCm 6.4.4 + ROCWMMA (hipBLASLt) vs ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 17 | -0.3% |
+| ROCm 6.4.4 + ROCWMMA | tg128 | ROCm 6.4.4 + ROCWMMA (hipBLASLt) vs ROCm 6.4.4 + ROCWMMA (hipBLASLt OFF) | 17 | 0.0% |
 
 ### Vulkan: AMDVLK vs RADV
 Head-to-head wins with selected Flash Attention filter:
 | Test | AMDVLK wins | RADV wins | Ties | Total |
 | --- | ---: | ---: | ---: | ---: |
-| pp512 | 14 | 2 | 0 | 16 |
-| tg128 | 2 | 14 | 0 | 16 |
+| pp512 | 12 | 4 | 0 | 16 |
+| tg128 | 5 | 11 | 0 | 16 |
 
 ---
 
 ## Recommendations
-- **Fastest prompt processing:** ROCm 6.4.3 + ROCWMMA (hipBLASLt) (most 1st-place finishes with selected Flash Attention filter).
+- **Fastest prompt processing:** Vulkan AMDVLK, ROCm 6.4.4 (hipBLASLt) (most 1st-place finishes with selected Flash Attention filter).
 - **Fastest token generation:** Vulkan RADV (most 1st-place finishes with selected Flash Attention filter).
-- **Balanced choice:** ROCm 6.4.3 + ROCWMMA (hipBLASLt) (consistently near the top across PP/TG).
+- **Balanced choice:** Vulkan AMDVLK (consistently near the top across PP/TG).
 
 ---
 
diff --git a/docs/index.html b/docs/index.html
index 6cd3e4a..010ca08 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -523,7 +523,7 @@
                 <div class="colbox">
                 <label for="${id}">
                 <input id="${id}" type="checkbox" data-env="${env}"
-                    ${/(vulkan_amdvlk|vulkan_radv|rocm6_4_3-rocwmma|rocm7_rc-rocwmma)(?![-\w])/.test(env.trim()) ? 'checked' : ''}>
+                    ${/(vulkan_amdvlk|vulkan_radv|rocm6_4_4-rocwmma|rocm7_rc-rocwmma)(?![-\w])/.test(env.trim()) ? 'checked' : ''}>
                     <span>
                     <strong>${env}</strong>
                     ${roc ? '<span class="badge roc">rocWMMA</span>' : ''}
diff --git a/docs/results.json b/docs/results.json
index 695b5e1..5cc4a33 100644
--- a/docs/results.json
+++ b/docs/results.json
@@ -1,8 +1,12 @@
 {
   "meta": {
-    "generated_at": "2025-09-16T21:05:07Z",
+    "generated_at": "2025-09-28T08:26:03Z",
     "os_kernel": "Fedora 42 \u2014 Linux 6.15.9-201.fc42.x86_64 (Sat Aug  2 11:37:34 UTC 2025)",
     "llamacpp_builds": [
+      {
+        "hash": "4807e8f9",
+        "number": "6609"
+      },
       {
         "hash": "f1fbffb5",
         "number": "6486"
@@ -13,6 +17,10 @@
       "rocm6_4_3-hblt0",
       "rocm6_4_3-rocwmma",
       "rocm6_4_3-rocwmma-hblt0",
+      "rocm6_4_4",
+      "rocm6_4_4-hblt0",
+      "rocm6_4_4-rocwmma",
+      "rocm6_4_4-rocwmma-hblt0",
       "rocm7_rc",
       "rocm7_rc-hblt0",
       "rocm7_rc-rocwmma",
@@ -425,6 +433,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 128.18,
+      "tps_std": 0.37,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 20.51,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 134.92,
+      "tps_std": 0.21,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 21.08,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 159.31,
+      "tps_std": 0.83,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 20.34,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 171.67,
+      "tps_std": 0.36,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 21.04,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 128.02,
+      "tps_std": 0.3,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 20.53,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 136.15,
+      "tps_std": 0.32,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 21.05,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 160.41,
+      "tps_std": 0.61,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 20.5,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 161.32,
+      "tps_std": 0.19,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 21.06,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 68.01,
+      "name_params_b": 110.47,
+      "quant": "Q4_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002",
       "model_clean": "GLM-4.5-Air-UD-Q4_K_XL",
@@ -1541,6 +1949,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 123.24,
+      "tps_std": 0.42,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 15.84,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 129.37,
+      "tps_std": 0.24,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 16.17,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 151.03,
+      "tps_std": 0.45,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 15.79,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 155.49,
+      "tps_std": 0.74,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 16.18,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 122.48,
+      "tps_std": 0.34,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 15.86,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 130.06,
+      "tps_std": 0.38,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 16.18,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 150.67,
+      "tps_std": 0.75,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 15.84,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 149.93,
+      "tps_std": 0.58,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
+      "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 16.18,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 110.47,
+      "file_size_gib": 94.57,
+      "name_params_b": 110.47,
+      "quant": "Q6_K_XL",
+      "log": "results/GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003",
       "model_clean": "GLM-4.5-Air-UD-Q6_K_XL",
@@ -2601,6 +3409,406 @@
       "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_3__hblt0__fa1.log",
       "build": null
     },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 98.87,
+      "tps_std": 0.18,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 2.77,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 104.31,
+      "tps_std": 0.07,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 2.79,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 97.43,
+      "tps_std": 0.23,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 2.76,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 103.81,
+      "tps_std": 0.09,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 2.78,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 99.32,
+      "tps_std": 0.17,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 2.78,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 104.93,
+      "tps_std": 0.11,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 2.79,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 98.99,
+      "tps_std": 0.21,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 2.78,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 103.03,
+      "tps_std": 0.23,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
+      "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 2.79,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 70.55,
+      "file_size_gib": 75.65,
+      "name_params_b": 70.55,
+      "quant": "Q8_K_XL",
+      "log": "results/Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002",
       "model_clean": "Llama-3.3-70B-Instruct-UD-Q8_K_XL",
@@ -3661,6 +4869,406 @@
       "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_3__hblt0__fa1.log",
       "build": null
     },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 276.88,
+      "tps_std": 1.57,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.66,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 292.47,
+      "tps_std": 1.18,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.83,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 277.79,
+      "tps_std": 0.94,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.65,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 292.17,
+      "tps_std": 1.61,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.83,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 276.97,
+      "tps_std": 1.15,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.71,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 293.79,
+      "tps_std": 2.33,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.84,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 278.59,
+      "tps_std": 1.22,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.7,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 296.61,
+      "tps_std": 0.98,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.83,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 82.35,
+      "name_params_b": 107.77,
+      "quant": "Q6_K",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002",
       "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q6_K",
@@ -4665,6 +6273,406 @@
       "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_3__hblt0__fa1.log",
       "build": null
     },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 281.33,
+      "tps_std": 2.6,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 11.89,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 297.14,
+      "tps_std": 1.58,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 12.0,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 280.36,
+      "tps_std": 0.42,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 11.88,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 298.12,
+      "tps_std": 2.72,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 12.0,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 279.89,
+      "tps_std": 0.66,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 11.92,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 297.68,
+      "tps_std": 2.9,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 11.97,
+      "tps_std": 0.09,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 284.44,
+      "tps_std": 3.25,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 11.9,
+      "tps_std": 0.04,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 300.04,
+      "tps_std": 1.45,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 12.0,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 106.65,
+      "name_params_b": 107.77,
+      "quant": "Q8_0",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003",
       "model_clean": "Llama-4-Scout-17B-16E-Instruct-Q8_0",
@@ -5781,6 +7789,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 291.19,
+      "tps_std": 2.35,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 17.82,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 307.71,
+      "tps_std": 1.77,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 18.0,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 291.96,
+      "tps_std": 2.18,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 17.82,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 310.84,
+      "tps_std": 1.35,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 18.01,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 291.26,
+      "tps_std": 0.79,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 17.83,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 311.26,
+      "tps_std": 1.06,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 17.97,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 290.78,
+      "tps_std": 1.38,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 17.81,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 310.36,
+      "tps_std": 1.62,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
+      "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 18.0,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 107.77,
+      "file_size_gib": 57.73,
+      "name_params_b": 107.77,
+      "quant": "Q4_K_XL",
+      "log": "results/Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL-00001-of-00002",
       "model_clean": "Llama-4-Scout-17B-16E-Instruct-UD-Q4_K_XL",
@@ -6897,6 +9305,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 134.57,
+      "tps_std": 0.66,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.57,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 144.38,
+      "tps_std": 0.73,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.9,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 134.69,
+      "tps_std": 1.05,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.58,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 143.45,
+      "tps_std": 0.41,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.97,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 133.5,
+      "tps_std": 0.67,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.55,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 144.31,
+      "tps_std": 0.58,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.93,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 133.54,
+      "tps_std": 0.74,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.54,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 144.26,
+      "tps_std": 0.29,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
+      "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.92,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 235.09,
+      "file_size_gib": 96.99,
+      "name_params_b": 235.09,
+      "quant": "Q3_K_XL",
+      "log": "results/Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL-00001-of-00003",
       "model_clean": "Qwen3-235B-A22B-Instruct-2507-UD-Q3_K_XL",
@@ -8069,6 +10877,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 451.6,
+      "tps_std": 1.8,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 25.54,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 482.09,
+      "tps_std": 5.55,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 25.77,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 345.46,
+      "tps_std": 3.07,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 25.49,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 354.93,
+      "tps_std": 5.65,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 25.8,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 448.97,
+      "tps_std": 7.97,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 25.57,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 489.49,
+      "tps_std": 3.92,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 25.78,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 343.78,
+      "tps_std": 1.91,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 25.48,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 363.09,
+      "tps_std": 8.05,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
+      "model_clean": "Qwen3-30B-A3B-BF16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 25.75,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 56.89,
+      "name_params_b": 30.53,
+      "quant": "BF16",
+      "log": "results/Qwen3-30B-A3B-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "Qwen3-30B-A3B-BF16-00001-of-00002",
       "model_clean": "Qwen3-30B-A3B-BF16",
@@ -9269,6 +12477,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 577.98,
+      "tps_std": 6.34,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 55.37,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 623.53,
+      "tps_std": 3.7,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 56.76,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 582.34,
+      "tps_std": 4.27,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 55.34,
+      "tps_std": 0.02,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 622.32,
+      "tps_std": 5.83,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 56.82,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 582.99,
+      "tps_std": 4.97,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 55.33,
+      "tps_std": 0.02,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 632.12,
+      "tps_std": 3.63,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 56.73,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 582.14,
+      "tps_std": 4.21,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 55.39,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 632.63,
+      "tps_std": 4.35,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 56.77,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 30.53,
+      "file_size_gib": 24.53,
+      "name_params_b": 30.53,
+      "quant": "Q6_K_XL",
+      "log": "results/Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
       "model_clean": "Qwen3-30B-A3B-Instruct-2507-UD-Q6_K_XL",
@@ -10469,6 +14077,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 754.71,
+      "tps_std": 0.79,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.16,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 803.95,
+      "tps_std": 0.73,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.07,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 768.26,
+      "tps_std": 1.35,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.15,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 814.89,
+      "tps_std": 0.73,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.08,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 751.85,
+      "tps_std": 1.59,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.16,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 814.18,
+      "tps_std": 1.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.08,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 769.51,
+      "tps_std": 0.9,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 14.15,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 824.93,
+      "tps_std": 0.75,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-12b-it-UD-Q8_K_XL",
+      "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 14.08,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 11.77,
+      "file_size_gib": 13.4,
+      "name_params_b": 11.77,
+      "quant": "Q8_K_XL",
+      "log": "results/gemma-3-12b-it-UD-Q8_K_XL__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "gemma-3-12b-it-UD-Q8_K_XL",
       "model_clean": "gemma-3-12b-it-UD-Q8_K_XL",
@@ -11669,6 +15677,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 425.33,
+      "tps_std": 1.61,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 4.11,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 470.8,
+      "tps_std": 1.97,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 4.1,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 469.59,
+      "tps_std": 0.76,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 4.04,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 524.38,
+      "tps_std": 0.7,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 4.1,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 418.14,
+      "tps_std": 0.79,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 4.1,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 472.28,
+      "tps_std": 1.24,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 4.1,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 471.56,
+      "tps_std": 0.6,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 4.1,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 530.58,
+      "tps_std": 0.66,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-27b-it-BF16-00001-of-00002",
+      "model_clean": "gemma-3-27b-it-BF16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 4.11,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 27.01,
+      "file_size_gib": 50.31,
+      "name_params_b": 27.01,
+      "quant": "BF16",
+      "log": "results/gemma-3-27b-it-BF16-00001-of-00002__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "gemma-3-27b-it-BF16-00001-of-00002",
       "model_clean": "gemma-3-27b-it-BF16",
@@ -12813,6 +17221,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 2110.44,
+      "tps_std": 6.13,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 79.31,
+      "tps_std": 0.03,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 2261.02,
+      "tps_std": 8.46,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 77.07,
+      "tps_std": 0.04,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 2040.3,
+      "tps_std": 9.11,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 79.33,
+      "tps_std": 0.05,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 2143.83,
+      "tps_std": 3.82,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 77.19,
+      "tps_std": 0.02,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 2099.8,
+      "tps_std": 6.34,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 79.43,
+      "tps_std": 0.05,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 2262.0,
+      "tps_std": 6.48,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 77.04,
+      "tps_std": 0.03,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 2038.14,
+      "tps_std": 6.72,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 79.41,
+      "tps_std": 0.04,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 2141.85,
+      "tps_std": 6.83,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gemma-3-4b-it-Q3_K_S",
+      "model_clean": "gemma-3-4b-it-Q3_K_S",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 77.14,
+      "tps_std": 0.02,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 3.88,
+      "file_size_gib": 1.8,
+      "name_params_b": 3.88,
+      "quant": "Q3_K_S",
+      "log": "results/gemma-3-4b-it-Q3_K_S__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "gemma-3-4b-it-Q3_K_S",
       "model_clean": "gemma-3-4b-it-Q3_K_S",
@@ -13985,6 +18793,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 683.95,
+      "tps_std": 7.54,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 34.82,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 783.37,
+      "tps_std": 6.29,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 35.06,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 689.85,
+      "tps_std": 4.6,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 34.84,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 789.94,
+      "tps_std": 5.16,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 35.17,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 682.09,
+      "tps_std": 3.61,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 34.89,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 790.76,
+      "tps_std": 6.72,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 35.06,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 688.37,
+      "tps_std": 4.43,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 34.74,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 777.75,
+      "tps_std": 25.64,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-F16",
+      "model_clean": "gpt-oss-120b-F16",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 35.12,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 60.87,
+      "name_params_b": 116.83,
+      "quant": "F16",
+      "log": "results/gpt-oss-120b-F16__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "gpt-oss-120b-F16",
       "model_clean": "gpt-oss-120b-F16",
@@ -15157,6 +20365,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 668.07,
+      "tps_std": 3.99,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 47.22,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 767.63,
+      "tps_std": 5.37,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 47.72,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 685.61,
+      "tps_std": 4.6,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 47.15,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 785.43,
+      "tps_std": 4.63,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 47.65,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 664.62,
+      "tps_std": 3.53,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 47.11,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 773.25,
+      "tps_std": 6.5,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 47.69,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 686.92,
+      "tps_std": 5.29,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 47.15,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 781.6,
+      "tps_std": 6.15,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-120b-mxfp4-00001-of-00003",
+      "model_clean": "gpt-oss-120b-mxfp4",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 47.76,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 116.83,
+      "file_size_gib": 59.02,
+      "name_params_b": 116.83,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-120b-mxfp4-00001-of-00003__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "gpt-oss-120b-mxfp4-00001-of-00003",
       "model_clean": "gpt-oss-120b-mxfp4",
@@ -16357,6 +21965,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 1253.42,
+      "tps_std": 6.47,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 27.29,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1502.41,
+      "tps_std": 9.99,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 27.35,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 1234.38,
+      "tps_std": 12.52,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 27.25,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1463.75,
+      "tps_std": 8.49,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 27.34,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 1258.74,
+      "tps_std": 12.44,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 27.27,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1513.34,
+      "tps_std": 10.79,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 27.35,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 1235.02,
+      "tps_std": 7.1,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 27.26,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1475.65,
+      "tps_std": 12.28,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-F32",
+      "model_clean": "gpt-oss-20b-F32",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 27.32,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 38.97,
+      "name_params_b": 20.91,
+      "quant": "F32",
+      "log": "results/gpt-oss-20b-F32__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "gpt-oss-20b-F32",
       "model_clean": "gpt-oss-20b-F32",
@@ -17557,6 +23565,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 1276.57,
+      "tps_std": 15.26,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 67.47,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1520.24,
+      "tps_std": 18.05,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 68.08,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 1335.36,
+      "tps_std": 7.22,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 67.28,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1575.76,
+      "tps_std": 15.77,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 68.18,
+      "tps_std": 0.02,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 1270.02,
+      "tps_std": 3.61,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 67.37,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1533.65,
+      "tps_std": 17.58,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 68.13,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 1337.89,
+      "tps_std": 14.39,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 67.39,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1587.21,
+      "tps_std": 12.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "gpt-oss-20b-mxfp4",
+      "model_clean": "gpt-oss-20b-mxfp4",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 68.25,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 20.91,
+      "file_size_gib": 11.27,
+      "name_params_b": 20.91,
+      "quant": "MXFP4",
+      "log": "results/gpt-oss-20b-mxfp4__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "gpt-oss-20b-mxfp4",
       "model_clean": "gpt-oss-20b-mxfp4",
@@ -18757,6 +25165,406 @@
         "number": "6486"
       }
     },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 979.59,
+      "tps_std": 0.72,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 49.85,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1098.0,
+      "tps_std": 4.05,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-rocwmma",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 49.4,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 899.84,
+      "tps_std": 2.29,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 49.81,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1005.78,
+      "tps_std": 1.42,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-rocwmma-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "rocwmma-hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 49.37,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4-rocwmma__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 979.86,
+      "tps_std": 1.66,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 49.87,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1117.04,
+      "tps_std": 3.47,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4",
+      "env_base": "rocm6_4_4",
+      "env_variant": null,
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 49.38,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "pp512",
+      "tps_mean": 895.65,
+      "tps_std": 0.66,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": false,
+      "test": "tg128",
+      "tps_mean": 49.89,
+      "tps_std": 0.0,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4__hblt0.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "pp512",
+      "tps_mean": 1020.22,
+      "tps_std": 1.63,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
+    {
+      "model": "llama-2-7b.Q4_0",
+      "model_clean": "llama-2-7b.Q4_0",
+      "env": "rocm6_4_4-hblt0",
+      "env_base": "rocm6_4_4",
+      "env_variant": "hblt0",
+      "fa": true,
+      "test": "tg128",
+      "tps_mean": 49.36,
+      "tps_std": 0.01,
+      "error": false,
+      "error_type": null,
+      "backend": "ROCm",
+      "ngl": 99,
+      "mmap": 0,
+      "params_b": 6.74,
+      "file_size_gib": 3.56,
+      "name_params_b": 6.74,
+      "quant": "Q4_0",
+      "log": "results/llama-2-7b.Q4_0__rocm6_4_4__hblt0__fa1.log",
+      "build": {
+        "hash": "4807e8f9",
+        "number": "6609"
+      }
+    },
     {
       "model": "llama-2-7b.Q4_0",
       "model_clean": "llama-2-7b.Q4_0",