Updated benchmarks
This commit is contained in:
@@ -9,7 +9,7 @@ rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
|||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 126.62 ± 0.10 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 125.93 ± 0.26 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 19.95 ± 0.02 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.52 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 135.10 ± 0.35 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 135.40 ± 0.23 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.14 ± 0.01 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.69 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 130.99 ± 0.36 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 132.28 ± 0.14 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.14 ± 0.01 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.50 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 140.15 ± 0.41 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 139.86 ± 0.32 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.15 ± 0.01 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.70 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -9,7 +9,7 @@ rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
|||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 126.66 ± 0.22 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 125.92 ± 0.27 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.14 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.52 ± 0.01 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 100.20 ± 0.13 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 134.12 ± 0.59 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.30 ± 0.01 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.66 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x2624d340) reason :GPU Hang
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
✖ ! [rocm6_4_3] GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002 __hblt0 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 131.45 ± 0.35 |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.53 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x37c5d340) on address 0x7f2e3516f000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm6_4_3] GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 140.67 ± 0.26 |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.67 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 94.56 ± 0.11 |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 19.90 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 127.25 ± 0.57 |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.66 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 128.69 ± 0.57 |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.56 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 169.19 ± 0.12 |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.67 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 117.48 ± 0.53 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 94.71 ± 0.12 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.11 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.53 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 126.27 ± 0.47 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 126.97 ± 0.54 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 19.86 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.70 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 158.54 ± 0.42 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 160.39 ± 0.34 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.11 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.56 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 166.11 ± 0.32 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 169.35 ± 0.56 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 19.83 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.65 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 89.60 ± 0.20 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 94.73 ± 0.22 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.22 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.47 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 64.66 ± 0.16 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 93.27 ± 0.18 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.35 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.67 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x1d380ea0) reason :GPU Hang
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
✖ ! [rocm7_rc] GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002 __hblt0 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 159.89 ± 0.44 |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 20.55 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x4a0fea0) on address 0x7f3bf796f000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm7_rc] GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 170.42 ± 0.33 |
|
||||||
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 20.66 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 0 | pp512 | 197.95 ± 0.29 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 0 | pp512 | 217.22 ± 0.49 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 0 | tg128 | 23.24 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 0 | tg128 | 24.18 ± 0.01 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | pp512 | 199.40 ± 0.35 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | pp512 | 219.61 ± 0.55 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | tg128 | 23.26 ± 0.00 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | tg128 | 24.21 ± 0.01 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 0 | pp512 | 126.28 ± 0.17 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 0 | pp512 | 212.60 ± 0.74 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 0 | tg128 | 23.33 ± 0.01 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 0 | tg128 | 24.18 ± 0.03 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | pp512 | 131.64 ± 0.32 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | pp512 | 224.85 ± 2.55 |
|
||||||
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | tg128 | 23.88 ± 0.01 |
|
| glm4moe 106B.A12B Q4_K - Medium | 68.01 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | tg128 | 24.64 ± 0.01 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -9,7 +9,7 @@ rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
|||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 121.82 ± 0.35 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 120.87 ± 0.23 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.59 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.86 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 126.60 ± 0.30 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 128.65 ± 0.59 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.62 ± 0.04 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.96 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x26e36340) on address 0x7fcef3635000. Reason: Page not present or supervisor privilege.
|
HW Exception by GPU node-1 (Agent handle: 0xe6e7340) reason :GPU Hang
|
||||||
✖ ! [rocm6_4_3-rocwmma] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __hblt0 failed (exit 134)
|
✖ ! [rocm6_4_3-rocwmma] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __hblt0 failed (exit 134)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x35263340) reason :GPU Hang
|
Memory access fault by GPU node-1 (Agent handle: 0x400a9340) on address 0x7ef17b435000. Reason: Page not present or supervisor privilege.
|
||||||
✖ ! [rocm6_4_3-rocwmma] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __hblt0__fa1 failed (exit 134)
|
✖ ! [rocm6_4_3-rocwmma] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __hblt0__fa1 failed (exit 134)
|
||||||
|
|||||||
@@ -9,7 +9,7 @@ rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
|||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 117.95 ± 0.30 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 120.53 ± 0.28 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.65 ± 0.01 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.87 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x28aa3340) on address 0x7fb93761b000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm6_4_3] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 129.22 ± 0.41 |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.95 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x14d05340) reason :GPU Hang
|
Memory access fault by GPU node-1 (Agent handle: 0x22558310) on address 0x7f7830fad000. Reason: Page not present or supervisor privilege.
|
||||||
✖ ! [rocm6_4_3] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __hblt0 failed (exit 134)
|
✖ ! [rocm6_4_3] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __hblt0 failed (exit 134)
|
||||||
|
|||||||
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x265e8340) reason :GPU Hang
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm6_4_3] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __hblt0__fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 128.68 ± 0.22 |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.96 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 91.95 ± 0.25 |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.76 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 70.00 ± 0.17 |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.98 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 134.22 ± 0.50 |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.90 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 159.75 ± 0.33 |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.99 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 69.19 ± 0.20 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 92.18 ± 0.04 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.64 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.92 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 114.61 ± 0.20 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 121.75 ± 0.32 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.51 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.97 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 120.88 ± 0.92 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 151.32 ± 0.45 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.61 ± 0.09 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.90 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 150.07 ± 0.56 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 161.10 ± 0.36 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.52 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.99 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 69.52 ± 0.17 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 92.20 ± 0.11 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.63 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.85 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 74.02 ± 0.13 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 71.02 ± 0.16 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.73 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.96 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 142.67 ± 0.75 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | pp512 | 147.32 ± 0.43 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.68 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 0 | tg128 | 15.91 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x1c536ea0) on address 0x7f623b57e000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm7_rc] GLM-4.5-Air-UD-Q6_K_XL-00001-of-00003 __hblt0__fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | pp512 | 161.37 ± 0.36 |
|
||||||
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | ROCm | 99 | 1 | 0 | tg128 | 15.99 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 0 | pp512 | 219.81 ± 0.70 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 0 | pp512 | 264.50 ± 0.99 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 0 | tg128 | 16.80 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 0 | tg128 | 17.27 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | pp512 | 222.20 ± 0.63 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | pp512 | 267.86 ± 1.22 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | tg128 | 16.82 ± 0.01 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | tg128 | 17.28 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 0 | pp512 | 126.55 ± 0.40 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 0 | pp512 | 208.01 ± 0.73 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 0 | tg128 | 17.07 ± 0.01 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 0 | tg128 | 17.49 ± 0.02 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | pp512 | 131.25 ± 0.50 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | pp512 | 221.63 ± 1.26 |
|
||||||
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | tg128 | 17.31 ± 0.00 |
|
| glm4moe 106B.A12B Q6_K | 94.57 GiB | 110.47 B | Vulkan | 99 | 1 | 0 | tg128 | 17.71 ± 0.01 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+2
-6
@@ -7,9 +7,5 @@ This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASL
|
|||||||
|
|
||||||
rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
HW Exception by GPU node-1 (Agent handle: 0x284c3340) reason :GPU Hang
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
✖ ! [rocm6_4_3-rocwmma] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 failed (exit 134)
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 98.02 ± 0.18 |
|
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.77 ± 0.00 |
|
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 101.83 ± 0.11 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 101.82 ± 0.06 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.77 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x21da1340) reason :GPU Hang
|
HW Exception by GPU node-1 (Agent handle: 0x7166340) reason :GPU Hang
|
||||||
✖ ! [rocm6_4_3-rocwmma] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0 failed (exit 134)
|
✖ ! [rocm6_4_3-rocwmma] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0 failed (exit 134)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x15ac2340) reason :GPU Hang
|
HW Exception by GPU node-1 (Agent handle: 0x37f0e340) reason :GPU Hang
|
||||||
✖ ! [rocm6_4_3-rocwmma] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
✖ ! [rocm6_4_3-rocwmma] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
||||||
|
|||||||
@@ -9,7 +9,7 @@ rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
|||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 97.13 ± 0.17 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 94.79 ± 0.14 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+2
-2
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 80.42 ± 0.08 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 104.62 ± 0.08 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x2c1e5340) reason :GPU Hang
|
HW Exception by GPU node-1 (Agent handle: 0x12cee310) reason :GPU Hang
|
||||||
✖ ! [rocm6_4_3] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0 failed (exit 134)
|
✖ ! [rocm6_4_3] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0 failed (exit 134)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x3e536340) on address 0x7f9182f6f000. Reason: Page not present or supervisor privilege.
|
Memory access fault by GPU node-1 (Agent handle: 0x367c310) on address 0x7fc07ad93000. Reason: Page not present or supervisor privilege.
|
||||||
✖ ! [rocm6_4_3] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
✖ ! [rocm6_4_3] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
||||||
|
|||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 98.15 ± 0.16 |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.77 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 102.79 ± 0.14 |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 93.89 ± 0.22 |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 97.53 ± 0.17 |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+2
-2
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 97.31 ± 0.20 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 97.42 ± 0.12 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 100.85 ± 0.13 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 101.56 ± 0.04 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.77 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+2
-2
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 93.00 ± 0.22 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 92.02 ± 0.17 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 97.88 ± 0.09 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 97.10 ± 0.17 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.77 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 99.41 ± 0.36 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 95.12 ± 0.17 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.77 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.77 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+6
-2
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x1f66bec0) on address 0x7f3e84b6f000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm7_rc] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 103.16 ± 0.07 |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.01 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+2
-2
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 94.06 ± 0.09 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | pp512 | 93.86 ± 0.18 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+6
-2
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0xac09ec0) on address 0x7f283f56f000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm7_rc] Llama-3.3-70B-Instruct-UD-Q8_K_XL-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | pp512 | 95.87 ± 0.08 |
|
||||||
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | ROCm | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 0 | pp512 | 98.03 ± 0.24 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 0 | pp512 | 97.72 ± 0.36 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 0 | tg128 | 2.81 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 1 | 0 | pp512 | 99.12 ± 0.25 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 1 | 0 | pp512 | 99.04 ± 0.31 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 1 | 0 | tg128 | 2.77 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 1 | 0 | tg128 | 2.80 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 0 | pp512 | 75.59 ± 0.28 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 0 | pp512 | 78.94 ± 0.51 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+2
-2
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 1 | 0 | pp512 | 80.09 ± 0.38 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 1 | 0 | pp512 | 80.90 ± 0.77 |
|
||||||
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
| llama 70B Q8_0 | 75.65 GiB | 70.55 B | Vulkan | 99 | 1 | 0 | tg128 | 2.78 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+6
-2
@@ -7,5 +7,9 @@ This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASL
|
|||||||
|
|
||||||
rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x1a840340) on address 0x7f3babb56000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
✖ ! [rocm6_4_3-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 265.76 ± 0.95 |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.69 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 291.08 ± 1.26 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 289.14 ± 1.57 |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.53 ± 0.00 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.64 ± 0.15 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+2
-6
@@ -2,9 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
HW Exception by GPU node-1 (Agent handle: 0x24187340) reason :GPU Hang
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
✖ ! [rocm6_4_3-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0 failed (exit 134)
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 134.19 ± 1.49 |
|
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.56 ± 0.01 |
|
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x1de78340) reason :GPU Hang
|
HW Exception by GPU node-1 (Agent handle: 0x3da9340) reason :GPU Hang
|
||||||
✖ ! [rocm6_4_3-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
✖ ! [rocm6_4_3-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
||||||
|
|||||||
@@ -7,9 +7,5 @@ This message will be only be displayed once, unless the ROCBLAS_VERBOSE_HIPBLASL
|
|||||||
|
|
||||||
rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
HW Exception by GPU node-1 (Agent handle: 0x11bc3310) reason :GPU Hang
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
✖ ! [rocm6_4_3] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 failed (exit 134)
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 270.28 ± 1.29 |
|
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.58 ± 0.03 |
|
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
|
||||||
|
|||||||
+6
-2
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x2162b340) on address 0x7f500556f000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm6_4_3] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 291.67 ± 0.91 |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.71 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0xdacf340) reason :GPU Hang
|
HW Exception by GPU node-1 (Agent handle: 0x8a0a310) reason :GPU Hang
|
||||||
✖ ! [rocm6_4_3] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0 failed (exit 134)
|
✖ ! [rocm6_4_3] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0 failed (exit 134)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x3dc00340) reason :GPU Hang
|
HW Exception by GPU node-1 (Agent handle: 0x1ada6310) reason :GPU Hang
|
||||||
✖ ! [rocm6_4_3] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
✖ ! [rocm6_4_3] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
||||||
|
|||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 276.44 ± 1.46 |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.55 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 292.67 ± 1.04 |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.71 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 273.88 ± 1.14 |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.70 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+10
@@ -0,0 +1,10 @@
|
|||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
||||||
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 284.81 ± 1.55 |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.72 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
+6
-2
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x3882bf60) reason :GPU Hang
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
✖ ! [rocm7_rc-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 274.13 ± 0.84 |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.71 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 285.84 ± 9.41 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 292.92 ± 2.63 |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.37 ± 0.00 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.71 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 273.97 ± 1.67 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 273.23 ± 1.35 |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.57 ± 0.05 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.70 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+2
-6
@@ -2,9 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
HW Exception by GPU node-1 (Agent handle: 0x13c5d180) reason :GPU Hang
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
✖ ! [rocm7_rc-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 285.26 ± 1.79 |
|
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.33 ± 0.03 |
|
|
||||||
|
|
||||||
build: de219279 (6181)
|
|
||||||
|
|||||||
@@ -2,9 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
Memory access fault by GPU node-1 (Agent handle: 0x381db160) on address 0x7f72baf68000. Reason: Page not present or supervisor privilege.
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
✖ ! [rocm7_rc] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 failed (exit 134)
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 276.37 ± 1.65 |
|
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.57 ± 0.04 |
|
|
||||||
|
|
||||||
build: de219279 (6181)
|
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0xa893ec0) on address 0x7f070a3a9000. Reason: Page not present or supervisor privilege.
|
HW Exception by GPU node-1 (Agent handle: 0x34902180) reason :GPU Hang
|
||||||
✖ ! [rocm7_rc] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __fa1 failed (exit 134)
|
✖ ! [rocm7_rc] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __fa1 failed (exit 134)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 269.17 ± 0.99 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 274.52 ± 1.78 |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.63 ± 0.01 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 14.70 ± 0.00 |
|
||||||
|
|
||||||
build: de219279 (6181)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+6
-2
@@ -2,5 +2,9 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
Memory access fault by GPU node-1 (Agent handle: 0x1db86ec0) on address 0x7f2273f6f000. Reason: Page not present or supervisor privilege.
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
✖ ! [rocm7_rc] Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00002 __hblt0__fa1 failed (exit 134)
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 287.04 ± 1.92 |
|
||||||
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 14.71 ± 0.00 |
|
||||||
|
|
||||||
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 0 | pp512 | 242.07 ± 1.05 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 0 | pp512 | 224.02 ± 2.86 |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 0 | tg128 | 15.56 ± 0.01 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 0 | tg128 | 15.98 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 1 | 0 | pp512 | 244.49 ± 1.13 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 1 | 0 | pp512 | 234.30 ± 1.10 |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 1 | 0 | tg128 | 15.33 ± 0.00 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 1 | 0 | tg128 | 15.75 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 0 | pp512 | 147.08 ± 0.98 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 0 | pp512 | 201.49 ± 2.22 |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 0 | tg128 | 15.50 ± 0.01 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 0 | tg128 | 15.77 ± 0.01 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -2,7 +2,7 @@ ggml_vulkan: Found 1 Vulkan devices:
|
|||||||
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 1 | 0 | pp512 | 149.97 ± 1.10 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 1 | 0 | pp512 | 202.49 ± 5.98 |
|
||||||
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 1 | 0 | tg128 | 15.49 ± 0.00 |
|
| llama4 17Bx16E (Scout) Q6_K | 82.35 GiB | 107.77 B | Vulkan | 99 | 1 | 0 | tg128 | 15.74 ± 0.00 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -9,7 +9,7 @@ rocBLAS warning: hipBlasLT failed, falling back to tensile.
|
|||||||
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
|
||||||
| model | size | params | backend | ngl | mmap | test | t/s |
|
| model | size | params | backend | ngl | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 270.35 ± 3.39 |
|
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 0 | pp512 | 264.44 ± 24.69 |
|
||||||
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 11.78 ± 0.03 |
|
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 0 | tg128 | 11.88 ± 0.05 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+3
-3
@@ -4,7 +4,7 @@ ggml_cuda_init: found 1 ROCm devices:
|
|||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
||||||
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 292.23 ± 3.13 |
|
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 298.83 ± 1.59 |
|
||||||
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 11.73 ± 0.03 |
|
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 11.89 ± 0.06 |
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
build: f1fbffb5 (6486)
|
||||||
|
|||||||
+1
-1
@@ -2,5 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
HW Exception by GPU node-1 (Agent handle: 0x5f69340) reason :GPU Hang
|
HW Exception by GPU node-1 (Agent handle: 0x3265f340) reason :GPU Hang
|
||||||
✖ ! [rocm6_4_3-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003 __hblt0 failed (exit 134)
|
✖ ! [rocm6_4_3-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003 __hblt0 failed (exit 134)
|
||||||
|
|||||||
+2
-6
@@ -2,9 +2,5 @@ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
|
|||||||
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
|
||||||
ggml_cuda_init: found 1 ROCm devices:
|
ggml_cuda_init: found 1 ROCm devices:
|
||||||
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
|
||||||
| model | size | params | backend | ngl | fa | mmap | test | t/s |
|
HW Exception by GPU node-1 (Agent handle: 0x33cad340) reason :GPU Hang
|
||||||
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
|
✖ ! [rocm6_4_3-rocwmma] Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00003 __hblt0__fa1 failed (exit 134)
|
||||||
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 1 | 0 | pp512 | 140.27 ± 0.97 |
|
|
||||||
| llama4 17Bx16E (Scout) Q8_0 | 106.65 GiB | 107.77 B | ROCm | 99 | 1 | 0 | tg128 | 11.74 ± 0.00 |
|
|
||||||
|
|
||||||
build: 1fe00296 (6182)
|
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user