added @64k benchmarks

This commit is contained in:
Donato Capitella
2026-05-03 16:20:42 +01:00
parent 1bffd6505f
commit 07d2131d8c
121 changed files with 7350 additions and 1 deletions
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| llama ?B Q4_K - Medium | 70.31 GiB | 125.03 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 7.94 ± 0.04 |
| llama ?B Q4_K - Medium | 70.31 GiB | 125.03 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 1.49 ± 0.00 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| llama ?B Q4_K - Medium | 70.31 GiB | 125.03 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 8.12 ± 0.03 |
| llama ?B Q4_K - Medium | 70.31 GiB | 125.03 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 1.48 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| llama ?B Q4_K - Medium | 70.31 GiB | 125.03 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 8.16 ± 0.08 |
| llama ?B Q4_K - Medium | 70.31 GiB | 125.03 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 1.58 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| llama ?B Q4_K - Medium | 70.31 GiB | 125.03 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 8.45 ± 0.02 |
| llama ?B Q4_K - Medium | 70.31 GiB | 125.03 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 1.43 ± 0.02 |
build: ab6120cde (8997)
@@ -0,0 +1,23 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
/lib64/libggml-base.so.0(+0x4465) [0x7f69acb1c465]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f69acb1c83b]
/lib64/libggml-base.so.0(+0x16f19) [0x7f69acb2ef19]
/lib64/libstdc++.so.6(+0x1ebfc) [0x7f69ac8bfbfc]
/lib64/libstdc++.so.6(_ZSt10unexpectedv+0x0) [0x7f69ac8a9d3a]
/lib64/libstdc++.so.6(+0x1eea8) [0x7f69ac8bfea8]
/lib64/libggml-vulkan.so.0(+0x1728d) [0x7f69acbf128d]
/lib64/libggml-vulkan.so.0(+0x10a410) [0x7f69acce4410]
/lib64/libggml-base.so.0(ggml_backend_sched_graph_compute_async+0x3b2) [0x7f69acb38192]
/lib64/libllama.so.0(_ZN13llama_context13graph_computeEP11ggml_cgraphb+0xa0) [0x7f69b07a6c70]
/lib64/libllama.so.0(_ZN13llama_context14process_ubatchERK12llama_ubatch14llm_graph_typeP22llama_memory_context_iR11ggml_status+0xe5) [0x7f69b07a9255]
/lib64/libllama.so.0(_ZN13llama_context6decodeERK11llama_batch+0x35f) [0x7f69b07af98f]
/lib64/libllama.so.0(llama_decode+0xe) [0x7f69b07b132e]
/usr/sbin/llama-bench() [0x40663b]
/usr/sbin/llama-bench() [0x4038b9]
/lib64/libc.so.6(+0x35b5) [0x7f69ac5905b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f69ac590668]
/usr/sbin/llama-bench() [0x404e65]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
✖ ! [vulkan_amdvlk] Devstral-2-123B-Instruct-2512-UD-Q4_K_XL-00001-of-00002__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,25 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
/lib64/libggml-base.so.0(+0x4465) [0x7fd3fbd2c465]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7fd3fbd2c83b]
/lib64/libggml-base.so.0(+0x16f19) [0x7fd3fbd3ef19]
/lib64/libstdc++.so.6(+0x1ebfc) [0x7fd3fbacfbfc]
/lib64/libstdc++.so.6(_ZSt10unexpectedv+0x0) [0x7fd3fbab9d3a]
/lib64/libstdc++.so.6(+0x1eea8) [0x7fd3fbacfea8]
/lib64/libggml-vulkan.so.0(+0x1569b) [0x7fd3fbdff69b]
/lib64/libggml-vulkan.so.0(+0x14505a) [0x7fd3fbf2f05a]
/lib64/libggml-vulkan.so.0(+0x145c31) [0x7fd3fbf2fc31]
/lib64/libggml-base.so.0(ggml_backend_sched_graph_compute_async+0x7f3) [0x7fd3fbd485d3]
/lib64/libllama.so.0(_ZN13llama_context13graph_computeEP11ggml_cgraphb+0xa0) [0x7fd3ff9b6c70]
/lib64/libllama.so.0(_ZN13llama_context14process_ubatchERK12llama_ubatch14llm_graph_typeP22llama_memory_context_iR11ggml_status+0xe5) [0x7fd3ff9b9255]
/lib64/libllama.so.0(_ZN13llama_context6decodeERK11llama_batch+0x35f) [0x7fd3ff9bf98f]
/lib64/libllama.so.0(llama_decode+0xe) [0x7fd3ff9c132e]
/usr/sbin/llama-bench() [0x40663b]
/usr/sbin/llama-bench() [0x403a5b]
/lib64/libc.so.6(+0x35b5) [0x7fd3fb7a05b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7fd3fb7a0668]
/usr/sbin/llama-bench() [0x404e65]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
✖ ! [vulkan_radv] Devstral-2-123B-Instruct-2512-UD-Q4_K_XL-00001-of-00002__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 46.54 ± 0.17 |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 11.91 ± 0.00 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 46.47 ± 0.09 |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 11.63 ± 0.19 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 50.95 ± 0.07 |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 11.15 ± 0.09 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 45.39 ± 0.36 |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 11.22 ± 0.13 |
build: ab6120cde (8997)
@@ -0,0 +1,24 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
/lib64/libggml-base.so.0(+0x4465) [0x7fa5c507b465]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7fa5c507b83b]
/lib64/libggml-base.so.0(+0x16f19) [0x7fa5c508df19]
/lib64/libstdc++.so.6(+0x1ebfc) [0x7fa5c4e1ebfc]
/lib64/libstdc++.so.6(_ZSt10unexpectedv+0x0) [0x7fa5c4e08d3a]
/lib64/libstdc++.so.6(+0x1eea8) [0x7fa5c4e1eea8]
/lib64/libggml-vulkan.so.0(+0x1569b) [0x7fa5c514e69b]
/lib64/libggml-vulkan.so.0(+0x14505a) [0x7fa5c527e05a]
/lib64/libggml-vulkan.so.0(+0x145c31) [0x7fa5c527ec31]
/lib64/libggml-base.so.0(ggml_backend_sched_graph_compute_async+0x7f3) [0x7fa5c50975d3]
/lib64/libllama.so.0(_ZN13llama_context13graph_computeEP11ggml_cgraphb+0xa0) [0x7fa5c8d05c70]
/lib64/libllama.so.0(_ZN13llama_context14process_ubatchERK12llama_ubatch14llm_graph_typeP22llama_memory_context_iR11ggml_status+0xe5) [0x7fa5c8d08255]
/lib64/libllama.so.0(_ZN13llama_context6decodeERK11llama_batch+0x35f) [0x7fa5c8d0e98f]
/lib64/libllama.so.0(llama_decode+0xe) [0x7fa5c8d1032e]
/usr/sbin/llama-bench() [0x40663b]
/usr/sbin/llama-bench() [0x403a5b]
/lib64/libc.so.6(+0x35b5) [0x7fa5c4aef5b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7fa5c4aef668]
/usr/sbin/llama-bench() [0x404e65]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
✖ ! [vulkan_amdvlk] GLM-4.7-Flash-BF16-00001-of-00002__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 61.52 ± 0.01 |
| deepseek2 30B.A3B BF16 | 55.79 GiB | 29.94 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 6.57 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 47.63 ± 0.04 |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 14.98 ± 0.00 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 47.62 ± 0.15 |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 14.87 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 51.79 ± 0.10 |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 14.36 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 46.43 ± 0.14 |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 14.43 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,23 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
/lib64/libggml-base.so.0(+0x4465) [0x7fb35126c465]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7fb35126c83b]
/lib64/libggml-base.so.0(+0x16f19) [0x7fb35127ef19]
/lib64/libstdc++.so.6(+0x1ebfc) [0x7fb35100fbfc]
/lib64/libstdc++.so.6(_ZSt10unexpectedv+0x0) [0x7fb350ff9d3a]
/lib64/libstdc++.so.6(+0x1eea8) [0x7fb35100fea8]
/lib64/libggml-vulkan.so.0(+0x1728d) [0x7fb35134128d]
/lib64/libggml-vulkan.so.0(+0x10a410) [0x7fb351434410]
/lib64/libggml-base.so.0(ggml_backend_sched_graph_compute_async+0x3b2) [0x7fb351288192]
/lib64/libllama.so.0(_ZN13llama_context13graph_computeEP11ggml_cgraphb+0xa0) [0x7fb354ef6c70]
/lib64/libllama.so.0(_ZN13llama_context14process_ubatchERK12llama_ubatch14llm_graph_typeP22llama_memory_context_iR11ggml_status+0xe5) [0x7fb354ef9255]
/lib64/libllama.so.0(_ZN13llama_context6decodeERK11llama_batch+0x35f) [0x7fb354eff98f]
/lib64/libllama.so.0(llama_decode+0xe) [0x7fb354f0132e]
/usr/sbin/llama-bench() [0x40663b]
/usr/sbin/llama-bench() [0x403a5b]
/lib64/libc.so.6(+0x35b5) [0x7fb350ce05b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7fb350ce0668]
/usr/sbin/llama-bench() [0x404e65]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
✖ ! [vulkan_amdvlk] GLM-4.7-Flash-UD-Q8_K_XL__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 72.80 ± 0.00 |
| deepseek2 30B.A3B Q8_0 | 32.70 GiB | 29.94 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 14.39 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,6 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
main: error: failed to load model '/home/kyuz0/models/mini-max-m2.7/UD-Q3_K_S/MiniMax-M2.7-UD-Q3_K_S-00001-of-00003.gguf'
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
✖ ! [rocm-7_2_2-pr21344] MiniMax-M2.7-UD-Q3_K_S-00001-of-00003__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,6 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
main: error: failed to load model '/home/kyuz0/models/mini-max-m2.7/UD-Q3_K_S/MiniMax-M2.7-UD-Q3_K_S-00001-of-00003.gguf'
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
✖ ! [rocm-7_2_2] MiniMax-M2.7-UD-Q3_K_S-00001-of-00003__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,3 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
✖ ! [rocm6_4_4] MiniMax-M2.7-UD-Q3_K_S-00001-of-00003__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,3 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
✖ ! [rocm7-nightlies] MiniMax-M2.7-UD-Q3_K_S-00001-of-00003__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,3 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
✖ ! [vulkan_amdvlk] MiniMax-M2.7-UD-Q3_K_S-00001-of-00003__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,3 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
✖ ! [vulkan_radv] MiniMax-M2.7-UD-Q3_K_S-00001-of-00003__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 75.02 ± 1.56 |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 6.11 ± 0.00 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 80.08 ± 2.35 |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 6.11 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 71.24 ± 0.32 |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 6.09 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 73.23 ± 0.94 |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 6.11 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,23 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
/lib64/libggml-base.so.0(+0x4465) [0x7f41f07d5465]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f41f07d583b]
/lib64/libggml-base.so.0(+0x16f19) [0x7f41f07e7f19]
/lib64/libstdc++.so.6(+0x1ebfc) [0x7f41f0578bfc]
/lib64/libstdc++.so.6(_ZSt10unexpectedv+0x0) [0x7f41f0562d3a]
/lib64/libstdc++.so.6(+0x1eea8) [0x7f41f0578ea8]
/lib64/libggml-vulkan.so.0(+0x1728d) [0x7f41f08aa28d]
/lib64/libggml-vulkan.so.0(+0x10a410) [0x7f41f099d410]
/lib64/libggml-base.so.0(ggml_backend_sched_graph_compute_async+0x3b2) [0x7f41f07f1192]
/lib64/libllama.so.0(_ZN13llama_context13graph_computeEP11ggml_cgraphb+0xa0) [0x7f41f445fc70]
/lib64/libllama.so.0(_ZN13llama_context14process_ubatchERK12llama_ubatch14llm_graph_typeP22llama_memory_context_iR11ggml_status+0xe5) [0x7f41f4462255]
/lib64/libllama.so.0(_ZN13llama_context6decodeERK11llama_batch+0x35f) [0x7f41f446898f]
/lib64/libllama.so.0(llama_decode+0xe) [0x7f41f446a32e]
/usr/sbin/llama-bench() [0x40663b]
/usr/sbin/llama-bench() [0x4038b9]
/lib64/libc.so.6(+0x35b5) [0x7f41f02495b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f41f0249668]
/usr/sbin/llama-bench() [0x404e65]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
✖ ! [vulkan_amdvlk] Ministral-3-14B-Instruct-2512-BF16__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 42.48 ± 0.11 |
| mistral3 14B BF16 | 25.16 GiB | 13.51 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 5.81 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 250.79 ± 0.32 |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 15.03 ± 0.04 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 249.66 ± 0.90 |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 15.42 ± 0.07 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 290.67 ± 0.26 |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 15.11 ± 0.05 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 242.09 ± 0.33 |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 15.47 ± 0.10 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 85.09 ± 0.16 |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 13.21 ± 0.04 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 174.22 ± 0.19 |
| nemotron_h_moe 120B.A12B Q4_K - Medium | 78.02 GiB | 120.67 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 13.55 ± 0.03 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 110.56 ± 0.10 |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 22.59 ± 0.01 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 102.26 ± 0.13 |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 22.53 ± 0.07 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 159.40 ± 0.10 |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 22.26 ± 0.17 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 102.96 ± 0.05 |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 23.26 ± 0.12 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 60.42 ± 0.04 |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 17.44 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 63.65 ± 0.38 |
| qwen3moe 30B.A3B Q4_K - Medium | 17.35 GiB | 30.53 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 24.54 ± 0.02 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 170.79 ± 0.22 |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 14.19 ± 0.06 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 152.12 ± 0.13 |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 14.21 ± 0.22 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 187.01 ± 3.13 |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 16.91 ± 0.09 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 138.80 ± 0.17 |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 17.26 ± 0.15 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 70.58 ± 0.04 |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 16.94 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 156.41 ± 0.22 |
| qwen35moe 122B.A10B Q5_K - Medium | 85.60 GiB | 122.11 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 18.78 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 296.43 ± 1.09 |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 19.60 ± 0.00 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 264.06 ± 0.65 |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 20.14 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 326.10 ± 1.55 |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 19.05 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 256.49 ± 1.26 |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 20.93 ± 0.06 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 82.44 ± 0.12 |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 10.40 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 226.78 ± 1.49 |
| qwen35moe 35B.A3B BF16 | 64.60 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 10.05 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 412.60 ± 0.33 |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 34.90 ± 0.02 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 460.71 ± 1.18 |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 35.40 ± 0.24 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 473.84 ± 2.27 |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 36.70 ± 0.97 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 432.39 ± 0.18 |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 38.08 ± 0.59 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 183.42 ± 0.98 |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 36.62 ± 0.05 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 507.32 ± 1.33 |
| qwen35moe 35B.A3B Q4_K - Medium | 20.81 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 42.75 ± 0.05 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 414.31 ± 0.61 |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 32.30 ± 0.01 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 414.74 ± 1.74 |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 32.74 ± 0.02 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 515.64 ± 0.13 |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 34.53 ± 0.03 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 375.11 ± 2.14 |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 35.02 ± 0.03 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 175.02 ± 1.28 |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 31.29 ± 0.05 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 468.83 ± 1.17 |
| qwen35moe 35B.A3B Q8_0 | 35.80 GiB | 34.66 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 35.46 ± 0.04 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 ?B BF16 | 47.02 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 408.04 ± 3.58 |
| gemma4 ?B BF16 | 47.02 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 19.52 ± 0.01 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 403.29 ± 1.83 |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 19.35 ± 0.19 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 441.14 ± 2.12 |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 17.76 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 345.70 ± 0.13 |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 19.16 ± 0.47 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 39.57 ± 0.04 |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 14.10 ± 0.02 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 281.32 ± 0.70 |
| gemma4 26B.A4B BF16 | 47.02 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 13.15 ± 0.03 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 ?B Q4_K - Medium | 15.90 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 446.29 ± 0.87 |
| gemma4 ?B Q4_K - Medium | 15.90 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 35.16 ± 0.03 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 453.92 ± 5.18 |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 34.85 ± 0.16 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 469.64 ± 3.17 |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 34.31 ± 0.25 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 374.22 ± 0.13 |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 34.77 ± 0.32 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 54.90 ± 0.12 |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 29.52 ± 0.03 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 445.00 ± 0.08 |
| gemma4 26B.A4B Q4_K - Medium | 15.90 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 36.64 ± 0.06 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 ?B Q8_0 | 25.94 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 436.86 ± 3.98 |
| gemma4 ?B Q8_0 | 25.94 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 31.55 ± 0.72 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 452.10 ± 0.82 |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 29.87 ± 3.28 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 451.48 ± 1.29 |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 30.76 ± 0.03 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 363.83 ± 2.85 |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 31.68 ± 0.02 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 54.04 ± 0.08 |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 26.65 ± 0.03 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 415.94 ± 0.55 |
| gemma4 26B.A4B Q8_0 | 25.94 GiB | 25.23 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 31.99 ± 0.06 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 ?B BF16 | 57.18 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 96.84 ± 0.57 |
| gemma4 ?B BF16 | 57.18 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 3.07 ± 0.02 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B BF16 | 57.18 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 98.23 ± 0.26 |
| gemma4 31B BF16 | 57.18 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 3.08 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B BF16 | 57.18 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 101.08 ± 0.13 |
| gemma4 31B BF16 | 57.18 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 2.94 ± 0.02 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B BF16 | 57.18 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 83.27 ± 0.13 |
| gemma4 31B BF16 | 57.18 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 3.04 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,23 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
/lib64/libggml-base.so.0(+0x4465) [0x7fa1d7fcc465]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7fa1d7fcc83b]
/lib64/libggml-base.so.0(+0x16f19) [0x7fa1d7fdef19]
/lib64/libstdc++.so.6(+0x1ebfc) [0x7fa1d7d6fbfc]
/lib64/libstdc++.so.6(_ZSt10unexpectedv+0x0) [0x7fa1d7d59d3a]
/lib64/libstdc++.so.6(+0x1eea8) [0x7fa1d7d6fea8]
/lib64/libggml-vulkan.so.0(+0x1728d) [0x7fa1d80a128d]
/lib64/libggml-vulkan.so.0(+0x10a410) [0x7fa1d8194410]
/lib64/libggml-base.so.0(ggml_backend_sched_graph_compute_async+0x3b2) [0x7fa1d7fe8192]
/lib64/libllama.so.0(_ZN13llama_context13graph_computeEP11ggml_cgraphb+0xa0) [0x7fa1dbc56c70]
/lib64/libllama.so.0(_ZN13llama_context14process_ubatchERK12llama_ubatch14llm_graph_typeP22llama_memory_context_iR11ggml_status+0xe5) [0x7fa1dbc59255]
/lib64/libllama.so.0(_ZN13llama_context6decodeERK11llama_batch+0x35f) [0x7fa1dbc5f98f]
/lib64/libllama.so.0(llama_decode+0xe) [0x7fa1dbc6132e]
/usr/sbin/llama-bench() [0x40663b]
/usr/sbin/llama-bench() [0x4038b9]
/lib64/libc.so.6(+0x35b5) [0x7fa1d7a405b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7fa1d7a40668]
/usr/sbin/llama-bench() [0x404e65]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
✖ ! [vulkan_amdvlk] gemma-4-31B-it-BF16-00001-of-00002__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma4 31B BF16 | 57.18 GiB | 30.70 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 45.96 ± 1.09 |
| gemma4 31B BF16 | 57.18 GiB | 30.70 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 3.06 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 ?B Q4_K - Medium | 17.46 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 91.86 ± 0.33 |
| gemma4 ?B Q4_K - Medium | 17.46 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 7.51 ± 0.00 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B Q4_K - Medium | 17.46 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 93.57 ± 0.20 |
| gemma4 31B Q4_K - Medium | 17.46 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 7.52 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B Q4_K - Medium | 17.46 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 96.98 ± 0.37 |
| gemma4 31B Q4_K - Medium | 17.46 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 7.26 ± 0.01 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B Q4_K - Medium | 17.46 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 84.96 ± 0.07 |
| gemma4 31B Q4_K - Medium | 17.46 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 7.56 ± 0.02 |
build: ab6120cde (8997)
@@ -0,0 +1,23 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
/lib64/libggml-base.so.0(+0x4465) [0x7fbb0ecdb465]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7fbb0ecdb83b]
/lib64/libggml-base.so.0(+0x16f19) [0x7fbb0ecedf19]
/lib64/libstdc++.so.6(+0x1ebfc) [0x7fbb0ea7ebfc]
/lib64/libstdc++.so.6(_ZSt10unexpectedv+0x0) [0x7fbb0ea68d3a]
/lib64/libstdc++.so.6(+0x1eea8) [0x7fbb0ea7eea8]
/lib64/libggml-vulkan.so.0(+0x1728d) [0x7fbb0edb028d]
/lib64/libggml-vulkan.so.0(+0x10a410) [0x7fbb0eea3410]
/lib64/libggml-base.so.0(ggml_backend_sched_graph_compute_async+0x3b2) [0x7fbb0ecf7192]
/lib64/libllama.so.0(_ZN13llama_context13graph_computeEP11ggml_cgraphb+0xa0) [0x7fbb12965c70]
/lib64/libllama.so.0(_ZN13llama_context14process_ubatchERK12llama_ubatch14llm_graph_typeP22llama_memory_context_iR11ggml_status+0xe5) [0x7fbb12968255]
/lib64/libllama.so.0(_ZN13llama_context6decodeERK11llama_batch+0x35f) [0x7fbb1296e98f]
/lib64/libllama.so.0(llama_decode+0xe) [0x7fbb1297032e]
/usr/sbin/llama-bench() [0x40663b]
/usr/sbin/llama-bench() [0x403a5b]
/lib64/libc.so.6(+0x35b5) [0x7fbb0e74f5b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7fbb0e74f668]
/usr/sbin/llama-bench() [0x404e65]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
✖ ! [vulkan_amdvlk] gemma-4-31B-it-UD-Q4_K_XL__fa1 __longctx65536 failed (exit 0)
@@ -0,0 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma4 31B Q4_K - Medium | 17.46 GiB | 30.70 B | Vulkan | 99 | 1 | 0 | pp2048 @ d65536 | 76.03 ± 1.52 |
| gemma4 31B Q4_K - Medium | 17.46 GiB | 30.70 B | Vulkan | 99 | 1 | 0 | tg32 @ d65536 | 7.78 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 ?B Q8_0 | 32.60 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 90.15 ± 0.22 |
| gemma4 ?B Q8_0 | 32.60 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 5.00 ± 0.00 |
build: 7957de9dc (8645)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B Q8_0 | 32.60 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 93.92 ± 0.29 |
| gemma4 31B Q8_0 | 32.60 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 5.00 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B Q8_0 | 32.60 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 93.78 ± 0.36 |
| gemma4 31B Q8_0 | 32.60 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 4.95 ± 0.00 |
build: ab6120cde (8997)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| gemma4 31B Q8_0 | 32.60 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | pp2048 @ d65536 | 81.58 ± 0.14 |
| gemma4 31B Q8_0 | 32.60 GiB | 30.70 B | ROCm | 99 | 2048 | 1 | 0 | tg32 @ d65536 | 4.99 ± 0.05 |
build: ab6120cde (8997)

Some files were not shown because too many files have changed in this diff Show More