updated benchmarks

This commit is contained in:
Donato Capitella
2026-02-09 13:30:26 +00:00
parent 632130a2c3
commit 8ff812fbb5
204 changed files with 1645 additions and 1376 deletions
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 18.41 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 4.12 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 101.82 ± 0.34 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 8.71 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 17.93 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 4.13 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 95.55 ± 0.26 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 8.78 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 18.59 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 3.63 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 103.11 ± 0.08 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 9.11 ± 0.03 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 18.03 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 3.63 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 87.98 ± 0.29 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 9.10 ± 0.02 |
build: 2656c0d26 (7693)
@@ -0,0 +1,19 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
/opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response
/lib64/libggml-base.so.0(+0x35a5) [0x7f92f39eb5a5]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f92f39eb96b]
/lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f92f39ebaef]
/lib64/libggml-rpc.so.0(+0x5b4a) [0x7f92f7010b4a]
/lib64/libggml-base.so.0(+0x171b2) [0x7f92f39ff1b2]
/lib64/libggml-base.so.0(+0x1749f) [0x7f92f39ff49f]
/lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f92f3a00509]
/lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c61) [0x7f92f72603c1]
/lib64/libllama.so.0(+0x25568) [0x7f92f71b6568]
/lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f92f71b73cc]
/usr/sbin/llama-bench() [0x4077b5]
/lib64/libc.so.6(+0x35b5) [0x7f92f33815b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f92f3381668]
/usr/sbin/llama-bench() [0x409cf5]
@@ -0,0 +1,19 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
/opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response
/lib64/libggml-base.so.0(+0x35a5) [0x7f4efadba5a5]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f4efadba96b]
/lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f4efadbaaef]
/lib64/libggml-rpc.so.0(+0x5b4a) [0x7f4efe3dfb4a]
/lib64/libggml-base.so.0(+0x171b2) [0x7f4efadce1b2]
/lib64/libggml-base.so.0(+0x1749f) [0x7f4efadce49f]
/lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f4efadcf509]
/lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c61) [0x7f4efe62f3c1]
/lib64/libllama.so.0(+0x25568) [0x7f4efe585568]
/lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f4efe5863cc]
/usr/sbin/llama-bench() [0x4077b5]
/lib64/libc.so.6(+0x35b5) [0x7f4efa7505b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f4efa750668]
/usr/sbin/llama-bench() [0x409cf5]
@@ -0,0 +1 @@
Error: unable to find user kyuz0: no matching entries in passwd file
@@ -0,0 +1 @@
Error: unable to find user kyuz0: no matching entries in passwd file
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 59.80 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 6.45 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 172.78 ± 2.43 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 18.17 ± 0.05 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 59.95 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 6.45 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 173.98 ± 1.76 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 18.17 ± 0.04 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 60.12 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 6.04 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 157.51 ± 1.13 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 18.24 ± 0.10 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 60.47 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 6.04 ± 0.00 |
build: 2656c0d26 (7693)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 162.36 ± 1.16 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 18.23 ± 0.08 |
build: 2656c0d26 (7693)
@@ -0,0 +1,3 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
Failed to connect to 10.0.0.1:50052
@@ -0,0 +1,19 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
/opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response
/lib64/libggml-base.so.0(+0x35a5) [0x7f7c046f25a5]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f7c046f296b]
/lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f7c046f2aef]
/lib64/libggml-rpc.so.0(+0x5b4a) [0x7f7c07d17b4a]
/lib64/libggml-base.so.0(+0x171b2) [0x7f7c047061b2]
/lib64/libggml-base.so.0(+0x1749f) [0x7f7c0470649f]
/lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f7c04707509]
/lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c61) [0x7f7c07f673c1]
/lib64/libllama.so.0(+0x25568) [0x7f7c07ebd568]
/lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f7c07ebe3cc]
/usr/sbin/llama-bench() [0x4077b5]
/lib64/libc.so.6(+0x35b5) [0x7f7c040885b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f7c04088668]
/usr/sbin/llama-bench() [0x409cf5]
@@ -0,0 +1,19 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
Failed to connect to 10.0.0.1:50052
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu: size : 990904320 bytes
radv/amdgpu: alignment : 262144 bytes
radv/amdgpu: domains : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu: size : 990904320 bytes
radv/amdgpu: alignment : 262144 bytes
radv/amdgpu: domains : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu: size : 990904320 bytes
radv/amdgpu: alignment : 262144 bytes
radv/amdgpu: domains : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu: size : 990904320 bytes
radv/amdgpu: alignment : 262144 bytes
radv/amdgpu: domains : 4
@@ -0,0 +1,19 @@
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
/opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response
/lib64/libggml-base.so.0(+0x35a5) [0x7fe6965fe5a5]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7fe6965fe96b]
/lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7fe6965feaef]
/lib64/libggml-rpc.so.0(+0x5b4a) [0x7fe699c23b4a]
/lib64/libggml-base.so.0(+0x171b2) [0x7fe6966121b2]
/lib64/libggml-base.so.0(+0x1749f) [0x7fe69661249f]
/lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7fe696613509]
/lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c61) [0x7fe699e733c1]
/lib64/libllama.so.0(+0x25568) [0x7fe699dc9568]
/lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7fe699dca3cc]
/usr/sbin/llama-bench() [0x4077b5]
/lib64/libc.so.6(+0x35b5) [0x7fe695f945b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7fe695f94668]
/usr/sbin/llama-bench() [0x409cf5]
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 18.17 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 3.73 ± 0.00 |
build: e0c93af2a (7938)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | pp512 | 97.96 ± 0.29 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | tg128 | 9.09 ± 0.00 |
build: e0c93af2a (7938)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 17.22 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 3.72 ± 0.00 |
build: e0c93af2a (7938)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | pp512 | 73.57 ± 0.23 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | tg128 | 9.02 ± 0.01 |
build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s | | model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 18.41 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 18.79 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 4.12 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 4.13 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 101.82 ± 0.34 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | pp512 | 99.24 ± 0.14 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 8.71 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | tg128 | 8.55 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s | | model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 17.93 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 18.80 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 4.13 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 4.10 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 95.55 ± 0.26 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | pp512 | 99.22 ± 0.25 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 8.78 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | tg128 | 8.55 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s | | model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 18.59 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 18.35 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 3.63 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 3.69 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 103.11 ± 0.08 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | pp512 | 99.58 ± 0.65 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 9.11 ± 0.03 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | tg128 | 9.04 ± 0.01 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s | | model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 18.03 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 17.17 ± 0.00 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 3.63 ± 0.00 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 3.67 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 87.98 ± 0.29 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | pp512 | 72.73 ± 0.53 |
| glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 9.10 ± 0.02 | | glm4moe 355B.A32B Q4_K - Medium | 189.69 GiB | 356.79 B | ROCm,RPC | 99 | 1 | tg128 | 9.05 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,19 +1,3 @@
ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s | Failed to connect to 192.168.100.2:50052
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
/opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response
/lib64/libggml-base.so.0(+0x35a5) [0x7f92f39eb5a5]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f92f39eb96b]
/lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f92f39ebaef]
/lib64/libggml-rpc.so.0(+0x5b4a) [0x7f92f7010b4a]
/lib64/libggml-base.so.0(+0x171b2) [0x7f92f39ff1b2]
/lib64/libggml-base.so.0(+0x1749f) [0x7f92f39ff49f]
/lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f92f3a00509]
/lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c61) [0x7f92f72603c1]
/lib64/libllama.so.0(+0x25568) [0x7f92f71b6568]
/lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f92f71b73cc]
/usr/sbin/llama-bench() [0x4077b5]
/lib64/libc.so.6(+0x35b5) [0x7f92f33815b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f92f3381668]
/usr/sbin/llama-bench() [0x409cf5]
@@ -1,19 +1,19 @@
ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
/opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response /opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response
/lib64/libggml-base.so.0(+0x35a5) [0x7f4efadba5a5] /lib64/libggml-base.so.0(+0x35a5) [0x7f14eecbd5a5]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f4efadba96b] /lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f14eecbd96b]
/lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f4efadbaaef] /lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f14eecbdaef]
/lib64/libggml-rpc.so.0(+0x5b4a) [0x7f4efe3dfb4a] /lib64/libggml-rpc.so.0(+0x5b4a) [0x7f14f2311b4a]
/lib64/libggml-base.so.0(+0x171b2) [0x7f4efadce1b2] /lib64/libggml-base.so.0(+0x174f2) [0x7f14eecd14f2]
/lib64/libggml-base.so.0(+0x1749f) [0x7f4efadce49f] /lib64/libggml-base.so.0(+0x177df) [0x7f14eecd17df]
/lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f4efadcf509] /lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f14eecd2849]
/lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c61) [0x7f4efe62f3c1] /lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c41) [0x7f14f257dbe1]
/lib64/libllama.so.0(+0x25568) [0x7f4efe585568] /lib64/libllama.so.0(+0x279e8) [0x7f14f24c79e8]
/lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f4efe5863cc] /lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f14f24c884c]
/usr/sbin/llama-bench() [0x4077b5] /usr/sbin/llama-bench() [0x407fbd]
/lib64/libc.so.6(+0x35b5) [0x7f4efa7505b5] /lib64/libc.so.6(+0x35b5) [0x7f14ee1055b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f4efa750668] /lib64/libc.so.6(__libc_start_main+0x88) [0x7f14ee105668]
/usr/sbin/llama-bench() [0x409cf5] /usr/sbin/llama-bench() [0x40a7b5]
@@ -1 +1 @@
Error: unable to find user kyuz0: no matching entries in passwd file Error: failed to start container llama-vulkan-radv
@@ -1 +1 @@
Error: unable to find user kyuz0: no matching entries in passwd file Error: failed to start container llama-vulkan-radv
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 58.89 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 5.92 ± 0.00 |
build: e0c93af2a (7938)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | pp512 | 158.25 ± 0.52 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | tg128 | 19.04 ± 0.00 |
build: e0c93af2a (7938)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 59.95 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 5.91 ± 0.00 |
build: e0c93af2a (7938)
@@ -0,0 +1,8 @@
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | pp512 | 159.79 ± 0.35 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | tg128 | 19.05 ± 0.00 |
build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s | | model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 59.80 ± 0.00 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 64.41 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 6.45 ± 0.00 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 6.41 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 172.78 ± 2.43 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | pp512 | 169.30 ± 0.95 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 18.17 ± 0.05 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | tg128 | 18.93 ± 0.01 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s | | model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 59.95 ± 0.00 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 65.69 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 6.45 ± 0.00 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 6.47 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 173.98 ± 1.76 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | pp512 | 169.50 ± 1.04 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 18.17 ± 0.04 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | tg128 | 18.89 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s | | model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 60.12 ± 0.00 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 60.88 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 6.04 ± 0.00 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 6.10 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 157.51 ± 1.13 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | pp512 | 171.03 ± 0.56 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 18.24 ± 0.10 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | tg128 | 18.98 ± 0.02 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | n_ubatch | fa | mmap | test | t/s | | model | size | params | backend | ngl | n_ubatch | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | pp2048 @ d32768 | 60.47 ± 0.00 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | pp2048 @ d32768 | 60.78 ± 0.00 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | 0 | tg32 @ d32768 | 6.04 ± 0.00 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 2048 | 1 | tg32 @ d32768 | 6.10 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,8 +1,8 @@
ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32 Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | pp512 | 162.36 ± 1.16 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | pp512 | 173.91 ± 0.29 |
| minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | 0 | tg128 | 18.23 ± 0.08 | | minimax-m2 230B.A10B Q6_K | 180.94 GiB | 228.69 B | ROCm,RPC | 99 | 1 | tg128 | 19.05 ± 0.00 |
build: 2656c0d26 (7693) build: e0c93af2a (7938)
@@ -1,3 +1,8 @@
ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
Failed to connect to 10.0.0.1:50052 Failed to connect to 192.168.100.2:50052
| model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
ggml_vulkan: Device memory allocation of size 990904320 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfHostMemory
main: error: failed to load model '/mnt/storage/MiniMax-M2-GGUF/UD-Q6_K_XL/MiniMax-M2-UD-Q6_K_XL-00001-of-00004.gguf'
@@ -1,19 +1,19 @@
ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat ggml_vulkan: 0 = Radeon 8060S Graphics (AMD open-source driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
/opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response /opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response
/lib64/libggml-base.so.0(+0x35a5) [0x7f7c046f25a5] /lib64/libggml-base.so.0(+0x35a5) [0x7f12de48e5a5]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f7c046f296b] /lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f12de48e96b]
/lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f7c046f2aef] /lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f12de48eaef]
/lib64/libggml-rpc.so.0(+0x5b4a) [0x7f7c07d17b4a] /lib64/libggml-rpc.so.0(+0x5b4a) [0x7f12e1ae2b4a]
/lib64/libggml-base.so.0(+0x171b2) [0x7f7c047061b2] /lib64/libggml-base.so.0(+0x174f2) [0x7f12de4a24f2]
/lib64/libggml-base.so.0(+0x1749f) [0x7f7c0470649f] /lib64/libggml-base.so.0(+0x177df) [0x7f12de4a27df]
/lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f7c04707509] /lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f12de4a3849]
/lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c61) [0x7f7c07f673c1] /lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c41) [0x7f12e1d4ebe1]
/lib64/libllama.so.0(+0x25568) [0x7f7c07ebd568] /lib64/libllama.so.0(+0x279e8) [0x7f12e1c989e8]
/lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f7c07ebe3cc] /lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f12e1c9984c]
/usr/sbin/llama-bench() [0x4077b5] /usr/sbin/llama-bench() [0x407fbd]
/lib64/libc.so.6(+0x35b5) [0x7f7c040885b5] /lib64/libc.so.6(+0x35b5) [0x7f12dd8d65b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f7c04088668] /lib64/libc.so.6(__libc_start_main+0x88) [0x7f12dd8d6668]
/usr/sbin/llama-bench() [0x409cf5] /usr/sbin/llama-bench() [0x40a7b5]
@@ -1,10 +1,6 @@
ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
Failed to connect to 10.0.0.1:50052 Failed to connect to 192.168.100.2:50052
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu: size : 990904320 bytes
radv/amdgpu: alignment : 262144 bytes
radv/amdgpu: domains : 4
radv/amdgpu: Failed to allocate a buffer: radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu: size : 990904320 bytes radv/amdgpu: size : 990904320 bytes
radv/amdgpu: alignment : 262144 bytes radv/amdgpu: alignment : 262144 bytes
@@ -1,19 +1,19 @@
ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
| model | size | params | backend | ngl | fa | mmap | test | t/s | | model | size | params | backend | ngl | fa | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
/opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response /opt/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:724: Remote RPC server crashed or returned malformed response
/lib64/libggml-base.so.0(+0x35a5) [0x7fe6965fe5a5] /lib64/libggml-base.so.0(+0x35a5) [0x7f62e4d465a5]
/lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7fe6965fe96b] /lib64/libggml-base.so.0(ggml_print_backtrace+0x1eb) [0x7f62e4d4696b]
/lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7fe6965feaef] /lib64/libggml-base.so.0(ggml_abort+0x11f) [0x7f62e4d46aef]
/lib64/libggml-rpc.so.0(+0x5b4a) [0x7fe699c23b4a] /lib64/libggml-rpc.so.0(+0x5b4a) [0x7f62e839ab4a]
/lib64/libggml-base.so.0(+0x171b2) [0x7fe6966121b2] /lib64/libggml-base.so.0(+0x174f2) [0x7f62e4d5a4f2]
/lib64/libggml-base.so.0(+0x1749f) [0x7fe69661249f] /lib64/libggml-base.so.0(+0x177df) [0x7f62e4d5a7df]
/lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7fe696613509] /lib64/libggml-base.so.0(ggml_backend_alloc_ctx_tensors_from_buft+0x19) [0x7f62e4d5b849]
/lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c61) [0x7fe699e733c1] /lib64/libllama.so.0(_ZN11llama_model12load_tensorsER18llama_model_loader+0x3c41) [0x7f62e8631be1]
/lib64/libllama.so.0(+0x25568) [0x7fe699dc9568] /lib64/libllama.so.0(+0x279e8) [0x7f62e857b9e8]
/lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7fe699dca3cc] /lib64/libllama.so.0(llama_model_load_from_file+0xac) [0x7f62e857c84c]
/usr/sbin/llama-bench() [0x4077b5] /usr/sbin/llama-bench() [0x407fbd]
/lib64/libc.so.6(+0x35b5) [0x7fe695f945b5] /lib64/libc.so.6(+0x35b5) [0x7f62e418e5b5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7fe695f94668] /lib64/libc.so.6(__libc_start_main+0x88) [0x7f62e418e668]
/usr/sbin/llama-bench() [0x409cf5] /usr/sbin/llama-bench() [0x40a7b5]

Some files were not shown because too many files have changed in this diff Show More