From 4ec72fa8f49ae040c65cfaf2bb0c83f801bd93ee Mon Sep 17 00:00:00 2001 From: "S. Neuhaus" Date: Thu, 30 Oct 2025 18:11:28 +0100 Subject: [PATCH] Fix command syntax for llama-cli usage --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 97769b1..b5ae284 100644 --- a/README.md +++ b/README.md @@ -145,7 +145,7 @@ Once inside, the following commands show how to run local LLMs: * `llama-cli --list-devices` *Lists available GPU devices for Llama.cpp.* -* `llama-cli --no-mmap -ngl 999 -fa -m ` +* `llama-cli --no-mmap -ngl 999 -fa 1 -m ` *Runs inference on the specified model, with all layers on GPU and flash attention enabled (replace \*\* with your model path).* ## 2.3 Downloading GGUF Models from HuggingFace