Qwen
/

GGUF
conversational
littlebird13 commited on
Commit
8aa0010
·
verified ·
1 Parent(s): 23920d3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -50,6 +50,8 @@ For more details, including benchmark evaluation, hardware requirements, and inf
50
 
51
  ## Usage
52
 
 
 
53
  ### llama.cpp
54
  Check out our [llama.cpp documentation](https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html) for more usage guide.
55
 
@@ -59,7 +61,7 @@ In the following demonstration, we assume that you are running commands under th
59
  You can run Qwen3 Embedding with one command:
60
 
61
  ```shell
62
- ./build/bin/llama-embedding -m model.gguf -p "<your context here>" --pooling last --verbose-prompt --embd-normalize -1
63
  ```
64
 
65
  Or lunch a server:
@@ -67,6 +69,10 @@ Or lunch a server:
67
  ./build/bin/llama-server -m model.gguf --embedding --pooling last -ub 8192 --verbose-prompt
68
  ```
69
 
 
 
 
 
70
  ## Evaluation
71
 
72
  ### MTEB (Multilingual)
 
50
 
51
  ## Usage
52
 
53
+ 📌 **Tip**: We recommend that developers customize the `instruct` according to their specific scenarios, tasks, and languages. Our tests have shown that in most retrieval scenarios, not using an `instruct` on the query side can lead to a drop in retrieval performance by approximately 1% to 5%.
54
+
55
  ### llama.cpp
56
  Check out our [llama.cpp documentation](https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html) for more usage guide.
57
 
 
61
  You can run Qwen3 Embedding with one command:
62
 
63
  ```shell
64
+ ./build/bin/llama-embedding -m model.gguf -p "<your context here><|endoftext|>" --pooling last --verbose-prompt --embd-normalize 2
65
  ```
66
 
67
  Or lunch a server:
 
69
  ./build/bin/llama-server -m model.gguf --embedding --pooling last -ub 8192 --verbose-prompt
70
  ```
71
 
72
+ 📌 **Tip**: Qwen3 Embedding models default to using the last token as `<|endoftext|>`, so you need to manually append this token to the end of your own input context. In addition, when running the `llama-server`, you also need to manually normalize the output embeddings as `llama-server` currently does not support the `--embd-normalize` option.
73
+
74
+
75
+
76
  ## Evaluation
77
 
78
  ### MTEB (Multilingual)