More stable startup command, not easy oom.

#31
by Piekey - opened

python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 12345 --max-model-len 24000 --trust-remote-code --tensor-parallel-size 8 --gpu-memory-utilization 0.85 --max-num-seqs 8 --dtype float16 --served-model-name deepseek-reasoner --model cognitivecomputations/DeepSeek-R1-AWQ

Sign up or log in to comment