vllm为什么 不管问啥,模型回复的内容都是一 堆!号呀,全是!!!!!!!!!!!!!!!!
1
#18 opened 3 days ago
by
chaochaoli
please release AWQ version
😎
1
#17 opened about 1 month ago
by
classdemo
A quick test using M1 Max (64G) and Word
#16 opened about 1 month ago
by
gptlocalhost

Awesome model! Can we get a version with a larger context window?
1
#15 opened about 1 month ago
by
seall0

It supports Serbo-Croatian language very well!
🤯
2
2
#13 opened about 1 month ago
by
JLouisBiz

GPTQ or AWQ Quants
🚀
1
3
#12 opened about 1 month ago
by
guialfaro
Great job, thanks for this model.
👍
5
4
#11 opened about 1 month ago
by
Dampfinchen
recommended sampling parameters?
👀
👍
2
1
#10 opened about 1 month ago
by
AaronFeng753
Can we have some more popular benchmarks
1
#8 opened about 2 months ago
by
rombodawg

The model is the best for coding.
🔥
❤️
5
3
#7 opened about 2 months ago
by
AekDevDev

When running with a single GPU, I get an error saying the VRAM is insufficient. However, when using multiple GPUs on a single machine, there are many errors. My vllm version is 0.8.4.
1
#6 opened about 2 months ago
by
hanson888

BitsAndBytes quantization inference error
1
#5 opened about 2 months ago
by
chengfy

Some bug when using function call with vllm==0.8.4
2
#4 opened about 2 months ago
by
waple

SimpleQA Scores Are WAY off
🔥
5
5
#3 opened about 2 months ago
by
phil111
Need fp8 version for inerface
1
#2 opened about 2 months ago
by
iwaitu

RuntimeError: CUDA error: device-side assert triggered
#1 opened about 2 months ago
by
DsnTgr