Resources

View closed (2)

vllm为什么不管问啥，模型回复的内容都是一堆！号呀，全是！！！！！！！！！！！！！！！！

#18 opened 3 days ago by

chaochaoli

please release AWQ version

😎 1

#17 opened about 1 month ago by

classdemo

A quick test using M1 Max (64G) and Word

#16 opened about 1 month ago by

gptlocalhost

Awesome model! Can we get a version with a larger context window?

#15 opened about 1 month ago by

seall0

It supports Serbo-Croatian language very well!

🤯 2

#13 opened about 1 month ago by

JLouisBiz

GPTQ or AWQ Quants

🚀 1

#12 opened about 1 month ago by

guialfaro

Great job, thanks for this model.

👍 5

#11 opened about 1 month ago by

Dampfinchen

recommended sampling parameters?

👀 👍 2

#10 opened about 1 month ago by

AaronFeng753

Can we have some more popular benchmarks

#8 opened about 2 months ago by

rombodawg

The model is the best for coding.

🔥 ❤️ 5

#7 opened about 2 months ago by

AekDevDev

When running with a single GPU, I get an error saying the VRAM is insufficient. However, when using multiple GPUs on a single machine, there are many errors. My vllm version is 0.8.4.

#6 opened about 2 months ago by

hanson888

BitsAndBytes quantization inference error

#5 opened about 2 months ago by

chengfy

Some bug when using function call with vllm==0.8.4

#4 opened about 2 months ago by

waple

SimpleQA Scores Are WAY off

🔥 5

#3 opened about 2 months ago by

phil111

Need fp8 version for inerface

#2 opened about 2 months ago by

iwaitu

RuntimeError: CUDA error: device-side assert triggered

#1 opened about 2 months ago by

DsnTgr

vllm为什么 不管问啥，模型回复的内容都是一 堆！号呀，全是！！！！！！！！！！！！！！！！