Image-Text-to-Text
Transformers
Safetensors
Cosmos
English
qwen2_5_vl
nvidia
conversational
text-generation-inference
tsungyi commited on
Commit
5c3a31c
·
verified ·
1 Parent(s): 5cace25

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -174,6 +174,8 @@ We release text annotations for all embodied reasoning datasets and videos for R
174
 
175
  ## Inference:
176
  **Test Hardware:** H100, A100, GB200 <br>
 
 
177
  ```python
178
  from transformers import AutoProcessor
179
  from vllm import LLM, SamplingParams
 
174
 
175
  ## Inference:
176
  **Test Hardware:** H100, A100, GB200 <br>
177
+ > [!NOTE]
178
+ > We suggest using `fps=4` for the input video and `max_tokens=4096` to avoid truncated response.
179
  ```python
180
  from transformers import AutoProcessor
181
  from vllm import LLM, SamplingParams