Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/
Yifan Peng
pyf98
AI & ML interests
Multimodal LLMs, Speech-to-Speech, Speech Recognition
Recent Activity
liked
a dataset
7 days ago
nvidia/Llama-Nemotron-Post-Training-Dataset
new activity
17 days ago
nvidia/Nemotron-H-8B-Reasoning-128K:Errors in HybridMambaAttentionDynamicCache
upvoted
an
article
about 2 months ago
Gotchas in Tokenizer Behavior Every Developer Should Know