olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 3 items • Updated 17 days ago • 114
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Apr 28 • 484
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding Paper • 2312.04461 • Published Dec 7, 2023 • 62
Document Processing Collection Any model or dataset dealing with documentary-type objects (layout detection, VQA, OCR, etc.) • 9 items • Updated Nov 14, 2024 • 3
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 7 days ago • 86
Evaluation Datasets Collection Collection of Romanian datasets used for evaluation • 8 items • Updated Oct 11, 2024 • 1
SFT Datasets Collection Collection of Romanian datasets used for supervised finetuning • 11 items • Updated Apr 22 • 1
MultiLegalPile Models Collection A 689GB Multilingual Legal Corpus • 33 items • Updated Oct 23, 2023 • 1