datasets transformers streamlit pymupdf fitz torch