requests beautifulsoup4 datasets pandas numpy python-dotenv gradio bm25s[full] lxml PyMuPDF