langchain openai streamlit pinecone-client chromadb unstructured pdf2image pytesseract tiktoken pymupdf tabulate sentence-transformers altair<5