FlexRAG Retriever

This is a FlexRetriever created with the FlexRAG library (version 0.3.0).

Retriever Attributes

The enwiki_2019_atlas retriever is a FlexRetriever that provides access to the English Wikipedia corpus from December 2019. It is designed for information retrieval tasks, allowing users to search and retrieve relevant documents based on their queries. The corpus of this retriever was created by the Atlas project and the index was built using the FlexRAG library.

Corpus Attribute Value
Language English
Domain Wikipedia
Saved Fields title, section, text
Size 32.1M (28.4M text, 3.7M infobox)
Dump Date Dec 2019
Provideer Atlas
License CC-BY-SA 3.0
Index Attribute Value
Index Name bm25
Index Type Sparse
Index Method Lucene
Indexed Fields title, section, text (concat)
Preprocessing LengthFilter(min_char=10, max_char=4096)
Provideer FlexRAG
License CC-BY-SA 3.0
Index Attribute Value
Index Name contriever
Index Type Dense
Index Method IVFPQ
Indexed Fields title, section, text (concat)
Query Encoder facebook/contriever-msmarco
Passage Encoder facebook/contriever-msmarco
Preprocessing LengthFilter(min_char=10, max_char=4096)
Provideer FlexRAG
License CC-BY-SA 3.0

Usage

Installation

You can install the FlexRAG library with pip:

pip install flexrag faiss-cpu

Loading the FlexRAG retriever

You can use this retriever for information retrieval tasks. Here is an example:

from flexrag.retriever import LocalRetriever


# Load the retriever from the HuggingFace Hub
retriever = LocalRetriever.load_from_hub("FlexRAG/enwiki_2019_atlas")


# You can retrieve relevant documents now
results = retriever.search("Who is Bruce Wayne?")

Running the RAG demo with the retriever

You can run the GUI application of the RAG assistant with this retriever. Here is an example:

python -m flexrag.entrypoints.run_interactive \
    assistant_type=modular \
    modular_config.used_fields=[title,text] \
    modular_config.retriever_type="FlexRAG/enwiki_2019_atlas" \
    modular_config.response_type=original \
    modular_config.generator_type=openai \
    modular_config.openai_config.model_name='gpt-4o-mini' \
    modular_config.openai_config.api_key=$OPENAI_KEY \
    modular_config.do_sample=False

License

As the corpus is based on the CC-BY-SA 3.0 license, the retriever is also licensed under the same license.

FlexRAG Related Links:

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support