FlexRAG Retriever
This is a FlexRetriever created with the FlexRAG
library (version 0.3.0
).
Retriever Attributes
The enwiki_2020_atlas
retriever is a FlexRetriever that provides access to the English Wikipedia corpus from December 2020. It is designed for information retrieval tasks, allowing users to search and retrieve relevant documents based on their queries.
The corpus of this retriever was created by the Atlas project and the index was built using the FlexRAG library.
Corpus Attribute | Value |
---|---|
Language | English |
Domain | Wikipedia |
Saved Fields | title, section, text |
Size | 33.1M (29.4M text, 3.8M infobox) |
Dump Date | Dec 2020 |
Provideer | Atlas |
License | CC-BY-SA 3.0 |
Index Attribute | Value |
---|---|
Index Name | bm25 |
Index Type | Sparse |
Index Method | Lucene |
Indexed Fields | title, section, text (concat) |
Preprocessing | LengthFilter(min_char=10, max_char=4096) |
Provideer | FlexRAG |
License | CC-BY-SA 3.0 |
Index Attribute | Value |
---|---|
Index Name | contriever |
Index Type | Dense |
Index Method | IVFPQ |
Indexed Fields | title, section, text (concat) |
Query Encoder | facebook/contriever-msmarco |
Passage Encoder | facebook/contriever-msmarco |
Preprocessing | LengthFilter(min_char=10, max_char=4096) |
Provideer | FlexRAG |
License | CC-BY-SA 3.0 |
Usage
Installation
You can install the FlexRAG
library with pip
:
pip install flexrag faiss-cpu
Loading the FlexRAG
retriever
You can use this retriever for information retrieval tasks. Here is an example:
from flexrag.retriever import LocalRetriever
# Load the retriever from the HuggingFace Hub
retriever = LocalRetriever.load_from_hub("FlexRAG/enwiki_2020_atlas")
# You can retrieve relevant documents now
results = retriever.search("Who is Bruce Wayne?")
Running the RAG demo with the retriever
You can run the GUI application of the RAG assistant with this retriever. Here is an example:
python -m flexrag.entrypoints.run_interactive \
assistant_type=modular \
modular_config.used_fields=[title,text] \
modular_config.retriever_type="FlexRAG/enwiki_2020_atlas" \
modular_config.response_type=original \
modular_config.generator_type=openai \
modular_config.openai_config.model_name='gpt-4o-mini' \
modular_config.openai_config.api_key=$OPENAI_KEY \
modular_config.do_sample=False
License
As the corpus is based on the CC-BY-SA 3.0 license, the retriever is also licensed under the same license.
FlexRAG Related Links:
- ๐Documentation
- ๐ปGitHub Repository
- Downloads last month
- 1