nikhilkomakula commited on
Commit
0a6f6d8
·
1 Parent(s): d11d425
.dockerignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ # python virtual environment
2
+ .lroc-venv
.gitignore ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Byte-compiled / optimized / DLL files
2
+ __pycache__/
3
+ *.py[cod]
4
+
5
+ # C extensions
6
+ *.so
7
+
8
+ # Distribution / packaging
9
+ .Python
10
+ env/
11
+ build/
12
+ develop-eggs/
13
+ dist/
14
+ downloads/
15
+ eggs/
16
+ .eggs/
17
+ lib/
18
+ lib64/
19
+ parts/
20
+ sdist/
21
+ var/
22
+ *.egg-info/
23
+ .installed.cfg
24
+ *.egg
25
+
26
+ # PyInstaller
27
+ # Usually these files are written by a python script from a template
28
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
29
+ *.manifest
30
+ *.spec
31
+
32
+ # Installer logs
33
+ pip-log.txt
34
+ pip-delete-this-directory.txt
35
+
36
+ # Unit test / coverage reports
37
+ htmlcov/
38
+ .tox/
39
+ .coverage
40
+ .coverage.*
41
+ .cache
42
+ nosetests.xml
43
+ coverage.xml
44
+ *.cover
45
+
46
+ # Translations
47
+ *.mo
48
+ *.pot
49
+
50
+ # Django stuff:
51
+ *.log
52
+
53
+ # Sphinx documentation
54
+ docs/_build/
55
+
56
+ # PyBuilder
57
+ target/
58
+
59
+ # DotEnv configuration
60
+ .env
61
+
62
+ # Database
63
+ *.db
64
+ *.rdb
65
+
66
+ # Pycharm
67
+ .idea
68
+
69
+ # VS Code
70
+ .vscode/
71
+
72
+ # Spyder
73
+ .spyproject/
74
+
75
+ # Jupyter NB Checkpoints
76
+ .ipynb_checkpoints/
77
+
78
+ # exclude data from source control by default
79
+ # /data/
80
+
81
+ # Mac OS-specific storage files
82
+ .DS_Store
83
+
84
+ # vim
85
+ *.swp
86
+ *.swo
87
+
88
+ # Mypy cache
89
+ .mypy_cache/
90
+
91
+ # Python virtual environment
92
+ .lroc-venv/
93
+
94
+ # references
95
+ /references/
Dockerfile ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11.5-slim
2
+
3
+ WORKDIR /app
4
+
5
+ COPY app.py requirements.txt /app/
6
+ COPY src /app/src
7
+ COPY indexes /app/indexes
8
+
9
+ RUN pip install --no-cache-dir --upgrade pip && \
10
+ pip install --no-cache-dir -r requirements.txt
11
+
12
+ # Use ENTRYPOINT to specify the command to run when the container starts
13
+ ENTRYPOINT ["python", "app.py"]
README.md CHANGED
@@ -1,2 +1,122 @@
1
- # llm-rag-op-chatbot
2
- OpenPages Chatbot to answer general questions about OpenPages features, solutions and triggers.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ i
2
+
3
+ # OpenPages IntelliBot
4
+
5
+ Welcome to OpenPages IntelliBot, your intelligent and efficient chatbot powered by the state-of-the-art Retrieval-Augmented Generation (RAG) technique and Large Language Model (LLM).
6
+
7
+ ## What is OpenPagesIntelliBot?
8
+
9
+ OpenPagesIntelliBot leverages cutting-edge AI technologies to provide you with instant and accurate responses about OpenPages, its features, solutions / modules it offers and its trigger framework. By combining the power of RAG and Zephyr LLM, OpenPagesIntelliBot ensures that you receive contextually relevant information.
10
+
11
+ ## How RAG Works?
12
+
13
+ ![RAG Diagram](images/RAG_workflow.png)
14
+
15
+ [Image Credit](https://huggingface.co/learn/cookbook/en/rag_evaluation)
16
+
17
+ #### Step 1: Data Collection
18
+
19
+ Gather all the data that is needed for your application. In the case of OpenPages IntelliBot, this include administrators guide, solutions or modules offerings, users guide and trigger developer guide.
20
+
21
+ #### Step 2: Data Chunking
22
+
23
+ Data chunking is the process of breaking your data down into smaller, more manageable pieces. For instance, if you have a lengthy 100-page user manual, you might break it down into different sections, each potentially answering different customer questions.
24
+
25
+ This way, each chunk of data is focused on a specific topic. When a piece of information is retrieved from the source dataset, it is more likely to be directly applicable to the user’s query, since we avoid including irrelevant information from entire documents.
26
+
27
+ This also improves efficiency, since the system can quickly obtain the most relevant pieces of information instead of processing entire documents.
28
+
29
+ #### Step 3: Document Embeddings
30
+
31
+ Now that the source data has been broken down into smaller parts, it needs to be converted into a vector representation. This involves transforming text data into embeddings, which are numeric representations that capture the semantic meaning behind text.
32
+
33
+ In simple words, document embeddings allow the system to understand user queries and match them with relevant information in the source dataset based on the meaning of the text, instead of a simple word-to-word comparison. This method ensures that the responses are relevant and aligned with the user’s query.
34
+
35
+ #### Step 4: Data Retrieval
36
+
37
+ When a user query enters the system, it must also be converted into an embedding or vector representation. The same model must be used for both the document and query embedding to ensure uniformity between the two.
38
+
39
+ Once the query is converted into an embedding, the system compares the query embedding with the document embeddings. It identifies and retrieves chunks whose embeddings are most similar to the query embedding, using measures such as cosine similarity.
40
+
41
+ These chunks are considered to be the most relevant to the user’s query.
42
+
43
+ #### Step 5: Response Generation
44
+
45
+ The retrieved text chunks, along with the initial user query, are fed into a language model. The algorithm will use this information to generate a coherent response to the user’s questions through a chat interface.
46
+
47
+ ## Sources used for OpenPages IntelliBot:
48
+
49
+ OpenPages IntelliBot can answer questions related to:
50
+
51
+ - OpenPages Administration
52
+ - OpenPages Solutions or Modules
53
+ - OpenPages Trigger Development
54
+
55
+ ## How to Use OpenPages IntelliBot?
56
+
57
+ 1. Simply type your query or question into the chat interface.
58
+ 2. OpenPages IntelliBot will process your query using the RAG model and provide you with a contextually relevant response.
59
+
60
+ ## Get Started to Run Locally:
61
+
62
+ **Step 1:** Download the Git repository
63
+
64
+ **Step 2:** Install dependencies
65
+
66
+ ```python
67
+ python install -r requirements.txt
68
+ ```
69
+
70
+ **Step 3:** Rename `dotenv` file to `.env` and set `HUGGINGFACEHUB_API_TOKEN` with your API token.
71
+
72
+ **Step 4:** Run the application
73
+
74
+ ```python
75
+ python app.py
76
+ ```
77
+
78
+ ## Build and Run Container Locally:
79
+
80
+ **Step 1:** Build image (replace `<docker_id>` with your Docker ID)
81
+
82
+ ```python
83
+ docker build --tag <docker_id>/llm-rag-op-chatbot .
84
+ ```
85
+
86
+ **Step 2:** Run container (replace `<docker_id>` with your Docker ID and `api_token` with your Hugging Face API Token)
87
+
88
+ ```python
89
+ docker run -it -d --name llm-rag-op-chatbot -p 5555:5555 -e HUGGINGFACEHUB_API_TOKEN=<api_token> <docker_id>/llm-rag-op-chatbot:latest
90
+ ```
91
+
92
+ **Note 1:** List all containers
93
+
94
+ ```python
95
+ docker ps -a
96
+ ```
97
+
98
+ **Note 2:** Review the logs
99
+
100
+ ```python
101
+ docker logs -f llm-rag-op-chatbot
102
+ ```
103
+
104
+ ## Technologies Used:
105
+
106
+ * **PDF Parser :** PyMuPDFLoader
107
+ * **Vector Database :** ChromaDB
108
+ * **Orchestration Framework :** LangChain
109
+ * **Embedding Model :** BAAI/bge-large-en-v1.5
110
+ * **Large Language Model :** huggingfaceh4/zephyr-7b-beta
111
+
112
+ ## Contact Me:
113
+
114
+ For any inquiries or feedback, please contact me at [nikhil.komakula@outlook.com](mailto:nikhil.komakula@outlook.com).
115
+
116
+ ## License:
117
+
118
+ This project is licensed under the [MIT License](https://opensource.org/licenses/MIT) - see the [LICENSE](LICENSE) file for details.
119
+
120
+ ---
121
+
122
+ **Note:** OpenPages IntelliBot is for demonstration purposes only and may not provide accurate information in all scenarios. Always verify critical information from reliable sources.
app.py ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # import libraries
2
+ import sys
3
+ from dotenv import find_dotenv, load_dotenv
4
+
5
+ # import functions
6
+ from src.test.eval_rag import evaluate_rag
7
+ from src.ui.chat_interface import create_chatinterface
8
+ from src.generation.generate_response import get_qa_chain, set_global_qa_chain, generate_response
9
+
10
+ # find .env automatically by walking up directories until it's found, then
11
+ # load up the .env entries as environment variables
12
+ load_dotenv(find_dotenv())
13
+
14
+ if __name__ == "__main__":
15
+
16
+ # get the qa chain
17
+ qa_chain = get_qa_chain()
18
+
19
+ if len(sys.argv) > 1:
20
+ evaluate_rag("qa_chain", qa_chain)
21
+ else:
22
+ set_global_qa_chain(qa_chain)
23
+ create_chatinterface(generate_response).launch(server_name="0.0.0.0", server_port=5555)
dotenv ADDED
@@ -0,0 +1 @@
 
 
1
+ HUGGINGFACEHUB_API_TOKEN="YOUR API KEY GOES HERE"
images/RAG_workflow.png ADDED
notebooks/.gitkeep ADDED
File without changes
notebooks/hf_llm_rag_1.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
requirements.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ # external requirements
2
+ python-dotenv==1.0.1
3
+ chromadb==0.4.24
4
+ langchain==0.1.11
5
+ langchain-community==0.0.27
6
+ pymupdf==1.23.26
7
+ sentence-transformers==2.5.1
8
+ tensorflow==2.16.1
9
+ gradio==4.21.0
10
+ pandas==2.2.1
src/__init__.py ADDED
File without changes
src/data/.gitkeep ADDED
File without changes
src/data/__init__.py ADDED
File without changes
src/data/load_dataset.py ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # import libraries
2
+ import os
3
+ from langchain_community.document_loaders import PyMuPDFLoader
4
+
5
+ # constants
6
+ DATA_DIR = "../../data/"
7
+
8
+
9
+ # load data
10
+ def load_documents():
11
+ """
12
+ Loads documents into memory.
13
+
14
+ Raises:
15
+ e: Any exception while loading the documents.
16
+
17
+ Returns:
18
+ list: An array of documents.
19
+ """
20
+
21
+ documents = []
22
+ try:
23
+ for root, _, files in os.walk(DATA_DIR):
24
+ for file in files:
25
+ if file.endswith(".pdf"):
26
+ print(f"Reading File: {file}")
27
+
28
+ # read PDF
29
+ loader = PyMuPDFLoader(os.path.join(root, file))
30
+ document = loader.load()
31
+
32
+ # append to docs
33
+ documents += document
34
+ except Exception as e:
35
+ print("Error while loading the data!", e)
36
+ raise e
37
+ return documents
src/generation/__init__.py ADDED
File without changes
src/generation/generate_response.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # import libraries
2
+ from src.retrieval.retriever_chain import get_base_retriever, load_hf_llm, create_qa_chain
3
+
4
+ # constants
5
+ HF_MODEL = "huggingfaceh4/zephyr-7b-beta" # "mistralai/Mistral-7B-Instruct-v0.2" # "google/gemma-7b"
6
+
7
+
8
+ # get the qa chain
9
+ def get_qa_chain():
10
+ """
11
+ Instantiates QA Chain.
12
+
13
+ Returns:
14
+ Runnable: Returns an instance of QA Chain.
15
+ """
16
+
17
+ # get retriever
18
+ retriever = get_base_retriever(k=4, search_type="mmr")
19
+
20
+ # instantiate llm
21
+ llm = load_hf_llm(repo_id=HF_MODEL, max_new_tokens=512, temperature=0.4)
22
+
23
+ # instantiate qa chain
24
+ qa_chain = create_qa_chain(retriever, llm)
25
+
26
+ return qa_chain
27
+
28
+
29
+ def set_global_qa_chain(local_qa_chain):
30
+ global global_qa_chain
31
+ global_qa_chain = local_qa_chain
32
+
33
+
34
+ # function to generate response
35
+ def generate_response(message, history):
36
+ """
37
+ Generates response based on the question being asked.
38
+
39
+ Args:
40
+ message (str): Question asked by the user.
41
+ history (dict): Chat history. NOT USED FOR NOW.
42
+
43
+ Returns:
44
+ str: Returns the generated response.
45
+ """
46
+
47
+ # invoke chain
48
+ response = global_qa_chain.invoke(message)
49
+ print(response)
50
+
51
+ return response
src/indexing/.gitkeep ADDED
File without changes
src/indexing/__init__.py ADDED
File without changes
src/indexing/build_indexes.py ADDED
@@ -0,0 +1,153 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # import libraries
2
+ import os
3
+ from typing import List, Optional
4
+ from transformers import AutoTokenizer
5
+ from langchain_community.vectorstores import Chroma
6
+ from sentence_transformers import SentenceTransformer
7
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
8
+ from langchain_community.embeddings import HuggingFaceBgeEmbeddings
9
+ from langchain.docstore.document import Document
10
+
11
+ # import functions
12
+ from ..data.load_dataset import load_documents
13
+
14
+ # constants
15
+ INDEX_DIR = "indexes/"
16
+ EMBEDDING_MODEL = "BAAI/bge-large-en-v1.5"
17
+
18
+
19
+ # instantiate embedding model
20
+ def load_embedding_model():
21
+ """
22
+ Load the embedding model.
23
+
24
+ Returns:
25
+ HuggingFaceBgeEmbeddings: Returns the embedding model.
26
+ """
27
+
28
+ # check if GPU is available
29
+ import tensorflow as tf
30
+
31
+ device = "cuda" if tf.test.gpu_device_name() else "cpu"
32
+ print("device:", device)
33
+
34
+ hf_bge_embeddings = HuggingFaceBgeEmbeddings(
35
+ model_name=EMBEDDING_MODEL,
36
+ model_kwargs={"device": device},
37
+ encode_kwargs={
38
+ "normalize_embeddings": True
39
+ }, # set True to compute cosine similarity
40
+ )
41
+
42
+ # To get the value of the max sequence_length, we will query the underlying `SentenceTransformer` object used in the RecursiveCharacterTextSplitter.
43
+ print(
44
+ f"Model's maximum sequence length: {SentenceTransformer(EMBEDDING_MODEL).max_seq_length}"
45
+ )
46
+
47
+ return hf_bge_embeddings
48
+
49
+
50
+ # split documents
51
+ def chunk_documents(
52
+ chunk_size: int,
53
+ knowledge_base: List[Document],
54
+ tokenizer_name: Optional[str] = EMBEDDING_MODEL,
55
+ ) -> List[Document]:
56
+ """
57
+ Split documents into chunks of maximum size `chunk_size` tokens and return a list of documents.
58
+
59
+ Args:
60
+ chunk_size (int): Chunk size.
61
+ knowledge_base (List[Document]): Loaded documents.
62
+ tokenizer_name (Optional[str], optional): Embedding Model name. Defaults to EMBEDDING_MODEL.
63
+
64
+ Returns:
65
+ List[Document]: Returns chunked documents.
66
+ """
67
+
68
+ text_splitter = RecursiveCharacterTextSplitter.from_huggingface_tokenizer(
69
+ AutoTokenizer.from_pretrained(tokenizer_name),
70
+ chunk_size=chunk_size,
71
+ chunk_overlap=int(chunk_size / 10),
72
+ add_start_index=True,
73
+ strip_whitespace=True,
74
+ separators=["\n\n", "\n", ".", ""],
75
+ )
76
+
77
+ docs_processed = []
78
+ for doc in knowledge_base:
79
+ docs_processed += text_splitter.split_documents([doc])
80
+
81
+ # Remove duplicates
82
+ unique_texts = {}
83
+ docs_processed_unique = []
84
+ for doc in docs_processed:
85
+ if doc.page_content not in unique_texts:
86
+ unique_texts[doc.page_content] = True
87
+ docs_processed_unique.append(doc)
88
+
89
+ return docs_processed_unique
90
+
91
+
92
+ # generate indexes
93
+ def generate_indexes():
94
+ """
95
+ Generates indexes.
96
+
97
+ Returns:
98
+ ChromaCollection: Returns vector store.
99
+ """
100
+
101
+ # load documents
102
+ documents = load_documents()
103
+
104
+ # chunk documents to honor the context length
105
+ chunked_documents = chunk_documents(
106
+ SentenceTransformer(
107
+ EMBEDDING_MODEL
108
+ ).max_seq_length, # We choose a chunk size adapted to our model
109
+ documents,
110
+ tokenizer_name=EMBEDDING_MODEL,
111
+ )
112
+
113
+ # save indexes to disk
114
+ vector_store = Chroma.from_documents(
115
+ documents=chunked_documents,
116
+ embedding=load_embedding_model(),
117
+ collection_metadata={"hnsw:space": "cosine"},
118
+ persist_directory=INDEX_DIR,
119
+ )
120
+
121
+ return vector_store
122
+
123
+
124
+ # load indexes from disk
125
+ def load_indexes():
126
+ """
127
+ Loads indexes into memory.
128
+
129
+ Returns:
130
+ ChromaCollection: Returns vector store.
131
+ """
132
+
133
+ vector_store = Chroma(
134
+ persist_directory=INDEX_DIR, embedding_function=load_embedding_model()
135
+ )
136
+ return vector_store
137
+
138
+
139
+ # retrieve vector store
140
+ def retrieve_indexes():
141
+ """
142
+ Retrieves indexes.
143
+
144
+ Returns:
145
+ ChromaCollection: Returns vector store.
146
+ """
147
+
148
+ if [f for f in os.listdir(INDEX_DIR) if not f.startswith(".")] == []:
149
+ print("Generating indexes...")
150
+ return generate_indexes()
151
+ else:
152
+ print("Loading existing indexes!")
153
+ return load_indexes()
src/retrieval/.gitkeep ADDED
File without changes
src/retrieval/__init__.py ADDED
File without changes
src/retrieval/retriever_chain.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # import libraries
2
+ import os
3
+ from langchain_core.prompts import ChatPromptTemplate
4
+ from langchain_community.llms import HuggingFaceEndpoint
5
+ from langchain_core.runnables import RunnablePassthrough
6
+ from langchain_core.output_parsers import StrOutputParser
7
+
8
+ # import functions
9
+ from ..indexing.build_indexes import retrieve_indexes
10
+
11
+
12
+ # instantiate base retriever
13
+ def get_base_retriever(k=4, search_type="mmr"):
14
+ """
15
+ Instantiates base retriever.
16
+
17
+ Args:
18
+ k (int, optional): Top k results to retrieve. Defaults to 4.
19
+ search_type (str, optional): Search type (mmr or similarity). Defaults to 'mmr'.
20
+
21
+ Returns:
22
+ VectorStoreRetriever: Returns base retriever.
23
+ """
24
+
25
+ # get the vector store of indexes
26
+ vector_store = retrieve_indexes()
27
+
28
+ base_retriever = vector_store.as_retriever(
29
+ search_type=search_type, search_kwargs={"k": k}
30
+ )
31
+
32
+ return base_retriever
33
+
34
+
35
+ # define prompt template
36
+ def create_prompt_template():
37
+ """
38
+ Creates prompt template.
39
+
40
+ Returns:
41
+ PromptTemplate: Returns prompt template.
42
+ """
43
+ prompt_template = """
44
+ <|system|>
45
+ You are an AI assistant for question-answering tasks. Use the provided context to answer the question. If you don't know the answer, just say that you don't know. The generated answer should be relevant to the question being asked, short and concise. Do not be creative and do not make up the answer.</s>
46
+ {context}</s>
47
+ <|user|>
48
+ {query}</s>
49
+ <|assistant|>
50
+ """
51
+ chat_prompt_template = ChatPromptTemplate.from_template(prompt_template)
52
+ return chat_prompt_template
53
+
54
+
55
+ # define llm
56
+ def load_hf_llm(repo_id, max_new_tokens=512, temperature=0.2):
57
+ """
58
+ Loads Hugging Face Endpoint for inference.
59
+
60
+ Args:
61
+ repo_id (str): HuggingFace Model Repo ID.
62
+ max_new_tokens (int, optional): Maximum number of new tokens to generate. Defaults to 512.
63
+ temperature (float, optional): Temperature setting. Defaults to 0.2.
64
+
65
+ Returns:
66
+ HuggingFaceEndpoint: Returns HuggingFace Endpoint.
67
+ """
68
+
69
+ hf_llm = HuggingFaceEndpoint(
70
+ repo_id=repo_id,
71
+ max_new_tokens=max_new_tokens,
72
+ temperature=temperature,
73
+ do_sample=True,
74
+ repetition_penalty=1.1,
75
+ return_full_text=False,
76
+ )
77
+ return hf_llm
78
+
79
+
80
+ # define retrieval chain
81
+ def create_qa_chain(retriever, llm):
82
+ """
83
+ Instantiates qa chain.
84
+
85
+ Args:
86
+ retriever (VectorStoreRetriever): Vector store.
87
+ llm (HuggingFaceEndpoint): HuggingFace endpoint.
88
+
89
+ Returns:
90
+ Runnable: Returns qa chain.
91
+ """
92
+
93
+ qa_chain = (
94
+ {"context": retriever, "query": RunnablePassthrough()}
95
+ | create_prompt_template()
96
+ | llm
97
+ | StrOutputParser()
98
+ )
99
+ return qa_chain
src/test/__init__.py ADDED
File without changes
src/test/eval_questions.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ What is FastMap?
2
+ What is a Role Template?
3
+ What is the purpose of Object Reset?
4
+ What is the purpose of Reporting Periods?
5
+ List the system variables used in Expressions.
6
+ Provide the steps to configure Watson Assistant in OpenPages?
7
+ What is the difference between PRE and POST position in Triggers?
8
+ What are the features of Operational Risk Management in OpenPages?
9
+ What are the different permissions that can be delegated to a user group administrator?
10
+ What are the different access controls available for non-participants for a standard stage within a workflow?
src/test/eval_rag.py ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # import libraries
2
+ import os
3
+ import time
4
+ import datetime
5
+ import pandas as pd
6
+
7
+ # constants
8
+ EVAL_FILE_PATH = "./src/test/eval_questions.txt"
9
+ EVAL_RESULTS_FILE_NAME = "eval_results_{0}.csv"
10
+ EVAL_RESULTS_PATH = "./src/test"
11
+
12
+
13
+ # load eval questions
14
+ def load_eval_questions():
15
+ """
16
+ Loads eval questions into memory.
17
+
18
+ Returns:
19
+ list: Returns list of questions.
20
+ """
21
+
22
+ eval_questions = []
23
+ with open(EVAL_FILE_PATH, "r") as file:
24
+ for line in file:
25
+ # Remove newline character and convert to integer
26
+ item = line.strip()
27
+ eval_questions.append(item)
28
+
29
+ return eval_questions
30
+
31
+
32
+ # evaluate rag chain
33
+ def evaluate_rag(chain_name, rag_chain):
34
+ """
35
+ Evaluates the rag pipeline based on eval questions.
36
+
37
+ Args:
38
+ chain_name (str): QA Chain name.
39
+ rag_chain (Runnable): QA Chain instance.
40
+ """
41
+
42
+ columns = ["Chain", "Question", "Response", "Time"]
43
+ df = pd.DataFrame(columns=columns)
44
+
45
+ eval_questions = load_eval_questions()
46
+
47
+ for question in eval_questions:
48
+
49
+ start_time = time.time()
50
+ answer = rag_chain.invoke(question)
51
+ end_time = time.time()
52
+
53
+ row = {
54
+ "Chain": chain_name,
55
+ "Question": question,
56
+ "Response": answer,
57
+ "Time": "{:.2f}".format(round(end_time - start_time, 2)),
58
+ }
59
+
60
+ df = pd.concat([df, pd.DataFrame.from_records([row])])
61
+
62
+ CSV = EVAL_RESULTS_FILE_NAME.format(
63
+ datetime.datetime.now().strftime("%Y%m%d%H%M%S")
64
+ )
65
+ print(os.path.join(EVAL_RESULTS_PATH, CSV))
66
+ df.to_csv(os.path.join(EVAL_RESULTS_PATH, CSV), index=False)
src/ui/__init__.py ADDED
File without changes
src/ui/chat_interface.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # import libraries
2
+ import gradio as gr
3
+
4
+ # import functions
5
+ from src.test.eval_rag import load_eval_questions
6
+
7
+
8
+ # create chatbot interface
9
+ def create_chatinterface(generate_response):
10
+ """
11
+ Instantiates the gradio chat interface.
12
+
13
+ Args:
14
+ generate_response (callable): Function that generates the response.
15
+
16
+ Returns:
17
+ class: Returns gradio chatinterface class
18
+ """
19
+
20
+ chat_interface = gr.ChatInterface(
21
+ fn=generate_response,
22
+ textbox=gr.Textbox(
23
+ placeholder="Type your question here!", container=False, scale=7
24
+ ),
25
+ title="OpenPages IntelliBot",
26
+ description="Ask me about OpenPages (v9.0), its features, solutions / modules it offers and the trigger framework. Authored by Nikhil Komakula (nikhil.komakula@outlook.com).",
27
+ theme=gr.themes.Default(primary_hue="blue"),
28
+ examples=load_eval_questions(),
29
+ cache_examples=False,
30
+ concurrency_limit=None
31
+ )
32
+
33
+ return chat_interface