File size: 5,581 Bytes
91f974c 89b80a1 2e540d6 89b80a1 239a856 89b80a1 239a856 89b80a1 239a856 89b80a1 239a856 89b80a1 239a856 89b80a1 239a856 89b80a1 239a856 91f974c 239a856 91f974c 2e540d6 239a856 89b80a1 239a856 91f974c 2e540d6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
from typing import Dict, Optional, List
from dataclasses import dataclass
from haystack.dataclasses import ChatMessage
@dataclass
class DatasetConfig:
name: str
split: str = "train"
content_field: str = "content"
fields: Dict[str, str] = None # Dictionary of field mappings
prompt_template: Optional[str] = None
# Default configurations for different datasets
DATASET_CONFIGS = {
"awesome-chatgpt-prompts": DatasetConfig(
name="fka/awesome-chatgpt-prompts",
content_field="prompt",
fields={
"role": "act",
"prompt": "prompt"
},
prompt_template="""
Given the following context where each document represents a prompt for a specific role,
please answer the question while considering both the role and the prompt content.
Available Contexts:
{% for document in documents %}
{% if document.meta.role %}Role: {{ document.meta.role }}{% endif %}
Content: {{ document.content }}
---
{% endfor %}
Question: {{question}}
Answer:
"""
),
"settings-dataset": DatasetConfig(
name="syntaxhacker/rag_pipeline",
content_field="context",
fields={
"question": "question",
"answer": "answer",
"context": "context"
},
prompt_template="""
Given the following context about software settings and configurations,
please answer the question accurately based on the provided information.
For each setting, provide a clear, step-by-step navigation path and include:
1. The exact location (Origin Type > Tab > Section > Setting name)
2. What the setting does
3. Available options/values
4. How to access and modify the setting
5. Reference screenshots (if available)
Format your answer as:
"To [accomplish task], follow these steps:
Location: [Origin Type] > [Tab] > [Section] > [Setting name]
Purpose: [describe what the setting does]
Options: [list available values/options]
How to set: [describe interaction method: toggle/select/input]
Visual Guide:
[Include reference image links if available]
For more details, you can refer to the screenshots above showing the exact location and interface."
Available Contexts:
{% for document in documents %}
Setting Info: {{ document.content }}
Reference Answer: {{ document.meta.answer }}
---
{% endfor %}
Question: {{question}}
Answer:
"""
),
"seven-wonders": DatasetConfig(
name="bilgeyucel/seven-wonders",
content_field="content",
fields={}, # No additional fields needed
prompt_template="""
Given the following information about the Seven Wonders, please answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{question}}
Answer:
"""
),
"psychology-dataset": DatasetConfig(
name="jkhedri/psychology-dataset",
split="train",
content_field="question", # Assuming we want to use the question as the content
fields={
"response_j": "response_j", # Response from one model
"response_k": "response_k" # Response from another model
},
prompt_template="""
Given the following context where each document represents a psychological inquiry,
please answer the question based on the provided responses.
Available Contexts:
{% for document in documents %}
Question: {{ document.content }}
Response J: {{ document.meta.response_j }}
Response K: {{ document.meta.response_k }}
---
{% endfor %}
Question: {{question}}
Answer:
"""
),
"developer-portfolio": DatasetConfig(
name="syntaxhacker/developer-portfolio-rag",
split="train",
content_field="answer",
fields={
"question": "question",
"answer": "answer",
"context": "context"
},
prompt_template="""
You are a helpful assistant that provides direct answers based on the provided context. Format your answers using markdown, especially for lists.
---
Example 1:
Question: What is your current role?
Answer:
I am a Tech Lead at FleetEnable, where I lead the UI development for a logistics SaaS product focused on drayage and freight management.
---
Example 2:
Question: What are your primary responsibilities as a Tech Lead?
Answer:
My primary responsibilities include:
- Leading UI development.
- Collaborating with product and backend teams.
- Helping define technical strategies.
- Ensuring the delivery of high-quality features.
---
Context:
{% for document in documents %}
Question: {{ document.meta.question }}
Answer: {{ document.content }}
{% endfor %}
Question: {{question}}
Answer:
"""
),
}
# Default configuration for embedding and LLM models
MODEL_CONFIG = {
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"llm_model": "gemini-2.0-flash-exp",
}
|