abdibrahem commited on
Commit
b65cd76
·
1 Parent(s): 2177417

Update the prompts

Browse files
Files changed (1) hide show
  1. main.py +113 -126
main.py CHANGED
@@ -153,53 +153,62 @@ class HealthcareChatbot:
153
  def _initialize_parsers_and_chains(self):
154
  """Initialize all prompt templates and chains"""
155
  self.json_parser = JsonOutputParser(pydantic_object=EndpointRequest)
156
-
157
  # Intent classification prompt
158
  self.intent_classifier_template = PromptTemplate(
159
  template="""
160
- You are an intent classifier for a healthcare chatbot. Analyze the user's message and determine if it requires an API call or is conversational.
 
161
 
162
- === ANALYSIS CONTEXT ===
163
  User Message: {user_query}
164
- Language: {detected_language}
165
  Conversation History: {conversation_history}
166
 
167
- === AVAILABLE API ENDPOINTS ===
 
168
  {endpoints_documentation}
169
 
170
- === CLASSIFICATION TASK ===
171
- Determine if the user's message requires:
172
- 1. API_ACTION: Specific healthcare action (book appointment, view records, etc.)
173
- 2. CONVERSATION: General chat, greeting, questions not requiring backend data
 
 
174
 
175
  === RESPONSE FORMAT ===
176
- Respond with EXACTLY this JSON structure:
177
  {{
178
  "intent": "API_ACTION" or "CONVERSATION",
179
- "confidence": 0.95,
180
- "reasoning": "Brief explanation of classification decision",
181
  "requires_backend": true or false
182
  }}
183
 
184
  === CLASSIFICATION RULES ===
185
- Choose API_ACTION for:
186
- - Booking, canceling, or viewing appointments
187
- - Requesting medical records or test results
188
- - Hospital information queries (locations, hours, etc.)
189
- - Medication management requests
190
- - Specific patient data requests
191
-
192
- Choose CONVERSATION for:
193
- - Greetings and pleasantries
194
- - General health advice (not patient-specific)
195
- - Explanations of medical terms
196
- - Small talk or casual questions
197
- - Questions about the chatbot itself
198
-
199
- Classify the intent:""",
 
 
200
  input_variables=["user_query", "detected_language", "conversation_history", "endpoints_documentation"]
201
  )
202
 
 
 
 
203
  # API routing prompt (reuse existing router_prompt_template)
204
  self.router_prompt_template = PromptTemplate(
205
  template="""
@@ -276,7 +285,7 @@ class HealthcareChatbot:
276
  - User wants to "update medication" → Use medication update endpoint with patient_id
277
 
278
  Think step by step and be precise with your endpoint selection and parameter extraction.:""",
279
- input_variables=["endpoints_documentation", "user_query", "detected_language",
280
  "extracted_keywords", "sentiment_analysis", "conversation_history"]
281
  )
282
 
@@ -299,7 +308,7 @@ class HealthcareChatbot:
299
  Conversation History: {conversation_history}
300
 
301
  === LANGUAGE-SPECIFIC INSTRUCTIONS ===
302
-
303
  FOR ARABIC RESPONSES:
304
  - Use Modern Standard Arabic (الفصحى)
305
  - Be respectful and formal as appropriate in Arabic culture
@@ -327,120 +336,98 @@ class HealthcareChatbot:
327
  # API response formatting prompt (reuse existing user_response_template)
328
  self.user_response_template = PromptTemplate(
329
  template="""
330
- You are a professional healthcare assistant. Your PRIMARY responsibility is to provide ONLY verified information directly from the API response data.
331
 
332
- === CRITICAL DATA ACCURACY REQUIREMENTS ===
333
- - Use ONLY information that exists in the api_response data
334
- - Do NOT infer, assume, or generate any information not explicitly provided
335
- - If specific data is missing from api_response, explicitly state it's unavailable
336
- - Cross-reference ALL details against the raw API data before including in response
337
- - Maintain strict data integrity - accuracy over completeness
338
-
339
- === RESPONSE LANGUAGE ===
340
- - Respond EXCLUSIVELY in: {detected_language}
341
- - NO translations, explanations, or text in other languages
342
- - Use language-appropriate formatting and terminology
343
 
344
- === INPUT DATA ===
345
  User Query: {user_query}
346
  User Sentiment: {sentiment_analysis}
347
- API Response Data: {api_response}
348
- API Status: {api_status}
349
- Data Validation: {data_validation_status}
350
 
351
- === DATA VERIFICATION PROTOCOL ===
352
- Before including ANY information, verify:
353
- 1. Does this exact data exist in api_response?
354
- 2. Is this data complete and not truncated?
355
- 3. Are all referenced IDs/codes valid in the API response?
356
- 4. Is the data current and not cached/outdated?
357
 
358
- === RESPONSE STRUCTURE BY LANGUAGE ===
359
 
360
  FOR ARABIC RESPONSES:
361
- - Use Modern Standard Arabic only
362
- - Arabic numerals: ١، ٢، ٣، ٤، ٥، ٦، ٧، ٨، ٩
363
- - Time: "من الساعة ٨:٠٠ صباحاً حتى ٥:٠٠ مساءً"
364
- - Dates: "الخميس ٢٣ مايو ٢٠٢٥"
365
- - Hospital format: "مستشفى [Name] في [Address] - ساعات العمل: [Hours]"
366
- - Missing data: "هذه المعلومة غير متوفرة في النظام حالياً"
 
 
367
 
368
  FOR ENGLISH RESPONSES:
369
- - Professional, clear language
370
- - Time: "8:00 AM - 5:00 PM"
371
- - Dates: "Thursday, May 23, 2025"
372
- - Hospital format: "[Name] Hospital at [Address] - Operating hours: [Hours]"
373
- - Missing data: "This information is not available in our system"
374
-
375
- === API RESPONSE HANDLING ===
376
-
377
- IF API RESPONSE IS SUCCESSFUL:
378
- - Extract and present ONLY the verified data
379
- - Preserve exact names, addresses, times, and contact information
380
- - Convert technical formats to user-friendly presentation
381
- - Validate all extracted data against the original API response
382
-
383
- IF API RESPONSE IS INCOMPLETE/ERROR:
384
- - Clearly state what information is unavailable
385
- - Do NOT make up placeholder information
386
- - Suggest alternative actions if appropriate
387
- - Maintain transparency about data limitations
388
-
389
- === PROHIBITED ACTIONS ===
390
- Never fabricate or estimate missing information
391
- Never include system URLs, booking links, or technical IDs in user response
392
- Never mix languages or provide translations
393
- Never use bullet points or technical formatting
394
- Never add information not present in api_response
395
- Never assume data relationships not explicitly stated in API
396
- ❌ Never provide outdated or cached information as current
397
-
398
- === REQUIRED VALIDATION STEPS ===
399
- 1. Parse api_response structure completely
400
- 2. Identify all available data fields
401
- 3. Map user query requirements to available data
402
- 4. Flag any missing or incomplete information
403
- 5. Format response using ONLY verified data
404
- 6. Double-check all details against original API response
405
-
406
- === RESPONSE QUALITY CRITERIA ===
407
- Every piece of information is traceable to api_response
408
- ✓ Response directly answers user's specific question
409
- Language is natural and conversational
410
- No technical jargon or system codes visible to user
411
- Missing information is clearly acknowledged
412
- Response sounds like a knowledgeable human assistant
413
- Data accuracy is prioritized over response completeness
414
-
415
- === ERROR HANDLING ===
416
-
417
- IF API DATA IS MISSING:
418
- Arabic: "عذراً، المعلومات المطلوبة غير متوفرة حالياً في النظام. يرجى المحاولة مرة أخرى أو الاتصال بالدعم الفني."
419
- English: "I apologize, but the requested information is not currently available in our system. Please try again later or contact technical support."
420
-
421
- IF API DATA IS INCOMPLETE:
422
- Arabic: "المعلومات المتوفرة: [available_data]. باقي التفاصيل غير متوفرة حالياً."
423
- English: "Available information: [available_data]. Additional details are not currently available."
424
-
425
- === FINAL INSTRUCTIONS ===
426
- 1. Analyze the api_response data thoroughly
427
- 2. Extract ONLY verified, complete information
428
- 3. Format naturally in the requested language
429
- 4. Provide transparent communication about data limitations
430
- 5. Ensure every detail can be traced back to the API response
431
- 6. Stop immediately after delivering the requested information
432
- 7. Do NOT add any supplementary languages or translations
433
-
434
- Generate your response based solely on verified API data, maintaining the highest standards of accuracy and transparency.
435
  """,
436
- input_variables=["user_query", "api_response", "detected_language", "sentiment_analysis", "api_status", "data_validation_status"]
437
  )
 
438
  # Create chains
439
  self.intent_chain = LLMChain(llm=self.llm, prompt=self.intent_classifier_template)
440
  self.router_chain = LLMChain(llm=self.llm, prompt=self.router_prompt_template)
441
  self.conversation_chain = LLMChain(llm=self.llm, prompt=self.conversation_template)
442
  self.api_response_chain = LLMChain(llm=self.llm, prompt=self.user_response_template)
443
-
444
  def detect_language(self, text):
445
  """Detect language of the input text"""
446
  if self.language_classifier and len(text.strip()) > 3:
 
153
  def _initialize_parsers_and_chains(self):
154
  """Initialize all prompt templates and chains"""
155
  self.json_parser = JsonOutputParser(pydantic_object=EndpointRequest)
156
+
157
  # Intent classification prompt
158
  self.intent_classifier_template = PromptTemplate(
159
  template="""
160
+ You are a strict intent classifier for a healthcare chatbot. Your task is to determine whether a user query requires an API call or is just a conversational message.
161
+ This decision MUST be based on an accurate comparison between the user's message and the functionality described in the available API endpoints.
162
 
163
+ === USER INPUT CONTEXT ===
164
  User Message: {user_query}
165
+ Detected Language: {detected_language}
166
  Conversation History: {conversation_history}
167
 
168
+ === API ENDPOINTS DOCUMENTATION ===
169
+ You must compare the user query against ALL of the following API endpoints and their descriptions to decide if there is a functional match:
170
  {endpoints_documentation}
171
 
172
+ === DECISION INSTRUCTIONS ===
173
+ Step 1: Understand the intent and meaning of the user message.
174
+ Step 2: Carefully read all API endpoint descriptions.
175
+ Step 3: Check if the user query matches or requests a function explicitly supported by ANY of the endpoints.
176
+ - If there is a match, the intent is **API_ACTION**.
177
+ - If no endpoint matches the user query in purpose or meaning, the intent is **CONVERSATION**.
178
 
179
  === RESPONSE FORMAT ===
180
+ Respond with ONLY the following JSON structure:
181
  {{
182
  "intent": "API_ACTION" or "CONVERSATION",
183
+ "confidence": [0.0 to 1.0],
184
+ "reasoning": "Explain if and how the user query matched any endpoint description, or why it didn’t",
185
  "requires_backend": true or false
186
  }}
187
 
188
  === CLASSIFICATION RULES ===
189
+ Choose **API_ACTION** if the user request can be fulfilled using an existing endpoint. This includes:
190
+ - Scheduling, canceling, or managing appointments
191
+ - Requesting or viewing test results, prescriptions, or medical records
192
+ - Any interaction described in the API documentation
193
+
194
+ Choose **CONVERSATION** if the query:
195
+ - Is a greeting or casual message
196
+ - Asks about general health advice or medical definitions
197
+ - Mentions something unrelated to the provided API functionality
198
+ - Cannot be fulfilled by any described endpoint
199
+
200
+ === IMPORTANT NOTE ===
201
+ DO NOT guess. Only return "API_ACTION" if the user message CLEARLY maps to a specific endpoint. Otherwise, return "CONVERSATION".
202
+
203
+ === CLASSIFICATION START ===
204
+ Begin classification based on the above rules:
205
+ """,
206
  input_variables=["user_query", "detected_language", "conversation_history", "endpoints_documentation"]
207
  )
208
 
209
+
210
+
211
+
212
  # API routing prompt (reuse existing router_prompt_template)
213
  self.router_prompt_template = PromptTemplate(
214
  template="""
 
285
  - User wants to "update medication" → Use medication update endpoint with patient_id
286
 
287
  Think step by step and be precise with your endpoint selection and parameter extraction.:""",
288
+ input_variables=["endpoints_documentation", "user_query", "detected_language",
289
  "extracted_keywords", "sentiment_analysis", "conversation_history"]
290
  )
291
 
 
308
  Conversation History: {conversation_history}
309
 
310
  === LANGUAGE-SPECIFIC INSTRUCTIONS ===
311
+
312
  FOR ARABIC RESPONSES:
313
  - Use Modern Standard Arabic (الفصحى)
314
  - Be respectful and formal as appropriate in Arabic culture
 
336
  # API response formatting prompt (reuse existing user_response_template)
337
  self.user_response_template = PromptTemplate(
338
  template="""
339
+ You are a professional healthcare assistant. Generate clear, accurate responses using EXACT data from the system.
340
 
341
+ === STRICT REQUIREMENTS ===
342
+ - Respond ONLY in {detected_language}
343
+ - Use EXACT information from api_response - NO modifications
344
+ - Keep responses SHORT, SIMPLE, and DIRECT
345
+ - Use professional healthcare tone
346
+ - NEVER mix languages or make up information
 
 
 
 
 
347
 
348
+ === ORIGINAL REQUEST ===
349
  User Query: {user_query}
350
  User Sentiment: {sentiment_analysis}
 
 
 
351
 
352
+ === SYSTEM DATA ===
353
+ {api_response}
 
 
 
 
354
 
355
+ === LANGUAGE-SPECIFIC FORMATTING ===
356
 
357
  FOR ARABIC RESPONSES:
358
+ - Use Modern Standard Arabic (الفصحى)
359
+ - Use Arabic numerals: ١، ٢، ٣، ٤، ٥، ٦، ٧، ٨، ٩، ١٠
360
+ - Time format: "من الساعة ٨:٠٠ صباحاً إلى ٥:٠٠ مساءً"
361
+ - Date format: "١٥ مايو ٢٠٢٥"
362
+ - Use proper Arabic medical terminology
363
+ - Keep sentences short and grammatically correct
364
+ - Example format for hospitals:
365
+ "مستشفى [الاسم] - العنوان: [العنوان الكامل] - أوقات العمل: من [الوقت] إلى [الوقت]"
366
 
367
  FOR ENGLISH RESPONSES:
368
+ - Use clear, professional language
369
+ - Time format: "8:00 AM to 5:00 PM"
370
+ - Date format: "May 15, 2025"
371
+ - Keep sentences concise and direct
372
+ - Example format for hospitals:
373
+ "[Hospital Name] - Address: [Full Address] - Hours: [Opening Time] to [Closing Time]"
374
+
375
+ === RESPONSE STRUCTURE ===
376
+ 1. Direct answer to the user's question
377
+ 2. Essential details only (names, addresses, hours, contact info)
378
+ 3. Brief helpful note if needed
379
+ 4. No unnecessary introductions or conclusions
380
+
381
+ === CRITICAL RULES ===
382
+ - Extract information EXACTLY as provided in api_response
383
+ - Do NOT include technical URLs, IDs, or system codes in the response
384
+ - Do NOT show raw links or booking URLs to users
385
+ - Present information in natural, conversational language
386
+ - Do NOT use bullet points or technical formatting
387
+ - Write as if you're speaking to the patient directly
388
+ - If data is missing, state "المعلومات غير متوفرة" (Arabic) or "Information not available" (English)
389
+ - Convert technical data into human-readable format
390
+ - NEVER add translations or explanations in other languages
391
+ - NEVER include "Translated response" or similar phrases
392
+ - END your response immediately after providing the requested information
393
+ - Do NOT add any English translation when responding in Arabic
394
+ - Do NOT add any Arabic translation when responding in English
395
+
396
+ === HUMAN-LIKE FORMATTING RULES ===
397
+ FOR ARABIC:
398
+ - Instead of "رابط الحجز: [URL]" → say "تم حجز موعدك بنجاح"
399
+ - Instead of "الأزمة: غير متوفرة" → omit or say "بدون أعراض محددة"
400
+ - Use natural sentences like "موعدك مع الدكتور [Name] يوم [Date] في تمام الساعة [Time]"
401
+ - Avoid technical terms and system language
402
+
403
+ FOR ENGLISH:
404
+ - Instead of "Booking URL: [link]" → say "Your appointment has been scheduled"
405
+ - Use natural sentences like "You have an appointment with Dr. [Name] on [Date] at [Time]"
406
+ - Avoid showing raw URLs, IDs, or technical data
407
+
408
+ === QUALITY CHECKS ===
409
+ Before responding, verify:
410
+ Response sounds natural and conversational
411
+ No technical URLs, IDs, or system codes are shown
412
+ Information is presented in human-friendly language
413
+ ✓ Grammar is correct in the target language
414
+ Response directly answers the user's question
415
+ ✓ No bullet points or technical formatting
416
+ Sounds like a helpful human assistant, not a system
417
+
418
+ Generate a response that is accurate, helpful, and professionally formatted.
419
+
420
+ === FINAL INSTRUCTION ===
421
+ Respond ONLY in the requested language. Do NOT provide translations, explanations, or additional text in any other language. Stop immediately after answering the user's question.
 
 
 
 
 
 
 
 
 
 
 
 
422
  """,
423
+ input_variables=["user_query", "api_response", "detected_language", "sentiment_analysis"]
424
  )
425
+
426
  # Create chains
427
  self.intent_chain = LLMChain(llm=self.llm, prompt=self.intent_classifier_template)
428
  self.router_chain = LLMChain(llm=self.llm, prompt=self.router_prompt_template)
429
  self.conversation_chain = LLMChain(llm=self.llm, prompt=self.conversation_template)
430
  self.api_response_chain = LLMChain(llm=self.llm, prompt=self.user_response_template)
 
431
  def detect_language(self, text):
432
  """Detect language of the input text"""
433
  if self.language_classifier and len(text.strip()) > 3: