manu commited on
Commit
40e26e1
·
verified ·
1 Parent(s): 98d3f8a

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +9 -6
app.py CHANGED
@@ -272,15 +272,16 @@ SYSTEM1 = (
272
  """
273
  You are a PDF research agent with a single tool: visual_deepsearch_image_search(query: string, k: int).
274
  Act iteratively:
275
- 1) Split the user question into 1–4 focused sub-queries. You can use the provided page images to help you ask relevant followup queries. Subqueries should be asked as natural language questions, not just keywords.
276
- 2) For each sub-query, call visual_deepsearch_image_search (k=5 by default; increase to up to 10 if you need to go deep).
277
- 3) You will receive the output of visual_deepsearch_image_search as a list of indices corresponding to page numbers. Print the page numbers out and stop generating. An external system will take over and convert the indices into image for you.
278
- 4) Analyze the images received to find information you were looking for. If you are condident that you have all the information needed for a complete response, stop early and provide a final answer. Otherwise run new search calls using the tool to find additional missing information.
279
- 5) Repeat the process for up to 5 rounds of iterations and 20 searches in total. If info is missing, try to continue searching using new keywords and queries.
280
 
281
  Workflow:
282
  • Use ONLY the provided images for grounding and cite as (p.<page>).
283
  • If an answer is not present, say “Not found in the provided pages.”
 
284
 
285
  Deliverable:
286
  • Return a clear, standalone Markdown answer in the user's language. Include concise tables for lists of dates/items when useful, and cite the page numbers used for each fact.
@@ -388,8 +389,10 @@ def stream_agent(question: str,
388
  parts: List[Dict[str, Any]] = []
389
  if round_idx == 1:
390
  parts.append({"type": "input_text", "text": question})
 
 
391
  else:
392
- parts.append({"type": "input_text", "text": "Continue reasoning with the newly attached pages. Remember you should probably further query the search tool."})
393
 
394
  parts += _build_image_parts_from_indices(attached_indices)
395
 
 
272
  """
273
  You are a PDF research agent with a single tool: visual_deepsearch_image_search(query: string, k: int).
274
  Act iteratively:
275
+ 1) If you are given images, analyze the images received to find information you were looking for. If you are condident that you have all the information needed for a complete response, provide a final answer. Most often, you should run new search calls using the tool to find additional missing information.
276
+ 2) To run new searches, split the query into 1–3 focused sub-queries. You can use the potentially provided page images to help you ask relevant followup queries. Subqueries should be asked as natural language questions, not just keywords.
277
+ 3) For each sub-query, call visual_deepsearch_image_search (k=5 by default; increase to up to 10 if you need to go deep).
278
+ 4) You will receive the output of visual_deepsearch_image_search as a list of indices corresponding to page numbers. Print the page numbers out and stop generating. An external system will take over and convert the indices into image for you.
279
+ 5) Back to step 1. Analyze the images received to find information you were looking for. If you are condident that you have all the information needed for a complete response, provide a final answer. Otherwise run new search calls using the tool to find additional missing information.
280
 
281
  Workflow:
282
  • Use ONLY the provided images for grounding and cite as (p.<page>).
283
  • If an answer is not present, say “Not found in the provided pages.”
284
+ • Never do more than three rounds of refinement. If you are past round 3, it's time to gaher all information and produce the final answer if you haven't done so yet.
285
 
286
  Deliverable:
287
  • Return a clear, standalone Markdown answer in the user's language. Include concise tables for lists of dates/items when useful, and cite the page numbers used for each fact.
 
389
  parts: List[Dict[str, Any]] = []
390
  if round_idx == 1:
391
  parts.append({"type": "input_text", "text": question})
392
+ elif round_idx < 5:
393
+ parts.append({"type": "input_text", "text": f"Continue reasoning with the newly attached pages which are from round {round_idx}. Ground your answer in these images, or query for new pages with the search tool if you are in round 3 or less. Otherwise, write your final answer."})
394
  else:
395
+ parts.append({"type": "input_text", "text": f"Time to produce the final answer grounded in the pages. Do not use the tool and query for new pages."})
396
 
397
  parts += _build_image_parts_from_indices(attached_indices)
398