Spaces:

AiActivity
/

ai-search-system

Running

App Files Files Community

AiActivity commited on May 19

Commit

6a5bbc8

verified ·

1 Parent(s): 2c0c571

fix more efficent

Browse files

Files changed (1) hide show

app.py +255 -383

app.py CHANGED Viewed

@@ -13,15 +13,11 @@ from markdown import markdown
 # Set environment variables
 os.environ["TOKENIZERS_PARALLELISM"] = "false"
-# Global variable to track model loading success
-MODEL_LOADED = False
-MODEL_TYPE = "none"
 print("Loading model... Please wait...")
-# Try to load the models with better error handling
 try:
-    # First try with Phi-2
     MODEL_ID = "microsoft/phi-2"
     tokenizer = AutoTokenizer.from_pretrained(
@@ -36,8 +32,6 @@ try:
         trust_remote_code=True
     )
-    MODEL_LOADED = True
-    MODEL_TYPE = "phi"
     print("Successfully loaded Phi-2 model")
 except Exception as e:
     print(f"Error loading Phi-2: {e}")
@@ -55,157 +49,20 @@ except Exception as e:
             device_map="auto"
         )
-        MODEL_LOADED = True
-        MODEL_TYPE = "t5"
         print("Successfully loaded fallback model")
     except Exception as e:
         print(f"Error loading fallback model: {e}")
-        print("Will use hard-coded responses only")
-# Pre-written answers for common topics
-COMMON_ANSWERS = {
-    "computer": """A computer is an electronic device that manipulates information, or data. It can store, retrieve, and process data [1]. Computers were originally designed to perform mathematical calculations, but they have evolved to handle a wide range of tasks including word processing, communication, multimedia playback, and more [2].
-Modern computers typically consist of several key components:
-- **Central Processing Unit (CPU)**: The "brain" of the computer that executes instructions
-- **Memory (RAM)**: Temporary storage used while running programs
-- **Storage Devices**: Permanent storage for data and programs (hard drives, SSDs)
-- **Input Devices**: Allow users to interact with the computer (keyboard, mouse)
-- **Output Devices**: Display or communicate information (monitor, speakers)
-Computers operate using binary code (0s and 1s) and follow instructions provided through software programs [3]. They can be categorized into different types including desktop computers, laptops, tablets, smartphones, servers, and supercomputers, each designed for specific use cases.
-The field of computer science involves the study of computers, their design, and their applications. As technology continues to advance, computers are becoming increasingly powerful, smaller, and more integrated into daily life.""",
-    "artificial intelligence": """Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions [1]. The term may also be applied to any machine that exhibits traits associated with a human mind such as learning and problem-solving.
-The core problems of artificial intelligence include programming computers for certain traits such as:
-- **Knowledge**: Having information and the ability to use it
-- **Reasoning**: Using knowledge to draw conclusions
-- **Problem solving**: Finding solutions to complex problems
-- **Perception**: Analyzing sensory inputs like visual, auditory, or textual information
-- **Learning**: Acquiring new knowledge and adapting to new situations
-AI can be categorized in different ways:
-- **Narrow AI (or Weak AI)**: Systems designed for a specific task
-- **General AI (or Strong AI)**: Systems with generalized human cognitive abilities
-Modern AI techniques include machine learning (particularly deep learning), which enables computers to learn from data without being explicitly programmed [2]. Applications of AI include virtual assistants, healthcare diagnostics, autonomous vehicles, facial recognition, recommendation systems, and much more [3].
-As AI technology continues to advance, it raises important ethical, social, and philosophical questions about the impact of these systems on society, privacy, employment, and the future of humanity.""",
-    "quantum computing": """Quantum computing is a rapidly emerging technology that harnesses the laws of quantum mechanics to solve problems too complex for classical computers [1]. Unlike traditional computers that use bits (0s and 1s), quantum computers use quantum bits or "qubits" that can exist in multiple states simultaneously due to a property called superposition [2].
-Key concepts in quantum computing include:
-- **Superposition**: Qubits can represent multiple states at once, enabling parallel computation
-- **Entanglement**: Quantum particles become interconnected, with the state of one instantly influencing the other
-- **Quantum Interference**: Used to control quantum states and amplify correct answers
-These properties potentially allow quantum computers to solve certain problems exponentially faster than classical computers. Promising applications include [3]:
-- Cryptography and security (both breaking and creating new encryption methods)
-- Drug discovery and molecular modeling
-- Optimization problems in fields like logistics and finance
-- Machine learning and artificial intelligence
-- Climate modeling and materials science
-Currently, quantum computers are still in early development. Companies like IBM, Google, Microsoft, and D-Wave are building increasingly powerful quantum processors, though practical, error-corrected quantum computers that can outperform classical supercomputers for a wide range of problems are still being developed.
-Challenges in quantum computing include maintaining quantum coherence (qubits are fragile and easily disrupted by environmental noise), error correction, and scaling up to more qubits while maintaining control."""
-}
-# Pre-curated sources for common topics
-COMMON_SOURCES = {
-    "computer": [
-        {
-            'title': "Wikipedia - Computer",
-            'url': "https://en.wikipedia.org/wiki/Computer",
-            'snippet': "A computer is a digital electronic machine that can be programmed to carry out sequences of arithmetic or logical operations (computation) automatically."
-        },
-        {
-            'title': "Computer - Encyclopedia Britannica",
-            'url': "https://www.britannica.com/technology/computer",
-            'snippet': "Computer, device for processing, storing, and displaying information. Computer once meant a person who did computations, but now the term almost universally refers to automated electronic machinery."
-        },
-        {
-            'title': "How Computers Work - Khan Academy",
-            'url': "https://www.khanacademy.org/computing/computer-science/how-computers-work2",
-            'snippet': "Computers are everywhere and they're used for everything, but how do they actually work? In this course you'll learn how computers work from the bottom up."
-        }
-    ],
-    "artificial intelligence": [
-        {
-            'title': "Wikipedia - Artificial intelligence",
-            'url': "https://en.wikipedia.org/wiki/Artificial_intelligence",
-            'snippet': "Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by humans or by other animals."
-        },
-        {
-            'title': "What is Artificial Intelligence (AI)? - IBM",
-            'url': "https://www.ibm.com/topics/artificial-intelligence",
-            'snippet': "Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems. Specific applications of AI include expert systems, natural language processing, speech recognition and machine vision."
-        },
-        {
-            'title': "Artificial Intelligence - Stanford Encyclopedia of Philosophy",
-            'url': "https://plato.stanford.edu/entries/artificial-intelligence/",
-            'snippet': "Artificial intelligence (AI) is both the intelligence of machines and the branch of computer science which aims to create it. AI textbooks define the field as 'the study and design of intelligent agents.'"
-        }
-    ],
-    "quantum computing": [
-        {
-            'title': "Wikipedia - Quantum computing",
-            'url': "https://en.wikipedia.org/wiki/Quantum_computing",
-            'snippet': "Quantum computing is a type of computation whose operations can harness the phenomena of quantum mechanics, such as superposition, interference, and entanglement."
-        },
-        {
-            'title': "What is quantum computing? - IBM",
-            'url': "https://www.ibm.com/topics/quantum-computing",
-            'snippet': "Quantum computing harnesses the phenomena of quantum mechanics to deliver a huge leap forward in computation to solve certain problems."
-        },
-        {
-            'title': "Quantum Computing - MIT Technology Review",
-            'url': "https://www.technologyreview.com/topic/computing/quantum-computing/",
-            'snippet': "Quantum computers leverage the strange properties of quantum physics to theoretically solve certain types of problems that are effectively impossible for classical computers."
-        }
-    ]
-}
-# Related topics for common searches
-COMMON_TOPICS = {
-    "computer": [
-        "History of computers",
-        "How do computers work?",
-        "Types of computer systems"
-    ],
-    "artificial intelligence": [
-        "Machine learning vs AI",
-        "Future of artificial intelligence",
-        "Ethical concerns in AI"
-    ],
-    "quantum computing": [
-        "Quantum supremacy",
-        "Quantum entanglement",
-        "Quantum computing applications"
-    ]
-}
-def search_web(query, max_results=3):
-    """Search the web using Wikipedia API with reliable fallbacks"""
-    # Check for common topics first
-    query_lower = query.lower()
-    for key in COMMON_SOURCES.keys():
-        if key in query_lower:
-            print(f"Using pre-curated sources for '{key}'")
-            return COMMON_SOURCES[key]
-    # If not a common topic, try Wikipedia API
     results = []
     try:
-        # Try Wikipedia API
         wiki_url = f"https://en.wikipedia.org/w/api.php?action=opensearch&search={urllib.parse.quote(query)}&limit={max_results}&namespace=0&format=json"
-        response = requests.get(wiki_url)
         if response.status_code == 200:
             data = response.json()
@@ -215,16 +72,14 @@ def search_web(query, max_results=3):
             for i in range(min(len(titles), len(urls))):
                 # Get summary for each page
                 page_url = f"https://en.wikipedia.org/w/api.php?action=query&prop=extracts&exintro&explaintext&titles={urllib.parse.quote(titles[i])}&format=json"
-                page_response = requests.get(page_url)
                 if page_response.status_code == 200:
                     page_data = page_response.json()
-                    # Extract page ID
                     try:
                         page_id = next(iter(page_data['query']['pages'].keys()))
-                        if page_id != "-1":  # Valid page
                             extract = page_data['query']['pages'][page_id].get('extract', '')
-                            # Truncate to a reasonable snippet length
                             snippet = extract[:200] + "..." if len(extract) > 200 else extract
                             results.append({
@@ -232,236 +87,252 @@ def search_web(query, max_results=3):
                                 'url': urls[i],
                                 'snippet': snippet
                             })
-                    except:
-                        pass
     except Exception as e:
-        print(f"Wikipedia API error: {e}")
-    # If we still don't have enough results, use fallback
     if len(results) < max_results:
-        # Reliable fallback results
-        fallback_results = [
-            {
-                'title': f"Wikipedia - {query}",
-                'url': f"https://en.wikipedia.org/wiki/Special:Search?search={urllib.parse.quote(query)}",
-                'snippet': f"Information about {query} from the free encyclopedia Wikipedia."
-            },
             {
-                'title': f"{query} - Overview",
                 'url': f"https://www.google.com/search?q={urllib.parse.quote(query)}",
-                'snippet': f"Comprehensive information about {query} including definitions, applications, and history."
-            },
-            {
-                'title': f"Latest on {query}",
-                'url': f"https://news.google.com/search?q={urllib.parse.quote(query)}",
-                'snippet': f"Recent news and updates about {query}."
             }
         ]
-        # Add fallback results until we have enough
-        for result in fallback_results:
-            if len(results) >= max_results:
-                break
-            results.append(result)
     return results[:max_results]
-def generate_response(prompt, max_new_tokens=256):
-    """Generate response with comprehensive error handling and fallbacks"""
-    # If model isn't loaded, return a generic response
-    if not MODEL_LOADED:
-        print("Model not loaded, using pre-written responses")
-        # Check for common topics
-        for key, answer in COMMON_ANSWERS.items():
-            if key in prompt.lower():
-                return answer
-        # Generic response for unknown topics
-        return f"Based on the search results, I can provide some information about your query. The topic appears to be related to {prompt.split(':')[-1].strip()}. For more detailed information, please check the sources provided below."
-    # With loaded model, try to generate a response
     try:
-        if MODEL_TYPE == "t5":
-            # T5 models
-            inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
             with torch.no_grad():
                 outputs = model.generate(
                     inputs.input_ids,
                     max_new_tokens=max_new_tokens,
                     temperature=0.7,
-                    num_beams=1,
                     do_sample=True
                 )
             response = tokenizer.decode(outputs[0], skip_special_tokens=True)
             return response
-        else:  # phi or other models
-            # Format for Phi-2
-            if MODEL_TYPE == "phi":
-                phi_prompt = f"Instruct: {prompt}\nOutput:"
-            else:
-                phi_prompt = prompt
-            # Tokenize input
-            inputs = tokenizer(phi_prompt, return_tensors="pt").to(model.device)
-            # Generate with efficient settings
             with torch.no_grad():
                 outputs = model.generate(
                     inputs.input_ids,
                     max_new_tokens=max_new_tokens,
                     temperature=0.7,
                     top_p=0.9,
-                    num_beams=1,
                     do_sample=True,
                     pad_token_id=tokenizer.eos_token_id
                 )
-            # Decode response
             response = tokenizer.decode(outputs[0][inputs.input_ids.size(1):], skip_special_tokens=True).strip()
-            # Check if response is empty and use fallback if needed
-            if not response or len(response) < 20:
-                # Try to find a pre-written answer
-                query = prompt.split(':')[-1].strip()
-                for key, answer in COMMON_ANSWERS.items():
-                    if key in query.lower():
-                        return answer
-                # Generic fallback response
-                return f"Based on the search results, I can provide information about {query}. The sources listed below contain more detailed information about this topic."
             return response
     except Exception as e:
         print(f"Error generating response: {e}")
-        # Try to find a pre-written answer
-        query = prompt.split(':')[-1].strip()
-        for key, answer in COMMON_ANSWERS.items():
-            if key in query.lower():
-                return answer
-        # Last resort fallback
-        return f"Based on the search results, I can tell you that {query} is a topic with various aspects covered in the sources below. For more detailed information, please check those sources."
-# Answer cache for better performance
-answer_cache = {}
-def extract_citations(text, search_results):
     """Ensure citations are properly added to the text"""
-    # If text is None or empty, return a fallback
     if not text or len(text.strip()) < 10:
-        return "I couldn't generate a specific response for this query. Please check the sources below for information."
     # Add citations if not present
     if not re.search(r'\[\d+\]', text):
         for i, result in enumerate(search_results, 1):
-            # Try to find snippet content in the answer
             key_phrases = result['snippet'].split('.')
             for phrase in key_phrases:
-                if phrase and len(phrase) > 20 and phrase.strip() in text:
                     text = text.replace(phrase, f"{phrase} [{i}]", 1)
-        # If still no citations, add at least one at the end
         if not re.search(r'\[\d+\]', text):
-            text += f" [1]"
     return text
-def generate_related_topics(query):
-    """Generate related topics with reliable fallbacks"""
-    # Check for common topics first
-    query_lower = query.lower()
-    for key, topics in COMMON_TOPICS.items():
-        if key in query_lower:
-            return topics
-    # Generic topics for any query
-    generic_topics = [
-        f"History of {query}",
-        f"Latest developments in {query}",
-        f"How does {query} work?"
-    ]
-    return generic_topics
-def search_and_answer(query):
-    """Main function to search and generate answer with robust fallbacks"""
     try:
-        # Check cache first
-        cache_key = query.lower().strip()
-        if cache_key in answer_cache:
-            return answer_cache[cache_key]
-        # Step 1: Search the web
-        search_results = search_web(query, max_results=3)
-        # Step 2: Check for pre-written answers
-        for key, answer in COMMON_ANSWERS.items():
-            if key in cache_key:
-                result = {
-                    "answer": answer,
-                    "sources": COMMON_SOURCES.get(key, search_results),
-                    "related_topics": COMMON_TOPICS.get(key, generate_related_topics(query))
-                }
-                answer_cache[cache_key] = result
-                return result
-        # Step 3: Create context for the model
         context = f"Query: {query}\n\nSearch Results:\n\n"
         for i, result in enumerate(search_results, 1):
             context += f"Source {i}:\n"
             context += f"Title: {result['title']}\n"
             context += f"Content: {result['snippet']}\n\n"
-        # Step 4: Create prompt for the model
         prompt = f"""You are a helpful AI assistant that provides accurate and comprehensive answers based on search results.
 {context}
-Based on these search results, please provide a concise answer to the query: "{query}"
 Include citations like [1], [2], etc. to reference the sources.
 Be factual and accurate. If the search results don't contain enough information, acknowledge this limitation.
 Format your answer in clear paragraphs with bullet points where appropriate."""
-        # Step 5: Generate answer with optimized settings
-        answer = generate_response(prompt, max_new_tokens=256)
-        # Step 6: Ensure we have an answer
-        if not answer or len(answer.strip()) < 20:
-            # Generic fallback answer
-            answer = f"""Based on the search results, I can provide some information about {query}.
-The sources show that this topic has various aspects and applications. For more detailed information, please refer to the sources listed below [1][2].
-To learn more about specific aspects of {query}, you might want to explore the related topics listed below."""
-        # Step 7: Ensure citations
-        answer = extract_citations(answer, search_results)
-        # Step 8: Generate related topics
-        related_topics = generate_related_topics(query)
-        # Store in cache for future use
-        result = {
             "answer": answer,
             "sources": search_results,
             "related_topics": related_topics
         }
-        answer_cache[cache_key] = result
-        return result
     except Exception as e:
-        print(f"Error in search_and_answer: {e}")
-        # Comprehensive fallback
         return {
-            "answer": f"I found some information about {query} that you might find useful. Please check the sources below for more details.",
-            "sources": search_web(query),
-            "related_topics": generate_related_topics(query)
         }
 def format_sources(sources):
@@ -489,10 +360,10 @@ def format_related(topics):
     if not topics:
         return ""
-    # Use a more reliable approach for clicking topics
     html = "<div style='display: flex; flex-wrap: wrap; gap: 10px; margin-top: 15px;'>"
     for i, topic in enumerate(topics):
-        # Each topic is a button with a unique ID that we'll handle with JavaScript
         html += f"""
         <div id="topic-{i}" style="background-color: #EFF6FF; padding: 10px 16px; border-radius: 100px;
                   color: #2563EB; font-size: 14px; font-weight: 500; cursor: pointer; display: inline-block;
@@ -505,83 +376,95 @@ def format_related(topics):
         """
     html += "</div>"
-    # Add JavaScript for clicking topics
     html += """
     <script>
-        // Function to handle topic clicks with error handling
         function setupTopicClicks() {
-            try {
-                // Select all topic elements
-                const topics = document.querySelectorAll('[id^="topic-"]');
-                // Add click event listener to each topic
-                topics.forEach(topic => {
-                    topic.addEventListener('click', function() {
-                        try {
-                            // Get the topic text
-                            const topicText = this.getAttribute('data-topic');
-                            console.log("Clicked topic:", topicText);
-                            // Set the search input value
-                            const inputElement = document.getElementById('query-input');
-                            if (inputElement) {
-                                inputElement.value = topicText;
-                                // Find and click the search button
-                                const searchButton = document.querySelector('button[data-testid="submit"]');
-                                if (searchButton) {
-                                    searchButton.click();
-                                } else {
-                                    // Alternative: try to find by aria-label
-                                    const altButton = document.querySelector('button[aria-label="Submit"]');
-                                    if (altButton) {
-                                        altButton.click();
-                                    } else {
-                                        // Last resort: try all buttons
-                                        const buttons = document.querySelectorAll('button');
-                                        for (let btn of buttons) {
-                                            if (btn.innerText.includes("Search")) {
-                                                btn.click();
-                                                break;
-                                            }
-                                        }
-                                    }
-                                }
-                            }
-                        } catch (err) {
-                            console.error("Error handling topic click:", err);
                         }
-                    });
                 });
-            } catch (error) {
-                console.error("Error setting up topic clicks:", error);
-            }
         }
-        // Run immediately and also when DOM changes
         setupTopicClicks();
-        // Set up a mutation observer to handle dynamically added topics
-        try {
-            const observer = new MutationObserver(function(mutations) {
-                setupTopicClicks();
-            });
-            // Start observing the document body for changes
-            observer.observe(document.body, {
-                childList: true,
-                subtree: true
             });
-        } catch (error) {
-            console.error("Error setting up observer:", error);
-        }
     </script>
     """
     return html
 def search_interface(query):
-    """Main function for the Gradio interface with reliable error handling"""
     if not query.strip():
         return (
             "Please enter a search query.",
@@ -593,14 +476,10 @@ def search_interface(query):
     try:
         # Show loading message while processing
-        yield ("Searching...", "", "")
-        # Perform search and answer generation
-        result = search_and_answer(query)
-        # Ensure we have a valid answer
-        if not result["answer"] or len(result["answer"].strip()) < 10:
-            result["answer"] = f"I found some information about '{query}'. Please check the sources below for more details."
         # Format answer with markdown
         answer_html = markdown(result["answer"])
@@ -625,8 +504,8 @@ def search_interface(query):
         # Return a fallback response
         yield (
             "I encountered an issue while processing your query. Please try again with a different search term.",
-            format_sources(search_web(query)),
-            format_related(generate_related_topics(query))
         )
 # Create the Gradio interface with modern UI
@@ -732,13 +611,6 @@ h3 {
     border-bottom: 1px solid currentColor;
 }
-/* Empty answer styling */
-.answer:empty::after {
-    content: 'Enter a search query to see results';
-    color: #9CA3AF;
-    font-style: italic;
-}
 /* Loading state */
 .answer.loading {
     display: flex;
@@ -768,12 +640,12 @@ footer {
 """
 with gr.Blocks(css=css, theme=gr.themes.Default()) as demo:
-    # Custom header with improved design
     gr.HTML("""
     <div class="header">
         <h1 style="color: #2563EB; font-size: 2.2rem; font-weight: 700; margin-bottom: 0.5rem;">🔍 AI Search System</h1>
         <p style="color: #64748B; font-size: 1.1rem; max-width: 600px; margin: 0 auto;">
-            Get comprehensive answers with reliable sources for any question you have.
         </p>
     </div>
     """)
@@ -825,6 +697,6 @@ with gr.Blocks(css=css, theme=gr.themes.Default()) as demo:
     </footer>
     """)
-# Launch app with queue to prevent overloading
 demo.queue(max_size=10)
 demo.launch()

 # Set environment variables
 os.environ["TOKENIZERS_PARALLELISM"] = "false"
 print("Loading model... Please wait...")
+# Load the model with proper error handling
 try:
+    # Try with Phi-2
     MODEL_ID = "microsoft/phi-2"
     tokenizer = AutoTokenizer.from_pretrained(
         trust_remote_code=True
     )
     print("Successfully loaded Phi-2 model")
 except Exception as e:
     print(f"Error loading Phi-2: {e}")
             device_map="auto"
         )
         print("Successfully loaded fallback model")
     except Exception as e:
         print(f"Error loading fallback model: {e}")
+        print("Operating in reduced functionality mode")
+def search_web(query, max_results=5):
+    """Perform real web searches using multiple search endpoints"""
     results = []
+    # Try multiple search methods for reliability
+    # Method 1: Wikipedia API
     try:
         wiki_url = f"https://en.wikipedia.org/w/api.php?action=opensearch&search={urllib.parse.quote(query)}&limit={max_results}&namespace=0&format=json"
+        response = requests.get(wiki_url, timeout=5)
         if response.status_code == 200:
             data = response.json()
             for i in range(min(len(titles), len(urls))):
                 # Get summary for each page
                 page_url = f"https://en.wikipedia.org/w/api.php?action=query&prop=extracts&exintro&explaintext&titles={urllib.parse.quote(titles[i])}&format=json"
+                page_response = requests.get(page_url, timeout=5)
                 if page_response.status_code == 200:
                     page_data = page_response.json()
                     try:
                         page_id = next(iter(page_data['query']['pages'].keys()))
+                        if page_id != "-1":
                             extract = page_data['query']['pages'][page_id].get('extract', '')
                             snippet = extract[:200] + "..." if len(extract) > 200 else extract
                             results.append({
                                 'url': urls[i],
                                 'snippet': snippet
                             })
+                    except Exception as e:
+                        print(f"Error extracting wiki data: {e}")
+                        continue
     except Exception as e:
+        print(f"Wikipedia search error: {e}")
+    # Method 2: Public Search API (SerpAPI demo)
     if len(results) < max_results:
+        try:
+            serpapi_url = f"https://serpapi.com/search.json?engine=google&q={urllib.parse.quote(query)}&api_key=demo"
+            response = requests.get(serpapi_url, timeout=5)
+            if response.status_code == 200:
+                data = response.json()
+                if "organic_results" in data:
+                    for result in data["organic_results"][:max_results - len(results)]:
+                        results.append({
+                            'title': result.get('title', ''),
+                            'url': result.get('link', ''),
+                            'snippet': result.get('snippet', '')
+                        })
+        except Exception as e:
+            print(f"SerpAPI error: {e}")
+    # Method 3: Direct web scraping (as last resort)
+    if len(results) < max_results:
+        try:
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
+            }
+            url = f"https://www.bing.com/search?q={urllib.parse.quote(query)}"
+            response = requests.get(url, headers=headers, timeout=10)
+            if response.status_code == 200:
+                soup = BeautifulSoup(response.text, 'html.parser')
+                search_results = soup.find_all('li', class_='b_algo')
+                for result in search_results[:max_results - len(results)]:
+                    title_elem = result.find('h2')
+                    if title_elem and title_elem.find('a'):
+                        title = title_elem.text
+                        url = title_elem.find('a')['href']
+                        snippet_elem = result.find('div', class_='b_caption')
+                        snippet = snippet_elem.find('p').text if snippet_elem and snippet_elem.find('p') else ""
+                        results.append({
+                            'title': title,
+                            'url': url,
+                            'snippet': snippet
+                        })
+        except Exception as e:
+            print(f"Web scraping error: {e}")
+    # If we still don't have results, create minimal placeholder results
+    # This ensures the UI doesn't break if all search methods fail
+    if not results:
+        results = [
             {
+                'title': f"Search: {query}",
                 'url': f"https://www.google.com/search?q={urllib.parse.quote(query)}",
+                'snippet': "Search engine results for your query."
             }
         ]
     return results[:max_results]
+def generate_response(model, tokenizer, prompt, max_new_tokens=512):
+    """Generate response using the AI model with proper error handling"""
     try:
+        # For T5 models
+        if "t5" in MODEL_ID.lower():
+            inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512).to(model.device)
             with torch.no_grad():
                 outputs = model.generate(
                     inputs.input_ids,
                     max_new_tokens=max_new_tokens,
                     temperature=0.7,
                     do_sample=True
                 )
             response = tokenizer.decode(outputs[0], skip_special_tokens=True)
             return response
+        # For Phi and other models
+        else:
+            if "phi" in MODEL_ID.lower():
+                formatted_prompt = f"Instruct: {prompt}\nOutput:"
+            else:
+                formatted_prompt = prompt
+            inputs = tokenizer(formatted_prompt, return_tensors="pt", truncation=True, max_length=512).to(model.device)
             with torch.no_grad():
                 outputs = model.generate(
                     inputs.input_ids,
                     max_new_tokens=max_new_tokens,
                     temperature=0.7,
                     top_p=0.9,
                     do_sample=True,
                     pad_token_id=tokenizer.eos_token_id
                 )
             response = tokenizer.decode(outputs[0][inputs.input_ids.size(1):], skip_special_tokens=True).strip()
+            # Check if response is empty or too short
+            if not response or len(response) < 10:
+                # Try again with different parameters
+                outputs = model.generate(
+                    inputs.input_ids,
+                    max_new_tokens=max_new_tokens,
+                    num_beams=3,  # Use beam search instead
+                    temperature=1.0,
+                    do_sample=False,  # Deterministic generation
+                    pad_token_id=tokenizer.eos_token_id
+                )
+                response = tokenizer.decode(outputs[0][inputs.input_ids.size(1):], skip_special_tokens=True).strip()
             return response
     except Exception as e:
         print(f"Error generating response: {e}")
+        # Return a simple error message
+        return "I encountered a technical issue while generating a response. Please try another query."
+def ensure_citations(text, search_results):
     """Ensure citations are properly added to the text"""
+    # If text is too short, return a generic message
     if not text or len(text.strip()) < 10:
+        return "I couldn't generate a proper response for this query. Please try a different search term."
     # Add citations if not present
     if not re.search(r'\[\d+\]', text):
+        # Try to find snippets in the answer
         for i, result in enumerate(search_results, 1):
             key_phrases = result['snippet'].split('.')
             for phrase in key_phrases:
+                if phrase and len(phrase) > 15 and phrase.strip() in text:
                     text = text.replace(phrase, f"{phrase} [{i}]", 1)
+        # If still no citations, add a generic one at the end
         if not re.search(r'\[\d+\]', text):
+            text += f" [{1}]"
     return text
+def generate_related_topics(model, tokenizer, query, answer):
+    """Generate related topics based on the AI model"""
     try:
+        # Craft a prompt to generate related topics
+        related_prompt = f"""Based on the original search query "{query}" and the information in this answer:
+"{answer[:300]}...", generate 3 related topics or questions that someone might want to explore next.
+Each should be specific and directly related to the query but explore a different aspect.
+Format as a simple list with 3 items only."""
+        # Use the model to generate topics
+        related_text = generate_response(model, tokenizer, related_prompt, max_new_tokens=200)
+        # Parse the generated text into individual topics
+        lines = related_text.split('\n')
+        topics = []
+        for line in lines:
+            # Clean up line by removing numbers, bullet points, etc.
+            clean_line = re.sub(r'^[\d\-\*\•\.\s]+', '', line.strip())
+            if clean_line and len(clean_line) > 5:
+                topics.append(clean_line)
+        # Ensure we have at least 3 topics
+        if len(topics) < 3:
+            # Add generic but relevant topics based on the query
+            base_topics = [
+                f"History of {query}",
+                f"Latest developments in {query}",
+                f"How does {query} work?",
+                f"Applications of {query}",
+                f"Future of {query}"
+            ]
+            # Add topics until we have at least 3
+            for topic in base_topics:
+                if len(topics) >= 3:
+                    break
+                if topic not in topics:
+                    topics.append(topic)
+        return topics[:3]  # Return top 3 topics
+    except Exception as e:
+        print(f"Error generating related topics: {e}")
+        # Return generic topics as fallback
+        return [
+            f"More about {query}",
+            f"Latest developments in {query}",
+            f"Applications of {query}"
+        ]
+def process_query(query):
+    """Main function to process a query with real search and AI responses"""
+    try:
+        # Step 1: Search the web for real results
+        search_results = search_web(query, max_results=5)
+        # Step 2: Create context from search results
         context = f"Query: {query}\n\nSearch Results:\n\n"
         for i, result in enumerate(search_results, 1):
             context += f"Source {i}:\n"
             context += f"Title: {result['title']}\n"
+            context += f"URL: {result['url']}\n"
             context += f"Content: {result['snippet']}\n\n"
+        # Step 3: Create prompt for the AI model
         prompt = f"""You are a helpful AI assistant that provides accurate and comprehensive answers based on search results.
 {context}
+Based on these search results, please provide a detailed answer to the query: "{query}"
 Include citations like [1], [2], etc. to reference the sources.
 Be factual and accurate. If the search results don't contain enough information, acknowledge this limitation.
 Format your answer in clear paragraphs with bullet points where appropriate."""
+        # Step 4: Generate answer using the AI model
+        answer = generate_response(model, tokenizer, prompt, max_new_tokens=512)
+        # Step 5: Ensure citations
+        answer = ensure_citations(answer, search_results)
+        # Step 6: Generate related topics using the AI model
+        related_topics = generate_related_topics(model, tokenizer, query, answer)
+        # Return the complete result
+        return {
             "answer": answer,
             "sources": search_results,
             "related_topics": related_topics
         }
     except Exception as e:
+        print(f"Error in process_query: {e}")
+        # Return a minimal result that won't break the UI
         return {
+            "answer": f"I encountered an error while processing your query about '{query}'. Please try again or try a different search term.",
+            "sources": search_web(query, max_results=2),  # Try to get at least some sources
+            "related_topics": [f"More about {query}", f"Different aspects of {query}", f"Applications of {query}"]
         }
 def format_sources(sources):
     if not topics:
         return ""
+    # Create HTML with unique IDs for each topic
     html = "<div style='display: flex; flex-wrap: wrap; gap: 10px; margin-top: 15px;'>"
     for i, topic in enumerate(topics):
+        # Each topic is a button with a unique ID
         html += f"""
         <div id="topic-{i}" style="background-color: #EFF6FF; padding: 10px 16px; border-radius: 100px;
                   color: #2563EB; font-size: 14px; font-weight: 500; cursor: pointer; display: inline-block;
         """
     html += "</div>"
+    # Add JavaScript to handle topic clicks
     html += """
     <script>
+        // Set up event listeners for topic clicks
         function setupTopicClicks() {
+            // Find all topic elements
+            const topics = document.querySelectorAll('[id^="topic-"]');
+            // Add click listeners to each topic
+            topics.forEach(topic => {
+                topic.addEventListener('click', function() {
+                    // Get the topic text
+                    const topicText = this.getAttribute('data-topic');
+                    console.log("Clicked topic:", topicText);
+                    // Set input value to the topic text
+                    const inputElement = document.getElementById('query-input');
+                    if (inputElement) {
+                        inputElement.value = topicText;
+                        // Try multiple methods to trigger the search
+                        // Method 1: Click the search button
+                        const searchButton = document.querySelector('button[data-testid="submit"]');
+                        if (searchButton) {
+                            searchButton.click();
+                            return;
                         }
+                        // Method 2: Try other button selectors
+                        const altButton = document.querySelector('button[aria-label="Submit"]') ||
+                                         document.querySelector('button:contains("Search")');
+                        if (altButton) {
+                            altButton.click();
+                            return;
+                        }
+                        // Method 3: Find button by text content
+                        const buttons = Array.from(document.querySelectorAll('button'));
+                        const searchBtn = buttons.find(btn =>
+                            btn.textContent.includes('Search') ||
+                            btn.innerHTML.includes('Search')
+                        );
+                        if (searchBtn) {
+                            searchBtn.click();
+                            return;
+                        }
+                        // Method 4: Trigger form submission directly
+                        const form = inputElement.closest('form');
+                        if (form) {
+                            const event = new Event('submit', { bubbles: true });
+                            form.dispatchEvent(event);
+                            return;
+                        }
+                        console.log("Could not find a way to trigger search");
+                    }
                 });
+            });
         }
+        // Run the setup function
         setupTopicClicks();
+        // Set up an observer to handle dynamically loaded topics
+        const observer = new MutationObserver(function(mutations) {
+            mutations.forEach(function(mutation) {
+                if (mutation.addedNodes.length) {
+                    setupTopicClicks();
+                }
             });
+        });
+        // Start observing the document
+        observer.observe(document.body, { childList: true, subtree: true });
+        // jQuery-like helper function
+        Element.prototype.contains = function(text) {
+            return this.innerText.includes(text);
+        };
     </script>
     """
     return html
 def search_interface(query):
+    """Main function for the Gradio interface with progress updates"""
     if not query.strip():
         return (
             "Please enter a search query.",
     try:
         # Show loading message while processing
+        yield ("Searching and generating response...", "", "")
+        # Process the query
+        result = process_query(query)
         # Format answer with markdown
         answer_html = markdown(result["answer"])
         # Return a fallback response
         yield (
             "I encountered an issue while processing your query. Please try again with a different search term.",
+            "",
+            ""
         )
 # Create the Gradio interface with modern UI
     border-bottom: 1px solid currentColor;
 }
 /* Loading state */
 .answer.loading {
     display: flex;
 """
 with gr.Blocks(css=css, theme=gr.themes.Default()) as demo:
+    # Custom header with professional design
     gr.HTML("""
     <div class="header">
         <h1 style="color: #2563EB; font-size: 2.2rem; font-weight: 700; margin-bottom: 0.5rem;">🔍 AI Search System</h1>
         <p style="color: #64748B; font-size: 1.1rem; max-width: 600px; margin: 0 auto;">
+            Get comprehensive answers with real sources for any question.
         </p>
     </div>
     """)
     </footer>
     """)
+# Launch app with queue for better performance
 demo.queue(max_size=10)
 demo.launch()