Spaces:

Agents-MCP-Hackathon
/

rss-mcp-server

Running

gperdrizet commited on 17 days ago

Commit

3c77f9c

unverified ·

2 Parent(s): 5614e32 397bedd

Merge pull request #8 from gperdrizet/dev

Files changed (2) hide show

assets/html.py CHANGED Viewed

@@ -3,9 +3,24 @@
 TITLE = (
     '''
         <center>
-            <h1>RSS feed reader</h1>
         </center>
     '''
 )
-DESCRIPTION = 'Enter a website to extract RSS feed entry titles.'

 TITLE = (
     '''
         <center>
+            <h1>RSS feed finder/extractor</h1>
         </center>
     '''
 )
+DESCRIPTION = (
+    '''
+        Functions to find and extract RSS feeds are complete-ish. No AI
+        yet, plan for tomorrow is to build two tools:
+        <ol>
+            <li>Human readable summaries of requested RSS feed</li>
+            <li>Simple RAG on requested RSS feed content</li>
+        </ol>
+        For now we just dump the extracted RSS content below. Try asking
+        for a feed by website name, website URL, or entering your favorite
+        feed URI directly. Suggestions: http://openai.com/news/rss.xml,
+        hackernews.com, Hugging Face, etc
+    '''
+)

functions/helper_functions.py CHANGED Viewed

@@ -209,6 +209,9 @@ def get_html(url: str) -> str:
                 content = content.decode(encoding)
     except HTTPError:
         content = None
@@ -227,6 +230,9 @@ def get_text(html: str) -> str:
     Returns:
         Cleaned text string'''
     extractor = extractors.ArticleExtractor()
@@ -236,6 +242,11 @@ def get_text(html: str) -> str:
     except HTMLExtractionError:
         pass
     return clean_html(html)

                 content = content.decode(encoding)
+            else:
+                content = None
     except HTTPError:
         content = None
     Returns:
         Cleaned text string'''
+    if html is None:
+        return None
     extractor = extractors.ArticleExtractor()
     except HTMLExtractionError:
         pass
+    except AttributeError:
+        pass
+    except TypeError:
+        pass
     return clean_html(html)