--- title: Agentic HF Analyzer emoji: 🌍 colorFrom: yellow colorTo: green sdk: gradio sdk_version: 5.32.1 app_file: app.py pinned: false short_description: Recommends users which Repos/Spaces to look at --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # 🚀 HF Repo Analyzer An AI-powered Hugging Face repository discovery and analysis tool that helps you find, evaluate, and explore the best repositories for your specific needs. ![HF Repo Analyzer](https://img.shields.io/badge/Powered%20by-Gradio-orange) ![Python](https://img.shields.io/badge/Python-3.8+-blue) ![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Spaces-yellow) ## ✨ Features - 🤖 **AI Assistant**: Intelligent conversation-based repository discovery - 🔍 **Smart Search**: Auto-detection of repository IDs vs. keywords - 📊 **Automated Analysis**: LLM-powered repository evaluation and ranking - 🏆 **Top 3 Selection**: AI-curated most relevant repositories - 💬 **Repository Explorer**: Interactive chat with repository contents - 🎯 **Requirements Extraction**: Automatic keyword extraction from conversations - 📋 **Comprehensive Results**: Detailed analysis with strengths, weaknesses, and specialities ## 🚦 Quick Start ### Prerequisites - Python 3.8+ - OpenAI API key (for LLM analysis) - Hugging Face access (for repository downloads) ### Installation 1. **Clone the repository** ```bash git clone cd Agentic_HF_Analyzer ``` 2. **Install dependencies** ```bash pip install -r requirements.txt ``` 3. **Set up environment variables** ```bash export modal_api="your_openai_api_key" export base_url="your_openai_base_url" ``` 4. **Run the application** ```bash python app.py ``` 5. **Open your browser** to `http://localhost:7860` ## 📖 User Guide ### 🤖 Using the AI Assistant (Recommended) 1. **Start a Conversation** - Navigate to the "🤖 AI Assistant" tab - Describe your project: "I'm building a chatbot for customer service" - The AI will ask clarifying questions about your needs 2. **Automatic Discovery** - When the AI has enough information, it will automatically: - Extract relevant keywords from your conversation - Search for matching repositories - Analyze and rank them by relevance 3. **Review Results** - The interface automatically switches to "🔬 Analysis & Results" - View the top 3 most relevant repositories - Browse all analyzed repositories with detailed insights ### 📝 Using Smart Search (Direct Input) 1. **Repository IDs** ``` microsoft/DialoGPT-medium openai/whisper huggingface/transformers ``` 2. **Keywords** ``` text generation image classification sentiment analysis ``` 3. **Mixed Input** - The system automatically detects the input type - Repository IDs (containing `/`) are processed directly - Keywords trigger automatic repository search ### 🔬 Analyzing Results - **Top 3 Repositories**: AI-selected most relevant based on your requirements - **Detailed Analysis**: Strengths, weaknesses, specialities, and relevance ratings - **Quick Actions**: Click repository names to visit or explore them - **Repository Explorer**: Deep dive into individual repositories with AI chat ### 🔍 Repository Explorer 1. **Access Methods**: - Click "🔍 Open in Repo Explorer" from repository actions - Manually enter repository ID in the Repo Explorer tab 2. **Features**: - Automatic repository loading and analysis - Interactive chat about repository contents - File structure exploration - Code analysis and explanations ## 🛠️ Technical Architecture ### Core Components ``` app.py # Main Gradio interface and orchestration ├── analyzer.py # Repository analysis and LLM processing ├── hf_utils.py # Hugging Face API interactions ├── chatbot_page.py # AI assistant conversation logic └── repo_explorer.py # Repository exploration interface ``` ### Key Features Implementation #### 🤖 AI Assistant - **System Prompt**: Focused on requirements gathering, not recommendations - **Auto-Extraction**: Detects conversation readiness for keyword extraction - **Smart Processing**: Converts natural language to actionable search queries #### 🔍 Smart Input Detection ```python def is_repo_id_format(text: str) -> bool: # Detects if input contains repository IDs (with /) vs keywords lines = [line.strip() for line in re.split(r'[\n,]+', text) if line.strip()] slash_count = sum(1 for line in lines if '/' in line) return slash_count >= len(lines) * 0.5 ``` #### 🏆 LLM-Powered Repository Ranking - **Model**: `Orion-zhen/Qwen2.5-Coder-7B-Instruct-AWQ` - **Criteria**: Requirements matching, strengths, relevance rating, speciality alignment - **Output**: JSON-formatted repository rankings #### 📊 Analysis Pipeline 1. **Download**: Repository files (`.py`, `.md`, `.txt`) 2. **Combine**: Merge files into single analyzable document 3. **Analyze**: LLM evaluation for strengths, weaknesses, specialities 4. **Rank**: User requirement-based relevance scoring 5. **Select**: Top 3 most relevant repositories ### Data Flow ```mermaid graph TD A[User Input] --> B{Input Type?} B -->|Keywords| C[Repository Search] B -->|Repo IDs| D[Direct Processing] C --> E[Repository List] D --> E E --> F[Download & Analyze] F --> G[LLM Evaluation] G --> H[Ranking & Selection] H --> I[Results Display] I --> J[Repository Explorer] ``` ### File Structure ``` 📦 Agentic_HF_Analyzer/ ├── 📄 app.py # Main application ├── 📄 analyzer.py # Repository analysis logic ├── 📄 hf_utils.py # Hugging Face utilities ├── 📄 chatbot_page.py # AI assistant functionality ├── 📄 repo_explorer.py # Repository exploration ├── 📄 requirements.txt # Python dependencies ├── 📄 README.md # Documentation ├── 📄 repo_ids.csv # Analysis results storage └── 📁 repo_files/ # Temporary repository downloads ``` ### Dependencies ``` gradio>=4.0.0 # Web interface framework pandas>=1.5.0 # Data manipulation regex>=2022.0.0 # Advanced regex operations openai>=1.0.0 # LLM API access huggingface_hub>=0.16.0 # HF repository access requests>=2.28.0 # HTTP requests ``` ### Environment Variables | Variable | Description | Required | |----------|-------------|----------| | `modal_api` | OpenAI API key for LLM analysis | ✅ | | `base_url` | OpenAI API base URL | ✅ | ### LLM Integration #### Analysis Prompt Structure ```python ANALYSIS_PROMPT = """ Analyze this repository and provide: 1. Strengths and capabilities 2. Potential weaknesses or limitations 3. Primary speciality/use case 4. Relevance rating for: {user_requirements} Return valid JSON with: strength, weaknesses, speciality, relevance rating """ ``` #### Repository Ranking System - **Input**: User requirements + repository analysis data - **Processing**: LLM evaluates relevance and ranks repositories - **Output**: Top 3 most relevant repositories in order ### UI Components #### Modern Design Features - **Gradient Backgrounds**: Linear gradients for visual appeal - **Glassmorphism**: Backdrop blur effects for modern look - **Responsive Layout**: Adaptive to different screen sizes - **Interactive Elements**: Hover effects and smooth transitions - **Modal System**: Repository action selection popups #### Tab Organization 1. **🤖 AI Assistant**: Conversation-based discovery 2. **📝 Smart Search**: Direct input processing 3. **🔬 Analysis & Results**: Comprehensive analysis display 4. **🔍 Repo Explorer**: Interactive repository exploration ### Advanced Features #### Auto-Navigation - Automatic tab switching based on workflow state - Smooth scrolling to top on tab changes - Progressive disclosure of information #### Error Handling - Graceful fallbacks for LLM failures - CSV update retry mechanisms - User-friendly error messages #### Performance Optimizations - Parallel processing for multiple repositories - Progress tracking for long operations - Efficient file caching and cleanup ## 🔧 Configuration ### Customizing Analysis - Modify `CHATBOT_SYSTEM_PROMPT` for different assistant behavior - Adjust repository search limits in `search_top_spaces()` - Configure analysis criteria in `get_top_relevant_repos()` ### Adding File Types ```python # In analyzer.py download_filtered_space_files( repo_id, local_dir="repo_files", file_extensions=['.py', '.md', '.txt', '.js', '.ts'] # Add more ) ``` ## 🤝 Contributing 1. Fork the repository 2. Create a feature branch 3. Implement your changes 4. Add tests if applicable 5. Submit a pull request ## 📄 License This project is licensed under the MIT License - see the LICENSE file for details. ## 🙏 Acknowledgments - **Gradio**: For the amazing web interface framework - **Hugging Face**: For the incredible repository ecosystem - **OpenAI**: For powerful language model capabilities ---

Built with ❤️ for the open source community

🚀 Happy repository hunting! 🚀