--- title: Named Entity Recognition Tool emoji: 🌍 colorFrom: purple colorTo: pink sdk: gradio sdk_version: 5.27.0 app_file: app.py pinned: false tags: - tool --- # Advanced Named Entity Recognition (NER) Tool for smolagents This repository contains an enhanced Named Entity Recognition tool built for the `smolagents` library from Hugging Face. This tool allows you to: - Identify named entities (people, organizations, locations, dates, etc.) in text - Choose from multiple NER models for different languages and use cases - Configure different output formats and confidence thresholds - Use with smolagents for AI agents that can understand entities in text ## Installation ```bash pip install smolagents transformers torch gradio ``` For faster inference on GPU: ```bash pip install smolagents transformers torch gradio accelerate ``` ## Basic Usage ```python from ner_tool import NamedEntityRecognitionTool # Initialize the NER tool ner_tool = NamedEntityRecognitionTool() # Analyze text with default settings result = ner_tool("Apple Inc. is planning to open a new store in Paris, France next year.") print(result) # Analyze with custom settings detailed_result = ner_tool( text="Apple Inc. is planning to open a new store in Paris, France next year.", model="Babelscape/wikineural-multilingual-ner", # Different model aggregation="detailed", # More detailed output format min_score=0.7 # Lower confidence threshold ) print(detailed_result) ``` ## Available Models The tool includes several pre-configured models: | Model ID | Description | |----------|-------------| | dslim/bert-base-NER | Standard NER (English) - Default | | jean-baptiste/camembert-ner | French NER | | Davlan/bert-base-multilingual-cased-ner-hrl | Multilingual NER | | Babelscape/wikineural-multilingual-ner | WikiNeural Multilingual NER | | flair/ner-english-ontonotes-large | OntoNotes English (fine-grained) | | elastic/distilbert-base-cased-finetuned-conll03-english | CoNLL (fast) | ## Output Formats The tool supports three output formats: 1. **Simple** - A simple list of entities found with their types and confidence scores 2. **Grouped** - Entities grouped by their category (default) 3. **Detailed** - A detailed analysis including the original text with entity markers ## Using with an Agent ```python from smolagents import CodeAgent, InferenceClientModel from ner_tool import NamedEntityRecognitionTool # Initialize the NER tool ner_tool = NamedEntityRecognitionTool() # Create an agent model model = InferenceClientModel( model_id="mistralai/Mistral-7B-Instruct-v0.2", token="your_huggingface_token" ) # Create the agent with our NER tool agent = CodeAgent(tools=[ner_tool], model=model) # Run the agent result = agent.run( "Analyze this text and identify all entities: 'The European Union and United Kingdom finalized a trade deal on Tuesday.'" ) print(result) ``` ## Interactive Gradio Interface For an interactive experience, run the Gradio app: ```bash python gradio_app.py ``` This provides a web interface where you can: - Enter custom text or select from samples - Choose different NER models - Configure display formats and confidence thresholds - See immediate results ## Customization Options ### Entity Confidence Score - Use `min_score` parameter to filter entities by confidence - Range: 0.0 (include all) to 1.0 (only highest confidence) - Default: 0.8 ### Entity Types The tool can identify various entity types including: - People (PER, PERSON) - Organizations (ORG, ORGANIZATION) - Locations (LOC, LOCATION, GPE) - Dates and Times (DATE, TIME) - Money and Percentages (MONEY, PERCENT) - Products (PRODUCT) - Events (EVENT) - Works of Art (WORK_OF_ART) - Laws (LAW) - Languages (LANGUAGE) - Facilities (FAC) - Miscellaneous (MISC) The exact entity types available depend on the chosen model. ## Sharing Your Tool You can share your tool on the Hugging Face Hub: ```python ner_tool.push_to_hub("your-username/advanced-ner-tool", token="your_huggingface_token") ``` ## Limitations - First-time model loading may take some time - Some models may require significant memory (especially larger ones) - Entity recognition accuracy varies by model and language ## Contributing Contributions are welcome! Feel free to open an issue or submit a pull request. ## License MIT