--- title: Prompt-Engineered Persona Agent emoji: πŸ“ˆ colorFrom: blue colorTo: red sdk: gradio sdk_version: 5.33.0 app_file: app.py pinned: false short_description: AI chatbot with a crafted personality (e.g., Wise Mentor) --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # πŸ€– Prompt-Engineered Persona Agent with Mini-RAG This project is an agentic chatbot built with a quantized LLM (`Gemma 1B`) that behaves according to a customizable persona prompt. It features a lightweight Retrieval-Augmented Generation (RAG) system using **TF-IDF + FAISS**, and **dynamic context length estimation** to optimize inference timeβ€”perfectly suited for CPU-only environments like Hugging Face Spaces. --- ## πŸš€ Features * βœ… **Customizable Persona** via system prompt * βœ… **Mini-RAG** using TF-IDF + FAISS to retrieve relevant past conversation * βœ… **Efficient memory** β€” only top relevant chat history used * βœ… **Dynamic context length** estimation speeds up response time * βœ… Gradio-powered UI * βœ… Runs on free CPU --- ## 🧠 How It Works 1. **User submits a query** along with a system persona prompt. 2. **Top-k similar past turns** are retrieved using FAISS over TF-IDF vectors. 3. Only **relevant chat history** is used to build the final prompt. 4. The LLM generates a response based on the combined system prompt, retrieved context, and current user message. 5. Context length (`n_ctx`) is dynamically estimated to minimize resource usage. --- ## πŸ§ͺ Example Personas You can change the persona in the UI system prompt box: * πŸ“š `"You are a wise academic advisor who offers up to 3 concise, practical suggestions."` * 🧘 `"You are a calm mindfulness coach. Always reply gently and with encouragement."` * πŸ•΅οΈ `"You are an investigative assistant. Be logical, skeptical, and fact-focused."` --- ## πŸ“¦ Installation **For local setup:** ```bash git clone https://huggingface.co/spaces/YOUR_USERNAME/Prompt-Persona-Agent cd Prompt-Persona-Agent pip install -r requirements.txt ``` Create an environment variable: ```bash export HF_TOKEN=your_huggingface_token ``` Then run: ```bash python app.py ``` --- ## πŸ“ Files * `app.py`: Main application with chat + RAG + dynamic context * `requirements.txt`: All Python dependencies * `README.md`: This file --- ## πŸ› οΈ Tech Stack * [Gradio](https://gradio.app/) * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) * [FAISS](https://github.com/facebookresearch/faiss) * [scikit-learn (TF-IDF)](https://scikit-learn.org/) * [Gemma 1B IT GGUF](https://huggingface.co/google/gemma-1.1-1b-it-gguf) --- ## πŸ“Œ Limitations * Basic TF-IDF + FAISS retrieval β€” can be extended with semantic embedding models. * Not all LLMs strictly follow persona β€” prompt tuning helps but is not perfect. * For longer-term memory, a database + summarizer would be better. --- ## πŸ“€ Deploy to Hugging Face Spaces > Uses only CPU, no paid GPU required. Make sure your `HF_TOKEN` is set as a secret or environment variable in your Hugging Face Space. ---