metadata

title: Prompt-Engineered Persona Agent
emoji: 📈
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
short_description: AI chatbot with a crafted personality (e.g., Wise Mentor)

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

🤖 Prompt-Engineered Persona Agent with Mini-RAG

This project is an agentic chatbot built with a quantized LLM (Gemma 1B) that behaves according to a customizable persona prompt. It features a lightweight Retrieval-Augmented Generation (RAG) system using TF-IDF + FAISS, and dynamic context length estimation to optimize inference time—perfectly suited for CPU-only environments like Hugging Face Spaces.

🚀 Features

✅ Customizable Persona via system prompt
✅ Mini-RAG using TF-IDF + FAISS to retrieve relevant past conversation
✅ Efficient memory — only top relevant chat history used
✅ Dynamic context length estimation speeds up response time
✅ Gradio-powered UI
✅ Runs on free CPU

🧠 How It Works

User submits a query along with a system persona prompt.
Top-k similar past turns are retrieved using FAISS over TF-IDF vectors.
Only relevant chat history is used to build the final prompt.
The LLM generates a response based on the combined system prompt, retrieved context, and current user message.
Context length (n_ctx) is dynamically estimated to minimize resource usage.

🧪 Example Personas

You can change the persona in the UI system prompt box:

📚 "You are a wise academic advisor who offers up to 3 concise, practical suggestions."
🧘 "You are a calm mindfulness coach. Always reply gently and with encouragement."
🕵️ "You are an investigative assistant. Be logical, skeptical, and fact-focused."

📦 Installation

For local setup:

git clone https://huggingface.co/spaces/YOUR_USERNAME/Prompt-Persona-Agent
cd Prompt-Persona-Agent
pip install -r requirements.txt

Create an environment variable:

export HF_TOKEN=your_huggingface_token

Then run:

python app.py

📁 Files

app.py: Main application with chat + RAG + dynamic context
requirements.txt: All Python dependencies
README.md: This file

🛠️ Tech Stack

📌 Limitations

Basic TF-IDF + FAISS retrieval — can be extended with semantic embedding models.
Not all LLMs strictly follow persona — prompt tuning helps but is not perfect.
For longer-term memory, a database + summarizer would be better.

📤 Deploy to Hugging Face Spaces

Uses only CPU, no paid GPU required.

Make sure your HF_TOKEN is set as a secret or environment variable in your Hugging Face Space.