A newer version of the Gradio SDK is available:
5.33.2
title: Prompt-Engineered Persona Agent
emoji: π
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
short_description: AI chatbot with a crafted personality (e.g., Wise Mentor)
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
π€ Prompt-Engineered Persona Agent with Mini-RAG
This project is an agentic chatbot built with a quantized LLM (Gemma 1B
) that behaves according to a customizable persona prompt. It features a lightweight Retrieval-Augmented Generation (RAG) system using TF-IDF + FAISS, and dynamic context length estimation to optimize inference timeβperfectly suited for CPU-only environments like Hugging Face Spaces.
π Features
- β Customizable Persona via system prompt
- β Mini-RAG using TF-IDF + FAISS to retrieve relevant past conversation
- β Efficient memory β only top relevant chat history used
- β Dynamic context length estimation speeds up response time
- β Gradio-powered UI
- β Runs on free CPU
π§ How It Works
- User submits a query along with a system persona prompt.
- Top-k similar past turns are retrieved using FAISS over TF-IDF vectors.
- Only relevant chat history is used to build the final prompt.
- The LLM generates a response based on the combined system prompt, retrieved context, and current user message.
- Context length (
n_ctx
) is dynamically estimated to minimize resource usage.
π§ͺ Example Personas
You can change the persona in the UI system prompt box:
- π
"You are a wise academic advisor who offers up to 3 concise, practical suggestions."
- π§
"You are a calm mindfulness coach. Always reply gently and with encouragement."
- π΅οΈ
"You are an investigative assistant. Be logical, skeptical, and fact-focused."
π¦ Installation
For local setup:
git clone https://huggingface.co/spaces/YOUR_USERNAME/Prompt-Persona-Agent
cd Prompt-Persona-Agent
pip install -r requirements.txt
Create an environment variable:
export HF_TOKEN=your_huggingface_token
Then run:
python app.py
π Files
app.py
: Main application with chat + RAG + dynamic contextrequirements.txt
: All Python dependenciesREADME.md
: This file
π οΈ Tech Stack
π Limitations
- Basic TF-IDF + FAISS retrieval β can be extended with semantic embedding models.
- Not all LLMs strictly follow persona β prompt tuning helps but is not perfect.
- For longer-term memory, a database + summarizer would be better.
π€ Deploy to Hugging Face Spaces
Uses only CPU, no paid GPU required.
Make sure your HF_TOKEN
is set as a secret or environment variable in your Hugging Face Space.