---
title: Prompt-Engineered Persona Agent
emoji: 📈
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
short_description: AI chatbot with a crafted personality (e.g., Wise Mentor)
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# 🤖 Prompt-Engineered Persona Agent with Mini-RAG

This project is an agentic chatbot built with a quantized LLM (`Gemma 1B`) that behaves according to a customizable persona prompt. It features a lightweight Retrieval-Augmented Generation (RAG) system using **TF-IDF + FAISS**, and **dynamic context length estimation** to optimize inference time—perfectly suited for CPU-only environments like Hugging Face Spaces.

---

## 🚀 Features

* ✅ **Customizable Persona** via system prompt
* ✅ **Mini-RAG** using TF-IDF + FAISS to retrieve relevant past conversation
* ✅ **Efficient memory** — only top relevant chat history used
* ✅ **Dynamic context length** estimation speeds up response time
* ✅ Gradio-powered UI
* ✅ Runs on free CPU

---

## 🧠 How It Works

1. **User submits a query** along with a system persona prompt.
2. **Top-k similar past turns** are retrieved using FAISS over TF-IDF vectors.
3. Only **relevant chat history** is used to build the final prompt.
4. The LLM generates a response based on the combined system prompt, retrieved context, and current user message.
5. Context length (`n_ctx`) is dynamically estimated to minimize resource usage.

---

## 🧪 Example Personas

You can change the persona in the UI system prompt box:

* 📚 `"You are a wise academic advisor who offers up to 3 concise, practical suggestions."`
* 🧘 `"You are a calm mindfulness coach. Always reply gently and with encouragement."`
* 🕵️ `"You are an investigative assistant. Be logical, skeptical, and fact-focused."`

---

## 📦 Installation

**For local setup:**

```bash
git clone https://huggingface.co/spaces/YOUR_USERNAME/Prompt-Persona-Agent
cd Prompt-Persona-Agent
pip install -r requirements.txt
```

Create an environment variable:

```bash
export HF_TOKEN=your_huggingface_token
```

Then run:

```bash
python app.py
```

---

## 📁 Files

* `app.py`: Main application with chat + RAG + dynamic context
* `requirements.txt`: All Python dependencies
* `README.md`: This file

---

## 🛠️ Tech Stack

* [Gradio](https://gradio.app/)
* [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
* [FAISS](https://github.com/facebookresearch/faiss)
* [scikit-learn (TF-IDF)](https://scikit-learn.org/)
* [Gemma 1B IT GGUF](https://huggingface.co/google/gemma-1.1-1b-it-gguf)

---

## 📌 Limitations

* Basic TF-IDF + FAISS retrieval — can be extended with semantic embedding models.
* Not all LLMs strictly follow persona — prompt tuning helps but is not perfect.
* For longer-term memory, a database + summarizer would be better.

---

## 📤 Deploy to Hugging Face Spaces

> Uses only CPU, no paid GPU required.

Make sure your `HF_TOKEN` is set as a secret or environment variable in your Hugging Face Space.

---