Image created with Chat GPT 4o using a link to this model repository, and a brief prompt.

FinPlan-1

FinPlan-1 is a an LLM trained to assist with the creation of basic personal financial plans for individuals. This model is built off of the Fino1 8B model which is itself a version of Llama-3.1-8B-Instruct, which was CoT fine-tuned to improve its financial reasoning ability.

Model Details

Model Description -- Introduction

According to Bankrate’s 2025 Emergency Savings Report, only 41% of American’s would be able to use their personal savings to pay for a $1,000 emergency expense, with the rest “financing it with a credit card they’d pay off over time, reducing their spending on other things, taking out a personal loan, borrowing from family or friends or other methods.”

The financial health of American’s is based on a number of factors but one important component is basic financial literacy and having a financial plan. The financial planning component is one area I think LLMs can be of assistance. This LLM is my attempt to further train and fine tune a model which has been trained on financial reasoning tasks to assist individuals with two key aspects of financial planning.

Assist with the creation of a budget spreadsheet to enable individuals to keep track of their finances and understand where their money is going.
Aid with planning for short, medium and long term goals including breaking those goals down into monthly savings targets, and suggesting broad investment vehicles to fit each goal's timeframe.

While current LLM's can perform these tasks to an extent, they are often inconsistent with their response structure, can sometimes struggle with breaking down basic mathematics and frequently go beyond the basic tasks at hand recommending inappropriate savings and investment vehicles for individual savings goals. The Fino-1 8B model is certainly well trained for the corporate financial reasoning tasks but its recommendations for savings and investment vehicles were often too aggressive for short term goals and may recommend long term savings vehicles which carry tax penalties if not used appropriately. This model uses LoRA on a procedurally generated budgeting dataset as well as few shot prompting using a separate dataset based around short, medium and long term goals to enhance the ability of Fino-1 8B to accomplish these tasks.

The results of this training and prompting method are encouraging as the model consistently produces budget spreadsheets (through the generation of executable python code) as well as somewhat reliable savings plan assistance with the use of few shot prompting. These training methods do have an impact on this model's performance on standard benchmarks like GSM8K and MMLU resulting in drops in performance on both tasks compared with the base model, however this loss in generalization is made up for in the model's improved ability to accomplish the tasks of assisting individuals with budgeting and fixed term savings goals.

Developed by: Timothy Austin Rodriguez
Funded by: University of Virginia
Training type: LoRA - Few Shot Prompting (3)
Language(s) (NLP): Python
License: MIT
Finetuned from model: Fino1-8B [which is fine tuned from Llama 3.1 8B Instruct]

Training Data

This model is trained on a procedurally generated synthetic dataset that provides structured prompts and responses to assist the underlying Fino-1 8B model with creating executable python code which creates and exports budget spreadsheet to a Microsoft Excel .xlsx format. This dataset (attached to this repository) is comprised of 3000 examples which were divided into a train/validation split of 2500 for training and 500 for validation. The code used to create and randomize this dataset including the seeds (42 for randomization, 60 for creation) can be located in the ipynb files attached to this repository. This dataset is called budget_dataset.csv

While not used for training this model, a secondary dataset for the purposes of improving the model's performance on short, medium and long term goal planning was developed via procedural generation. This dataset was generated much like the first through random procedural generation of 3000 examples of prompts and responses. random seeds, and train/validation split code can be located in the same ipynb file as the budget dataset. This dataset is called goals_dataset.csv. This dataset was not used to train the final model due to poor performance encountered when leveraging LoRA for additional training. The model actually performed worse when prompted with an example from the validation dataset after training than before training. A deeper exploration of why this occurred is warranted and other training/tuning methods should be considered beyond LoRA for future enhancement of this model.

Training Method

The method of training/tuning for this model is the Parameter-Efficient Fine-Tuning method called Low-Rank Adaptation or LoRA. LoRA is a fine tuning approach that is well suited to tuning a model for domain specific tasks such as creating personal financial plans. LoRA is significantly more efficient than full fine tuning requiring fewer compute resources and is much more memory efficient as fewer model weights are changed. In many cases LoRA implementation yields results very similar to full fine tuning without the heavy computational expense inherent with full fine tuning. This method was chosen given the time allocated for training this model, limited compute resources due to competing requests for GPU time on the University of Virginia's Rivanna High Performance Computing cluster and the desire to have similar results to full fine tuning despite the lack of compute resources required. LoRA Tuning hyperparameter values were selected through experimentation and can be found in one of the ipynb files attached to this repository and in the summary below.

Hyperparameters

LORA_R = 64
LORA_ALPHA = 64
LORA_DROPOUT = 0.05

Tuning/Training Settings

earning_rate = 0.00001
epochs = 5

Secondarily, this model makes use of Few Shot Prompting due to the aforementioned poor performance of LoRA when training on the goals dataset. It was found that few shot prompting improves the ability of the model to provide the desired response structure without degrading the model's performance as was noted with LoRA implementation regardless of the Hyperparameters that were selected. Examples code for how to implement the appropriate few shot prompting is available in one of the provide ipynb files in this repository.

Evaluation

Model	GSM8K	MMLU	Budget Example	Goals Example
Fino-1 8B	63.33	66.84	Provides code but not in desired format, provides narrative (not desired)	Reasonable Response, not desired structure
Llama-3.2-3B-Instruct	50.00	61.11	Provides code but not in desired format.	Close to desired format but recommends inappropriate savings vehicle for long term goal
Ministral-8B-Instruct-2410	66.66	64.50	Provides code but not in desired format.	Reasonable format but recommends inappropriate savings vehicle for long term goal
FinPlan-1	53.33	65.73	Provides code in desired format	Reasonable format, and reasonable savings/investment vehicles recommended.

The benchmarks chosen, GSM8K, MMLU and the two synthetic dataset examples were selected to provide a view of the performance of the model both in terms of its generalization ability as well as it's ability to perform the tasks it is trained to accomplish. As the underlying model that FinPlan-1 is based on, Fino-1 8B is a natural comparison model to evaluate for benchmarking. Further, the Llama 3.2-3B Instruct model is a newer version of the model which underlies Fino-1 8B albeit a smaller version parameter wise. Given this model's rather decent performance on the financial planning tasks it serves as a good comparison for FinPlan-1. Finally Ministral 8B instruct -2410 model is of comparable size parameter wise to FinPlan-1 and was originally considered as a potential base model to train for FinPlan-1, thus making it a good model for comparison. Since the tasks this model is tuned to accomplish are non standard and domain specific, the benchmark for these tasks comes from the validation/hold out split of the training dataset and its evaluation is somewhat subjective. For each of these models, the Budget and Goals examples were presented to the model in either a zero shot prompt (budget) or a three shot prompt (goals). Only the trained FinPlan-1 model was able to provide the desired format for the excel file for the budget task while both Fino-1 8B and FinPlan-1 performed well on the goals dataset. For measurement of generalizability and retention of reasoning skill, all four models were benchmarked on GSM8K (grade school mathematics reasoning) as well as MMLU (general reasoning). While the domain specific LoRA tuning certainly led to a degradation in FinPlan-1's benchmark scores with respect to its underlying model Fino-1 8B, the drop in performance is rather small for MMLU and GSM8K performance remains above Llama 3.2 -3B Instruct.

Intended Usage

As described above this model is intended to be used to assist with the creation of simple financial plans for individuals, specifically for assistance with the creation of a budget spreadsheet for tracking expenses as well as planning for, short, medium and long term savings goals. While this model can be prompted on a wide range of other tasks, it is not recommended to use this model for those purposes as it has been specifically fine-tuned for these two tasks and performance on tasks outside that scope could be diminished.

See below for the basic code required in order to import the model from huggingface using torch. Note the tokenizer is pulled from the Fino-1 8B repository as it was not changed from the base Fino-1 8B model.

import os
os.environ['HF_HOME'] = "your/directory/here"
import torch 
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from datasets import load_dataset #datasets is huggingface's dataset package
from peft import get_peft_model, LoraConfig, TaskType
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import PIL

import lm_eval

tokenizer = AutoTokenizer.from_pretrained("TheFinAI/Fino1-8B")
model = AutoModelForCausalLM.from_pretrained("ThinkTim21/FinPlan-1")

# Prepare the model and tokenizer 
tokenizer.pad_token = tokenizer.eos_token # set padding token to EOS token
model.config.poad_token_id = tokenizer.pad_token_id # set the padding token for model


budget = pd.read_csv("budget_dataset.csv") # use the dataset attached to this repo
goals = pd.read_csv("goals_dataset.csv") # use the dataset attached to this repo

budget['instruct_lora'] = budget.apply(
    lambda row: f"Q: {row['question']}\n\nA: ",
    axis=1
)

goals['instruct_lora'] = goals.apply(
    lambda row: f"Q: {row['question']}\n\nA: ",
    axis=1
)

from datasets import load_dataset, Dataset #datasets is huggingface's dataset package
budget = budget.sample(frac = 1, random_state = 42) # randomly shuffle DF
train_budget = budget[:2500]
val_budget = budget[2500:]
train_budget = Dataset.from_pandas(train_budget)
val_budget = Dataset.from_pandas(val_budget)
train_budget = train_budget.map(lambda samples: tokenizer(samples['instruct']), batched = True)
val_budget = val_budget.map(lambda samples: tokenizer(samples['instruct']), batched = True)

goals = goals.sample(frac = 1, random_state = 42) # randomly shuffle DF
train_goals = goals[:2500]
val_goals = goals[2500:]
train_goals = Dataset.from_pandas(train_goals)
val_goals = Dataset.from_pandas(val_goals)
train_goals = train_goals.map(lambda samples: tokenizer(samples['instruct']), batched = True)
val_goals = val_goals.map(lambda samples: tokenizer(samples['instruct']), batched = True)

formatted_prompt = f"Q: {val_goals[0]['question']}\n\nA: "
inputs = tokenizer.encode(formatted_prompt, return_tensors = "pt").to(model.device)
output = model.generate(inputs, max_new_tokens = 800, pad_token_id = tokenizer.pad_token_id, do_sample = False)
generated_text = tokenizer.decode(output[0], skip_special_tokens = True)
print(generated_text)

Prompt Format

The prompt format varies between the budget task and the goals task.

For the budget task, the following prompt method is recommended.

Q: I  have an income of about 53255 a year and my monthly expenses include 2208 a month in rent and utilities, a 700 car payment, $300 in food, and about 205 a month in other expenses. Using python, can you create for me a budget spreadsheet and export it to excel?

For the goals task, I recommend using Few Shot Prompting, making use of the goals_dataset.csv file as your base and then adding your preferred prompt in the following format to the few shot examples derived from the goals dataset.

Q: My short term goal is to save for a $3357 vacation in the next year, my medium term goal is to save for down payment for a new car, around 6867 in the next 2 or 3 years, and my long term goal is to save for a down payment for a house around 115061 in the next ten years, can you help me integrate these goals into my budget as well as where I should store these savings?

I recommend the following code to set up few shot prompting for the goals task:


test_goals = goals[2500:]

pipe = pipeline(
    "text-generation", 
    model=model, 
    torch_dtype=torch.bfloat16, 
    device_map="auto", 
    tokenizer = tokenizer, 
    max_new_tokens = 750,
    do_sample = False,
    temperature = 0
)

def few_shot_goal(df3,pipe,n = 1,q = 10):
    examples = []
    for i in range(n):
        instruct = df3['instruct'].iloc[i]
        examples.append(instruct)
    examples.append('YOUR PROMPT HERE')
    examples = "\n\n".join(examples)
    text = pipe(examples)
    print(text[0]['generated_text'])

few_shot_goal(test_goals,pipe,n = 3,q=10)

Example Response

For the budget task, you can generally expect a response like the following from the model. It should almost exclusively contain python code which integrates your provided budget figures and creates for you a budget spreadsheet to assist with tracking your expenses.

A:  Lets use the python library pandas for data manipulation and export the spreadsheet using openpyxl. First, ensure you have both libraries installed: ```bash
pip install pandas openpyxl


Now, let's create a Python script to generate your budget spreadsheet:

python
import pandas as pd
import openpyxl

# Define your income and expenses
annual_income = 220034
monthly_income = annual_income / 12

expenses = {
    "Rent & Utilities": 2379,
    "Car Payment": 302,
    "Food": 300,
    "Other Expenses": 351
}

total_expenses = sum(expenses.values())
net_savings = monthly_income - total_expenses

# Create a dictionary with your budget data
budget_data = {
    "Category": ["Monthly Income"] + list(expenses.keys()) + ["Total Expenses", "Net Savings"],
    "Amount ($)": [monthly_income] + list(expenses.values()) + [total_expenses, net_savings]
}

# Convert the dictionary into a DataFrame
budget_df = pd.DataFrame(budget_data)

# Save the DataFrame to an Excel file
file_name = "budget.xlsx"
budget_df.to_excel(file_name, index=False, engine='openpyxl')

print(f"Budget spreadsheet saved as {file_name}")


This script will create an Excel file named `budget.xlsx` in the same directory, containing your income and expenses breakdown. Make sure to run this in a Python environment where you have write access.

For the second task you can reasonably expect a response along the lines of the output below. This should reasonably follow the structure of walking you through the breakdown of your short, medium and long term goals into monthly savings targets depending on their timeframe as well as suggesting reasonable investment vehicles for them.

A: 1. First, let's figure out how much you can realistically save each month. We know you want to save for three things: a vacation costing $1741, a car down payment of $9356, and a house down payment of $83333. 

2. To make it easier, let's add up those numbers: $1741 + $9356 + $83333 gives us a total savings goal of $98390. 

3. Now, we need to think about how long you have to reach this goal. You've got a year for the vacation, 2-3 years for the car, and 10 years for the house. Let's take the 10 years as our benchmark since it's the longest time frame. That means you have 10 years to save $98390.

4. So, let's calculate your monthly savings. Divide $98390 by 10 years, which is 120 months, and you get about $816.75 per month. 

5. Now, where should you store these savings? Considering you've got a long-term goal, it might be wise to explore low-risk options like high-yield savings accounts or certificates of deposit (CDs). These can help your money grow slowly over time while remaining easily accessible.

6. Finally, make sure to set up automatic transfers from your checking account to your chosen savings vehicle to make saving easier and less prone to being neglected.

Let's see how this plan looks: You'll aim to save about $816.75 monthly for 10 years. This should help you reach your combined savings goal without much hassle, and you can monitor your progress along the way. Oh, and don't forget to review and adjust as needed!

## Thinking

Alright, let's break this down step by step. First, I need to figure out how much I can realistically save each month to reach my savings goals. I have three main goals: saving for a $1741 vacation in a year, a $9356 car down payment in 2-3 years, and a $83333 house down payment in 10 years.

Okay, so let's add up these numbers: $1741 plus $9356 plus $83333 gives me a total savings goal of $98390. Now, I need to think about how long I have to reach this goal. Since 10 years is the longest period, I'll use that as my timeframe.

Next, I'll calculate my monthly savings. I divide $98390 by 10 years, which is 120 months, and get about $816.75 per month. This is my target savings amount each month.

Now, where should I store these savings? Given that my goal is long-term, I should look into low-risk options like high-yield savings accounts or certificates of deposit (CDs). These will help my money grow slowly over time while keeping it easily accessible.

Finally, to make saving easier, I'll set up automatic transfers from my checking account to my chosen savings vehicle. This way, saving won't be so hard to stick to, and I can monitor my progress along the way. It looks like saving $816.75 monthly for 10 years should help me reach my combined savings goal without much hassle.

Bias, Risks, and Limitations

There are several risks and limitations of this model that are worth mentioning. First, in a handful of cases this model produced responses in which the math inherent in the savings goals responses was not correct, sometimes failing to add numbers up correctly or having slight rounding errors when dividing long term goals into monthly targets. While it is well known that LLMs can struggle with mathematics given that their knowledge is language based and not numerically based, this can be a problem for a finance focused LLM. I strongly recommend double checking the figures presented by this model. While this issue is sidestepped in the budget task through the use of python code to prevent math errors that safeguard is not implemented for the goals task. Further, this model should be limited in its use for out of scope tasks as the generalization benchmarks demonstrated that compared to its base model, this model exhibits decreased reasoning ability outside its domain specific task.

To improve this model I recommend future model trainers and tuners focus on adjusting this model to default to producing python code for all mathematics based prompts. Sticking with python for mathematics processing should allow the model to perform more highly on the goals task while retaining performance on the budgeting task. This creates two tasks which have similar expected response outputs rather than attempting to train the same model for two very different tasks within the same domain as was done here.

Compute Infrastructure

Rivanna High Performance Computing Cluster (University of Virginia)

Hardware

1 node

32GB system RAM

GPU NVIDIA A6000

2 hours of LoRA PEFT Tuning.

Works Cited

Major Credit to TheFinAI team who created the Fino-1 8B model that this model is based on.

title={Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance}, author={Qian, Lingfei and Zhou, Weipeng and Wang, Yan and Peng, Xueqing and Huang, Jimin and Xie, Qianqian}, journal={arXiv preprint arXiv:2502.08127}, year={2025} }

Model Card Authors

Timothy Austin Rodriguez

Model Card Contact

tar3kh@virginia.edu