DeepTolkien

This LLM is an OpenSeek R1 fine-tuned using the LoRA method on text extracted from JRR Tolkien's The Lord of the Rings.

Model Details

This LLM is an OpenSeek R1 fine-tuned using the LoRA method on text extracted from JRR Tolkien's The Lord of the Rings. The model can be prompted with a stub, for example "Frodo looked up and saw", and will then generate a story in the style of Tolkien's writing that continues from this stub. Have fun!

If you have played with OpenSeek R1, you have almost certainly noticed that at times the reasoning model seems to get caught up in a loop. This behavior is also seen here: for example, two characters will get caught in a looping dialog. I believe this is more of a property of DeepSeek R1 than this LoRA, and better results may yet be achieved through a model specific to prose and storytelling. However, I wanted to get an idea of how the new DeepSeek models perform, and this has been a fantastic learning experience.

Usage

Load the model:

# Import the model
config = PeftConfig.from_pretrained("cwestbrook/lotrdata")
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
# Load the Lora model
model = PeftModel.from_pretrained(model, "cwestbrook/lotrdata")

Run the model:

prompt = "Gandalf revealed his new iphone,"
inputs = tokenizer(prompt, return_tensors="pt").to('cuda')
tokens = model.generate(
    **inputs,
    max_new_tokens=100,
    temperature=1,
    eos_token_id=tokenizer.eos_token_id,
    early_stopping=True
)
predictions = tokenizer.batch_decode(tokens, skip_special_tokens=True)
print(predictions[0])

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

cwestbrook
/

lotrdata

DeepTolkien

Model Details

Usage

Load the model:

Run the model:

Model tree for cwestbrook/lotrdata

Dataset used to train cwestbrook/lotrdata