YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

LTX-Video 13B Generation Pipeline (Escoffier Style)

This repository contains a complete workflow for generating high-quality animated videos featuring anime-style characters in the distinctive Escoffier art style using the Lightricks LTX-Video 13B model with LoRA adaptation.

🎨 Character Description

The Escoffier-style character features:

  • Blonde hair with striking blue eyes
  • Long flowing hair with a signature curled strand on top
  • Elegant white and purple dress with gold accents
  • Large magenta waist bow
  • White thigh-high stockings with intricate floral designs
  • White frilled hat adorned with pink ribbon
  • Graceful posture in magical, ethereal settings

🧰 Installation

System Dependencies

sudo apt-get update && sudo apt-get install ffmpeg git-lfs cbm

Python Dependencies

pip install -U diffusers transformers torch sentencepiece peft moviepy protobuf
pip install git+https://github.com/Lightricks/LTX-Video.git
pip install git+https://github.com/huggingface/diffusers.git

πŸš€ Generation Pipeline

Model Initialization with Escoffier LoRA

import torch
from diffusers import LTXConditionPipeline, LTXLatentUpsamplePipeline
from diffusers.utils import export_to_video

# Load base model and LoRA weights
pipe = LTXConditionPipeline.from_pretrained(
    "Lightricks/LTX-Video-0.9.7-dev",
    torch_dtype=torch.bfloat16
)
pipe.load_lora_weights("LTXV_13B_097_DEV_escoffier_im_lora/lora_weights_step_19000.safetensors")

# Load latent upscaler
pipe_upsample = LTXLatentUpsamplePipeline.from_pretrained(
    "Lightricks/ltxv-spatial-upscaler-0.9.7",
    vae=pipe.vae,
    torch_dtype=torch.bfloat16
)

# Memory optimization
pipe.enable_sequential_cpu_offload()
pipe_upsample.enable_sequential_cpu_offload()

πŸ“¦ Complete Generation Function

def generate_escoffier_video(prompt, output_name):
    """Complete generation pipeline for 832x480 videos"""
    
    # Fixed resolution parameters
    expected_width, expected_height = 832, 480
    downscale_factor = 2/3
    num_frames = 121  # ~5 seconds at 24fps
    negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
    
    # Resolution rounding helper
    def round_resolution(h, w):
        ratio = pipe.vae_spatial_compression_ratio
        return h - (h % ratio), w - (w % ratio)
    
    low_res_h, low_res_w = round_resolution(
        int(expected_height * downscale_factor),
        int(expected_width * downscale_factor)
    )
    
    # 1. Initial generation at low resolution
    latents = pipe(
        conditions=None,
        prompt=prompt,
        negative_prompt=negative_prompt,
        width=low_res_w,
        height=low_res_h,
        num_frames=num_frames,
        num_inference_steps=30,
        generator=torch.Generator().manual_seed(0),
        output_type="latent",
    ).frames

    # 2. Latent upscaling (2x)
    upscaled_latents = pipe_upsample(
        latents=latents,
        output_type="latent"
    ).frames

    # 3. Quality refinement pass
    video = pipe(
        prompt=prompt,
        negative_prompt=negative_prompt,
        width=low_res_w*2,  # 2x upscaled
        height=low_res_h*2,
        num_frames=num_frames,
        denoise_strength=0.4,  # 4/10 steps
        num_inference_steps=10,
        latents=upscaled_latents,
        decode_timestep=0.05,
        image_cond_noise_scale=0.025,
        generator=torch.Generator().manual_seed(0),
        output_type="pil",
    ).frames[0]

    # 4. Final resize to target resolution
    video = [frame.resize((expected_width, expected_height)) for frame in video]
    export_to_video(video, f"{output_name}.mp4", fps=24)

πŸŽ₯ Example Generations (All 832x480)

🌌 Cosmic Fantasy Scene

generate_escoffier_video(
    prompt="In the style of Escoffier, This is a digital anime-style illustration of a blonde, blue-eyed female character with long, flowing hair and a large, curled strand on top. She wears a white and purple dress with gold accents, a large magenta bow on the waist, and white thigh-high stockings with intricate designs. The background features glowing, crystal-like structures and a dark blue, starry sky. Her expression is gentle, and she holds up the hem of her skirt with her right hand. The overall style is vibrant and dynamic, with a focus on her detailed, fantasy-inspired outfit and the magical, ethereal setting.",
    output_name="escoffier_cosmic_scene"
)

🌸 Mystical Garden Scene

generate_escoffier_video(
    prompt="In the style of Escoffier, This is a digital anime-style illustration of a blonde, blue-eyed female character with long, flowing hair and a large, curled strand on top. She wears a white and purple dress with gold accents, a large magenta bow on the waist, and white thigh-high stockings with intricate floral designs. She stands gracefully in a mystical garden filled with floating crystal butterflies and glowing lilies, reaching out to touch a shimmering orb.",
    output_name="escoffier_garden_scene"
)


πŸ“„ License

This pipeline is provided under the same license as the base LTX-Video model. Please refer to the original Lightricks repository for licensing details.


🀝 Acknowledgments

  • Lightricks – For developing and open-sourcing the LTX-Video model
  • Hugging Face – For hosting and community support
  • svjack – For adapting and fine-tuning the Escoffier LoRA weights

πŸ“¬ Support

For issues or feature requests, please open an issue on GitHub.


βœ… You now have a fully functional Escoffier-style anime video generation pipeline using LTX-Video 13B!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support