Phantom Distillation & LoRA Tests

Evaluating step-distilled and CFG-distilled LoRA performance

Test Date: August 20, 2025

Test Overview

Following the successful identification of ConditioningCombine as the optimal conditioning strategy (from previous tests), this battery evaluates distillation LoRAs designed to accelerate generation while maintaining quality.

Test Setup: All tests use the same seed (53), reference images, and prompt as the conditioning tests for direct comparison. The baseline uses ConditioningCombine with CFG=3.5 at 30 steps (our gold standard from previous testing).

LoRAs Tested:

LightX2V: Step and CFG distilled, designed for 4 steps at CFG=1.0
LightX2V Adaptive: Same as above but with adaptive rank scaling
CausVid V2: Modified autoregressive distillation for ~10-12 steps at CFG~2.0

🔍 Key Findings

CausVid is the winner: 12-step CausVid (Test 6) achieves 89% of baseline quality at 2.5x speed!
CFG=1.0 insufficient: Both LightX2V variants fail at CFG=1.0, confirming negative conditioning is essential
Surprising CFG=2.0 behavior: Lower resemblance at CFG=2.0 without LoRA than CFG=1.0 - unexpected finding
LightX2V disappointing: Both variants underperform despite 15x speedup - quality/speed tradeoff too severe
Sweet spot found: CausVid at 12 steps with CFG=2.0 offers best practical balance

Test Results Summary

Test	Configuration	Steps	CFG	Subject (1-10)	Quality (1-10)	Speed	Overall
Test 01	Gold Standard (No LoRA)	30	3.5	9	8	2100s	8.5
Test 02	Baseline CFG=1.0	30	1.0	5	6	1050s	5.5
Test 03	Min CFG=2.0	30	2.0	4	7	2100s	5.5
Test 04	LightX2V Standard	4	1.0	5	4	140s	4.5
Test 05	LightX2V Adaptive	4	1.0	5	5	140s	5.0
Test 06	CausVid 12 steps	12	2.0	8	9	840s	8.5
Test 07	CausVid 8 steps	8	2.0	7	8	560s	7.5

Phase 1: Baseline Tests

Test #01: Gold Standard

Baseline

Configuration

No LoRA

Steps

CFG

3.5

Speed

2100s (35 min)

Subject

Quality

Notes: Very good overall result. Eyes are a bit high contrast but might be that CFG should be a bit lower, otherwise video is good.

Test #02: Pure CFG=1.0

Baseline

Configuration

No LoRA

Steps

CFG

1.0

Speed

1050s (17.5 min)

Subject

Quality

Notes: Video is better quality than expected with CFG of 1, but it is a little muted, and she walks a bit more like a Zombie. Resemblance is significantly lower especially on the face.

Test #03: Minimal CFG=2.0

Baseline

Configuration

No LoRA

Steps

CFG

2.0

Speed

360s

Subject

Quality

Notes: Video quality is much better but the resemblance of the character and dress is a bit strange, like it grabbed unexpected parts of each and didn't focus on important aspects like the face.

Phase 2: Pure Distillation Tests

Test #04: LightX2V Standard

LoRA

Configuration

LightX2V

Steps

CFG

1.0

Speed

140s

Subject

Quality

Notes: Video quality suffers with this one, more noise than expected which may be attributed to the scheduler or shift not being right for this distill lora. Resemblance is down too.

Test #05: LightX2V Adaptive

LoRA

Configuration

LightX2V Adaptive

Steps

CFG

1.0

Speed

140s

Subject

Quality

Notes: Video quality is slightly better with this version compared to the base one.

Test #06: CausVid 12 Steps

LoRA

Configuration

CausVid V2

Steps

CFG

2.0

Speed

840s

Subject

Quality

Notes: Very good result, almost as good resemblance as 30 steps version, and arguably better quality.

Test #07: CausVid 8 Steps

LoRA

Configuration

CausVid V2

Steps

CFG

2.0

Speed

560s

Subject

Quality

Notes: Some small loss of quality and resemblance compared to the 12 step version with CausVid.

Next Steps

Phase 3 Testing (In Progress): Hybrid approaches combining distillation with strategic CFG application. Initial tests include:

LightX2V for initial steps, then switching to CFG>1.0 for final steps
Reduced LoRA strength with higher CFG values
Staged distillation combining multiple LoRAs

The goal is finding the optimal balance between generation speed and quality, targeting a 2-3x speedup while maintaining 80%+ of baseline quality.