RedbeardNZ ChuxiJ commited on
Commit
565452f
·
verified ·
0 Parent(s):

Duplicate from ACE-Step/ACE-Step-v1-chinese-rap-LoRA

Browse files

Co-authored-by: Junmin GONG <ChuxiJ@users.noreply.huggingface.co>

Files changed (4) hide show
  1. .gitattributes +35 -0
  2. README.md +160 -0
  3. config.json +15 -0
  4. pytorch_lora_weights.safetensors +3 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - music
5
+ - text2music
6
+ pipeline_tag: text-to-audio
7
+ language:
8
+ - en
9
+ - zh
10
+ - de
11
+ - fr
12
+ - es
13
+ - it
14
+ - pt
15
+ - pl
16
+ - tr
17
+ - ru
18
+ - cs
19
+ - nl
20
+ - ar
21
+ - ja
22
+ - hu
23
+ - ko
24
+ - hi
25
+ library_name: diffusers
26
+ ---
27
+
28
+ # 🎤 Chinese Rap LoRA for ACE-Step (Rap Machine)
29
+
30
+ This is a hybrid rap voice model. We meticulously curated Chinese rap/hip-hop datasets for training, with rigorous data cleaning and recaptioning. The results demonstrate:
31
+
32
+ - Improved Chinese pronunciation accuracy
33
+ - Enhanced stylistic adherence to hip-hop and electronic genres
34
+ - Greater diversity in hip-hop vocal expressions
35
+
36
+ Audio Examples see: https://ace-step.github.io/#RapMachine
37
+
38
+ ## Usage Guide
39
+
40
+ 1. Generate higher-quality Chinese songs
41
+ 2. Create superior hip-hop tracks
42
+ 3. Blend with other genres to:
43
+ - Produce music with better vocal quality and detail
44
+ - Add experimental flavors (e.g., underground, street culture)
45
+ 4. Fine-tune using these parameters:
46
+
47
+ **Vocal Controls**
48
+ **`vocal_timbre`**
49
+ - Examples: Bright, dark, warm, cold, breathy, nasal, gritty, smooth, husky, metallic, whispery, resonant, airy, smoky, sultry, light, clear, high-pitched, raspy, powerful, ethereal, flute-like, hollow, velvety, shrill, hoarse, mellow, thin, thick, reedy, silvery, twangy.
50
+ - Describes inherent vocal qualities.
51
+
52
+ **`techniques`** (List)
53
+ - Rap styles: `mumble rap`, `chopper rap`, `melodic rap`, `lyrical rap`, `trap flow`, `double-time rap`
54
+ - Vocal FX: `auto-tune`, `reverb`, `delay`, `distortion`
55
+ - Delivery: `whispered`, `shouted`, `spoken word`, `narration`, `singing`
56
+ - Other: `ad-libs`, `call-and-response`, `harmonized`
57
+
58
+ ## Community Note
59
+
60
+ While a Chinese rap LoRA might seem niche for non-Chinese communities, we consistently demonstrate through such projects that ACE-step - as a music generation foundation model - holds boundless potential. It doesn't just improve pronunciation in one language, but spawns new styles.
61
+
62
+ The universal human appreciation of music is a precious asset. Like abstract LEGO blocks, these elements will eventually combine in more organic ways. May our open-source contributions propel the evolution of musical history forward.
63
+
64
+ ---
65
+
66
+ # ACE-Step: A Step Towards Music Generation Foundation Model
67
+
68
+ ![ACE-Step Framework](https://github.com/ACE-Step/ACE-Step/raw/main/assets/ACE-Step_framework.png)
69
+
70
+ ## Model Description
71
+
72
+ ACE-Step is a novel open-source foundation model for music generation that overcomes key limitations of existing approaches through a holistic architectural design. It integrates diffusion-based generation with Sana's Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer, achieving state-of-the-art performance in generation speed, musical coherence, and controllability.
73
+
74
+ **Key Features:**
75
+ - 15× faster than LLM-based baselines (20s for 4-minute music on A100)
76
+ - Superior musical coherence across melody, harmony, and rhythm
77
+ - full-song generation, duration control and accepts natural language descriptions
78
+
79
+ ## Uses
80
+
81
+ ### Direct Use
82
+ ACE-Step can be used for:
83
+ - Generating original music from text descriptions
84
+ - Music remixing and style transfer
85
+ - edit song lyrics
86
+
87
+ ### Downstream Use
88
+ The model serves as a foundation for:
89
+ - Voice cloning applications
90
+ - Specialized music generation (rap, jazz, etc.)
91
+ - Music production tools
92
+ - Creative AI assistants
93
+
94
+ ### Out-of-Scope Use
95
+ The model should not be used for:
96
+ - Generating copyrighted content without permission
97
+ - Creating harmful or offensive content
98
+ - Misrepresenting AI-generated music as human-created
99
+
100
+ ## How to Get Started
101
+
102
+ see: https://github.com/ace-step/ACE-Step
103
+
104
+ ## Hardware Performance
105
+
106
+ | Device | 27 Steps | 60 Steps |
107
+ |---------------|----------|----------|
108
+ | NVIDIA A100 | 27.27x | 12.27x |
109
+ | RTX 4090 | 34.48x | 15.63x |
110
+ | RTX 3090 | 12.76x | 6.48x |
111
+ | M2 Max | 2.27x | 1.03x |
112
+
113
+ *RTF (Real-Time Factor) shown - higher values indicate faster generation*
114
+
115
+
116
+ ## Limitations
117
+
118
+ - Performance varies by language (top 10 languages perform best)
119
+ - Longer generations (>5 minutes) may lose structural coherence
120
+ - Rare instruments may not render perfectly
121
+ - Output Inconsistency: Highly sensitive to random seeds and input duration, leading to varied "gacha-style" results.
122
+ - Style-specific Weaknesses: Underperforms on certain genres (e.g. Chinese rap/zh_rap) Limited style adherence and musicality ceiling
123
+ - Continuity Artifacts: Unnatural transitions in repainting/extend operations
124
+ - Vocal Quality: Coarse vocal synthesis lacking nuance
125
+ - Control Granularity: Needs finer-grained musical parameter control
126
+
127
+ ## Ethical Considerations
128
+
129
+ Users should:
130
+ - Verify originality of generated works
131
+ - Disclose AI involvement
132
+ - Respect cultural elements and copyrights
133
+ - Avoid harmful content generation
134
+
135
+
136
+ ## Model Details
137
+
138
+ **Developed by:** ACE Studio and StepFun
139
+ **Model type:** Diffusion-based music generation with transformer conditioning
140
+ **License:** Apache 2.0
141
+ **Resources:**
142
+ - [Project Page](https://ace-step.github.io/)
143
+ - [Demo Space](https://huggingface.co/spaces/ACE-Step/ACE-Step)
144
+ - [GitHub Repository](https://github.com/ACE-Step/ACE-Step)
145
+
146
+
147
+ ## Citation
148
+
149
+ ```bibtex
150
+ @misc{gong2025acestep,
151
+ title={ACE-Step: A Step Towards Music Generation Foundation Model},
152
+ author={Junmin Gong, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
153
+ howpublished={\url{https://github.com/ace-step/ACE-Step}},
154
+ year={2025},
155
+ note={GitHub repository}
156
+ }
157
+ ```
158
+
159
+ ## Acknowledgements
160
+ This project is co-led by ACE Studio and StepFun.
config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "r": 256,
3
+ "lora_alpha": 32,
4
+ "target_modules": [
5
+ "speaker_embedder",
6
+ "linear_q",
7
+ "linear_k",
8
+ "linear_v",
9
+ "to_q",
10
+ "to_k",
11
+ "to_v",
12
+ "to_out.0"
13
+ ],
14
+ "use_rslora": true
15
+ }
pytorch_lora_weights.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:397db9b7dd49f46c3652ceb44f187d35d9dcfb21074cad3f210342ac6bdb666e
3
+ size 523816352