File size: 13,853 Bytes
aaef88a
 
 
 
 
6000ff1
 
 
aaef88a
 
6000ff1
 
aaef88a
f81b395
 
aaef88a
 
627e508
aaef88a
6000ff1
aaef88a
 
 
 
 
 
 
 
f81b395
 
 
 
aaef88a
 
 
 
 
 
 
6000ff1
627e508
aaef88a
 
 
f81b395
aaef88a
f81b395
aaef88a
627e508
 
 
 
 
 
 
f81b395
627e508
 
 
 
aaef88a
 
 
627e508
6000ff1
aaef88a
627e508
aaef88a
627e508
 
aaef88a
 
627e508
 
aaef88a
 
 
627e508
 
 
 
f81b395
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6000ff1
627e508
 
 
 
 
 
 
 
 
 
6000ff1
627e508
 
6000ff1
627e508
 
 
6000ff1
 
627e508
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f81b395
 
627e508
 
f81b395
627e508
 
 
 
 
f81b395
 
627e508
 
f81b395
627e508
 
 
 
 
f81b395
 
627e508
 
f81b395
627e508
 
 
 
 
f81b395
 
 
 
 
 
 
 
 
 
 
 
627e508
 
f81b395
627e508
 
 
 
6000ff1
627e508
 
 
 
 
 
f81b395
 
 
 
 
 
aaef88a
 
 
 
 
1130c6a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Qwen-Image Edit - Advanced Image-to-Image Generation by Alibaba Cloud</title>
    <meta name="description" content="Qwen-Image Edit: Part of the Qwen (Tongyi Qianwen) model series by Alibaba Cloud. A powerful image-to-image generative model that transforms existing images with high-quality rendering, artistic style control, and exceptional detail." />
    <meta name="keywords" content="Qwen-Image Edit, Qwen, Tongyi Qianwen, Alibaba Cloud, Image-to-Image, AI Models, Image Editing, Image Generation, AI Art, Generative AI, Image Synthesis, Multimodal AI" />
    
    <!-- Open Graph / Social Media Meta Tags -->
    <meta property="og:title" content="Qwen-Image Edit - Advanced Image-to-Image Generation by Alibaba Cloud" />
    <meta property="og:description" content="Transform your existing images into stunning new creations with Qwen-Image Edit, part of the Tongyi Qianwen model series developed by Alibaba Cloud" />
    <meta property="og:type" content="website" />
    <meta property="og:url" content="https://wavespeed.ai/models/wavespeed-ai/qwen-image/edit" />
    <meta property="og:image" content="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/merge3.jpg" />
    
    <!-- Additional Meta Information -->
    <meta name="author" content="Alibaba Cloud Qwen Team" />
    <meta name="robots" content="index, follow" />
    <link rel="canonical" href="https://huggingface.co/Qwen/Qwen-Image-Edit" />
    
    <link rel="stylesheet" href="style.css" />
</head>
<body>
    <nav class="top-nav">
        <div class="nav-content">
            <div class="nav-logo">QWEN</div>
            <div class="nav-links">
                <a href="https://wavespeed.ai/" class="nav-link" target="_blank" rel="noopener noreferrer">Home</a>
                <a href="https://wavespeed.ai/models/wavespeed-ai/qwen-image" class="nav-link" target="_blank" rel="noopener noreferrer">Documentation</a>
                <a href="https://wavespeed.ai/models/wavespeed-ai/qwen-image" class="nav-link" target="_blank" rel="noopener noreferrer">Blog</a>
                <a href="https://wavespeed.ai/models/wavespeed-ai/qwen-image/edit" class="nav-button" target="_blank" rel="noopener noreferrer">Try on WaveSpeed →</a>
            </div>
        </div>
    </nav>

    <div class="container">
        <div class="content">
            <div class="logo-section">
                <h1>Qwen-Image Edit</h1>
                <p class="subtitle">By Alibaba Cloud Qwen Team</p>
            </div>
            
            <div class="announcement-section">
                <p class="announcement">Qwen-Image Edit is now open-source!</p>
                <div class="divider"></div>
                <p class="description">Advanced Image-to-Image Generative Model for Precise Editing</p>
            </div>
            
            <div class="hero-image">
                <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/merge3.jpg" alt="Qwen-Image Examples" class="full-width-img">
            </div>

            <section class="intro-section">
                <h2>Introduction</h2>
                <p>We are thrilled to release Qwen-Image Edit, an image editing foundation model in the Qwen series that achieves significant advances in transforming existing images with precise control. Experiments show strong capabilities in image-to-image generation, with exceptional performance in maintaining original image structure while applying creative transformations. Qwen-Image Edit is now available on Hugging Face and can be used locally with the diffusers library.</p>
                <div class="benchmark-image">
                    <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/bench.png" alt="Qwen-Image Benchmark" class="full-width-img">
                </div>
            </section>

            <div class="features-section">
                <div class="feature">
                    <h3>🚀 Multimodal AI Capabilities</h3>
                    <p>Part of the Qwen (Tongyi Qianwen) model series, offering powerful image-to-image generation with exceptional understanding of complex editing requirements</p>
                </div>
              
                <div class="feature">
                    <h3>🌟 Open Source Innovation</h3>
                    <p>Part of Alibaba's commitment to open-source AI development, allowing researchers and developers to build upon and extend its capabilities</p>
                </div>
                <div class="feature">
                    <h3>🔍 Comprehensive Model Family</h3>
                    <p>Works alongside other Qwen models for text, vision, and multimodal applications, providing a complete ecosystem for AI development</p>
                </div>
            </div>

            <section class="quickstart-section">
                <h2>Quick Start</h2>
                <p>Choose your preferred Qwen image model:</p>
                
                <h3>Option 1: Using Qwen-Image-Edit with diffusers</h3>
                <p>Install the latest version of diffusers</p>
                <div class="code-block">
                    <pre><code>pip install git+https://github.com/huggingface/diffusers</code></pre>
                </div>
                <div class="code-block">
                    <pre><code>import os
from PIL import Image
import torch
from diffusers import QwenImageEditPipeline

pipeline = QwenImageEditPipeline.from_pretrained("Qwen/Qwen-Image-Edit")
print("pipeline loaded")
pipeline.to(torch.bfloat16)
pipeline.to("cuda")
pipeline.set_progress_bar_config(disable=None)

image = Image.open("./input.png").convert("RGB")
prompt = "Change the rabbit's color to purple, with a flash light background."

inputs = {
    "image": image,
    "prompt": prompt,
    "generator": torch.manual_seed(0),
    "true_cfg_scale": 4.0,
    "negative_prompt": " ",
    "num_inference_steps": 50,
}

with torch.inference_mode():
    output = pipeline(**inputs)

output_image = output.images[0]
output_image.save("output_image_edit.png")
print("image saved at", os.path.abspath("output_image_edit.png"))</code></pre>
                </div>
                
                <h3>Option 2: Using the latest Qwen VLo model</h3>
                <p>The new Qwen VLo model specializes in image-to-image generation with progressive editing features.</p>
                <div class="code-block">
                    <pre><code>pip install dashscope>=1.20.7</code></pre>
                </div>
                <div class="code-block">
                    <pre><code>import dashscope
from dashscope import ImageSynthesis

# Set your API key
dashscope.api_key = "YOUR_API_KEY"

# Image-to-image generation
response = ImageSynthesis.call(
    model='qwen-vlo',
    prompt='Transform this coffee shop into a futuristic cyber cafe with neon lights',
    negative_prompt='blurry, low quality',
    n=1,  # Number of images to generate
    size='1024*1024',  # Image size
    steps=50,  # Diffusion steps
    image='path/to/input_image.jpg'  # Input image for editing
)

# Save the generated image
if response.status_code == 200:
    with open('qwen_vlo_result.png', 'wb') as f:
        f.write(response.output.images[0].image)
        print('Image saved successfully!')
else:
    print(f'Failed to generate image: {response.message}')</code></pre>
                </div>
            </section>

            <section class="showcase-section">
                <h2>Show Cases</h2>
                
                <div class="showcase-item-full">
                    <div class="showcase-description-full">
                        <h3>Semantic Editing</h3>
                        <p>One of the highlights of Qwen-Image Edit lies in its powerful capabilities for semantic editing. It can modify image content while perfectly preserving the original visual semantics. For example, when editing character images like Qwen's mascot Capybara, the model maintains character consistency even when most pixels in the image are changed. This enables effortless and diverse creation of original IP content, such as MBTI-themed emoji packs based on mascot characters.</p>
                    </div>
                    <div class="showcase-image-full">
                        <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s1.jpg" alt="Semantic Editing Example" class="showcase-img-full">
                    </div>
                </div>

                <div class="showcase-item-full">
                    <div class="showcase-description-full">
                        <h3>Novel View Synthesis</h3>
                        <p>Qwen-Image Edit excels at novel view synthesis, a key application in semantic editing. The model can rotate objects by various angles, including 90-degree and even full 180-degree rotations, allowing users to see different sides of objects. This capability is particularly valuable for product visualization, architectural rendering, and creative content production where multiple perspectives are needed.</p>
                    </div>
                    <div class="showcase-image-full">
                        <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s2.jpg" alt="Novel View Synthesis Example" class="showcase-img-full">
                    </div>
                </div>

                <div class="showcase-item-full">
                    <div class="showcase-description-full">
                        <h3>Appearance Editing</h3>
                        <p>Appearance editing is another powerful capability of Qwen-Image Edit. The model can keep certain regions of an image completely unchanged while adding, removing, or modifying specific elements. For example, it can insert signboards into scenes with corresponding reflections, remove fine details like hair strands, change the color of specific elements, or adjust backgrounds and clothing in portraits—all with exceptional attention to detail and realism.</p>
                    </div>
                    <div class="showcase-image-full">
                        <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s3.jpg" alt="Appearance Editing Example" class="showcase-img-full">
                    </div>
                </div>

                <div class="showcase-item-full">
                    <div class="showcase-description-full">
                        <h3>Text Editing Excellence</h3>
                        <p>A standout feature of Qwen-Image Edit is its accurate text editing capability, which stems from Qwen-Image's deep expertise in text rendering. The model excels at editing both English and Chinese text in images, enabling modifications to large headline text as well as precise adjustments to small and intricate text elements. This makes it particularly valuable for poster design, advertisement creation, and multilingual content production.</p>
                    </div>
                    <div class="showcase-image-full">
                        <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s4.jpg" alt="Text Editing Example" class="showcase-img-full">
                    </div>
                </div>
                
                <div class="showcase-item-full">
                    <div class="showcase-description-full">
                        <h3>Progressive Editing</h3>
                        <p>Qwen-Image Edit supports chained, step-by-step editing approaches that allow users to progressively refine and correct images. For example, when editing complex calligraphy artwork, users can draw bounding boxes to mark specific regions that need correction and instruct the model to fix these areas one by one. This iterative approach enables precise control over the editing process, ensuring the desired final result is achieved even for challenging edits.</p>
                    </div>
                    <div class="showcase-image-full">
                        <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/bench.png" alt="Progressive Editing Example" class="showcase-img-full">
                    </div>
                </div>

                <div class="showcase-conclusion">
                    <p>Together, these features make Qwen-Image Edit not just a tool for basic image editing, but a comprehensive foundation model for intelligent visual transformation—where existing images become the canvas for sophisticated artistic and creative manipulation.</p>
                </div>
            </section>

            <div class="resource-links-section">
                <h2>Resources</h2>
                <div class="resource-links">
                    <a href="https://wavespeed.ai/models/wavespeed-ai/qwen-image/edit" target="_blank" rel="noopener noreferrer" class="resource-link">Qwen-Image Edit on WaveSpeed</a>
                    <a href="https://wavespeed.ai/models/wavespeed-ai/qwen-image" target="_blank" rel="noopener noreferrer" class="resource-link">Qwen-Image on WaveSpeed</a>
                    <a href="https://wavespeed.ai/" target="_blank" rel="noopener noreferrer" class="resource-link">WaveSpeed AI</a>
                    <a href="https://huggingface.co/Qwen/Qwen-Image-Edit" target="_blank" rel="noopener noreferrer" class="resource-link">Hugging Face</a>
                    <a href="https://modelscope.cn/models/qwen/Qwen-Image-Edit" target="_blank" rel="noopener noreferrer" class="resource-link">ModelScope</a>
                    <a href="https://chat.qwen.ai/" target="_blank" rel="noopener noreferrer" class="resource-link">Qwen Chat</a>
                </div>
            </div>
        </div>
    </div>
</body>
</html>