File size: 6,189 Bytes
a4aaff7
 
 
 
 
 
 
 
 
 
 
 
 
5126943
ec19476
a49be3b
ec19476
918b1c8
 
 
 
 
 
 
 
a49be3b
 
 
 
f4c58ee
a49be3b
 
 
 
f4c58ee
 
 
 
 
 
 
99a6901
f4c58ee
 
 
 
 
 
 
a49be3b
 
 
 
ec19476
f4c58ee
 
 
a49be3b
 
 
5126943
 
 
 
 
 
ade9e4c
5126943
 
 
 
 
 
 
ec19476
f4c58ee
ec19476
a49be3b
ec19476
869d773
a49be3b
f4c58ee
a49be3b
 
ec19476
869d773
ec19476
c684983
 
f4c58ee
06107a3
1170f1a
06107a3
1170f1a
06107a3
99a6901
 
 
 
 
 
dc262c1
 
 
 
 
 
f4c58ee
86eaa70
bb52925
 
 
 
f4c58ee
d3e7e66
8402d37
d3e7e66
 
 
99a6901
a49be3b
 
 
 
99a6901
 
 
 
 
 
 
 
 
 
 
 
 
a49be3b
 
 
 
 
 
 
 
 
 
f4c58ee
a49be3b
 
 
 
 
 
 
ec19476
a49be3b
ec19476
a49be3b
 
 
 
 
 
 
 
 
 
 
 
 
ec19476
f4c58ee
 
a885118
 
f4c58ee
 
a885118
a49be3b
ec19476
a49be3b
ec19476
a49be3b
ec19476
a49be3b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
---
title: Pipeline Parallelism Schedule Visualizer
emoji: πŸ“Š
colorFrom: indigo
colorTo: blue
sdk: docker
app_file: app.py
pinned: false
suggested_hardware: cpu-basic
suggested_storage: small
header: default
---

# Pipeline Parallelism Emulation and Visualization

This project provides tools for emulating and visualizing pipeline parallelism strategies used in large language model training.

## Online Demo

**Try it online!** This tool is deployed and accessible on Hugging Face Spaces:

πŸ”— **[https://huggingface.co/spaces/Victarry/PP-schedule-visualizer](https://huggingface.co/spaces/Victarry/PP-schedule-visualizer)**

No installation required - just visit the link and start exploring pipeline parallelism scheduling strategies directly in your browser!

## Overview

Pipeline parallelism is a technique used to train large models by partitioning the model across multiple devices and processing data in a pipelined fashion. This project allows you to:

- Simulate different pipeline parallelism strategies (1F1B, Interleaved, Zero-Bubble, etc.)
- Visualize the execution schedule on multiple devices
- Compare different strategies for efficiency

## Features

- **Supported Pipeline Strategies**:
  - 1F1B (One-Forward-One-Backward)
  - Interleaved 1F1B
  - Zero-Bubble 1F1B (ZB-1P)
  - 1F1B with computation-communication overlap
  - Interleaved 1F1B with computation-communication overlap
  - DualPipe (Bidirectional pipeline parallelism with full forward-backward overlap)

- **Visualization**:
  - Interactive visualization dashboard using Plotly/Dash
  
- **Configuration**:
  - Configurable simulation parameters through Hydra
  - Customizable stage latency and communication costs

## Installation

This project uses [uv](https://github.com/astral-sh/uv) for dependency management.

Setup `uv` if not installed on your computer:
```bash
# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
```


## Running the Interactive Server

To visualize schedules interactively:

```bash
uv run app.py
```

This will start a Dash server (usually on `http://127.0.0.1:8050/`). Open this URL in your web browser.

You can then adjust parameters like the number of devices, stages, batches, operation times, and select different scheduling strategies to see the resulting pipeline visualization.

## Running from Command Line

### Running for 1F1B strategy:
```bash
uv run python main.py strategy=1f1b num_devices=4 num_stages=4 num_batches=8
```
![1f1b](assets/1f1b.png)

### Running for interleaved strategy:
```bash
uv run python main.py strategy=interleave num_devices=4 num_stages=8 num_batches=8
```
![interleave](assets/interleave_1f1b.png)

You can optionally setting `microbatch_group_size_per_vp_stage`.

### Running for ZB-1P strategy:
```bash
uv run python main.py strategy=zb1p num_devices=4 num_stages=4 num_batches=8
```
![zb1p](assets/zb1p.png)

### Running for DualPipe strategy:
```bash
uv run python main.py strategy=dualpipe num_devices=8 num_stages=8 num_batches=20
```
![dualpipe](assets/dualpipe.png)

### Running for DualPipe-V strategy
```bash
uv run python main.py strategy=dualpipe_v num_devices=4 num_stages=8 num_batches=10
```
![dualpipe_v](assets/dualpipe_v.png)

### Running for 1F1B-batch-overlap strategy:
```bash
uv run python main.py strategy=1f1b_overlap num_devices=4 num_stages=4 num_batches=8
```
![1f1b_overlap](assets/1f1b_overlap.png)

### Running for 1F1B-interleave-overlap strategy:
```bash
uv run python main.py strategy=1f1b_interleave_overlap num_devices=4 num_stages=8 num_batches=8
```
![1f1b_interleave_overlap](assets/1f1b_interleave_overlap.png)


## Configuration

The default configuration is in `conf/config.yaml`. You can override any parameter on the command line or create configuration groups for different scenarios.

#### Override Specific Parameters

You can override specific parameters at runtime:
```bash
uv run python main.py op_times.forward=0.5 op_times.backward=1.0 num_batches=6
```

Use DualPipe as an example, you can manually set different time for forward/backward/backward_D/backward_W/overlapped_forward_backward:
```bash
uv run python main.py strategy=dualpipe num_devices=8 num_stages=8 num_batches=32 op_times.forward=1.0 op_times.backward=2.0 op_times.backward_D=1.0 op_times.backward_W=1.0 op_times.overlapped_forward_backward=2.5
```


### Using Different Configuration Files

You can use different configuration files with Hydra in several ways:

#### Recommended Approach

1. Create multiple configuration files in the `conf` directory for different use cases:
   ```
   conf/
   β”œβ”€β”€ config.yaml     # Default configuration
   └── model_A.yaml    # Create your own config with stage-specific latency for performance projection
   ```

2. Run with your desired configuration using the `--config-name` flag:
   ```bash
   uv run python main.py --config-name=model_A
   ```


## Project Structure

```
PP-Emulation/
β”œβ”€β”€ conf/                   # Hydra configuration files
β”‚   └── config.yaml         # Default configuration
β”œβ”€β”€ src/                    # Source code
β”‚   β”œβ”€β”€ __init__.py         # Package initialization
β”‚   β”œβ”€β”€ execution_model.py  # Schedule execution models
β”‚   β”œβ”€β”€ strategies.py       # Pipeline parallelism strategies
β”‚   └── visualizer.py       # Visualization utilities
β”œβ”€β”€ main.py                 # Main entry point
β”œβ”€β”€ pyproject.toml          # Project metadata and dependencies
└── README.md               # This file
```

## References

1. _PipeDream: Fast and Efficient Pipeline Parallel DNN Training_. [arxiv](https://arxiv.org/abs/1806.03377)
2. _Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM_. [arxiv](https://arxiv.org/abs/2104.04473)
3. _Zero Bubble Pipeline Parallelism_. [arxiv](https://arxiv.org/abs/2401.10241)
4. _Communication-Computation Overlap in MoE Training with 1F1B Pipeline Parallelism_. [blog](https://zhuanlan.zhihu.com/p/28463368206)

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.