Nicolas Denier commited on
Commit
5ad4868
·
1 Parent(s): 0ae53cb

ready for submission

Browse files
.gitignore CHANGED
@@ -15,3 +15,5 @@ logs/
15
 
16
  emissions.csv
17
  notebooks/test.ipynb
 
 
 
15
 
16
  emissions.csv
17
  notebooks/test.ipynb
18
+
19
+ .fuse_hidden*
README.md CHANGED
@@ -1,52 +1,107 @@
1
  ---
2
- title: Submission Template
3
- emoji: 🔥
4
- colorFrom: yellow
5
  colorTo: green
6
  sdk: docker
7
  pinned: false
8
  ---
9
 
10
 
11
- # Random Baseline Model for Climate Disinformation Classification
12
 
13
  ## Model Description
14
 
15
- This is a random baseline model for the Frugal AI Challenge 2024, specifically for the text classification task of identifying climate disinformation. The model serves as a performance floor, randomly assigning labels to text inputs without any learning.
16
 
17
  ### Intended Use
18
 
19
- - **Primary intended uses**: Baseline comparison for climate disinformation classification models
20
  - **Primary intended users**: Researchers and developers participating in the Frugal AI Challenge
21
  - **Out-of-scope use cases**: Not intended for production use or real-world classification tasks
22
 
23
- ## Training Data
 
24
 
25
- The model uses the QuotaClimat/frugalaichallenge-text-train dataset:
26
- - Size: ~6000 examples
27
- - Split: 80% train, 20% test
28
- - 8 categories of climate disinformation claims
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ### Labels
31
- 0. No relevant claim detected
32
- 1. Global warming is not happening
33
- 2. Not caused by humans
34
- 3. Not bad or beneficial
35
- 4. Solutions harmful/unnecessary
36
- 5. Science is unreliable
37
- 6. Proponents are biased
38
- 7. Fossil fuels are needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  ## Performance
41
 
42
  ### Metrics
43
- - **Accuracy**: ~12.5% (random chance with 8 classes)
 
 
 
44
  - **Environmental Impact**:
45
  - Emissions tracked in gCO2eq
46
  - Energy consumption tracked in Wh
 
 
 
 
 
 
 
 
 
47
 
48
  ### Model Architecture
49
- The model implements a random choice between the 8 possible labels, serving as the simplest possible baseline.
 
 
 
 
 
 
50
 
51
  ## Environmental Impact
52
 
@@ -57,15 +112,24 @@ Environmental impact is tracked using CodeCarbon, measuring:
57
  This tracking helps establish a baseline for the environmental impact of model deployment and inference.
58
 
59
  ## Limitations
60
- - Makes completely random predictions
61
- - No learning or pattern recognition
62
- - No consideration of input text
63
- - Serves only as a baseline reference
64
- - Not suitable for any real-world applications
65
 
66
  ## Ethical Considerations
67
 
68
- - Dataset contains sensitive topics related to climate disinformation
69
- - Model makes random predictions and should not be used for actual classification
70
- - Environmental impact is tracked to promote awareness of AI's carbon footprint
 
 
 
 
 
 
 
 
71
  ```
 
 
 
1
  ---
2
+ title: ChainsawDetector
3
+ emoji: 🌳
4
+ colorFrom: lightgreen
5
  colorTo: green
6
  sdk: docker
7
  pinned: false
8
  ---
9
 
10
 
11
+ # ChainsawDetector
12
 
13
  ## Model Description
14
 
15
+ This model is proposed as a submission for the **Frugal AI Challenge 2024**, specifically for the **audio** binary classification task: detecting chainsaw among environmental noise.
16
 
17
  ### Intended Use
18
 
19
+ - **Primary intended uses**: Non commercial chainsaw detection from audio recordings
20
  - **Primary intended users**: Researchers and developers participating in the Frugal AI Challenge
21
  - **Out-of-scope use cases**: Not intended for production use or real-world classification tasks
22
 
23
+ ## Training
24
+ ### Data
25
 
26
+ - The model was mainly trained on the [rfcx/frugalai](https://huggingface.co/datasets/rfcx/frugalai) dataset:
27
+ - License: CC BY-NC 4.0
28
+ - Size: 35.3k audio samples of 3 seconds max
29
+ - 2 classes (chainsaw or environment)
30
+ - Validation set (15.1k samples) and final test set are provided by the same source
31
+
32
+ To improve performance, additional datasets have been explored:
33
+ - Diverse audio recordings fetched from [freesound](freesound.org/)
34
+ - Various open licenses (Creative Commons, Attribution, Non commercial), see [`datasources/`](datasources/) for complete attributions.
35
+ - Chainsaw, and environmental noise (forest, rain)
36
+ - After curation, 2425 chainsaw, and 2646 environment samples of 3 seconds
37
+
38
+ - [ESC-50](https://github.com/karolpiczak/ESC-50)
39
+ - License: CC BY-NC 3.0
40
+ - Initially an environmental sound dataset of 50 classes
41
+ - Selected only chainsaw (as class 0), crickets, birds and wind (class 1)
42
+ - Only 20 samples of 5 seconds for each class
43
+ - After mixing and cropping, 240 samples for class 0, and 240 of class 1 (a derisory amount but it was interesting to process)
44
 
45
  ### Labels
46
+ 0. Chainsaw
47
+ 1. Environment
48
+
49
+ ### Preprocessing
50
+ 1. The initial raw audio arrays are first downampled to 4kHz.
51
+ Indeed, according to [[1]](https://ieeexplore.ieee.org/document/9909629), "chainsaw harmonics are visible only up to 1kHz". They can be higher, but in practice "often masked by background noise", so it was decided to keep frequencies up to 2kHz to feed the model. As the next step (fourier transform) requires to have as input at least 2 times the max frequency to keep (Nyquist–Shannon sampling theorem), the 4kHz downsampling makes sense. A low pass filter is applied before downsampling, to avoid aliasing.
52
+ This has two major advantages: reducing the amount of "useless" data to process (in the sense that it does not contains valuable information to identify chainsaws), leading to faster processing and converging. And filtering out an important part of possible noises (a lot of high frequencies bird songs are recensed in the recordings).
53
+
54
+ 2. The Short Term Fourier Transform (STFT) is used to extract a spectrogram.
55
+ A n_fft parameter of 1024, a window length of about 0,25 seconds, and a hop length close to 0.05s leads to a spectrogram of size (513, 60) in respectively frequency and time dimensions for the 4kHz 3s inputs. These quite wide windows (compared to speech processing for example) allow to roughly summarize the information without producing too much data (remember, frugality). More generally in this work, most of the decisions were in favor of simplicity while still allowing decent performances.
56
+
57
+ 3. The Per Channel Energy Normalization (PCEN) [[2]](https://ieeexplore.ieee.org/document/8514023) is applied.
58
+ Described as "a computationally efficient frontend for robust detection and classification of acoustic events", which sounds exactly what is needed here. It should brings better perfomances than traditionnal MFCC in this case.
59
+
60
+ 4. The spectrograms are split along the time dimension in 6 splits of time-length 10
61
+ (which is about half a second with context).
62
+ These (513, 10) sized chunks are given sequentially to the model. The idea is that a signal of any length can be chunked timewise and thus processed by the model. In this demonstration, 3s signals only are used to be able to process them by batches, but in a real world application, this model architecture can process real time continuous audio recordings.
63
+
64
+ ### Augmentation
65
+ During training only, different data augmentation techniques were applied:
66
+ - Small time shift, the spectrogams are randomly padded to the right.
67
+ - Random rectangular masks are applied to hide part of the input data.
68
+ - Reasonable gaussian noise is added so the model won't learn each data point.
69
+
70
+ ### Details
71
+ The final model was trained on 20 epochs with a OneCycle learning rate peaking at 0.005.
72
+ A batch size of 16 was decided, to keep sufficient performances, as it was fully trained on CPU (Intel® Core™ i5-1135G7)!
73
+ With AdamW as optimizer, the first half of the training was computed with float32 with automatic mixed precision, and the last part fully with bfloat16.
74
+ The model is converging quickly, although erratically due to data augmentation, and then from 90% accuracy slightly improving until stuck to 93%.
75
+ Training code can be found in [training/](training/) directory (not used for inference, only for information).
76
 
77
  ## Performance
78
 
79
  ### Metrics
80
+ - **Accuracy**: 93.02% on test split
81
+ - **Precision**: 93.51%
82
+ - **Recall**: 89.97%
83
+ - **F-score**: 91,71%
84
  - **Environmental Impact**:
85
  - Emissions tracked in gCO2eq
86
  - Energy consumption tracked in Wh
87
+ - **Mistakes**
88
+ - False positive represents 38.35% of mistakes
89
+ - False negative represents 61.65% of mistakes
90
+ The model tends to predict more class 1 (environment)
91
+ Possible explanations are:
92
+ - This class is slighlty more represented in the train dataset
93
+ - This corresponds to the default class (LSTM init states are biased towards it)
94
+ - Technically, each audio sample is containing environmental noise, the chainsaw occurences are on top of it
95
+ Overall, this is not a bad thing as false alert can lead to waste of time in real world situation
96
 
97
  ### Model Architecture
98
+ The model itself is taking a sequence of chunks (as described above) as input, and produce a single decision (0 or 1).
99
+ Three convolutional layers (and some max pooling) are reducing the inputs to a 2d tensor of 8 points across 16 channels, then a fourth one is shrinking the channels, producing a 8 length vector. The first convolution is 2d, but the next ones are 1d, the time is soon reduced to one dimension. Thus, the vector is summarizing the frequencies into 8 values.
100
+ These values are passed to an LSTM that also receive an initial state of ones (environment by default).
101
+ Each chunk is then processed the same way: same convolutional kernels, then persisting LSTM updating its state.
102
+ At the end of the signal (here after 6 chunks), a final dense layer is taking the last LSTM state (8 values), and outputs an almost prediction. (ReLu activations are used in hidden states, as well as after the convolutions because it is efficient to compute).
103
+ After a sigmoid activation, and a simple thresholding (1 if above 0.5, else 0), the final decison is produced.
104
+ In total, 1798 parameters are used.
105
 
106
  ## Environmental Impact
107
 
 
112
  This tracking helps establish a baseline for the environmental impact of model deployment and inference.
113
 
114
  ## Limitations
115
+ - Not much time was used for hyperparameters optimization, only learning rate was selected, and a few different layers configurations (size of kernels, number of layers). The main reason is because it is time and computationnaly expensive, but there are certainly improvements to find if HPO is considered more in details.
116
+ - There are implementations of trainable PCEN, which can be interesting, but uses more weigths.
117
+ - More data, and more diverse can be used to help the model distinguish chainsaws among any other kind of possible noise in a wild forest (and there are apparently a lot).
118
+
 
119
 
120
  ## Ethical Considerations
121
 
122
+ - Environmental impact is tracked to promote awareness of AI's carbon footprint.
123
+ - Advices from [[3]](https://arxiv.org/pdf/2106.08962) were applied to reduce size while keeping good performances.
124
+ - Illegal deforestation is bad. Legal one also though.
125
+
126
+ ## References
127
+ - [1] N. Stefanakis, K. Psaroulakis, N. Simou and C. Astaras, "An Open-Access System for Long-Range Chainsaw Sound Detection", 2022 30th European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 2022, pp. 264-268, doi: 10.23919/EUSIPCO55093.2022.9909629.
128
+
129
+ - [2] V. Lostanlen et al., "Per-Channel Energy Normalization: Why and How", in IEEE Signal Processing Letters, vol. 26, no. 1, pp. 39-43, Jan. 2019, doi: 10.1109/LSP.2018.2878620.
130
+
131
+ - [3] Menghani, Gaurav, "Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better", journal ACM Computing Surveys, Association for Computing Machinery (ACM), vol. 55, no. 12, pp. 1–37, 2023, doi: 10.1145/3578938
132
+
133
  ```
134
+
135
+
app.py CHANGED
@@ -1,6 +1,6 @@
1
  from fastapi import FastAPI
2
  from dotenv import load_dotenv
3
- from tasks import text, image, audio
4
 
5
  # Load environment variables
6
  load_dotenv()
@@ -10,18 +10,14 @@ app = FastAPI(
10
  description="API for the Frugal AI Challenge evaluation endpoints"
11
  )
12
 
13
- # Include all routers
14
- app.include_router(text.router)
15
- app.include_router(image.router)
16
  app.include_router(audio.router)
17
 
18
  @app.get("/")
19
  async def root():
20
  return {
21
- "message": "Welcome to the Frugal AI Challenge API",
22
  "endpoints": {
23
- "text": "/text - Text classification task",
24
- "image": "/image - Image classification task (coming soon)",
25
- "audio": "/audio - Audio classification task (coming soon)"
26
  }
27
  }
 
1
  from fastapi import FastAPI
2
  from dotenv import load_dotenv
3
+ from tasks import audio
4
 
5
  # Load environment variables
6
  load_dotenv()
 
10
  description="API for the Frugal AI Challenge evaluation endpoints"
11
  )
12
 
13
+ # Include audio routers
 
 
14
  app.include_router(audio.router)
15
 
16
  @app.get("/")
17
  async def root():
18
  return {
19
+ "message": "Frugal AI Challenge submission API",
20
  "endpoints": {
21
+ "audio": "/audio - Audio classification task"
 
 
22
  }
23
  }
notebooks/template-audio.ipynb DELETED
@@ -1,1351 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# Text task notebook template\n",
8
- "## Loading the necessary libraries"
9
- ]
10
- },
11
- {
12
- "cell_type": "code",
13
- "execution_count": 3,
14
- "metadata": {},
15
- "outputs": [
16
- {
17
- "name": "stderr",
18
- "output_type": "stream",
19
- "text": [
20
- "[codecarbon WARNING @ 19:48:07] Multiple instances of codecarbon are allowed to run at the same time.\n",
21
- "[codecarbon INFO @ 19:48:07] [setup] RAM Tracking...\n",
22
- "[codecarbon INFO @ 19:48:07] [setup] CPU Tracking...\n",
23
- "[codecarbon WARNING @ 19:48:09] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
24
- "[codecarbon WARNING @ 19:48:09] No CPU tracking mode found. Falling back on CPU constant mode. \n",
25
- " Windows OS detected: Please install Intel Power Gadget to measure CPU\n",
26
- "\n",
27
- "[codecarbon WARNING @ 19:48:11] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
28
- "[codecarbon INFO @ 19:48:11] CPU Model on constant consumption mode: 13th Gen Intel(R) Core(TM) i7-1365U\n",
29
- "[codecarbon WARNING @ 19:48:11] No CPU tracking mode found. Falling back on CPU constant mode.\n",
30
- "[codecarbon INFO @ 19:48:11] [setup] GPU Tracking...\n",
31
- "[codecarbon INFO @ 19:48:11] No GPU found.\n",
32
- "[codecarbon INFO @ 19:48:11] >>> Tracker's metadata:\n",
33
- "[codecarbon INFO @ 19:48:11] Platform system: Windows-11-10.0.22631-SP0\n",
34
- "[codecarbon INFO @ 19:48:11] Python version: 3.12.7\n",
35
- "[codecarbon INFO @ 19:48:11] CodeCarbon version: 3.0.0_rc0\n",
36
- "[codecarbon INFO @ 19:48:11] Available RAM : 31.347 GB\n",
37
- "[codecarbon INFO @ 19:48:11] CPU count: 12\n",
38
- "[codecarbon INFO @ 19:48:11] CPU model: 13th Gen Intel(R) Core(TM) i7-1365U\n",
39
- "[codecarbon INFO @ 19:48:11] GPU count: None\n",
40
- "[codecarbon INFO @ 19:48:11] GPU model: None\n",
41
- "[codecarbon INFO @ 19:48:11] Saving emissions data to file c:\\git\\submission-template\\notebooks\\emissions.csv\n"
42
- ]
43
- }
44
- ],
45
- "source": [
46
- "from fastapi import APIRouter\n",
47
- "from datetime import datetime\n",
48
- "from datasets import load_dataset\n",
49
- "from sklearn.metrics import accuracy_score\n",
50
- "import random\n",
51
- "\n",
52
- "import sys\n",
53
- "sys.path.append('../tasks')\n",
54
- "\n",
55
- "from utils.evaluation import AudioEvaluationRequest\n",
56
- "from utils.emissions import tracker, clean_emissions_data, get_space_info\n",
57
- "\n",
58
- "\n",
59
- "# Define the label mapping\n",
60
- "LABEL_MAPPING = {\n",
61
- " \"chainsaw\": 0,\n",
62
- " \"environment\": 1\n",
63
- "}"
64
- ]
65
- },
66
- {
67
- "cell_type": "markdown",
68
- "metadata": {},
69
- "source": [
70
- "## Loading the datasets and splitting them"
71
- ]
72
- },
73
- {
74
- "cell_type": "code",
75
- "execution_count": 4,
76
- "metadata": {},
77
- "outputs": [
78
- {
79
- "data": {
80
- "application/vnd.jupyter.widget-view+json": {
81
- "model_id": "668da7bf85434e098b95c3ec447d78fe",
82
- "version_major": 2,
83
- "version_minor": 0
84
- },
85
- "text/plain": [
86
- "README.md: 0%| | 0.00/5.18k [00:00<?, ?B/s]"
87
- ]
88
- },
89
- "metadata": {},
90
- "output_type": "display_data"
91
- },
92
- {
93
- "name": "stderr",
94
- "output_type": "stream",
95
- "text": [
96
- "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\datasets--QuotaClimat--frugalaichallenge-text-train. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
97
- "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
98
- " warnings.warn(message)\n"
99
- ]
100
- },
101
- {
102
- "data": {
103
- "application/vnd.jupyter.widget-view+json": {
104
- "model_id": "5b68d43359eb429395da8be7d4b15556",
105
- "version_major": 2,
106
- "version_minor": 0
107
- },
108
- "text/plain": [
109
- "train.parquet: 0%| | 0.00/1.21M [00:00<?, ?B/s]"
110
- ]
111
- },
112
- "metadata": {},
113
- "output_type": "display_data"
114
- },
115
- {
116
- "data": {
117
- "application/vnd.jupyter.widget-view+json": {
118
- "model_id": "140a304773914e9db8f698eabeb40298",
119
- "version_major": 2,
120
- "version_minor": 0
121
- },
122
- "text/plain": [
123
- "Generating train split: 0%| | 0/6091 [00:00<?, ? examples/s]"
124
- ]
125
- },
126
- "metadata": {},
127
- "output_type": "display_data"
128
- },
129
- {
130
- "data": {
131
- "application/vnd.jupyter.widget-view+json": {
132
- "model_id": "6d04e8ab1906400e8e0029949dc523a5",
133
- "version_major": 2,
134
- "version_minor": 0
135
- },
136
- "text/plain": [
137
- "Map: 0%| | 0/6091 [00:00<?, ? examples/s]"
138
- ]
139
- },
140
- "metadata": {},
141
- "output_type": "display_data"
142
- }
143
- ],
144
- "source": [
145
- "request = AudioEvaluationRequest()\n",
146
- "\n",
147
- "# Load and prepare the dataset\n",
148
- "dataset = load_dataset(request.dataset_name)\n",
149
- "\n",
150
- "# Split dataset\n",
151
- "train_test = dataset[\"train\"]\n",
152
- "test_dataset = dataset[\"test\"]"
153
- ]
154
- },
155
- {
156
- "cell_type": "markdown",
157
- "metadata": {},
158
- "source": [
159
- "## Random Baseline"
160
- ]
161
- },
162
- {
163
- "cell_type": "code",
164
- "execution_count": 5,
165
- "metadata": {},
166
- "outputs": [],
167
- "source": [
168
- "# Start tracking emissions\n",
169
- "tracker.start()\n",
170
- "tracker.start_task(\"inference\")"
171
- ]
172
- },
173
- {
174
- "cell_type": "code",
175
- "execution_count": 6,
176
- "metadata": {},
177
- "outputs": [
178
- {
179
- "data": {
180
- "text/plain": [
181
- "[1,\n",
182
- " 7,\n",
183
- " 6,\n",
184
- " 6,\n",
185
- " 2,\n",
186
- " 0,\n",
187
- " 1,\n",
188
- " 7,\n",
189
- " 3,\n",
190
- " 6,\n",
191
- " 6,\n",
192
- " 3,\n",
193
- " 6,\n",
194
- " 6,\n",
195
- " 5,\n",
196
- " 0,\n",
197
- " 2,\n",
198
- " 6,\n",
199
- " 2,\n",
200
- " 6,\n",
201
- " 5,\n",
202
- " 4,\n",
203
- " 1,\n",
204
- " 3,\n",
205
- " 6,\n",
206
- " 4,\n",
207
- " 2,\n",
208
- " 1,\n",
209
- " 4,\n",
210
- " 0,\n",
211
- " 3,\n",
212
- " 4,\n",
213
- " 1,\n",
214
- " 5,\n",
215
- " 5,\n",
216
- " 1,\n",
217
- " 2,\n",
218
- " 7,\n",
219
- " 6,\n",
220
- " 1,\n",
221
- " 3,\n",
222
- " 1,\n",
223
- " 7,\n",
224
- " 7,\n",
225
- " 0,\n",
226
- " 0,\n",
227
- " 3,\n",
228
- " 3,\n",
229
- " 3,\n",
230
- " 4,\n",
231
- " 1,\n",
232
- " 4,\n",
233
- " 4,\n",
234
- " 1,\n",
235
- " 4,\n",
236
- " 5,\n",
237
- " 6,\n",
238
- " 1,\n",
239
- " 2,\n",
240
- " 2,\n",
241
- " 2,\n",
242
- " 5,\n",
243
- " 2,\n",
244
- " 7,\n",
245
- " 2,\n",
246
- " 7,\n",
247
- " 7,\n",
248
- " 6,\n",
249
- " 4,\n",
250
- " 2,\n",
251
- " 0,\n",
252
- " 1,\n",
253
- " 6,\n",
254
- " 3,\n",
255
- " 2,\n",
256
- " 5,\n",
257
- " 5,\n",
258
- " 2,\n",
259
- " 0,\n",
260
- " 7,\n",
261
- " 0,\n",
262
- " 1,\n",
263
- " 5,\n",
264
- " 5,\n",
265
- " 7,\n",
266
- " 4,\n",
267
- " 6,\n",
268
- " 7,\n",
269
- " 1,\n",
270
- " 7,\n",
271
- " 1,\n",
272
- " 0,\n",
273
- " 3,\n",
274
- " 4,\n",
275
- " 2,\n",
276
- " 5,\n",
277
- " 3,\n",
278
- " 3,\n",
279
- " 3,\n",
280
- " 2,\n",
281
- " 2,\n",
282
- " 1,\n",
283
- " 0,\n",
284
- " 4,\n",
285
- " 5,\n",
286
- " 7,\n",
287
- " 0,\n",
288
- " 3,\n",
289
- " 1,\n",
290
- " 4,\n",
291
- " 6,\n",
292
- " 0,\n",
293
- " 7,\n",
294
- " 1,\n",
295
- " 1,\n",
296
- " 2,\n",
297
- " 2,\n",
298
- " 4,\n",
299
- " 0,\n",
300
- " 4,\n",
301
- " 3,\n",
302
- " 4,\n",
303
- " 4,\n",
304
- " 2,\n",
305
- " 2,\n",
306
- " 3,\n",
307
- " 3,\n",
308
- " 7,\n",
309
- " 4,\n",
310
- " 7,\n",
311
- " 6,\n",
312
- " 4,\n",
313
- " 5,\n",
314
- " 4,\n",
315
- " 3,\n",
316
- " 6,\n",
317
- " 0,\n",
318
- " 4,\n",
319
- " 0,\n",
320
- " 1,\n",
321
- " 3,\n",
322
- " 6,\n",
323
- " 7,\n",
324
- " 3,\n",
325
- " 3,\n",
326
- " 0,\n",
327
- " 1,\n",
328
- " 2,\n",
329
- " 4,\n",
330
- " 4,\n",
331
- " 3,\n",
332
- " 1,\n",
333
- " 2,\n",
334
- " 4,\n",
335
- " 3,\n",
336
- " 0,\n",
337
- " 5,\n",
338
- " 3,\n",
339
- " 6,\n",
340
- " 3,\n",
341
- " 6,\n",
342
- " 1,\n",
343
- " 3,\n",
344
- " 4,\n",
345
- " 5,\n",
346
- " 4,\n",
347
- " 0,\n",
348
- " 7,\n",
349
- " 3,\n",
350
- " 6,\n",
351
- " 7,\n",
352
- " 4,\n",
353
- " 4,\n",
354
- " 5,\n",
355
- " 3,\n",
356
- " 1,\n",
357
- " 7,\n",
358
- " 4,\n",
359
- " 1,\n",
360
- " 0,\n",
361
- " 3,\n",
362
- " 0,\n",
363
- " 5,\n",
364
- " 3,\n",
365
- " 6,\n",
366
- " 3,\n",
367
- " 0,\n",
368
- " 7,\n",
369
- " 2,\n",
370
- " 0,\n",
371
- " 4,\n",
372
- " 1,\n",
373
- " 2,\n",
374
- " 6,\n",
375
- " 3,\n",
376
- " 4,\n",
377
- " 4,\n",
378
- " 5,\n",
379
- " 1,\n",
380
- " 5,\n",
381
- " 4,\n",
382
- " 0,\n",
383
- " 1,\n",
384
- " 7,\n",
385
- " 3,\n",
386
- " 6,\n",
387
- " 0,\n",
388
- " 7,\n",
389
- " 4,\n",
390
- " 6,\n",
391
- " 3,\n",
392
- " 0,\n",
393
- " 0,\n",
394
- " 4,\n",
395
- " 6,\n",
396
- " 6,\n",
397
- " 4,\n",
398
- " 0,\n",
399
- " 5,\n",
400
- " 7,\n",
401
- " 5,\n",
402
- " 1,\n",
403
- " 3,\n",
404
- " 6,\n",
405
- " 2,\n",
406
- " 3,\n",
407
- " 2,\n",
408
- " 4,\n",
409
- " 5,\n",
410
- " 1,\n",
411
- " 5,\n",
412
- " 0,\n",
413
- " 3,\n",
414
- " 3,\n",
415
- " 0,\n",
416
- " 0,\n",
417
- " 6,\n",
418
- " 6,\n",
419
- " 2,\n",
420
- " 0,\n",
421
- " 7,\n",
422
- " 4,\n",
423
- " 5,\n",
424
- " 7,\n",
425
- " 1,\n",
426
- " 0,\n",
427
- " 4,\n",
428
- " 5,\n",
429
- " 1,\n",
430
- " 7,\n",
431
- " 0,\n",
432
- " 7,\n",
433
- " 2,\n",
434
- " 6,\n",
435
- " 1,\n",
436
- " 3,\n",
437
- " 5,\n",
438
- " 5,\n",
439
- " 6,\n",
440
- " 5,\n",
441
- " 4,\n",
442
- " 3,\n",
443
- " 7,\n",
444
- " 4,\n",
445
- " 3,\n",
446
- " 5,\n",
447
- " 5,\n",
448
- " 7,\n",
449
- " 2,\n",
450
- " 6,\n",
451
- " 1,\n",
452
- " 5,\n",
453
- " 0,\n",
454
- " 3,\n",
455
- " 4,\n",
456
- " 2,\n",
457
- " 3,\n",
458
- " 7,\n",
459
- " 0,\n",
460
- " 1,\n",
461
- " 7,\n",
462
- " 6,\n",
463
- " 7,\n",
464
- " 7,\n",
465
- " 5,\n",
466
- " 6,\n",
467
- " 3,\n",
468
- " 2,\n",
469
- " 3,\n",
470
- " 0,\n",
471
- " 4,\n",
472
- " 3,\n",
473
- " 5,\n",
474
- " 6,\n",
475
- " 0,\n",
476
- " 0,\n",
477
- " 6,\n",
478
- " 6,\n",
479
- " 1,\n",
480
- " 4,\n",
481
- " 0,\n",
482
- " 4,\n",
483
- " 2,\n",
484
- " 7,\n",
485
- " 5,\n",
486
- " 7,\n",
487
- " 6,\n",
488
- " 3,\n",
489
- " 5,\n",
490
- " 6,\n",
491
- " 0,\n",
492
- " 4,\n",
493
- " 5,\n",
494
- " 6,\n",
495
- " 1,\n",
496
- " 2,\n",
497
- " 1,\n",
498
- " 5,\n",
499
- " 3,\n",
500
- " 0,\n",
501
- " 3,\n",
502
- " 7,\n",
503
- " 1,\n",
504
- " 0,\n",
505
- " 7,\n",
506
- " 0,\n",
507
- " 1,\n",
508
- " 0,\n",
509
- " 4,\n",
510
- " 1,\n",
511
- " 1,\n",
512
- " 0,\n",
513
- " 7,\n",
514
- " 1,\n",
515
- " 0,\n",
516
- " 7,\n",
517
- " 6,\n",
518
- " 2,\n",
519
- " 3,\n",
520
- " 7,\n",
521
- " 4,\n",
522
- " 3,\n",
523
- " 4,\n",
524
- " 3,\n",
525
- " 3,\n",
526
- " 2,\n",
527
- " 5,\n",
528
- " 1,\n",
529
- " 5,\n",
530
- " 1,\n",
531
- " 7,\n",
532
- " 3,\n",
533
- " 2,\n",
534
- " 6,\n",
535
- " 4,\n",
536
- " 4,\n",
537
- " 1,\n",
538
- " 2,\n",
539
- " 6,\n",
540
- " 7,\n",
541
- " 2,\n",
542
- " 7,\n",
543
- " 1,\n",
544
- " 3,\n",
545
- " 5,\n",
546
- " 2,\n",
547
- " 6,\n",
548
- " 4,\n",
549
- " 6,\n",
550
- " 7,\n",
551
- " 0,\n",
552
- " 5,\n",
553
- " 1,\n",
554
- " 6,\n",
555
- " 5,\n",
556
- " 3,\n",
557
- " 6,\n",
558
- " 5,\n",
559
- " 4,\n",
560
- " 7,\n",
561
- " 6,\n",
562
- " 5,\n",
563
- " 4,\n",
564
- " 3,\n",
565
- " 0,\n",
566
- " 0,\n",
567
- " 1,\n",
568
- " 7,\n",
569
- " 7,\n",
570
- " 6,\n",
571
- " 1,\n",
572
- " 4,\n",
573
- " 5,\n",
574
- " 6,\n",
575
- " 1,\n",
576
- " 5,\n",
577
- " 1,\n",
578
- " 2,\n",
579
- " 6,\n",
580
- " 2,\n",
581
- " 6,\n",
582
- " 0,\n",
583
- " 2,\n",
584
- " 1,\n",
585
- " 5,\n",
586
- " 5,\n",
587
- " 1,\n",
588
- " 7,\n",
589
- " 0,\n",
590
- " 5,\n",
591
- " 5,\n",
592
- " 1,\n",
593
- " 7,\n",
594
- " 7,\n",
595
- " 2,\n",
596
- " 1,\n",
597
- " 0,\n",
598
- " 1,\n",
599
- " 0,\n",
600
- " 5,\n",
601
- " 4,\n",
602
- " 2,\n",
603
- " 7,\n",
604
- " 4,\n",
605
- " 3,\n",
606
- " 6,\n",
607
- " 7,\n",
608
- " 5,\n",
609
- " 1,\n",
610
- " 0,\n",
611
- " 7,\n",
612
- " 2,\n",
613
- " 1,\n",
614
- " 2,\n",
615
- " 3,\n",
616
- " 1,\n",
617
- " 0,\n",
618
- " 3,\n",
619
- " 2,\n",
620
- " 6,\n",
621
- " 0,\n",
622
- " 5,\n",
623
- " 4,\n",
624
- " 7,\n",
625
- " 1,\n",
626
- " 1,\n",
627
- " 0,\n",
628
- " 7,\n",
629
- " 0,\n",
630
- " 6,\n",
631
- " 7,\n",
632
- " 6,\n",
633
- " 1,\n",
634
- " 5,\n",
635
- " 5,\n",
636
- " 7,\n",
637
- " 6,\n",
638
- " 1,\n",
639
- " 7,\n",
640
- " 6,\n",
641
- " 5,\n",
642
- " 4,\n",
643
- " 1,\n",
644
- " 4,\n",
645
- " 7,\n",
646
- " 5,\n",
647
- " 4,\n",
648
- " 0,\n",
649
- " 0,\n",
650
- " 7,\n",
651
- " 0,\n",
652
- " 0,\n",
653
- " 3,\n",
654
- " 6,\n",
655
- " 2,\n",
656
- " 5,\n",
657
- " 3,\n",
658
- " 0,\n",
659
- " 3,\n",
660
- " 6,\n",
661
- " 5,\n",
662
- " 7,\n",
663
- " 2,\n",
664
- " 6,\n",
665
- " 7,\n",
666
- " 5,\n",
667
- " 2,\n",
668
- " 3,\n",
669
- " 6,\n",
670
- " 7,\n",
671
- " 7,\n",
672
- " 7,\n",
673
- " 6,\n",
674
- " 1,\n",
675
- " 7,\n",
676
- " 4,\n",
677
- " 2,\n",
678
- " 7,\n",
679
- " 5,\n",
680
- " 4,\n",
681
- " 1,\n",
682
- " 2,\n",
683
- " 3,\n",
684
- " 7,\n",
685
- " 0,\n",
686
- " 2,\n",
687
- " 7,\n",
688
- " 6,\n",
689
- " 1,\n",
690
- " 4,\n",
691
- " 0,\n",
692
- " 6,\n",
693
- " 3,\n",
694
- " 1,\n",
695
- " 0,\n",
696
- " 3,\n",
697
- " 4,\n",
698
- " 7,\n",
699
- " 7,\n",
700
- " 4,\n",
701
- " 2,\n",
702
- " 1,\n",
703
- " 0,\n",
704
- " 5,\n",
705
- " 1,\n",
706
- " 7,\n",
707
- " 4,\n",
708
- " 6,\n",
709
- " 7,\n",
710
- " 7,\n",
711
- " 3,\n",
712
- " 4,\n",
713
- " 3,\n",
714
- " 5,\n",
715
- " 4,\n",
716
- " 4,\n",
717
- " 5,\n",
718
- " 0,\n",
719
- " 1,\n",
720
- " 3,\n",
721
- " 7,\n",
722
- " 5,\n",
723
- " 4,\n",
724
- " 7,\n",
725
- " 3,\n",
726
- " 3,\n",
727
- " 3,\n",
728
- " 5,\n",
729
- " 3,\n",
730
- " 3,\n",
731
- " 4,\n",
732
- " 0,\n",
733
- " 1,\n",
734
- " 7,\n",
735
- " 4,\n",
736
- " 7,\n",
737
- " 7,\n",
738
- " 5,\n",
739
- " 0,\n",
740
- " 0,\n",
741
- " 5,\n",
742
- " 2,\n",
743
- " 6,\n",
744
- " 2,\n",
745
- " 6,\n",
746
- " 7,\n",
747
- " 6,\n",
748
- " 5,\n",
749
- " 7,\n",
750
- " 5,\n",
751
- " 7,\n",
752
- " 1,\n",
753
- " 6,\n",
754
- " 6,\n",
755
- " 0,\n",
756
- " 4,\n",
757
- " 7,\n",
758
- " 3,\n",
759
- " 0,\n",
760
- " 0,\n",
761
- " 2,\n",
762
- " 5,\n",
763
- " 2,\n",
764
- " 3,\n",
765
- " 7,\n",
766
- " 1,\n",
767
- " 0,\n",
768
- " 3,\n",
769
- " 0,\n",
770
- " 0,\n",
771
- " 3,\n",
772
- " 3,\n",
773
- " 7,\n",
774
- " 3,\n",
775
- " 0,\n",
776
- " 1,\n",
777
- " 1,\n",
778
- " 6,\n",
779
- " 0,\n",
780
- " 0,\n",
781
- " 5,\n",
782
- " 0,\n",
783
- " 3,\n",
784
- " 4,\n",
785
- " 6,\n",
786
- " 7,\n",
787
- " 4,\n",
788
- " 0,\n",
789
- " 4,\n",
790
- " 4,\n",
791
- " 5,\n",
792
- " 4,\n",
793
- " 4,\n",
794
- " 3,\n",
795
- " 6,\n",
796
- " 5,\n",
797
- " 2,\n",
798
- " 0,\n",
799
- " 6,\n",
800
- " 0,\n",
801
- " 6,\n",
802
- " 4,\n",
803
- " 3,\n",
804
- " 5,\n",
805
- " 7,\n",
806
- " 7,\n",
807
- " 5,\n",
808
- " 5,\n",
809
- " 1,\n",
810
- " 5,\n",
811
- " 2,\n",
812
- " 7,\n",
813
- " 7,\n",
814
- " 6,\n",
815
- " 6,\n",
816
- " 7,\n",
817
- " 6,\n",
818
- " 5,\n",
819
- " 2,\n",
820
- " 4,\n",
821
- " 0,\n",
822
- " 4,\n",
823
- " 4,\n",
824
- " 7,\n",
825
- " 5,\n",
826
- " 2,\n",
827
- " 7,\n",
828
- " 0,\n",
829
- " 6,\n",
830
- " 0,\n",
831
- " 2,\n",
832
- " 6,\n",
833
- " 6,\n",
834
- " 2,\n",
835
- " 3,\n",
836
- " 0,\n",
837
- " 5,\n",
838
- " 0,\n",
839
- " 5,\n",
840
- " 7,\n",
841
- " 2,\n",
842
- " 7,\n",
843
- " 4,\n",
844
- " 7,\n",
845
- " 4,\n",
846
- " 0,\n",
847
- " 7,\n",
848
- " 1,\n",
849
- " 4,\n",
850
- " 5,\n",
851
- " 0,\n",
852
- " 5,\n",
853
- " 5,\n",
854
- " 2,\n",
855
- " 0,\n",
856
- " 2,\n",
857
- " 5,\n",
858
- " 5,\n",
859
- " 6,\n",
860
- " 3,\n",
861
- " 4,\n",
862
- " 1,\n",
863
- " 7,\n",
864
- " 7,\n",
865
- " 2,\n",
866
- " 3,\n",
867
- " 2,\n",
868
- " 5,\n",
869
- " 0,\n",
870
- " 7,\n",
871
- " 2,\n",
872
- " 3,\n",
873
- " 7,\n",
874
- " 2,\n",
875
- " 4,\n",
876
- " 0,\n",
877
- " 5,\n",
878
- " 7,\n",
879
- " 3,\n",
880
- " 6,\n",
881
- " 7,\n",
882
- " 6,\n",
883
- " 4,\n",
884
- " 3,\n",
885
- " 6,\n",
886
- " 5,\n",
887
- " 4,\n",
888
- " 0,\n",
889
- " 3,\n",
890
- " 4,\n",
891
- " 3,\n",
892
- " 5,\n",
893
- " 2,\n",
894
- " 4,\n",
895
- " 0,\n",
896
- " 3,\n",
897
- " 6,\n",
898
- " 1,\n",
899
- " 3,\n",
900
- " 1,\n",
901
- " 4,\n",
902
- " 3,\n",
903
- " 3,\n",
904
- " 3,\n",
905
- " 0,\n",
906
- " 7,\n",
907
- " 6,\n",
908
- " 2,\n",
909
- " 4,\n",
910
- " 6,\n",
911
- " 5,\n",
912
- " 4,\n",
913
- " 1,\n",
914
- " 7,\n",
915
- " 6,\n",
916
- " 1,\n",
917
- " 4,\n",
918
- " 3,\n",
919
- " 0,\n",
920
- " 7,\n",
921
- " 3,\n",
922
- " 1,\n",
923
- " 2,\n",
924
- " 1,\n",
925
- " 6,\n",
926
- " 4,\n",
927
- " 7,\n",
928
- " 1,\n",
929
- " 7,\n",
930
- " 1,\n",
931
- " 5,\n",
932
- " 1,\n",
933
- " 6,\n",
934
- " 3,\n",
935
- " 0,\n",
936
- " 2,\n",
937
- " 6,\n",
938
- " 7,\n",
939
- " 7,\n",
940
- " 0,\n",
941
- " 1,\n",
942
- " 4,\n",
943
- " 0,\n",
944
- " 4,\n",
945
- " 5,\n",
946
- " 3,\n",
947
- " 6,\n",
948
- " 2,\n",
949
- " 3,\n",
950
- " 4,\n",
951
- " 1,\n",
952
- " 6,\n",
953
- " 2,\n",
954
- " 4,\n",
955
- " 4,\n",
956
- " 6,\n",
957
- " 4,\n",
958
- " 5,\n",
959
- " 7,\n",
960
- " 1,\n",
961
- " 7,\n",
962
- " 7,\n",
963
- " 4,\n",
964
- " 7,\n",
965
- " 4,\n",
966
- " 3,\n",
967
- " 3,\n",
968
- " 6,\n",
969
- " 1,\n",
970
- " 2,\n",
971
- " 0,\n",
972
- " 0,\n",
973
- " 0,\n",
974
- " 2,\n",
975
- " 5,\n",
976
- " 6,\n",
977
- " 5,\n",
978
- " 7,\n",
979
- " 5,\n",
980
- " 7,\n",
981
- " 1,\n",
982
- " 1,\n",
983
- " 2,\n",
984
- " 1,\n",
985
- " 6,\n",
986
- " 5,\n",
987
- " 7,\n",
988
- " 0,\n",
989
- " 0,\n",
990
- " 5,\n",
991
- " 5,\n",
992
- " 0,\n",
993
- " 3,\n",
994
- " 7,\n",
995
- " 5,\n",
996
- " 2,\n",
997
- " 5,\n",
998
- " 4,\n",
999
- " 2,\n",
1000
- " 3,\n",
1001
- " 6,\n",
1002
- " 2,\n",
1003
- " 3,\n",
1004
- " 6,\n",
1005
- " 0,\n",
1006
- " 0,\n",
1007
- " 2,\n",
1008
- " 6,\n",
1009
- " 0,\n",
1010
- " 1,\n",
1011
- " 3,\n",
1012
- " 3,\n",
1013
- " 6,\n",
1014
- " 4,\n",
1015
- " 6,\n",
1016
- " 4,\n",
1017
- " 6,\n",
1018
- " 0,\n",
1019
- " 0,\n",
1020
- " 2,\n",
1021
- " 3,\n",
1022
- " 6,\n",
1023
- " 2,\n",
1024
- " 2,\n",
1025
- " 6,\n",
1026
- " 6,\n",
1027
- " 2,\n",
1028
- " 4,\n",
1029
- " 3,\n",
1030
- " 3,\n",
1031
- " 6,\n",
1032
- " 7,\n",
1033
- " 7,\n",
1034
- " 1,\n",
1035
- " 1,\n",
1036
- " 7,\n",
1037
- " 7,\n",
1038
- " 6,\n",
1039
- " 1,\n",
1040
- " 7,\n",
1041
- " 0,\n",
1042
- " 0,\n",
1043
- " 2,\n",
1044
- " 4,\n",
1045
- " 2,\n",
1046
- " 2,\n",
1047
- " 3,\n",
1048
- " 0,\n",
1049
- " 1,\n",
1050
- " 4,\n",
1051
- " 0,\n",
1052
- " 4,\n",
1053
- " 6,\n",
1054
- " 5,\n",
1055
- " 3,\n",
1056
- " 2,\n",
1057
- " 3,\n",
1058
- " 2,\n",
1059
- " 3,\n",
1060
- " 6,\n",
1061
- " 2,\n",
1062
- " 1,\n",
1063
- " 4,\n",
1064
- " 7,\n",
1065
- " 6,\n",
1066
- " 4,\n",
1067
- " 5,\n",
1068
- " 6,\n",
1069
- " 7,\n",
1070
- " 7,\n",
1071
- " 2,\n",
1072
- " 0,\n",
1073
- " 5,\n",
1074
- " 5,\n",
1075
- " 0,\n",
1076
- " 3,\n",
1077
- " 6,\n",
1078
- " 6,\n",
1079
- " 5,\n",
1080
- " 4,\n",
1081
- " 4,\n",
1082
- " 7,\n",
1083
- " 0,\n",
1084
- " 5,\n",
1085
- " 1,\n",
1086
- " 7,\n",
1087
- " 0,\n",
1088
- " 3,\n",
1089
- " 1,\n",
1090
- " 7,\n",
1091
- " 0,\n",
1092
- " 1,\n",
1093
- " 4,\n",
1094
- " 7,\n",
1095
- " 5,\n",
1096
- " 0,\n",
1097
- " 4,\n",
1098
- " 0,\n",
1099
- " 0,\n",
1100
- " 1,\n",
1101
- " 0,\n",
1102
- " 6,\n",
1103
- " 4,\n",
1104
- " 0,\n",
1105
- " 5,\n",
1106
- " 4,\n",
1107
- " 6,\n",
1108
- " 6,\n",
1109
- " 7,\n",
1110
- " 2,\n",
1111
- " 6,\n",
1112
- " 2,\n",
1113
- " 6,\n",
1114
- " 0,\n",
1115
- " 3,\n",
1116
- " 2,\n",
1117
- " 2,\n",
1118
- " 1,\n",
1119
- " 5,\n",
1120
- " 4,\n",
1121
- " 7,\n",
1122
- " 6,\n",
1123
- " 6,\n",
1124
- " 2,\n",
1125
- " 5,\n",
1126
- " 5,\n",
1127
- " 5,\n",
1128
- " 0,\n",
1129
- " 3,\n",
1130
- " 5,\n",
1131
- " 4,\n",
1132
- " 5,\n",
1133
- " 7,\n",
1134
- " 5,\n",
1135
- " 0,\n",
1136
- " 5,\n",
1137
- " 0,\n",
1138
- " 0,\n",
1139
- " 2,\n",
1140
- " 0,\n",
1141
- " 2,\n",
1142
- " 1,\n",
1143
- " 0,\n",
1144
- " 2,\n",
1145
- " 4,\n",
1146
- " 3,\n",
1147
- " 4,\n",
1148
- " 1,\n",
1149
- " 7,\n",
1150
- " 2,\n",
1151
- " 1,\n",
1152
- " 0,\n",
1153
- " 3,\n",
1154
- " 0,\n",
1155
- " 3,\n",
1156
- " 1,\n",
1157
- " 1,\n",
1158
- " 0,\n",
1159
- " 5,\n",
1160
- " 3,\n",
1161
- " 1,\n",
1162
- " 2,\n",
1163
- " 5,\n",
1164
- " 6,\n",
1165
- " 7,\n",
1166
- " 6,\n",
1167
- " 7,\n",
1168
- " 0,\n",
1169
- " 2,\n",
1170
- " 6,\n",
1171
- " 3,\n",
1172
- " 1,\n",
1173
- " 5,\n",
1174
- " 4,\n",
1175
- " 2,\n",
1176
- " 4,\n",
1177
- " 6,\n",
1178
- " 5,\n",
1179
- " 2,\n",
1180
- " 7,\n",
1181
- " ...]"
1182
- ]
1183
- },
1184
- "execution_count": 6,
1185
- "metadata": {},
1186
- "output_type": "execute_result"
1187
- }
1188
- ],
1189
- "source": [
1190
- "\n",
1191
- "#--------------------------------------------------------------------------------------------\n",
1192
- "# YOUR MODEL INFERENCE CODE HERE\n",
1193
- "# Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.\n",
1194
- "#-------------------------------------------------------------------------------------------- \n",
1195
- "\n",
1196
- "# Make random predictions (placeholder for actual model inference)\n",
1197
- "true_labels = test_dataset[\"label\"]\n",
1198
- "predictions = [random.randint(0, 1) for _ in range(len(true_labels))]\n",
1199
- "\n",
1200
- "predictions\n",
1201
- "\n",
1202
- "#--------------------------------------------------------------------------------------------\n",
1203
- "# YOUR MODEL INFERENCE STOPS HERE\n",
1204
- "#-------------------------------------------------------------------------------------------- "
1205
- ]
1206
- },
1207
- {
1208
- "cell_type": "code",
1209
- "execution_count": 8,
1210
- "metadata": {},
1211
- "outputs": [
1212
- {
1213
- "name": "stderr",
1214
- "output_type": "stream",
1215
- "text": [
1216
- "[codecarbon WARNING @ 19:53:32] Background scheduler didn't run for a long period (47s), results might be inaccurate\n",
1217
- "[codecarbon INFO @ 19:53:32] Energy consumed for RAM : 0.000156 kWh. RAM Power : 11.755242347717285 W\n",
1218
- "[codecarbon INFO @ 19:53:32] Delta energy consumed for CPU with constant : 0.000564 kWh, power : 42.5 W\n",
1219
- "[codecarbon INFO @ 19:53:32] Energy consumed for All CPU : 0.000564 kWh\n",
1220
- "[codecarbon INFO @ 19:53:32] 0.000720 kWh of electricity used since the beginning.\n"
1221
- ]
1222
- },
1223
- {
1224
- "data": {
1225
- "text/plain": [
1226
- "EmissionsData(timestamp='2025-01-21T19:53:32', project_name='codecarbon', run_id='908f2e7e-4bb2-4991-a0f6-56bf8d7eda21', experiment_id='5b0fa12a-3dd7-45bb-9766-cc326314d9f1', duration=47.736408500000834, emissions=4.032368007471064e-05, emissions_rate=8.444466886328872e-07, cpu_power=42.5, gpu_power=0.0, ram_power=11.755242347717285, cpu_energy=0.0005636615353475565, gpu_energy=0, ram_energy=0.00015590305493261682, energy_consumed=0.0007195645902801733, country_name='France', country_iso_code='FRA', region='île-de-france', cloud_provider='', cloud_region='', os='Windows-11-10.0.22631-SP0', python_version='3.12.7', codecarbon_version='3.0.0_rc0', cpu_count=12, cpu_model='13th Gen Intel(R) Core(TM) i7-1365U', gpu_count=None, gpu_model=None, longitude=2.3494, latitude=48.8558, ram_total_size=31.347312927246094, tracking_mode='machine', on_cloud='N', pue=1.0)"
1227
- ]
1228
- },
1229
- "execution_count": 8,
1230
- "metadata": {},
1231
- "output_type": "execute_result"
1232
- }
1233
- ],
1234
- "source": [
1235
- "# Stop tracking emissions\n",
1236
- "emissions_data = tracker.stop_task()\n",
1237
- "emissions_data"
1238
- ]
1239
- },
1240
- {
1241
- "cell_type": "code",
1242
- "execution_count": 9,
1243
- "metadata": {},
1244
- "outputs": [
1245
- {
1246
- "data": {
1247
- "text/plain": [
1248
- "0.10090237899917966"
1249
- ]
1250
- },
1251
- "execution_count": 9,
1252
- "metadata": {},
1253
- "output_type": "execute_result"
1254
- }
1255
- ],
1256
- "source": [
1257
- "# Calculate accuracy\n",
1258
- "accuracy = accuracy_score(true_labels, predictions)\n",
1259
- "accuracy"
1260
- ]
1261
- },
1262
- {
1263
- "cell_type": "code",
1264
- "execution_count": 10,
1265
- "metadata": {},
1266
- "outputs": [
1267
- {
1268
- "data": {
1269
- "text/plain": [
1270
- "{'submission_timestamp': '2025-01-21T19:53:46.639165',\n",
1271
- " 'accuracy': 0.10090237899917966,\n",
1272
- " 'energy_consumed_wh': 0.7195645902801733,\n",
1273
- " 'emissions_gco2eq': 0.040323680074710634,\n",
1274
- " 'emissions_data': {'run_id': '908f2e7e-4bb2-4991-a0f6-56bf8d7eda21',\n",
1275
- " 'duration': 47.736408500000834,\n",
1276
- " 'emissions': 4.032368007471064e-05,\n",
1277
- " 'emissions_rate': 8.444466886328872e-07,\n",
1278
- " 'cpu_power': 42.5,\n",
1279
- " 'gpu_power': 0.0,\n",
1280
- " 'ram_power': 11.755242347717285,\n",
1281
- " 'cpu_energy': 0.0005636615353475565,\n",
1282
- " 'gpu_energy': 0,\n",
1283
- " 'ram_energy': 0.00015590305493261682,\n",
1284
- " 'energy_consumed': 0.0007195645902801733,\n",
1285
- " 'country_name': 'France',\n",
1286
- " 'country_iso_code': 'FRA',\n",
1287
- " 'region': 'île-de-france',\n",
1288
- " 'cloud_provider': '',\n",
1289
- " 'cloud_region': '',\n",
1290
- " 'os': 'Windows-11-10.0.22631-SP0',\n",
1291
- " 'python_version': '3.12.7',\n",
1292
- " 'codecarbon_version': '3.0.0_rc0',\n",
1293
- " 'cpu_count': 12,\n",
1294
- " 'cpu_model': '13th Gen Intel(R) Core(TM) i7-1365U',\n",
1295
- " 'gpu_count': None,\n",
1296
- " 'gpu_model': None,\n",
1297
- " 'ram_total_size': 31.347312927246094,\n",
1298
- " 'tracking_mode': 'machine',\n",
1299
- " 'on_cloud': 'N',\n",
1300
- " 'pue': 1.0},\n",
1301
- " 'dataset_config': {'dataset_name': 'QuotaClimat/frugalaichallenge-text-train',\n",
1302
- " 'test_size': 0.2,\n",
1303
- " 'test_seed': 42}}"
1304
- ]
1305
- },
1306
- "execution_count": 10,
1307
- "metadata": {},
1308
- "output_type": "execute_result"
1309
- }
1310
- ],
1311
- "source": [
1312
- "# Prepare results dictionary\n",
1313
- "results = {\n",
1314
- " \"submission_timestamp\": datetime.now().isoformat(),\n",
1315
- " \"accuracy\": float(accuracy),\n",
1316
- " \"energy_consumed_wh\": emissions_data.energy_consumed * 1000,\n",
1317
- " \"emissions_gco2eq\": emissions_data.emissions * 1000,\n",
1318
- " \"emissions_data\": clean_emissions_data(emissions_data),\n",
1319
- " \"dataset_config\": {\n",
1320
- " \"dataset_name\": request.dataset_name,\n",
1321
- " \"test_size\": request.test_size,\n",
1322
- " \"test_seed\": request.test_seed\n",
1323
- " }\n",
1324
- "}\n",
1325
- "\n",
1326
- "results"
1327
- ]
1328
- }
1329
- ],
1330
- "metadata": {
1331
- "kernelspec": {
1332
- "display_name": "base",
1333
- "language": "python",
1334
- "name": "python3"
1335
- },
1336
- "language_info": {
1337
- "codemirror_mode": {
1338
- "name": "ipython",
1339
- "version": 3
1340
- },
1341
- "file_extension": ".py",
1342
- "mimetype": "text/x-python",
1343
- "name": "python",
1344
- "nbconvert_exporter": "python",
1345
- "pygments_lexer": "ipython3",
1346
- "version": "3.12.7"
1347
- }
1348
- },
1349
- "nbformat": 4,
1350
- "nbformat_minor": 2
1351
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
notebooks/template-image.ipynb DELETED
@@ -1,416 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# Image task notebook template\n",
8
- "## Loading the necessary libraries"
9
- ]
10
- },
11
- {
12
- "cell_type": "code",
13
- "execution_count": 13,
14
- "metadata": {},
15
- "outputs": [],
16
- "source": [
17
- "from fastapi import APIRouter\n",
18
- "from datetime import datetime\n",
19
- "from datasets import load_dataset\n",
20
- "from sklearn.metrics import accuracy_score, precision_score, recall_score\n",
21
- "\n",
22
- "import random\n",
23
- "\n",
24
- "import sys\n",
25
- "sys.path.append('../')\n",
26
- "\n",
27
- "from tasks.utils.evaluation import ImageEvaluationRequest\n",
28
- "from tasks.utils.emissions import tracker, clean_emissions_data, get_space_info\n",
29
- "from tasks.image import parse_boxes,compute_iou,compute_max_iou"
30
- ]
31
- },
32
- {
33
- "cell_type": "markdown",
34
- "metadata": {},
35
- "source": [
36
- "## Loading the datasets and splitting them"
37
- ]
38
- },
39
- {
40
- "cell_type": "code",
41
- "execution_count": 4,
42
- "metadata": {},
43
- "outputs": [
44
- {
45
- "data": {
46
- "application/vnd.jupyter.widget-view+json": {
47
- "model_id": "4f62b23ca587477d9f37430e687bf951",
48
- "version_major": 2,
49
- "version_minor": 0
50
- },
51
- "text/plain": [
52
- "README.md: 0%| | 0.00/7.72k [00:00<?, ?B/s]"
53
- ]
54
- },
55
- "metadata": {},
56
- "output_type": "display_data"
57
- },
58
- {
59
- "name": "stderr",
60
- "output_type": "stream",
61
- "text": [
62
- "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\datasets--pyronear--pyro-sdis. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
63
- "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
64
- " warnings.warn(message)\n"
65
- ]
66
- },
67
- {
68
- "data": {
69
- "application/vnd.jupyter.widget-view+json": {
70
- "model_id": "70735dd748e343119b5a7cd966dcd0f0",
71
- "version_major": 2,
72
- "version_minor": 0
73
- },
74
- "text/plain": [
75
- "train-00000-of-00007.parquet: 0%| | 0.00/433M [00:00<?, ?B/s]"
76
- ]
77
- },
78
- "metadata": {},
79
- "output_type": "display_data"
80
- },
81
- {
82
- "data": {
83
- "application/vnd.jupyter.widget-view+json": {
84
- "model_id": "903c3227c24649f1a0424e039d74d303",
85
- "version_major": 2,
86
- "version_minor": 0
87
- },
88
- "text/plain": [
89
- "train-00001-of-00007.parquet: 0%| | 0.00/434M [00:00<?, ?B/s]"
90
- ]
91
- },
92
- "metadata": {},
93
- "output_type": "display_data"
94
- },
95
- {
96
- "data": {
97
- "application/vnd.jupyter.widget-view+json": {
98
- "model_id": "8795b7696f124715b9d52287d5cd4ee0",
99
- "version_major": 2,
100
- "version_minor": 0
101
- },
102
- "text/plain": [
103
- "train-00002-of-00007.parquet: 0%| | 0.00/432M [00:00<?, ?B/s]"
104
- ]
105
- },
106
- "metadata": {},
107
- "output_type": "display_data"
108
- },
109
- {
110
- "data": {
111
- "application/vnd.jupyter.widget-view+json": {
112
- "model_id": "4b6c1240bf024d61bf913584d13834f5",
113
- "version_major": 2,
114
- "version_minor": 0
115
- },
116
- "text/plain": [
117
- "train-00003-of-00007.parquet: 0%| | 0.00/428M [00:00<?, ?B/s]"
118
- ]
119
- },
120
- "metadata": {},
121
- "output_type": "display_data"
122
- },
123
- {
124
- "data": {
125
- "application/vnd.jupyter.widget-view+json": {
126
- "model_id": "cd5f8172a31f4fd79d489db96ede9c21",
127
- "version_major": 2,
128
- "version_minor": 0
129
- },
130
- "text/plain": [
131
- "train-00004-of-00007.parquet: 0%| | 0.00/431M [00:00<?, ?B/s]"
132
- ]
133
- },
134
- "metadata": {},
135
- "output_type": "display_data"
136
- },
137
- {
138
- "data": {
139
- "application/vnd.jupyter.widget-view+json": {
140
- "model_id": "416af82dba3a4ab7ad13190703c90757",
141
- "version_major": 2,
142
- "version_minor": 0
143
- },
144
- "text/plain": [
145
- "train-00005-of-00007.parquet: 0%| | 0.00/429M [00:00<?, ?B/s]"
146
- ]
147
- },
148
- "metadata": {},
149
- "output_type": "display_data"
150
- },
151
- {
152
- "data": {
153
- "application/vnd.jupyter.widget-view+json": {
154
- "model_id": "6819ad85508641a1a64bea34303446ac",
155
- "version_major": 2,
156
- "version_minor": 0
157
- },
158
- "text/plain": [
159
- "train-00006-of-00007.parquet: 0%| | 0.00/431M [00:00<?, ?B/s]"
160
- ]
161
- },
162
- "metadata": {},
163
- "output_type": "display_data"
164
- },
165
- {
166
- "data": {
167
- "application/vnd.jupyter.widget-view+json": {
168
- "model_id": "90a7f85c802b4330b502c8bbd3cca7f9",
169
- "version_major": 2,
170
- "version_minor": 0
171
- },
172
- "text/plain": [
173
- "val-00000-of-00001.parquet: 0%| | 0.00/407M [00:00<?, ?B/s]"
174
- ]
175
- },
176
- "metadata": {},
177
- "output_type": "display_data"
178
- },
179
- {
180
- "data": {
181
- "application/vnd.jupyter.widget-view+json": {
182
- "model_id": "b93f2f19aafb43e2b8db0fd7bb3ebd34",
183
- "version_major": 2,
184
- "version_minor": 0
185
- },
186
- "text/plain": [
187
- "Generating train split: 0%| | 0/29537 [00:00<?, ? examples/s]"
188
- ]
189
- },
190
- "metadata": {},
191
- "output_type": "display_data"
192
- },
193
- {
194
- "data": {
195
- "application/vnd.jupyter.widget-view+json": {
196
- "model_id": "c14c0f2cde184c959970dfccaa26b2d2",
197
- "version_major": 2,
198
- "version_minor": 0
199
- },
200
- "text/plain": [
201
- "Generating val split: 0%| | 0/4099 [00:00<?, ? examples/s]"
202
- ]
203
- },
204
- "metadata": {},
205
- "output_type": "display_data"
206
- }
207
- ],
208
- "source": [
209
- "request = ImageEvaluationRequest()\n",
210
- "\n",
211
- "# Load and prepare the dataset\n",
212
- "dataset = load_dataset(request.dataset_name)\n",
213
- "\n",
214
- "# Split dataset\n",
215
- "train_test = dataset[\"train\"]\n",
216
- "test_dataset = dataset[\"val\"]"
217
- ]
218
- },
219
- {
220
- "cell_type": "markdown",
221
- "metadata": {},
222
- "source": [
223
- "## Random Baseline"
224
- ]
225
- },
226
- {
227
- "cell_type": "code",
228
- "execution_count": 10,
229
- "metadata": {},
230
- "outputs": [],
231
- "source": [
232
- "# Start tracking emissions\n",
233
- "tracker.start()\n",
234
- "tracker.start_task(\"inference\")"
235
- ]
236
- },
237
- {
238
- "cell_type": "code",
239
- "execution_count": 11,
240
- "metadata": {},
241
- "outputs": [],
242
- "source": [
243
- "\n",
244
- "#--------------------------------------------------------------------------------------------\n",
245
- "# YOUR MODEL INFERENCE CODE HERE\n",
246
- "# Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.\n",
247
- "#-------------------------------------------------------------------------------------------- \n",
248
- "\n",
249
- "# Make random predictions (placeholder for actual model inference)\n",
250
- "\n",
251
- "predictions = []\n",
252
- "true_labels = []\n",
253
- "pred_boxes = []\n",
254
- "true_boxes_list = [] # List of lists, each inner list contains boxes for one image\n",
255
- "\n",
256
- "for example in test_dataset:\n",
257
- " # Parse true annotation (YOLO format: class_id x_center y_center width height)\n",
258
- " annotation = example.get(\"annotations\", \"\").strip()\n",
259
- " has_smoke = len(annotation) > 0\n",
260
- " true_labels.append(int(has_smoke))\n",
261
- " \n",
262
- " # Make random classification prediction\n",
263
- " pred_has_smoke = random.random() > 0.5\n",
264
- " predictions.append(int(pred_has_smoke))\n",
265
- " \n",
266
- " # If there's a true box, parse it and make random box prediction\n",
267
- " if has_smoke:\n",
268
- " # Parse all true boxes from the annotation\n",
269
- " image_true_boxes = parse_boxes(annotation)\n",
270
- " true_boxes_list.append(image_true_boxes)\n",
271
- " \n",
272
- " # For baseline, make one random box prediction per image\n",
273
- " # In a real model, you might want to predict multiple boxes\n",
274
- " random_box = [\n",
275
- " random.random(), # x_center\n",
276
- " random.random(), # y_center\n",
277
- " random.random() * 0.5, # width (max 0.5)\n",
278
- " random.random() * 0.5 # height (max 0.5)\n",
279
- " ]\n",
280
- " pred_boxes.append(random_box)\n",
281
- "\n",
282
- "\n",
283
- "#--------------------------------------------------------------------------------------------\n",
284
- "# YOUR MODEL INFERENCE STOPS HERE\n",
285
- "#-------------------------------------------------------------------------------------------- "
286
- ]
287
- },
288
- {
289
- "cell_type": "code",
290
- "execution_count": null,
291
- "metadata": {},
292
- "outputs": [],
293
- "source": [
294
- "# Stop tracking emissions\n",
295
- "emissions_data = tracker.stop_task()"
296
- ]
297
- },
298
- {
299
- "cell_type": "code",
300
- "execution_count": 15,
301
- "metadata": {},
302
- "outputs": [],
303
- "source": [
304
- "import numpy as np\n",
305
- "\n",
306
- "# Calculate classification metrics\n",
307
- "classification_accuracy = accuracy_score(true_labels, predictions)\n",
308
- "classification_precision = precision_score(true_labels, predictions)\n",
309
- "classification_recall = recall_score(true_labels, predictions)\n",
310
- "\n",
311
- "# Calculate mean IoU for object detection (only for images with smoke)\n",
312
- "# For each image, we compute the max IoU between the predicted box and all true boxes\n",
313
- "ious = []\n",
314
- "for true_boxes, pred_box in zip(true_boxes_list, pred_boxes):\n",
315
- " max_iou = compute_max_iou(true_boxes, pred_box)\n",
316
- " ious.append(max_iou)\n",
317
- "\n",
318
- "mean_iou = float(np.mean(ious)) if ious else 0.0"
319
- ]
320
- },
321
- {
322
- "cell_type": "code",
323
- "execution_count": 18,
324
- "metadata": {},
325
- "outputs": [
326
- {
327
- "data": {
328
- "text/plain": [
329
- "{'submission_timestamp': '2025-01-22T15:57:37.288173',\n",
330
- " 'classification_accuracy': 0.5001692620176033,\n",
331
- " 'classification_precision': 0.8397129186602871,\n",
332
- " 'classification_recall': 0.4972677595628415,\n",
333
- " 'mean_iou': 0.002819781629108398,\n",
334
- " 'energy_consumed_wh': 0.779355299496116,\n",
335
- " 'emissions_gco2eq': 0.043674291628462855,\n",
336
- " 'emissions_data': {'run_id': '4e750cd5-60f0-444c-baee-b5f7b31f784b',\n",
337
- " 'duration': 51.72819679998793,\n",
338
- " 'emissions': 4.3674291628462856e-05,\n",
339
- " 'emissions_rate': 8.445163379568943e-07,\n",
340
- " 'cpu_power': 42.5,\n",
341
- " 'gpu_power': 0.0,\n",
342
- " 'ram_power': 11.755242347717285,\n",
343
- " 'cpu_energy': 0.0006104993474311617,\n",
344
- " 'gpu_energy': 0,\n",
345
- " 'ram_energy': 0.00016885595206495442,\n",
346
- " 'energy_consumed': 0.0007793552994961161,\n",
347
- " 'country_name': 'France',\n",
348
- " 'country_iso_code': 'FRA',\n",
349
- " 'region': 'île-de-france',\n",
350
- " 'cloud_provider': '',\n",
351
- " 'cloud_region': '',\n",
352
- " 'os': 'Windows-11-10.0.22631-SP0',\n",
353
- " 'python_version': '3.12.7',\n",
354
- " 'codecarbon_version': '3.0.0_rc0',\n",
355
- " 'cpu_count': 12,\n",
356
- " 'cpu_model': '13th Gen Intel(R) Core(TM) i7-1365U',\n",
357
- " 'gpu_count': None,\n",
358
- " 'gpu_model': None,\n",
359
- " 'ram_total_size': 31.347312927246094,\n",
360
- " 'tracking_mode': 'machine',\n",
361
- " 'on_cloud': 'N',\n",
362
- " 'pue': 1.0},\n",
363
- " 'dataset_config': {'dataset_name': 'pyronear/pyro-sdis',\n",
364
- " 'test_size': 0.2,\n",
365
- " 'test_seed': 42}}"
366
- ]
367
- },
368
- "execution_count": 18,
369
- "metadata": {},
370
- "output_type": "execute_result"
371
- }
372
- ],
373
- "source": [
374
- "\n",
375
- "# Prepare results dictionary\n",
376
- "results = {\n",
377
- " \"submission_timestamp\": datetime.now().isoformat(),\n",
378
- " \"classification_accuracy\": float(classification_accuracy),\n",
379
- " \"classification_precision\": float(classification_precision),\n",
380
- " \"classification_recall\": float(classification_recall),\n",
381
- " \"mean_iou\": mean_iou,\n",
382
- " \"energy_consumed_wh\": emissions_data.energy_consumed * 1000,\n",
383
- " \"emissions_gco2eq\": emissions_data.emissions * 1000,\n",
384
- " \"emissions_data\": clean_emissions_data(emissions_data),\n",
385
- " \"dataset_config\": {\n",
386
- " \"dataset_name\": request.dataset_name,\n",
387
- " \"test_size\": request.test_size,\n",
388
- " \"test_seed\": request.test_seed\n",
389
- " }\n",
390
- "}\n",
391
- "results"
392
- ]
393
- }
394
- ],
395
- "metadata": {
396
- "kernelspec": {
397
- "display_name": "base",
398
- "language": "python",
399
- "name": "python3"
400
- },
401
- "language_info": {
402
- "codemirror_mode": {
403
- "name": "ipython",
404
- "version": 3
405
- },
406
- "file_extension": ".py",
407
- "mimetype": "text/x-python",
408
- "name": "python",
409
- "nbconvert_exporter": "python",
410
- "pygments_lexer": "ipython3",
411
- "version": "3.12.7"
412
- }
413
- },
414
- "nbformat": 4,
415
- "nbformat_minor": 2
416
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
notebooks/template-text.ipynb DELETED
@@ -1,1642 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# Text task notebook template\n",
8
- "## Loading the necessary libraries"
9
- ]
10
- },
11
- {
12
- "cell_type": "code",
13
- "execution_count": 3,
14
- "metadata": {},
15
- "outputs": [
16
- {
17
- "name": "stderr",
18
- "output_type": "stream",
19
- "text": [
20
- "[codecarbon WARNING @ 19:48:07] Multiple instances of codecarbon are allowed to run at the same time.\n",
21
- "[codecarbon INFO @ 19:48:07] [setup] RAM Tracking...\n",
22
- "[codecarbon INFO @ 19:48:07] [setup] CPU Tracking...\n",
23
- "[codecarbon WARNING @ 19:48:09] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
24
- "[codecarbon WARNING @ 19:48:09] No CPU tracking mode found. Falling back on CPU constant mode. \n",
25
- " Windows OS detected: Please install Intel Power Gadget to measure CPU\n",
26
- "\n",
27
- "[codecarbon WARNING @ 19:48:11] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
28
- "[codecarbon INFO @ 19:48:11] CPU Model on constant consumption mode: 13th Gen Intel(R) Core(TM) i7-1365U\n",
29
- "[codecarbon WARNING @ 19:48:11] No CPU tracking mode found. Falling back on CPU constant mode.\n",
30
- "[codecarbon INFO @ 19:48:11] [setup] GPU Tracking...\n",
31
- "[codecarbon INFO @ 19:48:11] No GPU found.\n",
32
- "[codecarbon INFO @ 19:48:11] >>> Tracker's metadata:\n",
33
- "[codecarbon INFO @ 19:48:11] Platform system: Windows-11-10.0.22631-SP0\n",
34
- "[codecarbon INFO @ 19:48:11] Python version: 3.12.7\n",
35
- "[codecarbon INFO @ 19:48:11] CodeCarbon version: 3.0.0_rc0\n",
36
- "[codecarbon INFO @ 19:48:11] Available RAM : 31.347 GB\n",
37
- "[codecarbon INFO @ 19:48:11] CPU count: 12\n",
38
- "[codecarbon INFO @ 19:48:11] CPU model: 13th Gen Intel(R) Core(TM) i7-1365U\n",
39
- "[codecarbon INFO @ 19:48:11] GPU count: None\n",
40
- "[codecarbon INFO @ 19:48:11] GPU model: None\n",
41
- "[codecarbon INFO @ 19:48:11] Saving emissions data to file c:\\git\\submission-template\\notebooks\\emissions.csv\n"
42
- ]
43
- }
44
- ],
45
- "source": [
46
- "from fastapi import APIRouter\n",
47
- "from datetime import datetime\n",
48
- "from datasets import load_dataset\n",
49
- "from sklearn.metrics import accuracy_score\n",
50
- "import random\n",
51
- "\n",
52
- "import sys\n",
53
- "sys.path.append('../tasks')\n",
54
- "\n",
55
- "from utils.evaluation import TextEvaluationRequest\n",
56
- "from utils.emissions import tracker, clean_emissions_data, get_space_info\n",
57
- "\n",
58
- "\n",
59
- "# Define the label mapping\n",
60
- "LABEL_MAPPING = {\n",
61
- " \"0_not_relevant\": 0,\n",
62
- " \"1_not_happening\": 1,\n",
63
- " \"2_not_human\": 2,\n",
64
- " \"3_not_bad\": 3,\n",
65
- " \"4_solutions_harmful_unnecessary\": 4,\n",
66
- " \"5_science_unreliable\": 5,\n",
67
- " \"6_proponents_biased\": 6,\n",
68
- " \"7_fossil_fuels_needed\": 7\n",
69
- "}"
70
- ]
71
- },
72
- {
73
- "cell_type": "markdown",
74
- "metadata": {},
75
- "source": [
76
- "## Loading the datasets and splitting them"
77
- ]
78
- },
79
- {
80
- "cell_type": "code",
81
- "execution_count": 4,
82
- "metadata": {},
83
- "outputs": [
84
- {
85
- "data": {
86
- "application/vnd.jupyter.widget-view+json": {
87
- "model_id": "668da7bf85434e098b95c3ec447d78fe",
88
- "version_major": 2,
89
- "version_minor": 0
90
- },
91
- "text/plain": [
92
- "README.md: 0%| | 0.00/5.18k [00:00<?, ?B/s]"
93
- ]
94
- },
95
- "metadata": {},
96
- "output_type": "display_data"
97
- },
98
- {
99
- "name": "stderr",
100
- "output_type": "stream",
101
- "text": [
102
- "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\datasets--QuotaClimat--frugalaichallenge-text-train. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
103
- "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
104
- " warnings.warn(message)\n"
105
- ]
106
- },
107
- {
108
- "data": {
109
- "application/vnd.jupyter.widget-view+json": {
110
- "model_id": "5b68d43359eb429395da8be7d4b15556",
111
- "version_major": 2,
112
- "version_minor": 0
113
- },
114
- "text/plain": [
115
- "train.parquet: 0%| | 0.00/1.21M [00:00<?, ?B/s]"
116
- ]
117
- },
118
- "metadata": {},
119
- "output_type": "display_data"
120
- },
121
- {
122
- "data": {
123
- "application/vnd.jupyter.widget-view+json": {
124
- "model_id": "140a304773914e9db8f698eabeb40298",
125
- "version_major": 2,
126
- "version_minor": 0
127
- },
128
- "text/plain": [
129
- "Generating train split: 0%| | 0/6091 [00:00<?, ? examples/s]"
130
- ]
131
- },
132
- "metadata": {},
133
- "output_type": "display_data"
134
- },
135
- {
136
- "data": {
137
- "application/vnd.jupyter.widget-view+json": {
138
- "model_id": "6d04e8ab1906400e8e0029949dc523a5",
139
- "version_major": 2,
140
- "version_minor": 0
141
- },
142
- "text/plain": [
143
- "Map: 0%| | 0/6091 [00:00<?, ? examples/s]"
144
- ]
145
- },
146
- "metadata": {},
147
- "output_type": "display_data"
148
- }
149
- ],
150
- "source": [
151
- "request = TextEvaluationRequest()\n",
152
- "\n",
153
- "# Load and prepare the dataset\n",
154
- "dataset = load_dataset(request.dataset_name)\n",
155
- "\n",
156
- "# Convert string labels to integers\n",
157
- "dataset = dataset.map(lambda x: {\"label\": LABEL_MAPPING[x[\"label\"]]})\n",
158
- "\n",
159
- "# Split dataset\n",
160
- "train_test = dataset[\"train\"]\n",
161
- "test_dataset = dataset[\"test\"]"
162
- ]
163
- },
164
- {
165
- "cell_type": "markdown",
166
- "metadata": {},
167
- "source": [
168
- "## Random Baseline"
169
- ]
170
- },
171
- {
172
- "cell_type": "code",
173
- "execution_count": 5,
174
- "metadata": {},
175
- "outputs": [],
176
- "source": [
177
- "# Start tracking emissions\n",
178
- "tracker.start()\n",
179
- "tracker.start_task(\"inference\")"
180
- ]
181
- },
182
- {
183
- "cell_type": "code",
184
- "execution_count": 6,
185
- "metadata": {},
186
- "outputs": [
187
- {
188
- "data": {
189
- "text/plain": [
190
- "[1,\n",
191
- " 7,\n",
192
- " 6,\n",
193
- " 6,\n",
194
- " 2,\n",
195
- " 0,\n",
196
- " 1,\n",
197
- " 7,\n",
198
- " 3,\n",
199
- " 6,\n",
200
- " 6,\n",
201
- " 3,\n",
202
- " 6,\n",
203
- " 6,\n",
204
- " 5,\n",
205
- " 0,\n",
206
- " 2,\n",
207
- " 6,\n",
208
- " 2,\n",
209
- " 6,\n",
210
- " 5,\n",
211
- " 4,\n",
212
- " 1,\n",
213
- " 3,\n",
214
- " 6,\n",
215
- " 4,\n",
216
- " 2,\n",
217
- " 1,\n",
218
- " 4,\n",
219
- " 0,\n",
220
- " 3,\n",
221
- " 4,\n",
222
- " 1,\n",
223
- " 5,\n",
224
- " 5,\n",
225
- " 1,\n",
226
- " 2,\n",
227
- " 7,\n",
228
- " 6,\n",
229
- " 1,\n",
230
- " 3,\n",
231
- " 1,\n",
232
- " 7,\n",
233
- " 7,\n",
234
- " 0,\n",
235
- " 0,\n",
236
- " 3,\n",
237
- " 3,\n",
238
- " 3,\n",
239
- " 4,\n",
240
- " 1,\n",
241
- " 4,\n",
242
- " 4,\n",
243
- " 1,\n",
244
- " 4,\n",
245
- " 5,\n",
246
- " 6,\n",
247
- " 1,\n",
248
- " 2,\n",
249
- " 2,\n",
250
- " 2,\n",
251
- " 5,\n",
252
- " 2,\n",
253
- " 7,\n",
254
- " 2,\n",
255
- " 7,\n",
256
- " 7,\n",
257
- " 6,\n",
258
- " 4,\n",
259
- " 2,\n",
260
- " 0,\n",
261
- " 1,\n",
262
- " 6,\n",
263
- " 3,\n",
264
- " 2,\n",
265
- " 5,\n",
266
- " 5,\n",
267
- " 2,\n",
268
- " 0,\n",
269
- " 7,\n",
270
- " 0,\n",
271
- " 1,\n",
272
- " 5,\n",
273
- " 5,\n",
274
- " 7,\n",
275
- " 4,\n",
276
- " 6,\n",
277
- " 7,\n",
278
- " 1,\n",
279
- " 7,\n",
280
- " 1,\n",
281
- " 0,\n",
282
- " 3,\n",
283
- " 4,\n",
284
- " 2,\n",
285
- " 5,\n",
286
- " 3,\n",
287
- " 3,\n",
288
- " 3,\n",
289
- " 2,\n",
290
- " 2,\n",
291
- " 1,\n",
292
- " 0,\n",
293
- " 4,\n",
294
- " 5,\n",
295
- " 7,\n",
296
- " 0,\n",
297
- " 3,\n",
298
- " 1,\n",
299
- " 4,\n",
300
- " 6,\n",
301
- " 0,\n",
302
- " 7,\n",
303
- " 1,\n",
304
- " 1,\n",
305
- " 2,\n",
306
- " 2,\n",
307
- " 4,\n",
308
- " 0,\n",
309
- " 4,\n",
310
- " 3,\n",
311
- " 4,\n",
312
- " 4,\n",
313
- " 2,\n",
314
- " 2,\n",
315
- " 3,\n",
316
- " 3,\n",
317
- " 7,\n",
318
- " 4,\n",
319
- " 7,\n",
320
- " 6,\n",
321
- " 4,\n",
322
- " 5,\n",
323
- " 4,\n",
324
- " 3,\n",
325
- " 6,\n",
326
- " 0,\n",
327
- " 4,\n",
328
- " 0,\n",
329
- " 1,\n",
330
- " 3,\n",
331
- " 6,\n",
332
- " 7,\n",
333
- " 3,\n",
334
- " 3,\n",
335
- " 0,\n",
336
- " 1,\n",
337
- " 2,\n",
338
- " 4,\n",
339
- " 4,\n",
340
- " 3,\n",
341
- " 1,\n",
342
- " 2,\n",
343
- " 4,\n",
344
- " 3,\n",
345
- " 0,\n",
346
- " 5,\n",
347
- " 3,\n",
348
- " 6,\n",
349
- " 3,\n",
350
- " 6,\n",
351
- " 1,\n",
352
- " 3,\n",
353
- " 4,\n",
354
- " 5,\n",
355
- " 4,\n",
356
- " 0,\n",
357
- " 7,\n",
358
- " 3,\n",
359
- " 6,\n",
360
- " 7,\n",
361
- " 4,\n",
362
- " 4,\n",
363
- " 5,\n",
364
- " 3,\n",
365
- " 1,\n",
366
- " 7,\n",
367
- " 4,\n",
368
- " 1,\n",
369
- " 0,\n",
370
- " 3,\n",
371
- " 0,\n",
372
- " 5,\n",
373
- " 3,\n",
374
- " 6,\n",
375
- " 3,\n",
376
- " 0,\n",
377
- " 7,\n",
378
- " 2,\n",
379
- " 0,\n",
380
- " 4,\n",
381
- " 1,\n",
382
- " 2,\n",
383
- " 6,\n",
384
- " 3,\n",
385
- " 4,\n",
386
- " 4,\n",
387
- " 5,\n",
388
- " 1,\n",
389
- " 5,\n",
390
- " 4,\n",
391
- " 0,\n",
392
- " 1,\n",
393
- " 7,\n",
394
- " 3,\n",
395
- " 6,\n",
396
- " 0,\n",
397
- " 7,\n",
398
- " 4,\n",
399
- " 6,\n",
400
- " 3,\n",
401
- " 0,\n",
402
- " 0,\n",
403
- " 4,\n",
404
- " 6,\n",
405
- " 6,\n",
406
- " 4,\n",
407
- " 0,\n",
408
- " 5,\n",
409
- " 7,\n",
410
- " 5,\n",
411
- " 1,\n",
412
- " 3,\n",
413
- " 6,\n",
414
- " 2,\n",
415
- " 3,\n",
416
- " 2,\n",
417
- " 4,\n",
418
- " 5,\n",
419
- " 1,\n",
420
- " 5,\n",
421
- " 0,\n",
422
- " 3,\n",
423
- " 3,\n",
424
- " 0,\n",
425
- " 0,\n",
426
- " 6,\n",
427
- " 6,\n",
428
- " 2,\n",
429
- " 0,\n",
430
- " 7,\n",
431
- " 4,\n",
432
- " 5,\n",
433
- " 7,\n",
434
- " 1,\n",
435
- " 0,\n",
436
- " 4,\n",
437
- " 5,\n",
438
- " 1,\n",
439
- " 7,\n",
440
- " 0,\n",
441
- " 7,\n",
442
- " 2,\n",
443
- " 6,\n",
444
- " 1,\n",
445
- " 3,\n",
446
- " 5,\n",
447
- " 5,\n",
448
- " 6,\n",
449
- " 5,\n",
450
- " 4,\n",
451
- " 3,\n",
452
- " 7,\n",
453
- " 4,\n",
454
- " 3,\n",
455
- " 5,\n",
456
- " 5,\n",
457
- " 7,\n",
458
- " 2,\n",
459
- " 6,\n",
460
- " 1,\n",
461
- " 5,\n",
462
- " 0,\n",
463
- " 3,\n",
464
- " 4,\n",
465
- " 2,\n",
466
- " 3,\n",
467
- " 7,\n",
468
- " 0,\n",
469
- " 1,\n",
470
- " 7,\n",
471
- " 6,\n",
472
- " 7,\n",
473
- " 7,\n",
474
- " 5,\n",
475
- " 6,\n",
476
- " 3,\n",
477
- " 2,\n",
478
- " 3,\n",
479
- " 0,\n",
480
- " 4,\n",
481
- " 3,\n",
482
- " 5,\n",
483
- " 6,\n",
484
- " 0,\n",
485
- " 0,\n",
486
- " 6,\n",
487
- " 6,\n",
488
- " 1,\n",
489
- " 4,\n",
490
- " 0,\n",
491
- " 4,\n",
492
- " 2,\n",
493
- " 7,\n",
494
- " 5,\n",
495
- " 7,\n",
496
- " 6,\n",
497
- " 3,\n",
498
- " 5,\n",
499
- " 6,\n",
500
- " 0,\n",
501
- " 4,\n",
502
- " 5,\n",
503
- " 6,\n",
504
- " 1,\n",
505
- " 2,\n",
506
- " 1,\n",
507
- " 5,\n",
508
- " 3,\n",
509
- " 0,\n",
510
- " 3,\n",
511
- " 7,\n",
512
- " 1,\n",
513
- " 0,\n",
514
- " 7,\n",
515
- " 0,\n",
516
- " 1,\n",
517
- " 0,\n",
518
- " 4,\n",
519
- " 1,\n",
520
- " 1,\n",
521
- " 0,\n",
522
- " 7,\n",
523
- " 1,\n",
524
- " 0,\n",
525
- " 7,\n",
526
- " 6,\n",
527
- " 2,\n",
528
- " 3,\n",
529
- " 7,\n",
530
- " 4,\n",
531
- " 3,\n",
532
- " 4,\n",
533
- " 3,\n",
534
- " 3,\n",
535
- " 2,\n",
536
- " 5,\n",
537
- " 1,\n",
538
- " 5,\n",
539
- " 1,\n",
540
- " 7,\n",
541
- " 3,\n",
542
- " 2,\n",
543
- " 6,\n",
544
- " 4,\n",
545
- " 4,\n",
546
- " 1,\n",
547
- " 2,\n",
548
- " 6,\n",
549
- " 7,\n",
550
- " 2,\n",
551
- " 7,\n",
552
- " 1,\n",
553
- " 3,\n",
554
- " 5,\n",
555
- " 2,\n",
556
- " 6,\n",
557
- " 4,\n",
558
- " 6,\n",
559
- " 7,\n",
560
- " 0,\n",
561
- " 5,\n",
562
- " 1,\n",
563
- " 6,\n",
564
- " 5,\n",
565
- " 3,\n",
566
- " 6,\n",
567
- " 5,\n",
568
- " 4,\n",
569
- " 7,\n",
570
- " 6,\n",
571
- " 5,\n",
572
- " 4,\n",
573
- " 3,\n",
574
- " 0,\n",
575
- " 0,\n",
576
- " 1,\n",
577
- " 7,\n",
578
- " 7,\n",
579
- " 6,\n",
580
- " 1,\n",
581
- " 4,\n",
582
- " 5,\n",
583
- " 6,\n",
584
- " 1,\n",
585
- " 5,\n",
586
- " 1,\n",
587
- " 2,\n",
588
- " 6,\n",
589
- " 2,\n",
590
- " 6,\n",
591
- " 0,\n",
592
- " 2,\n",
593
- " 1,\n",
594
- " 5,\n",
595
- " 5,\n",
596
- " 1,\n",
597
- " 7,\n",
598
- " 0,\n",
599
- " 5,\n",
600
- " 5,\n",
601
- " 1,\n",
602
- " 7,\n",
603
- " 7,\n",
604
- " 2,\n",
605
- " 1,\n",
606
- " 0,\n",
607
- " 1,\n",
608
- " 0,\n",
609
- " 5,\n",
610
- " 4,\n",
611
- " 2,\n",
612
- " 7,\n",
613
- " 4,\n",
614
- " 3,\n",
615
- " 6,\n",
616
- " 7,\n",
617
- " 5,\n",
618
- " 1,\n",
619
- " 0,\n",
620
- " 7,\n",
621
- " 2,\n",
622
- " 1,\n",
623
- " 2,\n",
624
- " 3,\n",
625
- " 1,\n",
626
- " 0,\n",
627
- " 3,\n",
628
- " 2,\n",
629
- " 6,\n",
630
- " 0,\n",
631
- " 5,\n",
632
- " 4,\n",
633
- " 7,\n",
634
- " 1,\n",
635
- " 1,\n",
636
- " 0,\n",
637
- " 7,\n",
638
- " 0,\n",
639
- " 6,\n",
640
- " 7,\n",
641
- " 6,\n",
642
- " 1,\n",
643
- " 5,\n",
644
- " 5,\n",
645
- " 7,\n",
646
- " 6,\n",
647
- " 1,\n",
648
- " 7,\n",
649
- " 6,\n",
650
- " 5,\n",
651
- " 4,\n",
652
- " 1,\n",
653
- " 4,\n",
654
- " 7,\n",
655
- " 5,\n",
656
- " 4,\n",
657
- " 0,\n",
658
- " 0,\n",
659
- " 7,\n",
660
- " 0,\n",
661
- " 0,\n",
662
- " 3,\n",
663
- " 6,\n",
664
- " 2,\n",
665
- " 5,\n",
666
- " 3,\n",
667
- " 0,\n",
668
- " 3,\n",
669
- " 6,\n",
670
- " 5,\n",
671
- " 7,\n",
672
- " 2,\n",
673
- " 6,\n",
674
- " 7,\n",
675
- " 5,\n",
676
- " 2,\n",
677
- " 3,\n",
678
- " 6,\n",
679
- " 7,\n",
680
- " 7,\n",
681
- " 7,\n",
682
- " 6,\n",
683
- " 1,\n",
684
- " 7,\n",
685
- " 4,\n",
686
- " 2,\n",
687
- " 7,\n",
688
- " 5,\n",
689
- " 4,\n",
690
- " 1,\n",
691
- " 2,\n",
692
- " 3,\n",
693
- " 7,\n",
694
- " 0,\n",
695
- " 2,\n",
696
- " 7,\n",
697
- " 6,\n",
698
- " 1,\n",
699
- " 4,\n",
700
- " 0,\n",
701
- " 6,\n",
702
- " 3,\n",
703
- " 1,\n",
704
- " 0,\n",
705
- " 3,\n",
706
- " 4,\n",
707
- " 7,\n",
708
- " 7,\n",
709
- " 4,\n",
710
- " 2,\n",
711
- " 1,\n",
712
- " 0,\n",
713
- " 5,\n",
714
- " 1,\n",
715
- " 7,\n",
716
- " 4,\n",
717
- " 6,\n",
718
- " 7,\n",
719
- " 7,\n",
720
- " 3,\n",
721
- " 4,\n",
722
- " 3,\n",
723
- " 5,\n",
724
- " 4,\n",
725
- " 4,\n",
726
- " 5,\n",
727
- " 0,\n",
728
- " 1,\n",
729
- " 3,\n",
730
- " 7,\n",
731
- " 5,\n",
732
- " 4,\n",
733
- " 7,\n",
734
- " 3,\n",
735
- " 3,\n",
736
- " 3,\n",
737
- " 5,\n",
738
- " 3,\n",
739
- " 3,\n",
740
- " 4,\n",
741
- " 0,\n",
742
- " 1,\n",
743
- " 7,\n",
744
- " 4,\n",
745
- " 7,\n",
746
- " 7,\n",
747
- " 5,\n",
748
- " 0,\n",
749
- " 0,\n",
750
- " 5,\n",
751
- " 2,\n",
752
- " 6,\n",
753
- " 2,\n",
754
- " 6,\n",
755
- " 7,\n",
756
- " 6,\n",
757
- " 5,\n",
758
- " 7,\n",
759
- " 5,\n",
760
- " 7,\n",
761
- " 1,\n",
762
- " 6,\n",
763
- " 6,\n",
764
- " 0,\n",
765
- " 4,\n",
766
- " 7,\n",
767
- " 3,\n",
768
- " 0,\n",
769
- " 0,\n",
770
- " 2,\n",
771
- " 5,\n",
772
- " 2,\n",
773
- " 3,\n",
774
- " 7,\n",
775
- " 1,\n",
776
- " 0,\n",
777
- " 3,\n",
778
- " 0,\n",
779
- " 0,\n",
780
- " 3,\n",
781
- " 3,\n",
782
- " 7,\n",
783
- " 3,\n",
784
- " 0,\n",
785
- " 1,\n",
786
- " 1,\n",
787
- " 6,\n",
788
- " 0,\n",
789
- " 0,\n",
790
- " 5,\n",
791
- " 0,\n",
792
- " 3,\n",
793
- " 4,\n",
794
- " 6,\n",
795
- " 7,\n",
796
- " 4,\n",
797
- " 0,\n",
798
- " 4,\n",
799
- " 4,\n",
800
- " 5,\n",
801
- " 4,\n",
802
- " 4,\n",
803
- " 3,\n",
804
- " 6,\n",
805
- " 5,\n",
806
- " 2,\n",
807
- " 0,\n",
808
- " 6,\n",
809
- " 0,\n",
810
- " 6,\n",
811
- " 4,\n",
812
- " 3,\n",
813
- " 5,\n",
814
- " 7,\n",
815
- " 7,\n",
816
- " 5,\n",
817
- " 5,\n",
818
- " 1,\n",
819
- " 5,\n",
820
- " 2,\n",
821
- " 7,\n",
822
- " 7,\n",
823
- " 6,\n",
824
- " 6,\n",
825
- " 7,\n",
826
- " 6,\n",
827
- " 5,\n",
828
- " 2,\n",
829
- " 4,\n",
830
- " 0,\n",
831
- " 4,\n",
832
- " 4,\n",
833
- " 7,\n",
834
- " 5,\n",
835
- " 2,\n",
836
- " 7,\n",
837
- " 0,\n",
838
- " 6,\n",
839
- " 0,\n",
840
- " 2,\n",
841
- " 6,\n",
842
- " 6,\n",
843
- " 2,\n",
844
- " 3,\n",
845
- " 0,\n",
846
- " 5,\n",
847
- " 0,\n",
848
- " 5,\n",
849
- " 7,\n",
850
- " 2,\n",
851
- " 7,\n",
852
- " 4,\n",
853
- " 7,\n",
854
- " 4,\n",
855
- " 0,\n",
856
- " 7,\n",
857
- " 1,\n",
858
- " 4,\n",
859
- " 5,\n",
860
- " 0,\n",
861
- " 5,\n",
862
- " 5,\n",
863
- " 2,\n",
864
- " 0,\n",
865
- " 2,\n",
866
- " 5,\n",
867
- " 5,\n",
868
- " 6,\n",
869
- " 3,\n",
870
- " 4,\n",
871
- " 1,\n",
872
- " 7,\n",
873
- " 7,\n",
874
- " 2,\n",
875
- " 3,\n",
876
- " 2,\n",
877
- " 5,\n",
878
- " 0,\n",
879
- " 7,\n",
880
- " 2,\n",
881
- " 3,\n",
882
- " 7,\n",
883
- " 2,\n",
884
- " 4,\n",
885
- " 0,\n",
886
- " 5,\n",
887
- " 7,\n",
888
- " 3,\n",
889
- " 6,\n",
890
- " 7,\n",
891
- " 6,\n",
892
- " 4,\n",
893
- " 3,\n",
894
- " 6,\n",
895
- " 5,\n",
896
- " 4,\n",
897
- " 0,\n",
898
- " 3,\n",
899
- " 4,\n",
900
- " 3,\n",
901
- " 5,\n",
902
- " 2,\n",
903
- " 4,\n",
904
- " 0,\n",
905
- " 3,\n",
906
- " 6,\n",
907
- " 1,\n",
908
- " 3,\n",
909
- " 1,\n",
910
- " 4,\n",
911
- " 3,\n",
912
- " 3,\n",
913
- " 3,\n",
914
- " 0,\n",
915
- " 7,\n",
916
- " 6,\n",
917
- " 2,\n",
918
- " 4,\n",
919
- " 6,\n",
920
- " 5,\n",
921
- " 4,\n",
922
- " 1,\n",
923
- " 7,\n",
924
- " 6,\n",
925
- " 1,\n",
926
- " 4,\n",
927
- " 3,\n",
928
- " 0,\n",
929
- " 7,\n",
930
- " 3,\n",
931
- " 1,\n",
932
- " 2,\n",
933
- " 1,\n",
934
- " 6,\n",
935
- " 4,\n",
936
- " 7,\n",
937
- " 1,\n",
938
- " 7,\n",
939
- " 1,\n",
940
- " 5,\n",
941
- " 1,\n",
942
- " 6,\n",
943
- " 3,\n",
944
- " 0,\n",
945
- " 2,\n",
946
- " 6,\n",
947
- " 7,\n",
948
- " 7,\n",
949
- " 0,\n",
950
- " 1,\n",
951
- " 4,\n",
952
- " 0,\n",
953
- " 4,\n",
954
- " 5,\n",
955
- " 3,\n",
956
- " 6,\n",
957
- " 2,\n",
958
- " 3,\n",
959
- " 4,\n",
960
- " 1,\n",
961
- " 6,\n",
962
- " 2,\n",
963
- " 4,\n",
964
- " 4,\n",
965
- " 6,\n",
966
- " 4,\n",
967
- " 5,\n",
968
- " 7,\n",
969
- " 1,\n",
970
- " 7,\n",
971
- " 7,\n",
972
- " 4,\n",
973
- " 7,\n",
974
- " 4,\n",
975
- " 3,\n",
976
- " 3,\n",
977
- " 6,\n",
978
- " 1,\n",
979
- " 2,\n",
980
- " 0,\n",
981
- " 0,\n",
982
- " 0,\n",
983
- " 2,\n",
984
- " 5,\n",
985
- " 6,\n",
986
- " 5,\n",
987
- " 7,\n",
988
- " 5,\n",
989
- " 7,\n",
990
- " 1,\n",
991
- " 1,\n",
992
- " 2,\n",
993
- " 1,\n",
994
- " 6,\n",
995
- " 5,\n",
996
- " 7,\n",
997
- " 0,\n",
998
- " 0,\n",
999
- " 5,\n",
1000
- " 5,\n",
1001
- " 0,\n",
1002
- " 3,\n",
1003
- " 7,\n",
1004
- " 5,\n",
1005
- " 2,\n",
1006
- " 5,\n",
1007
- " 4,\n",
1008
- " 2,\n",
1009
- " 3,\n",
1010
- " 6,\n",
1011
- " 2,\n",
1012
- " 3,\n",
1013
- " 6,\n",
1014
- " 0,\n",
1015
- " 0,\n",
1016
- " 2,\n",
1017
- " 6,\n",
1018
- " 0,\n",
1019
- " 1,\n",
1020
- " 3,\n",
1021
- " 3,\n",
1022
- " 6,\n",
1023
- " 4,\n",
1024
- " 6,\n",
1025
- " 4,\n",
1026
- " 6,\n",
1027
- " 0,\n",
1028
- " 0,\n",
1029
- " 2,\n",
1030
- " 3,\n",
1031
- " 6,\n",
1032
- " 2,\n",
1033
- " 2,\n",
1034
- " 6,\n",
1035
- " 6,\n",
1036
- " 2,\n",
1037
- " 4,\n",
1038
- " 3,\n",
1039
- " 3,\n",
1040
- " 6,\n",
1041
- " 7,\n",
1042
- " 7,\n",
1043
- " 1,\n",
1044
- " 1,\n",
1045
- " 7,\n",
1046
- " 7,\n",
1047
- " 6,\n",
1048
- " 1,\n",
1049
- " 7,\n",
1050
- " 0,\n",
1051
- " 0,\n",
1052
- " 2,\n",
1053
- " 4,\n",
1054
- " 2,\n",
1055
- " 2,\n",
1056
- " 3,\n",
1057
- " 0,\n",
1058
- " 1,\n",
1059
- " 4,\n",
1060
- " 0,\n",
1061
- " 4,\n",
1062
- " 6,\n",
1063
- " 5,\n",
1064
- " 3,\n",
1065
- " 2,\n",
1066
- " 3,\n",
1067
- " 2,\n",
1068
- " 3,\n",
1069
- " 6,\n",
1070
- " 2,\n",
1071
- " 1,\n",
1072
- " 4,\n",
1073
- " 7,\n",
1074
- " 6,\n",
1075
- " 4,\n",
1076
- " 5,\n",
1077
- " 6,\n",
1078
- " 7,\n",
1079
- " 7,\n",
1080
- " 2,\n",
1081
- " 0,\n",
1082
- " 5,\n",
1083
- " 5,\n",
1084
- " 0,\n",
1085
- " 3,\n",
1086
- " 6,\n",
1087
- " 6,\n",
1088
- " 5,\n",
1089
- " 4,\n",
1090
- " 4,\n",
1091
- " 7,\n",
1092
- " 0,\n",
1093
- " 5,\n",
1094
- " 1,\n",
1095
- " 7,\n",
1096
- " 0,\n",
1097
- " 3,\n",
1098
- " 1,\n",
1099
- " 7,\n",
1100
- " 0,\n",
1101
- " 1,\n",
1102
- " 4,\n",
1103
- " 7,\n",
1104
- " 5,\n",
1105
- " 0,\n",
1106
- " 4,\n",
1107
- " 0,\n",
1108
- " 0,\n",
1109
- " 1,\n",
1110
- " 0,\n",
1111
- " 6,\n",
1112
- " 4,\n",
1113
- " 0,\n",
1114
- " 5,\n",
1115
- " 4,\n",
1116
- " 6,\n",
1117
- " 6,\n",
1118
- " 7,\n",
1119
- " 2,\n",
1120
- " 6,\n",
1121
- " 2,\n",
1122
- " 6,\n",
1123
- " 0,\n",
1124
- " 3,\n",
1125
- " 2,\n",
1126
- " 2,\n",
1127
- " 1,\n",
1128
- " 5,\n",
1129
- " 4,\n",
1130
- " 7,\n",
1131
- " 6,\n",
1132
- " 6,\n",
1133
- " 2,\n",
1134
- " 5,\n",
1135
- " 5,\n",
1136
- " 5,\n",
1137
- " 0,\n",
1138
- " 3,\n",
1139
- " 5,\n",
1140
- " 4,\n",
1141
- " 5,\n",
1142
- " 7,\n",
1143
- " 5,\n",
1144
- " 0,\n",
1145
- " 5,\n",
1146
- " 0,\n",
1147
- " 0,\n",
1148
- " 2,\n",
1149
- " 0,\n",
1150
- " 2,\n",
1151
- " 1,\n",
1152
- " 0,\n",
1153
- " 2,\n",
1154
- " 4,\n",
1155
- " 3,\n",
1156
- " 4,\n",
1157
- " 1,\n",
1158
- " 7,\n",
1159
- " 2,\n",
1160
- " 1,\n",
1161
- " 0,\n",
1162
- " 3,\n",
1163
- " 0,\n",
1164
- " 3,\n",
1165
- " 1,\n",
1166
- " 1,\n",
1167
- " 0,\n",
1168
- " 5,\n",
1169
- " 3,\n",
1170
- " 1,\n",
1171
- " 2,\n",
1172
- " 5,\n",
1173
- " 6,\n",
1174
- " 7,\n",
1175
- " 6,\n",
1176
- " 7,\n",
1177
- " 0,\n",
1178
- " 2,\n",
1179
- " 6,\n",
1180
- " 3,\n",
1181
- " 1,\n",
1182
- " 5,\n",
1183
- " 4,\n",
1184
- " 2,\n",
1185
- " 4,\n",
1186
- " 6,\n",
1187
- " 5,\n",
1188
- " 2,\n",
1189
- " 7,\n",
1190
- " ...]"
1191
- ]
1192
- },
1193
- "execution_count": 6,
1194
- "metadata": {},
1195
- "output_type": "execute_result"
1196
- }
1197
- ],
1198
- "source": [
1199
- "\n",
1200
- "#--------------------------------------------------------------------------------------------\n",
1201
- "# YOUR MODEL INFERENCE CODE HERE\n",
1202
- "# Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.\n",
1203
- "#-------------------------------------------------------------------------------------------- \n",
1204
- "\n",
1205
- "# Make random predictions (placeholder for actual model inference)\n",
1206
- "true_labels = test_dataset[\"label\"]\n",
1207
- "predictions = [random.randint(0, 7) for _ in range(len(true_labels))]\n",
1208
- "\n",
1209
- "predictions\n",
1210
- "\n",
1211
- "#--------------------------------------------------------------------------------------------\n",
1212
- "# YOUR MODEL INFERENCE STOPS HERE\n",
1213
- "#-------------------------------------------------------------------------------------------- "
1214
- ]
1215
- },
1216
- {
1217
- "cell_type": "code",
1218
- "execution_count": 8,
1219
- "metadata": {},
1220
- "outputs": [
1221
- {
1222
- "name": "stderr",
1223
- "output_type": "stream",
1224
- "text": [
1225
- "[codecarbon WARNING @ 19:53:32] Background scheduler didn't run for a long period (47s), results might be inaccurate\n",
1226
- "[codecarbon INFO @ 19:53:32] Energy consumed for RAM : 0.000156 kWh. RAM Power : 11.755242347717285 W\n",
1227
- "[codecarbon INFO @ 19:53:32] Delta energy consumed for CPU with constant : 0.000564 kWh, power : 42.5 W\n",
1228
- "[codecarbon INFO @ 19:53:32] Energy consumed for All CPU : 0.000564 kWh\n",
1229
- "[codecarbon INFO @ 19:53:32] 0.000720 kWh of electricity used since the beginning.\n"
1230
- ]
1231
- },
1232
- {
1233
- "data": {
1234
- "text/plain": [
1235
- "EmissionsData(timestamp='2025-01-21T19:53:32', project_name='codecarbon', run_id='908f2e7e-4bb2-4991-a0f6-56bf8d7eda21', experiment_id='5b0fa12a-3dd7-45bb-9766-cc326314d9f1', duration=47.736408500000834, emissions=4.032368007471064e-05, emissions_rate=8.444466886328872e-07, cpu_power=42.5, gpu_power=0.0, ram_power=11.755242347717285, cpu_energy=0.0005636615353475565, gpu_energy=0, ram_energy=0.00015590305493261682, energy_consumed=0.0007195645902801733, country_name='France', country_iso_code='FRA', region='île-de-france', cloud_provider='', cloud_region='', os='Windows-11-10.0.22631-SP0', python_version='3.12.7', codecarbon_version='3.0.0_rc0', cpu_count=12, cpu_model='13th Gen Intel(R) Core(TM) i7-1365U', gpu_count=None, gpu_model=None, longitude=2.3494, latitude=48.8558, ram_total_size=31.347312927246094, tracking_mode='machine', on_cloud='N', pue=1.0)"
1236
- ]
1237
- },
1238
- "execution_count": 8,
1239
- "metadata": {},
1240
- "output_type": "execute_result"
1241
- }
1242
- ],
1243
- "source": [
1244
- "# Stop tracking emissions\n",
1245
- "emissions_data = tracker.stop_task()\n",
1246
- "emissions_data"
1247
- ]
1248
- },
1249
- {
1250
- "cell_type": "code",
1251
- "execution_count": 9,
1252
- "metadata": {},
1253
- "outputs": [
1254
- {
1255
- "data": {
1256
- "text/plain": [
1257
- "0.10090237899917966"
1258
- ]
1259
- },
1260
- "execution_count": 9,
1261
- "metadata": {},
1262
- "output_type": "execute_result"
1263
- }
1264
- ],
1265
- "source": [
1266
- "# Calculate accuracy\n",
1267
- "accuracy = accuracy_score(true_labels, predictions)\n",
1268
- "accuracy"
1269
- ]
1270
- },
1271
- {
1272
- "cell_type": "code",
1273
- "execution_count": 10,
1274
- "metadata": {},
1275
- "outputs": [
1276
- {
1277
- "data": {
1278
- "text/plain": [
1279
- "{'submission_timestamp': '2025-01-21T19:53:46.639165',\n",
1280
- " 'accuracy': 0.10090237899917966,\n",
1281
- " 'energy_consumed_wh': 0.7195645902801733,\n",
1282
- " 'emissions_gco2eq': 0.040323680074710634,\n",
1283
- " 'emissions_data': {'run_id': '908f2e7e-4bb2-4991-a0f6-56bf8d7eda21',\n",
1284
- " 'duration': 47.736408500000834,\n",
1285
- " 'emissions': 4.032368007471064e-05,\n",
1286
- " 'emissions_rate': 8.444466886328872e-07,\n",
1287
- " 'cpu_power': 42.5,\n",
1288
- " 'gpu_power': 0.0,\n",
1289
- " 'ram_power': 11.755242347717285,\n",
1290
- " 'cpu_energy': 0.0005636615353475565,\n",
1291
- " 'gpu_energy': 0,\n",
1292
- " 'ram_energy': 0.00015590305493261682,\n",
1293
- " 'energy_consumed': 0.0007195645902801733,\n",
1294
- " 'country_name': 'France',\n",
1295
- " 'country_iso_code': 'FRA',\n",
1296
- " 'region': 'île-de-france',\n",
1297
- " 'cloud_provider': '',\n",
1298
- " 'cloud_region': '',\n",
1299
- " 'os': 'Windows-11-10.0.22631-SP0',\n",
1300
- " 'python_version': '3.12.7',\n",
1301
- " 'codecarbon_version': '3.0.0_rc0',\n",
1302
- " 'cpu_count': 12,\n",
1303
- " 'cpu_model': '13th Gen Intel(R) Core(TM) i7-1365U',\n",
1304
- " 'gpu_count': None,\n",
1305
- " 'gpu_model': None,\n",
1306
- " 'ram_total_size': 31.347312927246094,\n",
1307
- " 'tracking_mode': 'machine',\n",
1308
- " 'on_cloud': 'N',\n",
1309
- " 'pue': 1.0},\n",
1310
- " 'dataset_config': {'dataset_name': 'QuotaClimat/frugalaichallenge-text-train',\n",
1311
- " 'test_size': 0.2,\n",
1312
- " 'test_seed': 42}}"
1313
- ]
1314
- },
1315
- "execution_count": 10,
1316
- "metadata": {},
1317
- "output_type": "execute_result"
1318
- }
1319
- ],
1320
- "source": [
1321
- "# Prepare results dictionary\n",
1322
- "results = {\n",
1323
- " \"submission_timestamp\": datetime.now().isoformat(),\n",
1324
- " \"accuracy\": float(accuracy),\n",
1325
- " \"energy_consumed_wh\": emissions_data.energy_consumed * 1000,\n",
1326
- " \"emissions_gco2eq\": emissions_data.emissions * 1000,\n",
1327
- " \"emissions_data\": clean_emissions_data(emissions_data),\n",
1328
- " \"dataset_config\": {\n",
1329
- " \"dataset_name\": request.dataset_name,\n",
1330
- " \"test_size\": request.test_size,\n",
1331
- " \"test_seed\": request.test_seed\n",
1332
- " }\n",
1333
- "}\n",
1334
- "\n",
1335
- "results"
1336
- ]
1337
- },
1338
- {
1339
- "cell_type": "markdown",
1340
- "metadata": {},
1341
- "source": [
1342
- "## Development of the model"
1343
- ]
1344
- },
1345
- {
1346
- "cell_type": "code",
1347
- "execution_count": 11,
1348
- "metadata": {},
1349
- "outputs": [
1350
- {
1351
- "data": {
1352
- "application/vnd.jupyter.widget-view+json": {
1353
- "model_id": "90f50ab19698484489f36976745efad3",
1354
- "version_major": 2,
1355
- "version_minor": 0
1356
- },
1357
- "text/plain": [
1358
- "config.json: 0%| | 0.00/1.15k [00:00<?, ?B/s]"
1359
- ]
1360
- },
1361
- "metadata": {},
1362
- "output_type": "display_data"
1363
- },
1364
- {
1365
- "name": "stderr",
1366
- "output_type": "stream",
1367
- "text": [
1368
- "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\models--facebook--bart-large-mnli. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
1369
- "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
1370
- " warnings.warn(message)\n"
1371
- ]
1372
- },
1373
- {
1374
- "data": {
1375
- "application/vnd.jupyter.widget-view+json": {
1376
- "model_id": "6e3974d8ff284603821f7beca9bd353d",
1377
- "version_major": 2,
1378
- "version_minor": 0
1379
- },
1380
- "text/plain": [
1381
- "model.safetensors: 0%| | 0.00/1.63G [00:00<?, ?B/s]"
1382
- ]
1383
- },
1384
- "metadata": {},
1385
- "output_type": "display_data"
1386
- },
1387
- {
1388
- "data": {
1389
- "application/vnd.jupyter.widget-view+json": {
1390
- "model_id": "bc29cb379c644b00b1bdf61d5426d99d",
1391
- "version_major": 2,
1392
- "version_minor": 0
1393
- },
1394
- "text/plain": [
1395
- "tokenizer_config.json: 0%| | 0.00/26.0 [00:00<?, ?B/s]"
1396
- ]
1397
- },
1398
- "metadata": {},
1399
- "output_type": "display_data"
1400
- },
1401
- {
1402
- "data": {
1403
- "application/vnd.jupyter.widget-view+json": {
1404
- "model_id": "635503cf819747c9a83f22aa4f2f11db",
1405
- "version_major": 2,
1406
- "version_minor": 0
1407
- },
1408
- "text/plain": [
1409
- "vocab.json: 0%| | 0.00/899k [00:00<?, ?B/s]"
1410
- ]
1411
- },
1412
- "metadata": {},
1413
- "output_type": "display_data"
1414
- },
1415
- {
1416
- "data": {
1417
- "application/vnd.jupyter.widget-view+json": {
1418
- "model_id": "3a5f53e451e8483ca7c33f42245abd13",
1419
- "version_major": 2,
1420
- "version_minor": 0
1421
- },
1422
- "text/plain": [
1423
- "merges.txt: 0%| | 0.00/456k [00:00<?, ?B/s]"
1424
- ]
1425
- },
1426
- "metadata": {},
1427
- "output_type": "display_data"
1428
- },
1429
- {
1430
- "data": {
1431
- "application/vnd.jupyter.widget-view+json": {
1432
- "model_id": "84f922d1b68a4a0faa5e920d004efca0",
1433
- "version_major": 2,
1434
- "version_minor": 0
1435
- },
1436
- "text/plain": [
1437
- "tokenizer.json: 0%| | 0.00/1.36M [00:00<?, ?B/s]"
1438
- ]
1439
- },
1440
- "metadata": {},
1441
- "output_type": "display_data"
1442
- },
1443
- {
1444
- "name": "stderr",
1445
- "output_type": "stream",
1446
- "text": [
1447
- "Device set to use cpu\n"
1448
- ]
1449
- }
1450
- ],
1451
- "source": [
1452
- "from transformers import pipeline\n",
1453
- "classifier = pipeline(\"zero-shot-classification\",\n",
1454
- " model=\"facebook/bart-large-mnli\")\n"
1455
- ]
1456
- },
1457
- {
1458
- "cell_type": "code",
1459
- "execution_count": 14,
1460
- "metadata": {},
1461
- "outputs": [],
1462
- "source": [
1463
- "sequence_to_classify = \"one day I will see the world\"\n",
1464
- "\n",
1465
- "candidate_labels = [\n",
1466
- " \"Not related to climate change disinformation\",\n",
1467
- " \"Climate change is not real and not happening\",\n",
1468
- " \"Climate change is not human-induced\",\n",
1469
- " \"Climate change impacts are not that bad\",\n",
1470
- " \"Climate change solutions are harmful and unnecessary\",\n",
1471
- " \"Climate change science is unreliable\",\n",
1472
- " \"Climate change proponents are biased\",\n",
1473
- " \"Fossil fuels are needed to address climate change\"\n",
1474
- "]"
1475
- ]
1476
- },
1477
- {
1478
- "cell_type": "code",
1479
- "execution_count": 15,
1480
- "metadata": {},
1481
- "outputs": [
1482
- {
1483
- "data": {
1484
- "text/plain": [
1485
- "{'sequence': 'one day I will see the world',\n",
1486
- " 'labels': ['Fossil fuels are needed to address climate change',\n",
1487
- " 'Climate change science is unreliable',\n",
1488
- " 'Not related to climate change disinformation',\n",
1489
- " 'Climate change proponents are biased',\n",
1490
- " 'Climate change impacts are not that bad',\n",
1491
- " 'Climate change solutions are harmful and unnecessary',\n",
1492
- " 'Climate change is not human-induced',\n",
1493
- " 'Climate change is not real and not happening'],\n",
1494
- " 'scores': [0.16242119669914246,\n",
1495
- " 0.15683825314044952,\n",
1496
- " 0.1564282774925232,\n",
1497
- " 0.14603719115257263,\n",
1498
- " 0.12794046103954315,\n",
1499
- " 0.10180754214525223,\n",
1500
- " 0.0936085507273674,\n",
1501
- " 0.0549185685813427]}"
1502
- ]
1503
- },
1504
- "execution_count": 15,
1505
- "metadata": {},
1506
- "output_type": "execute_result"
1507
- }
1508
- ],
1509
- "source": [
1510
- "classifier(sequence_to_classify, candidate_labels)"
1511
- ]
1512
- },
1513
- {
1514
- "cell_type": "code",
1515
- "execution_count": 26,
1516
- "metadata": {},
1517
- "outputs": [
1518
- {
1519
- "name": "stderr",
1520
- "output_type": "stream",
1521
- "text": [
1522
- "[codecarbon WARNING @ 11:00:07] Already started tracking\n"
1523
- ]
1524
- },
1525
- {
1526
- "data": {
1527
- "application/vnd.jupyter.widget-view+json": {
1528
- "model_id": "5d66a13f76a4411d95b62d4a73012495",
1529
- "version_major": 2,
1530
- "version_minor": 0
1531
- },
1532
- "text/plain": [
1533
- "0it [00:00, ?it/s]"
1534
- ]
1535
- },
1536
- "metadata": {},
1537
- "output_type": "display_data"
1538
- },
1539
- {
1540
- "name": "stderr",
1541
- "output_type": "stream",
1542
- "text": [
1543
- "[codecarbon WARNING @ 11:05:57] Background scheduler didn't run for a long period (349s), results might be inaccurate\n",
1544
- "[codecarbon INFO @ 11:05:57] Energy consumed for RAM : 0.018069 kWh. RAM Power : 11.755242347717285 W\n",
1545
- "[codecarbon INFO @ 11:05:57] Delta energy consumed for CPU with constant : 0.004122 kWh, power : 42.5 W\n",
1546
- "[codecarbon INFO @ 11:05:57] Energy consumed for All CPU : 0.065327 kWh\n",
1547
- "[codecarbon INFO @ 11:05:57] 0.083395 kWh of electricity used since the beginning.\n"
1548
- ]
1549
- },
1550
- {
1551
- "data": {
1552
- "text/plain": [
1553
- "EmissionsData(timestamp='2025-01-22T11:05:57', project_name='codecarbon', run_id='908f2e7e-4bb2-4991-a0f6-56bf8d7eda21', experiment_id='5b0fa12a-3dd7-45bb-9766-cc326314d9f1', duration=349.19709450000664, emissions=0.0002949120266226386, emissions_rate=8.445461750018632e-07, cpu_power=42.5, gpu_power=0.0, ram_power=11.755242347717285, cpu_energy=0.004122396676597424, gpu_energy=0, ram_energy=0.0011402244733631148, energy_consumed=0.005262621149960539, country_name='France', country_iso_code='FRA', region='île-de-france', cloud_provider='', cloud_region='', os='Windows-11-10.0.22631-SP0', python_version='3.12.7', codecarbon_version='3.0.0_rc0', cpu_count=12, cpu_model='13th Gen Intel(R) Core(TM) i7-1365U', gpu_count=None, gpu_model=None, longitude=2.3494, latitude=48.8558, ram_total_size=31.347312927246094, tracking_mode='machine', on_cloud='N', pue=1.0)"
1554
- ]
1555
- },
1556
- "execution_count": 26,
1557
- "metadata": {},
1558
- "output_type": "execute_result"
1559
- }
1560
- ],
1561
- "source": [
1562
- "# Start tracking emissions\n",
1563
- "tracker.start()\n",
1564
- "tracker.start_task(\"inference\")\n",
1565
- "\n",
1566
- "from tqdm.auto import tqdm\n",
1567
- "predictions = []\n",
1568
- "\n",
1569
- "\n",
1570
- "\n",
1571
- "# Option 1: Simple loop approach\n",
1572
- "\n",
1573
- "for i, text in tqdm(enumerate(test_dataset[\"quote\"])):\n",
1574
- "\n",
1575
- " result = classifier(text, candidate_labels)\n",
1576
- "\n",
1577
- " # Get index of highest scoring label\n",
1578
- "\n",
1579
- " pred_label = candidate_labels.index(result[\"labels\"][0])\n",
1580
- "\n",
1581
- " predictions.append(pred_label)\n",
1582
- " if i == 100:\n",
1583
- " break\n",
1584
- "\n",
1585
- "\n",
1586
- "# Stop tracking emissions\n",
1587
- "emissions_data = tracker.stop_task()\n",
1588
- "emissions_data\n"
1589
- ]
1590
- },
1591
- {
1592
- "cell_type": "code",
1593
- "execution_count": 28,
1594
- "metadata": {},
1595
- "outputs": [
1596
- {
1597
- "data": {
1598
- "text/plain": [
1599
- "0.4"
1600
- ]
1601
- },
1602
- "execution_count": 28,
1603
- "metadata": {},
1604
- "output_type": "execute_result"
1605
- }
1606
- ],
1607
- "source": [
1608
- "# Calculate accuracy\n",
1609
- "accuracy = accuracy_score(true_labels[:100], predictions[:100])\n",
1610
- "accuracy"
1611
- ]
1612
- },
1613
- {
1614
- "cell_type": "code",
1615
- "execution_count": null,
1616
- "metadata": {},
1617
- "outputs": [],
1618
- "source": []
1619
- }
1620
- ],
1621
- "metadata": {
1622
- "kernelspec": {
1623
- "display_name": "base",
1624
- "language": "python",
1625
- "name": "python3"
1626
- },
1627
- "language_info": {
1628
- "codemirror_mode": {
1629
- "name": "ipython",
1630
- "version": 3
1631
- },
1632
- "file_extension": ".py",
1633
- "mimetype": "text/x-python",
1634
- "name": "python",
1635
- "nbconvert_exporter": "python",
1636
- "pygments_lexer": "ipython3",
1637
- "version": "3.12.7"
1638
- }
1639
- },
1640
- "nbformat": 4,
1641
- "nbformat_minor": 2
1642
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
requirements.txt CHANGED
@@ -7,4 +7,6 @@ pydantic>=1.10.0
7
  python-dotenv>=1.0.0
8
  gradio>=4.0.0
9
  requests>=2.31.0
10
- librosa==0.10.2.post1
 
 
 
7
  python-dotenv>=1.0.0
8
  gradio>=4.0.0
9
  requests>=2.31.0
10
+ librosa==0.10.2.post1
11
+ torch==2.5.1
12
+ torchaudio==2.5.1
tasks/audio.py CHANGED
@@ -2,31 +2,30 @@ from fastapi import APIRouter
2
  from datetime import datetime
3
  from datasets import load_dataset
4
  from sklearn.metrics import accuracy_score
5
- import random
6
  import os
 
7
 
8
  from .utils.evaluation import AudioEvaluationRequest
9
  from .utils.emissions import tracker, clean_emissions_data, get_space_info
 
 
10
 
11
  from dotenv import load_dotenv
12
  load_dotenv()
13
 
14
  router = APIRouter()
15
 
16
- DESCRIPTION = "Random Baseline"
17
  ROUTE = "/audio"
18
 
19
 
20
-
21
- @router.post(ROUTE, tags=["Audio Task"],
22
- description=DESCRIPTION)
23
  async def evaluate_audio(request: AudioEvaluationRequest):
24
  """
25
  Evaluate audio classification for rainforest sound detection.
26
 
27
- Current Model: Random Baseline
28
- - Makes random predictions from the label space (0-1)
29
- - Used as a baseline for comparison
30
  """
31
  # Get space info
32
  username, space_url = get_space_info()
@@ -38,11 +37,17 @@ async def evaluate_audio(request: AudioEvaluationRequest):
38
  }
39
  # Load and prepare the dataset
40
  # Because the dataset is gated, we need to use the HF_TOKEN environment variable to authenticate
41
- dataset = load_dataset(request.dataset_name,token=os.getenv("HF_TOKEN"))
42
-
43
- # Split dataset
44
- train_test = dataset["train"]
45
- test_dataset = dataset["test"]
 
 
 
 
 
 
46
 
47
  # Start tracking emissions
48
  tracker.start()
@@ -53,9 +58,14 @@ async def evaluate_audio(request: AudioEvaluationRequest):
53
  # Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.
54
  #--------------------------------------------------------------------------------------------
55
 
56
- # Make random predictions (placeholder for actual model inference)
57
- true_labels = test_dataset["label"]
58
- predictions = [random.randint(0, 1) for _ in range(len(true_labels))]
 
 
 
 
 
59
 
60
  #--------------------------------------------------------------------------------------------
61
  # YOUR MODEL INFERENCE STOPS HERE
@@ -65,6 +75,7 @@ async def evaluate_audio(request: AudioEvaluationRequest):
65
  emissions_data = tracker.stop_task()
66
 
67
  # Calculate accuracy
 
68
  accuracy = accuracy_score(true_labels, predictions)
69
 
70
  # Prepare results dictionary
 
2
  from datetime import datetime
3
  from datasets import load_dataset
4
  from sklearn.metrics import accuracy_score
 
5
  import os
6
+ import torch
7
 
8
  from .utils.evaluation import AudioEvaluationRequest
9
  from .utils.emissions import tracker, clean_emissions_data, get_space_info
10
+ from .utils.preprocess import get_dataloader
11
+ from .models.model import ChainsawDetector
12
 
13
  from dotenv import load_dotenv
14
  load_dotenv()
15
 
16
  router = APIRouter()
17
 
18
+ DESCRIPTION = "Chainsaw goes brrr ⇒ GPU goes brrr"
19
  ROUTE = "/audio"
20
 
21
 
22
+ @router.post(ROUTE, tags=["Audio Task"], description=DESCRIPTION)
 
 
23
  async def evaluate_audio(request: AudioEvaluationRequest):
24
  """
25
  Evaluate audio classification for rainforest sound detection.
26
 
27
+ Current Model: ChainsawDetector
28
+ - STFT -> PCEN -> split into small time chunks -> CNN+LSTM for each chunk -> dense -> prediction
 
29
  """
30
  # Get space info
31
  username, space_url = get_space_info()
 
37
  }
38
  # Load and prepare the dataset
39
  # Because the dataset is gated, we need to use the HF_TOKEN environment variable to authenticate
40
+ batch_size = 16
41
+ device = "cuda" if torch.cuda.is_available() else "cpu"
42
+ split='test'
43
+ test_dataset = load_dataset(request.dataset_name, split=split, token=os.getenv("HF_TOKEN"))
44
+ dataloader = get_dataloader(test_dataset, device, batch_size=batch_size, shuffle=False)
45
+
46
+ # Load model
47
+ model = ChainsawDetector(batch_size).to(device, dtype=torch.bfloat16)
48
+ model = torch.compile(model)
49
+ model.load_state_dict(torch.load('models/final-bf16.pth', weights_only=True))
50
+ model.eval()
51
 
52
  # Start tracking emissions
53
  tracker.start()
 
58
  # Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.
59
  #--------------------------------------------------------------------------------------------
60
 
61
+ predictions = []
62
+ with torch.no_grad():#, torch.amp.autocast(device_type=device):
63
+ for (X, y) in dataloader:
64
+ X = X.to(device, dtype=torch.bfloat16)
65
+ y = y.to(device, dtype=torch.bfloat16)
66
+
67
+ predictions.append(model(X))
68
+ predictions = torch.cat(predictions, dim=0)
69
 
70
  #--------------------------------------------------------------------------------------------
71
  # YOUR MODEL INFERENCE STOPS HERE
 
75
  emissions_data = tracker.stop_task()
76
 
77
  # Calculate accuracy
78
+ true_labels = test_dataset["label"]
79
  accuracy = accuracy_score(true_labels, predictions)
80
 
81
  # Prepare results dictionary
tasks/datasources/freesound_chainsaw.txt ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ chainsaw sawing nearby long various cuts.flac by kyles -- https://freesound.org/s/453249/ -- License: Creative Commons 0
2
+ Chainsaw First Start (On choke).wav by lonemonk -- https://freesound.org/s/185578/ -- License: Attribution 3.0
3
+ Chainsaw - Pull to Idle and 3 Revs.wav by Stefan021 -- https://freesound.org/s/431737/ -- License: Creative Commons 0
4
+ Chainsaw Noises 002.wav by yottasounds -- https://freesound.org/s/380161/ -- License: Creative Commons 0
5
+ Tree Chainsawed Drops.m4a by RutgerMuller -- https://freesound.org/s/535352/ -- License: Creative Commons 0
6
+ D209 Chainsaw in the Wood.WAV by billcutbill -- https://freesound.org/s/669382/ -- License: Attribution 4.0
7
+ chainsaw sawing working long various.flac by kyles -- https://freesound.org/s/637360/ -- License: Creative Commons 0
8
+ chainsaw.WAV by inchadney -- https://freesound.org/s/467419/ -- License: Attribution NonCommercial 4.0
9
+ CHAINSAW.wav by JFBSAUVE -- https://freesound.org/s/19898/ -- License: Attribution 4.0
10
+ chainsaw.m4a by Chilsville -- https://freesound.org/s/570263/ -- License: Attribution NonCommercial 3.0
11
+ chainsaw.wav by ShanayGroen -- https://freesound.org/s/365578/ -- License: Attribution NonCommercial 3.0
12
+ Chainsaw with Bobcat idling and tree bits falling 080320.wav by BoilingSand -- https://freesound.org/s/50668/ -- License: Attribution 3.0
13
+ Chainsaw in a forest by SPAudiobooks -- https://freesound.org/s/751913/ -- License: Creative Commons 0
14
+ Chainsaw Crosscutting 2.wav by Benboncan -- https://freesound.org/s/64395/ -- License: Attribution 4.0
15
+ chainsaw.WAV by stomachache -- https://freesound.org/s/47250/ -- License: Creative Commons 0
16
+ Exterior_Chainsaw_Idle.wav by DJillcom -- https://freesound.org/s/157610/ -- License: Creative Commons 0
17
+ Chainsawing.wav by Puniho -- https://freesound.org/s/165856/ -- License: Attribution 3.0
18
+ Chainsaws (distant) by micadoe -- https://freesound.org/s/170338/ -- License: Creative Commons 0
19
+ chainsaw.ogg by electrovoice664 -- https://freesound.org/s/75078/ -- License: Sampling+
20
+ chainsaw.wav by doobit -- https://freesound.org/s/65997/ -- License: Sampling+
21
+ chainsaw.wav by mr101986 -- https://freesound.org/s/94718/ -- License: Creative Commons 0
22
+ Chainsaw - Tree cases.WAV by Ohrwurm -- https://freesound.org/s/68391/ -- License: Creative Commons 0
23
+ Strimmers And Chainsaw 2.wav by Benboncan -- https://freesound.org/s/81908/ -- License: Attribution 4.0
24
+ chainsaw.wav by UncleSigmund -- https://freesound.org/s/116765/ -- License: Attribution 4.0
25
+ chainsaw and little tree.wav by Kyster -- https://freesound.org/s/118657/ -- License: Attribution 4.0
26
+ chainsaw vs chestnut tree.wav by Kyster -- https://freesound.org/s/118658/ -- License: Attribution 4.0
27
+ ChainsawCutting_Distant_4824.wav by pblzr -- https://freesound.org/s/512876/ -- License: Creative Commons 0
28
+ chainsaw felling tree by matt_beer -- https://freesound.org/s/515296/ -- License: Creative Commons 0
29
+ chainsaw by matt_beer -- https://freesound.org/s/515303/ -- License: Creative Commons 0
30
+ starting chainsaw 2 by matt_beer -- https://freesound.org/s/515306/ -- License: Creative Commons 0
31
+ starting chainsaw 1 by matt_beer -- https://freesound.org/s/515307/ -- License: Creative Commons 0
32
+ Chainsaw by AugustSandberg -- https://freesound.org/s/508846/ -- License: Creative Commons 0
33
+ Chainsaw cutting by AugustSandberg -- https://freesound.org/s/508847/ -- License: Creative Commons 0
34
+ 190155_Chainsaw.wav by GaelanW -- https://freesound.org/s/490073/ -- License: Attribution 3.0
35
+ Chainsaw Cut Slow by DrinkingWindGames -- https://freesound.org/s/463729/ -- License: Attribution 4.0
36
+ 04-1 Chainsaw.wav by domiscz -- https://freesound.org/s/461734/ -- License: Creative Commons 0
37
+ CHAINSAW - 1 by SamuelGremaud -- https://freesound.org/s/463207/ -- License: Creative Commons 0
38
+ Chainsaws - Wood Carving by Ev-Dawg -- https://freesound.org/s/360619/ -- License: Creative Commons 0
39
+ Chainsaw by Hard3eat -- https://freesound.org/s/351775/ -- License: Creative Commons 0
40
+ Chainsaw_4.WAV by ivolipa -- https://freesound.org/s/345992/ -- License: Creative Commons 0
41
+ Chainsaw_3.WAV by ivolipa -- https://freesound.org/s/345993/ -- License: Creative Commons 0
42
+ Chainsaw_2.WAV by ivolipa -- https://freesound.org/s/345994/ -- License: Creative Commons 0
43
+ Distant Chainsaw 2.mp3 by FunWithSound -- https://freesound.org/s/390741/ -- License: Creative Commons 0
44
+ Distant Chainsaw 1.mp3 by FunWithSound -- https://freesound.org/s/390742/ -- License: Creative Commons 0
45
+ chainsaw_start.wav by Jedo -- https://freesound.org/s/396463/ -- License: Creative Commons 0
46
+ Chainsaw_cutting trees.wav by Jedo -- https://freesound.org/s/396464/ -- License: Creative Commons 0
47
+ Chainsaw gasoline-powered.wav by aoristos -- https://freesound.org/s/235795/ -- License: Creative Commons 0
48
+ Chainsaw by SPAudiobooks -- https://freesound.org/s/751912/ -- License: Creative Commons 0
49
+ Chainsaw Rampage in the Forest by unfa -- https://freesound.org/s/165823/ -- License: Creative Commons 0
50
+ S7 SIERRA.mp3 by AHTepsilon -- https://freesound.org/s/531550/ -- License: Creative Commons 0
51
+ Chainsaw trimming trees with street noise by Morphic__ -- https://freesound.org/s/541277/ -- License: Attribution 4.0
52
+ chainsaw_Husqvarna_385XPG.wav by theTone -- https://freesound.org/s/77945/ -- License: Attribution 4.0
53
+ Chainsaw Stihl MSA 200 C by druki -- https://freesound.org/s/595765/ -- License: Creative Commons 0
54
+ Chainsaw Stihl MS 170 cutting wood by druki -- https://freesound.org/s/595777/ -- License: Creative Commons 0
55
+ Chainsaw_1.WAV by ivolipa -- https://freesound.org/s/345995/ -- License: Creative Commons 0
56
+ chainsaw.m4a by Chilsville -- https://freesound.org/s/570263/ -- License: Attribution NonCommercial 3.0
57
+ Distant Chainsaw 1.mp3 by FunWithSound -- https://freesound.org/s/390742/ -- License: Creative Commons 0
58
+ Chainsaw in a forest by SPAudiobooks -- https://freesound.org/s/751913/ -- License: Creative Commons 0
59
+ Chainsaw by SPAudiobooks -- https://freesound.org/s/751912/ -- License: Creative Commons 0
60
+ Chainsawing.wav by Puniho -- https://freesound.org/s/165856/ -- License: Attribution 3.0
61
+ D209 Chainsaw in the Wood.WAV by billcutbill -- https://freesound.org/s/669382/ -- License: Attribution 4.0
62
+ Chain saw 3.mp3 by 5ound5murf23 -- https://freesound.org/s/523432/ -- License: Creative Commons 0
63
+ Chain saw 1.mp3 by 5ound5murf23 -- https://freesound.org/s/523434/ -- License: Creative Commons 0
64
+ Chain saw 2.mp3 by 5ound5murf23 -- https://freesound.org/s/523433/ -- License: Creative Commons 0
65
+ Chainsaw Starting by kangaroovindaloo -- https://freesound.org/s/520207/ -- License: Creative Commons 0
66
+ distant chainsaw by matt_beer -- https://freesound.org/s/515301/ -- License: Creative Commons 0
67
+ distant chainsaw by matt_beer -- https://freesound.org/s/515302/ -- License: Creative Commons 0
68
+ Chainsaw sounds deep in the forest by etienne.leplumey -- https://freesound.org/s/553502/ -- License: Attribution 4.0
tasks/datasources/freesound_environment.txt ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ forest summer Roond 018 200619_0186.wav by klankbeeld -- https://freesound.org/s/529758/ -- License: Attribution 4.0
2
+ Kampina forest spring010 190322_1321.wav by klankbeeld -- https://freesound.org/s/564004/ -- License: Attribution 4.0
3
+ Jogger in the forest by Cinetony -- https://freesound.org/s/559956/ -- License: Creative Commons 0
4
+ Kampina forest spring010 190322_1321.wav by klankbeeld -- https://freesound.org/s/564004/ -- License: Attribution 4.0
5
+ forest car passby 04 200619_0186.wav by klankbeeld -- https://freesound.org/s/613735/ -- License: Attribution 4.0
6
+ summer forest NL EU 1207 PM 220617_0405 by klankbeeld -- https://freesound.org/s/725797/ -- License: Attribution 4.0
7
+ forest spring NL EU 1114AM 220617_0400.wav by klankbeeld -- https://freesound.org/s/650572/ -- License: Attribution 4.0
8
+ 20080528.forest.wind.serins.flac by dobroide -- https://freesound.org/s/54744/ -- License: Attribution 4.0
9
+ little windy forest.wav by Kyster -- https://freesound.org/s/99281/ -- License: Attribution 4.0
10
+ Border ForestFarmfield 808AM NL EU 220515_0345.wav by klankbeeld -- https://freesound.org/s/671325/ -- License: Attribution 4.0
11
+ Autumn Forest Wind by Akacie -- https://freesound.org/s/73719/ -- License: Attribution NonCommercial 4.0
12
+ forest in the Netherlands 320 PM 230328_572 by klankbeeld -- https://freesound.org/s/730223/ -- License: Attribution 4.0
13
+ Berlin Grunewald Forest 3 - bells in distance.wav by dbspin -- https://freesound.org/s/396664/ -- License: Creative Commons 0
14
+ grassland forest spring NL 1127 AM 240531_0730 by klankbeeld -- https://freesound.org/s/738008/ -- License: Attribution 4.0
15
+ AMBForst_Edge Of The Forest.Byrds.Distant Road_EM_(Eq,OOsprd,Voice accent).wav by newlocknew -- https://freesound.org/s/640001/ -- License: Attribution NonCommercial 4.0
16
+ pine forest Kampina NL 04 190908_0072.wav by klankbeeld -- https://freesound.org/s/485972/ -- License: Attribution 4.0
17
+ small-forest-stream-in-mountains-surround-sound-rear by CRAFTCREST.com -- https://freesound.org/s/204903/ -- License: Attribution 4.0
18
+ water_forest_stream_00_l.wav by teadrinker -- https://freesound.org/s/403049/ -- License: Creative Commons 0
19
+ park forest 1020AM 230223_0562 by klankbeeld -- https://freesound.org/s/690705/ -- License: Attribution 4.0
20
+ Footsteps in Forest - 01.mp3 by Gutek -- https://freesound.org/s/201885/ -- License: Creative Commons 0
21
+ Forest dry leaves walk. .wav by rempen -- https://freesound.org/s/274833/ -- License: Creative Commons 0
22
+ Tiny trickling forest creek (loopable) by Mjeno -- https://freesound.org/s/405138/ -- License: Creative Commons 0
23
+ Spring Forest Ambience 1 - Hyby Fælled, Denmark by mugwood -- https://freesound.org/s/682850/ -- License: Attribution 4.0
24
+ New England Forest Daytime Ambience by Bmisiewicz -- https://freesound.org/s/698307/ -- License: Attribution 4.0
25
+ forest_cicada_loop.flac by Nimlos -- https://freesound.org/s/422048/ -- License: Creative Commons 0
26
+ Forest.mp3 by JayHu -- https://freesound.org/s/506103/ -- License: Attribution 3.0
27
+ wind in the forest.WAV by inchadney -- https://freesound.org/s/260161/ -- License: Attribution NonCommercial 4.0
28
+ Walk through Alice Holt Forest by Peter Batchelor by sensingtheforest -- https://freesound.org/s/730996/ -- License: Attribution NonCommercial 4.0
29
+ Forest rain.m4a by FreqWincy -- https://freesound.org/s/707401/ -- License: Attribution NonCommercial 4.0
30
+ Rain on window (interior) by xkeril -- https://freesound.org/s/669486/ -- License: Creative Commons 0
31
+ Rain Slowly Passing SIDE ONLY_Edgewater_06192020.mp3 by speakwithanimals -- https://freesound.org/s/525044/ -- License: Creative Commons 0
32
+ Street Scene - After The Rain - Moderate Traffic & Wet Road by FSFA -- https://freesound.org/s/593122/ -- License: Attribution 3.0
33
+ Midnight city rain stereo.wav by itinerantmonk108 -- https://freesound.org/s/573202/ -- License: Creative Commons 0
34
+ Rain, sheet roof.wav by snarcle -- https://freesound.org/s/569370/ -- License: Attribution 4.0
35
+ 003 - Rain Outside B.wav by Trashcan_Studios -- https://freesound.org/s/575461/ -- License: Attribution 4.0
36
+ mostly rain.mp3 by soundman9826 -- https://freesound.org/s/193337/ -- License: Creative Commons 0
37
+
tasks/image.py DELETED
@@ -1,176 +0,0 @@
1
- from fastapi import APIRouter
2
- from datetime import datetime
3
- from datasets import load_dataset
4
- import numpy as np
5
- from sklearn.metrics import accuracy_score, precision_score, recall_score
6
- import random
7
- import os
8
-
9
- from .utils.evaluation import ImageEvaluationRequest
10
- from .utils.emissions import tracker, clean_emissions_data, get_space_info
11
-
12
- from dotenv import load_dotenv
13
- load_dotenv()
14
-
15
- router = APIRouter()
16
-
17
- DESCRIPTION = "Random Baseline"
18
- ROUTE = "/image"
19
-
20
- def parse_boxes(annotation_string):
21
- """Parse multiple boxes from a single annotation string.
22
- Each box has 5 values: class_id, x_center, y_center, width, height"""
23
- values = [float(x) for x in annotation_string.strip().split()]
24
- boxes = []
25
- # Each box has 5 values
26
- for i in range(0, len(values), 5):
27
- if i + 5 <= len(values):
28
- # Skip class_id (first value) and take the next 4 values
29
- box = values[i+1:i+5]
30
- boxes.append(box)
31
- return boxes
32
-
33
- def compute_iou(box1, box2):
34
- """Compute Intersection over Union (IoU) between two YOLO format boxes."""
35
- # Convert YOLO format (x_center, y_center, width, height) to corners
36
- def yolo_to_corners(box):
37
- x_center, y_center, width, height = box
38
- x1 = x_center - width/2
39
- y1 = y_center - height/2
40
- x2 = x_center + width/2
41
- y2 = y_center + height/2
42
- return np.array([x1, y1, x2, y2])
43
-
44
- box1_corners = yolo_to_corners(box1)
45
- box2_corners = yolo_to_corners(box2)
46
-
47
- # Calculate intersection
48
- x1 = max(box1_corners[0], box2_corners[0])
49
- y1 = max(box1_corners[1], box2_corners[1])
50
- x2 = min(box1_corners[2], box2_corners[2])
51
- y2 = min(box1_corners[3], box2_corners[3])
52
-
53
- intersection = max(0, x2 - x1) * max(0, y2 - y1)
54
-
55
- # Calculate union
56
- box1_area = (box1_corners[2] - box1_corners[0]) * (box1_corners[3] - box1_corners[1])
57
- box2_area = (box2_corners[2] - box2_corners[0]) * (box2_corners[3] - box2_corners[1])
58
- union = box1_area + box2_area - intersection
59
-
60
- return intersection / (union + 1e-6)
61
-
62
- def compute_max_iou(true_boxes, pred_box):
63
- """Compute maximum IoU between a predicted box and all true boxes"""
64
- max_iou = 0
65
- for true_box in true_boxes:
66
- iou = compute_iou(true_box, pred_box)
67
- max_iou = max(max_iou, iou)
68
- return max_iou
69
-
70
- @router.post(ROUTE, tags=["Image Task"],
71
- description=DESCRIPTION)
72
- async def evaluate_image(request: ImageEvaluationRequest):
73
- """
74
- Evaluate image classification and object detection for forest fire smoke.
75
-
76
- Current Model: Random Baseline
77
- - Makes random predictions for both classification and bounding boxes
78
- - Used as a baseline for comparison
79
-
80
- Metrics:
81
- - Classification accuracy: Whether an image contains smoke or not
82
- - Object Detection accuracy: IoU (Intersection over Union) for smoke bounding boxes
83
- """
84
- # Get space info
85
- username, space_url = get_space_info()
86
-
87
- # Load and prepare the dataset
88
- dataset = load_dataset(request.dataset_name, token=os.getenv("HF_TOKEN"))
89
-
90
- # Split dataset
91
- train_test = dataset["train"]
92
- test_dataset = dataset["val"]
93
-
94
- # Start tracking emissions
95
- tracker.start()
96
- tracker.start_task("inference")
97
-
98
- #--------------------------------------------------------------------------------------------
99
- # YOUR MODEL INFERENCE CODE HERE
100
- # Update the code below to replace the random baseline with your model inference
101
- #--------------------------------------------------------------------------------------------
102
-
103
- predictions = []
104
- true_labels = []
105
- pred_boxes = []
106
- true_boxes_list = [] # List of lists, each inner list contains boxes for one image
107
-
108
- for example in test_dataset:
109
- # Parse true annotation (YOLO format: class_id x_center y_center width height)
110
- annotation = example.get("annotations", "").strip()
111
- has_smoke = len(annotation) > 0
112
- true_labels.append(int(has_smoke))
113
-
114
- # Make random classification prediction
115
- pred_has_smoke = random.random() > 0.5
116
- predictions.append(int(pred_has_smoke))
117
-
118
- # If there's a true box, parse it and make random box prediction
119
- if has_smoke:
120
- # Parse all true boxes from the annotation
121
- image_true_boxes = parse_boxes(annotation)
122
- true_boxes_list.append(image_true_boxes)
123
-
124
- # For baseline, make one random box prediction per image
125
- # In a real model, you might want to predict multiple boxes
126
- random_box = [
127
- random.random(), # x_center
128
- random.random(), # y_center
129
- random.random() * 0.5, # width (max 0.5)
130
- random.random() * 0.5 # height (max 0.5)
131
- ]
132
- pred_boxes.append(random_box)
133
-
134
- #--------------------------------------------------------------------------------------------
135
- # YOUR MODEL INFERENCE STOPS HERE
136
- #--------------------------------------------------------------------------------------------
137
-
138
- # Stop tracking emissions
139
- emissions_data = tracker.stop_task()
140
-
141
- # Calculate classification metrics
142
- classification_accuracy = accuracy_score(true_labels, predictions)
143
- classification_precision = precision_score(true_labels, predictions)
144
- classification_recall = recall_score(true_labels, predictions)
145
-
146
- # Calculate mean IoU for object detection (only for images with smoke)
147
- # For each image, we compute the max IoU between the predicted box and all true boxes
148
- ious = []
149
- for true_boxes, pred_box in zip(true_boxes_list, pred_boxes):
150
- max_iou = compute_max_iou(true_boxes, pred_box)
151
- ious.append(max_iou)
152
-
153
- mean_iou = float(np.mean(ious)) if ious else 0.0
154
-
155
- # Prepare results dictionary
156
- results = {
157
- "username": username,
158
- "space_url": space_url,
159
- "submission_timestamp": datetime.now().isoformat(),
160
- "model_description": DESCRIPTION,
161
- "classification_accuracy": float(classification_accuracy),
162
- "classification_precision": float(classification_precision),
163
- "classification_recall": float(classification_recall),
164
- "mean_iou": mean_iou,
165
- "energy_consumed_wh": emissions_data.energy_consumed * 1000,
166
- "emissions_gco2eq": emissions_data.emissions * 1000,
167
- "emissions_data": clean_emissions_data(emissions_data),
168
- "api_route": ROUTE,
169
- "dataset_config": {
170
- "dataset_name": request.dataset_name,
171
- "test_size": request.test_size,
172
- "test_seed": request.test_seed
173
- }
174
- }
175
-
176
- return results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tasks/models/__init__.py ADDED
File without changes
tasks/models/final-bf16.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:16405871852b620f3a0ebd30dd273d45b3eaf91d8b27604b9ec00511480c62df
3
+ size 10836
tasks/models/model.py ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from torch import ones, split, bfloat16
2
+ from torch.nn.functional import relu, sigmoid
3
+ from torch.nn import Module, MaxPool1d, Conv1d, Conv2d, Linear, BatchNorm2d, LSTMCell
4
+
5
+ class ChunkCNN(Module):
6
+ def __init__(self):
7
+ super(ChunkCNN, self).__init__()
8
+ self.pool = MaxPool1d(kernel_size=2, stride=2)
9
+ self.conv1 = Conv2d(in_channels=1, out_channels=4, kernel_size=(15,10), stride=(4, 1), padding=(1, 0))
10
+ self.conv2 = Conv1d(in_channels=4, out_channels=8, kernel_size=9, stride=2, padding=0)
11
+ self.conv3 = Conv1d(in_channels=8, out_channels=16, kernel_size=2, stride=2, padding=1)
12
+ self.collapse = Conv1d(in_channels=16, out_channels=1, kernel_size=1, stride=1, padding=0)
13
+
14
+
15
+ def forward(self, x):
16
+ x = self.conv1(x).squeeze()
17
+ x = relu(x)
18
+ x = self.pool(x)
19
+
20
+ x = self.conv2(x)
21
+ x = relu(x)
22
+ x = self.pool(x)
23
+
24
+ x = self.conv3(x)
25
+ x = relu(x)
26
+
27
+ x = self.collapse(x).squeeze()
28
+ x = relu(x)
29
+
30
+ return x
31
+
32
+
33
+ class LastLayer(Module):
34
+ def __init__(self, inputsize):
35
+ super(LastLayer, self).__init__()
36
+ self.dense1 = Linear(inputsize, 3)
37
+ self.dense2 = Linear(3, 1)
38
+
39
+
40
+ def forward(self, x):
41
+ x = self.dense1(x)
42
+ x = relu(x)
43
+ x = self.dense2(x)
44
+ x = sigmoid(x).squeeze()
45
+ return x
46
+
47
+
48
+ class ChainsawDetector(Module):
49
+ def __init__(self, batch_size):
50
+ super(ChainsawDetector, self).__init__()
51
+ self.batch_size = batch_size
52
+ self.nb_lstm = 8
53
+ self.batchnorm = BatchNorm2d(1)
54
+ self.chunkcnn = ChunkCNN()
55
+ self.lstmcell = LSTMCell(self.nb_lstm, self.nb_lstm)
56
+ self.lastlayer = LastLayer(self.nb_lstm)
57
+ self.initstate = ones((batch_size, self.nb_lstm), dtype=bfloat16) # default class is 1: environment
58
+
59
+ def forward(self, x):
60
+ hx = self.initstate.detach().clone()
61
+ cx = self.initstate.detach().clone()
62
+
63
+ x = x[:, None , :, :]
64
+ x = self.batchnorm(x)
65
+
66
+ for chunk in split(x, 10, dim=3):
67
+ xi = self.chunkcnn(chunk)
68
+ hx, cx = self.lstmcell(xi, (hx, cx))
69
+
70
+ x = self.lastlayer(hx)
71
+
72
+ # final decision
73
+ x = (x>0.5).bfloat16()
74
+ return x
tasks/text.py DELETED
@@ -1,92 +0,0 @@
1
- from fastapi import APIRouter
2
- from datetime import datetime
3
- from datasets import load_dataset
4
- from sklearn.metrics import accuracy_score
5
- import random
6
-
7
- from .utils.evaluation import TextEvaluationRequest
8
- from .utils.emissions import tracker, clean_emissions_data, get_space_info
9
-
10
- router = APIRouter()
11
-
12
- DESCRIPTION = "Random Baseline"
13
- ROUTE = "/text"
14
-
15
- @router.post(ROUTE, tags=["Text Task"],
16
- description=DESCRIPTION)
17
- async def evaluate_text(request: TextEvaluationRequest):
18
- """
19
- Evaluate text classification for climate disinformation detection.
20
-
21
- Current Model: Random Baseline
22
- - Makes random predictions from the label space (0-7)
23
- - Used as a baseline for comparison
24
- """
25
- # Get space info
26
- username, space_url = get_space_info()
27
-
28
- # Define the label mapping
29
- LABEL_MAPPING = {
30
- "0_not_relevant": 0,
31
- "1_not_happening": 1,
32
- "2_not_human": 2,
33
- "3_not_bad": 3,
34
- "4_solutions_harmful_unnecessary": 4,
35
- "5_science_unreliable": 5,
36
- "6_proponents_biased": 6,
37
- "7_fossil_fuels_needed": 7
38
- }
39
-
40
- # Load and prepare the dataset
41
- dataset = load_dataset(request.dataset_name)
42
-
43
- # Convert string labels to integers
44
- dataset = dataset.map(lambda x: {"label": LABEL_MAPPING[x["label"]]})
45
-
46
- # Split dataset
47
- train_test = dataset["train"]
48
- test_dataset = dataset["test"]
49
-
50
- # Start tracking emissions
51
- tracker.start()
52
- tracker.start_task("inference")
53
-
54
- #--------------------------------------------------------------------------------------------
55
- # YOUR MODEL INFERENCE CODE HERE
56
- # Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.
57
- #--------------------------------------------------------------------------------------------
58
-
59
- # Make random predictions (placeholder for actual model inference)
60
- true_labels = test_dataset["label"]
61
- predictions = [random.randint(0, 7) for _ in range(len(true_labels))]
62
-
63
- #--------------------------------------------------------------------------------------------
64
- # YOUR MODEL INFERENCE STOPS HERE
65
- #--------------------------------------------------------------------------------------------
66
-
67
-
68
- # Stop tracking emissions
69
- emissions_data = tracker.stop_task()
70
-
71
- # Calculate accuracy
72
- accuracy = accuracy_score(true_labels, predictions)
73
-
74
- # Prepare results dictionary
75
- results = {
76
- "username": username,
77
- "space_url": space_url,
78
- "submission_timestamp": datetime.now().isoformat(),
79
- "model_description": DESCRIPTION,
80
- "accuracy": float(accuracy),
81
- "energy_consumed_wh": emissions_data.energy_consumed * 1000,
82
- "emissions_gco2eq": emissions_data.emissions * 1000,
83
- "emissions_data": clean_emissions_data(emissions_data),
84
- "api_route": ROUTE,
85
- "dataset_config": {
86
- "dataset_name": request.dataset_name,
87
- "test_size": request.test_size,
88
- "test_seed": request.test_seed
89
- }
90
- }
91
-
92
- return results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tasks/utils/evaluation.py CHANGED
@@ -1,18 +1,9 @@
1
- from typing import Optional
2
  from pydantic import BaseModel, Field
3
 
4
  class BaseEvaluationRequest(BaseModel):
5
  test_size: float = Field(0.2, ge=0.0, le=1.0, description="Size of the test split (between 0 and 1)")
6
  test_seed: int = Field(42, ge=0, description="Random seed for reproducibility")
7
 
8
- class TextEvaluationRequest(BaseEvaluationRequest):
9
- dataset_name: str = Field("QuotaClimat/frugalaichallenge-text-train",
10
- description="The name of the dataset on HuggingFace Hub")
11
-
12
- class ImageEvaluationRequest(BaseEvaluationRequest):
13
- dataset_name: str = Field("pyronear/pyro-sdis",
14
- description="The name of the dataset on HuggingFace Hub")
15
-
16
  class AudioEvaluationRequest(BaseEvaluationRequest):
17
  dataset_name: str = Field("rfcx/frugalai",
18
  description="The name of the dataset on HuggingFace Hub")
 
 
1
  from pydantic import BaseModel, Field
2
 
3
  class BaseEvaluationRequest(BaseModel):
4
  test_size: float = Field(0.2, ge=0.0, le=1.0, description="Size of the test split (between 0 and 1)")
5
  test_seed: int = Field(42, ge=0, description="Random seed for reproducibility")
6
 
 
 
 
 
 
 
 
 
7
  class AudioEvaluationRequest(BaseEvaluationRequest):
8
  dataset_name: str = Field("rfcx/frugalai",
9
  description="The name of the dataset on HuggingFace Hub")
tasks/utils/preprocess.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from torch.utils.data import DataLoader
2
+ import librosa
3
+ from math import floor
4
+ import torch
5
+ from torch.nn.functional import pad
6
+ from torchaudio.transforms import Resample
7
+ #from random import randint
8
+
9
+
10
+ def get_dataloader(dataset, device, batch_size=16, shuffle=True):
11
+ return DataLoader(
12
+ dataset.with_format("torch", device=device),
13
+ batch_size=batch_size,
14
+ collate_fn=prepare_batch,
15
+ num_workers=4,
16
+ shuffle=shuffle,
17
+ drop_last=True,
18
+ persistent_workers=True,
19
+ )
20
+
21
+ def resample(x, sr, newsr):
22
+ transform = Resample(
23
+ orig_freq=sr,
24
+ new_freq=newsr,
25
+ resampling_method="sinc_interp_kaiser",
26
+ lowpass_filter_width=16,
27
+ rolloff=0.85,
28
+ beta=8.555504641634386,
29
+ )
30
+ return transform(x)
31
+
32
+ def fixlength(x, L):
33
+ x = x[:L]
34
+ x = pad(x, (0,L-len(x)))
35
+ return x
36
+
37
+ def preprocess(X, newsr, n_fft, win_length, hop_length, gain=0.8, bias=10, power=0.25):
38
+ X = torch.stft(X, n_fft, hop_length=hop_length, win_length=win_length, window=torch.hann_window(win_length), onesided=True, return_complex=True)
39
+ X = torch.abs(X)
40
+ X = torch.stack([torch.from_numpy(librosa.pcen(x.numpy(), sr=newsr, hop_length=hop_length, gain=gain, bias=bias, power=power))
41
+ for x in X], 0)
42
+ X = X.to(torch.bfloat16)
43
+ return X
44
+
45
+
46
+
47
+ def prepare_batch(samples):
48
+ #maxlen=60
49
+ newsr = 4000
50
+ n_fft = 2**10 # power of 2
51
+ win_length = 2**10
52
+ hop_length = floor(0.0505*newsr)
53
+ labels = []
54
+ signals = []
55
+ for sample in samples:
56
+ labels.append(sample['label'])
57
+ sr = sample['audio']['sampling_rate']
58
+ x = sample['audio']['array']
59
+ if (sr > newsr and len(x)!=0):
60
+ x = resample(x, sr, newsr)
61
+ x = fixlength(x, 3*newsr)
62
+ signals.append(x)
63
+
64
+ signals = torch.stack(signals, 0)
65
+ batch = preprocess(signals,newsr, n_fft, win_length, hop_length)
66
+ labels = torch.tensor(labels, dtype=float)
67
+ return batch, labels
68
+
69
+ # def random_mask(sample):
70
+ # # random rectangular mask
71
+ # B, H, W = sample.shape
72
+ # for b in range(B):
73
+ # for _ in range(randint(3,12)):
74
+ # w = randint(5, 15)
75
+ # h = randint(10, 100)
76
+ # x1 = randint(0, W-w)
77
+ # y1 = randint(0, H-h)
78
+ # sample[b, y1:y1+h, x1:x1+w] = 0
79
+ # return sample
80
+
81
+ # def timeshift(sample):
82
+ # padsize = randint(0, 6)
83
+ # length = sample.size(2)
84
+ # randpad = torch.zeros((sample.size(0), sample.size(1), padsize), dtype=torch.float32)
85
+ # sample = torch.cat((randpad, sample), dim=2)
86
+ # sample = sample[:,:,:length]
87
+ # return sample
88
+
89
+ # def add_noise(sample):
90
+ # #noise = np.random.normal(0, 0.05*sample.max(), sample.shape)
91
+ # noise = 0.05*sample.max()*torch.randn(sample.shape, dtype=torch.float32)
92
+ # sample = sample + noise
93
+ # return sample
94
+
95
+ # def augment(sample):
96
+ # sample = timeshift(sample)
97
+ # sample = random_mask(sample)
98
+ # sample = add_noise(sample)
99
+ # return sample
training/curate.ipynb ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 1,
6
+ "metadata": {},
7
+ "outputs": [],
8
+ "source": [
9
+ "import librosa\n",
10
+ "import soundfile as sf\n",
11
+ "import numpy as np\n",
12
+ "from os import listdir\n",
13
+ "from os.path import isfile, join\n",
14
+ "from math import floor\n",
15
+ "import IPython.display as ipd"
16
+ ]
17
+ },
18
+ {
19
+ "cell_type": "markdown",
20
+ "metadata": {},
21
+ "source": [
22
+ "Data from [ESC-50](https://github.com/karolpiczak/ESC-50) \n",
23
+ "And [freesound.org](freesound.org)"
24
+ ]
25
+ },
26
+ {
27
+ "cell_type": "code",
28
+ "execution_count": 2,
29
+ "metadata": {},
30
+ "outputs": [],
31
+ "source": [
32
+ "from scipy.signal import butter, lfilter\n",
33
+ "\n",
34
+ "def apply_lowpass_filter(x, sr):\n",
35
+ " order = 10\n",
36
+ " cutoff = 2000\n",
37
+ " b, a = butter(order, cutoff, fs=sr, btype='low', analog=False)\n",
38
+ " y = lfilter(b, a, x)\n",
39
+ " return y"
40
+ ]
41
+ },
42
+ {
43
+ "cell_type": "code",
44
+ "execution_count": 3,
45
+ "metadata": {},
46
+ "outputs": [],
47
+ "source": [
48
+ "def downsample(x, sr, newsr):\n",
49
+ " return x[::floor(sr/newsr)]"
50
+ ]
51
+ },
52
+ {
53
+ "cell_type": "code",
54
+ "execution_count": 4,
55
+ "metadata": {},
56
+ "outputs": [],
57
+ "source": [
58
+ "def play(x, sr):\n",
59
+ " ipd.display(ipd.Audio(data=x, rate=sr))"
60
+ ]
61
+ },
62
+ {
63
+ "cell_type": "code",
64
+ "execution_count": 11,
65
+ "metadata": {},
66
+ "outputs": [],
67
+ "source": [
68
+ "# crop a single file (if end by silence)\n",
69
+ "# file=\"751913__spaudiobooks__chainsaw-in-a-forest\"\n",
70
+ "# ext=\".wav\"\n",
71
+ "# data, sr = librosa.load(dirpath+file+ext)\n",
72
+ "# data = data[:sr*(60+31)]\n",
73
+ "# sf.write(dirpath+file+\".wav\", data, sr)"
74
+ ]
75
+ },
76
+ {
77
+ "cell_type": "code",
78
+ "execution_count": null,
79
+ "metadata": {},
80
+ "outputs": [],
81
+ "source": [
82
+ "# read, filter, downsample, chunk by 3s length, write wav\n",
83
+ "newsr=4000\n",
84
+ "c=2550\n",
85
+ "dirpath = \"../datasets/freesound/chainsaw/audio/long\"\n",
86
+ "for file in listdir(dirpath):\n",
87
+ " if isfile(join(dirpath, file)):\n",
88
+ " print(file)\n",
89
+ " data, sr = librosa.load(dirpath+file)\n",
90
+ " #play(data, sr)\n",
91
+ " data = apply_lowpass_filter(data, sr)\n",
92
+ " data = downsample(data, sr, newsr)\n",
93
+ " cutpoints = list(range(3*newsr,len(data),3*newsr))\n",
94
+ " all_data = np.split(data, cutpoints)\n",
95
+ " for d in all_data:\n",
96
+ " if (len(d) > 1024):\n",
97
+ " sf.write(dirpath+f'curated/{c}.wav', d, 4000)\n",
98
+ " c+=1\n"
99
+ ]
100
+ },
101
+ {
102
+ "cell_type": "code",
103
+ "execution_count": 7,
104
+ "metadata": {},
105
+ "outputs": [],
106
+ "source": [
107
+ "# detect too short files\n",
108
+ "# dirpath = \"../datasets/freesound/environment/audio/curated/\"\n",
109
+ "# for file in listdir(dirpath):\n",
110
+ "# if isfile(join(dirpath, file)):\n",
111
+ "# data, sr = librosa.load(dirpath+file)\n",
112
+ "# if (len(data)<=1024):\n",
113
+ "# print(file, len(data))"
114
+ ]
115
+ },
116
+ {
117
+ "cell_type": "code",
118
+ "execution_count": null,
119
+ "metadata": {},
120
+ "outputs": [],
121
+ "source": [
122
+ "# ESC-50\n",
123
+ "# attenuate, mix with background\n",
124
+ "from random import randint, uniform\n",
125
+ "\n",
126
+ "c=0\n",
127
+ "newsr=4000\n",
128
+ "dirpath = \"../datasets/freesound/chainsaw/audio/\"\n",
129
+ "envdir = \"../datasets/freesound/environment/audio/\"\n",
130
+ "envfiles = [file for file in listdir(envdir) if isfile(join(envdir, file))]\n",
131
+ "for file in listdir(dirpath):\n",
132
+ " if isfile(join(dirpath, file)):\n",
133
+ " print(file)\n",
134
+ " data, sr = librosa.load(dirpath+file)\n",
135
+ " #play(data, sr)\n",
136
+ " lastindexes=[]\n",
137
+ " for i in range(3):\n",
138
+ " index = randint(0, len(envfiles)-1)\n",
139
+ " while (index in lastindexes):\n",
140
+ " index = randint(0, len(envfiles)-1)\n",
141
+ " lastindexes.append(index)\n",
142
+ " addfile = envfiles[index]\n",
143
+ " data2, sr2 = librosa.load(envdir+addfile)\n",
144
+ " data1 = apply_lowpass_filter(data, sr)\n",
145
+ " data2 = apply_lowpass_filter(data2, sr2)\n",
146
+ " data1 = downsample(data1, sr, newsr)\n",
147
+ " data2 = downsample(data2, sr2, newsr)\n",
148
+ " attenuation = round(uniform(0.2, 0.5), 2)\n",
149
+ " data1 = (data1 * attenuation + data2 *(1-attenuation))/2\n",
150
+ " all_data = np.split(data1, [round(len(data1)/2)])\n",
151
+ " for d in all_data:\n",
152
+ " sf.write(dirpath+f'test/mix-{c}.wav', d, 4000)\n",
153
+ " c+=1"
154
+ ]
155
+ },
156
+ {
157
+ "cell_type": "code",
158
+ "execution_count": 64,
159
+ "metadata": {},
160
+ "outputs": [],
161
+ "source": [
162
+ "# environment audio from ESC-50, filter, downsample and half the files (they are 5 sec long)\n",
163
+ "newsr=4000\n",
164
+ "c=2649\n",
165
+ "dirpath = \"../datasets/freesound/environment/audio/\"\n",
166
+ "for file in listdir(dirpath):\n",
167
+ " if isfile(join(dirpath, file)):\n",
168
+ " data, sr = librosa.load(dirpath+file)\n",
169
+ " data = apply_lowpass_filter(data, sr)\n",
170
+ " data = downsample(data, sr, newsr)\n",
171
+ " all_data = np.split(data, [round(len(data)/2)])\n",
172
+ " for d in all_data:\n",
173
+ " # random time shift\n",
174
+ " rand_zeros = np.zeros(randint(0, 1900))\n",
175
+ " d = np.append(rand_zeros, d)\n",
176
+ " sf.write(dirpath+f'curated/e-{c}.wav', d, 4000)\n",
177
+ " c+=1"
178
+ ]
179
+ }
180
+ ],
181
+ "metadata": {
182
+ "kernelspec": {
183
+ "display_name": "audio-processing",
184
+ "language": "python",
185
+ "name": "python3"
186
+ },
187
+ "language_info": {
188
+ "codemirror_mode": {
189
+ "name": "ipython",
190
+ "version": 3
191
+ },
192
+ "file_extension": ".py",
193
+ "mimetype": "text/x-python",
194
+ "name": "python",
195
+ "nbconvert_exporter": "python",
196
+ "pygments_lexer": "ipython3",
197
+ "version": "3.11.5"
198
+ }
199
+ },
200
+ "nbformat": 4,
201
+ "nbformat_minor": 2
202
+ }
training/dataset.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from torch.utils.data import Dataset as TorchDataset
2
+ import pandas as pd
3
+ import torchaudio
4
+ import torch
5
+
6
+ class ChainsawDataset(TorchDataset):
7
+
8
+ def __init__(self):
9
+ self.path="../datasets/freesound/"
10
+ self.ds = pd.read_csv(self.path+"labels.csv")
11
+
12
+ def __getitem__(self, index):
13
+ file, label = self.ds.iloc[index]
14
+ x, sr = torchaudio.load(self.path+file)
15
+ x = x.squeeze()
16
+ return {
17
+ 'audio': {
18
+ 'path': file,
19
+ 'array': x,
20
+ 'sampling_rate': torch.tensor(sr),
21
+ },
22
+ 'label': torch.tensor(label)
23
+ }
24
+
25
+ def __len__(self):
26
+ return len(self.ds)
training/train.py ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import torch
3
+ from torch import optim
4
+ from torch import nn
5
+ from torchmetrics.classification import BinaryAccuracy
6
+ from torch.optim.lr_scheduler import OneCycleLR
7
+ from torch.amp import autocast
8
+ import mlflow
9
+ from tqdm import tqdm
10
+ import model
11
+ import preprocess
12
+ import dataset
13
+
14
+ # MLflow server
15
+ mlflow.set_tracking_uri(uri="http://localhost:8080")
16
+ mlflow.set_experiment("Optimizations")
17
+
18
+ start_time = time.time()
19
+
20
+ # batch best lr
21
+ # 8 1e-3
22
+ # 16 5e-3
23
+ # hyperparameters
24
+ hp = {
25
+ 'batch_size': 16,
26
+ 'learning_rate': 5e-3,
27
+ 'num_epochs': 10,
28
+ }
29
+
30
+ device ='cuda' if torch.cuda.is_available() else 'cpu'
31
+
32
+ preprocess.hflogin()
33
+
34
+ # Prepare datasets
35
+ custom_dataset = dataset.ChainsawDataset()
36
+ train_dataset = preprocess.get_dataset('train', device)
37
+ train_dataset = torch.utils.data.ConcatDataset([train_dataset, custom_dataset])
38
+ val_dataset = preprocess.get_dataset('test', device)
39
+ train_dataloader = preprocess.get_dataloader(train_dataset, batch_size=hp['batch_size'], shuffle=True)
40
+ val_dataloader = preprocess.get_dataloader(val_dataset, batch_size=hp['batch_size'], shuffle=False)
41
+
42
+ # Load model
43
+ model = model.ChainsawDetector(hp['batch_size']).to(device, dtype=torch.bfloat16)
44
+ model = torch.compile(model)
45
+ model.load_state_dict(torch.load('backups/final-bf16.pth', weights_only=True), strict=True)
46
+
47
+
48
+ hp['total_params'] = sum(p.numel() for p in model.parameters())
49
+ print(f"model ready, {hp['total_params']} parameters")
50
+
51
+ loss_fn = nn.BCELoss()
52
+ hp["loss_fn"] = 'BinaryCrossEntropyLoss'
53
+ optimizer = optim.AdamW(model.parameters(), lr=hp['learning_rate'])
54
+
55
+ total_iterations=len(train_dataset)
56
+ steps_per_epoch=total_iterations//hp['batch_size']
57
+ total_steps = total_iterations*hp['num_epochs']
58
+ print(f"batch_size = {hp['batch_size']}, num_epochs = {hp['num_epochs']}")
59
+ print(f'{total_iterations=}, {steps_per_epoch=}, {total_steps=}')
60
+ lrscheduler = OneCycleLR(optimizer, max_lr=hp['learning_rate'], steps_per_epoch=steps_per_epoch, epochs=hp['num_epochs'])
61
+ hp["optimizer"] = 'AdamW'
62
+ metric_fn = BinaryAccuracy(threshold=0.5)
63
+
64
+ def train(loader, model, loss_fn, metric_fn, optimizer, lrscheduler, epoch):
65
+ for batch_index, (data, targets) in enumerate(tqdm(loader)):
66
+ # Move data and targets to the device (GPU/CPU)
67
+ data = data.to(device, dtype=torch.bfloat16)
68
+ data = preprocess.augment(data)
69
+ targets = targets.to(device, dtype=torch.bfloat16)
70
+
71
+ optimizer.zero_grad()
72
+ # Forward pass: compute the model output
73
+ with autocast(device_type=device, dtype=torch.bfloat16):
74
+ predictions = model(data)
75
+ loss = loss_fn(predictions, targets)
76
+
77
+ # Backward pass: compute the gradients
78
+ loss.backward()
79
+
80
+ # Optimization step: update the model parameters
81
+ optimizer.step()
82
+ lrscheduler.step()
83
+
84
+ if batch_index % 100 == 0:
85
+ loss = loss.item()
86
+ accuracy = metric_fn(predictions, targets)
87
+ step = batch_index + epoch*steps_per_epoch
88
+ mlflow.log_metric("lr", lrscheduler.get_last_lr()[0], step=step)
89
+ mlflow.log_metric("train_loss", f"{loss:2f}", step=step)
90
+ mlflow.log_metric("train_accuracy", f"{accuracy:2f}", step=step)
91
+
92
+ def decide(x):
93
+ return 1 if x>=0.5 else 0
94
+
95
+ MAE = torch.nn.L1Loss()
96
+
97
+ def evaluate(loader, model, epoch, loss_fn=loss_fn):
98
+ num_correct = 0
99
+ num_samples = 0
100
+ num_batches = 0
101
+ loss = 0
102
+ confidence = 0
103
+ model.eval()
104
+
105
+ with torch.no_grad(), autocast(device_type=device):
106
+ for X, y in loader:
107
+ X = X.to(device, dtype=torch.bfloat16)
108
+ y = y.to(device, dtype=torch.bfloat16)
109
+
110
+ predictions = model(X)
111
+
112
+ decisions = predictions.detach().clone()
113
+ decisions.apply_(decide)
114
+
115
+ confidence += MAE(decisions, predictions)
116
+ loss += loss_fn(decisions, y).item()
117
+ num_correct += (decisions == y).sum() # Count correct predictions
118
+ num_samples += decisions.size(0) # Count total samples
119
+ num_batches +=1
120
+
121
+ # Calculate metrics
122
+ accuracy = float(num_correct) / float(num_samples) * 100
123
+ loss /= num_batches
124
+ confidence /= num_batches
125
+ confidence = 1-confidence
126
+ mlflow.log_metric("val_loss", f"{loss:2f}", step=epoch)
127
+ mlflow.log_metric("val_accuracy", f"{accuracy:2f}", step=epoch)
128
+ mlflow.log_metric("val_confidence", f"{confidence:2f}", step=epoch)
129
+ print(f"Got {num_correct}/{num_samples} with accuracy {accuracy:.2f}% and confidence {confidence:.2f}")
130
+ model.train()
131
+
132
+
133
+ with mlflow.start_run() as run:
134
+ mlflow.log_params(hp)
135
+ for epoch in range(0, hp['num_epochs']):
136
+ print(f"Epoch [{epoch+1}/{hp['num_epochs']}]")
137
+ train(train_dataloader, model, loss_fn, metric_fn, optimizer, lrscheduler, epoch)
138
+ evaluate(val_dataloader, model, epoch)
139
+
140
+ model.eval()
141
+
142
+ elapsed = time.time() - start_time
143
+ print(f"--- {elapsed:.2f} seconds ---")
144
+
145
+ torch.save(model.state_dict(), 'backups/name.pth')