frugal-ai-challenge-submission

Sleeping

App Files Files Community

Nicolas Denier commited on Jan 30

Commit

5ad4868

1 Parent(s): 0ae53cb

ready for submission

Browse files

Files changed (20) hide show

.gitignore +2 -0
README.md +93 -29
app.py +4 -8
notebooks/template-audio.ipynb +0 -1351
notebooks/template-image.ipynb +0 -416
notebooks/template-text.ipynb +0 -1642
requirements.txt +3 -1
tasks/audio.py +27 -16
tasks/datasources/freesound_chainsaw.txt +68 -0
tasks/datasources/freesound_environment.txt +37 -0
tasks/image.py +0 -176
tasks/models/__init__.py +0 -0
tasks/models/final-bf16.pth +3 -0
tasks/models/model.py +74 -0
tasks/text.py +0 -92
tasks/utils/evaluation.py +0 -9
tasks/utils/preprocess.py +99 -0
training/curate.ipynb +202 -0
training/dataset.py +26 -0
training/train.py +145 -0

.gitignore CHANGED Viewed

@@ -15,3 +15,5 @@ logs/
 emissions.csv
 notebooks/test.ipynb

 emissions.csv
 notebooks/test.ipynb
+.fuse_hidden*

README.md CHANGED Viewed

@@ -1,52 +1,107 @@
 ---
-title: Submission Template
-emoji: 🔥
-colorFrom: yellow
 colorTo: green
 sdk: docker
 pinned: false
 ---
-# Random Baseline Model for Climate Disinformation Classification
 ## Model Description
-This is a random baseline model for the Frugal AI Challenge 2024, specifically for the text classification task of identifying climate disinformation. The model serves as a performance floor, randomly assigning labels to text inputs without any learning.
 ### Intended Use
-- **Primary intended uses**: Baseline comparison for climate disinformation classification models
 - **Primary intended users**: Researchers and developers participating in the Frugal AI Challenge
 - **Out-of-scope use cases**: Not intended for production use or real-world classification tasks
-## Training Data
-The model uses the QuotaClimat/frugalaichallenge-text-train dataset:
-- Size: ~6000 examples
-- Split: 80% train, 20% test
-- 8 categories of climate disinformation claims
 ### Labels
-0. No relevant claim detected
-1. Global warming is not happening
-2. Not caused by humans
-3. Not bad or beneficial
-4. Solutions harmful/unnecessary
-5. Science is unreliable
-6. Proponents are biased
-7. Fossil fuels are needed
 ## Performance
 ### Metrics
-- **Accuracy**: ~12.5% (random chance with 8 classes)
 - **Environmental Impact**:
   - Emissions tracked in gCO2eq
   - Energy consumption tracked in Wh
 ### Model Architecture
-The model implements a random choice between the 8 possible labels, serving as the simplest possible baseline.
 ## Environmental Impact
@@ -57,15 +112,24 @@ Environmental impact is tracked using CodeCarbon, measuring:
 This tracking helps establish a baseline for the environmental impact of model deployment and inference.
 ## Limitations
-- Makes completely random predictions
-- No learning or pattern recognition
-- No consideration of input text
-- Serves only as a baseline reference
-- Not suitable for any real-world applications
 ## Ethical Considerations
-- Dataset contains sensitive topics related to climate disinformation
-- Model makes random predictions and should not be used for actual classification
-- Environmental impact is tracked to promote awareness of AI's carbon footprint
 ```

 ---
+title: ChainsawDetector
+emoji: 🌳
+colorFrom: lightgreen
 colorTo: green
 sdk: docker
 pinned: false
 ---
+# ChainsawDetector
 ## Model Description
+This model is proposed as a submission for the **Frugal AI Challenge 2024**, specifically for the **audio** binary classification task: detecting chainsaw among environmental noise.
 ### Intended Use
+- **Primary intended uses**: Non commercial chainsaw detection from audio recordings
 - **Primary intended users**: Researchers and developers participating in the Frugal AI Challenge
 - **Out-of-scope use cases**: Not intended for production use or real-world classification tasks
+## Training
+### Data
+- The model was mainly trained on the [rfcx/frugalai](https://huggingface.co/datasets/rfcx/frugalai) dataset:
+  - License: CC BY-NC 4.0
+  - Size: 35.3k audio samples of 3 seconds max
+  - 2 classes (chainsaw or environment)
+  - Validation set (15.1k samples) and final test set are provided by the same source
+To improve performance, additional datasets have been explored:
+- Diverse audio recordings fetched from [freesound](freesound.org/)
+  - Various open licenses (Creative Commons, Attribution, Non commercial), see [`datasources/`](datasources/) for complete attributions.
+  - Chainsaw, and environmental noise (forest, rain)
+  - After curation, 2425 chainsaw, and 2646 environment samples of 3 seconds
+- [ESC-50](https://github.com/karolpiczak/ESC-50)
+  - License: CC BY-NC 3.0
+  - Initially an environmental sound dataset of 50 classes
+  - Selected only chainsaw (as class 0), crickets, birds and wind (class 1)
+  - Only 20 samples of 5 seconds for each class
+  - After mixing and cropping, 240 samples for class 0, and 240 of class 1 (a derisory amount but it was interesting to process)
 ### Labels
+0. Chainsaw
+1. Environment
+### Preprocessing
+1. The initial raw audio arrays are first downampled to 4kHz.
+  Indeed, according to [[1]](https://ieeexplore.ieee.org/document/9909629), "chainsaw harmonics are visible only up to 1kHz". They can be higher, but in practice "often masked by background noise", so it was decided to keep frequencies up to 2kHz to feed the model. As the next step (fourier transform) requires to have as input at least 2 times the max frequency to keep (Nyquist–Shannon sampling theorem), the 4kHz downsampling makes sense. A low pass filter is applied before downsampling, to avoid aliasing.
+  This has two major advantages: reducing the amount of "useless" data to process (in the sense that it does not contains valuable information to identify chainsaws), leading to faster processing and converging. And filtering out an important part of possible noises (a lot of high frequencies bird songs are recensed in the recordings).
+2. The Short Term Fourier Transform (STFT) is used to extract a spectrogram.
+  A n_fft parameter of 1024, a window length of about 0,25 seconds, and a hop length close to 0.05s leads to a spectrogram of size (513, 60) in respectively frequency and time dimensions for the 4kHz 3s inputs. These quite wide windows (compared to speech processing for example) allow to roughly summarize the information without producing too much data (remember, frugality). More generally in this work, most of the decisions were in favor of simplicity while still allowing decent performances.
+3. The Per Channel Energy Normalization (PCEN) [[2]](https://ieeexplore.ieee.org/document/8514023) is applied.
+  Described as "a computationally efficient frontend for robust detection and classification of acoustic events", which sounds exactly what is needed here. It should brings better perfomances than traditionnal MFCC in this case.
+4. The spectrograms are split along the time dimension in 6 splits of time-length 10
+  (which is about half a second with context).
+  These (513, 10) sized chunks are given sequentially to the model. The idea is that a signal of any length can be chunked timewise and thus processed by the model. In this demonstration, 3s signals only are used to be able to process them by batches, but in a real world application, this model architecture can process real time continuous audio recordings.
+### Augmentation
+During training only, different data augmentation techniques were applied:
+- Small time shift, the spectrogams are randomly padded to the right.
+- Random rectangular masks are applied to hide part of the input data.
+- Reasonable gaussian noise is added so the model won't learn each data point.
+### Details
+The final model was trained on 20 epochs with a OneCycle learning rate peaking at 0.005.
+A batch size of 16 was decided, to keep sufficient performances, as it was fully trained on CPU (Intel® Core™ i5-1135G7)!
+With AdamW as optimizer, the first half of the training was computed with float32 with automatic mixed precision, and the last part fully with bfloat16.
+The model is converging quickly, although erratically due to data augmentation, and then from 90% accuracy slightly improving until stuck to 93%.
+Training code can be found in [training/](training/) directory (not used for inference, only for information).
 ## Performance
 ### Metrics
+- **Accuracy**: 93.02% on test split
+- **Precision**: 93.51%
+- **Recall**: 89.97%
+- **F-score**: 91,71%
 - **Environmental Impact**:
   - Emissions tracked in gCO2eq
   - Energy consumption tracked in Wh
+- **Mistakes**
+  - False positive represents 38.35% of mistakes
+  - False negative represents 61.65% of mistakes
+  The model tends to predict more class 1 (environment)
+  Possible explanations are:
+    - This class is slighlty more represented in the train dataset
+    - This corresponds to the default class (LSTM init states are biased towards it)
+    - Technically, each audio sample is containing environmental noise, the chainsaw occurences are on top of it
+  Overall, this is not a bad thing as false alert can lead to waste of time in real world situation
 ### Model Architecture
+The model itself is taking a sequence of chunks (as described above) as input, and produce a single decision (0 or 1).
+Three convolutional layers (and some max pooling) are reducing the inputs to a 2d tensor of 8 points across 16 channels, then a fourth one is shrinking the channels, producing a 8 length vector. The first convolution is 2d, but the next ones are 1d, the time is soon reduced to one dimension. Thus, the vector is summarizing the frequencies into 8 values.
+These values are passed to an LSTM that also receive an initial state of ones (environment by default).
+Each chunk is then processed the same way: same convolutional kernels, then persisting LSTM updating its state.
+At the end of the signal (here after 6 chunks), a final dense layer is taking the last LSTM state (8 values), and outputs an almost prediction.  (ReLu activations are used in hidden states, as well as after the convolutions because it is efficient to compute).
+After a sigmoid activation, and a simple thresholding (1 if above 0.5, else 0), the final decison is produced.
+In total, 1798 parameters are used.
 ## Environmental Impact
 This tracking helps establish a baseline for the environmental impact of model deployment and inference.
 ## Limitations
+- Not much time was used for hyperparameters optimization, only learning rate was selected, and a few different layers configurations (size of kernels, number of layers). The main reason is because it is time and computationnaly expensive, but there are certainly improvements to find if HPO is considered more in details.
+- There are implementations of trainable PCEN, which can be interesting, but uses more weigths.
+- More data, and more diverse can be used to help the model distinguish chainsaws among any other kind of possible noise in a wild forest (and there are apparently a lot).
 ## Ethical Considerations
+- Environmental impact is tracked to promote awareness of AI's carbon footprint.
+- Advices from [[3]](https://arxiv.org/pdf/2106.08962) were applied to reduce size while keeping good performances.
+- Illegal deforestation is bad. Legal one also though.
+## References
+- [1] N. Stefanakis, K. Psaroulakis, N. Simou and C. Astaras, "An Open-Access System for Long-Range Chainsaw Sound Detection", 2022 30th European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 2022, pp. 264-268, doi: 10.23919/EUSIPCO55093.2022.9909629.
+- [2] V. Lostanlen et al., "Per-Channel Energy Normalization: Why and How", in IEEE Signal Processing Letters, vol. 26, no. 1, pp. 39-43, Jan. 2019, doi: 10.1109/LSP.2018.2878620.
+- [3] Menghani, Gaurav, "Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better", journal ACM Computing Surveys, Association for Computing Machinery (ACM), vol. 55, no. 12, pp. 1–37, 2023, doi: 10.1145/3578938
 ```

app.py CHANGED Viewed

@@ -1,6 +1,6 @@
 from fastapi import FastAPI
 from dotenv import load_dotenv
-from tasks import text, image, audio
 # Load environment variables
 load_dotenv()
@@ -10,18 +10,14 @@ app = FastAPI(
     description="API for the Frugal AI Challenge evaluation endpoints"
 )
-# Include all routers
-app.include_router(text.router)
-app.include_router(image.router)
 app.include_router(audio.router)
 @app.get("/")
 async def root():
     return {
-        "message": "Welcome to the Frugal AI Challenge API",
         "endpoints": {
-            "text": "/text - Text classification task",
-            "image": "/image - Image classification task (coming soon)",
-            "audio": "/audio - Audio classification task (coming soon)"
         }
     }

 from fastapi import FastAPI
 from dotenv import load_dotenv
+from tasks import audio
 # Load environment variables
 load_dotenv()
     description="API for the Frugal AI Challenge evaluation endpoints"
 )
+# Include audio routers
 app.include_router(audio.router)
 @app.get("/")
 async def root():
     return {
+        "message": "Frugal AI Challenge submission API",
         "endpoints": {
+            "audio": "/audio - Audio classification task"
         }
     }

notebooks/template-audio.ipynb DELETED Viewed

@@ -1,1351 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Text task notebook template\n",
-    "## Loading the necessary libraries"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "[codecarbon WARNING @ 19:48:07] Multiple instances of codecarbon are allowed to run at the same time.\n",
-      "[codecarbon INFO @ 19:48:07] [setup] RAM Tracking...\n",
-      "[codecarbon INFO @ 19:48:07] [setup] CPU Tracking...\n",
-      "[codecarbon WARNING @ 19:48:09] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
-      "[codecarbon WARNING @ 19:48:09] No CPU tracking mode found. Falling back on CPU constant mode. \n",
-      " Windows OS detected: Please install Intel Power Gadget to measure CPU\n",
-      "\n",
-      "[codecarbon WARNING @ 19:48:11] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
-      "[codecarbon INFO @ 19:48:11] CPU Model on constant consumption mode: 13th Gen Intel(R) Core(TM) i7-1365U\n",
-      "[codecarbon WARNING @ 19:48:11] No CPU tracking mode found. Falling back on CPU constant mode.\n",
-      "[codecarbon INFO @ 19:48:11] [setup] GPU Tracking...\n",
-      "[codecarbon INFO @ 19:48:11] No GPU found.\n",
-      "[codecarbon INFO @ 19:48:11] >>> Tracker's metadata:\n",
-      "[codecarbon INFO @ 19:48:11]   Platform system: Windows-11-10.0.22631-SP0\n",
-      "[codecarbon INFO @ 19:48:11]   Python version: 3.12.7\n",
-      "[codecarbon INFO @ 19:48:11]   CodeCarbon version: 3.0.0_rc0\n",
-      "[codecarbon INFO @ 19:48:11]   Available RAM : 31.347 GB\n",
-      "[codecarbon INFO @ 19:48:11]   CPU count: 12\n",
-      "[codecarbon INFO @ 19:48:11]   CPU model: 13th Gen Intel(R) Core(TM) i7-1365U\n",
-      "[codecarbon INFO @ 19:48:11]   GPU count: None\n",
-      "[codecarbon INFO @ 19:48:11]   GPU model: None\n",
-      "[codecarbon INFO @ 19:48:11] Saving emissions data to file c:\\git\\submission-template\\notebooks\\emissions.csv\n"
-     ]
-    }
-   ],
-   "source": [
-    "from fastapi import APIRouter\n",
-    "from datetime import datetime\n",
-    "from datasets import load_dataset\n",
-    "from sklearn.metrics import accuracy_score\n",
-    "import random\n",
-    "\n",
-    "import sys\n",
-    "sys.path.append('../tasks')\n",
-    "\n",
-    "from utils.evaluation import AudioEvaluationRequest\n",
-    "from utils.emissions import tracker, clean_emissions_data, get_space_info\n",
-    "\n",
-    "\n",
-    "# Define the label mapping\n",
-    "LABEL_MAPPING = {\n",
-    "    \"chainsaw\": 0,\n",
-    "    \"environment\": 1\n",
-    "}"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Loading the datasets and splitting them"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "668da7bf85434e098b95c3ec447d78fe",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "README.md:   0%|          | 0.00/5.18k [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\datasets--QuotaClimat--frugalaichallenge-text-train. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
-      "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
-      "  warnings.warn(message)\n"
-     ]
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "5b68d43359eb429395da8be7d4b15556",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train.parquet:   0%|          | 0.00/1.21M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "140a304773914e9db8f698eabeb40298",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "Generating train split:   0%|          | 0/6091 [00:00<?, ? examples/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "6d04e8ab1906400e8e0029949dc523a5",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "Map:   0%|          | 0/6091 [00:00<?, ? examples/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    }
-   ],
-   "source": [
-    "request = AudioEvaluationRequest()\n",
-    "\n",
-    "# Load and prepare the dataset\n",
-    "dataset = load_dataset(request.dataset_name)\n",
-    "\n",
-    "# Split dataset\n",
-    "train_test = dataset[\"train\"]\n",
-    "test_dataset = dataset[\"test\"]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Random Baseline"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Start tracking emissions\n",
-    "tracker.start()\n",
-    "tracker.start_task(\"inference\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[1,\n",
-       " 7,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 3,\n",
-       " 6,\n",
-       " 6,\n",
-       " 3,\n",
-       " 6,\n",
-       " 6,\n",
-       " 5,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 2,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 1,\n",
-       " 3,\n",
-       " 6,\n",
-       " 4,\n",
-       " 2,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 5,\n",
-       " 5,\n",
-       " 1,\n",
-       " 2,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 3,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 0,\n",
-       " 0,\n",
-       " 3,\n",
-       " 3,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 4,\n",
-       " 4,\n",
-       " 1,\n",
-       " 4,\n",
-       " 5,\n",
-       " 6,\n",
-       " 1,\n",
-       " 2,\n",
-       " 2,\n",
-       " 2,\n",
-       " 5,\n",
-       " 2,\n",
-       " 7,\n",
-       " 2,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 4,\n",
-       " 2,\n",
-       " 0,\n",
-       " 1,\n",
-       " 6,\n",
-       " 3,\n",
-       " 2,\n",
-       " 5,\n",
-       " 5,\n",
-       " 2,\n",
-       " 0,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 5,\n",
-       " 5,\n",
-       " 7,\n",
-       " 4,\n",
-       " 6,\n",
-       " 7,\n",
-       " 1,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 2,\n",
-       " 5,\n",
-       " 3,\n",
-       " 3,\n",
-       " 3,\n",
-       " 2,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 4,\n",
-       " 5,\n",
-       " 7,\n",
-       " 0,\n",
-       " 3,\n",
-       " 1,\n",
-       " 4,\n",
-       " 6,\n",
-       " 0,\n",
-       " 7,\n",
-       " 1,\n",
-       " 1,\n",
-       " 2,\n",
-       " 2,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 3,\n",
-       " 4,\n",
-       " 4,\n",
-       " 2,\n",
-       " 2,\n",
-       " 3,\n",
-       " 3,\n",
-       " 7,\n",
-       " 4,\n",
-       " 7,\n",
-       " 6,\n",
-       " 4,\n",
-       " 5,\n",
-       " 4,\n",
-       " 3,\n",
-       " 6,\n",
-       " 0,\n",
-       " 4,\n",
-       " 0,\n",
-       " 1,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 3,\n",
-       " 3,\n",
-       " 0,\n",
-       " 1,\n",
-       " 2,\n",
-       " 4,\n",
-       " 4,\n",
-       " 3,\n",
-       " 1,\n",
-       " 2,\n",
-       " 4,\n",
-       " 3,\n",
-       " 0,\n",
-       " 5,\n",
-       " 3,\n",
-       " 6,\n",
-       " 3,\n",
-       " 6,\n",
-       " 1,\n",
-       " 3,\n",
-       " 4,\n",
-       " 5,\n",
-       " 4,\n",
-       " 0,\n",
-       " 7,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 4,\n",
-       " 4,\n",
-       " 5,\n",
-       " 3,\n",
-       " 1,\n",
-       " 7,\n",
-       " 4,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 0,\n",
-       " 5,\n",
-       " 3,\n",
-       " 6,\n",
-       " 3,\n",
-       " 0,\n",
-       " 7,\n",
-       " 2,\n",
-       " 0,\n",
-       " 4,\n",
-       " 1,\n",
-       " 2,\n",
-       " 6,\n",
-       " 3,\n",
-       " 4,\n",
-       " 4,\n",
-       " 5,\n",
-       " 1,\n",
-       " 5,\n",
-       " 4,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 3,\n",
-       " 6,\n",
-       " 0,\n",
-       " 7,\n",
-       " 4,\n",
-       " 6,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 4,\n",
-       " 6,\n",
-       " 6,\n",
-       " 4,\n",
-       " 0,\n",
-       " 5,\n",
-       " 7,\n",
-       " 5,\n",
-       " 1,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 2,\n",
-       " 4,\n",
-       " 5,\n",
-       " 1,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 0,\n",
-       " 7,\n",
-       " 4,\n",
-       " 5,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 4,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 7,\n",
-       " 2,\n",
-       " 6,\n",
-       " 1,\n",
-       " 3,\n",
-       " 5,\n",
-       " 5,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 3,\n",
-       " 7,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 5,\n",
-       " 7,\n",
-       " 2,\n",
-       " 6,\n",
-       " 1,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 5,\n",
-       " 6,\n",
-       " 3,\n",
-       " 2,\n",
-       " 3,\n",
-       " 0,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 6,\n",
-       " 0,\n",
-       " 0,\n",
-       " 6,\n",
-       " 6,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 2,\n",
-       " 7,\n",
-       " 5,\n",
-       " 7,\n",
-       " 6,\n",
-       " 3,\n",
-       " 5,\n",
-       " 6,\n",
-       " 0,\n",
-       " 4,\n",
-       " 5,\n",
-       " 6,\n",
-       " 1,\n",
-       " 2,\n",
-       " 1,\n",
-       " 5,\n",
-       " 3,\n",
-       " 0,\n",
-       " 3,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 0,\n",
-       " 4,\n",
-       " 1,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 4,\n",
-       " 3,\n",
-       " 4,\n",
-       " 3,\n",
-       " 3,\n",
-       " 2,\n",
-       " 5,\n",
-       " 1,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 3,\n",
-       " 2,\n",
-       " 6,\n",
-       " 4,\n",
-       " 4,\n",
-       " 1,\n",
-       " 2,\n",
-       " 6,\n",
-       " 7,\n",
-       " 2,\n",
-       " 7,\n",
-       " 1,\n",
-       " 3,\n",
-       " 5,\n",
-       " 2,\n",
-       " 6,\n",
-       " 4,\n",
-       " 6,\n",
-       " 7,\n",
-       " 0,\n",
-       " 5,\n",
-       " 1,\n",
-       " 6,\n",
-       " 5,\n",
-       " 3,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 7,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 4,\n",
-       " 5,\n",
-       " 6,\n",
-       " 1,\n",
-       " 5,\n",
-       " 1,\n",
-       " 2,\n",
-       " 6,\n",
-       " 2,\n",
-       " 6,\n",
-       " 0,\n",
-       " 2,\n",
-       " 1,\n",
-       " 5,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 5,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 1,\n",
-       " 0,\n",
-       " 5,\n",
-       " 4,\n",
-       " 2,\n",
-       " 7,\n",
-       " 4,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 5,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 2,\n",
-       " 1,\n",
-       " 2,\n",
-       " 3,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 2,\n",
-       " 6,\n",
-       " 0,\n",
-       " 5,\n",
-       " 4,\n",
-       " 7,\n",
-       " 1,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 0,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 5,\n",
-       " 5,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 7,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 1,\n",
-       " 4,\n",
-       " 7,\n",
-       " 5,\n",
-       " 4,\n",
-       " 0,\n",
-       " 0,\n",
-       " 7,\n",
-       " 0,\n",
-       " 0,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 5,\n",
-       " 3,\n",
-       " 0,\n",
-       " 3,\n",
-       " 6,\n",
-       " 5,\n",
-       " 7,\n",
-       " 2,\n",
-       " 6,\n",
-       " 7,\n",
-       " 5,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 7,\n",
-       " 4,\n",
-       " 2,\n",
-       " 7,\n",
-       " 5,\n",
-       " 4,\n",
-       " 1,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 0,\n",
-       " 2,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 6,\n",
-       " 3,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 7,\n",
-       " 7,\n",
-       " 4,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 4,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 3,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 4,\n",
-       " 4,\n",
-       " 5,\n",
-       " 0,\n",
-       " 1,\n",
-       " 3,\n",
-       " 7,\n",
-       " 5,\n",
-       " 4,\n",
-       " 7,\n",
-       " 3,\n",
-       " 3,\n",
-       " 3,\n",
-       " 5,\n",
-       " 3,\n",
-       " 3,\n",
-       " 4,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 4,\n",
-       " 7,\n",
-       " 7,\n",
-       " 5,\n",
-       " 0,\n",
-       " 0,\n",
-       " 5,\n",
-       " 2,\n",
-       " 6,\n",
-       " 2,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 5,\n",
-       " 7,\n",
-       " 5,\n",
-       " 7,\n",
-       " 1,\n",
-       " 6,\n",
-       " 6,\n",
-       " 0,\n",
-       " 4,\n",
-       " 7,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 5,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 3,\n",
-       " 3,\n",
-       " 7,\n",
-       " 3,\n",
-       " 0,\n",
-       " 1,\n",
-       " 1,\n",
-       " 6,\n",
-       " 0,\n",
-       " 0,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 6,\n",
-       " 7,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 4,\n",
-       " 5,\n",
-       " 4,\n",
-       " 4,\n",
-       " 3,\n",
-       " 6,\n",
-       " 5,\n",
-       " 2,\n",
-       " 0,\n",
-       " 6,\n",
-       " 0,\n",
-       " 6,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 7,\n",
-       " 7,\n",
-       " 5,\n",
-       " 5,\n",
-       " 1,\n",
-       " 5,\n",
-       " 2,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 5,\n",
-       " 2,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 4,\n",
-       " 7,\n",
-       " 5,\n",
-       " 2,\n",
-       " 7,\n",
-       " 0,\n",
-       " 6,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 0,\n",
-       " 5,\n",
-       " 0,\n",
-       " 5,\n",
-       " 7,\n",
-       " 2,\n",
-       " 7,\n",
-       " 4,\n",
-       " 7,\n",
-       " 4,\n",
-       " 0,\n",
-       " 7,\n",
-       " 1,\n",
-       " 4,\n",
-       " 5,\n",
-       " 0,\n",
-       " 5,\n",
-       " 5,\n",
-       " 2,\n",
-       " 0,\n",
-       " 2,\n",
-       " 5,\n",
-       " 5,\n",
-       " 6,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 2,\n",
-       " 3,\n",
-       " 2,\n",
-       " 5,\n",
-       " 0,\n",
-       " 7,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 2,\n",
-       " 4,\n",
-       " 0,\n",
-       " 5,\n",
-       " 7,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 4,\n",
-       " 3,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 2,\n",
-       " 4,\n",
-       " 0,\n",
-       " 3,\n",
-       " 6,\n",
-       " 1,\n",
-       " 3,\n",
-       " 1,\n",
-       " 4,\n",
-       " 3,\n",
-       " 3,\n",
-       " 3,\n",
-       " 0,\n",
-       " 7,\n",
-       " 6,\n",
-       " 2,\n",
-       " 4,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 1,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 4,\n",
-       " 3,\n",
-       " 0,\n",
-       " 7,\n",
-       " 3,\n",
-       " 1,\n",
-       " 2,\n",
-       " 1,\n",
-       " 6,\n",
-       " 4,\n",
-       " 7,\n",
-       " 1,\n",
-       " 7,\n",
-       " 1,\n",
-       " 5,\n",
-       " 1,\n",
-       " 6,\n",
-       " 3,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 5,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 6,\n",
-       " 2,\n",
-       " 4,\n",
-       " 4,\n",
-       " 6,\n",
-       " 4,\n",
-       " 5,\n",
-       " 7,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 4,\n",
-       " 7,\n",
-       " 4,\n",
-       " 3,\n",
-       " 3,\n",
-       " 6,\n",
-       " 1,\n",
-       " 2,\n",
-       " 0,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 5,\n",
-       " 6,\n",
-       " 5,\n",
-       " 7,\n",
-       " 5,\n",
-       " 7,\n",
-       " 1,\n",
-       " 1,\n",
-       " 2,\n",
-       " 1,\n",
-       " 6,\n",
-       " 5,\n",
-       " 7,\n",
-       " 0,\n",
-       " 0,\n",
-       " 5,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 7,\n",
-       " 5,\n",
-       " 2,\n",
-       " 5,\n",
-       " 4,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 0,\n",
-       " 1,\n",
-       " 3,\n",
-       " 3,\n",
-       " 6,\n",
-       " 4,\n",
-       " 6,\n",
-       " 4,\n",
-       " 6,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 2,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 4,\n",
-       " 3,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 1,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 4,\n",
-       " 2,\n",
-       " 2,\n",
-       " 3,\n",
-       " 0,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 6,\n",
-       " 5,\n",
-       " 3,\n",
-       " 2,\n",
-       " 3,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 1,\n",
-       " 4,\n",
-       " 7,\n",
-       " 6,\n",
-       " 4,\n",
-       " 5,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 2,\n",
-       " 0,\n",
-       " 5,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 6,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 4,\n",
-       " 7,\n",
-       " 0,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 3,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 4,\n",
-       " 7,\n",
-       " 5,\n",
-       " 0,\n",
-       " 4,\n",
-       " 0,\n",
-       " 0,\n",
-       " 1,\n",
-       " 0,\n",
-       " 6,\n",
-       " 4,\n",
-       " 0,\n",
-       " 5,\n",
-       " 4,\n",
-       " 6,\n",
-       " 6,\n",
-       " 7,\n",
-       " 2,\n",
-       " 6,\n",
-       " 2,\n",
-       " 6,\n",
-       " 0,\n",
-       " 3,\n",
-       " 2,\n",
-       " 2,\n",
-       " 1,\n",
-       " 5,\n",
-       " 4,\n",
-       " 7,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 5,\n",
-       " 5,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 5,\n",
-       " 4,\n",
-       " 5,\n",
-       " 7,\n",
-       " 5,\n",
-       " 0,\n",
-       " 5,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 0,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 2,\n",
-       " 4,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 7,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 0,\n",
-       " 3,\n",
-       " 1,\n",
-       " 1,\n",
-       " 0,\n",
-       " 5,\n",
-       " 3,\n",
-       " 1,\n",
-       " 2,\n",
-       " 5,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 7,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 3,\n",
-       " 1,\n",
-       " 5,\n",
-       " 4,\n",
-       " 2,\n",
-       " 4,\n",
-       " 6,\n",
-       " 5,\n",
-       " 2,\n",
-       " 7,\n",
-       " ...]"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "\n",
-    "#--------------------------------------------------------------------------------------------\n",
-    "# YOUR MODEL INFERENCE CODE HERE\n",
-    "# Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.\n",
-    "#--------------------------------------------------------------------------------------------   \n",
-    "\n",
-    "# Make random predictions (placeholder for actual model inference)\n",
-    "true_labels = test_dataset[\"label\"]\n",
-    "predictions = [random.randint(0, 1) for _ in range(len(true_labels))]\n",
-    "\n",
-    "predictions\n",
-    "\n",
-    "#--------------------------------------------------------------------------------------------\n",
-    "# YOUR MODEL INFERENCE STOPS HERE\n",
-    "#--------------------------------------------------------------------------------------------   "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "[codecarbon WARNING @ 19:53:32] Background scheduler didn't run for a long period (47s), results might be inaccurate\n",
-      "[codecarbon INFO @ 19:53:32] Energy consumed for RAM : 0.000156 kWh. RAM Power : 11.755242347717285 W\n",
-      "[codecarbon INFO @ 19:53:32] Delta energy consumed for CPU with constant : 0.000564 kWh, power : 42.5 W\n",
-      "[codecarbon INFO @ 19:53:32] Energy consumed for All CPU : 0.000564 kWh\n",
-      "[codecarbon INFO @ 19:53:32] 0.000720 kWh of electricity used since the beginning.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "EmissionsData(timestamp='2025-01-21T19:53:32', project_name='codecarbon', run_id='908f2e7e-4bb2-4991-a0f6-56bf8d7eda21', experiment_id='5b0fa12a-3dd7-45bb-9766-cc326314d9f1', duration=47.736408500000834, emissions=4.032368007471064e-05, emissions_rate=8.444466886328872e-07, cpu_power=42.5, gpu_power=0.0, ram_power=11.755242347717285, cpu_energy=0.0005636615353475565, gpu_energy=0, ram_energy=0.00015590305493261682, energy_consumed=0.0007195645902801733, country_name='France', country_iso_code='FRA', region='île-de-france', cloud_provider='', cloud_region='', os='Windows-11-10.0.22631-SP0', python_version='3.12.7', codecarbon_version='3.0.0_rc0', cpu_count=12, cpu_model='13th Gen Intel(R) Core(TM) i7-1365U', gpu_count=None, gpu_model=None, longitude=2.3494, latitude=48.8558, ram_total_size=31.347312927246094, tracking_mode='machine', on_cloud='N', pue=1.0)"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Stop tracking emissions\n",
-    "emissions_data = tracker.stop_task()\n",
-    "emissions_data"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.10090237899917966"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Calculate accuracy\n",
-    "accuracy = accuracy_score(true_labels, predictions)\n",
-    "accuracy"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'submission_timestamp': '2025-01-21T19:53:46.639165',\n",
-       " 'accuracy': 0.10090237899917966,\n",
-       " 'energy_consumed_wh': 0.7195645902801733,\n",
-       " 'emissions_gco2eq': 0.040323680074710634,\n",
-       " 'emissions_data': {'run_id': '908f2e7e-4bb2-4991-a0f6-56bf8d7eda21',\n",
-       "  'duration': 47.736408500000834,\n",
-       "  'emissions': 4.032368007471064e-05,\n",
-       "  'emissions_rate': 8.444466886328872e-07,\n",
-       "  'cpu_power': 42.5,\n",
-       "  'gpu_power': 0.0,\n",
-       "  'ram_power': 11.755242347717285,\n",
-       "  'cpu_energy': 0.0005636615353475565,\n",
-       "  'gpu_energy': 0,\n",
-       "  'ram_energy': 0.00015590305493261682,\n",
-       "  'energy_consumed': 0.0007195645902801733,\n",
-       "  'country_name': 'France',\n",
-       "  'country_iso_code': 'FRA',\n",
-       "  'region': 'île-de-france',\n",
-       "  'cloud_provider': '',\n",
-       "  'cloud_region': '',\n",
-       "  'os': 'Windows-11-10.0.22631-SP0',\n",
-       "  'python_version': '3.12.7',\n",
-       "  'codecarbon_version': '3.0.0_rc0',\n",
-       "  'cpu_count': 12,\n",
-       "  'cpu_model': '13th Gen Intel(R) Core(TM) i7-1365U',\n",
-       "  'gpu_count': None,\n",
-       "  'gpu_model': None,\n",
-       "  'ram_total_size': 31.347312927246094,\n",
-       "  'tracking_mode': 'machine',\n",
-       "  'on_cloud': 'N',\n",
-       "  'pue': 1.0},\n",
-       " 'dataset_config': {'dataset_name': 'QuotaClimat/frugalaichallenge-text-train',\n",
-       "  'test_size': 0.2,\n",
-       "  'test_seed': 42}}"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Prepare results dictionary\n",
-    "results = {\n",
-    "    \"submission_timestamp\": datetime.now().isoformat(),\n",
-    "    \"accuracy\": float(accuracy),\n",
-    "    \"energy_consumed_wh\": emissions_data.energy_consumed * 1000,\n",
-    "    \"emissions_gco2eq\": emissions_data.emissions * 1000,\n",
-    "    \"emissions_data\": clean_emissions_data(emissions_data),\n",
-    "    \"dataset_config\": {\n",
-    "        \"dataset_name\": request.dataset_name,\n",
-    "        \"test_size\": request.test_size,\n",
-    "        \"test_seed\": request.test_seed\n",
-    "    }\n",
-    "}\n",
-    "\n",
-    "results"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "base",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.7"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}

notebooks/template-image.ipynb DELETED Viewed

@@ -1,416 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Image task notebook template\n",
-    "## Loading the necessary libraries"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from fastapi import APIRouter\n",
-    "from datetime import datetime\n",
-    "from datasets import load_dataset\n",
-    "from sklearn.metrics import accuracy_score, precision_score, recall_score\n",
-    "\n",
-    "import random\n",
-    "\n",
-    "import sys\n",
-    "sys.path.append('../')\n",
-    "\n",
-    "from tasks.utils.evaluation import ImageEvaluationRequest\n",
-    "from tasks.utils.emissions import tracker, clean_emissions_data, get_space_info\n",
-    "from tasks.image import parse_boxes,compute_iou,compute_max_iou"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Loading the datasets and splitting them"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "4f62b23ca587477d9f37430e687bf951",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "README.md:   0%|          | 0.00/7.72k [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\datasets--pyronear--pyro-sdis. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
-      "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
-      "  warnings.warn(message)\n"
-     ]
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "70735dd748e343119b5a7cd966dcd0f0",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train-00000-of-00007.parquet:   0%|          | 0.00/433M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "903c3227c24649f1a0424e039d74d303",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train-00001-of-00007.parquet:   0%|          | 0.00/434M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "8795b7696f124715b9d52287d5cd4ee0",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train-00002-of-00007.parquet:   0%|          | 0.00/432M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "4b6c1240bf024d61bf913584d13834f5",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train-00003-of-00007.parquet:   0%|          | 0.00/428M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "cd5f8172a31f4fd79d489db96ede9c21",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train-00004-of-00007.parquet:   0%|          | 0.00/431M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "416af82dba3a4ab7ad13190703c90757",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train-00005-of-00007.parquet:   0%|          | 0.00/429M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "6819ad85508641a1a64bea34303446ac",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train-00006-of-00007.parquet:   0%|          | 0.00/431M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "90a7f85c802b4330b502c8bbd3cca7f9",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "val-00000-of-00001.parquet:   0%|          | 0.00/407M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "b93f2f19aafb43e2b8db0fd7bb3ebd34",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "Generating train split:   0%|          | 0/29537 [00:00<?, ? examples/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "c14c0f2cde184c959970dfccaa26b2d2",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "Generating val split:   0%|          | 0/4099 [00:00<?, ? examples/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    }
-   ],
-   "source": [
-    "request = ImageEvaluationRequest()\n",
-    "\n",
-    "# Load and prepare the dataset\n",
-    "dataset = load_dataset(request.dataset_name)\n",
-    "\n",
-    "# Split dataset\n",
-    "train_test = dataset[\"train\"]\n",
-    "test_dataset = dataset[\"val\"]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Random Baseline"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Start tracking emissions\n",
-    "tracker.start()\n",
-    "tracker.start_task(\"inference\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "#--------------------------------------------------------------------------------------------\n",
-    "# YOUR MODEL INFERENCE CODE HERE\n",
-    "# Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.\n",
-    "#--------------------------------------------------------------------------------------------   \n",
-    "\n",
-    "# Make random predictions (placeholder for actual model inference)\n",
-    "\n",
-    "predictions = []\n",
-    "true_labels = []\n",
-    "pred_boxes = []\n",
-    "true_boxes_list = []  # List of lists, each inner list contains boxes for one image\n",
-    "\n",
-    "for example in test_dataset:\n",
-    "    # Parse true annotation (YOLO format: class_id x_center y_center width height)\n",
-    "    annotation = example.get(\"annotations\", \"\").strip()\n",
-    "    has_smoke = len(annotation) > 0\n",
-    "    true_labels.append(int(has_smoke))\n",
-    "    \n",
-    "    # Make random classification prediction\n",
-    "    pred_has_smoke = random.random() > 0.5\n",
-    "    predictions.append(int(pred_has_smoke))\n",
-    "    \n",
-    "    # If there's a true box, parse it and make random box prediction\n",
-    "    if has_smoke:\n",
-    "        # Parse all true boxes from the annotation\n",
-    "        image_true_boxes = parse_boxes(annotation)\n",
-    "        true_boxes_list.append(image_true_boxes)\n",
-    "        \n",
-    "        # For baseline, make one random box prediction per image\n",
-    "        # In a real model, you might want to predict multiple boxes\n",
-    "        random_box = [\n",
-    "            random.random(),  # x_center\n",
-    "            random.random(),  # y_center\n",
-    "            random.random() * 0.5,  # width (max 0.5)\n",
-    "            random.random() * 0.5   # height (max 0.5)\n",
-    "        ]\n",
-    "        pred_boxes.append(random_box)\n",
-    "\n",
-    "\n",
-    "#--------------------------------------------------------------------------------------------\n",
-    "# YOUR MODEL INFERENCE STOPS HERE\n",
-    "#--------------------------------------------------------------------------------------------   "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Stop tracking emissions\n",
-    "emissions_data = tracker.stop_task()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 15,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy as np\n",
-    "\n",
-    "# Calculate classification metrics\n",
-    "classification_accuracy = accuracy_score(true_labels, predictions)\n",
-    "classification_precision = precision_score(true_labels, predictions)\n",
-    "classification_recall = recall_score(true_labels, predictions)\n",
-    "\n",
-    "# Calculate mean IoU for object detection (only for images with smoke)\n",
-    "# For each image, we compute the max IoU between the predicted box and all true boxes\n",
-    "ious = []\n",
-    "for true_boxes, pred_box in zip(true_boxes_list, pred_boxes):\n",
-    "    max_iou = compute_max_iou(true_boxes, pred_box)\n",
-    "    ious.append(max_iou)\n",
-    "\n",
-    "mean_iou = float(np.mean(ious)) if ious else 0.0"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 18,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'submission_timestamp': '2025-01-22T15:57:37.288173',\n",
-       " 'classification_accuracy': 0.5001692620176033,\n",
-       " 'classification_precision': 0.8397129186602871,\n",
-       " 'classification_recall': 0.4972677595628415,\n",
-       " 'mean_iou': 0.002819781629108398,\n",
-       " 'energy_consumed_wh': 0.779355299496116,\n",
-       " 'emissions_gco2eq': 0.043674291628462855,\n",
-       " 'emissions_data': {'run_id': '4e750cd5-60f0-444c-baee-b5f7b31f784b',\n",
-       "  'duration': 51.72819679998793,\n",
-       "  'emissions': 4.3674291628462856e-05,\n",
-       "  'emissions_rate': 8.445163379568943e-07,\n",
-       "  'cpu_power': 42.5,\n",
-       "  'gpu_power': 0.0,\n",
-       "  'ram_power': 11.755242347717285,\n",
-       "  'cpu_energy': 0.0006104993474311617,\n",
-       "  'gpu_energy': 0,\n",
-       "  'ram_energy': 0.00016885595206495442,\n",
-       "  'energy_consumed': 0.0007793552994961161,\n",
-       "  'country_name': 'France',\n",
-       "  'country_iso_code': 'FRA',\n",
-       "  'region': 'île-de-france',\n",
-       "  'cloud_provider': '',\n",
-       "  'cloud_region': '',\n",
-       "  'os': 'Windows-11-10.0.22631-SP0',\n",
-       "  'python_version': '3.12.7',\n",
-       "  'codecarbon_version': '3.0.0_rc0',\n",
-       "  'cpu_count': 12,\n",
-       "  'cpu_model': '13th Gen Intel(R) Core(TM) i7-1365U',\n",
-       "  'gpu_count': None,\n",
-       "  'gpu_model': None,\n",
-       "  'ram_total_size': 31.347312927246094,\n",
-       "  'tracking_mode': 'machine',\n",
-       "  'on_cloud': 'N',\n",
-       "  'pue': 1.0},\n",
-       " 'dataset_config': {'dataset_name': 'pyronear/pyro-sdis',\n",
-       "  'test_size': 0.2,\n",
-       "  'test_seed': 42}}"
-      ]
-     },
-     "execution_count": 18,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "\n",
-    "# Prepare results dictionary\n",
-    "results = {\n",
-    "    \"submission_timestamp\": datetime.now().isoformat(),\n",
-    "    \"classification_accuracy\": float(classification_accuracy),\n",
-    "    \"classification_precision\": float(classification_precision),\n",
-    "    \"classification_recall\": float(classification_recall),\n",
-    "    \"mean_iou\": mean_iou,\n",
-    "    \"energy_consumed_wh\": emissions_data.energy_consumed * 1000,\n",
-    "    \"emissions_gco2eq\": emissions_data.emissions * 1000,\n",
-    "    \"emissions_data\": clean_emissions_data(emissions_data),\n",
-    "    \"dataset_config\": {\n",
-    "        \"dataset_name\": request.dataset_name,\n",
-    "        \"test_size\": request.test_size,\n",
-    "        \"test_seed\": request.test_seed\n",
-    "    }\n",
-    "}\n",
-    "results"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "base",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.7"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}

notebooks/template-text.ipynb DELETED Viewed

@@ -1,1642 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Text task notebook template\n",
-    "## Loading the necessary libraries"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "[codecarbon WARNING @ 19:48:07] Multiple instances of codecarbon are allowed to run at the same time.\n",
-      "[codecarbon INFO @ 19:48:07] [setup] RAM Tracking...\n",
-      "[codecarbon INFO @ 19:48:07] [setup] CPU Tracking...\n",
-      "[codecarbon WARNING @ 19:48:09] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
-      "[codecarbon WARNING @ 19:48:09] No CPU tracking mode found. Falling back on CPU constant mode. \n",
-      " Windows OS detected: Please install Intel Power Gadget to measure CPU\n",
-      "\n",
-      "[codecarbon WARNING @ 19:48:11] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
-      "[codecarbon INFO @ 19:48:11] CPU Model on constant consumption mode: 13th Gen Intel(R) Core(TM) i7-1365U\n",
-      "[codecarbon WARNING @ 19:48:11] No CPU tracking mode found. Falling back on CPU constant mode.\n",
-      "[codecarbon INFO @ 19:48:11] [setup] GPU Tracking...\n",
-      "[codecarbon INFO @ 19:48:11] No GPU found.\n",
-      "[codecarbon INFO @ 19:48:11] >>> Tracker's metadata:\n",
-      "[codecarbon INFO @ 19:48:11]   Platform system: Windows-11-10.0.22631-SP0\n",
-      "[codecarbon INFO @ 19:48:11]   Python version: 3.12.7\n",
-      "[codecarbon INFO @ 19:48:11]   CodeCarbon version: 3.0.0_rc0\n",
-      "[codecarbon INFO @ 19:48:11]   Available RAM : 31.347 GB\n",
-      "[codecarbon INFO @ 19:48:11]   CPU count: 12\n",
-      "[codecarbon INFO @ 19:48:11]   CPU model: 13th Gen Intel(R) Core(TM) i7-1365U\n",
-      "[codecarbon INFO @ 19:48:11]   GPU count: None\n",
-      "[codecarbon INFO @ 19:48:11]   GPU model: None\n",
-      "[codecarbon INFO @ 19:48:11] Saving emissions data to file c:\\git\\submission-template\\notebooks\\emissions.csv\n"
-     ]
-    }
-   ],
-   "source": [
-    "from fastapi import APIRouter\n",
-    "from datetime import datetime\n",
-    "from datasets import load_dataset\n",
-    "from sklearn.metrics import accuracy_score\n",
-    "import random\n",
-    "\n",
-    "import sys\n",
-    "sys.path.append('../tasks')\n",
-    "\n",
-    "from utils.evaluation import TextEvaluationRequest\n",
-    "from utils.emissions import tracker, clean_emissions_data, get_space_info\n",
-    "\n",
-    "\n",
-    "# Define the label mapping\n",
-    "LABEL_MAPPING = {\n",
-    "    \"0_not_relevant\": 0,\n",
-    "    \"1_not_happening\": 1,\n",
-    "    \"2_not_human\": 2,\n",
-    "    \"3_not_bad\": 3,\n",
-    "    \"4_solutions_harmful_unnecessary\": 4,\n",
-    "    \"5_science_unreliable\": 5,\n",
-    "    \"6_proponents_biased\": 6,\n",
-    "    \"7_fossil_fuels_needed\": 7\n",
-    "}"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Loading the datasets and splitting them"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "668da7bf85434e098b95c3ec447d78fe",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "README.md:   0%|          | 0.00/5.18k [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\datasets--QuotaClimat--frugalaichallenge-text-train. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
-      "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
-      "  warnings.warn(message)\n"
-     ]
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "5b68d43359eb429395da8be7d4b15556",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "train.parquet:   0%|          | 0.00/1.21M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "140a304773914e9db8f698eabeb40298",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "Generating train split:   0%|          | 0/6091 [00:00<?, ? examples/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "6d04e8ab1906400e8e0029949dc523a5",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "Map:   0%|          | 0/6091 [00:00<?, ? examples/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    }
-   ],
-   "source": [
-    "request = TextEvaluationRequest()\n",
-    "\n",
-    "# Load and prepare the dataset\n",
-    "dataset = load_dataset(request.dataset_name)\n",
-    "\n",
-    "# Convert string labels to integers\n",
-    "dataset = dataset.map(lambda x: {\"label\": LABEL_MAPPING[x[\"label\"]]})\n",
-    "\n",
-    "# Split dataset\n",
-    "train_test = dataset[\"train\"]\n",
-    "test_dataset = dataset[\"test\"]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Random Baseline"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Start tracking emissions\n",
-    "tracker.start()\n",
-    "tracker.start_task(\"inference\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[1,\n",
-       " 7,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 3,\n",
-       " 6,\n",
-       " 6,\n",
-       " 3,\n",
-       " 6,\n",
-       " 6,\n",
-       " 5,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 2,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 1,\n",
-       " 3,\n",
-       " 6,\n",
-       " 4,\n",
-       " 2,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 5,\n",
-       " 5,\n",
-       " 1,\n",
-       " 2,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 3,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 0,\n",
-       " 0,\n",
-       " 3,\n",
-       " 3,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 4,\n",
-       " 4,\n",
-       " 1,\n",
-       " 4,\n",
-       " 5,\n",
-       " 6,\n",
-       " 1,\n",
-       " 2,\n",
-       " 2,\n",
-       " 2,\n",
-       " 5,\n",
-       " 2,\n",
-       " 7,\n",
-       " 2,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 4,\n",
-       " 2,\n",
-       " 0,\n",
-       " 1,\n",
-       " 6,\n",
-       " 3,\n",
-       " 2,\n",
-       " 5,\n",
-       " 5,\n",
-       " 2,\n",
-       " 0,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 5,\n",
-       " 5,\n",
-       " 7,\n",
-       " 4,\n",
-       " 6,\n",
-       " 7,\n",
-       " 1,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 2,\n",
-       " 5,\n",
-       " 3,\n",
-       " 3,\n",
-       " 3,\n",
-       " 2,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 4,\n",
-       " 5,\n",
-       " 7,\n",
-       " 0,\n",
-       " 3,\n",
-       " 1,\n",
-       " 4,\n",
-       " 6,\n",
-       " 0,\n",
-       " 7,\n",
-       " 1,\n",
-       " 1,\n",
-       " 2,\n",
-       " 2,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 3,\n",
-       " 4,\n",
-       " 4,\n",
-       " 2,\n",
-       " 2,\n",
-       " 3,\n",
-       " 3,\n",
-       " 7,\n",
-       " 4,\n",
-       " 7,\n",
-       " 6,\n",
-       " 4,\n",
-       " 5,\n",
-       " 4,\n",
-       " 3,\n",
-       " 6,\n",
-       " 0,\n",
-       " 4,\n",
-       " 0,\n",
-       " 1,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 3,\n",
-       " 3,\n",
-       " 0,\n",
-       " 1,\n",
-       " 2,\n",
-       " 4,\n",
-       " 4,\n",
-       " 3,\n",
-       " 1,\n",
-       " 2,\n",
-       " 4,\n",
-       " 3,\n",
-       " 0,\n",
-       " 5,\n",
-       " 3,\n",
-       " 6,\n",
-       " 3,\n",
-       " 6,\n",
-       " 1,\n",
-       " 3,\n",
-       " 4,\n",
-       " 5,\n",
-       " 4,\n",
-       " 0,\n",
-       " 7,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 4,\n",
-       " 4,\n",
-       " 5,\n",
-       " 3,\n",
-       " 1,\n",
-       " 7,\n",
-       " 4,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 0,\n",
-       " 5,\n",
-       " 3,\n",
-       " 6,\n",
-       " 3,\n",
-       " 0,\n",
-       " 7,\n",
-       " 2,\n",
-       " 0,\n",
-       " 4,\n",
-       " 1,\n",
-       " 2,\n",
-       " 6,\n",
-       " 3,\n",
-       " 4,\n",
-       " 4,\n",
-       " 5,\n",
-       " 1,\n",
-       " 5,\n",
-       " 4,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 3,\n",
-       " 6,\n",
-       " 0,\n",
-       " 7,\n",
-       " 4,\n",
-       " 6,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 4,\n",
-       " 6,\n",
-       " 6,\n",
-       " 4,\n",
-       " 0,\n",
-       " 5,\n",
-       " 7,\n",
-       " 5,\n",
-       " 1,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 2,\n",
-       " 4,\n",
-       " 5,\n",
-       " 1,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 0,\n",
-       " 7,\n",
-       " 4,\n",
-       " 5,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 4,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 7,\n",
-       " 2,\n",
-       " 6,\n",
-       " 1,\n",
-       " 3,\n",
-       " 5,\n",
-       " 5,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 3,\n",
-       " 7,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 5,\n",
-       " 7,\n",
-       " 2,\n",
-       " 6,\n",
-       " 1,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 5,\n",
-       " 6,\n",
-       " 3,\n",
-       " 2,\n",
-       " 3,\n",
-       " 0,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 6,\n",
-       " 0,\n",
-       " 0,\n",
-       " 6,\n",
-       " 6,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 2,\n",
-       " 7,\n",
-       " 5,\n",
-       " 7,\n",
-       " 6,\n",
-       " 3,\n",
-       " 5,\n",
-       " 6,\n",
-       " 0,\n",
-       " 4,\n",
-       " 5,\n",
-       " 6,\n",
-       " 1,\n",
-       " 2,\n",
-       " 1,\n",
-       " 5,\n",
-       " 3,\n",
-       " 0,\n",
-       " 3,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 0,\n",
-       " 4,\n",
-       " 1,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 4,\n",
-       " 3,\n",
-       " 4,\n",
-       " 3,\n",
-       " 3,\n",
-       " 2,\n",
-       " 5,\n",
-       " 1,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 3,\n",
-       " 2,\n",
-       " 6,\n",
-       " 4,\n",
-       " 4,\n",
-       " 1,\n",
-       " 2,\n",
-       " 6,\n",
-       " 7,\n",
-       " 2,\n",
-       " 7,\n",
-       " 1,\n",
-       " 3,\n",
-       " 5,\n",
-       " 2,\n",
-       " 6,\n",
-       " 4,\n",
-       " 6,\n",
-       " 7,\n",
-       " 0,\n",
-       " 5,\n",
-       " 1,\n",
-       " 6,\n",
-       " 5,\n",
-       " 3,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 7,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 4,\n",
-       " 5,\n",
-       " 6,\n",
-       " 1,\n",
-       " 5,\n",
-       " 1,\n",
-       " 2,\n",
-       " 6,\n",
-       " 2,\n",
-       " 6,\n",
-       " 0,\n",
-       " 2,\n",
-       " 1,\n",
-       " 5,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 5,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 1,\n",
-       " 0,\n",
-       " 5,\n",
-       " 4,\n",
-       " 2,\n",
-       " 7,\n",
-       " 4,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 5,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 2,\n",
-       " 1,\n",
-       " 2,\n",
-       " 3,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 2,\n",
-       " 6,\n",
-       " 0,\n",
-       " 5,\n",
-       " 4,\n",
-       " 7,\n",
-       " 1,\n",
-       " 1,\n",
-       " 0,\n",
-       " 7,\n",
-       " 0,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 5,\n",
-       " 5,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 7,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 1,\n",
-       " 4,\n",
-       " 7,\n",
-       " 5,\n",
-       " 4,\n",
-       " 0,\n",
-       " 0,\n",
-       " 7,\n",
-       " 0,\n",
-       " 0,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 5,\n",
-       " 3,\n",
-       " 0,\n",
-       " 3,\n",
-       " 6,\n",
-       " 5,\n",
-       " 7,\n",
-       " 2,\n",
-       " 6,\n",
-       " 7,\n",
-       " 5,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 7,\n",
-       " 4,\n",
-       " 2,\n",
-       " 7,\n",
-       " 5,\n",
-       " 4,\n",
-       " 1,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 0,\n",
-       " 2,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 6,\n",
-       " 3,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 7,\n",
-       " 7,\n",
-       " 4,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 4,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 3,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 4,\n",
-       " 4,\n",
-       " 5,\n",
-       " 0,\n",
-       " 1,\n",
-       " 3,\n",
-       " 7,\n",
-       " 5,\n",
-       " 4,\n",
-       " 7,\n",
-       " 3,\n",
-       " 3,\n",
-       " 3,\n",
-       " 5,\n",
-       " 3,\n",
-       " 3,\n",
-       " 4,\n",
-       " 0,\n",
-       " 1,\n",
-       " 7,\n",
-       " 4,\n",
-       " 7,\n",
-       " 7,\n",
-       " 5,\n",
-       " 0,\n",
-       " 0,\n",
-       " 5,\n",
-       " 2,\n",
-       " 6,\n",
-       " 2,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 5,\n",
-       " 7,\n",
-       " 5,\n",
-       " 7,\n",
-       " 1,\n",
-       " 6,\n",
-       " 6,\n",
-       " 0,\n",
-       " 4,\n",
-       " 7,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 5,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 0,\n",
-       " 0,\n",
-       " 3,\n",
-       " 3,\n",
-       " 7,\n",
-       " 3,\n",
-       " 0,\n",
-       " 1,\n",
-       " 1,\n",
-       " 6,\n",
-       " 0,\n",
-       " 0,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 6,\n",
-       " 7,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 4,\n",
-       " 5,\n",
-       " 4,\n",
-       " 4,\n",
-       " 3,\n",
-       " 6,\n",
-       " 5,\n",
-       " 2,\n",
-       " 0,\n",
-       " 6,\n",
-       " 0,\n",
-       " 6,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 7,\n",
-       " 7,\n",
-       " 5,\n",
-       " 5,\n",
-       " 1,\n",
-       " 5,\n",
-       " 2,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 5,\n",
-       " 2,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 4,\n",
-       " 7,\n",
-       " 5,\n",
-       " 2,\n",
-       " 7,\n",
-       " 0,\n",
-       " 6,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 0,\n",
-       " 5,\n",
-       " 0,\n",
-       " 5,\n",
-       " 7,\n",
-       " 2,\n",
-       " 7,\n",
-       " 4,\n",
-       " 7,\n",
-       " 4,\n",
-       " 0,\n",
-       " 7,\n",
-       " 1,\n",
-       " 4,\n",
-       " 5,\n",
-       " 0,\n",
-       " 5,\n",
-       " 5,\n",
-       " 2,\n",
-       " 0,\n",
-       " 2,\n",
-       " 5,\n",
-       " 5,\n",
-       " 6,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 2,\n",
-       " 3,\n",
-       " 2,\n",
-       " 5,\n",
-       " 0,\n",
-       " 7,\n",
-       " 2,\n",
-       " 3,\n",
-       " 7,\n",
-       " 2,\n",
-       " 4,\n",
-       " 0,\n",
-       " 5,\n",
-       " 7,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 4,\n",
-       " 3,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 0,\n",
-       " 3,\n",
-       " 4,\n",
-       " 3,\n",
-       " 5,\n",
-       " 2,\n",
-       " 4,\n",
-       " 0,\n",
-       " 3,\n",
-       " 6,\n",
-       " 1,\n",
-       " 3,\n",
-       " 1,\n",
-       " 4,\n",
-       " 3,\n",
-       " 3,\n",
-       " 3,\n",
-       " 0,\n",
-       " 7,\n",
-       " 6,\n",
-       " 2,\n",
-       " 4,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 1,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 4,\n",
-       " 3,\n",
-       " 0,\n",
-       " 7,\n",
-       " 3,\n",
-       " 1,\n",
-       " 2,\n",
-       " 1,\n",
-       " 6,\n",
-       " 4,\n",
-       " 7,\n",
-       " 1,\n",
-       " 7,\n",
-       " 1,\n",
-       " 5,\n",
-       " 1,\n",
-       " 6,\n",
-       " 3,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 5,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 6,\n",
-       " 2,\n",
-       " 4,\n",
-       " 4,\n",
-       " 6,\n",
-       " 4,\n",
-       " 5,\n",
-       " 7,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 4,\n",
-       " 7,\n",
-       " 4,\n",
-       " 3,\n",
-       " 3,\n",
-       " 6,\n",
-       " 1,\n",
-       " 2,\n",
-       " 0,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 5,\n",
-       " 6,\n",
-       " 5,\n",
-       " 7,\n",
-       " 5,\n",
-       " 7,\n",
-       " 1,\n",
-       " 1,\n",
-       " 2,\n",
-       " 1,\n",
-       " 6,\n",
-       " 5,\n",
-       " 7,\n",
-       " 0,\n",
-       " 0,\n",
-       " 5,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 7,\n",
-       " 5,\n",
-       " 2,\n",
-       " 5,\n",
-       " 4,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 0,\n",
-       " 1,\n",
-       " 3,\n",
-       " 3,\n",
-       " 6,\n",
-       " 4,\n",
-       " 6,\n",
-       " 4,\n",
-       " 6,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 2,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 4,\n",
-       " 3,\n",
-       " 3,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 1,\n",
-       " 1,\n",
-       " 7,\n",
-       " 7,\n",
-       " 6,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 4,\n",
-       " 2,\n",
-       " 2,\n",
-       " 3,\n",
-       " 0,\n",
-       " 1,\n",
-       " 4,\n",
-       " 0,\n",
-       " 4,\n",
-       " 6,\n",
-       " 5,\n",
-       " 3,\n",
-       " 2,\n",
-       " 3,\n",
-       " 2,\n",
-       " 3,\n",
-       " 6,\n",
-       " 2,\n",
-       " 1,\n",
-       " 4,\n",
-       " 7,\n",
-       " 6,\n",
-       " 4,\n",
-       " 5,\n",
-       " 6,\n",
-       " 7,\n",
-       " 7,\n",
-       " 2,\n",
-       " 0,\n",
-       " 5,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 6,\n",
-       " 6,\n",
-       " 5,\n",
-       " 4,\n",
-       " 4,\n",
-       " 7,\n",
-       " 0,\n",
-       " 5,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 3,\n",
-       " 1,\n",
-       " 7,\n",
-       " 0,\n",
-       " 1,\n",
-       " 4,\n",
-       " 7,\n",
-       " 5,\n",
-       " 0,\n",
-       " 4,\n",
-       " 0,\n",
-       " 0,\n",
-       " 1,\n",
-       " 0,\n",
-       " 6,\n",
-       " 4,\n",
-       " 0,\n",
-       " 5,\n",
-       " 4,\n",
-       " 6,\n",
-       " 6,\n",
-       " 7,\n",
-       " 2,\n",
-       " 6,\n",
-       " 2,\n",
-       " 6,\n",
-       " 0,\n",
-       " 3,\n",
-       " 2,\n",
-       " 2,\n",
-       " 1,\n",
-       " 5,\n",
-       " 4,\n",
-       " 7,\n",
-       " 6,\n",
-       " 6,\n",
-       " 2,\n",
-       " 5,\n",
-       " 5,\n",
-       " 5,\n",
-       " 0,\n",
-       " 3,\n",
-       " 5,\n",
-       " 4,\n",
-       " 5,\n",
-       " 7,\n",
-       " 5,\n",
-       " 0,\n",
-       " 5,\n",
-       " 0,\n",
-       " 0,\n",
-       " 2,\n",
-       " 0,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 2,\n",
-       " 4,\n",
-       " 3,\n",
-       " 4,\n",
-       " 1,\n",
-       " 7,\n",
-       " 2,\n",
-       " 1,\n",
-       " 0,\n",
-       " 3,\n",
-       " 0,\n",
-       " 3,\n",
-       " 1,\n",
-       " 1,\n",
-       " 0,\n",
-       " 5,\n",
-       " 3,\n",
-       " 1,\n",
-       " 2,\n",
-       " 5,\n",
-       " 6,\n",
-       " 7,\n",
-       " 6,\n",
-       " 7,\n",
-       " 0,\n",
-       " 2,\n",
-       " 6,\n",
-       " 3,\n",
-       " 1,\n",
-       " 5,\n",
-       " 4,\n",
-       " 2,\n",
-       " 4,\n",
-       " 6,\n",
-       " 5,\n",
-       " 2,\n",
-       " 7,\n",
-       " ...]"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "\n",
-    "#--------------------------------------------------------------------------------------------\n",
-    "# YOUR MODEL INFERENCE CODE HERE\n",
-    "# Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.\n",
-    "#--------------------------------------------------------------------------------------------   \n",
-    "\n",
-    "# Make random predictions (placeholder for actual model inference)\n",
-    "true_labels = test_dataset[\"label\"]\n",
-    "predictions = [random.randint(0, 7) for _ in range(len(true_labels))]\n",
-    "\n",
-    "predictions\n",
-    "\n",
-    "#--------------------------------------------------------------------------------------------\n",
-    "# YOUR MODEL INFERENCE STOPS HERE\n",
-    "#--------------------------------------------------------------------------------------------   "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "[codecarbon WARNING @ 19:53:32] Background scheduler didn't run for a long period (47s), results might be inaccurate\n",
-      "[codecarbon INFO @ 19:53:32] Energy consumed for RAM : 0.000156 kWh. RAM Power : 11.755242347717285 W\n",
-      "[codecarbon INFO @ 19:53:32] Delta energy consumed for CPU with constant : 0.000564 kWh, power : 42.5 W\n",
-      "[codecarbon INFO @ 19:53:32] Energy consumed for All CPU : 0.000564 kWh\n",
-      "[codecarbon INFO @ 19:53:32] 0.000720 kWh of electricity used since the beginning.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "EmissionsData(timestamp='2025-01-21T19:53:32', project_name='codecarbon', run_id='908f2e7e-4bb2-4991-a0f6-56bf8d7eda21', experiment_id='5b0fa12a-3dd7-45bb-9766-cc326314d9f1', duration=47.736408500000834, emissions=4.032368007471064e-05, emissions_rate=8.444466886328872e-07, cpu_power=42.5, gpu_power=0.0, ram_power=11.755242347717285, cpu_energy=0.0005636615353475565, gpu_energy=0, ram_energy=0.00015590305493261682, energy_consumed=0.0007195645902801733, country_name='France', country_iso_code='FRA', region='île-de-france', cloud_provider='', cloud_region='', os='Windows-11-10.0.22631-SP0', python_version='3.12.7', codecarbon_version='3.0.0_rc0', cpu_count=12, cpu_model='13th Gen Intel(R) Core(TM) i7-1365U', gpu_count=None, gpu_model=None, longitude=2.3494, latitude=48.8558, ram_total_size=31.347312927246094, tracking_mode='machine', on_cloud='N', pue=1.0)"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Stop tracking emissions\n",
-    "emissions_data = tracker.stop_task()\n",
-    "emissions_data"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.10090237899917966"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Calculate accuracy\n",
-    "accuracy = accuracy_score(true_labels, predictions)\n",
-    "accuracy"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'submission_timestamp': '2025-01-21T19:53:46.639165',\n",
-       " 'accuracy': 0.10090237899917966,\n",
-       " 'energy_consumed_wh': 0.7195645902801733,\n",
-       " 'emissions_gco2eq': 0.040323680074710634,\n",
-       " 'emissions_data': {'run_id': '908f2e7e-4bb2-4991-a0f6-56bf8d7eda21',\n",
-       "  'duration': 47.736408500000834,\n",
-       "  'emissions': 4.032368007471064e-05,\n",
-       "  'emissions_rate': 8.444466886328872e-07,\n",
-       "  'cpu_power': 42.5,\n",
-       "  'gpu_power': 0.0,\n",
-       "  'ram_power': 11.755242347717285,\n",
-       "  'cpu_energy': 0.0005636615353475565,\n",
-       "  'gpu_energy': 0,\n",
-       "  'ram_energy': 0.00015590305493261682,\n",
-       "  'energy_consumed': 0.0007195645902801733,\n",
-       "  'country_name': 'France',\n",
-       "  'country_iso_code': 'FRA',\n",
-       "  'region': 'île-de-france',\n",
-       "  'cloud_provider': '',\n",
-       "  'cloud_region': '',\n",
-       "  'os': 'Windows-11-10.0.22631-SP0',\n",
-       "  'python_version': '3.12.7',\n",
-       "  'codecarbon_version': '3.0.0_rc0',\n",
-       "  'cpu_count': 12,\n",
-       "  'cpu_model': '13th Gen Intel(R) Core(TM) i7-1365U',\n",
-       "  'gpu_count': None,\n",
-       "  'gpu_model': None,\n",
-       "  'ram_total_size': 31.347312927246094,\n",
-       "  'tracking_mode': 'machine',\n",
-       "  'on_cloud': 'N',\n",
-       "  'pue': 1.0},\n",
-       " 'dataset_config': {'dataset_name': 'QuotaClimat/frugalaichallenge-text-train',\n",
-       "  'test_size': 0.2,\n",
-       "  'test_seed': 42}}"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Prepare results dictionary\n",
-    "results = {\n",
-    "    \"submission_timestamp\": datetime.now().isoformat(),\n",
-    "    \"accuracy\": float(accuracy),\n",
-    "    \"energy_consumed_wh\": emissions_data.energy_consumed * 1000,\n",
-    "    \"emissions_gco2eq\": emissions_data.emissions * 1000,\n",
-    "    \"emissions_data\": clean_emissions_data(emissions_data),\n",
-    "    \"dataset_config\": {\n",
-    "        \"dataset_name\": request.dataset_name,\n",
-    "        \"test_size\": request.test_size,\n",
-    "        \"test_seed\": request.test_seed\n",
-    "    }\n",
-    "}\n",
-    "\n",
-    "results"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Development of the model"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "90f50ab19698484489f36976745efad3",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\models--facebook--bart-large-mnli. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
-      "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
-      "  warnings.warn(message)\n"
-     ]
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "6e3974d8ff284603821f7beca9bd353d",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "bc29cb379c644b00b1bdf61d5426d99d",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "635503cf819747c9a83f22aa4f2f11db",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "3a5f53e451e8483ca7c33f42245abd13",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "84f922d1b68a4a0faa5e920d004efca0",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Device set to use cpu\n"
-     ]
-    }
-   ],
-   "source": [
-    "from transformers import pipeline\n",
-    "classifier = pipeline(\"zero-shot-classification\",\n",
-    "                      model=\"facebook/bart-large-mnli\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "sequence_to_classify = \"one day I will see the world\"\n",
-    "\n",
-    "candidate_labels = [\n",
-    "    \"Not related to climate change disinformation\",\n",
-    "    \"Climate change is not real and not happening\",\n",
-    "    \"Climate change is not human-induced\",\n",
-    "    \"Climate change impacts are not that bad\",\n",
-    "    \"Climate change solutions are harmful and unnecessary\",\n",
-    "    \"Climate change science is unreliable\",\n",
-    "    \"Climate change proponents are biased\",\n",
-    "    \"Fossil fuels are needed to address climate change\"\n",
-    "]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 15,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'sequence': 'one day I will see the world',\n",
-       " 'labels': ['Fossil fuels are needed to address climate change',\n",
-       "  'Climate change science is unreliable',\n",
-       "  'Not related to climate change disinformation',\n",
-       "  'Climate change proponents are biased',\n",
-       "  'Climate change impacts are not that bad',\n",
-       "  'Climate change solutions are harmful and unnecessary',\n",
-       "  'Climate change is not human-induced',\n",
-       "  'Climate change is not real and not happening'],\n",
-       " 'scores': [0.16242119669914246,\n",
-       "  0.15683825314044952,\n",
-       "  0.1564282774925232,\n",
-       "  0.14603719115257263,\n",
-       "  0.12794046103954315,\n",
-       "  0.10180754214525223,\n",
-       "  0.0936085507273674,\n",
-       "  0.0549185685813427]}"
-      ]
-     },
-     "execution_count": 15,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "classifier(sequence_to_classify, candidate_labels)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 26,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "[codecarbon WARNING @ 11:00:07] Already started tracking\n"
-     ]
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "5d66a13f76a4411d95b62d4a73012495",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "0it [00:00, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "[codecarbon WARNING @ 11:05:57] Background scheduler didn't run for a long period (349s), results might be inaccurate\n",
-      "[codecarbon INFO @ 11:05:57] Energy consumed for RAM : 0.018069 kWh. RAM Power : 11.755242347717285 W\n",
-      "[codecarbon INFO @ 11:05:57] Delta energy consumed for CPU with constant : 0.004122 kWh, power : 42.5 W\n",
-      "[codecarbon INFO @ 11:05:57] Energy consumed for All CPU : 0.065327 kWh\n",
-      "[codecarbon INFO @ 11:05:57] 0.083395 kWh of electricity used since the beginning.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "EmissionsData(timestamp='2025-01-22T11:05:57', project_name='codecarbon', run_id='908f2e7e-4bb2-4991-a0f6-56bf8d7eda21', experiment_id='5b0fa12a-3dd7-45bb-9766-cc326314d9f1', duration=349.19709450000664, emissions=0.0002949120266226386, emissions_rate=8.445461750018632e-07, cpu_power=42.5, gpu_power=0.0, ram_power=11.755242347717285, cpu_energy=0.004122396676597424, gpu_energy=0, ram_energy=0.0011402244733631148, energy_consumed=0.005262621149960539, country_name='France', country_iso_code='FRA', region='île-de-france', cloud_provider='', cloud_region='', os='Windows-11-10.0.22631-SP0', python_version='3.12.7', codecarbon_version='3.0.0_rc0', cpu_count=12, cpu_model='13th Gen Intel(R) Core(TM) i7-1365U', gpu_count=None, gpu_model=None, longitude=2.3494, latitude=48.8558, ram_total_size=31.347312927246094, tracking_mode='machine', on_cloud='N', pue=1.0)"
-      ]
-     },
-     "execution_count": 26,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Start tracking emissions\n",
-    "tracker.start()\n",
-    "tracker.start_task(\"inference\")\n",
-    "\n",
-    "from tqdm.auto import tqdm\n",
-    "predictions = []\n",
-    "\n",
-    "\n",
-    "\n",
-    "# Option 1: Simple loop approach\n",
-    "\n",
-    "for i, text in tqdm(enumerate(test_dataset[\"quote\"])):\n",
-    "\n",
-    "    result = classifier(text, candidate_labels)\n",
-    "\n",
-    "    # Get index of highest scoring label\n",
-    "\n",
-    "    pred_label = candidate_labels.index(result[\"labels\"][0])\n",
-    "\n",
-    "    predictions.append(pred_label)\n",
-    "    if i == 100:\n",
-    "        break\n",
-    "\n",
-    "\n",
-    "# Stop tracking emissions\n",
-    "emissions_data = tracker.stop_task()\n",
-    "emissions_data\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 28,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "0.4"
-      ]
-     },
-     "execution_count": 28,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Calculate accuracy\n",
-    "accuracy = accuracy_score(true_labels[:100], predictions[:100])\n",
-    "accuracy"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "base",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.7"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}

requirements.txt CHANGED Viewed

@@ -7,4 +7,6 @@ pydantic>=1.10.0
 python-dotenv>=1.0.0
 gradio>=4.0.0
 requests>=2.31.0
-librosa==0.10.2.post1

 python-dotenv>=1.0.0
 gradio>=4.0.0
 requests>=2.31.0
+librosa==0.10.2.post1
+torch==2.5.1
+torchaudio==2.5.1

tasks/audio.py CHANGED Viewed

@@ -2,31 +2,30 @@ from fastapi import APIRouter
 from datetime import datetime
 from datasets import load_dataset
 from sklearn.metrics import accuracy_score
-import random
 import os
 from .utils.evaluation import AudioEvaluationRequest
 from .utils.emissions import tracker, clean_emissions_data, get_space_info
 from dotenv import load_dotenv
 load_dotenv()
 router = APIRouter()
-DESCRIPTION = "Random Baseline"
 ROUTE = "/audio"
-@router.post(ROUTE, tags=["Audio Task"],
-             description=DESCRIPTION)
 async def evaluate_audio(request: AudioEvaluationRequest):
     """
     Evaluate audio classification for rainforest sound detection.
-    Current Model: Random Baseline
-    - Makes random predictions from the label space (0-1)
-    - Used as a baseline for comparison
     """
     # Get space info
     username, space_url = get_space_info()
@@ -38,11 +37,17 @@ async def evaluate_audio(request: AudioEvaluationRequest):
     }
     # Load and prepare the dataset
     # Because the dataset is gated, we need to use the HF_TOKEN environment variable to authenticate
-    dataset = load_dataset(request.dataset_name,token=os.getenv("HF_TOKEN"))
-    # Split dataset
-    train_test = dataset["train"]
-    test_dataset = dataset["test"]
     # Start tracking emissions
     tracker.start()
@@ -53,9 +58,14 @@ async def evaluate_audio(request: AudioEvaluationRequest):
     # Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.
     #--------------------------------------------------------------------------------------------
-    # Make random predictions (placeholder for actual model inference)
-    true_labels = test_dataset["label"]
-    predictions = [random.randint(0, 1) for _ in range(len(true_labels))]
     #--------------------------------------------------------------------------------------------
     # YOUR MODEL INFERENCE STOPS HERE
@@ -65,6 +75,7 @@ async def evaluate_audio(request: AudioEvaluationRequest):
     emissions_data = tracker.stop_task()
     # Calculate accuracy
     accuracy = accuracy_score(true_labels, predictions)
     # Prepare results dictionary

 from datetime import datetime
 from datasets import load_dataset
 from sklearn.metrics import accuracy_score
 import os
+import torch
 from .utils.evaluation import AudioEvaluationRequest
 from .utils.emissions import tracker, clean_emissions_data, get_space_info
+from .utils.preprocess import get_dataloader
+from .models.model import ChainsawDetector
 from dotenv import load_dotenv
 load_dotenv()
 router = APIRouter()
+DESCRIPTION = "Chainsaw goes brrr ⇒ GPU goes brrr"
 ROUTE = "/audio"
+@router.post(ROUTE, tags=["Audio Task"], description=DESCRIPTION)
 async def evaluate_audio(request: AudioEvaluationRequest):
     """
     Evaluate audio classification for rainforest sound detection.
+    Current Model: ChainsawDetector
+    - STFT -> PCEN -> split into small time chunks -> CNN+LSTM for each chunk -> dense -> prediction
     """
     # Get space info
     username, space_url = get_space_info()
     }
     # Load and prepare the dataset
     # Because the dataset is gated, we need to use the HF_TOKEN environment variable to authenticate
+    batch_size = 16
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    split='test'
+    test_dataset = load_dataset(request.dataset_name, split=split, token=os.getenv("HF_TOKEN"))
+    dataloader = get_dataloader(test_dataset, device, batch_size=batch_size, shuffle=False)
+    # Load model
+    model = ChainsawDetector(batch_size).to(device, dtype=torch.bfloat16)
+    model = torch.compile(model)
+    model.load_state_dict(torch.load('models/final-bf16.pth', weights_only=True))
+    model.eval()
     # Start tracking emissions
     tracker.start()
     # Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.
     #--------------------------------------------------------------------------------------------
+    predictions = []
+    with torch.no_grad():#, torch.amp.autocast(device_type=device):
+        for (X, y) in dataloader:
+            X = X.to(device, dtype=torch.bfloat16)
+            y = y.to(device, dtype=torch.bfloat16)
+            predictions.append(model(X))
+    predictions = torch.cat(predictions, dim=0)
     #--------------------------------------------------------------------------------------------
     # YOUR MODEL INFERENCE STOPS HERE
     emissions_data = tracker.stop_task()
     # Calculate accuracy
+    true_labels = test_dataset["label"]
     accuracy = accuracy_score(true_labels, predictions)
     # Prepare results dictionary

tasks/datasources/freesound_chainsaw.txt ADDED Viewed

	@@ -0,0 +1,68 @@

+chainsaw sawing nearby long various cuts.flac by kyles -- https://freesound.org/s/453249/ -- License: Creative Commons 0
+Chainsaw First Start (On choke).wav by lonemonk -- https://freesound.org/s/185578/ -- License: Attribution 3.0
+Chainsaw - Pull to Idle and 3 Revs.wav by Stefan021 -- https://freesound.org/s/431737/ -- License: Creative Commons 0
+Chainsaw Noises 002.wav by yottasounds -- https://freesound.org/s/380161/ -- License: Creative Commons 0
+Tree Chainsawed Drops.m4a by RutgerMuller -- https://freesound.org/s/535352/ -- License: Creative Commons 0
+D209 Chainsaw in the Wood.WAV by billcutbill -- https://freesound.org/s/669382/ -- License: Attribution 4.0
+chainsaw sawing working long various.flac by kyles -- https://freesound.org/s/637360/ -- License: Creative Commons 0
+chainsaw.WAV by inchadney -- https://freesound.org/s/467419/ -- License: Attribution NonCommercial 4.0
+CHAINSAW.wav by JFBSAUVE -- https://freesound.org/s/19898/ -- License: Attribution 4.0
+chainsaw.m4a by Chilsville -- https://freesound.org/s/570263/ -- License: Attribution NonCommercial 3.0
+chainsaw.wav by ShanayGroen -- https://freesound.org/s/365578/ -- License: Attribution NonCommercial 3.0
+Chainsaw with Bobcat idling and tree bits falling 080320.wav by BoilingSand -- https://freesound.org/s/50668/ -- License: Attribution 3.0
+Chainsaw in a forest by SPAudiobooks -- https://freesound.org/s/751913/ -- License: Creative Commons 0
+Chainsaw Crosscutting 2.wav by Benboncan -- https://freesound.org/s/64395/ -- License: Attribution 4.0
+chainsaw.WAV by stomachache -- https://freesound.org/s/47250/ -- License: Creative Commons 0
+Exterior_Chainsaw_Idle.wav by DJillcom -- https://freesound.org/s/157610/ -- License: Creative Commons 0
+Chainsawing.wav by Puniho -- https://freesound.org/s/165856/ -- License: Attribution 3.0
+Chainsaws (distant) by micadoe -- https://freesound.org/s/170338/ -- License: Creative Commons 0
+chainsaw.ogg by electrovoice664 -- https://freesound.org/s/75078/ -- License: Sampling+
+chainsaw.wav by doobit -- https://freesound.org/s/65997/ -- License: Sampling+
+chainsaw.wav by mr101986 -- https://freesound.org/s/94718/ -- License: Creative Commons 0
+Chainsaw - Tree cases.WAV by Ohrwurm -- https://freesound.org/s/68391/ -- License: Creative Commons 0
+Strimmers And Chainsaw 2.wav by Benboncan -- https://freesound.org/s/81908/ -- License: Attribution 4.0
+chainsaw.wav by UncleSigmund -- https://freesound.org/s/116765/ -- License: Attribution 4.0
+chainsaw and little tree.wav by Kyster -- https://freesound.org/s/118657/ -- License: Attribution 4.0
+chainsaw vs chestnut tree.wav by Kyster -- https://freesound.org/s/118658/ -- License: Attribution 4.0
+ChainsawCutting_Distant_4824.wav by pblzr -- https://freesound.org/s/512876/ -- License: Creative Commons 0
+chainsaw felling tree by matt_beer -- https://freesound.org/s/515296/ -- License: Creative Commons 0
+chainsaw by matt_beer -- https://freesound.org/s/515303/ -- License: Creative Commons 0
+starting chainsaw 2 by matt_beer -- https://freesound.org/s/515306/ -- License: Creative Commons 0
+starting chainsaw 1 by matt_beer -- https://freesound.org/s/515307/ -- License: Creative Commons 0
+Chainsaw by AugustSandberg -- https://freesound.org/s/508846/ -- License: Creative Commons 0
+Chainsaw cutting by AugustSandberg -- https://freesound.org/s/508847/ -- License: Creative Commons 0
+190155_Chainsaw.wav by GaelanW -- https://freesound.org/s/490073/ -- License: Attribution 3.0
+Chainsaw Cut Slow by DrinkingWindGames -- https://freesound.org/s/463729/ -- License: Attribution 4.0
+04-1 Chainsaw.wav by domiscz -- https://freesound.org/s/461734/ -- License: Creative Commons 0
+CHAINSAW - 1 by SamuelGremaud -- https://freesound.org/s/463207/ -- License: Creative Commons 0
+Chainsaws - Wood Carving by Ev-Dawg -- https://freesound.org/s/360619/ -- License: Creative Commons 0
+Chainsaw by Hard3eat -- https://freesound.org/s/351775/ -- License: Creative Commons 0
+Chainsaw_4.WAV by ivolipa -- https://freesound.org/s/345992/ -- License: Creative Commons 0
+Chainsaw_3.WAV by ivolipa -- https://freesound.org/s/345993/ -- License: Creative Commons 0
+Chainsaw_2.WAV by ivolipa -- https://freesound.org/s/345994/ -- License: Creative Commons 0
+Distant Chainsaw 2.mp3 by FunWithSound -- https://freesound.org/s/390741/ -- License: Creative Commons 0
+Distant Chainsaw 1.mp3 by FunWithSound -- https://freesound.org/s/390742/ -- License: Creative Commons 0
+chainsaw_start.wav by Jedo -- https://freesound.org/s/396463/ -- License: Creative Commons 0
+Chainsaw_cutting trees.wav by Jedo -- https://freesound.org/s/396464/ -- License: Creative Commons 0
+Chainsaw gasoline-powered.wav by aoristos -- https://freesound.org/s/235795/ -- License: Creative Commons 0
+Chainsaw by SPAudiobooks -- https://freesound.org/s/751912/ -- License: Creative Commons 0
+Chainsaw Rampage in the Forest by unfa -- https://freesound.org/s/165823/ -- License: Creative Commons 0
+S7 SIERRA.mp3 by AHTepsilon -- https://freesound.org/s/531550/ -- License: Creative Commons 0
+Chainsaw trimming trees with street noise by Morphic__ -- https://freesound.org/s/541277/ -- License: Attribution 4.0
+chainsaw_Husqvarna_385XPG.wav by theTone -- https://freesound.org/s/77945/ -- License: Attribution 4.0
+Chainsaw Stihl MSA 200 C by druki -- https://freesound.org/s/595765/ -- License: Creative Commons 0
+Chainsaw Stihl MS 170 cutting wood by druki -- https://freesound.org/s/595777/ -- License: Creative Commons 0
+Chainsaw_1.WAV by ivolipa -- https://freesound.org/s/345995/ -- License: Creative Commons 0
+chainsaw.m4a by Chilsville -- https://freesound.org/s/570263/ -- License: Attribution NonCommercial 3.0
+Distant Chainsaw 1.mp3 by FunWithSound -- https://freesound.org/s/390742/ -- License: Creative Commons 0
+Chainsaw in a forest by SPAudiobooks -- https://freesound.org/s/751913/ -- License: Creative Commons 0
+Chainsaw by SPAudiobooks -- https://freesound.org/s/751912/ -- License: Creative Commons 0
+Chainsawing.wav by Puniho -- https://freesound.org/s/165856/ -- License: Attribution 3.0
+D209 Chainsaw in the Wood.WAV by billcutbill -- https://freesound.org/s/669382/ -- License: Attribution 4.0
+Chain saw 3.mp3 by 5ound5murf23 -- https://freesound.org/s/523432/ -- License: Creative Commons 0
+Chain saw 1.mp3 by 5ound5murf23 -- https://freesound.org/s/523434/ -- License: Creative Commons 0
+Chain saw 2.mp3 by 5ound5murf23 -- https://freesound.org/s/523433/ -- License: Creative Commons 0
+Chainsaw Starting by kangaroovindaloo -- https://freesound.org/s/520207/ -- License: Creative Commons 0
+distant chainsaw by matt_beer -- https://freesound.org/s/515301/ -- License: Creative Commons 0
+distant chainsaw by matt_beer -- https://freesound.org/s/515302/ -- License: Creative Commons 0
+Chainsaw sounds deep in the forest by etienne.leplumey -- https://freesound.org/s/553502/ -- License: Attribution 4.0

tasks/datasources/freesound_environment.txt ADDED Viewed

	@@ -0,0 +1,37 @@

+forest summer Roond 018 200619_0186.wav by klankbeeld -- https://freesound.org/s/529758/ -- License: Attribution 4.0
+Kampina forest spring010 190322_1321.wav by klankbeeld -- https://freesound.org/s/564004/ -- License: Attribution 4.0
+Jogger in the forest by Cinetony -- https://freesound.org/s/559956/ -- License: Creative Commons 0
+Kampina forest spring010 190322_1321.wav by klankbeeld -- https://freesound.org/s/564004/ -- License: Attribution 4.0
+forest car passby 04 200619_0186.wav by klankbeeld -- https://freesound.org/s/613735/ -- License: Attribution 4.0
+summer forest NL EU 1207 PM 220617_0405 by klankbeeld -- https://freesound.org/s/725797/ -- License: Attribution 4.0
+forest spring NL EU 1114AM 220617_0400.wav by klankbeeld -- https://freesound.org/s/650572/ -- License: Attribution 4.0
+20080528.forest.wind.serins.flac by dobroide -- https://freesound.org/s/54744/ -- License: Attribution 4.0
+little windy forest.wav by Kyster -- https://freesound.org/s/99281/ -- License: Attribution 4.0
+Border ForestFarmfield 808AM NL EU 220515_0345.wav by klankbeeld -- https://freesound.org/s/671325/ -- License: Attribution 4.0
+Autumn Forest Wind by Akacie -- https://freesound.org/s/73719/ -- License: Attribution NonCommercial 4.0
+forest in the Netherlands 320 PM 230328_572 by klankbeeld -- https://freesound.org/s/730223/ -- License: Attribution 4.0
+Berlin Grunewald Forest 3 - bells in distance.wav by dbspin -- https://freesound.org/s/396664/ -- License: Creative Commons 0
+grassland forest spring NL 1127 AM 240531_0730 by klankbeeld -- https://freesound.org/s/738008/ -- License: Attribution 4.0
+AMBForst_Edge Of The Forest.Byrds.Distant Road_EM_(Eq,OOsprd,Voice accent).wav by newlocknew -- https://freesound.org/s/640001/ -- License: Attribution NonCommercial 4.0
+pine forest Kampina NL 04 190908_0072.wav by klankbeeld -- https://freesound.org/s/485972/ -- License: Attribution 4.0
+small-forest-stream-in-mountains-surround-sound-rear by CRAFTCREST.com -- https://freesound.org/s/204903/ -- License: Attribution 4.0
+water_forest_stream_00_l.wav by teadrinker -- https://freesound.org/s/403049/ -- License: Creative Commons 0
+park forest 1020AM 230223_0562 by klankbeeld -- https://freesound.org/s/690705/ -- License: Attribution 4.0
+Footsteps in Forest - 01.mp3 by Gutek -- https://freesound.org/s/201885/ -- License: Creative Commons 0
+Forest dry leaves walk. .wav by rempen -- https://freesound.org/s/274833/ -- License: Creative Commons 0
+Tiny trickling forest creek (loopable) by Mjeno -- https://freesound.org/s/405138/ -- License: Creative Commons 0
+Spring Forest Ambience 1 - Hyby Fælled, Denmark by mugwood -- https://freesound.org/s/682850/ -- License: Attribution 4.0
+New England Forest Daytime Ambience by Bmisiewicz -- https://freesound.org/s/698307/ -- License: Attribution 4.0
+forest_cicada_loop.flac by Nimlos -- https://freesound.org/s/422048/ -- License: Creative Commons 0
+Forest.mp3 by JayHu -- https://freesound.org/s/506103/ -- License: Attribution 3.0
+wind in the forest.WAV by inchadney -- https://freesound.org/s/260161/ -- License: Attribution NonCommercial 4.0
+Walk through Alice Holt Forest by Peter Batchelor by sensingtheforest -- https://freesound.org/s/730996/ -- License: Attribution NonCommercial 4.0
+Forest rain.m4a by FreqWincy -- https://freesound.org/s/707401/ -- License: Attribution NonCommercial 4.0
+Rain on window (interior) by xkeril -- https://freesound.org/s/669486/ -- License: Creative Commons 0
+Rain Slowly Passing SIDE ONLY_Edgewater_06192020.mp3 by speakwithanimals -- https://freesound.org/s/525044/ -- License: Creative Commons 0
+Street Scene - After The Rain - Moderate Traffic & Wet Road by FSFA -- https://freesound.org/s/593122/ -- License: Attribution 3.0
+Midnight city rain stereo.wav by itinerantmonk108 -- https://freesound.org/s/573202/ -- License: Creative Commons 0
+Rain, sheet roof.wav by snarcle -- https://freesound.org/s/569370/ -- License: Attribution 4.0
+003 - Rain Outside B.wav by Trashcan_Studios -- https://freesound.org/s/575461/ -- License: Attribution 4.0
+mostly rain.mp3 by soundman9826 -- https://freesound.org/s/193337/ -- License: Creative Commons 0

tasks/image.py DELETED Viewed

@@ -1,176 +0,0 @@
-from fastapi import APIRouter
-from datetime import datetime
-from datasets import load_dataset
-import numpy as np
-from sklearn.metrics import accuracy_score, precision_score, recall_score
-import random
-import os
-from .utils.evaluation import ImageEvaluationRequest
-from .utils.emissions import tracker, clean_emissions_data, get_space_info
-from dotenv import load_dotenv
-load_dotenv()
-router = APIRouter()
-DESCRIPTION = "Random Baseline"
-ROUTE = "/image"
-def parse_boxes(annotation_string):
-    """Parse multiple boxes from a single annotation string.
-    Each box has 5 values: class_id, x_center, y_center, width, height"""
-    values = [float(x) for x in annotation_string.strip().split()]
-    boxes = []
-    # Each box has 5 values
-    for i in range(0, len(values), 5):
-        if i + 5 <= len(values):
-            # Skip class_id (first value) and take the next 4 values
-            box = values[i+1:i+5]
-            boxes.append(box)
-    return boxes
-def compute_iou(box1, box2):
-    """Compute Intersection over Union (IoU) between two YOLO format boxes."""
-    # Convert YOLO format (x_center, y_center, width, height) to corners
-    def yolo_to_corners(box):
-        x_center, y_center, width, height = box
-        x1 = x_center - width/2
-        y1 = y_center - height/2
-        x2 = x_center + width/2
-        y2 = y_center + height/2
-        return np.array([x1, y1, x2, y2])
-    box1_corners = yolo_to_corners(box1)
-    box2_corners = yolo_to_corners(box2)
-    # Calculate intersection
-    x1 = max(box1_corners[0], box2_corners[0])
-    y1 = max(box1_corners[1], box2_corners[1])
-    x2 = min(box1_corners[2], box2_corners[2])
-    y2 = min(box1_corners[3], box2_corners[3])
-    intersection = max(0, x2 - x1) * max(0, y2 - y1)
-    # Calculate union
-    box1_area = (box1_corners[2] - box1_corners[0]) * (box1_corners[3] - box1_corners[1])
-    box2_area = (box2_corners[2] - box2_corners[0]) * (box2_corners[3] - box2_corners[1])
-    union = box1_area + box2_area - intersection
-    return intersection / (union + 1e-6)
-def compute_max_iou(true_boxes, pred_box):
-    """Compute maximum IoU between a predicted box and all true boxes"""
-    max_iou = 0
-    for true_box in true_boxes:
-        iou = compute_iou(true_box, pred_box)
-        max_iou = max(max_iou, iou)
-    return max_iou
-@router.post(ROUTE, tags=["Image Task"],
-             description=DESCRIPTION)
-async def evaluate_image(request: ImageEvaluationRequest):
-    """
-    Evaluate image classification and object detection for forest fire smoke.
-    Current Model: Random Baseline
-    - Makes random predictions for both classification and bounding boxes
-    - Used as a baseline for comparison
-    Metrics:
-    - Classification accuracy: Whether an image contains smoke or not
-    - Object Detection accuracy: IoU (Intersection over Union) for smoke bounding boxes
-    """
-    # Get space info
-    username, space_url = get_space_info()
-    # Load and prepare the dataset
-    dataset = load_dataset(request.dataset_name, token=os.getenv("HF_TOKEN"))
-    # Split dataset
-    train_test = dataset["train"]
-    test_dataset = dataset["val"]
-    # Start tracking emissions
-    tracker.start()
-    tracker.start_task("inference")
-    #--------------------------------------------------------------------------------------------
-    # YOUR MODEL INFERENCE CODE HERE
-    # Update the code below to replace the random baseline with your model inference
-    #--------------------------------------------------------------------------------------------
-    predictions = []
-    true_labels = []
-    pred_boxes = []
-    true_boxes_list = []  # List of lists, each inner list contains boxes for one image
-    for example in test_dataset:
-        # Parse true annotation (YOLO format: class_id x_center y_center width height)
-        annotation = example.get("annotations", "").strip()
-        has_smoke = len(annotation) > 0
-        true_labels.append(int(has_smoke))
-        # Make random classification prediction
-        pred_has_smoke = random.random() > 0.5
-        predictions.append(int(pred_has_smoke))
-        # If there's a true box, parse it and make random box prediction
-        if has_smoke:
-            # Parse all true boxes from the annotation
-            image_true_boxes = parse_boxes(annotation)
-            true_boxes_list.append(image_true_boxes)
-            # For baseline, make one random box prediction per image
-            # In a real model, you might want to predict multiple boxes
-            random_box = [
-                random.random(),  # x_center
-                random.random(),  # y_center
-                random.random() * 0.5,  # width (max 0.5)
-                random.random() * 0.5   # height (max 0.5)
-            ]
-            pred_boxes.append(random_box)
-    #--------------------------------------------------------------------------------------------
-    # YOUR MODEL INFERENCE STOPS HERE
-    #--------------------------------------------------------------------------------------------
-    # Stop tracking emissions
-    emissions_data = tracker.stop_task()
-    # Calculate classification metrics
-    classification_accuracy = accuracy_score(true_labels, predictions)
-    classification_precision = precision_score(true_labels, predictions)
-    classification_recall = recall_score(true_labels, predictions)
-    # Calculate mean IoU for object detection (only for images with smoke)
-    # For each image, we compute the max IoU between the predicted box and all true boxes
-    ious = []
-    for true_boxes, pred_box in zip(true_boxes_list, pred_boxes):
-        max_iou = compute_max_iou(true_boxes, pred_box)
-        ious.append(max_iou)
-    mean_iou = float(np.mean(ious)) if ious else 0.0
-    # Prepare results dictionary
-    results = {
-        "username": username,
-        "space_url": space_url,
-        "submission_timestamp": datetime.now().isoformat(),
-        "model_description": DESCRIPTION,
-        "classification_accuracy": float(classification_accuracy),
-        "classification_precision": float(classification_precision),
-        "classification_recall": float(classification_recall),
-        "mean_iou": mean_iou,
-        "energy_consumed_wh": emissions_data.energy_consumed * 1000,
-        "emissions_gco2eq": emissions_data.emissions * 1000,
-        "emissions_data": clean_emissions_data(emissions_data),
-        "api_route": ROUTE,
-        "dataset_config": {
-            "dataset_name": request.dataset_name,
-            "test_size": request.test_size,
-            "test_seed": request.test_seed
-        }
-    }
-    return results

tasks/models/__init__.py ADDED Viewed

File without changes

tasks/models/final-bf16.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:16405871852b620f3a0ebd30dd273d45b3eaf91d8b27604b9ec00511480c62df
+size 10836

tasks/models/model.py ADDED Viewed

	@@ -0,0 +1,74 @@

+from torch import ones, split, bfloat16
+from torch.nn.functional import relu, sigmoid
+from torch.nn import Module, MaxPool1d, Conv1d, Conv2d, Linear, BatchNorm2d, LSTMCell
+class ChunkCNN(Module):
+    def __init__(self):
+        super(ChunkCNN, self).__init__()
+        self.pool = MaxPool1d(kernel_size=2, stride=2)
+        self.conv1 = Conv2d(in_channels=1, out_channels=4, kernel_size=(15,10), stride=(4, 1), padding=(1, 0))
+        self.conv2 = Conv1d(in_channels=4, out_channels=8, kernel_size=9, stride=2, padding=0)
+        self.conv3 = Conv1d(in_channels=8, out_channels=16, kernel_size=2, stride=2, padding=1)
+        self.collapse = Conv1d(in_channels=16, out_channels=1, kernel_size=1, stride=1, padding=0)
+    def forward(self, x):
+        x = self.conv1(x).squeeze()
+        x = relu(x)
+        x = self.pool(x)
+        x = self.conv2(x)
+        x = relu(x)
+        x = self.pool(x)
+        x = self.conv3(x)
+        x = relu(x)
+        x = self.collapse(x).squeeze()
+        x = relu(x)
+        return x
+class LastLayer(Module):
+    def __init__(self, inputsize):
+        super(LastLayer, self).__init__()
+        self.dense1 = Linear(inputsize, 3)
+        self.dense2 = Linear(3, 1)
+    def forward(self, x):
+        x = self.dense1(x)
+        x = relu(x)
+        x = self.dense2(x)
+        x = sigmoid(x).squeeze()
+        return x
+class ChainsawDetector(Module):
+    def __init__(self, batch_size):
+        super(ChainsawDetector, self).__init__()
+        self.batch_size = batch_size
+        self.nb_lstm = 8
+        self.batchnorm = BatchNorm2d(1)
+        self.chunkcnn = ChunkCNN()
+        self.lstmcell = LSTMCell(self.nb_lstm, self.nb_lstm)
+        self.lastlayer = LastLayer(self.nb_lstm)
+        self.initstate = ones((batch_size, self.nb_lstm), dtype=bfloat16) # default class is 1: environment
+    def forward(self, x):
+        hx = self.initstate.detach().clone()
+        cx = self.initstate.detach().clone()
+        x = x[:, None , :, :]
+        x = self.batchnorm(x)
+        for chunk in split(x, 10, dim=3):
+            xi = self.chunkcnn(chunk)
+            hx, cx = self.lstmcell(xi, (hx, cx))
+        x = self.lastlayer(hx)
+        # final decision
+        x = (x>0.5).bfloat16()
+        return x

tasks/text.py DELETED Viewed

@@ -1,92 +0,0 @@
-from fastapi import APIRouter
-from datetime import datetime
-from datasets import load_dataset
-from sklearn.metrics import accuracy_score
-import random
-from .utils.evaluation import TextEvaluationRequest
-from .utils.emissions import tracker, clean_emissions_data, get_space_info
-router = APIRouter()
-DESCRIPTION = "Random Baseline"
-ROUTE = "/text"
-@router.post(ROUTE, tags=["Text Task"],
-             description=DESCRIPTION)
-async def evaluate_text(request: TextEvaluationRequest):
-    """
-    Evaluate text classification for climate disinformation detection.
-    Current Model: Random Baseline
-    - Makes random predictions from the label space (0-7)
-    - Used as a baseline for comparison
-    """
-    # Get space info
-    username, space_url = get_space_info()
-    # Define the label mapping
-    LABEL_MAPPING = {
-        "0_not_relevant": 0,
-        "1_not_happening": 1,
-        "2_not_human": 2,
-        "3_not_bad": 3,
-        "4_solutions_harmful_unnecessary": 4,
-        "5_science_unreliable": 5,
-        "6_proponents_biased": 6,
-        "7_fossil_fuels_needed": 7
-    }
-    # Load and prepare the dataset
-    dataset = load_dataset(request.dataset_name)
-    # Convert string labels to integers
-    dataset = dataset.map(lambda x: {"label": LABEL_MAPPING[x["label"]]})
-    # Split dataset
-    train_test = dataset["train"]
-    test_dataset = dataset["test"]
-    # Start tracking emissions
-    tracker.start()
-    tracker.start_task("inference")
-    #--------------------------------------------------------------------------------------------
-    # YOUR MODEL INFERENCE CODE HERE
-    # Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.
-    #--------------------------------------------------------------------------------------------
-    # Make random predictions (placeholder for actual model inference)
-    true_labels = test_dataset["label"]
-    predictions = [random.randint(0, 7) for _ in range(len(true_labels))]
-    #--------------------------------------------------------------------------------------------
-    # YOUR MODEL INFERENCE STOPS HERE
-    #--------------------------------------------------------------------------------------------
-    # Stop tracking emissions
-    emissions_data = tracker.stop_task()
-    # Calculate accuracy
-    accuracy = accuracy_score(true_labels, predictions)
-    # Prepare results dictionary
-    results = {
-        "username": username,
-        "space_url": space_url,
-        "submission_timestamp": datetime.now().isoformat(),
-        "model_description": DESCRIPTION,
-        "accuracy": float(accuracy),
-        "energy_consumed_wh": emissions_data.energy_consumed * 1000,
-        "emissions_gco2eq": emissions_data.emissions * 1000,
-        "emissions_data": clean_emissions_data(emissions_data),
-        "api_route": ROUTE,
-        "dataset_config": {
-            "dataset_name": request.dataset_name,
-            "test_size": request.test_size,
-            "test_seed": request.test_seed
-        }
-    }
-    return results

tasks/utils/evaluation.py CHANGED Viewed

@@ -1,18 +1,9 @@
-from typing import Optional
 from pydantic import BaseModel, Field
 class BaseEvaluationRequest(BaseModel):
     test_size: float = Field(0.2, ge=0.0, le=1.0, description="Size of the test split (between 0 and 1)")
     test_seed: int = Field(42, ge=0, description="Random seed for reproducibility")
-class TextEvaluationRequest(BaseEvaluationRequest):
-    dataset_name: str = Field("QuotaClimat/frugalaichallenge-text-train",
-                            description="The name of the dataset on HuggingFace Hub")
-class ImageEvaluationRequest(BaseEvaluationRequest):
-    dataset_name: str = Field("pyronear/pyro-sdis",
-                            description="The name of the dataset on HuggingFace Hub")
 class AudioEvaluationRequest(BaseEvaluationRequest):
     dataset_name: str = Field("rfcx/frugalai",
                             description="The name of the dataset on HuggingFace Hub")

 from pydantic import BaseModel, Field
 class BaseEvaluationRequest(BaseModel):
     test_size: float = Field(0.2, ge=0.0, le=1.0, description="Size of the test split (between 0 and 1)")
     test_seed: int = Field(42, ge=0, description="Random seed for reproducibility")
 class AudioEvaluationRequest(BaseEvaluationRequest):
     dataset_name: str = Field("rfcx/frugalai",
                             description="The name of the dataset on HuggingFace Hub")

tasks/utils/preprocess.py ADDED Viewed

	@@ -0,0 +1,99 @@

+from torch.utils.data import DataLoader
+import librosa
+from math import floor
+import torch
+from torch.nn.functional import pad
+from torchaudio.transforms import Resample
+#from random import randint
+def get_dataloader(dataset, device, batch_size=16, shuffle=True):
+    return DataLoader(
+        dataset.with_format("torch", device=device),
+        batch_size=batch_size,
+        collate_fn=prepare_batch,
+        num_workers=4,
+        shuffle=shuffle,
+        drop_last=True,
+        persistent_workers=True,
+    )
+def resample(x, sr, newsr):
+    transform = Resample(
+        orig_freq=sr,
+        new_freq=newsr,
+        resampling_method="sinc_interp_kaiser",
+        lowpass_filter_width=16,
+        rolloff=0.85,
+        beta=8.555504641634386,
+    )
+    return transform(x)
+def fixlength(x, L):
+    x = x[:L]
+    x = pad(x, (0,L-len(x)))
+    return x
+def preprocess(X, newsr, n_fft, win_length, hop_length, gain=0.8, bias=10, power=0.25):
+    X = torch.stft(X, n_fft, hop_length=hop_length, win_length=win_length, window=torch.hann_window(win_length), onesided=True, return_complex=True)
+    X = torch.abs(X)
+    X = torch.stack([torch.from_numpy(librosa.pcen(x.numpy(), sr=newsr, hop_length=hop_length, gain=gain, bias=bias, power=power))
+                      for x in X], 0)
+    X = X.to(torch.bfloat16)
+    return X
+def prepare_batch(samples):
+    #maxlen=60
+    newsr = 4000
+    n_fft = 2**10 # power of 2
+    win_length = 2**10
+    hop_length = floor(0.0505*newsr)
+    labels = []
+    signals = []
+    for sample in samples:
+        labels.append(sample['label'])
+        sr = sample['audio']['sampling_rate']
+        x = sample['audio']['array']
+        if (sr > newsr and len(x)!=0):
+            x = resample(x, sr, newsr)
+        x = fixlength(x, 3*newsr)
+        signals.append(x)
+    signals = torch.stack(signals, 0)
+    batch = preprocess(signals,newsr, n_fft, win_length, hop_length)
+    labels = torch.tensor(labels, dtype=float)
+    return batch, labels
+# def random_mask(sample):
+#     # random rectangular mask
+#     B, H, W = sample.shape
+#     for b in range(B):
+#         for _ in range(randint(3,12)):
+#             w = randint(5, 15)
+#             h = randint(10, 100)
+#             x1 = randint(0, W-w)
+#             y1 = randint(0, H-h)
+#             sample[b, y1:y1+h, x1:x1+w] = 0
+#     return sample
+# def timeshift(sample):
+#     padsize = randint(0, 6)
+#     length = sample.size(2)
+#     randpad = torch.zeros((sample.size(0), sample.size(1), padsize), dtype=torch.float32)
+#     sample = torch.cat((randpad, sample), dim=2)
+#     sample = sample[:,:,:length]
+#     return sample
+# def add_noise(sample):
+#     #noise = np.random.normal(0, 0.05*sample.max(), sample.shape)
+#     noise = 0.05*sample.max()*torch.randn(sample.shape, dtype=torch.float32)
+#     sample = sample + noise
+#     return sample
+# def augment(sample):
+#     sample = timeshift(sample)
+#     sample = random_mask(sample)
+#     sample = add_noise(sample)
+#     return sample

training/curate.ipynb ADDED Viewed

	@@ -0,0 +1,202 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import librosa\n",
+    "import soundfile as sf\n",
+    "import numpy as np\n",
+    "from os import listdir\n",
+    "from os.path import isfile, join\n",
+    "from math import floor\n",
+    "import IPython.display as ipd"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Data from [ESC-50](https://github.com/karolpiczak/ESC-50)  \n",
+    "And [freesound.org](freesound.org)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from scipy.signal import butter, lfilter\n",
+    "\n",
+    "def apply_lowpass_filter(x, sr):\n",
+    "    order = 10\n",
+    "    cutoff = 2000\n",
+    "    b, a = butter(order, cutoff, fs=sr, btype='low', analog=False)\n",
+    "    y = lfilter(b, a, x)\n",
+    "    return y"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def downsample(x, sr, newsr):\n",
+    "    return x[::floor(sr/newsr)]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def play(x, sr):\n",
+    "    ipd.display(ipd.Audio(data=x, rate=sr))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# crop a single file (if end by silence)\n",
+    "# file=\"751913__spaudiobooks__chainsaw-in-a-forest\"\n",
+    "# ext=\".wav\"\n",
+    "# data, sr = librosa.load(dirpath+file+ext)\n",
+    "# data = data[:sr*(60+31)]\n",
+    "# sf.write(dirpath+file+\".wav\", data, sr)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# read, filter, downsample, chunk by 3s length, write wav\n",
+    "newsr=4000\n",
+    "c=2550\n",
+    "dirpath = \"../datasets/freesound/chainsaw/audio/long\"\n",
+    "for file in listdir(dirpath):\n",
+    "    if isfile(join(dirpath, file)):\n",
+    "        print(file)\n",
+    "        data, sr = librosa.load(dirpath+file)\n",
+    "        #play(data, sr)\n",
+    "        data = apply_lowpass_filter(data, sr)\n",
+    "        data = downsample(data, sr, newsr)\n",
+    "        cutpoints = list(range(3*newsr,len(data),3*newsr))\n",
+    "        all_data = np.split(data, cutpoints)\n",
+    "        for d in all_data:\n",
+    "            if (len(d) > 1024):\n",
+    "                sf.write(dirpath+f'curated/{c}.wav', d, 4000)\n",
+    "                c+=1\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# detect too short files\n",
+    "# dirpath = \"../datasets/freesound/environment/audio/curated/\"\n",
+    "# for file in listdir(dirpath):\n",
+    "#     if isfile(join(dirpath, file)):\n",
+    "#         data, sr = librosa.load(dirpath+file)\n",
+    "#         if (len(data)<=1024):\n",
+    "#             print(file, len(data))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# ESC-50\n",
+    "# attenuate, mix with background\n",
+    "from random import randint, uniform\n",
+    "\n",
+    "c=0\n",
+    "newsr=4000\n",
+    "dirpath = \"../datasets/freesound/chainsaw/audio/\"\n",
+    "envdir = \"../datasets/freesound/environment/audio/\"\n",
+    "envfiles = [file for file in listdir(envdir) if isfile(join(envdir, file))]\n",
+    "for file in listdir(dirpath):\n",
+    "    if isfile(join(dirpath, file)):\n",
+    "        print(file)\n",
+    "        data, sr = librosa.load(dirpath+file)\n",
+    "        #play(data, sr)\n",
+    "        lastindexes=[]\n",
+    "        for i in range(3):\n",
+    "            index = randint(0, len(envfiles)-1)\n",
+    "            while (index in lastindexes):\n",
+    "                index = randint(0, len(envfiles)-1)\n",
+    "            lastindexes.append(index)\n",
+    "            addfile = envfiles[index]\n",
+    "            data2, sr2 = librosa.load(envdir+addfile)\n",
+    "            data1 = apply_lowpass_filter(data, sr)\n",
+    "            data2 = apply_lowpass_filter(data2, sr2)\n",
+    "            data1 = downsample(data1, sr, newsr)\n",
+    "            data2 = downsample(data2, sr2, newsr)\n",
+    "            attenuation = round(uniform(0.2, 0.5), 2)\n",
+    "            data1 = (data1 * attenuation + data2 *(1-attenuation))/2\n",
+    "            all_data = np.split(data1, [round(len(data1)/2)])\n",
+    "            for d in all_data:\n",
+    "                sf.write(dirpath+f'test/mix-{c}.wav', d, 4000)\n",
+    "                c+=1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 64,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# environment audio from ESC-50, filter, downsample and half the files (they are 5 sec long)\n",
+    "newsr=4000\n",
+    "c=2649\n",
+    "dirpath = \"../datasets/freesound/environment/audio/\"\n",
+    "for file in listdir(dirpath):\n",
+    "    if isfile(join(dirpath, file)):\n",
+    "        data, sr = librosa.load(dirpath+file)\n",
+    "        data = apply_lowpass_filter(data, sr)\n",
+    "        data = downsample(data, sr, newsr)\n",
+    "        all_data = np.split(data, [round(len(data)/2)])\n",
+    "        for d in all_data:\n",
+    "            # random time shift\n",
+    "            rand_zeros = np.zeros(randint(0, 1900))\n",
+    "            d = np.append(rand_zeros, d)\n",
+    "            sf.write(dirpath+f'curated/e-{c}.wav', d, 4000)\n",
+    "            c+=1"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "audio-processing",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

training/dataset.py ADDED Viewed

	@@ -0,0 +1,26 @@

+from torch.utils.data import Dataset as TorchDataset
+import pandas as pd
+import torchaudio
+import torch
+class ChainsawDataset(TorchDataset):
+    def __init__(self):
+        self.path="../datasets/freesound/"
+        self.ds = pd.read_csv(self.path+"labels.csv")
+    def __getitem__(self, index):
+        file, label = self.ds.iloc[index]
+        x, sr = torchaudio.load(self.path+file)
+        x = x.squeeze()
+        return {
+            'audio': {
+                'path': file,
+                'array': x,
+                'sampling_rate': torch.tensor(sr),
+            },
+            'label': torch.tensor(label)
+        }
+    def __len__(self):
+        return len(self.ds)

training/train.py ADDED Viewed

	@@ -0,0 +1,145 @@

+import time
+import torch
+from torch import optim
+from torch import nn
+from torchmetrics.classification import BinaryAccuracy
+from torch.optim.lr_scheduler import OneCycleLR
+from torch.amp import autocast
+import mlflow
+from tqdm import tqdm
+import model
+import preprocess
+import dataset
+# MLflow server
+mlflow.set_tracking_uri(uri="http://localhost:8080")
+mlflow.set_experiment("Optimizations")
+start_time = time.time()
+# batch best lr
+# 8     1e-3
+# 16    5e-3
+# hyperparameters
+hp = {
+    'batch_size': 16,
+    'learning_rate': 5e-3,
+    'num_epochs': 10,
+}
+device ='cuda' if torch.cuda.is_available() else 'cpu'
+preprocess.hflogin()
+# Prepare datasets
+custom_dataset = dataset.ChainsawDataset()
+train_dataset = preprocess.get_dataset('train', device)
+train_dataset = torch.utils.data.ConcatDataset([train_dataset, custom_dataset])
+val_dataset = preprocess.get_dataset('test', device)
+train_dataloader = preprocess.get_dataloader(train_dataset, batch_size=hp['batch_size'], shuffle=True)
+val_dataloader = preprocess.get_dataloader(val_dataset, batch_size=hp['batch_size'], shuffle=False)
+# Load model
+model = model.ChainsawDetector(hp['batch_size']).to(device, dtype=torch.bfloat16)
+model = torch.compile(model)
+model.load_state_dict(torch.load('backups/final-bf16.pth', weights_only=True), strict=True)
+hp['total_params'] = sum(p.numel() for p in model.parameters())
+print(f"model ready, {hp['total_params']} parameters")
+loss_fn = nn.BCELoss()
+hp["loss_fn"] = 'BinaryCrossEntropyLoss'
+optimizer = optim.AdamW(model.parameters(), lr=hp['learning_rate'])
+total_iterations=len(train_dataset)
+steps_per_epoch=total_iterations//hp['batch_size']
+total_steps = total_iterations*hp['num_epochs']
+print(f"batch_size = {hp['batch_size']}, num_epochs = {hp['num_epochs']}")
+print(f'{total_iterations=}, {steps_per_epoch=}, {total_steps=}')
+lrscheduler = OneCycleLR(optimizer, max_lr=hp['learning_rate'], steps_per_epoch=steps_per_epoch, epochs=hp['num_epochs'])
+hp["optimizer"] = 'AdamW'
+metric_fn = BinaryAccuracy(threshold=0.5)
+def train(loader, model, loss_fn, metric_fn, optimizer, lrscheduler, epoch):
+    for batch_index, (data, targets) in enumerate(tqdm(loader)):
+        # Move data and targets to the device (GPU/CPU)
+        data = data.to(device, dtype=torch.bfloat16)
+        data = preprocess.augment(data)
+        targets = targets.to(device, dtype=torch.bfloat16)
+        optimizer.zero_grad()
+        # Forward pass: compute the model output
+        with autocast(device_type=device, dtype=torch.bfloat16):
+            predictions = model(data)
+            loss = loss_fn(predictions, targets)
+        # Backward pass: compute the gradients
+        loss.backward()
+        # Optimization step: update the model parameters
+        optimizer.step()
+        lrscheduler.step()
+        if batch_index % 100 == 0:
+            loss = loss.item()
+            accuracy = metric_fn(predictions, targets)
+            step = batch_index + epoch*steps_per_epoch
+            mlflow.log_metric("lr", lrscheduler.get_last_lr()[0], step=step)
+            mlflow.log_metric("train_loss", f"{loss:2f}", step=step)
+            mlflow.log_metric("train_accuracy", f"{accuracy:2f}", step=step)
+def decide(x):
+    return 1 if x>=0.5 else 0
+MAE = torch.nn.L1Loss()
+def evaluate(loader, model, epoch, loss_fn=loss_fn):
+    num_correct = 0
+    num_samples = 0
+    num_batches = 0
+    loss = 0
+    confidence = 0
+    model.eval()
+    with torch.no_grad(), autocast(device_type=device):
+        for X, y in loader:
+            X = X.to(device, dtype=torch.bfloat16)
+            y = y.to(device, dtype=torch.bfloat16)
+            predictions = model(X)
+            decisions = predictions.detach().clone()
+            decisions.apply_(decide)
+            confidence += MAE(decisions, predictions)
+            loss += loss_fn(decisions, y).item()
+            num_correct += (decisions == y).sum()  # Count correct predictions
+            num_samples += decisions.size(0)  # Count total samples
+            num_batches +=1
+    # Calculate metrics
+    accuracy = float(num_correct) / float(num_samples) * 100
+    loss /= num_batches
+    confidence /= num_batches
+    confidence = 1-confidence
+    mlflow.log_metric("val_loss", f"{loss:2f}", step=epoch)
+    mlflow.log_metric("val_accuracy", f"{accuracy:2f}", step=epoch)
+    mlflow.log_metric("val_confidence", f"{confidence:2f}", step=epoch)
+    print(f"Got {num_correct}/{num_samples} with accuracy {accuracy:.2f}% and confidence {confidence:.2f}")
+    model.train()
+with mlflow.start_run() as run:
+    mlflow.log_params(hp)
+    for epoch in range(0, hp['num_epochs']):
+            print(f"Epoch [{epoch+1}/{hp['num_epochs']}]")
+            train(train_dataloader, model, loss_fn, metric_fn, optimizer, lrscheduler, epoch)
+            evaluate(val_dataloader, model, epoch)
+model.eval()
+elapsed = time.time() - start_time
+print(f"--- {elapsed:.2f} seconds ---")
+torch.save(model.state_dict(), 'backups/name.pth')