{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Style Transfer with OpenVINO™\n", "\n", "This notebook demonstrates style transfer with OpenVINO, using the Style Transfer Models from [ONNX Model Repository](https://github.com/onnx/models). Specifically, [Fast Neural Style Transfer](https://github.com/onnx/models/tree/master/vision/style_transfer/fast_neural_style) model, which is designed to mix the content of an image with the style of another image.\n", "\n", "![style transfer](https://user-images.githubusercontent.com/109281183/208703143-049f712d-2777-437c-8172-597ef7d53fc3.gif)\n", "\n", "This notebook uses five pre-trained models, for the following styles: Mosaic, Rain Princess, Candy, Udnie and Pointilism. The models are from [ONNX Model Repository](https://github.com/onnx/models) and are based on the research paper [Perceptual Losses for Real-Time Style Transfer and Super-Resolution](https://arxiv.org/abs/1603.08155) along with [Instance Normalization](https://arxiv.org/abs/1607.08022). Final part of this notebook shows live inference results from a webcam. Additionally, you can also upload a video file.\n", "\n", "> **NOTE**: If you have a webcam on your computer, you can see live results streaming in the notebook. If you run the notebook on a server, the webcam will not work but you can run inference, using a video file.\n", "\n", "#### Table of contents:\n", "\n", "- [Preparation](#Preparation)\n", " - [Install requirements](#Install-requirements)\n", " - [Imports](#Imports)\n", "- [The Model](#The-Model)\n", " - [Download the Model](#Download-the-Model)\n", " - [Convert ONNX Model to OpenVINO IR Format](#Convert-ONNX-Model-to-OpenVINO-IR-Format)\n", " - [Load the Model](#Load-the-Model)\n", " - [Preprocess the image](#Preprocess-the-image)\n", " - [Helper function to postprocess the stylized image](#Helper-function-to-postprocess-the-stylized-image)\n", " - [Main Processing Function](#Main-Processing-Function)\n", " - [Run Style Transfer](#Run-Style-Transfer)\n", "- [References](#References)\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Preparation\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "### Install requirements\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install -q \"openvino>=2023.1.0\"\n", "%pip install -q opencv-python requests tqdm\n", "\n", "# Fetch `notebook_utils` module\n", "import requests\n", "\n", "r = requests.get(\n", " url=\"https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py\",\n", ")\n", "\n", "open(\"notebook_utils.py\", \"w\").write(r.text)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Imports\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "import collections\n", "import time\n", "\n", "import cv2\n", "import numpy as np\n", "from pathlib import Path\n", "import ipywidgets as widgets\n", "from IPython.display import display, clear_output, Image\n", "import openvino as ov\n", "\n", "import notebook_utils as utils" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Select one of the styles below: Mosaic, Rain Princess, Candy, Udnie, and Pointilism to do the style transfer." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Option to select different styles using a dropdown\n", "style_dropdown = widgets.Dropdown(\n", " options=[\"MOSAIC\", \"RAIN-PRINCESS\", \"CANDY\", \"UDNIE\", \"POINTILISM\"],\n", " value=\"MOSAIC\", # Set the default value\n", " description=\"Select Style:\",\n", " disabled=False,\n", " style={\"description_width\": \"initial\"}, # Adjust the width as needed\n", ")\n", "\n", "\n", "# Function to handle changes in dropdown and print the selected style\n", "def print_style(change):\n", " if change[\"type\"] == \"change\" and change[\"name\"] == \"value\":\n", " print(f\"Selected style {change['new']}\")\n", "\n", "\n", "# Observe changes in the dropdown value\n", "style_dropdown.observe(print_style, names=\"value\")\n", "\n", "# Display the dropdown\n", "display(style_dropdown)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## The Model\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "### Download the Model\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "The style transfer model, selected in the previous step, will be downloaded to `model_path` if you have not already downloaded it. The models are provided by the ONNX Model Zoo in `.onnx` format, which means it could be used with OpenVINO directly. However, this notebook will also show how you can use the Conversion API to convert ONNX to OpenVINO Intermediate Representation (IR) with `FP16` precision." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Directory to download the model from ONNX model zoo\n", "base_model_dir = \"model\"\n", "base_url = \"https://github.com/onnx/models/raw/69d69010b7ed6ba9438c392943d2715026792d40/archive/vision/style_transfer/fast_neural_style/model\"\n", "\n", "# Selected ONNX model will be downloaded in the path\n", "model_path = Path(f\"{style_dropdown.value.lower()}-9.onnx\")\n", "\n", "style_url = f\"{base_url}/{model_path}\"\n", "utils.download_file(style_url, directory=base_model_dir)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "### Convert ONNX Model to OpenVINO IR Format\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "In the next step, you will convert the ONNX model to OpenVINO IR format with `FP16` precision. While ONNX models are directly supported by OpenVINO runtime, it can be useful to convert them to IR format to take advantage of OpenVINO optimization tools and features. The `ov.convert_model` Python function of model conversion API can be used. The converted model is saved to the model directory. The function returns instance of OpenVINO Model class, which is ready to use in Python interface but can also be serialized to OpenVINO IR format for future execution. If the model has been already converted, you can skip this step." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Construct the command for model conversion API.\n", "\n", "ov_model = ov.convert_model(f\"model/{style_dropdown.value.lower()}-9.onnx\")\n", "ov.save_model(ov_model, f\"model/{style_dropdown.value.lower()}-9.xml\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Converted IR model path\n", "ir_path = Path(f\"model/{style_dropdown.value.lower()}-9.xml\")\n", "onnx_path = Path(f\"model/{model_path}\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Load the Model\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "Both the ONNX model(s) and converted IR model(s) are stored in the `model` directory.\n", "\n", "Only a few lines of code are required to run the model. First, initialize OpenVINO Runtime. Then, read the network architecture and model weights from the `.bin` and `.xml` files to compile for the desired device. If you select `GPU` you may need to wait briefly for it to load, as the startup time is somewhat longer than `CPU`.\n", "\n", "To let OpenVINO automatically select the best device for inference just use `AUTO`. In most cases, the best device to use is `GPU` (better performance, but slightly longer startup time). You can select one from available devices using dropdown list below.\n", "\n", "OpenVINO Runtime can load ONNX models from [ONNX Model Repository](https://github.com/onnx/models) directly. In such cases, use ONNX path instead of IR model to load the model. It is recommended to load the OpenVINO Intermediate Representation (IR) model for the best results." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Initialize OpenVINO Runtime.\n", "core = ov.Core()\n", "\n", "# Read the network and corresponding weights from ONNX Model.\n", "# model = ie_core.read_model(model=onnx_path)\n", "\n", "# Read the network and corresponding weights from IR Model.\n", "model = core.read_model(model=ir_path)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import ipywidgets as widgets\n", "\n", "device = widgets.Dropdown(\n", " options=core.available_devices + [\"AUTO\"],\n", " value=\"AUTO\",\n", " description=\"Device:\",\n", " disabled=False,\n", ")\n", "\n", "\n", "# Compile the model for CPU (or change to GPU, etc. for other devices)\n", "# or let OpenVINO select the best available device with AUTO.\n", "device" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "compiled_model = core.compile_model(model=model, device_name=device.value)\n", "\n", "# Get the input and output nodes.\n", "input_layer = compiled_model.input(0)\n", "output_layer = compiled_model.output(0)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Input and output layers have the names of the input node and output node respectively. For *fast-neural-style-mosaic-onnx*, there is 1 input and 1 output with the `(1, 3, 224, 224)` shape." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "print(input_layer.any_name, output_layer.any_name)\n", "print(input_layer.shape)\n", "print(output_layer.shape)\n", "\n", "# Get the input size.\n", "N, C, H, W = list(input_layer.shape)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Preprocess the image\n", "[back to top ⬆️](#Table-of-contents:)\n", "Preprocess the input image before running the model. Prepare the dimensions and channel order for the image to match the original image with the input tensor\n", "\n", "1. Preprocess a frame to convert from `unit8` to `float32`.\n", "2. Transpose the array to match with the network input size" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Preprocess the input image.\n", "def preprocess_images(frame, H, W):\n", " \"\"\"\n", " Preprocess input image to align with network size\n", "\n", " Parameters:\n", " :param frame: input frame\n", " :param H: height of the frame to style transfer model\n", " :param W: width of the frame to style transfer model\n", " :returns: resized and transposed frame\n", " \"\"\"\n", " image = np.array(frame).astype(\"float32\")\n", " image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)\n", " image = cv2.resize(src=image, dsize=(H, W), interpolation=cv2.INTER_AREA)\n", " image = np.transpose(image, [2, 0, 1])\n", " image = np.expand_dims(image, axis=0)\n", " return image" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Helper function to postprocess the stylized image\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "The converted IR model outputs a NumPy `float32` array of the [(1, 3, 224, 224)](https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/public/fast-neural-style-mosaic-onnx/README.md) shape ." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Postprocess the result\n", "def convert_result_to_image(frame, stylized_image) -> np.ndarray:\n", " \"\"\"\n", " Postprocess stylized image for visualization\n", "\n", " Parameters:\n", " :param frame: input frame\n", " :param stylized_image: stylized image with specific style applied\n", " :returns: resized stylized image for visualization\n", " \"\"\"\n", " h, w = frame.shape[:2]\n", " stylized_image = stylized_image.squeeze().transpose(1, 2, 0)\n", " stylized_image = cv2.resize(src=stylized_image, dsize=(w, h), interpolation=cv2.INTER_CUBIC)\n", " stylized_image = np.clip(stylized_image, 0, 255).astype(np.uint8)\n", " stylized_image = cv2.cvtColor(stylized_image, cv2.COLOR_BGR2RGB)\n", " return stylized_image" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Main Processing Function\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "The style transfer function can be run in different operating modes, either using a webcam or a video file." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "def run_style_transfer(source=0, flip=False, use_popup=False, skip_first_frames=0):\n", " \"\"\"\n", " Main function to run the style inference:\n", " 1. Create a video player to play with target fps (utils.VideoPlayer).\n", " 2. Prepare a set of frames for style transfer.\n", " 3. Run AI inference for style transfer.\n", " 4. Visualize the results.\n", " Parameters:\n", " source: The webcam number to feed the video stream with primary webcam set to \"0\", or the video path.\n", " flip: To be used by VideoPlayer function for flipping capture image.\n", " use_popup: False for showing encoded frames over this notebook, True for creating a popup window.\n", " skip_first_frames: Number of frames to skip at the beginning of the video.\n", " \"\"\"\n", " # Create a video player to play with target fps.\n", " player = None\n", " try:\n", " player = utils.VideoPlayer(source=source, flip=flip, fps=30, skip_first_frames=skip_first_frames)\n", " # Start video capturing.\n", " player.start()\n", " if use_popup:\n", " title = \"Press ESC to Exit\"\n", " cv2.namedWindow(winname=title, flags=cv2.WINDOW_GUI_NORMAL | cv2.WINDOW_AUTOSIZE)\n", "\n", " processing_times = collections.deque()\n", " while True:\n", " # Grab the frame.\n", " frame = player.next()\n", " if frame is None:\n", " print(\"Source ended\")\n", " break\n", " # If the frame is larger than full HD, reduce size to improve the performance.\n", " scale = 720 / max(frame.shape)\n", " if scale < 1:\n", " frame = cv2.resize(\n", " src=frame,\n", " dsize=None,\n", " fx=scale,\n", " fy=scale,\n", " interpolation=cv2.INTER_AREA,\n", " )\n", " # Preprocess the input image.\n", "\n", " image = preprocess_images(frame, H, W)\n", "\n", " # Measure processing time for the input image.\n", " start_time = time.time()\n", " # Perform the inference step.\n", " stylized_image = compiled_model([image])[output_layer]\n", " stop_time = time.time()\n", "\n", " # Postprocessing for stylized image.\n", " result_image = convert_result_to_image(frame, stylized_image)\n", "\n", " processing_times.append(stop_time - start_time)\n", " # Use processing times from last 200 frames.\n", " if len(processing_times) > 200:\n", " processing_times.popleft()\n", " processing_time_det = np.mean(processing_times) * 1000\n", "\n", " # Visualize the results.\n", " f_height, f_width = frame.shape[:2]\n", " fps = 1000 / processing_time_det\n", " cv2.putText(\n", " result_image,\n", " text=f\"Inference time: {processing_time_det:.1f}ms ({fps:.1f} FPS)\",\n", " org=(20, 40),\n", " fontFace=cv2.FONT_HERSHEY_COMPLEX,\n", " fontScale=f_width / 1000,\n", " color=(0, 0, 255),\n", " thickness=1,\n", " lineType=cv2.LINE_AA,\n", " )\n", "\n", " # Use this workaround if there is flickering.\n", " if use_popup:\n", " cv2.imshow(title, result_image)\n", " key = cv2.waitKey(1)\n", " # escape = 27\n", " if key == 27:\n", " break\n", " else:\n", " # Encode numpy array to jpg.\n", " _, encoded_img = cv2.imencode(\".jpg\", result_image, params=[cv2.IMWRITE_JPEG_QUALITY, 90])\n", " # Create an IPython image.\n", " i = Image(data=encoded_img)\n", " # Display the image in this notebook.\n", " clear_output(wait=True)\n", " display(i)\n", " # ctrl-c\n", " except KeyboardInterrupt:\n", " print(\"Interrupted\")\n", " # any different error\n", " except RuntimeError as e:\n", " print(e)\n", " finally:\n", " if player is not None:\n", " # Stop capturing.\n", " player.stop()\n", " if use_popup:\n", " cv2.destroyAllWindows()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Run Style Transfer\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "Now, try to apply the style transfer model using video from your webcam or video file. By default, the primary webcam is set with `source=0`. If you have multiple webcams, each one will be assigned a consecutive number starting at 0. Set `flip=True` when using a front-facing camera. Some web browsers, especially Mozilla Firefox, may cause flickering. If you experience flickering, set `use_popup=True`.\n", "\n", "> **NOTE**: To use a webcam, you must run this Jupyter notebook on a computer with a webcam. If you run it on a server, you will not be able to access the webcam. However, you can still perform inference on a video file in the final step.\n", "\n", "If you do not have a webcam, you can still run this demo with a video file. Any [format supported by OpenCV](https://docs.opencv.org/4.5.1/dd/d43/tutorial_py_video_display.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "USE_WEBCAM = False\n", "\n", "cam_id = 0\n", "video_file = \"https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/video/Coco%20Walking%20in%20Berkeley.mp4\"\n", "\n", "source = cam_id if USE_WEBCAM else video_file\n", "\n", "run_style_transfer(source=source, flip=isinstance(source, int), use_popup=False)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" }, "tags": [] }, "source": [ "## References\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "1. [ONNX Model Zoo](https://github.com/onnx/models)\n", "2. [Fast Neural Style Transfer](https://github.com/onnx/models/tree/main/vision/style_transfer/fast_neural_style)\n", "3. [Fast Neural Style Mosaic Onnx - Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/public/fast-neural-style-mosaic-onnx/README.md)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" }, "openvino_notebooks": { "imageUrl": "https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/style-transfer-webcam/style-transfer.gif?raw=true", "tags": { "categories": [ "Live Demos" ], "libraries": [], "other": [], "tasks": [ "Style Transfer" ] } }, "vscode": { "interpreter": { "hash": "e0404472fd7b5b63117a9fa5c50283296e2708c2449c6090d2cdf8903f95897f" } }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }