Introduction
This site hosts the documentation for the Video2X project, a machine learning-based lossless video super-resolution and frame interpolation framework.
The project's homepage is located on GitHub at: https://github.com/k4yt3x/video2x.
If you have any questions, suggestions, or found any issues in the documentation, please open an issue on GitHub.
🚧 Some pages are still under construction.
Building
Instructions for building the project.
Windows
Instructions for building this project on Windows.
1. Prerequisites
The following tools must be installed manually:
- Visual Studio 2022
- Workload: Desktop development with C++
- winget-cli
2. Clone the Repository
# Install Git if not already installed
winget install -e --id=Git.Git
# Clone the repository
git clone --recurse-submodules https://github.com/k4yt3x/video2x.git
cd video2x
3. Install Dependencies
# Install CMake
winget install -e --id=Kitware.CMake
# Install Vulkan SDK
winget install -e --id=KhronosGroup.VulkanSDK
# Versions of manually installed dependencies
$ffmpegVersion = "7.1"
$ncnnVersion = "20240820"
# Download and extract FFmpeg
curl -Lo ffmpeg-shared.zip "https://github.com/GyanD/codexffmpeg/releases/download/$ffmpegVersion/ffmpeg-$ffmpegVersion-full_build-shared.zip"
Expand-Archive -Path ffmpeg-shared.zip -DestinationPath third_party
Rename-Item -Path "third_party/ffmpeg-$ffmpegVersion-full_build-shared" -NewName ffmpeg-shared
# Download and extract ncnn
curl -Lo ncnn-shared.zip "https://github.com/Tencent/ncnn/releases/download/$ncnnVersion/ncnn-$ncnnVersion-windows-vs2022-shared.zip"
Expand-Archive -Path ncnn-shared.zip -DestinationPath third_party
Rename-Item -Path "third_party/ncnn-$ncnnVersion-windows-vs2022-shared" -NewName ncnn-shared
4. Build the Project
cmake -S . -B build -DUSE_SYSTEM_NCNN=OFF -DUSE_SYSTEM_SPDLOG=OFF -DUSE_SYSTEM_BOOST=OFF `
-DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=build/libvideo2x-shared
cmake --build build --config Release --parallel --target install
The built binaries will be located in build/libvideo2x-shared
.
Windows (Qt6)
Instructions for building the Qt6 GUI of this project on Windows.
1. Prerequisites
These dependencies must be installed before building the project. This tutorial assumes that Qt6 has been installed to the default location (C:\Qt
).
- Visual Studio 2022
- Workload: Desktop development with C++
- winget-cli
- Qt6
- Component: Qt6 with MSVC 2022 64-bit
- Component: Qt Creator
1. Clone the Repository
# Install Git if not already installed
winget install -e --id=Git.Git
# Clone the repository
git clone --recurse-submodules https://github.com/k4yt3x/video2x-qt6.git
cd video2x-qt6
2. Install Dependencies
You need to have the libvideo2x
shared library built before building the Qt6 GUI. Put the built binaries in third_party/libvideo2x-shared
.
# Versions of manually installed dependencies
$ffmpegVersion = "7.1"
# Download and extract FFmpeg
curl -Lo ffmpeg-shared.zip "https://github.com/GyanD/codexffmpeg/releases/download/$ffmpegVersion/ffmpeg-$ffmpegVersion-full_build-shared.zip"
Expand-Archive -Path ffmpeg-shared.zip -DestinationPath third_party
Rename-Item -Path "third_party/ffmpeg-$ffmpegVersion-full_build-shared" -NewName ffmpeg-shared
3. Build the Project
- Open the
CMakeLists.txt
file in Qt Creator as the project file. - Click on the hammer icon at the bottom left of the window to build the project.
- Built binaries will be located in the
build
directory.
After the build finishes, you will need to copy the Qt6 DLLs and other dependencies to the build directory to run the application. Before you run the following commands, remove everything in the release directory except for video2x-qt6.exe
and the .qm
files as they are not required for running the application. Then, run the following command to copy the Qt6 runtime DLLs:
C:\Qt\6.8.0\msvc2022_64\bin\windeployqt.exe --release --compiler-runtime .\build\Desktop_Qt_6_8_0_MSVC2022_64bit-Release\video2x-qt6.exe
You will also need to copy the libvideo2x
shared library to the build directory. Copy all files under third_party/libvideo2x-shared
to the release directory except for include
, libvideo2x.lib
, and video2x.exe
.
Now you should be able to run the application by double-clicking on video2x-qt6.exe
.
Linux
Instructions for building this project on Linux.
Arch Linux
Arch users can build the latest version of the project from the AUR package video2x-git
. The project's repository also contains another PKGBUILD example at packaging/arch/PKGBUILD
.
# Build only
git clone https://aur.archlinux.org/video2x-git.git
cd video2x-git
makepkg -s
To build manually from the source, follow the instructions below.
# Install build and runtime dependencies
# See the PKGBUILD file for the list of up-to-date dependencies
pacman -Sy ffmpeg ncnn vulkan-driver opencv spdlog boost-libs
pacman -Sy git cmake make clang pkgconf vulkan-headers openmp boost
# Clone the repository
git clone --recurse-submodules https://github.com/k4yt3x/video2x.git
cd video2x
# Build the project
make build
The built binaries will be located in the build
directory.
Ubuntu
Ubuntu users can use the Makefile
to build the project automatically. The ubuntu2404
and ubuntu2204
targets are available for Ubuntu 24.04 and 22.04, respectively. make
will automatically install the required dependencies, build the project, and package it into a .deb
package file. It is recommended to perform the build in a container to ensure the environment's consistency and to avoid leaving extra build packages on your system.
# make needs to be installed manually
sudo apt-get update && sudo apt-get install make
# Clone the repository
git clone --recurse-submodules https://github.com/k4yt3x/video2x.git
cd video2x
# Build the project
make ubuntu2404
The built .deb
package will be located under the current directory.
Installing
Instructions for installing this project.
Windows
You can download the latest version of the Windows build from the releases page. Here are the steps to download and install the pre-built binaries to %LOCALAPPDATA%\Programs
.
$latestTag = (Invoke-RestMethod -Uri https://api.github.com/repos/k4yt3x/video2x/releases/latest).tag_name
curl -LO "https://github.com/k4yt3x/video2x/releases/download/$latestTag/video2x-windows-amd64.zip"
New-Item -Path "$env:LOCALAPPDATA\Programs\video2x" -ItemType Directory -Force
Expand-Archive -Path .\video2x-windows-amd64.zip -DestinationPath "$env:LOCALAPPDATA\Programs\video2x"
You can then add %LOCALAPPDATA%\Programs\video2x
to your PATH
environment variable to run video2x
from the command line.
Windows (Qt6)
You can download the installer for Video2X Qt6 from the releases page. The installer file's name is video2x-qt6-windows-amd64-installer.exe
.
Download then double-click the installer to start the installation process. The installer will guide you through the installation process. You can choose the installation directory and whether to create a desktop shortcut during the installation.
After the installation is complete, you can start Video2X Qt6 by double-clicking the desktop shortcut.
Linux
Instructions for installing this project on Linux systems.
Arch Linux
Arch users can install the project from the AUR.
yay -S video2x-git
Ubuntu
Ubuntu users can download the .deb
packages from the releases page. Install the package with the APT package manager:
apt-get install ./video2x-linux-ubuntu2404-amd64.deb
Running
Instructions for running and using this project.
Desktop
TODO.
Command Line
Instructions for running Video2X from the command line.
This page does not cover all the options available. For help with more options available, run Video2X with the --help
argument.
Basics
Use the following command to upscale a video by 4x with RealESRGAN:
video2x -i input.mp4 -o output.mp4 -f realesrgan -r 4 -m realesr-animevideov3
Use the following command to upscale a video to with libplacebo + Anime4Kv4 Mode A+A:
video2x -i input.mp4 -o output.mp4 -f libplacebo -s anime4k-v4-a+a -w 3840 -h 2160
Advanced
It is possible to specify custom MPV-compatible GLSL shader files with the --shader, -s
argument:
video2x -i input.mp4 -o output.mp4 -f libplacebo -s path/to/custom/shader.glsl -w 3840 -h 2160
List the available GPUs with --list-gpus, -l
:
$video2x --list-gpus
0. NVIDIA RTX A6000
Type: Discrete GPU
Vulkan API Version: 1.3.289
Driver Version: 565.228.64
Select which GPU to use with the --gpu, -g
argument:
video2x -i input.mp4 -o output.mp4 -f realesrgan -r 4 -m realesr-animevideov3 -g 1
Specify arbitrary extra FFmepg encoder options with the --extra-encoder-options, -e
argument:
video2x -i input.mkv -o output.mkv -f realesrgan -m realesrgan-plus -r 4 -c libx264rgb -e crf=17 -e preset=veryslow -e tune=film
Container
Instructions for running the Video2X container.
Prerequisites
- Docker, Podman, or another OCI-compatible runtime
- A GPU that supports the Vulkan API
- Check the Vulkan Hardware Database to see if your GPU supports Vulkan
Upscaling a Video
This section documents how to upscale a video. Replace $TAG
with an appropriate container tag. A list of available tags can be found here (e.g., 6.1.1
).
AMD GPUs
Make sure your host has the proper GPU and Vulkan libraries and drivers, then use the following command to launch the container:
docker run --gpus all -it --rm -v $PWD/data:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -f realesrgan -r 4 -m realesr-animevideov3
NVIDIA GPUs
In addition to installing the proper drivers on your host, nvidia-docker2
(NVIDIA Container Toolkit) must also be installed on the host to use NVIDIA GPUs in containers. Below are instructions for how to install it on some popular Linux distributions:
- Debian/Ubuntu
- Follow the official guide to install
nvidia-docker2
- Follow the official guide to install
- Arch/Manjaro
- Install
nvidia-container-toolkit
from the AUR - E.g.,
yay -S nvidia-container-toolkit
- Install
Once all the prerequisites are installed, you can launch the container:
docker run --gpus all -it --rm -v $PWD:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -f realesrgan -r 4 -m realesr-animevideov3
Depending on the version of your nvidia-docker and some other mysterious factors, you can also try setting no-cgroups = true
in /etc/nvidia-container-runtime/config.toml
and adding the NVIDIA devices into the container if the command above doesn't work:
docker run --gpus all --device=/dev/nvidia0 --device=/dev/nvidiactl --runtime nvidia -it --rm -v $PWD:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -f realesrgan -r 4 -m realesr-animevideov3
If you are still getting a vkEnumeratePhysicalDevices failed -3
error at this point, try adding the --privileged
flag to give the container the same level of permissions as the host:
docker run --gpus all --privileged -it --rm -v $PWD:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -f realesrgan -r 4 -m realesr-animevideov3
Intel GPUs
Similar to NVIDIA GPUs, you can add --gpus all
or --device /dev/dri
to pass the GPU into the container. Adding --privileged
might help with the performance (thanks @NukeninDark).
docker run --gpus all --privileged -it --rm -v $PWD:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -f realesrgan -r 4 -m realesr-animevideov3
Developing
Development-related instructions and guidelines for this project.
Architecture
The basic working principals of Video2X and its historical architectures.
Video2X <=4.0.0 (Legacy)
Below is the earliest architecture of Video2X. It extracts all of the frames from the video using FFmpeg, processes all frames, and stores them into a folder before running FFmpeg again to convert all of the frames back into a video. The drawbacks of this approach are apparent:
- Storing all frames of the video on disk twice requires a huge amount of storage, often hundreds of gigabytes.
- A lot of disk I/O (reading from/writing to disks) operations occur, which is inefficient. Each step stores its processing results to disk, and the next step has to read them from disk again.
Video2X architecture before version 5.0.0
Video2X 5.0.0 (Legacy)
Video2X 5.0.0's architecture was designed to address the inefficient disk I/O issues. This version uses frame serving and streamlines the process. All stages are started simultaneously, and frames are passed between stages through stdin/stdout pipes. However, this architecture also has several issues:
- At least two instances of FFmpeg will be started, three in the case of Anime4K.
- Passing frames through stdin/stdout is unstable. If frame sizes are incorrect, FFmpeg will hang waiting for the next frame.
- The frames entering and leaving each stage must be RGB24, even if they don't need to be. For instance, if the upscaler used is Anime4K, yuv420p is acceptable, but the frame is first converted by the decoder to RGB24, then converted back into YUV colorspace for libplacebo.
Video2X 5.x.x architecture
Video2X 6.0.0 (Current)
Video2X 6.0.0 (Current)
The newest version of Video2X's architecture addresses the issues of the previous architecture while improving efficiency.
- Frames are only decoded once and encoded once with FFmpeg's libavformat.
- Frames are passed as
AVFrame
structs. Their pixel formats are only converted when needed. - Frames always stay in RAM, avoiding bottlenecks from disk I/O and pipes.
- Frames always stay in the hardware (GPU) unless they need to be downloaded to be processed by software (partially implemented).
Video2X 6.0.0 architecture
libvideo2x
Instructions for using libvideo2x's C API in your own projects.
libvideo2x's API is still highly volatile. This document will be updated as the API stabilizes.
Other
History
Video2X came a long way from its original concepts to what it has become today. It started as a simple concept of "waifu2x can upscale images, and a video is just a sequence of images". Then, a PoC was made which can barely upscale a single video with waifu2x-caffe and with fixed settings. Now, Video2X has become a comprehensive and customizable video upscaling tool with a nice GUI and a community around it. This article documents in detail how Video2X's concept was born, and what happened during its development.
Origin
The story started with me watching Bad Apple!!'s PV in early 2017. The original PV has a size of 512x384
, which is quite small and thus, quite blurry.
A screenshot of the original Bad Apple!! PV
Around the same time, I was introduced to this amazing project named waifu2x, which upscales (mostly anime) images using machine learning. This created a spark in my head: if images can be upscaled, aren't videos just a sequence of images? Then, I started making a proof-of-concept by manually extracting all frames from the original PV using FFmpeg, putting them through waifu2x-caffe, and assembling the frames back into a video again using FFmpeg. This was how the "4K BadApple!! waifu2x Lossless Upscaled" video was created.
Thumbnail of the "4K BadApple!! waifu2x Lossless Upscaled" video
After this experiment completed successfully, I started thinking about making an automation pipeline, where this manual process will be streamlined, and each of the steps will be handled automatically.
Proof-of-Concept
When I signed up for Hack the Valley II in late 2017, I didn't know what I was going to make during that hackathon. Our team sat down and thought about what to make for around an hour, but no one came up with anything interesting. All of a sudden, I remembered, "Hey, isn't there a PoC I wanted to make? How about making that our hackathon project?" I then temporarily name the project Video2X, following waifu2x's scheme. Video2X was then born.
I originally wanted to write Video2X for Linux, but it's too complicated to get the original nagadomi/waifu2x's version of waifu2x running, so waifu2x-caffe written for Windows was used to save time. This is why the first version of Video2X only supports Windows, and can only use waifu2x-caffe as its upscaling driver.
video2x.py file in the first version of Video2X
At the end of the hackathon, we managed to make a sample comparison video based on Spirited Away's official trailer. This video was then published on YouTube and is the same demo video showcased in Video2X's repository. The original link was at https://www.youtube.com/watch?v=PG94iPoeoZk, but it has been moved lately to another account under K4YT3X's name.
Upscale Comparison Demonstration
When we demoed this project, there wasn't so much interest expressed by the judges. We were, however, suggested to pitch our project to Adobe. That didn't end up going anywhere, either. Like most of the other projects in a hackathon, this project didn't win any awards, and just almost vanished after the hackathon was over.
[Image Removed]
Our team in Hack the Valley II. You can see Video2X's demo video on the computer screens. Image blurred for privacy.
Video2X 2.0
Roughly three months after the hackathon, I came back to this project and decided it was worth continuing. Although not many people in the hackathon found this project interesting or useful, I saw value in this project. This was further reinforced by the stars I've received in the project's repository.
I continued working on enhancing Video2X and fixing bugs, and Video2X 2.0 was released. The original version of Video2X was only made as a proof-of-concept for the hackathon. A lot of the usability and convenience aspects are ignored in exchange for development speed. The 2.0 version addressed a lot of these issues and made Video2X usable for regular users. Video2X has then also been converted from a hackathon project to a personal open-source project.
Screenshot of Video2X 2.0