Introduction

This site hosts the documentation for the Video2X project, a machine learning-based lossless video super-resolution and frame interpolation framework.

The project's homepage is located on GitHub at: https://github.com/k4yt3x/video2x.

If you have any questions, suggestions, or found any issues in the documentation, please open an issue on GitHub.

🚧 Some pages are still under construction.

Building

Instructions for building the project.

Windows

Instructions for building this project on Windows.

1. Prerequisites

The following tools must be installed manually:

Visual Studio 2022
- Workload: Desktop development with C++
winget-cli

2. Clone the Repository

# Install Git if not already installed
winget install -e --id=Git.Git

# Clone the repository
git clone --recurse-submodules https://github.com/k4yt3x/video2x.git
cd video2x

3. Install Dependencies

# Install CMake
winget install -e --id=Kitware.CMake

# Install Vulkan SDK
winget install -e --id=KhronosGroup.VulkanSDK

# Versions of manually installed dependencies
$ffmpegVersion = "7.1"
$ncnnVersion = "20240820"

# Download and extract FFmpeg
curl -Lo ffmpeg-shared.zip "https://github.com/GyanD/codexffmpeg/releases/download/$ffmpegVersion/ffmpeg-$ffmpegVersion-full_build-shared.zip"
Expand-Archive -Path ffmpeg-shared.zip -DestinationPath third_party
Rename-Item -Path "third_party/ffmpeg-$ffmpegVersion-full_build-shared" -NewName ffmpeg-shared

# Download and extract ncnn
curl -Lo ncnn-shared.zip "https://github.com/Tencent/ncnn/releases/download/$ncnnVersion/ncnn-$ncnnVersion-windows-vs2022-shared.zip"
Expand-Archive -Path ncnn-shared.zip -DestinationPath third_party
Rename-Item -Path "third_party/ncnn-$ncnnVersion-windows-vs2022-shared" -NewName ncnn-shared

4. Build the Project

cmake -S . -B build -DUSE_SYSTEM_NCNN=OFF -DUSE_SYSTEM_SPDLOG=OFF -DUSE_SYSTEM_BOOST=OFF `
  -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=build/libvideo2x-shared
cmake --build build --config Release --parallel --target install

The built binaries will be located in build/libvideo2x-shared.

Windows (Qt6)

Instructions for building the Qt6 GUI of this project on Windows.

1. Prerequisites

These dependencies must be installed before building the project. This tutorial assumes that Qt6 has been installed to the default location (C:\Qt).

Visual Studio 2022
- Workload: Desktop development with C++
winget-cli
Qt6
- Component: Qt6 with MSVC 2022 64-bit
- Component: Qt Creator

1. Clone the Repository

# Install Git if not already installed
winget install -e --id=Git.Git

# Clone the repository
git clone --recurse-submodules https://github.com/k4yt3x/video2x-qt6.git
cd video2x-qt6

2. Install Dependencies

You need to have the libvideo2x shared library built before building the Qt6 GUI. Put the built binaries in third_party/libvideo2x-shared.

# Versions of manually installed dependencies
$ffmpegVersion = "7.1"

# Download and extract FFmpeg
curl -Lo ffmpeg-shared.zip "https://github.com/GyanD/codexffmpeg/releases/download/$ffmpegVersion/ffmpeg-$ffmpegVersion-full_build-shared.zip"
Expand-Archive -Path ffmpeg-shared.zip -DestinationPath third_party
Rename-Item -Path "third_party/ffmpeg-$ffmpegVersion-full_build-shared" -NewName ffmpeg-shared

3. Build the Project

Open the CMakeLists.txt file in Qt Creator as the project file.
Click on the hammer icon at the bottom left of the window to build the project.
Built binaries will be located in the build directory.

After the build finishes, you will need to copy the Qt6 DLLs and other dependencies to the build directory to run the application. Before you run the following commands, remove everything in the release directory except for video2x-qt6.exe and the .qm files as they are not required for running the application. Then, run the following command to copy the Qt6 runtime DLLs:

C:\Qt\6.8.0\msvc2022_64\bin\windeployqt.exe --release --compiler-runtime .\build\Desktop_Qt_6_8_0_MSVC2022_64bit-Release\video2x-qt6.exe

You will also need to copy the libvideo2x shared library to the build directory. Copy all files under third_party/libvideo2x-shared to the release directory except for include, libvideo2x.lib, and video2x.exe.

Now you should be able to run the application by double-clicking on video2x-qt6.exe.

Linux

Instructions for building this project on Linux.

Arch Linux

Arch users can build the latest version of the project from the AUR packages video2x and video2x-git. The project's repository also contains another PKGBUILD example at packaging/arch/PKGBUILD.

git clone https://aur.archlinux.org/video2x.git
cd video2x-git

# Build the package without installing it
makepkg -s

# Build and install the package
makepkg -si

To build manually from the source, follow the instructions below.

# Install build and runtime dependencies
# See the PKGBUILD file for the list of up-to-date dependencies
pacman -Sy ffmpeg ncnn vulkan-driver spdlog boost-libs
pacman -Sy git cmake clang pkgconf just vulkan-headers openmp boost

# Clone the repository
git clone --recurse-submodules https://github.com/k4yt3x/video2x.git
cd video2x

# Build the project
just build

The built binaries will be located in the build directory.

Ubuntu

Ubuntu users can use the .justfile to build the project automatically. The ubuntu2404 and ubuntu2204 targets are available for Ubuntu 24.04 and 22.04, respectively. just will automatically install the required dependencies, build the project, and package it into a .deb package file. It is recommended to perform the build in a container to ensure the environment's consistency and to avoid leaving extra build packages on your system.

# The version of `just` in the Ubuntu repository is outdated
# We need to compile and install `just` manually
sudo apt-get update && sudo apt-get install cargo
cargo install just

# Clone the repository
git clone --recurse-submodules https://github.com/k4yt3x/video2x.git
cd video2x

# Build the project
# Before running the command, ensure ~/.cargo/bin is in your PATH
just ubuntu2404

The built .deb package will be located under the current directory.

Installing

Instructions for installing this project.

Windows

You can download the latest version of the Windows build from the releases page. Here are the steps to download and install the pre-built binaries to %LOCALAPPDATA%\Programs.

$latestTag = (Invoke-RestMethod -Uri https://api.github.com/repos/k4yt3x/video2x/releases/latest).tag_name
curl -LO "https://github.com/k4yt3x/video2x/releases/download/$latestTag/video2x-windows-amd64.zip"
New-Item -Path "$env:LOCALAPPDATA\Programs\video2x" -ItemType Directory -Force
Expand-Archive -Path .\video2x-windows-amd64.zip -DestinationPath "$env:LOCALAPPDATA\Programs\video2x"

You can then add %LOCALAPPDATA%\Programs\video2x to your PATH environment variable to run video2x from the command line.

Windows (Qt6)

You can download the installer for Video2X Qt6 from the releases page. The installer file's name is video2x-qt6-windows-amd64-installer.exe.

Download then double-click the installer to start the installation process. The installer will guide you through the installation process. You can choose the installation directory and whether to create a desktop shortcut during the installation.

After the installation is complete, you can start Video2X Qt6 by double-clicking the desktop shortcut.

Linux

Video2X packages are available for the Linux distros listed below. If you'd like to build it from source code, refer to the PKGBUILD file for a general overview of the required dependencies and commands. If a package is not available for your distro and you prefer not to compile the program from source code, consider using the container image.

Arch Linux

AUR packages, maintained by @K4YT3X.
Chinese Mainland: archlinuxcn packages, maintained by @Integral-Tech.

Other Distros

Users of other distros can download and use the AppImage from the releases page.

Running

Instructions for running and using this project.

Desktop

TODO.

Command Line

Instructions for running Video2X from the command line.

This page does not cover all the options available. For help with more options available, run Video2X with the --help argument.

Basics

Use the following command to upscale a video by 4x with RealESRGAN:

video2x -i input.mp4 -o output.mp4 -p realesrgan -s 4 --realesrgan-model realesr-animevideov3

Use the following command to upscale a video to with libplacebo + Anime4Kv4 Mode A+A:

video2x -i input.mp4 -o output.mp4 -w 3840 -h 2160 -p libplacebo --libplacebo-shader anime4k-v4-a+a

Advanced

It is possible to specify custom MPV-compatible GLSL shader files with the --libplacebo-shader argument:

video2x -i input.mp4 -o output.mp4 -p libplacebo -w 3840 -h 2160 --libplacebo-shader path/to/custom/shader.glsl

List the available GPUs with --list-gpus, -l:

$ video2x --list-gpus
0. NVIDIA RTX A6000
        Type: Discrete GPU
        Vulkan API Version: 1.3.289
        Driver Version: 565.228.64

Select which GPU to use with the --gpu, -g argument:

video2x -i input.mp4 -o output.mp4 -p realesrgan -s 4 --realesrgan-model realesr-animevideov3 -g 1

Specify arbitrary extra FFmpeg encoder options with the --extra-encoder-options, -e argument:

video2x -i input.mkv -o output.mkv -p realesrgan --realesrgan-model realesrgan-plus -s 4 -c libx264rgb -e crf=17 -e preset=veryslow -e tune=film

Encoder Options

Video2X uses FFmpeg's C libraries to encode videos. Encoder options are specified in two ways:

Common options shared by all encoders are stored in a AVCodecContext struct. Below are some options set through AVCodecContext:
- Codec
- Pixel format
- Bitrate
- Keyframe interval
- Minimum and maximum quantizer
- GOP size
Encoder-specific options are stored in AVOption structs and set with the av_opt_set function. Below are some encoder-specific options for libx264:
- CRF
- Preset
- Tune
- Profile

Common options can only be set through Video2X's command line arguments. You can run video2x --help and see the Encoder options section to see the supported options.

You can specify encoder-specific options in Video2X using the --extra-encoder-option (-e) argument. To view the available options for a particular codec, run:

ffmpeg -h encoder=$ENCODER

For example, to view the available options for libx264, run:

$ ffmpeg -h encoder=libx264
Encoder libx264 [libx264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10]:
    General capabilities: dr1 delay threads
    Threading capabilities: other
    Supported pixel formats: yuv420p yuvj420p yuv422p yuvj422p yuv444p yuvj444p nv12 nv16 nv21 yuv420p10le yuv422p10le yuv444p10le nv20le gray gray10le
libx264 AVOptions:
  -preset            <string>     E..V....... Set the encoding preset (cf. x264 --fullhelp) (default "medium")
  -tune              <string>     E..V....... Tune the encoding params (cf. x264 --fullhelp)
  -profile           <string>     E..V....... Set profile restrictions (cf. x264 --fullhelp)
  -fastfirstpass     <boolean>    E..V....... Use fast settings when encoding first pass (default true)
  -level             <string>     E..V....... Specify level (as defined by Annex A)
  -passlogfile       <string>     E..V....... Filename for 2 pass stats
  -wpredp            <string>     E..V....... Weighted prediction for P-frames
  -a53cc             <boolean>    E..V....... Use A53 Closed Captions (if available) (default true)
  -x264opts          <string>     E..V....... x264 options
  -crf               <float>      E..V....... Select the quality for constant quality mode (from -1 to FLT_MAX) (default -1)
  -crf_max           <float>      E..V....... In CRF mode, prevents VBV from lowering quality beyond this point. (from -1 to FLT_MAX) (default -1)
  -qp                <int>        E..V....... Constant quantization parameter rate control method (from -1 to INT_MAX) (default -1)
...

You can then set the encoder-specific options with the -e argument. The -e argument can be used multiple times to set multiple options. For example, the following arguments set the CRF to 17, the preset to veryslow, and the tune to film for libx264:

-e crf=17 -e preset=veryslow -e tune=film

Container

Instructions for running the Video2X container.

Prerequisites

Docker, Podman, or another OCI-compatible runtime
A GPU that supports the Vulkan API
- Check the Vulkan Hardware Database to see if your GPU supports Vulkan

Upscaling a Video

This section documents how to upscale a video. Replace $TAG with an appropriate container tag. A list of available tags can be found here (e.g., 6.1.1).

AMD GPUs

Make sure your host has the proper GPU and Vulkan libraries and drivers, then use the following command to launch the container:

docker run --gpus all -it --rm -v $PWD/data:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -p realesrgan -s 4 --realesrgan-model realesr-animevideov3

NVIDIA GPUs

In addition to installing the proper drivers on your host, nvidia-docker2 (NVIDIA Container Toolkit) must also be installed on the host to use NVIDIA GPUs in containers. Below are instructions for how to install it on some popular Linux distributions:

Debian/Ubuntu
- Follow the official guide to install nvidia-docker2
Arch/Manjaro
- Install nvidia-container-toolkit from the AUR
- E.g., yay -S nvidia-container-toolkit

Once all the prerequisites are installed, you can launch the container:

docker run --gpus all -it --rm -v $PWD:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -p realesrgan -s 4 --realesrgan-model realesr-animevideov3

Depending on the version of your nvidia-docker and some other mysterious factors, you can also try setting no-cgroups = true in /etc/nvidia-container-runtime/config.toml and adding the NVIDIA devices into the container if the command above doesn't work:

docker run --gpus all --device=/dev/nvidia0 --device=/dev/nvidiactl --runtime nvidia -it --rm -v $PWD:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -p realesrgan -s 4 --realesrgan-model realesr-animevideov3

If you are still getting a vkEnumeratePhysicalDevices failed -3 error at this point, try adding the --privileged flag to give the container the same level of permissions as the host:

docker run --gpus all --privileged -it --rm -v $PWD:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -p realesrgan -s 4 --realesrgan-model realesr-animevideov3

Intel GPUs

Similar to NVIDIA GPUs, you can add --gpus all or --device /dev/dri to pass the GPU into the container. Adding --privileged might help with the performance (thanks @NukeninDark).

docker run --gpus all --privileged -it --rm -v $PWD:/host ghcr.io/k4yt3x/video2x:$TAG -i standard-test.mp4 -o output.mp4 -p realesrgan -s 4 --realesrgan-model realesr-animevideov3

Developing

Development-related instructions and guidelines for this project.

Architecture

The basic working principals of Video2X and its historical architectures.

Video2X <=4.0.0 (Legacy)

Below is the earliest architecture of Video2X. It extracts all of the frames from the video using FFmpeg, processes all frames, and stores them into a folder before running FFmpeg again to convert all of the frames back into a video. The drawbacks of this approach are apparent:

Storing all frames of the video on disk twice requires a huge amount of storage, often hundreds of gigabytes.
A lot of disk I/O (reading from/writing to disks) operations occur, which is inefficient. Each step stores its processing results to disk, and the next step has to read them from disk again.

Video2Xv4
Video2X architecture before version 5.0.0

Video2X 5.0.0 (Legacy)

Video2X 5.0.0's architecture was designed to address the inefficient disk I/O issues. This version uses frame serving and streamlines the process. All stages are started simultaneously, and frames are passed between stages through stdin/stdout pipes. However, this architecture also has several issues:

At least two instances of FFmpeg will be started, three in the case of Anime4K.
Passing frames through stdin/stdout is unstable. If frame sizes are incorrect, FFmpeg will hang waiting for the next frame.
The frames entering and leaving each stage must be RGB24, even if they don't need to be. For instance, if the upscaler used is Anime4K, yuv420p is acceptable, but the frame is first converted by the decoder to RGB24, then converted back into YUV colorspace for libplacebo.

Video2Xv5
Video2X 5.x.x architecture

Video2X 6.0.0 (Current)

The newest version of Video2X's architecture addresses the issues of the previous architecture while improving efficiency.

Frames are only decoded once and encoded once with FFmpeg's libavformat.
Frames are passed as AVFrame structs. Their pixel formats are only converted when needed.
Frames always stay in RAM, avoiding bottlenecks from disk I/O and pipes.
Frames always stay in the hardware (GPU) unless they need to be downloaded to be processed by software (partially implemented).

Video2Xv6 drawio
Video2X 6.0.0 architecture

libvideo2x

Instructions for using libvideo2x's C API in your own projects.

libvideo2x's API is still highly volatile. This document will be updated as the API stabilizes.

Other

History

Video2X came a long way from its original concepts to what it has become today. It started as a simple concept of "waifu2x can upscale images, and a video is just a sequence of images". Then, a PoC was made which can barely upscale a single video with waifu2x-caffe and with fixed settings. Now, Video2X has become a comprehensive and customizable video upscaling tool with a nice GUI and a community around it. This article documents in detail how Video2X's concept was born, and what happened during its development.

Origin

The story started with me watching Bad Apple!!'s PV in early 2017. The original PV has a size of 512x384, which is quite small and thus, quite blurry.

vlcsnap-2020-05-15-20h41m36s060
A screenshot of the original Bad Apple!! PV

Around the same time, I was introduced to this amazing project named waifu2x, which upscales (mostly anime) images using machine learning. This created a spark in my head: if images can be upscaled, aren't videos just a sequence of images? Then, I started making a proof-of-concept by manually extracting all frames from the original PV using FFmpeg, putting them through waifu2x-caffe, and assembling the frames back into a video again using FFmpeg. This was how the "4K BadApple!! waifu2x Lossless Upscaled" video was created.

4K BadApple waifu2x
Thumbnail of the "4K BadApple!! waifu2x Lossless Upscaled" video

After this experiment completed successfully, I started thinking about making an automation pipeline, where this manual process will be streamlined, and each of the steps will be handled automatically.

Proof-of-Concept

When I signed up for Hack the Valley II in late 2017, I didn't know what I was going to make during that hackathon. Our team sat down and thought about what to make for around an hour, but no one came up with anything interesting. All of a sudden, I remembered, "Hey, isn't there a PoC I wanted to make? How about making that our hackathon project?" I then temporarily name the project Video2X, following waifu2x's scheme. Video2X was then born.

I originally wanted to write Video2X for Linux, but it's too complicated to get the original nagadomi/waifu2x's version of waifu2x running, so waifu2x-caffe written for Windows was used to save time. This is why the first version of Video2X only supports Windows, and can only use waifu2x-caffe as its upscaling driver.

the first commit of Video2X
video2x.py file in the first version of Video2X

At the end of the hackathon, we managed to make a sample comparison video based on Spirited Away's official trailer. This video was then published on YouTube and is the same demo video showcased in Video2X's repository. The original link was at https://www.youtube.com/watch?v=PG94iPoeoZk, but it has been moved lately to another account under K4YT3X's name.

Spirited Away Demo
Upscale Comparison Demonstration

When we demoed this project, there wasn't so much interest expressed by the judges. We were, however, suggested to pitch our project to Adobe. That didn't end up going anywhere, either. Like most of the other projects in a hackathon, this project didn't win any awards, and just almost vanished after the hackathon was over.

[Image Removed]
Our team in Hack the Valley II. You can see Video2X's demo video on the computer screens. Image blurred for privacy.

Video2X 2.0

Roughly three months after the hackathon, I came back to this project and decided it was worth continuing. Although not many people in the hackathon found this project interesting or useful, I saw value in this project. This was further reinforced by the stars I've received in the project's repository.

I continued working on enhancing Video2X and fixing bugs, and Video2X 2.0 was released. The original version of Video2X was only made as a proof-of-concept for the hackathon. A lot of the usability and convenience aspects are ignored in exchange for development speed. The 2.0 version addressed a lot of these issues and made Video2X usable for regular users. Video2X has then also been converted from a hackathon project to a personal open-source project.

Screenshot of Video2X 2.0

Video2X Documentation