All posts by alex@alexmogurenko.com

AMCDX Video Patcher v0.1.0

AMCDX Video Patcher v0.1.0 officially released.
The main update is Performance optimizations… Now decoder is almost x10 faster and encoder x5 faster. which allows decode and encode 4k content realtime even on 4 cores machines

Also added timecode display and Audio playback

Bugs fixed:
1) Crash on open file with alpha (files not supported yet but no crash)
2) Crash on Open file with a width which not aligned to 16
3) Audio bit depth detection
4) Crash when trying to open a file if edit “In Progress”
5) Show error message when attempted to open Mov file without prores content

Features:
1) Performance optimization (multithreaded ProRes decoding and encoding, realtime playback of 4K footage)
2) Audio playback
3) Show current frame / timecode during playback
4) Show name/path of current file

Whats next?
Next will be File to File insert eg:
1) insert one or more frames from one ProRes file to another
2) Insert Rectangle from (eg Picture in Picture)

OSX Installer
Windows Installer

AMCDX Video Patcher V0.0.5

I finally found some time to release AMCDX Video Patcher v0.0.5
the video above shows newly added features.

Release notes:

Features:
1) Added “Save to PNG” functionality that allows you to save selected part of ProRes frame to PNG image (16 bit per component)
2) Added “Insert from PNG” functionality that allows you to replace selected rectangle by an image from PNG file

Bugs:
1) Fixed various UI bugs
2) Fixed MOV reader crash
3) Fixed color Matrix detection

Note: there are a lot AVX2 optimizations so app will crash if CPU does not support AVX2

Mac OS Installer

Windows Installer

AMCDX Video Patcher v0.0.3

Its almost week left after v0.0.1 was released and as I said before it was more proof of concept.
I got some feedback and bug reports and I`m ready to release version 0.0.3

What`s done?
1) added support of ProRes in MXF files (op1a and op-atom)
2) significantly improved Blur and decoder Performace
3) fixed YUV422 chrominance blur
4) fixed support of MOVs > 4GB
5) fixed various UI bugs
6) Added Windows Installer
7) Notarized Mac OS installer

Mac OS Installer

Windows Installer

AMCDX Video Patcher

Today I decided to release the first version of the tool I work on. To be fair its more like a proof of concept but still, id love to have some feedback so decided to share it

So currently It can:

1) open and playback MOV files with ProRes encoded Video

2) decode selected rectangle, blur it, encode and record back (so you can easily blur a face or anything else you don’t like on your final video)

Its still in progress and current implementation just shared to prove couple things:
1) we can easily edit files in-place
2) if we need to edit just exact rectangle we don’t need to decode and later encode back whole frame
3) obviously instead of blur we can add any effect needed or just replace part of the frame with a new one (for example its possible to add logo)

I uploaded OSX installer to GitHub and plan to upload Windows version tomorrow:

https://github.com/da8eat/VideoEditorInstaller/raw/master/AMCDXVideoPatcher.pkg

P.S. It s still in an experimental mode so I would recommend copying a file you try to edit it

I added short Video which demonstrates how it can be used

Some thoughts on performance optimization

I have a bad habit to try to make my code highly optimized even if the unit not fully finished, and I`m serious when say its a bad habit as it really slows down the development process and sometimes affects code quality…

For example here is an interesting trick I added to my ProRes encoder/decoder:

When you encode/decode DC you should use one of 4 existing codebooks and codebook should be chosen based on value just encoded DC which obviously can be way bigger of 3. So what most of us will do here just add branching:

if (codebook > 3) {
    codebook = 3;
}

And in 999 of 1000 cases its probably will be the best solution. For my case, I have 3 * 30 DC per slice (technically 3 * 32 but first 2 DCs don’t need any branching to know codebook) or 364500 DC per UHD frame which is kinda bad…

as you might see max value of codebook == 3 eg (2^2 – 1) and this is the case where we can easily avoid branching, so I replaced that branching with code:

codebook = (3 & (4 - !!(codebook & 0xfffc))) + ((codebook & 3) & (4 - !(codebook & 0xfffc)));

Which is avoid branching and in my particular case improves performance a bit ~0.3-0.5% but to be fair this code would be hard to support and I still doubt if I need to push it.

This is really tricky example which probably shouldn’t be considered especially when your unit is not 100% complete and when there are for sure a million ways to optimize your code.

One more example

This one is more critical and which I faced doing my Contract.
I had to fix performance regression which the company faced after FFmpeg upgrade from FFmpeg 3.x to FFmpeg 4.2 as they use FFmpeg to demux MOV files.

One of the developers found out that FFmpeg finally added 12-bit decoding support and now they claim all HQX and 4444 profiles as 12 bit and indeed that commit causes regression. It sounds weird if you consider the fact they use official libs from Apple for decoding …

So how is it possible? My first thought was decoder still used somewhere how else is it possible File open would become x2 slower? So how do you open File with ffmpeg libs? Something like:

avformat_open_input ...
avformat_find_stream_info ..

What I found out avformat_find_stream_info reads the first frame from file and decodes it and does it single-threaded. How do you like it? To be fair there is a reason behind it as sometimes there is no way to get all needed metadata without decoding frame header (for example bit depth or pixel format so on) but the problem is we don’t need to decode whole frame to get that metadata we just need to decode frame header… So I added an extra flag wich force to stop FFmpeg Prores and DNx decoders after headers decoded, something like:

//proresdec2.c .  line 784
if (avctx->flags & AV_CODEC_STOP_AFTER_HEADER_DECODED) {
    return avpkt->size;
}

Believe it or not but instead of x2 slowdown we achieved x2 speed up and now file open became constant time regardless of resolution when with the previous version the higher resolution was the slower file open would be…

Prores tools updates

I didn’t write last time as was quite busy and unfortunately didn’t work a lot on announced tools as still doing my main contract and literally have max 2 hours a day to do my project :(.

Nevertheless, as there was high interest I want to post some updates:

  1. I finally finished ProRes decoder which needed for a lot of reasons (playback, transcode as we need to stop decoding before doing IDCT, in-place edit as we need to decode just some of the slices, not whole frame)
  2. I finished MOV demuxer which based on the project I started here (https://github.com/da8eat/qtfile_pp) never committed updates though but at least you can see the main idea
  3. I made huge progress on MOV muxer (also based on GitHub project mentioned above)
  4. I finished basic UI (Qt/qml based)
    ProRes Tools UI

5. I implemented GLSL shaders for Video Renderer

Next couple of months I`m going to finish the Player and MOV muxer, and after that, finally, start features integration.

First one will be ProRes to ProRes transcode (for example transcode HQX to Proxy profile)

The second one will be in-place ProRes editing (I plan to detect faces on video and blur all detected faces in the input file without video re-encoding and file re-muxing)

FFMPEG + GPL

Just to make things clear

0) yes I agree I violated GLP and I feel sorry, but I didnt do it on purpose (and to be fair I didnt know much licensing details till the week I was noticed about my violation)

1) After Kieran left the comment about violation I made repo private (to spend some time to understand all details) and today I fully removed the repo

2) To be fair its easy to see it wasn’t done on purpose as I never posted updates even though the posted version had interlaced coding bug. I was asked a couple of times privately to make a custom build with my encoder and I rejected it as the only purpose I persued was to show encoder exists and performs well

3) Why dont I disclose source codes?

3.1) I was going to re-work prores_ks or add one more encoder and I had an exact plan on how to do it and I started with it. I sent couple patches 1st was approved 2nd still under review. I decided to not wait forever and Im not the one to ping every day/week to make it pushed (I believe if the community needs something it will be pushed, and its easy to prove with my mxf op1b research when my changes were pushed even though Ive never sent that patch)

3.2) I still was interested to finish my Prores encoder so I continued to work on it. And now when encoder done I plan to make a product based on it. It will be free but probably close-sourced.

3.3) Even if I want one day to make it part of ffmpeg it wont be easy to do, as my implementation done in C++ and ffmpeg is C

3.4) so the build I shared (and later removed) literally a bag of tricks.

3.5) so when Kieran/Martin/Carl or whoever says I should share prores_amcdx_encoder I can easily do it, but what will you see here? as its basically skeleton copied from prores_anatoly with whole logic replaced by calling functions from private static library

#include "libavutil/opt.h"
#include "avcodec.h"
#include "internal.h"
#include "profiles.h"
#include "prores_defs.hpp"

#define DEFAULT_SLICE_MB_WIDTH 8

static const AVProfile profiles[] = {
    { FF_PROFILE_PRORES_PROXY,    "apco"},
    { FF_PROFILE_PRORES_LT,       "apcs"},
    { FF_PROFILE_PRORES_STANDARD, "apcn"},
    { FF_PROFILE_PRORES_HQ,       "apch"},
    { FF_PROFILE_PRORES_4444,     "ap4h"},
    { FF_PROFILE_PRORES_XQ,       "ap4x"},
    { FF_PROFILE_UNKNOWN }
};

static const int valid_primaries[9]  = { AVCOL_PRI_RESERVED0, AVCOL_PRI_BT709, AVCOL_PRI_UNSPECIFIED, AVCOL_PRI_BT470BG,
                                         AVCOL_PRI_SMPTE170M, AVCOL_PRI_BT2020, AVCOL_PRI_SMPTE431, AVCOL_PRI_SMPTE432,INT_MAX };
static const int valid_trc[4]        = { AVCOL_TRC_RESERVED0, AVCOL_TRC_BT709, AVCOL_TRC_UNSPECIFIED, INT_MAX };
static const int valid_colorspace[5] = { AVCOL_SPC_BT709, AVCOL_SPC_UNSPECIFIED, AVCOL_SPC_SMPTE170M,
                                         AVCOL_SPC_BT2020_NCL, INT_MAX };

typedef struct {
    AVClass *class;
    void *encoder;
    int cs;
    int qual;
    int field_order;
    int planes;
    int target_size;
} ProresContext;

static int prores_encode_frame2(AVCodecContext *avctx, AVPacket *pkt,
                               const AVFrame *pict, int *got_packet)
{
    ProresContext *ctx = avctx->priv_data;
    int ret;
    int frame_size = amcdx_pr_encoder_encode(ctx->encoder, (void **)pict->data, (int *)pict->linesize, ctx->planes); //for the time being


    if ((ret = ff_alloc_packet2(avctx, pkt, frame_size, 0)) < 0)
        return ret;

    amcdx_pr_encoder_read(ctx->encoder, pkt->data, &pkt->size);


    pkt->flags |= AV_PKT_FLAG_KEY;

    *got_packet = 1;
    return 0;
}

static av_cold int prores_encode_init2(AVCodecContext *avctx)
{
    ProresContext* ctx = avctx->priv_data;

    avctx->bits_per_raw_sample = 10;

    if (avctx->width & 0x1) {
        av_log(avctx, AV_LOG_ERROR,
                "frame width needs to be multiple of 2\n");
        return AVERROR(EINVAL);
    }

    if (avctx->width > 65534 || avctx->height > 65535) {
        av_log(avctx, AV_LOG_ERROR, "The maximum dimensions are 65534x65535\n");
        return AVERROR(EINVAL);
    }

    switch (avctx->profile) {
    case FF_PROFILE_UNKNOWN:
    case FF_PROFILE_PRORES_STANDARD:
        ctx->qual = Quality_422;
        break;
    case FF_PROFILE_PRORES_4444:
        ctx->qual = Quality_4444;
        break;
    case FF_PROFILE_PRORES_HQ:
        ctx->qual = Quality_422HQ;
        break;
    case FF_PROFILE_PRORES_LT:
        ctx->qual = Quality_422LT;
        break;
    case FF_PROFILE_PRORES_PROXY:
        ctx->qual = Quality_422Proxy;
        break;
    case FF_PROFILE_PRORES_XQ:
        ctx->qual = Quality_4444XQ;
        break;
    default:
        return -1;
        break;
    }

    switch (avctx->pix_fmt) {
    case AV_PIX_FMT_UYVY422:
        ctx->cs = ColorSpace_uyvy;
        ctx->planes = 1;
        break;
    case AV_PIX_FMT_YUV422P10:
        ctx->cs = ColorSpace_yuv10_422_planar;
        ctx->planes = 3;
        break;
    case AV_PIX_FMT_YUV422P12:
        ctx->cs = ColorSpace_yuv12_422_planar;
        ctx->planes = 3;
        break;
    case AV_PIX_FMT_YUV444P12:
        ctx->cs = ColorSpace_yuv12_444_planar;
        ctx->planes = 3;
        break;
    default:
        break;
    }

     //for the time being

    switch (avctx->field_order)
    {
    case AV_FIELD_BT:
        avctx->field_order = FieldOrder_BottomFieldFirst;
        break;
    case AV_FIELD_TB:
        avctx->field_order = FieldOrder_TopFieldFirst;
        break;
    case AV_FIELD_PROGRESSIVE:
    default: //otherwise we think its progressive
        avctx->field_order = FieldOrder_Progressive;
        break;
    }

    ctx->encoder = amcdx_pr_encoder_create();

    if (ctx->target_size != 0) {
        amcdx_pr_encoder_set_frame_size(ctx->encoder, ctx->target_size);
    }

    avctx->codec_tag = MKTAG(profiles[avctx->profile].name[0], profiles[avctx->profile].name[1], profiles[avctx->profile].name[2], profiles[avctx->profile].name[3]);// AV_RL32((const uint8_t*)profiles[avctx->profile].name);

    return amcdx_pr_encoder_init(ctx->encoder, avctx->width, avctx->height, ctx->cs, ctx->qual, ctx->field_order) - 1;
}

static av_cold int prores_encode_close2(AVCodecContext *avctx)
{
    ProresContext* ctx = avctx->priv_data;
    amcdx_pr_encoder_destroy(ctx->encoder);

    return 0;
}

#define OFFSET(x) offsetof(ProresContext, x)
#define VE     AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_ENCODING_PARAM

static const AVOption options[] = {
    { "target_size", "force frame size", OFFSET(target_size), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, VE },
    { NULL }
};

static const AVClass proresamcdx_enc_class = {
    .class_name = "ProRes amcdx encoder",
    .item_name  = av_default_item_name,
    .option     = options,
    .version    = LIBAVUTIL_VERSION_INT,
};

AVCodec ff_prores_amcdx_encoder = {
    .name           = "prores_amcdx",
    .long_name      = NULL_IF_CONFIG_SMALL("Apple ProRes"),
    .type           = AVMEDIA_TYPE_VIDEO,
    .id             = AV_CODEC_ID_PRORES,
    .priv_data_size = sizeof(ProresContext),
    .init           = prores_encode_init2,
    .close          = prores_encode_close2,
    .encode2        = prores_encode_frame2,
    .pix_fmts       = (const enum AVPixelFormat[]){AV_PIX_FMT_UYVY422, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_NONE},
    .capabilities   = AV_CODEC_CAP_FRAME_THREADS | AV_CODEC_CAP_INTRA_ONLY,
    .priv_class     = &proresamcdx_enc_class,
    .profiles       = NULL_IF_CONFIG_SMALL(ff_prores_profiles),
};