When clipping a video, mind the GOP and hope the I-frame is an IDR-frame

A picture is worth a thousand words, but if it is a B-frame it may be worth two hundred words.

Mind the GOP
Mind the GOP

One of our products ,ViDeus Auditor, lets you clip and join videos, showing a preview before doing the actual clipping. For doing it, we have to understand how an encoded video is composed. We usually work with H264.

When encoding each video picture you can get a I-frame, a P-frame or a B-frame.

  • The I-frame is the easy one, all the information for decoding the picture is within the I-frame.
  • The P-frame is a frame which needs previous decoded pictures for being decoded. So it uses information from old pictures.
  • And the B-frame needs decoded pictures from the past and from the future. So it uses information from old and future pictures.
    For example, you can get something like this:

The I-frame can be decoded instantaneously, then the second frame (B-frame) needs information from previous frames (the I-frame for example) and from future frames (like the P-frame).

As the B-frames may need information from the following frames, the stream is rearranged for decoding, in a way such as when a B-frame is being decoded, everything needed is there. So usually the frames are transmitted like this:


This results in having a decoding time-stamp (DTS) less than the presentation time-stamp (PTS) in the rearranged frames.

That series of frames can be a GOP, a Group of Picture, a video is composed by a series of GOPs, each GOP starting with an I-frame, this would be three GOPs:


As the I-frame doesn’t need any other information for decoding, that’s a good point for fast-clipping a video because all the information for decoding is within it; clipping a video in the middle of a GOP (when it’s not an I-frame) will most likely result in a corrupt output for a while until a new full GOP is decoded.


But, clipping a video at a GOP start will not always result in a clean output.

The I-frame at the beginning will certainly be decoded fine, it doesn’t need anything special. However, the following B-frames and P-frames will probably need previous frames for being decoded correctly. Sometimes those needed frames are within the GOP which it is usefull, but sometimes they are outside the GOP which is bad for clipping, because it means they reference pictures which are before the I-frame where we cut the video, resulting in a corrupt output.

When frames from a GOP reference frames from another GOP it’s called Open GOP. If not, it is called a Closed GOP.

Hopefully, the video was encoded with IDR-frames. Those are a special case of I-frames. Apart from being an I-frame the IDR-frame ensures the following frames will not reference any frame before the IDR.
In a GOP the IDR-frame replace the I-frame, all IDR-frames are I-frames but not all I-frames are IDR-frames.

So, if an IDR is found that’s a good place for clipping, because that frame will be decoded without any other information and all the following frames will not require information from before the IDR-frame.

Next time you want to clip a video, mind the GOP and find an IDR-frame.

Showing what the threads are doing, trying not to interrupt the process

Humans are curious, perhaps that makes us humans[1], and you might be curious about what your program is doing.
Sometimes, the program has several threads running and sometimes you can’t completely stop it neither kill it to see what it is doing.

A situation like that could be when there is a service running on a client and it has some problem, you suspect a thread (one of several) is locked or waiting for something that will never happen, but all other threads look like they are running fine; so you don’t want to interrupt neither kill the process for now.

Don't stop me now
Don’t stop me now

What I do, is to use gdb and write a file with the commands I want to run and ending the file with the ‘q’ command (quit), making gdb quit so the process can continue its execution. I usually write a file called ‘commands’ with this:

thread apply all bt

That will execute ‘bt’ (backtrace) to all threads and then ‘q’ (quit) gdb after executing backtrace. Printing the backtrace for all threads shows me (more or less) what the threads are executing.
Using the ‘commands’ file I run gdb like this:
gdb -p <pid> -x commands > /tmp/threads

Being <pid> the process pid.
Notice I redirect the output to a file, that is because unless I redirect to a file, gdb will stop the output when the console is full and wait for me to press ‘return’; which will make the process stop for a while, something I don’t want to happen.

After seeing the threads, I can write another ‘commands’ files with another instructions to gdb, like printing some variable.

[…]the ability to ask questions is probably the central cognitive element that distinguishes human and animal cognitive abilities[…] (https://en.wikipedia.org/wiki/Great_ape_language#Limitations_of_ape_language)

Reading logs with ‘tac’

Logs are usefull for debugging and tracing what your code is doing, and what has done in the past.

I started using ‘tac’ when reading logs, specially when I need to see what happened recently.
$ tac that_log.log | less

From man tac:
tac - concatenate and print files in reverse
So with ‘tac’ you get the last line first. Which is nice if you don’t need to see lines from long ago.

print files in reverse
print files in reverse

After using tac for a while, I feel it is better to read from end to beginning, you read the error first and then what happened before. I think reading the error first makes you be more alert to read the following lines paying more attention.

If just using less to read the last line, you has to press Shift-g and wait a moment to get the last line, and sometimes with large files less hangs for a while.
Another option is to use tail -n<lines> to get the last lines right away, but usually <lines> are not enough.