Reversing mxf op1a

Today I want to show technic how to reverse MXF files, thats what I do from time to time to repair some broken files, but also useful if you need to get some information.

Our user case: We have mxf op1a with XAVC stream inside, if open that file in Sony catalyst browser, we can see detailed information including Level and Profile, we need to understand which part of that file contains that information.

what we will need for that:
 1. libMXF any version (I use custom forked from v.1.0.0-rc1)
 2. MXDump from libMXF
 3. any Hex viewer (I use Far manager on win)

Before we start, best way to understand what part of mxf contains information about Level/Profile is to buy and read
 SMPTE 381-3 "Mapping AVC Streams into the MXF Generic Container" but will it help to improve your mxf reverse technics?

first of all I want to notice that you must have at least basic understanding of MXF (SMPTE 336M, 377M, 379M) that basic knowledge helps us understand object structure, ber, mxf list structure so on.

there are 2 known ways to find XAVC Profile/Level:
 1) read encoded XAVC frame and parse Sequence Parameter Set (SPS) to extract fields profile_idc and level_idc
 2) parse MXF metadata (in most cases and in our specific case parse descriptor)

1st way is more solid as for me except some weird cases like Avid AVCI (they exclude SPS and PPS). I will explain at the end why first method more solid.
 Nevertheless as I saw tons XAVC files that have SPS but Catalyst browser still do not detect/show Profile/Level we can make conclusion they use 2nd way to detect it.

so what we do first is makind dump of needed MXF file using MXFDump app from LibMXF:

MXFDump.exe -m "e:\xavc.mxf" >> "e:\xavc.txt"

Console output was redirected to file so It could be opened and analyzed. As we found Catalyst browser looks for profile and level in metadata and most probably its stream descriptor.

in result dump file CDCI descriptor was found, descriptor contains 1 field that MXFDump knows nothing about so marked it as Dark (check image1)
(image 1)
that dark field has structure that typical for MXF List where first 4 bytes - number of elements, second 4 bytes - element size and after that all elements with defined size.
 In current particular case we see list with 1 element 16 bytes long, so looks like its UID, there is one observation that confirms it, if we compare UID value with instance UID of CDCI descriptor we will find its almost identical except 4th byte which increased by 1, it makes me think we found UID generated using same UID generation algorithm and it was generated right after CDCI descriptor instance UID was generated. We still dont know what that UID means, but next line after CDCI descriptor saying "[ Omitted 1 dark KLV triplets ]" and it makes me think there is an object (unknown for MXFDump). Lets assume dark UID in CDCI descriptor refers to this object, so its time to open our hex viewer and find our unknown UID and check what preceded and follows it.
(image 2)
(image 2)
UID was found twice (highlighted on image2 by black and red rectangles) 1st obviously our CDCI field and second looks more interesting to us, we assumed its Instance UID of unknown to us object, lets check if its true now time to get back to out basic understanding of MXF, we know that MXF object looks like:
 1) 16 bytes long Key
 2) BER encoded length
 3) Set of values that represented like:
 3.1) 2 bytes unique value (Local Tag) which we can use to find key in Primer
 3.2) 2 bytes length of value
 3.3) Value

so if found UID is Instance UID of unknown object it should follow same rules which means found UID id corresponds to (3.3) and 2 word long values should precedeĀ  found UID ((3.1) and (3.2)) more than that we know our UID is 16 bytes long, so (3.2) should be equal to 16 or 0x00 0x10 in hex and if you check hex viewer (image2 highlighted by yellow rectangle) you`ll see yes (3.2) really equal to 16,
 and (3.1) equal to 0x3C 0x0A and if we are right we should find record in Primer with local Tag equal to 0x3CĀ  0x0A (image 3)
(image 3)
(image 3)
lets check dump again and yes we see there record with Local Tag equal to 3c.0a and UID corresponds to this tag is 06.0e.2b.
 if you grep libMXF for that UID you'll find:
MXF_ITEM_DEFINITION(InterchangeObject, InstanceUID,
                        MXF_LABEL (0x06,0x0e,0x2b,0x34,0x01,0x01,0x01,0x01,0x01,0x01,0x15,0x02,0x00,0x00,0x00,0x00),
Ok we confirmed that Dark item contains UID of referenced from CDCI descriptor object, lets name it SubDescriptor.
 Next assumption: Catalyst browser looks for Profile and Level in SubDescriptor referenced from CDCI descriptor and its easy to confirm, for current particular file I see in browser:
 "Profile and level" detected as High422@5.2 its not a problem to google that for profile High422 profile_idc should be equal to 122 or 0x7A in hex and level_idc for level 5.2 equal to 52 or 0x34 in hex, if we return back to our found sub descriptor (image 2 highlighted by orange rectangles) we can see both profile and level presented:
 Profile - 1 byte long value with Local Tag equal to 80 08
 Level - 1 byte long value with Local Tag equal to 80 0B

knowing local tags we can find unique UIDs for Level and Profile field in Primer (image 3)

so at the end we can conclude, Sony Catalyst browser reads Profile and Level from sub descriptor referenced from CDCI descriptor
P.S. yes it described on 15 pages of smpte 381-3, unfortunately I spent one day reversing sample files instead buying that specs

P.P.S. I promised to explain why parse SPS more solid comparing to read sub descriptor. I saw tons different MXF files with AVC stream wrapped (AVC, AVCI, XAVC) and tons different descriptors some have AVC sub descriptor, some have sub descriptor fields added directly to CDCI descriptor, some files have Mpeg Descriptor instead CDCI so on, and btw smpte 381-3 says all avc sub descriptor fields are optional, and only parsing SPS could guaranty you same result regardless descriptor structure (except Avid case as I mentioned before)