Let’s talk about what MPEG is. The full name of MPEG is the Moving Picture Experts Group. The expert group is part of the Joint Technical Committee (JTC1). JTC1 is organized by ISO (International Organization for Standardization) and IEC (International Electrotechnical Commission). JTC1 is responsible for Information Technology and within JTC1 there is a subgroup SG29 responsible for “Audio, Image Coding and Multimedia and Hypermedia Information” . In the SG29 subgroup, there are a number of working groups, including JPEG (Joint Photographic Experts Group) and the working group WG11 responsible for moving image compression. Therefore, MPEG can be considered as ISO/IEC JTC1/SG29/WG11, established in 1988.
Just think of MPEG as an organization. MPEG mainly formulates specifications and standards for video compression and transmission. There are currently five standards formulated by the MPEG organization, MPEG-1, MPEG-2, MPEG-4, MPEG-7 and MPEG-21, and the MPEG-TS package format is defined in the MPEG-2 standard.
The following is a brief introduction to the MPEG-2 standard. MPEG-2 is currently widely used in Internet transmission protocols, as well as early cable digital TV, wireless digital TV, satellite TV, DVB, DVD, and so on.
The MPEG-2 standard is currently divided into 10 parts, collectively referred to as the ISO/IEC 13818 international standard, titled “GENERIC CODING OF MOVING PICTURES AND ASSOCIATED AUDIO”.
The remaining four parts of the MPEG-2 standard are not listed because they are not widely used. As can be seen from the table above, the MPEG-TS package format is defined in 13818-1 (System). In 1990, the ATM Video Coding Expert Group cooperated with the MPEG organization to convert 13818-1 of the ISO/IEC 13818 standard into ITU-TRec.H.220 (System) and 13818-2 into ITU-TRec.H.262 (Video ), ITU-TRec.H.220 and ITU-TRec.H.262 are part of the ITU-T standard.
ISO/IEC 13818-1, ITU-TRec.H.220 standard file compression package download address: Baidu network disk , extraction code: f0cv
In fact, there is cooperation between various organizations, and some standard documents are extended, derived, and extracted separately.
The background of MPEG has been introduced. Let’s talk about the package format of MPEG-TS. The file whose suffix name ends with .ts is the package format of MPEG-TS. The full name of ts is Transport Stream.
In fact, there is also a package format MPEG-PS, the full name of ps is Program Stream (program stream). Note that the translation of Program into Chinese means the program , not the program . The PS encapsulation format is mainly used in error-prone environments, such as DVD discs.
The packet structure of the TS stream is a fixed length of 188 bytes, while the packet structure of the PS stream is of variable length. The fixed packet structure of the TS stream has a strong ability to resist transmission errors .
What are transmission errors ?
The generation of bit errors is due to the fact that in the signal transmission, the decay changes the voltage of the signal, which causes the signal to be destroyed in the transmission, resulting in bit errors. Noise, pulses from alternating current or lightning, failure of transmission equipment, and other factors can cause bit errors (eg, a 1 is transmitted and a 0 is received; and vice versa). Due to various reasons, errors will inevitably occur in the digital signal transmission process. – Baidu Encyclopedia
To put it simply, the transmission error is a relatively low-level situation. The digital signal is a binary stream of 111000. When an error occurs, a 1 becomes a 0 during the transmission process. This is also why the TCP and UDP protocols we often use have a checksum (checksum) field. If a bit error occurs, the bottom layer directly discards the data packet and will not throw it to the application layer.
In fact, for UDP or TCP, whether the TS encapsulation format or the PS encapsulation format is used, the ability to resist transmission errors is the same, because UDP, TCP and these protocols help you deal with the error situation. But the standards of TS and PS are not only used for UDP and TCP. Video recorders, DVD players, and equipment in the field of digital video also use PS and TS.
Both TS and PS were mainly used in the field of digital TV in the early days. China’s digital TV standard uses DVB, the full name of which is Digital Video Broadcasting (Digital Video Broadcasting).
When we used to watch TV, there were many channels, and the receiving antenna could switch channels and receive different channels. There are many programs in a channel, pay attention to the concept of program , English is Program , this term often appears in MPEG-2 documents.
For example, under the CCTV channel, there are CCTV1-CCTV14 programs at the same time. Each program has an audio stream and a video stream. The structure diagram is as follows:
How to distinguish these channels, programs, audio and video streams in TS? This is where the TS package format gets complicated. The performance of the early TV (receiving device) was relatively poor, and it could not interact with the base station, and could only passively receive the digital signal broadcast by the base station.
Our Internet TCP/UDP transmits digital signals , and digital TV base stations are also broadcast digital signals , binary streams such as 001001. However, TS encapsulation is used in the Internet, which usually transmits one video and one audio, so we do not need many fields in the TS. The TS encapsulation of the Internet is a relatively simple application scenario. This single program TS package is called Single Program Transport Stream (SPTS).
If readers need to do development such as set-top boxes, and want to deeply understand the knowledge of digital TV, they can read “Video Demystified: A Handbook for The Digital Engineer”.
The general background has been introduced. Let’s talk about some actual combat content, which is still an old routine. You need to find a software that can parse the TS format. The download resources are as follows:
1, elecard stream analyzer, supports TS, FLV, MP4 and many other formats, free trial for 30 days. Download address: Baidu network disk , extraction code: sq5y
3, MPEG-2 TS packet analyser, the interface is clear and concise. Download address: Baidu network disk , extraction code: hmxj
2, juren.ts, the TS stream file used in this article. Download address: Baidu network disk , extraction code: igs5
PS: The wall crack recommends elecard stream analyzer. This software is very easy to use. The software produced by elecard company is worth trying.
Open the juren.ts file with stream analyzer, the screenshot is as follows:
Let’s talk about the basic type first. There are actually three kinds of data packets in the TS stream.
1, ES, Elementary Stream (basic data stream), you can understand an ES packet as an H264 encoded video frame, or an audio frame, but ES is not just audio and video frames.
2. PES, Packetized Elementary Stream (packaged ES), encapsulate a layer on top of the ES package and add information such as PTS and DTS.
3, TS, Transport Packet (transport packet), fixed 188 bytes, a large PES packet will be split into multiple small blocks, encapsulated into TS packets for transmission.
Let’s look at it from the top down and analyze the fields of the TS packet first. As shown below:
Note that the stream analyzer does not parse the fields in byte order, and the fields displayed in his interface are not in byte order.
As can be seen from the above figure, Transport Packet is a TS packet, which starts with 0x47, which is convenient for synchronization in certain scenarios. The ISO/IEC 13818-1 standard document gives the syntax for parsing a TS packet, as follows:
The sync_byte in the above figure is 0x47, and
transport_packet() is a function that parses TS packets. The syntax is as follows:
Pay attention to the word syntax . The corresponding English is syntax . Many MPEG documents provide syntax. The syntax he uses is similar to C language. If you are familiar with C language, it is easy to understand the pseudocode provided by them. The syntax (syntax) is Pseudocode .
As can be seen from the syntax in the above figure, there are the following fields at the beginning of the TS packet:
1, sync_byte , synchronization field, fixed at 0x47. Position: Bits 0 to 7. If there is the same content, it is also 0x47 but it is not a synchronization field, it should be escaped.
Supplement: 0x47 should not be escaped. The length of TS is fixed at 188 bytes. For details, see how to parse and see the parsed code.
2, transport_error_indicator , transport error flag, in TCP/UDP scenarios, this field should always be 0.
2, payload_unit_start_indicator , the start marker of the playload content, because a PES packet will be divided into multiple TS packets, so a start marker is required.
3. transport_priority , the transmission priority, this field should not be used much in the Internet field, so don’t pay attention.
4. PID , P is not an abbreviation of Program, and I don’t know what it is. PID can determine what data is in the payload. The standard document says, as follows:
5, transport_scrambling_control , this is a restricted field, generally used for paid programs and the like. In the past, satellite TV needed to insert a card and charge money to watch some programs. Internet scenarios rarely use this field.
6. adaptation_field_control , a variable field marker, this field has 4 values, 10 and 11 represent that there are extended fields that need to be parsed, and 01 and 11 represent that the TS packet has a payload.
7. continuity_counter , if there is a payload in the TS packet, this field will be incremented. I feel that the Internet scene is not used much, and I don’t know what to do with this field, so I’m burying a hole. Fill in the back.
8. data_byte , starting from here, it should be the data of the playload. The syntax of the document is to process the data in a loop N times. Subtracting the adaptation_field from 184 is the size of N. Just understand N as the size of the payload. Actually the real code implementation is not necessarily for N times.
We now use notepad++ to manually parse the above 6 fields, as shown below:
The 0th byte of the TS file must be 0x47. In fact, this sync_byte is a redundant field in the Internet scene for compatibility. In fact, if you redesign an encapsulation format in the TCP/UDP scenario, this 0x47 is not needed. TCP transmits the m3u8 TS live stream, this 0x47 should not be transmitted. Specifically, I will capture the packet and look at it and then improve this content.
In the ATSC standard, the 0x47 sync byte is never encoded and transmitted, but is instead transmitted with a specific, 2-level sync pulse where the receiver inserts the 0x47 sync byte.
Then the 1st ~ 2nd bytes are 0x40 0x11, these two bytes are composed of transport_error_indicator, payload_unit_start_indicator, transport_priority and PID 4 fields.
Now convert 0x40 0x11 to binary 0100 0000 0001 0001 , looking from left to right, there is no 0th bit here, starting from 1. Then the first bit is 0, that is, transport_error_indicator is equal to 0, there is no transmission error. Then the second bit is 1, and the second bit is payload_unit_start_indicator, which means that this is the start packet of the payload. The third bit is 0 which means transport_priority is 0.
Then the remaining 13 bits are the PID value, 0 0000 0001 0001 is the PID, which is 17.
Then look at the third byte, the value is 0x10, the binary is 0001 0000, the third byte is composed of transport_scrambling_control, adaptation_field_control, continuity_counter, as shown below:
Finally look at the 4th byte, the 4th byte is the data_byte value is 00. So far, the 8 fields of the header of the TS packet have been explained.
There is a bslbf in Table 2-3. The full name of this bslbf is Bit string, left bit first. The full name of uimsbf is Unsigned integer, most significant bit first. These can be found in standard documents.
Then look at the content of the following bytes, as shown below:
The ff in the above figure are padding bytes, to fill up the size of 188 bytes. In fact, it is a waste of Internet traffic. The early use of the TS encapsulation format was not on the Internet.
How to know that the content of these bytes is the Service Description Table, it should be known according to the PID of the header equal to 0x11. Then the parsing syntax of the Service Description Table is as follows:
The first TS package in the juren.ts file is actually a custom thing. It can be seen from the PID equal to 0x11 and the table_id equal to 0x42 that it is User private. Let me directly say what the ts package at the beginning is used for. I read it for a long time and I didn’t understand it. These syntax documents are best viewed in combination with a TS parsing code, such as the code of the TS part of FFmpeg, to understand this document.
Let’s take a look at the second TS package, as shown below:
As can be seen from the above figure, PID and table_id are both 0, so the content of this TS package is the content of the PAT (Program Association Table) table, here is a reminder that the TS package is the encapsulation of the PES package, but it is not just the PES package. , TS can also be the encapsulation of PSI data, the full name of PSI is Program Specific Information, you can understand PSI as program specific information. Note that PSI is not a table, PSI is a general term, PAT, PMT, CAT, NIT These are all PSI, as shown below:
The knowledge of PSI is briefly explained, but the encapsulation format of TS is really very complicated, and there are so many tables. Readers need to combine the code to understand TS in depth.
Let’s look for all the TS packets of the first video frame, and use this clue to understand the TS package format. Use qt creator to see the content of AVPacket, as shown below:
In the AVPacket of FFmpeg, the data pointer points to the actual data encoded by H264, and the pos field points to the start position of the TS packet. You can see the TS sign number of 0x47 0x41.
The first frame of video has a total size of 3768 bytes, the beginning is 00 00 01 09 f0 00 00 00 00 06 00 07 , and the end is ab 28 fd 45 f9 30 42 22 70 34 00 00 03 00 00 03 00 02 1e 00 .
Because 3768 bytes are so large, they must be placed in multiple TS packets. Let’s find out all these TS packets. Please see the figure below:
There are a few things to note in the picture above.
1. I counted, there are about 20 TS packets, and these 20 TS packets are combined into one frame of video data.
2. Note that among the 20 TS packets, only the playload_unit_start_indicator of the first TS packet is equal to 1, and the others are 0.
3. The value of the continuity_counter field keeps increasing.
So far, the analysis of the MPEG-TS encapsulation format in this paper is completed. I only talked about some simple content. It is recommended that readers read the ISO/IEC 13818-1 standard document. This document is well written, and at the same time, it is combined with the TS code of FFmpeg to deeply understand this format.
Here is a technique for reading standard documents. Don’t think that it is difficult to read English documents. Since DEEPL, reading English documents has become a very easy task. I will not read only one document. I It is a combination of many Chinese translations or Chinese analysis articles to understand a standard, because everyone’s native language is Chinese, in most cases, Chinese materials are enough for you to understand a standard. If Chinese materials are really lacking, I will read English articles again, but the speed will be very slow. Although there is DEEPL, some areas still need to be carefully pondered.
Finally, let’s talk about expanding knowledge:
1. Both MPEG-PS and MPEG-TS are encapsulated based on PES, so they can be converted to each other.
5. MPEG Fundamentals and Protocol Analysis Guide – Tektronix
The text and pictures in this article are from InfoQ
This article is reprinted from https://www.techug.com/post/mpeg-ts-encapsulation-format.html
This site is for inclusion only, and the copyright belongs to the original author.