Internet Draft A. Li draft-li-ulp-00.txt F. Liu July 4, 2000 J. Villasenor Expires: January 2001 Univ. of Calif., LA J.H. Park D.S. Park Samsung Electronics An RTP Payload Format for Generic FEC with Uneven Level Protection STATUS OF THIS MEMO This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as work in progress. The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1 Abstract This document specifies a payload format for generic forward error correction to achieve uneven level protection (ULP) of media encapsulated in RTP. It is an extension to the forward error correction scheme specified in RFC 2733 [1], and it is also based on the exclusive-or (parity) operation. The payload format allows end systems to transmit using arbitrary protection length and levels, in additional to using arbitrary block lengths. It also allows for the both complete recovery of the critical payload and RTP header fields, and partial recovery when complete recovery is not possible due to the packet lost situation. This scheme is backward compatible with non-FEC capable hosts, and hosts that are only capable of FEC schemes specified in RFC2733 [1], so that receivers which do not wish to implement ULP forward error correction can just ignore the extensions. 2 Introduction Because of the real time nature of many applications, they have more strict delay requirement than a pure data transmission. As a result, retransmission of the lost packets is generally not a valid option for such applications. A better way to attempt to recover information about a lost packet in this case is FEC. Thus forward error correction (FEC) has been used to compensate for packet loss in the Internet [2]. In particular, the error correction has to be on the packet level, because any correction within the packet will be useless if the whole packet is lost. In many cases for the network connections, the bandwidth is a very limited resource. However, many traditional FEC schemes are not designed for optimal usage of the limited bandwidth resource. A more efficient way is to provide different protection levels for different parts of the data stream of different importance. These unequal error protection schemes make more efficient use of the bandwidth to provide overall better protection of the data stream against the losts. To support these mechanisms, protocol support is required. However, most of the unequal error protection schemes require the knowledge of the importance level or class of data stream. As a result, most of such schemes depend on the nature and structure of the media being protected, and are not generic. In many cases for multimedia streams, we have some very important knowledge about the stream. In general, the more important parts of the data are always at the beginning of the data packet. This is the common practice for most codecs, since the beginning of the packet is closer to the re-synchronization marker at the header and thus is more likely to be correctly decoded if the data is variable length coded. Also, almost all media formats have the frame headers at the beginning of the packet. For video streams, most modern formats have optional data patitioning modes to improve error resilience, where where the video macroblock header data, the motion vector data and DCT coefficient data are seperated in their individual partitions. In ITU-T H.263 version 3, when the optional data partitioned syntax of Annex V is enabled, when the optional data partioning mode is enabled and MPEG-4 Visual Simple Profile, the video macroblock (MB) header and motion vector partitions (which are much more important to the quality of video reconstruction) transmitted in the partition at the beginning of the video packet while residue DCT coefficient partitions (which are less important) are transmitted in the partition close to the end of the packet. Because the data is arranged in the order of more important data to less important data, it would help to provide more protection to the beginning part of the packet. In case of audio stream, many new audio codecs do encode into bitstream data of different importance classes and transmit them in the order of more important to less important. Applying more protection to the beginning of the packet would benifit. Even for uniform-significance audio streams, special stretching techniques can be applied the partially recovered audio data packets. Also, if there is audio redundancy coding, it makes sense to have more protection applied to the original data which is at the first half of the packet, while with no protections for the redundant copies which is at the trailing half of the packet. So the application should benefit from unequal error protections scheme with more emphasis on the beginning part of the packets. This document defines a payload format for RTP [3] which allows for generic forward error correction with unequal error protection for real time media. The payload data is protected by one or more protection levels. The lower protection level provides more protection by using smaller group size (compare to higher protection levels) to generate the FEC packet. The data that is closer to the beginning of the packet is protected by lower protection levels because these data are in general more important and carrying more information than those further behind. In this context, generic means that the FEC protocol is (1) independent of the nature of the media being protected, be it audio, video, or otherwise, (2) flexible enough to support a wide variety of FEC mechanisms, (3) designed for adaptivity so that the FEC technique can be modified easily without out of band signaling, and (4) supportive of a number of different mechanisms for transporting the FEC packets. 3 Terminology The following terms are used throughout this document: Media Payload: is a piece of raw, un-protected user data which is to be transmitted from the sender. The media payload is placed inside of an RTP packet. Media Header: is the RTP header for the packet containing the media payload. Media Packet: The combination of a media payload and media header is called a media packet. FEC Packet: The forward error correction algorithms at the transmitter take the media packets as an input. They output both the media packets that they are passed, and new packets called FEC packets. The FEC packets are formatted according to the rules specified in this document. FEC Header: The FEC header is the header information contained in an FEC packet. FEC Payload: The FEC payload is the payload in an FEC packet. Associated: An FEC packet is said to be "associated" with one or more media packets when those media packets are used to generate the FEC packet (by use of the exclusive or operation). The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [4]. 4 Basic Operation The payload format described here is used whenever a participant in an RTP session would like to protect a media stream it is sending with uneven level protection (ULP) FEC. The ULP FEC supported by the format are based on simple exclusive or (xor) parities as used also in RFC 2733 [1]. The sender takes the packets from the media stream that need to be protected, and determines the protection level it wants for these packets and the length for each level. The data of each level are grouped in a way that is described below to provide each level a different error resilience capability by adjusting the size of the group. An xor operation is applied across the payload to generate the FEC information for that level. The lower protection levels (which provides high protection, or high error resilience) are applied to the data that is closer to the beginning of the packet to ensure more protection there. Based on the procedures defined here, the result is an RTP packet containing ULP FEC information. This packet can be used at the receiver to recover any one packets used to generate the FEC packet, or to recover part of the packet depending on the packet lost situation. By using uneven error protection, this scheme can make more efficient use of the channel bandwidth, and provide more efficient error resilience for transmission over error prone channels. The payload format contains information that allows the sender to tell the receiver exactly which media packets are protected by this ULP FEC packet and the protection levels and lengths for each of them. Specifically, each ULP FEC packet contains a set of protection length and bitmask, called the offset mask, for each protection level. If bit i in the mask m(k) (i.e., the mask for protection level k) is set to 1, data of length L(k) in the media packet with sequence number N + i is protected by this ULP FEC packet at level k. N is called the sequence number base, and is sent in the FEC packet as well. The protection length, offset mask and payload type are sufficient to signal arbitrary parity based forward error correction schemes with little overhead. There are a set of rules as described below on how the mask should be set for different protection levels. This will ensure that if data of protection level k for a packet is recoverable, all the data of protection level lower than k is recoverable for that particular packet. This document also describes procedures that allow the receiver to make use of the ULP FEC without having to know the details of specific codes. This allows the sender much flexibility; it can adapt the code in use based on network conditions, and be certain the receivers can still make use of the FEC for recovery. At the receiver, the ULP FEC and original media are received. If no media packets are lost, the ULP FEC can be ignored. In the event of loss, the FEC packets can be combined with other media and FEC packets that have been received, resulting in recovery of the whole or part of the missing media packets. RTP packets which contain data formatted according to this specification (i.e., ULP FEC packets) are using dynamic RTP payload types. 5 RTP Media Packet Structure The formatting of the media packets is unaffected by FEC. If the FEC is sent as a separate stream, the media packets are sent as if there was no FEC. This lends to a very efficient encoding. When little (or no) FEC is used, there are mostly media packets being sent. This means that the overhead (present in FEC packets only) tracks the amount of FEC in use. 6 FEC Packet Structure An FEC packet is constructed by placing an ULP FEC header and ULP FEC payload in the RTP payload, as shown in Figure 1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FEC Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ULP Layer 0 Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ULP Layer 0 Payload | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ULP Layer 1 Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ULP Layer 1 Payload | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cont. | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: ULP FEC Packet Structure 6.1 RTP Header of FEC Packets The version field is set to 2. The padding bit is computed via the protection operation, defined below. The extension bit is also computed via the protection operation. The SSRC value will generally be the same as the SSRC value of the media stream it protects. It MAY be different if the FEC stream is being demultiplexed via the SSRC value. The CC value is computed via the protection operation. The CSRC list is never present, independent of the value of the CC field. The extension is never present, independent of the value of the X bit. The marker bit is computed via the protection operation. The sequence number has the standard definition: it MUST be one higher than the sequence number in the previously transmitted FEC packet. The timestamp MUST be set to the value of the media RTP clock at the instant the FEC packet is transmitted. This results in the TS value in FEC packets to be monotonically increasing, independent of the FEC scheme. The payload type for the FEC packet is determined through dynamic, out of band means. According to RFC1889 [3], RTP participants which cannot recognize a payload type must discard it. This provides backwards compatibility. The FEC mechanisms can then be used in a multicast group with mixed FEC-capable and FEC-incapable receivers. 6.2 FEC Header This header is 12 bytes. The format of the header is shown in Figure 2, and consists of an SN base field, length recovery field, E field, PT recovery field, mask field and TS recovery field. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SN base | length recovery | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E| PT recovery | mask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TS recovery | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2: FEC Header Format This is exactly as the FEC header used in RFC 2733 [1]. The usage will also be the exactly the same as specified as in RFC 2733, except that the E bit MUST set to one for this version. 6.3 ULP Layer Header The ULP Layer Header is 2 bytes for ULP layer 0, and 5 bytes for ULP layer 1 and higher. The format of the header is shown in Figure 3 and Figure 4, and consists of a Protection Length field, and mask field (for layer 1 and higher headers). 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protection Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3: ULP Layer Header Format (Level 0) 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protection Length | mask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | mask (cont.) | +-+-+-+-+-+-+-+-+ Figure 4: ULP Layer Header Format (Level 1 and higher) The Protection Length field is 16 bits. It indicates the protection length provided by the ULP FEC for the current protection level (i.e., the payload length for the current protection level after the header). The mask field is 24 bits. If bit i in the mask is set to 1, then the media packet with sequence number N + i is associated with this ULP FEC packet of current protection level, where N is the SN Base field in the ULP FEC packet header. The least significant bit corresponds to i=0, and the most significant to i=24. The SN base field in the FEC header MUST be set to the minimum sequence number of those media packets protected by ULP FEC. This allows for the ULP FEC operation to extend over any string of at most 24 packets. The setting of mask field shall follow the following rules: a. A packet can only be protected by each protection level once. b. For a packet to be protected at level p, it must also be protect at level p-1. c. Packets that are protected by one ULP FEC packet at level p-1 must also be protected within a same ULP FEC packet at level p. Note: the protection level p might be in different ULP FEC packet from protection level p-1. d. Packets that are protected in a packet with protection level p (even the packet is not protected at that level), must not be protected by levels equal or lower than p at later FEC packets. The payload of the FEC packet of each level is the protection operation applied to the concatenation of the CSRC list, RTP extension, media payload, and padding of the media packets associated with the FEC packet. The detail is described in the next section on the protection operation 7 Protection Operation The protection operation involves copying the payload, padding with zeroes, and computing the xor across the resulting bit strings. In additional, for protection of level 0, it also involves concatenating specific fields from the RTP header of the media packet before the payload data. The resulting bit string is used to generate the ULP FEC packet. The following procedure MAY be followed for the protection operation. Other procedures MAY be followed, but the end result MUST be identical to the one described here. 7.1 Protection Level 0 For each media packet to be protected, a bit string is generated by concatenating the following fields together in the order specifed: o Padding Bit (1 bit) o Extension Bit (1 bit) o CC bits (4 bits) o Marker bit (1 bit) o Payload Type (7 bits) o Timestamp (32 bits) o Unsigned network-ordered 16 bit representation of the sum of the lengths of the CSRC List, length of the padding, length of the extension, and length of the media packet (16 bits) o if CC is nonzero, the CSRC List (variable length) o if X is 1, the Header Extension (variable length) o the payload (variable length) o Padding, if present (variable length) Note that the Padding Bit (first entry above) forms the most significant bit of the bit string. If the lengths of the bit strings are not equal, each bit string that is shorter than the Protection Length 0 plus 62 bits, MUST be padded to that length. Any value for the pad may be used. The pad MUST be added at the end of the bit string. The parity operation is then applied across the bit strings. The result is the bit string used to build the FEC packet. Call this the ULP FEC bit string (level 0). The first (most significant) bit in the FEC bit string is written into the Padding Bit of the FEC packet. The second bit in the FEC bit string is written into the Extension bit of the FEC packet. The next four bits of the FEC bit string are written into the CC field of the FEC packet. The next bit of the FEC bit string is written into the marker bit of the FEC packet. The next 7 bits of the FEC bit string are written into the PT recovery field in the FEC packet header. The next 32 bits of the FEC bit string are written into the TS recovery field in the packet header. The next 16 bits are written into the length recovery field in the FEC packet header. This is exactly the same as in RFC 2733 [1]. The remaining bits (of length Protection Length 0) are set to be the payload of the ULP FEC packet. 7.2 Protection Level 1 and higher The protected data of the corresponding packets are copied into the bit strings. If the packet ends before the Protection Length of the current level is reached, the string is padded to that length. Any value for the pad may be used. The pad MUST be added at the end of the bit string. The parity operation is applied across the protected data of the corresponding packets. The generated ULP FEC bit of that level is then appended to the payload of the ULP FEC packet. 8 Recovery Procedures The FEC packets allow end systems to recover from the loss of media packets. All of the header fields of the missing packets, including CSRC lists, extensions, padding bits, marker and payload type, are recoverable. This section describes the procedure for performing this recovery. Recovery requires two distinct operations. The first determines which packets (media and FEC) must be combined in order to recover a missing packet. Once this is done, the second step is to actually reconstruct the data. The second step MUST be performed as described below. The first step MAY be based on any algorithm chosen by the implementor. Different algorithms result in a tradeoff between complexity and the ability to recover missing packets if at all possible. 8.1 Reconstruction of Level 0 Let T be the list of packets (FEC and media) which can be combined to recover some media packet xi. The procedure is as follows: 1. For the media packets in T, compute the bit string as described in the protection operation of the previous section. 2. For the FEC packet in T, compute the bit string in the same fashion, except always set the CSRC list, extension, and padding to null. Read the Protection Length 0. Read string of that length from that FEC packet. 3. If any of the bit strings generated from the media packets are shorter than the bit string generated from the FEC packet, pad them to be the same length as the bit string generated from the FEC. The padding MUST be added at the end of the bit string, and MAY be of any value. 4. Perform the exclusive or (parity) operation across the bit strings, resulting in a recovery bit string. 5. Create a new packet with the standard 12 byte RTP header and no payload. 6. Set the version of the new packet to 2. 7. Set the Padding bit in the new packet to the first bit in the recovery bit string. 8. Set the Extension bit in the new packet to the second bit in the recovery bit string. 9. Set the CC field to the next four bits in the recovery bit string. 10. Set the marker bit in the new packet to the next bit in the recovery bit string. 11. Set the payload type in the new packet to the next 7 bits in the recovery bit string. 12. Set the SN field in the new packet to xi. 13. Set the TS field in the new packet to the next 32 bits in the recovery bit string. 14. Take the next 16 bits of the recovery bit string. Whatever unsigned integer this represents (assuming network-order), take that many bytes from the recovery bit string and append them to the new packet. This represents the CSRC list, extension, payload, and padding. 15. Set the SSRC of the new packet to the SSRC of the media stream it's protecting. This procedure will recover both the header and payload of an RTP packet up to the Protection Length of level 0. 8.2 Reconstruction of Level 1 and higher Let T be the list of packets (FEC and media) which can be combined to recover some media packet xi. The procedure is as follows: 1. For the media packet in T, get the protection length of that level. Copy the data of the that protection level (data of the length read following the level header) to the bit strings. 2. If any of the bit strings generated from the media packets are shorter than the Protection Length of the current level, pad them to that length. The padding MUST be added at the end of the bit string, and MUST be of the same value as used in the process of generating the ULP FEC packets. 3. Perform the exclusive or (parity) operation across the bit strings, resulting in a recovery bit string. Because the data protected at lower protection level is always recoverable if the higher level protected data is recoverable. This procedure (together with the procedure for the lower protection levels) will recover both the header and payload of an RTP packet up to the Protection Length of the current level. 9 Examples 9.1 An example that generates identical protection as in RFC 2733 9.2 An example that has only protection level 0 9.3 An example that has two protection levels (0 and 1) 10 Acknowledgments This text is partially based on an RFC 2733 on FEC by H. Schulzrinne and J. Rosenburg. The authors would also like to acknowledge the suggestions from many people, particularly Tao Tian, Matthieu Tisserand, and Stephen Wenger. 11 Author's Addresses Adam H. Li Electronic Engineering Department University of California, Los Angeles Los Angeles, CA 90095 USA Phone: +1-310-825-5178 Fax : +1-310-825-7928 EMail: adamli@icsl.ucla.edu Fang Liu Electronic Engineering Department University of California, Los Angeles Los Angeles, CA 90095 USA Phone: +1-310-825-5178 Fax : +1-310-825-7928 EMail: fanliu@icsl.ucla.edu John D. Villasenor Electronic Engineering Department University of California, Los Angeles Los Angeles, CA 90095 USA Phone: +1-310-825-5178 Fax : +1-310-825-7928 EMail: villa@icsl.ucla.edu Jeong-Hoon Park Samsung Electronics Suwon, Kyoungi 442-742 Korea Phone: +82-331-200-3674 Fax : +82-331-200-3174 Email: jhpark@mmrnd.sec.samsung.co.kr Dong-Seek Park Samsung Electronics Suwon, Kyoungi 442-742 Korea Phone: +82-331-200-3674 Fax : +82-331-200-3174 Email: dspark@mmrnd.sec.samsung.co.kr 12 Bibliography [1] J. Rosenberg and H. Schulzrine, "An RTP Payload Format for Generic Forward Error Correction," Request for Comments (Proposed Standard) 2733, Internet Engineering Task Force, Dec. 1999. [2] C. Perkins and O. Hodson, "Options for repair of streaming media," Request for Comments (Informational) 2354, Internet Engineering Task Force, June 1998. [3] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a transport protocol for real-time applications," Request for Comments (Proposed Standard) 1889, Internet Engineering Task Force, Jan. 1996. [4] S. Bradner, "Key words for use in RFCs to indicate requirement levels," Request for Comments (Best Current Practice) 2119, Internet Engineering Task Force, Mar. 1997.