FRAME SYNCHRONOUS PACKET SWITCHING FOR HIGH-DEFINITION MULTIMEDIA INTERFACE (HDMI) VIDEO TRANSITIONS
An apparatus for use in a high-definition media interface (HDMI) source device includes an HDMI interface for transmitting video data and metadata to a sink device. The apparatus is configured to encode the metadata in an auxiliary video information (AVI) information frame (InfoFrame). The apparatus is further configured to transmit the AVI InfoFrame during a frame synchronous transmission window (FSTW) of the video data, wherein the FSTW begins during a video blanking interval (VBI) of the video data, on a first video blank pixel that immediately follows a last active video pixel of a preceding video frame or video field and ends a predetermined number of video lines after a start of the VBI.
This application concerns sending and receiving units that employ a high-definition multimedia interface (HDMI) and in particular to HDMI sending and receiving units implementing frame synchronous transitions among high dynamic range (HDR) and standard dynamic range (SDR) video content.
BACKGROUNDThe high-definition multimedia interface (HDMI) is a popular interface for transmitting high-speed baseband digital video and associated audio signals for presentation on an HDMI-capable device. Recently, high dynamic range (HDR) video display devices have become available, and video sources, such as digital versatile disc (DVD) players, television broadcasts, and on-line streaming services, now provide HDR content. HDR displays that receive HDR content provide higher brightness levels and may also provide darker black levels and improved color rendering as compared to standard dynamic range (SDR). SDR video refers to a dynamic range of between zero and 300 nits (cd/m2). Recently, display devices having dynamic ranges up to 10000 nits or greater have become available. These display devices are referred to as HDR displays. In order to accommodate these HDR displays and the corresponding HDR sources, video interfaces, including HDMI, have been adapted to transport both pixel data and SDR or HDR metadata over the interface.
Metadata for SDR video data is sent over the HDMI interface using auxiliary video information (AVI) information frames (InfoFrames). Currently, there are two types of HDR metadata, static HDR (S-HDR) metadata which is sent using DRange and Mastering (DRAM) InfoFrames, and dynamic HDR metadata which is sent using HDR Dynamic Metadata Extended (HDR DME) InfoFrames. S-HDR metadata is applied to an entire program while dynamic HDR metadata may change more frequently, typically over a sequence of several frames but could change frame to frame. The metadata in the DRAM InfoFrames and HDR DME InfoFrames augments the metadata in the AVI InfoFrames.
A source processing an HDR signal may be coupled to a sink (e.g., display) configured to display only SDR video or SDR video and one or both of S-HDR video or dynamic HDR video. When the sink does not support dynamic HDR, the source may convert the dynamic HDR video data to S-HDR video data or SDR video data before sending the video data to the sink. When the sink does not support S-HDR video or dynamic HDR video, the source may convert both S-HDR video data and dynamic HDR video data to SDR video data before sending the video data to the sink. A sink that is capable of displaying dynamic HDR video receives the video data over the HDMI interface using the HDR DME InfoFrame in a frame-synchronous manner so that the metadata is applied to the frame occurring immediately after the metadata is received.
To implement the frame-synchronous switching of the dynamic HDR metadata carried in the HDR DME InfoFrame, HDMI 2.1 defines a frame accurate packet area (FAPA) in the vertical blanking area of the video signal and specifies that HDR DME InfoFrames are to be sent during the FAPA period. HDMI 2.1 also specifies that AVI InfoFrames and DRAM InfoFrames are to be sent in a frame-synchronous manner, but HDMI 2.1 does not require that these InfoFrames be sent during any particular period within a video frame. Therefore, considering the timing requirements specified for transmission of InfoFrames, the timing of the HDR DME InfoFrame, is precisely specified to be transmitted during the FAPA period. The AVI InfoFrame and the DRAM InfoFrame are required to be frame-synchronous, but a specific time period for transmission is not specified.
SUMMARYVarious examples are now described to introduce a selection of concepts in a simplified form that are further described below in the detailed description. The Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
According to one aspect of the present disclosure, an apparatus for use in a source device for transmitting and receiving data using a high definition media interface (HDMI), the apparatus comprises an HDMI interface for transmitting data to and receiving data from a sink device; a memory holding executable code; a processor, coupled to the memory and to the HDMI interface, the processor configured by the executable code to: receive video data and metadata for transmission to the sink device; encode the metadata in an auxiliary video information (AVI) information frame (InfoFrame); and transmit the AVI InfoFrame during a frame synchronous transmission window (FSTW) of the video data, wherein the FSTW begins during a video blanking interval (VBI) of the video data, on a first video blank pixel that immediately follows a last active video pixel of a preceding video frame or video field and ends a predetermined number of video lines after a start of the VBI.
Optionally, in the preceding aspect, a further implementation of the aspect includes, the received video data including standard dynamic range (SDR) video data and the metadata is metadata for the SDR video data.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes, the received metadata including metadata for a static high dynamic range (S-HDR) video sequence wherein the processor is configured by the executable code to: encode the metadata for the S-HDR video sequence in the AVI InfoFrame and in a DRange and Mastering (DRAM) InfoFrame; and transmit the AVI InfoFrame and the DRAM InfoFrame during the FSTW.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes, the received metadata including metadata for a dynamic high dynamic range (HDR) video sequence wherein the processor is configured by the executable code to: encode the metadata for the dynamic HDR video sequence in the AVI InfoFrame and in a HDR dynamic metadata extended (HDR DME) InfoFrame; and transmit the AVI InfoFrame and the HDR DME InfoFrame during the FSTW.
According to another aspect of the present disclosure, an apparatus for use in a sink device for receiving data using a high definition media interface (HDMI), the apparatus comprises: an HDMI interface for receiving data from a source device; a memory holding executable code; a processor, coupled to the memory and to the HDMI interface, the processor configured by the executable code to: receive a video sequence from the source device, the video sequence including a plurality of video fields or video frames, each video field or video frame including an active video interval and a vertical blanking interval (VBI); extract an auxiliary video information (AVI) information frame (InfoFrame) including metadata for the video sequence from a frame synchronous transmission window (FSTW) of the VBI of at least one of the fields or frames of the video sequence, wherein the FSTW begins during the VBI on a first video blank pixel that immediately follows a last active video pixel of a preceding video field or video frame and ends a predetermined number of video lines after a start of the VBI; extract the metadata from the AVI InfoFrame; and apply the extracted metadata to video data in the active video interval of the video field or video frame containing the FSTW.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes the received video sequence having a static high dynamic range (S-HDR) video sequence wherein the processor is configured by the executable code to: extract a DRange and Mastering (DRAM) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence; extract further metadata from the DRAM InfoFrame; and apply the extracted metadata and the further metadata to the video data in the active video interval of the video field or video frame containing the FSTW.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes the received video sequence having a high dynamic range (HDR) video sequence wherein the processor is configured by the executable code to: extract an HDR dynamic metadata extended (HDR DME) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence; extract further metadata from the HDR DME InfoFrame; and apply the extracted metadata and the further metadata to the video data in the active video interval of the video field or video frame containing the FSTW.
According to another aspect of the present disclosure, a method for transmitting data from a source device to a sink device uses a high definition media interface (HDMI) and comprises: receiving video data and metadata for transmission to the sink device; encoding the metadata in an auxiliary video information (AVI) InfoFrame; and transmitting the AVI InfoFrame during a frame synchronous transmission window (FSTW) of the video data, wherein the FSTW begins during a video blanking interval (VBI) of the video data, on a first video blank pixel that immediately follows a last active video pixel of a preceding video frame or video field and ends a predetermined number of video lines after a start of the VBI.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes receiving standard dynamic range (SDR) video data and the metadata is metadata for the SDR video data.
Optionally, in any of the preceding aspects, in a further implementation of the aspect, receiving the video data and metadata includes receiving a static high dynamic range (S-HDR) video sequence and metadata for the S-HDR video sequence; encoding the metadata includes encoding the metadata for the S-HDR video sequence in the AVI InfoFrame and in a DRange and Mastering (DRAM) InfoFrame; and transmitting the AVI InfoFrame includes transmitting the AVI InfoFrame and the DRAM InfoFrame during the FSTW.
Optionally, in any of the preceding aspects, in a further implementation of the aspect, receiving the video data and metadata includes receiving a dynamic high dynamic range (HDR) video sequence and metadata for the dynamic HDR video sequence; encoding the metadata includes encoding the metadata for the dynamic HDR video sequence in the AVI InfoFrame and in a HDR dynamic metadata extended (HDR DME) InfoFrame; and transmitting the AVI InfoFrame includes transmitting the AVI InfoFrame and the HDR DME InfoFrame during the FSTW.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes receiving a video sequence from the source device, the video sequence including a plurality of video fields or video frames, each video field or video frame including an active video interval and a vertical blanking interval (VBI); extracting an auxiliary video information (AVI) information frame (InfoFrame) including metadata for the video sequence from a frame synchronous transmission window (FSTW) of the VBI of at least one of the fields or frames of the video sequence, wherein the FSTW begins during the VBI on a first video blank pixel that immediately follows a last active video pixel of a preceding video field or video frame and ends a predetermined number of video lines after a start of the VBI; extracting the metadata from the AVI InfoFrame; and applying the extracted metadata to video data in the active video interval of the video field or video frame containing the FSTW.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes extracting a DRange and Mastering (DRAM) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence; extracting further metadata from the DRAM InfoFrame; and applying the further metadata to the S-HDR video data in the active video interval of the video field or video frame containing the FSTW.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes extracting an HDR dynamic metadata extended (HDR DME) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence; extracting further metadata from the HDR DME InfoFrame; and applying the further metadata to the dynamic HDR video data in the active video interval of the video field or video frame containing the FSTW.
According to another aspect of the present disclosure, a computer-readable medium includes program instructions for execution by a processor to configure the processor to transmit data from a source device to a sink device using a high definition media interface (HDMI), the program instructions configuring the processor to: receive video data and metadata for transmission to the sink device; encode the metadata in an auxiliary video information (AVI) information frame (Info Frame); and configure the AVI InfoFrame for transmission during a frame synchronous transmission window (FSTW) of the video data, wherein the FSTW begins during a video blanking interval (VBI) of the video data, on a first video blank pixel that immediately follows a last active video pixel of a preceding video frame or video field and ends a predetermined number of video lines after a start of the VBI.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes program instructions to configure the processor to: receive standard dynamic range (SDR) video data and metadata for the SDR video data; and encode the metadata for the SDR video data in the AVI InfoFrame.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes program instructions to configure the processor to: receive, as the video data and metadata, a static high dynamic range (S-HDR) video sequence and metadata for the S-HDR video sequence; encode the metadata for the S-HDR video sequence in the AVI InfoFrame and in a DRange and Mastering (DRAM) InfoFrame; and configure the AVI InfoFrame and the DRAM InfoFrame for transmission during the FSTW.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes program instructions to configure the processor to: receive, as the video data and metadata, a dynamic high dynamic range (HDR) video sequence and metadata for the dynamic HDR video sequence; encode the metadata for the dynamic HDR video sequence in the AVI InfoFrame and in a HDR dynamic metadata extended (HDR DME) InfoFrame; and configure the AVI InfoFrame and the HDR DME InfoFrame for transmission during the FSTW.
According to yet another aspect of the present disclosure, a computer-readable medium includes program instructions for execution by a processor to configure the processor in a sink device to receive data from a source device using a high definition media interface (HDMI), the program instructions configuring the processor to: receive a video sequence from the source device, the video sequence including a plurality of video fields or video frames, each video field or video frame including an active video interval and a vertical blanking interval (VBI); extract an auxiliary video information (AVI) information frame (InfoFrame) including metadata for the video sequence from a frame synchronous transmission window (FSTW) of the VBI of at least one of the fields or frames of the video sequence, wherein the FSTW begins during the VBI on a first video blank pixel that immediately follows a last active video pixel of a preceding video field or video frame and ends a predetermined number of video lines after a start of the VBI; extract the metadata from the AVI InfoFrame; and apply the extracted metadata to video data in the active video interval of the video field or video frame containing the FSTW.
Optionally, in any of the preceding aspects, a further implementation of the aspect includes program instructions to configure the processor to: receive, as the video sequence, a static high dynamic range (S-HDR) video sequence and the method further comprises: extracting a DRange and Mastering (DRAM) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence; extracting further metadata from the DRAM InfoFrame; and applying the further metadata to the video data in the active video interval of the video field or video frame containing the FSTW.
In the following description, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific embodiments which may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the subject matter, and it is to be understood that other embodiments may be utilized and that structural, logical, and electrical changes may be made without departing from the scope of the present subject matter. The following description of example embodiments is, therefore, not to be taken in a limited sense, and the scope of the present subject matter is defined by the appended claims.
The functions or algorithms described herein may be implemented in software in one embodiment. The software may consist of computer-executable instructions stored on computer-readable media or computer-readable storage device such as one or more non-transitory memories or other type of hardware based storage devices, either local or networked. Further, such functions correspond to modules, which may be software, hardware, firmware, or any combination thereof. Multiple functions may be performed in one or more modules as desired, and the embodiments described are merely examples. The software may be executed on processing circuitry that may include a single core microprocessor, multi-core microprocessor, digital signal processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), or other type of data processing circuitry operating on a computer system, such as a personal computer, server or other computer system, turning such computer system into a specifically programmed machine.
In many existing systems, video information originates from a single source such as a digital versatile disk (DVD) player or a television tuner. These sources typically provide video data with a uniform dynamic range and may provide either SDR data or S-HDR data. To display video data from these sources, the HDMI interface provides for S-HDR metadata signaling (e.g., AVI InfoFrames and DRAM InfoFrames) and SDR signaling (e.g., AVI InfoFrames).
S-HDR signaling works well when the video data changes between HDR and SDR infrequently (e.g., when an S-HDR disk is inserted in the DVD player). Increasingly, however, video data is provided in a streaming format in which disparate video segments are stitched together into a single stream. Some segments may be SDR segments while others are HDR segments. As described below with reference to
More recently, different types of HDR video data may be provided in a single scene or for a single frame. For example, in a relatively dark scene, the range of luminance values may be significantly less than the full range of the HDR signal. For example, a 10-bit luminance signal may have values bounded by 0-255, the range of an 8-bit video signal. In this instance, an opto-electric transfer function (OETF) and corresponding electro-optical transfer function (EOTF) may be applied so that the image data in the scene may be mapped into the 10-bit range of the luminance signal, reducing quantization distortion in the reproduced image. These signals are dynamic HDR signals that may use HDR DME InfoFrames to send the EOTF to the sink device.
Because the dynamic HDR video signals having HDR DME may change on a frame-by-frame basis, the HDR DME InfoFrames are processed with frame-synchronous timing to ensure proper display of the HDR video data. The embodiments described below also send AVI InfoFrames and DRAM InfoFrames in a frame-synchronous transmission window (FSTW). The FSTW, which has the same timing as FAPA with location start 0 (FAPA0), starts on the first video blank pixel that immediately follows the last active video pixel of a video frame/field and ends FAPA_end lines prior to the start of the next active region (as described in section 10.10.1.1 of the High-Definition Multimedia Interface Specification Version 2.1). Briefly, FAPA_end may be one-half the number of lines in the VBI or less, depending on the number of lines in the VBI. The FSTW is used by sink devices compatible with dynamic HDR video and has timing that corresponds to the FAPA. Sending the AVI InfoFrames and DRAM InfoFrames as well as HDR DME InfoFrames during the FSTW reduces image distortion that may occur on switching among SDR, S-HDR and dynamic HDR video formats. As used herein, FSTW is identical to FAPA0.
The TMDS channels 120, 122, and 124 allow the source device to transmit video and audio data 154, 156 to the sink device at rates up to 6 gigabits per second (Gbps) using differential signals synchronized by the clock signal transmitted through the TMDS clock channel 126. The audio data 156 may be encoded in data islands, described below, that are transmitted in the vertical and horizontal blanking intervals of the transmitted video data 154.
The DDC 128 is a serial channel that includes a serial data (SDA) conductor (not separately shown) and a serial clock (SCL) conductor (not separately shown). The DDC 128 is used to send/receive control data between the sending unit 110 and the receiving unit 150. For example, the sending unit 110 may use the DDC 128 to read enhanced extended display identification data (E-EDID), such as a vendor-specific data block (VSDB) from the receiving unit 150. For this operation, the receiving unit 150 may include a read only memory (ROM) (not shown) that stores the E-EDID of the HDMI receiving unit 150.
The sending unit 110 uses the HPD line to sense that the sink device is coupled to the cable 140 and is powered on. Responsive to the HPD line having a positive DC bias potential, the sending unit 110 reads the E-EDID data via the DDC 128 to determine the capabilities of the receiving unit 150. The CEC channel 130 allows users to control devices connected by the HDMI cable 140 using a single remote control device (not shown). As described below, the E-EDID may include information about the HDR capabilities of the sink device, for example, whether the sink device supports S-HDR and/or dynamic HDR.
The processor 202 controls the operation of other components of the HDMI source device 200. The memory 204 holds data and instructions for the processor 202. The processor 202 may operate the display controller 206 to control a display panel (not shown) used to control the operation of the HDMI source device 200. The display controller 206 may also interface with an input device such as a touchscreen and/or keypad (not shown) to allow a user to input data for controlling the HDMI source device 200. The processor 202 may also control the network interface 208 to allow the source device 200 to access media content from a network (e.g., the Internet) via a browser or a video streaming application. As described above, this media content may be streaming video including SDR segments, S-HDR segments, and/or dynamic HDR segments. The communication interface 212 of the HDMI sending unit 210 is controlled by the processor 202 to communicate with the sink device (described below with reference to
In the example source device 200, compressed video and audio data from the DVD interface 220 and/or the network interface 208 are provided to the audio video decoder 214. The decoder 214 may include a motion picture experts group (MPEG) decoder such as an H.222/H.262 (MPEG2), H.264 advanced video coding (AVC), and/or H.265 high efficiency video coding (HEVC) decoder. The decoder 214 generates baseband video and audio data from the encoded data provided by the network interface 208, DVD interface 220, or provided directly to the AV decoder 214 as indicated in
When the encoded video stream includes high dynamic range video data, the audio/video decoder 214 extracts the HDR metadata (e.g., DRAM and/or HDR DME) from the encoded video data and provides it to the HDMI sending unit 210 to be included in data islands to be transmitted inside or outside of frame synchronous transmission windows (FSTWs) of the video data sent to the HDMI receiving unit. For video data provided directly to the audio video decoder 214, any associated HDR metadata may be provided to the metadata acquisition circuitry 218. This metadata may be provided to the InfoFrame processing circuitry 216 to be included in the data islands transmitted by the HDMI transmitter 211.
If the sink device 300 (
The processor 302 controls the operation of other components of the HDMI sink device 300. The memory 304 holds data and instructions for the processor 302. The processor 302 may operate the display controller 306 to control a display panel (not shown) used to control the operation of the HDMI sink device 300. The controller 306 may also interface with an input device such as a touchscreen and/or keypad (not shown) to allow a user to input data for controlling the HDMI sink device 300. The sink device 300 receives audio and video data via the TMDS channels 120, 122, 124 and 126 or FRL lanes 166, 168, 170 and 172, described above with reference to
The HDMI receiving unit 310 extracts audio data from the data islands in the horizontal and vertical blanking intervals of the video signal outside of the FSTW and provides the audio data to the audio processing circuitry 318. The audio data generated by the audio processing circuitry 318 and the video data generated by the video processing circuitry 316 are provided to a presentation device including a monitor (not shown) and a sound system (not shown).
Each of the memories 204 and 304 may include volatile memory and/or non-volatile memory. The non-volatile memory may include removable storage and non-removable storage. Computer storage includes random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) and electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, compact disc read-only memory (CD ROM), Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium capable of storing computer-readable instructions.
The various processing devices and circuits shown in
As described below with reference to
The communication interface 312 of the HDMI receiving unit 310 is controlled by the processor 302 to communicate with the source device 200 via the DDC/SDA/SCL channel of the HDMI interface. The processor 202 uses this interface to receive commands and data from, and to transmit commands and data to, the source device 200 via the communication interface 312. For example, the sink device 300 may provide to the source device 200 information (e.g., a vendor-specific data block (VSDB)) indicating the capabilities of the sink device 300. Similarly, the sink device 300 may obtain information about the source device 200 via the DDC/SDA/SCL channel of the HDMI interface.
In sink devices that support frame-synchronous processing, control information in the HDR DME is applied to the immediately following active video data so dynamic HDR video data in the active video area 420 is properly displayed. Sink devices supporting frame synchronous processing identify the HDR DME and copy metadata data to appropriate control registers and memory elements in the sink device 300. This may include, for example, copying EOTF data to implement a particular EOTF to be used for displaying the dynamic HDR video data or configuring the sink device 300 to handle the pixel depth (e.g., the number of bits in each pixel) or a particular color space configuration indicated by the HDR DME.
The example sink device 300 includes a vendor-specific data block (VSDB) (not shown), for example in the E-EDID, containing information on the capabilities of the sink device 300. The VSDB may indicate that the sink device 300 supports only SDR video data; SDR and S-HDR video data; or SDR, S-HDR, and dynamic HDR video data. As described above, when the sink device 300 does not support either dynamic HDR video data or S-HDR data, the source device may convert the dynamic HDR data to S-HDR data compatible with the AVI InfoFrames, and may convert the S-HDR data to SDR data compatible with the AVI InfoFrames before sending the converted video data to the sink device 300. The example embodiments send the AVI InfoFrames, DRAM InfoFrames, and HDR DME InfoFrames during the region of the vertical blanking interval beginning at the first blank pixel that immediately follows the last active video pixel of a video frame/field and ending FAPA_end lines prior to the start of the next active region. This region corresponds to the FSTW 414 described above.
Metadata for the SDR and S-HDR video data is contained in AVI InfoFrames. Although
The metadata for the S-HDR video sequence 604 is contained in an AVI InfoFrame 614 and in DRAM InfoFrame 622, which are transmitted by the source device 200 at field/frame time T100 but do not become active until field/frame time T101. Similarly, at field/frame time T500, the source device 200 sends the second S-HDR metadata in AVI InfoFrame 618 and DRAM InfoFrame 624. The metadata in these two InfoFrames 618, 624 becomes active at field/frame time T501 and remains active until time T601, when the metadata in the AVI InfoFrame 620 for the third SDR sequence 610 becomes active.
In
As shown in
The metadata for the SDR video is contained in AVI InfoFrames. The AVI InfoFrame 716 containing metadata or the first SDR video sequence 702 is received in data islands during the non-FAPA area of the VBI or during the HBI of the field/frame starting at field/frame time T0. As shown in
At field/frame time T200, the sink receives the first dynamic HDR video sequence 706 and accompanying metadata including AVI InfoFrame 720 and HDR DME 734. The AVI InfoFrame 720 is received outside of the FAPA interval of the VBI while the HDR DME is received during the FAPA interval (FAPA0 or FAPA1) of the VBI. As shown in
At field/frame time T300, the sink receives the second SDR video sequence 708 and the AVI InfoFrame 722 containing the metadata for the second SDR sequence 708. Because the AVI InfoFrame 722 is received outside of the FAPA area of the VBI, it does not become active until field/frame time T301 and remains active until field/frame time T400.
At field/frame time T400, the sink receives the second dynamic HDR video sequence 710 and accompanying metadata including AVI InfoFrame 724 and HDR DME 738. As shown in
The sink receives the second S-HDR video sequence 712 and accompanying metadata at field/frame time T500. The S-HDR metadata includes AVI InfoFrame 726 and DRAM InfoFrame 740. Both of these frames are received outside of the FAPA area of the VBI and, thus, do not become active until field/frame time T501. The metadata for the second S-HDR video sequence 712 remains active between field/frame times T501 and T601.
At time T601, the sink receives the third SDR video sequence 714 and its accompanying metadata, AVI InfoFrame 728. Because the AVI InfoFrame 728 is received outside of the FAPA area, it does not become active until field/frame time T601.
The actual flow includes several instances of mismatch between the displayed video data and the dynamic range metadata used to process the video data. For example, the display begins with displayed SDR video sequence 742 at field/frame time T100 followed by a mismatch interval 744 between field/frame times T100 and T101. This mismatch occurs because the first S-HDR video sequence 704 is processed using the SDR metadata because the metadata in the AVI InfoFrame 718 and DRAM InfoFrame 730 for the S-HDR video sequence 704 have not been transferred to the InfoFrame processing circuitry 314 (e.g., have not become active) until field/frame time T101. From field/frame time T101 to T200, the S-HDR video data 746 is properly displayed using the first S-HDR metadata. Even though the DRAM InfoFrame 730 metadata is active until field/frame time 201, there is no mismatch at the transition beginning at field/frame time T200 because the HDR DME 734 metadata overrides the DRAM InfoFrame 730 metadata. Because it is received during the FAPA0 interval, the first HDR DME 734 metadata is processed in a frame-synchronous manner and is transferred to the InfoFrame processing circuitry 314 so that the metadata may be passed to the video processing circuitry 316 in time to process the video data at field/frame time T200. The displayed dynamic HDR video sequence 748 continues to field/frame time T300 at which there is another mismatch 750. At field/frame time T300, the first HDR DME metadata 734 is no longer active; however, the second SDR metadata has not yet become active. The mismatch 750 occurs because the SDR video information in the field/frame starting at time T300 is processed using the AVI InfoFrame 720 metadata. Once SDR metadata in AVI InfoFrame 722 becomes active at field/frame time T301, the system properly displays the SDR video data 752 until field/frame time T400. At T400, again due to the frame-synchronous processing, the system properly displays the dynamic HDR video data 754 using the second HDR DME 738 metadata and AVI InfoFrame 724. A mismatch 756 occurs, however, in the field/frame starting at T500 because the second S-HDR metadata in DRAM InfoFrame 740 has not become active, so that the corresponding S-HDR video data is processed using the metadata in the AVI InfoFrame 724 for the second dynamic HDR video sequence. Once the metadata in the AVI InfoFrame 726 and the DRAM InfoFrame 740 become active at T501, the second S-HDR video data 758 is displayed properly. The actual flow continues at field/frame time T600 with another mismatch 760, when the third SDR video sequence 714 is processed using the second S-HDR metadata contained in the InfoFrames 726 and 740. The SDR video data 762 displays properly after field/frame time T601.
Although the examples in
The visual artifacts that occur on switching to SDR or S-HDR from dynamic HDR may be more noticeable than those which occur on switching between SDR and S-HDR because, due to the dynamic nature of dynamic HDR metadata, the changes may be less predictable, unlike legacy HDMI in which the changes are ‘static’ or ‘pseudo-static.’ The HDMI 2.1 Specification implements frame-accuracy for switching on HDR DME processing but not for switching off HDR DME processing. The visual artifacts experienced during the mismatch intervals may include reduced contrast, for mismatch interval 744, when S-HDR video is incorrectly interpreted as SDR video, or incorrect dimming with missing shadow details for mismatch 760, when SDR video is incorrectly interpreted as S-HDR video. The artifacts may also include incorrect color. The occurrence of these artifacts may be increased in systems operating according to the HDMI 2.1 standard due to the addition of dynamic HDR sequences, since the dynamic HDR sequences may be stitched with S-HDR or SDR in a linear stream before delivery, resulting in more frequent and more visible artifacts.
As shown in
In the example shown in
To minimize visual artifacts in sinks that do not support frame-accuracy, the source device 200 sends the data to the sink device 300 according to the legacy HDMI standards so that all video packets are accurately processed within a set amount of time, for example, one to four fields/frames times after each video transition.
When, at block 904, the source device 200 determines that the sink device 300 can process dynamic HDR video sequences, the source device 200, at block 908, formats the video data so that all of the metadata in the AVI InfoFrames, DRAM InfoFrames, and HDR DME InfoFrames is sent during the FSTW.
The metadata describes how the video data sent during the active video interval is to be displayed. For example, the metadata may include: information on color remapping; a color volume transform to be applied; maximum, minimum and average luminance values in a scene and target maximum, minimum and average luminance values; data describing a transfer function (e.g. an EOTF) to be applied to the luminance data; and/or data specific to an application running on the source device. The content and format of the metadata is described in a standard issued by the Consumer Technology Association™, entitled A DTV Profile for Uncompressed High Speed Digital Interfaces CTA-861-G (November 2016).
With reference to
As described above, with reference to
As shown in
Although the examples described above concern metadata transitions related to the changing dynamic range of the video signals, it is contemplated that other metadata transitions in video or audio signals may be implemented as frame-synchronous transitions. For example, object-oriented audio and video data such as may be used in virtual-reality and augmented-reality applications may be transmitted through the HDMI interface. In this instance, frame-synchronous processing may be desirable to coordinate the video and audio data to motions and/or gestures of the user.
Claims
1. An apparatus for use in a source device for transmitting data using a high definition media interface (HDMI), the apparatus comprising:
- an HDMI interface for transmitting data to and receiving data from a sink device;
- a memory holding executable code; and
- a processor, coupled to the memory and to the HDMI interface, the processor configured by the executable code to: receive video data and metadata for transmission to the sink device; encode the metadata in an auxiliary video information (AVI) information frame (InfoFrame); and transmit the AVI InfoFrame during a frame-synchronous transmission window (FSTW) of the video data, wherein the FSTW begins during a video blanking interval (VBI) of the video data, on a first video blank pixel that immediately follows a last active video pixel of a preceding video frame or video field and ends a predetermined number of video lines after a start of the VBI.
2. The apparatus of claim 1, wherein the received video data includes standard dynamic range (SDR) video data and the metadata is metadata for the SDR video data.
3. The apparatus of claim 1, wherein the received metadata includes metadata for a static high dynamic range (S-HDR) video sequence and the processor is configured by the executable code to:
- encode the metadata for the S-HDR video sequence in the AVI InfoFrame and in a DRange and Mastering (DRAM) InfoFrame; and
- transmit the AVI InfoFrame and the DRAM InfoFrame during the FSTW.
4. The apparatus of claim 1, wherein the received metadata includes metadata for a dynamic high dynamic range (HDR) video sequence and the processor is configured by the executable code to:
- encode the metadata for the dynamic HDR video sequence in the AVI InfoFrame and in a HDR dynamic metadata extended (HDR DME) InfoFrame; and
- transmit the AVI InfoFrame and the HDR DME InfoFrame during the FSTW.
5. An apparatus for use in a sink device for receiving data using a high-definition media interface (HDMI), the apparatus comprising:
- an HDMI interface for receiving data from a source device;
- a memory holding executable code; and
- a processor, coupled to the memory and to the HDMI interface, the processor configured by the executable code to: receive a video sequence from the source device, the video sequence including a plurality of video fields or video frames, each video field or video frame including an active video interval and a vertical blanking interval (VBI); extract an auxiliary video information (AVI) information frame (InfoFrame) including first metadata for the video sequence from a frame-synchronous transmission window (FSTW) of the VBI of at least one of the fields or frames of the video sequence, wherein the FSTW begins during the VBI on a first video blank pixel that immediately follows a last active video pixel of a preceding video field or video frame and ends a predetermined number of video lines after a start of the VBI; extract the first metadata from the AVI InfoFrame; and apply the extracted first metadata to video data in the active video interval of the video field or video frame containing the FSTW.
6. The apparatus of claim 5, wherein the received video sequence includes a static high dynamic range (S-HDR) video sequence and the processor is configured by the executable code to:
- extract a DRange and Mastering (DRAM) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence;
- extract second metadata from the DRAM InfoFrame; and
- apply the extracted first metadata and the second metadata to the video data in the active video interval of the video field or video frame containing the FSTW.
7. The apparatus of claim 5, wherein the received video sequence includes a dynamic high dynamic range (HDR) video sequence and the processor is configured by the executable code to:
- extract an HDR dynamic metadata extended (HDR DME) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence;
- extract second metadata from the HDR DME InfoFrame; and
- apply the extracted metadata and the second metadata to the video data in the active video interval of the video field or video frame containing the FSTW.
8. A method for transmitting data from a source device to a sink device using a high-definition media interface (HDMI), the method comprising:
- receiving video data and metadata for transmission to the sink device;
- encoding the metadata in an auxiliary video information (AVI) InfoFrame; and
- transmitting the AVI InfoFrame during a frame-synchronous transmission window (FSTW) of the video data, wherein the FSTW begins during a video blanking interval (VBI) of the video data, on a first video blank pixel that immediately follows a last active video pixel of a preceding video frame or video field and ends a predetermined number of video lines after a start of the VBI.
9. The method of claim 8, wherein receiving the video data and metadata includes receiving standard dynamic range (SDR) video data and the metadata is metadata for the SDR video data.
10. The method of claim 8, wherein:
- receiving the video data and metadata includes receiving a static high dynamic range (S-HDR) video sequence and metadata for the S-HDR video sequence;
- encoding the metadata includes encoding the metadata for the S-HDR video sequence in the AVI InfoFrame and in a DRange and Mastering (DRAM) InfoFrame; and
- transmitting the AVI InfoFrame includes transmitting the AVI InfoFrame and the DRAM InfoFrame during the FSTW.
11. The method of claim 8, wherein:
- receiving the video data and metadata includes receiving a dynamic high dynamic range (HDR) video sequence and metadata for the dynamic HDR video sequence;
- encoding the metadata includes encoding the metadata for the dynamic HDR video sequence in the AVI InfoFrame and in a HDR dynamic metadata extended (HDR DME) InfoFrame; and
- transmitting the AVI InfoFrame includes transmitting the AVI InfoFrame and the HDR DME InfoFrame during the FSTW.
12. A method for receiving data from a source device using a high-definition media interface (HDMI), the method comprising:
- receiving a video sequence from the source device, the video sequence including a plurality of video fields or video frames, each video field or video frame including an active video interval and a vertical blanking interval (VBI);
- extracting an auxiliary video information (AVI) information frame (InfoFrame) including first metadata for the video sequence from a frame synchronous transmission window (FSTW) of the VBI of at least one of the fields or frames of the video sequence, wherein the FSTW begins during the VBI on a first video blank pixel that immediately follows a last active video pixel of a preceding video field or video frame and ends a predetermined number of video lines after a start of the VBI;
- extracting the first metadata from the AVI InfoFrame; and
- applying the extracted first metadata to video data in the active video interval of the video field or video frame containing the FSTW.
13. The method of claim 12, wherein:
- receiving the video sequence includes receiving a static high dynamic range (S-HDR) video sequence and the method further comprises: extracting a DRange and Mastering (DRAM) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence; extracting second metadata from the DRAM InfoFrame; and applying the second metadata to S-HDR video data in the active video interval of the video field or video frame containing the FSTW.
14. The method of claim 12, wherein:
- receiving the video sequence includes receiving a dynamic high dynamic range (HDR) video sequence and the method further comprises: extracting an HDR dynamic metadata extended (HDR DME) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence; extracting second metadata from the HDR DME InfoFrame; and applying the second metadata to dynamic HDR video data in the active video interval of the video field or video frame containing the FSTW.
15. A computer-readable medium including program instructions for execution by a processor to configure the processor to transmit data from a source device to a sink device using a high-definition media interface (HDMI), the program instructions configuring the processor to:
- receive video data and metadata for transmission to the sink device;
- encode the metadata in an auxiliary video information (AVI) information frame (InfoFrame); and
- configure the AVI InfoFrame for transmission during a frame synchronous transmission window (FSTW) of the video data, wherein the FSTW begins during a video blanking interval (VBI) of the video data, on a first video blank pixel that immediately follows a last active video pixel of a preceding video frame or video field and ends a predetermined number of video lines after a start of the VBI.
16. The computer-readable medium of claim 15, wherein the program instructions configure the processor to:
- receive standard dynamic range (SDR) video data and metadata for the SDR video data; and
- encode the metadata for the SDR video data in the AVI InfoFrame.
17. The computer-readable medium of claim 15, wherein the program instructions configure the processor to:
- receive, as the video data and metadata, a static high dynamic range (S-HDR) video sequence and metadata for the S-HDR video sequence;
- encode the metadata for the S-HDR video sequence in the AVI InfoFrame and in a DRange and Mastering (DRAM) InfoFrame; and
- configure the AVI InfoFrame and the DRAM InfoFrame for transmission during the FSTW.
18. The computer-readable medium of claim 15, wherein the program instructions configure the processor to:
- receive, as the video data and metadata, a dynamic high dynamic range (HDR) video sequence and metadata for the dynamic HDR video sequence;
- encode the metadata for the dynamic HDR video sequence in the AVI InfoFrame and in a HDR dynamic metadata extended (HDR DME) InfoFrame; and
- configure the AVI InfoFrame and the HDR DME InfoFrame for transmission during the FSTW.
19. A computer-readable medium including program instructions for execution by a processor to configure the processor in a sink device to receive data from a source device using a high-definition media interface (HDMI), the program instructions configuring the processor to:
- receive a video sequence from the source device, the video sequence including a plurality of video fields or video frames, each video field or video frame including an active video interval and a vertical blanking interval (VBI);
- extract an auxiliary video information (AVI) information frame (InfoFrame) including first metadata for the video sequence from a frame synchronous transmission window (FSTW) of the VBI of at least one of the fields or frames of the video sequence, wherein the FSTW begins during the VBI on a first video blank pixel that immediately follows a last active video pixel of a preceding video field or video frame and ends a predetermined number of video lines after a start of the VBI;
- extract the first metadata from the AVI InfoFrame; and
- apply the extracted first metadata to video data in the active video interval of the video field or video frame containing the FSTW.
20. The computer-readable medium of claim 19, wherein the program instructions configure the processor to:
- receive, as the video sequence, a static high dynamic range (S-HDR) video sequence and the method further comprises: extract a DRange and Mastering (DRAM) InfoFrame from the FSTW of the VBI of the at least one field or frame of the video sequence; extract second metadata from the DRAM InfoFrame; and apply the second metadata to the video data in the active video interval of the video field or video frame containing the FSTW.
Type: Application
Filed: May 17, 2018
Publication Date: Nov 21, 2019
Inventors: Jiong Huang (San Jose, CA), Laurence A. Thompson (Morgan Hill, CA), Le Yuan (Shenzhen), Hua Long (Shenzhen), Yong Su (Shenzhen), Zhigui Wei (Shenzhen), Feng Wang (Shenzhen)
Application Number: 15/982,838