INTERPOLATION OF THREE-DIMENSIONAL VIDEO CONTENT

- BROADCOM CORPORATION

Techniques are described herein for interpolating three-dimensional video content. Three-dimensional video content is video content that includes portions representing respective frame sequences that provide respective perspective views of a given subject matter over the same period of time. For example, the three-dimensional video content may be analyzed to identify one or more interpolation opportunities. If an interpolation opportunity is identified, frame data that is associated with the interpolation opportunity may be replaced with an interpolation marker. In another example, a frame that is not directly represented by data in the three-dimensional video content may be identified. For instance, the frame may be represented by an interpolation marker or corrupted data. The interpolation marker or corrupted data may be replaced with an interpolated representation of the frame.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No. 61/291,818, filed Dec. 31, 2009, which is incorporated by reference herein in its entirety. This application also claims the benefit of U.S. Provisional Application No. 61/303,119, filed Feb. 10, 2010, which is incorporated by reference herein in its entirety.

This application is also related to the following U.S. Patent Applications, each of which also claims the benefit of U.S. Provisional Patent Application Nos. 61/291,818 and 61/303,119 and each of which is incorporated by reference herein:

U.S. patent application Ser. No. 12/845,409, filed on Jul. 28, 2010, and entitled “Display with Adaptable Parallax Barrier”;

U.S. patent application Ser. No. 12/845,440, filed on Jul. 28, 2010, and entitled “Adaptable Parallax Barrier Supporting Mixed 2D and Stereoscopic 3D Display Regions”;

U.S. patent application Ser. No. 12/845,461, filed on Jul. 28, 2010, and entitled “Display Supporting Multiple Simultaneous 3D Views”; and

U.S. patent application Ser. No. ______ (Attorney Docket No. A05.01330000), filed on even date herewith and entitled “Hierarchical Video Compression Supporting Selective Delivery of Two-Dimensional and Three-Dimensional Video Content.”

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to techniques for processing video images.

2. Background Art

Images may be transmitted for display in various forms. For instance, television (TV) is a widely used telecommunication medium for transmitting and displaying images in monochromatic (“black and white”) or color form. Conventionally, images are provided in analog form and are displayed by display devices in the form of two-dimensional images. More recently, images are being provided in digital form for display in two-dimensions on display devices having improved resolution. Even more recently, images capable of being displayed in three-dimensions are being provided.

Conventional displays may use a variety of techniques to achieve three-dimensional image viewing functionality. For example, various types of glasses have been developed that may be worn by users to view three-dimensional images displayed by a conventional display. Examples of such glasses include glasses that utilize color filters or polarized filters. In each case, the lenses of the glasses pass two-dimensional images of differing perspective to the user's left and right eyes. The images are combined in the visual center of the brain of the user to be perceived as a three-dimensional image. In another example, synchronized left eye, right eye LCD (liquid crystal display) shutter glasses may be used with conventional two-dimensional displays to create a three-dimensional viewing illusion. In still another example, LCD display glasses may be used to display three-dimensional images to a user. The lenses of the LCD display glasses include corresponding displays that provide images of differing perspective to the user's eyes, to be perceived by the user as three-dimensional.

Some displays are configured for viewing three-dimensional images without the user having to wear special glasses, such as by using techniques of autostereoscopy. For example, a display may include a parallax barrier that has a layer of material with a series of precision slits. The parallax barrier is placed proximal to a display so that a user's eyes each see a different set of pixels to create a sense of depth through parallax. Another type of display for viewing three-dimensional images is one that includes a lenticular lens. A lenticular lens includes an array of magnifying lenses configured so that when viewed from slightly different angles, different images are magnified. Displays are being developed that use lenticular lenses to enable autostereoscopic images to be generated.

Each technique for achieving three-dimensional image viewing functionality involves transmitting three-dimensional video content to a display device, so that the display device can display three-dimensional images that are represented by the three-dimensional video content to a user. A variety of issues may arise with respect to such transmission. For example, errors that occur during the transmission may cause frame data in the video content to become corrupted. In another example, a source of the video content and/or the channels through which the video content is transferred may become temporarily unable to handle a load that is imposed by the video content. In yet another example, the display device may be capable of processing frame data of a greater number of perspectives than the source is capable of providing.

BRIEF SUMMARY OF THE INVENTION

Methods, systems, and apparatuses are described for interpolating three-dimensional video content as shown in and/or described herein in connection with at least one of the figures, as set forth more completely in the claims.

BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES

The accompanying drawings, which are incorporated herein and form part of the specification, illustrate embodiments of the present invention and, together with the description, further serve to explain the principles involved and to enable a person skilled in the relevant art(s) to make and use the disclosed technologies.

FIG. 1 is a block diagram of an exemplary system for generating three-dimensional video content that may be encoded in accordance with an embodiment.

FIG. 2 is a block diagram of an exemplary display system according to an embodiment.

FIG. 3 depicts an exemplary implementation of an encoding system shown in FIG. 2 in accordance with an embodiment.

FIGS. 4-9 show flowcharts of exemplary methods for encoding portions of three-dimensional video content for subsequent interpolation according to embodiments.

FIG. 10 depicts an exemplary implementation of a decoding system shown in FIG. 2 in accordance with an embodiment.

FIGS. 11-16 show flowcharts of exemplary methods for decoding portions of encoded three-dimensional video content using interpolation according to embodiments.

FIGS. 17-20 illustrate exemplary interpolation techniques according to embodiments.

FIG. 21 is a block diagram of an exemplary electronic device according to an embodiment.

The features and advantages of the disclosed technologies will become more apparent from the detailed description set forth below when taken in conjunction with the drawings, in which like reference characters identify corresponding elements throughout. In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. The drawing in which an element first appears is indicated by the leftmost digit(s) in the corresponding reference number.

DETAILED DESCRIPTION OF THE INVENTION

I. Introduction

The following detailed description refers to the accompanying drawings that illustrate exemplary embodiments of the present invention. However, the scope of the present invention is not limited to these embodiments, but is instead defined by the appended claims. Thus, embodiments beyond those shown in the accompanying drawings, such as modified versions of the illustrated embodiments, may nevertheless be encompassed by the present invention.

References in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” or the like, indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Furthermore, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the relevant art(s) to implement such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

Furthermore, it should be understood that spatial descriptions (e.g., “above,” “below,” “up,” “left,” “right,” “down,” “top,” “bottom,” “vertical,” “horizontal,” etc.) used herein are for purposes of illustration only, and that practical implementations of the structures described herein can be spatially arranged in any orientation or manner.

II. Example Embodiments

Example embodiments relate to interpolation of three-dimensional video content. Three-dimensional video content is video content that includes portions representing respective frame sequences that provide respective perspective views of a given subject matter over the same period of time. In accordance with some embodiments, an upstream device analyzes the three-dimensional video content to identify one or more interpolation opportunities. An interpolation opportunity occurs when a target perspective view that is associated with the three-dimensional video content is between reference perspective views that are associated with the three-dimensional video content. The target perspective view and the reference perspective views are perspective views of a common video event that are provided by respective sequences of frames (alternatively referred to herein as “images” or “pictures”) that are represented by respective portions of the three-dimensional video content.

For example, assume that three-dimensional video content includes portions PA, PB, and PC that represent respective perspective views VA, VB, and VC for illustrative purposes. Further assume that VB is between VA and VC. In accordance with this example, an interpolation opportunity is said to occur for providing an interpolated representation of PB based on PA and PC.

If an interpolation opportunity is identified, frame data that is associated with the interpolation opportunity may be replaced with an interpolation marker. In accordance with the example mentioned above, the upstream device may replace PB with the interpolation marker.

When a downstream device receives the three-dimensional video content, which includes the interpolation marker, the downstream device may replace the interpolation marker with an interpolated representation of the frame data that the interpolation marker replaced. For instance, the downstream device may interpolate between the portions of the three-dimensional video content that represent the sequences of frames that provide the reference perspective views to generate an interpolated representation (a.k.a. an interpolation) of the portion of the three-dimensional video content that represents the sequence of frames that provides the target perspective view. In accordance with the example mentioned above, the downstream device may interpolate between PA and PC to generate an interpolated representation of PB.

In some embodiments, the downstream device identifies a frame that is not directly represented by data that is included in the three-dimensional video content. For example, the frame may be represented by an interpolation marker. However, in such embodiments, the downstream device may perform an interpolation operation with respect to portions of the three-dimensional video content even in the absence of an interpolation marker. For example, the data may be corrupted. In accordance with this example, the frame may be missing from the data, or a portion of the data that corresponds to the frame may include erroneous data. Accordingly, the interpolation need not necessarily be performed in response to an interpolation marker.

The embodiments described herein have a variety of benefits as compared to conventional techniques for processing video content. For example, the embodiments may increase the likelihood that a source of the video content and/or the channels through which the video content is transferred are capable of handling a load that is imposed by the video content. In another example, the embodiments may be capable of increasing the number of perspectives that are provided by the video content. In yet another example, the embodiments may be capable of correcting corrupted data that is included in the video content based on other data in the video content. For instance, the corrupted data may be corrected on the fly using one or more of the techniques described herein.

The following subsections describe a variety of example embodiments of the present invention. It will be apparent to persons skilled in the relevant art that various changes in form and detail can be made to the embodiments described herein without departing from the spirit and scope of the invention. Thus, the breadth and scope of the present invention should not be limited by any of the example embodiments described herein.

A. Example Display System and Method Embodiments

In accordance with embodiments described herein, three-dimensional video content is represented as a plurality of separate portions (a.k.a. digital video streams). Each portion represents a respective frame sequence that provides a respective perspective view of a video event. This is illustrated by FIG. 1, which is a diagram of an exemplary system 100 for generating three-dimensional video content that may be encoded in accordance with an embodiment. As shown in FIG. 1, system 100 includes a plurality of video cameras 102A-102N that are directed at and operate to record images of the same subject matter 104 from different perspectives over the same period of time. This results in the generation three-dimensional video content 106, which includes N different portions 108A-108N that provide different perspective views of subject matter 104 over the same period of time.

Of course, techniques other than utilizing video cameras may be used to produce the different portions 108A-108N. For example, one or more of the portions 108A-108N may be created in a manual or automated fashion by digital animators using advanced graphics and animation tools. Additionally, at least one of the portions 108A-108N may be created by using a manual or automated interpolation process that creates a portion based on analysis of at least two of the other portions. For example, with reference to FIG. 1, if camera 102B were absent, a digital video stream corresponding to the perspective view of subject matter 104 provided by that camera could nevertheless be created by performing an interpolation process on the portions of the three-dimensional video content 106 produced by camera 102A and another of the cameras. Still other techniques not described herein may be used to produce one or more of the different digital video streams.

Display systems have been described that can display a single image of certain subject matter to provide a two-dimensional view thereof and that can also display two images of the same subject matter viewed from different perspectives in an integrated manner to provide a three-dimensional view thereof. Such two-dimensional (2D)/three-dimensional (3D) display systems can further display a multiple of two images (e.g., four images, eight images, etc.) of the same subject matter viewed from different perspectives in an integrated manner to simultaneously provide multiple three-dimensional views thereof, wherein the particular three-dimensional view perceived by a viewer is determined based at least in part on the position of the viewer. Examples of such 2D/3D display systems are described in the following commonly-owned, co-pending U.S. Patent Applications: U.S. patent application Ser. No. 12/845,409, filed on Jul. 28, 2010, and entitled “Display with Adaptable Parallax Barrier”; U.S. patent application Ser. No. 12/845,440, filed on Jul. 28, 2010, and entitled “Adaptable Parallax Barrier Supporting Mixed 2D and Stereoscopic 3D Display Regions”; and U.S. patent application Ser. No. 12/845,461, filed on Jul. 28, 2010, and entitled “Display Supporting Multiple Simultaneous 3D Views.” The entirety of each of these applications is incorporated by reference herein.

The portions 108A-108N produced by system 100 can be obtained and provided to a 2D/3D display system as described above in order to facilitate the presentation of a two-dimensional view of subject matter 104, a single three-dimensional view of subject matter 104, or multiple three-dimensional views of subject matter 104.

FIG. 2 is a block diagram of an exemplary display system 200 according to an embodiment. Generally speaking, display system 200 operates to transmit three-dimensional video content, such as three-dimensional video content 106 of FIG. 1, to a display device, so that the display device can display three-dimensional images that are represented by the three-dimensional video content to user(s). According to embodiments, display system 200 interpolates between portions of the three-dimensional video content that correspond to respective perspective views to provide frame data that corresponds to another perspective view. As shown in FIG. 2, display system 200 includes source(s) 202 and a display device 204. Source(s) provide three-dimensional video content 206. Source(s) 202 can include any number of sources, including one, two, three, etc. Each source provides one or more portions of the three-dimensional video content 206. Examples of a source include but are not limited to a computer storage disc (e.g., a digital video disc (DVD) or a Blu-Ray® disc), local storage on a display device, a remote server (i.e., a server that is located remotely from the display device), a gaming system, a satellite, a cable headend, and a point-to-point system.

Some of the portions of the three-dimensional video content 206 may serve as reference portions, while others serve as supplemental portions, though the scope of the embodiments is not limited in this respect. For instance, the supplemental portions may be used to increase the number of perspective views that are included in the three-dimensional video content beyond the number of perspective views that are represented by the reference portions. The reference portions may include 2D data, 3D2 data, 3D4 data, 3D8 data, etc. Supplemental portions may include auto-interpolated 2D-3D2 (single stream) data, manually generated interpolation 3D2 data, A-I 3D4 (3 stream) data, M-G-I 3D4 (3 stream) data, etc.

As shown in FIG. 2, source(s) 202 includes an encoding system 208. Encoding system 208 encodes the three-dimensional video content 206 to provide encoded three-dimensional video content 210. For example, encoding system 208 may replace frame data in the three-dimensional video content 206 with an interpolation marker. The interpolation marker may indicate that interpolation is to be performed between portions of the three-dimensional video content in order to generate an interpolated representation of the frame data that is replaced with the interpolation marker. The interpolation marker may be accompanied by instructions for generating the interpolated representation. It will be recognized, however, that encoding system 208 need not necessarily replace frame data in the three-dimensional video content 206 with an interpolation marker. Regardless, encoding system 208 transmits the encoded three-dimensional video content 210 toward display device 204 via communication channels 212.

It will be further recognized that source(s) 202 need not necessarily include encoding system 208. For example, source(s) 202 may store the encoded three-dimensional video content 210, rather than generating the encoded three-dimensional video content 210 based on the three-dimensional video content 206.

Communication channels 212 may include one or more local device pathways, point-to-point links, and/or pathways in a hybrid fiber coaxial (HFC) network, a wide-area network (e.g., the Internet), a local area network (LAN), another type of network, or a combination thereof. Communication channels 212 may support wired, wireless, or both transmission media, including satellite, terrestrial (e.g., fiber optic, copper, twisted pair, coaxial, or the like), radio, microwave, free-space optics, and/or any other form or method of transmission.

Display device 204 displays images to user(s) upon receipt of the encoded three-dimensional video content 210. Display device 204 may be implemented in various ways. For instance, display device 204 may be a television display (e.g., a liquid crystal display (LCD) television, a plasma television, etc.), a computer monitor, a projection system, or any other type of display device.

Display device 204 includes an interpolation-enabled decoding system 214, display circuitry 216, and a screen 218. Decoding system 214 decodes the encoded three-dimensional video content 210 to provide decoded three-dimensional video content 220. For instance, decoding system 214 may interpolate between portions of a decoded representation of the encoded three-dimensional video content 210 to generate one or more of the portions of the decoded three-dimensional video content 220. In one example, decoding system 214 may interpolate in response to detecting an interpolation indicator in the encoded three-dimensional video content 210. In another example, decoding system 214 may interpolate in response to determining that a frame that is included in the decoded representation of the encoded three-dimensional video content 210 is not directly represented by data in the decoded representation. For instance, decoding system 214 may determine that the frame is replaced by an interpolation marker, that the frame is missing from the data, or that a portion of the data that corresponds to the frame includes erroneous data. Interpolation that is performed by decoding system 214 may be incorporated into a decoding process or may be performed after such a decoding process on raw data.

In an embodiment, decoding system 214 maintains synchronization of the portions that are included in the decoded three-dimensional video content 220. For instance, such synchronization may be maintained during inter-reference frame periods, during screen reconfiguration, etc. If decoding system 214 is unable to maintain synchronization with respect to one or more portions of the decoded three-dimensional video content 220, decoding system 214 may perform interpolation to generate interpolated representations of those portion(s) until synchronization is re-established. Decoding system 214 may synchronize 3DN adjustments with reference frame occurrence, where N can be any positive integer greater than or equal to two. A 3DN adjustment may include the addition of frame data corresponding to a perspective view, for example. For each additional perspective that is represented by the decoded three-dimensional video content 220, N is incremented by one.

Display circuitry 216 directs display of one or more of the frame sequences that are represented by the decoded three-dimensional video content 220 toward screen 218, as indicated by arrow 222, for presentation to the user(s). It will be recognized that although display circuitry 216 is labeled as such, the functionality of display circuitry 216 may be implemented in hardware, software, firmware, or any combination thereof.

Screen 218 displays the frame sequence(s) that are received from display circuitry 216 to the user(s). Screen 218 may be any suitable type of screen, including but not limited to an LCD screen, a plasma screen, a light emitting device (LED) screen (e.g., an OLED (organic LED) screen), etc.

It will be recognized that encoding system 208 may be external to source(s) 202. Moreover, decoding system 214 may be external to display device 204. For instance, encoding system 208 and decoding system 214 may be implemented in a common device, such as a transcoder that is coupled between source(s) 202 and display device 204.

It will be further recognized that feedback may be provided from communication channels 212 and/or display device 204 to any one or more of the source(s) 202. For example, display device 204 may provide feedback to indicate an error that occurs with respect to frame data that is included in encoded three-dimensional video content 210, one or more characteristics that are associated with display device 204, etc. Examples of such characteristics include but are not limited to a load that is associated with display device 204 and a number of perspective views that display device 204 is capable of processing. In another example, channels 212 may provide feedback to indicate an error that occurs with respect to frame data that is included in encoded three-dimensional video content 210, one or more characteristics (e.g., a load) that are associated with the channels 212, etc.

B. Example Encoding Embodiments

FIG. 3 depicts a block diagram of an encoding system 300, which is an exemplary implementation of encoding system 208 of FIG. 2, in accordance with an embodiment. As shown in FIG. 3, encoding system 300 includes input circuitry 302, processing circuitry 304, and output circuitry 306. Input circuitry 302 serves as an input interface for encoding system 300. Processing circuitry 304 receives a plurality of portions 310A-310N of three-dimensional video content 308 through input circuitry 302. Each of the portions 310A-310N represents a respective sequence of frames that provides a respective perspective view of a video event. Processing circuitry 304 encodes the portions 310A-310N to provide encoded portions 314A-314N.

Processing circuitry 304 analyzes at least some of the portions 310A-310N to identify one or more interpolation opportunities. An interpolation opportunity occurs when a target perspective view that is associated with the three-dimensional video content 308 is between reference perspective views that are associated with the three-dimensional video content 308. The target perspective view and the reference perspective views are provided by respective sequences of frames that are represented by respective portions of the three-dimensional video content 308. For each identified interpolation opportunity, processing circuitry 304 replaces frame data that is included in the corresponding portion of the three-dimensional video content 308 with an interpolation marker. For example, if processing circuitry 304 identifies an interpolation opportunity in each of first portion 310A and second portion 310B, processing circuitry replaces frame data that is included in first portion 310A with an interpolation marker and replaces frame data that is included in second portion 310B with another interpolation marker.

Any one or more of the interpolation marker(s) may be accompanied by an interpolation instruction. For instance, a first interpolation instruction that corresponds to a first interpolation marker may specify which of the portions 310A-310N of the three-dimensional video content 308 are to be used for generating an interpolated representation of the frame data that the first interpolation marker replaces. A second interpolation instruction that corresponds to a second interpolation marker may specify which of the portions 310A-310N are to be used for generating an interpolated representation of the frame data that the second interpolation marker replaces, and so on.

Each interpolation marker may specify a type of interpolation to be performed to generate an interpolated representation of the frame data that the interpolation marker replaces. For instance, a first type of interpolation may assign a first weight to a first reference portion of the three-dimensional video content 308 and a second weight that is different from the first weight to a second reference portion of the three-dimensional video content 308 for generating an interpolated representation of frame data. A second type of interpolation may assign equal weights to the first and second reference portions of the three-dimensional video content 308. Other exemplary types of interpolation include but are not limited to linear interpolation, polynomial interpolation, and spline interpolation.

Output circuitry 306 serves as an output interface for encoding system 300. Processing circuitry 304 delivers encoded three-dimensional video content 312 that includes encoded portions 314A-314N through output circuitry 306.

Portions of three-dimensional video content, such as portions 310A-310N, may be encoded in any of a variety of ways. FIGS. 4-9 show flowcharts 400, 500, 600, 700, 800, and 900 of exemplary methods for encoding portions of three-dimensional video content for subsequent interpolation according to embodiments. Flowcharts 400, 500, 600, 700, 800, and 900 may be performed by encoding system 300 shown in FIG. 3, for example. However the methods of flowcharts 400, 500, 600, 700, 800, and 900 are not limited to that embodiment. Further structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the discussion regarding flowcharts 400, 500, 600, 700, 800, and 900. Flowcharts 400, 500, 600, 700, 800, and 900 are described as follows.

In all of FIGS. 4-9, the basic approach involves encoder processing of at least a first sequence of frames and a second sequence of frames, wherein the first sequence represents a first perspective view (e.g., a right eye view) while the second sequence represents a second perspective view (e.g., a left eye view). As an output of such encoder processing, many frames will be encoded based on the frame itself (no referencing to other frames), internal referencing (referencing frames within the same sequence of frames), and external referencing (referencing frames outside of the current frame's sequence of frames). In addition, whenever an interpolation opportunity presents itself, instead of sending encoded data for such frame, such encoded data will either be (i) merely deleted (forcing a decoder to perform interpolation based on its determination that the encoded frame data is missing), or (ii) replaced with interpolation information. Such interpolation information may be nothing more than an indicator or marker (an “interpolation marker”) but may also contain interpolation instructions, data and parameters.

In the encoder processing, a determination is made as to the hierarchical importance of the current frame under consideration. That is to determine the extent that the current frame will be referenced by other frames. For example, if the current frame is a primary reference frame (e.g., an I-Frame) that will be referenced by many other frames, applying interpolation may not be justifiable. If, on the other hand, the current frame will be referenced by no (or few) other frame(s), it may be a prime candidate for considering interpolation. In addition, a current frame is encoded to determine the size of the resultant encoded frame data. If the size is less than an established threshold, interpolation may not be applied. But, for example, if the current frame offers a justifiable data savings and without being referenced, it is a prime candidate for considering interpolation.

Once a candidate frame has been identified, the encoder processing involves applying at least one but, depending on the embodiment, may apply multiple interpolation approaches (along with various underlying parameters variations). If only one approach is applied, a determination is made as to whether such interpolation can be used to yield a visually acceptable output. When multiple approaches are available, a selection therefrom of (i) a best match which is also determined to be visually acceptable, (ii) the first match that can be used to yield something visually acceptable, (iv) an acceptable match selected at least in part based on the ease of decoding and/or the size of the interpolation information, or etc. If best and/or acceptable interpolation information is identified which saves justifiable amounts of data, the encoder processing involves selecting to use the interpolation information (or use nothing to force default interpolation by a decoder) instead of the encoded frame data in subsequent storage and/or transmissions.

For example, in a three frame sequence, a camera is fixed and only a relatively small object with the field of view moves relatively slowly therein, while the background remains practically unchanged. An interpolation opportunity might involve replacing the middle frame in the sequence with nothing at all, to force the decoder to interpolate between the first frame sequence and the third frame sequence. Alternatively, a marker (an interpolation marker) might be used instead of the second frame's encoded data. Upon identification of such marker, a decoder might either (i) substitute the first or the third frame data for the missing second frame data, which is likely to not be noticed by a viewer due to the relatively short frame rate period, (ii) create a substitute for the missing second frame data by creating an average between the first frame and the second frame (e.g., a 50/50 “weighted” addition), or (iii) otherwise create a substitute based on weighted addition percentage or using some other interpolation approach that may utilize interpolation parameters, filters and other data.

In another example, when a camera is panning, to interpolate a missing middle frame, a first and second frame sequence might be stitched together and then cropped to produce a substitute for the missing middle frame data. Of course in a panning scene, some objects such as a moving car may appear stationary at least in areas within the field of view so multiple interpolation approaches within a single frame may be applied.

Likewise, although the above two examples of interpolation opportunities were applied to a single camera view's frame sequence, interpolation with reference to other camera view frame sequences may also be performed. For example, an object moving in a frame of a first camera's frame sequence might have strong correlation with the same object a short time later captured in a frame of the second camera's frame sequence. Thus, if the correlating frame of the second camera's frame sequence is discarded or replaced, at least the frame in the first camera's frame sequence can be used by a decoder to recreate the missing data. In addition, a single frame (or frame portion) alone or along with other frames (or frame portions) from either or both camera sequences can be used by the decoder to recreate the substitute.

Thus, by sending no interpolation information (i.e., no replacement for deleted frame data), a decoder will conclude that interpolation is needed and respond by either repeating an adjacent frame (e.g., if the frames are substantially different) or create a middling alternative based on both preceding and subsequent frame data using a single camera's frame sequence. If the interpolation information contains only a marker, the decoder will immediately do the same as above without having to indirectly reach the conclusion that interpolation is needed. The interpolation information may also contain further items that either direct or assist a decoder in performing a desired interpolation. That is, for example, the interpolation information may also contain interpolation instructions, frame reference identifiers (that identify a frame or frames from which a decoder can base its interpolation), interpolation parameters (weighting factors, interpolation approaches to be used, regional/area definitions, etc.), filters (to be applied in the interpolation process) and any accompanying data (e.g., texture maps, etc.) that may enhance the interpolation process.

For example, images captured by one camera might be very close to those created at a brief time later by another camera. Thus, instead of using merely adjacent reference frames for interpolation (such as the three frame sequences with a missing middle frame approach mentioned above), the encoder may choose to send interpolation information that identifies for use in the interpolation process one or more frames selected from other camera's frame sequences and other possibly non-adjacent frames from within the same camera's frame sequence. The interpolation information may also include the various interpolation parameters mentioned above, interpolation approaches to be used, regional definitions in which such approaches and frames are used, filters and data.

A single encoder can perform all or any portion of the above in association with a full frame or sections thereof. For instance, a single frame can be broken down into regions and interpolation per region can be different from that of another region.

FIGS. 4-7 are flow charts that illustrate several of many approaches for carrying out at least a portion of such encoder interpolation processing. More specifically, as shown in FIG. 4, flowchart 400 begins step 402. In step 402, both a first portion of three-dimensional video content and a second portion of the three-dimensional video content are received. The first portion corresponds to data that represents at least one frame from a first sequence of frames that provide a first perspective view. The second portion corresponds to data that represents at least one frame from a second sequence of frames that provide a second perspective view. Although not shown, a third portion that corresponds to data that represents at least one other frame from either the first or the second sequences of frames could also be gathered and considered in the interpolation process. Of course, many other portions from various other frames can also be gathered and used.

In the implementation example of FIG. 3, the processing circuitry 304 receives all portions, including both the first portion and the second portion of the three-dimensional video content through the input circuitry 302.

At step 404, the first portion and the second portion are encoded. The encoding involves at least in part analyzing the first portion and the second portion to identify an interpolation opportunity. In the implementation example of FIG. 3, the processing circuitry 304 encodes the first portion and the second portion.

At step 406, frame data is replaced with an interpolation marker. In the implementation example of FIG. 3, the processing circuitry 304 replaces the frame data with the interpolation marker.

At step 408, an encoded representation of the three-dimensional video content is delivered. In the implementation example of FIG. 3, the processing circuitry 304 delivers the encoded representation of the three-dimensional video content (e.g., encoded three-dimensional video content 312) through the output circuitry 306.

In some embodiments, one or more of the steps 402, 404, 406, and/or 408 of the flowchart 400 may not be performed. Moreover, other steps in addition to or in lieu of the steps 402, 404, 406, and/or 408 may be performed.

FIG. 5 shows a flowchart 500 that illustrates one of many possible implementations of the step 404 of the flowchart 400 in FIG. 4 in accordance with an embodiment of the present invention. Similarly, as shown in FIG. 5, flowchart 500 includes step 502 that may be applied in the step 404 of the flowchart 400 in FIG. 4, for example. In step 502, a current frame is compared with frames that neighbor the current frame to identify the interpolation opportunity. For example, the frames that neighbor the current frame may be included in respective portions of the three-dimensional video content that correspond to respective reference perspective views. In accordance with this example, the current frame may be included in a portion of the three-dimensional video content that corresponds to a perspective view that is between the reference perspective views. In an embodiment, the interpolation opportunity is identified in a first frame of the first portion while the neighboring frames include a second frame from the second portion. In the implementation example of FIG. 3, the processing circuitry 304 may compare the current frame with the frames that neighbor the current frame (neighbors within either or both of the current camera view frame sequence and other camera view's frame sequences) to identify the interpolation opportunity.

In some embodiments, step 404 of flowchart 400 may be performed in response to any one or more of the steps shown in flowcharts 600, 700, 800, and/or 900 shown in FIGS. 6-9. As shown in FIG. 6, flowchart 600 includes step 602. In step 602, a determination is made that an accuracy of an estimate of the frame data is greater than a threshold accuracy. In the implementation example of FIG. 3, the processing circuitry 304 determines that the accuracy of the estimate is greater than the threshold accuracy. For example, the processing circuitry 304 may perform an interpolation operation with respect to the first portion and/or the second portion to generate the estimate of the frame data. In accordance with this example, processing circuitry 304 may compare the estimate to the frame data to determine the accuracy of the estimate. Processing circuitry 304 may compare the accuracy to the threshold accuracy to determine whether the estimate is greater than the threshold accuracy. For instance, processing circuitry may be configured to replace the frame data with an interpolation marker at step 406 if the estimate of the frame data is greater than the threshold accuracy, but not if the estimate is less than the threshold accuracy.

As shown in FIG. 7, a determination is made that an error occurs with respect to the frame data. For instance, it may be desirable to avoid sending frame data with respect to which an error is known to have occurred.

As shown in FIG. 8, a determination is made that a source that generates the three-dimensional video content has at least one specified characteristic. For example, a load that is associated with the source may be greater than a threshold load. In another example, the source may not support a viewing format that is associated with the frame data.

As shown in FIG. 9, a determination is made that a communication channel via which the three-dimensional video content is to be transmitted has at least one specified characteristic. For instance, a load that is associated with the communication channel may be greater than a threshold load.

C. Example Decoding Embodiments

FIG. 10 depicts a block diagram of a decoding system 1000, which is an exemplary implementation of interpolation-enabled decoding system 214 of FIG. 2, in accordance with an embodiment. As shown in FIG. 10, decoding system 1000 includes input circuitry 1002, processing circuitry 1004, and output circuitry 1006. Input circuitry 1002 serves as an input interface for decoding system 1000. Processing circuitry 1004 receives a plurality of encoded portions 1010A-1010N of encoded three-dimensional video content 1008 through input circuitry 1002. Each of the encoded portions 1010A-1010N represents a respective sequence of frames that provides a respective perspective view of a video event. Processing circuitry 1004 decodes the encoded portions 1010A-1010N to provide decoded portions 1014A-1014M, which are included in decoded three-dimensional video content 1012. The decoded three-dimensional video content 1012 is also referred to as a decoded representation of the encoded three-dimensional video content 1008. It will be recognized that the number of encoded portions “N” need not necessarily be equal to the number of decoded portions “M”. For instance, processing circuitry 1004 may interpolate between any the encoded portions 1010A-1010N to generate one or more of the decoded portions 1014A-1014M.

In some embodiments, processing circuitry 1004 responds to one or more interpolation markers by generating frame data to replace the respective interpolation marker(s). For instance, processing circuitry 1004 may respond to a first interpolation marker by generating first frame data to replace the first interpolation marker. Processing circuitry may respond to a second interpolation marker by generating second frame data to replace the second interpolation marker, and so on. The interpolation marker(s) are included in the encoded three-dimensional video content 1008. The instance(s) of frame data that replace the respective interpolation marker(s) are included in the decoded three-dimensional video content 1012.

Any one or more of the interpolation marker(s) may be accompanied by an interpolation instruction. For instance, processing system 1004 may use a first subset of the encoded portions 1010A-1010N that is specified by a first interpolation instruction that corresponds to a first interpolation marker to generate first frame data to replace the first interpolation marker. Processing system 1004 may use a second subset of the encoded portions 1010A-1010N that is specified by a second interpolation instruction that corresponds to a second interpolation marker to generate second frame data to replace the second interpolation marker, and so on. Each interpolation instruction (or the interpolation marker that it accompanies) may specify a type of interpolation to be performed to generate the frame data that the interpolation marker replaces.

In other embodiments, processing circuitry 1004 identifies one or more frames that are not directly represented by one or more respective encoded portions of the encoded three-dimensional video content 1008. For example, a frame is not directly represented if the frame is replaced with an interpolation marker in the encoded three-dimensional video content 1008. In another example, a frame is not directly represented if the frame is missing from the encoded three-dimensional video content 1008. In yet another example, a frame is not directly represented if the frame is represented by erroneous data in the encoded three-dimensional video content. Missing frames and erroneous frame data may occur, for example, because of (i) defects in storage media or storage process, and (ii) losses or unacceptable delays encountered in a less than perfect communication pathway. Another example resulting in a need for interpolation occurs when referenced frame data cannot be found or is itself erroneous (corrupted). That is, current frame data is correct but to decode it, one or more other portions of frame data (such portions directly associated with different frames) happen to be missing or contain erroneous data. In such case and without an ability to decode the present, correct frame data, interpolation may be performed to generate the current frame as an alternative. Processing circuitry 1004 produces interpolation(s) of the respective frame(s) that are not directly represented.

Output circuitry 1006 serves as an output interface for decoding system 1000. Processing circuitry 1004 delivers the decoded three-dimensional video content 1012 through output circuitry 1006.

Portions of encoded three-dimensional video content, such as encoded portions 1010A-1010N, may be decoded in any of a variety of ways. FIGS. 11-16 show flowcharts 1100, 1200, 1300, 1400, 1500, and 1600 of exemplary methods for decoding portions of encoded three-dimensional video content using interpolation according to embodiments. Flowcharts 400, 500, 600, 700, 800, and 900 may be performed by decoding system 1000 shown in FIG. 10, for example. However the methods of flowcharts 1100, 1200, 1300, 1400, 1500, and 1600 are not limited to that embodiment. Further structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the discussion regarding flowcharts 1100, 1200, 1300, 1400, 1500, and 1600. Flowcharts 1100, 1200, 1300, 1400, 1500, and 1600 are described as follows.

In all of FIGS. 11-16, the basic approach involves decoder processing of at least a first sequence of frames and a second sequence of frames, wherein the first sequence represents a first perspective view (e.g., a right eye view) while the second sequence represents a second perspective view (e.g., a left eye view). The decoder receives many frames that are encoded based on the frame itself (no referencing to other frames), internal referencing (referencing frames within the same sequence of frames), and external referencing (referencing frames outside of the current frame's sequence of frames). In addition, instead of receiving encoded data for such frame, such encoded data is either (i) deleted or (ii) replaced with interpolation information. Upon determining that encoded data is deleted or replaced with interpolation information, the decoder performs interpolation to generate the encoded data. Interpolation information may be nothing more than an indicator or marker (an “interpolation marker”) but may also contain interpolation instructions, data and parameters. The decoder processing involves applying at least one but, depending on the embodiment, may apply multiple interpolation approaches (along with various underlying parameters variations).

For example, in a three frame sequence, a camera is fixed and only a relatively small object with the field of view moves relatively slowly therein, while the background remains practically unchanged. In accordance with this example, an encoder may replace the middle frame in the sequence with nothing at all. The decoder detects that the middle frame is missing and interpolates between the first frame sequence and the third frame sequence. Alternatively, the encoder may use a marker (an interpolation marker) instead of the second frame's encoded data. Upon identification of such marker, the decoder might either (i) substitute the first or the third frame data for the missing second frame data, which is likely to not be noticed by a viewer due to the relatively short frame rate period, (ii) create a substitute for the missing second frame data by creating an average between the first frame and the second frame (e.g., a 50/50 “weighted” addition), or (iii) otherwise create a substitute based on weighted addition percentage or using some other interpolation approach that may utilize interpolation parameters, filters and other data.

In another example, when a camera is panning, to interpolate a missing middle frame, the decoder may stitch a first and second frame sequence together and then crop the stitched sequence to produce a substitute for the missing middle frame data. Of course in a panning scene, some objects such as a moving car may appear stationary at least in areas within the field of view so multiple interpolation approaches within a single frame may be applied.

Likewise, although the above two interpolation examples were applied to a single camera view's frame sequence, interpolation with reference to other camera view frame sequences may also be performed. For example, an object moving in a frame of a first camera's frame sequence might have strong correlation with the same object a short time later captured in a frame of the second camera's frame sequence. Thus, if the correlating frame of the second camera's frame sequence is discarded or replaced, at least the frame in the first camera's frame sequence can be used by the decoder to recreate the missing data. In addition, a single frame (or frame portion) alone or along with other frames (or frame portions) from either or both camera sequences can be used by the decoder to recreate the substitute.

Thus, if frame data is missing and no interpolation information (i.e., no replacement for the missing frame data) is received by the decoder, the decoder will conclude that interpolation is needed and respond by either repeating an adjacent frame (e.g., if the frames are substantially different) or create a middling alternative based on both preceding and subsequent frame data using a single camera's frame sequence. If the interpolation information contains only a marker, the decoder will immediately do the same as above without having to indirectly reach the conclusion that interpolation is needed. The interpolation information may also contain further items that either direct or assist the decoder in performing a desired interpolation. That is, for example, the interpolation information may also contain interpolation instructions, frame reference identifiers (that identify a frame or frames from which the decoder can base its interpolation), interpolation parameters (weighting factors, interpolation approaches to be used, regional/area definitions, etc.), filters (to be applied in the interpolation process) and any accompanying data (e.g., texture maps, etc.) that may enhance the interpolation process.

For example, images captured by one camera might be very close to those created at a brief time later by another camera. Thus, instead of using merely adjacent reference frames for interpolation (such as the three frame sequences with a missing middle frame approach mentioned above), an encoder may choose to send interpolation information that identifies for use by the decoder one or more frames selected from other camera's frame sequences and other possibly non-adjacent frames from within the same camera's frame sequence. The interpolation information may also include the various interpolation parameters mentioned above, interpolation approaches to be used, regional definitions in which such approaches and frames are used, filters and data.

A decoder can perform all or any portion of the above in association with a full frame or sections thereof. For instance, the decoder can perform interpolation operations on respective regions of a single frame, and interpolation per region can be different from that of another region.

FIGS. 11-16 are flow charts that illustrate several of many approaches for carrying out at least a portion of such decoder interpolation processing. more specifically, as shown in FIG. 11, flowchart 1100 begins step 1102. In step 1102, both a first encoded portion of a first encoded sequence of frames that represent a first perspective view and a second encoded portion of a second encoded sequence of frames that represent a second perspective view are received. In the implementation example of FIG. 10, the processing circuitry 1004 receives the first encoded portion and the second encoded portion through the input circuitry 1002.

At step 1104, the first encoded portion and the second encoded portion are decoded. The decoding involves responding to an interpolation marker by generating frame data to replace the interpolation marker. In the implementation example of FIG. 10, the processing circuitry 1004 decodes the first encoded portion and the second encoded portion.

At step 1106, a decoded representation of the encoded three-dimensional video content is delivered. In the implementation example of FIG. 10, the processing circuitry 1004 delivers the decoded representation of the encoded three-dimensional video content (e.g., decoded three-dimensional video content 1012) through the output circuitry 1006.

In some example embodiments, one or more steps 1102, 1104, and/or 1106 of flowchart 1100 may not be performed. Moreover, steps in addition to or in lieu of steps 1102, 1104, and/or 1106 may be performed.

Instead of performing step 1104 of flowchart 1100, the steps shown in flowchart 1200, flowchart 1300, or flowchart 1400 shown in respective FIGS. 12-14 may be performed. A shown in FIG. 12, flowchart 1200 begins at step 1202. In step 1202, a determination is made that a number of perspective views that a display is capable of processing is greater than a number of perspective views that is initially represented by the encoded three-dimensional video content. In the implementation example of FIG. 10, the processing circuitry 1004 determines that the number of perspective views that the display is capable of processing is greater than the number of perspective views that is initially represented by the encoded three-dimensional video content.

At step 1204, an interpolation request is provided to an encoder. The interpolation request requests inclusion of an interpolation marker in the encoded three-dimensional video content. In the implementation example of FIG. 10, the processing circuitry 1004 provides the interpolation request through the output circuitry 1006.

At step 1206, an interpolation is performed between a decoded version of the first encoded portion and a decoded version of the second encoded portion to generate frame data that corresponds to a third sequence of frames that represent a third perspective view to replace the interpolation marker. The third perspective view is not initially represented by the encoded three-dimensional video content. In the implementation example of FIG. 10, the processing circuitry 1004 interpolates between the decoded version of the first encoded portion and the decoded version of the second encoded portion to generate the frame data.

As shown in FIG. 13, flowchart 1300 begins at step 1302. In step 1302, an interpolation instruction is received from an upstream device. In the implementation example of FIG. 10, the processing circuitry 1004 receives the interpolation instructions through the input circuitry 1002.

At step 1304, the first encoded portion and the second encoded portion are decoded. The decoding involves responding to an interpolation marker by generating frame data to replace the interpolation marker in accordance with the interpolation instruction. In the implementation example of FIG. 10, the processing circuitry 1004 decodes the first encoded portion and the second encoded portion.

As shown in FIG. 14, flowchart 1400 begins at step 1402. In step 1402, the first encoded portion is decoded to provide a first decoded portion of a first decoded sequence of frames that represents the first perspective view. In the implementation example of FIG. 10, the processing circuitry 1004 decodes the first encoded portion.

At step 1404, the second encoded portion is decoded to provide decoded data that represents the second perspective view, the decoded data including an interpolation marker. In the implementation example of FIG. 10, the processing circuitry 1004 decodes the second encoded portion.

At step 1406, an interpolation is performed between the first decoded portion and a third decoded portion of a third decoded sequence of frames that represents a third perspective view to generate frame data to replace the interpolation marker in the decoded data. In the implementation example of FIG. 10, the processing circuitry 1004 interpolates between the first decoded portion and the third decoded portion to generate the frame data.

Instead of performing step 1406 of flowchart 1400, the steps shown in flowchart 1500 shown in FIG. 15 may be performed. A shown in FIG. 15, flowchart 1500 begins at step 1502. In step 1502, a weight indicator is received from an upstream device. The weight indicator specifies an extent to which the first decoded portion is to be weighed with respect to the third decoded portion. In the implementation example of FIG. 10, the processing circuitry 1004 receives the weight indicator from the upstream device through input circuitry 1002.

At step 1504, an interpolation is performed between the first decoded portion and a third decoded portion of a third decoded sequence of frames that represents a third perspective view to generate frame data to replace the interpolation marker in the decoded data based on the extent that is specified by the weight indicator. In the implementation example of FIG. 10, the processing circuitry 1004 interpolates between the first decoded portion and the third decoded portion to generate the frame data.

As shown in FIG. 16, flowchart 1600 begins at step 1602. In step 1602, at least a portion of first encoded data is retrieved that relates to a first sequence of frames representing a first perspective view. In the implementation example of FIG. 10, the processing circuitry 1004 retrieves the at least one portion of the first encoded data.

At step 1604, at least a portion of second encoded data is received that relates to a second sequence of frames representing a second perspective view. In the implementation example of FIG. 10, the processing circuitry 1004 retrieves the at least one portion of the second encoded data.

At step 1606, a first frame is identified within the first sequence of frames that is not directly represented by the first encoded data retrieved. For example, an interpolation marker that is associated with the first frame may be identified. In accordance with this example, the interpolation marker may be accompanied by interpolation instructions. In another example, the first frame includes a missing frame. In the implementation example of FIG. 10, the processing circuitry 1004 identifies the first frame.

At step 1608, an interpolation of the first frame is produced. For example, the interpolation may be based at least in part on the second encoded data. In another example, production of the interpolation of the first frame may be based on at least the portion of the first encoded data and at least a portion of third encoded data that relates to a third sequence of frames representing a third perspective view based on a weight indicator. In accordance with this example, the weight indicator specifies an extent to which at least the portion of the first encoded data is to be weighed with respect to at least the portion of the third encoded data. In the implementation example of FIG. 10, the processing circuitry 1004 produces the interpolation of the first frame.

FIGS. 17-20 illustrate exemplary interpolation techniques 1700, 1800, 1900, and 2000 according to embodiments. Each of the interpolation techniques 1700, 1800, 1900, and 2000 is described with reference to exemplary instances of video content. The instances of video content may be 2D video content, 3D2 video content, 3D4 video content, 3D8 video content, etc. It will be recognized that 2D video content represents one perspective view of a video event, 3D2 video content represents two perspective views of the video event, 3D4 represents four perspective views of the video event, 3D8 video content represents eight perspective views of the video content, and so on. The number of views represented by the instances of video content that are used to describe techniques 1700, 1800, 1900, and 2000 are provided for illustrative purposes and are not intended to be limiting. It will be recognized that techniques 1700, 1800, 1900, and 2000 are applicable to video content that represents any suitable number of views. In the following discussion, instances of video content are referred to simply as “content” for convenience.

Referring to FIG. 17, technique 1700 is directed to staging 3D8 content from 2D up according to an embodiment. Technique 1700 will be described with reference to original 3D8 content 1702, 2D content 1704, 3D2 content 1706, 3D4 content 1708, and 3D8 content 1710. The original content used to illustrate technique 1700 is 3D8 content, which includes eight video streams (labeled as 1-8) that represent respective views of a video event.

A single stream of the original 3D8 content 1702 may be used to provide 2D content. As shown in FIG. 17, stream 3 of the original 3D8 content 1702 is used to provide 2D content 1704 for illustrative purposes. Internal interframe compression referencing is used with respect to stream 3 of the original 3D8 content 1702 to generate stream 3 of the 2D content 1704. However, no other streams that are included in the original 3D8 content 1702 are referenced to generate stream 3 of the 2D content 1704.

Internal interframe compression referencing is a technique in which differences between frames (e.g., adjacent frames) that are included in a stream of video content are used to represent the frames in that stream. For example, a first frame in the stream may be designated as a reference frame with other frames in the stream being designated as dependent frames. In accordance with this example, the reference frame may be represented by data that is sufficient to independently define the reference frame, while the dependent frames may be represented by difference data. The difference data that represents each dependent frame may use data that represents one or more of the other frames in the stream to generate data that is sufficient to independently define that dependent frame.

Two streams of the original 3D8 content 1702 may be used to provide 3D2 content. As shown in FIG. 17, streams 3 and 7 of the original 3D8 content 1702 are used to provide 3D2 content 1706 for illustrative purposes. Internal interframe compression referencing is used with respect to stream 3 of the original 3D8 content 1702 to generate stream 3 of the 3D2 content 1706, as described above with reference to 2D content 1704. Stream 7 of the 3D2 content 1706 is generated using internal interframe compression referencing with respect to stream 7 of the original 3D8 content 1702 and stream 3 of the 2D content 1704 for referencing.

Four streams of the original 3D8 content 1702 may be used to provide 3D4 content. As shown in FIG. 17, streams 3 and 7 of the 3D4 content 1708 are generated as described above with reference to 3D2 content 1706. Streams 1 and 5 of the 3D4 content 1708 are generated using streams 1 and 5 of the original 3D8 content 1702 and streams 3 and 7 of the 3D2 content 1706 for referencing.

All eight streams of the original 3D8 content 1702 may be used to provide 3D8 content. As shown in FIG. 17, streams 1, 3, 5, and 7 of the 3D8 content 1710 are generated as described above with reference to 3D4 content 1708. Streams 2, 4, 6, and 8 of the 3D8 content 1710 are generated using any of a plurality of streams, which includes streams 1, 3, 5, and 7 of the 3D4 content 1708 and stream 2, 4, 6, and 8 of the original 3D8 content 1702, for referencing.

Referring to FIG. 18, technique 1800 is directed to a limited referencing configuration according to an embodiment. Technique 1800 will be described with reference to original 3D8 content 1802, 2D content 1804, 3D2 content 1806, 3D4 content 1808, and 3D8 content 1810. The original content used to illustrate technique 1800 is 3D8 content, which includes eight video streams (labeled as 1-8) that represent respective views of a video event. The 2D content 1804 and the 3D2 content 1806 are generated in the same manner as the 2D content 1704 and the 3D2 content 1706 described above with reference to FIG. 17. However, the manner in which the 3D4 content 1808 and the 3D8 content 1810 are generated differs from the manner in which the 3D4 content 1708 and the 3D8 content 1710 are generated.

As shown in FIG. 18, streams 3 and 7 of the 3D4 content 1808 are generated as described above with reference to 3D4 content 1708 of FIG. 17. However, stream 1 of the 3D4 content 1808 is generated using stream 1 of the original 3D8 content 1802 and streams 3 and 7 of the 3D2 content 1806 for referencing. Furthermore, stream 5 of the 3D4 content 1808 is generated using stream 5 of the original 3D8 content 1802 and streams 3 and 7 of the 3D2 content 1806 for referencing.

Streams 1, 3, 5, and 7 of the 3D8 content 1810 are generated as described above with reference to the 3D4 content 1808. Streams 2, 4, 6, and 8 of the 3D8 content 1810 are generated using internal interframe compression referencing and streams 1, 3, 5, and 7 of the 3D4 content 1808 for referencing.

Referring to FIG. 19, technique 1900 is directed to interpolation of lost frame data to maintain image stability according to an embodiment. Technique 1900 will be described with reference to original 3D8 content 1902, 2D content 1904, and 3D2 content 1906. The original content used to illustrate technique 1900 is 3D8 content, which includes eight video streams (labeled as 1-8) that represent respective views of a video event. The 2D content 1904 is generated in the same manner as the 2D content 1704 described above with reference to FIG. 17. However, the manner in which the 3D2 content 1906 is generated differs from the manner in which the 3D2 content 1708 is generated.

As shown in FIG. 19, stream 3 of the 3D2 content 1906 is generated as described above with reference to 3D2 content 1706 of FIG. 17. Moreover, stream 7 of the 3D2 content 1906 is generated using stream 3 of the 2D content 1904 for referencing. Stream 7 of the 3D2 content 1906 is generated further using internal interframe compression referencing if a previous frame and/or a future frame of stream 3 of the 2D content 1904 is similar to a current frame of stream 3 of the 2D content 1904.

Referring to FIG. 20, technique 2000 is directed to interpolation to provide a number of views that is greater than a number of views that are represented by received video content according to an embodiment. Technique 2000 will be described with reference to original 3D4 content 2002, 2D content 2004, 3D2 content 2006, 3D4 content 2008, and 3D8 content 2010. The original content used to illustrate technique 2000 is 3D4 content, which includes four video streams (labeled as 1-4) that represent respective views of a video event.

A single stream of the original 3D4 content 2002 may be used to provide 2D content. As shown in FIG. 20, stream 3 of the original 3D4 content 2002 is used to provide 2D content 2004 for illustrative purposes. Internal interframe compression referencing is used with respect to stream 3 of the original 3D4 content 2002 to generate stream 3 of the 2D content 2004. However, no other streams that are included in the original 3D4 content 2002 are referenced to generate stream 3 of the 2D content 2004.

Two streams of the original 3D4 content 2002 may be used to provide 3D2 content. As shown in FIG. 20, streams 1 and 3 of the original 3D4 content 2002 are used to provide 3D2 content 2006 for illustrative purposes. Stream 3 of the 3D2 content 2006 is generated as described above with reference to the 2D content 2004. Stream 1 of the 3D2 content 2006 is generated using internal interframe compression referencing with respect to stream 1 of the original 3D4 content 2002 and stream 3 of the 2D content 2004 for referencing.

All four streams of the original 3D4 content 2002 may be used to provide 3D4 content. As shown in FIG. 20, streams 1 and 3 of the 3D4 content 2008 are generated as described above with reference to 3D2 content 2006. Streams 2 and 4 of the 3D4 content 2008 are generated using streams 2 and 4 of the original 3D4 content 2002 and streams 1 and 3 of the 3D2 content 2006 for referencing.

All four streams of the original 3D4 content 2002 may be used to provide 3D8 content. As shown in FIG. 20, streams 1-4 of the 3D8 content 1710 are generated as described above with reference to 3D4 content 2008. Streams 5-8 of the 3D8 content 2010 are entirely interpolated using nearest neighbor streams.

Streams that are used by a decoder for purposes of interpolation must be available to the decoder. For example, when external interpolation referencing is used to encode data, only frame sequences (perspective views) that are allowed to be referenced for encoding purposes may be used for interpolation purposes. External interpolation referencing involves referencing frames to be used for interpolation that can be found in frame sequences outside of a current frame sequence (i.e., from a different perspective view).

In some embodiments, streams are encoded using hierarchical encoding techniques, as described in commonly-owned, co-pending U.S. patent application Ser. No. ______ (Atty. Docket No. A05.01330000), filed on even date herewith and entitled “Hierarchical Video Compression Supporting Selective Delivery of Two-Dimensional and Three-Dimensional Video Content,” the entirety of which is incorporated by reference herein. Such embodiments enable a subset of a total number of streams to be received and decoded at the decoder, wherein some of the streams received may be decoded by referencing some of the other streams received. In accordance with these embodiments, none of the received streams rely on non-received streams for decoding. Accordingly, in these embodiments, interpolation referencing is limited to received streams.

III. Exemplary Electronic Device Implementations

Embodiments may be implemented in hardware, software, firmware, or any combination thereof. For example, encoding system 208, decoding system 214, display circuitry 216, input circuitry 302, processing circuitry 304, output circuitry 306, input circuitry 1002, processing circuitry 1004, and/or output circuitry 1006 may be implemented as hardware logic/electrical circuitry. In another example, encoding system 208, decoding system 214, display circuitry 216, input circuitry 302, processing circuitry 304, output circuitry 306, input circuitry 1002, processing circuitry 1004, and/or output circuitry 1006 may be implemented as computer program code configured to be executed in one or more processors.

For instance, FIG. 21 shows a block diagram of an exemplary implementation of electronic device 2100 according to an embodiment. In embodiments, electronic device 2100 may include one or more of the elements shown in FIG. 21. As shown in the example of FIG. 21, electronic device 2100 may include one or more processors (also called central processing units, or CPUs), such as a processor 2104. Processor 2104 is connected to a communication infrastructure 2102, such as a communication bus. In some embodiments, processor 2104 can simultaneously operate multiple computing threads.

Electronic device 2100 also includes a primary or main memory 2106, such as random access memory (RAM). Main memory 2106 has stored therein control logic 2128A (computer software), and data.

Electronic device 2100 also includes one or more secondary storage devices 2110. Secondary storage devices 2110 include, for example, a hard disk drive 2112 and/or a removable storage device or drive 2114, as well as other types of storage devices, such as memory cards and memory sticks. For instance, electronic device 2100 may include an industry standard interface, such a universal serial bus (USB) interface for interfacing with devices such as a memory stick. Removable storage drive 2114 represents a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup, etc.

Removable storage drive 2114 interacts with a removable storage unit 2116. Removable storage unit 2116 includes a computer useable or readable storage medium 2124 having stored therein computer software 2128B (control logic) and/or data. Removable storage unit 2116 represents a floppy disk, magnetic tape, compact disk, DVD, optical storage disk, or any other computer data storage device. Removable storage drive 2114 reads from and/or writes to removable storage unit 2116 in a well known manner.

Electronic device 2100 further includes a communication or network interface 2118. Communication interface 2118 enables the electronic device 2100 to communicate with remote devices. For example, communication interface 2118 allows electronic device 2100 to communicate over communication networks or mediums 2142 (representing a form of a computer useable or readable medium), such as LANs, WANs, the Internet, etc. Network interface 2118 may interface with remote sites or networks via wired or wireless connections.

Control logic 2128C may be transmitted to and from electronic device 2100 via the communication medium 2122.

Any apparatus or manufacture comprising a computer useable or readable medium having control logic (software) stored therein is referred to herein as a computer program product or program storage device. This includes, but is not limited to, electronic device 2100, main memory 2106, secondary storage devices 2110, and removable storage unit 2116. Such computer program products, having control logic stored therein that, when executed by one or more data processing devices, cause such data processing devices to operate as described herein, represent embodiments of the invention.

Devices in which embodiments may be implemented may include storage, such as storage drives, memory devices, and further types of computer-readable media. Examples of such computer-readable storage media include a hard disk, a removable magnetic disk, a removable optical disk, flash memory cards, digital video disks, random access memories (RAMs), read only memories (ROM), and the like. As used herein, the terms “computer program medium” and “computer-readable medium” are used to generally refer to the hard disk associated with a hard disk drive, a removable magnetic disk, a removable optical disk (e.g., CDROMs, DVDs, etc.), zip disks, tapes, magnetic storage devices, MEMS (micro-electromechanical systems) storage, nanotechnology-based storage devices, as well as other media such as flash memory cards, digital video discs, RAM devices, ROM devices, and the like. Such computer-readable storage media may store program modules that include computer program logic for encoding system 208, decoding system 214, display circuitry 216, input circuitry 302, processing circuitry 304, output circuitry 306, input circuitry 1002, processing circuitry 1004, output circuitry 1006, flowchart 400, flowchart 500, flowchart 600, flowchart 700, flowchart 800, flowchart 900, flowchart 1100, flowchart 1200, flowchart 1300, flowchart 1400, flowchart 1500, flowchart 1600 (including any one or more steps of flowcharts 400, 500, 600, 700, 800, 900, 1100, 1200, 1300, 1400, 1500, and 1600), and/or further embodiments of the present invention described herein. Embodiments of the invention are directed to computer program products comprising such logic (e.g., in the form of program code or software) stored on any computer useable medium. Such program code, when executed in one or more processors, causes a device to operate as described herein.

The invention can be put into practice using software, firmware, and/or hardware implementations other than those described herein. Any software, firmware, and hardware implementations suitable for performing the functions described herein can be used.

As described herein, electronic device 2100 may be implemented in association with a variety of types of display devices. For instance, electronic device 2100 may be one of a variety of types of media devices, such as a stand-alone display (e.g., a television display such as flat panel display, etc.), a computer, a game console, a set top box, a digital video recorder (DVR), other electronic device mentioned elsewhere herein, etc. Media content that is delivered in two-dimensional or three-dimensional form according to embodiments described herein may be stored locally or received from remote locations. For instance, such media content may be locally stored for playback (replay TV, DVR), may be stored in removable memory (e.g. DVDs, memory sticks, etc.), may be received on wireless and/or wired pathways through a network such as a home network, through Internet download streaming, through a cable network, a satellite network, and/or a fiber network, etc. For instance, FIG. 21 shows a first media content 2130A that is stored in hard disk drive 2112, a second media content 2130B that is stored in storage medium 2124 of removable storage unit 2116, and a third media content 2130C that may be remotely stored and received over communication medium 2122 by communication interface 2118. Media content 2130 may be stored and/or received in these manners and/or in other ways.

IV. Conclusion

While various embodiments have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be understood by those skilled in the relevant arts) that various changes in form and details may be made to the embodiments described herein without departing from the spirit and scope of the invention. Accordingly, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

1. An encoding system servicing three-dimensional video content, the three-dimensional video content having both a first portion representing a first sequence of frames that provide a first perspective view and a second portion representing a second sequence of frames that provide a second perspective view, the encoding system comprising:

processing circuitry;
input circuitry through which the processing circuitry receives both the first portion that represents the first sequence of frames that provide the first perspective view and the second portion that represents the second sequence of frames that provide the second perspective view;
the processing circuitry encodes the first portion and the second portion received, the encoding involving at least in part analyzing the first portion and the second portion to identify an interpolation opportunity, and, upon so identifying, the processing circuitry replaces frame data with an interpolation marker; and
output circuitry through which the processing circuitry delivers an encoded representation of the three-dimensional video content.

2. The encoding system of claim 1, wherein the processing circuitry compares a current frame with frames that neighbor the current frame to identify the interpolation opportunity.

3. The encoding system of claim 2, wherein the interpolation opportunity is identified in a first frame of the first portion while the neighboring frames include a second frame from the second portion.

4. The encoding system of claim 1, wherein the interpolation marker is accompanied by an interpolation instruction.

5. The encoding system of claim 1, wherein the processing circuitry determines that an accuracy of an estimate of the frame data is greater than a threshold accuracy; and

wherein the processing circuitry analyzes the first portion and the second portion to identify the interpolation opportunity in response to determination that the accuracy of the estimate is greater than the threshold accuracy.

6. The encoding system of claim 1, wherein the processing circuitry determines that an error occurs with respect to the frame data; and

wherein the processing circuitry analyzes the first portion and the second portion to identify the interpolation opportunity in response to determination that the error occurs.

7. The encoding system of claim 1, wherein the processing circuitry determines that a source that generates the three-dimensional video content has at least one specified characteristic; and

wherein the processing circuitry analyzes the first portion and the second portion to identify the interpolation opportunity in response to determination that the source has the at least one specified characteristic.

8. The encoding system of claim 1, wherein the processing circuitry determines that a communication channel via which the three-dimensional video content is to be transmitted has at least one specified characteristic; and

wherein the processing circuitry analyzes the first portion and the second portion to identify the interpolation opportunity in response to determination that the communication channel has the at least one specified characteristic.

9. The encoding system of claim 1, wherein the interpolation marker specifies a type of interpolation to be performed to generate the frame data.

10. A decoding system servicing encoded three-dimensional video content, the encoded three-dimensional video content having both a first encoded portion of a first encoded sequence of frames that represent a first perspective view and a second encoded portion of a second encoded sequence of frames that represent a second perspective view, the encoding system comprising:

processing circuitry;
input circuitry through which the processing circuitry receives both the first encoded portion of the first encoded sequence of frames that represent the first perspective view and the second encoded portion of the second encoded sequence of frames that represent the second perspective view;
the processing circuitry decodes the first encoded portion and the second encoded portion received, the decoding involving responding to an interpolation marker by generating frame data to replace the interpolation marker; and
output circuitry through which the processing circuitry delivers a decoded representation of the encoded three-dimensional video content.

11. The decoding system of claim 10, wherein the processing circuitry determines that a number of perspective views that a display is capable of processing is greater than a number of perspective views that is initially represented by the encoded three-dimensional video content;

wherein the processing circuitry provides an interpolation request to an encoder through the output circuitry, the interpolation request requesting inclusion of the interpolation marker in the encoded three-dimensional video content, in response to determination that the number of perspective views that the display is capable of processing is greater than the number of perspective views that is initially represented by the encoded three-dimensional video content; and
wherein the processing circuitry interpolates between a decoded version of the first encoded portion and a decoded version of the second encoded portion to generate the frame data that corresponds to a third sequence of frames that represent a third perspective view, the third perspective view not being initially represented by the encoded three-dimensional video content.

12. The decoding system of claim 10, wherein the processing circuitry receives an interpolation instruction from an upstream device through the input circuitry; and

wherein the processing circuitry generates the frame data in accordance with the interpolation instruction.

13. The decoding system of claim 10, wherein the processing circuitry decodes the first encoded portion to provide a first decoded portion of a first decoded sequence of frames that represents the first perspective view;

wherein the processing circuitry decodes the second encoded portion to provide decoded data that represents the second perspective view, the decoded data including the interpolation marker; and
wherein the processing circuitry interpolates between the first decoded portion and a third decoded portion of a third decoded sequence of frames that represents a third perspective view to generate the frame data to replace the interpolation marker in the decoded data.

14. The decoding system of claim 13, wherein the processing circuitry receives a weight indicator from an upstream device through the input circuitry, the weight indicator specifying an extent to which the first decoded portion is to be weighed with respect to the third decoded portion; and

wherein the processing circuitry generates the frame data based on the extent that is specified by the weight indicator.

15. A method used in decoding encoded three-dimensional video content, the encoded three-dimensional video content having both first encoded data relating to a first sequence of frames representing a first perspective view and second encoded data relating to a second sequence of frames representing a second perspective view, the method comprising:

retrieving at least a portion of the first encoded data that relates to the first sequence of frames representing the first perspective view;
retrieving at least a portion of the second encoded data that relates to the second sequence of frames representing the second perspective view;
identifying a first frame within the first sequence of frames not directly represented by the first encoded data retrieved; and
producing an interpolation of the first frame.

16. The method of claim 15, wherein the interpolation is based at least in part on the second encoded data.

17. The method of claim 15, wherein identifying the first frame comprises:

identifying an interpolation marker that is associated with the first frame.

18. The method of claim 17, wherein the interpolation marker is accompanied by interpolation instructions.

19. The method of claim 15, wherein the first frame comprising a missing frame.

20. The method of claim 15, wherein producing the interpolation of the first frame comprises:

producing the interpolation of the first frame based on at least the portion of the first encoded data and at least a portion of third encoded data that relates to a third sequence of frames representing a third perspective view based on a weight indicator, the weight indicator specifying an extent to which at least the portion of the first encoded data is to be weighed with respect to at least the portion of the third encoded data.
Patent History
Publication number: 20110157315
Type: Application
Filed: Dec 30, 2010
Publication Date: Jun 30, 2011
Applicant: BROADCOM CORPORATION (Irvine, CA)
Inventors: James D. Bennett (Hroznetin), Jeyhan Karaoguz (Irvine, CA)
Application Number: 12/982,248
Classifications
Current U.S. Class: Picture Signal Generator (348/46); Picture Signal Generators (epo) (348/E13.074)
International Classification: H04N 13/02 (20060101);