METHODS AND APPARATUS FOR MAKING ENVIRONMENTAL MEASUREMENTS AND/OR USING SUCH MEASUREMENTS IN 3D IMAGE RENDERING
Methods and apparatus for making and using environmental measurements are described. Environmental information captured using a variety of devices is processed and combined to generate an environmental model which is communicated to customer playback devices. A UV map which is used for applying, e.g., wrapping, images onto the environmental model is also provided to the playback devices. A playback device uses the environmental model and UV map to render images which are then displayed to a viewer as part of providing a 3D viewing experience. In some embodiments updated environmental model is generated based on more recent environmental measurements, e.g., performed during the event. The updated environmental model and/or difference information for updating the existing model, optionally along with updated UV map(s), is communicated to the playback devices for use in rendering and playback of subsequently received image content. By communicating updated environmental information improved 3D simulations are achieved.
The present application claims the benefit of U.S. Provisional Application Ser. No. 62/126,701 filed Mar. 1, 2015, U.S. Provisional Application Ser. No. 62/126,709 filed Mar. 1, 2015, and U.S. Provisional Application Ser. No. 62/127,215 filed Mar. 2, 2015, each of which is hereby expressly incorporated by reference in its entirety.
FIELDThe present invention relates to methods and apparatus for capturing and using environmental information, e.g., measurements and images, to support various applications including the generation and/or display of stereoscopic images which can be used as part of providing a 3D viewing experience.
BACKGROUNDAccurate representation of a 3D environment often requires reliable models of the environment. Such models, when available, can be used to during image playback so that object captured in images of a scene appear to the view to be the correct size. Environmental maps can also be used in stitching together different pieces of an image and to facilitate alignment of images captured by different cameras.
While environment maps, when available, can facilitate a much more realistic stereoscopic displays than when a simple spherical model of an environment is assumed, there are numerous difficulties associated with obtaining accurate environmental information during an event which may be filmed for later stereoscopic playback. For example, while LIDAR may be used to make environmental measures of distances relative to a camera position prior to deployment of a stereoscopic camera to capture an event, the laser(s) used for LIDAR measurements may be a distraction or unsuitable for use during an actual event while people trying to view a concert, game or other activity. In addition, the placement of the camera rig used to capture an event may preclude a LIDAR device being placed at the same location during the event.
Thus it should be appreciated that while LIDAR may be used to make accurate measurements of a stadium or other event location prior to an event, because of the use of LASER light as well as the time associated with making LIDAR measures of an area, LIDAR is not well suited for making measurements of an environment from the location of a camera position during an event which is to be captured by one or more cameras placed and operated from the camera position during an ongoing event.
While LIDAR can be used to make highly accurate distance measurements, for the above discussed reasons it is normally used when a stadium or other event area does not have an ongoing event. As a result, the LIDAR distance measurement normally measure an empty stadium or event area without people present. In addition, since the LIDAR measurements are normally made before any modification or display set ups for a particular event, the static environmental map provided by a LIDAR or other measurement system, while in many cases highly accurate with regard to the environment at the time of measurement, often does not accurately reflect the state and shape of an environment during an event such as a sports game, concert or fashion show.
In view of the above discussion it should be appreciated that there is a need for new and improved methods of making environmental measurements and, in particular, measuring the shape of an environment during an event and using the environmental information in simulating the 3D environment. While not necessary for all embodiments, it would be desirable if an environment could be accurately measured during an event with regard to a camera position from which stereoscopic or other images are captured for later playback as part of simulating the 3D environment of the event.
SUMMARYMethods and apparatus for making and using environmental measurements are described. Environmental information captured using a variety of devices is processed and combined. In some embodiments different devices are used to capture environmental information at different times, rates and/or resolutions. At least some of the environmental information used to map the environment is captured during an event. Such information is combined, in some but not necessarily all embodiments, with environmental information that was captured prior to the event. However, depending on the embodiment, a single environmental measurement technique may be used but in many embodiments multiple environmental measurement techniques are used with the environmental information, e.g., depth information relative to a camera position, being combined to generate a more reliable and timely environmental map than might be possible if a single source of environmental information were used to generate a depth map.
In various embodiments environmental information is obtained from one or more sources. In some embodiments, a static environmental map or model, such as one produced form LIDAR measurements before an event is used. LIDAR is a detection system that works on the principle of radar, but uses light from a laser for distance measurement. From LIDAR measurements made from a location to be used for a camera position where a camera is placed for capturing images during the actual event, or from model of the environment made based on another location but with information about the location of the camera position, a static map of an environment relative to a camera position is generated. The static map provides accurate distance information for the environment in many cases, assuming the environment is unoccupied or has not otherwise changed from the time the measurements used to make the static map were made. Since the static map normally corresponds to an empty environment, the distances indicated in the static depth map are often maximum distances since objects such as persons, signs, props, etc, are often added to an environment for an event and it is rare that a structure shown in the static map is removed for an event. Thus, static map can and sometimes is used to provide maximum distance information and to provide information on the overall scale/size of the environment.
In addition to static model information, in some embodiments environmental measurements are made using information captured during an event. The capture of the environmental information during the event involves, in some embodiments, the use of one or more light field cameras which capture images from which depth information can be obtained using known techniques. In some embodiments, light field cameras which provide both images and depth maps generated from the images captured by the light field camera are used. The cameras may be, and sometimes are, mounted on or incorporated into a camera rig which also includes one or more pairs of stereoscopic cameras. Methods for generating depth information from light field cameras are used in some embodiments. For example, image data corresponding to an area or a point in the environment captured by sensor portions corresponding to different lenses of the light field micro array can be processed to provide information on the distance to the point or area.
The light field camera has the advantage of being able to passively collect images during an event which can be used to provide distance information. A drawback of the use of a light field camera is that it normally has lower resolution than that of a regular camera due to the use of the lens array over the sensor which effectively lowers the resolution of the individual captured images.
In addition to the images of the light field camera or cameras, the images captured by other cameras including, e.g., stereoscopic camera pairs, can be processed and used to provide depth information. This is possible since the cameras of a stereoscopic pair are spaced apart by a known distance and this information along with the captured images can, and in some embodiments is used to determine the distance from the camera to a point in the environment captured by the cameras in the stereoscopic camera pair. The depth information, in terms of the number of environmental points or locations for which depth can be estimated, maybe as high or almost as high as the number of pixels of the image captured by the individual cameras of the stereoscopic pairs since the camera do not use a micro lens array over the sensor of the camera.
While the output of the stereoscopic cameras can, and in some embodiments are, processed to generate depth information, it may be less reliable in many cases than the depth information obtained from the output of the light field cameras.
In some embodiments, the static model of the environment provides maximum distance information, the depth information from the light field cameras provides more up to date depth information which normally indicates depths which are equal to or less than the depths indicated by the static model but which are more timely and which may vary during an event as environmental conditions change. Similarly the depth information from the images captured by the stereo camera pair or pairs tends to be timely and available form images captured during an event.
In various embodiments the depth information from the different sources, e.g., static model which may be based on LIDAR measurements prior to an event, depth information from the one or more light field cameras and depth information generated from the stereoscopic images are combined, e.g., reconciled. The reconciliation process may involve a variety of techniques or information weighting operations taking into consideration the advantages of different depth information sources and the availability of such information. For example, in one exemplary resolution process LIDAR based depth information obtained from measurements of the environment prior to an event is used to determine maximum depths, e.g., distances, from a camera position and are used in the absence of additional depth information to model the environment. When depth information is available from a light field camera or array of light field cameras, the depth information is used to refine the environmental depth map so that it can reflect changes in the environment during an ongoing event. In some embodiments reconciling depth map information obtained from images captured by a light field camera includes refining the LIDAR based depth map to include shorter depths reflecting the presence of objects in the environment during an event. In some cases reconciling an environmental depth map that is based on light field depth measurements alone, or in combination with information from a static or LIDAR depth map, includes using depth information to further clarify the change in depths between points where the depth information is known from the output of the light field camera. In this way, the greater number of points of information available from the light field and/or stereoscopic images can be used to refine the depth map based on the output of the light field camera or camera array.
Based on depth information and/or map a 3D model of the environment, sometimes referred to as the environmental mesh model, is generated in some embodiments. The 3D environmental model may be in the form of a grid map of the environment onto which images can be applied. In some embodiments the environmental model is generated based on environmental measurements, e.g., depth measurements, of the environment of interest performed using a light field camera, e.g., with the images captured by the light field cameras being used to obtain depth information. In some embodiments an environmental model generated based on measurements of at least a portion of the environment made using a light field camera at a first time, e.g., prior to and/or at the start of an event. The environmental model is communicated to one or more customer devices, e.g., rendering and playback devices for use in rendering and playback of image content. In some embodiments a UV map which is used to apply, e.g., wrap, images onto the 3D environmental model is also provided to the customer devices.
The application of images to such a map is sometimes called wrapping since the application has the effect of applying the image, e.g., a 2D image, as if it was being wrapped unto the 3D environmental model. The customer playback devices use the environmental model and UV map to render image content which is then displayed to a viewer as part of providing the viewer a 3D viewing experience.
Since the environment is dynamic and changes may occur while the event is ongoing as discussed above, in some embodiments updated environmental information is generated to accurately model the environmental changes during the event and provided to the customer devices. In some embodiments the updated environmental information is generated based on measurements of the portion of the environment made using the light field camera at a second time, e.g., after the first time period and during the event. In some embodiments the updated model information communicates a complete updated mesh model. In some embodiments the updated mesh model information includes information indicating changes to be made to the original environmental model to generate an updated model with the updated environmental model information providing new information for portions of the 3D environment which have changed between the first and second time periods.
The updated environmental model and/or difference information for updating the existing model, optionally along with updated UV map(s), is communicated to the playback devices for use in rendering and playback of subsequently received image content. By communicating updated environmental information improved 3D simulations are achieved.
By using the depth map generation techniques described herein, relatively accurate depth maps of a dynamic environment such as an ongoing concert, sporting event, play, etc. in which items in the environment may move or be changed during the event can be generated. By communicating the updated depth information, e.g., in the form of a 3D model of the environment or updates to an environmental model, improved 3D simulations can be achieved which can in turn be used for enhanced 3D playback and/or viewing experience. The improvements in 3D environmental simulation can be achieved over systems which use static depth maps since the environmental model onto which images captured in the environment to be simulated will more accurately reflect the actual environment than in cases where the environmental model is static.
It should be appreciated that as changes to the environment in which images are captured by the stereoscopic and/or other camera occur, such changes can be readily and timely reflected in the model of the environment used by a playback device to display the captured images.
Various features relate to the field of panoramic stereoscopic imagery and more particularly, to an apparatus suitable for capturing high-definition, high dynamic range, high frame rate stereoscopic, 360-degree panoramic video using a minimal number of cameras in an apparatus of small size and at reasonable cost while satisfying weight, and power requirements for a wide range of applications.
Stereoscopic, 360-degree panoramic video content is increasingly in demand for use in virtual reality displays. In order to produce stereoscopic, 360-degree panoramic video content with 4K or greater of resolution, which is important for final image clarity, high dynamic range, which is important for recording low-light content, and high frame rates, which are important for recording detail in fast moving content (such as sports), an array of professional grade, large-sensor, cinematic cameras or other cameras of suitable quality are often needed.
In order for the camera array to be useful for capturing 360-degree, stereoscopic content for viewing in a stereoscopic virtual reality display, the camera array should acquire the content such that the results approximate what the viewer would have seen if his head were co-located with the camera. Specifically, the pairs of stereoscopic cameras should be configured such that their inter-axial separation is within an acceptable delta from the accepted human-model average of 63 mm. Additionally, the distance from the panoramic array's center point to the entrance pupil of a camera lens (aka nodal offset) should be configured such that it is within an acceptable delta from the accepted human-model average of 101 mm.
In order for the camera array to be used to capture events and spectator sports where it should be compact and non-obtrusive, it should be constructed with a relatively small physical footprint allowing it to be deployed in a wide variety of locations and shipped in a reasonable sized container when shipping is required.
The camera array should also be designed such that the minimum imaging distance of the array to be small, e.g., as small as possible, which minimizes the “dead zone” where scene elements are not captured because they fall outside of the field of view of adjacent cameras.
It would be advantageous if the camera array can be calibrated for optical alignment by positioning calibration targets where the highest optical distortion is prone to occur (where lens angles of view intersect AND the maximum distortion of the lenses occur). To facilitate the most efficacious calibration target positioning, target locations should, and in some embodiments are, determined formulaically from the rig design.
While in some embodiments three camera pairs are used such as in the
In other cases where camera cost is not an issue, more than two cameras can be mounted at each position in the rig with the rig holding up to 6 cameras as in the
In
Each camera viewing position includes one camera pair in the
The first camera pair 102 shown in
Second and third camera pairs 104, 106 are the same or similar to the first camera pair 102 but located at 120 and 240 degree camera mounting positions with respect to the front 0 degree position. The second camera pair 104 includes a left camera 105 and left lens assembly 122 and a right camera 107 and right camera lens assembly 122′. The third camera pair 106 includes a left camera 109 and left lens assembly 124 and a right camera 111 and right camera lens assembly 124′.
In
In one particular embodiment the footprint of the camera rig 100 is relatively small. Such a small size allows the camera rig to be placed in an audience, e.g., at a seating position where a fan or attendance might normally be located or positioned. Thus in some embodiments the camera rig is placed in an audience area allowing a viewer to have a sense of being a member of the audience where such an effect is desired. The footprint in some embodiments corresponds to the size of the base to which the support structure including, in some embodiments a center support rod is mounted or support tower is located. As should be appreciated the camera rigs in some embodiments can rotate around the center point of the base which corresponds to the center point between the 3 pairs of cameras. In other embodiments the cameras are fixed and do not rotate around the center of the camera array.
The camera rig 100 is capable of capturing relatively close as well as distinct object. In one particular embodiment the minimum imaging distance of the camera array is 649 mm but other distances are possible and this distance is in no way critical.
The distance from the center of the camera assembly to the intersection point 151 of the views of the first and third camera parts represents an exemplary calibration distance which can be used for calibrating images captured by the first and second camera pairs. In one particular exemplary embodiment, an optimal calibration distance, where lens angles of view intersect and the maximum distortion of the lenses occur is 743 mm. Note that target 115 may be placed at a known distance from the camera pairs located at or slightly beyond the area of maximum distortion. The calibration target include a known fixed calibration pattern. The calibration target can be and is used for calibrating the size of images captured by cameras of the camera pairs. Such calibration is possible since the size and position of the calibration target is known relative to the cameras capturing the image of the calibration target 115.
In the
Various features also relate to the fact that the camera support structure and camera configuration can, and in various embodiments does, maintain a nodal offset distance in a range from 75 mm to 350 mm. In one particular embodiment, a nodal offset distance of 315 mm is maintained. The support structure also maintains, in some embodiments an overall area (aka footprint) in a range from 400 mm2 to 700 mm2. In one particular embodiment, an overall area (aka footprint) of 640 mm2 is maintained. The support structure also maintains a minimal imaging distance in a range from 400 mm to 700 mm. In one particular embodiment, a minimal imaging distance of 649 mm is maintained. In one particular embodiment the optimal calibration distance of the array is where lens angles of view intersect AND the maximum distortion of the lenses occur. In one particular exemplary embodiment this distance is 743 mm.
As discussed above, in various embodiments the camera array, e.g., rig, is populated with only 2 of the 6-total cameras which would normally be required for simultaneous 360-degree stereoscopic video for the purpose of capturing the high-value, foreground 180-degree scene elements in real-time while manually capturing static images of the lower-value, background 180-degree scene elements.
In some embodiments the three pairs (six cameras) of cameras 702, 704 and 706 are mounted on the support structure 720 via the respective camera pair mounting plates 710, 712 and 714. The support structure 720 may be in the form of a slotted mounting plate 720. Slot 738 is exemplary of some of the slots in the plate 720. The slots reduce weight but also allow for adjustment of the position of the camera mounting plates 710, 712, 714 used to support camera pairs or in some cases a single camera.
The support structure 720 includes three different mounting positions for mounting the stereoscopic camera pairs 702, 704, 706, with each mounting position corresponding to a different direction offset 120 degrees from the direction of the adjacent mounting position. In the illustrated embodiment of
The first camera pair mounting plate 710 includes threaded screw holes 741, 741′, 741″ and 741′″ through which screws 704, 740′, 740″, 740″ can be inserted, respectively through slots 738 and 738′; to secure the plate 710 to the support structure 720. The slots allow for adjustment of the position of the support plate 710.
The cameras 750, 750′ of the first camera pair are secured to individual corresponding camera mounting plates 703, 703′ using screws that pass through the bottom of the plates 703, 703′ and extend into threaded holes on the bottom of the cameras 750, 750′. Once secured to the individual mounting plates 703, 703′ the cameras 750, 750′ and mounting plates 703, 703′ can be secured to the camera pair mounting plate 710 using screws. Screws 725, 725′, 725″ (which is not fully visible) and 725′″ pass through corresponding slots 724 into threaded holes 745, 745′, 745″ and 745′″ of the camera pair mounting plate 710 to secure the camera plate 703 and camera 750 to the camera pair mounting plate 710. Similarly, screws 727, 727′(which is not fully visible), 727″ and 727″ pass through corresponding slots 726, 726′, 726″ and 726′″ into threaded holes 746, 746′, 746″ and 746′″ of the camera pair mounting plate 710 to secure the camera plate 703′ and camera 750′ to the camera pair mounting plate 710.
The support structure 720 has standoff rollers 732, 732′ mounted to reduce the risk that an object moving past the support structure will get caught on the support structure as it moves nearby. This reduces the risk of damage to the support structure 720. Furthermore by having a hollow area inside behind the roller an impact to the roller is less likely to be transferred to the main portion of the support structure. That is, the void behind the rollers 732, 732′ allows for some deformation of the bar portion of the support structure on which the standoff roller 732′ is mounted without damage to the main portion of the support structure including the slots used to secure the camera mounting plates.
In various embodiments the camera rig 400 includes a base 722 to which the support structure 720 is rotatable mounted e.g. by a shaft or threaded rod extending trough the center of the base into the support plate 720. Thus in various embodiments the camera assembly on the support structure 720 can be rotated 360 degrees around an axis that passes through the center of the base 722. In some embodiments the base 722 may be part of a tripod or another mounting device. The tripod includes legs formed by pairs of tubes (742, 742′), (742″ and 742″) as well as additional leg which is not visible in
The assembly 400 shown in
In
In the drawing 500 the camera pairs 702, 704, 706 can be seen mounted on the support structure 720 with at least one of the camera pair mounting plate 710 being visible in the illustrated drawing. In addition to the elements of camera rig 400 already discussed above with regard to
The simulated ears 730, 730 are mounted on a support bar 510 which includes the microphones for capturing sound. The audio capture system 730, 732, 810 is supported by a movable arm 514 which can be moved via handle 515.
While
In other embodiments the camera rig 400 includes a single stereoscopic camera pair 702 and one camera mounted in each of the second and third positions normally used for a pair of stereoscopic cameras. In such an embodiment a single camera is mounted to the rig in place of the second camera pair 704 and another single camera is mounted to the camera rig in place of the camera pair 706. Thus, in such an embodiment, the second camera pair 704 may be thought of as being representative of a single camera and the camera pair 706 may be thought of as being illustrative of the additional single camera.
In some embodiments the camera rig 801 includes one light field camera (e.g., camera 802) and two other cameras (e.g., cameras 804, 806) forming a stereoscopic camera pair on each longer side of the camera rig 801. In some such embodiments there are four such longer sides (also referred to as the four side faces 830, 832, 834 and 836) with each longer side having one light field camera and one stereoscopic camera pair, e.g., light field camera 802 and stereoscopic camera pair 804, 806 on one longer side 836 to the left while another light field camera 810 and stereoscopic camera pair 812, 814 on the other longer side 830 to the right can be seen in drawing 800. While the other two side faces are not fully shown in drawing 800, they are shown in more detail in
In addition to the stereoscopic camera pair and the light field camera on each of the four side faces 830, 832, 834 and 836, in some embodiments the camera rig 801 further includes a camera 825 facing in the upward vertical direction, e.g., towards the sky or another top ceiling surface in the case of a closed environment, on the top face 840 of the camera rig 801. In some such embodiments the camera 825 on the top face of the camera rig 801 is a light field camera. While not shown in drawing 800, in some other embodiments the top face 840 of the camera rig 801 also includes, in addition to the camera 825, another stereoscopic camera pair for capturing left and right eye images. While in normal circumstances the top hemisphere (also referred to as the sky portion) of a 360 degree environment, e.g., stadium, theater, concert hall etc., captured by the camera 825 may not include action and/or remain static in some cases it may be important or desirable to capture the sky portion at the same rate as other environmental portions are being captured by other cameras on the rig 801.
While one exemplary camera array arrangement is shown and discussed above with regard to camera rig 801, in some other implementations instead of just a single light field camera (e.g., such as cameras 802 and 810) arranged on top of a pair of stereoscopic cameras (e.g., cameras 804, 806 and 812, 814) on four faces 830, 832, 834, 836 of the camera rig 801, the camera rig 801 includes an array of light field cameras arranged with stereoscopic camera pair. For example in some embodiments there are 3 light field cameras arranged on top of a stereoscopic camera pair on each of the longer sides of the camera rig 801. In another embodiment there are 6 light field cameras arranged on top of stereoscopic camera pair on each of the longer sides of the camera rig 801, e.g., with two rows of 3 light field cameras arranged on top of the stereoscopic camera pair. Some of such variations are discussed with regard to
In some embodiments the camera rig 801 may be mounted on a support structure such that it can be rotated around a vertical axis. In various embodiments the camera rig 801 may be deployed in an environment of interest, e.g., such as a stadium, auditorium, or another place where an event to be captured is taking place. In some embodiments the light field cameras of the camera rig 801 are used to capture images of the environment of interest, e.g., a 360 degree scene area of interest, and generate depth maps which can be used in simulating a 3D environment and displaying stereoscopic imaging content.
In drawing 900 various components of the cameras on two out of four side faces 830, 832, 834, 836 of the camera rig 801 are shown. The lens assemblies 902, 904 and 906 correspond to cameras 802, 804 and 806 respectively of side face 836 of the camera rig 801. Lens assemblies 910, 912 and 914 correspond to cameras 810, 812 and 814 respectively of side face 830 while lens assembly 925 corresponds to camera 825 on the top face of the camera rig 801. Also show in drawing 900 are three side support plates 808, 808′, and 808′″ which are support the top and bottom cover plates 805 and 842 of the camera rig 801. The side support plates 808, 808′, and 808′″ are secured to the top cover 805 and bottom base cover 842 via the corresponding pairs of screws shown in the Figure. For example the side support plate 808 is secured to the top and bottom cover plates 805, 842 via the screw pairs 951 and 956, the side support plate 808′ is secured to the top and bottom cover plates 805, 842 via the screw pairs 952 and 954, and the side support plate 808′″ is secured to the top and bottom cover plates 805, 842 via the screw pairs 950 and 958. The camera rig 801 in some embodiments includes a base support 960 secured to the bottom cover plate 842 via a plurality of screws 960. In some embodiments via the base support 960 the camera rig may be mounted on a support structure such that it can be rotated around a vertical axis, e.g., axis going through the center of base 960. The external support structure may be a tripod or another platform.
As can be seen in drawing 1000, the assembly of cameras on each of the four sides faces 830, 832, 834, 836 (small arrows pointing towards the faces) and the top face 840 of the camera rig 801 face in different directions. The cameras on the side faces 830, 832, 834, 836 of the camera rig 801 are pointed in the horizontal (e.g., perpendicular to the corresponding face) while the camera(s) on the top face 840 is pointed in the upward vertical direction. For example as shown in
While the stereoscopic cameras of the camera rigs 801 and 1101 are used to capture stereoscopic imaging content, e.g., during an event, the use of light field cameras allows for scanning the scene area of interest and generate depth maps of various portions of the scene area captured by the light field cameras (e.g., from the captured images corresponding to these portions of the scene of interest). In some embodiments the depth maps of various portions of the scene area may be combined to generate a composite depth map of the scene area. Such depth maps and/or composite depth map may, and in some embodiments are, provided to a playback device for use in displaying stereoscopic imaging content and simulating a 3D environment which can be experienced by the viewers.
While the stereoscopic cameras of the camera rigs discussed above are used to capture stereoscopic imaging content, e.g., during an event, the use of light field cameras allows for scanning the scene area of interest and generate depth maps of various portions of the scene area captured by the light field cameras (from the captured images corresponding to these portions of the scene of interest). In some embodiments the depth maps of various portions of the scene area may be combined to generate a composite depth map of the scene area. Such depth maps and/or composite depth map may, and in some embodiments are, provided to a playback device for use in displaying stereoscopic imaging content and simulating a 3D environment which can be experienced by the viewers.
The use of light field camera on combination with the stereoscopic cameras allows for environmental measurements and generation the environmental depth maps in real time, e.g., during an event being shot, thus obviating the need for deployment of environmental measurements to be performed offline ahead in time prior to the start of an event, e.g., a football game.
While the depth map generated from each image corresponds to a portion of the environment to be mapped, in some embodiments the depth maps generated from individual images are processed, e.g., stitched together, to form a composite map of the complete environment scanned using the light field cameras. Thus by using the light field cameras a relatively complete environmental map can be, and in some embodiments is generated.
In the case of light field cameras, an array of micro-lenses captures enough information that one can refocus images after acquisition. It is also possible to shift, after image capture, one's viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. In the case of a light field camera, depth cues from both defocus and correspondence are available simultaneously in a single capture. This can be useful when attempting to fill in occluded information/scene portions not captured by the stereoscopic cameras.
The depth maps generated from the light field camera outputs will be current and is likely to accurately measure changes in a stadium or other environment of interest for a particular event, e.g., a concert or game to be captured by a stereoscopic camera. In addition, by measuring the environment from the same location or near the location at which the stereoscopic camera are mounted, the environmental map, at least in some embodiments, accurately reflects the environment as it is likely to be perceived from the perspective of the stereoscopic cameras that are used to capture the event.
In some embodiments images captured by the light field cameras can be processed and used to fill in for portions of the environment which are not captured by a stereoscopic camera pair, e.g., because the position and/or field of view of the stereoscopic camera pair may be slightly different from that of the light field camera and/or due to an obstruction of view from the stereoscopic cameras. For example, when the light field camera is facing rearward relative to the position of the stereoscopic pair it may capture a rear facing view not visible to a forward facing stereoscopic camera pair. In some embodiments output of the light field camera is provided to a playback device separately or along with image data captured by the stereoscopic camera pairs. The playback device can use all or portions of the images captured by the light field camera when display of a scene area not sufficiently captured by the stereoscopic camera pairs is to be displayed. In addition a portion of an image captured by the light field camera may be used to fill in a portion of the a stereoscopic image that was occluded from view from the position of the stereoscopic camera pair but which a user expects to be able to see when he or she shifts his or her head to the left or right relative to the default viewing position corresponding to the location of the stereoscopic camera pair. For example, if a user leans to the left or right in an attempt to peer around a column obstructing his/her view in some embodiments content from one or more images captured by the light field camera will be used to provide the image content which was not visible to the stereoscopic camera pair but which is expected to be visible to the user from the shifted head portion the user achieves during playback by leaning left or right.
Various exemplary camera rigs illustrated in
The processing system 1408 is configured to process imaging data received from the one or more light field cameras 1404 and one or more stereoscopic cameras included in the stereoscopic imaging system 1406, in accordance with the invention. The processing performed by the processing system 1408 includes generating depth map of the environment of interest, generating 3D mesh models and UV maps and communicating them to one or more playback devices in accordance with some features of the invention. The processing performed by the processing system 1408 further includes processing and encoding stereoscopic image data received from the stereoscopic imaging system 1406 and delivering that to one or more playback devices for use in rendering/playback of stereoscopic content generated from stereoscopic cameras.
In some embodiments the processing system 1408 may include a server with the server responding to requests for content, e.g., depth map corresponding to environment of interest and/or 3D mesh model and/or imaging content. The playback devices may, and in some embodiments do, use such information to simulate a 3D environment and render 3D image content. In some but not all embodiments the imaging data, e.g., depth map corresponding to environment of interest and/or imaging content generated from images captured by the light field camera device of the imaging apparatus 1404, is communicated directly from the imaging apparatus 1404 to the customer playback devices over the communications network 1450.
The processing system 1408 is configured to stream, e.g., transmit, imaging data and/or information to one or more customer devices, e.g., over the communications network 1450. Via the network 1450, the processing system 1408 can send and/or exchange information with the devices located at the customer premises 1410, 1412 as represented in the figure by the link 1409 traversing the communications network 1450. The imaging data and/or information may be encoded prior to delivery to one or more playback devices.
Each customer premise 1410, 1412 may include a plurality of devices/players, which are used to decode and playback/display the imaging content, e.g., captured by stereoscopic cameras 1406 and/or other cameras deployed in the system 100. The imaging content is normally processed and communicated to the devices by the processing system 1408. The customer premise 1 1410 includes a decoding apparatus/playback device 1422 coupled to a display device 1420 while customer premise N 1412 includes a decoding apparatus/playback device 1426 coupled to a display device 1424. In some embodiments the display devices 1420, 1424 are head mounted stereoscopic display devices. In some embodiments the playback devices 1422, 1426 receive and use the depth map of the environment of interest and/or 3D mesh model and UV map received from the processing system 1408 in displaying stereoscopic imaging content generated from stereoscopic content captured by the stereoscopic cameras.
In various embodiments playback devices 1422, 1426 present the imaging content on the corresponding display devices 1420, 1424. The playback devices 1422, 1426 may be devices which are capable of decoding stereoscopic imaging content captured by stereoscopic camera, generate imaging content using the decoded content and rendering the imaging content, e.g., 3D image content, on the display devices 1420, 1424. In various embodiments the playback devices 1422, 1426 receives the image data and depth maps and/or 3D mesh models from the processing system 1408 and use them to display 3D image content.
The method starts in step 1502, e.g., with the imaging system being powered on and initialized. The method proceeds from start step 1502 to steps 1504, 1506, 1508 which may be performed in parallel by different elements of the imaging system, e.g., one or more cameras and a processing system.
In step 1506 the processing system acquires static environmental depth map corresponding to an environment of interest, e.g., by downloading it on the system and/or uploading it on the processing system a storage medium including the environmental depth map. The environment of interest may be, e.g., a stadium, an auditorium, a field etc. where an event of interest takes place. In various embodiments the event is captured, e.g., recorded, by one or more camera devices including stereoscopic cameras and light field cameras. The static environmental depth map includes environmental measurements of the environment of interest that have been previously made, e.g., prior to the event and thus are called static. Static environmental depth maps for various famous environments of interests, e.g., known stadiums, auditoriums etc., where events occur are readily available however such environmental depth maps do not take into consideration dynamic changes to the environment that may occur during an event and/or other changes that may have occurred since the time when the environmental measurements were made. The static depth map of environment of interest may be generated using various measurement techniques, e.g., using LIDAR and/or other methods. Operation proceeds from step 1504 to step 1510. While in various embodiments the processing systems acquires static depth map when available, in case when static depth map is not available operation proceeds to next step 1510.
In step 1510 it is checked if the static depth map is available, e.g., to the processing system. If the static depth map is available the operation proceeds from step 1510 to step 1512 otherwise the operation proceeds to step 1518. In step 1512 the processing system sets the current depth map (e.g., base environmental depth map to be used) to be the static depth map. In some embodiments when the system is initialized and depth maps from other sources are not available then the processing system initially sets the current depth map to be the static depth map. Operation proceeds from step 1512 to step 1518.
Referring to steps along the path corresponding to step 1506. In step 1506 stereoscopic image pairs of portions of the environment of interest, e.g., left and right eye images, are captured using one or more stereoscopic camera pair(s). In some embodiments the stereoscopic camera pair(s) capturing the images are mounted on the camera rigs implemented in accordance with various embodiments discussed above. Operation proceeds from step 1506 to step 1514. In step 1514 the captured stereoscopic image pairs are received at the processing system. Operation proceeds from step 1514 to step 1516. In step 1516 an environmental depth map (e.g., composite depth map of the environment of interest) is generated from the one or more stereoscopic image pairs. Operation proceeds from step 1516 to step 1518.
Returning to step 1518. In step 1518 the processing system determines if the environmental depth map generated from the one or more stereoscopic image pairs is available (for example in some cases when the stereoscopic camera pair(s) have not started capturing stereoscopic images and/or the environmental depth map has not yet been generated, the environmental depth map based on the stereoscopic images may not be available to the processing system). If in step 1518 it is determined that environmental depth map generated from the one or more stereoscopic image pairs is available the operation proceeds from step 1518 to step 1520 otherwise the operation proceeds to step 1530.
In step 1520 it is determined if a current depth map has already been set. If it is determined that the current depth map has not been set, the operation proceeds to step 1522 where the processing system sets the current depth map to be the environmental depth map generated from the one or more stereoscopic image pairs. Operation proceeds from step 1522 to step 1530. If in step 1520 it is determined that the current depth map has already been set, (e.g., for example the static depth map may have been set as the current depth map) the operation proceeds to step 1524 where the processing system reconciles the environmental depth map generated from the one or more stereoscopic image pairs with the current depth map. After reconciling operation completes the reconciled environmental depth map is set as the current depth map. In various embodiments the reconciled depth map has more and enhanced depth information compared to either one of the two individual depth maps used for reconciliation. Operation proceeds from step 1524 to step 1530.
Referring to steps along the path corresponding to step 1508. In step 1508 images of portions of the environment of interest are captured using one or more light field cameras. In some embodiments the one or more light field cameras capturing the images are mounted on the camera rigs implemented in accordance with various embodiments discussed above. Operation proceeds from step 1508 to step 1526. In step 1526 the images captured by the light field cameras are received at the processing system optionally along with depth maps of the portions of the environment of interest. Thus in some embodiments the one or more light field cameras generate depth maps of portions of the environment from the captured images and provides them to the processing system. In some other embodiments the images captured by the light field cameras are provided and the processing system generates depth maps of portions of the environment of interest. Operation proceeds from step 1526 to step 1528. In step 1528 an environmental depth map (e.g., composite depth map of the environment of interest) is generated from the one or more received images captured by the light field cameras and/or from the depth maps of portions of the environment of interest. Operation proceeds from step 1528 to step 1530.
Returning to step 1530. In step 1530 the processing system determines if the environmental depth map, generated from the image captured by the light field cameras or from the depth maps of one or more portions of the environment of interest, is available to the processing system. If in step 1530 it is determined that environmental depth map is available the operation proceeds from step 1530 to step 1532 otherwise the operation proceeds to step 1542 via connecting node B 1540.
In step 1532 it is determined if a current depth map has already been set. If it is determined that the current depth map has not been set, the operation proceeds from step 1532 to step 1534 where the processing system sets the current depth map to be the environmental depth map generated from the one or more received images captured by the light field cameras and/or from the depth maps of portions of the environment of interest. Operation proceeds from step 1534 to step 1546 via connecting node A 1538. If in step 1532 it is determined that the current depth map has already been set, (e.g., for example the static depth and/or environmental depth map generated from stereoscopic images and/or reconciled depth map may have been set as the current depth map) the operation proceeds to step 1536 where the processing system reconciles the environmental depth map generated in step 1528 from the one or more received images captured by the light field cameras with the current depth map. After reconciling operation completes the reconciled environmental depth map is set as the current depth map. Operation proceeds from step 1536 to step 1546 via connecting node A 1538.
If in step 1530 it is determined that environmental depth map is not available the operation proceeds from step 1530 to step 1542 via connecting node B 1540. In step 1542 it is determined if a current depth map has already been set. If it is determined that the current depth map has not been set, the operation proceeds from step 1542 to step 1544 where the processing system sets the current depth map to a default depth map corresponding to a sphere since no other environmental depth map is available to the processing system. Operation proceeds from step 1544 to step 1546.
In step 1542 if it is determined if a current depth map has already been set (e.g., set to one of the generated/reconciled environmental depth maps or the static depth map or the default sphere environmental depth) the operation proceeds from step 1542 to step 1546.
Returning to step 1546. In step 1546 the processing system outputs the current depth map. The current environmental depth map may be, and in various embodiments is, provided to one or more customer rendering and playback devices, e.g., for use in displaying 3D imaging content. The environmental depth map may be generated multiple times during an event, e.g., a game and/or other performance, as things may change dynamically during the event which may impact the environment of interest and thus updating the environmental depth map to keep it current is useful if the system is to be provide information and imaging content which can be used to provide a real life 3D experience to the viewers. It should be appreciated that method discussed with regard to flowchart 1500 allows for generating an enhanced and improved environmental depth map based on depth information from multiple sources, e.g., static depth maps, depth maps generated using images captured by one or more stereoscopic camera pairs and/or depth maps generated using images captured by one or more light field cameras.
Operation proceeds from step 1554 to 1556. In step 1556 a first 3D mesh model is generated from the current environmental depth map. Operation proceeds from step 1556 to 1558. In step 1558 a first UV map to be used for wrapping frames (e.g., frames of images) onto the first 3D mesh model is generated. Operation proceeds from step 1558 to 1560 wherein the first 3D mesh model and the first UV map is communicated, e.g., transmitted, to a playback device.
Operation proceeds from step 1560 to step 1562. In step 1562 the processing system initializes a current 3D mesh model and UV map to the first 3D mesh model and the first UV map respectively, e.g., by setting the current 3D mesh model as the first 3D mesh model and current UV map as the first UV map. Operation proceeds from step 1562 to step 1564. In step 1564 the processing system receives current environmental depth map, e.g., a new environmental depth map.
Operation proceeds from step 1564 to step 1566 where it is determined whether the current environmental depth map reflect a significant environmental change from the environmental depth map used to generate the current 3D mesh model. In some embodiments, the system processing the depth information monitors the depth information to detect a significant change in the depth information, e.g., a change in depth information over a predetermined amount. In some embodiments detection of such a significant change triggers updating of the current mesh model and/or UV map. Thus if in step 1566 it is determined that a significant environmental change is detected between the current environmental depth map and the environmental depth map used to generate the current 3D mesh model, the operation proceeds to step 1568 otherwise the operation proceeds back to step 1564.
Following the determination that a significant environmental change is detected, in step 1568 the processing system generates an updated 3D mesh model from the new current environmental depth map. Operation proceeds from step 1568 to step 1570. In step 1570 an updated UV map to be used for wrapping frames onto the updated 3D mesh model is generated.
Operation proceeds from step 1570 to step 1574 via connecting node M 1572. In step 1574 3D mesh model difference information is generated. In various embodiments the 3D mesh model difference information includes information reflecting the difference between the new updated 3D mesh model and the currently used 3D mesh model, e.g., first 3D mesh model. In some cases communicating the difference information to a playback device is more efficient rather than communicating the entire updated 3D mesh model. In such cases by using the received different information the playback device can, and in various embodiments does, updates its current 3D mesh model to generate an updated mesh model. While the 3D mesh model difference information is generated in some embodiments, e.g., where it is determined that it is more convenient and/or efficient to send difference information rather than the entire updated mesh model, step 1574 is optional and not necessarily performed in all embodiments. Operation proceeds from step 1574 to step 1576. In step 1576, which is optional too, UV map difference information is generated, where the UV map difference information reflects the difference between the new updated UV map and the currently used UV map, e.g., first UV map.
Operation proceeds from step 1576 to step 1578. In step 1578 the processing system communicates updated 3D mesh model information, e.g., the generated updated 3D mesh model or the mesh model difference information, to a playback device. Operation proceeds from step 1578 to step 1580. In step 1580 the processing system communicates updated UV map information, e.g., the generated updated UV map or the UV map difference information, to a playback device.
Operation proceeds from step 1580 to step 1582. In step 1582 the processing system sets the current 3D mesh model to be the updated 3D mesh model. Operation proceeds from step 1582 to step 1584. In step 1584 the processing system sets the current UV map to be the updated UV map. It should be appreciated that the updated mesh model and UV map is based on current depth measurements making the new mesh model and/or UV map more accurate than the older mesh models and/or maps based on depth measurement taken at a different time. Operation proceeds from step 1584 back to 1564 via connecting node N 1585 and the process continues in the manner as discussed above.
The display device 1602 may be, and in some embodiments is, a touch screen, used to display images, video, information regarding the configuration of the camera device, and/or status of data processing being performed on the camera device. In the case where the display device 1602 is a touch screen, the display device 1602 serves as an additional input device and/or as an alternative to the separate input device, e.g., buttons, 1604. The input device 1604 may be, and in some embodiments is, e.g., keypad, touch screen, or similar device that may be used for inputting information, data and/or instructions.
Via the I/O interface 1606 the camera device 1600 may be coupled to external devices and exchange information and signaling with such external devices. In some embodiments via the I/O interface 1606 the camera 1600 may, and in some embodiments does, interfaces with the processing system 1600. In some such embodiments the processing system 1600 can be used to configure and/or control the camera 1600.
The network interface 1614 allows the camera device 1600 to be able to receive and/or communicate information to an external device over a communications network. In some embodiments via the network interface 1614 the camera 1600 communicates captured images and/or generated depth maps to other devices and/or systems over a communications network, e.g., internet and/or other network.
The optical chain 1610 includes a micro lens array 1624 and an image sensor 1626. The camera 1600 uses the micro lens array 1624 to capture light information of a scene of interest coming from more than one direction when an image capture operation is performed by the camera 1600.
The memory 1612 includes various modules and routines, which when executed by the processor 1608 control the operation of the camera 1600 in accordance with the invention. The memory 1612 includes control routines 1620 and data/information 1622. The processor 1608, e.g., a CPU, executes control routines and uses data/information 1622 to control the camera 1600 to operate in accordance with the invention and implement one or more steps of the method of flowchart 1500. In some embodiments the processor 1608 includes an on-chip depth map generation circuit 1607 which generates depth map of various portions of the environment of interest from captured images corresponding to these portions of the environment of interest which are captured during the operation of the camera 1600 in accordance with the invention. In some other embodiments the camera 1600 provides captured images 1628 to the processing system 1600 which generates depth maps using the images captured by the light field camera 1600. The depth maps of various portions of the environment of interest generated by the camera 1600 are stored in the memory 1612 as depth maps 1630 while images corresponding to one or more portions of the environment of interest are stored as captured image(s) 1628. The captured images and depth maps are stored in memory 1612 for future use, e.g., additional processing, and/or transmission to another device. In various embodiments the depth maps 1630 generated by the camera 1600 and one or more captured images 1628 of portions of the environment of interest captured by the camera 1600 are provided to the processing system 1408, e.g., via interface 1606 and/or 1614, for further processing and actions in accordance with the features of the invention. In some embodiments the depth maps and/or captured images are provided, e.g., communicated by the camera 1500, to one or more customer devices.
The processing system 1700 may be, and in some embodiments is, used to perform composite environmental depth map generation operation, multi-rate encoding operation, storage, and transmission and/or content output in accordance with the features of the invention. The processing system 1700 may also include the ability to decode and display processed and/or encoded image data, e.g., to an operator.
The system 1700 includes a display 1702, input device 1704, input/output (I/O) interface 1706, a processor 1708, network interface 1710 and a memory 1712. The various components of the system 1700 are coupled together via bus 1709 which allows for data to be communicated between the components of the system 1700.
The memory 1712 includes various routines and modules which when executed by the processor 1708 control the system 1700 to implement the composite environmental depth map generation, environmental depth map reconciling, encoding, storage, and streaming/transmission and/or output operations in accordance with the invention.
The display device 1702 may be, and in some embodiments is, a touch screen, used to display images, video, information regarding the configuration of the processing system 1700, and/or indicate status of the processing being performed on the processing device. In the case where the display device 602 is a touch screen, the display device 602 serves as an additional input device and/or as an alternative to the separate input device, e.g., buttons, 1706. The input device 1704 may be, and in some embodiments is, e.g., keypad, touch screen, or similar device that may be used for inputting information, data and/or instructions.
Via the I/O interface 1706 the processing system 1700 may be coupled to external devices and exchange information and signaling with such external devices, e.g., such as the camera rig 801 and/or other camera rigs shown in the figures and/or light field camera 1600. The I/O interface 1606 includes a transmitter and a receiver. In some embodiments via the I/O interface 1706 the processing system 1700 receives images captured by various cameras, e.g., stereoscopic camera pairs and/or light field cameras (e.g., camera 1600), which may be part of a camera rig such as camera rig 801.
The network interface 1710 allows the processing system 1700 to be able to receive and/or communicate information to an external device over a communications network, e.g., such as communications network 105. The network interface 1710 includes a multiport broadcast transmitter 1740 and a receiver 1742. The multiport broadcast transmitter 1740 allows the processing system 1700 to broadcast multiple encoded stereoscopic data streams each supporting different bit rates to various customer devices. In some embodiments the processing system 1700 transmits different portions of a scene, e.g., 180 degree front portion, left rear portion, right rear portion etc., to customer devices via the multiport broadcast transmitter 1740. Furthermore, in some embodiments via the multiport broadcast transmitter 1740 the processing system 1700 also broadcasts a current environmental depth map to the one or more customer devices. While the multiport broadcast transmitter 1740 is used in the network interface 1710 in some embodiments, still in some other embodiments the processing system transmits, e.g., unicasts, the environmental depth map, 3D mesh model, UV map, and/or stereoscopic imaging content to individual customer devices.
The memory 1712 includes various modules and routines, which when executed by the processor 1708 control the operation of the system 1700 in accordance with the invention. The processor 1708, e.g., a CPU, executes control routines and uses data/information stored in memory 1712 to control the system 1700 to operate in accordance with the invention and implement one or more steps of the method of flowchart of
In some embodiments the modules are, implemented as software modules. In other embodiments the modules are implemented outside the memory 1612 in hardware, e.g., as individual circuits with each module being implemented as a circuit for performing the function to which the module corresponds. In still other embodiments the modules are implemented using a combination of software and hardware. In the embodiments where one or more modules are implemented as software modules or routines, the modules and/or routines are executed by the processor 1708 to control the system 1700 to operate in accordance with the invention and implement one or more operations discussed with regard to flowcharts 1500 and/or 1550.
The control routines 1714 include device control routines and communications routines to control the operation of the processing system 1700. The encoder(s) 1716 may, and in some embodiments do, include a plurality of encoders configured to encode received image content, stereoscopic images of a scene and/or one or more scene portions in accordance with the features of the invention. In some embodiments encoder(s) include multiple encoders with each encoder being configured to encode a stereoscopic scene and/or partitioned scene portions to support a given bit rate stream. Thus in some embodiments each scene portion can be encoded using multiple encoders to support multiple different bit rate streams for each scene. An output of the encoder(s) 1716 is the encoded stereoscopic image data 1728 stored in the memory for streaming to customer devices, e.g., playback devices. The encoded content can be streamed to one or multiple different devices via the network interface 1710.
The composite depth map generation module 1717 is configured to generate a composite environmental depth maps of the environment of interest from the images captured by various cameras, e.g., stereoscopic camera pairs and one or more light field cameras. Thus the composite depth map generation module 1717 generates the environmental depth map 1732 from stereoscopic image pairs, the environmental depth map 1734 from images captured by one or more light field cameras.
The depth map availability determination module 1718 is configured to determine whether a given depth map is available at a given time, e.g., whether a static depth map is available and/or whether an environmental depth map generated from images captured by light field cameras is available and/or whether an environmental depth map generated from images captured by stereoscopic camera pairs is available, at given times.
The current depth map determination module 1719 is configured to determine if a current depth map has been set. In various embodiments the current depth map determination module 1719 is further configured to set one of the environmental depth map or a reconciled depth map as the current depth map in accordance with the features of the invention. For example when a reconciled environmental depth map is available, e.g., having been generated by reconciling environmental depth maps generated from two or more sources, the current depth map determination module 1719 sets the reconciled environmental depth map as the current depth map.
The streaming controller 1720 is configured to control streaming of encoded content for delivering the encoded image content (e.g., at least a portion of encoded stereoscopic image data 1728) to one or more customer playback devices, e.g., over the communications network 105. In various embodiments the streaming controller 1720 is further configured to communicate, e.g., transmit, an environmental depth map that has been set as the current depth map to one or more customer playback devices, e.g., via the network interface 1710.
The image generation module 1721 is configured to generate a first image from at least one image captured by the light field camera, e.g., received images 1723, the generated first image including a portion of the environment of interest which is not included in at least some of the stereoscopic images (e.g., stereoscopic image content 1724) captured by the stereoscopic cameras. In some embodiments the streaming controller 1720 is further configured to transmit at least a portion of the generated first image to one or more customer playback devices, e.g., via the network interface 1710.
The depth map reconciliation module 1722 is configured to perform depth map reconciling operations in accordance with the invention, e.g., by implementing the functions corresponding to steps 1526 and 1536 of flowchart 1500. The 3D mesh model generation and update module 1740 is configured to generate a 3D mesh model from a current environmental depth map (e.g., reconciled depth map or environmental depth map that has been set as the current environmental depth map). The module 1740 is further configured to update the 3D mesh model when significant environmental changes have been detected in a current environmental depth map compared to the environmental depth map used to generate the current 3D mesh model. In some embodiments the generated 3D mesh model(s) 1744 may include one or more 3D mesh models generated by module 1740 and the most recently updated 3D mesh model in the 3D mesh model(s) 1744 is set as the current 3D mesh model 1748. The UV map generation and update module 1742 is configured to generate a UV map to be used in wrapping frames onto the generated 3D mesh model. The module 1742 is further configured to update the UV map. The generated UV map(s) 1746 may include one or more UV maps generated by module 1742 and the most recently updated UV map in the generated UV map(s) 1746 is set as the current UV map 1750. In some embodiments the modules are configured to perform the functions corresponding to various steps discussed in
Received stereoscopic image data 1724 includes stereoscopic image pairs captured by received from one or more stereoscopic cameras, e.g., such as those included in the rig 801. Encoded stereoscopic image data 1728 includes a plurality of sets of stereoscopic image data which have been encoded by the encoder(s) 1716 to support multiple different bit rate streams.
The static depth map 1730 is the acquired, e.g., downloaded, depth map of the environment of interest. The environmental depth map generated from images captured by stereoscopic camera pairs 1732 and the environmental depth map generated from images captured by one or more light field cameras 1734 are outputs of the composite environmental depth map generation module 1717. The reconciled environmental depth map(s) 1736 includes one or more environmental depth maps generated by the reconciliation module 1722 in accordance with the invention. The default depth map corresponding to a sphere 1738 is also stored in memory 1712 for use in the event when an environmental depth map is not available from other sources, e.g., when none of the static depth map 1730, environmental depth map 1732 and environmental depth map 1734 is available for use. Thus in some embodiments the reconciled environmental depth map(s) 1736 is set as the current environmental depth map and used in generating 3D mesh models.
In some embodiments generation, transmission and updating of the 3D mesh model and UV map may be triggered by detection of significant changes to environmental depth information obtained from one or more depth measurement sources, e.g., the light field camera outputs and/or stereoscopic camera pair output. See for example
A complete new 3D model or model difference information maybe, and in some embodiments is, transmitted to the playback device as updated model information. In addition to the generation and transmission of updated 3D model information, updated UV map information maybe, and some embodiments is, generated and transmitted to the playback device to be used when rendering images using the updated 3D model information. Mesh model and/or UV map updates are normally timed to coincide with scene changes and/or to align with group of picture (GOP) boundaries in a transmitted image stream. In this way, application of the new model and/or map will normally begin being applied in the playback device at a point where decoding of a current frame does not depend on a frame or image which was to be rendered using the older model or map since each GOP boundary normally coincides with the sending of intra-frame coded image data. Since the environmental changes will frequently coincide with scene changes such as the closing of a curtain, moving of a wall, etc. the scene change point is a convenient point to implement the new model and in many cases will coincide with the event that triggered the generation and transmission of the updated model information and/or updated UV map.
In addition to receiving a updated mesh model, in many cases the playback device receives a corresponding UV map to be used to map images, e.g., frames, to the 3D space, e.g., onto a 3D mesh model defining the 3D environmental space. The frames may be, and sometimes are, generated from image data captured by one or more stereoscopic camera pairs mounted on a camera rig which also includes one or more light field cameras, e.g., Lytro cameras, used to capture depth information useful in updating a 3D map. While new or updated UV map information is often received when updated mesh model information is received, if the number of nodes in the 3D mesh model remains the same before and after an update, the UV map may not be updated at the same time as the 3D mesh model. UV map information may be transmitted and received as a complete new map or as difference information. Thus, in some embodiments UV map difference information is received and processed to generate an updated UV map. The updated difference map maybe and sometimes is, generated by applying the differences indicated in the updated UV map information to the previous UV map.
The method of flowchart 1800 begins in start step 1802 with a playback device such as a game console and display or head mounted display assembly being powered on and set to begin receiving, storing and processing 3D related image data, e.g., frames representing texture information produced from captured images, model information and/or UV maps to be used in rendering images. Operation proceeds from start step 1802 to step 1804 in which information communicating a first mesh model of a 3D environment, e.g., a stadium, theater, etc., generated based on measurements of at least a portion of the environment made using a light field camera at a first time is received and stored, e.g., in memory. The model maybe, and sometimes is, in the form of a set of 3D coordinates (X, Y, Z) indicating distances to nodes from an origin corresponding to a user viewing position. The node coordinates define a mesh model. Thus in some embodiments the first mesh model information includes a first set of coordinate triples, each triple indicating a coordinate in X, Y, Z space of a node in the first mesh model.
The mesh model includes segments formed by the interconnection of the nodes points in an indicated or predetermined manner. For example, each node in all or a portion of the mesh may be coupled to the nearest 3 adjacent nodes for portions of the mesh model where 3 sided segments are used. In portions of the mesh model where four sided segments are used, each node may be known to interconnect with its four nearest neighbors. In addition to node location information, the model may, and in some embodiments does, include information about how nodes in the model are to be interconnected. In some embodiments information communicating the first mesh model of the 3D environment includes information defining a complete mesh model.
Operation proceeds from step 1804 to step 1806 in which a first map, e.g., a first UV map, indicating how a 2D image, e.g., received frame, is to be wrapped onto the first 3D model is received. The first UV map usually includes one segment for each segment of the 3D model map with there being a one to one indicated or otherwise known correspondence between the first UV map segments to the first 3D model segments. The first UV map can, and is used, as part of the image rendering process to apply, e.g., wrap, the content of 2D frames which correspond to what is sometimes referred to as UV space to the segments of the first 3D model. This mapping of the received textures in the form of frames corresponding to captured image data to the 3D environment represented by the segments of the 3D model allows received left and right eye frames corresponding to stereoscopic image pairs to be rendered into images which are to be viewed by the user's left and right eyes, respectively.
The receipt of the first 3D model and first rendering map, e.g., a first UV map, can occur together or in any order and are shown as sequential operation in
Operation proceeds from step 1808 to step 1810 in which at least one image is rendered using the first mesh model. As part of the image rendering performed in step 1810, the first UV map is used to determine how to wrap an image included in the received image content on to the first mesh model to generate an image which can be displayed and viewed by a user. Each of the left and right eye images of a stereoscopic pair will be, in some embodiments, rendered individually and may be displayed on different portions of a display so that different images are viewed by the left and right eyes allowing for images to be perceived by the user as having a 3D effect. The rendered images are normally displayed to the user after rendering, e.g., via a display device which in some embodiments is a cell phone display mounted in a helmet which can be worn on a person's head, e.g., as a head mounted display device.
While multiple images may be rendered and displayed over time as part of step 1810, at some point during the event being captured and streamed for playback, a change in the environment may occur such as a curtain being lowered, a wall of a stage being moved, a dome on a stadium being opened or closed. Such events may, and in various embodiments will be, detected by environmental measurements being performed. In response to detecting a change in the environment, a new 3D mesh model and UV map may be generated by the system processing the captured images and/or environmental measurements.
In step 1814, updated mesh model information is received. The updated mesh model, in some cases, includes updated mesh model information, e.g., new node points, generated based on measurement of a portion of the environment. The measurements may correspond to the same portion of the environment to which the earlier measurements for the first mesh model correspond and/or the new measurements may at include measurements of the portion of the environment. Such measurements maybe, and sometimes are, based on environmental depth measurements relative to the camera rig position obtained using a light field camera, e.g., such as the ones illustrated in the preceding figures. In some embodiments updated mesh model information including at least some updated mesh model information generated based on measurements of at least the portion said environment using said light field camera at a second time, e.g., a time period after the first time period.
The updated mesh model information received in step 1814 may be in the form of a complete updated mesh model or in the form of difference information indicating changes to be made to the first mesh model to form the updated mesh model. Thus in some embodiments updated mesh model information is difference information indicating a difference between said first mesh model and an updated mesh model. In optional step 1815 which is performed when model difference information is received, the playback device generates the updated mesh model from the first mesh model and the received difference information. For example, in step 1815 nodes not included in the updated mesh model may be deleted from the set of information representing the first mesh model and replaced with new nodes indicated by the mesh module update information that was received to thereby create the updated mesh model. Thus in some embodiments the updated mesh model information includes information indicating changes to be made to the first mesh model to generate an updated mesh model. In some embodiments the updated mesh model information provides new mesh information for portions of the 3D environment which have changed between the first and second time periods. In some embodiments the updated mesh model information includes at least one of: i) new sets of mesh coordinates for at least some nodes in the first mesh model information, the new coordinates being intended to replace coordinates of corresponding nodes in the first mesh model; or ii) a new set of coordinate triples to be used for at least a portion of the mesh model in place of a previous set of coordinate triples, the new set of coordinate triples including the same or a different number of coordinate triples than the previous set of coordinate triples to be replaced.
In addition to receiving updated mesh model information the playback device may receive updated map information. This is shown in step 1816. The updated map information maybe in the form of a complete new UV map to be used to map images to the updated mesh model or in the form of difference information which can be used in combination with the first map to generate an updated map. While an updated UV map need not be supplied with each 3D model update, UV map updates will normally occur at the same time as the model updates and will occur when a change in the number of nodes occurs resulting in a different number of segments in the 3D mesh model. Updated map information need not be provided if the number of segments and nodes in the 3D model remain unchanged but will in many cases be provided even if there is no change in the number of model segments given that the change in the environmental shape may merit a change in how captured images are mapped to the 3D mesh model being used.
If difference information is received rather than a complete UV map, the operation proceeds from step 1816 to step 1818. In step 1818, which is used in the case where map difference information is received in step 1816, an updated map is generated by applying the map difference information included in the received updated map information to the first map. In the case where a complete updated UV map is received in step 1816 there is no need to generate the updated map from difference information since the full updated map is received.
In parallel with or after the receipt and/or generation of the updated 3D mesh model and/or updated UV map, additional image content is received in step 1820. The additional image content, may and sometimes does correspond to, for example, a second portion of an event which follows a first event segment to which the first 3D model corresponded. Operation proceeds from step 1820 to step 1822. In step 1822 the additional image content is rendered. As part of the image rendering performed in step 1822, the updated 3D model is used to render at least some of the received additional image content as indicated in step 1824. The update UV map will also be used as indicated by step 1826 when it is available. When no updated UV map has been received or generated, the image rendering in step 1822 will use the old, e.g., first UV map as part of the rendering process. Images rendered in step 1822 are output for display.
The updating of the 3D model and/or UV map may occur repeatedly during a presentation in response to environmental changes. This on going potential for repeated model and UV map updates is represented by arrow 1827 which returns processing to step 1814 where additional updated mesh model information may be received. With each return to step 1814, the current mesh model and UV model is treated as the first mesh model for purposes of generating a new updated mesh model and/or UV map in the case where the update includes difference information.
The processing described with regard to
In some embodiments the playback device includes instructions which, when executed by a processor of the playback device, control the playback device to implemented the steps shown in
The rendering and playback system 1900 in some embodiments include and/or coupled to 3D head mounted display 1905. The system 1900 includes the ability to decode the received encoded image data and generate 3D image content for display to the customer. The playback system 1900 in some embodiments is located at a customer premise location such as a home or office but may be located at an image capture site as well. The playback system 1900 can perform signal reception, decoding, 3D mesh model updating, rendering, display and/or other operations in accordance with the invention.
The playback system 1900 includes a display 1902, a display device interface 1903, a user input interface device 1904, input/output (I/O) interface 1906, a processor 1908, network interface 1910 and a memory 1912. The various components of the playback system 1900 are coupled together via bus 1909 which allows for data to be communicated between the components of the system 1900.
While in some embodiments display 1902 is included as an optional element as illustrated using the dashed box, in some embodiments an external display device 1905, e.g., a head mounted stereoscopic display device, can be coupled to the playback system 1900 via the display device interface 1903. The head mounted display 1202 maybe implemented using the OCULUS RIFT™ VR (virtual reality) headset which may include the head mounted display 1202. Other head mounted displays may also be used. The image content is presented on the display device of system 1900, e.g., with left and right eyes of a user being presented with different images in the case of stereoscopic content. By displaying different images to the left and right eyes on a single screen, e.g., on different portions of the single screen to different eyes, a single display can be used to display left and right eye images which will be perceived separately by the viewer's left and right eyes. While various embodiments contemplate a head mounted display to be used in system 1900, the methods and system can also be used with non-head mounted displays which can support 3D image.
The operator of the playback system 1900 may control one or more parameters and/or provide input via user input device 1904. The input device 1904 may be, and in some embodiments is, e.g., keypad, touch screen, or similar device that may be used for inputting information, data and/or instructions.
Via the I/O interface 1906 the playback system 1900 may be coupled to external devices and exchange information and signaling with such external devices. In some embodiments via the I/O interface 1906 the playback system 1900 receives images captured by various cameras, e.g., stereoscopic camera pairs and/or light field cameras, receive 3D mesh models and UV maps.
The memory 1912 includes various modules, e.g., routines, which when executed by the processor 1908 control the playback system 1900 to perform operations in accordance with the invention. The memory 1912 includes control routines 1914, a user input processing module 1916, a head position and/or viewing angle determination module 1918, a decoder module 1920, a stereoscopic image rendering module 1922 also referred to as a 3D image generation module, a 3D mesh model update module 1924, a UV map update module 1926, received 3D mesh model 1928, received UV map 1930, and data/information including received encoded image content 1932, decoded image content 1934, updated 3D mesh model information 1936, updated UV map information 1938, updated 3D mesh model 1940, updated UV map 1940 and generated stereoscopic content 1934.
The processor 1908, e.g., a CPU, executes routines 1914 and uses the various modules to control the system 1900 to operate in accordance with the invention. The processor 1908 is responsible for controlling the overall general operation of the system 1100. In various embodiments the processor 1108 is configured to perform functions that have been discussed as being performed by the rendering and playback system 1900.
The network interface 1910 includes a transmitter 1911 and a receiver 1913 which allows the playback system 1900 to be able to receive and/or communicate information to an external device over a communications network, e.g., such as communications network 1450. In some embodiments the playback system 1900 receives, e.g., via the interface 1910, image content 1932, 3D mesh model 1928, UV map 1930, updated mesh model information 1936, updated UV map information 1938 from the processing system 1700 over the communications network 1450. Thus in some embodiments the playback system 1900 receives, via the interface 1910, information communicating a first mesh model, e.g., the 3D mesh model 1928, of a 3D environment generated based on measurements of at least a portion of the environment made using a light field camera at a first time. The playback system 1900 in some embodiments further receives via the interface 1910, image content, e.g., frames of left and right eye image pairs.
The control routines 1914 include device control routines and communications routines to control the operation of the system 1900. The request generation module 1916 is configured to generate request for content, e.g., upon user selection of an item for playback. The received information processing module 1917 is configured to process information, e.g., image content, audio data, environmental models, UV maps etc., received by the system 1900, e.g., via the receiver of interface 1906 and/or 1910, to recover communicated information that can be used by the system 1900, e.g., for rendering and playback. The head position and/or viewing angle determination module 1918 is configured to determine a current viewing angle and/or a current head position, e.g., orientation, of the user, e.g., orientation of the head mounted display, and in some embodiment report the determined position and/or viewing angle information to the processing system 1700.
The decoder module 1920 is configured to decode encoded image content 1932 received from the processing system 1700 or the camera rig 1402 to produce decoded image data 1934. The decoded image data 1934 may include decoded stereoscopic scene and/or decoded scene portions.
The 3D image renderer 1922 uses decoded image data to generate 3D image content in accordance with the features of the invention for display to the user on the display 1902 and/or the display device 1905. In some embodiments the 3D image renderer 1922 is configured to render, using a first 3D mesh model at least some of received image content. In some embodiments the 3D image renderer 1922 is further configured to use a first UV map to determine how to wrap an image included in received image content onto the first 3D mesh model.
The 3D mesh model update module 1924 is configured to update a received first 3D mesh model 1928 (e.g., initially received mesh model) using received updated mesh model information 1936 to generate an updated mesh model 1940. In some embodiments the received updated mesh model information 1936 includes mesh model difference information reflecting the changes with respect to a previous version of the 3D mesh model received by the playback device 1900. In some other embodiments the received updated mesh model information 1936 includes complete information for generating a full complete 3D mesh model which is then output as the updated mesh model 1940.
The UV map update module 1926 is configured to update a received first UV map 1930 (e.g., initially received UV map) using received updated UV map information 1938 to generate an updated UV map 1942. In some embodiments the received updated UV map information 1938 includes difference information reflecting the changes with respect to a previous version of the UV map received by the playback device 1900. In some other embodiments the received updated UV map information 1938 includes information for generating a full complete UV map which is then output as the updated UV map 1942.
In various embodiments when the 3D mesh model and/or UV map is updated in accordance with the invention, 3D image rendering module 1922 is further configured to render, using a updated mesh model, at least some of the image content, e.g., additional image content. In some such embodiments the 3D image rendering module 1922 is further configured use the updated UV map to determine how to wrap an image included in the image content to be rendered onto the updated 3D mesh model. The generated stereoscopic image content 1944 is the output of the 3D image rendering module 1922.
In some embodiments some of the modules are implemented, e.g., as circuits, within the processor 1908 with other modules being implemented, e.g., as circuits, external to and coupled to the processor. Alternatively, rather than being implemented as circuits, all or some of the modules may be implemented in software and stored in the memory of the playback device 1900 with the modules controlling operation of the playback device 1900 to implement the functions corresponding to the modules when the modules are executed by a processor, e.g., processor 1908. In still other embodiments, various modules are implemented as a combination of hardware and software, e.g., with a circuit external to the processor 1908 providing input to the processor 1908 which then under software control operates to perform a portion of a module's function.
While shown in
While shown in the
As should be appreciated, the modules illustrated in
In one exemplary embodiment the processor 1908 is configured to control the playback device 1900 to: receive, e.g., via interface 1910, information communicating a first mesh model of a 3D environment generated based on measurements of at least a portion of said environment made using a light field camera at a first time; receive, e.g., via the interface 1910, image content; and render, using said first mesh model at least some of the received image content.
In some embodiments the processor is further configured to control the playback device to receive, e.g., via the interface 1910, updated mesh model information, said updated mesh model information including at least some updated mesh model information generated based on measurements of at least the portion said environment using said light field camera at a second time. In some embodiments the updated mesh model information communicates a complete updated mesh model.
In some embodiments the processor is further configured to control the playback device to: receive additional image content; and render, using said updated mesh model information, at least some of the received additional image content.
In some embodiments the processor is further configured to control the playback device to: receive (e.g., via the interface 1910 or 1906), a first map mapping a 2D image space to said first mesh model; and use said first map to determine how to wrap an image included in said received image content onto said first mesh model as part of being configured to render, using said first mesh model, at least some of the received image content.
In some embodiments the processor is further configured to control the playback device to: receive (e.g., via the interface 1910 or 1906) updated map information corresponding to said updated mesh model information; and use said updated map information to determine how to wrap an additional image included in said received additional image content onto said updated mesh model as part of being configured to render, using said updated mesh model information, at least some of the received additional image content.
In some embodiments the updated map information includes map difference information. In some such embodiments the processor is further configured to control the playback device to: generate an updated map by applying said map difference information to said first map to generate an updated map; and use said updated map to determine how to wrap an additional image included in said received additional image content onto said updated mesh model as part of rendering, using said updated mesh model information, at least some of the received additional image content.
While steps are shown in an exemplary order it should be appreciated that in many cases the order of the steps may be altered without adversely affecting operation. Accordingly, unless the exemplary order of steps is required for proper operation, the order of steps is to be considered exemplary and not limiting.
While various embodiments have been discussed, it should be appreciated that not necessarily all embodiments include the same features and some of the described features are not necessary but can be desirable in some embodiments.
While various ranges and exemplary values are described the ranges and values are exemplary. In some embodiments the ranges of values are 20% larger than the ranges discussed above. In other embodiments the ranges are 20% smaller than the exemplary ranges discussed above. Similarly, particular values may be, and sometimes are, up to 20% larger than the values specified above while in other embodiments the values are up to 20% smaller than the values specified above. In still other embodiments other values are used.
In
The techniques of various embodiments may be implemented using software, hardware and/or a combination of software and hardware. Various embodiments are directed to apparatus, e.g., a image data capture and processing systems. Various embodiments are also directed to methods, e.g., a method of image capture and/or processing image data. Various embodiments are also directed to a non-transitory machine, e.g., computer, readable medium, e.g., ROM, RAM, CDs, hard discs, etc., which include machine readable instructions for controlling a machine to implement one or more steps of a method.
Various features of the present invention are implemented using modules. Such modules may, and in some embodiments are, implemented as software modules. In other embodiments the modules are implemented in hardware. In still other embodiments the modules are implemented using a combination of software and hardware. In some embodiments the modules are implemented as individual circuits with each module being implemented as a circuit for performing the function to which the module corresponds. A wide variety of embodiments are contemplated including some embodiments where different modules are implemented differently, e.g., some in hardware, some in software, and some using a combination of hardware and software. It should also be noted that routines and/or subroutines, or some of the steps performed by such routines, may be implemented in dedicated hardware as opposed to software executed on a general purpose processor. Such embodiments remain within the scope of the present invention. Many of the above described methods or method steps can be implemented using machine executable instructions, such as software, included in a machine readable medium such as a memory device, e.g., RAM, floppy disk, etc. to control a machine, e.g., general purpose computer with or without additional hardware, to implement all or portions of the above described methods. Accordingly, among other things, the present invention is directed to a machine-readable medium including machine executable instructions for causing a machine, e.g., processor and associated hardware, to perform one or more of the steps of the above-described method(s).
Some embodiments are directed a non-transitory computer readable medium embodying a set of software instructions, e.g., computer executable instructions, for controlling a computer or other device to encode and compresses stereoscopic video. Other embodiments are embodiments are directed a computer readable medium embodying a set of software instructions, e.g., computer executable instructions, for controlling a computer or other device to decode and decompresses video on the player end. While encoding and compression are mentioned as possible separate operations, it should be appreciated that encoding may be used to perform compression and thus encoding may, in some include compression. Similarly, decoding may involve decompression.
In various embodiments a processor of a processing system is configured to control the processing system to perform the method steps performed by the exemplary described processing system. In various embodiments a processor of a playback device is configured to control the playback device to implement the steps, performed by a playback device, of one or more of the methods described in the present application.
Numerous additional variations on the methods and apparatus of the various embodiments described above will be apparent to those skilled in the art in view of the above description. Such variations are to be considered within the scope.
Claims
1. A method of operating a playback device, the method comprising:
- receiving information communicating a first mesh model of a 3D environment generated based on measurements of a portion of said environment made using a light field camera at a first time;
- receiving image content; and
- rendering, using said first mesh model at least some of the received image content.
2. The method of claim 1, further comprising:
- receiving updated mesh model information, said updated mesh model information including at least some updated mesh model information generated based on measurements of said portion of said environment using said light field camera at a second time.
3. The method of claim 2, further comprising:
- receiving additional image content; and
- rendering, using said updated mesh model information at least some of the received additional image content.
4. The method of claim 3, wherein said information communicating a first mesh model of the 3D environment includes information defining a complete mesh model.
5. The method of claim 4, wherein said updated mesh model information communicates a complete updated mesh model.
6. The method of claim 5, wherein said updated mesh model information provides new mesh information for portions of said 3D environment which have changed between said first and second time periods.
7. The method of claim 6, wherein said updated mesh model information is difference information indicating a difference between said first mesh model and an updated mesh model.
8. The method of claim 7, wherein said first mesh model information includes a first set of coordinate triples, each coordinate triple indicating a coordinate in X, Y, Z space of a node in the first mesh model.
9. The method of claim 8, wherein said updated mesh model information includes at least one of: i) new sets of mesh coordinates for at least some nodes in said first mesh model information, said new coordinates being intended to replace coordinates of corresponding nodes in said first mesh model; or ii) a new set of coordinate triples to be used for at least a portion of said first mesh model in place of a previous set of coordinate triples, said new set of coordinate triples including the same or a different number of coordinate triples than the previous set of coordinate triples to be replaced.
10. The method of claim 9, further comprising:
- receiving a first map mapping a 2D image space to said first mesh model; and
- wherein rendering, using said first mesh model at least some of the received image content, includes using said first map to determine how to wrap an image included in said received image content onto said first mesh model.
11. The method of claim 10, further comprising:
- receiving updated map information corresponding to said updated mesh model information; and
- wherein rendering, using said updated mesh model information at least some of the received additional image content, includes using said updated map information to determine how to wrap an additional image included in said received additional image content onto said updated mesh model.
12. The method of claim 11, wherein the updated map information includes map difference information, the method further comprising:
- generating an updated map by applying said map difference information to said first map to generate an updated map; and
- wherein rendering, using said updated mesh model information, at least some of the received additional image content, includes using said updated map to determine how to wrap an additional image included in said received additional image content onto said updated mesh model.
13. A computer readable medium including computer executable instructions which, when executed by a computer, control the computer to:
- receive information communicating a first mesh model of a 3D environment generated based on measurements of a portion of said environment made using a light field camera at a first time;
- receive image content; and
- render, using said first mesh model at least some of the received image content.
14. A playback apparatus, comprising:
- a processor configured to control said playback apparatus to: receive information communicating a first mesh model of a 3D environment generated based on measurements of a portion of said environment made using a light field camera at a first time; receive image content; and render, using said first mesh model at least some of the received image content.
15. The playback apparatus of claim 14, wherein the processor is further configured to control the playback apparatus to:
- receive updated mesh model information, said updated mesh model information including at least some updated mesh model information generated based on measurements of the portion of said environment using said light field camera at a second time.
16. The playback apparatus of claim 15, wherein the processor is further configured to control the playback apparatus to:
- receive additional image content; and
- render, using said updated mesh model information, at least some of the received additional image content.
17. The playback apparatus of claim 14, wherein the processor is further configured to control the playback apparatus to:
- receive a first map mapping a 2D image space to said first mesh model; and
- use said first map to determine how to wrap an image included in said received image content onto said first mesh model as part of being configured to render, using said first mesh model, at least some of the received image content.
18. The playback apparatus of claim 17, wherein the processor is further configured to control the playback apparatus to:
- receive updated map information corresponding to said updated mesh model information; and
- use said updated map information to determine how to wrap an additional image included in said received additional image content onto said updated mesh model as part of being configured to render, using said updated mesh model information, at least some of the received additional image content.
19. The playback apparatus of claim 18, wherein the updated map information includes map difference information; and
- wherein the processor is further configured to control the playback apparatus to: generate an updated map by applying said map difference information to said first map to generate an updated map; and use said updated map to determine how to wrap an additional image included in said received additional image content onto said updated mesh model as part of rendering, using said updated mesh model information, at least some of the received additional image content.
20. The playback apparatus of claim 16, wherein said information communicating a first mesh model of the 3D environment includes information defining a complete mesh model.
Type: Application
Filed: Mar 1, 2016
Publication Date: Sep 1, 2016
Inventors: David Cole (Laguna Beach, CA), Alan McKay Moss (Laguna Beach, CA)
Application Number: 15/057,210