Radiative Transfer Signalling For Immersive Video
An encoder may segment volumetric video data into one or more regions; determine at least one radiative transfer property of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data. A decoder may receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
This application claims priority under 35 U.S.C. 119(e) (1) to U.S. Provisional Patent Application No. 63/125,086, filed Dec. 14, 2020, which is hereby incorporated by reference in its entirety.
TECHNICAL FIELDThe example and non-limiting embodiments relate generally to immersive video and specifically to signaling characteristics of immersive video for recreation of immersive video at a decoder side.
BACKGROUNDIt is known, in video encoding, to signal radiative transfer attributes on a per-point basis.
SUMMARYThe following summary is merely intended to be illustrative. The summary is not intended to limit the scope of the claims.
In accordance with one aspect, a method comprising: segmenting volumetric video data into one or more regions; determining at least one radiative transfer property of the one or more regions; indicating the at least one radiative transfer property of the one or more regions in a sub-stream; and including the sub-stream in a bitstream configured to describe the volumetric video data.
In accordance with one aspect, an apparatus comprising: at least one processor; and at least one memory and computer program code, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus to: segment volumetric video data into one or more regions; determine at least one radiative transfer property of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data.
In accordance with one aspect, an apparatus comprising means for performing: segmenting volumetric video data into one or more regions; determining at least one radiative transfer property of the one or more regions; indicating the at least one radiative transfer property of the one or more regions in a sub-stream; and including the sub-stream in a bitstream configured to describe the volumetric video data.
In accordance with one aspect, a non-transitory computer-readable medium comprising program instructions stored thereon which, when executed with at least one processor, cause the at least one processor to: segment volumetric video data into one or more regions; determine at least one radiative transfer property of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data.
In accordance with one aspect, a method comprising: receiving a bitstream describing volumetric video data; extracting a sub-stream from the received bitstream; determining whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determining a value for the at least one radiative transfer property for the region; and rendering the region based on the determined value for the at least one radiative transfer property.
In accordance with one aspect, an apparatus comprising: at least one processor; and at least one memory and computer program code, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus to: receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
In accordance with one aspect, an apparatus comprising means for performing: receiving a bitstream describing volumetric video data; extracting a sub-stream from the received bitstream; determining whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determining a value for the at least one radiative transfer property for the region; and rendering the region based on the determined value for the at least one radiative transfer property.
In accordance with one aspect, a non-transitory computer-readable medium comprising program instructions stored thereon which, when executed with at least one processor, cause the at least one processor to: receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
The foregoing aspects and other features are explained in the following description, taken in connection with the accompanying drawings, wherein:
The following abbreviations that may be found in the specification and/or the drawing figures are defined as follows:
-
- 3GPP third generation partnership project
- 4G fourth generation
- 5G fifth generation
- 5GC 5G core network
- 6DoF six degrees of freedom
- AFOC atlas frame order count
- ALU arithmetic logic unit
- AR augmented reality
- ASPS atlas sequence parameter set
- BDTF bidirectional optical transfer function
- CDMA code division multiple access
- CGI computer-generated imagery
- CPU central processing unit
- CSG constructive solid geometry
- DSP digital signal processor
- eNB (or eNodeB) evolved Node B (e.g., an LTE base station)
- E-UTRA evolved universal terrestrial radio access, i.e., the LTE radio access technology
- FDMA frequency division multiple access
- FLOPS floating point operations per second
- gNB (or gNodeB) base station for 5G/NR, i.e., a node providing NR user plane and control plane protocol terminations towards the UE, and connected via the NG interface to the 5GC
- GPU graphical processing unit
- GSM global system for mobile communication
- HMD head mounted display
- IEEE Institute of Electrical and Electronics Engineers
- IMD integrated messaging device
- IMS instant messaging service
- IoT Internet-of-Things
- IRAP intra random access point
- LTE long term evolution
- LUT look-up table
- MIV MPEG immersive video
- MMS multimedia messaging service
- MPEG Moving Picture Experts Group
- MPEG-I Moving Picture Experts Group—immersive codec family
- MR mixed reality
- NAL network abstraction layer
- NR new radio
- PDA personal digital assistant
- pcc point cloud compression
- POC picture order count
- RBSP raw byte sequence payload
- SEI supplemental enhancement information
- SMS short message service
- SPS sequence parameter set
- TCP-LP transmission control protocol-internet protocol
- TDMA time division multiple access
- TDP thermal design power
- TM test model
- TMC2 test model category 2
- UE user equipment (e.g., a wireless, typically mobile device)
- UICC universal integrated circuit card
- UMTS universal mobile telecommunications service
- V3C visual volumetric video-based coding
- V-PCC video-based point cloud compression
- VPS V3C parameter set
- VR virtual reality
- WLAN wireless local area network
The following describes suitable apparatus and possible mechanisms for practicing example embodiments of the present disclosure. Accordingly, reference is first made to
The electronic device 50 may for example be a mobile terminal or user equipment of a wireless communication system. Alternatively, the electronic device may be a computer or part of a computer that is not mobile. It should be appreciated that embodiments of the invention may be implemented within any electronic device or apparatus which may process data. The electronic device 50 may comprise a device that can access a network and/or cloud through a wired or wireless connection. The electronic device 50 may comprise one or more processors or controllers 56, one or more memories 58, and one or more transceivers 52 interconnected through one or more buses. The one or more processors 56 may comprise a central processing unit (CPU) and/or a graphical processing unit (GPU). Each of the one or more transceivers 52 includes a receiver and a transmitter. The one or more buses may be address, data, or control buses, and may include any interconnection mechanism, such as a series of lines on a motherboard or integrated circuit, fiber optics or other optical communication equipment, and the like. The one or more transceivers may be connected to one or more antennas 44. The one or more memories 58 may include computer program code. The one or more memories 58 and the computer program code may be configured to, with the one or more processors 56, cause the electronic device 50 to perform one or more of the operations as described herein.
The electronic device 50 may connect to a node of a network. The network node may comprise one or more processors, one or more memories, and one or more transceivers interconnected through one or more buses. Each of the one or more transceivers includes a receiver and a transmitter. The one or more buses may be address, data, or control buses, and may include any interconnection mechanism, such as a series of lines on a motherboard or integrated circuit, fiber optics or other optical communication equipment, and the like. The one or more transceivers may be connected to one or more antennas. The one or more memories may include computer program code. The one or more memories and the computer program code may be configured to, with the one or more processors, cause the network node to perform one or more of the operations as described herein.
The electronic device 50 may comprise a microphone 36 or any suitable audio input which may be a digital or analogue signal input. The electronic device 50 may further comprise an audio output device which in embodiments of the invention may be any one of: an earpiece 38, speaker, or an analogue audio or digital audio output connection. The electronic device 50 may also comprise a battery (or in other embodiments of the invention the device may be powered by any suitable mobile energy device such as solar cell, fuel cell or clockwork generator). The electronic device 50 may further comprise a camera 42 capable of recording or capturing images and/or video. The electronic device 50 may further comprise a display 32. The electronic device 50 may further comprise an infrared port for short range line of sight communication to other devices. In other embodiments the apparatus 50 may further comprise any suitable short-range communication solution such as for example a Bluetooth™ wireless connection or a USB/firewire wired connection.
It should be understood that an electronic device 50 configured to perform example embodiments of the present disclosure may have fewer and/or additional components, which may correspond to the processes the electronic device 50 is configured to perform. For example, an apparatus configured to encode a video might not comprise a speaker or audio transducer and may comprise a microphone, while an apparatus configured to render the decoded video might not comprise a microphone and may comprise a speaker or audio transducer.
Referring now to
The electronic device 50 may further comprise a card reader 48 and a smart card 46, for example a UICC and UICC reader for providing user information and being suitable for providing authentication information for authentication and authorization of the user at a network. The electronic device 50 may further comprise an input device 34, such as a keypad, one or more input buttons, or a touch screen input device, for providing information to the controller 56.
The electronic device 50 may comprise radio interface circuitry 52 connected to the controller and suitable for generating wireless communication signals for example for communication with a cellular communications network, a wireless communications system or a wireless local area network. The apparatus 50 may further comprise an antenna 44 connected to the radio interface circuitry 52 for transmitting radio frequency signals generated at the radio interface circuitry 52 to other apparatus(es) and/or for receiving radio frequency signals from other apparatus(es).
The electronic device 50 may comprise a microphone 38, camera 42, and/or other sensors capable of recording or detecting audio signals, image/video signals, and/or other information about the local/virtual environment, which are then passed to the codec 54 and/or the controller 56 for processing. The electronic device 50 may receive the audio/image/video signals and/or information about the local/virtual environment for processing from another device prior to transmission and/or storage. The electronic device 50 may also receive either wirelessly or by a wired connection the audio/image/video signals and/or information about the local/virtual environment for encoding/decoding. The structural elements of electronic device 50 described above represent examples of means for performing a corresponding function.
The memory 58 may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, flash memory, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory. The memory 58 may be a non-transitory memory. The memory 58 may be means for performing storage functions. The controller 56 may be or comprise one or more processors, which may be of any type suitable to the local technical environment, and may include one or more of general-purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs) and processors based on a multi-core processor architecture, as non-limiting examples. The controller 56 may be means for performing functions.
With respect to
The system 10 may include both wired and wireless communication devices and/or electronic devices suitable for implementing embodiments of the invention.
For example, the non-limiting example system shown in
The example communication devices shown in the system 10 may include, but are not limited to, an apparatus 15, a combination of a personal digital assistant (PDA) and a mobile telephone 14, a PDA 16, an integrated messaging device (IMD) 18, a desktop computer 20, a notebook computer 22, and a head-mounted display (HMD) 17. The electronic device 50 may comprise any of those example communication devices. In an example embodiment of the present disclosure, more than one of these devices, or a plurality of one or more of these devices, may perform the disclosed process(es). These devices may connect to the internet 28 through a wireless connection 2.
The embodiments may also be implemented in a set-top box; i.e. a digital TV receiver, which may or may not have a display or wireless capabilities, in tablets or (laptop) personal computers (PC), which have hardware and/or software to process neural network data, in various operating systems, and in chipsets, processors, DSPs and/or embedded systems offering hardware/software based coding. The embodiments may also be implemented in cellular telephones such as smart phones, tablets, personal digital assistants (PDAs) having wireless communication capabilities, portable computers having wireless communication capabilities, image capture devices such as digital cameras having wireless communication capabilities, gaming devices having wireless communication capabilities, music storage and playback appliances having wireless communication capabilities, Internet appliances permitting wireless Internet access and browsing, tablets with wireless communication capabilities, as well as portable units or terminals that incorporate combinations of such functions.
Some or further apparatus may send and receive calls and messages and communicate with service providers through a wireless connection 25 to a base station 24, which may be, for example, an eNB, gNB, etc. The base station 24 may be connected to a network server 26 that allows communication between the mobile telephone network 11 and the internet 28. The system may include additional communication devices and communication devices of various types.
The communication devices may communicate using various transmission technologies including, but not limited to, code division multiple access (CDMA), global systems for mobile communications (GSM), universal mobile telecommunications system (UMTS), time divisional multiple access (TDMA), frequency division multiple access (FDMA), transmission control protocol-internet protocol (TCP-IP), short messaging service (SMS), multimedia messaging service (MMS), email, instant messaging service (IMS), Bluetooth, IEEE 802.11, 3GPP Narrowband IoT and any similar wireless communication technology. A communications device involved in implementing various embodiments of the present invention may communicate using various media including, but not limited to, radio, infrared, laser, cable connections, and any suitable connection.
In telecommunications and data networks, a channel may refer either to a physical channel or to a logical channel. A physical channel may refer to a physical transmission medium such as a wire, whereas a logical channel may refer to a logical connection over a multiplexed medium, capable of conveying several logical channels. A channel may be used for conveying an information signal, for example a bitstream, which may be a MPEG-I bitstream, from one or several senders (or transmitters) to one or several receivers.
Features as described herein generally relate to enablement of virtual reality (VR), augmented reality (AR), and/or mixed reality (MR). It should be understood that example embodiments described with regard to one of VR, AR, or MR may be implemented with respect to any of these technology areas. Virtual reality (VR) is an area of technology in which video content may be provided, e.g. streamed, to a VR display system. The VR display system may be provided with a live or stored feed from a video content source, the feed representing a VR space or world for immersive output through the display system. A virtual space or virtual world is any computer-generated version of a space, including but not limited to a captured real-world space, in which a user can be immersed through a display system such as a VR headset. A VR headset may be configured to provide VR video and audio content to the user, e.g. through the use of a pair of video screens and headphones incorporated within the headset. Augmented reality (AR) is similar to VR in that video content may be provided, as above, which may be overlaid over or combined with aspects of a real-world environment in which the AR content is being consumed. A user of AR content may therefore experience a version of the real-world environment that is “augmented” with additional virtual features, such as virtual visual and/or audio objects. A device may provide AR video and audio content overlaid over a visible or recorded version of the real-world visual and audio elements.
Features as described herein may relate to methods of encoding, decoding, and/or rendering AR/VR/MR content, including but not limited to volumetric/immersive video data. The encoding, decoding, and/or rendering of the content may take place at a single device or at two or more separate devices. For example, the encoding of the content may take place at a user equipment, a server, or another electronic device capable of performing the processes herein described. The encoded content may then be transmitted to another device, which may then store, decode, and/or render the content. Transmission of the encoded content may, for example, occur over a network connection, such as an LTE, 5G, and/or NR network. As another example, the encoding of the content may take place at a server. The encoded content may then be stored on a suitable file server, which may then be transmitted to another device, which may then store, decode, and/or render the content.
Features as described herein may relate to volumetric video data. Volumetric video data may represent a three-dimensional scene or object and may be used as input for AR, VR, and MR applications. Because volumetric video describes a 3D scene (or object), such data can be viewed from any viewpoint. Therefore, volumetric video is an important format for AR, VR, or MR applications, especially for providing six degrees of freedom (6DoF) viewing capabilities. Such data may describe geometry (shape, size, position in 3D-space, etc.) and respective attributes (e.g. color, opacity, reflectance, etc.), plus any possible temporal changes of the geometry and attributes at given time instances. Temporal information about the scene may be included in the form of individual capture instances, i.e. “frames” in 2D video, or other means, e.g. position of an object as a function of time.
Volumetric video may be generated from 3D models, i.e. computer-generated imagery (CGI); captured from real-world scenes using a variety of capture solutions, e.g. multi-camera, laser scan, combination of video and dedicated depth sensors, etc.; or generated from a combination of generated data and real-world data. Increasing computational resources and advances in 3D data acquisition devices has enabled reconstruction of highly detailed volumetric video representations of natural scenes. Representation of the 3D data depends on how the 3D data is used. Infrared, lasers, time-of-flight, and structured light are all examples of devices that can be used to construct 3D video data. Typical representation formats for such volumetric data are triangle meshes, point clouds, voxels, etc. Dense voxel arrays have been used to represent volumetric medical data. In 3D graphics, polygonal meshes are extensively used. Point clouds are well suited for applications such as capturing real world 3D scenes where the topology is not necessarily a 2D manifold. Another way to represent 3D data is coding this 3D data as a set of texture and depth maps, as is the case in the multi-view plus depth. Closely related to the techniques used in multi-view plus depth is the use of elevation maps, and multi-level surface maps.
In dense point clouds or voxel arrays, the reconstructed 3D scene may contain tens or even hundreds of millions of points. If such representations are to be stored or interchanged between entities, then efficient compression becomes essential. Standard volumetric video representation formats, such as point clouds, meshes, voxels, etc. suffer from poor temporal compression performance. Identifying correspondences for motion-compensation in 3D-space is an ill-defined problem, as both geometry and respective attributes may change. For example, successive temporal “frames” do not necessarily have the same number of meshes, points, or voxels. Therefore, compression of dynamic 3D scenes may be inefficient. 2D-video based approaches for compressing volumetric data, i.e. multiview+depth, have much better compression efficiency, but rarely cover the full scene. Therefore, they may provide only limited 6DoF capabilities.
Instead of the above-mentioned 2D approach, a 3D scene, represented as meshes, points, and/or voxels, may be projected onto one or more geometries. These geometries may be “unfolded” into 2D planes (i.e. two planes per geometry: one for texture, one for depth), which are then encoded using standard 2D video compression technologies. Relevant projection geometry information is transmitted alongside the encoded video files to the decoder. The decoder decodes the video and performs the inverse projection to regenerate the 3D scene in any desired representation format (which might not necessarily be the starting format).
Projecting volumetric models onto 2D planes allows for using standard 2D video coding tools with highly efficient temporal compression. Thus, coding efficiency may be greatly increased. Using geometry-projections instead of prior-art 2D-video based approaches, i.e. multiview+depth, may provide better coverage of a 3D scene or object. Thus, 6DoF capabilities may be improved. Using several geometries for individual objects may further improve the coverage of a scene. Furthermore, standard video encoding hardware may be utilized for real-time compression/decompression of the projected planes. The projection and reverse projection steps are of low complexity.
Referring now to
The packed patches/occupancy map may be compressed at 335, resulting in an occupancy sub-stream sent to the multiplexer 360. Image padding may be applied to the one or more geometry images at 345, and the padded geometry images may be compressed at 355, resulting in a geometry sub-stream sent to the multiplexer 360. The image padding may be based on an occupancy map reconstructed from the compressed patches, at 345. Smoothing of the attribute image may be based on a geometry image reconstructed from the compressed geometry image and an occupancy map reconstructed from the compressed patches/occupancy map, at 325. In an example, the reconstructed geometry information may be smoothed outside the encoding loop as a post processing step. Additional smoothing parameters that were used for the smoothing process may be transferred as a supplemental information for the decoding process. The generation of the attribute image may be based on the smoothed geometry and an occupancy map reconstructed from the compressed patches/occupancy map, at 320. Image padding may be applied to the one or more attribute images at 340, and the padded attribute images may be compressed at 350, resulting in an attribute sub-stream sent to the multiplexer 360. The image padding may be based on an occupancy map reconstructed from the compressed patches/occupancy map, at 340. The sequence of the generated patches may be compressed at 315, resulting in a patch sub-stream sent to the multiplexer 360. This patch sub-stream may be considered as comprising compressed auxiliary information.
The multiplexer 360 may multiplex the patch sub-stream, the attribute sub-stream, the geometry sub-stream, and the occupancy sub-stream to produce a compressed bitstream that may be transmitted to a decoder, for example a decoder implementing the decompression process illustrated at
Referring now to
The attributes of the point cloud may be reconstructed, at 470, based on the decoded attribute video stream and reconstructed information for smoothed geometry and, if present, occupancy map and auxiliary information. After the attribute reconstruction stage, an additional attribute smoothing method may be used for point cloud refinement, at 490. The attribute transfer and smoothing may be based, at least partially, on auxiliary information and/or reconstructed geometry/attributes.
Referring now to
Referring now to
The patch information may be generated per each point cloud frame unless the information is considered static. In the example of
Referring now to
Referring now to
At 820, a normal may be estimated for each point. The tangent plane and its corresponding normal may be defined for each point based on the point's nearest neighbours m within a predefined search distance. At 830, initial segmentation, a K-D tree may be used to separate the data and find neighbours in the vicinity of a point pi, and a barycenter c=
The normal may be estimated from eigen decomposition for the defined point cloud as:
Based on this information, each point may be associated with a corresponding plane of a point cloud bounding box. Each plane may be defined by a corresponding normal {right arrow over (n)}p
(1.0, 0.0, 0.0),
(0.0, 1.0, 0.0),
(0.0, 0.0, 1.0),
(−1.0, 0.0, 0.0),
(0.0, −1.0, 0.0),
(0.0, 0.0, −1.0).
More precisely, each point may be associated with the plane that has the closest normal (i.e., maximizes the dot product of the point normal {right arrow over (n)}p
The sign of the normal may be defined depending on the point's position in relation to the “center”.
The initial clustering may then be refined by iteratively updating the clustered index associated with each point based on the point's normal and the cluster indices of the point's nearest neighbors, at 840 (i.e. refine segmentation).
At the following step, segment patches 850, the points may be clustered based on the closeness of the normals and the distance between points in Euclidian space. Final patches, 860, may be created from the clusters by grouping similar clusters. By adding the weight to each plane, the patches may be refined when the Initial Segmentation process, 830, decides the projection plane, in order to increase the size of the patch in the front or back. The weight values may be calculated in the first frame per GOF. The weight may be determined according to the ratio of projected points when projecting all points to the three planes (XY, YZ, ZX).
The refine segmentation process, 840, may provide a minimum number of connected components (patches) for a given number of points in the point cloud frame 810.
Referring now to
Referring now to
The V-PCC and NAL unit sample stream format classes may be redesigned to avoid this two-path approach by calculating the size precision at each instance of sample stream unit syntax structure. Referring now to
An atlas may be considered auxiliary patch information. For each patch, some or all of the following metadata may be encoded/decoded: Index of the projection plane (Index 0 for the plane (1.0, 0.0, 0.0); Index 1 for the plane (0.0, 1.0, 0.0); Index 2 for the plane (0.0, 0.0, 1.0); Index 3 for the plane (−1.0, 0.0, 0.0); Index 4 for the plane (0.0, −1.0, 0.0); Index 5 for the plane (0.0, 0.0, −1.0)); 2D bounding box (u0, v0, u1, v1); and/or 3D location (x0, y0, z0) of the patch represented in terms of depth δ0, tangential shift s0, and/or bi-tangential shift r0.
According to the chosen projection planes, (δ0, s0, r0) may be computed as follows: Index 0, δ0=x0, s0=z0 and r0=y0; Index 3, δ0=x0, s0=z0 and r0=y0; Index 1, δ0=y0, s0=z0 and r0=x0; Index 4, δ0=y0, s0=z0 and r0=x0; Index 2, δ0=z0, s0=x0 and r0=y0; Index 5, δ0=z0, s0=x0 and r0=y0. An addition to the index list to define the normal axis may be used for the additional 45-degree projection planes: Index 6 for the plane
Index 7 for the plane
Index 8 for the plane
Index 9 for the plane
The mapping information providing, for each T×T block, its associated patch index may be represented as follows: For each T×T block, let L be the ordered list of the indexes of the patches such that their 2D bounding box contains that block. The order in the list may be the same as the order used to encode the 2D bounding boxes. L may be the list of candidate patches. The empty space between patches may be considered as a patch and assigned the special index 0, which may be added to the candidate patches list of all the blocks. I may be an index of the patch to which belongs the current T×T block.
Table 1 gives an example of patch data unit syntax:
Referring now to
At 1200, patch information data may be read. In an example, the input from the patch information data may be patch_mode, p, frmIdx, and/or refFrmIdx, 1210. At 1220, if patch_mode is SKIP, the arithmetic, spud_patch_index, may be decoded (1222), the refIdx may equal the value of [refFrmIdx][spud_patch_index] (1224), and the patch may be reconstructed (1270) according to one or more of the illustrated parameters (1226) (e.g. Patch2dShiftU[p]=pdu_2d_shift_u[refIdx]; Patch2dShiftV[p]=pdu_2d_shift_v[refIdx]; Patch2dSizeU[p]=Patch2dSizeU[refIdx]; Patch2dSizeV[p]=Patch2dSizeV[refIdx]; Patch3dShiftT[p]=Patch3dShiftT[refIdx]; Patch3dShiftBT[p]=Patch3dShiftBT[refIdx]; Patch3dShiftN[p]=Patch3dShiftN[refIdx]; PatchNormalAxis[p]=PatchNormalAxis[refIdx]; Orientation[p]=Orientation[refIdx]; PatchLoD[p]=PatchLod[refIdx]).
Else, at 1230, if patch_mode is INTRA, refIdx=[frmIdx] [p−1] (1232), one or more of the illustrated arithmetic (e.g. u0(pdu_2d_shift_u); u1(pdu_2d_shift_v); size_u0(pdu_2d_size_u); size_v0(pdu_2d_size_v); u1(pdu_3d_shift_tangent_axis); v1(pdu_3d_shift_bitangent_axis); d1(pdu_3d_shift_normal_axis); n(pdu_normal_axis); swap(pdu_orientation_swap_flag); LoD(pdu_lod)) may be decoded (1234), and the patch may be reconstructed (1270) according to one or more of the illustrated parameters (1236) (e.g. Patch2dShiftU[p]=pdu_2d_shift_u[p]; Patch2dShiftV[p]=pdu_2d_shift_v[p]; Patch2dSizeU[p]=pdu_2d_size_u[p]; Patch2dSizeV[p]=pdu_2d_size_v[p]; Patch3dShiftT[p]=pdu_3d_shift_tan[p]; Patch3dShiftBT[p]=pdu_3d_shift_bitan[p]; Patch3dShiftN[p]=pdu_shift_norm[p]; PatchNormalAxis[p]=pdu_norm_axis[p]; Orientation[p]=pdu_orientation_swap_flag[p]; PatchLoD[p]=pdu_lod[p].
Else, at 1240, if patch_mode is INTER, the arithmetic, dpdu_patch_index, may be decoded (1242), the refIdx may be equal to [refFrmIdx][dpdu_patch_index] (1244), one or more of the illustrated arithmetic (e.g. d_u0(pdu_2d_shift_u); d_u1(pdu_2d_shift_v); d_size_u0(pdu_2d_delta_size_u); d_size_v0(pdu_2d_delta_size_v); d_u1(pdu_3d_shift_tangent_axis); d_v1(pdu_3d_shift_bitangent_axis);
-
- d_d1(pdu_3d_shift_normal_axis); d_d1(pdu_3d_shift_normal_axis)) may be decoded (1246), and the patch may be reconstructed (1270) according to one or more of the illustrated parameters (1248) (e.g. Patch2dShiftU[p]=pdu_2d_shift_u[p]++Patch2dShiftU[refIdx]; Patch2dShiftV[p]=pdu_2d_shift_v[p]++Patch2dShiftV[refIdx]; Patch2dSixeU[p]=pdu_2d_delta_size_u[p]++Patch2dSizeU[refIdx]; Patch2dSixeV[p]=pdu_2d_delta_size_v[p]++Patch2dSizeV[refIdx]; Patch3dShiftT[p]=pdu_3d_shift_tan[p]++Patch3dShiftT[refIdx]; Patch3dShiftBT[p]=pdu_3d_shift_bitan[p]++Patch3dShiftBT[refIdx]; Patch3dShiftN[p]=pdu_shift_norm[p]++Patch3dShiftN[refIdx]; PatchNormalAxis[p]=PatchnormalAxis[refidx]; Orientation[p]=Orientation[refIdx]; PatchLod[p]=PatchLod[refIdx]).
Else, at 1250, if patch_mode is FCM, refIdx may be equal to [frmIdx][p−1] (1252), one or more of the illustrated arithmetic (e.g. separate_video_flag(ppdu_patch . . . ); u0(ppdu_2d_shift_u); u1(ppdu_2d)shift_v); d_size_u0(ppdu_2d_delta_size_u); d_size_v0(ppdu_2d_delta_size_v); PCM points (ppdu_pcm_points)) may be decoded (1254), and the patch may be reconstructed (1270) according to the illustrated parameters (1256) (e.g. Patch2dShiftU[p]=pdu_2d_shift_u[p]; Patch2dShiftV[p]=pdu_2d_shift_v[p]; Patch2dSizeU[p]=pdu_2d_delta_size_u[p]++Patch2dSizeU[refIdx]; Patch2dSizeV[p]=pdu_2d_delta_size_v[p]++Patch2dSizeV[refIdx]; PatchPomPoints[p]=ppdu_pcm_points [p]).
Else, at 1260, if patch_mode is LAST, the reconstruction process for patch_frame_data_unit may be finished, 1280.
Features as described herein may relate to radiative transfer. Radiative transfer is the transfer of energy as electromagnetic radiation. The propagation of radiation through a medium may be affected by absorption, emission, and/or scattering process(es). Radiative transfer may be described mathematically. In an example, such as
In projection-based 3D data compression, such as MPEG Visual Volumetric Video-based Coding (V3C), 3D data is projected on 2D patches, video encoded, and reconstructed into 3D space at the decoder side. This may be similar to V-PCC, as illustrated in
In an example embodiment, a set of new syntax elements to carry radiative transfer information on a per-patch level in V3C, to represent non-lambertian surface characteristics, may be introduced.
Example embodiments of the present disclosure may relate to immersive video scenarios where an immersive volumetric scene is represented by a Visual Volumetric Video-based Coding (V3C) bitstream or similar representation. While V3C bitstreams are discussed herein, this should not be considered as limiting the scope of example embodiments; it should be understood that example embodiments of the present disclosure may be applicable to other representations of volumetric video data, including but not limited to representations in which original 3D data is represented as video-coded 2D projections with accompanying metadata. In an example, a decoder may decode the 2D video stream(s) and recreate the 3D scenery by remapping the 2D video information into 3D space.
In an example, certain parts of 3D scenery (or a 3D model) may have non-lambertian characteristics, such as certain levels of opacity, reflection, refraction, etc. Such characteristics may be used on the rendering/displaying device to recreate a more immersive viewing experience.
Use cases for example embodiments of the present disclosure may include, but are not limited to: transparent objects, e.g. a windowpane, with respective levels of absorption opacity; reflective surfaces, e.g. a mirror, with respective level of reflectance; refractive materials, e.g. tinted glass, with respective levels of scattering opacity; diffuse reflection (“albedo”), e.g. skin reflection; bidirectional optical transfer function (BDTF); and/or a combination of the foregoing.
For the current level of content targeted in V3C MIV & V-PCC, absorption opacity and reflectance may be the two most important use cases. Therefore, in the present disclosure, these use cases are presented in more detail. However, the example embodiments may be extended to cover any other radiative transfer use case or form of radiative transfer signaling.
In an example embodiment, a model or scenery may include distinct areas with some sort of non-Lambertian reflection, e.g. a windowpane or a mirror. In an example embodiment, information required for adequately rendering this distinct area may be carried in the Patch data unit of all patches representing the area. Table 2 gives an example of patch data unit syntax according to an example embodiment of the present disclosure:
In the example of Table 2, pdu_rts_enabled_flag[tileID][patchIdx] may indicate whether radiative transfer signaling parameters are present for a given patch, patchIdx, of a tile, tileID. A flag value of “1” may specify that the radiative transfer signaling parameters are present for the current patch patchIdx of the current atlas tile, with tile ID equal to tileID. A flag value of “0” may indicate that no radiative transfer signaling parameters are present for the current patch. These values are non-limiting; other values or additional values may be used. If pdu_rts_enabled_flag[tileID] [patchIdx] is not present in the patch data unit syntax, the value of the flag may be inferred to be equal to 0, i.e. no radiative transfer signaling parameters are present for the current patch. Where no radiative transfer signaling parameters are present for a patch, the patch may be rendered according to auxiliary information without radiative transfer rendering applied.
If the flag indicates that radiative transfer signaling parameters are present, the radiative transfer signaling parameters may include one or more of a transparency/opacity parameter, a reflectance parameter, and/or an absorption/scattering parameter.
In the example of Table 2, pdu_rts_opc[tileID][patchIdx] may specify a value for the opacity for the patch with index patchIdx of the current atlas tile, with tile ID equal to tileID. The value of pdu_rts_opc[tileID][patchIdx] may be in a range of 0 to 255, inclusive, where “0” may indicate a fully transparent patch, and “255” may indicate a fully opaque patch. These values are non-limiting; other values or additional/fewer values may be used. The upper limit of the range may be defined by a maximum defined by the allocated bits for signalling, for example 255 for 8 bits signalling.
In the example of Table 2, pdu_rts_ref[tileID] [patchIDx] may specify a value for the reflectance for the patch with index patchIdx of the current atlas tile, with tile ID equal to tileID. The value of pdu_rts_ref [tileID][patchIdx] may be in a range of 0 to 255, inclusive, where “0” may indicate a fully absorbing patch, and “255” may indicate a fully reflective patch. These values are non-limiting; other values or additional/fewer values may be used.
In the example of Table 2, pdu_rts_sca[tileID][patchIdx] may specify a value for the scattering for the patch with index patchIdx of the current atlas tile, with tile ID equal to tileID. The value of pdu_rts_sca [tileID][patchIDx] may be in a range of 0 to 255, inclusive, where “0” may indicate a fully absorbing patch, and “255” may indicate a fully scattering patch. These values are non-limiting; other values or additional/fewer values may be used.
In an example embodiment, a decoder linked to a rendering/playback device may, upon receiving the radiative transfer information, decide on how to render the respective 3D reconstruction.
In an example embodiment, the decoder may receive opacity information for a patch (i.e. pdu_rts_opc). The decoder may inform the rendering unit to display the respective 3D reconstruction with the indicated level of opacity.
In an example embodiment, the decoder may receive reflectance information for a patch (i.e. pdu_rts_ref). The decoder may inform the rendering unit to display the respective 3D reconstruction with the indicated level of surface reflectance.
In an example embodiment, the decoder may receive scattering information for a patch (i.e. pdu_rts_sca). The decoder may inform the rendering unit to display the respective 3D reconstruction with the indicated level of light scattering.
It should be noted that the variable names used for opacity information, reflectance information, and scattering information are not intended to be limiting; other variable names may be used for example embodiments of the present disclosure.
A technical effect of example embodiments of the present disclosure may be to provide efficient radiative transfer signaling. Another technical effect of example embodiments of the present disclosure may be to enable a more realistic/immersive viewing experience.
In accordance with one aspect, an example method may be provided comprising: segmenting volumetric video data into one or more regions; determining at least one radiative transfer property of the one or more regions; indicating the at least one radiative transfer property of the one or more regions in a sub-stream; and including the sub-stream in a bitstream configured to describe the volumetric video data.
The example method may further comprise: determining whether to include an indication of the at least one radiative transfer property of the one or more regions based, at least partially, on the determined at least one radiative transfer property of the one or more regions, wherein the indicating of the at least one radiative transfer property of the one or more regions in the sub-stream may be based on a determination to include the indication of the at least one radiative transfer property of the one or more regions.
The example method may further comprise: including a flag in the sub-stream indicating whether the at least one radiative transfer property of the one or more regions may be indicated in the sub-stream.
The at least one radiative transfer property of the one or more regions may comprise at least one of: opacity information of the one or more regions; reflectance information of the one or more regions; or scattering information of the one or more regions.
The example method may further comprise: indicating an identifier of the one or more regions in the sub-stream; and indicating an identifier of an atlas tile of the one or more regions in the sub-stream.
The indicating of the at least one radiative transfer property of the one or more regions in the sub-stream may comprise indicating a value of the at least one radiative transfer property, wherein the value is in a range of zero to a maximum defined by the allocated bits for signalling.
The sub-stream may comprise one of: an attribute sub-stream, or a patch sub-stream.
The including of the sub-stream in the bitstream describing the volumetric video data may comprise multiplexing the sub-stream with at least one other sub-stream associated with the volumetric video data.
In accordance with one example embodiment, an apparatus may comprise: at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to: segment volumetric video data into one or more regions; determine at least one radiative transfer property of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data.
The example apparatus may be further configured to: determine whether to include an indication of the at least one radiative transfer property of the one or more regions based, at least partially, on the determined at least one radiative transfer property of the one or more regions, wherein indicating the at least one radiative transfer property of the one or more regions in the sub-stream may be based on a determination to include the indication of the at least one radiative transfer property of the one or more regions.
The example apparatus may be further configured to: include a flag in the sub-stream indicating whether the at least one radiative transfer property of the one or more regions may be indicated in the sub-stream.
The at least one radiative transfer property of the one or more regions may comprise at least one of: opacity information of the one or more regions; reflectance information of the one or more regions; or scattering information of the one or more regions.
The example apparatus may be further configured to: indicate an identifier of the one or more regions in the sub-stream; and indicate an identifier of an atlas tile of the one or more regions in the sub-stream.
Indicating the at least one radiative transfer property of the one or more regions in the sub-stream may comprise indicating a value of the at least one radiative transfer property, wherein the value is in a range of zero to a maximum defined by the allocated bits for signalling.
The sub-stream may comprise one of: an attribute sub-stream, or a patch sub-stream.
Including the sub-stream in the bitstream describing the volumetric video data may comprise multiplexing the sub-stream with at least one other sub-stream associated with the volumetric video data.
In accordance with one example embodiment, an apparatus may comprise: processing circuitry; memory circuitry including computer program code, the memory circuitry and the computer program code configured to, with the processing circuitry, enable the apparatus to: segment volumetric video data into one or more regions; determine at least one radiative transfer property of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data.
In accordance with one example embodiment, an apparatus may comprise: circuitry configured to perform: segment volumetric video data into one or more regions; determine at least one radiative transfer property of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data.
As used in this application, the term “circuitry” may refer to one or more or all of the following: (a) hardware-only circuit implementations (such as implementations in only analog and/or digital circuitry) and (b) combinations of hardware circuits and software, such as (as applicable): (i) a combination of analog and/or digital hardware circuit(s) with software/firmware and (ii) any portions of hardware processor(s) with software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions) and (c) hardware circuit(s) and or processor(s), such as a microprocessor(s) or a portion of a microprocessor(s), that requires software (e.g., firmware) for operation, but the software may not be present when it is not needed for operation.” This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit or processor integrated circuit for a mobile device or a similar integrated circuit in server, a cellular network device, or other computing or network device.
The example apparatus may be further configured to: determine whether to include an indication of the at least one radiative transfer property of the one or more regions based, at least partially, on the determined at least one radiative transfer property of the one or more regions, wherein indicating the at least one radiative transfer property of the one or more regions in the sub-stream may be based on a determination to include the indication of the at least one radiative transfer property of the one or more regions.
The example apparatus may be further configured to: include a flag in the sub-stream indicating whether the at least one radiative transfer property of the one or more regions may be indicated in the sub-stream.
The at least one radiative transfer property of the one or more regions may comprise at least one of: opacity information of the one or more regions; reflectance information of the one or more regions; or scattering information of the one or more regions.
The example apparatus may be further configured to: indicate an identifier of the one or more regions in the sub-stream; and indicate an identifier of an atlas tile of the one or more regions in the sub-stream.
Indicating the at least one radiative transfer property of the one or more regions in the sub-stream may comprise indicating a value of the at least one radiative transfer property, wherein the value is in a range of zero to a maximum defined by the allocated bits for signalling.
The sub-stream may comprise one of: an attribute sub-stream, or a patch sub-stream.
Including the sub-stream in the bitstream describing the volumetric video data may comprise multiplexing the sub-stream with at least one other sub-stream associated with the volumetric video data.
In accordance with one example embodiment, an apparatus may comprise means for performing: segmenting volumetric video data into one or more regions; determining at least one radiative transfer property of the one or more regions; indicating the at least one radiative transfer property of the one or more regions in a sub-stream; and including the sub-stream in a bitstream configured to describe the volumetric video data.
The means may be further configured to perform: determining whether to include an indication of the at least one radiative transfer property of the one or more regions based, at least partially, on the determined at least one radiative transfer property of the one or more regions, wherein the indicating of the at least one radiative transfer property of the one or more regions in the sub-stream may be based on a determination to include the indication of the at least one radiative transfer property of the one or more regions.
The means may be further configured to perform: including a flag in the sub-stream indicating whether the at least one radiative transfer property of the one or more regions is indicated in the sub-stream.
The at least one radiative transfer property of the one or more regions may comprise at least one of: opacity information of the one or more regions; reflectance information of the one or more regions; or scattering information of the one or more regions.
The means may be further configured to perform: indicating an identifier of the one or more regions in the sub-stream; and indicating an identifier of an atlas tile of the one or more regions in the sub-stream.
The means configured to perform indicating of the at least one radiative transfer property of the one or more regions in the sub-stream may comprise means configured to perform indicating a value of the at least one radiative transfer property, wherein the value may be in a range of zero to a maximum defined by the allocated bits for signalling.
The sub-stream may comprises one of: an attribute sub-stream, or a patch sub-stream.
The means configured to perform including of the sub-stream in the bitstream describing the volumetric video data may comprise means configured to perform multiplexing the sub-stream with at least one other sub-stream associated with the volumetric video data.
In accordance with one example embodiment, a non-transitory computer-readable medium comprising program instructions stored thereon which, when executed with at least one processor, cause the at least one processor to: segment volumetric video data into one or more regions; determine at least one radiative transfer property of a region of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data.
In accordance with another example embodiment, a non-transitory program storage device readable by a machine may be provided, tangibly embodying a program of instructions executable by the machine for performing operations, the operations comprising: segment volumetric video data into one or more regions; determine at least one radiative transfer property of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data.
In accordance with one aspect, an example method may be provided comprising: receiving a bitstream describing volumetric video data; extracting a sub-stream from the received bitstream; determining whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream may comprise the at least one radiative transfer property for the region, determining a value for the at least one radiative transfer property for the region; and rendering the region based on the determined value for the at least one radiative transfer property.
The determining of whether the extracted sub-stream comprises the at least one radiative transfer property for the region of the volumetric video data may comprise detecting a flag in the extracted sub-stream, wherein the flag may be configured to indicate that the at least one radiative transfer property may be included in the extracted sub-stream.
The at least one radiative transfer property of the region may comprise at least one of: opacity information of the region; reflectance information of the region; or scattering information of the region.
The example method may further comprise: determining an identifier of the region; and determining an identifier of an atlas tile associated with the region, wherein the rendering of the region may be based, at least partially, on the identifier of the region and the identifier of the atlas tile.
The value for the at least one radiative transfer property for the region may comprise a value in a range of zero to a maximum defined by the allocated bits for signalling.
The sub-stream may comprise one of: an attribute sub-stream, or a patch sub-stream.
The extracting of the sub-stream from the received bitstream may comprise demultiplexing the bitstream.
The rendering of the region based on the determined value for the at least one radiative transfer property may comprise rendering a plurality of points associated with the region based on the determined value for the at least one radiative transfer property.
In accordance with one example embodiment, an apparatus may comprise: at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to: receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream may comprise the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
Determining whether the extracted sub-stream comprises the at least one radiative transfer property for the region of the volumetric video data may comprise detecting a flag in the extracted sub-stream, wherein the flag may be configured to indicate that the at least one radiative transfer property is included in the extracted sub-stream.
The at least one radiative transfer property of the region may comprise at least one of: opacity information of the region; reflectance information of the region; or scattering information of the region.
The example apparatus may be further configured to: determine an identifier of the region; and determine an identifier of an atlas tile associated with the region, wherein rendering the region may be based, at least partially, on the identifier of the region and the identifier of the atlas tile.
The value for the at least one radiative transfer property for the region may comprise a value in a range of zero to a maximum defined by the allocated bits for signalling.
The sub-stream may comprise one of: an attribute sub-stream, or a patch sub-stream.
Extracting the sub-stream from the received bitstream may comprise demultiplexing the bitstream.
Rendering the region based on the determined value for the at least one radiative transfer property may comprise rendering a plurality of points associated with the region based on the determined value for the at least one radiative transfer property.
In accordance with one example embodiment, an apparatus may comprise: processing circuitry; memory circuitry including computer program code, the memory circuitry and the computer program code configured to, with the processing circuitry, enable the apparatus to: receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream may comprise the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
In accordance with one example embodiment, an apparatus may comprise: circuitry configured to perform: receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream may comprise the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
Determining whether the extracted sub-stream comprises the at least one radiative transfer property for the region of the volumetric video data may comprise detecting a flag in the extracted sub-stream, wherein the flag may be configured to indicate that the at least one radiative transfer property is included in the extracted sub-stream.
The at least one radiative transfer property of the region may comprise at least one of: opacity information of the region; reflectance information of the region; or scattering information of the region.
The example apparatus may be further configured to: determine an identifier of the region; and determine an identifier of an atlas tile associated with the region, wherein rendering the region may be based, at least partially, on the identifier of the region and the identifier of the atlas tile.
The value for the at least one radiative transfer property for the region may comprise a value in a range of zero to a maximum defined by the allocated bits for signalling.
The sub-stream may comprise one of: an attribute sub-stream, or a patch sub-stream.
Extracting the sub-stream from the received bitstream may comprise demultiplexing the bitstream.
Rendering the region based on the determined value for the at least one radiative transfer property may comprise rendering a plurality of points associated with the region based on the determined value for the at least one radiative transfer property.
In accordance with one example embodiment, an apparatus may comprise means for performing: receiving a bitstream describing volumetric video data; extracting a sub-stream from the received bitstream; determining whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream may comprise the at least one radiative transfer property for the region, determining a value for the at least one radiative transfer property for the region; and rendering the region based on the determined value for the at least one radiative transfer property.
The means configured to perform determining of whether the extracted sub-stream comprises the at least one radiative transfer property for the region of the volumetric video data may comprise means configured to perform detecting a flag in the extracted sub-stream, wherein the flag may be configured to indicate that the at least one radiative transfer property is included in the extracted sub-stream.
The at least one radiative transfer property of the region may comprise at least one of: opacity information of the region; reflectance information of the region; or scattering information of the region.
The means may be further configured to perform: determining an identifier of the region; and determining an identifier of an atlas tile associated with the region, wherein the rendering of the region may be based, at least partially, on the identifier of the region and the identifier of the atlas tile.
The value for the at least one radiative transfer property for the region may comprise a value in a range of zero to a maximum defined by the allocated bits for signalling.
The sub-stream may comprise one of: an attribute sub-stream, or a patch sub-stream.
The means configured to perform extracting of the sub-stream from the received bitstream may comprise means configured to perform demultiplexing the bitstream.
The means configured to perform rendering of the region based on the determined value for the at least one radiative transfer property may comprise means configured to perform rendering a plurality of points associated with the region based on the determined value for the at least one radiative transfer property.
In accordance with one example embodiment, a non-transitory computer-readable medium comprising program instructions stored thereon which, when executed with at least one processor, cause the at least one processor to: receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
In accordance with another example embodiment, a non-transitory program storage device readable by a machine may be provided, tangibly embodying a program of instructions executable by the machine for performing operations, the operations comprising: receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
It should be understood that the foregoing description is only illustrative. Various alternatives and modifications can be devised by those skilled in the art. For example, features recited in the various dependent claims could be combined with each other in any suitable combination(s). In addition, features from different embodiments described above could be selectively combined into a new embodiment. Accordingly, the description is intended to embrace all such alternatives, modification and variances which fall within the scope of the appended claims.
Claims
1. An apparatus comprising:
- at least one processor; and
- at least one non-transitory memory and computer program code, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus to: segment volumetric video data into one or more regions; determine at least one radiative transfer property of the one or more regions; indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and include the sub-stream in a bitstream configured to describe the volumetric video data.
2. The apparatus of claim 1, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus to: wherein indicating the at least one radiative transfer property of the one or more regions in the sub-stream is based on a determination to include the indication of the at least one radiative transfer property of the one or more regions.
- determine whether to include an indication of the at least one radiative transfer property of the one or more regions based, at least partially, on the determined at least one radiative transfer property of the one or more regions,
3. The apparatus of claim 1, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus to:
- include a flag in the sub-stream indicating whether the at least one radiative transfer property of the one or more regions is indicated in the sub-stream.
4. The apparatus of claim 1, wherein the at least one radiative transfer property of the one or more regions comprises at least one of:
- opacity information of the one or more regions;
- reflectance information of the one or more regions; or
- scattering information of the one or more regions.
5. The apparatus of claim 1, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus to:
- indicate an identifier of the one or more regions in the sub-stream; and
- indicate an identifier of an atlas tile of the one or more regions in the sub-stream.
6. The apparatus of claim 1, wherein indicating the at least one radiative transfer property of the one or more regions in the sub-stream comprises indicating a value of the at least one radiative transfer property, wherein the value is in a range of zero to a maximum defined by the allocated bits for signalling.
7. The apparatus of claim 1, wherein the sub-stream comprises one of:
- an attribute sub-stream, or
- a patch sub-stream.
8. The apparatus of claim 1, wherein including the sub-stream in the bitstream describing the volumetric video data comprises multiplexing the sub-stream with at least one other sub-stream associated with the volumetric video data.
9. A method comprising:
- segmenting volumetric video data into one or more regions;
- determining at least one radiative transfer property of the one or more regions;
- indicating the at least one radiative transfer property of the one or more regions in a sub-stream; and
- including the sub-stream in a bitstream configured to describe the volumetric video data.
10. A non-transitory computer-readable medium comprising program instructions stored thereon which, when executed with at least one processor, cause the at least one processor to:
- segment volumetric video data into one or more regions;
- determine at least one radiative transfer property of the one or more regions;
- indicate the at least one radiative transfer property of the one or more regions in a sub-stream; and
- include the sub-stream in a bitstream configured to describe the volumetric video data.
11. An apparatus comprising:
- at least one processor; and
- at least one non-transitory memory and computer program code, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus to: receive a bitstream describing volumetric video data; extract a sub-stream from the received bitstream; determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data; based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and render the region based on the determined value for the at least one radiative transfer property.
12. The apparatus of claim 11, wherein determining whether the extracted sub-stream comprises the at least one radiative transfer property for the region of the volumetric video data comprises detecting a flag in the extracted sub-stream, wherein the flag is configured to indicate that the at least one radiative transfer property is included in the extracted sub-stream.
13. The apparatus of claim 11, wherein the at least one radiative transfer property of the region comprises at least one of:
- opacity information of the region;
- reflectance information of the region; or
- scattering information of the region.
14. The apparatus of claim 11, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus to: wherein rendering the region is based, at least partially, on the identifier of the region and the identifier of the atlas tile.
- determine an identifier of the region; and
- determine an identifier of an atlas tile associated with the region,
15. The apparatus of claim 11, wherein the value for the at least one radiative transfer property for the region comprises a value in a range of zero to a maximum defined by the allocated bits for signalling.
16. The apparatus of claim 11, wherein the sub-stream comprises one of:
- an attribute sub-stream, or
- a patch sub-stream.
17. The apparatus of claim 11, wherein extracting the sub-stream from the received bitstream comprises demultiplexing the bitstream.
18. The apparatus of claim 11, wherein rendering the region based on the determined value for the at least one radiative transfer property comprises rendering a plurality of points associated with the region based on the determined value for the at least one radiative transfer property.
19. A method comprising:
- receiving a bitstream describing volumetric video data;
- extracting a sub-stream from the received bitstream;
- determining whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data;
- based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determining a value for the at least one radiative transfer property for the region; and
- rendering the region based on the determined value for the at least one radiative transfer property.
20. A non-transitory computer-readable medium comprising program instructions stored thereon which, when executed with at least one processor, cause the at least one processor to:
- receive a bitstream describing volumetric video data;
- extract a sub-stream from the received bitstream;
- determine whether the extracted sub-stream comprises at least one radiative transfer property for a region of the volumetric video data;
- based on a determination that the extracted sub-stream comprises the at least one radiative transfer property for the region, determine a value for the at least one radiative transfer property for the region; and
- render the region based on the determined value for the at least one radiative transfer property.
Type: Application
Filed: Dec 7, 2021
Publication Date: Jun 16, 2022
Inventor: Sebastian Schwarz (Unterhaching)
Application Number: 17/544,217