BEZIER VOLUME REPRESENTATION OF POINT CLOUD ATTRIBUTES

The systems and methods discussed herein implement a volumetric approach to point cloud representation, compression, decompression, communication, or any suitable combination thereof. The volumetric approach can be used for both geometry and attribute compression and decompression, and both geometry and attributes can be represented by volumetric functions. To create a compressed representation of the geometry or attributes of a point cloud, a suitable set of volumetric functions are transformed, quantized, and entropy-coded. When decoded, the volumetric functions are sufficient to reconstruct the corresponding geometry or attributes of the point cloud.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 16/250,704, filed Jan. 17, 2019, which application claims the priority benefit of U.S. Provisional Patent Application No. 62/619,516, filed Jan. 19, 2018, which are incorporated herein by reference in their entireties.

TECHNICAL FIELD

The subject matter disclosed herein generally relates to the technical field of special-purpose machines that facilitate computer graphics, including software-configured computerized variants of such special-purpose machines and improvements to such variants, and to the technologies by which such special-purpose machines become improved compared to other special-purpose machines that facilitate computer graphics. Specifically, the present disclosure addresses systems and methods to facilitate, among other things, representation of point cloud attributes with a data structure based on a Bezier volume.

BACKGROUND

A machine (e.g., a computer graphics processing machine) may be configured to generate, store, access, send, receive, compress, decompress, modify, render, display, or any suitable combination thereof, a data structure that represents computer graphics or some portion thereof. In computer graphics, a point cloud is a set of points (also known as locations or positions) in Euclidean space, with a vector of one or more attributes associated with each point (e.g., a color vector, such as a color triple, or a motion vector). Point clouds may be static or dynamic. A dynamic point cloud can be considered to be a sequence of static point clouds, each in its own frame. Point clouds have applications in robotics, tele-operations, virtual and augmented reality, cultural heritage preservation, geographic information systems, and so forth.

Point clouds can represent volumetric media, which may be popularly known as holograms. In general, a hologram is an object or scene whose representation permits rendering arbitrary points of view, such that the object or scene appears to occupy space due to stereo or motion parallax. A point cloud can form all or part of a data structure that represents a hologram, because each point, with one or more corresponding color attributes, can represent the color of light rays that pass through that point.

Volumetric media have emerged as the first significant new modality for immersive communication since the introduction of audio for audio recordings and the introduction of video for motion pictures. Like audio and video, volumetric media may be used in three major communication scenarios: on-demand consumption of pre-recorded content, broadcast of live or pre-recorded content, and interactive communication, such as telephony or conferencing. Since point clouds can be very large, especially for complex objects or scenes, efficient storage and transmission may be important; hence data compression may be important.

Two issues in point cloud compression are geometry compression and attribute compression. Geometry compression, which may be called shape compression, involves compressing the point locations; attribute compression involves compressing the attribute values, given the point locations. Some approaches to point cloud compression compress both geometry and attributes. For example, the attributes of each point may be compressed and decompressed under the assumption that the point locations have already been compressed and decompressed, and that the decompressed point locations are available to the encoder and decoder as supplemental information when the attribute compression is done. Some other approaches to point cloud compression compress the color attributes of each point by truncating the color components (e.g., RGB values) to a few bits each. More sophisticated compression can result in an order of magnitude reduction in bit rate. Other approaches use Graph Transforms, a Regional Adaptive Haar Transform (RAHT), or a Karhunen Loeve transform based on a Gaussian Process model. In such approaches, the attributes are viewed as a signal ƒ1, . . . , ƒN defined on a finite set of N points x1, . . . , xN in space, and the transforms are viewed as discrete transforms on these N points, where N is the number of points in the point cloud.

BRIEF DESCRIPTION OF THE DRAWINGS

Some embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings.

FIG. 1 is a diagram illustrating central B-spline basis functions of order p, according to some example embodiments.

FIG. 2 is a block diagram illustrating two different analysis-synthesis structures, according to some example embodiments.

FIG. 3 is a diagram illustrating a Haar Transform butterfly structure for depth d=3, according to some example embodiments.

FIG. 4 is a diagram illustrating a RAHT tree structure for depth d=3, according to some example embodiments.

FIG. 5 is a diagram illustrating a Haar Transform tree structure for depth d=3, according to some example embodiments.

FIG. 6 is a flowchart illustrating operations in a method of encoding point cloud attributes, according to some example embodiments.

FIG. 7 is a flowchart illustrating operations in a method of decoding previously encoded point cloud attributes, according to some example embodiments.

FIG. 8 is a block diagram illustrating components of a machine, according to some example embodiments, able to read instructions from a machine-readable medium and perform any one or more of the methodologies discussed herein.

DETAILED DESCRIPTION

Example methods (e.g., algorithms) facilitate representation of attributes of a point cloud, which may accordingly facilitate generation, compression, decompression, or other processing of a data structure that represents the attributes of the point cloud. Example systems (e.g., special-purpose machines configured by special-purpose software) are configured to perform such example methods. Examples merely typify possible variations. Unless explicitly stated otherwise, structures (e.g., structural components, such as modules) are optional and may be combined or subdivided, and operations (e.g., in a procedure, algorithm, or other function) may vary in sequence or be combined or subdivided. In the following description, for purposes of explanation, numerous specific details are set forth to provide a thorough understanding of various example embodiments. It will be evident to one skilled in the art, however, that the present subject matter may be practiced without these specific details.

The systems and methods described herein represent the attributes of a point cloud volumetrically within a data structure, and such a data structure may be generated or stored in a computer-readable medium, such as a memory or storage drive, as well as communicated (e.g., transmitted or received) or otherwise processed (e.g., modified, compressed, decompressed, rendered, or displayed). A volumetric representation of the attributes is a function ƒ(x) defined over all points x in space, not just the finite set of points x1, . . . , xN in the point cloud. Once the set of points x1, . . . , xN is transmitted, and the volumetric function ƒ is transmitted, the value of the attributes on each point can be reconstructed as {circumflex over (ƒ)}(x1), . . . , {circumflex over (ƒ)}(xN), where {circumflex over (ƒ)} is the recovered version of ƒ. The systems and methods described herein represent the volumetric function ƒ as a sequence of wavelet coefficients or other coefficients (e.g., within a data structure that specifies such a sequence). The coefficients are compressed using quantization, entropy coding, or both, and are accordingly decompressed using entropy decoding, inverse quantization, or both, to reconstruct an approximation of the function. Representing the volumetric function ƒ in a B-spline basis of order p=1 provides results similar to representing the values ƒ1, . . . , ƒN using RAHT. Furthermore, the systems and methods described herein may represent the volumetric function ƒ in a B-spline basis of order p=2. This basis causes the reconstructed function {circumflex over (ƒ)} to be continuous, which may be helpful for certain applications.

The systems and methods discussed herein implement a volumetric approach to point cloud representation, compression, decompression, communication, or any suitable combination thereof. The volumetric approach can be used for both geometry and attribute compression and decompression, and both geometry and attributes can be represented by volumetric functions. As used herein, the term “volumetric function” refers to a scalar-valued or vector-valued function defined on a volume of space (e.g., in contrast to a function defined on an image plane or on a finite set of points). To create a compressed representation of the geometry or attributes of a point cloud, a suitable set of volumetric functions are transformed, quantized, and entropy-coded. When decoded, the volumetric functions are sufficient to reconstruct the corresponding geometry or attributes of the point cloud.

A scalar volumetric function is used to represent the geometry of a point cloud, and the geometry of the point cloud can be reconstructed by obtaining the level set of the decoded scalar volumetric function. Certain example embodiments use a signed distance function or an occupancy probability as the scalar volumetric function for geometry.

A vector-valued volumetric function is used to represent attributes of the point cloud, and the systems and methods described herein generally use a vector-valued volumetric function having the same dimension as the attributes. The attributes can accordingly be reconstructed by obtaining the values of the decoded vector-valued volumetric function at the points of the decoded geometry. Certain example embodiments determine the vector-valued volumetric function by solving a linear regression. The parameters of the function may be B-spline wavelet coefficients, similar to the parameters of linked Bezier volumes. Such functions exist in a Hilbert space in which distance and orthogonality are induced by an inner product defined by a novel counting measure supported by the decoded point locations.

Measure

Let Ω be a set, and let σ(Ω) be sigma algebra of subsets of Ω. A measure is a function μ:σ→R that assigns a real number to each set in σ(Ω).

It is helpful to focus on the case where Ω=R3. Suppose a finite set of points in R3, say B={x1, . . . , xN}. For each set M∈σ(R3), define μ(M)=|M∩B| to be the number of such points in M. This can be called a counting measure.

Let ƒ:R3→R be a real-valued function on R3. The integral of ƒ over a set M∈σ(R3) with respect to measure μ is denoted ∫Mƒ(x)dμ(x). When μ is a counting measure, the integral is equal to

M f ( x ) d μ ( x ) = x n M f ( x n ) . ( 1 )

Hilbert Space

A Hilbert space is a complete normed vector space equipped with an inner product that induces the norm. Consider the Hilbert space F of real-valued functions ƒ:R3→R equipped with inner product

f , g = f ( x ) g ( x ) d μ ( x ) = n f ( x n ) g ( x n ) , ( 2 )

where ƒ, g∈F, and ∥ƒ∥√{square root over (<ƒ, ƒ>)} is the induced norm.

Note that ∥ƒ∥ depends on the value of ƒ only at the points x1, . . . , xN. Hence, strictly speaking, ∥ƒ∥ is only a pseudo-norm, as there are non-zero functions ƒ:R3→R such that ∥ƒ∥=0. However, ∥ƒ∥ can be called a norm with the usual understanding that ∥ƒ∥=0 implies ƒ=0 almost everywhere (a.e.) with respect to measure μ. Alternatively, one can consider the space of equivalence classes of functions, where two functions ƒ and g are deemed equivalent if ƒ=g a.e. Denoting by {tilde over (ƒ)} the equivalence class of functions equivalent to ƒ, one can show that the set of equivalence classes {tilde over (F)}={{tilde over (ƒ)}|ƒ:R3→R} is a vector space with a proper norm ∥{tilde over (ƒ)}∥ induced by the inner product <{tilde over (ƒ)},{tilde over (g)}>, which in turn is induced by the inner product <ƒ,g> between representatives. The vector space {tilde over (F)} is isomorphic to RN, and is hence a Hilbert space. Thus, take F to be a Hilbert space with the usual “almost everywhere” understanding or with the equivalence class understanding.

With the inner product and norm so determined, other properties of the Hilbert space follow. Specifically, a vector g∈F is orthogonal to a vector ƒ∈F iff<ƒ,g>=0. A vector g∈F is orthogonal to a subspace F0⊆F iff g is orthogonal to all ƒ∈F0. A subspace G0⊆F is the orthogonal complement to a subspace F0⊆F iff, for all g∈G0, g is orthogonal to F0. A point ƒ0*∈F0 is the projection of a point ƒ∈F onto the subspace F0∈F iff it minimizes ∥ƒ−ƒ0∥ over ƒ0 ∈F0. The projection ƒ0* of ƒ onto F0, denoted ƒºF0, exists and is unique almost everywhere with respect to the measure μ. A necessary and sufficient condition for ƒ0* to be the projection of ƒ onto F0 is that the approximation error (ƒ−ƒ0*) is orthogonal to F0.

Bezier Volumes

A Bézier curve of degree m is a function on the unit interval b:[0,1]→R specified as a linear combination of Bernstein polynomials, namely,

b ( x ) = m i = 0 B i b m , i ( x ) , ( 3 )

where B0, . . . , Bm are the coefficients of the linear combination, and

b m , i ( x ) = ( m i ) x i ( 1 - x ) m - i , ( 4 )

i=0, . . . , m, are the m th order Bernstein polynomials, which are polynomials of degree m defined on the unit interval [0,1].

Analogously, a Bézier volume (BV) of degree m is a function on the unit cube b:[0,1]3→R specified as a linear combination of products of Bernstein polynomials, namely,

b ( x , y , z ) = i = 0 m j = 0 m k = 0 m B ijk b m , i ( x ) b m , j ( y ) b m , k ( z ) , ( 5 )

where Bijk, i, j, k∈{0, . . . , m}, are the coefficients of the linear combination.

A function b(x,y,z) is tri-polynomial of degree m if it is a polynomial of degree min each of its coordinates when its other coordinates have any fixed value. Thus, a BV is tri-polynomial of degree m over the unit cube.

Cardinal B-Splines

A cardinal B-spline function of order p is a function on the real line ƒ:R→R specified as a linear combination of B-spline basis functions of order p, namely

f ( x ) = n Z F n φ ( p ) ( x - n ) , ( 6 )

where Fn, n∈Z, are the coefficients of the linear combination, and ϕ(p)(x−n) is the B-spline basis function of order p at integer shift n. The B-spline basis function ϕ(p)(x) can be defined for p=1 as

φ ( 1 ) ( x ) = { 1 x [ 0 , 1 ] 0 otherwise ( 7 )

and recursively for p>1 as


ϕ(p)(x)=∫ϕ(1)(t(p-1)(x−t)dt  (8)

for all x. From this definition, it can be seen that ϕ(p)(x) is the p-fold convolution of b(0)(x) with itself, and that the support of ϕ(p)(x) is an interval of length p, as shown in FIG. 1.

An alternative recursive definition for p>1 is

φ ( p ) ( x ) = 1 p - 1 [ x φ ( p ) ( x ) + ( p - x ) φ ( p ) ( x - 1 ) ] ( 9 )

for x∈[0, p] and ϕ(p)(x)=0 otherwise. From this second definition, it follows that ϕ(p)(x) is a polynomial of degree p−1 on all integer shifts of the unit interval, [n,n+1], n∈Z, which may be called blocks, and that ƒ(x) is piecewise polynomial of degree p−1 over each block. It also follows that ƒ(x) is Cp-2 continuous, meaning that ƒ(x) and all of its derivatives up to its (p−2)th derivative are continuous, even at the breakpoints between blocks, which may be called knots.

Analogously, a cardinal B-spline volume of order p is a volumetric function ƒ:R3→R specified as a linear combination of vector integer shifts of a product of B-spline basis functions of order p, namely

f ( x ) = n Z 3 F n φ ( p ) ( x - n ) , ( 10 )

where Fn is the coefficient of the linear combination at vector integer shift n∈Z3, and ϕ(p)(x)=ϕ(p)(x, y, z)=ϕ(p)(x)ϕ(p)(y)ϕ(p)(z) is the product of B-spline basis functions ϕ(p)(x), ϕ(p)(y), and ϕ(p)(z).

It can be seen that ƒ(x) is tri-polynomial of degree p−1 over each shifted unit cube [0,1]3+n, or block. Further, it can be seen that ƒ(x) is Cp-2 continuous between blocks. Thus, ƒ(x) can be considered to be a collection of Bézier volumes of degree p−1 linked together such that the overall function is Cp-2 continuous.

Approximation

Let ϕ0,0(x)=ϕ(p)(x−n0) be the central cardinal B-spline basis function, that is, the cardinal B-spline basis function centered on the origin 0 (if p is even) or on the unit cube [0,1]3 (if p is odd). Let ϕ0,n(x)=ϕ0,0(x−n) be this central cardinal B-spline basis function shifted by integer vector n.

For notational simplicity, one can suppress the dependence of ϕ0,n on p. Specific values of p are discussed below.

Let F be the Hilbert space of all functions ƒ:R3→R under the inner product equation (2), as defined above. Define the subspace F0⊆F as

F 0 = { f 0 F | { F n } s . t . f 0 ( x ) = n Z 3 F n φ 0 , n ( x ) } . ( 11 )

This is the subspace of all functions that are tri-polynomial of degree p−1 over the blocks {[0,1]3+n|n∈Z3}.

Any function ƒ∈F can be approximated by a function ƒ0*∈F0, where ƒ0* is the projection of ƒ onto F0, denoted ƒ0*=ƒºF0. Let {Fn*} be coefficients such that ƒ0*(x)=Σn∈Z3Fn0,n(x). Under the inner product equation (2), the squared error between ƒ and ƒ0*,

f - f 0 * 2 = i = 1 N ( f ( x i ) - f 0 * ( x i ) ) 2 , ( 12 )

depends on the values of ƒ0* only at the points x1, . . . , xN, which in turn depend on any particular coefficient Fn* only if ϕ0,n(xi)≠0 for some xi, i=1 . . . , N. Let


N0={n|∃i∈{1, . . . ,N}s.t. ϕ0,n(xi)≠0}  (13)

be the set of vector integer shifts n such that ϕ0,n(xi)≠0 for some xi. This set is finite because ϕ0,0 has bounded support, and any of its shifts far away from the points xi, . . . , xN will not include any such point in its support. Let {ni} denote the finite set of shifts in N0.

For any n∉N0, assign Fn*=0, and for any n∈N0, solve for Fn* by noting that the approximation error (ƒ−ƒ0*) must be orthogonal to ϕ0,n for all n∈Z3. In particular, for all ni∈N0,

0 = φ 0 , n i , f - f 0 * = φ 0 , n i , f - φ 0 , n i , f 0 * , or ( 14 ) φ 0 , n i , f = φ 0 , n i , f 0 * = n j N 0 φ 0 , n i , φ 0 , n j F n j * . ( 15 )

In vector form,


Φ0Tƒ=Φ0TΦ0F*  (16)

where Φ0Tƒ is shorthand for the |N0|×1 vector └<ϕ0,ni,ƒ>┘, Φ0TΦ0 is shorthand for the |N0|×|N0| matrix └<ϕ0,ni0,nj>┘, and F* is the |N0|×1 vector [Fni*]. If Φ0TΦ0 is invertible, then one may solve for F* explicitly as


F*=(Φ0TΦ0)−1Φ0Tƒ.  (17)

Multiresolution Approximation

To obtain approximations at different resolutions, the cardinal B-spline basis functions can be scaled by a factor of , where is the scale or level of detail or simply level. To be specific, define


,n(x)=ϕ0,n(x)  (18)

as the cardinal B-spline basis function at level and shift n. Define the subspace ⊆F as

F = { f F | { F , n } s . t . f ( x ) = n Z 3 F , n φ , n ( x ) } . ( 19 )

This is the subspace of all functions that are tri-polynomial of degree p−1 over the blocks at level , {([0,1]3+n)|n∈Z3}. Since the blocks at level are refined by the blocks at level +1, it is clear that if a function is tri-polynomial over the blocks at level , i.e., ∈, then it is also tri-polynomial over the blocks at level +1, i.e., ∈. Hence ⊆, and


F0⊆F1⊆ . . . ⊆⊆ . . . ⊆F  (20)

is a nested sequence of subspace whose resolution increases with .

Let =ƒº and =ƒº be the projections of ƒ onto and respectively. Then, by the Pythagorean theorem, for all ∈⊆,


∥ƒ−∥2=∥ƒ−∥2+∥−∥2.  (21)

Then since =ƒº minimizes ∥ƒ−∥2 over all ∈, by equation (21) must also minimize ∥−∥2 over all ∈, and hence =º. That is, projecting ƒ onto can be done in two steps, by first projecting onto (i.e., =ƒº) and then onto (i.e., =º). Alternatively, ƒº=ƒºº.

Paralleling the development described above, let


={n|∃i∈{1, . . . ,N}s.t.(xi)≠0}  (22)

be the finite set of vector integer shifts n such that (xi)≠0 for some xi. Then, for all ni∈,

0 = φ , n i , f - f * = φ , n i , f - φ , n i , f * , or ( 23 ) φ , n i , f = φ , n i , f * = n j N φ , n i , φ , n j F , n j * , ( 24 )

where

f * = n j N F , n j * φ , n j ( p ) .

In vector form, equation (24) can be expressed


ƒ=  (25)

where ƒ is shorthand for the ||×1 vector [<,ƒ>], is shorthand for the ||×|| matrix [<, >], and is the ||×1 vector

[ F , n j * ]

. If is invertible, then one may solve for explicitly as


=()−1ƒ.  (26)

In turn, one may compute and recursively from ƒ and respectively, as follows.

Since ∈⊆, there exist coefficients {ak} not depending on such that

φ , n = k Z 3 a k φ + 1 , n + k = k Z 3 a k - n φ + 1 , k ( 27 )

Equation (27) is known as the two-scale equation. From this equation, it follows that

φ , n i , f = n j N + 1 a n j - n i φ + 1 , n j , f and ( 28 ) φ , n i , φ , n j = n k N + 1 n l N + 1 a n k - n i φ + 1 , n k , φ , n l a n l + 1 - n j . ( 29 )

In vector form,


ƒ=  (30)


and


=  (31)

where

A + 1 = [ a n j - n i ] .

Wavelets

Let be the orthogonal complement of in , i.e.,


=⊕.  (32)

Applying this recursively,


F=F0⊕G0⊕G1⊕ . . . ⊕⊕ . . . ,  (33)

so that any function ∈ can be written as the sum of orthogonal functions,


0+g0+g1+ . . . ++ . . . .  (34)

The coefficients of ƒ0 in the basis for F0 are low pass coefficients, while the coefficients of in the basis for are high pass or wavelet coefficients. The function ƒ can be efficiently communicated by performing quantizing, entropy coding, or both, on its low pass and wavelet coefficients. This is efficient, because most of the energy in ƒ is in its low pass coefficients and its low-level wavelet coefficients.

To compute the low pass coefficients, equation (17) can be used, while to compute the wavelet coefficients, it helps to first establish a basis for each .

First, some definitions: For each , let =[] be a row vector containing the functions , nj∈, and let be a column vector of || coefficients. Let denote a function in . Let ƒ=[<, ƒ>] denote the column vector of inner products of the functions , ni∈, with function ƒ. Similarly, let [ƒ1, . . . , ƒn]=[<, ƒj>] denote the matrix of inner products of the functions , ni∈, with the functions ƒj j=1, . . . , n.

Consider now the subspace ⊆ defined by


={|=0}  (35)

It can be seen that is a subspace of and is orthogonal to ={}, and hence is the orthogonal complement of in . If is the dimension of , and is the dimension of , then − is the dimension of . For the moment, assume =||. When large, the dimension of may be lower than ||, due to the finite number of points x1, . . . , xN.

One way to construct an explicit basis for is as follows. Partition the × matrix into an × matrix R′ and an ×(−) matrix R″, as =[R′R″]. Similarly partition the dimensional vector into an dimensional vector F′ and an − dimensional vector F″, as

F + 1 = [ F F ] . ( 36 )

Then, for all satisfying =0 in equation (35),

[ R R ] [ F F ] = 0 ( 37 ) ( R ) - 1 [ R R ] [ F F ] = 0 ( 38 ) [ I ( R ) - 1 R ] [ F F ] = 0 ( 39 )

where I′ is the × identity matrix. Hence, F′=−(R′)−1R″F″, and

G = { Φ + 1 [ - ( R ) - 1 R I ] F F R N + 1 - N } , ( 40 )

where I″ is the (−)×(−) identity matrix. Thus, the (−) functions in the row vector

Ψ = Φ + 1 [ - ( R ) - 1 R I ] ( 41 )

form an explicit basis for .

Now that a basis is established for , any function =∈ can be decomposed as the sum of functions =∈, and =∈, specifically,

Φ + 1 F + 1 = [ Φ Ψ ] [ F G ] . ( 42 )

From equation (42), one can obtain two formulas for an analysis filter bank, which formulas produce coefficients and from coefficients , and two formulas for a synthesis filter bank, which formulas produce coefficients from coefficients and .

For the first two formulas, take the inner product of equation (42) with the functions in ,

Φ + 1 T Φ + 1 F + 1 = Φ + 1 T [ Φ Ψ ] [ F G ] , ( 43 )

yielding

F + 1 = [ Φ + 1 T Φ + 1 ] - 1 Φ + 1 T [ Φ Ψ ] [ F G ] ( 44 )

for the synthesis and

[ Φ + 1 T [ Φ Ψ ] ] - 1 Φ + 1 T Φ + 1 F + 1 = [ F G ] ( 45 )

for the analysis.

For the second two formulas, take the inner product of equation (42) with the functions in and the − functions in ,

[ Φ Ψ ] T Φ + 1 F + 1 = [ Φ T Φ 0 0 Ψ T Ψ ] [ F G ] , ( 46 )

yielding

F + 1 = [ [ Φ Ψ ] T Φ + 1 ] - 1 [ Φ T Φ 0 0 Ψ T Ψ ] [ F G ] ( 47 )

for the synthesis and

[ Φ T Φ 0 0 Ψ T Ψ ] - 1 [ Φ Ψ ] T Φ + 1 F + 1 = [ F G ] ( 48 )

for the analysis. These two structures are shown in FIG. 2. The analysis portion of one structure can be matched with the synthesis portion of the other structure, and vice versa. These structures may be nested, recursively decomposing the low pass coefficients.

Attribute Coding Using Region Adaptive Hierarchical Transforms

The systems and methods described herein represent and may compress or decompress the real-valued attributes of a point cloud using volumetric functions. It is assumed that an encoder is given a set of point locations x1, . . . , xN and a set of corresponding attributes ƒ1, . . . , ƒN. It is also assumed that the point locations can be communicated to the decoder without loss. A technical problem of attribute compression is to reproduce approximate attributes, {circumflex over (ƒ)}1, . . . , {circumflex over (ƒ)}N, at the decoder, subject to a constraint on the number of bits communicated, given the point locations as supplemental information (e.g., side information). For clarity and brevity, the discussion herein focuses on attributes that are scalar. Vector attributes can be accordingly treated component-wise.

At an example embodiment of the encoder, a volumetric B-spline of order p is fit to the values ƒ1, . . . , ƒN at locations x1, . . . , xN, and its wavelet coefficients are quantized and entropy coded. At an example embodiment of the decoder, the wavelet coefficients are entropy decoded and dequantized, and the volumetric B-spline is reconstructed. Finally, the reconstructed volumetric B-spline is sampled at the locations x1, . . . , xN, and the corresponding values {circumflex over (ƒ)}1 . . . , {circumflex over (ƒ)}N are used as reproductions of the attributes.

The next three sections focus on volumetric B-splines of orders 1, 2, and higher orders.

Constant B-Splines

Volumetric B-splines of order p=1, or constant B-splines, are presently discussed. It can be shown that constant B-splines are equivalent to the RAHT.

RAHT is a generalization of the Haar Transform. The Haar Transform of a sequence of 2d coefficients ƒ0, . . . , ƒ2d−1 can be described as a series of orthonormal butterfly transforms,

[ F _ , n G _ , n ] = [ 1 2 1 2 - 1 2 1 2 ] [ F _ + 1 , 2 n F _ + 1 , 2 n + 1 ] , ( 49 )

for =d−1, . . . , 0 and n=0, . . . , 2−1, beginning with Fd,nn, n=0, . . . , 2d−1, The butterfly structure for d=3 is shown in FIG. 3. This can equally well be regarded as a full binary tree, as shown in FIG. 4, in which the signal samples ƒ0, . . . , ƒ2d−1 are located at the leaves of the tree, the high pass coefficients are located at the intermediate nodes of the tree, and the DC coefficient is located at the root of the tree.

The RAHT is a generalization of the Haar Transform in that the tree is not necessarily full, in that the internal nodes of the tree may be either binary or unary, and the butterfly transform at each internal binary node of the tree is a Givens rotation,

[ F _ , n G _ , n ] = [ a b - b a ] [ F _ + 1 , 2 n F _ + 1 , 2 n + 1 ] , ( 50 )

where

a = w + 1 , 2 n w + 1 , 2 n + w + 1 , 2 n + 1 , ( 51 ) b = w + 1 , 2 n + 1 w + 1 , 2 n + w + 1 , 2 n + 1 , ( 52 ) w , n = w + 1 , 2 n w + 1 , 2 n + 1 ( 53 )

for =0, . . . , d−1, and wd,n equals 1 for all n in the signal and equals 0 otherwise. is called the weight of node n at level and is equal to the total of the weights of all the leaves descended from node n in level . The tree for RAHT is shown in FIG. 5.

To apply RAHT to a signal ƒ1, . . . , ƒN ∈R defined on point locations x1, . . . , xN∈R3, first scale the point locations so that the set X={x1, . . . , xN} fits within the unit cube, X⊂[0,1)3, and choose d sufficiently large so that each point location x=(x, y, z)∈X can be represented uniquely with d bits of precision, as

x = b = 1 d x b 2 - b , y = b = 1 d y b 2 - b , and z = b = 1 d z b 2 - b ,

or in more conventional notation,


x=x1 . . . xd,  (54)


y=y1 . . . yd,  (55)


z=z1 . . . zd.  (56)

The Morton code of point location x=(x, y, z) is defined as the interleaving of its coefficients' bits,


Morton(x)=z1y1x1. . . zdydxd.  (57)

Now the RAHT tree of depth 3d can be constructed with N leaves, with ƒi at leaf i, such that the path from the root of the tree to leaf i is given by the Morton code of xi. Thus, each node in the tree, if it is at level , is associated with the common length- prefix shared by the Morton codes of the node's descendants. Two nodes at level +1 are siblings if their length-(+1) Morton prefixes are adjacent in Morton order, also known as Z-scan order. The Givens rotation is applied to such siblings.

As described above, RAHT is a transform on a signal defined on a discrete set of points, but it also has an interpretation as a transform of a signal defined on a volume. To see that, note that the nodes at level of the RAHT tree partition not only the points x1, . . . , xN, but also the volume [0,1)3. For example, consider a node at a level =whose Morton prefix is z1y1x1 . . . . This node corresponds to the set of all points x∈[0,1)3 that have this same Morton prefix, namely the cube

B , ( n x , n y , n z ) = [ 2 - n x , 2 - ( n x + 1 ) ) × [ 2 - n y , 2 - ( n y + 1 ) ) × [ 2 - n z , 2 - ( n z + 1 ) ) , ( 58 )

where

n x = b = 1 x b 2 - b , n y = b = 1 y b 2 - b , and n z = b = 1 z b 2 - b .

This is a ×× cube at vector integer shift n=(nx, xy, nz), where nx, ny, nz∈{0, . . . , −1}. For nodes at levels that are not a multiple of three, the notation may seem a little clumsy, but similar and straightforward. Specifically, a node at level =+1 corresponds to a ×× cuboid B,n at vector integer shift n=(nx, xy, nz), where nx, ny∈{0, . . . ,−1} and nz∈{0, . . . , −1}, and a node at level =+2 corresponds to a ×× cuboid at vector integer shift n=(nx, xy, nz), where nx∈{0, . . . , −1} and ny, nz∈{0, . . . ,−1}. At any level, these cuboids are called blocks. A block is said to be occupied if it contains a point, that is, if ∩X≠Ø. Each node in the RAHT tree at level corresponds to an occupied block at level .

Let (x) the the indicator function for block , namely

φ , n ( x ) = { 1 x B , n 0 otherwise , ( 59 )

and, as in equation (22), let be the set of vector integer shifts n such that (xi)≠0 for some i=1, . . . , N, that is, let be the set of vector integer shifts n such that is occupied. Then, as in equation (19), let be the subspace of functions that are linear combinations of for n∈,

F = { f F { F , n } s . t . f = n N F , n φ , n } . ( 60 )

Let ƒ:[0,1)3→R be any function that agrees with ƒi on xi, i.e., ƒ(xi)=ƒi, i=1, . . . , N, and let

f * = Σ n N F , n * φ , n

be the projection of ƒ onto . Again, let ƒ denote the ||×1 vector [<, ƒ>], let denote the ||×|| matrix [<,>], and let denote the ||×1 vector []. If is invertible, then


=(ƒ.  (61)

From the definition of in equation (59) and the inner product equation (2), it can be seen that

< φ , n , f >= Σ x i B , n f ( x i )

and <,> equals if n=n′ and equals 0 otherwise, where =μ() is the number of points in the set . Hence is the average value of the attributes of the points in , namely

F , n * = < φ , n , f > w , n = 1 w , n x i B , n f ( x i ) , ( 62 )

for n∈.

To express the two-scale equation for succinctly regardless of whether is a multiple of three or not, the following notation is used. If , n are the level and shift of a block, then “+1,2n” and “+1,2n+1” mean the level and shifts of its two subblocks. To be pedantic, “+1,2n” and “+1,2n+1” mean +1,(nx,ny,2nz) and +1,(nx,ny,2nz+1) if = (i.e., is a multiple of 3); they mean +1,(nx,2ny,nz) and +1,(nx,2ny+1,nz) if =+1 (i.e., ≡1 mod 3); and they mean +1,(2nx,ny,nz) and +1,(2nx+1,ny,nz) if =+2 (i.e., ≡2 mod 3).

Now the two-scale equation for can be readily expressed


=+,  (63)

so that combining equations (62) and (63),

F , n * = < φ + 1 , 2 n , f > w , n + < φ + 1 , 2 n + 1 , f > w , n ( 64 ) = w + 1 , 2 n w , n F + 1 , 2 n * + w + 1 , 2 n + 1 w , n F + 1 , 2 n + 1 * ( 65 ) = w 0 w 0 + w 1 F + 1 , 2 n * + w 1 w 0 + w 1 F + 1 , 2 n + 1 * , ( 66 )

where, to be more concise, one can abbreviate w0= and w1=. Both w0 and w1 will be non-zero when , n correspond to a node in the tree with two children. For such , n, define the function

ψ , n = - φ + 1 , 2 n w 0 + φ + 1 , 2 n + 1 w 1 , ( 67 )

and define

G , n * = w 0 w 1 w 0 + w 1 < ψ , n , f > . ( 68 )

Then combining equations (68), (67), and (62),

G , n * = w 0 w 1 w 0 + w 1 ( - < φ + 1 , 2 n , f > w 0 + < φ + 1 , 2 n + 1 , f > w 1 ) ( 69 ) = w 0 w 1 w 0 + w 1 ( - F + 1 , 2 n * + F + 1 , 2 n + 1 * ) , ( 70 )

which is the scaled difference between the average values of the attributes of the points in the two sub-blocks and of . When ƒ is smooth, will be close to zero. Putting equations (66) and (70) in matrix form,

[ F , n * G , n * ] = [ w 0 w 0 + w 1 w 1 w 0 + w 1 - w 0 w 1 w 0 + w 1 w 0 w 1 w 0 + w 1 ] [ F + 1 , 2 n * F + 1 , 2 n + 1 * ] . ( 71 )

Both and have support only on , and hence they are both orthogonal to both and for vector integer shifts n′≠n. Yet it can also be seen that and are orthogonal to each other, since <,>=−1+1=0. Thus, the orthogonal complement of in is

G = { g g = n N b G , n ψ , n } , ( 72 )

where the sum is over only those vector integer shifts ⊆ for which , n correspond to a node in the tree with two children. However, as defined, the orthogonal basis functions , and are not normalized. Define their normalized versions as

φ _ , n = φ , n φ , n = φ , n w 0 + w 1 , ( 73 ) ψ _ , n = ψ , n ψ , n = ψ , n 1 w 0 + 1 w 1 = w 0 w 1 w 0 + w 1 ψ , n . ( 74 )

Then

f = Σ n N F , n φ , n = Σ n N F _ , n φ _ , n and g = Σ n N b G , n ψ , n = Σ n N b G _ , n ψ _ , n ,

where

F _ , n = w 0 + w 1 F , n , ( 75 ) G _ , n = w 0 + w 1 w 0 w 1 G , n . ( 76 )

Rewriting equation (71),

[ 1 w 0 + w 1 F _ , n * w 0 w 1 w 0 + w 1 G _ , n * ] = [ w 0 w 0 + w 1 w 1 w 0 + w 1 - w 0 w 1 w 0 + w 1 w 0 w 1 w 0 + w 1 ] [ F _ + 1 , 2 , n * w 0 F _ + 1 , 2 n + 1 * w 1 ] , or ( 77 ) [ F _ , n * G _ , n * ] = [ w 0 w 0 + w 1 w 1 w 0 + w 1 - w 1 w 0 + w 1 w 0 w 0 + w 1 ] [ F _ + 1 , 2 n * F _ + 1 , 2 n + 1 * ] . ( 78 )

This is identical to equation (50) with

a = w 0 w 0 + w 1 = w + 1 , 2 n w + 1 , 2 n + w + 1 , 2 n + 1 , ( 79 ) b = w 1 w 0 + w 1 = w + 1 , 2 n + 1 w + 1 , 2 n + w + 1 , 2 n + 1 . ( 80 )

The asterisks are reminders that if RAHT is applied to the values ƒ1, . . . , ƒN of a volumetric function ƒ(x) at point locations x1, . . . , xN∈R3, the resulting coefficients , n∈, are optimal in that they represent the projection

f * = Σ n N F _ , n * φ , n of f

onto the subspace .

Thus, RAHT has a volumetric interpretation.

Tri-Linear B-Splines

Volumetric B-splines of order p=2, or tri-linear B-splines, are presently discussed. These splines are continuous, unlike constant B-splines. Tri-linear B-splines reduce blocking artifacts, and perhaps helpfully, do not develop rips, tears, or holes when the surface or motion representation is quantized, due to their guaranteed continuity.

Begin with the central cardinal B-spline of order p=2,

φ ( 2 ) ( x - 1 ) = { 1 + x x [ - 1 , 0 ] 1 - x x [ 0 , 1 ] 0 otherwise , ( 81 )

and define the volumetric version


ϕ0,0(x)=ϕ(2)(x−1)ϕ(2)(y−1)ϕ(2)(z−1).  (82)

Then (x)=ϕ0,0(2x−n) is the tri-linear B-spline basis function at level with vector integer shift n. As in equation (19), let

F = { f F { F , n } s . t . f ( x ) = n C F , n φ , n ( x ) } . ( 83 )

be the subspace of all functions in the Hilbert space F that are tri-linear over all blocks , n∈Z3, at level . It suffices to take the sum over vector integer shifts n∈, where is the collection of corners of the occupied blocks , n′∈. This is because (xi)=0 for all vector integer shifts n∉.

As in equation (26), the projection of ƒ∈F onto is given by =(ƒ, where ƒ is the ||×1 vector [<,ƒ>], is the ||×|| matrix [<,>], and is the ||×1 vector []. However, the matrix in the case of tri-linear splines, unlike the case of constant splines, is block tri-diagonal, rather than diagonal, and hence is more difficult to invert. Nevertheless, is sparse, and thus inversion by iterative methods is feasible.

The elements of and ƒ do not have to be computed directly from their definitions. Rather, they can be computed directly from their definitions in the case of =d+1, and then they can be computed recursively for ≤d, using the two-scale equation (27). For the tri-linear B-spline, the coefficients in the two-scale equation are

a k = { 2 - k 1 k { - 1 , 0 , 1 } 3 0 otherwise , ( 84 )

where ∥k∥1=|kx|+|ky|+|kz| is the 1-norm of k=(kx, ky, kz). Thus, as in equations (30) and (31), for ≤d,


ƒ=ƒ  (85)


and


,  (86)


where

A + 1 = [ a n j - n i ]

for ni, nj∈. For =d+1, one may use


Φd+1Tƒ=[ƒi]  (87)


and


Φd+1TΦd+1=IN,  (88)

where IN is the N×N identity matrix. This is possible, because the point locations xi can be taken to be the centers of the occupied voxels, or blocks Bd,n at level d. This means that they are at the corners Cd+1 of blocks Bd+1 at level d+1, for which the tri-linear functions ϕd+1,n, n∈Cd+1 do not overlap.

Consider now the orthogonal complement of in . As in the case of constant B-splines, in the case of tri-linear B-splines, the basis functions of depend locally on the point locations {xi} and hence are not shifts of each other as in the case of Lebesgue measure. In the case of constant B-splines, it is possible to explicitly define basis functions orthogonal to each other and to . Unfortunately, in the case of tri-linear B-splines, this is more challenging to do. One approach is to follow the procedure discussed above with respect to wavelets. There may be drawbacks of this approach, however. First, the basis functions computed for the subspace , though they are orthogonal to the basis functions for the subspace , are not in general orthogonal to each other. Thus, quantizing them independently of each other may magnify the quantization error. Second, the procedure for finding the basis functions, which involves a large matrix inverse, would be followed at the decoder as well as the encoder, which may be undesirable for lightweight decoders.

In contrast, the approach implemented by the systems and methods discussed herein, which is an approximate approach, is to prune the octree back to a collection of blocks of varying sizes, such that the per-point approximation error within each block is below a threshold (e.g., a predetermined threshold value for per-point approximation error), and to keep the coefficients at the corners of these blocks. If two (or more) of such coefficients and with < occupy the same position in space, i.e. n′=n, then the coefficients are combined in some way. In some example embodiments, the coefficients are averaged. In other example embodiments, the coefficients are optimized by a least square method. In further example embodiments, the coefficient at the lower level is discarded. The kept coefficients , each located at a different location n, are themselves a point cloud, which can be compressed with RAHT using a constant B-spline.

Higher Order B-Splines

Volumetric B-splines of order p≥3 are possible, and offer higher order continuity properties. However, they are more complex to compute, and at each level , they require significantly more coefficients per occupied block. Specifically, a function in requires p3 coefficients per occupied block. Although some of these coefficients are shared between blocks, in some example embodiments, the extra smoothness gained from the higher order may not be worth the added bitrate or computational complexity.

FIG. 6 is a flowchart illustrating operations in a method 600 of encoding attributes of a point cloud, according to some example embodiments. Some or all of the operations in the method 600 may be performed by an encoder machine (e.g., by one or more processors of an encoder device or other machine configured to compress data, such as data that specifies point cloud attributes). As shown, the method 600 includes one or more of operations 610, 620, 630, and 640.

In operation 610, the encoder machine accesses (e.g., reads, receives, or retrieves) a point cloud that specifies a finite set of points in three-dimensional space. The point cloud may be accessed by accessing data that describes, defines, or otherwise specifies the geometric information of the point cloud, the attribute information of the point cloud, or both.

In operation 620, the encoder machine determines (e.g., fits) a volumetric function (e.g., a volumetric B-spline function) based on values of a signal defined on the finite set of points of the point cloud accessed in operation 610.

In operation 630, the encoder machine generates a data structure that represents a volumetric function. The volumetric function represents the signal defined on the finite set of points of the point cloud accessed in operation 610.

In operation 640, the encoder machine provides (e.g., communicates or causes communication of) the data structure generated in operation 630 to a decoder machine. The decoder machine may be configured to recover values of the signal within a predetermined approximation threshold by evaluating the volumetric function on the finite set of points.

As shown in FIG. 6, operation 630 may include one or more of operations 632, 634, and 636. According to some example embodiments, in operation 632, as part of generating the data structure that represents the volumetric function, the encoder machine determines coefficients of basis functions that correspond to the signal defined on the finite set of points.

In some other example embodiments, however, the volumetric function that represents the signal is a first volumetric function that represents geometry information of the point cloud, and the point cloud further specifies a corresponding set of attributes (e.g., color) for each point among the finite set of points. Accordingly, the encoder machine may perform data compression on the sets of attributes that correspond to the finite set of points by determining a second volumetric function based on the sets of attributes. Thus, in operation 632, as part of generating the data structure that represents the second volumetric function, the encoder machine may determine coefficients of the second volumetric function.

In operation 634, the encoder machine quantizes the determined coefficients of the volumetric function (e.g., the first volumetric function, the second volumetric function, or both).

In operation 636, the encoder machine performs entropy coding on the determined coefficients (e.g., the quantized determined coefficients) discussed above with respect to operation 632, operation 634, or both.

Accordingly, in various example embodiments, performance of operation 640 includes providing entropy coded and quantized coefficients of the second volumetric function within the generated data structure to the decoder machine. This may be done by providing the generated data structure within a bit stream to the decoder machine.

FIG. 7 is a flowchart illustrating operations in a method 700 of decoding previously encoded attributes of a point cloud, according to some example embodiments. Some or all of the operations in the method 700 may be performed by a decoder machine (e.g., by one or more processors of a decoder device or other machine configured to decompress data, such as previously compressed data that specifies point cloud attributes). As shown, the method 700 includes one or more of operations 710, 720, and 730.

In operation 710, the decoder machine accesses (e.g., reads, receives, or retrieves) a data structure that represents a volumetric function (e.g., a volumetric B-spline function). The volumetric function represents a signal on a finite set of points specified by a point cloud in three-dimensional space. In some example embodiments, the accessing of the data structure includes receiving the data structure within a bit stream generated by an encoder machine. In addition, according to certain example embodiments, the accessed data structure includes coefficients of basis functions that correspond to the signal defined on the finite set of points specified by the point cloud.

In operation 720, the decoder machine recovers the point cloud, within a predetermined approximation threshold, by evaluating the volumetric function represented by the data structure accessed in operation 710. The result obtains data that describes, defines, or otherwise specifies (e.g., within the approximation threshold) the geometric information of the point cloud, the attribute information of the point cloud, or both.

In some example embodiments, the volumetric function that represents the signal is a first volumetric function that represents geometry information of the point cloud, and the point cloud further specifies a corresponding set of attributes (e.g., color) for each point among the finite set of points. Accordingly, the decoder machine may recover (e.g., within the approximation threshold) the sets of attributes by evaluating a second volumetric function defined by the accessed data structure, where the second volumetric function represents the sets of attributes that correspond to the finite set of points. Thus, as part of operation 720, prior to the evaluating of the second volumetric function, the decoder machine may determine (e.g., fit) the second volumetric function by performing one or more of operations 722, 724, and 726, any one or more of which may be included in operation 720.

In operation 722, the decoder machine performs entropy decoding on entropy coded and quantized coefficients of the second volumetric function.

In operation 724, the decoder machine determines coefficients of the second volumetric function by inverse quantizing the entropy decoded but still quantized coefficients of the second volumetric function.

In operation 726, the decoder machine determines values of the signal that is defined on the finite set of points specified by the point cloud.

In operation 730, the decoder machine causes (e.g., triggers, controls, or otherwise initiates) a rendering of the recovered point cloud, a display of the recovered point cloud, or both, based on results of operation 720 (e.g., based on the results of evaluating the volumetric function represented by the data structure accessed in operation 710).

According to various example embodiments, one or more of the methodologies described herein may facilitate representation of attributes of a point cloud. Moreover, one or more of the methodologies described herein may facilitate generating, storing, accessing, transmitting, receiving, compressing, decompressing, modifying, rendering, displaying, or any suitable combination thereof, a data structure that represents attributes of a point cloud. Hence, one or more of the systems and methods described herein may facilitate generating, storing, accessing, transmitting, receiving, compressing, decompressing, modifying, rendering, displaying, or any suitable combination thereof, computer graphics that include one or more point clouds, compared to capabilities of pre-existing systems and methods.

When these effects are considered in aggregate, one or more of the methodologies described herein may obviate a need for certain efforts or resources that otherwise would be involved in data operations on one or more representations of attributes of point clouds. Efforts expended by a user in using point clouds for computer graphics may be reduced by use of (e.g., reliance upon) a special-purpose machine that implements one or more of the methodologies described herein. Computing resources used by one or more systems or machines may similarly be reduced (e.g., compared to systems or machines that lack the structures discussed herein or are otherwise unable to perform the functions discussed herein). Examples of such computing resources include processor cycles, network traffic, computational capacity, main memory usage, graphics rendering capacity, graphics memory usage, data storage capacity, power consumption, and cooling capacity.

FIG. 8 is a block diagram illustrating components of a machine 800, according to some example embodiments, able to read instructions 824 from a machine-readable medium 822 (e.g., a non-transitory machine-readable medium, a machine-readable storage medium, a computer-readable storage medium, or any suitable combination thereof) and perform any one or more of the methodologies discussed herein, in whole or in part. Specifically, FIG. 8 shows the machine 800 in the example form of a computer system (e.g., a computer) within which the instructions 824 (e.g., software, a program, an application, an applet, an app, or other executable code) for causing the machine 800 to perform any one or more of the methodologies discussed herein may be executed, in whole or in part.

In alternative embodiments, the machine 800 operates as a standalone device or may be communicatively coupled (e.g., networked) to other machines. In a networked deployment, the machine 800 may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a distributed (e.g., peer-to-peer) network environment. The machine 800 may be a server computer, a client computer, a personal computer (PC), a tablet computer, a laptop computer, a netbook, a cellular telephone, a smart phone, a set-top box (STB), a personal digital assistant (PDA), a web appliance, a network router, a network switch, a network bridge, or any machine capable of executing the instructions 824, sequentially or otherwise, that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute the instructions 824 to perform all or part of any one or more of the methodologies discussed herein.

The machine 800 includes a processor 802 (e.g., one or more central processing units (CPUs), one or more graphics processing units (GPUs), one or more digital signal processors (DSPs), one or more application specific integrated circuits (ASICs), one or more radio-frequency integrated circuits (RFICs), or any suitable combination thereof), a main memory 804, and a static memory 806, which are configured to communicate with each other via a bus 808. The processor 802 contains solid-state digital microcircuits (e.g., electronic, optical, or both) that are configurable, temporarily or permanently, by some or all of the instructions 824 such that the processor 802 is configurable to perform any one or more of the methodologies described herein, in whole or in part. For example, a set of one or more microcircuits of the processor 802 may be configurable to execute one or more modules (e.g., software modules) described herein. In some example embodiments, the processor 802 is a multicore CPU (e.g., a dual-core CPU, a quad-core CPU, an 8-core CPU, or a 128-core CPU) within which each of multiple cores behaves as a separate processor that is able to perform any one or more of the methodologies discussed herein, in whole or in part. Although the beneficial effects described herein may be provided by the machine 800 with at least the processor 802, these same beneficial effects may be provided by a different kind of machine that contains no processors (e.g., a purely mechanical system, a purely hydraulic system, or a hybrid mechanical-hydraulic system), if such a processor-less machine is configured to perform one or more of the methodologies described herein.

The machine 800 may further include a graphics display 810 (e.g., a plasma display panel (PDP), a light emitting diode (LED) display, a liquid crystal display (LCD), a projector, a cathode ray tube (CRT), or any other display capable of displaying graphics or video). The machine 800 may also include an alphanumeric input device 812 (e.g., a keyboard or keypad), a pointer input device 814 (e.g., a mouse, a touchpad, a touchscreen, a trackball, a joystick, a stylus, a motion sensor, an eye tracking device, a data glove, or other pointing instrument), a data storage 816, an audio generation device 818 (e.g., a sound card, an amplifier, a speaker, a headphone jack, or any suitable combination thereof), and a network interface device 820.

The data storage 816 (e.g., a data storage device) includes the machine-readable medium 822 (e.g., a tangible and non-transitory machine-readable storage medium) on which are stored the instructions 824 embodying any one or more of the methodologies or functions described herein. The instructions 824 may also reside, completely or at least partially, within the main memory 804, within the static memory 806, within the processor 802 (e.g., within the processor's cache memory), or any suitable combination thereof, before or during execution thereof by the machine 800. Accordingly, the main memory 804, the static memory 806, and the processor 802 may be considered machine-readable media (e.g., tangible and non-transitory machine-readable media). The instructions 824 may be transmitted or received over a network 890 via the network interface device 820. For example, the network interface device 820 may communicate the instructions 824 using any one or more transfer protocols (e.g., hypertext transfer protocol (HTTP)).

In some example embodiments, the machine 800 may be a portable computing device (e.g., a smart phone, a tablet computer, or a wearable device), and may have one or more additional input components 830 (e.g., sensors or gauges). Examples of such input components 830 include an image input component (e.g., one or more cameras), an audio input component (e.g., one or more microphones), a direction input component (e.g., a compass), a location input component (e.g., a global positioning system (GPS) receiver), an orientation component (e.g., a gyroscope), a motion detection component (e.g., one or more accelerometers), an altitude detection component (e.g., an altimeter), a temperature input component (e.g., a thermometer), and a gas detection component (e.g., a gas sensor). Input data gathered by any one or more of these input components may be accessible and available for use by any of the modules described herein (e.g., with suitable privacy notifications and protections, such as opt-in consent or opt-out consent, implemented in accordance with user preference, applicable regulations, or any suitable combination thereof).

As used herein, the term “memory” refers to a machine-readable medium able to store data temporarily or permanently and may be taken to include, but not be limited to, random-access memory (RAM), read-only memory (ROM), buffer memory, flash memory, and cache memory. While the machine-readable medium 822 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store instructions. The term “machine-readable medium” shall also be taken to include any medium, or combination of multiple media, that is capable of carrying (e.g., storing or communicating) the instructions 824 for execution by the machine 800, such that the instructions 824, when executed by one or more processors of the machine 800 (e.g., processor 802), cause the machine 800 to perform any one or more of the methodologies described herein, in whole or in part. Accordingly, a “machine-readable medium” refers to a single storage apparatus or device, as well as cloud-based storage systems or storage networks that include multiple storage apparatus or devices. The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, one or more tangible and non-transitory data repositories (e.g., data volumes) in the example form of a solid-state memory chip, an optical disc, a magnetic disc, or any suitable combination thereof.

A “non-transitory” machine-readable medium, as used herein, specifically excludes propagating signals per se. According to various example embodiments, the instructions 824 for execution by the machine 800 can be communicated via a carrier medium (e.g., a machine-readable carrier medium). Examples of such a carrier medium include a non-transient carrier medium (e.g., a non-transitory machine-readable storage medium, such as a solid-state memory that is physically movable from one place to another place) and a transient carrier medium (e.g., a carrier wave or other propagating signal that communicates the instructions 824).

Certain example embodiments are described herein as including modules. Modules may constitute software modules (e.g., code stored or otherwise embodied in a machine-readable medium or in a transmission medium), hardware modules, or any suitable combination thereof. A “hardware module” is a tangible (e.g., non-transitory) physical component (e.g., a set of one or more processors) capable of performing certain operations and may be configured or arranged in a certain physical manner. In various example embodiments, one or more computer systems or one or more hardware modules thereof may be configured by software (e.g., an application or portion thereof) as a hardware module that operates to perform operations described herein for that module.

In some example embodiments, a hardware module may be implemented mechanically, electronically, hydraulically, or any suitable combination thereof. For example, a hardware module may include dedicated circuitry or logic that is permanently configured to perform certain operations. A hardware module may be or include a special-purpose processor, such as a field programmable gate array (FPGA) or an ASIC. A hardware module may also include programmable logic or circuitry that is temporarily configured by software to perform certain operations. As an example, a hardware module may include software encompassed within a CPU or other programmable processor. It will be appreciated that the decision to implement a hardware module mechanically, hydraulically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.

Accordingly, the phrase “hardware module” should be understood to encompass a tangible entity that may be physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein. Furthermore, as used herein, the phrase “hardware-implemented module” refers to a hardware module. Considering example embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where a hardware module includes a CPU configured by software to become a special-purpose processor, the CPU may be configured as respectively different special-purpose processors (e.g., each included in a different hardware module) at different times. Software (e.g., a software module) may accordingly configure one or more processors, for example, to become or otherwise constitute a particular hardware module at one instance of time and to become or otherwise constitute a different hardware module at a different instance of time.

Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over circuits and buses) between or among two or more of the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory (e.g., a memory device) to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information from a computing resource).

The various operations of example methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions described herein. As used herein, “processor-implemented module” refers to a hardware module in which the hardware includes one or more processors. Accordingly, the operations described herein may be at least partially processor-implemented, hardware-implemented, or both, since a processor is an example of hardware, and at least some operations within any one or more of the methods discussed herein may be performed by one or more processor-implemented modules, hardware-implemented modules, or any suitable combination thereof.

Moreover, such one or more processors may perform operations in a “cloud computing” environment or as a service (e.g., within a “software as a service” (SaaS) implementation). For example, at least some operations within any one or more of the methods discussed herein may be performed by a group of computers (e.g., as examples of machines that include processors), with these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., an application program interface (API)). The performance of certain operations may be distributed among the one or more processors, whether residing only within a single machine or deployed across a number of machines. In some example embodiments, the one or more processors or hardware modules (e.g., processor-implemented modules) may be located in a single geographic location (e.g., within a home environment, an office environment, or a server farm). In other example embodiments, the one or more processors or hardware modules may be distributed across a number of geographic locations.

Throughout this specification, plural instances may implement components, operations, or structures described as a single instance. Although individual operations of one or more methods are illustrated and described as separate operations, one or more of the individual operations may be performed concurrently, and nothing requires that the operations be performed in the order illustrated. Structures and their functionality presented as separate components and functions in example configurations may be implemented as a combined structure or component with combined functions. Similarly, structures and functionality presented as a single component may be implemented as separate components and functions. These and other variations, modifications, additions, and improvements fall within the scope of the subject matter herein.

Some portions of the subject matter discussed herein may be presented in terms of algorithms or symbolic representations of operations on data stored as bits or binary digital signals within a memory (e.g., a computer memory or other machine memory). Such algorithms or symbolic representations are examples of techniques used by those of ordinary skill in the data processing arts to convey the substance of their work to others skilled in the art. As used herein, an “algorithm” is a self-consistent sequence of operations or similar processing leading to a desired result. In this context, algorithms and operations involve physical manipulation of physical quantities. Typically, but not necessarily, such quantities may take the form of electrical, magnetic, or optical signals capable of being stored, accessed, transferred, combined, compared, or otherwise manipulated by a machine. It is convenient at times, principally for reasons of common usage, to refer to such signals using words such as “data,” “content,” “bits,” “values,” “elements,” “symbols,” “characters,” “terms,” “numbers,” “numerals,” or the like. These words, however, are merely convenient labels and are to be associated with appropriate physical quantities.

Unless specifically stated otherwise, discussions herein using words such as “accessing,” “processing,” “detecting,” “computing,” “calculating,” “determining,” “generating,” “presenting,” “displaying,” or the like refer to actions or processes performable by a machine (e.g., a computer) that manipulates or transforms data represented as physical (e.g., electronic, magnetic, or optical) quantities within one or more memories (e.g., volatile memory, non-volatile memory, or any suitable combination thereof), registers, or other machine components that receive, store, transmit, or display information. Furthermore, unless specifically stated otherwise, the terms “a” or “an” are herein used, as is common in patent documents, to include one or more than one instance. Finally, as used herein, the conjunction “or” refers to a non-exclusive “or,” unless specifically stated otherwise.

The following enumerated embodiments describe various example embodiments of methods, machine-readable media, and systems (e.g., machines, devices, or other apparatus) discussed herein.

A first embodiment provides a method comprising:

accessing, by one or more processors of an encoder machine, a point cloud that specifies a finite set of points in three-dimensional space; and
generating, by one or more processors of the encoder machine, a data structure that represents a volumetric function, the volumetric function representing a signal defined on the finite set of points of the accessed point cloud.

A second embodiment provides a method according to the first embodiment, further comprising:

determining the volumetric function based on values of the signal defined on the finite set of points of the accessed point cloud.

A third embodiment provides a method according to the first embodiment or the second embodiment, further comprising:

providing the generated data structure to a decoder machine configured to recover values of the signal within a predetermined approximation threshold by evaluating the volumetric function on the finite set of points.

A fourth embodiment provides a method according to any of the first through third embodiments, wherein:

the generating the data structure that represents the volumetric function includes determining coefficients of basis functions that correspond to the signal defined on the finite set of points.

A fifth embodiment provides a method according to any of the first, second, or fourth embodiments, wherein:

the volumetric function that represents the signal is a first volumetric function that represents geometry information of the point cloud;
the point cloud specifies a corresponding set of attributes for each point among the finite set of points; and the method further comprises:
performing data compression on the sets of attributes that correspond to the finite set of points by determining a second volumetric function based on the sets of attributes.

A sixth embodiment provides a method according to the fifth embodiment, wherein:

the generating of the data structure includes:
determining coefficients of the second volumetric function;
quantizing the determined coefficients of the second volumetric function; and performing entropy coding on the quantized coefficients of the second volumetric function.

A seventh embodiment provides a method according to the sixth embodiment, further comprising:

providing the entropy coded and quantized coefficients of the second volumetric function within the generated data structure to a decoder machine by providing the generated data structure within a bit stream to the decoder machine.

An eighth embodiment provides a method comprising:

accessing, by one or more processors of a decoder machine, a data structure that represents a volumetric function, the volumetric function representing a signal defined on a finite set of points specified by a point cloud in three-dimensional space;
recovering, by one or more processors of the decoder machine, the point cloud within a predetermined approximation threshold by evaluating the volumetric function represented by the accessed data structure; and
causing, by one or more processors of the decoder machine, rendering and display of the recovered point cloud based on results of the evaluating of the volumetric function represented by the accessed data structure.

A ninth embodiment provides a method according to the eighth embodiment, wherein:

in the recovering of the point cloud, the evaluating of the volumetric function determines values of the signal defined on the finite set of points specified by the point cloud.

A tenth embodiment provides a method according to the eighth embodiment or the ninth embodiment, wherein:

the accessing of the data structure includes receiving the data structure within a bit stream generated by an encoder machine.

An eleventh embodiment provides a method according to any of the eighth through tenth embodiments, wherein:

the accessed data structure includes coefficients of basis functions that correspond to the signal defined on the finite set of points specified by the point cloud.

A twelfth embodiment provides a method according to any of the eight through eleventh embodiments, wherein:

the volumetric function that represents the signal is a first volumetric function that represents geometry information of the point cloud;
the point cloud specifies a corresponding set of attributes for each point among the finite set of points; and the method further comprises:
evaluating a second volumetric function defined by the accessed data structure, the second volumetric function representing the sets of attributes that correspond to the finite set of points.

A thirteenth embodiment provides a method according to the twelfth embodiment, further comprising:

prior to the evaluating of the second volumetric function, determining the second volumetric function by:
performing entropy decoding on entropy coded and quantized coefficients of the second volumetric function; and determining the coefficients of the second volumetric function by inverse quantizing the entropy decoded but still quantized coefficients of the second volumetric function.

A fourteenth embodiment provides an encoder system comprising:

one or more processors; and
a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the system to perform operations comprising:
accessing a point cloud that specifies a finite set of points in three-dimensional space; and
generating a data structure that represents a volumetric function, the volumetric function representing a signal defined on the finite set of points of the accessed point cloud.

A fifteenth embodiment provides a decoder system comprising:

one or more processors; and
a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the system to perform operations comprising:
accessing a data structure that represents a volumetric function, the volumetric function representing a signal defined on a finite set of points specified by a point cloud in three-dimensional space;
recovering the point cloud within a predetermined approximation threshold by evaluating the volumetric function represented by the accessed data structure; and causing rendering and display of the recovered point cloud based on results of the evaluating of the volumetric function represented by the accessed data structure.

A sixteenth embodiment provides a machine-readable medium (e.g., a non-transitory machine-readable storage medium) comprising instructions that, when executed by one or more processors of an encoder machine, cause the encoder machine to perform operations comprising:

accessing a point cloud that specifies a finite set of points in three-dimensional space; and
generating a data structure that represents a volumetric function, the volumetric function representing a signal defined on the finite set of points of the accessed point cloud.

A seventeenth embodiment provides a machine-readable medium (e.g., a non-transitory machine-readable storage medium) comprising instructions that, when executed by one or more processors of a decoder machine, cause the decoder machine to perform operations comprising:

accessing a data structure that represents a volumetric function, the volumetric function representing a signal defined on a finite set of points specified by a point cloud in three-dimensional space;
recovering the point cloud within a predetermined approximation threshold by evaluating the volumetric function represented by the accessed data structure; and
causing rendering and display of the recovered point cloud based on results of the evaluating of the volumetric function represented by the accessed data structure.

An eighteenth embodiment provides a method comprising operations substantially as herein described and illustrated.

A nineteenth embodiment provides a machine-readable medium (e.g., a non-transitory machine-readable storage medium) comprising instructions that, when executed by one or more processors of a machine, cause the machine to perform operations substantially as herein described and illustrated.

A twentieth embodiment provides a system comprising:

one or more processors; and
a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the system to perform operations substantially as herein described and illustrated.

A twenty-first embodiment provides a carrier medium carrying machine-readable instructions for controlling a machine to carry out the method of any one of the first through thirteenth embodiments.

Claims

1. A method comprising:

accessing, by one or more processors, a point cloud that specifies a set of three-dimensional (3D) points, and
generating, by the one or more processors, a data structure that indicates a volumetric function that corresponds to a signal defined on the set of 3D points specified by the accessed point cloud.

2. The method of claim 1, further comprising:

determining the volumetric function that corresponds to the signal based on values of the signal.

3. The method of claim 1, further comprising:

providing the generated data structure that indicates the volumetric function to a device configured to recover the point cloud by evaluating the volumetric function indicated by the data structure, the evaluating of the volumetric function including determining at least some values of the signal defined on the set of 3D points specified by the point cloud.

4. The method of claim 1, wherein:

the generating of the data structure that encodes the volumetric function includes determining coefficients of basis functions that correspond to the signal defined on the set of 3D points specified by the point cloud.

5. The method of claim 1, wherein:

the volumetric function that represents the signal is a first volumetric function that represents geometry of the point cloud that specifies the set of 3D points;
the point cloud specifies a corresponding set of attributes for each 3D point among the set of 3D points; and
the method further comprises:
performing data compression on the sets of attributes that correspond to the set of 3D points by determining a second volumetric function based on the sets of attributes.

6. The method of claim 5, wherein:

the generating of the data structure includes: determining coefficients of the second volumetric function; quantizing the determined coefficients of the second volumetric function; and performing entropy coding on the quantized coefficients of the second volumetric function.

7. The method of claim 6, further comprising:

providing the entropy coded and quantized coefficients of the second volumetric function within the generated data structure to a device.

8. A method comprising:

accessing, by one or more processors, a data structure that indicates a volumetric function that corresponds to a signal defined on a set of three-dimensional (3D) points specified by a point cloud;
recovering, by the one or more processors, the point cloud by evaluating the volumetric function indicated by the accessed data structure; and
causing, by the one or more processors, a rendering of the recovered point cloud based on the evaluating of the volumetric function indicated by the accessed data structure.

9. The method of claim 8, wherein:

in the recovering of the point cloud, the evaluating of the volumetric function includes determining at least some values of the signal defined on the set of 3D points specified by the point cloud.

10. The method of claim 8, wherein:

the accessing of the data structure includes receiving the data structure via a network from a device that generated the data structure.

11. The method of claim 8, wherein:

the accessed data structure includes coefficients of basis functions that correspond to the signal defined on the set of 3D points specified by the point cloud.

12. The method of claim 8, wherein:

the volumetric function that represents the signal is a first volumetric function that represents geometry of the point cloud that specifies the set of 3D points;
the point cloud specifies a corresponding set of attributes for each 3D point among the set of 3D points; and
the method further comprises:
evaluating a second volumetric function indicated by the accessed data structure, the second volumetric function determined based on the sets of attributes that correspond to the set of 3D points.

13. The method of claim 12, further comprising:

prior to the evaluating of the second volumetric function, determining the second volumetric function by performing operations comprising: performing entropy decoding on entropy coded and quantized coefficients of the second volumetric function; and determining the coefficients of the second volumetric function by inverse quantizing the entropy decoded but still quantized coefficients of the second volumetric function.

14. A system comprising:

one or more processors; and
a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the system to perform operations comprising:
accessing a point cloud that specifies a set of three-dimensional (3D) points; and
generating a data structure that indicates a volumetric function that corresponds to a signal defined on the set of 3D points specified by the accessed point cloud.

15. The system of claim 14, wherein the operations further comprise:

providing the generated data structure that indicates the volumetric function to a device configured to recover the point cloud by evaluating the volumetric function indicated by the data structure, the evaluating of the volumetric function including determining at least some values of the signal defined on the set of 3D points specified by the point cloud.

16. A system comprising:

one or more processors; and
a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the system to perform operations comprising:
accessing a data structure that indicates a volumetric function that corresponds to a signal defined on a set of three-dimensional (3D) points specified by a point cloud;
recovering the point cloud by evaluating the volumetric function indicated by the accessed data structure; and
causing a rendering of the recovered point cloud based on the evaluating of the volumetric function indicated by the accessed data structure.

17. A non-transitory machine-readable storage medium comprising instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising:

accessing a point cloud that specifies a set of three-dimensional (3D) points; and
generating a data structure that indicates a volumetric function that corresponds to a signal defined on the set of 3D points specified by the accessed point cloud.

18. The non-transitory machine-readable storage medium of claim 17, wherein the operations further comprise:

providing the generated data structure that indicates the volumetric function to a device configured to recover the point cloud by evaluating the volumetric function indicated by the data structure, the evaluating of the volumetric function including determining at least some values of the signal defined on the set of 3D points specified by the point cloud.

19. A non-transitory machine-readable storage medium comprising instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising:

accessing a data structure that indicates a volumetric function that corresponds to a signal defined on a set of three-dimensional (3D) points specified by a point cloud;
recovering the point cloud by evaluating the volumetric function indicated by the accessed data structure; and
causing a rendering of the recovered point cloud based on the evaluating of the volumetric function indicated by the accessed data structure.

20. The non-transitory machine-readable storage medium of claim 19, wherein:

in the recovering of the point cloud, the evaluating of the volumetric function includes determining at least some values of the signal defined on the set of 3D points specified by the point cloud.
Patent History
Publication number: 20210034696
Type: Application
Filed: Oct 21, 2020
Publication Date: Feb 4, 2021
Inventors: Philip A. Chou (Hermosa Beach, CA), Maxim Koroteev (Karori), Maja Krivokuca (Rennes), Robert James William Higgs (Porirua), Charles Loop (Hermosa Beach, CA)
Application Number: 17/076,578
Classifications
International Classification: G06F 17/15 (20060101); G06F 17/16 (20060101); H03M 7/30 (20060101); G06F 17/17 (20060101); G06T 15/00 (20060101);