Patents by Inventor Miika Samuli Aittala
Miika Samuli Aittala has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240135630Abstract: A method and system for performing novel image synthesis using generative networks are provided. The encoder-based model is trained to infer a 3D representation of an input image. A feature image is then generated using volume rendering techniques in accordance with the 3D representation. The feature image is then concatenated with a noisy image and processed by a denoiser network to predict an output image from a novel viewpoint that is consistent with the input image. The denoiser network can be a modified Noise Conditional Score Network (NCSN). In some embodiments, multiple input images or keyframes can be provided as input, and a different 3D representation is generated for each input image. The feature image is then generated, during volume rendering, by sampling each of the 3D representations and applying a mean-pooling operation to generate an aggregate feature image.Type: ApplicationFiled: October 11, 2023Publication date: April 25, 2024Inventors: Koki Nagano, Eric Ryan Wong Chan, Tero Tapani Karras, Shalini De Mello, Miika Samuli Aittala, Matthew Aaron Wong Chan
-
Patent number: 11790598Abstract: A three-dimensional (3D) density volume of an object is constructed from tomography images (e.g., x-ray images) of the object. The tomography images are projection images that capture all structures of an object (e.g., human body) between a beam source and imaging sensor. The beam effectively integrates along a path through the object producing a tomography image at the imaging sensor, where each pixel represents attenuation. A 3D reconstruction pipeline includes a first neural network model, a fixed function backprojection unit, and a second neural network model. Given information for the capture environment, the tomography images are processed by the reconstruction pipeline to produce a reconstructed 3D density volume of the object. In contrast with a set of 2D slices, the entire 3D density volume is reconstructed, so two-dimensional (2D) density images may be produced by slicing through any portion of the 3D density volume at any angle.Type: GrantFiled: July 1, 2021Date of Patent: October 17, 2023Assignee: NVIDIA CorporationInventors: Onni August Kosomaa, Jaakko T. Lehtinen, Samuli Matias Laine, Tero Tapani Karras, Miika Samuli Aittala
-
Patent number: 11775829Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.Type: GrantFiled: December 12, 2022Date of Patent: October 3, 2023Assignee: NVIDIA CorporationInventors: Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko T. Lehtinen, Miika Samuli Aittala, Timo Oskari Aila, Ming-Yu Liu, Arun Mohanray Mallya, Ting-Chun Wang
-
Publication number: 20230110206Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.Type: ApplicationFiled: December 12, 2022Publication date: April 13, 2023Inventors: Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko T. Lehtinen, Miika Samuli Aittala, Timo Oskari Aila, Ming-Yu Liu, Arun Mohanray Mallya, Ting-Chun Wang
-
Patent number: 11625613Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.Type: GrantFiled: January 7, 2021Date of Patent: April 11, 2023Assignee: NVIDIA CorporationInventors: Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko T. Lehtinen, Miika Samuli Aittala, Timo Oskari Aila, Ming-Yu Liu, Arun Mohanray Mallya, Ting-Chun Wang
-
Patent number: 11620521Abstract: A style-based generative network architecture enables scale-specific control of synthesized output data, such as images. During training, the style-based generative neural network (generator neural network) includes a mapping network and a synthesis network. During prediction, the mapping network may be omitted, replicated, or evaluated several times. The synthesis network may be used to generate highly varied, high-quality output data with a wide variety of attributes. For example, when used to generate images of people's faces, the attributes that may vary are age, ethnicity, camera viewpoint, pose, face shape, eyeglasses, colors (eyes, hair, etc.), hair style, lighting, background, etc. Depending on the task, generated output data may include images, audio, video, three-dimensional (3D) objects, text, etc.Type: GrantFiled: January 28, 2021Date of Patent: April 4, 2023Assignee: NVIDIA CorporationInventors: Tero Tapani Karras, Samuli Matias Laine, Jaakko T. Lehtinen, Miika Samuli Aittala, Janne Johannes Hellsten, Timo Oskari Aila
-
Patent number: 11610122Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.Type: GrantFiled: January 7, 2021Date of Patent: March 21, 2023Assignee: NVIDIA CorporationInventors: Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko T. Lehtinen, Miika Samuli Aittala, Timo Oskari Aila, Ming-Yu Liu, Arun Mohanray Mallya, Ting-Chun Wang
-
Patent number: 11610435Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.Type: GrantFiled: October 13, 2020Date of Patent: March 21, 2023Assignee: NVIDIA CorporationInventors: Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko T. Lehtinen, Miika Samuli Aittala, Timo Oskari Aila, Ming-Yu Liu, Arun Mohanray Mallya, Ting-Chun Wang
-
Patent number: 11605001Abstract: A style-based generative network architecture enables scale-specific control of synthesized output data, such as images. During training, the style-based generative neural network (generator neural network) includes a mapping network and a synthesis network. During prediction, the mapping network may be omitted, replicated, or evaluated several times. The synthesis network may be used to generate highly varied, high-quality output data with a wide variety of attributes. For example, when used to generate images of people's faces, the attributes that may vary are age, ethnicity, camera viewpoint, pose, face shape, eyeglasses, colors (eyes, hair, etc.), hair style, lighting, background, etc. Depending on the task, generated output data may include images, audio, video, three-dimensional (3D) objects, text, etc.Type: GrantFiled: January 28, 2021Date of Patent: March 14, 2023Assignee: NVIDIA CorporationInventors: Tero Tapani Karras, Samuli Matias Laine, Jaakko T. Lehtinen, Miika Samuli Aittala, Janne Johannes Hellsten, Timo Oskari Aila
-
Patent number: 11580395Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.Type: GrantFiled: October 13, 2020Date of Patent: February 14, 2023Assignee: NVIDIA CorporationInventors: Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko T. Lehtinen, Miika Samuli Aittala, Timo Oskari Aila, Ming-Yu Liu, Arun Mohanray Mallya, Ting-Chun Wang
-
Publication number: 20220405980Abstract: Systems and methods are disclosed for fused processing of a continuous mathematical operator. Fused processing of continuous mathematical operations, such as pointwise non-linear functions without storing intermediate results to memory improves performance when the memory bus bandwidth is limited. In an embodiment, a continuous mathematical operation including at least two of convolution, upsampling, pointwise non-linear function, and downsampling is executed to process input data and generate alias-free output data. In an embodiment, the input data is spatially tiled for processing in parallel such that the intermediate results generated during processing of the input data for each tile may be stored in a shared memory within the processor. Storing the intermediate data in the shared memory improves performance compared with storing the intermediate data to the external memory and loading the intermediate data from the external memory.Type: ApplicationFiled: December 27, 2021Publication date: December 22, 2022Inventors: Tero Tapani Karras, Miika Samuli Aittala, Samuli Matias Laine, Erik Andreas Härkönen, Janne Johannes Hellsten, Jaakko T. Lehtinen, Timo Oskari Aila
-
Publication number: 20220405880Abstract: Systems and methods are disclosed that improve output quality of any neural network, particularly an image generative neural network. In the real world, details of different scale tend to transform hierarchically. For example, moving a person's head causes the nose to move, which in turn moves the skin pores on the nose. Conventional generative neural networks do not synthesize images in a natural hierarchical manner: the coarse features seem to mainly control the presence of finer features, but not the precise positions of the finer features. Instead, much of the fine detail appears to be fixed to pixel coordinates which is a manifestation of aliasing. Aliasing breaks the illusion of a solid and coherent object moving in space. A generative neural network with reduced aliasing provides an architecture that exhibits a more natural transformation hierarchy, where the exact sub-pixel position of each feature is inherited from underlying coarse features.Type: ApplicationFiled: December 27, 2021Publication date: December 22, 2022Inventors: Tero Tapani Karras, Miika Samuli Aittala, Samuli Matias Laine, Erik Andreas Härkönen, Janne Johannes Hellsten, Jaakko T. Lehtinen, Timo Oskari Aila
-
Publication number: 20220189100Abstract: A three-dimensional (3D) density volume of an object is constructed from tomography images (e.g., x-ray images) of the object. The tomography images are projection images that capture all structures of an object (e.g., human body) between a beam source and imaging sensor. The beam effectively integrates along a path through the object producing a tomography image at the imaging sensor, where each pixel represents attenuation. A 3D reconstruction pipeline includes a first neural network model, a fixed function backprojection unit, and a second neural network model. Given information for the capture environment, the tomography images are processed by the reconstruction pipeline to produce a reconstructed 3D density volume of the object. In contrast with a set of 2D slices, the entire 3D density volume is reconstructed, so two-dimensional (2D) density images may be produced by slicing through any portion of the 3D density volume at any angle.Type: ApplicationFiled: July 1, 2021Publication date: June 16, 2022Inventors: Onni August Kosomaa, Jaakko T. Lehtinen, Samuli Matias Laine, Tero Tapani Karras, Miika Samuli Aittala
-
Publication number: 20220189011Abstract: A three-dimensional (3D) density volume of an object is constructed from tomography images (e.g., x-ray images) of the object. The tomography images are projection images that capture all structures of an object (e.g., human body) between a beam source and imaging sensor. The beam effectively integrates along a path through the object producing a tomography image at the imaging sensor, where each pixel represents attenuation. A 3D reconstruction pipeline includes a first neural network model, a fixed function backprojection unit, and a second neural network model. Given information for the capture environment, the tomography images are processed by the reconstruction pipeline to produce a reconstructed 3D density volume of the object. In contrast with a set of 2D slices, the entire 3D density volume is reconstructed, so two-dimensional (2D) density images may be produced by slicing through any portion of the 3D density volume at any angle.Type: ApplicationFiled: July 1, 2021Publication date: June 16, 2022Inventors: Onni August Kosomaa, Jaakko T. Lehtinen, Samuli Matias Laine, Tero Tapani Karras, Miika Samuli Aittala
-
Publication number: 20210383241Abstract: Embodiments of the present disclosure relate to a technique for training neural networks, such as a generative adversarial neural network (GAN), using a limited amount of data. Training GANs using too little example data typically leads to discriminator overfitting, causing training to diverge and produce poor results. An adaptive discriminator augmentation mechanism is used that significantly stabilizes training with limited data providing the ability to train high-quality GANs. An augmentation operator is applied to the distribution of inputs to a discriminator used to train a generator, representing a transformation that is invertible to ensure there is no leakage of the augmentations into the images generated by the generator. Reducing the amount of training data that is needed to achieve convergence has the potential to considerably help many applications and may the increase use of generative models in fields such as medicine.Type: ApplicationFiled: March 24, 2021Publication date: December 9, 2021Inventors: Tero Tapani Karras, Miika Samuli Aittala, Janne Johannes Hellsten, Samuli Matias Laine, Jaakko T. Lehtinen, Timo Oskari Aila
-
Publication number: 20210329306Abstract: Apparatuses, systems, and techniques to perform compression of video data using neural networks to facilitate video streaming, such as video conferencing. In at least one embodiment, a sender transmits to a receiver a key frame from video data and one or more keypoints identified by a neural network from said video data, and a receiver reconstructs video data using said key frame and one or more received keypoints.Type: ApplicationFiled: October 13, 2020Publication date: October 21, 2021Inventors: Ming-Yu Liu, Ting-Chun Wang, Arun Mohanray Mallya, Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko Lehtinen, Miika Samuli Aittala, Timo Oskari Aila
-
Publication number: 20210150369Abstract: A style-based generative network architecture enables scale-specific control of synthesized output data, such as images. During training, the style-based generative neural network (generator neural network) includes a mapping network and a synthesis network. During prediction, the mapping network may be omitted, replicated, or evaluated several times. The synthesis network may be used to generate highly varied, high-quality output data with a wide variety of attributes. For example, when used to generate images of people's faces, the attributes that may vary are age, ethnicity, camera viewpoint, pose, face shape, eyeglasses, colors (eyes, hair, etc.), hair style, lighting, background, etc. Depending on the task, generated output data may include images, audio, video, three-dimensional (3D) objects, text, etc.Type: ApplicationFiled: January 28, 2021Publication date: May 20, 2021Inventors: Tero Tapani Karras, Samuli Matias Laine, Jaakko T. Lehtinen, Miika Samuli Aittala, Janne Johannes Hellsten, Timo Oskari Aila
-
Publication number: 20210150354Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.Type: ApplicationFiled: January 7, 2021Publication date: May 20, 2021Inventors: Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko T. Lehtinen, Miika Samuli Aittala, Timo Oskari Aila, Ming-Yu Liu, Arun Mohanray Mallya, Ting-Chun Wang
-
Publication number: 20210150357Abstract: A style-based generative network architecture enables scale-specific control of synthesized output data, such as images. During training, the style-based generative neural network (generator neural network) includes a mapping network and a synthesis network. During prediction, the mapping network may be omitted, replicated, or evaluated several times. The synthesis network may be used to generate highly varied, high-quality output data with a wide variety of attributes. For example, when used to generate images of people's faces, the attributes that may vary are age, ethnicity, camera viewpoint, pose, face shape, eyeglasses, colors (eyes, hair, etc.), hair style, lighting, background, etc. Depending on the task, generated output data may include images, audio, video, three-dimensional (3D) objects, text, etc.Type: ApplicationFiled: January 28, 2021Publication date: May 20, 2021Inventors: Tero Tapani Karras, Samuli Matias Laine, Jaakko T. Lehtinen, Miika Samuli Aittala, Janne Johannes Hellsten, Timo Oskari Aila
-
Publication number: 20210150187Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.Type: ApplicationFiled: January 7, 2021Publication date: May 20, 2021Inventors: Tero Tapani Karras, Samuli Matias Laine, David Patrick Luebke, Jaakko T. Lehtinen, Miika Samuli Aittala, Timo Oskari Aila, Ming-Yu Liu, Arun Mohanray Mallya, Ting-Chun Wang