Patents by Inventor Vijay Vasudevan

Vijay Vasudevan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Vector-Quantized Image Modeling

Publication number: 20240112088

Abstract: Systems and methods are provided for vector-quantized image modeling using vision transformers and improved codebook handling. In particular, the present disclosure provides a Vector-quantized Image Modeling (VIM) approach that involves pretraining a machine learning model (e.g., Transformer model) to predict rasterized image tokens autoregressively. The discrete image tokens can be encoded from a learned Vision-Transformer-based VQGAN (example implementations of which can be referred to as ViT-VQGAN). The present disclosure proposes multiple improvements over vanilla VQGAN from architecture to codebook learning, yielding better efficiency and reconstruction fidelity. The improved ViT-VQGAN further improves vector-quantized image modeling tasks, including unconditional image generation, conditioned image generation (e.g., class-conditioned image generation), and unsupervised representation learning.

Type: Application

Filed: November 27, 2023

Publication date: April 4, 2024

Inventors: Jiahui Yu, Xin Li, Han Zhang, Vijay Vasudevan, Alexander Yeong-Shiuh Ku, Jason Michael Baldridge, Yuanzhong Xu, Jing Yu Koh, Thang Minh Luong, Gunjan Baid, Zirui Wang, Yonghui Wu
Processing perspective view range images using neural networks

Patent number: 11941875

Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for processing a perspective view range image generated from sensor measurements of an environment. The perspective view range image includes a plurality of pixels arranged in a two-dimensional grid and including, for each pixel, (i) features of one or more sensor measurements at a location in the environment corresponding to the pixel and (ii) geometry information comprising range features characterizing a range of the location in the environment corresponding to the pixel relative to the one or more sensors. The system processes the perspective view range image using a first neural network to generate an output feature representation. The first neural network comprises a first perspective point-set aggregation layer comprising a geometry-dependent kernel.

Type: Grant

Filed: July 27, 2021

Date of Patent: March 26, 2024

Assignee: Waymo LLC

Inventors: Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Vijay Vasudevan, Benjamin James Caine, Xiao Zhang, Dragomir Anguelov
Neural architecture search with factorized hierarchical search space

Patent number: 11928574

Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.

Type: Grant

Filed: January 13, 2023

Date of Patent: March 12, 2024

Assignee: GOOGLE LLC

Inventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
CONTRASTIVE CAPTIONING NEURAL NETWORKS

Publication number: 20230351149

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing multi-modal inputs using contrastive captioning neural networks.

Type: Application

Filed: April 28, 2023

Publication date: November 2, 2023

Inventors: Jiahui Yu, Zirui Wang, Vijay Vasudevan, Ho Man Yeung, Seyed Mojtaba Seyedhosseini Tarzjani, Yonghui Wu
Streaming object detection within sensor data

Patent number: 11774596

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.

Type: Grant

Filed: September 1, 2022

Date of Patent: October 3, 2023

Assignee: Google LLC

Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
NEURAL ARCHITECTURE SEARCH FOR CONVOLUTIONAL NEURAL NETWORKS

Publication number: 20230252327

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network having controller parameters and in accordance with current values of the controller parameters, a batch of output sequences. The method includes, for each output sequence in the batch: generating an instance of a child convolutional neural network (CNN) that includes multiple instances of a first convolutional cell having an architecture defined by the output sequence; training the instance of the child CNN to perform an image processing task; and evaluating a performance of the trained instance of the child CNN on the task to determine a performance metric for the trained instance of the child CNN; and using the performance metrics for the trained instances of the child CNN to adjust current values of the controller parameters of the controller neural network.

Type: Application

Filed: April 20, 2023

Publication date: August 10, 2023

Inventors: Vijay Vasudevan, Barret Zoph, Jonathon Shlens, Quoc V. Le
Neural Architecture Search with Factorized Hierarchical Search Space

Publication number: 20230244904

Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.

Type: Application

Filed: January 13, 2023

Publication date: August 3, 2023

Inventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
Processing point clouds using dynamic voxelization

Patent number: 11670038

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data using dynamic voxelization. When deployed within an on-board system of a vehicle, processing the point cloud data using dynamic voxelization can be used to make autonomous driving decisions for the vehicle with enhanced accuracy, for example by combining representations of point cloud data characterizing a scene from multiple views of the scene.

Type: Grant

Filed: November 1, 2021

Date of Patent: June 6, 2023

Assignee: Waymo LLC

Inventors: Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Yu Ouyang, Zijian Guo, Jiquan Ngiam, Vijay Vasudevan
Neural architecture search for convolutional neural networks

Patent number: 11651259

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network having controller parameters and in accordance with current values of the controller parameters, a batch of output sequences. The method includes, for each output sequence in the batch: generating an instance of a child convolutional neural network (CNN) that includes multiple instances of a first convolutional cell having an architecture defined by the output sequence; training the instance of the child CNN to perform an image processing task; and evaluating a performance of the trained instance of the child CNN on the task to determine a performance metric for the trained instance of the child CNN; and using the performance metrics for the trained instances of the child CNN to adjust current values of the controller parameters of the controller neural network.

Type: Grant

Filed: November 5, 2019

Date of Patent: May 16, 2023

Assignee: Google LLC

Inventors: Vijay Vasudevan, Barret Zoph, Jonathon Shlens, Quoc V. Le
STREAMING OBJECT DETECTION WITHIN SENSOR DATA

Publication number: 20220415042

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.

Type: Application

Filed: September 1, 2022

Publication date: December 29, 2022

Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
Neural architecture search with factorized hierarchical search space

Patent number: 11531861

Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.

Type: Grant

Filed: January 28, 2019

Date of Patent: December 20, 2022

Assignee: GOOGLE LLC

Inventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
MACHINE LEARNING MODELS FOR BEHAVIOR UNDERSTANDING

Publication number: 20220383076

Abstract: A method for performing one or more tasks, wherein each of the one or more tasks includes predicting behavior of one or more agents in an environment, the method comprising: obtaining a three-dimensional (3D) input tensor representing behaviors of the one or more agents in the environment across a plurality of time steps; generating an encoded representation of the 3D input tensor by processing the 3D input tensor using an encoder neural network, wherein 3D input tensor comprises a plurality of observed cells and a plurality of masked cells; and processing the encoded representation of the 3D input tensor using a decoder neural network to generate a 4D output tensor.

Type: Application

Filed: May 31, 2022

Publication date: December 1, 2022

Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Benjamin James Caine, Zhengdong Zhang, Zhifeng Chen, Hao-Tien Chiang, David Joseph Weiss, Jeffrey Ling, Ashish Venugopal
Streaming object detection within sensor data

Patent number: 11508147

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.

Type: Grant

Filed: March 6, 2020

Date of Patent: November 22, 2022

Assignee: Google LLC

Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
Object detection in point clouds

Patent number: 11450120

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data representing a sensor measurement of a scene captured by one or more sensors to generate an object detection output that identifies locations of one or more objects in the scene. When deployed within an on-board system of a vehicle, the object detection output that is generated can be used to make autonomous driving decisions for the vehicle with enhanced accuracy.

Type: Grant

Filed: July 8, 2020

Date of Patent: September 20, 2022

Assignee: Waymo LLC

Inventors: Jonathon Shlens, Patrick An Phu Nguyen, Benjamin James Caine, Jiquan Ngiam, Wei Han, Brandon Chauloon Yang, Yuning Chai, Pei Sun, Yin Zhou, Xi Yi, Ouais Alsharif, Zhifeng Chen, Vijay Vasudevan
THREE-DIMENSIONAL OBJECT DETECTION USING PSEUDO-LABELS

Publication number: 20220180193

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to perform 3D object detection. One of the methods includes training a student neural network to perform 3D object detection using pseudo-labels generated by a teacher neural network.

Type: Application

Filed: December 9, 2021

Publication date: June 9, 2022

Inventors: Benjamin James Caine, Rebecca Dawn Roelofs, Jonathon Shlens, Zhifeng Chen, Jiquan Ngiam, Vijay Vasudevan
Neural Architecture Search with Factorized Hierarchical Search Space

Publication number: 20220101090

Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.

Type: Application

Filed: October 6, 2021

Publication date: March 31, 2022

Inventors: Mingxing Tan, Quoc V. Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
PROCESSING POINT CLOUDS USING DYNAMIC VOXELIZATION

Publication number: 20220058858

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data using dynamic voxelization. When deployed within an on-board system of a vehicle, processing the point cloud data using dynamic voxelization can be used to make autonomous driving decisions for the vehicle with enhanced accuracy, for example by combining representations of point cloud data characterizing a scene from multiple views of the scene.

Type: Application

Filed: November 1, 2021

Publication date: February 24, 2022

Inventors: Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Yu Ouyang, Zijian Guo, Jiquan Ngiam, Vijay Vasudevan
PROCESSING PERSPECTIVE VIEW RANGE IMAGES USING NEURAL NETWORKS

Publication number: 20220044068

Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for processing a perspective view range image generated from sensor measurements of an environment. The perspective view range image includes a plurality of pixels arranged in a two-dimensional grid and including, for each pixel, (i) features of one or more sensor measurements at a location in the environment corresponding to the pixel and (ii) geometry information comprising range features characterizing a range of the location in the environment corresponding to the pixel relative to the one or more sensors. The system processes the perspective view range image using a first neural network to generate an output feature representation. The first neural network comprises a first perspective point-set aggregation layer comprising a geometry-dependent kernel.

Type: Application

Filed: July 27, 2021

Publication date: February 10, 2022

Inventors: Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Vijay Vasudevan, Benjamin James Caine, Xiao Zhang, Dragomir Anguelov
STREAM-BASED ACCELERATOR PROCESSING OF COMPUTATIONAL GRAPHS

Publication number: 20220027202

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving, by a computational graph system, a request to process a computational graph; obtaining data representing a subgraph of the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node, the subgraph assigned to a first device by a placer in the computational graph system; determining that the first device comprises a hardware accelerator having a plurality of streams; in response to determining, generating instructions that when executed by the first device cause the first device to: assign the operation represented by each node in the subgraph to a respective stream; and perform the operations represented by the nodes in the subgraph in accordance with the assignment.

Type: Application

Filed: October 12, 2021

Publication date: January 27, 2022

Inventors: Paul Ronald Barham, Vijay Vasudevan
MODIFYING COMPUTATIONAL GRAPHS

Publication number: 20220019896

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for modifying a computational graph to include send and receive nodes. Communication between unique devices performing operations of different subgraphs of the computational graph can be handled efficiently by inserting send and receive nodes into each subgraph. When executed, the operations that these send and receive nodes represent may enable pairs of unique devices to conduct communication with each other in a self-sufficient manner. This shifts the burden of coordinating communication away from the backend, which affords the system that processes this computational graph representation the opportunity to perform one or more other processes while devices are executing subgraphs.

Type: Application

Filed: August 3, 2021

Publication date: January 20, 2022

Inventors: Vijay Vasudevan, Jeffrey Adgate Dean, Sanjay Ghemawat

1 2 3 next