Patents by Inventor Vijay Vasudevan
Vijay Vasudevan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240112088Abstract: Systems and methods are provided for vector-quantized image modeling using vision transformers and improved codebook handling. In particular, the present disclosure provides a Vector-quantized Image Modeling (VIM) approach that involves pretraining a machine learning model (e.g., Transformer model) to predict rasterized image tokens autoregressively. The discrete image tokens can be encoded from a learned Vision-Transformer-based VQGAN (example implementations of which can be referred to as ViT-VQGAN). The present disclosure proposes multiple improvements over vanilla VQGAN from architecture to codebook learning, yielding better efficiency and reconstruction fidelity. The improved ViT-VQGAN further improves vector-quantized image modeling tasks, including unconditional image generation, conditioned image generation (e.g., class-conditioned image generation), and unsupervised representation learning.Type: ApplicationFiled: November 27, 2023Publication date: April 4, 2024Inventors: Jiahui Yu, Xin Li, Han Zhang, Vijay Vasudevan, Alexander Yeong-Shiuh Ku, Jason Michael Baldridge, Yuanzhong Xu, Jing Yu Koh, Thang Minh Luong, Gunjan Baid, Zirui Wang, Yonghui Wu
-
Patent number: 11941875Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for processing a perspective view range image generated from sensor measurements of an environment. The perspective view range image includes a plurality of pixels arranged in a two-dimensional grid and including, for each pixel, (i) features of one or more sensor measurements at a location in the environment corresponding to the pixel and (ii) geometry information comprising range features characterizing a range of the location in the environment corresponding to the pixel relative to the one or more sensors. The system processes the perspective view range image using a first neural network to generate an output feature representation. The first neural network comprises a first perspective point-set aggregation layer comprising a geometry-dependent kernel.Type: GrantFiled: July 27, 2021Date of Patent: March 26, 2024Assignee: Waymo LLCInventors: Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Vijay Vasudevan, Benjamin James Caine, Xiao Zhang, Dragomir Anguelov
-
Patent number: 11928574Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.Type: GrantFiled: January 13, 2023Date of Patent: March 12, 2024Assignee: GOOGLE LLCInventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
-
Publication number: 20230351149Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing multi-modal inputs using contrastive captioning neural networks.Type: ApplicationFiled: April 28, 2023Publication date: November 2, 2023Inventors: Jiahui Yu, Zirui Wang, Vijay Vasudevan, Ho Man Yeung, Seyed Mojtaba Seyedhosseini Tarzjani, Yonghui Wu
-
Patent number: 11774596Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.Type: GrantFiled: September 1, 2022Date of Patent: October 3, 2023Assignee: Google LLCInventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
-
Publication number: 20230252327Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network having controller parameters and in accordance with current values of the controller parameters, a batch of output sequences. The method includes, for each output sequence in the batch: generating an instance of a child convolutional neural network (CNN) that includes multiple instances of a first convolutional cell having an architecture defined by the output sequence; training the instance of the child CNN to perform an image processing task; and evaluating a performance of the trained instance of the child CNN on the task to determine a performance metric for the trained instance of the child CNN; and using the performance metrics for the trained instances of the child CNN to adjust current values of the controller parameters of the controller neural network.Type: ApplicationFiled: April 20, 2023Publication date: August 10, 2023Inventors: Vijay Vasudevan, Barret Zoph, Jonathon Shlens, Quoc V. Le
-
Publication number: 20230244904Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.Type: ApplicationFiled: January 13, 2023Publication date: August 3, 2023Inventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
-
Patent number: 11670038Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data using dynamic voxelization. When deployed within an on-board system of a vehicle, processing the point cloud data using dynamic voxelization can be used to make autonomous driving decisions for the vehicle with enhanced accuracy, for example by combining representations of point cloud data characterizing a scene from multiple views of the scene.Type: GrantFiled: November 1, 2021Date of Patent: June 6, 2023Assignee: Waymo LLCInventors: Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Yu Ouyang, Zijian Guo, Jiquan Ngiam, Vijay Vasudevan
-
Patent number: 11651259Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network having controller parameters and in accordance with current values of the controller parameters, a batch of output sequences. The method includes, for each output sequence in the batch: generating an instance of a child convolutional neural network (CNN) that includes multiple instances of a first convolutional cell having an architecture defined by the output sequence; training the instance of the child CNN to perform an image processing task; and evaluating a performance of the trained instance of the child CNN on the task to determine a performance metric for the trained instance of the child CNN; and using the performance metrics for the trained instances of the child CNN to adjust current values of the controller parameters of the controller neural network.Type: GrantFiled: November 5, 2019Date of Patent: May 16, 2023Assignee: Google LLCInventors: Vijay Vasudevan, Barret Zoph, Jonathon Shlens, Quoc V. Le
-
Publication number: 20220415042Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.Type: ApplicationFiled: September 1, 2022Publication date: December 29, 2022Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
-
Patent number: 11531861Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.Type: GrantFiled: January 28, 2019Date of Patent: December 20, 2022Assignee: GOOGLE LLCInventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
-
Publication number: 20220383076Abstract: A method for performing one or more tasks, wherein each of the one or more tasks includes predicting behavior of one or more agents in an environment, the method comprising: obtaining a three-dimensional (3D) input tensor representing behaviors of the one or more agents in the environment across a plurality of time steps; generating an encoded representation of the 3D input tensor by processing the 3D input tensor using an encoder neural network, wherein 3D input tensor comprises a plurality of observed cells and a plurality of masked cells; and processing the encoded representation of the 3D input tensor using a decoder neural network to generate a 4D output tensor.Type: ApplicationFiled: May 31, 2022Publication date: December 1, 2022Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Benjamin James Caine, Zhengdong Zhang, Zhifeng Chen, Hao-Tien Chiang, David Joseph Weiss, Jeffrey Ling, Ashish Venugopal
-
Patent number: 11508147Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.Type: GrantFiled: March 6, 2020Date of Patent: November 22, 2022Assignee: Google LLCInventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
-
Patent number: 11450120Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data representing a sensor measurement of a scene captured by one or more sensors to generate an object detection output that identifies locations of one or more objects in the scene. When deployed within an on-board system of a vehicle, the object detection output that is generated can be used to make autonomous driving decisions for the vehicle with enhanced accuracy.Type: GrantFiled: July 8, 2020Date of Patent: September 20, 2022Assignee: Waymo LLCInventors: Jonathon Shlens, Patrick An Phu Nguyen, Benjamin James Caine, Jiquan Ngiam, Wei Han, Brandon Chauloon Yang, Yuning Chai, Pei Sun, Yin Zhou, Xi Yi, Ouais Alsharif, Zhifeng Chen, Vijay Vasudevan
-
Publication number: 20220180193Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to perform 3D object detection. One of the methods includes training a student neural network to perform 3D object detection using pseudo-labels generated by a teacher neural network.Type: ApplicationFiled: December 9, 2021Publication date: June 9, 2022Inventors: Benjamin James Caine, Rebecca Dawn Roelofs, Jonathon Shlens, Zhifeng Chen, Jiquan Ngiam, Vijay Vasudevan
-
Publication number: 20220101090Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.Type: ApplicationFiled: October 6, 2021Publication date: March 31, 2022Inventors: Mingxing Tan, Quoc V. Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
-
Publication number: 20220058858Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data using dynamic voxelization. When deployed within an on-board system of a vehicle, processing the point cloud data using dynamic voxelization can be used to make autonomous driving decisions for the vehicle with enhanced accuracy, for example by combining representations of point cloud data characterizing a scene from multiple views of the scene.Type: ApplicationFiled: November 1, 2021Publication date: February 24, 2022Inventors: Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Yu Ouyang, Zijian Guo, Jiquan Ngiam, Vijay Vasudevan
-
Publication number: 20220044068Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for processing a perspective view range image generated from sensor measurements of an environment. The perspective view range image includes a plurality of pixels arranged in a two-dimensional grid and including, for each pixel, (i) features of one or more sensor measurements at a location in the environment corresponding to the pixel and (ii) geometry information comprising range features characterizing a range of the location in the environment corresponding to the pixel relative to the one or more sensors. The system processes the perspective view range image using a first neural network to generate an output feature representation. The first neural network comprises a first perspective point-set aggregation layer comprising a geometry-dependent kernel.Type: ApplicationFiled: July 27, 2021Publication date: February 10, 2022Inventors: Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Vijay Vasudevan, Benjamin James Caine, Xiao Zhang, Dragomir Anguelov
-
Publication number: 20220027202Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving, by a computational graph system, a request to process a computational graph; obtaining data representing a subgraph of the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node, the subgraph assigned to a first device by a placer in the computational graph system; determining that the first device comprises a hardware accelerator having a plurality of streams; in response to determining, generating instructions that when executed by the first device cause the first device to: assign the operation represented by each node in the subgraph to a respective stream; and perform the operations represented by the nodes in the subgraph in accordance with the assignment.Type: ApplicationFiled: October 12, 2021Publication date: January 27, 2022Inventors: Paul Ronald Barham, Vijay Vasudevan
-
Publication number: 20220019896Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for modifying a computational graph to include send and receive nodes. Communication between unique devices performing operations of different subgraphs of the computational graph can be handled efficiently by inserting send and receive nodes into each subgraph. When executed, the operations that these send and receive nodes represent may enable pairs of unique devices to conduct communication with each other in a self-sufficient manner. This shifts the burden of coordinating communication away from the backend, which affords the system that processes this computational graph representation the opportunity to perform one or more other processes while devices are executing subgraphs.Type: ApplicationFiled: August 3, 2021Publication date: January 20, 2022Inventors: Vijay Vasudevan, Jeffrey Adgate Dean, Sanjay Ghemawat