Patents by Inventor Vijay Vasudevan

Vijay Vasudevan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240112088
    Abstract: Systems and methods are provided for vector-quantized image modeling using vision transformers and improved codebook handling. In particular, the present disclosure provides a Vector-quantized Image Modeling (VIM) approach that involves pretraining a machine learning model (e.g., Transformer model) to predict rasterized image tokens autoregressively. The discrete image tokens can be encoded from a learned Vision-Transformer-based VQGAN (example implementations of which can be referred to as ViT-VQGAN). The present disclosure proposes multiple improvements over vanilla VQGAN from architecture to codebook learning, yielding better efficiency and reconstruction fidelity. The improved ViT-VQGAN further improves vector-quantized image modeling tasks, including unconditional image generation, conditioned image generation (e.g., class-conditioned image generation), and unsupervised representation learning.
    Type: Application
    Filed: November 27, 2023
    Publication date: April 4, 2024
    Inventors: Jiahui Yu, Xin Li, Han Zhang, Vijay Vasudevan, Alexander Yeong-Shiuh Ku, Jason Michael Baldridge, Yuanzhong Xu, Jing Yu Koh, Thang Minh Luong, Gunjan Baid, Zirui Wang, Yonghui Wu
  • Patent number: 11941875
    Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for processing a perspective view range image generated from sensor measurements of an environment. The perspective view range image includes a plurality of pixels arranged in a two-dimensional grid and including, for each pixel, (i) features of one or more sensor measurements at a location in the environment corresponding to the pixel and (ii) geometry information comprising range features characterizing a range of the location in the environment corresponding to the pixel relative to the one or more sensors. The system processes the perspective view range image using a first neural network to generate an output feature representation. The first neural network comprises a first perspective point-set aggregation layer comprising a geometry-dependent kernel.
    Type: Grant
    Filed: July 27, 2021
    Date of Patent: March 26, 2024
    Assignee: Waymo LLC
    Inventors: Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Vijay Vasudevan, Benjamin James Caine, Xiao Zhang, Dragomir Anguelov
  • Patent number: 11928574
    Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.
    Type: Grant
    Filed: January 13, 2023
    Date of Patent: March 12, 2024
    Assignee: GOOGLE LLC
    Inventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
  • Publication number: 20230351149
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing multi-modal inputs using contrastive captioning neural networks.
    Type: Application
    Filed: April 28, 2023
    Publication date: November 2, 2023
    Inventors: Jiahui Yu, Zirui Wang, Vijay Vasudevan, Ho Man Yeung, Seyed Mojtaba Seyedhosseini Tarzjani, Yonghui Wu
  • Patent number: 11774596
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.
    Type: Grant
    Filed: September 1, 2022
    Date of Patent: October 3, 2023
    Assignee: Google LLC
    Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
  • Publication number: 20230252327
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network having controller parameters and in accordance with current values of the controller parameters, a batch of output sequences. The method includes, for each output sequence in the batch: generating an instance of a child convolutional neural network (CNN) that includes multiple instances of a first convolutional cell having an architecture defined by the output sequence; training the instance of the child CNN to perform an image processing task; and evaluating a performance of the trained instance of the child CNN on the task to determine a performance metric for the trained instance of the child CNN; and using the performance metrics for the trained instances of the child CNN to adjust current values of the controller parameters of the controller neural network.
    Type: Application
    Filed: April 20, 2023
    Publication date: August 10, 2023
    Inventors: Vijay Vasudevan, Barret Zoph, Jonathon Shlens, Quoc V. Le
  • Publication number: 20230244904
    Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.
    Type: Application
    Filed: January 13, 2023
    Publication date: August 3, 2023
    Inventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
  • Patent number: 11670038
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data using dynamic voxelization. When deployed within an on-board system of a vehicle, processing the point cloud data using dynamic voxelization can be used to make autonomous driving decisions for the vehicle with enhanced accuracy, for example by combining representations of point cloud data characterizing a scene from multiple views of the scene.
    Type: Grant
    Filed: November 1, 2021
    Date of Patent: June 6, 2023
    Assignee: Waymo LLC
    Inventors: Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Yu Ouyang, Zijian Guo, Jiquan Ngiam, Vijay Vasudevan
  • Patent number: 11651259
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network having controller parameters and in accordance with current values of the controller parameters, a batch of output sequences. The method includes, for each output sequence in the batch: generating an instance of a child convolutional neural network (CNN) that includes multiple instances of a first convolutional cell having an architecture defined by the output sequence; training the instance of the child CNN to perform an image processing task; and evaluating a performance of the trained instance of the child CNN on the task to determine a performance metric for the trained instance of the child CNN; and using the performance metrics for the trained instances of the child CNN to adjust current values of the controller parameters of the controller neural network.
    Type: Grant
    Filed: November 5, 2019
    Date of Patent: May 16, 2023
    Assignee: Google LLC
    Inventors: Vijay Vasudevan, Barret Zoph, Jonathon Shlens, Quoc V. Le
  • Publication number: 20220415042
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.
    Type: Application
    Filed: September 1, 2022
    Publication date: December 29, 2022
    Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
  • Patent number: 11531861
    Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.
    Type: Grant
    Filed: January 28, 2019
    Date of Patent: December 20, 2022
    Assignee: GOOGLE LLC
    Inventors: Mingxing Tan, Quoc Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
  • Publication number: 20220383076
    Abstract: A method for performing one or more tasks, wherein each of the one or more tasks includes predicting behavior of one or more agents in an environment, the method comprising: obtaining a three-dimensional (3D) input tensor representing behaviors of the one or more agents in the environment across a plurality of time steps; generating an encoded representation of the 3D input tensor by processing the 3D input tensor using an encoder neural network, wherein 3D input tensor comprises a plurality of observed cells and a plurality of masked cells; and processing the encoded representation of the 3D input tensor using a decoder neural network to generate a 4D output tensor.
    Type: Application
    Filed: May 31, 2022
    Publication date: December 1, 2022
    Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Benjamin James Caine, Zhengdong Zhang, Zhifeng Chen, Hao-Tien Chiang, David Joseph Weiss, Jeffrey Ling, Ashish Venugopal
  • Patent number: 11508147
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing data generated by a sensing system that rotationally senses an environment. In one aspect, a method comprises partitioning a predetermined period of time into a plurality of sub-periods, wherein the predetermined period of time is a period of time for which data generated by the sensing system constitutes a complete rotational sensing of the environment; for each sub-period: receiving current data generated by the sensing system during the sub-period and characterizing a respective partial scene of the environment; processing the current data using an object detection neural network to generate a current object detection output that is specific to the respective partial scene of the environment.
    Type: Grant
    Filed: March 6, 2020
    Date of Patent: November 22, 2022
    Assignee: Google LLC
    Inventors: Jonathon Shlens, Vijay Vasudevan, Jiquan Ngiam, Wei Han, Zhifeng Chen, Brandon Chauloon Yang, Benjamin James Caine, Zhengdong Zhang, Christoph Sprunk, Ouais Alsharif, Junhua Mao, Chen Wu
  • Patent number: 11450120
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data representing a sensor measurement of a scene captured by one or more sensors to generate an object detection output that identifies locations of one or more objects in the scene. When deployed within an on-board system of a vehicle, the object detection output that is generated can be used to make autonomous driving decisions for the vehicle with enhanced accuracy.
    Type: Grant
    Filed: July 8, 2020
    Date of Patent: September 20, 2022
    Assignee: Waymo LLC
    Inventors: Jonathon Shlens, Patrick An Phu Nguyen, Benjamin James Caine, Jiquan Ngiam, Wei Han, Brandon Chauloon Yang, Yuning Chai, Pei Sun, Yin Zhou, Xi Yi, Ouais Alsharif, Zhifeng Chen, Vijay Vasudevan
  • Publication number: 20220180193
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to perform 3D object detection. One of the methods includes training a student neural network to perform 3D object detection using pseudo-labels generated by a teacher neural network.
    Type: Application
    Filed: December 9, 2021
    Publication date: June 9, 2022
    Inventors: Benjamin James Caine, Rebecca Dawn Roelofs, Jonathon Shlens, Zhifeng Chen, Jiquan Ngiam, Vijay Vasudevan
  • Publication number: 20220101090
    Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.
    Type: Application
    Filed: October 6, 2021
    Publication date: March 31, 2022
    Inventors: Mingxing Tan, Quoc V. Le, Bo Chen, Vijay Vasudevan, Ruoming Pang
  • Publication number: 20220058858
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing point cloud data using dynamic voxelization. When deployed within an on-board system of a vehicle, processing the point cloud data using dynamic voxelization can be used to make autonomous driving decisions for the vehicle with enhanced accuracy, for example by combining representations of point cloud data characterizing a scene from multiple views of the scene.
    Type: Application
    Filed: November 1, 2021
    Publication date: February 24, 2022
    Inventors: Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Yu Ouyang, Zijian Guo, Jiquan Ngiam, Vijay Vasudevan
  • Publication number: 20220044068
    Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for processing a perspective view range image generated from sensor measurements of an environment. The perspective view range image includes a plurality of pixels arranged in a two-dimensional grid and including, for each pixel, (i) features of one or more sensor measurements at a location in the environment corresponding to the pixel and (ii) geometry information comprising range features characterizing a range of the location in the environment corresponding to the pixel relative to the one or more sensors. The system processes the perspective view range image using a first neural network to generate an output feature representation. The first neural network comprises a first perspective point-set aggregation layer comprising a geometry-dependent kernel.
    Type: Application
    Filed: July 27, 2021
    Publication date: February 10, 2022
    Inventors: Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Vijay Vasudevan, Benjamin James Caine, Xiao Zhang, Dragomir Anguelov
  • Publication number: 20220027202
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving, by a computational graph system, a request to process a computational graph; obtaining data representing a subgraph of the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node, the subgraph assigned to a first device by a placer in the computational graph system; determining that the first device comprises a hardware accelerator having a plurality of streams; in response to determining, generating instructions that when executed by the first device cause the first device to: assign the operation represented by each node in the subgraph to a respective stream; and perform the operations represented by the nodes in the subgraph in accordance with the assignment.
    Type: Application
    Filed: October 12, 2021
    Publication date: January 27, 2022
    Inventors: Paul Ronald Barham, Vijay Vasudevan
  • Publication number: 20220019896
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for modifying a computational graph to include send and receive nodes. Communication between unique devices performing operations of different subgraphs of the computational graph can be handled efficiently by inserting send and receive nodes into each subgraph. When executed, the operations that these send and receive nodes represent may enable pairs of unique devices to conduct communication with each other in a self-sufficient manner. This shifts the burden of coordinating communication away from the backend, which affords the system that processes this computational graph representation the opportunity to perform one or more other processes while devices are executing subgraphs.
    Type: Application
    Filed: August 3, 2021
    Publication date: January 20, 2022
    Inventors: Vijay Vasudevan, Jeffrey Adgate Dean, Sanjay Ghemawat