Patents by Inventor Varun RAVI KUMAR
Varun RAVI KUMAR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250239061Abstract: Aspects presented herein may enable a UE to distinguish features captured by different sensors or different types of sensors. The UE extracts a set of features from each sensor of multiple sensors. The UE maps a vector to each feature in the set of features extracted from each sensor, where the vector is related to positioning information and/or a set of intrinsic parameters associated with each sensor of the multiple sensors. The UE concatenates sets of features from the multiple sensors with their corresponding embedded vectors. The UE trains a machine learning (ML) model to identify relationships between different sensors in the multiple sensors based on the concatenated sets of features and the corresponding embedded vectors; or output the concatenated sets of features and the corresponding embedded vectors for training of the ML model for identification of the relationships between the different sensors in the multiple sensors.Type: ApplicationFiled: January 22, 2024Publication date: July 24, 2025Inventors: Varun RAVI KUMAR, Meysam SADEGHIGOOGHARI, Senthil Kumar YOGAMANI
-
Publication number: 20250232451Abstract: Systems and techniques are described herein for detecting objects. For instance, a method for detecting objects is provided. The method may include obtaining image data representative of a scene and point-cloud data representative of the scene; processing the image data and the point-cloud data using a machine-learning model, wherein the machine-learning model is trained using at least one loss function to detect moving objects represented by image data and point-cloud data, the at least one loss function being based on odometry data and at least one of training image-data features or training point-cloud-data features; and obtaining, from the machine-learning model, indications of one or more objects that are moving in the scene.Type: ApplicationFiled: January 11, 2024Publication date: July 17, 2025Inventors: Ming-Yuan YU, Varun RAVI KUMAR, Senthil Kumar YOGAMANI
-
Patent number: 12354334Abstract: Systems and techniques are described herein for training an object-detection model. For instance, a method for training an object-detection model is provided. The method may include obtaining a light detection and ranging (LIDAR) capture; obtaining a first LIDAR-based representation of an object as captured from a first distance; obtaining a second LIDAR-based representation of the object as captured from a second distance; augmenting the LIDAR capture using the first LIDAR-based representation of the object and the second LIDAR-based representation of the object to generate an augmented LIDAR capture; and training a machine-learning object-detection model using the augmented LIDAR capture.Type: GrantFiled: March 17, 2023Date of Patent: July 8, 2025Assignee: QUALCOMM IncorporatedInventors: Venkatraman Narayanan, Varun Ravi Kumar, Senthil Kumar Yogamani
-
Publication number: 20250191392Abstract: Generating dense semantic labels for objects in a camera image may be accomplished by constructing an image graph where nodes of the image graph represent pixels of a camera image; performing a first diffusion of labels on the image graph using sparse labels from a point cloud sensor to generate propagated labels; applying inpainting to one or more regions of the camera image to generate inpainted labels; performing a second diffusion of labels on the image graph to update the propagated labels; and fusing the propagated labels and the inpainted labels.Type: ApplicationFiled: December 7, 2023Publication date: June 12, 2025Inventors: Varun Ravi Kumar, Balaji Shankar Balachandran, Senthil Kumar Yogamani
-
Publication number: 20250178624Abstract: A device includes memory configured to store scene data from one or more scene sensors associated with a vehicle. The device also includes one or more processors configured to obtain, via a first machine-learning model of a contextual encoder system, a first embedding based on data representing speech that includes one or more commands for operation of the vehicle. The one or more processors are configured to obtain, via a second machine-learning model of the contextual encoder system, a second embedding based on the scene data and based on state data of the first machine-learning model. The one or more processors are configured to generate one or more vehicle control signals for the vehicle based on the first embedding and the second embedding.Type: ApplicationFiled: December 1, 2023Publication date: June 5, 2025Inventors: Venkatraman NARAYANAN, Varun RAVI KUMAR, Senthil Kumar YOGAMANI
-
Patent number: 12311968Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of image processing includes receiving, by a processor, image data from a camera image sensor; receiving, by the processor, point cloud data from a light detection and ranging (LiDAR) sensor; generating, by the processor and using a first machine learning model, fused image data that combines the image data and the point cloud data; and determining, by the processor and using a second machine learning model, whether the fused image data satisfies a criteria based on whether a population risk function of the first machine learning model exceeds a threshold. Other aspects and features are also claimed and described.Type: GrantFiled: June 5, 2023Date of Patent: May 27, 2025Assignee: QUALCOMM IncorporatedInventors: Sweta Priyadarshi, Shivansh Rao, Varun Ravi Kumar, Senthil Kumar Yogamani
-
Publication number: 20250166216Abstract: An example device for detecting objects through processing of media data, such as image data and point cloud data, includes a processing system configured to form voxel representations of a real-world three-dimensional (3D) space using images and point clouds captured for the 3D space at consecutive time steps, extract image and/or point cloud features for voxels in voxel representations of the 3D space, determine correspondences between the voxels at consecutive time steps according to similarities between the extracted features, and determine positions of objects in the 3D space using the correspondences between the voxels. For example, the processing system may perform triangulation according to positions of a moving object to positions of the voxels at the time steps. In this manner, the processing system may generate an accurate bird's eye view (BEV) representation of the real-world 3D space.Type: ApplicationFiled: November 21, 2023Publication date: May 22, 2025Inventors: Varun Ravi Kumar, Senthil Kumar Yogamani, Venkatraman Narayanan
-
Publication number: 20250166354Abstract: A method of image processing includes receiving a set of images from a sensor, dynamically determining respective cell resolutions of respective cells in a bird's-eye-view (BEV) grid based on image content in the set of images, wherein at least two of the cells have different cell resolutions, and generating BEV image content based on the respective cell resolutions of the respective cells.Type: ApplicationFiled: November 16, 2023Publication date: May 22, 2025Inventors: Varun Ravi Kumar, Kiran Bangalore Ravi, Senthil Kumar Yogamani
-
Publication number: 20250156997Abstract: An apparatus for processing image data includes a memory for storing the image data, wherein the image data comprises a first set of image data collected by a first camera comprising a first field of view (FOV) and a second set of image data collected by a second camera comprising a second FOV; and processing circuitry in communication with the memory. The processing circuitry is configured to: apply an encoder to extract, from the first set of image data, a first set of perspective view features; apply the encoder to extract, from the second set of image data, a second set of perspective view features; and project the first set of perspective view features and the second set of perspective view features onto a grid to generate a set of bird's eye view (BEV) features.Type: ApplicationFiled: November 9, 2023Publication date: May 15, 2025Inventors: Varun Ravi Kumar, Kiran Bangalore Ravi, Senthil Kumar Yogamani
-
Publication number: 20250157204Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of image processing includes receiving image data from an image sensor; receiving ranging data from a ranging sensor; embedding first spatial features of the image data with first temporal information associated with the image data; embedding second spatial features of the ranging data with second temporal information associated with the ranging data; determining first bird's-eye-view (BEV) features based on the first spatial features embedded with first temporal information; determining second BEV features based on the second spatial features embedded with second temporal information; and determining, based on the first and second BEV features, a feature set for processing by a transformer network. The feature set includes at least a portion of both the first and second BEV features. Other aspects and features are also claimed and described.Type: ApplicationFiled: November 14, 2023Publication date: May 15, 2025Inventors: Venkatraman Narayanan, Varun Ravi Kumar, Senthil Kumar Yogamani
-
Publication number: 20250139882Abstract: In some aspects of the disclosure, an apparatus includes a processing system that includes one or more processors and one or more memories coupled to the one or more processors. The processing system is configured to receive sensor data associated with a scene and to generate a cylindrical representation associated with the scene. The processing system is further configured to modify the cylindrical representation based on detecting a feature of the cylindrical representation being included in a first region of the cylindrical representation. Modifying the cylindrical representation includes relocating the feature from the first region to a second region that is different than the first region. The processing system is further configured to perform, based on the modified cylindrical representation, one or more three-dimensional (3D) perception operations associated with the scene.Type: ApplicationFiled: October 31, 2023Publication date: May 1, 2025Inventors: Behnaz Rezaei, Varun Ravi Kumar, Senthil Kumar Yogamani
-
Publication number: 20250131742Abstract: Aspects presented herein may improve the accuracy and reliability of object detections performed by multiple object detection models. In one aspect, a UE detects (1) a set of polylines from at least one of a set of bird's eye view (BEV) features or a set of perspective view (PV) features associated with a set of images and (2) a set of three-dimensional (3D) objects in the set of BEV features. The UE associates the set of polylines with the set of 3D objects. The UE updates the set of polylines based on a set of nearby 3D objects or updates the set of 3D objects based on a set of nearby polylines. The UE outputs an indication of the updated set of polylines or the updated set of 3D objects.Type: ApplicationFiled: October 23, 2023Publication date: April 24, 2025Inventors: Varun RAVI KUMAR, Senthil Kumar YOGAMANI, Heesoo MYEONG
-
Publication number: 20250094535Abstract: According to aspects described herein, a device can extract first features from frames of first sensor data and second features from frames of second sensor data (captured after the first sensor data). The device can obtain first weighted features based on the first features and second weighted features based on the second features. The device can aggregate the first weighted features to determine a first feature vector and the second weighted features to determine a second feature vector. The device can obtain a first transformed feature vector (based on transforming the first feature vector into a coordinate space) and a second transformed feature vector (based on transforming the second feature vector into the coordinate space). The device can aggregate first transformed weighted features (based on the first transformed feature vector) and second transformed weighted features (based on the second transformed feature vector) to determine a fused feature vector.Type: ApplicationFiled: September 18, 2023Publication date: March 20, 2025Inventors: Shivansh RAO, Sweta PRIYADARSHI, Varun RAVI KUMAR, Senthil Kumar YOGAMANI, Arunkumar NEHRUR RAVI, Vasudev BHASKARAN
-
Publication number: 20250095168Abstract: Systems and techniques are described herein for processing data. For instance, a method for processing data is provided. The method may include obtaining source features generated based on first sensor data captured using a first set of sensors; obtaining source semantic attributes related to the source features; obtaining target features generated based on second sensor data captured using a second set of sensors; obtaining map information; obtaining location information of a device comprising the second set of sensors; obtaining target semantic attributes from the map information based on the location information; aligning the target features with a set of the source features, based on the source semantic attributes and the target semantic attributes, to generate aligned target features; and processing the aligned target features to generate an output.Type: ApplicationFiled: September 15, 2023Publication date: March 20, 2025Inventors: Julia KABALAR, Kiran BANGALORE RAVI, Nirnai ACH, Mireille Lucette Laure GREGOIRE, Varun RAVI KUMAR, Senthil Kumar YOGAMANI
-
Publication number: 20250095173Abstract: An example device for training a neural network includes a memory configured to store a neural network model for the neural network; and a processing system comprising one or more processors implemented in circuitry, the processing system being configured to: extract image features from an image of an area, the image features representing objects in the area; extract point cloud features from a point cloud representation of the area, the point cloud features representing the objects in the area; add Gaussian noise to a ground truth depth map for the area to generate a noisy ground truth depth map, the ground truth depth map representing accurate positions of the objects in the area; and train the neural network using the image features, the point cloud features, and the noisy ground truth depth map to generate a depth map.Type: ApplicationFiled: September 14, 2023Publication date: March 20, 2025Inventors: Savitha Srinivasan, Varun Ravi Kumar, Senthil Kumar Yogamani
-
Publication number: 20250095354Abstract: An apparatus includes a memory and processing circuitry in communication with the memory. The processing circuitry is configured to process a joint graph representation using a graph neural network (GNN) to form an enhanced graph representation. The joint graph representation includes first features from a voxelized point cloud, and second features from a plurality of camera images. The enhanced graph representation includes enhanced first features and enhanced second features. The processing circuitry is further configured to perform a diffusion processes on the enhanced first features and the enhanced second features of the enhanced graph representation to form a denoised graph representation having denoised first features and denoised second features, and fuse the denoised first features and the denoised second features of the denoised graph representation using a graph attention network (GAT) to form a fused point cloud having fused features.Type: ApplicationFiled: September 14, 2023Publication date: March 20, 2025Inventors: Varun Ravi Kumar, Debasmit Das, Senthil Kumar Yogamani
-
Publication number: 20250085413Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of image processing includes receiving image BEV features and receiving first radio detection and ranging (RADAR) BEV features. The first RADAR BEV features that are received are determined based on first RADAR data associated with a first data type. First normalized RADAR BEV features are determined, which includes rescaling the first RADAR BEV features using a first attention mechanism based on the image BEV features and the first RADAR BEV features. Fused data is determined that combines the first normalized RADAR BEV features and the image BEV features. Other aspects and features are also claimed and described.Type: ApplicationFiled: September 7, 2023Publication date: March 13, 2025Inventors: Senthil Kumar Yogamani, Varun Ravi Kumar
-
Publication number: 20250085407Abstract: A method includes receiving a plurality of images, wherein a first image of the one or more images comprises a range image and a second image comprises a camera image and filtering the first image to generate a filtered first image. The method also includes generating a plurality of depth estimates based on the second image and generating an attention map by combining the filtered first image and the plurality of depth estimates. Additionally, the method includes generating a consistency score indicative of a consistency of depth estimates between the first image and the second image based on the attention map, modulating one or more features extracted from the second image based on the consistency score using a gating mechanism to generate modulated one or more features, and generating a classification of one or more soiled regions in the second image based on the modulated one or more features.Type: ApplicationFiled: September 11, 2023Publication date: March 13, 2025Inventors: Varun Ravi Kumar, Senthil Kumar Yogamani, Shivansh Rao
-
Publication number: 20250086978Abstract: An apparatus includes a memory for storing image data and position data, wherein the image data comprises a set of two-dimensional (2D) camera images, and wherein the position data comprises a set of three-dimensional (3D) point cloud frames. The apparatus also includes processing circuitry in communication with the memory, wherein the processing circuitry is configured to convert the set of 2D camera images into a first 3D representation of a 3D environment corresponding to the image data and the position data, wherein the set of 3D point cloud frames comprises a second 3D representation of the 3D environment. The processing circuitry is also configured to generate, based on the first 3D representation and the second 3D representation, a set of bird's eye view (BEV) feature kernels in a continuous space; and generate, based on the set of BEV feature kernels, an output.Type: ApplicationFiled: September 13, 2023Publication date: March 13, 2025Inventors: Kiran Bangalore Ravi, Varun Ravi Kumar, Senthil Kumar Yogamani
-
Publication number: 20250086977Abstract: This disclosure provides systems, methods, and devices for processing and aligning sensor data features for navigation. In a first aspect, a method is provided that includes determining, based on received sensor data, a first set of features for an area surrounding the vehicle. A second set of features for the area surrounding the vehicle may be determined based on an occupancy map for the area surrounding the vehicle. A third set of features may be determined that align the first set of features with the second set of features. The third set of features may align each of at least a subset of the second set of features with at least one corresponding feature from the first set of features. Other aspects and features are also claimed and described.Type: ApplicationFiled: September 8, 2023Publication date: March 13, 2025Inventors: Venkatraman Narayanan, Varun Ravi Kumar, Senthil Kumar Yogamani