Patents by Inventor Yuzhuo Ren

Yuzhuo Ren has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11954862
    Abstract: A neural network system leverages dual attention, specifically both spatial attention and channel attention, to jointly estimate heart rate and respiratory rate of a subject by processing images of the subject. A motion neural network receives images of the subject and estimates heart and breath rates of the subject using both spatial and channel domain attention masks to focus processing on particular feature data. An appearance neural network computes a spatial attention mask from the images of the subject and may indicate that features associated with the subject's face (as opposed to the subject's hair or shoulders) to accurately estimate the heart and/or breath rate. Channel-wise domain attention is learned during training and recalibrates channel-wise feature responses to select the most informative features for processing. The channel attention mask is learned during training and can be used for different subjects during deployment.
    Type: Grant
    Filed: September 20, 2021
    Date of Patent: April 9, 2024
    Assignee: NVIDIA Corporation
    Inventors: Yuzhuo Ren, Niranjan Avadhanam, Rajath Bellipady Shetty
  • Publication number: 20240112472
    Abstract: In various examples, color statistic(s) from ground projections are used to harmonize color between reference and target frames representing an environment. The reference and target frames may be projected onto a representation of the ground (e.g., a ground plane) of the environment, an overlapping region between the projections may be identified, and the portion of each projection that lands in the overlapping region may be taken as a corresponding ground projection. Color statistics (e.g., mean, variance, standard deviation, kurtosis, skew, correlation(s) between color channels) may be computed from the ground projections (or a portion thereof, such as a majority cluster) and used to modify the colors of the target frame to have updated color statistics that match those from the ground projection of the reference frame, thereby harmonizing color across the reference and target frames.
    Type: Application
    Filed: October 4, 2022
    Publication date: April 4, 2024
    Inventors: Yuzhuo Ren, Dawid Stanislaw Pajak, Niranjan Avadhanam, Guangli DAI
  • Publication number: 20240112376
    Abstract: In various examples, color harmonization is applied to images of an environment in a reference light space. For example, different cameras on an ego-object may use independent capturing algorithms to generate processed images of the environment representing a common time slice using different capture configuration parameters. The processed images may be transformed into deprocessed images by inverting one or more stages of image processing to transform the processed images into a reference light space of linear light, and color harmonization may be applied to the deprocessed images in the reference light space. After applying color harmonization, corresponding image processing may be reapplied to the harmonized images using corresponding capture configuration parameters, the resulting processed harmonized images may be stitched into a stitched image, and a visualization of the stitched image may be presented (e.g., on a monitor visible to an occupant or operator of the ego-object).
    Type: Application
    Filed: October 4, 2022
    Publication date: April 4, 2024
    Inventors: Yuzhuo Ren, Dawid Stanislaw Pajak, Niranjan Avadhanam
  • Patent number: 11948315
    Abstract: In various examples, two or more cameras in an automotive surround view system generate two or more input images to be stitched, or combined, into a single stitched image. In an embodiment, to improve the quality of a stitched image, a feedback module calculates two or more scores representing errors between the stitched image and one or more input images. If a computed score indicates structural errors in the stitched image, the feedback module calculates and applies one or more geometric transforms to apply to the one or more input images. If a computed score indicates color errors in the stitched image, the feedback module calculates and applies one or more photometric transforms to apply to the one or more input images.
    Type: Grant
    Filed: December 31, 2020
    Date of Patent: April 2, 2024
    Assignee: NVIDIA Corporation
    Inventors: Yuzhuo Ren, Niranjan Avadhanam
  • Publication number: 20240104879
    Abstract: In various examples, calibration techniques for interior depth sensors and image sensors for in-cabin monitoring systems and applications are provided. An intermediary coordinate system may be generated using calibration targets distributed within an interior space to reference 3D positions of features detected by both depth-perception and optical image sensors. Rotation-translation transforms may be determined to compute a first transform (H1) between the depth-perception sensor's 3D coordinate system and the 3D intermediary coordinate system, and a second transform (H2) between the optical image sensor's 2D coordinate system and the intermediary coordinate system. A third transform (H3) between the depth-perception sensor's 3D coordinate system and the optical image sensor's 2D coordinate system can be computed as a function of H1 and H2. The calibration targets may comprise a structural substrate that includes one or more fiducial point markers and one or more motion targets.
    Type: Application
    Filed: September 26, 2022
    Publication date: March 28, 2024
    Inventors: Hairong JIANG, Yuzhuo REN, Nitin BHARADWAJ, Chun-Wei CHEN, Varsha Chandrashekhar HEDAU
  • Publication number: 20240104941
    Abstract: In various examples, sensor parameter calibration techniques for in-cabin monitoring systems and applications are presented. An occupant monitoring system (OMS) is an example of a system that may be used within a vehicle or machine cabin to perform real-time assessments of driver and occupant presence, gaze, alertness, and/or other conditions. In some embodiments, a calibration parameter for an interior image sensor is determined so that the coordinates of features detected in 2D captured images may be referenced to an in-cabin 3D coordinate system. In some embodiments, a processing unit may detect fiducial points using an image of an interior space captured by a sensor, determine a 2D image coordinate for a fiducial point using the image, determine a 3D coordinate for the fiducial point, determine a calibration parameter comprising a rotation-translation transform from the 2D image coordinate and the 3D coordinate, and configure an operation based on the calibration parameter.
    Type: Application
    Filed: September 26, 2022
    Publication date: March 28, 2024
    Inventors: Yuzhuo REN, Hairong JIANG, Niranjan AVADHANAM, Varsha Chandrashekhar HEDAU
  • Publication number: 20240087341
    Abstract: State information can be determined for a subject that is robust to different inputs or conditions. For drowsiness, facial landmarks can be determined from captured image data and used to determine a set of blink parameters. These parameters can be used, such as with a temporal network, to estimate a state (e.g., drowsiness) of the subject. To improve robustness, an eye state determination network can determine eye state from the image data, without reliance on intermediate landmarks, that can be used, such as with another temporal network, to estimate the state of the subject. A weighted combination of these values can be used to determine an overall state of the subject. To improve accuracy, individual behavior patterns and context information can be utilized to account for variations in the data due to subject variation or current context rather than changes in state.
    Type: Application
    Filed: November 21, 2023
    Publication date: March 14, 2024
    Inventors: Yuzhuo Ren, Niranjan Avadhanam
  • Publication number: 20240062067
    Abstract: Apparatuses, systems, and techniques are described to determine locations of objects using images including digital representations of those objects. In at least one embodiment, a gaze of one or more occupants of a vehicle is determined independently of a location of one or more sensors used to detect those occupants.
    Type: Application
    Filed: October 30, 2023
    Publication date: February 22, 2024
    Inventors: Feng Hu, Niranjan Avadhanam, Yuzhuo Ren, Sujay Yadawadkar, Sakthivel Sivaraman, Hairong Jiang, Siyue Wu
  • Patent number: 11830259
    Abstract: State information can be determined for a subject that is robust to different inputs or conditions. For drowsiness, facial landmarks can be determined from captured image data and used to determine a set of blink parameters. These parameters can be used, such as with a temporal network, to estimate a state (e.g., drowsiness) of the subject. To improve robustness, an eye state determination network can determine eye state from the image data, without reliance on intermediate landmarks, that can be used, such as with another temporal network, to estimate the state of the subject. A weighted combination of these values can be used to determine an overall state of the subject. To improve accuracy, individual behavior patterns and context information can be utilized to account for variations in the data due to subject variation or current context rather than changes in state.
    Type: Grant
    Filed: August 24, 2021
    Date of Patent: November 28, 2023
    Assignee: Nvidia Corporation
    Inventors: Yuzhuo Ren, Niranjan Avadhanam
  • Publication number: 20230351807
    Abstract: A machine learning model (MLM) may be trained and evaluated. Attribute-based performance metrics may be analyzed to identify attributes for which the MLM is performing below a threshold when each are present in a sample. A generative neural network (GNN) may be used to generate samples including compositions of the attributes, and the samples may be used to augment the data used to train the MLM. This may be repeated until one or more criteria are satisfied. In various examples, a temporal sequence of data items, such as frames of a video, may be generated which may form samples of the data set. Sets of attribute values may be determined based on one or more temporal scenarios to be represented in the data set, and one or more GNNs may be used to generate the sequence to depict information corresponding to the attribute values.
    Type: Application
    Filed: May 2, 2022
    Publication date: November 2, 2023
    Inventors: Yuzhuo Ren, Weili Nie, Arash Vahdat, Animashree Anandkumar, Nishant Puri, Niranjan Avadhanam
  • Patent number: 11803759
    Abstract: Apparatuses, systems, and techniques are described to determine locations of objects using images including digital representations of those objects. In at least one embodiment, a gaze of one or more occupants of a vehicle is determined independently of a location of one or more sensors used to detect those occupants.
    Type: Grant
    Filed: October 11, 2021
    Date of Patent: October 31, 2023
    Assignee: Nvidia Corporation
    Inventors: Feng Hu, Niranjan Avadhanam, Yuzhuo Ren, Sujay Yadawadkar, Sakthivel Sivaraman, Hairong Jiang, Siyue Wu
  • Publication number: 20230319218
    Abstract: In various examples, a state machine is used to select between a default seam placement or dynamic seam placement that avoids salient regions, and to enable and disable dynamic seam placement based on speed of ego-motion, direction of ego-motion, proximity to salient objects, active viewport, driver gaze, and/or other factors. Images representing overlapping views of an environment may be aligned to create an aligned composite image or surface (e.g., a panorama, a 360° image, bowl shaped surface) with overlapping regions of image data, and a default or dynamic seam placement may be selected based on driving scenario (e.g., driving direction, speed, proximity to nearby objects). As such, seams may be positioned in the overlapping regions of image data, and the image data may be blended at the seams to create a stitched image or surface (e.g., a stitched panorama, stitched 360° image, stitched textured surface).
    Type: Application
    Filed: February 23, 2023
    Publication date: October 5, 2023
    Inventors: Yuzhuo REN, Nuri Murat ARAR, Orazio GALLO, Jan KAUTZ, Niranjan AVADHANAM, Hang SU
  • Publication number: 20230316458
    Abstract: In various examples, dynamic seam placement is used to position seams in regions of overlapping image data to avoid crossing salient objects or regions. Objects may be detected from image frames representing overlapping views of an environment surrounding an ego-object such as a vehicle. The images may be aligned to create an aligned composite image or surface (e.g., a panorama, a 360° image, bowl shaped surface) with regions of overlapping image data, and a representation of the detected objects and/or salient regions (e.g., a saliency mask) may be generated and projected onto the aligned composite image or surface. Seams may be positioned in the overlapping regions to avoid or minimize crossing salient pixels represented in the projected masks, and the image data may be blended at the seams to create a stitched image or surface (e.g., a stitched panorama, stitched 360° image, stitched textured surface).
    Type: Application
    Filed: February 23, 2023
    Publication date: October 5, 2023
    Inventors: Yuzhuo REN, Kenneth TURKOWSKI, Nuri Murat ARAR, Orazio GALLO, Jan KAUTZ, Niranjan AVADHANAM, Hang SU
  • Patent number: 11657535
    Abstract: Systems and methods for automatic camera calibration without using a robotic actuator or similar hardware. An electronic display screen projects an image of a simulated three-dimensional calibration pattern, such as a checkerboard, oriented in a particular pose. The camera captures an image of the calibration pattern that is displayed on the screen, and this image together with the transform of the simulated three-dimensional calibration pattern are used to calibrate the camera. Multiple different pictures of different poses are employed to determine the optimal set of poses that produces the lowest reprojection error. To aid in selecting different poses, i.e., spatial positions and orientations of the simulated three-dimensional calibration pattern, poses may be selected from only that portion of the camera's field of view which is expected to be typically used in operation of the camera.
    Type: Grant
    Filed: October 15, 2019
    Date of Patent: May 23, 2023
    Assignee: NVIDIA Corporation
    Inventors: Feng Hu, Yuzhuo Ren, Niranjan Avadhanam, Ankit Pashiney
  • Publication number: 20230091371
    Abstract: A neural network system leverages dual attention, specifically both spatial attention and channel attention, to jointly estimate heart rate and respiratory rate of a subject by processing images of the subject. A motion neural network receives images of the subject and estimates heart and breath rates of the subject using both spatial and channel domain attention masks to focus processing on particular feature data. An appearance neural network computes a spatial attention mask from the images of the subject and may indicate that features associated with the subject's face (as opposed to the subject's hair or shoulders) to accurately estimate the heart and/or breath rate. Channel-wise domain attention is learned during training and recalibrates channel-wise feature responses to select the most informative features for processing. The channel attention mask is learned during training and can be used for different subjects during deployment.
    Type: Application
    Filed: September 20, 2021
    Publication date: March 23, 2023
    Inventors: Yuzhuo Ren, Niranjan Avadhanam, Rajath Bellipady Shetty
  • Publication number: 20230065491
    Abstract: State information can be determined for a subject that is robust to different inputs or conditions. For drowsiness, facial landmarks can be determined from captured image data and used to determine a set of blink parameters. These parameters can be used, such as with a temporal network, to estimate a state (e.g., drowsiness) of the subject. To improve robustness, an eye state determination network can determine eye state from the image data, without reliance on intermediate landmarks, that can be used, such as with another temporal network, to estimate the state of the subject. A weighted combination of these values can be used to determine an overall state of the subject. To improve accuracy, individual behavior patterns and context information can be utilized to account for variations in the data due to subject variation or current context rather than changes in state.
    Type: Application
    Filed: August 24, 2021
    Publication date: March 2, 2023
    Inventors: Yuzhuo Ren, Niranjan Avadhanam
  • Publication number: 20230064049
    Abstract: Interactions with virtual systems may be difficult when users inadvertently fail to provide sufficient information to proceed with their requests. Certain types of inputs, such as auditory inputs, may lack sufficient information to properly provide a response to the user. Additional information, such as image data, may enable user gestures or poses to supplement the auditory inputs to enable response generation without requesting additional information from users.
    Type: Application
    Filed: August 31, 2021
    Publication date: March 2, 2023
    Inventors: Sakthivel Sivaraman, Nishant Puri, Yuzhuo Ren, Atousa Torabi, Shubhadeep Das, Niranjan Avadhanam, Sumit Kumar Bhattacharya, Jason Roche
  • Publication number: 20230065399
    Abstract: State information can be determined for a subject that is robust to different inputs or conditions. For drowsiness, facial landmarks can be determined from captured image data and used to determine a set of blink parameters. These parameters can be used, such as with a temporal network, to estimate a state (e.g., drowsiness) of the subject. To improve robustness, an eye state determination network can determine eye state from the image data, without reliance on intermediate landmarks, that can be used, such as with another temporal network, to estimate the state of the subject. A weighted combination of these values can be used to determine an overall state of the subject. To improve accuracy, individual behavior patterns and context information can be utilized to account for variations in the data due to subject variation or current context rather than changes in state.
    Type: Application
    Filed: August 24, 2021
    Publication date: March 2, 2023
    Inventors: Yuzhuo Ren, Niranjan Avadhanam
  • Publication number: 20230024474
    Abstract: Stitching of multiple images into a composite representation can be performed using a set of stitching parameters determined based, at least in part, upon a subjective stitching quality assessment value. A stitched image can be compared against its constituent images to obtain one or more objective quality metrics. These objective quality metrics can be fed, as input, to a trained classifier, which can infer a subjective quality assessment metric for the stitched (or otherwise composited) image. This subjective quality assessment metric can be used to adjust one or more compositing parameter values in order to provide at least a minimum subjective quality assessment value for composited images.
    Type: Application
    Filed: July 20, 2021
    Publication date: January 26, 2023
    Inventors: Yuzhuo Ren, Niranjan Avadhanam
  • Publication number: 20220207756
    Abstract: In various examples, two or more cameras in an automotive surround view system generate two or more input images to be stitched, or combined, into a single stitched image. In an embodiment, to improve the quality of a stitched image, a feedback module calculates two or more scores representing errors between the stitched image and one or more input images. If a computed score indicates structural errors in the stitched image, the feedback module calculates and applies one or more geometric transforms to apply to the one or more input images. If a computed score indicates color errors in the stitched image, the feedback module calculates and applies one or more photometric transforms to apply to the one or more input images.
    Type: Application
    Filed: December 31, 2020
    Publication date: June 30, 2022
    Inventors: Yuzhuo Ren, Niranjan Avadhanam