Patents by Inventor Bingbing Zhuang

Bingbing Zhuang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11987236
    Abstract: A method provided for 3D object localization predicts pairs of 2D bounding boxes. Each pair corresponds to a detected object in each of the two consecutive input monocular images. The method generates, for each detected object, a relative motion estimation specifying a relative motion between the two images. The method constructs an object cost volume by aggregating temporal features from the two images using the pairs of 2D bounding boxes and the relative motion estimation to predict a range of object depth candidates and a confidence score for each object depth candidate and an object depth from the object depth candidates. The method updates the relative motion estimation based on the object cost volume and the object depth to provide a refined object motion and a refined object depth. The method reconstructs a 3D bounding box for each detected object based on the refined object motion and refined object depth.
    Type: Grant
    Filed: August 23, 2021
    Date of Patent: May 21, 2024
    Assignee: NEC Corporation
    Inventors: Pan Ji, Buyu Liu, Bingbing Zhuang, Manmohan Chandraker, Xiangyu Chen
  • Publication number: 20240153250
    Abstract: Methods and systems for training a model include training a size estimation model to generate an estimated object size using a training dataset with differing levels of annotation. Two-dimensional object detection is performed on a training image to identify an object. The training image is cropped around the object. A category-level shape reconstruction is generated using a neural radiance field model. A normalized coordinate model is trained using the training image and ground truth information from the category-level shape reconstruction.
    Type: Application
    Filed: November 1, 2023
    Publication date: May 9, 2024
    Inventors: Bingbing Zhuang, Samuel Schulter, Buyu Liu, Zhixiang Min
  • Publication number: 20240153251
    Abstract: Methods and systems for training a model include performing two-dimensional object detection on a training image to identify an object. The training image is cropped around the object. A category-level shape reconstruction is generated using a neural radiance field model. A normalized coordinate model is trained using the training image and ground truth information from the category-level shape reconstruction.
    Type: Application
    Filed: November 1, 2023
    Publication date: May 9, 2024
    Inventors: Bingbing Zhuang, Samuel Schulter, Buyu Liu, Zhixiang Min
  • Publication number: 20240071105
    Abstract: Methods and systems for training a model include pre-training a backbone model with a pre-training decoder, using an unlabeled dataset with multiple distinct sensor data modalities that derive from different sensor types. The backbone model is fine-tuned with an output decoder after pre-training, using a labeled dataset with the multiple modalities.
    Type: Application
    Filed: August 22, 2023
    Publication date: February 29, 2024
    Inventors: Samuel Schulter, Bingbing Zhuang, Vijay Kumar Baikampady Gopalkrishna, Sparsh Garg, Zhixing Zhang
  • Publication number: 20240037186
    Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.
    Type: Application
    Filed: October 11, 2023
    Publication date: February 1, 2024
    Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
  • Publication number: 20240037187
    Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.
    Type: Application
    Filed: October 11, 2023
    Publication date: February 1, 2024
    Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
  • Publication number: 20240037188
    Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.
    Type: Application
    Filed: October 11, 2023
    Publication date: February 1, 2024
    Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
  • Patent number: 11694311
    Abstract: A computer-implemented method executed by at least one processor for applying rolling shutter (RS)-aware spatially varying differential homography fields for simultaneous RS distortion removal and image stitching is presented. The method includes inputting two consecutive frames including RS distortions from a video stream, performing keypoint detection and matching to extract correspondences between the two consecutive frames, feeding the correspondences between the two consecutive frames into an RS-aware differential homography estimation component to filter out outlier correspondences, sending inlier correspondences to an RS-aware spatially varying differential homography field estimation component to compute an RS-aware spatially varying differential homography field, and using the RS-aware spatially varying differential homography field in an RS stitching and correction component to produce stitched images with removal of the RS distortions.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: July 4, 2023
    Inventors: Bingbing Zhuang, Quoc-Huy Tran
  • Publication number: 20230154104
    Abstract: A method for achieving high-fidelity novel view synthesis and 3D reconstruction for large-scale scenes is presented. The method includes obtaining images from a video stream received from a plurality of video image capturing devices, grouping the images into different image clusters representing a large-scale 3D scene, training a neural radiance field (NeRF) and an uncertainty multilayer perceptron (MLP) for each of the image clusters to generate a plurality of NeRFs and a plurality of uncertainty MLPs for the large-scale 3D scene, applying a rendering loss and an entropy loss to the plurality of NeRFs, performing uncertainty-based fusion to the plurality of NeRFs to define a fused NeRF, and jointly fine-tuning the plurality of NeRFs and the plurality of uncertainty MLPs, and during inference, applying the fused NeRF for novel view synthesis of the large-scale 3D scene.
    Type: Application
    Filed: October 11, 2022
    Publication date: May 18, 2023
    Inventors: Bingbing Zhuang, Samuel Schulter, Yi-Hsuan Tsai, Buyu Liu, Nanbo Li
  • Publication number: 20230081913
    Abstract: Systems and methods are provided for multi-modal test-time adaptation. The method includes inputting a digital image into a pre-trained Camera Intra-modal Pseudo-label Generator, and inputting a point cloud set into a pre-trained Lidar Intra-modal Pseudo-label Generator. The method further includes applying a fast 2-dimension (2D) model, and a slow 2D model, to the inputted digital image to apply pseudo-labels, and applying a fast 3-dimension (3D) model, and a slow 3D model, to the inputted point cloud set to apply pseudo-labels. The method further includes fusing pseudo-label predictions from the fast models and the slow models through an Inter-modal Pseudo-label Refinement module to obtain robust pseudo labels, and measuring a prediction consistency for the pseudo-labels.
    Type: Application
    Filed: September 6, 2022
    Publication date: March 16, 2023
    Inventors: Yi-Hsuan Tsai, Bingbing Zhuang, Samuel Schulter, Buyu Liu, Sparsh Garg, Ramin Moslemi, Inkyu Shin
  • Patent number: 11599974
    Abstract: A method for jointly removing rolling shutter (RS) distortions and blur artifacts in a single input RS and blurred image is presented. The method includes generating a plurality of RS blurred images from a camera, synthesizing RS blurred images from a set of GS sharp images, corresponding GS sharp depth maps, and synthesized RS camera motions by employing a structure-and-motion-aware RS distortion and blur rendering module to generate training data to train a single-view joint RS correction and deblurring convolutional neural network (CNN), and predicting an RS rectified and deblurred image from the single input RS and blurred image by employing the single-view joint RS correction and deblurring CNN.
    Type: Grant
    Filed: November 5, 2020
    Date of Patent: March 7, 2023
    Inventors: Quoc-Huy Tran, Bingbing Zhuang, Pan Ji, Manmohan Chandraker
  • Patent number: 11455813
    Abstract: Systems and methods are provided for producing a road layout model. The method includes capturing digital images having a perspective view, converting each of the digital images into top-down images, and conveying a top-down image of time t to a neural network that performs a feature transform to form a feature map of time t. The method also includes transferring the feature map of the top-down image of time t to a feature transform module to warp the feature map to a time t+1, and conveying a top-down image of time t+1 to form a feature map of time t+1. The method also includes combining the warped feature map of time t with the feature map of time t+1 to form a combined feature map, transferring the combined feature map to a long short-term memory (LSTM) module to generate the road layout model, and displaying the road layout model.
    Type: Grant
    Filed: November 12, 2020
    Date of Patent: September 27, 2022
    Inventors: Buyu Liu, Bingbing Zhuang, Samuel Schulter, Manmohan Chandraker
  • Publication number: 20220148220
    Abstract: A computer-implemented method for fusing geometrical and Convolutional Neural Network (CNN) relative camera pose is provided. The method includes receiving two images having different camera poses. The method further includes inputting the two images into a geometric solver branch to return, as a first solution, an estimated camera pose and an associated pose uncertainty value determined from a Jacobian of a reproduction error function. The method also includes inputting the two images into a CNN branch to return, as a second solution, a predicted camera pose and an associated pose uncertainty value. The method additionally includes fusing, by a processor device, the first solution and the second solution in a probabilistic manner using Bayes' rule to obtain a fused pose.
    Type: Application
    Filed: November 5, 2021
    Publication date: May 12, 2022
    Inventors: Bingbing Zhuang, Manmohan Chandraker
  • Publication number: 20220147761
    Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.
    Type: Application
    Filed: November 8, 2021
    Publication date: May 12, 2022
    Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
  • Publication number: 20220147746
    Abstract: A computer-implemented method for road layout prediction is provided. The method includes segmenting, by a first processor-based element, an RGB image to output pixel-level semantic segmentation results for the RGB image in a perspective view for both visible and occluded pixels in the perspective view based on contextual clues. The method further includes learning, by a second processor-based element, a mapping from the pixel-level semantic segmentation results for the RGB image in the perspective view to a top view of the RGB image using a road plane assumption. The method also includes generating, by a third processor-based element, an occlusion-aware parametric road layout prediction for road layout related attributes in the top view.
    Type: Application
    Filed: November 8, 2021
    Publication date: May 12, 2022
    Inventors: Buyu Liu, Bingbing Zhuang, Manmohan Chandraker
  • Publication number: 20220111869
    Abstract: Methods and systems for determining a path include detecting objects within a perspective image that shows a scene. Depth is predicted within the perspective image. Semantic segmentation is performed on the perspective image. An attention map is generated using the detected objects and the predicted depth. A refined top-down view of the scene is generated using the predicted depth and the semantic segmentation. A parametric top-down representation of the scene is determined using a relational graph model. A path through the scene is determined using the parametric top-down representation.
    Type: Application
    Filed: October 6, 2021
    Publication date: April 14, 2022
    Inventors: Buyu Liu, Pan Ji, Bingbing Zhuang, Manmohan Chandraker, Uday Kusupati
  • Publication number: 20220063605
    Abstract: A method provided for 3D object localization predicts pairs of 2D bounding boxes. Each pair corresponds to a detected object in each of the two consecutive input monocular images. The method generates, for each detected object, a relative motion estimation specifying a relative motion between the two images. The method constructs an object cost volume by aggregating temporal features from the two images using the pairs of 2D bounding boxes and the relative motion estimation to predict a range of object depth candidates and a confidence score for each object depth candidate and an object depth from the object depth candidates. The method updates the relative motion estimation based on the object cost volume and the object depth to provide a refined object motion and a refined object depth. The method reconstructs a 3D bounding box for each detected object based on the refined object motion and refined object depth.
    Type: Application
    Filed: August 23, 2021
    Publication date: March 3, 2022
    Inventors: Pan Ji, Buyu Liu, Bingbing Zhuang, Manmohan Chandraker, Xiangyu Chen
  • Patent number: 11222409
    Abstract: A method for correcting blur effects is presented. The method includes generating a plurality of images from a camera, synthesizing blurred images from sharp image counterparts to generate training data to train a structure-and-motion-aware convolutional neural network (CNN), and predicting a camera motion and a depth map from a single blurred image by employing the structure-and-motion-aware CNN to remove blurring from the single blurred image.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: January 11, 2022
    Inventors: Quoc-Huy Tran, Bingbing Zhuang, Pan Ji, Manmohan Chandraker
  • Patent number: 11132586
    Abstract: A method for correcting rolling shutter (RS) effects is presented. The method includes generating a plurality of images from a camera, synthesizing RS images from global shutter (GS) counterparts to generate training data to train the structure-and-motion-aware convolutional neural network (CNN), and predicting an RS camera motion and an RS depth map from a single RS image by employing a structure-and-motion-aware CNN to remove RS distortions from the single RS image.
    Type: Grant
    Filed: October 4, 2019
    Date of Patent: September 28, 2021
    Inventors: Quoc-Huy Tran, Bingbing Zhuang, Pan Ji, Manmohan Chandraker
  • Publication number: 20210279843
    Abstract: A computer-implemented method executed by at least one processor for applying rolling shutter (RS)-aware spatially varying differential homography fields for simultaneous RS distortion removal and image stitching is presented. The method includes inputting two consecutive frames including RS distortions from a video stream, performing keypoint detection and matching to extract correspondences between the two consecutive frames, feeding the correspondences between the two consecutive frames into an RS-aware differential homography estimation component to filter out outlier correspondences, sending inlier correspondences to an RS-aware spatially varying differential homography field estimation component to compute an RS-aware spatially varying differential homography field, and using the RS-aware spatially varying differential homography field in an RS stitching and correction component to produce stitched images with removal of the RS distortions.
    Type: Application
    Filed: February 23, 2021
    Publication date: September 9, 2021
    Inventors: Bingbing Zhuang, Quoc-Huy Tran