  • Publication number: 20210150410
    Abstract: Systems and methods for predicting instance geometry are provided. A method includes obtaining an input image depicting at least one object. The method includes determining an instance mask for the object by inputting the input image into a machine-learned instance segmentation model. The method includes determining an initial polygon with a number of initial vertices outlining the border of the object within the input image. The method includes obtaining a feature embedding for one or more pixels of the input image and determining a vertex embedding including a feature embedding for each pixel corresponding an initial vertex of the initial polygon. The method includes determining a vertex offset for each initial vertex of the initial polygon based on the vertex embedding and applying the vertex offset to the initial polygon to obtain one or more enhanced polygons.
    Type: Application
    Filed: August 31, 2020
    Publication date: May 20, 2021
    Inventors: Justin Liang, Namdar Homayounfar, Wei-Chiu Ma, Yuwen Xiong, Raquel Urtasun
  • Publication number: 20210150722
    Abstract: Disclosed herein are methods and systems for performing instance segmentation that can provide improved estimation of object boundaries. Implementations can include a machine-learned segmentation model trained to estimate an initial object boundary based on a truncated signed distance function (TSDF) generated by the model. The model can also generate outputs for optimizing the TSDF over a series of iterations to produce a final TSDF that can be used to determine the segmentation mask.
    Type: Application
    Filed: September 10, 2020
    Publication date: May 20, 2021
    Inventors: Namdar Homayounfar, Yuwen Xiong, Justin Liang, Wei-Chiu Ma, Raquel Urtasun
  • Publication number: 20200160537
    Abstract: Systems, methods, tangible non-transitory computer-readable media, and devices associated with motion flow estimation are provided. For example, scene data including representations of an environment over a first set of time intervals can be accessed. Extracted visual cues can be generated based on the representations and machine-learned feature extraction models. At least one of the machine-learned feature extraction models can be configured to generate a portion of the extracted visual cues based on a first set of the representations of the environment from a first perspective and a second set of the representations of the environment from a second perspective. The extracted visual cues can be encoded using energy functions. Three-dimensional motion estimates of object instances at time intervals subsequent to the first set of time intervals can be determined based on the energy functions and machine-learned inference models.
    Type: Application
    Filed: August 5, 2019
    Publication date: May 21, 2020
    Inventors: Raquel Urtasun, Wei-Chiu Ma, Shenlong Wang, Yuwen Xiong, Rui Hu