Patents by Inventor Stefano Soatto

Stefano Soatto has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10878234
    Abstract: Techniques for automated form understanding via layout-agnostic identification of keys and corresponding values are described. An embedding generator creates embeddings of pixels from an image including a representation of a form. The generated embeddings are similar for pixels within a same key-value unit, and far apart for pixels not in a same key-value unit. A weighted bipartite graph is constructed including a first set of nodes corresponding to keys of the form and a second set of nodes corresponding to values of the form. Weights for the edges are determined based on an analysis of distances between ones of the embeddings. The graph is partitioned according to a scheme to identify pairings between the first set of nodes and the second set of nodes that produces a minimum overall edge weight. The pairings indicate keys and values that are associated within the form.
    Type: Grant
    Filed: November 20, 2018
    Date of Patent: December 29, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Amit Adam, Oron Anschel, Hadar Averbuch Elor, Shai Mazor, Gal Sabina Star, Or Perel, Wendy Tse, Andrea Olgiati, Rahul Bhotika, Stefano Soatto
  • Patent number: 10872236
    Abstract: Techniques for layout-agnostic clustering-based classification of document keys and values are described. A key-value differentiation unit generates feature vectors corresponding to text elements of a form represented within an electronic image using a machine learning (ML) model. The ML model was trained utilizing a loss function that separates keys from values. The feature vectors are clustered into at least two clusters, and a cluster is determined to include either keys of the form or values of the form via identifying neighbors between feature vectors of the cluster(s) with labeled feature vectors.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: December 22, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Hadar Averbuch Elor, Oron Anschel, Or Perel, Amit Adam, Shai Mazor, Rahul Bhotika, Stefano Soatto
  • Patent number: 10839245
    Abstract: A structured document analyzer that associates keys and values in structured documents based on key, value, and key-value container bounding boxes. A trained machine learning model analyzes images of structured documents to determine bounding boxes for keys, values, and key-value containers in the images with confidence scores for the classifications. For each image, duplicate bounding boxes are removed, and then a set of key-value containers are selected and sorted based on the confidence scores. For each key-value container, a best key and value are determined for the container based on overlap of the key and value bounding boxes with the container bounding box and the confidence scores. Optical character recognition may be performed on the image to determine text for the keys and values.
    Type: Grant
    Filed: March 25, 2019
    Date of Patent: November 17, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Guneet Singh Dhillon, Vijay Mahadevan, Yuting Zhang, Meng Wang, Gangadhar Payyavula, Viet Cuong Nguyen, Rahul Bhotika, Stefano Soatto
  • Patent number: 10762644
    Abstract: Techniques for multiple object tracking in video are described in which the outputs of neural networks are combined within a Bayesian framework. A motion model is applied to a probability distribution representing the estimated current state of a target object being tracked to predict the state of the target object in the next frame. A state of an object can include one or more features, such as the location of the object in the frame, a velocity and/or acceleration of the object across frames, a classification of the object, etc. The prediction of the state of the target object in the next frame is adjusted by a score based on the combined outputs of neural networks that process the next frame.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: September 1, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Vijay Mahadevan, Stefano Soatto
  • Publication number: 20190236399
    Abstract: A new method for improving the robustness of visual-inertial integration systems (VINS) based on derivation of optimal discriminants for outlier rejection, and the consequent approximations, that are both conceptually and empirically superior to other outlier detection schemes used in this context. It should be appreciated that VINS is central to a number of application areas including augmented reality (AR), virtual reality (VR), robotics, autonomous vehicles, autonomous flying robots, and so forth and their related hardware including mobile phones, such as for use in indoor localization (in GPS-denied areas), and the like.
    Type: Application
    Filed: August 9, 2018
    Publication date: August 1, 2019
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Konstantine Tsotsos
  • Publication number: 20170243084
    Abstract: A variation of scale-invariant feature transform (SIFT) based on pooling gradient orientations across different domain sizes, in addition to spatial locations. The resulting descriptor is called DSP-SIFT, and it outperforms other methods in wide-baseline matching benchmarks, including those based on convolutional neural networks, despite having the same dimension of SIFT and requiring no training. Problems of local representation of imaging data are also addressed as computation of minimal sufficient statistics that are invariant to nuisance variability induced by viewpoint and illumination. A sampling-based and a point-estimate based approximation of such representations are described.
    Type: Application
    Filed: November 7, 2016
    Publication date: August 24, 2017
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Jingming Dong
  • Patent number: 9456768
    Abstract: A disposable volume spirometry apparatus and non-contact measurement system is described. The spirometry system includes an expandable disposable volume spirometry apparatus, a remote non-contact sensor, memory, and a processor. The remote non-contact sensor captures images associated with the expandable disposable volume spirometry apparatus. The memory stores the captured images. The processor is operatively coupled to the memory and determines a volume for the expandable disposable volume spirometry apparatus by analyzing the captured images.
    Type: Grant
    Filed: August 18, 2011
    Date of Patent: October 4, 2016
    Inventors: Stefano Soatto, Giuseppe Torresin
  • Patent number: 9418317
    Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.
    Type: Grant
    Filed: April 4, 2014
    Date of Patent: August 16, 2016
    Assignee: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Taehee Lee
  • Publication number: 20160140729
    Abstract: A new method for improving the robustness of visual-inertial integration systems (VINS) based on derivation of optimal discriminants for outlier rejection, and the consequent approximations, that are both conceptually and empirically superior to other outlier detection schemes used in this context. It should be appreciated that VINS is central to a number of application areas including augmented reality (AR), virtual reality (VR), robotics, autonomous vehicles, autonomous flying robots, and so forth and their related hardware including mobile phones, such as for use in indoor localization (in GPS-denied areas), and the like.
    Type: Application
    Filed: November 4, 2015
    Publication date: May 19, 2016
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Konstantine Tsotsos
  • Publication number: 20140301635
    Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.
    Type: Application
    Filed: April 4, 2014
    Publication date: October 9, 2014
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Taehee Lee
  • Patent number: 8717437
    Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.
    Type: Grant
    Filed: January 7, 2013
    Date of Patent: May 6, 2014
    Assignee: The Regents of the University of California
    Inventors: Stefano Soatto, Taehee Lee
  • Publication number: 20120046568
    Abstract: A disposable volume spirometry apparatus and non-contact measurement system is described. The spirometry system includes an expandable disposable volume spirometry apparatus, a remote non-contact sensor, memory, and a processor. The remote non-contact sensor captures images associated with the expandable disposable volume spirometry apparatus. The memory stores the captured images. The processor is operatively coupled to the memory and determines a volume for the expandable disposable volume spirometry apparatus by analyzing the captured images.
    Type: Application
    Filed: August 18, 2011
    Publication date: February 23, 2012
    Inventors: Stefano Soatto, Giuseppe Torresin
  • Patent number: 6944327
    Abstract: The present invention provides a method and apparatus for designing and visualizing the shape of eyeglass lenses and of the front rims of eyeglass frames, and for allowing the customer to modify the design by changing the shape and style interactively to suit his/her preference and perceived character. The present invention comprises an interface for a lens grinding machine that allows the retailer to manufacture the selected frame at the store for certain specified styles, or allows the retailer to transmit the shape and style data to a manufacturer that can implement the selected design and deliver it directly to the customer. Another embodiment of the present invention is to provide a method for designing, visualizing, modifying the shape and style of eyeglass lenses and frames remotely through the Internet or other communication channel, and to transmit the design to a manufacturer, which enables the customer to select and purchase eyeglasses directly from manufacturers.
    Type: Grant
    Filed: November 3, 2000
    Date of Patent: September 13, 2005
    Inventor: Stefano Soatto