Patents by Inventor Stefano Soatto

Stefano Soatto has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11860977
    Abstract: Techniques for performing visual clustering with a hierarchical graph neural network framework including a joint linkage prediction and density estimation graph model are described. Embodiments herein recurrently run the joint linkage prediction and density estimation graph model to generate intermediate clusters in multiple iterations (e.g., until convergence) to obtain a final clustering result. In certain embodiments, for each iteration, the input graph contains nodes that are merged from nodes assigned to intermediate clusters from the previous iteration. By using a small and fixed bandwidth k in each iteration, embodiments herein alleviate the sensitivity to the k selection for different clustering applications. Certain embodiments herein remove the tuning of a different k (e.g., k-bandwidth) for k-nearest neighbor graph construction over different clustering applications.
    Type: Grant
    Filed: May 4, 2021
    Date of Patent: January 2, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Yifan Xing, Tianjun Xiao, Tong He, Yongxin Wang, Yuanjun Xiong, Wei Xia, David Paul Wipf, Zheng Zhang, Stefano Soatto
  • Patent number: 11429813
    Abstract: This disclosure describes automatically selecting and training one or more models for image recognition based upon training and testing (validation) data provided by a user. A service provider network includes a recognition service that may use models to process images and videos to recognize objects in the images and videos, features on the objects in the images and videos, and/or locate objects in the images and videos. The service provider network also includes a model selection and training service that may select one or more modeling techniques based on the objectives of the user and/or the amount of data provided by the user. Based on the selected modeling technique, the model selection and training service selects and trains one or more models for use by the recognition service to process images and videos using the training data. The trained model may be tested and validated using the testing data.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: August 30, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Avinash Aghoram Ravichandran, Rahul Bhotika, Stefano Soatto, Pietro Perona, Hao Yang
  • Patent number: 11257006
    Abstract: Techniques for auto-generation of annotated real-world training data are described. An electronic document is analyzed to determine text represented in the document and corresponding locations of the text. A representation of the electronic document is modified to include markers and printed. The printed document is photographed in real-world environments, and the markers within the digital photographs are analyzed to allow for the depiction of the document within the photographs to be rectified. The text and location data are used to annotate the rectified images.
    Type: Grant
    Filed: November 20, 2018
    Date of Patent: February 22, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Oron Anschel, Amit Adam, Shahar Tsiper, Hadar Averbuch Elor, Shai Mazor, Rahul Bhotika, Stefano Soatto
  • Patent number: 11216697
    Abstract: Techniques for building a backward compatible and backfill-free image search system are described. According to some embodiments, a backwards compatible training system trains a new embedding model to be backward compatible with the face embeddings (e.g., floating-point vectors) generated by a previous embedding model. In one embodiment, backwards compatible training uses a classifier of the previous embedding model as a form of constraint in the training of the new embedding model.
    Type: Grant
    Filed: March 11, 2020
    Date of Patent: January 4, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Yantao Shen, Yuanjun Xiong, Siqi Deng, Wei Xia, Shuo Yang, Yifan Xing, Wei Li, Stefano Soatto
  • Patent number: 10970530
    Abstract: Techniques for grammar-based automated generation of annotated synthetic form training data for machine learning are described. A training data generation engine utilizes a defined grammar to construct a layout for a form, select key-value units to place within the layout, and select attribute variants for the key-value units. The form is rendered and stored at a storage location, where it can be provided along with other similarly-generated forms to be used as training data for a machine learning model.
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: April 6, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Amit Adam, Oron Anschel, Or Perel, Gal Sabina Star, Omri Ben-Eliezer, Hadar Averbuch Elor, Shai Mazor, Wendy Tse, Andrea Olgiati, Rahul Bhotika, Stefano Soatto
  • Patent number: 10963754
    Abstract: Techniques for training an embedding using a limited training set are described. In some examples, the embedding is trained by generating a plurality of vectors from a random sample of the limited set of training data classes using a layer of the particular machine learning classification model, randomly selecting samples from the plurality of vectors into a set of samples, computing at least one distance for each sampled class from a center parameter for the class using the set of samples, generating a discrete probability distribution over the classes for a query point based on distances to a center parameter for each of the classes in the embedding space, calculating a loss value for the modified prototypical network, the calculation of the loss value being for a fixed geometry of the embedding space and including a measure of the difference between distributions, and back propagating.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: March 30, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Avinash Aghoram Ravichandran, Paulo Ricardo dos Santos Mendonca, Rahul Bhotika, Stefano Soatto
  • Patent number: 10878234
    Abstract: Techniques for automated form understanding via layout-agnostic identification of keys and corresponding values are described. An embedding generator creates embeddings of pixels from an image including a representation of a form. The generated embeddings are similar for pixels within a same key-value unit, and far apart for pixels not in a same key-value unit. A weighted bipartite graph is constructed including a first set of nodes corresponding to keys of the form and a second set of nodes corresponding to values of the form. Weights for the edges are determined based on an analysis of distances between ones of the embeddings. The graph is partitioned according to a scheme to identify pairings between the first set of nodes and the second set of nodes that produces a minimum overall edge weight. The pairings indicate keys and values that are associated within the form.
    Type: Grant
    Filed: November 20, 2018
    Date of Patent: December 29, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Amit Adam, Oron Anschel, Hadar Averbuch Elor, Shai Mazor, Gal Sabina Star, Or Perel, Wendy Tse, Andrea Olgiati, Rahul Bhotika, Stefano Soatto
  • Patent number: 10872236
    Abstract: Techniques for layout-agnostic clustering-based classification of document keys and values are described. A key-value differentiation unit generates feature vectors corresponding to text elements of a form represented within an electronic image using a machine learning (ML) model. The ML model was trained utilizing a loss function that separates keys from values. The feature vectors are clustered into at least two clusters, and a cluster is determined to include either keys of the form or values of the form via identifying neighbors between feature vectors of the cluster(s) with labeled feature vectors.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: December 22, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Hadar Averbuch Elor, Oron Anschel, Or Perel, Amit Adam, Shai Mazor, Rahul Bhotika, Stefano Soatto
  • Patent number: 10839245
    Abstract: A structured document analyzer that associates keys and values in structured documents based on key, value, and key-value container bounding boxes. A trained machine learning model analyzes images of structured documents to determine bounding boxes for keys, values, and key-value containers in the images with confidence scores for the classifications. For each image, duplicate bounding boxes are removed, and then a set of key-value containers are selected and sorted based on the confidence scores. For each key-value container, a best key and value are determined for the container based on overlap of the key and value bounding boxes with the container bounding box and the confidence scores. Optical character recognition may be performed on the image to determine text for the keys and values.
    Type: Grant
    Filed: March 25, 2019
    Date of Patent: November 17, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Guneet Singh Dhillon, Vijay Mahadevan, Yuting Zhang, Meng Wang, Gangadhar Payyavula, Viet Cuong Nguyen, Rahul Bhotika, Stefano Soatto
  • Patent number: 10762644
    Abstract: Techniques for multiple object tracking in video are described in which the outputs of neural networks are combined within a Bayesian framework. A motion model is applied to a probability distribution representing the estimated current state of a target object being tracked to predict the state of the target object in the next frame. A state of an object can include one or more features, such as the location of the object in the frame, a velocity and/or acceleration of the object across frames, a classification of the object, etc. The prediction of the state of the target object in the next frame is adjusted by a score based on the combined outputs of neural networks that process the next frame.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: September 1, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Vijay Mahadevan, Stefano Soatto
  • Publication number: 20190236399
    Abstract: A new method for improving the robustness of visual-inertial integration systems (VINS) based on derivation of optimal discriminants for outlier rejection, and the consequent approximations, that are both conceptually and empirically superior to other outlier detection schemes used in this context. It should be appreciated that VINS is central to a number of application areas including augmented reality (AR), virtual reality (VR), robotics, autonomous vehicles, autonomous flying robots, and so forth and their related hardware including mobile phones, such as for use in indoor localization (in GPS-denied areas), and the like.
    Type: Application
    Filed: August 9, 2018
    Publication date: August 1, 2019
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Konstantine Tsotsos
  • Publication number: 20170243084
    Abstract: A variation of scale-invariant feature transform (SIFT) based on pooling gradient orientations across different domain sizes, in addition to spatial locations. The resulting descriptor is called DSP-SIFT, and it outperforms other methods in wide-baseline matching benchmarks, including those based on convolutional neural networks, despite having the same dimension of SIFT and requiring no training. Problems of local representation of imaging data are also addressed as computation of minimal sufficient statistics that are invariant to nuisance variability induced by viewpoint and illumination. A sampling-based and a point-estimate based approximation of such representations are described.
    Type: Application
    Filed: November 7, 2016
    Publication date: August 24, 2017
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Jingming Dong
  • Patent number: 9456768
    Abstract: A disposable volume spirometry apparatus and non-contact measurement system is described. The spirometry system includes an expandable disposable volume spirometry apparatus, a remote non-contact sensor, memory, and a processor. The remote non-contact sensor captures images associated with the expandable disposable volume spirometry apparatus. The memory stores the captured images. The processor is operatively coupled to the memory and determines a volume for the expandable disposable volume spirometry apparatus by analyzing the captured images.
    Type: Grant
    Filed: August 18, 2011
    Date of Patent: October 4, 2016
    Inventors: Stefano Soatto, Giuseppe Torresin
  • Patent number: 9418317
    Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.
    Type: Grant
    Filed: April 4, 2014
    Date of Patent: August 16, 2016
    Assignee: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Taehee Lee
  • Publication number: 20160140729
    Abstract: A new method for improving the robustness of visual-inertial integration systems (VINS) based on derivation of optimal discriminants for outlier rejection, and the consequent approximations, that are both conceptually and empirically superior to other outlier detection schemes used in this context. It should be appreciated that VINS is central to a number of application areas including augmented reality (AR), virtual reality (VR), robotics, autonomous vehicles, autonomous flying robots, and so forth and their related hardware including mobile phones, such as for use in indoor localization (in GPS-denied areas), and the like.
    Type: Application
    Filed: November 4, 2015
    Publication date: May 19, 2016
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Konstantine Tsotsos
  • Publication number: 20140301635
    Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.
    Type: Application
    Filed: April 4, 2014
    Publication date: October 9, 2014
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Stefano Soatto, Taehee Lee
  • Patent number: 8717437
    Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.
    Type: Grant
    Filed: January 7, 2013
    Date of Patent: May 6, 2014
    Assignee: The Regents of the University of California
    Inventors: Stefano Soatto, Taehee Lee
  • Publication number: 20120046568
    Abstract: A disposable volume spirometry apparatus and non-contact measurement system is described. The spirometry system includes an expandable disposable volume spirometry apparatus, a remote non-contact sensor, memory, and a processor. The remote non-contact sensor captures images associated with the expandable disposable volume spirometry apparatus. The memory stores the captured images. The processor is operatively coupled to the memory and determines a volume for the expandable disposable volume spirometry apparatus by analyzing the captured images.
    Type: Application
    Filed: August 18, 2011
    Publication date: February 23, 2012
    Inventors: Stefano Soatto, Giuseppe Torresin
  • Patent number: 6944327
    Abstract: The present invention provides a method and apparatus for designing and visualizing the shape of eyeglass lenses and of the front rims of eyeglass frames, and for allowing the customer to modify the design by changing the shape and style interactively to suit his/her preference and perceived character. The present invention comprises an interface for a lens grinding machine that allows the retailer to manufacture the selected frame at the store for certain specified styles, or allows the retailer to transmit the shape and style data to a manufacturer that can implement the selected design and deliver it directly to the customer. Another embodiment of the present invention is to provide a method for designing, visualizing, modifying the shape and style of eyeglass lenses and frames remotely through the Internet or other communication channel, and to transmit the design to a manufacturer, which enables the customer to select and purchase eyeglasses directly from manufacturers.
    Type: Grant
    Filed: November 3, 2000
    Date of Patent: September 13, 2005
    Inventor: Stefano Soatto