Patents by Inventor Stefano Soatto
Stefano Soatto has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11860977Abstract: Techniques for performing visual clustering with a hierarchical graph neural network framework including a joint linkage prediction and density estimation graph model are described. Embodiments herein recurrently run the joint linkage prediction and density estimation graph model to generate intermediate clusters in multiple iterations (e.g., until convergence) to obtain a final clustering result. In certain embodiments, for each iteration, the input graph contains nodes that are merged from nodes assigned to intermediate clusters from the previous iteration. By using a small and fixed bandwidth k in each iteration, embodiments herein alleviate the sensitivity to the k selection for different clustering applications. Certain embodiments herein remove the tuning of a different k (e.g., k-bandwidth) for k-nearest neighbor graph construction over different clustering applications.Type: GrantFiled: May 4, 2021Date of Patent: January 2, 2024Assignee: Amazon Technologies, Inc.Inventors: Yifan Xing, Tianjun Xiao, Tong He, Yongxin Wang, Yuanjun Xiong, Wei Xia, David Paul Wipf, Zheng Zhang, Stefano Soatto
-
Patent number: 11429813Abstract: This disclosure describes automatically selecting and training one or more models for image recognition based upon training and testing (validation) data provided by a user. A service provider network includes a recognition service that may use models to process images and videos to recognize objects in the images and videos, features on the objects in the images and videos, and/or locate objects in the images and videos. The service provider network also includes a model selection and training service that may select one or more modeling techniques based on the objectives of the user and/or the amount of data provided by the user. Based on the selected modeling technique, the model selection and training service selects and trains one or more models for use by the recognition service to process images and videos using the training data. The trained model may be tested and validated using the testing data.Type: GrantFiled: November 27, 2019Date of Patent: August 30, 2022Assignee: Amazon Technologies, Inc.Inventors: Avinash Aghoram Ravichandran, Rahul Bhotika, Stefano Soatto, Pietro Perona, Hao Yang
-
Patent number: 11257006Abstract: Techniques for auto-generation of annotated real-world training data are described. An electronic document is analyzed to determine text represented in the document and corresponding locations of the text. A representation of the electronic document is modified to include markers and printed. The printed document is photographed in real-world environments, and the markers within the digital photographs are analyzed to allow for the depiction of the document within the photographs to be rectified. The text and location data are used to annotate the rectified images.Type: GrantFiled: November 20, 2018Date of Patent: February 22, 2022Assignee: Amazon Technologies, Inc.Inventors: Oron Anschel, Amit Adam, Shahar Tsiper, Hadar Averbuch Elor, Shai Mazor, Rahul Bhotika, Stefano Soatto
-
Patent number: 11216697Abstract: Techniques for building a backward compatible and backfill-free image search system are described. According to some embodiments, a backwards compatible training system trains a new embedding model to be backward compatible with the face embeddings (e.g., floating-point vectors) generated by a previous embedding model. In one embodiment, backwards compatible training uses a classifier of the previous embedding model as a form of constraint in the training of the new embedding model.Type: GrantFiled: March 11, 2020Date of Patent: January 4, 2022Assignee: Amazon Technologies, Inc.Inventors: Yantao Shen, Yuanjun Xiong, Siqi Deng, Wei Xia, Shuo Yang, Yifan Xing, Wei Li, Stefano Soatto
-
Patent number: 10970530Abstract: Techniques for grammar-based automated generation of annotated synthetic form training data for machine learning are described. A training data generation engine utilizes a defined grammar to construct a layout for a form, select key-value units to place within the layout, and select attribute variants for the key-value units. The form is rendered and stored at a storage location, where it can be provided along with other similarly-generated forms to be used as training data for a machine learning model.Type: GrantFiled: November 13, 2018Date of Patent: April 6, 2021Assignee: Amazon Technologies, Inc.Inventors: Amit Adam, Oron Anschel, Or Perel, Gal Sabina Star, Omri Ben-Eliezer, Hadar Averbuch Elor, Shai Mazor, Wendy Tse, Andrea Olgiati, Rahul Bhotika, Stefano Soatto
-
Patent number: 10963754Abstract: Techniques for training an embedding using a limited training set are described. In some examples, the embedding is trained by generating a plurality of vectors from a random sample of the limited set of training data classes using a layer of the particular machine learning classification model, randomly selecting samples from the plurality of vectors into a set of samples, computing at least one distance for each sampled class from a center parameter for the class using the set of samples, generating a discrete probability distribution over the classes for a query point based on distances to a center parameter for each of the classes in the embedding space, calculating a loss value for the modified prototypical network, the calculation of the loss value being for a fixed geometry of the embedding space and including a measure of the difference between distributions, and back propagating.Type: GrantFiled: September 27, 2018Date of Patent: March 30, 2021Assignee: Amazon Technologies, Inc.Inventors: Avinash Aghoram Ravichandran, Paulo Ricardo dos Santos Mendonca, Rahul Bhotika, Stefano Soatto
-
Patent number: 10878234Abstract: Techniques for automated form understanding via layout-agnostic identification of keys and corresponding values are described. An embedding generator creates embeddings of pixels from an image including a representation of a form. The generated embeddings are similar for pixels within a same key-value unit, and far apart for pixels not in a same key-value unit. A weighted bipartite graph is constructed including a first set of nodes corresponding to keys of the form and a second set of nodes corresponding to values of the form. Weights for the edges are determined based on an analysis of distances between ones of the embeddings. The graph is partitioned according to a scheme to identify pairings between the first set of nodes and the second set of nodes that produces a minimum overall edge weight. The pairings indicate keys and values that are associated within the form.Type: GrantFiled: November 20, 2018Date of Patent: December 29, 2020Assignee: Amazon Technologies, Inc.Inventors: Amit Adam, Oron Anschel, Hadar Averbuch Elor, Shai Mazor, Gal Sabina Star, Or Perel, Wendy Tse, Andrea Olgiati, Rahul Bhotika, Stefano Soatto
-
Patent number: 10872236Abstract: Techniques for layout-agnostic clustering-based classification of document keys and values are described. A key-value differentiation unit generates feature vectors corresponding to text elements of a form represented within an electronic image using a machine learning (ML) model. The ML model was trained utilizing a loss function that separates keys from values. The feature vectors are clustered into at least two clusters, and a cluster is determined to include either keys of the form or values of the form via identifying neighbors between feature vectors of the cluster(s) with labeled feature vectors.Type: GrantFiled: September 28, 2018Date of Patent: December 22, 2020Assignee: Amazon Technologies, Inc.Inventors: Hadar Averbuch Elor, Oron Anschel, Or Perel, Amit Adam, Shai Mazor, Rahul Bhotika, Stefano Soatto
-
Patent number: 10839245Abstract: A structured document analyzer that associates keys and values in structured documents based on key, value, and key-value container bounding boxes. A trained machine learning model analyzes images of structured documents to determine bounding boxes for keys, values, and key-value containers in the images with confidence scores for the classifications. For each image, duplicate bounding boxes are removed, and then a set of key-value containers are selected and sorted based on the confidence scores. For each key-value container, a best key and value are determined for the container based on overlap of the key and value bounding boxes with the container bounding box and the confidence scores. Optical character recognition may be performed on the image to determine text for the keys and values.Type: GrantFiled: March 25, 2019Date of Patent: November 17, 2020Assignee: Amazon Technologies, Inc.Inventors: Guneet Singh Dhillon, Vijay Mahadevan, Yuting Zhang, Meng Wang, Gangadhar Payyavula, Viet Cuong Nguyen, Rahul Bhotika, Stefano Soatto
-
Patent number: 10762644Abstract: Techniques for multiple object tracking in video are described in which the outputs of neural networks are combined within a Bayesian framework. A motion model is applied to a probability distribution representing the estimated current state of a target object being tracked to predict the state of the target object in the next frame. A state of an object can include one or more features, such as the location of the object in the frame, a velocity and/or acceleration of the object across frames, a classification of the object, etc. The prediction of the state of the target object in the next frame is adjusted by a score based on the combined outputs of neural networks that process the next frame.Type: GrantFiled: December 13, 2018Date of Patent: September 1, 2020Assignee: Amazon Technologies, Inc.Inventors: Vijay Mahadevan, Stefano Soatto
-
Publication number: 20190236399Abstract: A new method for improving the robustness of visual-inertial integration systems (VINS) based on derivation of optimal discriminants for outlier rejection, and the consequent approximations, that are both conceptually and empirically superior to other outlier detection schemes used in this context. It should be appreciated that VINS is central to a number of application areas including augmented reality (AR), virtual reality (VR), robotics, autonomous vehicles, autonomous flying robots, and so forth and their related hardware including mobile phones, such as for use in indoor localization (in GPS-denied areas), and the like.Type: ApplicationFiled: August 9, 2018Publication date: August 1, 2019Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIAInventors: Stefano Soatto, Konstantine Tsotsos
-
Publication number: 20170243084Abstract: A variation of scale-invariant feature transform (SIFT) based on pooling gradient orientations across different domain sizes, in addition to spatial locations. The resulting descriptor is called DSP-SIFT, and it outperforms other methods in wide-baseline matching benchmarks, including those based on convolutional neural networks, despite having the same dimension of SIFT and requiring no training. Problems of local representation of imaging data are also addressed as computation of minimal sufficient statistics that are invariant to nuisance variability induced by viewpoint and illumination. A sampling-based and a point-estimate based approximation of such representations are described.Type: ApplicationFiled: November 7, 2016Publication date: August 24, 2017Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIAInventors: Stefano Soatto, Jingming Dong
-
Patent number: 9456768Abstract: A disposable volume spirometry apparatus and non-contact measurement system is described. The spirometry system includes an expandable disposable volume spirometry apparatus, a remote non-contact sensor, memory, and a processor. The remote non-contact sensor captures images associated with the expandable disposable volume spirometry apparatus. The memory stores the captured images. The processor is operatively coupled to the memory and determines a volume for the expandable disposable volume spirometry apparatus by analyzing the captured images.Type: GrantFiled: August 18, 2011Date of Patent: October 4, 2016Inventors: Stefano Soatto, Giuseppe Torresin
-
Patent number: 9418317Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.Type: GrantFiled: April 4, 2014Date of Patent: August 16, 2016Assignee: THE REGENTS OF THE UNIVERSITY OF CALIFORNIAInventors: Stefano Soatto, Taehee Lee
-
Publication number: 20160140729Abstract: A new method for improving the robustness of visual-inertial integration systems (VINS) based on derivation of optimal discriminants for outlier rejection, and the consequent approximations, that are both conceptually and empirically superior to other outlier detection schemes used in this context. It should be appreciated that VINS is central to a number of application areas including augmented reality (AR), virtual reality (VR), robotics, autonomous vehicles, autonomous flying robots, and so forth and their related hardware including mobile phones, such as for use in indoor localization (in GPS-denied areas), and the like.Type: ApplicationFiled: November 4, 2015Publication date: May 19, 2016Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIAInventors: Stefano Soatto, Konstantine Tsotsos
-
Publication number: 20140301635Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.Type: ApplicationFiled: April 4, 2014Publication date: October 9, 2014Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIAInventors: Stefano Soatto, Taehee Lee
-
Patent number: 8717437Abstract: We describe an end-to-end visual recognition system, where “end-to-end” refers to the ability of the system of performing all aspects of the system, from the construction of “maps” of scenes, or “models” of objects from training data, to the determination of the class, identity, location and other inferred parameters from test data. Our visual recognition system is capable of operating on a mobile hand-held device, such as a mobile phone, tablet or other portable device equipped with sensing and computing power. Our system employs a video based feature descriptor, and we characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking.Type: GrantFiled: January 7, 2013Date of Patent: May 6, 2014Assignee: The Regents of the University of CaliforniaInventors: Stefano Soatto, Taehee Lee
-
Publication number: 20120046568Abstract: A disposable volume spirometry apparatus and non-contact measurement system is described. The spirometry system includes an expandable disposable volume spirometry apparatus, a remote non-contact sensor, memory, and a processor. The remote non-contact sensor captures images associated with the expandable disposable volume spirometry apparatus. The memory stores the captured images. The processor is operatively coupled to the memory and determines a volume for the expandable disposable volume spirometry apparatus by analyzing the captured images.Type: ApplicationFiled: August 18, 2011Publication date: February 23, 2012Inventors: Stefano Soatto, Giuseppe Torresin
-
Patent number: 6944327Abstract: The present invention provides a method and apparatus for designing and visualizing the shape of eyeglass lenses and of the front rims of eyeglass frames, and for allowing the customer to modify the design by changing the shape and style interactively to suit his/her preference and perceived character. The present invention comprises an interface for a lens grinding machine that allows the retailer to manufacture the selected frame at the store for certain specified styles, or allows the retailer to transmit the shape and style data to a manufacturer that can implement the selected design and deliver it directly to the customer. Another embodiment of the present invention is to provide a method for designing, visualizing, modifying the shape and style of eyeglass lenses and frames remotely through the Internet or other communication channel, and to transmit the design to a manufacturer, which enables the customer to select and purchase eyeglasses directly from manufacturers.Type: GrantFiled: November 3, 2000Date of Patent: September 13, 2005Inventor: Stefano Soatto