Patents by Inventor Michail Raptis

Michail Raptis has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

UNIFIED SCENE TEXT DETECTION AND LAYOUT ANALYSIS

Publication number: 20240062560

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for jointly performing text detection and layout analysis. In one aspect, a method comprises processing the image and a set of object queries to generate an encoded representation of the image and an encoded representation of the set of object queries; processing the encoded representation of the image and the encoded representation of the set of object queries to generate a set of text detection masks; processing the encoded representation of the set of object queries to generate layout relevance measures; processing the encoded representation of the set of object queries to generate textness scores for the text detection masks; generating a text detection output that defines respective areas of the image that include text items; and generating a layout analysis output that defines clusters of respective areas of the image identified by the text detection masks.

Type: Application

Filed: September 1, 2022

Publication date: February 22, 2024

Inventors: Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michail Raptis
Methods and systems for action recognition using poselet keyframes

Patent number: 10223580

Abstract: Methods and systems for video action recognition using poselet keyframes are disclosed. An action recognition model may be implemented to spatially and temporally model discriminative action components as a set of discriminative keyframes. One method of action recognition may include the operations of selecting a plurality of poselets that are components of an action, encoding each of a plurality of video frames as a summary of the detection confidence of each of the plurality of poselets for the video frame, and encoding correlations between poselets in the encoded video frames.

Type: Grant

Filed: September 12, 2013

Date of Patent: March 5, 2019

Assignee: Disney Enterprises, Inc.

Inventors: Michail Raptis, Leonid Sigal
Eye gaze driven spatio-temporal action localization

Patent number: 9514363

Abstract: The disclosure provides an approach for detecting and localizing action in video. In one embodiment, an action detection application receives training video sequences and associated eye gaze fixation data collected from a sample of human viewers. Using the training video sequences and eye gaze data, the action detection application learns a model which includes a latent regions potential term that measures the compatibility of latent spatio-temporal regions with the model, as well as a context potential term that accounts for contextual information that is not directly produced by the appearance and motion of the actor. The action detection application may train this model in, e.g., the latent structural SVM framework by minimizing a cost function which encodes the cost of an incorrect action label prediction and a mislocalization of the eye gaze. During training and thereafter, inferences using the model may be made using an efficient dynamic programming algorithm.

Type: Grant

Filed: April 8, 2014

Date of Patent: December 6, 2016

Assignee: Disney Enterprises, Inc.

Inventors: Leonid Sigal, Nataliya Shapovalova, Michail Raptis
Multi-level framework for object detection

Patent number: 9477908

Abstract: The disclosure provides an approach for detecting objects in images. An object detection application receives a set of training images with object annotations. Given these training images, the object detection application generates semantic labeling for object detections, where the labeling includes lower-level subcategories and higher-level visual composites. In one embodiment, the object detection application identifies subcategories using an exemplar support vector machine (SVM) based clustering approach. Identified subcategories are used to initialize mixture components in mixture models which the object detection application trains in a latent SVM framework, thereby learning a number of subcategory classifiers that produce, for any given image, a set of candidate windows and associated subcategory labels.

Type: Grant

Filed: April 10, 2014

Date of Patent: October 25, 2016

Assignee: Disney Enterprises, Inc.

Inventors: Leonid Sigal, Michail Raptis, Tian Lan
MULTI-LEVEL FRAMEWORK FOR OBJECT DETECTION

Publication number: 20150294192

Abstract: The disclosure provides an approach for detecting objects in images. An object detection application receives a set of training images with object annotations. Given these training images, the object detection application generates semantic labeling for object detections, where the labeling includes lower-level subcategories and higher-level visual composites. In one embodiment, the object detection application identifies subcategories using an exemplar support vector machine (SVM) based clustering approach. Identified subcategories are used to initialize mixture components in mixture models which the object detection application trains in a latent SVM framework, thereby learning a number of subcategory classifiers that produce, for any given image, a set of candidate windows and associated subcategory labels.

Type: Application

Filed: April 10, 2014

Publication date: October 15, 2015

Applicant: DISNEY ENTERPRISES, INC.

Inventors: Tian LAN, Michail RAPTIS, Leonid SIGAL
EYE GAZE DRIVEN SPATIO-TEMPORAL ACTION LOCALIZATION

Publication number: 20150286853

Abstract: The disclosure provides an approach for detecting and localizing action in video. In one embodiment, an action detection application receives training video sequences and associated eye gaze fixation data collected from a sample of human viewers. Using the training video sequences and eye gaze data, the action detection application learns a model which includes a latent regions potential term that measures the compatibility of latent spatio-temporal regions with the model, as well as a context potential term that accounts for contextual information that is not directly produced by the appearance and motion of the actor. The action detection application may train this model in, e.g., the latent structural SVM framework by minimizing a cost function which encodes the cost of an incorrect action label prediction and a mislocalization of the eye gaze. During training and therafter, inferences using the model may be made using an efficient dynamic programming algorithm.

Type: Application

Filed: April 8, 2014

Publication date: October 8, 2015

Applicant: DISNEY ENTERPRISES, INC.

Inventors: Nataliya SHAPOVALOVA, Leonid SIGNAL, Michail RAPTIS
METHODS AND SYSTEMS FOR ACTION RECOGNITION USING POSELET KEYFRAMES

Publication number: 20140294360

Abstract: Methods and systems for video action recognition using poselet keyframes are disclosed. An action recognition model may be implemented to spatially and temporally model discriminative action components as a set of discriminative keyframes. One method of action recognition may include the operations of selecting a plurality of poselets that are components of an action, encoding each of a plurality of video frames as a summary of the detection confidence of each of the plurality of poselets for the video frame, and encoding correlations between poselets in the encoded video frames.

Type: Application

Filed: September 12, 2013

Publication date: October 2, 2014

Applicant: Disney Enterprises, Inc.

Inventors: MICHAIL RAPTIS, LEONID SIGAL
Motion recognition

Patent number: 8761437

Abstract: Human body motion is represented by a skeletal model derived from image data of a user. Skeletal model data may be used to perform motion recognition and/or similarity analysis of body motion. An example method of motion recognition includes receiving skeletal motion data representative of a user data motion feature from a capture device relating to a position of a user within a scene. A cross-correlation of the received skeletal motion data relative to a plurality of prototype motion features from a prototype motion feature database is determined. Likelihoods that the skeletal motion data corresponds to each of the plurality of prototype motion features are ranked. The likelihoods are determined using the cross-correlation. A classifying operation is performed on a subset of the plurality of prototype motion features. The subset of the plurality of prototype motion features is chosen because its members have the relatively highest likelihoods of corresponding to the skeletal motion data.

Type: Grant

Filed: February 18, 2011

Date of Patent: June 24, 2014

Assignee: Microsoft Corporation

Inventors: Darko Kirovski, Michail Raptis
MOTION RECOGNITION

Publication number: 20120214594

Abstract: Human body motion is represented by a skeletal model derived from image data of a user. Skeletal model data may be used to perform motion recognition and/or similarity analysis of body motion. An example method of motion recognition includes receiving skeletal motion data representative of a user data motion feature from a capture device relating to a position of a user within a scene. A cross-correlation of the received skeletal motion data relative to a plurality of prototype motion features from a prototype motion feature database is determined. Likelihoods that the skeletal motion data corresponds to each of the plurality of prototype motion features are ranked. The likelihoods are determined using the cross-correlation. A classifying operation is performed on a subset of the plurality of prototype motion features. The subset of the plurality of prototype motion features is chosen because its members have the relatively highest likelihoods of corresponding to the skeletal motion data.

Type: Application

Filed: February 18, 2011

Publication date: August 23, 2012

Applicant: Microsoft Corporation

Inventors: Darko Kirovski, Michail Raptis