Patents by Inventor Rogerio Schmidt Feris

Rogerio Schmidt Feris has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TRANSLATING TEXT USING GENERATED VISUAL REPRESENTATIONS AND ARTIFICIAL INTELLIGENCE

Publication number: 20240127005

Abstract: Methods, systems, and computer program products for translating text using generated visual representations and artificial intelligence are provided herein. A computer-implemented method includes generating a tokenized form of at least a portion of input text in a first language; generating at least one visual representation of at least a portion of the input text using a first set of artificial intelligence techniques; generating a tokenized form of at least a portion of the at least one visual representation; and generating an output including a translated version of the input text into at least a second language by processing, using a second set of artificial intelligence techniques, at least a portion of the tokenized form of the at least a portion of the input text and at least a portion of the tokenized form of the at least a portion of the at least one visual representation.

Type: Application

Filed: September 28, 2022

Publication date: April 18, 2024

Inventors: Rameswar Panda, Yi Li, Richard Chen, Rogerio Schmidt Feris, Yoon Hyung Kim, David Cox
Dynamic multi-resolution processing for video classification

Patent number: 11954910

Abstract: Methods, apparatus, and systems for multi-resolution processing for video classification. A plurality of video frames of a video are obtained and a resolution for classifying each video frame of the plurality of video frames is determined by analyzing each video frame using a policy network. Based on the determined resolution, each video frame having a determined resolution is rescaled and each rescaled video frame is routed to a classifier of a backbone network that corresponds to the determined resolution. Each rescaled video frame is classified using the corresponding classifier of the backbone network to obtain a plurality of classifications and the classifications are averaged to determine an action classification of the video.

Type: Grant

Filed: December 26, 2020

Date of Patent: April 9, 2024

Assignees: International Business Machines Corporation, MASSACHUSETTS INSTITUTE OF TECHNOLOGY, MA

Inventors: Rameswar Panda, Yue Meng, Chung-Ching Lin, Rogerio Schmidt Feris, Aude Jeanne Oliva
DYNAMIC NETWORK QUANTIZATION FOR EFFICIENT VIDEO INFERENCE

Publication number: 20230215174

Abstract: A recognition network is trained for a selected video frame at a desired highest precision using back-propagation and a policy network is trained using back-propagation from the trained recognition network. The recognition network is trained at a lower precision specified by a policy recommended for the selected video frame by the trained policy network. A frame of a given video is inputted to the trained policy network for determination of a precision policy for processing the frame. Video inferencing is performed utilizing the trained policy network and the trained recognition network based on the precision policy.

Type: Application

Filed: December 31, 2021

Publication date: July 6, 2023

Inventors: Rameswar Panda, Ximeng Sun, Richard Chen, Rogerio Schmidt Feris, Ekaterina Saenko
INTERPRETABILITY-AWARE REDUNDANCY REDUCTION FOR VISION TRANSFORMERS

Publication number: 20230196710

Abstract: A sequence of patch tokens representing an image can be received. A network can be trained to learn informative patch tokens and uninformative patch tokens in the sequence of patch tokens, in learning to recognize an object in the image. The sequence of patch tokens can be reduced by removing the uninformative patch tokens from the sequence of patch tokens. The reduced sequence of patch tokens can be input to an attention-based deep learning neural network. The attention-based deep learning neural network can be fine-tuned to recognize the object in the image using the reduced sequence of patch tokens.

Type: Application

Filed: December 22, 2021

Publication date: June 22, 2023

Inventors: Bowen Pan, Rameswar Panda, Rogerio Schmidt Feris, Aude Jeanne Oliva
TEMPORAL CONTRASTIVE LEARNING FOR SEMI-SUPERVISED VIDEO ACTION RECOGNITION

Publication number: 20230138254

Abstract: A base pathway of a computerized two-pathway video action recognition model is trained using a plurality of labeled video samples. The base pathway is trained using a plurality of unlabeled video samples at a first framerate. An auxiliary pathway of the computerized two-pathway video action recognition model is trained using a plurality of the unlabeled video samples at a second framerate, the second framerate being slower than the first framerate, wherein the training of the base pathway and the training of the auxiliary pathway result in a trained computerized two-pathway video action recognition model. A candidate video is categorized using the trained computerized two-pathway video action recognition model and the categorized candidate video is stored in a computer-accessible video database system for information retrieval.

Type: Application

Filed: October 29, 2021

Publication date: May 4, 2023

Inventors: Rameswar Panda, Rogerio Schmidt Feris, Abir Das
ADAPTIVE REDUNDANCY REDUCTION FOR EFFICIENT VIDEO UNDERSTANDING

Publication number: 20230082448

Abstract: For each convolution layer of a plurality of convolution layers of a convolutional neural network (CNN), apply an input-dependent policy network to determine: a first fraction of input feature maps to the given layer for which first corresponding output feature maps are to be fully computed by the layer; and a second fraction of input feature maps to the layer for which second corresponding output feature maps are not to be fully computed, but to be reconstructed from the first corresponding output feature maps. Fully computing the first corresponding output feature maps and reconstruct the second corresponding output feature maps. For a final one of the convolution layers of the plurality of convolution layers of the neural network, input the first corresponding output feature maps and the second corresponding output feature maps to an output layer to obtain an inference result.

Type: Application

Filed: September 15, 2021

Publication date: March 16, 2023

Inventors: Bowen Pan, Rameswar Panda, Camilo Luciano Fosco, Rogerio Schmidt Feris, Aude Jeanne Oliva
ADAPTIVE SELECTION OF DATA MODALITIES FOR EFFICIENT VIDEO RECOGNITION

Publication number: 20220292285

Abstract: One embodiment of the invention provides a method for video recognition. The method comprises receiving an input video comprising a sequence of video segments over a plurality of data modalities. The method further comprises, for a video segment of the sequence, selecting one or more data modalities based on data representing the video segment. Each data modality selected is optimal for video recognition of the video segment. The method further comprises, for each data modality selected, providing at least one data input representing the video segment over the data modality selected to a machine learning model corresponding to the data modality selected, and generating a first type of prediction representative of the video segment via the machine learning model. The method further comprises determining a second type of prediction representative of the entire input video by aggregating all first type of predictions generated.

Type: Application

Filed: March 11, 2021

Publication date: September 15, 2022

Inventors: Rameswar Panda, Richard Chen, Quanfu Fan, Rogerio Schmidt Feris
DYNAMIC MULTI-RESOLUTION PROCESSING FOR VIDEO CLASSIFICATION

Publication number: 20220215198

Abstract: Methods, apparatus, and systems for multi-resolution processing for video classification. A plurality of video frames of a video are obtained and a resolution for classifying each video frame of the plurality of video frames is determined by analyzing each video frame using a policy network. Based on the determined resolution, each video frame having a determined resolution is rescaled and each rescaled video frame is routed to a classifier of a backbone network that corresponds to the determined resolution. Each rescaled video frame is classified using the corresponding classifier of the backbone network to obtain a plurality of classifications and the classifications are averaged to determine an action classification of the video.

Type: Application

Filed: December 26, 2020

Publication date: July 7, 2022

Inventors: Rameswar Panda, Yue Meng, Chung-Ching Lin, Rogerio Schmidt Feris, Aude Jeanne Oliva
System and method for augmenting few-shot object classification with semantic information from multiple sources

Patent number: 11263488

Abstract: Embodiments may provide learning and recognition of classifications using only one or a few examples of items. For example, in an embodiment, a method of computer vision processing may be implemented in a computer comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method may comprise training a neural network system implemented in the computer system to classify images into a plurality of classes using one or a few training images for each class and a plurality of associated semantic information, wherein the plurality of associated semantic information is from a plurality of sources and comprises at least some of class/object labels, textual description, or attributes, and wherein the neural network is trained by modulating the training images by sequentially applying the plurality of associated semantic information and classifying query images using the trained neural network system.

Type: Grant

Filed: April 13, 2020

Date of Patent: March 1, 2022

Assignee: International Business Machines Corporation

Inventors: Eliyahu Schwartz, Leonid Karlinsky, Rogerio Schmidt Feris
SYSTEM AND METHOD FOR AUGMENTING FEW-SHOT OBJECT CLASSIFICATION WITH SEMANTIC INFORMATION FROM MULTIPLE SOURCES

Publication number: 20210319263

Abstract: Embodiments may provide learning and recognition of classifications using only one or a few examples of items. For example, in an embodiment, a method of computer vision processing may be implemented in a computer comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method may comprise training a neural network system implemented in the computer system to classify images into a plurality of classes using one or a few training images for each class and a plurality of associated semantic information, wherein the plurality of associated semantic information is from a plurality of sources and comprises at least some of class/object labels, textual description, or attributes, and wherein the neural network is trained by modulating the training images by sequentially applying the plurality of associated semantic information and classifying query images using the trained neural network system.

Type: Application

Filed: April 13, 2020

Publication date: October 14, 2021

Inventors: ELIYAHU SCHWARTZ, LEONID KARLINSKY, ROGERIO SCHMIDT FERIS
LEARNING DATA-AUGMENTATION FROM UNLABELED MEDIA

Publication number: 20200242507

Abstract: A computing system is configured to learn data-augmentations from unlabeled media. The system includes an extracting unit and an embedding unit. The extracting unit is configured to receive media data that includes moving images of an object and audio generated by the object. The extracting unit extracts an image frame of the object among the moving images and extracts an audio segment from the audio. The embedding unit is configured to generate first embeddings of the image frame and second embeddings of the audio segment, and to concatenate the first and second embeddings together to generate concatenated embeddings. The computing system labels the media data based at least in part on the concatenated embeddings.

Type: Application

Filed: January 25, 2019

Publication date: July 30, 2020

Inventors: Chuang Gan, Quanfu Fan, Sijia Liu, Rogerio Schmidt Feris
System and method for pose-aware feature learning

Patent number: 10679047

Abstract: An object recognition system includes a parameter generator for generating a parameter that indicates whether an object depicted in an image pair is the same or different object, and a pose difference for the image pair and a parameter refiner for refining the parameter, and an applicator for applying the refined parameter to object recognition.

Type: Grant

Filed: December 29, 2017

Date of Patent: June 9, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rogerio Schmidt Feris, Minkyong Kim, Clifford A. Pickover
Control device for controlling a rigidity of an orthosis and method of controlling a rigidity of an orthosis

Patent number: 10390986

Abstract: A control device for controlling a rigidity of an orthosis, includes a sensing circuit for sensing a falling motion, a signal generating circuit which generates a sensing signal based on the sensing of the falling motion, and a rigidity control mechanism which controls a rigidity of the orthosis based on the sensing signal.

Type: Grant

Filed: November 30, 2015

Date of Patent: August 27, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rogerio Schmidt Feris, Minkyong Kim, Clifford A. Pickover
Dynamic management system, method, and recording medium for cognitive drone-swarms

Patent number: 10163355

Abstract: A method, system, and recording medium including a drone and pattern recruiting device configured to recruit a plurality of drones based on a mission, a flocking goal device configured to arrange the plurality of drones in the drone-swarm in a pattern to satisfy the mission, and a changing device configured to adaptively change the pattern of the drone-swarm based on a condition of the mission indicating a needed change and to cause the drone and pattern recruiting device to recruit an additional drone for the needed change.

Type: Grant

Filed: January 30, 2017

Date of Patent: December 25, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Thomas David Erickson, Rogerio Schmidt Feris, Clifford A. Pickover
SYSTEM AND METHOD FOR POSE-AWARE FEATURE LEARNING

Publication number: 20180173944

Abstract: An object recognition system includes a parameter generator for generating a parameter that indicates whether an object depicted in an image pair is the same or different object, and a pose difference for the image pair and a parameter refiner for refining the parameter, and an applicator for applying the refined parameter to object recognition.

Type: Application

Filed: December 29, 2017

Publication date: June 21, 2018

Inventors: Rogerio Schmidt Feris, Minkyong Kim, Clifford A. Pickover
System and method for pose-aware feature learning

Patent number: 9953217

Abstract: A pose-aware feature learning system includes an object tracker which tracks an object on a subject in a plurality of video frames, a pose estimator which estimates a pose of the subject in a track of the plurality of video frames, an image pair generator which extracts a plurality of image pairs from the track of the plurality of video frames, and labels the plurality of image pairs with the estimated pose and as depicting the same or different object, and a neural network trainer which trains a neural network based on the labeled plurality of image pairs, to predict whether an image pair depicts the same or different object and a pose difference for the image pair.

Type: Grant

Filed: November 30, 2015

Date of Patent: April 24, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rogerio Schmidt Feris, Minkyong Kim, Clifford A. Pickover
Semantic parsing of objects in video

Patent number: 9679201

Abstract: Methods, systems, and computer program products for parsing objects are provided herein. A method includes producing a plurality of versions of an image of an object derived from an input, wherein each version comprises one of one multiple resolutions of said image of said object; computing an appearance probability at each of a plurality of regions on the one or more lowest resolution versions of said plurality of versions of said image for at least one attribute for said object; determining a configuration of the at least one attribute in the one or more lowest resolution versions based on at least the appearance probability in each of the plurality of regions; and outputting said configuration.

Type: Grant

Filed: January 18, 2016

Date of Patent: June 13, 2017

Assignee: International Business Machines Corporation

Inventors: Lisa Marie Brown, Rogerio Schmidt Feris, Arun Hampapur, Daniel Andre Vaquero
SYSTEM AND METHOD FOR POSE-AWARE FEATURE LEARNING

Publication number: 20170154212

Abstract: A pose-aware feature learning system includes an object tracker which tracks an object on a subject in a plurality of video frames, a pose estimator which estimates a pose of the subject in a track of the plurality of video frames, an image pair generator which extracts a plurality of image pairs from the track of the plurality of video frames, and labels the plurality of image pairs with the estimated pose and as depicting the same or different object, and a neural network trainer which trains a neural network based on the labeled plurality of image pairs, to predict whether an image pair depicts the same or different object and a pose difference for the image pair.

Type: Application

Filed: November 30, 2015

Publication date: June 1, 2017

Inventors: Rogerio Schmidt FERIS, Minkyong KIM, Clifford A. PICKOVER
CONTROL DEVICE FOR CONTROLLING A RIGIDITY OF AN ORTHOSIS AND METHOD OF CONTROLLING A RIGIDITY OF AN ORTHOSIS

Publication number: 20170151081

Abstract: A control device for controlling a rigidity of an orthosis, includes a sensing circuit for sensing a falling motion, a signal generating circuit which generates a sensing signal based on the sensing of the falling motion, and a rigidity control mechanism which controls a rigidity of the orthosis based on the sensing signal.

Type: Application

Filed: November 30, 2015

Publication date: June 1, 2017

Inventors: Rogerio Schmidt FERIS, Minkyong KIM, Clifford A. PICKOVER
Video object classification

Patent number: 9659238

Abstract: A system comprises an input component, a feature extractor, an object classifier, an adaptation component and a calibration tool. The input component is configured to receive one or more images, and the feature extractor is configured to extract features for one or more objects in the one or more images, the extracted features comprising at least one view-independent feature. The object classifier is configured to classify the one or more objects based at least in part on the extracted features and one or more object classification parameters, and the adaptation component is configured to adjust the classification of at least one of the objects based on one or more contextual parameters. The calibration tool is configured to adjust one or more of the object classification parameters based on likelihoods for characteristics associated with one or more object classes.

Type: Grant

Filed: April 15, 2014

Date of Patent: May 23, 2017

Assignee: International Business Machines Corporation

Inventors: Lisa Marie Brown, Longbin Chen, Rogerio Schmidt Feris, Arun Hampapur, Yun Zhai

1 2 3 next