Patents by Inventor Udi Barzelay

Udi Barzelay has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Font attribute detection

Patent number: 12260659

Abstract: Described are techniques for font attribute detection. The techniques include receiving a document having different font attributes amongst a plurality of words respectively comprised of at least one character. The techniques further include generating a dense image document from the document by setting the plurality of words to a predefined size, removing blank spaces from the document, and altering an order of characters relative to the document. The techniques further include determining characteristics of the characters in the dense image document and aggregating the characteristics for at least one word. The techniques further include annotating the at least one word with a font attribute based on the aggregated characteristics.

Type: Grant

Filed: April 26, 2022

Date of Patent: March 25, 2025

Assignee: International Business Machines Corporation

Inventors: Ophir Azulai, Daniel Nechemia Rotman, Udi Barzelay
Text detection algorithm for separating words detected as one text bounding box

Patent number: 12249168

Abstract: A method, computer system, and a computer program product for text detection is provided. The present invention may include training a text detection model. The present invention may include performing text detection on an inputted image using the trained text detection model. The present invention may include determining whether at least one of a plurality of bounding boxes generated using the inputted image has an aspect ratio above a threshold. The present invention may include based upon determining that at least one of the plurality of bounding boxes generated using the inputted image has the aspect ratio above the threshold, upscaling any text within the at least one bounding box and performing text detection on a new image using the trained text detection model. The present invention may include outputting an output image.

Type: Grant

Filed: January 31, 2022

Date of Patent: March 11, 2025

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ophir Azulai, Udi Barzelay, Oshri Pesah Naparstek
Automatic creation of difficult annotated data leveraging cues

Patent number: 12045717

Abstract: A system and method for generating hard training data from easy training data. Training data including visual data with synthetic semantic implants (“VSSI”) having at least one cue is received. An annotator identifies at least one cue in the VSSI and annotates the VSSI to indicate the cue to create a modified training data set. A data scrambler removes at least one cue from the VSSI to create the tagged training data, which can then be used to train a classifier to identify transitions between segments when the cues are not present.

Type: Grant

Filed: December 9, 2020

Date of Patent: July 23, 2024

Assignee: International Business Machines Corporation

Inventors: Daniel Nechemia Rotman, Yevgeny Yaroker, Udi Barzelay, Joseph Shtok
Synthesizing hard-negative text training data

Patent number: 11948382

Abstract: A method for synthesizing negative training data associated with training models to detect text within documents and images. The method includes one or more computer processors receiving a set of dictates associated with generating one or more negative training datasets for training a set of models to classify a plurality of features found within a data source. The method further includes identifying a set of rules related to generating negative training data to detect text based on the received set of dictates. The method further includes compiling one or more arrays of elements of hard-negative training data into a negative training data dataset based on the identified set of rules and one or more dictates. The method further includes determining metadata corresponding an array of elements of hard-negative training data.

Type: Grant

Filed: December 18, 2020

Date of Patent: April 2, 2024

Assignee: International Business Machines Corporation

Inventors: Ophir Azulai, Udi Barzelay
Object detector trained via self-supervised training on raw and unlabeled videos

Patent number: 11842278

Abstract: An example system includes a processor to receive an image containing an object to be detected. The processor is to detect the object in the image via a binary object detector trained via a self-supervised training on raw and unlabeled videos.

Type: Grant

Filed: January 26, 2023

Date of Patent: December 12, 2023

Assignee: International Business Machines Corporation

Inventors: Elad Amrani, Tal Hakim, Rami Ben-Ari, Udi Barzelay
FONT ATTRIBUTE DETECTION

Publication number: 20230343124

Abstract: Described are techniques for font attribute detection. The techniques include receiving a document having different font attributes amongst a plurality of words respectively comprised of at least one character. The techniques further include generating a dense image document from the document by setting the plurality of words to a predefined size, removing blank spaces from the document, and altering an order of characters relative to the document. The techniques further include determining characteristics of the characters in the dense image document and aggregating the characteristics for at least one word. The techniques further include annotating the at least one word with a font attribute based on the aggregated characteristics.

Type: Application

Filed: April 26, 2022

Publication date: October 26, 2023

Inventors: Ophir Azulai, Daniel Nechemia Rotman, Udi Barzelay
Document segmentation for optical character recognition

Patent number: 11776287

Abstract: An approach to identifying text within an image may be presented. The approach can receive an image. The approach can classify an image on a pixel-by-pixel basis whether the pixel is text. The approach can generate bounding boxes around groups of pixels that are classified as text. The approach can mask sections of an image that where pixels are not classified as text. The approach may be used as a pre-processing technique for optical character recognition in documents, scanned images, or still images.

Type: Grant

Filed: April 27, 2021

Date of Patent: October 3, 2023

Assignee: International Business Machines Corporation

Inventors: Udi Barzelay, Ophir Azulai, Inbar Shapira
Techniques for detecting text

Patent number: 11741732

Abstract: In some examples, a system for detecting text in an image includes a memory device to store a text detection model trained using images of up-scaled text, and a processor configured to perform text detection on an image to generate original bounding boxes that identify potential text in the image. The processor is also configured to generate a secondary image that includes up-scaled portions of the image associated with bounding boxes below a threshold size, and perform text detection on the secondary image to generate secondary bounding boxes that identify potential text in the secondary image. The processor is also configured to compare the original bounding boxes with the secondary bounding boxes to identify original bounding boxes that are false positives, and generate an image file that includes the original bounding boxes, wherein those original bounding boxes that are identified as false positives are removed.

Type: Grant

Filed: December 22, 2021

Date of Patent: August 29, 2023

Assignee: International Business Machines Corporation

Inventors: Ophir Azulai, Udi Barzelay, Oshri Pesah Naparstek
TEXT DETECTION ALGORITHM FOR SEPARATING WORDS DETECTED AS ONE TEXT BOUNDING BOX

Publication number: 20230245481

Abstract: A method, computer system, and a computer program product for text detection is provided. The present invention may include training a text detection model. The present invention may include performing text detection on an inputted image using the trained text detection model. The present invention may include determining whether at least one of a plurality of bounding boxes generated using the inputted image has an aspect ratio above a threshold. The present invention may include based upon determining that at least one of the plurality of bounding boxes generated using the inputted image has the aspect ratio above the threshold, upscaling any text within the at least one bounding box and performing text detection on a new image using the trained text detection model. The present invention may include outputting an output image.

Type: Application

Filed: January 31, 2022

Publication date: August 3, 2023

Inventors: Ophir Azulai, Udi Barzelay, Oshri Pesah Naparstek
Techniques for Detecting Text

Publication number: 20230196807

Abstract: In some examples, a system for detecting text in an image includes a memory device to store a text detection model trained using images of up-scaled text, and a processor configured to perform text detection on an image to generate original bounding boxes that identify potential text in the image. The processor is also configured to generate a secondary image that includes up-scaled portions of the image associated with bounding boxes below a threshold size, and perform text detection on the secondary image to generate secondary bounding boxes that identify potential text in the secondary image. The processor is also configured to compare the original bounding boxes with the secondary bounding boxes to identify original bounding boxes that are false positives, and generate an image file that includes the original bounding boxes, wherein those original bounding boxes that are identified as false positives are removed.

Type: Application

Filed: December 22, 2021

Publication date: June 22, 2023

Inventors: Ophir AZULAI, Udi BARZELAY, Oshri Pesah NAPARSTEK
OBJECT DETECTOR TRAINED VIA SELF-SUPERVISED TRAINING ON RAW AND UNLABELED VIDEOS

Publication number: 20230169344

Abstract: An example system includes a processor to receive an image containing an object to be detected. The processor is to detect the object in the image via a binary object detector trained via a self-supervised training on raw and unlabeled videos.

Type: Application

Filed: January 26, 2023

Publication date: June 1, 2023

Inventors: Elad AMRANI, Tal HAKIM, Rami BEN-ARI, Udi BARZELAY
Training an object detector using raw and unlabeled videos and extracted speech

Patent number: 11636385

Abstract: An example system includes a processor to receive raw and unlabeled videos. The processor is to extract speech from the raw and unlabeled videos. The processor is to extract positive frames and negative frames from the raw and unlabeled videos based on the extracted speech for each object to be detected. The processor is to extract region proposals from the positive frames and negative frames. The processor is to extract features based on the extracted region proposals. The processor is to cluster the region proposals and assign a potential score to each cluster. The processor is to train a binary object detector to detect objects based on positive samples randomly selected based on the potential score.

Type: Grant

Filed: November 4, 2019

Date of Patent: April 25, 2023

Assignee: International Business Machines Corporation

Inventors: Elad Amrani, Udi Barzelay, Rami Ben-Ari, Tal Hakim
DOCUMENT SEGMENTATION FOR OPTICAL CHARACTER RECOGNITION

Publication number: 20220343103

Abstract: An approach to identifying text within an image may be presented. The approach can receive an image. The approach can classify an image on a pixel-by-pixel basis whether the pixel is text. The approach can generate bounding boxes around groups of pixels that are classified as text. The approach can mask sections of an image that where pixels are not classified as text. The approach may be used as a pre-processing technique for optical character recognition in documents, scanned images, or still images.

Type: Application

Filed: April 27, 2021

Publication date: October 27, 2022

Inventors: Udi Barzelay, Ophir Azulai, Inbar Shapira
ACTION RECOGNITION USING LIMITED DATA

Publication number: 20220318555

Abstract: Approaches presented herein enable action recognition. More specifically, a plurality of video segments having one or more action representations is received. One or more sub-action representations in the plurality of video segments are learned. An embedding in a space of a distance metric learning (DML) network for each of the one or more sub-action representations is determined. A set of respective trajectory distances between each of the one or more sub-action representations and one or more class representatives in the space of the DML network based on the embedding is computed, and the one or more action representations based on the set of respective trajectory distances are classified.

Type: Application

Filed: March 31, 2021

Publication date: October 6, 2022

Inventors: Rami Ben-Ari, Ophir Azulai, Udi Barzelay, Mor Shpigel Nacson
Deterministic learning video scene detection

Patent number: 11450111

Abstract: A video scene detection machine learning model is provided. A computer device receives feature vectors corresponding to audio and video components of a video. The computing device provides the feature vectors as input to a trained neural network. The computing device receives from the trained neural network, a plurality of output feature vectors that correspond to shots of the video. The computing device applies optimal sequence grouping to the output feature vectors. The computing device further trains the trained neural network based, at least in part, on the applied optimal sequence grouping.

Type: Grant

Filed: August 27, 2020

Date of Patent: September 20, 2022

Assignee: International Business Machines Corporation

Inventors: Daniel Nechemia Rotman, Rami Ben-Ari, Udi Barzelay
Classifier training using noisy samples

Patent number: 11416757

Abstract: An example system includes a processor to receive input data comprising noisy positive data and clean negative data. The processor is to cluster the input data. The processor is to compute a potential score for each cluster of the clustered input data. The processor is to iteratively refine cluster quality of the clusters using the potential scores of the clusters as weights. The processor is to train a classifier by sampling the negative dataset uniformly and the positive set in a non-uniform manner based on the potential score.

Type: Grant

Filed: November 4, 2019

Date of Patent: August 16, 2022

Assignee: International Business Machines Corporation

Inventors: Elad Amrani, Udi Barzelay, Rami Ben-Ari, Tal Hakim
SYNTHESIZING HARD-NEGATIVE TEXT TRAINING DATA

Publication number: 20220198186

Abstract: A method for synthesizing negative training data associated with training models to detect text within documents and images. The method includes one or more computer processors receiving a set of dictates associated with generating one or more negative training datasets for training a set of models to classify a plurality of features found within a data source. The method further includes identifying a set of rules related to generating negative training data to detect text based on the received set of dictates. The method further includes compiling one or more arrays of elements of hard-negative training data into a negative training data dataset based on the identified set of rules and one or more dictates. The method further includes determining metadata corresponding an array of elements of hard-negative training data.

Type: Application

Filed: December 18, 2020

Publication date: June 23, 2022

Inventors: Ophir Azulai, Udi Barzelay
AUTOMATIC CREATION OF DIFFICULT ANNOTATED DATA LEVERAGING CUES

Publication number: 20220180182

Abstract: A system and method for generating hard training data from easy training data. Training data including visual data with synthetic semantic implants (“VSSI”) having at least one cue is received. An annotator identifies at least one cue in the VSSI and annotates the VSSI to indicate the cue to create a modified training data set. A data scrambler removes at least one cue from the VSSI to create the tagged training data, which can then be used to train a classifier to identify transitions between segments when the cues are not present.

Type: Application

Filed: December 9, 2020

Publication date: June 9, 2022

Inventors: Daniel Nechemia Rotman, Yevgeny Yaroker, Udi Barzelay, Joseph Shtok
DETERMINISTIC LEARNING VIDEO SCENE DETECTION

Publication number: 20220067386

Abstract: A video scene detection machine learning model is provided. A computer device receives feature vectors corresponding to audio and video components of a video. The computing device provides the feature vectors as input to a trained neural network. The computing device receives from the trained neural network, a plurality of output feature vectors that correspond to shots of the video. The computing device applies optimal sequence grouping to the output feature vectors. The computing device further trains the trained neural network based, at least in part, on the applied optimal sequence grouping.

Type: Application

Filed: August 27, 2020

Publication date: March 3, 2022

Inventors: Daniel Nechemia Rotman, Rami Ben-Ari, Udi Barzelay
VISUAL QUESTION ANSWERING USING MODEL TRAINED ON UNLABELED VIDEOS

Publication number: 20220067546

Abstract: An example system includes a processor to learn a shared embedding space on unlabeled videos using speech visual correspondence. The processor can learn a number of additional embeddings including a question plus video embedding and an answer embedding using the shared embedding space to generate a trained visual question answering model. The processor can execute a visual question answering based on the trained visual question answering model.

Type: Application

Filed: August 31, 2020

Publication date: March 3, 2022

Inventors: Elad Amrani, Rami Ben-Ari, Daniel Nechemia Rotman, Udi Barzelay

1 2 next