Patents by Inventor Udi Barzelay

Udi Barzelay has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11948382
    Abstract: A method for synthesizing negative training data associated with training models to detect text within documents and images. The method includes one or more computer processors receiving a set of dictates associated with generating one or more negative training datasets for training a set of models to classify a plurality of features found within a data source. The method further includes identifying a set of rules related to generating negative training data to detect text based on the received set of dictates. The method further includes compiling one or more arrays of elements of hard-negative training data into a negative training data dataset based on the identified set of rules and one or more dictates. The method further includes determining metadata corresponding an array of elements of hard-negative training data.
    Type: Grant
    Filed: December 18, 2020
    Date of Patent: April 2, 2024
    Assignee: International Business Machines Corporation
    Inventors: Ophir Azulai, Udi Barzelay
  • Patent number: 11842278
    Abstract: An example system includes a processor to receive an image containing an object to be detected. The processor is to detect the object in the image via a binary object detector trained via a self-supervised training on raw and unlabeled videos.
    Type: Grant
    Filed: January 26, 2023
    Date of Patent: December 12, 2023
    Assignee: International Business Machines Corporation
    Inventors: Elad Amrani, Tal Hakim, Rami Ben-Ari, Udi Barzelay
  • Publication number: 20230343124
    Abstract: Described are techniques for font attribute detection. The techniques include receiving a document having different font attributes amongst a plurality of words respectively comprised of at least one character. The techniques further include generating a dense image document from the document by setting the plurality of words to a predefined size, removing blank spaces from the document, and altering an order of characters relative to the document. The techniques further include determining characteristics of the characters in the dense image document and aggregating the characteristics for at least one word. The techniques further include annotating the at least one word with a font attribute based on the aggregated characteristics.
    Type: Application
    Filed: April 26, 2022
    Publication date: October 26, 2023
    Inventors: Ophir Azulai, Daniel Nechemia Rotman, Udi Barzelay
  • Patent number: 11776287
    Abstract: An approach to identifying text within an image may be presented. The approach can receive an image. The approach can classify an image on a pixel-by-pixel basis whether the pixel is text. The approach can generate bounding boxes around groups of pixels that are classified as text. The approach can mask sections of an image that where pixels are not classified as text. The approach may be used as a pre-processing technique for optical character recognition in documents, scanned images, or still images.
    Type: Grant
    Filed: April 27, 2021
    Date of Patent: October 3, 2023
    Assignee: International Business Machines Corporation
    Inventors: Udi Barzelay, Ophir Azulai, Inbar Shapira
  • Patent number: 11741732
    Abstract: In some examples, a system for detecting text in an image includes a memory device to store a text detection model trained using images of up-scaled text, and a processor configured to perform text detection on an image to generate original bounding boxes that identify potential text in the image. The processor is also configured to generate a secondary image that includes up-scaled portions of the image associated with bounding boxes below a threshold size, and perform text detection on the secondary image to generate secondary bounding boxes that identify potential text in the secondary image. The processor is also configured to compare the original bounding boxes with the secondary bounding boxes to identify original bounding boxes that are false positives, and generate an image file that includes the original bounding boxes, wherein those original bounding boxes that are identified as false positives are removed.
    Type: Grant
    Filed: December 22, 2021
    Date of Patent: August 29, 2023
    Assignee: International Business Machines Corporation
    Inventors: Ophir Azulai, Udi Barzelay, Oshri Pesah Naparstek
  • Publication number: 20230245481
    Abstract: A method, computer system, and a computer program product for text detection is provided. The present invention may include training a text detection model. The present invention may include performing text detection on an inputted image using the trained text detection model. The present invention may include determining whether at least one of a plurality of bounding boxes generated using the inputted image has an aspect ratio above a threshold. The present invention may include based upon determining that at least one of the plurality of bounding boxes generated using the inputted image has the aspect ratio above the threshold, upscaling any text within the at least one bounding box and performing text detection on a new image using the trained text detection model. The present invention may include outputting an output image.
    Type: Application
    Filed: January 31, 2022
    Publication date: August 3, 2023
    Inventors: Ophir Azulai, Udi Barzelay, Oshri Pesah Naparstek
  • Publication number: 20230196807
    Abstract: In some examples, a system for detecting text in an image includes a memory device to store a text detection model trained using images of up-scaled text, and a processor configured to perform text detection on an image to generate original bounding boxes that identify potential text in the image. The processor is also configured to generate a secondary image that includes up-scaled portions of the image associated with bounding boxes below a threshold size, and perform text detection on the secondary image to generate secondary bounding boxes that identify potential text in the secondary image. The processor is also configured to compare the original bounding boxes with the secondary bounding boxes to identify original bounding boxes that are false positives, and generate an image file that includes the original bounding boxes, wherein those original bounding boxes that are identified as false positives are removed.
    Type: Application
    Filed: December 22, 2021
    Publication date: June 22, 2023
    Inventors: Ophir AZULAI, Udi BARZELAY, Oshri Pesah NAPARSTEK
  • Publication number: 20230169344
    Abstract: An example system includes a processor to receive an image containing an object to be detected. The processor is to detect the object in the image via a binary object detector trained via a self-supervised training on raw and unlabeled videos.
    Type: Application
    Filed: January 26, 2023
    Publication date: June 1, 2023
    Inventors: Elad AMRANI, Tal HAKIM, Rami BEN-ARI, Udi BARZELAY
  • Patent number: 11636385
    Abstract: An example system includes a processor to receive raw and unlabeled videos. The processor is to extract speech from the raw and unlabeled videos. The processor is to extract positive frames and negative frames from the raw and unlabeled videos based on the extracted speech for each object to be detected. The processor is to extract region proposals from the positive frames and negative frames. The processor is to extract features based on the extracted region proposals. The processor is to cluster the region proposals and assign a potential score to each cluster. The processor is to train a binary object detector to detect objects based on positive samples randomly selected based on the potential score.
    Type: Grant
    Filed: November 4, 2019
    Date of Patent: April 25, 2023
    Assignee: International Business Machines Corporation
    Inventors: Elad Amrani, Udi Barzelay, Rami Ben-Ari, Tal Hakim
  • Publication number: 20220343103
    Abstract: An approach to identifying text within an image may be presented. The approach can receive an image. The approach can classify an image on a pixel-by-pixel basis whether the pixel is text. The approach can generate bounding boxes around groups of pixels that are classified as text. The approach can mask sections of an image that where pixels are not classified as text. The approach may be used as a pre-processing technique for optical character recognition in documents, scanned images, or still images.
    Type: Application
    Filed: April 27, 2021
    Publication date: October 27, 2022
    Inventors: Udi Barzelay, Ophir Azulai, Inbar Shapira
  • Publication number: 20220318555
    Abstract: Approaches presented herein enable action recognition. More specifically, a plurality of video segments having one or more action representations is received. One or more sub-action representations in the plurality of video segments are learned. An embedding in a space of a distance metric learning (DML) network for each of the one or more sub-action representations is determined. A set of respective trajectory distances between each of the one or more sub-action representations and one or more class representatives in the space of the DML network based on the embedding is computed, and the one or more action representations based on the set of respective trajectory distances are classified.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Inventors: Rami Ben-Ari, Ophir Azulai, Udi Barzelay, Mor Shpigel Nacson
  • Patent number: 11450111
    Abstract: A video scene detection machine learning model is provided. A computer device receives feature vectors corresponding to audio and video components of a video. The computing device provides the feature vectors as input to a trained neural network. The computing device receives from the trained neural network, a plurality of output feature vectors that correspond to shots of the video. The computing device applies optimal sequence grouping to the output feature vectors. The computing device further trains the trained neural network based, at least in part, on the applied optimal sequence grouping.
    Type: Grant
    Filed: August 27, 2020
    Date of Patent: September 20, 2022
    Assignee: International Business Machines Corporation
    Inventors: Daniel Nechemia Rotman, Rami Ben-Ari, Udi Barzelay
  • Patent number: 11416757
    Abstract: An example system includes a processor to receive input data comprising noisy positive data and clean negative data. The processor is to cluster the input data. The processor is to compute a potential score for each cluster of the clustered input data. The processor is to iteratively refine cluster quality of the clusters using the potential scores of the clusters as weights. The processor is to train a classifier by sampling the negative dataset uniformly and the positive set in a non-uniform manner based on the potential score.
    Type: Grant
    Filed: November 4, 2019
    Date of Patent: August 16, 2022
    Assignee: International Business Machines Corporation
    Inventors: Elad Amrani, Udi Barzelay, Rami Ben-Ari, Tal Hakim
  • Publication number: 20220198186
    Abstract: A method for synthesizing negative training data associated with training models to detect text within documents and images. The method includes one or more computer processors receiving a set of dictates associated with generating one or more negative training datasets for training a set of models to classify a plurality of features found within a data source. The method further includes identifying a set of rules related to generating negative training data to detect text based on the received set of dictates. The method further includes compiling one or more arrays of elements of hard-negative training data into a negative training data dataset based on the identified set of rules and one or more dictates. The method further includes determining metadata corresponding an array of elements of hard-negative training data.
    Type: Application
    Filed: December 18, 2020
    Publication date: June 23, 2022
    Inventors: Ophir Azulai, Udi Barzelay
  • Publication number: 20220180182
    Abstract: A system and method for generating hard training data from easy training data. Training data including visual data with synthetic semantic implants (“VSSI”) having at least one cue is received. An annotator identifies at least one cue in the VSSI and annotates the VSSI to indicate the cue to create a modified training data set. A data scrambler removes at least one cue from the VSSI to create the tagged training data, which can then be used to train a classifier to identify transitions between segments when the cues are not present.
    Type: Application
    Filed: December 9, 2020
    Publication date: June 9, 2022
    Inventors: Daniel Nechemia Rotman, Yevgeny Yaroker, Udi Barzelay, Joseph Shtok
  • Publication number: 20220067386
    Abstract: A video scene detection machine learning model is provided. A computer device receives feature vectors corresponding to audio and video components of a video. The computing device provides the feature vectors as input to a trained neural network. The computing device receives from the trained neural network, a plurality of output feature vectors that correspond to shots of the video. The computing device applies optimal sequence grouping to the output feature vectors. The computing device further trains the trained neural network based, at least in part, on the applied optimal sequence grouping.
    Type: Application
    Filed: August 27, 2020
    Publication date: March 3, 2022
    Inventors: Daniel Nechemia Rotman, Rami Ben-Ari, Udi Barzelay
  • Publication number: 20220067546
    Abstract: An example system includes a processor to learn a shared embedding space on unlabeled videos using speech visual correspondence. The processor can learn a number of additional embeddings including a question plus video embedding and an answer embedding using the shared embedding space to generate a trained visual question answering model. The processor can execute a visual question answering based on the trained visual question answering model.
    Type: Application
    Filed: August 31, 2020
    Publication date: March 3, 2022
    Inventors: Elad Amrani, Rami Ben-Ari, Daniel Nechemia Rotman, Udi Barzelay
  • Publication number: 20220044105
    Abstract: An example system includes a processor to receive unannotated multimodal data. The processor can estimate a probability an associated pair of different modalities in the unannotated multimodal data to be correctly associated using a multimodal similarity function and a local density estimation. The processor can also train a multimodal representation learning model on the unannotated multimodal data using the estimated probability as a weight for the associated pair in a loss function.
    Type: Application
    Filed: August 4, 2020
    Publication date: February 10, 2022
    Inventors: Elad Amrani, Rami Ben-Ari, Daniel Nechemia Rotman, Udi Barzelay
  • Patent number: 11164005
    Abstract: Embodiments may provide techniques that provide identification of images that can provide reduced resource utilization due to reduced sampling of video frames for visual recognition. For example, in an embodiment, a method of visual recognition processing may be implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method comprising: coarsely segmenting video frames of video stream into a plurality of clusters based on scenes of the video stream, sampling a plurality of video frames from each cluster; determining a quality of each cluster, re-clustering the video frames of video stream to improve the quality of at least some of the clusters.
    Type: Grant
    Filed: April 12, 2020
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Yevgeny Burshtein, Daniel Nechemia Rotman, Dror Porat, Udi Barzelay
  • Patent number: 11157744
    Abstract: Automated detection and approximation of objects in a video, including: (a) sampling a provided digital video, to obtain a set of sampled frames; (b) applying an object detection algorithm to the sampled frames, to detect objects appearing in the sampled frames; (c) based on the detections in the sampled frames, applying an object approximation algorithm to each sequence of frames that lie between the sampled frames, to approximately detect objects appearing in each of the sequences; (d) applying a trained regression model to each of the sequences, to estimate a quality of the approximate detection of objects in the respective sequence; (e) applying the object detection algorithm to one or more frames in those of the sequences whose quality of the approximate detection is below a threshold, to detect objects appearing in those frames.
    Type: Grant
    Filed: January 15, 2020
    Date of Patent: October 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Udi Barzelay, Tal Hakim, Daniel Nechemia Rotman, Dror Porat