Patents by Inventor Oron NIR

Oron NIR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11954893
    Abstract: The technology described herein is directed to systems, methods, and software for indexing video. In an implementation, a method comprises identifying one or more regions of interest around target content in a frame of the video. Further, the method includes identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest. The method continues with identifying at least one empty region of the potentially empty regions that satisfies one or more criteria and classifying at least the one empty region as a negative sample of the target content. In some implementations, the negative sample of the target content in a set of negative samples of the target content, with which to train a machine learning model employed to identify instances of the target content.
    Type: Grant
    Filed: June 17, 2022
    Date of Patent: April 9, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oron Nir, Maria Zontak, Tucker Cunningham Burns, Apar Singhal, Lei Zhang, Irit Ofer, Avner Levi, Haim Sabo, Ika Bar-Menachem, Eylon Ami, Ella Ben Tov, Anika Zaman
  • Patent number: 11900961
    Abstract: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: February 13, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Oron Nir, Inbal Sagiv, Maayan Yedidia, Fardau Van Neerden, Itai Norman
  • Publication number: 20240004915
    Abstract: A method for automatically classifying terms of a first ontology into categories of a classification scheme defined with respect to a second ontology includes generating, for each term in the first ontology and each term in the second ontology, an embedding encoding the term and a description of the term. The method further includes adding the generated embeddings to a transformer model and computing, for each pair of the embeddings consisting of a first term from the first ontology and a second term from the second ontology, a similarity metric quantifying a similarity of the first term and the second term. The method still further provides for determining a matching scheme based on the similarity metric computed with respect to each pair of the embeddings, where the matching scheme associates term of the first ontology with one or more relevant categories of the classification scheme defined with respect to the second ontology.
    Type: Application
    Filed: June 29, 2022
    Publication date: January 4, 2024
    Inventors: Oron NIR, Inbal SAGIV, Fardau VAN NEERDEN
  • Publication number: 20240005094
    Abstract: A system for ontology matching performs operations to refine a natural language processing (NLP) model that encodes terms of a first hierarchical ontology and of a second hierarchical ontology as embeddings in a vector space in which spatial proximity between the embeddings is correlated with similarity between the associated terms. The operations to refine the NLP model include performing at least a first round of triplet loss training to decrease separation between select pairs of the embeddings sampled from the different ontologies that satisfy a first hierarchical relation while increasing separation between other pairs of the embeddings that do not satisfy the first hierarchical relation. The system then determines, from the refined NLP model, a stable matching scheme that matches each term in the first hierarchical ontology with a corresponding term of the second hierarchical ontology.
    Type: Application
    Filed: June 29, 2022
    Publication date: January 4, 2024
    Inventors: Oron NIR, Inbal SAGIV, Fardau VAN NEERDEN
  • Publication number: 20230419663
    Abstract: Examples of the present disclosure describe systems and methods for video genre classification. In one example implementation, video content is received. A plurality of sliding windows of the video content is sampled. The plurality of sliding windows comprises audio data and video data. The audio data is analyzed to identify a set of audio features. The video data is analyzed to identify a set of video features. The set of audio features and the set of video features is provided to a classifier. The classifier is configured to detect a genre for the video content using the set of audio features and the set of video features. The video content is indexed based on the genre.
    Type: Application
    Filed: June 27, 2022
    Publication date: December 28, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oron NIR, Mattan SERRY, Yonit HOFFMAN, Michael BEN-HAYM, Zvi FIGOV, Eliyahu STRUGO, Avi NEEMAN
  • Publication number: 20230386505
    Abstract: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.
    Type: Application
    Filed: May 31, 2022
    Publication date: November 30, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oron NIR, Inbal SAGIV, Maayan YEDIDIA, Fardau VAN NEERDEN, Itai NORMAN
  • Patent number: 11823453
    Abstract: The technology described herein is directed to a media indexer framework including a character recognition engine that automatically detects and groups instances (or occurrences) of characters in a multi-frame animated media file. More specifically, the character recognition engine automatically detects and groups the instances (or occurrences) of the characters in the multi-frame animated media file such that each group contains images associated with a single character. The character groups are then labeled and used to train an image classification model. Once trained, the image classification model can be applied to subsequent multi-frame animated media files to automatically classifying the animated characters included therein.
    Type: Grant
    Filed: February 1, 2022
    Date of Patent: November 21, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oron Nir, Maria Zontak, Tucker Cunningham Burns, Apar Singhal, Lei Zhang, Irit Ofer, Avner Levi, Haim Sabo, Ika Bar-Menachem, Eylon Ami, Ella Ben Tov
  • Patent number: 11768961
    Abstract: Methods for speaker role determination and scrubbing identifying information are performed by systems and devices. In speaker role determination, data from an audio or text file is divided into respective portions related to speaking parties. Characteristics classifying the portions of the data for speaking party roles are identified in the portions to generate data sets from the portions corresponding to the speaking party roles and to assign speaking party roles for the data sets. For scrubbing identifying information in data, audio data for speaking parties is processed using speech recognition to generate a text-based representation. Text associated with identifying information is determined based on a set of key words/phrases, and a portion of the text-based representation that includes a part of the text is identified. A segment of audio data that corresponds to the identified portion is replaced with different audio data, and the portion is replaced with different text.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: September 26, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yun-Cheng Ju, Ashwarya Poddar, Royi Ronen, Oron Nir, Ami Turgman, Andreas Stolcke, Edan Hauon
  • Patent number: 11501546
    Abstract: In various embodiments, methods and systems for implementing a media management system, for video data processing and adaptation data generation, are provided. At a high level, a video data processing engine relies on different types of video data properties and additional auxiliary data resources to perform video optical character recognition operations for recognizing characters in video data. In operation, video data is accessed to identify recognized characters. A video OCR operation to perform on the video data for character recognition is determined from video character processing and video auxiliary data processing. Video auxiliary data processing includes processing an auxiliary reference object; the auxiliary reference object is an indirect reference object that is a derived input element used as a factor in determining the recognized characters. The video data is processed based on the video OCR operation and based on processing the video data, at least one recognized character is communicated.
    Type: Grant
    Filed: July 27, 2020
    Date of Patent: November 15, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Royi Ronen, Ika Bar-Menachem, Ohad Jassin, Avner Levi, Olivier Nano, Oron Nir, Mor Geva Pipek, Ori Ziv
  • Publication number: 20220318574
    Abstract: The technology described herein is directed to systems, methods, and software for indexing video. In an implementation, a method comprises identifying one or more regions of interest around target content in a frame of the video. Further, the method includes identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest. The method continues with identifying at least one empty region of the potentially empty regions that satisfies one or more criteria and classifying at least the one empty region as a negative sample of the target content. In some implementations, the negative sample of the target content in a set of negative samples of the target content, with which to train a machine learning model employed to identify instances of the target content.
    Type: Application
    Filed: June 17, 2022
    Publication date: October 6, 2022
    Inventors: Oron NIR, Maria ZONTAK, Tucker Cunningham BURNS, Apar SINGHAL, Lei ZHANG, Irit OFER, Avner LEVI, Haim SABO, Ika BAR-MENACHEM, Eylon AMI, Ella BEN TOV, Anika ZAMAN
  • Patent number: 11366989
    Abstract: The technology described herein is directed to systems, methods, and software for indexing video. In an implementation, a method comprises identifying one or more regions of interest around target content in a frame of the video. Further, the method includes identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest. The method continues with identifying at least one empty region of the potentially empty regions that satisfies one or more criteria and classifying at least the one empty region as a negative sample of the target content. In some implementations, the negative sample of the target content in a set of negative samples of the target content, with which to train a machine learning model employed to identify instances of the target content.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: June 21, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oron Nir, Maria Zontak, Tucker Cunningham Burns, Apar Singhal, Lei Zhang, Irit Ofer, Avner Levi, Haim Sabo, Ika Bar-Menachem, Eylon Ami, Ella Ben Tov, Anika Zaman
  • Publication number: 20220157057
    Abstract: The technology described herein is directed to a media indexer framework including a character recognition engine that automatically detects and groups instances (or occurrences) of characters in a multi-frame animated media file. More specifically, the character recognition engine automatically detects and groups the instances (or occurrences) of the characters in the multi-frame animated media file such that each group contains images associated with a single character. The character groups are then labeled and used to train an image classification model. Once trained, the image classification model can be applied to subsequent multi-frame animated media files to automatically classifying the animated characters included therein.
    Type: Application
    Filed: February 1, 2022
    Publication date: May 19, 2022
    Inventors: Oron NIR, Maria ZONTAK, Tucker Cunningham BURNS, Apar SINGHAL, Lei ZHANG, Irit OFER, Avner LEVI, Haim SABO, Ika BAR-MENACHEM, Eylon AMI, Ella BEN TOV
  • Patent number: 11270121
    Abstract: The technology described herein is directed to a media indexer framework including a character recognition engine that automatically detects and groups instances (or occurrences) of characters in a multi-frame animated media file. More specifically, the character recognition engine automatically detects and groups the instances (or occurrences) of the characters in the multi-frame animated media file such that each group contains images associated with a single character. The character groups are then labeled and used to train an image classification model. Once trained, the image classification model can be applied to subsequent multi-frame animated media files to automatically classifying the animated characters included therein.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: March 8, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oron Nir, Maria Zontak, Tucker Cunningham Burns, Apar Singhal, Lei Zhang, Irit Ofer, Avner Levi, Haim Sabo, Ika Bar-Menachem, Eylon Ami, Ella Ben Tov
  • Publication number: 20220050922
    Abstract: Methods for speaker role determination and scrubbing identifying information are performed by systems and devices. In speaker role determination, data from an audio or text file is divided into respective portions related to speaking parties. Characteristics classifying the portions of the data for speaking party roles are identified in the portions to generate data sets from the portions corresponding to the speaking party roles and to assign speaking party roles for the data sets. For scrubbing identifying information in data, audio data for speaking parties is processed using speech recognition to generate a text-based representation. Text associated with identifying information is determined based on a set of key words/phrases, and a portion of the text-based representation that includes a part of the text is identified. A segment of audio data that corresponds to the identified portion is replaced with different audio data, and the portion is replaced with different text.
    Type: Application
    Filed: October 28, 2021
    Publication date: February 17, 2022
    Inventors: Yun-Cheng Ju, Ashwarya Poddar, Royi Ronen, Oron Nir, Ami Turgman, Andreas Stolcke, Edan Hauon
  • Patent number: 11182504
    Abstract: Methods for speaker role determination and scrubbing identifying information are performed by systems and devices. In speaker role determination, data from an audio or text file is divided into respective portions related to speaking parties. Characteristics classifying the portions of the data for speaking party roles are identified in the portions to generate data sets from the portions corresponding to the speaking party roles and to assign speaking party roles for the data sets. For scrubbing identifying information in data, audio data for speaking parties is processed using speech recognition to generate a text-based representation. Text associated with identifying information is determined based on a set of key words/phrases, and a portion of the text-based representation that includes a part of the text is identified. A segment of audio data that corresponds to the identified portion is replaced with different audio data, and the portion is replaced with different text.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: November 23, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yun-Cheng Ju, Ashwarya Poddar, Royi Ronen, Oron Nir, Ami Turgman, Andreas Stolcke, Edan Hauon
  • Patent number: 11062706
    Abstract: Methods for speaker role determination and scrubbing identifying information are performed by systems and devices. In speaker role determination, data from an audio or text file is divided into respective portions related to speaking parties. Characteristics classifying the portions of the data for speaking party roles are identified in the portions to generate data sets from the portions corresponding to the speaking party roles and to assign speaking party roles for the data sets. For scrubbing identifying information in data, audio data for speaking parties is processed using speech recognition to generate a text-based representation. Text associated with identifying information is determined based on a set of key words/phrases, and a portion of the text-based representation that includes a part of the text is identified. A segment of audio data that corresponds to the identified portion is replaced with different audio data, and the portion is replaced with different text.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: July 13, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yun-Cheng Ju, Ashwarya Poddar, Royi Ronen, Oron Nir, Ami Turgman, Andreas Stolcke, Edan Hauon
  • Publication number: 20210174146
    Abstract: Aspects of the technology described herein improve an object recognition system by specifying a type of picture that would improve the accuracy of the object recognition system if used to retrain the object recognition system. The technology described herein can take the form of an improvement model that improves an object recognition model by suggesting the types of training images that would improve the object recognition model's performance For example, the improvement model could suggest that a picture of a person smiling be used to retrain the object recognition system. Once trained, the improvement model can be used to estimate a performance score for an image recognition model given the set characteristics of a set of training of images.
    Type: Application
    Filed: January 25, 2021
    Publication date: June 10, 2021
    Inventors: Oron NIR, Royi RONEN, Ohad JASSIN, Milan M. GADA, Mor Geva PIPEK
  • Publication number: 20210081699
    Abstract: In various embodiments, methods and systems for implementing a media management system, for video data processing and adaptation data generation, are provided. At a high level, a video data processing engine relies on different types of video data properties and additional auxiliary data resources to perform video optical character recognition operations for recognizing characters in video data. In operation, video data is accessed to identify recognized characters. A video OCR operation to perform on the video data for character recognition is determined from video character processing and video auxiliary data processing. Video auxiliary data processing includes processing an auxiliary reference object; the auxiliary reference object is an indirect reference object that is a derived input element used as a factor in determining the recognized characters. The video data is processed based on the video OCR operation and based on processing the video data, at least one recognized character is communicated.
    Type: Application
    Filed: July 27, 2020
    Publication date: March 18, 2021
    Inventors: Royi RONEN, Ika BAR-MENACHEM, Ohad JASSIN, Avner LEVI, Olivier NANO, Oron NIR, Mor Geva PIPEK, Ori ZIV
  • Patent number: 10936630
    Abstract: Systems and methods are disclosed for inferring topics from a file containing both audio and video, for example a multimodal or multimedia file, in order to facilitate video indexing. A set of entities is extracted from the file and linked to produce a graph, and reference information is also obtained for the set of entities. Entities may be drawn, for example, from Wikipedia categories, or other large ontological data sources. Analysis of the graph, using unsupervised learning, permits determining clusters in the graph. Extracting features from the clusters, possibly using supervised learning, provides for selection of topic identifiers. The topic identifiers are then used for indexing the file.
    Type: Grant
    Filed: September 13, 2018
    Date of Patent: March 2, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Royi Ronen, Oron Nir, Chin-Yew Lin, Ohad Jassin, Daniel Nurieli, Eylon Ami, Avner Levi
  • Publication number: 20210056313
    Abstract: The technology described herein is directed to a media indexer framework including a character recognition engine that automatically detects and groups instances (or occurrences) of characters in a multi-frame animated media file. More specifically, the character recognition engine automatically detects and groups the instances (or occurrences) of the characters in the multi-frame animated media file such that each group contains images associated with a single character. The character groups are then labeled and used to train an image classification model. Once trained, the image classification model can be applied to subsequent multi-frame animated media files to automatically classifying the animated characters included therein.
    Type: Application
    Filed: March 26, 2020
    Publication date: February 25, 2021
    Inventors: Oron Nir, Maria Zontak, Tucker Cunningham Burns, Apar Singhal, Lei Zhang, Irit Ofer, Avner Levi, Haim Sabo, Ika Bar-Menachem, Eylon Ami, Ella Ben Tov