Patents by Inventor Oron NIR

Oron NIR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20260133997
    Abstract: Systems and methods are provided for implementing large-scale density-based clustering functionalities. In examples, a system selects, for a dataset (which may be sampled at 100% or less), an upper bound value and a lower bound value of a neighborhood radius parameter of a density-based clustering algorithm. The system identifies, using a modified ternary search algorithm, an optimal neighborhood radius parameter value, based on the upper and lower bound values, outputs the optimal neighborhood radius parameter value and/or a corresponding optimal number of clusters within the dataset. The modified ternary search algorithm leverages the near-unimodality of the neighborhood radius parameter, while selection of the upper bound value leverages a characteristic in which the neighborhood radius parameter value increases as the sampling rate decreases, and selection of the lower bound value uses ternary search that takes the number of clusters as a parameter instead of the neighborhood radius parameter.
    Type: Application
    Filed: October 3, 2025
    Publication date: May 14, 2026
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Inbal SAGIV, David DYCKMAN, Oron NIR
  • Publication number: 20260039920
    Abstract: A vision language model (“VLM”) generates text captions from video content. Innovations in controlling the complexity of captioning that uses a VLM are described. For example, a training tool updates a training set so that text captions are more concise, then fine-tunes a VLM using the updated training set. Or, as another example, a generative artificial intelligence model such as a VLM dynamically adjusts the probability of an end-of-sentence (“EOS”) token so that the probability of the EOS token increases in successive iterations of output token generation, which tends to make generated text captions more concise. Or, as another example, a captioning tool identifies and ranks representative units (such as keyframes) of video, then selectively applies captioning (using a VLM) to representative units of the video based on ranking information. Together or individually, the innovations can improve the computational efficiency and accuracy of captioning that uses a VLM.
    Type: Application
    Filed: July 30, 2024
    Publication date: February 5, 2026
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oron NIR, Tal SHOHAM
  • Publication number: 20260037730
    Abstract: A vision language model (“VLM”) generates text captions from video content. Innovations in controlling the complexity of captioning that uses a VLM are described. For example, a training tool updates a training set so that text captions are more concise, then fine-tunes a VLM using the updated training set. Or, as another example, a generative artificial intelligence model such as a VLM dynamically adjusts the probability of an end-of-sentence (“EOS”) token so that the probability of the EOS token increases in successive iterations of output token generation, which tends to make generated text captions more concise. Or, as another example, a captioning tool identifies and ranks representative units (such as keyframes) of video, then selectively applies captioning (using a VLM) to representative units of the video based on ranking information. Together or individually, the innovations can improve the computational efficiency and accuracy of captioning that uses a VLM.
    Type: Application
    Filed: July 30, 2024
    Publication date: February 5, 2026
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oron NIR, Tal SHOHAM
  • Patent number: 12455903
    Abstract: Systems and methods are provided for implementing large-scale density-based clustering functionalities. In examples, a system selects, for a dataset (which may be sampled at 100% or less), an upper bound value and a lower bound value of a neighborhood radius parameter of a density-based clustering algorithm. The system identifies, using a modified ternary search algorithm, an optimal neighborhood radius parameter value, based on the upper and lower bound values, outputs the optimal neighborhood radius parameter value and/or a corresponding optimal number of clusters within the dataset. The modified ternary search algorithm leverages the near-unimodality of the neighborhood radius parameter, while selection of the upper bound value leverages a characteristic in which the neighborhood radius parameter value increases as the sampling rate decreases, and selection of the lower bound value uses ternary search that takes the number of clusters as a parameter instead of the neighborhood radius parameter.
    Type: Grant
    Filed: May 31, 2024
    Date of Patent: October 28, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Inbal Sagiv, David Dyckman, Oron Nir
  • Publication number: 20250246176
    Abstract: Examples of the present disclosure describe a video scene describer. The video scene describer receives video content data as input and provides audio description (AD) data as output. The video scene describer utilizes one or more components using artificial intelligence (AI) and/or algorithms to analyze and describe the video content data. For example, the video scene describer may include a video indexer component to identify and describe particular aspects of the video content data and generates video insights data based on the analysis. The video indexer provides video insights data to a large language model (LLM) component. The video scene describer may additionally include a visual-language model system, which includes a visual encoder, a relation aggregator, a transformer encoder, and/or transformer. The LLM component synthesizes the video insights data and video embedding data, along with any prompt (e.g., a request or question) or dialogue context, to provide the AD data.
    Type: Application
    Filed: January 31, 2024
    Publication date: July 31, 2025
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oron NIR, Shemer Shmuel STEINLAUF, Eliyahu STRUGO
  • Patent number: 12346374
    Abstract: A video indexing system generates descriptive metadata for a video including identifiers for each of multiple detections that each correspond to a select one of multiple subjects that appear in the video. These detections are used to create relational graph data for the video, where the relational graph data includes nodes corresponding to each of the multiple subjects that appear in the video. A knowledge graph is queried with unique identifiers corresponding to the multiple subjects of the video to retrieve implicit relational data for each of the multiple subjects, and a merged relational graph is created by merging the implicit relational data retrieved from the knowledge graph with the relational graph data created for the video. A search engine uses the merged relational graph to identify video content relevant to a user query that is based on an implicit relation. Search results identifying the relevant content are presented on a user device.
    Type: Grant
    Filed: December 16, 2022
    Date of Patent: July 1, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Oron Nir, Ika Bar-Menachem, Inbal Sagiv
  • Publication number: 20250156642
    Abstract: Systems and methods for semantic temporal segmentation based on topic recognition are disclosed. Text classification and segmentation may be used to index media content for subsequent searches. The method may include using a text classification model to semantically analyze sentences in a text file (such as a transcript) to determine topics with which sentences are associated. The output of the text classification model may be provided to a text segmentation model to enable more-accurate identification of text segments (e.g., paragraphs) within the text file. In some examples, the output of a text segmentation model is provided to a text classification model to enable the text classification model to perform more-accurate classification based on text segments rather than (or in addition to) performing classification on single sentences. The classification of the text segments may be used to assign labels to the text segments to enable subsequent searches based on the labels.
    Type: Application
    Filed: November 13, 2023
    Publication date: May 15, 2025
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Mattan SERRY, Oron NIR
  • Patent number: 12242803
    Abstract: An ontology matching system performs operations to refine a natural language processing (NLP) model that encodes terms of a first hierarchical ontology and of a second hierarchical ontology as embeddings in a latent space. The operations include performing at least a first round of triplet loss training to decrease separation between select pairs of the embeddings sampled from the different ontologies that satisfy a first hierarchical relation while increasing separation between other pairs of the embeddings that do not satisfy the first hierarchical relation. The system then determines, from the refined NLP model, a stable matching scheme that matches each term in the first hierarchical ontology with a corresponding term of the second hierarchical ontology. Responsive to receiving terms of the first hierarchical ontology from an application, the system uses the stable matching scheme to map each of the terms to corresponding terms of the second hierarchical ontology.
    Type: Grant
    Filed: June 29, 2022
    Date of Patent: March 4, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Oron Nir, Inbal Sagiv, Fardau Van Neerden
  • Publication number: 20250054337
    Abstract: Aspects of the technology described herein improve an object recognition system by specifying a type of picture that would improve the accuracy of the object recognition system if used to retrain the object recognition system. The technology described herein can take the form of an improvement model that improves an object recognition model by suggesting the types of training images that would improve the object recognition model's performance. For example, the improvement model could suggest that a picture of a person smiling be used to retrain the object recognition system. Once trained, the improvement model can be used to estimate a performance score for an image recognition model given the set characteristics of a set of training of images. The improvement model can then select a feature of an image, which if added to the training set, would cause a meaningful increase in the recognition system's performance.
    Type: Application
    Filed: October 21, 2024
    Publication date: February 13, 2025
    Inventors: Oron NIR, Royi RONEN, Ohad JASSIN, Milan M. GADA, Mor Geva PIPEK
  • Patent number: 12222974
    Abstract: A method for automatically classifying terms of a first ontology into categories of a classification scheme defined with respect to a second ontology includes generating, for each term in the first ontology and each term in the second ontology, an embedding encoding the term and a description of the term. The method further includes adding the generated embeddings to a transformer model and computing, for each pair of the embeddings consisting of a first term from the first ontology and a second term from the second ontology, a similarity metric quantifying a similarity of the first term and the second term. The method still further provides for determining a matching scheme based on the similarity metric computed with respect to each pair of the embeddings, where the matching scheme associates term of the first ontology with one or more relevant categories of the classification scheme defined with respect to the second ontology.
    Type: Grant
    Filed: June 29, 2022
    Date of Patent: February 11, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Oron Nir, Inbal Sagiv, Fardau Van Neerden
  • Publication number: 20240419944
    Abstract: Sampling operations enable a computer vision tool to regulate downstream tasks. The sampling operations can indicate which frames of a video sequence should be processed by different downstream tasks. For example, a computer vision tool receives encoded data for a given frame and uses the encoded data to determine inputs for machine learning models in different channels. The computer vision tool provides the inputs to the machine learning models, respectively, and fuses results from the machine learning models. In this way, the computer vision tool determines a set of event indicators for the given frame. Based at least in part on the event indicator(s) for the given frame, the computer vision tool regulates downstream tasks for the given frame (e.g., selectively performing or skipping downstream tasks for the given frame, or otherwise adjusting how and when downstream tasks are performed for the given frame).
    Type: Application
    Filed: June 13, 2023
    Publication date: December 19, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oron NIR, Fardau VAN NEERDEN, Inbal SAGIV
  • Patent number: 12169984
    Abstract: Aspects of the technology described herein improve an object recognition system by specifying a type of picture that would improve the accuracy of the object recognition system if used to retrain the object recognition system. The technology described herein can take the form of an improvement model that improves an object recognition model by suggesting the types of training images that would improve the object recognition model's performance. For example, the improvement model could suggest that a picture of a person smiling be used to retrain the object recognition system. Once trained, the improvement model can be used to estimate a performance score for an image recognition model given the set characteristics of a set of training of images. The improvement model can then select a feature of an image, which if added to the training set, would cause a meaningful increase in the recognition system's performance.
    Type: Grant
    Filed: January 25, 2021
    Date of Patent: December 17, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Oron Nir, Royi Ronen, Ohad Jassin, Milan M. Gada, Mor Geva Pipek
  • Publication number: 20240370661
    Abstract: Multimedia content is summarized with the use of summary prompts that are created with audio and visual insights obtained from the multimedia content. An aggregated timeline temporally aligns the audio and visual insights. The aggregated timeline is segmented into coherent segments that each include a unique combination of audio and visual insights. These segments are grouped into chunks, based on prompt size constraints, and are used with identified summarization styles to create the summary prompts. The summary prompts are provided to summarization models to obtain summaries having content and summarization styles based on the summary prompts.
    Type: Application
    Filed: June 9, 2023
    Publication date: November 7, 2024
    Inventors: Tom HIRSHBERG, Yonit HOFFMAN, Zvi FIGOV, Maayan YEDIDIA DOTAN, Oron NIR
  • Publication number: 20240312477
    Abstract: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.
    Type: Application
    Filed: December 27, 2023
    Publication date: September 19, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oron NIR, Inbal SAGIV, Maayan YEDIDIA, Fardau VAN NEERDEN, Itai NORMAN
  • Publication number: 20240202240
    Abstract: A video indexing system generates descriptive metadata for a video including identifiers for each of multiple detections that each correspond to a select one of multiple subjects that appear in the video. These detections are used to create relational graph data for the video, where the relational graph data includes nodes corresponding to each of the multiple subjects that appear in the video. A knowledge graph is queried with unique identifiers corresponding to the multiple subjects of the video to retrieve implicit relational data for each of the multiple subjects, and a merged relational graph is created by merging the implicit relational data retrieved from the knowledge graph with the relational graph data created for the video. A search engine uses the merged relational graph to identify video content relevant to a user query that is based on an implicit relation. Search results identifying the relevant content are presented on a user device.
    Type: Application
    Filed: December 16, 2022
    Publication date: June 20, 2024
    Inventors: Oron NIR, Ika BAR-MENACHEM, Inbal SAGIV
  • Patent number: 11954893
    Abstract: The technology described herein is directed to systems, methods, and software for indexing video. In an implementation, a method comprises identifying one or more regions of interest around target content in a frame of the video. Further, the method includes identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest. The method continues with identifying at least one empty region of the potentially empty regions that satisfies one or more criteria and classifying at least the one empty region as a negative sample of the target content. In some implementations, the negative sample of the target content in a set of negative samples of the target content, with which to train a machine learning model employed to identify instances of the target content.
    Type: Grant
    Filed: June 17, 2022
    Date of Patent: April 9, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oron Nir, Maria Zontak, Tucker Cunningham Burns, Apar Singhal, Lei Zhang, Irit Ofer, Avner Levi, Haim Sabo, Ika Bar-Menachem, Eylon Ami, Ella Ben Tov, Anika Zaman
  • Patent number: 11900961
    Abstract: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: February 13, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Oron Nir, Inbal Sagiv, Maayan Yedidia, Fardau Van Neerden, Itai Norman
  • Publication number: 20240004915
    Abstract: A method for automatically classifying terms of a first ontology into categories of a classification scheme defined with respect to a second ontology includes generating, for each term in the first ontology and each term in the second ontology, an embedding encoding the term and a description of the term. The method further includes adding the generated embeddings to a transformer model and computing, for each pair of the embeddings consisting of a first term from the first ontology and a second term from the second ontology, a similarity metric quantifying a similarity of the first term and the second term. The method still further provides for determining a matching scheme based on the similarity metric computed with respect to each pair of the embeddings, where the matching scheme associates term of the first ontology with one or more relevant categories of the classification scheme defined with respect to the second ontology.
    Type: Application
    Filed: June 29, 2022
    Publication date: January 4, 2024
    Inventors: Oron NIR, Inbal SAGIV, Fardau VAN NEERDEN
  • Publication number: 20240005094
    Abstract: A system for ontology matching performs operations to refine a natural language processing (NLP) model that encodes terms of a first hierarchical ontology and of a second hierarchical ontology as embeddings in a vector space in which spatial proximity between the embeddings is correlated with similarity between the associated terms. The operations to refine the NLP model include performing at least a first round of triplet loss training to decrease separation between select pairs of the embeddings sampled from the different ontologies that satisfy a first hierarchical relation while increasing separation between other pairs of the embeddings that do not satisfy the first hierarchical relation. The system then determines, from the refined NLP model, a stable matching scheme that matches each term in the first hierarchical ontology with a corresponding term of the second hierarchical ontology.
    Type: Application
    Filed: June 29, 2022
    Publication date: January 4, 2024
    Inventors: Oron NIR, Inbal SAGIV, Fardau VAN NEERDEN
  • Publication number: 20230419663
    Abstract: Examples of the present disclosure describe systems and methods for video genre classification. In one example implementation, video content is received. A plurality of sliding windows of the video content is sampled. The plurality of sliding windows comprises audio data and video data. The audio data is analyzed to identify a set of audio features. The video data is analyzed to identify a set of video features. The set of audio features and the set of video features is provided to a classifier. The classifier is configured to detect a genre for the video content using the set of audio features and the set of video features. The video content is indexed based on the genre.
    Type: Application
    Filed: June 27, 2022
    Publication date: December 28, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Oron NIR, Mattan SERRY, Yonit HOFFMAN, Michael BEN-HAYM, Zvi FIGOV, Eliyahu STRUGO, Avi NEEMAN