Patents by Inventor Yonathan WEILL

Yonathan WEILL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Context-based prompt generation for automated translations between natural language and query language

Patent number: 12524403

Abstract: A disclosed method facilitates translation of natural language queries into query language statements usable to retrieve data from or write data to a particular database. The method includes obtaining a pool of shots. Each shot in the pool includes a natural language query component and a corresponding database translation component. The method further provides for vectorizing the natural language query component for each of the shots into a common vector space; receiving a natural language query from a user interface; vectorizing the natural language query within the common vector space; identifying a subset of vectorized natural language query components that satisfy a similarity metric when compared to the vectorized natural language query; and generating an LLM prompt that includes shots from the pool corresponding to the subset of the vectorized natural language query.

Type: Grant

Filed: November 14, 2023

Date of Patent: January 13, 2026

Assignee: Microsoft Technology Licensing, LLC

Inventors: Oren Barkan, Yonathan Weill, Noam Koenigstein
Model with usage data compensation

Patent number: 12423381

Abstract: A method of training a machine learning model is provided. The method includes receiving labeled training data in the machine learning model, the received labeled training data including content data for items accessible to a user and input usage data representing recorded interaction between the user and the items, wherein the received content data for each item includes data representing intrinsic attributes of the item. The method further includes selecting a set of the input usage data that excludes input usage data for a proper subset of the items and training the machine learning model based on both the content data and the selected set of input usage data of the received labeled training data for the items.

Type: Grant

Filed: December 6, 2021

Date of Patent: September 23, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Oren Barkan, Roy Hirsch, Ori Katz, Avi Caciularu, Yonathan Weill, Noam Koenigstein, Nir Nice
EVALUATION AND OPTIMIZATION OF NATURAL LANGUAGE TO DATABASE QUERY TRANSLATION

Publication number: 20250245220

Abstract: Natural language to database query translation optimization is disclosed. Due to the flexibility of natural language and database query languages, translations may differ in form, yet still be valid. A natural language query is translated to a first database query in a database query language using a query translator. A first and second database query result are generated by performing a first and second database query on a database. The second database query differs from the first database query in form (e.g., order of the query parameters), but is functionally similar to the first database query, such that an ideal query translation would produce equivalent database query results (e.g., same retrieved information, even if in a different order). If the first database query result matches the second database query result, training reinforcement is provided, but if not, a training adjustment is provided for the query translator.

Type: Application

Filed: January 26, 2024

Publication date: July 31, 2025

Inventors: Yonathan WEILL, Oren BARKAN, Noam KOENIGSTEIN
LANGUAGE MODEL HALLUCINATION DETECTION

Publication number: 20250238629

Abstract: In some embodiments, a language model forward traversal with a few-shot learning forward prompt yields a primary answer from a primary question. Then at least one backward traversal yields at least one candidate question using backward prompt(s) with answer-question pairs derived from the forward prompt's question-answer pairs. Each backward prompt also includes the primary answer but not the primary question. Each backward traversal is through one or more language models, not necessarily including the forward traversal's language model. Sometimes backward traversals vary model temperature, top-p, or top-k. A vector distance calculated between at least some candidate question vectors and a primary question vector indicates whether the primary answer includes hallucination content, and in some cases how much. Some embodiments withhold hallucinated answers from user interfaces and device control interfaces. Some embodiments also loop to obtain an answer with less hallucination content.

Type: Application

Filed: January 21, 2024

Publication date: July 24, 2025

Inventors: Itzik MALKIEL, Yakir YEHUDA, Oren BARKAN, Yonathan WEILL, Noam KOENIGSTEIN
CONTEXT-BASED PROMPT GENERATION FOR AUTOMATED TRANSLATIONS BETWEEN NATURAL LANGUAGE AND QUERY LANGUAGE

Publication number: 20250156413

Abstract: A disclosed method facilitates translation of natural language queries into query language statements usable to retrieve data from or write data to a particular database. The method includes obtaining a pool of shots. Each shot in the pool includes a natural language query component and a corresponding database translation component. The method further provides for vectorizing the natural language query component for each of the shots into a common vector space; receiving a natural language query from a user interface; vectorizing the natural language query within the common vector space; identifying a subset of vectorized natural language query components that satisfy a similarity metric when compared to the vectorized natural language query; and generating an LLM prompt that includes shots from the pool corresponding to the subset of the vectorized natural language query.

Type: Application

Filed: November 14, 2023

Publication date: May 15, 2025

Inventors: Oren BARKAN, Yonathan WEILL, Noam KOENIGSTEIN
Representation learning with side information

Patent number: 12223274

Abstract: A relational similarity determination engine receives as input a dataset including a set of entities and co-occurrence data that defines co-occurrence relations for pairs of the entities. The relational similarity determination engine also receives as input side information defining explicit relations between the entities. The relational similarity determination engine jointly models the co-occurrence relations and the explicit relations for the entities to compute a similarity metric for each different pair of entities within the dataset. Based on the computed similarity metrics, the relational similarity determination engine identifies a most similar replacement entity from the dataset for each of the entities within the dataset. For a select entity received as an input, the relational similarity determination engine outputs the identified most similar replacement entity.

Type: Grant

Filed: October 29, 2021

Date of Patent: February 11, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Oren Barkan, Avi Caciularu, Idan Rejwan, Yonathan Weill, Noam Koenigstein, Ori Katz, Itzik Malkiel, Nir Nice
Systems and methods for semantic search via focused summarizations

Patent number: 11836175

Abstract: Semantic search techniques via focused summarizations are described. For example, a search query is received for a text-based content item in a data set comprising a plurality of text-based content items. A first feature vector representative of the search query is obtained. A respective semantic similarity score is determined between the first feature vector and each of a plurality of second feature vectors. Each of the second feature vectors is representative of a machine-generated summarization of a respective text-based content item. The machine-generated summarization comprises a plurality of multi-word fragments that are selected from the respective text-based content item via a transformer-based machine learning model. A search result is provided responsive to the search query.

Type: Grant

Filed: June 29, 2022

Date of Patent: December 5, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Itzik Malkiel, Noam Koenigstein, Oren Barkan, Jonathan Ephrath, Yonathan Weill, Nir Nice
DEEP ANGULAR SIMILARITY LEARNING

Publication number: 20230376835

Abstract: A comparison engine performs item similarity comparisons. A source item and one or more candidate items are input into a triplet-trained machine learning model trained using training data including triplets of anchor elements, positive elements, and negative elements. Each triplet corresponds to an item included in the training data. The anchor elements and the positive elements are included in the corresponding item. The negative element is included in a different item in the training data. A similarity score between the source item and each of the one or more candidate items is generated from the triplet-trained machine learning model.

Type: Application

Filed: May 20, 2022

Publication date: November 23, 2023

Inventors: Itzik MALKIEL, Noam KOENIGSTEIN, Yonathan WEILL, Oren BARKAN, Jonathan EPHRATH, Nir NICE
MODEL WITH USAGE DATA COMPENSATION

Publication number: 20230177111

Abstract: A method of training a machine learning model is provided. The method includes receiving labeled training data in the machine learning model, the received labeled training data including content data for items accessible to a user and input usage data representing recorded interaction between the user and the items, wherein the received content data for each item includes data representing intrinsic attributes of the item. The method further includes selecting a set of the input usage data that excludes input usage data for a proper subset of the items and training the machine learning model based on both the content data and the selected set of input usage data of the received labeled training data for the items.

Type: Application

Filed: December 6, 2021

Publication date: June 8, 2023

Inventors: Oren BARKAN, Roy HIRSCH, Ori KATZ, Avi CACIULARU, Yonathan WEILL, Noam KOENIGSTEIN, Nir NICE
REPRESENTATION LEARNING WITH SIDE INFORMATION

Publication number: 20230137718

Abstract: A relational similarity determination engine receives as input a dataset including a set of entities and co-occurrence data that defines co-occurrence relations for pairs of the entities. The relational similarity determination engine also receives as input side information defining explicit relations between the entities. The relational similarity determination engine jointly models the co-occurrence relations and the explicit relations for the entities to compute a similarity metric for each different pair of entities within the dataset. Based on the computed similarity metrics, the relational similarity determination engine identifies a most similar replacement entity from the dataset for each of the entities within the dataset. For a select entity received as an input, the relational similarity determination engine outputs the identified most similar replacement entity.

Type: Application

Filed: October 29, 2021

Publication date: May 4, 2023

Inventors: Oren BARKAN, Avi CACIULARU, Idan REJWAN, Yonathan WEILL, Noam KOENIGSTEIN, Ori KATZ, Itzik MALKIEL, Nir NICE