Patents by Inventor Hila Yehuda

Hila Yehuda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Iterative vectoring for constructing data driven machine learning models

Patent number: 12248855

Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.

Type: Grant

Filed: October 27, 2022

Date of Patent: March 11, 2025

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
SYSTEM AND METHOD FOR IMPROVING MACHINE LEARNING MODELS BY DETECTING AND REMOVING INACCURATE TRAINING DATA

Publication number: 20240202591

Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).

Type: Application

Filed: January 30, 2024

Publication date: June 20, 2024

Inventors: Oren ELISHA, Ami LUTTWAK, Hila YEHUDA, Adar KAHANA, Maya BECHLER-SPEICHER
System and method for improving machine learning models by detecting and removing inaccurate training data

Patent number: 11928567

Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).

Type: Grant

Filed: March 17, 2023

Date of Patent: March 12, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
SYSTEM AND METHOD FOR IMPROVING MACHINE LEARNING MODELS BY DETECTING AND REMOVING INACCURATE TRAINING DATA

Publication number: 20230229973

Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).

Type: Application

Filed: March 17, 2023

Publication date: July 20, 2023

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
System and method for improving machine learning models based on confusion error evaluation

Patent number: 11636387

Abstract: Embodiments described herein are directed to improving machine learning (ML) model-based techniques for automatically labeling data items based on identifying and resolving labels that are problematic. An ML model may be trained to predict labels for any given data item. The ML model may be validated to determine a confusion metric with respect to each distinct pair of labels predicted by the ML model. Each confusion metric indicates how a particular label is being mistaken for another particular label. The confusion metrics are analyzed to determine whether any of the ML model-generated labels are problematic (e.g., a label conflicts with another label, a label that is rarely predicted, a label that is incorrectly predicted, etc.). Steps for resolving the problematic labels are implemented, and the ML model is retrained based on the resolution steps. By doing so, the ML model generates a more accurate label for a data item.

Type: Grant

Filed: January 27, 2020

Date of Patent: April 25, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler Speicher
System and method for improving machine learning models by detecting and removing inaccurate training data

Patent number: 11636389

Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).

Type: Grant

Filed: February 19, 2020

Date of Patent: April 25, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
ITERATIVE VECTORING FOR CONSTRUCTING DATA DRIVEN MACHINE LEARNING MODELS

Publication number: 20230095553

Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.

Type: Application

Filed: October 27, 2022

Publication date: March 30, 2023

Inventors: Oren ELISHA, Ami LUTTWAK, Hila YEHUDA, Adar KAHANA, Maya BECHLER-SPEICHER
Iterative vectoring for constructing data driven machine learning models

Patent number: 11514364

Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.

Type: Grant

Filed: February 19, 2020

Date of Patent: November 29, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
ITERATIVE VECTORING FOR CONSTRUCTING DATA DRIVEN MACHINE LEARNING MODELS

Publication number: 20210256419

Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.

Type: Application

Filed: February 19, 2020

Publication date: August 19, 2021

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
SYSTEM AND METHOD FOR IMPROVING MACHINE LEARNING MODELS BY DETECTING AND REMOVING INACCURATE TRAINING DATA

Publication number: 20210256420

Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).

Type: Application

Filed: February 19, 2020

Publication date: August 19, 2021

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
SYSTEM AND METHOD FOR IMPROVING MACHINE LEARNING MODELS BASED ON CONFUSION ERROR EVALUATION

Publication number: 20210232966

Abstract: Embodiments described herein are directed to improving machine learning (ML) model-based techniques for automatically labeling data items based on identifying and resolving labels that are problematic. An ML model may be trained to predict labels for any given data item. The ML model may be validated to determine a confusion metric with respect to each distinct pair of labels predicted by the ML model. Each confusion metric indicates how a particular label is being mistaken for another particular label. The confusion metrics are analyzed to determine whether any of the ML model-generated labels are problematic (e.g., a label conflicts with another label, a label that is rarely predicted, a label that is incorrectly predicted, etc.). Steps for resolving the problematic labels are implemented, and the ML model is retrained based on the resolution steps. By doing so, the ML model generates a more accurate label for a data item.

Type: Application

Filed: January 27, 2020

Publication date: July 29, 2021

Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler Speicher