Patents by Inventor Hila Yehuda
Hila Yehuda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12248855Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.Type: GrantFiled: October 27, 2022Date of Patent: March 11, 2025Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
-
Publication number: 20240202591Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).Type: ApplicationFiled: January 30, 2024Publication date: June 20, 2024Inventors: Oren ELISHA, Ami LUTTWAK, Hila YEHUDA, Adar KAHANA, Maya BECHLER-SPEICHER
-
Patent number: 11928567Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).Type: GrantFiled: March 17, 2023Date of Patent: March 12, 2024Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
-
Publication number: 20230229973Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).Type: ApplicationFiled: March 17, 2023Publication date: July 20, 2023Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
-
Patent number: 11636387Abstract: Embodiments described herein are directed to improving machine learning (ML) model-based techniques for automatically labeling data items based on identifying and resolving labels that are problematic. An ML model may be trained to predict labels for any given data item. The ML model may be validated to determine a confusion metric with respect to each distinct pair of labels predicted by the ML model. Each confusion metric indicates how a particular label is being mistaken for another particular label. The confusion metrics are analyzed to determine whether any of the ML model-generated labels are problematic (e.g., a label conflicts with another label, a label that is rarely predicted, a label that is incorrectly predicted, etc.). Steps for resolving the problematic labels are implemented, and the ML model is retrained based on the resolution steps. By doing so, the ML model generates a more accurate label for a data item.Type: GrantFiled: January 27, 2020Date of Patent: April 25, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler Speicher
-
Patent number: 11636389Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).Type: GrantFiled: February 19, 2020Date of Patent: April 25, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
-
Publication number: 20230095553Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.Type: ApplicationFiled: October 27, 2022Publication date: March 30, 2023Inventors: Oren ELISHA, Ami LUTTWAK, Hila YEHUDA, Adar KAHANA, Maya BECHLER-SPEICHER
-
Patent number: 11514364Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.Type: GrantFiled: February 19, 2020Date of Patent: November 29, 2022Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
-
Publication number: 20210256419Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.Type: ApplicationFiled: February 19, 2020Publication date: August 19, 2021Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
-
Publication number: 20210256420Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).Type: ApplicationFiled: February 19, 2020Publication date: August 19, 2021Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
-
Publication number: 20210232966Abstract: Embodiments described herein are directed to improving machine learning (ML) model-based techniques for automatically labeling data items based on identifying and resolving labels that are problematic. An ML model may be trained to predict labels for any given data item. The ML model may be validated to determine a confusion metric with respect to each distinct pair of labels predicted by the ML model. Each confusion metric indicates how a particular label is being mistaken for another particular label. The confusion metrics are analyzed to determine whether any of the ML model-generated labels are problematic (e.g., a label conflicts with another label, a label that is rarely predicted, a label that is incorrectly predicted, etc.). Steps for resolving the problematic labels are implemented, and the ML model is retrained based on the resolution steps. By doing so, the ML model generates a more accurate label for a data item.Type: ApplicationFiled: January 27, 2020Publication date: July 29, 2021Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler Speicher