Patents by Inventor Hila Yehuda

Hila Yehuda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11928567
    Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).
    Type: Grant
    Filed: March 17, 2023
    Date of Patent: March 12, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
  • Publication number: 20230229973
    Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).
    Type: Application
    Filed: March 17, 2023
    Publication date: July 20, 2023
    Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
  • Patent number: 11636389
    Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).
    Type: Grant
    Filed: February 19, 2020
    Date of Patent: April 25, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
  • Patent number: 11636387
    Abstract: Embodiments described herein are directed to improving machine learning (ML) model-based techniques for automatically labeling data items based on identifying and resolving labels that are problematic. An ML model may be trained to predict labels for any given data item. The ML model may be validated to determine a confusion metric with respect to each distinct pair of labels predicted by the ML model. Each confusion metric indicates how a particular label is being mistaken for another particular label. The confusion metrics are analyzed to determine whether any of the ML model-generated labels are problematic (e.g., a label conflicts with another label, a label that is rarely predicted, a label that is incorrectly predicted, etc.). Steps for resolving the problematic labels are implemented, and the ML model is retrained based on the resolution steps. By doing so, the ML model generates a more accurate label for a data item.
    Type: Grant
    Filed: January 27, 2020
    Date of Patent: April 25, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler Speicher
  • Publication number: 20230095553
    Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.
    Type: Application
    Filed: October 27, 2022
    Publication date: March 30, 2023
    Inventors: Oren ELISHA, Ami LUTTWAK, Hila YEHUDA, Adar KAHANA, Maya BECHLER-SPEICHER
  • Patent number: 11514364
    Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.
    Type: Grant
    Filed: February 19, 2020
    Date of Patent: November 29, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
  • Publication number: 20210256420
    Abstract: Methods, systems and computer program products are described to improve machine learning (ML) model-based classification of data items by identifying and removing inaccurate training data. Inaccurate training samples may be identified, for example, based on excessive variance in vector space between a training sample and a mean of category training samples, and based on a variance between an assigned category and a predicted category for a training sample. Suspect or erroneous samples may be selectively removed based on, for example, vector space variance and/or prediction confidence level. As a result, ML model accuracy may be improved by training on a more accurate revised training set. ML model accuracy may (e.g., also) be improved, for example, by identifying and removing suspect categories with excessive (e.g., weighted) vector space variance. Suspect categories may be retained or revised. Users may (e.g., also) specify a prediction confidence level and/or coverage (e.g., to control accuracy).
    Type: Application
    Filed: February 19, 2020
    Publication date: August 19, 2021
    Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
  • Publication number: 20210256419
    Abstract: Embodiments described herein are directed to generating a machine learning (ML) model. A plurality of vectors are accessed, each vector of the plurality of vectors including a first set of features associated with a corresponding data item. A second set of features is identified by expanding the first set of features. A ML model is trained using vectors including the expanded set of features, and it is determined that an accuracy of the ML model trained using the vectors increased. A third set of features is identified by determining a measure of importance for different subsets of features in the second set and replacing subsets having a low measure of importance with new features. A ML model is trained using vectors that include the third set, and it is determined that an accuracy of the model increased due to the replacing.
    Type: Application
    Filed: February 19, 2020
    Publication date: August 19, 2021
    Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler-Speicher
  • Publication number: 20210232966
    Abstract: Embodiments described herein are directed to improving machine learning (ML) model-based techniques for automatically labeling data items based on identifying and resolving labels that are problematic. An ML model may be trained to predict labels for any given data item. The ML model may be validated to determine a confusion metric with respect to each distinct pair of labels predicted by the ML model. Each confusion metric indicates how a particular label is being mistaken for another particular label. The confusion metrics are analyzed to determine whether any of the ML model-generated labels are problematic (e.g., a label conflicts with another label, a label that is rarely predicted, a label that is incorrectly predicted, etc.). Steps for resolving the problematic labels are implemented, and the ML model is retrained based on the resolution steps. By doing so, the ML model generates a more accurate label for a data item.
    Type: Application
    Filed: January 27, 2020
    Publication date: July 29, 2021
    Inventors: Oren Elisha, Ami Luttwak, Hila Yehuda, Adar Kahana, Maya Bechler Speicher