Patents Assigned to NeuralStudio SEZC
  • Publication number: 20200401939
    Abstract: Historical data used to train machine learning algorithms can have thousands of records with hundreds of fields, and inevitably includes faulty data that affects the accuracy and utility of a primary model machine learning algorithm. To improve dataset integrity it is segregated into a clean dataset having no invalid data values and a faulty dataset having the invalid data values. The clean dataset is used to produce a secondary model machine learning algorithm trained to generate from plural complete data records a replacement value for a single invalid data value in a data record, and a tertiary model machine learning clustering algorithm trained to generate from plural complete data records replacement values for multiple invalid data values. Substituting the replacement data values for invalid data values in the faulty dataset creates augmented training data which is combined with clean data to train a more accurate and useful primary model.
    Type: Application
    Filed: June 7, 2020
    Publication date: December 24, 2020
    Applicant: NeuralStudio SEZC
    Inventor: Jack Copper