Patents Assigned to DataRobot, Inc.
  • Publication number: 20230186174
    Abstract: Segmenting data and forecasting by a combination of models trained on segmented data is provided. A system compares, with a first model, values of timestamps corresponding to data points to determine a time series dependency between the data points. The system generates, with the first model and based on the time series dependency, a first cluster with first data points and a second cluster with second data points. The system allocates, by a controller, a second model to the first cluster, and a third model to the second cluster. The system trains the second model based on the time series dependency and the first data points. The system trains the third model based on the time series dependency and the second data points. The system generates a fourth model based on a combination of the second trained model and the third trained model.
    Type: Application
    Filed: December 9, 2022
    Publication date: June 15, 2023
    Applicant: DataRobot, Inc.
    Inventors: Matt Nitzken, David McGarry, Roman Midianyi, Anatolli Stehni
  • Publication number: 20230186116
    Abstract: Aspects of this technical solution can identify, by a second machine learning model receiving as input first features, second features having respective impact metrics that satisfy an impact threshold, the impact threshold indicating that the second features modify various forecast data points, cause a graphical user interface to present the forecast including one or more of the first features having respective first visual properties corresponding to identifiers of respective ones of the first features, cause the graphical user interface to present the forecast including the second features having a second visual property corresponding to an indication that the second features satisfy the impact threshold, and cause the graphical user interface to modify the forecast including the second features to include an explanation portion including metrics of the second features, the metrics corresponding to respective time points of a time dependency relationship.
    Type: Application
    Filed: December 9, 2022
    Publication date: June 15, 2023
    Applicant: DataRobot, Inc.
    Inventors: Ina Ko, Borys Kupar, Yulia Bezhula, Kyrylo Kniazev
  • Publication number: 20230091610
    Abstract: This disclosure relates generally to using machine learning models to generate current time-series features using machine learning and validate time-series machine learning model output. At least one aspect is directed to a system with one or more processors, coupled to memory, to segment a time series range into a first segment for an instance of time, the segment associated with a value for a target feature and a timestamp for the value, segment the time series range into an input segment associated with a plurality of input features and a segment timestamp less than or equal to the timestamp, generate a model trained with input comprising values for the target feature and timestamps for the values less than or equal to the segment timestamp, and transform at least one of the input features based at least on the model.
    Type: Application
    Filed: September 12, 2022
    Publication date: March 23, 2023
    Applicant: DataRobot, Inc.
    Inventors: Anastasiia Tamazlykar, Igor Iaroshenkno, Mark Steadman, Jilian Schwiep, Peter Michael Simon, Zachary Deane-Mayer, Brett Rowley, Jing Qiang Goh
  • Publication number: 20230083891
    Abstract: Disclosed herein are methods and systems to generate and revise a workflow that utilizes machine learning model nodes and other analytical nodes to analyze data and generate a decision via allowing a user to interact with input elements of a graphical user interface. The methods and systems use a processor to provide, for rendering by a user device, a graphical user interface comprising at least a first graphical indicator corresponding to a computer model node within workflow code and a second graphical indicator corresponding to a decision node within the workflow code, the computer model node visually connected with the decision node; and in response to receiving, via a user interacting with the graphical user interface, an additional node corresponding to at least one analytical protocol, revise the workflow code, by adding the analytical protocol before an execution of the decision node.
    Type: Application
    Filed: September 12, 2022
    Publication date: March 16, 2023
    Applicant: DataRobot, Inc.
    Inventors: Jeremy Achin, Ina Ko, Stephen James Millet, Daniel Thomas Trost, Igor Veksler
  • Publication number: 20230065870
    Abstract: This disclosure relates generally to artificial intelligence structured to generate models based on multimodal input. At least one aspect is directed to a system. The system can include a data processing system comprising memory and one or more processors to generate, by a first model trained using machine learning with input including one or more first features each associated with data structures having a plurality of distinct data types, one or more second features compatible with one of the distinct data types, generate, by a second model trained with input including the second features, a plurality of cluster classifications each compatible with one or more of the distinct data types, and cause a user interface to present one or more of the data structures rendered according to a spatial structure based on the second features and the cluster classifications.
    Type: Application
    Filed: August 30, 2022
    Publication date: March 2, 2023
    Applicant: DataRobot, Inc.
    Inventors: Ivan Pyzow, David Michael McGarry, Mikhail Yakubovskiy, Ee Kin Chin, Mykyta Yarmak, Yuliia Bezuhla, Zachary Albert Mayer
  • Publication number: 20230067026
    Abstract: Automated data analytics techniques for non-tabular data sets may include methods and systems for (1) automatically developing models that perform tasks in the domains of computer vision, audio processing, speech processing, text processing, or natural language processing; (2) automatically developing models that analyze heterogeneous data sets containing image data and non-image data, and/or heterogeneous data sets containing tabular data and non-tabular data; (3) determining the importance of an image feature with respect to a modeling task, (4) explaining the value of a modeling target based at least in part on an image feature, and (5) detecting drift in image data. In some cases, multi-stage models may be developed, wherein a pre-trained feature extraction model extracts low-, mid-, high-, and/or highest-level features of non-tabular data, and a data analytics models uses those features (or features derived therefrom) to perform a data analytics task.
    Type: Application
    Filed: February 17, 2021
    Publication date: March 2, 2023
    Applicant: DataRobot, Inc.
    Inventors: Yurii Huts, Chin Ee Kin, Anton Kasyanov, Zachary Albert Mayer, Xavier Conort, Hon Nian Chua, Sabari Shanmugam, Atanas Mitkov Atanasov, Ivan Richard Pyzow
  • Publication number: 20230051833
    Abstract: Systems and methods of epidemiological modeling using machine learning are provided, and can include receiving values for an occurrence of the infectious disease during a first time period, generating, from a model trained by a machine learning system, predictions for the occurrence of the infectious disease over a second time period, performing, by a simulator using the predictions, one or more simulations of the occurrence of the infectious disease in one or more geographic regions during one or more time periods subsequent to the second time period, and providing, to a user interface, a first simulation of the one or more simulations performed by the simulator for a first geographic region of the one or more geographic regions during a time period of the one or more time periods.
    Type: Application
    Filed: July 28, 2022
    Publication date: February 16, 2023
    Applicant: DataRobot, Inc.
    Inventors: Jeremy Achin, Michael Schmidt, Mackenzie Heiser, Jona Sassenhagen, Oleg Baranovskiy, Jared Shamwell, Hon Nian Chua, Joao Paulo Gomes, Maxence Jeunesse, Yung Siang Liau, Julian Wergieluk, Jay Cameron Schuren, Mark Steadman, Mohak Saxena, Samuel Clark, Noa Flaherty, Jarred Bultema, Nathan Robert Cameron, Amanda Schierz, Vinay Venkata Wunnava, Xavier Conort, Gregory Michaelson, Anton Suslov, Madeleine Mott, Sergey Yurgenson, Christopher James Monsour, Matthew Joseph Nitzken, Patrick Allen Farrell, Jared Bowns, Dustin Burke, Ievgenii Baliuk, Rishabh Raman
  • Publication number: 20230004486
    Abstract: The system can identify data stored in repositories that indicate changes in the version of the application relative to a prior version of the application tested or deployed before receipt of the request to test the performance of the version of the application. The system can determine, based on the data and using machine learning with historical data associated with applications tested or deployed to test performance of the version, and without execution of the tests, a score for each of a plurality of tests configured to test performance of the version of the application. The system can select, based on the scores, a subset of the tests to execute, and provide an indication of the selected subset of the tests to cause execution of the subset of the tests to evaluate performance of the version of the application prior to deployment of the version of the application.
    Type: Application
    Filed: July 1, 2022
    Publication date: January 5, 2023
    Applicant: DataRobot, Inc.
    Inventors: Borys Drozhak, Ievgenii Baliuk, Dustin Burke
  • Publication number: 20230004796
    Abstract: Systems and methods are described for developing and using neural network models. An example method of training a neural network includes: oscillating a learning rate while performing a preliminary training of a neural network; determining, based on the preliminary training, a number of training epochs to perform for a subsequent training session, and training the neural network using the determined number of training epochs. The systems and methods can be used to build neural network models that efficiently and accurately handle heterogeneous data.
    Type: Application
    Filed: May 13, 2022
    Publication date: January 5, 2023
    Applicant: DataRobot, Inc.
    Inventors: Zachary Albert Mayer, Jason McGhee, Jesse Bannon, Joshua Matthew Weiner
  • Patent number: 11514369
    Abstract: Systems and methods are described for interpreting machine learning model predictions. An example method includes: providing a machine learning model configured to receive a plurality of features as input and provide a prediction as output, wherein the plurality of features includes an engineered feature including a combination of two or more parent features; calculating a Shapley value for each feature in the plurality of features; and allocating a respective portion of the Shapley value for the engineered feature to each of the two or more parent features.
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: November 29, 2022
    Assignee: DataRobot, Inc.
    Inventors: Mark Benjamin Romanowsky, Jared Bowns, Thomas Whitehead, Thomas Stearns, Xavier Conort, Anastasiia Tamazlykar, Mohak Saxena
  • Publication number: 20220358528
    Abstract: An apparatus has a memory with processor-executable instructions and a processor operatively coupled to the memory. The apparatus receives datasets including time series data points that are descriptive of a feature of a given entity. The processor determines a time series characteristic based on the data content, and selects, based on the determined characteristic, a set of entrant forecasting models from a pool of forecasting models stored in the memory. Next, the processor trains each entrant forecasting model with the time series data points to produce a set of trained entrant forecasting models. The processor executes each trained entrant forecasting model to generate a set of forecasted values indicating estimations of the feature of the given entity. Thereafter the processor selects at least one forecasting model from the set of trained entrant forecasting models based on computed accuracy evaluations performed over the set of forecasted values.
    Type: Application
    Filed: February 14, 2022
    Publication date: November 10, 2022
    Applicant: DataRobot, Inc.
    Inventors: John Bledsoe, Jeff Gabriel, Jason Montgomery, Ryan Sevey, Matt Steinpreis, Craig Vermeer, Ryan West
  • Publication number: 20220335030
    Abstract: Cache optimization for data preparation includes: generating a data traversal program that represents a result of a set of sequenced data preparation operations performed on one or more sets of data, wherein the data traversal program indicates how to assemble one or more affected columns in the one or more sets of data to derive the result; in response to receiving a specification of the set of sequenced operations to be performed on the one or more sets of data, accessing the data traversal program that represents the result or a stored copy of the data traversal program that represents the result; assembling the one or more affected columns in the one or more sets of data according to the data traversal program to re-generate the result; and outputting the result.
    Type: Application
    Filed: July 1, 2022
    Publication date: October 20, 2022
    Applicant: DataRobot, Inc.
    Inventors: Dave Brewster, Victor Tso
  • Patent number: 11461304
    Abstract: Signature-based cache optimization for data preparation includes: performing a first set of sequenced data preparation operations on one or more sets of data to generate a plurality of transformation results; caching one or more of the plurality of transformation results and one or more corresponding operation signatures, a cached operation signature being derived based at least in part on a subset of sequenced operations that generated a corresponding result; receiving a specification of a second set of sequenced operations; determining an operation signature associated with the second set of sequenced operations; identifying a cached result among the cached results based at least in part on the determined operation signature; and outputting the cached result.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: October 4, 2022
    Assignee: DataRobot, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20220284183
    Abstract: A step editor for data preparation can instruct a user interface to present a first plurality of operations to be applied in a sequential order to one or more sets of data, receive user inputs including at least one indication to mute at least one operation of the first plurality of operations to prevent the processors from performing the at least one operation, generate a second plurality of operations, the second plurality of operations to be applied in a sequential order to the sets of data and comprising the first plurality of operations excluding the operation muted by the user inputs, obtain a cached data traversal program associated with the second plurality of operations and comprising a representation of a result of transforming the sets of data, and instruct the user interface to present output based at least in part on execution of the cached data traversal program.
    Type: Application
    Filed: March 25, 2022
    Publication date: September 8, 2022
    Applicant: DataRobot, Inc.
    Inventors: Nenshad Bardoliwalla, Michael Matthews, Ian Timourian, Jing Chen, Lilia Gutnik, Whitman Kwok, Dave Brewster, Victor Tze-Yeuan Tso
  • Publication number: 20220237516
    Abstract: Data modeling systems and methods are described. A data modeling method may include receiving user input specifying a structure of at least a portion of a data model and a complexity value associated with the structure; (a) generating one or more data models; (b) determining complexity scores for the respective data models; (c) for each of the data models: determining whether to select the respective data model for evaluation based, at least in part, on the complexity score of the respective data model, and if the respective data model is selected for evaluation, evaluating an accuracy of the respective data model for one or more data sets; and repeating steps (a)-(c) until one or more specified termination criteria are satisfied, wherein a first of the generated data models includes the specified structure, and wherein the complexity score for the first data model is determined based, at least in part, on the complexity value associated with the structure.
    Type: Application
    Filed: April 6, 2022
    Publication date: July 28, 2022
    Applicant: DataRobot, Inc.
    Inventors: Michael Schmidt, Dylan Sherry, Hongmin Fan
  • Patent number: 11386075
    Abstract: Methods for detection of anomalous data samples from a plurality of data samples are provided. In some embodiments, an anomaly detection procedure that includes a plurality of tasks is executed to identify the anomalous data samples from the plurality of data samples.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: July 12, 2022
    Assignee: DataRobot, Inc.
    Inventors: Amanda Claire Schierz, Jeremy Achin, Zachary Albert Mayer
  • Publication number: 20220199266
    Abstract: Systems and methods of epidemiological modeling using machine learning are provided, and can include receiving values for an occurrence of the infectious disease during a first time period, generating, from a model trained by a machine learning system, predictions for the occurrence of the infectious disease over a second time period, performing, by a simulator using the predictions, one or more simulations of the occurrence of the infectious disease in one or more geographic regions during one or more time periods subsequent to the second time period, and providing, to a user interface, a first simulation of the one or more simulations performed by the simulator for a first geographic region of the one or more geographic regions during a time period of the one or more time periods.
    Type: Application
    Filed: December 9, 2021
    Publication date: June 23, 2022
    Applicant: DataRobot, Inc.
    Inventors: Jeremy Achin, Earl Jared Shamwell, Michael Schmidt, Mackenzie Heiser, Patrick Farrell, Matt Nitzken, Jared Bowns, Nathan Cameron, Adam Beairsto, Jay Schuren, Mohak Saxena
  • Patent number: 11361246
    Abstract: Various systems and methods provide an intuitive user interface that enables automatic specification of queries and constraints for analysis by ML component. Various implementations provide methodologies for automatically formulating machine learning (“ML”) and optimization queries. The automatic generation of ML and/or optimization queries can be configured to use examples to facilitate formulation of ML and optimization queries. One example method includes accepting input data specifying variables and data values associated with the variables. Within the input data any unspecified data records are identified, and a relationship between the variables specified in the input data and a variable associated with the at least one unspecified data record is automatically determined. The relationship can be automatically determined based on training data contained within the input data. Once a relationship is established a ML problem can be automatically generated.
    Type: Grant
    Filed: September 17, 2018
    Date of Patent: June 14, 2022
    Assignee: DataRobot, Inc.
    Inventor: Michael Schmidt
  • Patent number: 11334795
    Abstract: Systems and methods are described for developing and using neural network models. An example method of training a neural network includes: oscillating a learning rate while performing a preliminary training of a neural network; determining, based on the preliminary training, a number of training epochs to perform for a subsequent training session; and training the neural network using the determined number of training epochs. The systems and methods can be used to build neural network models that efficiently and accurately handle heterogeneous data.
    Type: Grant
    Filed: March 11, 2021
    Date of Patent: May 17, 2022
    Assignee: DataRobot, Inc.
    Inventors: Zachary Albert Mayer, Jason McGhee, Jesse Bannon, Joshua Matthew Weiner
  • Publication number: 20220076164
    Abstract: Training computer models by generating time-aware training datasets is provided. A system receives a secondary dataset to be combined with a primary dataset for generation of a training dataset. The primary dataset includes a plurality of data records where at least one data record corresponds to a time-of-prediction value corresponding to a timestamp at which at least one data record was used to generate a prediction. The secondary dataset includes a plurality of features where at least one feature corresponds to a timestamp value. The system selects a feature within the secondary dataset with a timestamp that precedes or matches a time-of-prediction value for a corresponding data record within the primary dataset. The system generates the training dataset that includes the primary dataset and the selected feature. The system trains a model using the generated training dataset.
    Type: Application
    Filed: September 8, 2021
    Publication date: March 10, 2022
    Applicant: DataRobot, Inc.
    Inventors: Xavier Conort, Hon Nian Chua, Yung Siang Liau, Harry Dinh