Patents Assigned to DataRobot, Inc.
-
Publication number: 20230206610Abstract: Disclosed herein at methods and systems for visualizing machine learning model performance. One method comprises receiving a request to provide a visual representation of a machine learning technique executed on a set of images to generate a first attribute and a second attribute for each image; executing the machine learning model to receive the first and the second attribute for each image; mapping the first attribute to a visual distinctiveness protocol; identifying a distance for each image, the distance representing a difference between the second attribute predicted by the model for each pair of respective images within the set of images; and providing for display at least a subset of the set of images arranged in accordance with their respective distance and having a visual attribute corresponding to the mapped first attribute for each image.Type: ApplicationFiled: December 29, 2022Publication date: June 29, 2023Applicant: DataRobot, Inc.Inventors: Ivan Pyzow, Pavlo Kochubei, Yehor Kolchyba, Sylvain Ferrandiz, Anton Kasyanov
-
Publication number: 20230196101Abstract: An automated machine learning (“ML”) method may include training a first machine learning model using a first machine learning algorithm and a training data set; validating the first machine learning model using a validation data set, wherein validating the first machine learning model comprises generating an error data set; training a second machine learning model to predict a suitability of the first machine learning model for analyzing an inference data set, wherein the second machine learning model is trained using a second machine learning algorithm and the error data set; and triggering a remedial action associated with the first or second machine learning model in response to a predicted suitability of the first machine learning model for analyzing the inference data set not satisfying a suitability threshold.Type: ApplicationFiled: November 16, 2022Publication date: June 22, 2023Applicant: DataRobot, Inc.Inventors: Sindhu Ghanta, Drew Roselli, Nisha Talagala, Vinay Sridhar, Swaminathan Sundararaman, Lior Amar, Lior Khermosh, Bharath Ramsundar, Sriram Subramanian
-
Publication number: 20230186174Abstract: Segmenting data and forecasting by a combination of models trained on segmented data is provided. A system compares, with a first model, values of timestamps corresponding to data points to determine a time series dependency between the data points. The system generates, with the first model and based on the time series dependency, a first cluster with first data points and a second cluster with second data points. The system allocates, by a controller, a second model to the first cluster, and a third model to the second cluster. The system trains the second model based on the time series dependency and the first data points. The system trains the third model based on the time series dependency and the second data points. The system generates a fourth model based on a combination of the second trained model and the third trained model.Type: ApplicationFiled: December 9, 2022Publication date: June 15, 2023Applicant: DataRobot, Inc.Inventors: Matt Nitzken, David McGarry, Roman Midianyi, Anatolli Stehni
-
Publication number: 20230186116Abstract: Aspects of this technical solution can identify, by a second machine learning model receiving as input first features, second features having respective impact metrics that satisfy an impact threshold, the impact threshold indicating that the second features modify various forecast data points, cause a graphical user interface to present the forecast including one or more of the first features having respective first visual properties corresponding to identifiers of respective ones of the first features, cause the graphical user interface to present the forecast including the second features having a second visual property corresponding to an indication that the second features satisfy the impact threshold, and cause the graphical user interface to modify the forecast including the second features to include an explanation portion including metrics of the second features, the metrics corresponding to respective time points of a time dependency relationship.Type: ApplicationFiled: December 9, 2022Publication date: June 15, 2023Applicant: DataRobot, Inc.Inventors: Ina Ko, Borys Kupar, Yulia Bezhula, Kyrylo Kniazev
-
Publication number: 20230186175Abstract: Comparing a challenger model with a primary model is provided herein. In an embodiment, a system comprises one or more processors, coupled to memory, configured to determine, based on a comparison of a first model that is deployed as a primary model with a second model that is acting as a challenger model, that the second model performs better than the first model based on at least one performance metric; determine, based on a comparison of a characteristic of the first model with a characteristic of the second model, to skip a validation process for the second model; and establish the second model as the primary model in the deployment to replace the first model in the deployment.Type: ApplicationFiled: December 9, 2022Publication date: June 15, 2023Applicant: DataRobot, Inc.Inventors: Bohdan Usatov, Chris Li, Evan Chang, Tristan Spauding, Christopher Cozzi
-
Publication number: 20230091610Abstract: This disclosure relates generally to using machine learning models to generate current time-series features using machine learning and validate time-series machine learning model output. At least one aspect is directed to a system with one or more processors, coupled to memory, to segment a time series range into a first segment for an instance of time, the segment associated with a value for a target feature and a timestamp for the value, segment the time series range into an input segment associated with a plurality of input features and a segment timestamp less than or equal to the timestamp, generate a model trained with input comprising values for the target feature and timestamps for the values less than or equal to the segment timestamp, and transform at least one of the input features based at least on the model.Type: ApplicationFiled: September 12, 2022Publication date: March 23, 2023Applicant: DataRobot, Inc.Inventors: Anastasiia Tamazlykar, Igor Iaroshenkno, Mark Steadman, Jilian Schwiep, Peter Michael Simon, Zachary Deane-Mayer, Brett Rowley, Jing Qiang Goh
-
Publication number: 20230083891Abstract: Disclosed herein are methods and systems to generate and revise a workflow that utilizes machine learning model nodes and other analytical nodes to analyze data and generate a decision via allowing a user to interact with input elements of a graphical user interface. The methods and systems use a processor to provide, for rendering by a user device, a graphical user interface comprising at least a first graphical indicator corresponding to a computer model node within workflow code and a second graphical indicator corresponding to a decision node within the workflow code, the computer model node visually connected with the decision node; and in response to receiving, via a user interacting with the graphical user interface, an additional node corresponding to at least one analytical protocol, revise the workflow code, by adding the analytical protocol before an execution of the decision node.Type: ApplicationFiled: September 12, 2022Publication date: March 16, 2023Applicant: DataRobot, Inc.Inventors: Jeremy Achin, Ina Ko, Stephen James Millet, Daniel Thomas Trost, Igor Veksler
-
Publication number: 20230065870Abstract: This disclosure relates generally to artificial intelligence structured to generate models based on multimodal input. At least one aspect is directed to a system. The system can include a data processing system comprising memory and one or more processors to generate, by a first model trained using machine learning with input including one or more first features each associated with data structures having a plurality of distinct data types, one or more second features compatible with one of the distinct data types, generate, by a second model trained with input including the second features, a plurality of cluster classifications each compatible with one or more of the distinct data types, and cause a user interface to present one or more of the data structures rendered according to a spatial structure based on the second features and the cluster classifications.Type: ApplicationFiled: August 30, 2022Publication date: March 2, 2023Applicant: DataRobot, Inc.Inventors: Ivan Pyzow, David Michael McGarry, Mikhail Yakubovskiy, Ee Kin Chin, Mykyta Yarmak, Yuliia Bezuhla, Zachary Albert Mayer
-
Publication number: 20230067026Abstract: Automated data analytics techniques for non-tabular data sets may include methods and systems for (1) automatically developing models that perform tasks in the domains of computer vision, audio processing, speech processing, text processing, or natural language processing; (2) automatically developing models that analyze heterogeneous data sets containing image data and non-image data, and/or heterogeneous data sets containing tabular data and non-tabular data; (3) determining the importance of an image feature with respect to a modeling task, (4) explaining the value of a modeling target based at least in part on an image feature, and (5) detecting drift in image data. In some cases, multi-stage models may be developed, wherein a pre-trained feature extraction model extracts low-, mid-, high-, and/or highest-level features of non-tabular data, and a data analytics models uses those features (or features derived therefrom) to perform a data analytics task.Type: ApplicationFiled: February 17, 2021Publication date: March 2, 2023Applicant: DataRobot, Inc.Inventors: Yurii Huts, Chin Ee Kin, Anton Kasyanov, Zachary Albert Mayer, Xavier Conort, Hon Nian Chua, Sabari Shanmugam, Atanas Mitkov Atanasov, Ivan Richard Pyzow
-
Publication number: 20230051833Abstract: Systems and methods of epidemiological modeling using machine learning are provided, and can include receiving values for an occurrence of the infectious disease during a first time period, generating, from a model trained by a machine learning system, predictions for the occurrence of the infectious disease over a second time period, performing, by a simulator using the predictions, one or more simulations of the occurrence of the infectious disease in one or more geographic regions during one or more time periods subsequent to the second time period, and providing, to a user interface, a first simulation of the one or more simulations performed by the simulator for a first geographic region of the one or more geographic regions during a time period of the one or more time periods.Type: ApplicationFiled: July 28, 2022Publication date: February 16, 2023Applicant: DataRobot, Inc.Inventors: Jeremy Achin, Michael Schmidt, Mackenzie Heiser, Jona Sassenhagen, Oleg Baranovskiy, Jared Shamwell, Hon Nian Chua, Joao Paulo Gomes, Maxence Jeunesse, Yung Siang Liau, Julian Wergieluk, Jay Cameron Schuren, Mark Steadman, Mohak Saxena, Samuel Clark, Noa Flaherty, Jarred Bultema, Nathan Robert Cameron, Amanda Schierz, Vinay Venkata Wunnava, Xavier Conort, Gregory Michaelson, Anton Suslov, Madeleine Mott, Sergey Yurgenson, Christopher James Monsour, Matthew Joseph Nitzken, Patrick Allen Farrell, Jared Bowns, Dustin Burke, Ievgenii Baliuk, Rishabh Raman
-
Publication number: 20230004486Abstract: The system can identify data stored in repositories that indicate changes in the version of the application relative to a prior version of the application tested or deployed before receipt of the request to test the performance of the version of the application. The system can determine, based on the data and using machine learning with historical data associated with applications tested or deployed to test performance of the version, and without execution of the tests, a score for each of a plurality of tests configured to test performance of the version of the application. The system can select, based on the scores, a subset of the tests to execute, and provide an indication of the selected subset of the tests to cause execution of the subset of the tests to evaluate performance of the version of the application prior to deployment of the version of the application.Type: ApplicationFiled: July 1, 2022Publication date: January 5, 2023Applicant: DataRobot, Inc.Inventors: Borys Drozhak, Ievgenii Baliuk, Dustin Burke
-
Publication number: 20230004796Abstract: Systems and methods are described for developing and using neural network models. An example method of training a neural network includes: oscillating a learning rate while performing a preliminary training of a neural network; determining, based on the preliminary training, a number of training epochs to perform for a subsequent training session, and training the neural network using the determined number of training epochs. The systems and methods can be used to build neural network models that efficiently and accurately handle heterogeneous data.Type: ApplicationFiled: May 13, 2022Publication date: January 5, 2023Applicant: DataRobot, Inc.Inventors: Zachary Albert Mayer, Jason McGhee, Jesse Bannon, Joshua Matthew Weiner
-
Patent number: 11514369Abstract: Systems and methods are described for interpreting machine learning model predictions. An example method includes: providing a machine learning model configured to receive a plurality of features as input and provide a prediction as output, wherein the plurality of features includes an engineered feature including a combination of two or more parent features; calculating a Shapley value for each feature in the plurality of features; and allocating a respective portion of the Shapley value for the engineered feature to each of the two or more parent features.Type: GrantFiled: June 11, 2021Date of Patent: November 29, 2022Assignee: DataRobot, Inc.Inventors: Mark Benjamin Romanowsky, Jared Bowns, Thomas Whitehead, Thomas Stearns, Xavier Conort, Anastasiia Tamazlykar, Mohak Saxena
-
Publication number: 20220358528Abstract: An apparatus has a memory with processor-executable instructions and a processor operatively coupled to the memory. The apparatus receives datasets including time series data points that are descriptive of a feature of a given entity. The processor determines a time series characteristic based on the data content, and selects, based on the determined characteristic, a set of entrant forecasting models from a pool of forecasting models stored in the memory. Next, the processor trains each entrant forecasting model with the time series data points to produce a set of trained entrant forecasting models. The processor executes each trained entrant forecasting model to generate a set of forecasted values indicating estimations of the feature of the given entity. Thereafter the processor selects at least one forecasting model from the set of trained entrant forecasting models based on computed accuracy evaluations performed over the set of forecasted values.Type: ApplicationFiled: February 14, 2022Publication date: November 10, 2022Applicant: DataRobot, Inc.Inventors: John Bledsoe, Jeff Gabriel, Jason Montgomery, Ryan Sevey, Matt Steinpreis, Craig Vermeer, Ryan West
-
Publication number: 20220335030Abstract: Cache optimization for data preparation includes: generating a data traversal program that represents a result of a set of sequenced data preparation operations performed on one or more sets of data, wherein the data traversal program indicates how to assemble one or more affected columns in the one or more sets of data to derive the result; in response to receiving a specification of the set of sequenced operations to be performed on the one or more sets of data, accessing the data traversal program that represents the result or a stored copy of the data traversal program that represents the result; assembling the one or more affected columns in the one or more sets of data according to the data traversal program to re-generate the result; and outputting the result.Type: ApplicationFiled: July 1, 2022Publication date: October 20, 2022Applicant: DataRobot, Inc.Inventors: Dave Brewster, Victor Tso
-
Patent number: 11461304Abstract: Signature-based cache optimization for data preparation includes: performing a first set of sequenced data preparation operations on one or more sets of data to generate a plurality of transformation results; caching one or more of the plurality of transformation results and one or more corresponding operation signatures, a cached operation signature being derived based at least in part on a subset of sequenced operations that generated a corresponding result; receiving a specification of a second set of sequenced operations; determining an operation signature associated with the second set of sequenced operations; identifying a cached result among the cached results based at least in part on the determined operation signature; and outputting the cached result.Type: GrantFiled: March 10, 2020Date of Patent: October 4, 2022Assignee: DataRobot, Inc.Inventors: Dave Brewster, Victor Tze-Yeuan Tso
-
Publication number: 20220284183Abstract: A step editor for data preparation can instruct a user interface to present a first plurality of operations to be applied in a sequential order to one or more sets of data, receive user inputs including at least one indication to mute at least one operation of the first plurality of operations to prevent the processors from performing the at least one operation, generate a second plurality of operations, the second plurality of operations to be applied in a sequential order to the sets of data and comprising the first plurality of operations excluding the operation muted by the user inputs, obtain a cached data traversal program associated with the second plurality of operations and comprising a representation of a result of transforming the sets of data, and instruct the user interface to present output based at least in part on execution of the cached data traversal program.Type: ApplicationFiled: March 25, 2022Publication date: September 8, 2022Applicant: DataRobot, Inc.Inventors: Nenshad Bardoliwalla, Michael Matthews, Ian Timourian, Jing Chen, Lilia Gutnik, Whitman Kwok, Dave Brewster, Victor Tze-Yeuan Tso
-
Publication number: 20220237516Abstract: Data modeling systems and methods are described. A data modeling method may include receiving user input specifying a structure of at least a portion of a data model and a complexity value associated with the structure; (a) generating one or more data models; (b) determining complexity scores for the respective data models; (c) for each of the data models: determining whether to select the respective data model for evaluation based, at least in part, on the complexity score of the respective data model, and if the respective data model is selected for evaluation, evaluating an accuracy of the respective data model for one or more data sets; and repeating steps (a)-(c) until one or more specified termination criteria are satisfied, wherein a first of the generated data models includes the specified structure, and wherein the complexity score for the first data model is determined based, at least in part, on the complexity value associated with the structure.Type: ApplicationFiled: April 6, 2022Publication date: July 28, 2022Applicant: DataRobot, Inc.Inventors: Michael Schmidt, Dylan Sherry, Hongmin Fan
-
Patent number: 11386075Abstract: Methods for detection of anomalous data samples from a plurality of data samples are provided. In some embodiments, an anomaly detection procedure that includes a plurality of tasks is executed to identify the anomalous data samples from the plurality of data samples.Type: GrantFiled: November 6, 2020Date of Patent: July 12, 2022Assignee: DataRobot, Inc.Inventors: Amanda Claire Schierz, Jeremy Achin, Zachary Albert Mayer
-
Publication number: 20220199266Abstract: Systems and methods of epidemiological modeling using machine learning are provided, and can include receiving values for an occurrence of the infectious disease during a first time period, generating, from a model trained by a machine learning system, predictions for the occurrence of the infectious disease over a second time period, performing, by a simulator using the predictions, one or more simulations of the occurrence of the infectious disease in one or more geographic regions during one or more time periods subsequent to the second time period, and providing, to a user interface, a first simulation of the one or more simulations performed by the simulator for a first geographic region of the one or more geographic regions during a time period of the one or more time periods.Type: ApplicationFiled: December 9, 2021Publication date: June 23, 2022Applicant: DataRobot, Inc.Inventors: Jeremy Achin, Earl Jared Shamwell, Michael Schmidt, Mackenzie Heiser, Patrick Farrell, Matt Nitzken, Jared Bowns, Nathan Cameron, Adam Beairsto, Jay Schuren, Mohak Saxena