Patents Assigned to DataRobot, Inc.
-
Publication number: 20240394595Abstract: The subject matter of this disclosure relates to systems and methods for monitoring and managing machine learning models and related data. Histogram structures can be used to aggregate streams of numerical data for storage and metric calculations. Drift in such data can be identified and monitored over time. When significant drift is detected and/or when model accuracy has deteriorated, models can be automatically refreshed with updated training data and/or replaced with one or more other models. A model controller is used to automate model monitoring and management activities across multiple prediction environments where models are deployed and prediction jobs are executed.Type: ApplicationFiled: February 20, 2024Publication date: November 28, 2024Applicant: DataRobot, Inc.Inventors: Amanda Schierz, Drew Roselli, Dulcardo Arteaga, Christopher Cozzi, Samuel Clark, John Bledsoe, Mykola Novik, Amar Mudrankit, Lior Amar, Evan Chang, Scott Oglesby, Tristan Robert Spaulding
-
Publication number: 20240338554Abstract: A method for generating a description of a data set may include generating a data dictionary that associates fields of the data set with descriptions of the fields based on analysis data and context data. The analysis data may indicate results of analysis of the data set including values of the fields and a value of an outcome variable. The context data may characterize a use case of the data set. The method may include generating a summary of the analysis data based on the analysis data and the dictionary; generating a description of a relationship between the outcome variable and a subset of the fields based on the context data, the summary of the analysis data, and the dictionary; generating a potential explanation for the relationship based on the context data and the description of the relationship; and outputting data based on the potential explanation.Type: ApplicationFiled: March 15, 2024Publication date: October 10, 2024Applicant: DataRobot, Inc.Inventors: Michael Schmidt, Marcus Braun, Ina Ko, Alex Conway
-
Patent number: 12050762Abstract: Disclosed herein are methods and systems to generate and revise a workflow that utilizes machine learning model nodes and other analytical nodes to analyze data and generate a decision via allowing a user to interact with input elements of a graphical user interface. The methods and systems use a processor to provide, for rendering by a user device, a graphical user interface comprising at least a first graphical indicator corresponding to a computer model node within workflow code and a second graphical indicator corresponding to a decision node within the workflow code, the computer model node visually connected with the decision node; and in response to receiving, via a user interacting with the graphical user interface, an additional node corresponding to at least one analytical protocol, revise the workflow code, by adding the analytical protocol before an execution of the decision node.Type: GrantFiled: September 12, 2022Date of Patent: July 30, 2024Assignee: DataRobot, Inc.Inventors: Jeremy Achin, Ina Ko, Stephen James Millet, Daniel Thomas Trost, Igor Veksler
-
Publication number: 20240193481Abstract: Identifying, visualizing, and mitigating machine learning model bias is provided. A system receives a feature of a plurality of features used by a model to generate output. The feature includes a plurality of categories, and the output comprises a plurality of types. The system identifies a metric used to evaluate a performance of the model and a threshold for the metric. The system determines a value for the metric for a category of the plurality of categories of the feature by comparing of a first number of values of a first type of the plurality of types output by the model for the category with a second number of values of the first type output by the model for the second category. The system generates a notification indicating the performance of the model responsive to a comparison of the value for the metric with the threshold for the metric.Type: ApplicationFiled: November 10, 2023Publication date: June 13, 2024Applicant: DataRobot, Inc.Inventors: Natalie Bucklin, Scott Lindeman, Jett Oristaglio, Edward Kwartler, Haniyeh Mahmoudian
-
Publication number: 20240086725Abstract: Aspects of this technical solution can segment a first time period for the first series into a second time period bounded by a first time stamp and a second time stamp later than the first time stamp, and into a third time period bounded by a third time stamp later than the second timestamp and a fourth time stamp later than the third time stamp, determine a metric for the third time period and based on first data points of a training data set for the first series and having time stamps bounded by the first time stamp and the second time stamp within the second time period, generate data points within the third time period based on the first metric and generate data points corresponding to a performance of a second series subsequent to the prediction time stamp.Type: ApplicationFiled: August 31, 2023Publication date: March 14, 2024Applicant: DataRobot, Inc.Inventors: Jonas Marius Vilkas, Mykhailo Poliakov, Iryna Kovalchuk
-
Publication number: 20240086736Abstract: A system can include a data processing system that can include memory and one or more processors to generate, by a first model trained using machine learning and compatible with first data having a first type and second data having a second type, a first metric based on the first data and indicating a first fault probability in a second model, generate, by the first model, a second metric based on the second data and indicating a second fault probability in a third model, determine, based on the first metric and the second metric, that an aggregate model that includes the second model and the third model satisfies a heuristic indicating a third fault probability in the aggregate model, and instruct, in response to a determination that the aggregate model satisfies the heuristic, a user interface to present an indication that the aggregate model satisfies the heuristic.Type: ApplicationFiled: November 17, 2023Publication date: March 14, 2024Applicant: DataRobot, Inc.Inventors: Edward Kwartler, Jett Oristaglio, Sarah Khatry, Haniyeh Mahmoudian, Scott Lindeman, Oleksandr Bagan, Vlad Vovk, Wesley Hedrick, Kent Borg, Alex Shoop, Nikita Striuk, Gianni Saporiti, Alisa Zosimova, Oleksandr Pikovets, Anton Bogatyrov
-
Publication number: 20240086775Abstract: Presented herein are methods and systems for generating and executing applications that provide insights to a model's operation without requiring the user to have knowledge of coding, computer programming, or artificial intelligence machine-learning methodologies. An exemplary method includes deploying a model using input data to generate a predicted dataset; presenting indications for a plurality of applications associated with the deployed model including an configured to generate new scenarios and another application configured to optimize at least one feature; presenting a plurality of features analyzed by the model; and in response to receiving a selection of a feature of the plurality of features and a new value for the feature, executing the first application to generate a second predicted dataset using the new value.Type: ApplicationFiled: November 10, 2023Publication date: March 14, 2024Applicant: DataRobot, Inc.Inventors: Jeremy Achin, Ina Ko, Borys Kupar, Tristan Spaulding, Yulia Bezuhla, Brett Rowley, Colleen Wilhide
-
Publication number: 20240078163Abstract: A system to deploy virtual sensors to a machine learning project and translate data of the machine learning project is provided. The system can deploy, for a machine learning project, a plurality of virtual sensors at a first location of a plurality of locations to detect metadata of a data source of the machine learning project, at a second location of the plurality of locations to detect deployment information of a model trained for the machine learning project, and at a third location of the plurality of locations to detect learning session information for creation of the model. The system can collect, via the plurality of virtual sensors, data for the machine learning project. The system can translate, for render on a computing system, the data collected via the plurality of virtual sensors into a plurality of graphics.Type: ApplicationFiled: November 10, 2023Publication date: March 7, 2024Applicant: DataRobot, Inc.Inventors: Jeremy Achin, Michael Schmidt, Dmitry Zahanych, Alexander Jason Conway, Benjamin Taylor, Michael William Gilday, Uros Perisic, Andrii Chulovskyi, Romain Briot, Sully Matthew Sullenberger
-
Publication number: 20240078093Abstract: Customizing an automated machine learning system is provided. The system receives a request to establish computer-executable operations for use with machine learning on a data set. The system provides, for display via a graphical user interface on the client device, an indication of a set of computer-executable operations generated automatically for machine learning on the data set by the system responsive to the request. The system receives, from the client device via the graphical user interface, an indication to modify the set of computer-executable operations. The system establishes compatibility of the set of computer-executable operations responsive to the modification. The system constructs, responsive to establishment of the compatibility, the set of computer-executable operations for use with machine learning.Type: ApplicationFiled: November 10, 2023Publication date: March 7, 2024Applicant: DataRobot, Inc.Inventors: Sylvain Ferrandiz, Zachary Mayer, Jason Jay McGhee, Joshua David Preuss, Mikhail Yakubovskiy
-
Patent number: 11922329Abstract: A predictive modeling method may include obtaining a fitted, first-order predictive model configured to predict values of output variables based on values of first input variables; and performing a second-order modeling procedure on the fitted, first-order model, which may include: generating input data including observations including observed values of second input variables and predicted values of the output variables; generating training data and testing data from the input data; generating a fitted second-order model of the fitted first-order model by fitting a second-order model to the training data; and testing the fitted, second-order model of the first-order model on the testing data. Each observation of the input data may be generated by (1) obtaining observed values of the second input variables, and (2) applying the first-order predictive model to corresponding observed values of the first input variables to generate the predicted values of the output variables.Type: GrantFiled: December 20, 2019Date of Patent: March 5, 2024Assignee: DataRobot, Inc.Inventors: Jeremy Achin, Thomas DeGodoy, Timothy Owen, Xavier Conort, Sergey Yurgenson, Mark L. Steadman, Glen Koundry, Hon Nian Chua
-
Publication number: 20240064074Abstract: Aspects of this technical solution can generate, according to a lag time window based at least in part on a first plurality of features, a second data set via aggregation of compatible fields in the first data set, the first plurality of features corresponding to a first data set, augment the first plurality of features extracted from the first data set with a second plurality of features extracted from a third data set, the third data set corresponding to a join of the first data set and the second data set, update, via machine learning and according to a rate corresponding to the data set, a model with the third plurality of features, and instruct a user interface to present at least one performance of the model with the third plurality of features, according to the rate.Type: ApplicationFiled: August 15, 2023Publication date: February 22, 2024Applicant: DataRobot, Inc.Inventors: Rishabh Raman, Peter Simon, Oleg Zarakhani
-
Publication number: 20240028828Abstract: Aspects of this technical solution can identify a plurality of n-grams at a plurality of locations in a first data set comprising text, generate, via a model trained with machine learning, a first prediction for the first data set, generate, via the model, a second prediction for a second data set that lacks the first n-gram at a first location of the plurality of locations, generate, by comparing a first prediction for the first data set with a second prediction for the second data set, an impact of the first n-gram at the first location, and cause a user interface to present at least a portion of the first data set with a visual indication corresponding to the impact, the visual indication applied to a portion of the user interface corresponding to the first n-gram and positioned in the user interface at the first location.Type: ApplicationFiled: July 24, 2023Publication date: January 25, 2024Applicant: DataRobot, Inc.Inventors: Anton Kasyanov, Jonathan Chang, Mykyta Yarmak, Ee Kin Chin
-
Publication number: 20240028959Abstract: Re-binning and smoothing an indicator table is provided. A system can identify a table generated a model trained with machine learning, the table including bins for ranges of values of a feature and coefficients that indicate a level of a target for the bins. The system can receive, via a graphical user interface from a client device, a request to modify bins of the table. The system can establish, responsive to the request, a spline to fit the table based at least in part on a cost function weighted based on a number of entries of the feature for the ranges of values of the feature. The system can generate, via the spline established based at least in part on the cost function, a second table including second bins and second coefficients. The system can generate data to cause the graphical user interface to include a graphic representation of the second table.Type: ApplicationFiled: July 19, 2023Publication date: January 25, 2024Applicant: DataRobot, Inc.Inventors: Glen Koundry, Mikhail Yakubovskiy
-
Publication number: 20230394361Abstract: Machine learning model searching using meta data is provided. A system receives, via a graphical user interface from a client device, a request to search for one or more blueprints including one or more models to add to a project. The system can identify, based on a selection, a list of features with which to execute the requested search. The system can provide a blueprint including a model selected from projects established via input from client devices different from the client device, the projects including blueprints, the blueprints including models trained by machine learning. The system can train, via machine learning, the model of the blueprint to determine the target and add the blueprint including the trained model to the project. The system can generate data causing the graphical user interface to display an indication of the blueprint including the trained model.Type: ApplicationFiled: May 31, 2023Publication date: December 7, 2023Applicant: DataRobot, Inc.Inventors: Ho Nian Chua, Michael Schmidt, Zachary Meyer, Senbong Gee, Mark Steadman, Alex Conway, Lingjun Kang
-
Publication number: 20230316137Abstract: Automated spatial feature engineering techniques may include (1) automatically deriving new features (e.g., spatial lags) based on spatial relationships between or among observations, (2) using parameter optimization techniques to optimize parameters of the spatial feature engineering process (e.g., parameters relating to the size of spatial neighborhoods and/or to the orders of spatial lags), (3) automatically deriving new spatial features representing geometric properties and/or spatial statistics associated with individual spatial observations, (4) determining the feature importance of location features, and/or (5) automatically partitioning spatial datasets such that spatial leakage is reduced, which generally leads to the development of more accurate spatial models. Such techniques may involve joint treatment of distinct location coordinate features as a single location feature for purposes of determining feature importance.Type: ApplicationFiled: January 17, 2023Publication date: October 5, 2023Applicant: DataRobot, Inc.Inventors: David Blumstein, Lingjun Kang, Andrey Mukomolov, Joseph O’Halloran, Eric Reyes, Rohit Sharma, Kevin Stofan, Pavel Tyslacki
-
Publication number: 20230297043Abstract: A system to generate scenarios by modifying values of machine learning features is provided. The system can present a first indication in a first coordinate space of a first performance generated by a model trained with a plurality of features using machine learning. The system can present a second indication in a second coordinate space of a first performance of the first feature. The system can receive a modification to a value in the second coordinate space of the first feature. The system can determine a second performance of the model using machine learning based on a first derived feature to output derived data points in the time period. The system can present in the first coordinate space, a third indication of the second performance of the model overlaid with the first indication of the first performance of the model.Type: ApplicationFiled: March 15, 2022Publication date: September 21, 2023Applicant: DataRobot, Inc.Inventors: Ina Ko, Borys Kupar, Yulia Bezhula
-
Patent number: 11748653Abstract: Apparatuses, systems, program products, and method are disclosed for machine learning abstraction. An apparatus includes an objective module configured to receive an objective to be analyzed using machine learning. An apparatus includes a grouping module configured to select a logical grouping of one or more machine learning pipelines to analyze a received objective. An apparatus includes an adjustment module configured to dynamically adjust one or more machine learning settings for a logical grouping of one or more machine learning pipelines based on feedback generated in response to analyzing a received objective.Type: GrantFiled: June 5, 2018Date of Patent: September 5, 2023Assignee: DataRobot, Inc.Inventors: Nisha Talagala, Vinay Sridhar, Swaminathan Sundararaman, Sindhu Ghanta, Lior Amar, Lior Khermosh, Bharath Ramsundar, Sriram Subramanian, Drew Roselli
-
Publication number: 20230206610Abstract: Disclosed herein at methods and systems for visualizing machine learning model performance. One method comprises receiving a request to provide a visual representation of a machine learning technique executed on a set of images to generate a first attribute and a second attribute for each image; executing the machine learning model to receive the first and the second attribute for each image; mapping the first attribute to a visual distinctiveness protocol; identifying a distance for each image, the distance representing a difference between the second attribute predicted by the model for each pair of respective images within the set of images; and providing for display at least a subset of the set of images arranged in accordance with their respective distance and having a visual attribute corresponding to the mapped first attribute for each image.Type: ApplicationFiled: December 29, 2022Publication date: June 29, 2023Applicant: DataRobot, Inc.Inventors: Ivan Pyzow, Pavlo Kochubei, Yehor Kolchyba, Sylvain Ferrandiz, Anton Kasyanov
-
Publication number: 20230196101Abstract: An automated machine learning (“ML”) method may include training a first machine learning model using a first machine learning algorithm and a training data set; validating the first machine learning model using a validation data set, wherein validating the first machine learning model comprises generating an error data set; training a second machine learning model to predict a suitability of the first machine learning model for analyzing an inference data set, wherein the second machine learning model is trained using a second machine learning algorithm and the error data set; and triggering a remedial action associated with the first or second machine learning model in response to a predicted suitability of the first machine learning model for analyzing the inference data set not satisfying a suitability threshold.Type: ApplicationFiled: November 16, 2022Publication date: June 22, 2023Applicant: DataRobot, Inc.Inventors: Sindhu Ghanta, Drew Roselli, Nisha Talagala, Vinay Sridhar, Swaminathan Sundararaman, Lior Amar, Lior Khermosh, Bharath Ramsundar, Sriram Subramanian
-
Publication number: 20230186175Abstract: Comparing a challenger model with a primary model is provided herein. In an embodiment, a system comprises one or more processors, coupled to memory, configured to determine, based on a comparison of a first model that is deployed as a primary model with a second model that is acting as a challenger model, that the second model performs better than the first model based on at least one performance metric; determine, based on a comparison of a characteristic of the first model with a characteristic of the second model, to skip a validation process for the second model; and establish the second model as the primary model in the deployment to replace the first model in the deployment.Type: ApplicationFiled: December 9, 2022Publication date: June 15, 2023Applicant: DataRobot, Inc.Inventors: Bohdan Usatov, Chris Li, Evan Chang, Tristan Spauding, Christopher Cozzi