Patents Assigned to BigML, Inc.
-
Publication number: 20170286839Abstract: Systems and methods of selecting machine learning models/algorithms for a candidate dataset are disclosed. A computer system may access historical data of a set of algorithms applied to a set of benchmark datasets; select a first algorithm of the set of algorithms; apply the first algorithm to an input dataset to create a model of the input dataset; evaluate and store results of the applying; and add the first algorithm to a set of tried algorithms. The computer system may select a next algorithm of the algorithm set via submodular optimization based on the historical data and the set of tried algorithms; apply the next algorithm to the input dataset; capture a next result based on the applying; add the next result to update the set of tried algorithms; and repeat the submodular optimization. The procedure may continue until a termination condition is reached.Type: ApplicationFiled: April 3, 2017Publication date: October 5, 2017Applicant: BigML, Inc.Inventor: Charles Parker
-
Publication number: 20170140302Abstract: A system and method enables users to selectively expose and optionally monetize their data resources, for example on a web site. Data assets such as datasets and models can be exposed by the proprietor on a public gallery for use by others. Fees may be charged, for example, per new model, or per prediction using a model. Users may selectively expose public datasets or public models while keeping their raw data private.Type: ApplicationFiled: January 26, 2017Publication date: May 18, 2017Applicant: BigML, Inc.Inventors: Francisco J. MARTIN, Oscar ROVIRA, Jos VERWOERD, Poul PETERSEN, Charles PARKER, Jose Antonio ORTEGA, Beatriz GARCIA, J. Justin DONALDSON, Antonio BLASCO, Adam ASHENFELTER
-
Publication number: 20170090980Abstract: We describe a high-level computational framework especially well suited to parallel operations on large datasets. In a system in accordance with this framework, there is at least one, and generally several, instances of an architecture deployment as further described. We use the term “architecture deployment” herein to mean a cooperating group of processes together with the hardware on which the processes are executed. This is not to imply a one-to-one association of any process to particular hardware. To the contrary, as detailed below, an architecture deployment may dynamically spawn another deployment as appropriate, including provisioning needed hardware. The active architecture deployments together form a system that dynamically processes jobs requested by a user-customer, in accordance with customer's monetary budget and other criteria, in a robust and automatically scalable environment.Type: ApplicationFiled: December 7, 2016Publication date: March 30, 2017Applicant: BigML, Inc.Inventors: Francisco J. Martin, Adam Ashenfelter, J. Justin Donaldson, Jos Verwoerd, Jose Antonio Ortega, Charles Parker
-
Patent number: 9576246Abstract: A system and method enables users to selectively expose and optionally monetize their data resources, for example on a web site. Data assets such as datasets and models can be exposed by the proprietor on a public gallery for use by others. Fees may be charged, for example, per new model, or per prediction using a model. Users may selectively expose public datasets or public models while keeping their raw data private.Type: GrantFiled: September 12, 2013Date of Patent: February 21, 2017Assignee: BIGML, INC.Inventors: Francisco J. Martin, Oscar Rovira, Jos Verwoerd, Poul Petersen, Charles Parker, Jose Antonio Ortega, Beatriz Garcia, J. Justin Donaldson, Antonio Blasco, Adam Ashenfelter
-
Publication number: 20170032026Abstract: Systems and processes are disclosed for advanced text analysis in the field of big data analytics and visualization: Users can now factor text into their predictive models, alongside regression, time/date and categorical information. This is ideal for building models where text content may play a prominent role (e.g., social media or customer service logs). Multiple data types, including text fields, may be combined together in datasets and models, and may be presented in various interactive visualization displays.Type: ApplicationFiled: October 12, 2016Publication date: February 2, 2017Applicant: BigML, Inc.Inventors: Charles Parker, Adam Ashenfelter
-
Patent number: 9558036Abstract: We describe a high-level computational framework especially well suited to parallel operations on large datasets. In a system in accordance with this framework, there is at least one, and generally several, instances of an architecture deployment as further described. We use the term “architecture deployment” herein to mean a cooperating group of processes together with the hardware on which the processes are executed. This is not to imply a one-to-one association of any process to particular hardware. To the contrary, as detailed below, an architecture deployment may dynamically spawn another deployment as appropriate, including provisioning needed hardware. The active architecture deployments together form a system that dynamically processes jobs requested by a user-customer, in accordance with customer's monetary budget and other criteria, in a robust and automatically scalable environment.Type: GrantFiled: May 29, 2015Date of Patent: January 31, 2017Assignee: BigML, Inc.Inventors: Francisco J. Martin, Adam Ashenfelter, J. Justin Donaldson, Jos Verwoerd, Jose Antonio Ortega, Charles Parker
-
Patent number: 9501540Abstract: Systems and processes are disclosed for advanced text analysis in the field of big data analytics and visualization: Users can now factor text into their predictive models, alongside regression, time/date and categorical information. This is ideal for building models where text content may play a prominent role (e.g., social media or customer service logs). Multiple data types, including text fields, may be combined together in datasets and models, and may be presented in various interactive visualization displays.Type: GrantFiled: September 25, 2014Date of Patent: November 22, 2016Assignee: BIGML, INC.Inventors: Charles Parker, Adam Ashenfelter
-
Publication number: 20160292578Abstract: The present disclosure pertains to a system and method for predictive modeling of data clusters. The system and method include creating a dataset from a data source comprising data points, identifying a number of clusters based at least in part on a similarity metric between the data points, generating a model for each of the number of clusters based at least in part on identifying the number of clusters, visually displaying the number of clusters, receiving an indication of selection of a particular cluster, and replacing the visual display of the identified number of clusters with a visual display of the model corresponding to the particular cluster in response to receiving an indication of selection of a model icon.Type: ApplicationFiled: April 1, 2016Publication date: October 6, 2016Applicant: BigML, Inc.Inventor: Adam Ashenfelter
-
Patent number: 9269054Abstract: Systems and methods are disclosed for building and using decision trees, preferably in a scalable and distributed manner. Our system can be used to create and use classification trees, regression trees, or a combination of regression trees called a gradient boosted regression tree (GBRT). Our system leverages approximate histograms in new ways to process large datasets, or data streams, while limiting inter-process communication bandwidth requirements. Further, in some embodiments, a scalable network of computers or processors is utilized for fast computation of decision trees. Preferably, the network comprises a tree structure of processors, comprising a master node and a plurality of worker nodes or “workers,” again arranged to limit necessary communications.Type: GrantFiled: November 9, 2012Date of Patent: February 23, 2016Assignee: BigML, Inc.Inventors: Francisco J. Martin, Adam Ashenfelter, J. Justin Donaldson, Jos Verwoerd, Jose Antonio Ortega, Charles Parker
-
Patent number: 9098326Abstract: We describe a high-level computational framework especially well suited to parallel operations on large datasets. In a system in accordance with this framework, there is at least one, and generally several, instances of an architecture deployment as further described. We use the term “architecture deployment” herein to mean a cooperating group of processes together with the hardware on which the processes are executed. This is not to imply a one-to-one association of any process to particular hardware. To the contrary, as detailed below, an architecture deployment may dynamically spawn another deployment as appropriate, including provisioning needed hardware. The active architecture deployments together form a system that dynamically processes jobs requested by a user-customer, in accordance with customer's monetary budget and other criteria, in a robust and automatically scalable environment.Type: GrantFiled: November 9, 2012Date of Patent: August 4, 2015Assignee: BigML, Inc.Inventors: Francisco J. Martin, Adam Ashenfelter, J. Justin Donaldson, Jos Verwoerd, Jose Antonio Ortega, Charles Parker
-
Publication number: 20150081685Abstract: A system and method generates and displays an interactive space-filling graphical representation of a model based at least in part on a dataset having data items. The space-filling graphical representation may have a plurality of segments arranged to realize a type of visualization and sized in proportion to a number of data items represented by the segment to convey particular information about the dataset.Type: ApplicationFiled: September 24, 2014Publication date: March 19, 2015Applicant: BigML, Inc.Inventors: Adam Ashenfelter, David Gerster, Oscar Rovira
-
Publication number: 20130117280Abstract: A decision tree model is generated from sample data. A visualization system may automatically prune the decision tree model based on characteristics of nodes or branches in the decision tree or based on artifacts associated with model generation. For example, only nodes or questions in the decision tree receiving a largest amount of the sample data may be displayed in the decision tree. The nodes also may be displayed in a manner to more readily identify associated fields or metrics. For example, the nodes may be displayed in different colors and the colors may be associated with different node questions or answers.Type: ApplicationFiled: November 2, 2012Publication date: May 9, 2013Applicant: BigML, Inc.Inventor: BigML, Inc.