Patents Assigned to BigML, Inc.
  • Publication number: 20170286839
    Abstract: Systems and methods of selecting machine learning models/algorithms for a candidate dataset are disclosed. A computer system may access historical data of a set of algorithms applied to a set of benchmark datasets; select a first algorithm of the set of algorithms; apply the first algorithm to an input dataset to create a model of the input dataset; evaluate and store results of the applying; and add the first algorithm to a set of tried algorithms. The computer system may select a next algorithm of the algorithm set via submodular optimization based on the historical data and the set of tried algorithms; apply the next algorithm to the input dataset; capture a next result based on the applying; add the next result to update the set of tried algorithms; and repeat the submodular optimization. The procedure may continue until a termination condition is reached.
    Type: Application
    Filed: April 3, 2017
    Publication date: October 5, 2017
    Applicant: BigML, Inc.
    Inventor: Charles Parker
  • Publication number: 20170140302
    Abstract: A system and method enables users to selectively expose and optionally monetize their data resources, for example on a web site. Data assets such as datasets and models can be exposed by the proprietor on a public gallery for use by others. Fees may be charged, for example, per new model, or per prediction using a model. Users may selectively expose public datasets or public models while keeping their raw data private.
    Type: Application
    Filed: January 26, 2017
    Publication date: May 18, 2017
    Applicant: BigML, Inc.
    Inventors: Francisco J. MARTIN, Oscar ROVIRA, Jos VERWOERD, Poul PETERSEN, Charles PARKER, Jose Antonio ORTEGA, Beatriz GARCIA, J. Justin DONALDSON, Antonio BLASCO, Adam ASHENFELTER
  • Publication number: 20170090980
    Abstract: We describe a high-level computational framework especially well suited to parallel operations on large datasets. In a system in accordance with this framework, there is at least one, and generally several, instances of an architecture deployment as further described. We use the term “architecture deployment” herein to mean a cooperating group of processes together with the hardware on which the processes are executed. This is not to imply a one-to-one association of any process to particular hardware. To the contrary, as detailed below, an architecture deployment may dynamically spawn another deployment as appropriate, including provisioning needed hardware. The active architecture deployments together form a system that dynamically processes jobs requested by a user-customer, in accordance with customer's monetary budget and other criteria, in a robust and automatically scalable environment.
    Type: Application
    Filed: December 7, 2016
    Publication date: March 30, 2017
    Applicant: BigML, Inc.
    Inventors: Francisco J. Martin, Adam Ashenfelter, J. Justin Donaldson, Jos Verwoerd, Jose Antonio Ortega, Charles Parker
  • Patent number: 9576246
    Abstract: A system and method enables users to selectively expose and optionally monetize their data resources, for example on a web site. Data assets such as datasets and models can be exposed by the proprietor on a public gallery for use by others. Fees may be charged, for example, per new model, or per prediction using a model. Users may selectively expose public datasets or public models while keeping their raw data private.
    Type: Grant
    Filed: September 12, 2013
    Date of Patent: February 21, 2017
    Assignee: BIGML, INC.
    Inventors: Francisco J. Martin, Oscar Rovira, Jos Verwoerd, Poul Petersen, Charles Parker, Jose Antonio Ortega, Beatriz Garcia, J. Justin Donaldson, Antonio Blasco, Adam Ashenfelter
  • Publication number: 20170032026
    Abstract: Systems and processes are disclosed for advanced text analysis in the field of big data analytics and visualization: Users can now factor text into their predictive models, alongside regression, time/date and categorical information. This is ideal for building models where text content may play a prominent role (e.g., social media or customer service logs). Multiple data types, including text fields, may be combined together in datasets and models, and may be presented in various interactive visualization displays.
    Type: Application
    Filed: October 12, 2016
    Publication date: February 2, 2017
    Applicant: BigML, Inc.
    Inventors: Charles Parker, Adam Ashenfelter
  • Patent number: 9558036
    Abstract: We describe a high-level computational framework especially well suited to parallel operations on large datasets. In a system in accordance with this framework, there is at least one, and generally several, instances of an architecture deployment as further described. We use the term “architecture deployment” herein to mean a cooperating group of processes together with the hardware on which the processes are executed. This is not to imply a one-to-one association of any process to particular hardware. To the contrary, as detailed below, an architecture deployment may dynamically spawn another deployment as appropriate, including provisioning needed hardware. The active architecture deployments together form a system that dynamically processes jobs requested by a user-customer, in accordance with customer's monetary budget and other criteria, in a robust and automatically scalable environment.
    Type: Grant
    Filed: May 29, 2015
    Date of Patent: January 31, 2017
    Assignee: BigML, Inc.
    Inventors: Francisco J. Martin, Adam Ashenfelter, J. Justin Donaldson, Jos Verwoerd, Jose Antonio Ortega, Charles Parker
  • Patent number: 9501540
    Abstract: Systems and processes are disclosed for advanced text analysis in the field of big data analytics and visualization: Users can now factor text into their predictive models, alongside regression, time/date and categorical information. This is ideal for building models where text content may play a prominent role (e.g., social media or customer service logs). Multiple data types, including text fields, may be combined together in datasets and models, and may be presented in various interactive visualization displays.
    Type: Grant
    Filed: September 25, 2014
    Date of Patent: November 22, 2016
    Assignee: BIGML, INC.
    Inventors: Charles Parker, Adam Ashenfelter
  • Publication number: 20160292578
    Abstract: The present disclosure pertains to a system and method for predictive modeling of data clusters. The system and method include creating a dataset from a data source comprising data points, identifying a number of clusters based at least in part on a similarity metric between the data points, generating a model for each of the number of clusters based at least in part on identifying the number of clusters, visually displaying the number of clusters, receiving an indication of selection of a particular cluster, and replacing the visual display of the identified number of clusters with a visual display of the model corresponding to the particular cluster in response to receiving an indication of selection of a model icon.
    Type: Application
    Filed: April 1, 2016
    Publication date: October 6, 2016
    Applicant: BigML, Inc.
    Inventor: Adam Ashenfelter
  • Patent number: 9269054
    Abstract: Systems and methods are disclosed for building and using decision trees, preferably in a scalable and distributed manner. Our system can be used to create and use classification trees, regression trees, or a combination of regression trees called a gradient boosted regression tree (GBRT). Our system leverages approximate histograms in new ways to process large datasets, or data streams, while limiting inter-process communication bandwidth requirements. Further, in some embodiments, a scalable network of computers or processors is utilized for fast computation of decision trees. Preferably, the network comprises a tree structure of processors, comprising a master node and a plurality of worker nodes or “workers,” again arranged to limit necessary communications.
    Type: Grant
    Filed: November 9, 2012
    Date of Patent: February 23, 2016
    Assignee: BigML, Inc.
    Inventors: Francisco J. Martin, Adam Ashenfelter, J. Justin Donaldson, Jos Verwoerd, Jose Antonio Ortega, Charles Parker
  • Patent number: 9098326
    Abstract: We describe a high-level computational framework especially well suited to parallel operations on large datasets. In a system in accordance with this framework, there is at least one, and generally several, instances of an architecture deployment as further described. We use the term “architecture deployment” herein to mean a cooperating group of processes together with the hardware on which the processes are executed. This is not to imply a one-to-one association of any process to particular hardware. To the contrary, as detailed below, an architecture deployment may dynamically spawn another deployment as appropriate, including provisioning needed hardware. The active architecture deployments together form a system that dynamically processes jobs requested by a user-customer, in accordance with customer's monetary budget and other criteria, in a robust and automatically scalable environment.
    Type: Grant
    Filed: November 9, 2012
    Date of Patent: August 4, 2015
    Assignee: BigML, Inc.
    Inventors: Francisco J. Martin, Adam Ashenfelter, J. Justin Donaldson, Jos Verwoerd, Jose Antonio Ortega, Charles Parker
  • Publication number: 20150081685
    Abstract: A system and method generates and displays an interactive space-filling graphical representation of a model based at least in part on a dataset having data items. The space-filling graphical representation may have a plurality of segments arranged to realize a type of visualization and sized in proportion to a number of data items represented by the segment to convey particular information about the dataset.
    Type: Application
    Filed: September 24, 2014
    Publication date: March 19, 2015
    Applicant: BigML, Inc.
    Inventors: Adam Ashenfelter, David Gerster, Oscar Rovira
  • Publication number: 20130117280
    Abstract: A decision tree model is generated from sample data. A visualization system may automatically prune the decision tree model based on characteristics of nodes or branches in the decision tree or based on artifacts associated with model generation. For example, only nodes or questions in the decision tree receiving a largest amount of the sample data may be displayed in the decision tree. The nodes also may be displayed in a manner to more readily identify associated fields or metrics. For example, the nodes may be displayed in different colors and the colors may be associated with different node questions or answers.
    Type: Application
    Filed: November 2, 2012
    Publication date: May 9, 2013
    Applicant: BigML, Inc.
    Inventor: BigML, Inc.