Patents by Inventor Matei Zaharia
Matei Zaharia has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230177072Abstract: The present application discloses a method, system, and computer system for managing a plurality of features and storing lineage information pertaining to the features. The method includes obtaining one or more datasets, determining a first feature, wherein the first feature is determined based at least in part on the one or more datasets, and storing the first feature in a feature store. The first feature is stored in association with a dataset indication of the one or more datasets from which the first feature is determined. The feature store comprises a plurality of features.Type: ApplicationFiled: January 31, 2023Publication date: June 8, 2023Inventors: Mani Parkhe, Clemens Mewald, Matei Zaharia, Avesh Singh
-
Publication number: 20230141556Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.Type: ApplicationFiled: October 28, 2022Publication date: May 11, 2023Inventors: Michael Paul Armbrust, Tathagata Das, Shi Xin, Matei Zaharia
-
Patent number: 11514045Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.Type: GrantFiled: December 19, 2019Date of Patent: November 29, 2022Assignee: Databricks Inc.Inventors: Michael Paul Armbrust, Tathagata Das, Shi Xin, Matei Zaharia
-
Publication number: 20220374532Abstract: The present application discloses a method, system, and computer system for providing access to information stored on system for data storage. The method includes receiving a data request from a user, determining data corresponding to the data request, determining whether the user has requisite permissions to access the data, and in response to determining that the user has requisite permissions to access the data: determining a manner by which to provide access to the data, wherein the data comprises a filtered subset of stored data, and generating a token based at least in part on the user and the manner by which access to the data is to be provided.Type: ApplicationFiled: October 29, 2021Publication date: November 24, 2022Inventors: Matei Zaharia, David Lewis, Cheng Lian, Yuchen Huo, Ali Ghodsi
-
Publication number: 20220374457Abstract: The present application discloses a method, system, and computer system for managing a plurality of features and storing lineage information pertaining to the features. The method includes obtaining one or more datasets, determining a first feature, wherein the first feature is determined based at least in part on the one or more datasets, and storing the first feature in a feature store. The first feature is stored in association with a dataset indication of the one or more datasets from which the first feature is determined. The feature store comprises a plurality of features.Type: ApplicationFiled: October 29, 2021Publication date: November 24, 2022Inventors: Mani Parkhe, Clemens Mewald, Matei Zaharia, Avesh Singh
-
Publication number: 20200257689Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.Type: ApplicationFiled: December 19, 2019Publication date: August 13, 2020Inventors: Michael Paul Armbrust, Tathagata Das, Shi Xin, Matei Zaharia
-
Patent number: 10558664Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.Type: GrantFiled: April 28, 2017Date of Patent: February 11, 2020Assignee: Databricks Inc.Inventors: Michael Armbrust, Tathagata Das, Shi Xin, Matei Zaharia
-
Patent number: 10474501Abstract: A system for cluster resource allocation includes an interface and a processor. The interface is configured to receive a process and input data. The processor is configured to determine an estimate for resources required for the process to process the input data; determine existing available resources in a cluster for running the process; determine whether the existing available resources are sufficient for running the process; in the event it is determined that the existing available resources are not sufficient for running the process, indicate to add new resources; determine an allocated share of resources in the cluster for running the process; and cause execution of the process using the share of resources.Type: GrantFiled: April 28, 2017Date of Patent: November 12, 2019Assignee: Databricks Inc.Inventors: Ali Ghodsi, Srinath Shankar, Sameer Paranjpye, Shi Xin, Matei Zaharia
-
Patent number: 10361928Abstract: A system for cluster management comprises a status monitor and an instance replacement manager. The status monitor is for monitoring status of an instance of a set of instances on a cluster provider. The instance replacement manager is for determining a replacement strategy for the instance in the event the instance does not respond. The replacement strategy for the instance is based at least in part on a management criteria for on-demand instances and spot instances on the cluster provider.Type: GrantFiled: August 21, 2017Date of Patent: July 23, 2019Assignee: Databricks Inc.Inventors: Ali Ghodsi, Ion Stoica, Matei Zaharia
-
Publication number: 20180314732Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.Type: ApplicationFiled: April 28, 2017Publication date: November 1, 2018Inventors: Michael Armbrust, Tathagata Das, Shi Xin, Matei Zaharia
-
Publication number: 20180314556Abstract: A system for cluster resource allocation includes an interface and a processor. The interface is configured to receive a process and input data. The processor is configured to determine an estimate for resources required for the process to process the input data; determine existing available resources in a cluster for running the process; determine whether the existing available resources are sufficient for running the process; in the event it is determined that the existing available resources are not sufficient for running the process, indicate to add new resources; determine an allocated share of resources in the cluster for running the process; and cause execution of the process using the share of resources.Type: ApplicationFiled: April 28, 2017Publication date: November 1, 2018Inventors: Ali Ghodsi, Srinath Shankar, Sameer Paranjpye, Shi Xin, Matei Zaharia
-
Patent number: 10095735Abstract: A system for exploring data in a database comprises a query parser, a parameter manager, a query submitter, and a result formatter. The query parser is to receive a base query and determine an input parameter from the base query. The parameter manager is to provide a first request for a value for the input parameter; receive the value for the input parameter; and provide a second request for the value for the input parameter. The query submitter is to determine a first query using the base query and the value for the input parameter; and provide an indication to execute the first query. The result formatter is to receive a result associated with the indication to execute the first query.Type: GrantFiled: August 11, 2017Date of Patent: October 9, 2018Assignee: Databricks Inc.Inventors: Ali Ghodsi, Ion Stoica, Matei Zaharia
-
Publication number: 20180048536Abstract: A system for cluster management comprises a status monitor and an instance replacement manager. The status monitor is for monitoring status of an instance of a set of instances on a cluster provider. The instance replacement manager is for determining a replacement strategy for the instance in the event the instance does not respond. The replacement strategy for the instance is based at least in part on a management criteria for on-demand instances and spot instances on the cluster provider.Type: ApplicationFiled: August 21, 2017Publication date: February 15, 2018Inventors: Ali Ghodsi, Ion Stoica, Matei Zaharia
-
Publication number: 20180046668Abstract: A system for exploring data in a database comprises a query parser, a parameter manager, a query submitter, and a result formatter. The query parser is to receive a base query and determine an input parameter from the base query. The parameter manager is to provide a first request for a value for the input parameter; receive the value for the input parameter; and provide a second request for the value for the input parameter. The query submitter is to determine a first query using the base query and the value for the input parameter; and provide an indication to execute the first query. The result formatter is to receive a result associated with the indication to execute the first query.Type: ApplicationFiled: August 11, 2017Publication date: February 15, 2018Inventors: Ali Ghodsi, Ion Stoica, Matei Zaharia
-
Patent number: 9769032Abstract: A system for cluster management comprises a status monitor and an instance replacement manager. The status monitor is for monitoring status of an instance of a set of instances on a cluster provider. The instance replacement manager is for determining a replacement strategy for the instance in the event the instance does not respond. The replacement strategy for the instance is based at least in part on a management criteria for on-demand instances and spot instances on the cluster provider.Type: GrantFiled: March 20, 2015Date of Patent: September 19, 2017Assignee: Databricks Inc.Inventors: Ali Ghodsi, Ion Stoica, Matei Zaharia
-
Patent number: 9760602Abstract: A system for exploring data in a database comprises a query parser, a parameter manager, a query submitter, and a result formatter. The query parser is to receive a base query and determine an input parameter from the base query. The parameter manager is to provide a first request for a value for the input parameter; receive the value for the input parameter; and provide a second request for the value for the input parameter. The query submitter is to determine a first query using the base query and the value for the input parameter; and provide an indication to execute the first query. The result formatter is to receive a result associated with the indication to execute the first query.Type: GrantFiled: February 13, 2015Date of Patent: September 12, 2017Assignee: Databricks Inc.Inventors: Ali Ghodsi, Ion Stoica, Matei Zaharia