Patents by Inventor MOSTAFA MOKHTAR

MOSTAFA MOKHTAR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Adaptive approach to lazy materialization in database scans using pushed filters

Patent number: 12124450

Abstract: Disclosed herein is a method for determining whether to apply a lazy materialization technique to a query run. A data processing service receives a request to perform a query identifying a filter column and a non-filter column in a columnar database. The data processing service accesses a first task of contiguous rows in the filter column from a cloud-based object storage. The data processing service applies a filter defined by the query to the first task. The data processing service generates filter results for the first task that may include a percentage of the first task discarded and a run-time. The data processing service determines, based on the filter results for the first task, a likelihood value that indicates a likelihood of gaining a performance benefit by applying the lazy materialization technique to a second task of the query.

Type: Grant

Filed: January 27, 2023

Date of Patent: October 22, 2024

Assignee: Databricks, Inc.

Inventors: Shoumik Palkar, Alexander Behm, Mostafa Mokhtar, Sriram Krishnamurthy
Distinct value estimation for query planning

Patent number: 12105712

Abstract: The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.

Type: Grant

Filed: April 24, 2023

Date of Patent: October 1, 2024

Assignee: CLOUDERA, INC.

Inventors: Alexander Behm, Mostafa Mokhtar
Adaptive Approach to Lazy Materialization in Database Scans using Pushed Filters

Publication number: 20240256543

Abstract: Disclosed herein is a method for determining whether to apply a lazy materialization technique to a query run. A data processing service receives a request to perform a query identifying a filter column and a non-filter column in a columnar database. The data processing service accesses a first task of contiguous rows in the filter column from a cloud-based object storage. The data processing service applies a filter defined by the query to the first task. The data processing service generates filter results for the first task that may include a percentage of the first task discarded and a run-time. The data processing service determines, based on the filter results for the first task, a likelihood value that indicates a likelihood of gaining a performance benefit by applying the lazy materialization technique to a second task of the query.

Type: Application

Filed: January 27, 2023

Publication date: August 1, 2024

Inventors: Shoumik Palkar, Alexander Behm, Mostafa Mokhtar, Sriram Krishnamurthy
STATIC APPROACH TO LAZY MATERIALIZATION IN DATABASE SCANS USING PUSHED FILTERS

Publication number: 20240256539

Abstract: Disclosed herein is a method for determining whether to apply a lazy materialization technique to a query run. The method includes receiving a request to perform a new query in a columnar database containing a plurality of columns. A step in the method includes accessing a set of data in a column of the plurality of columns based on the query. The method includes generating an input to a machine-learned model comprising characteristics of the set of data in the column. From the machine-learned model, the method includes generating a likelihood value indicative of whether a filter of a first portion of the set of data in the column has greater efficiency than a download followed by a filter of the set of data in the column. The method further includes comparing the likelihood value to a threshold value. Based on the comparison, the method includes filtering the first portion of the set of data before downloading the set of data if the likelihood value is equal to or above the threshold value.

Type: Application

Filed: January 27, 2023

Publication date: August 1, 2024

Inventors: Shoumik Palkar, Alexander Behm, Mostafa Mokhtar, Sriram Krishnamurthy
DISTINCT VALUE ESTIMATION FOR QUERY PLANNING

Publication number: 20230350894

Abstract: The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.

Type: Application

Filed: April 24, 2023

Publication date: November 2, 2023

Inventors: Alexander Behm, Mostafa Mokhtar
Distinct value estimation for query planning

Patent number: 11663213

Abstract: The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.

Type: Grant

Filed: November 25, 2020

Date of Patent: May 30, 2023

Assignee: Cloudera, Inc.

Inventors: Alexander Behm, Mostafa Mokhtar
DISTINCT VALUE ESTIMATION FOR QUERY PLANNING

Publication number: 20210149904

Abstract: The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.

Type: Application

Filed: November 25, 2020

Publication date: May 20, 2021

Inventors: Alexander Behm, Mostafa Mokhtar
Distinct value estimation for query planning

Patent number: 10853368

Abstract: The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.

Type: Grant

Filed: April 2, 2018

Date of Patent: December 1, 2020

Assignee: Cloudera, Inc.

Inventors: Alexander Behm, Mostafa Mokhtar
DISTINCT VALUE ESTIMATION FOR QUERY PLANNING

Publication number: 20190303479

Abstract: The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.

Type: Application

Filed: April 2, 2018

Publication date: October 3, 2019

Inventors: ALEXANDER BEHM, MOSTAFA MOKHTAR