Patents by Inventor Alexander Behm

Alexander Behm has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LIFO based spilling for grouping aggregation

Patent number: 11481398

Abstract: A system for spilling comprises an interface and a processor. The interface is configured to receive an indication to perform a GROUP BY operation, wherein the indication comprises an input table and a grouping column. The processor is configured to: for each input table entry of the input table, determine a key, wherein the key is based at least in part on the input table entry and the grouping column; add the key to a grouping hash table, wherein adding the key to the grouping hash table comprises last-in, first-out (LIFO) spilling when necessary; create an output table based at least in part on the grouping hash table; and provide the output table.

Type: Grant

Filed: December 9, 2020

Date of Patent: October 25, 2022

Assignee: Databricks Inc.

Inventors: Alexander Behm, Ankur Dave, Ryan Deng, Shoumik Palkar
INTEGRATED NATIVE VECTORIZED ENGINE FOR COMPUTATION

Publication number: 20220100761

Abstract: A system comprises an interface, a processor, and a memory. The interface is configured to receive a query. The processor is configured to: determine a set of nodes for the query; determine whether a node of the set of nodes comprises a first engine node type or a second engine node type, wherein determining whether the node of the set of nodes comprises the first engine node type or the second engine node type is based at least in part on determining whether the node is able to be executed in a second engine; and generate a plan based at least in part on the set of nodes. The memory is coupled to the processor and is configured to provide the processor with instructions.

Type: Application

Filed: April 22, 2021

Publication date: March 31, 2022

Inventors: Shi Xin, Alexander Behm, Shoumik Palkar, Herman Rudolf Petrus Catharina van Hövell tot Westerflier
DISTINCT VALUE ESTIMATION FOR QUERY PLANNING

Publication number: 20210149904

Abstract: The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.

Type: Application

Filed: November 25, 2020

Publication date: May 20, 2021

Inventors: Alexander Behm, Mostafa Mokhtar
Distinct value estimation for query planning

Patent number: 10853368

Abstract: The problem of distinct value estimation has many applications, but is particularly important in the field of database technology where such information is utilized by query planners to generate and optimize query plans. Introduced is a novel technique for estimating the number of distinct values in a given dataset without scanning all of the values in the dataset. In an example embodiment, the introduced technique includes gathering multiple intermediate probabilistic estimates based on varying samples of the dataset, 2) plotting the multiple intermediate probabilistic estimates against indications of sample size, 3) fitting a function to the plotted data points, and 4) determining an overall distinct value estimate by extrapolating the objective function to an estimated or known total number of values in the dataset.

Type: Grant

Filed: April 2, 2018

Date of Patent: December 1, 2020

Assignee: Cloudera, Inc.

Inventors: Alexander Behm, Mostafa Mokhtar
Method and apparatus for improving performance of approximate string queries using variable length high-quality grams

Patent number: 7996369

Abstract: A computer process, called VGRAM, improves the performance of these string search algorithms in computers by using a carefully chosen dictionary of variable-length grams based on their frequencies in the string collection. A dynamic programming algorithm for computing a tight lower bound on the number of common grams shared by two similar strings in order to improve query performance is disclosed. A method for automatically computing a dictionary of high-quality grams for a workload of queries. Improvement on query performance is achieved by these techniques by a cost-based quantitative approach to deciding good grams for approximate string queries. An approach for answering approximate queries efficiently based on discarding gram lists, and another is based on combining correlated lists. An indexing structure is reduced to a given amount of space, while retaining efficient query processing by using algorithms in a computer based on discarding gram lists and combining correlated lists.

Type: Grant

Filed: December 14, 2008

Date of Patent: August 9, 2011

Assignee: The Regents of the University of California

Inventors: Chen Li, Bin Wang, Xaochun Yang, Alexander Behm, Shengyue Ji, Jiaheng Lu
Method and Apparatus for Improving Performance of Approximate String Queries Using Variable Length High-Quality Grams

Publication number: 20100125594

Abstract: A computer process, called VGRAM, improves the performance of these string search algorithms in computers by using a carefully chosen dictionary of variable-length grams based on their frequencies in the string collection. A dynamic programming algorithm for computing a tight lower bound on the number of common grams shared by two similar strings in order to improve query performance is disclosed. A method for automatically computing a dictionary of high-quality grams for a workload of queries. Improvement on query performance is achieved by these techniques by a cost-based quantitative approach to deciding good grams for approximate string queries. An approach for answering approximate queries efficiently based on discarding gram lists, and another is based on combining correlated lists. An indexing structure is reduced to a given amount of space, while retaining efficient query processing by using algorithms in a computer based on discarding gram lists and combining correlated lists.

Type: Application

Filed: December 14, 2008

Publication date: May 20, 2010

Applicant: The Regents of the University of California

Inventors: Chen Li, Bin Wang, Xaochun Yang, Alexander Behm, Shengyue Ji, Jiaheng Lu

prev 1 2

LIFO based spilling for grouping aggregation

INTEGRATED NATIVE VECTORIZED ENGINE FOR COMPUTATION

DISTINCT VALUE ESTIMATION FOR QUERY PLANNING

Distinct value estimation for query planning

Method and apparatus for improving performance of approximate string queries using variable length high-quality grams

Method and Apparatus for Improving Performance of Approximate String Queries Using Variable Length High-Quality Grams