Patents by Inventor Sumit Ganguly

Sumit Ganguly has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Distributed set-expression cardinality estimation

Patent number: 7873689

Abstract: A method and system for answering set-expression cardinality queries while lowering data communication costs by utilizing a coordinator site to provide global knowledge of the distribution of certain frequently occurring stream elements to significantly reduce the transmission of element state information to the central site and, optionally, capturing the semantics of the input set expression in a Boolean logic formula and using models of the formula to determine whether an element state change at a remote site can affect the set expression result.

Type: Grant

Filed: December 30, 2004

Date of Patent: January 18, 2011

Assignee: Alcatel-Lucent USA Inc.

Inventors: Abhinandan Sujit Das, Sumit Ganguly, Minos N. Garofalakis, Rajeev Rastogi
Method for distinct count estimation over joins of continuous update stream

Patent number: 7668856

Abstract: The invention provides methods and systems for summarizing multiple continuous update streams such that an approximate answer to a query over one or more of the continuous update streams (such as a Query requiring a join operation followed by a duplicate elimination step) may be rapidly provided. The systems and methods use multiple (parallel) Join Distinct (JD) Sketch data structures corresponding to hash buckets of at least one initial attribute.

Type: Grant

Filed: September 30, 2004

Date of Patent: February 23, 2010

Assignee: Alcatel-Lucent USA Inc.

Inventors: Sumit Ganguly, Minos N. Garofalakis, Amit Kumar, Rajeev Rastogi
Streaming algorithms for robust, real-time detection of DDoS attacks

Patent number: 7669241

Abstract: A distinct-count estimate is obtained in a guaranteed small footprint using a two level hash, distinct count sketch. A first hash fills the first-level hash buckets with an exponentially decreasing number of data-elements. These are then uniformly hashed to an array of second-level-hash tables, and have an associated total-element counter and bit-location counters. These counters are used to identify singletons and so provide a distinct-sample and a distinct-count. An estimate of the total distinct-count is obtained by dividing by the distinct-count by the probability of mapping a data-element to that bucket. An estimate of the total distinct-source frequencies of destination address can be found in a similar fashion. By further associating the distinct-count sketch with a list of singletons, a total singleton count and a heap containing the destination addresses ordered by their distinct-source frequencies, a tracking distinct-count sketch may be formed that has considerably improved query time.

Type: Grant

Filed: September 30, 2004

Date of Patent: February 23, 2010

Assignee: Alcatel-Lucent USA Inc.

Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi, Krishan Sabnani
Tracking set-expression cardinalities over continuous update streams

Patent number: 7596544

Abstract: A method of estimating set-expression cardinalities over data streams with guaranteed small maintenance time per data-element update. The method only examines each data element once and uses a limited amount of memory. The time-efficient stream synopsis extends 2-level hash-sketches by randomly, but uniformly, pre-hashing data-elements prior to logarithmically hashing them to a first-level hash-table. This generates a set of independent 2-level hash-sketches. The set-union cardinality can be estimated by determining the smallest hash-bucket index j at which only a predetermined fraction of the b hash-buckets has a non-empty union |A?B|. Once a set-union cardinality is estimated, general set-expression cardinalities may be estimated by counting witness elements for the set-expression, i.e., those first-level hash-buckets that are both a singleton for the set-expression and a set-union singleton.

Type: Grant

Filed: December 29, 2004

Date of Patent: September 29, 2009

Assignee: Alcatel-Lucent USA Inc.

Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
Processing data-stream join aggregates using skimmed sketches

Patent number: 7483907

Abstract: A method of estimating an aggregate of a join over data-streams in real-time using skimmed sketches, that only examines each data element once and has a worst case space requirement of O(n2/J), where J is the size of the join and n is the number of data elements. The skimmed sketch is an atomic sketch, formed as the inner product of the data-stream frequency vector and a random binary variable, from which the frequency values that exceed a predetermined threshold have been skimmed off and placed in a dense frequency vector. The join size is estimated as the sum of the sub-joins of skimmed sketches and dense frequency vectors. The atomic sketches may be arranged in a hash structure so that processing a data element only requires updating a single sketch per hash table. This keeps the per-element overhead logarithmic in the domain and stream sizes.

Type: Grant

Filed: December 29, 2004

Date of Patent: January 27, 2009

Assignee: Alcatel-Lucent USA Inc.

Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
Distributed set-expression cardinality estimation

Publication number: 20060149744

Abstract: A method and system for answering set-expression cardinality queries while lowering data communication costs by utilizing a coordinator site to provide global knowledge of the distribution of certain frequently occurring stream elements to significantly reduce the transmission of element state information to the central site and, optionally, capturing the semantics of the input set expression in a Boolean logic formula and using models of the formula to determine whether an element state change at a remote site can affect the set expression result.

Type: Application

Filed: December 30, 2004

Publication date: July 6, 2006

Inventors: Abhinandan Das, Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
Tracking set-expression cardinalities over continuous update streams

Publication number: 20060143218

Abstract: A method of estimating set-expression cardinalities over data streams with guaranteed small maintenance time per data-element update. The method only examines each data element once and uses a limited amount of memory. The time-efficient stream synopsis extends 2-level hash-sketches by randomly, but uniformly, pre-hashing data-elements prior to logarithmically hashing them to a first-level hash-table. This generates a set of independent 2-level hash-sketches. The set-union cardinality can be estimated by determining the smallest hash-bucket index j at which only a predetermined fraction of the b hash-buckets has a non-empty union |A?B|. Once a set-union cardinality is estimated, general set-expression cardinalities may be estimated by counting witness elements for the set-expression, i.e., those first-level hash-buckets that are both a singleton for the set-expression and a set-union singleton.

Type: Application

Filed: December 29, 2004

Publication date: June 29, 2006

Applicant: Lucent Technologies, Inc.

Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
Processing data-stream join aggregates using skimmed sketches

Publication number: 20060143170

Abstract: A method of estimating an aggregate of a join over data-streams in real-time using skimmed sketches, that only examines each data element once and has a worst case space requirement of O(n2/J), where J is the size of the join and n is the number of data elements. The skimmed sketch is an atomic sketch, formed as the inner product of the data-stream frequency vector and a random binary variable, from which the frequency values that exceed a predetermined threshold have been skimmed off and placed in a dense frequency vector. The join size is estimated as the sum of the sub-joins of skimmed sketches and dense frequency vectors. The atomic sketches may be arranged in a hash structure so that processing a data element only requires updating a single sketch per hash table. This keeps the per-element overhead logarithmic in the domain and stream sizes.

Type: Application

Filed: December 29, 2004

Publication date: June 29, 2006

Applicant: Lucent Technologies, Inc.

Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
Method for distinct count estimation over joins of continuous update stream

Publication number: 20060085592

Abstract: The invention provides methods and systems for summarizing multiple continuous update streams using corresponding multiple (parallel) JD Sketch data structures such that, for example, an approximate answer to a query requiring a join operation followed by a duplicate elimination step may be rapidly provided.

Type: Application

Filed: September 30, 2004

Publication date: April 20, 2006

Inventors: Sumit Ganguly, Minos Garofalakis, Amit Kumar, Rajeev Rastogi
Streaming algorithms for robust, real-time detection of DDoS attacks

Publication number: 20060075489

Abstract: A distinct-count estimate is obtained in a guaranteed small footprint using a two level hash, distinct count sketch. A first hash fills the first-level hash buckets with an exponentially decreasing number of data-elements. These are then uniformly hashed to an array of second-level-hash tables, and have an associated total-element counter and bit-location counters. These counters are used to identify singletons and so provide a distinct-sample and a distinct-count. An estimate of the total distinct-count is obtained by dividing by the distinct-count by the probability of mapping a data-element to that bucket. An estimate of the total distinct-source frequencies of destination address can be found in a similar fashion. By further associating the distinct-count sketch with a list of singletons, a total singleton count and a heap containing the destination addresses ordered by their distinct-source frequencies, a tracking distinct-count sketch may be formed that has considerably improved query time.

Type: Application

Filed: September 30, 2004

Publication date: April 6, 2006

Applicant: Lucent Technologies, Inc.

Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi, Krishan Sabnani
Method for skew resistant join size estimation

Patent number: 5721896

Abstract: A method of estimating the query size of two databases T and R is disclosed. The method uses a threshold value to categorize the databases as dense or sparse. A dense-dense procedure is then applied to the two databases to produce a dense-dense estimate (A.sub.d). A sparse-any procedure that suppresses the dense data items coming from database T is performed which produces a first sparse-any estimate (A.sub.s1). A second sparse-any estimate (A.sub.s2) is then produced by suppressing the dense data items from database R. Ultimately a query size estimate is produced by combining the dense-dense estimate, the first sparse-any estimate and the second sparse-any estimate.

Type: Grant

Filed: May 13, 1996

Date of Patent: February 24, 1998

Assignee: Lucent Technologies Inc.

Inventors: Sumit Ganguly, Phillip B. Gibbons, Yossi Matias, Abraham Silberschatz