Patents by Inventor Sumit Ganguly

Sumit Ganguly has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7873689
    Abstract: A method and system for answering set-expression cardinality queries while lowering data communication costs by utilizing a coordinator site to provide global knowledge of the distribution of certain frequently occurring stream elements to significantly reduce the transmission of element state information to the central site and, optionally, capturing the semantics of the input set expression in a Boolean logic formula and using models of the formula to determine whether an element state change at a remote site can affect the set expression result.
    Type: Grant
    Filed: December 30, 2004
    Date of Patent: January 18, 2011
    Assignee: Alcatel-Lucent USA Inc.
    Inventors: Abhinandan Sujit Das, Sumit Ganguly, Minos N. Garofalakis, Rajeev Rastogi
  • Patent number: 7668856
    Abstract: The invention provides methods and systems for summarizing multiple continuous update streams such that an approximate answer to a query over one or more of the continuous update streams (such as a Query requiring a join operation followed by a duplicate elimination step) may be rapidly provided. The systems and methods use multiple (parallel) Join Distinct (JD) Sketch data structures corresponding to hash buckets of at least one initial attribute.
    Type: Grant
    Filed: September 30, 2004
    Date of Patent: February 23, 2010
    Assignee: Alcatel-Lucent USA Inc.
    Inventors: Sumit Ganguly, Minos N. Garofalakis, Amit Kumar, Rajeev Rastogi
  • Patent number: 7669241
    Abstract: A distinct-count estimate is obtained in a guaranteed small footprint using a two level hash, distinct count sketch. A first hash fills the first-level hash buckets with an exponentially decreasing number of data-elements. These are then uniformly hashed to an array of second-level-hash tables, and have an associated total-element counter and bit-location counters. These counters are used to identify singletons and so provide a distinct-sample and a distinct-count. An estimate of the total distinct-count is obtained by dividing by the distinct-count by the probability of mapping a data-element to that bucket. An estimate of the total distinct-source frequencies of destination address can be found in a similar fashion. By further associating the distinct-count sketch with a list of singletons, a total singleton count and a heap containing the destination addresses ordered by their distinct-source frequencies, a tracking distinct-count sketch may be formed that has considerably improved query time.
    Type: Grant
    Filed: September 30, 2004
    Date of Patent: February 23, 2010
    Assignee: Alcatel-Lucent USA Inc.
    Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi, Krishan Sabnani
  • Patent number: 7596544
    Abstract: A method of estimating set-expression cardinalities over data streams with guaranteed small maintenance time per data-element update. The method only examines each data element once and uses a limited amount of memory. The time-efficient stream synopsis extends 2-level hash-sketches by randomly, but uniformly, pre-hashing data-elements prior to logarithmically hashing them to a first-level hash-table. This generates a set of independent 2-level hash-sketches. The set-union cardinality can be estimated by determining the smallest hash-bucket index j at which only a predetermined fraction of the b hash-buckets has a non-empty union |A?B|. Once a set-union cardinality is estimated, general set-expression cardinalities may be estimated by counting witness elements for the set-expression, i.e., those first-level hash-buckets that are both a singleton for the set-expression and a set-union singleton.
    Type: Grant
    Filed: December 29, 2004
    Date of Patent: September 29, 2009
    Assignee: Alcatel-Lucent USA Inc.
    Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
  • Patent number: 7483907
    Abstract: A method of estimating an aggregate of a join over data-streams in real-time using skimmed sketches, that only examines each data element once and has a worst case space requirement of O(n2/J), where J is the size of the join and n is the number of data elements. The skimmed sketch is an atomic sketch, formed as the inner product of the data-stream frequency vector and a random binary variable, from which the frequency values that exceed a predetermined threshold have been skimmed off and placed in a dense frequency vector. The join size is estimated as the sum of the sub-joins of skimmed sketches and dense frequency vectors. The atomic sketches may be arranged in a hash structure so that processing a data element only requires updating a single sketch per hash table. This keeps the per-element overhead logarithmic in the domain and stream sizes.
    Type: Grant
    Filed: December 29, 2004
    Date of Patent: January 27, 2009
    Assignee: Alcatel-Lucent USA Inc.
    Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
  • Publication number: 20060149744
    Abstract: A method and system for answering set-expression cardinality queries while lowering data communication costs by utilizing a coordinator site to provide global knowledge of the distribution of certain frequently occurring stream elements to significantly reduce the transmission of element state information to the central site and, optionally, capturing the semantics of the input set expression in a Boolean logic formula and using models of the formula to determine whether an element state change at a remote site can affect the set expression result.
    Type: Application
    Filed: December 30, 2004
    Publication date: July 6, 2006
    Inventors: Abhinandan Das, Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
  • Publication number: 20060143218
    Abstract: A method of estimating set-expression cardinalities over data streams with guaranteed small maintenance time per data-element update. The method only examines each data element once and uses a limited amount of memory. The time-efficient stream synopsis extends 2-level hash-sketches by randomly, but uniformly, pre-hashing data-elements prior to logarithmically hashing them to a first-level hash-table. This generates a set of independent 2-level hash-sketches. The set-union cardinality can be estimated by determining the smallest hash-bucket index j at which only a predetermined fraction of the b hash-buckets has a non-empty union |A?B|. Once a set-union cardinality is estimated, general set-expression cardinalities may be estimated by counting witness elements for the set-expression, i.e., those first-level hash-buckets that are both a singleton for the set-expression and a set-union singleton.
    Type: Application
    Filed: December 29, 2004
    Publication date: June 29, 2006
    Applicant: Lucent Technologies, Inc.
    Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
  • Publication number: 20060143170
    Abstract: A method of estimating an aggregate of a join over data-streams in real-time using skimmed sketches, that only examines each data element once and has a worst case space requirement of O(n2/J), where J is the size of the join and n is the number of data elements. The skimmed sketch is an atomic sketch, formed as the inner product of the data-stream frequency vector and a random binary variable, from which the frequency values that exceed a predetermined threshold have been skimmed off and placed in a dense frequency vector. The join size is estimated as the sum of the sub-joins of skimmed sketches and dense frequency vectors. The atomic sketches may be arranged in a hash structure so that processing a data element only requires updating a single sketch per hash table. This keeps the per-element overhead logarithmic in the domain and stream sizes.
    Type: Application
    Filed: December 29, 2004
    Publication date: June 29, 2006
    Applicant: Lucent Technologies, Inc.
    Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi
  • Publication number: 20060085592
    Abstract: The invention provides methods and systems for summarizing multiple continuous update streams using corresponding multiple (parallel) JD Sketch data structures such that, for example, an approximate answer to a query requiring a join operation followed by a duplicate elimination step may be rapidly provided.
    Type: Application
    Filed: September 30, 2004
    Publication date: April 20, 2006
    Inventors: Sumit Ganguly, Minos Garofalakis, Amit Kumar, Rajeev Rastogi
  • Publication number: 20060075489
    Abstract: A distinct-count estimate is obtained in a guaranteed small footprint using a two level hash, distinct count sketch. A first hash fills the first-level hash buckets with an exponentially decreasing number of data-elements. These are then uniformly hashed to an array of second-level-hash tables, and have an associated total-element counter and bit-location counters. These counters are used to identify singletons and so provide a distinct-sample and a distinct-count. An estimate of the total distinct-count is obtained by dividing by the distinct-count by the probability of mapping a data-element to that bucket. An estimate of the total distinct-source frequencies of destination address can be found in a similar fashion. By further associating the distinct-count sketch with a list of singletons, a total singleton count and a heap containing the destination addresses ordered by their distinct-source frequencies, a tracking distinct-count sketch may be formed that has considerably improved query time.
    Type: Application
    Filed: September 30, 2004
    Publication date: April 6, 2006
    Applicant: Lucent Technologies, Inc.
    Inventors: Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi, Krishan Sabnani
  • Patent number: 5721896
    Abstract: A method of estimating the query size of two databases T and R is disclosed. The method uses a threshold value to categorize the databases as dense or sparse. A dense-dense procedure is then applied to the two databases to produce a dense-dense estimate (A.sub.d). A sparse-any procedure that suppresses the dense data items coming from database T is performed which produces a first sparse-any estimate (A.sub.s1). A second sparse-any estimate (A.sub.s2) is then produced by suppressing the dense data items from database R. Ultimately a query size estimate is produced by combining the dense-dense estimate, the first sparse-any estimate and the second sparse-any estimate.
    Type: Grant
    Filed: May 13, 1996
    Date of Patent: February 24, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Sumit Ganguly, Phillip B. Gibbons, Yossi Matias, Abraham Silberschatz