Query Cost Estimation Patents (Class 707/719)
  • Publication number: 20120078880
    Abstract: Systems, methods and articles of manufacture for accelerating database queries containing bitmap-based conditions are described herein. An embodiment includes determining a bitmap, where the bitmap represents a set of rows that have satisfied one or more conjunctive conditions which preceded a conjunct that is a disjunction in a query expression and restricting evaluation of a disjunct within the disjunction to the set of rows represented by the bitmap. Another embodiment includes determining a satisfaction bitmap, where the bitmap represents the result of one or more preceding disjuncts in a disjunction within a query expression and restricting scope of evaluation of a disjunct to a set of rows that are not within the determined satisfaction bitmap. In this way, embodiments of the present invention enable the acceleration of queries containing disjunctions of conditions on a database table, as well as reduce the temporary resources consumed for such queries.
    Type: Application
    Filed: September 28, 2010
    Publication date: March 29, 2012
    Applicant: Sybase, Inc.
    Inventors: Steven A. KIRK, David E. Walrath
  • Patent number: 8145621
    Abstract: A system, method, and computer program product are provided for generating a graphical representation of a query optimization process. The method comprises the steps of parsing a search space log, presenting one or more evaluated access plans on an axis of a timeline, identifying a best access plan on the timeline, and outputting a graphical representation of the timeline. An additional system, method, and computer program product are provided for recording a query optimization process of a query optimizer.
    Type: Grant
    Filed: December 19, 2008
    Date of Patent: March 27, 2012
    Assignee: iAnywhere Solutions, Inc.
    Inventors: Anisoara Nica, Daniel Scott Brotherston, David William Hillis
  • Patent number: 8145626
    Abstract: In one embodiment the present invention includes a method comprising receiving a data filter for filtering a collection of data, wherein the collection of data is configured as a star schema including a fact table and dimension tables. The data filter is applied against the dimension tables to generate a modified dimension table. The modified dimension tables are applied against the fact table to produce a modified fact table. The data filter is then applied against the modified fact table to generate a second modified fact table, which is the output of the process.
    Type: Grant
    Filed: December 31, 2008
    Date of Patent: March 27, 2012
    Assignee: SAP AG
    Inventors: Peter John, Thomas Zurek
  • Publication number: 20120072414
    Abstract: Querying data stores in a federation of data stores. A first search filter is accessed. The first search filter is constructed with one or more nested logical AND, OR, or NOT operands. The first search filter is normalized to a normalized search filter that is logically equivalent to the first search filter. The normalized search filter includes 3 or 4 levels. All first level operands are logically ORed. All second level operands are logically ANDed. All third level operands are at least one of parameters or logical NOTs. Any fourth level operands are parameters. The normalized search filter is used to search a plurality of data stores in a federation of data stores for information by searching different data stores for at least two or more of the top level operands.
    Type: Application
    Filed: September 16, 2010
    Publication date: March 22, 2012
    Applicant: Microsoft Corporation
    Inventor: Xin He
  • Patent number: 8140518
    Abstract: The present invention provides a method and system for optimizing search result rankings through use of a game interface. The method and system includes providing a game interface to at least two users, the game interface comprising at least one search query and at least two search result sets. The method and system further includes detecting the selection of one of the two search result sets by the users based on competition criteria and updating ranking data in response to the selection of one of the two search results. The method and system further includes selecting ranking data associated with a given query, determining an optimum ranking based on aggregating the selected ranking data, and storing the optimum ranking.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: March 20, 2012
    Assignee: Yahoo! Inc.
    Inventors: Ali Dasdan, Santanu Kolay, Chris Drome
  • Patent number: 8135702
    Abstract: A method and system for eliminating unnecessary statistics collections for query optimization in a database stored on a computer. Statistics are unnecessary when a re-generated query execution plan that does not use the statistics is equivalent to an original query execution plan that uses the statistics. To determine this, an original query execution plan is created for each query in a specified workload using the statistics in the database. A search is performed of the statistics in order to enumerate one or more candidate sets of statistics to be eliminated. One or more of the candidate sets of statistics are removed from consideration prior to creating the re-generated query execution plan for each query in the specified workload.
    Type: Grant
    Filed: October 27, 2008
    Date of Patent: March 13, 2012
    Assignee: Teradata US, Inc.
    Inventors: Louis M. Burger, Frank Roderic Vandervort
  • Patent number: 8135703
    Abstract: An apparatus and method for a multi-partition query governor in a partitioned computer database system. In preferred embodiments a query governor uses data of a query governor file that is associated with multiple partitions to determine how the query governor manages access to the database across multiple partitions. Also, in preferred embodiments, the query governor in a local partition that receives a query request communicates with a query governor in a target partition to accumulate the total resource demands of the query on the local and target partitions. In preferred embodiments, a query governor estimates whether resources to execute a query will exceed a threshold over all or a combination of database partitions.
    Type: Grant
    Filed: September 28, 2010
    Date of Patent: March 13, 2012
    Assignee: International Business Machines Corporation
    Inventors: Eric Lawrence Barsness, Robert Joseph Bestgen, John Matthew Santosuosso
  • Publication number: 20120054175
    Abstract: Techniques are described for managing query execution by estimating and monitoring query execution time. Embodiments of the invention may generally receive a query to be executed and calculate an initial estimated execution time for the received query. If the initial estimated execution time does not exceed a threshold amount of time, embodiments of the invention may submit the query for execution. Once execution of the query has begun, embodiments of the invention may calculate an updated estimated execution time for the executing query, and if the updated estimated execution time exceeds the threshold amount of time, may halt the execution of the query.
    Type: Application
    Filed: August 30, 2010
    Publication date: March 1, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric L. Barsness, John M. Santosuosso
  • Patent number: 8122150
    Abstract: A system, method, and computer readable medium for optimizing throughput of a stream processing system are disclosed. The method comprises analyzing a set of input streams and creating, based on the analyzing, an input profile for at least one input stream in the set of input streams. The input profile comprises at least a set of processing requirements associated with the input stream. The method also comprises generating a search space, based on an initial configuration, comprising a plurality of configurations associated with the input stream. A configuration in the plurality of configurations is identified that increases throughput more than the other configurations in the plurality of configurations based on at least one of the input profile and system resources.
    Type: Grant
    Filed: February 16, 2009
    Date of Patent: February 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Christian A. Lang, George Andrei Mihaila, Themis Palpanas, Ioana Stanoi
  • Patent number: 8122010
    Abstract: Methods, systems, and computer program products for dynamically adjusting computer resources, as appropriate, in response to predictions of query runtimes as well as for rendering costs of the computer resources actually utilized, which costs are consistent with consumer demands.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: February 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Eric Lawrence Barsness, Mahdad Majd, Randy William Ruhlow, John Matthew Santosuosso
  • Publication number: 20120036120
    Abstract: Searching stored content is disclosed. A first mapping is created from an object to one or more stored relational database tables. A second mapping is created from the object to an indexer schema. One or both of the following is done: 1) using the first mapping to translate a search request expressed in an abstract query language to a first query language associated with the relational database; and 2) using the second mapping to translate the search request to a second query language associated with the indexer schema.
    Type: Application
    Filed: August 23, 2011
    Publication date: February 9, 2012
    Applicant: EMC CORPORATION
    Inventors: Marc Brette, Frédéric Ciminera, Bruno Marquié
  • Patent number: 8112414
    Abstract: Disclosed is an apparatus and system for reducing locking in materialized query tables (MQT) for distributive functions. The apparatus includes an insert module that inserts into an MQT table a child record when a new record is inserted into a base table associated with the MQT. The child record includes values associated with the insert operation. Also included is a delete module that inserts into the MQT a child record that includes measure values that are the negative of the measure values in the base table row that is the subject of the delete operation. An update module inserts two child rows into the MQT, one negating the affected record and the other adding the values of the update operation. Each inserted child row includes a unique identifier that relates the inserted row to a parent row. An execution module generates responses using the values indicated by the cumulative records in a family.
    Type: Grant
    Filed: August 28, 2008
    Date of Patent: February 7, 2012
    Assignee: International Business Machines Corporation
    Inventors: James P. Bates, Jonathan Sloan, Calisto P. Zuzarte
  • Patent number: 8112415
    Abstract: Two methods and computer-readable medium for obtaining information using field group unpacking functions. The first method obtains information using field group unpacking functions by identifying an optimized unpacking function from field group unpacking functions, and an optimized unpacking function is used to unpack a field associated with the data stream. The second method obtains information using field group unpacking functions by identifying an optimized unpacking function from the field group unpacking functions. Then, a prefilter is applied and associated with the optimized unpacking functions and used to unpack a field associated with the data stream. The computer-readable medium obtains field group unpacking functions for execution by a computing device using field group unpacking functions that identify an optimized unpacking function from the field group unpacking functions, and use an optimized unpacking function to unpack a field associated with the data stream.
    Type: Grant
    Filed: June 4, 2008
    Date of Patent: February 7, 2012
    Assignee: AT&T Intellectual Property I, LP
    Inventors: Theodore Johnson, Lukasz Golab, Oliver Spatscheck
  • Publication number: 20120030193
    Abstract: A method and system for connecting users via a matrix/mesh/grid/graph of intermediate relationship routing (FIG. 3, element 50), comprising creating a collective business incentive awareness through a multi layered (per service) charge system defined by the users themselves, while utilizing and harnessing the power of P2P (Peer-to-Peer)/Client-Server, Collaboration tools, 3D/2D GUI (3d bubble/Birdseye view) navigation interfaces for effectively presenting a large volume of information to users. The system allows users(100,110) to focus on specific groups or individuals in a secure, private and anonymous environment.
    Type: Application
    Filed: April 14, 2005
    Publication date: February 2, 2012
    Inventors: Sagi Richberg, Sergey Gribov
  • Patent number: 8108382
    Abstract: Optimizing the execution of a query in a multi-database system includes identifying a region within a table, the table being referenced in the query. The region is stored on a data-storage device on a first of the system databases in the multi-database system. The region is stored on a data-storage device on a second of the system databases in the multi-database system, the second system database being a different system database than the first system database. A first access plan for the query is developed, the first access plan comprising accessing the version of the region stored on the first system database. A second access plan for the query is developed, the second access plan comprising accessing the version of the region stored on the second system database. A selection is made between the first access plan and the second access plan to execute the query. The query is executed using the selected access plan to produce a result. The result is stored.
    Type: Grant
    Filed: December 29, 2008
    Date of Patent: January 31, 2012
    Assignee: Teradata US, Inc.
    Inventors: Douglas Brown, John Mark Morris
  • Patent number: 8099411
    Abstract: A system, method, and computer-readable medium that facilitate workload management in a computer system are provided. A workload's system resource consumption is adjusted against a target consumption level thereby facilitating maintenance of the consumption to the target consumption within an averaging interval by dynamically controlling workload concurrency levels. System resource consumption is compensated during periods of over or under-consumption by adjusting workload consumption to a larger averaging interval. Further, mechanisms for limiting, or banding, dynamic concurrency adjustments to disallow workload starvation or unconstrained usage at any time are provided.
    Type: Grant
    Filed: December 15, 2008
    Date of Patent: January 17, 2012
    Assignee: Teradata US, Inc.
    Inventors: Anita Richards, Douglas Brown
  • Publication number: 20120005188
    Abstract: Techniques for automatically recommending parallel execution of a SQL statement. In one set of embodiments, a first determination can be made regarding whether a SQL statement can be executed in parallel. Further, a second determination can be made regarding whether executing the SQL statement in parallel is faster than executing the statement in serial by a predetermined factor. If the first determination and second determination are positive (i.e., the statement can be executed in parallel and parallel execution is faster by the predetermined factor), a recommendation can be provided indicating that the SQL statement should be executed in parallel. In some embodiments, the recommendation can include a report specifying the degree of performance improvement gained from parallel execution, additional system resources consumed by parallel execution, and other statistics pertaining to the recommended parallel execution plan.
    Type: Application
    Filed: June 30, 2010
    Publication date: January 5, 2012
    Applicant: Oracle International Corporation
    Inventors: Hailing Yu, Peter Belknap, Thierry Cruanes, Benoit Dageville, Karl Dias, Khaled Yagoub
  • Publication number: 20110320436
    Abstract: When each file of a number of files is accessed, at least a number of times each file has been accessed is kept track of. Each file is stored on a storage of a number of storages. Periodically, at least one file is moved among the number of storages, based at least on the number of times each file has been accessed. As such, the at least one file is moved from being stored on a first storage to being stored on a second storage, to optimize subsequent access time of the at least one file. The storages are physically distinct storage devices. At least one of the storage devices has different storage characteristics as compared to one or more other of the storage devices.
    Type: Application
    Filed: March 10, 2009
    Publication date: December 29, 2011
    Inventor: Mark K Hokanson
  • Publication number: 20110307472
    Abstract: In a method for storing data in a relational database system using a processor, a collection of values is assigned to a structure dictionary, wherein each of the values represents the value of a row for an attribute and has a unique ordinal number within the collection, and wherein the structure dictionary contains structures defined based on at least one of interaction with a user of the system via an interface, automatic detection of structures occurring in data, and predetermined information about structures relevant to data content that is stored in the system. For each structure in the structure dictionary, a structure match list is formed from ordinal numbers of values matching the structure, and a structure sub-collection from values matching the structure, using the processor.
    Type: Application
    Filed: June 14, 2011
    Publication date: December 15, 2011
    Applicant: INFOBRIGHT, INC.
    Inventors: Dominik Slezak, Graham Toppin, Marcin Kowalski, Arkadiusz Wojna
  • Patent number: 8078611
    Abstract: Systems, methods, and other embodiments associated with providing query modes for translation-enabled XML documents are described. One method embodiment includes receiving an XPath query to an XML document that may store a translation for a data element. The method embodiment may also include automatically selecting a query mode for the XPath query. The method embodiment may also include querying the XML document using the XPath query and the selected query mode. The query mode may control, at least in part, the operation of an XML database logic.
    Type: Grant
    Filed: January 3, 2007
    Date of Patent: December 13, 2011
    Assignee: Oracle International Corporation
    Inventors: Nipun Agarwal, Sanket Malde, Bhushan Khaladkar, Ravi Murthy, Sivasankaran Chandrasekar
  • Publication number: 20110295840
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a generalized edit distance for queries. In one aspect, a method includes selecting query pairs of consecutive queries, each query pair being a first query and a second query consecutively submitted as separate queries, each first and second query including at least one term. For each query pair, the method includes selecting term pairs from the query pair, each term pair being a first term in the first query and a second term in the second query; and determining a co-occurrence value for each term pair. The method also includes determining transition costs based on the co-occurrence values for term pairs, each transition cost indicative of a cost of transitioning from a first term in a first query to a second term in a second query consecutive to the first query.
    Type: Application
    Filed: May 18, 2011
    Publication date: December 1, 2011
    Applicant: GOOGLE INC.
    Inventors: Massimiliano Ciaramita, Amac Herdagdelen, Daniel Mahler
  • Publication number: 20110295839
    Abstract: In a method, system, and computer-readable medium having instructions for optimizing a query in a database system, a database statistic is generated for a number of related records for one or more entities for at least one tenant and a related record is a record with a relationship to a shared record in a database table for an entity from the one or more entities, a first cost is calculated for accessing the number of related records for at least one tenant, a second cost is calculated for accessing a number of related records accessible to a user, a comparison of the first cost to the second cost is performed to determine a data access path for retrieving accessible related records, and the data access path for retrieving accessible related records is determined based upon the comparison.
    Type: Application
    Filed: September 17, 2010
    Publication date: December 1, 2011
    Applicant: SALESFORCE.COM, INC.
    Inventors: Jesse Collins, Jaikumar Bathija
  • Patent number: 8065297
    Abstract: One exemplary method embodiment, pre-processes customer requests that are maintained in a dataset to create a matrix between products and the customer requests. Each of the customer requests comprises at least a customer identification, a textual request, and a product identification related to the textual request. After such pre-processing of the dataset, the method can respond to queries of the dataset using the matrix.
    Type: Grant
    Filed: June 9, 2008
    Date of Patent: November 22, 2011
    Assignee: Xerox Corporation
    Inventors: Wei Peng, Tong Sun, Shriram Revankar
  • Publication number: 20110282864
    Abstract: In accordance with embodiments, there are provided mechanisms and methods for query optimization in a database system. These mechanisms and methods for query optimization in a database system can enable embodiments to optimize OR expression filters referencing different logical tables. The ability of embodiments to optimize OR expression filters referencing different logical tables can enable optimization that is dynamic and specific to the particular tenant for whom the query is run and improve the performance and efficiency of the database system in response to query requests.
    Type: Application
    Filed: January 26, 2011
    Publication date: November 17, 2011
    Applicant: Salesforce.com Inc.
    Inventors: Jesse Collins, Jaikumar Bathija
  • Patent number: 8055693
    Abstract: A set of words is converted to a corresponding set of particles, wherein the words and the particles are unique within each set. For each word, all possible partitionings of the word into particles are determined, and a cost is determined for each possible partitioning. The particles of the possible partitioning associated with a minimal cost are added to the set of particles.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: November 8, 2011
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Tony Ezzat, Evandro Gouvea
  • Publication number: 20110270822
    Abstract: A query governor intelligently sets tailored thresholds for a query accessing a computer database. The query governor preferably generates a tailored threshold for each query sent to the database for execution. The tailored threshold for the query is preferably compared to an estimated query execution time to determine whether to execute the query. The query governor uses one or more factors applied to a standard threshold to generate the tailored threshold. The factors preferably include user factors and query factors. These factors are dynamically adjusted by the query governor in an intelligent way to increase optimal use of the database. Other factors may include factors such as job priority factor, resource factor and an application factor.
    Type: Application
    Filed: April 30, 2010
    Publication date: November 3, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James L. Denton, Brian R. Muras
  • Patent number: 8051070
    Abstract: An acquiring unit acquires from each of a plurality of database servers processing capacity information for a query received from a client. A generating unit generates a first code indicating a first processing and a second code indicating a second processing for the query. A first transmitting unit transmits the first code to the database servers. An executing unit executes the second processing by using first result data of XML data acquired by executing the first processing from the database servers. A second transmitting unit transmits second result data of XML data acquired by executing the second processing to the client.
    Type: Grant
    Filed: September 18, 2008
    Date of Patent: November 1, 2011
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Masakazu Hattori
  • Patent number: 8051058
    Abstract: A system for estimating cardinalities for a plurality of columns in a database system is disclosed. The system include obtaining statistics collected for the plurality of columns. A first portion of the statistics indicates at least one relationship between at least a portion of the plurality of columns, while a second portion of the statistics includes single column statistics. The system also include utilizing the first portion and the second portion of the statistics to estimate the cardinality for the plurality of columns.
    Type: Grant
    Filed: September 12, 2008
    Date of Patent: November 1, 2011
    Assignee: International Business Machines Corporation
    Inventors: Vincent Corvinelli, Yuri Deigin, John F Hornibrook, Tam Minh Dai Tran
  • Patent number: 8051069
    Abstract: A method and system are disclosed for operating a high speed data stream management system which runs a query plan including a set of queries on a data feed in the form of a stream of tuples. A predicate prefilter is placed outside the query plan upstream of the set of queries, and includes predicates selected from those used by the queries. Predicates are selected for inclusion in the prefilter based on a cost heuristic, and predicates are combined into composites using a rectangle mapping heuristic. The prefilter evaluates the presence of individual and composite predicates in the tuples and returns a bit vector for each tuple with bits representing the presence or absence of predicates in the tuple. A bit signature is assigned to each query to represent the predicates related to that query, and a query is invoked when the tuple bit vector and the query bit signature are compatible.
    Type: Grant
    Filed: January 2, 2008
    Date of Patent: November 1, 2011
    Assignee: AT&T Intellectual Property I, LP
    Inventors: Theodore Johnson, Lukasz Golab, Oliver Spatscheck
  • Patent number: 8041186
    Abstract: Some embodiments provide a method for processing metadata associated with digital video in a multi-state video computer readable medium. The method specifies a set of rules for propagating the metadata between different states in the video computer readable medium. It then propagates the metadata between the states based on the specified set of rules.
    Type: Grant
    Filed: December 9, 2003
    Date of Patent: October 18, 2011
    Assignee: Apple Inc.
    Inventor: David Robert Black
  • Patent number: 8036076
    Abstract: Provided is a computer system including: a computer running as a DB server; a storage system including a plurality of disk drives for storing data; and a management module, in which: at least one of the plurality of disk drives stores data of a DB schema written by the computer; the management module specifies the DB schema to be accessed based on a received query, transmits, to the storage system, an instruction to copy at least a portion of the data of the specified DB schema from the disk drive to a memory, and transmits, to the storage system, an instruction to control an rpm of the disk drive that stores the data of the specified DB schema; and the storage system controls the rpm of the disk drive based on the instruction. Accordingly, power consumption of the storage system can be reduced even if installed disks increase in number.
    Type: Grant
    Filed: January 17, 2008
    Date of Patent: October 11, 2011
    Assignee: Hitachi, Ltd.
    Inventors: Hideomi Idei, Kazuhiko Mogi, Norifumi Nishikawa
  • Patent number: 8032522
    Abstract: Parameterized queries are optimized by a transformational optimizer. The optimizer produces a dynamic plan that embeds multiple plan options that may be selected to execute a particular query. Parameter distribution improves query execution efficiency and performance by exploring a sample parameter space representative of the parameter values actually used. The dynamic plans can be simplified while maintaining an acceptable level of optimality by reducing the number of plan options. The reduction is achieved by eliminating switch unions to alternatives that are close in cost. Both approaches of parameter space exploration and dynamic plan generation are deeply integrated into the query optimizer.
    Type: Grant
    Filed: August 25, 2006
    Date of Patent: October 4, 2011
    Assignee: Microsoft Corporation
    Inventors: Jonathan D. Goldstein, Per-Ake Larson, Jingren Zhou
  • Patent number: 8024325
    Abstract: Techniques for estimating the cost of processing a database statement that includes one or more path expressions are provided. One aspect of cost is I/O cost, or the cost of reading data from persistent storage into memory according to a particular streaming operator. Binary-encoded XML data is stored in association with a synopsis that summarizes the binary-encoded XML data. The synopsis includes skip length information for one or more elements and indicates, for each such element, how large (e.g., in bytes) the element is in storage. The skip length information of a particular element thus indicates how much data may be skipped during I/O if the particular element does not match the path expression that is input to the streaming operator. The skip length information of one or more elements is used to estimate the cost of processing the database statement.
    Type: Grant
    Filed: June 25, 2008
    Date of Patent: September 20, 2011
    Assignee: Oracle International Corporation
    Inventors: Ning Zhang, Sam Idicula, Nipun Agarwal
  • Patent number: 8019751
    Abstract: The cost of running a query (having a query range) on a multidimensional database may be estimated using a process factors criteria beyond merely the number of affected records. First, a materialized view of the database may be represented as a container of tuples, sorted by key. Then keys may be stepped through, each key representing a mapping of a combination of tuples from the container. At each step, the process may request the next smallest key in the query range greater than or equal to the key of the current step, which results in the tuple in the database whose key is the smallest, greater than or equal to the requested key, and determine if the resulting is in the query range. The cost of the query may then be estimated as the number of tuples upon which the range check was performed.
    Type: Grant
    Filed: June 23, 2008
    Date of Patent: September 13, 2011
    Assignee: Oracle International Corporation
    Inventors: Jonathan M. Baccash, Igor Nazarenko, Uri Rodny, Ambuj Shatdal
  • Patent number: 8015202
    Abstract: Embodiments of the invention provide techniques for aggregating database queries for energy efficiency. In one embodiment, queries received by a DBMS are aggregated and staged according to hard-disk drives required for query execution. Each group of queries accessing a given drive may be dispatched for execution together. Further, the queries received by a DBMS may be matched to patterns of previously received queries. The matching patterns may be used to predict other queries which are likely to be received by the DBMS. The received queries may be staged to be dispatched with the predicted queries. By aggregating queries to be executed, access to each hard-disk drive may be optimized, thus reducing the overall energy consumption required for executing the queries.
    Type: Grant
    Filed: June 19, 2008
    Date of Patent: September 6, 2011
    Assignee: International Business Machines Corporation
    Inventors: Robert Joseph Bestgen, Wei Hu, Shantan Kethireddy, Andrew Peter Passe, Ulrich Thiemann
  • Patent number: 8015176
    Abstract: A method and system for cleansing anomalies from sequence-based data at query time. Sequence-based data such as RFID data is loaded into a database. One or more cleansing rules are received at a cleansing rules engine. The cleansing rule engine converts the cleansing rule(s) to a template that includes logic to compensate for anomalies in the sequence-based data. A query to retrieve the sequence-based data is received by a query rewrite engine. The query rewrite engine rewrites the query by applying the template logic. The rewritten query is executed at query time. The result of the rewritten query execution is identical to the result of executing the original query on a data set generated by applying the cleansing rule to all of the sequence-based data.
    Type: Grant
    Filed: May 21, 2008
    Date of Patent: September 6, 2011
    Assignee: International Business Machines Corporation
    Inventors: Latha Sankar Colby, Sangeeta T. Doraiswamy, Jun Rao, Hetal Thakkar
  • Patent number: 8015180
    Abstract: Systems, methodologies, media, and other embodiments associated with supporting queries with hard time constraints are described. One exemplary system embodiment includes logic for accepting a query having a hard time constraint. The example system may also include logic for selectively rewriting the query having the hard time constraint into a query having a row limitation or a sample percentage limitation. In one example, the row limitation or sample percentage limitation are computed by repetitively comparing an estimated query execution time to the hard time constraint. The example system may also include logic for establishing a timer(s) associated with the rewritten query.
    Type: Grant
    Filed: May 18, 2007
    Date of Patent: September 6, 2011
    Assignee: Oracle International Corp.
    Inventors: Ying Hu, Seema Sundara, Jagannathan Srinivasan
  • Patent number: 8001113
    Abstract: In one implementation, a method is provided for increasing relevance of database search results. The method includes receiving a subject query string and determining a trained edit distance between the subject query string and a candidate string using trained cost factors derived from a training set of labeled query transformations. A trained cost factor includes a conditional probability for mutations in labeled non-relevant query transformations and a conditional probability for mutations in labeled relevant query transformations. The candidate string is evaluated the for selection based on the trained edit distance. In some implementations, the cost factors may take into account the context of a mutation. As such, in some implementations multi-dimensional matrices are utilized which include the trained cost factors.
    Type: Grant
    Filed: April 22, 2010
    Date of Patent: August 16, 2011
    Assignee: Yahoo! Inc.
    Inventor: John M. Carnahan
  • Patent number: 7991769
    Abstract: An improved system and method is provided for searching a collection of objects that may be located in hierarchies of auxiliary information for retrieval of response objects. A framework to perform a generalization search in hierarchies may be used to generalize a search by moving up to a higher level in a hierarchy of taxonomies or to specialize a search by moving down to a lower level in the hierarchy of taxonomies. Once the system may decide to enumerate response objects at a particular level of generalization, a budgeted generalization search may be used for enumerating a set of response objects within a budgeted cost.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: August 2, 2011
    Assignee: Yahoo! Inc.
    Inventors: Marcus Felipe Fontoura, Vanja Josifovski, Christopher Olston, Shanmugasundaram Ravikumar, Andrew Tomkins
  • Publication number: 20110184935
    Abstract: A networked data processor maintains a database of information concerning potential evidence from litigants in legal proceedings supporting stipulated discovery agreements, and requiring meet-and-confer sessions prior to seeking court supervision. Individual or adverse parties can participate. Sources include data custodians, file repositories, electronic data, witnesses, etc. Managers and employees are polled to populate a database defining the litigant's organization and evidence. The evidence may be sequestered, copied and processed, e.g., filtered for confidentiality or privilege, analyzed as to format, and queried to assess the volume of data that would be responsive under alternative discovery specifications. Cost and time are assessed under alternative specifications and reports are provided for use in negotiating a discovery plan. A stipulated discovery plan may result, or if not, a meet-and-confer session is electronically managed in a multi-user teleconference.
    Type: Application
    Filed: January 21, 2011
    Publication date: July 28, 2011
    Applicant: 26F, LLC
    Inventor: Michael MARLIN
  • Patent number: 7987178
    Abstract: A method and system for automatically determining optimization frequencies of queries having one or more parameter markers. Execution plans for a query are generated and each plan is associated with one or more bind value sets. An optimization frequency is selected based on differences between pairs of execution costs where one execution cost of a pair is a cost of executing the query with a bind value set via a first execution plan and the other execution cost of the pair is a cost of optimally executing the query with the bind value set via a second execution plan. The differences are based on maximum selectivity or cardinality distances associated with the bind value sets. If none of the differences exceeds a predefined value, the query is optimized once. If at least one of the differences exceeds the predefined value, the query is reoptimized each time the query is executed.
    Type: Grant
    Filed: May 22, 2008
    Date of Patent: July 26, 2011
    Assignee: International Business Machines Corporation
    Inventors: Fabian Hueske, Volker Gerhard Markl
  • Patent number: 7987181
    Abstract: A system and method for directing query traffic. In one embodiment, the system may include a plurality of query servers, each configured to evaluate queries, and a query traffic director. The query traffic director may be configured to receive a given query formulated in a query language for evaluation, to parse the given query, to identify a dataset targeted by the given query dependent upon parsing the given query, and to convey the given query to a particular query server dependent upon the identified dataset.
    Type: Grant
    Filed: June 16, 2004
    Date of Patent: July 26, 2011
    Assignee: Symantec Operating Corporation
    Inventors: Dhrubajyoti Borthakur, Nur Premo
  • Patent number: 7987177
    Abstract: The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. The present invention provides synopses for DV estimation in the setting of a partitioned dataset, as well as corresponding DV estimators that exploit these synopses. Whenever an output compound data partition is created via a multiset operation on a pair of (possibly compound) input partitions, the synopsis for the output partition can be obtained by combining the synopses of the input partitions. If the input partitions are compound partitions, it is not necessary to access the synopses for all the base partitions that were used to construct the input partitions. Superior (in certain cases near-optimal) accuracy in DV estimates is maintained, especially when the synopsis size is small. The synopses can be created in parallel, and can also handle deletions of individual partition elements.
    Type: Grant
    Filed: January 30, 2008
    Date of Patent: July 26, 2011
    Assignee: International Business Machines Corporation
    Inventors: Kevin Scott Beyer, Rainer Gemulla, Peter Jay Haas, Berthold Reinwald, John Sismanis
  • Patent number: 7984004
    Abstract: Described herein is a system that facilitates assigning indications of usefulness to query suggestions. The system includes a query suggestion generator component that receives a query and generates a query suggestion based at least in part upon the received query. A model component outputs an indication of usefulness with respect to the query suggestion, wherein the model component is a machine-learned model of user behavior with respect to query suggestions.
    Type: Grant
    Filed: January 17, 2008
    Date of Patent: July 19, 2011
    Inventors: Galen Andrew, Sooho Park, Robert L. Rounthwaite, Silviu-Petru Cucerzan, Jamie Paul Buckley, Joanna Chan
  • Patent number: 7984039
    Abstract: A method and system are provided of merging results in distributed information retrieval. A search manager is in communication with a plurality of components, wherein a component is a search engine working on a document collection and returning results in the form of a list of documents to a search query. The search manager submits a query to the plurality of components, receives results from each component in the form of a list of documents; estimates the success of a component in handling the query to generate a merit score for a component per query; applies the merit score to the results for the component; and merges results from the plurality of components by ranking in order of the applied merit score.
    Type: Grant
    Filed: July 14, 2005
    Date of Patent: July 19, 2011
    Assignee: International Business Machines Corporation
    Inventors: David Carmel, Adam Darlow, Shai Fine, Elad Yom-Tov
  • Patent number: 7984044
    Abstract: In document search for searching a document by use of a query formula composed of a Boolean formula of keywords, a plurality of query formulas arriving at about the same time from a plurality of users are efficiently processed. A system or a program for searching documents includes: a query formula controller for sorting a plurality of query formulas into a plurality of query formula sets based on predicted search speeds of the respective query formulas; and a search unit for searches for the plurality of sorted query formula sets sequentially from the set having the fastest predicted search speed, and for, in each search processing, merging the query formulas in the corresponding query formula set into a formula and thereby searching from the merged formula.
    Type: Grant
    Filed: December 23, 2008
    Date of Patent: July 19, 2011
    Assignee: Hitachi, Ltd.
    Inventors: Makoto Iwayama, Osamu Imaichi, Tomohiro Yasuda
  • Publication number: 20110173183
    Abstract: The present invention provides a method and system for optimizing search result rankings through use of a game interface. The method and system includes providing a game interface to at least two users, the game interface comprising at least one search query and at least two search result sets. The method and system further includes detecting the selection of one of the two search result sets by the users based on competition criteria and updating ranking data in response to the selection of one of the two search results. The method and system further includes selecting ranking data associated with a given query, determining an optimum ranking based on aggregating the selected ranking data, and storing the optimum ranking.
    Type: Application
    Filed: January 8, 2010
    Publication date: July 14, 2011
    Inventors: Ali Dasdan, Santanu Kolay, Chris Drome
  • Patent number: 7979421
    Abstract: Methods and apparatus, including computer systems and program products, for executing a query on a subset of data, for example, to facilitate a fast search with a very large result set. In one general aspect, a method of executing a query includes receiving a query for execution on data in the data repository; generating an estimate of a number of results of the query; defining a subset of data in the data repository; determining whether to execute the query on the subset of the data; executing the query on the subset of the data to generate a partial set of results if the query is to be executed on the subset of the data, otherwise executing the query on the data repository to generate a complete set of results; and providing query results.
    Type: Grant
    Filed: December 19, 2007
    Date of Patent: July 12, 2011
    Assignee: SAP AG
    Inventors: Guenter Radestock, Oliver M. Steinau
  • Patent number: 7974967
    Abstract: A system may include a routines repository that is configured to store and maintain hardware libraries, software libraries and metadata, a hybrid query engine that may be configured to receive a query, parse the query, compute a query execution plan and output the query execution plan using metadata and operators from the hardware libraries and/or the software libraries, and a routines management module that may be configured to provide the metadata and the operators from the routines repository to the hybrid query engine. The system may include an execution engine module that may be configured to receive the query execution plan, the execution engine module including a reconfigurable hardware execution engine having a reconfigurable fabric, where the reconfigurable hardware execution engine may be configured to process the query execution plan.
    Type: Grant
    Filed: April 15, 2008
    Date of Patent: July 5, 2011
    Assignee: SAP AG
    Inventor: Bernd Scheuermann
  • Publication number: 20110161311
    Abstract: Disclosed are methods and apparatus for clustering and presenting search suggestions. A segment of text is obtained via a search query section of a user interface, the segment of text being a portion of a search query. A set of suggestions is obtained, each suggestion in the set of suggestions being a suggested search query relating to the segment of text. Two or more groups of suggestions are generated, each of the two or more groups of suggestions including a different subset of the set of suggestions. The two or more groups of suggestions are provided such that each of the two or more groups of suggestions is displayed in a separate partition of a search assistance segment of the user interface.
    Type: Application
    Filed: December 28, 2009
    Publication date: June 30, 2011
    Applicant: YAHOO! INC.
    Inventors: Gilad Mishne, Alpa Jain