Patents Examined by SyLing Yen
  • Patent number: 11487762
    Abstract: Techniques and solutions are described for partitioning data among different types of computer-readable storage media, such as between RAM and disk-based storage. A measured workload can be used to estimate data access for one or more possible partition arrangements. The partitions arrangements can be automatically enumerated. Scores for the partition arrangements can be calculated, where a score can indicate how efficiently a partition arrangement places frequently accessed data into storage specified for frequently-accessed data and placed infrequently accessed data into storage specified for infrequently accessed data.
    Type: Grant
    Filed: July 14, 2020
    Date of Patent: November 1, 2022
    Assignee: SAP SE
    Inventors: Norman May, Alexander Boehm, Guido Moerkotte, Michael Brendle, Mahammad Valiyev, Nick Weber, Robert Schulze, Michael Grossniklaus
  • Patent number: 11475012
    Abstract: A non-transitory computer-readable medium is provided. The medium comprises a set of instructions, which, when executed by a processing system associated with a database or data warehouse, causes the processing system to retrieve data from a data source in accordance with a mapping between a first set of partitions and a second set of partitions, the first set of partitions being associated with the data source and the second set of partitions being associated with the database. The set of instructions, when executed by the processing system, further causes the processing system to load the retrieved data into the database. Retrieving the data and loading the retrieved data comprise a single logical unit of work. A database system and a method executed by a processing system associated with a database are also provided.
    Type: Grant
    Filed: September 25, 2017
    Date of Patent: October 18, 2022
    Assignee: SingleStore, Inc.
    Inventors: Joseph Victor, Francis Williams, Carl Sverre, Steven Camina, Hurshal Patel
  • Patent number: 11468073
    Abstract: Techniques are provided for gathering statistics in a database system. The techniques involve gathering some statistics using an “on-the-fly” technique, some statistics through a “high-frequency” technique, and yet other statistics using a “prediction” technique. The technique used to gather each statistic is based, at least in part, on the overhead required to gather the statistic. For example, low-overhead statistics may be gathered “on-the-fly” using the same process that is performing the operation that affects the statistic, while statistics whose gathering incurs greater overhead may be gathered in the background, while the database is live, using the high-frequency technique. The prediction technique may be used for relatively-high overhead statistics that can be predicted based on historical data and the current value of predictor statistics.
    Type: Grant
    Filed: August 6, 2019
    Date of Patent: October 11, 2022
    Assignee: Oracle International Corporation
    Inventors: Mohamed Zait, Yuying Zhang, Hong Su, Jiakun Li
  • Patent number: 11468074
    Abstract: Systems and methods are disclosed for an approximate string searching technique to search for match results that have character differences with the search string. A cost is computed to measure the amount of character differences, and a match is recognized if the cost is below a threshold. The match is determined based on an inferred state machine, whose states are iteratively generated in computer memory for successive characters in the input text. States are added to represent modifications to the string needed to account for character differences and track the costs of the modifications. States are removed when their costs become excessive. Advantageously, the search process never generates the full state machine in memory, retaining only a selected set of best states to continue with the approximate match process. The technique thus enables a practicable implementation of approximate searching that can tolerate an arbitrary number of character deviations.
    Type: Grant
    Filed: December 31, 2019
    Date of Patent: October 11, 2022
    Assignee: Rapid7, Inc.
    Inventors: Viliam Holub, Eoin Shanley, Trevor Parsons
  • Patent number: 11455364
    Abstract: A machine learning clustering process is trained. Web pages of a website are clustered. User flow data associated with a first browsing session at the website is obtained. The user flow data includes a plurality of web page identifiers (e.g., URLs). A web page record for each of the web page identifiers is generated. Each web page record includes words of the corresponding web page identifier. Clusters of web page identifiers previously output from the trained machine learning clustering process are received. For each of the web page records, a cluster of web page identifiers is identified by mapping the web page record to one of the clusters of web page identifiers using the machine learning clustering process. A directed graph representative of the first browsing session is constructed. One or more nodes of the directed graph are the identified clusters of web page identifiers.
    Type: Grant
    Filed: June 23, 2020
    Date of Patent: September 27, 2022
    Assignee: International Business Machines Corporation
    Inventors: Andrey Finkelshtein, Noga Agmon, Eitan Menahem, Yehonatan Bitton
  • Patent number: 11455309
    Abstract: Disclosed is a computer-implemented method to adjust partition keys. The method includes identifying a target table that is a target of a query, the target table including a set of initial partitions. The method also includes determining a set of common queries, wherein each of the common queries are configured to retrieve data from the target table. The method further includes identifying a plurality of core ranges. The method includes merging the core ranges into a new set of partitions. The method further includes setting, in response to the merging, updated partition keys. Further aspects of the present disclosure are directed to systems and computer program products containing functionality consistent with the method described above.
    Type: Grant
    Filed: July 13, 2020
    Date of Patent: September 27, 2022
    Assignee: International Business Machines Corporation
    Inventors: Hong Mei Zhang, Shuo Li, Xiaobo Wang, ShengYan Sun
  • Patent number: 11449482
    Abstract: Aspects define a dynamic threshold filter data structure that includes a pairing of an override log level value to a key value; in response to an incoming processing request, identify a user identification value that is linked to the request, wherein the user identification value is associated to a default logging level within a thread context map for logging data associated with executing processes in satisfaction of the processing request, and wherein the default logging level is different from the override log level; and in response to determining that the user identification value matches the key value, log data associated with executing processes in satisfaction of the processing request to the override log level.
    Type: Grant
    Filed: June 14, 2019
    Date of Patent: September 20, 2022
    Assignee: ADP, Inc.
    Inventor: Stephen Dale Garvey
  • Patent number: 11442934
    Abstract: A method, a system, and a computer program product for executing a query. A query associated with a calculation scenario defining a data flow model that includes one or more calculation nodes is received. Each calculation node corresponds to an operation being performed on one or more database tables stored at a database. The calculation nodes include one or more nodes specifying a window function operation. The window function operation including one or more first attributes and one or more second attributes. A calculation engine executes the calculation scenario by performing, using at least one of the first and second attributes, the window function operation on the database tables stored at the database. Based on the execution of the calculation scenario, a result data set is generated and provided by the database server to the application server.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: September 13, 2022
    Assignee: SAP SE
    Inventors: Michael Ludwig, Johannes Merx, Matthias Vigelius, Christoph Weyerhaeuser
  • Patent number: 11429572
    Abstract: One or more processors store rules for performing rules-based cleaning operations on a plurality of datasets, wherein each rule comprises one or more functions to be executed against a dataset during the rules-based cleaning operations, the one or more functions each having one or more associated conditions and actions, wherein the one or more actions are performed on the dataset responsive to the one or more associated conditions being satisfied. The one or more processors further apply the rules to each of the plurality of datasets to perform the rules-based cleaning operations. To apply the rules to a given dataset, the one or more processors identify an ordered list of the one or more functions to be executed with respect to the given dataset during the rules-based cleaning operations and determine, for each of the one or more functions, whether the given dataset satisfies one or more conditions associated with a respective function of the one or more functions.
    Type: Grant
    Filed: June 21, 2019
    Date of Patent: August 30, 2022
    Assignee: Palantir Technologies, Inc.
    Inventor: Shelby Vanhooser
  • Patent number: 11409771
    Abstract: Methods, systems, and computer-readable media for splitting partitions across database clusters in a time-series database are disclosed. A time-series database determines that a heat metric for the first tile has exceeded a threshold. The first tile represents spatial boundaries and temporal boundaries of time-series data, and a lease for the first tile is assigned to a storage node. Based (at least in part on) the heat metric, a temporal split of the first tile is performed to generate an intermediate tile representing the spatial boundaries and a later portion of the temporal boundaries. A spatial split of the intermediate tile is performed to generate second and third tiles representing two portions of the spatial boundaries and the later portion of the temporal boundaries. The storage node stores elements of the time-series data within these new boundaries to the second and third tiles.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: August 9, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Dumanshu Goyal, Zhong Ren, Nirmesh Khandelwal
  • Patent number: 11409813
    Abstract: A method and apparatus for mining a general tag, a server and a medium are disclosed. The method can comprise: matching a tag seed rule containing a tag placeholder and an attribute of the tag placeholder with historical search information to determine a matching tag; combining the existing tag seed rule and the matching tag to construct a new search sequence set; and performing a generalization process on search sequences included in the new search sequence set to obtain a new tag seed rule, and returning to perform the operation of matching the new tag seed rule with the historical search information to determine a new tag until the tag and the tag seed rule satisfy a convergence condition. A more comprehensive and profound tag can be mined, and the entire flow of mining the tag can not be dependent on a vertical website.
    Type: Grant
    Filed: December 7, 2018
    Date of Patent: August 9, 2022
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Xinwei Feng, Xuping Cao, Yilin Zhang, Ying Li
  • Patent number: 11392589
    Abstract: A method includes generating vertical-specific (VS) records from data sources. Each VS record includes a vertical identifier and a set of VS data fields. The method further comprises generating, for each VS record, an entity partial (EP) record that includes EP data fields populated from the VS data fields. The EP data fields include an entity ID that indicates an entity for the EP record, a source data field that identifies a data source, and an EP searchable data field including data that is descriptive of the entity. The method further comprises generating a search record for each entity ID by combining data from EP records. The data from the EP records is combined based on the source data included in the EP records. Each search record includes a search record searchable data field that includes data from one or more of the EP searchable data fields.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: July 19, 2022
    Assignee: Branch Metrics, Inc.
    Inventors: Eric Glover, Jonas Bauer, Rishi Khaitan, Matthew Dale, Dmitri Gaskin, Charles Gilliam, Pavan Achanta, Zachary Joel Rivest, Nicholas Chen
  • Patent number: 11392566
    Abstract: Techniques described herein propose a new RIDDecode operator in a QEP that uses ROWID lookup and fetch, instead of dictionary decoding, to retrieve decoded values, in order to reduce memory pressure and speed up processing.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: July 19, 2022
    Assignee: Oracle International Corporation
    Inventors: Pit Fender, Benjamin Schlegel, Matthias Brantner
  • Patent number: 11386082
    Abstract: Disclosed herein are system, method, and computer program product embodiments for providing paged and compressed storage for column data, while maintaining existing access mechanisms for the data. In order to reduce an in-memory footprint for column data, columns may be stored in pageable format using page chains, and only those pages of the column data needed to resolve a query will be placed in memory, and evicted from memory when no longer needed. In order to further reduce the footprint for these columns, compression can be applied, and the compressed column data stored in the same pageable format using page chains. The compressed data includes a plurality of vectors, each of which is converted into pages and stored on the page chain with the others so that they can be efficiently retrieved during database retrieval operations.
    Type: Grant
    Filed: June 5, 2020
    Date of Patent: July 12, 2022
    Assignee: SAP SE
    Inventors: Mehul Wagle, Colin Florendo, Pushkar Khadilkar, Robert Schulze, Reza Sherkat, Amit Pathak
  • Patent number: 11379435
    Abstract: Disclosed is a system for automated document generation. The system comprises a database arrangement comprising a plurality of structured data records and a processing arrangement communicably coupled to the database arrangement. The processing arrangement receives a user input from a user relating to at least: a type of the document to be generated, information to be included in the document to be generated. Moreover, the processing arrangement analyzes user input relating to the type of document to be generated to determine related structured data records to be retrieved from the database arrangement, retrieves the related structured data records from the database arrangement, analyzes the related structured data records to determine attributes for the document to be generated and uses the determined attributes and the user input relating to the information to be included in the document to be generated to generate the document.
    Type: Grant
    Filed: August 16, 2019
    Date of Patent: July 5, 2022
    Inventors: Pradip Kumar Seth, Shailendra Raj Mehta
  • Patent number: 11379478
    Abstract: An approach is provided for optimizing a join operation that includes receiving, by one or more processors of a computer system, a join request associated with a fact table and a plurality of related dimension tables; identifying, by the one or more processors of the computer system, a join relationship from the fact table and the plurality of related dimension tables; matching, by the one or more processors of the computer system, different tables of the fact table and the plurality of related dimension tables; filtering, by the one or more processors of the computer system using data parallelism, the fact table and the plurality of related dimension tables, wherein the filtering occurs prior to performing the join request; and performing, by the one or more processors of the computer system, the join operation pursuant to the join request.
    Type: Grant
    Filed: April 2, 2020
    Date of Patent: July 5, 2022
    Assignee: International Business Machines Corporation
    Inventors: ShengYan Sun, Peng Hui Jiang, Shuo Li, Xiaobo Wang
  • Patent number: 11367129
    Abstract: Methods and Apparatus related to generating representations of information. The information may include menu information for merchants such as restaurants. Referring to menus, methods may include receiving potential information for a first menu, and receiving indications of associations of the information with the first menu and/or any number of additional menus. Information and/or associations may later be updated by a desired set of users.
    Type: Grant
    Filed: October 31, 2019
    Date of Patent: June 21, 2022
    Inventor: Howard W. Lutnick
  • Patent number: 11360983
    Abstract: A system and method for performing a hash bucketing process on data in motion are presented. The method includes applying a first hash function on an input dataset to map the input dataset to a bucket, wherein the first hash function results with a first hash value; applying a second hash function on the first hash value to map the input dataset to a record in the bucket; generating metadata based on the input dataset, wherein the metadata at least points to the original location of the input dataset; and storing the generated metadata in the record in the bucket.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: June 14, 2022
    Assignee: HITACHI VANTARA LLC
    Inventors: Alex Mylnikov, Rohit Mahajan
  • Patent number: 11360953
    Abstract: A system and method for data entries deduplication are provided. The method includes indexing an input data set, wherein the input data set is in a tabular formant and the indexing includes providing a unique Row identifier (RowID), wherein rows are the data entries; computing attribute similarity for each column across each pair of rows; computing, for each pair of rows, row-to-row similarity as a weighted sum of attribute similarities; clustering pairs of rows based on their row-to-row similarities; and providing an output data set including at least the clustered pairs of rows.
    Type: Grant
    Filed: July 24, 2020
    Date of Patent: June 14, 2022
    Assignee: HITACHI VANTARA LLC
    Inventors: Rohit Mahajan, Winnie Cheng
  • Patent number: 11354372
    Abstract: Methods and systems for providing a user with content relevant to a location of interest to the user, when the user is determined to be at or near the location, are presented. The user's interest in the location may be determined based on queries about the location received from the user prior to the user arriving at the location. The queries received from the user about the location are used to build a location recommendation model, which generates personalized content relevant to the location and to one or more interest verticals identified for the user. The location recommendation model is built using a location recommendation engine that collects data about the user, the queried location, one or more associations between the user, the queried location, and/or one or more other users, as well as various other information related to the user's interests and the queried location.
    Type: Grant
    Filed: September 24, 2019
    Date of Patent: June 7, 2022
    Assignee: GOOGLE LLC
    Inventors: Jignashu Parikh, Subhadip Sarkar