Patents by Inventor Guogen Zhang

Guogen Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170097972
    Abstract: System and method embodiments are provided for using different storage formats for a primary database and its replicas in a database managed replication (DMR) system. As such, the advantages of both formats can be combined with suitable design complexity and implementation. In an embodiment, data is arranged in a sequence of rows and stored in a first storage format at the primary database. The data arranged in the sequence of rows is also stored in a second storage format at the replica database. The sequence of rows is determined according to the first storage format or the second storage format. The first storage format is a row store (RS) and the second storage format is a column store (CS), or vice versa. In an embodiment, the sequence of rows is determined to improve compression efficiency at the CS.
    Type: Application
    Filed: October 1, 2015
    Publication date: April 6, 2017
    Inventors: Qingqing Zhou, Yang Sun, Guogen Zhang
  • Publication number: 20170091269
    Abstract: A method includes receiving, by a database system, a query statement and forming a runtime plan tree in accordance with the query statement. The method also includes traversing the runtime plan tree including determining whether a function node of the runtime plan tree is qualified for just-in-time (JIT) compilation. Additionally, the method includes, upon determining that the function node is a qualified for JIT compilation producing a string key in accordance with a function of the function node and determining whether a compiled object corresponding to the string key is stored in a compiled object cache.
    Type: Application
    Filed: September 24, 2015
    Publication date: March 30, 2017
    Inventors: Cheng Zhu, Yonghua Ding, Guogen Zhang
  • Publication number: 20170031988
    Abstract: A method includes dividing a dataset into partitions by hashing a specified key, selecting a set of distributed file system nodes as a primary node group for storage of the partitions, and causing a primary copy of the partitions to be stored on the primary node group by a distributed storage system file server such that the location of each partition is known by hashing of the specified key.
    Type: Application
    Filed: July 30, 2015
    Publication date: February 2, 2017
    Inventors: Jason Yang Sun, Guogen Zhang, Le Cai
  • Publication number: 20170031765
    Abstract: An apparatus and method are provided for utilizing different data storage types to store primary and replicated database directories. Included is a first data storage of a first data storage type including a direct-access storage type. The first data storage is configured to store a primary database directory. Also included is a second data storage of a second data storage type including a share type. The second data storage is configured to store a replicated database directory that replicates at least a portion of the primary database directory.
    Type: Application
    Filed: July 28, 2015
    Publication date: February 2, 2017
    Inventors: Bai Yang, Guogen Zhang
  • Publication number: 20170010968
    Abstract: The present technology relates to managing data caching in processing nodes of a massively parallel processing (MPP) database system. A directory is maintained that includes a list and a storage location of the data pages in the MPP database system. Memory usage is monitored in processing nodes by exchanging memory usage information with each other. Each of the processing nodes manages a list and a corresponding amount of available memory in each of the processing nodes based on the memory usage information. Data pages are read from a memory of the processing nodes in response to receiving a request to fetch the data pages, and a remote memory manager is queried for available memory in each of the processing nodes in response to receiving the request. The data pages are distributed to the memory of the processing nodes having sufficient space available for storage during data processing.
    Type: Application
    Filed: July 8, 2015
    Publication date: January 12, 2017
    Inventors: Huaizhi Li, Qingqing Zhou, Guogen Zhang
  • Patent number: 9535952
    Abstract: A method, apparatus, and article of manufacture for optimizing a query in a computer system. Grouping operations are optimized during execution of the query in the computer system by: (1) translating the grouping operations into a plurality of levels, wherein each of the levels is comprised of one or more grouping sets with the same number of grouping expressions; (2) deriving the grouping sets on a level-by-level basis, wherein the grouping sets in a base level are obtained from the database and the grouping sets in a next one of the levels are derived by selecting as an input a smallest one of the grouping sets in a previous one of the levels with which it has a derivation relationship; and (3) combining the derived grouping sets into an output for the query.
    Type: Grant
    Filed: April 11, 2012
    Date of Patent: January 3, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Guogen Zhang, Fen-Ling Lin, Jung-Hsin Hu, Yao-Ching S. Chen, Yun Wang, Glenn M. Yuki
  • Publication number: 20160378824
    Abstract: A system and method for parallelizing hash-based operators in symmetric multiprocessing (SMP) databases is provided. In an embodiment, a method in a device for performing hash based database operations includes receiving at the device an database query; creating a plurality of execution workers to process the query; and building by the execution workers a hash table from a database table, the database table comprising one of a plurality of partitions and a plurality of scan units, the hash table shared by the execution workers, each execution worker scanning a corresponding partition and adding entries to the hash table if the database table is partitioned, each execution worker scanning an unprocessed scan unit and adding entries to the hash table according to the scan unit if the database table comprises scan units, and the workers performing the scanning and the adding in a parallel manner.
    Type: Application
    Filed: June 24, 2015
    Publication date: December 29, 2016
    Inventors: Huaizhi Li, Guogen Zhang, Jason Yang Sun
  • Publication number: 20160364484
    Abstract: Data messages having different priorities may be stored in different communication buffers of a network node. The data messages may then be forwarded from the communication buffers to working buffers as space becomes available in the working buffers. After being forwarded to the working buffers, the data messages may be available to be processed by upper-layer operations of the network node. Priorities may be assigned to the data messages based on a priority level of a query associated with the data messages, a priority level of an upper-layer operation assigned to process the data messages, or combinations thereof.
    Type: Application
    Filed: June 10, 2015
    Publication date: December 15, 2016
    Inventors: Yu Dong, Qingqing Zhou, Guogen Zhang
  • Publication number: 20160306810
    Abstract: System and method for storing statistical data of records stored in a distributed file system. In one aspect a statistical data block is allocated in a memory of a data node for storing statistical data of records stored in a storage disk of the data node. Each data block of the plurality of data blocks in the data node has a respective entry in the statistical data block, which is collocated with data blocks on the data node. Statistical data of records stored in the distributed file system are collected, and written to statistical data block in the memory of the data node.
    Type: Application
    Filed: April 15, 2015
    Publication date: October 20, 2016
    Inventors: Demai NI, Guogen ZHANG, Qingqing ZHOU, Jason Yang SUN
  • Publication number: 20160306847
    Abstract: Embodiments are provided herein for using parameterized Intermediate Representation (IR) for just-in-time (JIT) compilation in database query execution engines. In an embodiment, a method supporting query JIT compilation and execution in a database management system includes identifying a central processing unit (CPU) intensive function in a query, and identifying, in the CPU intensive function, one or more parameters. The one or more parameters represent variables with values changeable at different query instances. The CPU intensive function tis compiled to a parameterized IR including the one or more parameters. The parameterized IR of the CPU intensive function is saved in a catalog of parameterized IRs.
    Type: Application
    Filed: April 15, 2015
    Publication date: October 20, 2016
    Inventors: Yonghua Ding, Guogen Zhang, Cheng Zhu
  • Patent number: 9436732
    Abstract: System and method embodiments are provided for adaptive vector size selection for vectorized query execution. The adaptive vector size selection is implemented in two stages. In a query planning stage, a suitable vector size is estimated for a query by a query planner. The planning stage includes analyzing a query plan tree, segmenting the tree into different segments, and assigning to the query execution plan an initial vector size to each segment. In a subsequent query execution stage, an execution engine monitors hardware performance indicators, and adjusts the vector size according to the monitored hardware performance indicators. Adjusting the vector size includes trying different vector sizes and observing related processor counters to increase or decrease the vector size, wherein the vector size is increased to improve hardware performance according to the processor counters, and wherein the vector size is decreased when the processor counters indicate a decrease in hardware performance.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: September 6, 2016
    Assignee: FUTUREWEI TECHNOLOGIES, INC.
    Inventors: Qingqing Zhou, Guogen Zhang
  • Patent number: 9430582
    Abstract: A system and method is provided for query processing comprises: creating an index of a database and ordering a set of index candidates from the index into a list based on a set of heuristic rules. A query defining a query path is then reduced into a list of single path expressions. Each index candidate is matched against the list of single path expressions according to the ordering of the index candidates. The matched candidate nodes are also verified to insure that they satisfy the query path.
    Type: Grant
    Filed: January 26, 2015
    Date of Patent: August 30, 2016
    Assignee: International Business Machines Corporation
    Inventors: Mengchu Cai, Ruiping Li, Guogen Zhang
  • Patent number: 9430274
    Abstract: System and method embodiments are provided for consistent read in a record-based multi-version concurrency control (MVCC) in database (DB) management systems. In an embodiment, a method in a record-based multi-version concurrent control (MVCC) database (DB) management system for a snapshot consistent read includes copying a system commit transaction identifier (TxID) and a current log record sequence number (LSN) from a transaction log at a start of a reader without backfilling of a commit LSN of a transaction to records that are changed and without copying an entire transaction table by the reader; and determining whether a record is visible according to a record TxID, the commit TxID and a current LSN, wherein a transaction table is consulted only when the record TxID is equal to or larger than a commit TxID at a transaction start.
    Type: Grant
    Filed: March 28, 2014
    Date of Patent: August 30, 2016
    Assignee: Futurewei Technologies, Inc.
    Inventor: Guogen Zhang
  • Publication number: 20160246842
    Abstract: A method for adaptively generating a query execution plan for a parallel database distributed among a cluster of data nodes includes receiving memory usage data from a multiple data nodes including network devices, calculating a representative memory load corresponding to the data nodes based on the memory usage data, categorizing a memory mode corresponding to the data nodes based on the calculated representative memory load, calculating an available work memory corresponding to the data nodes based on the memory mode, and generating the query execution plan for the data nodes based on the available work memory, wherein the memory usage data is based on monitored individual memory loads associated with the data nodes and the query execution plan corresponds to the currently available work memory.
    Type: Application
    Filed: February 25, 2015
    Publication date: August 25, 2016
    Inventors: Huaizhi LI, Guogen ZHANG
  • Publication number: 20160092488
    Abstract: Presented systems and methods can facilitate efficient and effective information storage management. A system may include a plurality of nodes, shared storage and a centralized lock manager. A storage management method can include: receiving an access request to information, performing a lock resolution process; and performing an access operation (e.g., read, information update, etc.). The information can be associated with a shared storage component. The lock resolution process can include participating in a lock management process that manages a physical lock (P-lock), wherein the lock management process utilizes transaction information associated with an implicit lock process and proceeds without communication overhead associated with explicit requests for a logical lock.
    Type: Application
    Filed: September 26, 2014
    Publication date: March 31, 2016
    Inventors: Jason Yang SUN, Guogen ZHANG
  • Patent number: 9286300
    Abstract: At least a portion of data from a first processing system is archived onto a second processing system based on partitions of the data. A query received at the first processing system is processed at the second processing system to retrieve archived data satisfying the received query in response to determining at the first processing system that the received query encompasses archived data. Embodiments of the present invention further include methods, systems, and computer program products for archiving and accessing data in substantially the same manner described above.
    Type: Grant
    Filed: May 2, 2013
    Date of Patent: March 15, 2016
    Assignee: International Business Machines Corporation
    Inventors: Oliver Draese, Namik Hrle, Claus Kempfert, Oliver Koeth, Ruiping Li, Robert S. Muse, Knut Stolze, Guogen Zhang
  • Publication number: 20150293966
    Abstract: In one embodiment, a method of performing point-in-time recovery (PITR) in a massively parallel processing (MPP) database includes receiving, by a data node from a coordinator, a PITR recovery request and reading a log record of the MPP database. The method also includes determining a type of the log record and updating a transaction table when the type of the log record is an abort transaction or a commit transaction.
    Type: Application
    Filed: April 10, 2014
    Publication date: October 15, 2015
    Applicant: Futurewei Technologies, Inc.
    Inventors: Le Cai, Guogen Zhang
  • Publication number: 20150278270
    Abstract: System and method embodiments are provided for multi-version support in indexes in a database. The embodiments enable substantially optimized multi-version support in index and avoid backfill of commit log sequence number (LSN) for a transaction identifier (TxID). In an embodiment, a method in a data processing system for managing a database includes determining with the data processing system whether a record is deleted according to a delete indicator in an index leaf page record corresponding to the record; and determining with the data processing system, when the record is not deleted, whether the record is visible according to a new record indicator in the index leaf page record and according to a comparison of a system commit TxID at the transaction start with a record commit TxID obtained from the index leaf page record.
    Type: Application
    Filed: March 28, 2014
    Publication date: October 1, 2015
    Applicant: FUTUREWEI TECHNOLOGIES, INC.
    Inventor: Guogen Zhang
  • Publication number: 20150278281
    Abstract: System and method embodiments are provided for consistent read in a record-based multi-version concurrency control (MVCC) in database (DB) management systems. In an embodiment, a method in a record-based multi-version concurrent control (MVCC) database (DB) management system for a snapshot consistent read includes copying a system commit transaction identifier (TxID) and a current log record sequence number (LSN) from a transaction log at a start of a reader without backfilling of a commit LSN of a transaction to records that are changed and without copying an entire transaction table by the reader; and determining whether a record is visible according to a record TxID, the commit TxID and a current LSN, wherein a transaction table is consulted only when the record TxID is equal to or larger than a commit TxID at a transaction start.
    Type: Application
    Filed: March 28, 2014
    Publication date: October 1, 2015
    Applicant: FUTUREWEI TECHNOLOGIES, INC.
    Inventor: Guogen Zhang
  • Publication number: 20150199449
    Abstract: A system and method is provided for query processing comprises: creating an index of a database and ordering a set of index candidates from the index into a list based on a set of heuristic rules. A query defining a query path is then reduced into a list of single path expressions. Each index candidate is matched against the list of single path expressions according to the ordering of the index candidates. The matched candidate nodes are also verified to insure that they satisfy the query path.
    Type: Application
    Filed: January 26, 2015
    Publication date: July 16, 2015
    Inventors: Mengchu Cai, Ruiping Li, Guogen Zhang