Patents by Inventor Guogen Zhang

Guogen Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods to optimize multi-version support in indexes

Patent number: 9645844

Abstract: System and method embodiments are provided for multi-version support in indexes in a database. The embodiments enable substantially optimized multi-version support in index and avoid backfill of commit log sequence number (LSN) for a transaction identifier (TxID). In an embodiment, a method in a data processing system for managing a database includes determining with the data processing system whether a record is deleted according to a delete indicator in an index leaf page record corresponding to the record; and determining with the data processing system, when the record is not deleted, whether the record is visible according to a new record indicator in the index leaf page record and according to a comparison of a system commit TxID at the transaction start with a record commit TxID obtained from the index leaf page record.

Type: Grant

Filed: March 28, 2014

Date of Patent: May 9, 2017

Assignee: Futurewei Technologies, Inc.

Inventor: Guogen Zhang
Apparatus and Method for Managing Storage of a Primary Database and a Replica Database

Publication number: 20170097972

Abstract: System and method embodiments are provided for using different storage formats for a primary database and its replicas in a database managed replication (DMR) system. As such, the advantages of both formats can be combined with suitable design complexity and implementation. In an embodiment, data is arranged in a sequence of rows and stored in a first storage format at the primary database. The data arranged in the sequence of rows is also stored in a second storage format at the replica database. The sequence of rows is determined according to the first storage format or the second storage format. The first storage format is a row store (RS) and the second storage format is a column store (CS), or vice versa. In an embodiment, the sequence of rows is determined to improve compression efficiency at the CS.

Type: Application

Filed: October 1, 2015

Publication date: April 6, 2017

Inventors: Qingqing Zhou, Yang Sun, Guogen Zhang
System and Method for Database Query

Publication number: 20170091269

Abstract: A method includes receiving, by a database system, a query statement and forming a runtime plan tree in accordance with the query statement. The method also includes traversing the runtime plan tree including determining whether a function node of the runtime plan tree is qualified for just-in-time (JIT) compilation. Additionally, the method includes, upon determining that the function node is a qualified for JIT compilation producing a string key in accordance with a function of the function node and determining whether a compiled object corresponding to the string key is stored in a compiled object cache.

Type: Application

Filed: September 24, 2015

Publication date: March 30, 2017

Inventors: Cheng Zhu, Yonghua Ding, Guogen Zhang
DATA PLACEMENT CONTROL FOR DISTRIBUTED COMPUTING ENVIRONMENT

Publication number: 20170031988

Abstract: A method includes dividing a dataset into partitions by hashing a specified key, selecting a set of distributed file system nodes as a primary node group for storage of the partitions, and causing a primary copy of the partitions to be stored on the primary node group by a distributed storage system file server such that the location of each partition is known by hashing of the specified key.

Type: Application

Filed: July 30, 2015

Publication date: February 2, 2017

Inventors: Jason Yang Sun, Guogen Zhang, Le Cai
APPARATUS AND METHOD FOR UTILIZING DIFFERENT DATA STORAGE TYPES TO STORE PRIMARY AND REPLICATED DATABASE DIRECTORIES

Publication number: 20170031765

Abstract: An apparatus and method are provided for utilizing different data storage types to store primary and replicated database directories. Included is a first data storage of a first data storage type including a direct-access storage type. The first data storage is configured to store a primary database directory. Also included is a second data storage of a second data storage type including a share type. The second data storage is configured to store a replicated database directory that replicates at least a portion of the primary database directory.

Type: Application

Filed: July 28, 2015

Publication date: February 2, 2017

Inventors: Bai Yang, Guogen Zhang
SYSTEM AND METHOD FOR DATA CACHING IN PROCESSING NODES OF A MASSIVELY PARALLEL PROCESSING (MPP) DATABASE SYSTEM

Publication number: 20170010968

Abstract: The present technology relates to managing data caching in processing nodes of a massively parallel processing (MPP) database system. A directory is maintained that includes a list and a storage location of the data pages in the MPP database system. Memory usage is monitored in processing nodes by exchanging memory usage information with each other. Each of the processing nodes manages a list and a corresponding amount of available memory in each of the processing nodes based on the memory usage information. Data pages are read from a memory of the processing nodes in response to receiving a request to fetch the data pages, and a remote memory manager is queried for available memory in each of the processing nodes in response to receiving the request. The data pages are distributed to the memory of the processing nodes having sufficient space available for storage during data processing.

Type: Application

Filed: July 8, 2015

Publication date: January 12, 2017

Inventors: Huaizhi Li, Qingqing Zhou, Guogen Zhang
Dynamic selection of optimal grouping sequence at runtime for grouping sets, rollup and cube operations in SQL query processing

Patent number: 9535952

Abstract: A method, apparatus, and article of manufacture for optimizing a query in a computer system. Grouping operations are optimized during execution of the query in the computer system by: (1) translating the grouping operations into a plurality of levels, wherein each of the levels is comprised of one or more grouping sets with the same number of grouping expressions; (2) deriving the grouping sets on a level-by-level basis, wherein the grouping sets in a base level are obtained from the database and the grouping sets in a next one of the levels are derived by selecting as an input a smallest one of the grouping sets in a previous one of the levels with which it has a derivation relationship; and (3) combining the derived grouping sets into an output for the query.

Type: Grant

Filed: April 11, 2012

Date of Patent: January 3, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Guogen Zhang, Fen-Ling Lin, Jung-Hsin Hu, Yao-Ching S. Chen, Yun Wang, Glenn M. Yuki
Systems and Methods for Parallelizing Hash-based Operators in SMP Databases

Publication number: 20160378824

Abstract: A system and method for parallelizing hash-based operators in symmetric multiprocessing (SMP) databases is provided. In an embodiment, a method in a device for performing hash based database operations includes receiving at the device an database query; creating a plurality of execution workers to process the query; and building by the execution workers a hash table from a database table, the database table comprising one of a plurality of partitions and a plurality of scan units, the hash table shared by the execution workers, each execution worker scanning a corresponding partition and adding entries to the hash table if the database table is partitioned, each execution worker scanning an unprocessed scan unit and adding entries to the hash table according to the scan unit if the database table comprises scan units, and the workers performing the scanning and the adding in a parallel manner.

Type: Application

Filed: June 24, 2015

Publication date: December 29, 2016

Inventors: Huaizhi Li, Guogen Zhang, Jason Yang Sun
Query Plan and Operation-Aware Communication Buffer Management

Publication number: 20160364484

Abstract: Data messages having different priorities may be stored in different communication buffers of a network node. The data messages may then be forwarded from the communication buffers to working buffers as space becomes available in the working buffers. After being forwarded to the working buffers, the data messages may be available to be processed by upper-layer operations of the network node. Priorities may be assigned to the data messages based on a priority level of a query associated with the data messages, a priority level of an upper-layer operation assigned to process the data messages, or combinations thereof.

Type: Application

Filed: June 10, 2015

Publication date: December 15, 2016

Inventors: Yu Dong, Qingqing Zhou, Guogen Zhang
BIG DATA STATISTICS AT DATA-BLOCK LEVEL

Publication number: 20160306810

Abstract: System and method for storing statistical data of records stored in a distributed file system. In one aspect a statistical data block is allocated in a memory of a data node for storing statistical data of records stored in a storage disk of the data node. Each data block of the plurality of data blocks in the data node has a respective entry in the statistical data block, which is collocated with data blocks on the data node. Statistical data of records stored in the distributed file system are collected, and written to statistical data block in the memory of the data node.

Type: Application

Filed: April 15, 2015

Publication date: October 20, 2016

Inventors: Demai NI, Guogen ZHANG, Qingqing ZHOU, Jason Yang SUN
Apparatus and Method for Using Parameterized Intermediate Representation for Just-In-Time Compilation in Database Query Execution Engine

Publication number: 20160306847

Abstract: Embodiments are provided herein for using parameterized Intermediate Representation (IR) for just-in-time (JIT) compilation in database query execution engines. In an embodiment, a method supporting query JIT compilation and execution in a database management system includes identifying a central processing unit (CPU) intensive function in a query, and identifying, in the CPU intensive function, one or more parameters. The one or more parameters represent variables with values changeable at different query instances. The CPU intensive function tis compiled to a parameterized IR including the one or more parameters. The parameterized IR of the CPU intensive function is saved in a catalog of parameterized IRs.

Type: Application

Filed: April 15, 2015

Publication date: October 20, 2016

Inventors: Yonghua Ding, Guogen Zhang, Cheng Zhu
System and method for adaptive vector size selection for vectorized query execution

Patent number: 9436732

Abstract: System and method embodiments are provided for adaptive vector size selection for vectorized query execution. The adaptive vector size selection is implemented in two stages. In a query planning stage, a suitable vector size is estimated for a query by a query planner. The planning stage includes analyzing a query plan tree, segmenting the tree into different segments, and assigning to the query execution plan an initial vector size to each segment. In a subsequent query execution stage, an execution engine monitors hardware performance indicators, and adjusts the vector size according to the monitored hardware performance indicators. Adjusting the vector size includes trying different vector sizes and observing related processor counters to increase or decrease the vector size, wherein the vector size is increased to improve hardware performance according to the processor counters, and wherein the vector size is decreased when the processor counters indicate a decrease in hardware performance.

Type: Grant

Filed: March 13, 2013

Date of Patent: September 6, 2016

Assignee: FUTUREWEI TECHNOLOGIES, INC.

Inventors: Qingqing Zhou, Guogen Zhang
Efficient method of using XML value indexes without exact path information to filter XML documents for more specific XPath queries

Patent number: 9430582

Abstract: A system and method is provided for query processing comprises: creating an index of a database and ordering a set of index candidates from the index into a list based on a set of heuristic rules. A query defining a query path is then reduced into a list of single path expressions. Each index candidate is matched against the list of single path expressions according to the ordering of the index candidates. The matched candidate nodes are also verified to insure that they satisfy the query path.

Type: Grant

Filed: January 26, 2015

Date of Patent: August 30, 2016

Assignee: International Business Machines Corporation

Inventors: Mengchu Cai, Ruiping Li, Guogen Zhang
Efficient methods and systems for consistent read in record-based multi-version concurrency control

Patent number: 9430274

Abstract: System and method embodiments are provided for consistent read in a record-based multi-version concurrency control (MVCC) in database (DB) management systems. In an embodiment, a method in a record-based multi-version concurrent control (MVCC) database (DB) management system for a snapshot consistent read includes copying a system commit transaction identifier (TxID) and a current log record sequence number (LSN) from a transaction log at a start of a reader without backfilling of a commit LSN of a transaction to records that are changed and without copying an entire transaction table by the reader; and determining whether a record is visible according to a record TxID, the commit TxID and a current LSN, wherein a transaction table is consulted only when the record TxID is equal to or larger than a commit TxID at a transaction start.

Type: Grant

Filed: March 28, 2014

Date of Patent: August 30, 2016

Assignee: Futurewei Technologies, Inc.

Inventor: Guogen Zhang
QUERY OPTIMIZATION ADAPTIVE TO SYSTEM MEMORY LOAD FOR PARALLEL DATABASE SYSTEMS

Publication number: 20160246842

Abstract: A method for adaptively generating a query execution plan for a parallel database distributed among a cluster of data nodes includes receiving memory usage data from a multiple data nodes including network devices, calculating a representative memory load corresponding to the data nodes based on the memory usage data, categorizing a memory mode corresponding to the data nodes based on the calculated representative memory load, calculating an available work memory corresponding to the data nodes based on the memory mode, and generating the query execution plan for the data nodes based on the available work memory, wherein the memory usage data is based on monitored individual memory loads associated with the data nodes and the query execution plan corresponds to the currently available work memory.

Type: Application

Filed: February 25, 2015

Publication date: August 25, 2016

Inventors: Huaizhi LI, Guogen ZHANG
CONCURRENCY CONTROL IN A SHARED STORAGE ARCHITECTURE SUPPORTING ON-PAGE IMPLICIT LOCKS

Publication number: 20160092488

Abstract: Presented systems and methods can facilitate efficient and effective information storage management. A system may include a plurality of nodes, shared storage and a centralized lock manager. A storage management method can include: receiving an access request to information, performing a lock resolution process; and performing an access operation (e.g., read, information update, etc.). The information can be associated with a shared storage component. The lock resolution process can include participating in a lock management process that manages a physical lock (P-lock), wherein the lock management process utilizes transaction information associated with an implicit lock process and proceeds without communication overhead associated with explicit requests for a logical lock.

Type: Application

Filed: September 26, 2014

Publication date: March 31, 2016

Inventors: Jason Yang SUN, Guogen ZHANG
Archiving data in database management systems

Patent number: 9286300

Abstract: At least a portion of data from a first processing system is archived onto a second processing system based on partitions of the data. A query received at the first processing system is processed at the second processing system to retrieve archived data satisfying the received query in response to determining at the first processing system that the received query encompasses archived data. Embodiments of the present invention further include methods, systems, and computer program products for archiving and accessing data in substantially the same manner described above.

Type: Grant

Filed: May 2, 2013

Date of Patent: March 15, 2016

Assignee: International Business Machines Corporation

Inventors: Oliver Draese, Namik Hrle, Claus Kempfert, Oliver Koeth, Ruiping Li, Robert S. Muse, Knut Stolze, Guogen Zhang
System and Method for Massively Parallel Processing Database

Publication number: 20150293966

Abstract: In one embodiment, a method of performing point-in-time recovery (PITR) in a massively parallel processing (MPP) database includes receiving, by a data node from a coordinator, a PITR recovery request and reading a log record of the MPP database. The method also includes determining a type of the log record and updating a transaction table when the type of the log record is an abort transaction or a commit transaction.

Type: Application

Filed: April 10, 2014

Publication date: October 15, 2015

Applicant: Futurewei Technologies, Inc.

Inventors: Le Cai, Guogen Zhang
Efficient Methods and Systems for Consistent Read in Record-Based Multi-Version Concurrency Control

Publication number: 20150278281

Abstract: System and method embodiments are provided for consistent read in a record-based multi-version concurrency control (MVCC) in database (DB) management systems. In an embodiment, a method in a record-based multi-version concurrent control (MVCC) database (DB) management system for a snapshot consistent read includes copying a system commit transaction identifier (TxID) and a current log record sequence number (LSN) from a transaction log at a start of a reader without backfilling of a commit LSN of a transaction to records that are changed and without copying an entire transaction table by the reader; and determining whether a record is visible according to a record TxID, the commit TxID and a current LSN, wherein a transaction table is consulted only when the record TxID is equal to or larger than a commit TxID at a transaction start.

Type: Application

Filed: March 28, 2014

Publication date: October 1, 2015

Applicant: FUTUREWEI TECHNOLOGIES, INC.

Inventor: Guogen Zhang
Systems and Methods to Optimize Multi-version Support in Indexes

Publication number: 20150278270

Abstract: System and method embodiments are provided for multi-version support in indexes in a database. The embodiments enable substantially optimized multi-version support in index and avoid backfill of commit log sequence number (LSN) for a transaction identifier (TxID). In an embodiment, a method in a data processing system for managing a database includes determining with the data processing system whether a record is deleted according to a delete indicator in an index leaf page record corresponding to the record; and determining with the data processing system, when the record is not deleted, whether the record is visible according to a new record indicator in the index leaf page record and according to a comparison of a system commit TxID at the transaction start with a record commit TxID obtained from the index leaf page record.

Type: Application

Filed: March 28, 2014

Publication date: October 1, 2015

Applicant: FUTUREWEI TECHNOLOGIES, INC.

Inventor: Guogen Zhang

prev 1 2 3 4 5 next