Patents by Inventor Wangyuan Zhang

Wangyuan Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Replicating Big Data

Publication number: 20240370460

Abstract: A method includes identifying a first table including data. The first table has associated metadata, an associated replication state, an associated replication log file including replication logs logging mutations of the first table, and an associated replication configuration file including a first association that associates the first table with a replication family. The method includes inserting a second association in the replication configuration file that associates a second table having a non-loadable state with the replication family. The association of the second table with the replication family causes persistence of any replication logs in the replication log file that correspond to any mutations of the first table during the existence of the second table. The method further includes generating a third table from the first table, the metadata associated with the first table, and the associated replication state of the first table.

Type: Application

Filed: July 19, 2024

Publication date: November 7, 2024

Applicant: Google LLC

Inventors: Wangyuan Zhang, Li Moore
Replicating big data

Patent number: 12050622

Abstract: A method includes identifying a first table including data. The first table has associated metadata, an associated replication state, an associated replication log file including replication logs logging mutations of the first table, and an associated replication configuration file including a first association that associates the first table with a replication family. The method includes inserting a second association in the replication configuration file that associates a second table having a non-loadable state with the replication family. The association of the second table with the replication family causes persistence of any replication logs in the replication log file that correspond to any mutations of the first table during the existence of the second table. The method further includes generating a third table from the first table, the metadata associated with the first table, and the associated replication state of the first table.

Type: Grant

Filed: April 25, 2020

Date of Patent: July 30, 2024

Assignee: Google LLC

Inventors: Wangyuan Zhang, Li Moore
Packing objects by predicted lifespans in cloud storage

Patent number: 11954024

Abstract: A method includes receiving data objects, determining a predicted lifespan of each data object, and instantiating multiple shard files. Each shard file has an associated predicted lifespan range. The method also includes writing each data object into a corresponding shard file having the associated predicted lifespan range that includes the predicted lifespan of the respective data object and storing the shard files in a distributed system. The method also includes determining whether any stored shard files satisfy a compaction criteria based on a number of deleted data objects in each corresponding stored shard file. For each stored shard file satisfying the compaction criteria, the method also includes compacting the stored shard file by rewriting the remaining data objects of the stored shard file into a new shard file.

Type: Grant

Filed: January 24, 2022

Date of Patent: April 9, 2024

Assignee: Google LLC

Inventors: Wangyuan Zhang, Sandeep Singhal, Sangho Yoon, Guangda Lai, Arash Baratloo, Zhifan Zhang, Gael Hatchue Njouyep, Pramod Gaud
Access pattern driven data placement in cloud storage

Patent number: 11588891

Abstract: A system and method for storing data in a distributed network having a plurality of datacenters distributed over a plurality of geographic regions. The method may involve receiving data, including metadata, uploaded to a first datacenter of the distributed network, receiving access information about previous data that was previously stored in the plurality of datacenters of the distributed network, predicting one or more of the plurality of geographic regions from which the uploaded data will be accessed based on the metadata and the access information, and instructing the uploaded data to be transferred from the first datacenter to one or more second datacenters located at each of the one or more predicted geographic regions.

Type: Grant

Filed: November 4, 2019

Date of Patent: February 21, 2023

Assignee: Google LLC

Inventors: Wangyuan Zhang, Vivienne Zhang, Pramod Gaud, Sangho Yoon, Xudong Shi, Kaifeng Yao
Packing Objects by Predicted Lifespans in Cloud Storage

Publication number: 20220147448

Abstract: A method includes receiving data objects, determining a predicted lifespan of each data object, and instantiating multiple shard files. Each shard file has an associated predicted lifespan range. The method also includes writing each data object into a corresponding shard file having the associated predicted lifespan range that includes the predicted lifespan of the respective data object and storing the shard files in a distributed system. The method also includes determining whether any stored shard files satisfy a compaction criteria based on a number of deleted data objects in each corresponding stored shard file. For each stored shard file satisfying the compaction criteria, the method also includes compacting the stored shard file by rewriting the remaining data objects of the stored shard file into a new shard file.

Type: Application

Filed: January 24, 2022

Publication date: May 12, 2022

Applicant: Google LLC

Inventors: Wangyuan Zhang, Sandeep Singhal, Sangho Yoon, Guangda Lai, Arash Baratloo, Zhifan Zhang, Gael Hatchue Njouyep, Pramod Gaud
Packing objects by predicted lifespans in cloud storage

Patent number: 11263128

Abstract: A method includes receiving data objects, determining a predicted lifespan of each data object, and instantiating multiple shard files. Each shard file has an associated predicted lifespan range. The method also includes writing each data object into a corresponding shard file having the associated predicted lifespan range that includes the predicted lifespan of the respective data object and storing the shard files in a distributed system. The method also includes determining whether any stored shard files satisfy a compaction criteria based on a number of deleted data objects in each corresponding stored shard file. For each stored shard file satisfying the compaction criteria, the method also includes compacting the stored shard file by rewriting the remaining data objects of the stored shard file into a new shard file.

Type: Grant

Filed: October 27, 2017

Date of Patent: March 1, 2022

Assignee: Google LLC

Inventors: Wangyuan Zhang, Sandeep Singhal, Sangho Yoon, Guangda Lai, Arash Baratloo, Zhifan Zhang, Gael Hatchue Njouyep, Pramod Gaud
Access Pattern Driven Data Placement in Cloud Storage

Publication number: 20210136150

Abstract: A system and method for storing data in a distributed network having a plurality of datacenters distributed over a plurality of geographic regions. The method may involve receiving data, including metadata, uploaded to a first datacenter of the distributed network, receiving access information about previous data that was previously stored in the plurality of datacenters of the distributed network, predicting one or more of the plurality of geographic regions from which the uploaded data will be accessed based on the metadata and the access information, and instructing the uploaded data to be transferred from the first datacenter to one or more second datacenters located at each of the one or more predicted geographic regions.

Type: Application

Filed: November 4, 2019

Publication date: May 6, 2021

Applicant: Google LLC

Inventors: Wangyuan Zhang, Vivienne Zhang, Pramod Gaud, Sangho Yoon, Xudong Shi, Kaifeng Yao
Replicating Big Data

Publication number: 20200265068

Abstract: A method includes identifying a first table including data. The first table has associated metadata, an associated replication state, an associated replication log file including replication logs logging mutations of the first table, and an associated replication configuration file including a first association that associates the first table with a replication family. The method includes inserting a second association in the replication configuration file that associates a second table having a non-loadable state with the replication family. The association of the second table with the replication family causes persistence of any replication logs in the replication log file that correspond to any mutations of the first table during the existence of the second table. The method further includes generating a third table from the first table, the metadata associated with the first table, and the associated replication state of the first table.

Type: Application

Filed: April 25, 2020

Publication date: August 20, 2020

Applicant: Google LLC

Inventors: Wangyuan Zhang, Li Moore
System and method of replicating data in a distributed system

Patent number: 10650024

Abstract: A method includes identifying a first table including data. The first table has associated metadata, an associated replication state, an associated replication log file including replication logs logging mutations of the first table, and an associated replication configuration file including a first association that associates the first table with a replication family. The method includes inserting a second association in the replication configuration file that associates a second table having a non-loadable state with the replication family. The association of the second table with the replication family causes persistence of any replication logs in the replication log file that correspond to any mutations of the first table during the existence of the second table. The method further includes generating a third table from the first table, the metadata associated with the first table, and the associated replication state of the first table.

Type: Grant

Filed: July 30, 2015

Date of Patent: May 12, 2020

Assignee: Google LLC

Inventors: Wangyuan Zhang, Li Moore
Lock state synchronization for non-disruptive persistent operation

Patent number: 10530855

Abstract: Techniques for synchronization between data structures for original locks and mirror lock data structures are disclosed herein. The mirror lock data structures are being maintained during various scenarios including volume move and aggregate relocation, in order to preserve the non-disruptive persistent operation on storage initiated by clients. According to one embodiment, a storage node determines a plurality of data container locks to be synchronized to a partner node of the storage node and transfers metadata that indicates states of variables that represent the plurality of data container locks to the partner node in a batch. When a client initiates a data access operation that causes an attempt to modify a data container lock of the plurality of data container locks, the storage node sends a retry code to a client that prompts the client to retry the data access operation after a predetermined time period.

Type: Grant

Filed: February 26, 2016

Date of Patent: January 7, 2020

Assignee: NETAPP, INC.

Inventors: Omprakaash C. Thoppai, William Zumach, Wangyuan Zhang, Vinay Sridhar, Robert Wyckoff Hyer, Jr.
Packing Objects by Predicted Lifespans in Cloud Storage

Publication number: 20190129844

Abstract: A method includes receiving data objects, determining a predicted lifespan of each data object, and instantiating multiple shard files. Each shard file has an associated predicted lifespan range. The method also includes writing each data object into a corresponding shard file having the associated predicted lifespan range that includes the predicted lifespan of the respective data object and storing the shard files in a distributed system. The method also includes determining whether any stored shard files satisfy a compaction criteria based on a number of deleted data objects in each corresponding stored shard file. For each stored shard file satisfying the compaction criteria, the method also includes compacting the stored shard file by rewriting the remaining data objects of the stored shard file into a new shard file.

Type: Application

Filed: October 27, 2017

Publication date: May 2, 2019

Applicant: Google LLC

Inventors: Wangyuan Zhang, Sandeep Singhal, Sangho Yoon, Guangda Lai, Arash Baratloo, Zhifan Zhang, Gael Hatchue Njouyep, Pramod Gaud
Replicating Big Data

Publication number: 20170032012

Abstract: A method includes identifying a first table including data. The first table has associated metadata, an associated replication state, an associated replication log file including replication logs logging mutations of the first table, and an associated replication configuration file including a first association that associates the first table with a replication family. The method includes inserting a second association in the replication configuration file that associates a second table having a non-loadable state with the replication family. The association of the second table with the replication family causes persistence of any replication logs in the replication log file that correspond to any mutations of the first table during the existence of the second table. The method further includes generating a third table from the first table, the metadata associated with the first table, and the associated replication state of the first table.

Type: Application

Filed: July 30, 2015

Publication date: February 2, 2017

Applicant: Google Inc.

Inventors: Wangyuan Zhang, Li Moore
Lock State Synchronization for Non-Disruptive Persistent Operation

Publication number: 20160182630

Abstract: Techniques for synchronization between data structures for original locks and mirror lock data structures are disclosed herein. The mirror lock data structures are being maintained during various scenarios including volume move and aggregate relocation, in order to preserve the non-disruptive persistent operation on storage initiated by clients. According to one embodiment, a storage node determines a plurality of data container locks to be synchronized to a partner node of the storage node and transfers metadata that indicates states of variables that represent the plurality of data container locks to the partner node in a batch. When a client initiates a data access operation that causes an attempt to modify a data container lock of the plurality of data container locks, the storage node sends a retry code to a client that prompts the client to retry the data access operation after a predetermined time period.

Type: Application

Filed: February 26, 2016

Publication date: June 23, 2016

Inventors: Omprakaash C. Thoppai, William Zumach, Wangyuan Zhang, Vinay Sridhar, Robert Wyckoff Hyer
Lock state synchronization for non-disruptive persistent operation

Patent number: 9280396

Abstract: Techniques for synchronization between data structures for original locks and mirror lock data structures are disclosed herein. The mirror lock data structures are being maintained during various scenarios including volume move and aggregate relocation, in order to preserve the non-disruptive persistent operation on storage initiated by clients. According to one embodiment, a storage node determines a plurality of data container locks to be synchronized to a partner node of the storage node and transfers metadata that indicates states of variables that represent the plurality of data container locks to the partner node in a batch. When a client initiates a data access operation that causes an attempt to modify a data container lock of the plurality of data container locks, the storage node sends a retry code to a client that prompts the client to retry the data access operation after a predetermined time period.

Type: Grant

Filed: November 1, 2012

Date of Patent: March 8, 2016

Assignee: NetApp, Inc.

Inventors: Omprakaash C. Thoppai, William Zumach, Wangyuan Zhang, Vinay Sridhar, Robert Wyckoff Hyer, Jr.