Patents by Inventor Wenguang Wang

Wenguang Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Scale out chunk store to multiple nodes to allow concurrent deduplication

Patent number: 11055265

Abstract: The present disclosure provides techniques for scaling out deduplication of files among a plurality of nodes. The techniques include designating a master component for the coordination of deduplication. The master component divides files to be deduplicated among several slave nodes, and provides to each slave node a set of unique identifiers that are to be assigned to chunks during the deduplication process. The techniques herein preserve integrity of the deduplication process that has been scaled out among several nodes. The scaled out deduplication process deduplicates files faster by allowing several deduplication modules to work in parallel to deduplicate files.

Type: Grant

Filed: August 27, 2019

Date of Patent: July 6, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Junlong Gao, Marcos K. Aguilera, Richard P. Spillane, Christos Karamanolis, Maxime Austruy
SYSTEM AND METHOD OF A HIGHLY CONCURRENT CACHE REPLACEMENT ALGORITHM

Publication number: 20210141728

Abstract: Disclosed are a method and system for managing multi-threaded concurrent access to a cache data structure. The cache data structure includes a hash table and three queues. The hash table includes a list of elements for each hash bucket with each hash bucket containing a mutex object and elements in each of the queues containing lock objects. Multiple threads can each lock a different hash bucket to have access to the list, and multiple threads can each lock a different element in the queues. The locks permit highly concurrent access to the cache data structure without conflict. Also, atomic operations are used to obtain pointers to elements in the queues so that a thread can safely advance each pointer. Race conditions that are encountered with locking an element in the queues or entering an element into the hash table are detected, and the operation encountering the race condition is retried.

Type: Application

Filed: November 11, 2019

Publication date: May 13, 2021

Inventors: Wenguang WANG, Mounesh BADIGER, Abhay Kumar JAIN, Junlong GAO, Zhaohui GUO, Richard P. SPILLANE
STORAGE OF KEY-VALUE ENTRIES IN A DISTRIBUTED STORAGE SYSTEM

Publication number: 20210117443

Abstract: A distributed storage system, such as a distributed storage system in a virtualized computing environment, stores data in storage nodes as immutable key-value entries. A coordinator storage node creates a key-value entry and attempts to store the key-value entry in the coordinator storage node and in neighbor storage nodes. If the storage of the key-value entry in the in the coordinator storage node and in the neighbor storage node is successful, the coordinator storage node pushes the key-value entry to other storage nodes in the distributed storage system for storage as replicas.

Type: Application

Filed: October 21, 2019

Publication date: April 22, 2021

Applicant: VMware, Inc.

Inventors: Haoran ZHENG, Wenguang WANG, Tao XIE, Yizheng CHEN
Mirrored write ahead logs for data storage system

Patent number: 10977143

Abstract: Data storage system and method for managing transaction requests to the data storage system utilizes an active write ahead log and a standby write ahead log to apply the transaction requests to a storage data structure stored in a storage system of the data storage system.

Type: Grant

Filed: December 15, 2017

Date of Patent: April 13, 2021

Assignee: VMware, Inc.

Inventors: Abhishek Gupta, Richard P. Spillane, Kapil Chowksey, Rob Johnson, Wenguang Wang
UNBALANCED STORAGE RESOURCE USAGE CONFIGURATION FOR DISTRIBUTED STORAGE SYSTEMS

Publication number: 20210103410

Abstract: Example methods are provided for unbalanced storage resource usage configuration for a distributed storage system in a virtualized computing environment. An example method may include obtaining usage data associated with multiple storage resources forming the distributed storage system. The multiple storage resources are supported by the multiple hosts. Based on the usage data, the method may further include determining a higher usage set and a lower usage set of one or more storage resources from the multiple storage resources and configuring the multiple hosts to use the multiple storage resources in an unbalanced manner by using the higher usage set of one or more storage resources at a higher usage level compared to the lower usage set of one or more storage resources.

Type: Application

Filed: November 30, 2020

Publication date: April 8, 2021

Applicant: VMware, Inc.

Inventors: ZONGLIANG LI, WENGUANG WANG, CHRISTIAN DICKMANN, MANSI SHAH, TAO XIE, YE ZHANG
CONTAINER RUNTIME IMAGE MANAGEMENT ACROSS THE CLOUD

Publication number: 20210075855

Abstract: Examples disclosed herein relate to propagating changes made on a file system volume of a primary cluster of nodes to the same file system volume also being managed by a secondary cluster of nodes. An application is executed on both clusters, and data changes on the primary cluster are mirrored to the secondary cluster using an exo-clone file. The exo-clone file includes the differences between two or more snapshots of the volume on the primary cluster, along with identifiers of the change blocks and (optionally) state information thereof. Just these changes, identifiers, and state information are packaged in the exo-clone file and then exported to the secondary cluster, which in turn makes the changes to its version of the volume. Exporting just the changes to the data blocks and the corresponding block identifiers drastically reduces the information needed to be exchanged and processed to keep the two volumes consistent.

Type: Application

Filed: October 2, 2020

Publication date: March 11, 2021

Inventors: Richard Spillane, Yunshan Luke Lu, Wenguang Wang, Maxime Austruy, Christos Karamanolis, Rawlinson Rivera
SMALL IN-MEMORY CACHE TO SPEED UP CHUNK STORE OPERATION FOR DEDUPLICATION

Publication number: 20210064581

Abstract: The present disclosure provides techniques for deduplicating files. The techniques include creating a cache or subset of a large data structure. The large data structure organizes information by random hash values. The random hash values result in a random organization of information within the data structure, with the information spanning a large number of storage blocks within a storage system. The cache, however, is within memory and is small relative to the data structure. The cache is created so as to contain information that is likely to be needed during deduplication of a file. Having needed information within memory rather than in storage results in faster read and write operations to that information, improving the performance of a computing system.

Type: Application

Filed: August 27, 2019

Publication date: March 4, 2021

Inventors: Wenguang WANG, Junlong GAO, Marcos K. AGUILERA, Richard P. SPILLANE, Christos KARAMANOLIS, Maxime AUSTRUY
PROBABILISTIC ALGORITHM TO CHECK WHETHER A FILE IS UNIQUE FOR DEDUPLICATION

Publication number: 20210064579

Abstract: Disclosed techniques include deduplication. Techniques include determining whether a file is unique, and depending on whether the file is unique, deduplicating only part of the file or the entire file. The techniques include processing the first chunk of a file to determine whether the hash of the chunk hash is already within a chunk hash table, and if not, then a percentage of chunks of the file is similarly processed. If any of the hashes of chunks are already in the chunk hash table, then at least some of file has been previously deduplicated, and file is not unique the storage system. If none of the processed chunks have a hash that is already in the chunk hash table, then the file is considered to be unique within chunk store and only a partial percentage of the file's chunks are deduplicated. Not all of a unique file's chunks are deduplicated.

Type: Application

Filed: August 27, 2019

Publication date: March 4, 2021

Inventors: Wenguang WANG, Junlong GAO, Marcos K. AGUILERA, Richard P. SPILLANE, Christos KARAMANOLIS, Maxime AUSTRUY
EFFICIENT GARBAGE COLLECTION OF VARIABLE SIZE CHUNKING DEDUPLICATION

Publication number: 20210064522

Abstract: The present disclosure provides techniques for deallocating previously allocated storage blocks. The techniques include obtaining a list of chunk IDs to analyze, choosing a chunk ID, and determining the storage blocks spanned by the chunk corresponding to the chosen chunk ID. The technique further includes determining whether any file references any storage blocks spanned by the chunk. The determining may be performed by comparing an internal reference count to a total reference count, where the internal reference count is the number of reference to the storage block by a chunk ID data structure. If no files reference any of the storage blocks spanned by the chunk, then all the storage blocks of the chunk can be deallocated.

Type: Application

Filed: August 27, 2019

Publication date: March 4, 2021

Inventors: Wenguang WANG, Junlong GAO, Marcos K. AGUILERA, Richard P. SPILLANE, Christos KARAMANOLIS, Maxime AUSTRUY
ORGANIZE CHUNK STORE TO PRESERVE LOCALITY OF HASH VALUES AND REFERENCE COUNTS FOR DEDUPLICATION

Publication number: 20210064582

Abstract: The present disclosure provides techniques for deduplicating files. The techniques include creating a data structure that organizes metadata about chunks of files, the organization of the metadata preserving order and locality of the chunks within files. The organization of the metadata within storage blocks of storage devices matches the order of chunks within files. Upon a read or write operation to a metadata, the preservation of locality of metadata results in the likely fetching, from storage into a memory cache, metadata of subsequent and contiguous chunks. The preserved locality results in faster subsequent read and write operations of metadata, because the read and write operations are likely to be executed from memory rather than from storage.

Type: Application

Filed: August 27, 2019

Publication date: March 4, 2021

Inventors: Wenguang WANG, Junlong GAO, Marcos K. AGUILERA, Richard P. SPILLANE, Christos KARAMANOLIS, Maxime AUSTRUY
FAST ALGORITHM TO FIND FILE SYSTEM DIFFERENCE FOR DEDUPLICATION

Publication number: 20210064580

Abstract: The disclosure provides techniques for deduplicating files. The techniques include, upon creating or modifying a file, placing a logical timestamp of the current logical time, within a queue associated with the directory of the file. The techniques further include placing the logical timestamp within a queue of each parent directory of the directory of the file. To determine a set of files for deduplication, the techniques disclosed herein identify files that have been modified within a logical time range. The set of files modified within a logical time is identified by traversing directories of a storage system, the directories being organized within a tree structure. If a directory's queue does not contain a timestamp that is within the logical time range, then all child directories can be skipped over for further processing, such that no files within the child directories end up being within the set of files for deduplication.

Type: Application

Filed: August 27, 2019

Publication date: March 4, 2021

Inventors: Junlong GAO, Wenguang WANG, Marcos K. AGUILERA, Richard P. SPILLANE, Christos KARAMANOLIS, Maxime AUSTRUY
SCALE OUT CHUNK STORE TO MULTIPLE NODES TO ALLOW CONCURRENT DEDUPLICATION

Publication number: 20210064589

Abstract: The present disclosure provides techniques for scaling out deduplication of files among a plurality of nodes. The techniques include designating a master component for the coordination of deduplication. The master component divides files to be deduplicated among several slave nodes, and provides to each slave node a set of unique identifiers that are to be assigned to chunks during the deduplication process. The techniques herein preserve integrity of the deduplication process that has been scaled out among several nodes. The scaled out deduplication process deduplicates files faster by allowing several deduplication modules to work in parallel to deduplicate files.

Type: Application

Filed: August 27, 2019

Publication date: March 4, 2021

Inventors: Wenguang WANG, Junlong GAO, Marcos K. AGUILERA, Richard P. SPILLANE, Christos KARAMANOLIS, Maxime AUSTRUY
Systems and methods for performing scalable Log-Structured Merge (LSM) tree compaction using sharding

Patent number: 10909102

Abstract: Certain aspects provide systems and methods of compacting data within a log-structured merge tree (LSM tree) using sharding. In certain aspects, a method includes determining a size of the LSM tree, determining a compaction time for a compaction of the LSM tree based on the size, determining a number of compaction entities for performing the compaction in parallel based on the compaction time, determining a number of shards based on the number of compaction entities, and determining a key range associated with the LSM tree. The method further comprises dividing the key range by the number of shards into a number of sub key ranges, wherein each of the number of sub key ranges corresponds to a shard of the number of shards and assigning the number of shards to the number of compaction entities for compaction.

Type: Grant

Filed: December 6, 2018

Date of Patent: February 2, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Richard P. Spillane, Junlong Gao, Robert T. Johnson, Christos Karamanolis, Maxime Austruy
ZERO COPY METHOD THAT CAN SPAN MULTIPLE ADDRESS SPACES FOR DATA PATH APPLICATIONS

Publication number: 20210011855

Abstract: A system and method for transferring data between a user space buffer in the address space of a user space process running on a virtual machine and a storage system are described The user space buffer is represented as a file with a file descriptor in the method, a file system proxy receives a request for I/O read or write from the user space process without copying data to be transferred. The file system proxy then sends the request to a file system server without copying data to be transferred. The file system server then requests that the storage system perform the requested I/O directly between the storage system and the user space buffer, the only transfer of data being between the storage system and the user space buffer.

Type: Application

Filed: August 27, 2019

Publication date: January 14, 2021

Inventors: KAMAL JEET CHARAN, Adrian Drzewiecki, Mounesh Badiger, Pushpesh Sharma, Wenguang Wang, Maxime Austruy, Richard P. Spillane
Distributed, scalable key-value store

Patent number: 10891264

Abstract: Techniques for implementing a distributed, scalable key-value store (DSKVS) across a plurality of nodes are provided. In one embodiment, each node in the plurality of nodes can store: (1) a hash table in a nonvolatile storage of the node, where the hash table is configured to hold a partition of a total set of key-value data maintained by the DSKVS; (2) a logical log in the nonvolatile storage, where the logical log is configured to hold transaction log records corresponding to key-value update operations performed on the node; and (3) a cache in a volatile memory of the node, where the cache is configured to hold key-value data that has been recently updated on the node via one or more of the key-value update operations.

Type: Grant

Filed: April 30, 2015

Date of Patent: January 12, 2021

Assignee: VMWARE, INC.

Inventors: Wenguang Wang, Radu Berinde
FAILURE ANALYSIS SYSTEM FOR A DISTRIBUTED STORAGE SYSTEM

Publication number: 20200409810

Abstract: A failure analysis system identifies a root cause of a failure (or other health issue) in a virtualized computing environment and provides a recommendation for remediation. The failure analysis system uses a model-based reasoning (MBR) approach that involves building a model describing the relationships/dependencies of elements in the various layers of the virtualized computing environment, and the model is used by an inference engine to generate facts and rules for reasoning to identify an element in the virtualized computing environment that is causing the failure. Then, then the failure analysis system uses a decision tree analysis (DTA) approach to perform a deep diagnosis of the element, by traversing a decision tree that was generated by combining the rules for reasoning provided by the MBR approach, in conjunction with examining data collected by health monitors. The result of the DTA approach is then used to generate the recommendation for remediation.

Type: Application

Filed: August 14, 2019

Publication date: December 31, 2020

Applicant: VMware, Inc.

Inventors: YU WU, YANG YANG, XIANG YU, WENGUANG WANG, JIN FENG
Unbalanced storage resource usage configuration for distributed storage systems

Patent number: 10866762

Abstract: Example methods are provided for unbalanced storage resource usage configuration for a distributed storage system in a virtualized computing environment. The method may comprise: obtaining usage data associated with multiple storage resources forming the distributed storage system; and based on the usage data, determining a higher usage set and a lower usage set from the multiple storage resources. The method also comprise configuring the multiple hosts to use the multiple storage resources in an unbalanced manner by using the higher usage set at a higher usage level compared to the lower usage set.

Type: Grant

Filed: July 25, 2018

Date of Patent: December 15, 2020

Assignee: VMWARE, INC.

Inventors: Zongliang Li, Wenguang Wang, Christian Dickmann, Mansi Shah, Tao Xie, Ye Zhang
Management of applications across nodes using exo-clones

Patent number: 10812582

Abstract: Examples disclosed herein relate to propagating changes made on a file system volume of a primary cluster of nodes to the same file system volume also being managed by a secondary cluster of nodes. An application is executed on both clusters, and data changes on the primary cluster are mirrored to the secondary cluster using an exo-clone file. The exo-clone file includes the differences between two or more snapshots of the volume on the primary cluster, along with identifiers of the change blocks and (optionally) state information thereof. Just these changes, identifiers, and state information are packaged in the exo-clone file and then exported to the secondary cluster, which in turn makes the changes to its version of the volume. Exporting just the changes to the data blocks and the corresponding block identifiers drastically reduces the information needed to be exchanged and processed to keep the two volumes consistent.

Type: Grant

Filed: June 23, 2016

Date of Patent: October 20, 2020

Assignee: VMware, Inc.

Inventors: Richard Spillane, Yunshan Luke Lu, Wenguang Wang, Maxime Austruy, Christos Karamanolis, Rawlinson Rivera
Method of rebuilding real world storage environment

Patent number: 10789139

Abstract: A method for replicating a first virtual storage system of a customer includes receiving periodically collected configuration data, workload data, service failure data, and management workflow data on the first virtual storage system, creating a first multi-dimensional array of observed variables based on periodically collected data, applying dimensionality reduction to the first multi-dimensional array to determine an artificial variable having a largest variance, determining a smaller, second multi-dimensional array that represents the first multi-dimensional array based on the artificial variable, and building a second virtual storage system to replicate the first virtual storage system based on the second multi-dimensional array.

Type: Grant

Filed: April 12, 2018

Date of Patent: September 29, 2020

Assignee: VMWARE, INC.

Inventors: Yu Wu, Wenguang Wang, Sifan Liu, Jin Feng
Multiple data storage management with reduced latency

Patent number: 10776045

Abstract: System and method for managing multiple data storages using a file system of a computer system utilize a primary data storage to cache objects of logical object containers stored in a secondary data storage in caching-tier volumes. When an access request for an object stored in the secondary data storage is received at the file system and the object is not currently cached in the primary data storage, a caching-tier volume in the primary data storage is created that corresponds to a logical object container in the secondary data storage that includes the requested object. The caching-tier volume is used to cache the object as an inflated file so that the inflated file is available at the primary data storage in the caching-tier volume for a subsequent access request for the object stored in the secondary data storage.

Type: Grant

Filed: August 1, 2017

Date of Patent: September 15, 2020

Assignee: VMware, Inc.

Inventors: Richard P. Spillane, Wenguang Wang, Abhishek Gupta, Maxime Austruy, Christos Karamanolis

prev … 8 9 10 11 12 13 14 15 16 … next