Patents by Inventor Wenguang Wang

Wenguang Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

USING DATA REBUILDING TO SUPPORT LARGE SEGMENTS

Publication number: 20210311651

Abstract: Techniques for supporting large segments when issuing writes to an erasure coded storage object in a distributed storage system are provided. In one set of embodiments, a node of the system can receive a write request for updating a logical data block of the storage object, write data/metadata for the block to a record in a data log of a metadata object of the storage object (where the metadata object is stored on a performance storage tier), and determine whether the data log has accumulated a threshold number of records. If so, the node can further allocate an in-memory bank, place the data from the data log records into free slots of the bank, allocate a segment in a capacity object of the storage object for holding contents of the bank (where the capacity object is stored on a capacity storage tier), and write the bank contents via a full stripe write to the allocated segment.

Type: Application

Filed: April 7, 2020

Publication date: October 7, 2021

Inventors: Wenguang Wang, Vamsi Gunturu
Using Segment Pre-Allocation to Support Large Segments

Publication number: 20210311652

Abstract: Techniques for supporting large segments when issuing writes to an erasure coded storage object in a distributed storage system are provided. In one set of embodiments, a node of the system can pre-allocate a segment of space in a capacity object of the storage object, receive a write request for updating a logical data block of the storage object, write data/metadata for the block to a record in a data log of a metadata object of the storage object, place the block in an in-memory bank, and determine whether the in-memory bank has become full. If so, the node can compute/fill-in one or more parity blocks for each stripe of the storage object in the in-memory bank and write, based on a next sub-segment pointer pointing to a free sub-segment of the pre-allocated segment, the contents of the in-memory bank via a full stripe write to the free sub-segment.

Type: Application

Filed: April 7, 2020

Publication date: October 7, 2021

Inventors: Wenguang Wang, Vamsi Gunturu
Issuing Efficient Writes to Erasure Coded Objects in a Distributed Storage System with Two Tiers of Storage

Publication number: 20210311653

Abstract: Techniques for issuing efficient writes to an erasure coded storage object in a distributed storage system are provided. In one set of embodiments, a node of the system can receive a write request for updating a logical data block of the storage object, write data/metadata for the block to a record in a data log of a metadata object of the storage object (where the metadata object is stored on a performance storage tier), place the block data in a free slot of an in-memory bank, and determine whether the in-memory bank has become full. If the in-memory bank is full, the node can further allocate a segment in a capacity object of the storage object for holding contents of the in-memory bank (where the capacity object is stored on a capacity storage tier), and write the in-memory bank contents via a full stripe write to the allocated segment.

Type: Application

Filed: April 7, 2020

Publication date: October 7, 2021

Inventors: Wenguang Wang, Vamsi Gunturu, Eric Knauft, Pascal Renauld
Techniques for Reducing Data Log Recovery Time and Metadata Write Amplification

Publication number: 20210311919

Abstract: Techniques for reducing data log recovery time and metadata write amplification when checkpointing a data log of a storage object in a distributed storage system are provided. In one set of embodiments, a node of the system can determine whether the data log has reached a first threshold size, where the data log comprises a plurality of data log records, and where each data log record includes data and metadata for a write request directed to the storage object. If the data log has reached the first threshold size, the node can copy, from each of the plurality of data log records, the metadata for the write request to a corresponding metadata log entry in a metadata log of the storage object. The node can then truncate the data log by removing the plurality of data log records.

Type: Application

Filed: April 7, 2020

Publication date: October 7, 2021

Inventors: Wenguang Wang, Vamsi Gunturu, Eric Knauft
ENHANCED HASH CALCULATION IN DISTRIBUTED DATASTORES

Publication number: 20210294495

Abstract: A method for generating one or more hashes for one or more data blocks is provided. The method receives a data block to write on at least one physical disk of a set of physical disks associated with a set of host machines. The method then calculates a hash for the received data block and writes a first entry to a data log in a cache disk, the first entry comprising a first header and data indicative of the received block, the first header comprising the hash. The method further writes the data to the at least one physical disk as part of data blocks of a stripe, and stores the hash in a summary block on the at least one physical disk. The summary block is associated with the data blocks of the stripe stored on the at least one physical disk.

Type: Application

Filed: March 23, 2020

Publication date: September 23, 2021

Inventors: Wenguang WANG, Vamsi GUNTURU
ENHANCED DATA COMPRESSION IN DISTRIBUTED DATASTORES

Publication number: 20210294499

Abstract: A method for performing write operations on a set of one or more physical disks of a set of one or more host machines is provided. The method receives a data block to write on at least one physical disk in the set of physical disks and generates a first set of one or more compressed sectors based on the received data block. The method writes (i) a first entry having a first header and the first set of compressed sectors to a data log that is maintained in a cache, and (ii) the first set of compressed sectors to a bank in memory. The method further determines if a size of data including compressed sectors in the bank satisfies a threshold, and when the size of data in the bank satisfies the threshold, writes the data to the at least one physical disk in the set of physical disks.

Type: Application

Filed: March 23, 2020

Publication date: September 23, 2021

Inventors: Wenguang WANG, Vamsi GUNTURU
ENHANCED DATA ENCRYPTION IN DISTRIBUTED DATASTORES USING RANDOM TWEAKS STORED IN DATA BLOCKS

Publication number: 20210294502

Abstract: A method for encrypting data in one or more data blocks is provided. The method receives a first data block to be written to a physical storage that includes one or more physical disks. The method applies a first random tweak to data indicative of the first data block to generate a first encrypted data block, and writes the first encrypted data block and the first random tweak to a first physical block of the physical storage. The method receives a second data block to be written to the physical storage. The method then applies a second random tweak, different than the first random tweak, to data indicative of the second data block to generate a second encrypted data block, and writes the second encrypted data block and the second random tweak to a second physical block of the physical storage.

Type: Application

Filed: March 23, 2020

Publication date: September 23, 2021

Inventors: Wenguang WANG, Eric KNAUFT, Vamsi GUNTURU, Pascal RENAULD
Global deduplication on distributed storage using segment usage tables

Patent number: 11093464

Abstract: Solutions are disclosed for blocks in a multi-writer log-structured file system. Solutions include selecting candidate segments in a storage medium; reading blocks of the candidate segments; determining whether any blocks are duplicates; updating a reference count for the duplicate blocks; identifying unique blocks; writing at least a portion of the unique blocks to a log; determining whether the log has accumulated a full segment of data; based at least on determining that the log has accumulated a full segment of data, writing the full segment to the storage medium; updating a segment usage table (SUT) to mark the candidate segments as free; and updating the SUT to mark a segment of the storage medium as no longer free. Some examples identify a window start time and stop time, because older segments have been deduped and younger segments may be volatile. Some examples adjust the window to improve performance.

Type: Grant

Filed: April 24, 2020

Date of Patent: August 17, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Vamsi Gunturu
System and methods of a self-tuning cache sizing system in a cache partitioning system

Patent number: 11093403

Abstract: The disclosure provides a technique for reducing cache misses to a cache of a computer system. The technique includes deallocating memory pages of the cache from one process and allocating those memory pages to another process based on cache misses of each process during a given time period. Repeating the technique leads the total number of cache misses to the cache to gradually decrease to an optimum or near optimum level. The repetition of the technique leads to a dynamic and flexible apportionment of cache memory pages to processes running within the computer system.

Type: Grant

Filed: December 4, 2018

Date of Patent: August 17, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Abhishek Srivastava, Ashish Kaila, Julien Freche
Large range lookups for B-tree

Patent number: 11093471

Abstract: Embodiments herein are directed towards systems and methods for performing range lookups in B?-trees. One example method involves receiving a request to return key-value pairs within a range of keys from the B?-tree. The B?-tree includes a plurality of nodes, each node being associated with a buffer that stores key-value pairs. The method further involves determining a fractional size of the range of keys. The method further involves, for each level of the B?-tree, obtaining from within one or more buffers of one or more nodes of the level, a set of key-value pairs within the range of keys up to a size equal to the fractional size and transferring the set of key-value pairs to a result data structure. The method further involves sorting and merging all key-value pairs in the result data structure and returning the result data structure in response to the request.

Type: Grant

Filed: June 5, 2018

Date of Patent: August 17, 2021

Assignee: VMware, Inc.

Inventors: Abhishek Gupta, Richard P. Spillane, Robert T. Johnson, Wenguang Wang, Kapil Chowksey, Jorge Guerra Delgado, Sandeep Rangaswamy, Srinath Premachandran
Auto-tuned write-optimized key-value store

Patent number: 11093450

Abstract: A B?-tree associated with a file system on a storage volume includes a hierarchy of nodes. Each node includes a buffer portion to store key-value pairs as messages in the buffer. Each node can be characterized by having a maximum allowable size that is periodically updated at run time. The buffers in the nodes of the B?-tree are therefore characterized by having a maximum allowed size that can vary over time.

Type: Grant

Filed: September 27, 2017

Date of Patent: August 17, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Abhishek Gupta, Richard P Spillane, Kapil Chowksey, Robert T Johnson
Using an LSM tree file structure for the on-disk format of an object storage platform

Patent number: 11093472

Abstract: The disclosure herein describes providing and accessing data on an object storage platform using a log-structured merge (LSM) tree file system. The LSM tree file system on the object storage platform includes sorted data tables, each sorted data table including a payload portion and an index portion. Data is written to the LSM tree file system in at least one new sorted data table. Data is ready by identifying a data location of the data based on index portions of the sorted data tables and reading the data from a sorted data table associated with the identified data location. The use of the LSM tree file system on the object storage platform provides an efficient means for interacting with the data stored thereon.

Type: Grant

Filed: December 7, 2018

Date of Patent: August 17, 2021

Assignee: VMware, Inc.

Inventors: Richard P. Spillane, Wenguang Wang, Junlong Gao, Robert T. Johnson, Christos Karamanolis, Maxime Austruy
Data channel between a client and a restartable service

Patent number: 11088896

Abstract: A data communication channel between a client and a service is preserved through a failure of the server by maintaining a request log and an inflight request queue in a protected memory region that preserves the contents of the request log and the inflight request queue even when the service encounters a failure. The method of restarting the data communication channel includes, upon the service being restarted following the failure of the service, determining whether the request log contains requests and, if so, copying the requests from the request log into the in-flight request queue and then removing the copied requests from the request log. The requests in the in-flight request queue, which include any that were in the in-flight request queue at the time of the failure of the service and any that were copied from the request log, are then processed.

Type: Grant

Filed: July 17, 2017

Date of Patent: August 10, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Xiaoyun Gong
System and method of a highly concurrent cache replacement algorithm

Patent number: 11086779

Abstract: Disclosed are a method and system for managing multi-threaded concurrent access to a cache data structure. The cache data structure includes a hash table and three queues. The hash table includes a list of elements for each hash bucket with each hash bucket containing a mutex object and elements in each of the queues containing lock objects. Multiple threads can each lock a different hash bucket to have access to the list, and multiple threads can each lock a different element in the queues. The locks permit highly concurrent access to the cache data structure without conflict. Also, atomic operations are used to obtain pointers to elements in the queues so that a thread can safely advance each pointer. Race conditions that are encountered with locking an element in the queues or entering an element into the hash table are detected, and the operation encountering the race condition is retried.

Type: Grant

Filed: November 11, 2019

Date of Patent: August 10, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Mounesh Badiger, Abhay Kumar Jain, Junlong Gao, Zhaohui Guo, Richard P. Spillane
CPU-efficient cache replacment with two-phase eviction

Patent number: 11080189

Abstract: The present disclosure provides techniques for managing a cache of a computer system using a cache management data structure. The cache management data structure includes a cold queue, a ghost queue, and a hot queue. The techniques herein improve the functioning of the computer because management of the cache management data structure can be performed in parallel with multiple cores or multiple processors, because a sequential scan will only pollute (i.e., add unimportant memory pages) cold queue, and to an extent, ghost queue, but not hot queue, and also because the cache management data structure has lower memory requirements and lower CPU overhead on cache hit than some prior art algorithms.

Type: Grant

Filed: January 24, 2019

Date of Patent: August 3, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Christoph Klee, Adrian Drzewiecki, Christos Karamanolis, Richard P. Spillane, Maxime Austruy
Synchronization of index copies in an LSM tree file system

Patent number: 11074225

Abstract: The disclosure herein describes synchronizing cached index copies at a first site with indexes of a log-structured merge (LSM) tree file system on an object storage platform at a second site. An indication that the LSM tree file system has been compacted based on a compaction process is received. A cached metadata catalog of the included parent catalog version at the first site is accessed. A set of cached index copies is identified at the first site based on the metadata of the cached metadata catalog. The compaction process is applied to the identified set of cached index copies and a compacted set of cached index copies is generated at the first site, whereby the compacted set of cached index copies is synchronized with a respective set of indexes of the plurality of sorted data tables of the LSM tree file system at the second site.

Type: Grant

Filed: December 21, 2018

Date of Patent: July 27, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Richard P. Spillane, Junlong Gao, Robert T. Johnson, Christos Karamanolis, Maxime Austruy
Enhanced data encryption in distributed datastores using a cluster-wide fixed random tweak

Patent number: 11061594

Abstract: A method for encrypting data in one or more data blocks is provided. The method generates a fixed random tweak. The method receives first and second data blocks to write on at least one physical disk of a set of physical disks associated with a set of host machines. The method applies a fixed random tweak to data indicative of the first data block and data indicative of the second data block to generate, respectively, first and second encrypted data blocks. The method writes first and second entries to a data log in a cache, the first entry comprising a first header and the first encrypted data block and the second entry comprising a second header and the second encrypted data block. The method then writes the first and second encrypted data blocks to the at least one physical disk.

Type: Grant

Filed: March 23, 2020

Date of Patent: July 13, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Vamsi Gunturu
Scale out chunk store to multiple nodes to allow concurrent deduplication

Patent number: 11055265

Abstract: The present disclosure provides techniques for scaling out deduplication of files among a plurality of nodes. The techniques include designating a master component for the coordination of deduplication. The master component divides files to be deduplicated among several slave nodes, and provides to each slave node a set of unique identifiers that are to be assigned to chunks during the deduplication process. The techniques herein preserve integrity of the deduplication process that has been scaled out among several nodes. The scaled out deduplication process deduplicates files faster by allowing several deduplication modules to work in parallel to deduplicate files.

Type: Grant

Filed: August 27, 2019

Date of Patent: July 6, 2021

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Junlong Gao, Marcos K. Aguilera, Richard P. Spillane, Christos Karamanolis, Maxime Austruy
SYSTEM AND METHOD OF A HIGHLY CONCURRENT CACHE REPLACEMENT ALGORITHM

Publication number: 20210141728

Abstract: Disclosed are a method and system for managing multi-threaded concurrent access to a cache data structure. The cache data structure includes a hash table and three queues. The hash table includes a list of elements for each hash bucket with each hash bucket containing a mutex object and elements in each of the queues containing lock objects. Multiple threads can each lock a different hash bucket to have access to the list, and multiple threads can each lock a different element in the queues. The locks permit highly concurrent access to the cache data structure without conflict. Also, atomic operations are used to obtain pointers to elements in the queues so that a thread can safely advance each pointer. Race conditions that are encountered with locking an element in the queues or entering an element into the hash table are detected, and the operation encountering the race condition is retried.

Type: Application

Filed: November 11, 2019

Publication date: May 13, 2021

Inventors: Wenguang WANG, Mounesh BADIGER, Abhay Kumar JAIN, Junlong GAO, Zhaohui GUO, Richard P. SPILLANE
STORAGE OF KEY-VALUE ENTRIES IN A DISTRIBUTED STORAGE SYSTEM

Publication number: 20210117443

Abstract: A distributed storage system, such as a distributed storage system in a virtualized computing environment, stores data in storage nodes as immutable key-value entries. A coordinator storage node creates a key-value entry and attempts to store the key-value entry in the coordinator storage node and in neighbor storage nodes. If the storage of the key-value entry in the in the coordinator storage node and in the neighbor storage node is successful, the coordinator storage node pushes the key-value entry to other storage nodes in the distributed storage system for storage as replicas.

Type: Application

Filed: October 21, 2019

Publication date: April 22, 2021

Applicant: VMware, Inc.

Inventors: Haoran ZHENG, Wenguang WANG, Tao XIE, Yizheng CHEN

prev … 7 8 9 10 11 12 13 14 15 … next