Patents by Inventor Junlong Gao

Junlong Gao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Writing data to an LSM tree file structure using consistent cache staging

Patent number: 11620261

Abstract: The disclosure herein describes writing data to a log-structured merge (LSM) tree file system on an object storage platform. Write data instructions indicating data for writing to the LSM tree file system are received. Based on the received instructions, the data is written to the first data cache. Based on an instruction to transfer data in the live data cache to the LSM tree file system, the first data cache is converted to a stable cache. A second data cache configured as a live data cache is then generated based on cloning the first data cache. The data in the first data cache is then written to the LSM tree file system. Use of a stable cache and a cloned live data cache enables parallel writing data to the file system by the stable cache and handling write data instructions by the live data cache.

Type: Grant

Filed: December 7, 2018

Date of Patent: April 4, 2023

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Richard P. Spillane, Junlong Gao, Robert T. Johnson, Christos Karamanolis, Maxime Austruy
SCALABLE SEGMENT CLEANING FOR A LOG-STRUCTURED FILE SYSTEM

Publication number: 20230067709

Abstract: Scalable segment cleaning for log-structured file systems (LFSs) includes determining counts of segment cleaners and virtual nodes, with each virtual node being associated with a plurality of objects. Each virtual node is assigned to a selected segment cleaner. Based at least on the assignments, performing, for each virtual node, segment cleaning of the objects by the assigned segment cleaner. A portion, less than all, of the virtual nodes are reassigned to a newly selected segment cleaner based on a change of the count of the segment cleaners and/or a change of the count of the virtual nodes. Based at least on the reassignments, segment cleaning of the objects is performed, for each reassigned virtual node, by the reassigned segment cleaner. In some examples, the objects comprise virtual machine disks (VMDKs) and the segment cleaning uses a segment usage table (SUT) to track segment usage and identify segment cleaning candidates.

Type: Application

Filed: October 20, 2022

Publication date: March 2, 2023

Inventors: Wenguang Wang, Junlong Gao, Vamsi Gunturu
Using Data Mirroring Across Multiple Regions to Reduce the Likelihood of Losing Objects Maintained in Cloud Object Storage

Publication number: 20230020366

Abstract: Techniques for using data mirroring across regions to reduce the likelihood of losing objects in a cloud object storage platform are provided. In one set of embodiments, a computer system can upload first and second copies of a data object to first and second regions of the cloud object storage platform respectively, where the first and second copies are identical. The computer system can then attempt to read the first copy of the data object from the first region. If the read attempt fails, the computer system can retrieve the second copy of the data object from the second region.

Type: Application

Filed: September 22, 2022

Publication date: January 19, 2023

Inventors: Wenguang Wang, Vamsi Gunturu, Junlong Gao
Using erasure coding in a single region to reduce the likelihood of losing objects maintained in cloud object storage

Patent number: 11556423

Abstract: Techniques for using erasure coding in a single region to reduce the likelihood of losing objects in a cloud object storage platform are provided. In one set of embodiments, a computer system can upload a plurality of data objects to a region of a cloud object storage platform, where the plurality of data objects including modifications to a data set. The computer system can further compute a parity object based on the plurality of data objects, where the parity object encodes parity information for the plurality of data objects. The computer system can then upload the parity object to the same region where the plurality of data objects was uploaded.

Type: Grant

Filed: May 22, 2020

Date of Patent: January 17, 2023

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Vamsi Gunturu, Junlong Gao
Using erasure coding across multiple regions to reduce the likelihood of losing objects maintained in cloud object storage

Patent number: 11544147

Abstract: Techniques for using erasure coding across multiple regions to reduce the likelihood of losing objects in a cloud object storage platform are provided. In one set of embodiments, a computer system can upload each of a plurality of data objects to each of a plurality of regions of the cloud object storage platform. The computer system can further compute a parity object based on the plurality of data objects, where the parity object encodes parity information for the plurality of data objects. The computer system can then upload the parity object to another region of the cloud object storage platform different from the plurality of regions.

Type: Grant

Filed: May 22, 2020

Date of Patent: January 3, 2023

Assignee: VMWARE, INC.

Inventors: Wenguang Wang, Junlong Gao, Vamsi Gunturu
EFFICIENT REPLICATION OF FILE CLONES

Publication number: 20220414064

Abstract: A method for managing replication of cloned files is provided. Embodiments include determining, at a source system, that a first file has been cloned to create a second file. Embodiments include sending, from the source system to a replica system, an address of the first extent and an indication that a status of the first extent has changed from non-cloned to cloned. Embodiments include changing, at the replica system, a status of a second extent associated with a replica of the first file on the replica system from non-cloned to cloned and creating a mapping of the address of the first extent to an address of the second extent on the replica system. Embodiments include creating, at the replica system, a replica of the second file comprising a reference to the address of the second extent on the replica system.

Type: Application

Filed: June 24, 2021

Publication date: December 29, 2022

Inventors: Abhay Kumar JAIN, Sriram PATIL, Junlong GAO, Wenguang WANG
Supporting deduplication in file storage using file chunk hashes

Patent number: 11500819

Abstract: The present disclosure is related to methods, systems, and machine-readable media for supporting deduplication in file storage using file chunk hashes. A hash of a chunk of a log segment can be received from a software defined data center. A chunk identifier can be associated with the hash in a hash map that stores associations between sequentially-allocated chunk identifiers and hashes. The chunk identifier can be associated with a logical address corresponding to the chunk of the log segment in a logical map that stores associations between the sequentially-allocated chunk identifiers and logical addresses. A search of the hash map can be performed to determine if the chunk is a duplicate, and the chunk can be deduplicated responsive to a determination that the chunk is a duplicate.

Type: Grant

Filed: September 22, 2020

Date of Patent: November 15, 2022

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Vamsi Gunturu, Junlong Gao, Maxime Austruy, Petr Vandrovec, Ilya Languev, Ilia Sokolinski, Satish Pudi
Scalable segment cleaning for a log-structured file system

Patent number: 11494110

Abstract: Scalable segment cleaning for log-structured file systems (LFSs) includes determining counts of segment cleaners and virtual nodes, with each virtual node being associated with a plurality of objects. Each virtual node is assigned to a selected segment cleaner. Based at least on the assignments, performing, for each virtual node, segment cleaning of the objects by the assigned segment cleaner. A portion, less than all, of the virtual nodes are reassigned to a newly selected segment cleaner based on a change of the count of the segment cleaners and/or a change of the count of the virtual nodes. Based at least on the reassignments, segment cleaning of the objects is performed, for each reassigned virtual node, by the reassigned segment cleaner. In some examples, the objects comprise virtual machine disks (VMDKs) and the segment cleaning uses a segment usage table (SUT) to track segment usage and identify segment cleaning candidates.

Type: Grant

Filed: August 21, 2020

Date of Patent: November 8, 2022

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Junlong Gao, Vamsi Gunturu
Using data mirroring across multiple regions to reduce the likelihood of losing objects maintained in cloud object storage

Patent number: 11481319

Abstract: Techniques for using data mirroring across regions to reduce the likelihood of losing objects in a cloud object storage platform are provided. In one set of embodiments, a computer system can upload first and second copies of a data object to first and second regions of the cloud object storage platform respectively, where the first and second copies are identical. The computer system can then attempt to read the first copy of the data object from the first region. If the read attempt fails, the computer system can retrieve the second copy of the data object from the second region.

Type: Grant

Filed: May 22, 2020

Date of Patent: October 25, 2022

Assignee: VMWARE, INC.

Inventors: Wenguang Wang, Vamsi Gunturu, Junlong Gao
Efficient garbage collection of variable size chunking deduplication

Patent number: 11461229

Abstract: The present disclosure provides techniques for deallocating previously allocated storage blocks. The techniques include obtaining a list of chunk IDs to analyze, choosing a chunk ID, and determining the storage blocks spanned by the chunk corresponding to the chosen chunk ID. The technique further includes determining whether any file references any storage blocks spanned by the chunk. The determining may be performed by comparing an internal reference count to a total reference count, where the internal reference count is the number of reference to the storage block by a chunk ID data structure. If no files reference any of the storage blocks spanned by the chunk, then all the storage blocks of the chunk can be deallocated.

Type: Grant

Filed: August 27, 2019

Date of Patent: October 4, 2022

Assignee: VMWARE, INC.

Inventors: Wenguang Wang, Junlong Gao, Marcos K. Aguilera, Richard P. Spillane, Christos Karamanolis, Maxime Austruy
Shrinking segment cleaning algorithm in an object storage

Patent number: 11435935

Abstract: A method for cleaning an object storage having a plurality of segments is provided. Each segment includes an identifier through which the segment is accessed. The method identifies a first segment in the plurality of segments. The first segment includes a first identifier and a first size. The method determines that a utilization ratio for the first segment is below a threshold. As a result, the method generates a second segment from the first segment, such that the second segment includes a second identifier that is the same as the first identifier and a second size that is smaller than the first size. The method then writes the second segment to the object storage.

Type: Grant

Filed: November 20, 2020

Date of Patent: September 6, 2022

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Petr Vandrovec, Hardik Singh Negi, Junlong Gao, Vamsi Gunturu
Log-structured formats for managing archived storage of objects

Patent number: 11436102

Abstract: Solutions for managing archived storage include receiving, at a first node, a snapshot comprising object data (e.g., a virtual machine disk snapshot) from a second node (e.g., a software defined data center), and storing the snapshot in a tiered structure that includes a data tier and a metadata tier. Snapshots may be used for fail-over operations and/or backups, to support disaster recovery. The data tier comprises a log-structured file system (LFS), and the metadata tier comprises a content addressable storage (CAS) identifying addresses within the LFS. The metadata tier also comprises a logical layer indicating content in the CAS. Segment cleaning of the data tier is performed using a segment usage table (SUT). Some examples include performing a fail-over operation from the second node to a third node using at least the stored snapshot for workload recovery. In some examples, the CAS comprises a log-structured merge-tree (LSM-tree).

Type: Grant

Filed: August 20, 2020

Date of Patent: September 6, 2022

Assignee: VMware, Inc.

Inventors: Vamsi Gunturu, Wenguang Wang, Junlong Gao, Ilia Langouev, Petr Vandrovec, Maxime Austruy, Ilia Sokolinski, Satish Pudi
Isolation of concurrent read and write transactions on the same file

Patent number: 11403261

Abstract: The disclosure provides for isolation of concurrent read and write transactions on the same file, thereby enabling higher file system throughput relative to serial-only transactions. Race conditions and lock contentions in multi-writer scenarios are avoided in file stat (metadata) updates by the use of an aggregator to merge updates of committed transactions to maintain file stat truth, and an upgrade lock that enforces atomicity of file stat access, even while still permitting multiple processes to concurrently read from and/or write to the file data. The disclosure is applicable to generic file systems, whether native or virtualized, and may be used, for example, to speed access to database files that require prolonged input/output (I/O) transaction time periods.

Type: Grant

Filed: December 7, 2018

Date of Patent: August 2, 2022

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Richard P. Spillane, Junlong Gao, Fengshuang Li
System and method for reducing read amplification of archival storage using proactive consolidation

Patent number: 11397706

Abstract: System and method for managing snapshots of storage objects in a storage system use a consolidation operation to reduce read amplification for stored snapshots of a storage object that are stored in log segments in the storage system according to a log-structured file system as storage service objects. The consolidation operation involves identifying target log segments among the log segments that include live blocks that are associated with the latest snapshot of the storage object and determining the number of the live blocks included in each of the target log segments. Based on the number of the live blocks in each of the target log segments, candidate consolidation log segments are determined from the target log segments. The live blocks in the candidate consolidation log segments are then consolidated to new log segments, which are uploaded to the storage system as new storage service objects.

Type: Grant

Filed: December 22, 2020

Date of Patent: July 26, 2022

Assignee: VMWARE, INC.

Inventors: Wenguang Wang, Hardik Singh Negi, Junlong Gao, Vamsi Gunturu
Supporting deduplication in object storage using subset hashes

Patent number: 11385817

Abstract: The present disclosure is related to methods, systems, and machine-readable media for supporting deduplication in object storage using subset hashes. A plurality of hashes of a plurality of blocks of a plurality of log segments can be received from a software defined data center, wherein each block corresponds to a respective logical address. Each of the plurality of logical addresses can be associated with a respective sequentially-allocated chunk identifier in a logical map. A subset hash comprising a hash of a subset of the plurality of blocks can be determined that corresponds to a contiguous range of the plurality of logical addresses. A search of a hash map for the subset hash can be performed to determine if the subset hash is a duplicate. The subset of the plurality of blocks can be deduplicated responsive to a determination that the subset hash is a duplicate.

Type: Grant

Filed: September 22, 2020

Date of Patent: July 12, 2022

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Vamsi Gunturu, Junlong Gao, Ilya Languev, Petr Vandrovec, Maxime Austruy, Ilia Sokolinski, Satish Pudi
Organize chunk store to preserve locality of hash values and reference counts for deduplication

Patent number: 11372813

Abstract: The present disclosure provides techniques for deduplicating files. The techniques include creating a data structure that organizes metadata about chunks of files, the organization of the metadata preserving order and locality of the chunks within files. The organization of the metadata within storage blocks of storage devices matches the order of chunks within files. Upon a read or write operation to a metadata, the preservation of locality of metadata results in the likely fetching, from storage into a memory cache, metadata of subsequent and contiguous chunks. The preserved locality results in faster subsequent read and write operations of metadata, because the read and write operations are likely to be executed from memory rather than from storage.

Type: Grant

Filed: August 27, 2019

Date of Patent: June 28, 2022

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Junlong Gao, Marcos K. Aguilera, Richard P. Spillane, Christos Karamanolis, Maxime Austruy
SYSTEM AND METHOD FOR REDUCING READ AMPLIFICATION OF ARCHIVAL STORAGE USING PROACTIVE CONSOLIDATION

Publication number: 20220197861

Abstract: System and method for managing snapshots of storage objects in a storage system use a consolidation operation to reduce read amplification for stored snapshots of a storage object that are stored in log segments in the storage system according to a log-structured file system as storage service objects. The consolidation operation involves identifying target log segments among the log segments that include live blocks that are associated with the latest snapshot of the storage object and determining the number of the live blocks included in each of the target log segments. Based on the number of the live blocks in each of the target log segments, candidate consolidation log segments are determined from the target log segments. The live blocks in the candidate consolidation log segments are then consolidated to new log segments, which are uploaded to the storage system as new storage service objects.

Type: Application

Filed: December 22, 2020

Publication date: June 23, 2022

Inventors: Wenguang WANG, Hardik Singh NEGI, Junlong GAO, Vamsi GUNTURU
SHRINKING SEGMENT CLEANING ALGORITHM IN AN OBJECT STORAGE

Publication number: 20220164125

Abstract: A method for cleaning an object storage having a plurality of segments is provided. Each segment includes an identifier through which the segment is accessed. The method identifies a first segment in the plurality of segments. The first segment includes a first identifier and a first size. The method determines that a utilization ratio for the first segment is below a threshold. As a result, the method generates a second segment from the first segment, such that the second segment includes a second identifier that is the same as the first identifier and a second size that is smaller than the first size. The method then writes the second segment to the object storage.

Type: Application

Filed: November 20, 2020

Publication date: May 26, 2022

Inventors: Wenguang WANG, Petr VANDROVEC, Hardik Singh NEGI, Junlong GAO, Vamsi GUNTURU
SCALABLE I/O OPERATIONS ON A LOG-STRUCTURED MERGE (LSM) TREE

Publication number: 20220156231

Abstract: A method for managing data associated with objects stored in a cloud storage is provided. The method receives, at a first compute node, first data associated with an object stored in the cloud storage, the first compute node being one of a plurality of compute nodes that store data associated with different objects as storage objects in a log-structured merging (LSM) tree data structure. The method then assigns a first unique name to a first storage object associated with the first data, the first unique name comprising a combination of at least an identifier identifying the first compute node and a first incremental local value. The method stores the first storage object in a first level (L0) of the LSM tree data structure.

Type: Application

Filed: November 13, 2020

Publication date: May 19, 2022

Inventors: Wenguang WANG, Junlong GAO, Vamsi GUNTURU
Distributed object storage supporting difference-level snapshots

Patent number: 11314440

Abstract: Techniques for the increased efficiency of storing data objects storage in the object storage of a software designed data center (SDDC) are provided. The techniques include the efficient storage of data, while enabling snapshots of each updating of the data. The snapshots of the data may be efficiently recovered via the techniques. Difference-level mappings for each snapshot are encoded in compact self-balancing data trees included in the object's metadata. The metadata mappings include mappings between various address spaces employed by the SDDC, as well as the address spaces employed by data stores that store the data on physical medium. Because the metadata is efficiently structured, the metadata for an object may be cached for quick lookups during data access and/or snapshot recovery. The techniques also provide low-latency recovery and/or system rollback in the event of any failure in the SDDC.

Type: Grant

Filed: October 16, 2020

Date of Patent: April 26, 2022

Assignee: VMware, Inc.

Inventors: Wenguang Wang, Vamsidhar Gunturu, Junlong Gao, Ilya Languev, Petr Vandrovec, Maxime Austruy, Ilia Sokolinski, Satish Pudi

prev 1 2 3 4 next