Patents by Inventor Xianbo Zhang

Xianbo Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9928210
    Abstract: The present disclosure provides for defragmenting deduplicated data, such as one or more backup image files, stored in a deduplicated data store. A defragmentation module can be implemented on a deduplication server to reduce fragmentation of backup images and improve processing time for restoring a backup image. A defragmentation module can be configured to defragment a backup image file by migrating portions of data of the backup image file that are stored in various containers at non-contiguous locations throughout deduplicated data store. A defragmentation module can contiguously write the portions to one or more containers, which are stored at one or more new locations in the deduplicated data store. A defragmentation module can be configured to evaluate whether portions of a backup image file meet criteria for defragmentation. A defragmentation module can also be configured to update location information about the portions that are migrated to the new container(s).
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: March 27, 2018
    Assignee: Veritas Technologies LLC
    Inventors: Xianbo Zhang, Benjamin Potvien, Thomas Hartnett, Weibao Wu, Satyajit Gorhe Parlikar
  • Patent number: 9886446
    Abstract: A system and method for creating an inverted index is disclosed. The inverted index is created from indexing information received by a deduplication server. This indexing information is collected by a deduplication client during a backup operation and includes a list of keywords and a plurality of values. Once the indexing information is received, the index is constructed and includes a list of keywords. Each of the keywords is mapped to a value, each value represents a section of a document, and each section of the document includes at least a portion of a keyword.
    Type: Grant
    Filed: March 15, 2011
    Date of Patent: February 6, 2018
    Assignee: Veritas Technologies LLC
    Inventors: Danzhou Liu, Xianbo Zhang, Weibao Wu
  • Patent number: 9830231
    Abstract: A system and method for caching fingerprints in a client cache is provided. A data object that comprises a set of data segments and describes a backup process is identified. Thereafter, a request referencing the data object is made to a deduplication server to request that a task identifier be added to the data object. If the deduplication server is able to successfully add the task identifier to the data object, then an active identifier is added to each data segment from the set of data segments in a cache that is within a client system.
    Type: Grant
    Filed: October 27, 2014
    Date of Patent: November 28, 2017
    Assignee: Veritas Technologies LLC
    Inventors: Xianbo Zhang, Thomas Hartnett, Weibao Wu
  • Publication number: 20170286512
    Abstract: Disclosed herein are systems, methods, and processes to perform replication between heterogeneous storage systems. Information associated with a backup stream is recorded during a backup operation by a source server and includes instructions. The instructions include an include instruction to include existing data and a write instruction to write new data during a replication operation. A request to perform the replication operation is received. In response to the request, the information is sent to a target server as part of performing the replication operation.
    Type: Application
    Filed: March 31, 2016
    Publication date: October 5, 2017
    Inventors: Xianbo Zhang, Weibao Wu, Timothy Stevens, Shuangmin Zhang
  • Patent number: 9710386
    Abstract: A computer-implemented method for prefetching subsequent data segments may include (1) identifying a storage system that receives sequential read requests from a sequential-access computing job and random-access read requests from a random-access computing job, (2) observing a plurality of requests to read a plurality of data segments stored by the storage system, (3) determining that the plurality of data segments are stored contiguously by the storage system and that the plurality of requests originate from the sequential-access computing job, and (4) prefetching a subsequent data segment that is directly subsequent to the plurality of data segments in response to determining that the plurality of requests originate from the sequential-access computing job. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Grant
    Filed: August 7, 2013
    Date of Patent: July 18, 2017
    Assignee: Veritas Technologies
    Inventors: Xianbo Zhang, Gaurav Makin, Steve Vranyes, Sinh Nguyen, Smitha Cauligi
  • Publication number: 20170147446
    Abstract: A computer-implemented method for taking snapshots in a deduplicated virtual file system may include (1) maintaining a deduplicated virtual file system that stores, at an original location within a non-virtual file system, at least one configuration file storing metadata for a target file and an extent map for the target file, the extent map defining how to construct the target file from deduplicated data segments in a deduplicated storage system, (2) receiving a request to take a snapshot of the target file corresponding to the configuration file, (3) copying the configuration file storing metadata for the target file and the extent map for the target file into a snapshot location within the non-virtual file system, and (4) transmitting a file reference request to the deduplicated storage system to add a file reference within the deduplicated storage system. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: November 25, 2015
    Publication date: May 25, 2017
    Inventors: Xianbo Zhang, Haigang Wang, Shuangmin Zhang, Jeffrey Van Voorst, Weibao Wu, Sameer Kulkarni, Nilesh Joshi, Kai Li, Yun Yang, Scott Brons
  • Patent number: 9626253
    Abstract: A method for data container group management in a deduplication system is provided. The method includes arranging a plurality of data container groups according to a plurality of file systems. A subset of the plurality of data container groups correspond to each of the plurality of file systems, each of the plurality of data container groups having a reference database, a plurality of data containers, and a data container group identifier (ID). The method includes performing a first backup process for a first client-policy pair with deduplication via a first one of the plurality of data container groups and performing a second backup process for a second client-policy pair with deduplication via a second one of the plurality of data container groups.
    Type: Grant
    Filed: June 26, 2014
    Date of Patent: April 18, 2017
    Assignee: Veritas Technologies LLC
    Inventors: Xianbo Zhang, Haibin She, Haigang Wang
  • Patent number: 9619479
    Abstract: A method to partition a deduplication pool is provided. The method includes determining that an amount of data in a plurality of data containers of the deduplication pool has reached a data capacity threshold and comparing each data container of the plurality of data containers with at least one other of the plurality of data containers as to amount of shared data. The method includes grouping, based on results of the comparing, the plurality of data containers into a plurality of groups of data containers, with data sharing from each of the plurality of groups of data containers to each other of the plurality of groups of data containers less than a data sharing threshold and data sharing inside each of the plurality of groups of data containers greater than the data sharing threshold.
    Type: Grant
    Filed: June 26, 2014
    Date of Patent: April 11, 2017
    Assignee: Veritas Technologies LLC
    Inventors: Xianbo Zhang, Haibin She, Haigang Wang
  • Patent number: 9575680
    Abstract: A method for deduplication rehydration is described. In one embodiment, a request to restore a backup image is received. The backup image is stored in a deduplication system. The backup image includes a plurality of data segments. The method includes determining locality information for at least one of the plurality of data segments. The locality information includes information regarding a location of the at least one data segment in relation to each other data segment of the plurality of data segments in the backup image. The method includes obtaining an identifier of each data container storing the plurality of data segments of the backup image, determining a degree to which the plurality of data segments of the backup image are processed by prefetching, and prefetching one or more of the plurality of target data segments from a data container based at least in part on a predetermined effectiveness threshold.
    Type: Grant
    Filed: August 22, 2014
    Date of Patent: February 21, 2017
    Assignee: Veritas Technologies LLC
    Inventors: Lei Hu Zhang, Xianbo Zhang
  • Patent number: 9495379
    Abstract: The present disclosure provides for implementing a two-level fingerprint caching scheme for a client cache and a server cache. The client cache hit ratio can be improved by pre-populating the client cache with fingerprints that are relevant to the client. Relevant fingerprints include fingerprints used during a recent time period (e.g., fingerprints of segments that are included in the last full backup image and any following incremental backup images created for the client after the last full backup image), and thus are referred to as fingerprints with good temporal locality. Relevant fingerprints also include fingerprints associated with a storage container that has good spatial locality, and thus are referred to as fingerprints with good spatial locality. A pre-set threshold established for the client cache (e.g., threshold Tc) is used to determine whether a storage container (and thus fingerprints associated with the storage container) has good spatial locality.
    Type: Grant
    Filed: October 8, 2012
    Date of Patent: November 15, 2016
    Assignee: Veritas Technologies LLC
    Inventors: Xianbo Zhang, Haibin She, Chao Lei, Xiaobing Song, Shuai Cheng
  • Patent number: 9477677
    Abstract: A computer-implemented method for parallel content-defined data chunking may include (1) identifying a data stream to be chunked, (2) splitting the data stream into a plurality of data sub-streams by alternatingly dividing consecutive bytes of the data stream among the plurality of data sub-streams, and (3) chunking, in parallel, each data sub-stream within the plurality of data sub-streams into a plurality of data segments using a content-defined chunking algorithm. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Grant
    Filed: May 7, 2013
    Date of Patent: October 25, 2016
    Assignee: Veritas Technologies LLC
    Inventors: Wenxin Wang, Xianbo Zhang, Dongxu Sun
  • Patent number: 9442807
    Abstract: In some embodiments, a method of maintaining a reference list for data deduplication is provided. The method includes discarding a newly arriving data segment in response to finding a fingerprint of the newly arriving data segment matches an existing fingerprint in a plurality of fingerprints on a fingerprint-to-file reference list. The method includes adding, in the fingerprint-to-file reference list, to a list for the existing fingerprint, a source for the newly arriving data segment, in response to the fingerprint-to-file reference list indicating the existing fingerprint does not correspond to a hot data segment and setting an indication in the fingerprint-to-file reference list that the existing fingerprint corresponds to the hot data segment in response to the list for the existing fingerprint meeting or exceeding a predetermined number of entries. Other embodiments are included.
    Type: Grant
    Filed: July 3, 2013
    Date of Patent: September 13, 2016
    Assignee: Veritas Technologies, LLC
    Inventors: Xianbo Zhang, Haigang Wang, Haibin She, Wim Goedertier
  • Publication number: 20160179631
    Abstract: Techniques for data backup and restoration are disclosed. In one embodiment, the techniques may be realized as a method including generating a first backup representing a database at a first time; after the first backup, generating a plurality of journal entries, each journal entry representing a change to the database made after the first time; and restoring the database from the first backup and the plurality of journal entries, the restored database including the changes represented by the entries.
    Type: Application
    Filed: December 19, 2014
    Publication date: June 23, 2016
    Applicant: SYMANTEC CORPORATION
    Inventors: Dongxu SUN, Cheng Hai ZHU, Cheng SHAN, Haibin SHE, Xianbo ZHANG
  • Patent number: 9367559
    Abstract: A method for data locality control in a deduplication system is provided. The method includes forming a fingerprint cache from a backup image corresponding to a first backup operation. The method includes removing one or more fingerprints from inclusion in the fingerprint cache, in response to the one or more fingerprints having a data segment locality, in a container, less than a threshold of data segment locality. The container has one or more data segments corresponding to the one or more fingerprints. The method includes applying the fingerprint cache, with the one or more fingerprints removed from inclusion therein, to a second backup operation, wherein at least one method operation is executed through a processor.
    Type: Grant
    Filed: December 2, 2013
    Date of Patent: June 14, 2016
    Assignee: Veritas Technologies LLC
    Inventors: Xianbo Zhang, Haibin She, Xiaobing Song
  • Patent number: 9336224
    Abstract: A computer-implemented method for providing increased scalability in deduplication storage systems may include (1) identifying a database that stores a plurality of reference objects, (2) determining that at least one size-related characteristic of the database has reached a predetermined threshold, (3) partitioning the database into a plurality of sub-databases capable of being updated independent of one another, (4) identifying a request to perform an update operation that updates one or more reference objects stored within at least one sub-database, and then (5) performing the update operation on less than all of the sub-databases to avoid processing costs associated with performing the update operation on all of the sub-databases. Various other systems, methods, and computer-readable media are also disclosed.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: May 10, 2016
    Assignee: Veritas Technologies, LLC
    Inventors: Xianbo Zhang, Fanglu Guo, Weibao Wu
  • Patent number: 9298707
    Abstract: Systems and methods for providing efficient storage and retrieval of data are disclosed. A two-level segment labeling mechanism may be employed to ensure that unique data segments from particular backup data sets are stored together in a storage container. The two-level segment labeling may facilitate preservation of the relative positions of segments within the backup stream during compaction operations. Also, backup data restoration performance may be improved by use of multiple read threads that are localized to particular storage containers.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: March 29, 2016
    Assignee: Veritas US IP Holdings LLC
    Inventors: Xianbo Zhang, Thomas D. Hartnett
  • Patent number: 9244936
    Abstract: A computer-implemented method for enabling deduplication of attachment files within a database is described. A database file comprising data blocks of an attachment file positioned intermittently among data blocks of the database file is inspected. A first map may be generated from the inspection of the database file and the attachment file. The data blocks of the database file and the data blocks of the attachment file are identified according to the first map. The data blocks of the database file are written to a database data file. The data blocks of the attachment file are written to an attachment data file. The attachment data file is deduplicated with at least one other data file.
    Type: Grant
    Filed: October 28, 2010
    Date of Patent: January 26, 2016
    Assignee: Symantec Corporation
    Inventors: Richard Jones, Patrick Ou, Kirk Searls, Weibao Wu, Xianbo Zhang
  • Patent number: 9122635
    Abstract: The present disclosure provides for efficiently creating a full backup image of a client device by efficiently communicating backup data to a backup server using a change tracking log, or track log. A present full backup image can be created using a track log that is associated with a previous full backup image. The client device can determine whether files, which were included in the previous full backup image, have or have not changed using the track log. The client device can transmit changed file data to the backup server for inclusion in the present full backup image. The client device can also transmit metadata identifying unchanged file data to the backup server. The backup server can use the metadata to extract a copy of the unchanged file data from the previous full backup image for inclusion in the present full backup image.
    Type: Grant
    Filed: April 16, 2014
    Date of Patent: September 1, 2015
    Assignee: Symantec Corporation
    Inventors: Shuangmin Zhang, Xianbo Zhang, Weibao Wu, Jim R. Lamb, Yun Yang, Satyajit Ashok GorheParlikar
  • Publication number: 20150112950
    Abstract: A computer-implemented method for providing increased scalability in deduplication storage systems may include (1) identifying a database that stores a plurality of reference objects, (2) determining that at least one size-related characteristic of the database has reached a predetermined threshold, (3) partitioning the database into a plurality of sub-databases capable of being updated independent of one another, (4) identifying a request to perform an update operation that updates one or more reference objects stored within at least one sub-database, and then (5) performing the update operation on less than all of the sub-databases to avoid processing costs associated with performing the update operation on all of the sub-databases. Various other systems, methods, and computer-readable media are also disclosed.
    Type: Application
    Filed: December 23, 2014
    Publication date: April 23, 2015
    Inventors: Xianbo Zhang, Fanglu Guo, Weibao Wu
  • Patent number: 8983952
    Abstract: A system and method for partitioning a data stream into a plurality of segments of varying sizes. A data stream manager partitions a received data stream into segments which are then conveyed to a deduplication engine for processing. The data stream received by the data stream manager includes metadata corresponding to the data stream. Based upon the metadata, which may include an indication as to a type of data included in the data stream, the data stream is partitioned into segments for further processing. A size of a segment used for partitioning given data is based at least in part on a type of data being partitioned. The variable segment sizes may be chosen to balance between maximizing the deduplication ratio and minimizing both the segment count and the size of the fingerprint index.
    Type: Grant
    Filed: July 29, 2010
    Date of Patent: March 17, 2015
    Assignee: Symantec Corporation
    Inventors: Xianbo Zhang, Emery Wang, David Teater, James P. Ohr