Patents by Inventor Guilherme Menezes

Guilherme Menezes has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220365852
    Abstract: Methods and systems for backing up and restoring files that have multiple hard links using master file references and index node-based mappings are described. In some cases, file fetching and restoration may be performed by a storage appliance using master file references in which a master file is identified for each multi-link file that is backed-up on the storage appliance and then referenced by one or more hard links to the multi-link file. In other cases, file fetching and restoration may be performed by a storage appliance using index node-based mappings for multi-link files that provide mappings between index node identifiers (e.g., inode numbers) for the multi-link files on a primary system and hard link paths for storing the file contents of the multi-link files on a storage appliance used for backing up the primary system.
    Type: Application
    Filed: July 20, 2022
    Publication date: November 17, 2022
    Inventors: Looi Chow Lee, Ziqi Liu, Guilherme Menezes
  • Patent number: 11474912
    Abstract: Methods and systems for backing up and restoring files that have multiple hard links using master file references and index node-based mappings are described. In some cases, file fetching and restoration may be performed by a storage appliance using master file references in which a master file is identified for each multi-link file that is backed-up on the storage appliance and then referenced by one or more hard links to the multi-link file. In other cases, file fetching and restoration may be performed by a storage appliance using index node-based mappings for multi-link files that provide mappings between index node identifiers (e.g., inode numbers) for the multi-link files on a primary system and hard link paths for storing the file contents of the multi-link files on a storage appliance used for backing up the primary system.
    Type: Grant
    Filed: January 31, 2019
    Date of Patent: October 18, 2022
    Assignee: Rubrik, Inc.
    Inventors: Looi Chow Lee, Ziqi Liu, Guilherme Menezes
  • Patent number: 11269817
    Abstract: In one example, a method includes measuring an amount of physical storage space used, or expected to be used, by a portion of a dataset S of segments, and measuring the amount of physical storage space includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1 . . . Fn that makes up a subset of the dataset S, determining a number of unique segments in the dataset S, identifying a respective unique segment set UF1 . . . UFN for each of the ‘n’ files in the ad-hoc group of files, performing a set union operation on the unique segment sets UF1 . . . UFN, and determining a sum of sizes of the unique segment sets UF1 . . . UFN, where the sum is the amount of physical storage space used or expected to be used by the ad-hoc group of size ‘n’ of files F1 . . . Fn.
    Type: Grant
    Filed: April 10, 2019
    Date of Patent: March 8, 2022
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza
  • Patent number: 11151030
    Abstract: A first set of garbage collection (GC) features and non-GC features associated with a storage system are received, the first set of features being associated with a predetermined start date and a time window. A learning equation is generated having a plurality of vectors of GC features and a plurality of vectors of non-GC features. For a current iteration representing a current GC process, it is determined whether a first prior GC process was started within the time window. An entry of vectors of the non-GC features of the learning equation is populated based on corresponding feature values of the first set of non-GC features, in response to determining that the first prior GC process was started within the time window. A predetermined regression algorithm is applied to the learning equation to generate a GC duration predictive model to predict a GC duration of a subsequent GC process.
    Type: Grant
    Filed: August 31, 2016
    Date of Patent: October 19, 2021
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Fabiano C. Botelho, Mark Chamness, Dmitry Serdyuk, Guilherme Menezes
  • Patent number: 10838923
    Abstract: Identifying files that do not deduplicate well in a storage system with deduplication facilitates optimizing storage capacity by moving the identified files to less expensive storage without deduplication. Any set of files can be examined to remove files that are identified as files that do not deduplicate well. The process of identification includes arranging the files in a predefined order and using bitmap representations of the unique segments in the files to determine a count of different segments in neighboring next files compared to the previous files, and removing from deduplication any next files that exceed a difference threshold. The bitmap representations of the files allows the identification processes to be performed efficiently for large datasets. Any over-identification of files is minimized by repeating the identification processes on the set of files after arranging them in the reverse order.
    Type: Grant
    Filed: December 18, 2015
    Date of Patent: November 17, 2020
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Abdullah Reza
  • Publication number: 20200250049
    Abstract: Methods and systems for backing up and restoring files that have multiple hard links using master file references and index node-based mappings are described. In some cases, file fetching and restoration may be performed by a storage appliance using master file references in which a master file is identified for each multi-link file that is backed-up on the storage appliance and then referenced by one or more hard links to the multi-link file. In other cases, file fetching and restoration may be performed by a storage appliance using index node-based mappings for multi-link files that provide mappings between index node identifiers (e.g., inode numbers) for the multi-link files on a primary system and hard link paths for storing the file contents of the multi-link files on a storage appliance used for backing up the primary system.
    Type: Application
    Filed: January 31, 2019
    Publication date: August 6, 2020
    Applicant: RUBRIK, INC.
    Inventors: Looi Chow Lee, Ziqi Liu, Guilherme Menezes
  • Patent number: 10459648
    Abstract: File measurements are computed and stored in persistent memory of a deduplicated storage system as files are written or on demand, where the file measurements are used to estimate storage requirements for storing a subset of files. The file measurements are accumulated into an initial measurement at a first point in time and a final measurement at a second point in time to obtain an estimate of any change in a quantity of unique segments required to store the subset of files in the deduplicated storage system between the first and second points in time. Future storage requirements can be estimated based on a computed rate of change in the amount of storage required to store the subset of files between the first and second points in time.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: October 29, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Guilherme Menezes, Abdullah Reza
  • Patent number: 10430383
    Abstract: In one example, a method for processing data includes receiving information that identifies an ad hoc group of size ‘n’ of files F1 . . . Fn, each file F including a respective file sequence S that includes K data segments. Next, each file sequence S is sampled to obtain a sequence SS of data segments from the file sequence S, and a non-random sampling of data segments is sampled from each sequence SS to obtain a set SSU of the sequence SS. The data segments of each set SSU are then sampled to obtain a sample subset SSUS of the set SSU, and a compression ratio is determined for each data segment in each sample subset SSUS. Finally, an average data compression RF1 . . . Fn is estimated and output for the files F in the group of size ‘n’, based on the compression ratios.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: October 1, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Teng Xu, Abdullah Reza
  • Publication number: 20190236054
    Abstract: In one example, a method includes measuring an amount of physical storage space used, or expected to be used, by a portion of a dataset S of segments, and measuring the amount of physical storage space includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1 . . . Fn that makes up a subset of the dataset S, determining a number of unique segments in the dataset S, identifying a respective unique segment set UF1 . . . UFN for each of the ‘n’ files in the ad-hoc group of files, performing a set union operation on the unique segment sets UF1 . . . UFN, and determining a sum of sizes of the unique segment sets UF1 . . . UFN, where the sum is the amount of physical storage space used or expected to be used by the ad-hoc group of size ‘n’ of files F1 . . . Fn.
    Type: Application
    Filed: April 10, 2019
    Publication date: August 1, 2019
    Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza
  • Patent number: 10303662
    Abstract: In one example, a method for processing data includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1. . . Fn, each file F including a respective segment set S, and then sampling a representation of each unique segment in the segment set S to obtain a sampled unique segment count for each file F. A unique segment count is then obtained for each file F by applying a sampling ratio R to each sampled unique segment count, and an average segment size for each file F is determined. Next, a physical space measurement is generated for each file F based on the average segment size and the unique segment count, and then a total physical space measurement p is generated based on the individual physical space measurements for each file F.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: May 28, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza
  • Patent number: 10303797
    Abstract: Clustering files in deduplication systems is based on an estimate of similarity between files in a file system. The estimates of similarity are based on how much content the files share, where the estimate of how much content is shared is based on an estimate of segments shared. The estimate of segments shared is based on segment offsets found in the files' bitmap vectors of segment offsets. The found segment offsets are used to generate a cluster definition approximating an optimal data structure for clustering files that share content. The approximated optimal data structure defines clusters hierarchically arranged based on the offset numbers of the found segment offsets.
    Type: Grant
    Filed: December 18, 2015
    Date of Patent: May 28, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Abdullah Reza
  • Patent number: 9460389
    Abstract: Mechanisms for predicting a GC duration are described herein. In one embodiment, the mechanisms include receiving a first set of features determined based on current operating status and prior garbage collection (GC) statistics of a first storage system. In one embodiment, the mechanisms include predicting a GC duration of a first GC process being performed at the first storage system by applying a predictive model on the first set of features, wherein the predictive model was generated based on a second set of features received periodically from a plurality of storage systems.
    Type: Grant
    Filed: May 31, 2013
    Date of Patent: October 4, 2016
    Assignee: EMC Corporation
    Inventors: Fabiano C. Botelho, Mark Chamness, Dmitry Serdyuk, Guilherme Menezes