Patents by Inventor Istvan Gonczi

Istvan Gonczi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11960458
    Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.
    Type: Grant
    Filed: March 17, 2021
    Date of Patent: April 16, 2024
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 11847333
    Abstract: A method, computer program product, and computer system for identifying duplicate sectors in a block of a plurality of blocks. The duplicate sectors in the block may be zeroed out. A data reduction operation may be performed on the block after the duplicate sectors are zeroed out.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: December 19, 2023
    Assignee: EMC IP Holding Company, LLC
    Inventors: Istvan Gonczi, Sorin Faibish, Ivan Basov
  • Patent number: 11593312
    Abstract: A method performed by a block-storage server, of storing data is described. The method includes (1) receiving, from a remote file server, data blocks to be written to persistent block storage managed by the block-storage server; (2) receiving, from the remote file server, metadata describing files to which the data blocks belong in a set of filesystems managed by the remote file server; and (3) selectively applying data reduction when storing the data blocks in the persistent block storage based, at least in part, on the received metadata. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: February 28, 2023
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Philippe Armangau, Ivan Bassov, Istvan Gonczi
  • Patent number: 11513739
    Abstract: A method performed by a block-storage server, of storing data is described. The method includes (1) receiving, from a remote file server, data blocks to be written to persistent block storage managed by the block-storage server; (2) receiving, from the remote file server, metadata describing a placement of the data blocks in a filesystem managed by the remote file server; and (3) organizing the data blocks within the persistent block storage based, at least in part, on the received metadata. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: November 29, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau
  • Patent number: 11500540
    Abstract: A technique for managing data storage includes generating entropy of blocks on a per-block basis and selectively performing inline compression on blocks based at least in part on their entropy. Entropy of a block provides a rough measure of the block's compressibility. Thus, using per-block entropy enables a storage system to steer compression decisions, e.g., whether to compress and/or how much to compress, flexibly and with high granularity, striking a balance between throughput and storage efficiency.
    Type: Grant
    Filed: October 28, 2020
    Date of Patent: November 15, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau, Vamsi K. Vankamamidi
  • Patent number: 11422975
    Abstract: A technique for performing data reduction applies deduplication principles when performing data compression, providing a form of enhanced compression. The technique obtains a chunk of data that contains multiple extents and applies deduplication actions to identify duplicate extents within the chunk. The technique marks duplicate extents in metadata. Such duplicate extents need not be compressed using conventional data compression, saving computational resources and considerable time.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: August 23, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau
  • Patent number: 11372579
    Abstract: Techniques for generating data sets may include: receiving an initial buffer that achieves a compression ratio responsive to compression processing using a compression algorithm, the initial buffer including first content located at a first position in the initial buffer and including second content located at a second position in the initial buffer; and generating a data set of buffers using the initial buffer. The data set may be expected to achieve a specified deduplication ratio responsive to deduplication processing and to achieve the compression ratio responsive to compression processing using the compression algorithm. Generating the data set may include generating a first plurality of buffers where each buffer of the first plurality is not a duplicate of another buffer in the first plurality, and generating a second plurality of duplicate buffers. Each duplicate buffer may be a duplicate of a buffer in the first plurality of buffers.
    Type: Grant
    Filed: October 22, 2020
    Date of Patent: June 28, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Istvan Gonczi, Sorin Faibish
  • Patent number: 11360954
    Abstract: A method, computer program product, and computing system for receiving a candidate data portion; calculating a distance-preserving hash for the candidate data portion; and performing an entropy analysis on the distance-preserving hash to generate a hash entropy for the candidate data portion.
    Type: Grant
    Filed: August 3, 2020
    Date of Patent: June 14, 2022
    Assignee: EMC IP HOLDING COMPANY, LLC
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi
  • Patent number: 11347423
    Abstract: A method, computer program product, and computer system for identifying a plurality of blocks. At least one heuristic associated with at least a portion of the plurality of blocks may be determined. It may be determined whether at least the portion of the plurality of blocks is a candidate for deduplication based upon, at least in part, the at least one heuristic. At least the portion of the plurality of blocks may be deduplicated based upon, at least in part, the at least one heuristic.
    Type: Grant
    Filed: July 29, 2019
    Date of Patent: May 31, 2022
    Assignee: EMC IP HOLDING COMPANY, LLC
    Inventors: Ivan Basov, Sorin Faibish, Istvan Gonczi
  • Publication number: 20220129162
    Abstract: A technique for managing data storage includes generating entropy of blocks on a per-block basis and selectively performing inline compression on blocks based at least in part on their entropy. Entropy of a block provides a rough measure of the block's compressibility. Thus, using per-block entropy enables a storage system to steer compression decisions, e.g., whether to compress and/or how much to compress, flexibly and with high granularity, striking a balance between throughput and storage efficiency.
    Type: Application
    Filed: October 28, 2020
    Publication date: April 28, 2022
    Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau, Vamsi K. Vankamamidi
  • Publication number: 20220129190
    Abstract: Techniques for generating data sets may include: receiving an initial buffer that achieves a compression ratio responsive to compression processing using a compression algorithm, the initial buffer including first content located at a first position in the initial buffer and including second content located at a second position in the initial buffer; and generating a data set of buffers using the initial buffer. The data set may be expected to achieve a specified deduplication ratio responsive to deduplication processing and to achieve the compression ratio responsive to compression processing using the compression algorithm. Generating the data set may include generating a first plurality of buffers where each buffer of the first plurality is not a duplicate of another buffer in the first plurality, and generating a second plurality of duplicate buffers. Each duplicate buffer may be a duplicate of a buffer in the first plurality of buffers.
    Type: Application
    Filed: October 22, 2020
    Publication date: April 28, 2022
    Applicant: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Istvan Gonczi, Sorin Faibish
  • Patent number: 11308036
    Abstract: Techniques for processing data may include: receiving a plurality of data chunks for a data set; performing data deduplication processing for the plurality of data chunks; determining, in accordance with one or more criteria, whether a frequency distribution of a frequency histogram of digest byte frequencies is sufficiently uniform; and responsive to determining that the frequency distribution of the frequency histogram is not sufficiently uniform, performing processing to update data deduplication settings for the data set. Updating the data deduplication settings may include using a stronger hash algorithm and/or a larger size digest when generating subsequent digests. The data deduplication processing may include: determining, using a current hash algorithm, a plurality of digests for the plurality of data chunks of the data set; and updating the frequency histogram of digest byte frequencies for the data set in accordance the plurality of digests.
    Type: Grant
    Filed: April 11, 2019
    Date of Patent: April 19, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Ivan Bassov, Sorin Faibish
  • Patent number: 11221991
    Abstract: Techniques for data processing may include: receiving a data chunk of the data set; determining, in accordance with criteria including a compressibility ratio for the data set and a cost ratio of compression computation cost and entropy computation cost, whether to activate or deactivate entropy computation for the data set, wherein the compressibility ratio is ratio of a number of compressible data chunks of the data set and a number of uncompressible data chunks of the data set; and responsive to determining to activate entropy computation for the data set, performing first processing comprising: determining an entropy value for the data chunk; and determining, in accordance with the entropy value for the data chunk, whether to compress the data chunk.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: January 11, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Philippe Armangau, Sorin Faibish, Istvan Gonczi
  • Patent number: 11163449
    Abstract: A method of accepting writes in a multilayered storage system is provided. The method includes (a) monitoring a rate of flushing of data from a first data storage component to a second data storage component; (b) setting an intake rate for the first data storage component based on the monitored flushing rate; and (c) throttling writes to the first data storage component based on the set intake rate. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: November 2, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Patent number: 11157188
    Abstract: Techniques for processing data may include: receiving a candidate data block; computing a distance using a distance function, wherein the distance is an entropy-based distance and denotes a measurement of similarity between the candidate data block and a target data block; and determining, using the distance, whether to perform data deduplication of the candidate data block with respect to the target data block to identify at least one sub-block of the candidate data block that is a duplicate of at least one sub-block of the target data block. If the distance is less than a threshold, it may be expected to have a matching sub-block between the candidate and target data blocks. The distance may be a difference between entropy values for the candidate and target data blocks. The first entropy value may be used to determine whether to compress or perform partial deduplication for the candidate data block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: October 26, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Sorin Faibish, Istvan Gonczi, Philippe Armangau
  • Patent number: 11138154
    Abstract: A method, computer program product, and computing system for performing an entropy analysis on each of a plurality of candidate data chunks associated with a potential candidate to generate a plurality of candidate data chunk entropies; performing an entropy analysis on each of a plurality of target data chunks associated with a potential target to generate a plurality of target data chunk entropies; identifying a candidate data chunk entropy limit, chosen from the plurality of candidate data chunk entropies, and a target data chunk entropy limit, chosen from the plurality of candidate data chunk entropies; and comparing a specific candidate data chunk associated with the candidate data chunk entropy limit to a specific target data chunk associated with the target data chunk entropy limit to determine if the specific candidate data chunk and the specific target data chunk are identical.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: October 5, 2021
    Assignee: EMC IP Holding Company, LLC
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Vamsi Vankamamidi
  • Publication number: 20210286783
    Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.
    Type: Application
    Filed: March 17, 2021
    Publication date: September 16, 2021
    Inventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 11112987
    Abstract: Techniques for processing data may include: receiving a candidate block; performing partial deduplication processing of the candidate block; receiving a second candidate block subsequent to performing partial deduplication processing for the candidate block; and performing first processing to determine whether to perform promotion processing for the entry, The partial deduplication processing may include: partially deduplicating at least one sub-block of the candidate block; and creating an entry in a deduplication database for the candidate block, wherein the entry includes a digest of the candidate block and the entry denotes a potential target block having the digest, and wherein the entry includes a counter that tracks a number of missed full block deduplications between the potential target block and subsequently processed candidate blocks. The promotion processing promotes the potential target block, having the first digest of the entry, to a new target block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 7, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Philippe Armangau, Sorin Faibish, Ivan Bassov
  • Patent number: 11112985
    Abstract: Techniques for processing data may include: receiving a candidate data block; computing a distance using a distance function, wherein the distance denotes a measurement of similarity between the candidate data block and a target data block; and determining, using the distance, whether to perform data deduplication of the candidate data block with respect to the target data block to identify at least one sub-block of the candidate data block that is a duplicate of at least one sub-block of the target data block. The distance may be computed using a bit-wise logical exclusive-or operation of the contents of the candidate data block and the target data block. The distance may be computed using a bit-wise logical exclusive-or operation of digests computed for the candidate and target data blocks using a distance preserving hash function. The target and candidate block may be similar if the distance is less than a threshold.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 7, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Philippe Armangau, Sorin Faibish, Istvan Gonczi
  • Patent number: 10990565
    Abstract: A method, computer program product, and computing system for processing a data portion to divide the data portion into a plurality of data chunks; performing an entropy analysis on each of the plurality of data chunks to generate a plurality of data chunk entropies; and determining an average data chunk entropy from the plurality of data chunk entropies.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: April 27, 2021
    Assignee: EMC IP Holding Company, LLC
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi