Patents by Inventor Istvan Gonczi

Istvan Gonczi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10990310
    Abstract: Techniques for data processing may include: determining one or more sub-blocks of a target block that match one or more sub-blocks of a candidate block; creating a shared sub-block mapping (SSM) structure having a plurality of entries, wherein each of the plurality of entries corresponds to a different one of the sub-blocks in the candidate block and wherein a value stored in said each entry, corresponding to one of the sub-blocks of the candidate block, identifies a sub-block of the target block matching said one sub-block of the candidate block; and storing the candidate block as a deduplicated block sharing at least one sub-block with the target block. The SSM structure may be stored as a metadata structure of the candidate block to identify deduplicated sub-blocks of the candidate block and to identify sub-blocks of the target block providing content for the deduplicated sub-blocks of the candidate block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: April 27, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Ivan Bassov, Sorin Faibish, Philippe Armangau
  • Publication number: 20210117799
    Abstract: A method of monitoring storage performance of a remote data storage apparatus (DSA) is provided. The method includes (a) receiving performance metrics of the DSA and a first set of behavioral estimates generated by a first neural network (NN) running on the DSA operating on the performance metrics; (b) operating a second NN on the computing device with the received performance metrics as inputs, the second NN configured to produce a second set of behavioral estimates as outputs in response to the performance metrics, the second NN running at a higher level of precision than the first NN; and (c) sending to the remote DSA updated parameters of an updated version of the first NN based at least in part on the performance metrics and the first and second sets of behavioral estimates. Apparatuses, systems, and computer program products for performing similar methods are also provided.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Publication number: 20210117099
    Abstract: A method of accepting writes in a multilayered storage system is provided. The method includes (a) monitoring a rate of flushing of data from a first data storage component to a second data storage component; (b) setting an intake rate for the first data storage component based on the monitored flushing rate; and (c) throttling writes to the first data storage component based on the set intake rate. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Patent number: 10963436
    Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: March 30, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 10963437
    Abstract: A method, computer program product, and computing system for identifying a potential deduplication candidate and a related deduplication target; executing a comparison operation with respect to the potential deduplication candidate and the related deduplication target to generate a comparison result; and determining a level of similarity between the potential deduplication candidate and the related deduplication target by processing the comparison result.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: March 30, 2021
    Assignee: EMC IP Holding Company, LLC
    Inventors: Istvan Gonczi, Ivan Basov, Sorin Faibish, Philippe Armangau, Anton Kucherov
  • Patent number: 10956370
    Abstract: Techniques for data processing a data set may comprise: performing first processing that forms a first compression unit, wherein the first compression unit includes a data chunks including a first data chunk having a first entropy value less than an entropy threshold, the first processing including: receiving a second data chunk; determining, in accordance with criteria, whether to add the second data chunk to the first compression unit; and responsive to determining to add the second data chunk to the first compression unit, adding the second data chunk to the first compression unit; and compressing the first compression unit as a single compressible unit. The second chunk may be added if its entropy value is less than the entropy threshold and if entropy values of the first and second chunks are similar. The second chunk may be added if the resulting compression unit provides sufficient storage/compression benefit.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: March 23, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Sorin Faibish, Istvan Gonczi
  • Patent number: 10936228
    Abstract: In response to a cache flush event indicating that host data accumulated in a cache of a storage processor of a data storage system is to be flushed to a lower deck file system, an aggregation set of blocks is formed within the cache, and a digest calculation group is selected from within the aggregation set. Hardware vector processing logic is caused to simultaneously calculate crypto-digests from the blocks in the digest calculation group. If one of the resulting crypto-digests matches a previously generated crypto-digest, deduplication is performed that i) causes the lower deck file system to indicate the block of data from which the previously generated crypto-digest was generated and ii) discards the block that corresponds to the matching crypto-digest. Objects required by a digest generation component may be allocated in a just in time manner to avoid having to manage a pool of pre-allocated objects.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: March 2, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Ivan Bassov, Philippe Armangau
  • Publication number: 20210034248
    Abstract: A method performed by a block-storage server, of storing data is described. The method includes (1) receiving, from a remote file server, data blocks to be written to persistent block storage managed by the block-storage server; (2) receiving, from the remote file server, metadata describing a placement of the data blocks in a filesystem managed by the remote file server; and (3) organizing the data blocks within the persistent block storage based, at least in part, on the received metadata. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau
  • Publication number: 20210034579
    Abstract: A method, computer program product, and computer system for identifying, at a block level of a file, a duplicate block of a plurality of blocks within the file. Granularity of a block size used for deduplication of the file at the block level may be adjusted. A type of deduplication may be adjusted for the file. Deduplication of the file at the block level within the file may be executed based upon, at least in part, the granularity of the block size used for deduplication of the file at the block level.
    Type: Application
    Filed: August 1, 2019
    Publication date: February 4, 2021
    Inventors: IVAN BASOV, Sorin Faibish, Istvan Gonczi, Philippe Armangau
  • Publication number: 20210034575
    Abstract: A technique for performing data reduction applies deduplication principles when performing data compression, providing a form of enhanced compression. The technique obtains a chunk of data that contains multiple extents and applies deduplication actions to identify duplicate extents within the chunk. The technique marks duplicate extents in metadata. Such duplicate extents need not be compressed using conventional data compression, saving computational resources and considerable time.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau
  • Publication number: 20210034262
    Abstract: A method, computer program product, and computer system for identifying a plurality of blocks. At least one heuristic associated with at least a portion of the plurality of blocks may be determined. It may be determined whether at least the portion of the plurality of blocks is a candidate for deduplication based upon, at least in part, the at least one heuristic. At least the portion of the plurality of blocks may be deduplicated based upon, at least in part, the at least one heuristic.
    Type: Application
    Filed: July 29, 2019
    Publication date: February 4, 2021
    Inventors: IVAN BASOV, Sorin Faibish, Istvan Gonczi
  • Publication number: 20210034263
    Abstract: A method, computer program product, and computer system for identifying duplicate sectors in a block of a plurality of blocks. The duplicate sectors in the block may be zeroed out. A data reduction operation may be performed on the block after the duplicate sectors are zeroed out.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Istvan Gonczi, Sorin Faibish, Ivan Basov
  • Publication number: 20210034577
    Abstract: A method performed by a block-storage server, of storing data is described. The method includes (1) receiving, from a remote file server, data blocks to be written to persistent block storage managed by the block-storage server; (2) receiving, from the remote file server, metadata describing files to which the data blocks belong in a set of filesystems managed by the remote file server; and (3) selectively applying data reduction when storing the data blocks in the persistent block storage based, at least in part, on the received metadata. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Sorin Faibish, Philippe Armangau, Ivan Bassov, Istvan Gonczi
  • Publication number: 20210034576
    Abstract: A method, computer program product, and computer system for identifying whether one of a first phase associated with a data operation is occurring with a file and a second phase associated with the data operation is occurring with the file. Resources available for data reduction operations may be increased when the first phase is occurring with the file. The resources available for data reduction operations may be decreased when the second phase is occurring with the file.
    Type: Application
    Filed: August 1, 2019
    Publication date: February 4, 2021
    Inventors: Istvan Gonczi, Ivan Basov, Sorin Faibish, Philippe Armangau
  • Publication number: 20210011894
    Abstract: A method, computer program product, and computing system for receiving a candidate data portion; calculating a distance-preserving hash for the candidate data portion; and performing an entropy analysis on the distance-preserving hash to generate a hash entropy for the candidate data portion.
    Type: Application
    Filed: August 3, 2020
    Publication date: January 14, 2021
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi
  • Patent number: 10852975
    Abstract: Techniques for data processing may include: receiving a data chunk and an associated digest; and performing data deduplication processing for the data chunk comprising: determining whether there is an existing entry in a deduplication digest cache for the digest; and responsive to determining there is no existing entry in the deduplication digest cache, performing processing including: determining whether there is an existing entry in a mapping structure for the digest, the mapping structure mapping digests to associated pages of related entries in a deduplication data store; and responsive to determining there is an existing entry in the mapping structure, performing second processing including: obtaining, from the existing entry, a page mapped to the digest; and loading the page of entries from the deduplication data store into the deduplication digest cache. At least some entries of the page may be prefetched and loaded into the deduplication digest cache prior to use.
    Type: Grant
    Filed: January 24, 2019
    Date of Patent: December 1, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Istvan Gonczi
  • Patent number: 10838643
    Abstract: A technique for managing cache in a storage system that supports data deduplication renders each of a set of data blocks as multiple sub-blocks and loads a cache-resident digest database on a per-block basis, selectively creating new digest entries in the database for all sub-blocks in a block, but only for blocks that contain no duplicate sub-blocks. Sub-blocks of blocks containing duplicates are excluded. By limiting digest entries to sub-blocks of blocks that contain no duplicates, the storage system limits the size of the digest database, and thus of the cache, while also biasing the contents of the digest database toward entries that are likely to produce deduplication matches in the future.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: November 17, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Philippe Armangau, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Publication number: 20200349116
    Abstract: A method, computer program product, and computing system for encoding a candidate data portion to generate an encoded candidate data portion; identifying one or more portion similarities between the encoded candidate data portion and an encoded target data portion to position the one or more portion similarities with respect to the encoded target data portion, thus generating one or more portion similarity measurements; identifying one or more portion differences between the encoded candidate data portion and the encoded target data portion to generate one or more portion difference measurements; and combining the one or more portion similarity measurements and the one or more portion difference measurements to generate a candidate similarity measurement for the candidate data portion.
    Type: Application
    Filed: May 3, 2019
    Publication date: November 5, 2020
    Inventors: SORIN FAIBISH, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi
  • Publication number: 20200349118
    Abstract: A method, computer program product, and computing system for performing an entropy analysis on each of a plurality of candidate data chunks associated with a potential candidate to generate a plurality of candidate data chunk entropies; performing an entropy analysis on each of a plurality of target data chunks associated with a potential target to generate a plurality of target data chunk entropies; identifying a candidate data chunk entropy limit, chosen from the plurality of candidate data chunk entropies, and a target data chunk entropy limit, chosen from the plurality of candidate data chunk entropies; and comparing a specific candidate data chunk associated with the candidate data chunk entropy limit to a specific target data chunk associated with the target data chunk entropy limit to determine if the specific candidate data chunk and the specific target data chunk are identical.
    Type: Application
    Filed: May 3, 2019
    Publication date: November 5, 2020
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Yarns Vankamamidi
  • Publication number: 20200349117
    Abstract: A method, computer program product, and computing system for processing a data portion to divide the data portion into a plurality of data chunks; performing an entropy analysis on each of the plurality of data chunks to generate a plurality of data chunk entropies; and determining an average data chunk entropy from the plurality of data chunk entropies.
    Type: Application
    Filed: May 3, 2019
    Publication date: November 5, 2020
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi