Patents by Inventor Ivan Bassov

Ivan Bassov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11157188
    Abstract: Techniques for processing data may include: receiving a candidate data block; computing a distance using a distance function, wherein the distance is an entropy-based distance and denotes a measurement of similarity between the candidate data block and a target data block; and determining, using the distance, whether to perform data deduplication of the candidate data block with respect to the target data block to identify at least one sub-block of the candidate data block that is a duplicate of at least one sub-block of the target data block. If the distance is less than a threshold, it may be expected to have a matching sub-block between the candidate and target data blocks. The distance may be a difference between entropy values for the candidate and target data blocks. The first entropy value may be used to determine whether to compress or perform partial deduplication for the candidate data block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: October 26, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Sorin Faibish, Istvan Gonczi, Philippe Armangau
  • Publication number: 20210286783
    Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.
    Type: Application
    Filed: March 17, 2021
    Publication date: September 16, 2021
    Inventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 11112987
    Abstract: Techniques for processing data may include: receiving a candidate block; performing partial deduplication processing of the candidate block; receiving a second candidate block subsequent to performing partial deduplication processing for the candidate block; and performing first processing to determine whether to perform promotion processing for the entry, The partial deduplication processing may include: partially deduplicating at least one sub-block of the candidate block; and creating an entry in a deduplication database for the candidate block, wherein the entry includes a digest of the candidate block and the entry denotes a potential target block having the digest, and wherein the entry includes a counter that tracks a number of missed full block deduplications between the potential target block and subsequently processed candidate blocks. The promotion processing promotes the potential target block, having the first digest of the entry, to a new target block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 7, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Philippe Armangau, Sorin Faibish, Ivan Bassov
  • Patent number: 11112985
    Abstract: Techniques for processing data may include: receiving a candidate data block; computing a distance using a distance function, wherein the distance denotes a measurement of similarity between the candidate data block and a target data block; and determining, using the distance, whether to perform data deduplication of the candidate data block with respect to the target data block to identify at least one sub-block of the candidate data block that is a duplicate of at least one sub-block of the target data block. The distance may be computed using a bit-wise logical exclusive-or operation of the contents of the candidate data block and the target data block. The distance may be computed using a bit-wise logical exclusive-or operation of digests computed for the candidate and target data blocks using a distance preserving hash function. The target and candidate block may be similar if the distance is less than a threshold.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 7, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Philippe Armangau, Sorin Faibish, Istvan Gonczi
  • Patent number: 11068405
    Abstract: A storage processor in a data storage system includes a compression selection component that selects a data compression component to be used to compress host I/O data that is flushed from a persistent cache of the storage processor based on a current fullness level of the persistent cache. The compression selection component selects compression components implementing compression algorithms having relatively lower compression ratios for relatively higher current fullness levels of the persistent cache, and selects compression components implementing compression algorithms having relatively higher compression ratios for relatively lower current fullness levels of the persistent cache.
    Type: Grant
    Filed: April 19, 2018
    Date of Patent: July 20, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Ivan Bassov, Monica Chaudhary, Christopher A. Seibel
  • Patent number: 10990310
    Abstract: Techniques for data processing may include: determining one or more sub-blocks of a target block that match one or more sub-blocks of a candidate block; creating a shared sub-block mapping (SSM) structure having a plurality of entries, wherein each of the plurality of entries corresponds to a different one of the sub-blocks in the candidate block and wherein a value stored in said each entry, corresponding to one of the sub-blocks of the candidate block, identifies a sub-block of the target block matching said one sub-block of the candidate block; and storing the candidate block as a deduplicated block sharing at least one sub-block with the target block. The SSM structure may be stored as a metadata structure of the candidate block to identify deduplicated sub-blocks of the candidate block and to identify sub-blocks of the target block providing content for the deduplicated sub-blocks of the candidate block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: April 27, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Ivan Bassov, Sorin Faibish, Philippe Armangau
  • Publication number: 20210117799
    Abstract: A method of monitoring storage performance of a remote data storage apparatus (DSA) is provided. The method includes (a) receiving performance metrics of the DSA and a first set of behavioral estimates generated by a first neural network (NN) running on the DSA operating on the performance metrics; (b) operating a second NN on the computing device with the received performance metrics as inputs, the second NN configured to produce a second set of behavioral estimates as outputs in response to the performance metrics, the second NN running at a higher level of precision than the first NN; and (c) sending to the remote DSA updated parameters of an updated version of the first NN based at least in part on the performance metrics and the first and second sets of behavioral estimates. Apparatuses, systems, and computer program products for performing similar methods are also provided.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Publication number: 20210117099
    Abstract: A method of accepting writes in a multilayered storage system is provided. The method includes (a) monitoring a rate of flushing of data from a first data storage component to a second data storage component; (b) setting an intake rate for the first data storage component based on the monitored flushing rate; and (c) throttling writes to the first data storage component based on the set intake rate. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Patent number: 10963436
    Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: March 30, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 10956370
    Abstract: Techniques for data processing a data set may comprise: performing first processing that forms a first compression unit, wherein the first compression unit includes a data chunks including a first data chunk having a first entropy value less than an entropy threshold, the first processing including: receiving a second data chunk; determining, in accordance with criteria, whether to add the second data chunk to the first compression unit; and responsive to determining to add the second data chunk to the first compression unit, adding the second data chunk to the first compression unit; and compressing the first compression unit as a single compressible unit. The second chunk may be added if its entropy value is less than the entropy threshold and if entropy values of the first and second chunks are similar. The second chunk may be added if the resulting compression unit provides sufficient storage/compression benefit.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: March 23, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Sorin Faibish, Istvan Gonczi
  • Patent number: 10936228
    Abstract: In response to a cache flush event indicating that host data accumulated in a cache of a storage processor of a data storage system is to be flushed to a lower deck file system, an aggregation set of blocks is formed within the cache, and a digest calculation group is selected from within the aggregation set. Hardware vector processing logic is caused to simultaneously calculate crypto-digests from the blocks in the digest calculation group. If one of the resulting crypto-digests matches a previously generated crypto-digest, deduplication is performed that i) causes the lower deck file system to indicate the block of data from which the previously generated crypto-digest was generated and ii) discards the block that corresponds to the matching crypto-digest. Objects required by a digest generation component may be allocated in a just in time manner to avoid having to manage a pool of pre-allocated objects.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: March 2, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Ivan Bassov, Philippe Armangau
  • Patent number: 10922027
    Abstract: There is disclosed techniques for use in managing data storage in storage systems. For example, in one embodiment, there is disclosed a method comprising receiving a request to store data of a data object in a storage system. The method also comprising determining that at least a portion of the data is to be stored in an uncompressed format in the storage system in response to receiving the request. The method also comprising storing at least a portion of the data in the uncompressed format in an allocation unit of a segment in the storage system such that the stored data in the uncompressed format emulates stored data in a compressed format based on the said determination.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: February 16, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Ivan Bassov, John Didier, Ajay Karri
  • Publication number: 20210034248
    Abstract: A method performed by a block-storage server, of storing data is described. The method includes (1) receiving, from a remote file server, data blocks to be written to persistent block storage managed by the block-storage server; (2) receiving, from the remote file server, metadata describing a placement of the data blocks in a filesystem managed by the remote file server; and (3) organizing the data blocks within the persistent block storage based, at least in part, on the received metadata. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau
  • Publication number: 20210034575
    Abstract: A technique for performing data reduction applies deduplication principles when performing data compression, providing a form of enhanced compression. The technique obtains a chunk of data that contains multiple extents and applies deduplication actions to identify duplicate extents within the chunk. The technique marks duplicate extents in metadata. Such duplicate extents need not be compressed using conventional data compression, saving computational resources and considerable time.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau
  • Publication number: 20210034577
    Abstract: A method performed by a block-storage server, of storing data is described. The method includes (1) receiving, from a remote file server, data blocks to be written to persistent block storage managed by the block-storage server; (2) receiving, from the remote file server, metadata describing files to which the data blocks belong in a set of filesystems managed by the remote file server; and (3) selectively applying data reduction when storing the data blocks in the persistent block storage based, at least in part, on the received metadata. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Sorin Faibish, Philippe Armangau, Ivan Bassov, Istvan Gonczi
  • Patent number: 10852975
    Abstract: Techniques for data processing may include: receiving a data chunk and an associated digest; and performing data deduplication processing for the data chunk comprising: determining whether there is an existing entry in a deduplication digest cache for the digest; and responsive to determining there is no existing entry in the deduplication digest cache, performing processing including: determining whether there is an existing entry in a mapping structure for the digest, the mapping structure mapping digests to associated pages of related entries in a deduplication data store; and responsive to determining there is an existing entry in the mapping structure, performing second processing including: obtaining, from the existing entry, a page mapped to the digest; and loading the page of entries from the deduplication data store into the deduplication digest cache. At least some entries of the page may be prefetched and loaded into the deduplication digest cache prior to use.
    Type: Grant
    Filed: January 24, 2019
    Date of Patent: December 1, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Istvan Gonczi
  • Patent number: 10853325
    Abstract: Techniques for determining data reduction options may include: receiving first data reduction information regarding compression and deduplication of chunks of a data set; determining, in accordance with the first data reduction information for the data set, first settings denoting whether compression and deduplication are enabled or disabled for the data set; receiving, during a first time period when the first settings are effective, writes directed to the data set; receiving second data reduction information regarding compression and deduplication of chunks of the data set modified by writes during the first time period; and determining, in accordance with the second plurality of data reduction statistics for the data set, second settings denoting whether compression and deduplication are enabled or disabled for the data set.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: December 1, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Ronald A. Miller, II, James M. Pedone, Jr., Ivan Bassov
  • Patent number: 10838643
    Abstract: A technique for managing cache in a storage system that supports data deduplication renders each of a set of data blocks as multiple sub-blocks and loads a cache-resident digest database on a per-block basis, selectively creating new digest entries in the database for all sub-blocks in a block, but only for blocks that contain no duplicate sub-blocks. Sub-blocks of blocks containing duplicates are excluded. By limiting digest entries to sub-blocks of blocks that contain no duplicates, the storage system limits the size of the digest database, and thus of the cache, while also biasing the contents of the digest database toward entries that are likely to produce deduplication matches in the future.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: November 17, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Philippe Armangau, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 10838634
    Abstract: A technique for managing storage space in a data storage system generates liability values on a per-family basis, with each family including files in the file system that are related to one another by snapping. Each family thus groups together files in the file system that generally share at least some blocks among one another based on snapshot activities. Distinct files that do not share blocks based on snapping are provided in separate families. The technique further generates worst-case storage liability of a version family by differentiating between writable data objects and read-only data objects, thus allowing administrators to provide spare storage and/or prepare for increases in storage requirements as writable data objects grow and differentiate.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: November 17, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Walter C. Forrester, Michal Marko, Ahsan Rashid, Karl M. Owen
  • Publication number: 20200341667
    Abstract: Techniques for processing data may include: receiving a candidate data block; computing a distance using a distance function, wherein the distance is an entropy-based distance and denotes a measurement of similarity between the candidate data block and a target data block; and determining, using the distance, whether to perform data deduplication of the candidate data block with respect to the target data block to identify at least one sub-block of the candidate data block that is a duplicate of at least one sub-block of the target data block. If the distance is less than a threshold, it may be expected to have a matching sub-block between the candidate and target data blocks. The distance may be a difference between entropy values for the candidate and target data blocks. The first entropy value may be used to determine whether to compress or perform partial deduplication for the candidate data block.
    Type: Application
    Filed: April 24, 2019
    Publication date: October 29, 2020
    Applicant: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Sorin Faibish, Istvan Gonczi, Philippe Armangau