Patents by Inventor Sorin Faibish

Sorin Faibish has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210286783
    Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.
    Type: Application
    Filed: March 17, 2021
    Publication date: September 16, 2021
    Inventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 11112985
    Abstract: Techniques for processing data may include: receiving a candidate data block; computing a distance using a distance function, wherein the distance denotes a measurement of similarity between the candidate data block and a target data block; and determining, using the distance, whether to perform data deduplication of the candidate data block with respect to the target data block to identify at least one sub-block of the candidate data block that is a duplicate of at least one sub-block of the target data block. The distance may be computed using a bit-wise logical exclusive-or operation of the contents of the candidate data block and the target data block. The distance may be computed using a bit-wise logical exclusive-or operation of digests computed for the candidate and target data blocks using a distance preserving hash function. The target and candidate block may be similar if the distance is less than a threshold.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 7, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Philippe Armangau, Sorin Faibish, Istvan Gonczi
  • Patent number: 11112987
    Abstract: Techniques for processing data may include: receiving a candidate block; performing partial deduplication processing of the candidate block; receiving a second candidate block subsequent to performing partial deduplication processing for the candidate block; and performing first processing to determine whether to perform promotion processing for the entry, The partial deduplication processing may include: partially deduplicating at least one sub-block of the candidate block; and creating an entry in a deduplication database for the candidate block, wherein the entry includes a digest of the candidate block and the entry denotes a potential target block having the digest, and wherein the entry includes a counter that tracks a number of missed full block deduplications between the potential target block and subsequently processed candidate blocks. The promotion processing promotes the potential target block, having the first digest of the entry, to a new target block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 7, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Philippe Armangau, Sorin Faibish, Ivan Bassov
  • Patent number: 11093468
    Abstract: A computer-executable method, system, and computer program product for managing metadata in a distributed data storage system, wherein the distributed data storage system includes a first burst buffer having a key-value store enabled to store metadata, the computer-executable method, system, and computer program product comprising receiving, from a compute node, metadata related to data stored within the distributed data storage system, indexing the metadata at the first burst buffer, and processing the metadata in the first burst buffer.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: August 17, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: John M. Bent, Sorin Faibish, Zhenhua Zhang, Xuezhao Liu, Jingwang Zhang
  • Patent number: 11080196
    Abstract: Techniques are provided for pattern-aware prefetching using a parallel log-structured file system. At least a portion of one or more files is accessed by detecting at least one pattern in a non-sequential access of the one or more files; and obtaining at least a portion of the one or more files based on the detected at least one pattern. The obtaining step comprises, for example, a prefetching or pre-allocation of the at least the portion of the one or more files. A prefetch cache can store the portion of the one or more obtained files. The cached portion of the one or more files can be provided from the prefetch cache to an application requesting the at least a portion of the one or more files.
    Type: Grant
    Filed: December 17, 2019
    Date of Patent: August 3, 2021
    Assignees: EMC IP Holding Company LLC, Triad National Security, LLC
    Inventors: John M. Bent, Sorin Faibish, Gary Grider, Aaron Torres, Jun He
  • Publication number: 20210132814
    Abstract: A method and system for sharing data reduction metadata with storage systems. Specifically, the disclosed method and system entail communicating, to a storage system, information known to host devices from which data (submitted to-be-written to the storage system) may originate. This a priori reduction-pertinent information, which may include the potential to improve storage system efficiency and/or performance at least with respect to data reduction processing of the data submitted to-be-written, had previously been considered incommunicable to the storage system. The disclosed method and system, however, lift this previous limitation and enable communication of any storage system performance-improving information, applicable to the data submitted to-be-written, to the storage system.
    Type: Application
    Filed: November 1, 2019
    Publication date: May 6, 2021
    Inventors: Jeremy O'Hare, Alexandre Lemay, Matthew Fredette, Sorin Faibish
  • Patent number: 10997126
    Abstract: Methods and apparatus are provided for reorganizing dynamically loadable namespaces (DLNs). In one exemplary embodiment, a method comprises the steps of, in response to a job submitted by an application, obtaining a DLN portion of a global single namespace of a file system, wherein the DLN is associated with the job and is maintained in a capacity tier of object storage of a storage system; obtaining one or more reordering directives from a user; rearranging one or more files in the DLN into a new directory hierarchy based on the one or more reordering directives to generate a reordered DLN; and dynamically loading the reordered DLN, including the metadata only for the reordered DLN, from the capacity tier of object storage into a performance tier of storage of the storage system for processing by the application. The reordered DLN is merged into the DLN following one or more modifications to the reordered DLN.
    Type: Grant
    Filed: December 8, 2015
    Date of Patent: May 4, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: John M. Bent, Sorin Faibish, Patrick S. Combes, Eriks S. Paegle, James M. Pedone
  • Publication number: 20210125053
    Abstract: Continuous learning may include receiving a first neural network trained using a first training data set to predict outputs; determining whether the first neural network has a successful prediction rate greater than a prediction threshold; and responsive to determining the first neural network does not have a successful prediction rate greater than the prediction threshold, performing processing.
    Type: Application
    Filed: October 25, 2019
    Publication date: April 29, 2021
    Applicant: EMC IP Holding Company LLC
    Inventor: Sorin Faibish
  • Patent number: 10990310
    Abstract: Techniques for data processing may include: determining one or more sub-blocks of a target block that match one or more sub-blocks of a candidate block; creating a shared sub-block mapping (SSM) structure having a plurality of entries, wherein each of the plurality of entries corresponds to a different one of the sub-blocks in the candidate block and wherein a value stored in said each entry, corresponding to one of the sub-blocks of the candidate block, identifies a sub-block of the target block matching said one sub-block of the candidate block; and storing the candidate block as a deduplicated block sharing at least one sub-block with the target block. The SSM structure may be stored as a metadata structure of the candidate block to identify deduplicated sub-blocks of the candidate block and to identify sub-blocks of the target block providing content for the deduplicated sub-blocks of the candidate block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: April 27, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Ivan Bassov, Sorin Faibish, Philippe Armangau
  • Patent number: 10990565
    Abstract: A method, computer program product, and computing system for processing a data portion to divide the data portion into a plurality of data chunks; performing an entropy analysis on each of the plurality of data chunks to generate a plurality of data chunk entropies; and determining an average data chunk entropy from the plurality of data chunk entropies.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: April 27, 2021
    Assignee: EMC IP Holding Company, LLC
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi
  • Publication number: 20210117099
    Abstract: A method of accepting writes in a multilayered storage system is provided. The method includes (a) monitoring a rate of flushing of data from a first data storage component to a second data storage component; (b) setting an intake rate for the first data storage component based on the monitored flushing rate; and (c) throttling writes to the first data storage component based on the set intake rate. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Publication number: 20210117799
    Abstract: A method of monitoring storage performance of a remote data storage apparatus (DSA) is provided. The method includes (a) receiving performance metrics of the DSA and a first set of behavioral estimates generated by a first neural network (NN) running on the DSA operating on the performance metrics; (b) operating a second NN on the computing device with the received performance metrics as inputs, the second NN configured to produce a second set of behavioral estimates as outputs in response to the performance metrics, the second NN running at a higher level of precision than the first NN; and (c) sending to the remote DSA updated parameters of an updated version of the first NN based at least in part on the performance metrics and the first and second sets of behavioral estimates. Apparatuses, systems, and computer program products for performing similar methods are also provided.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Patent number: 10963436
    Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: March 30, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 10963437
    Abstract: A method, computer program product, and computing system for identifying a potential deduplication candidate and a related deduplication target; executing a comparison operation with respect to the potential deduplication candidate and the related deduplication target to generate a comparison result; and determining a level of similarity between the potential deduplication candidate and the related deduplication target by processing the comparison result.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: March 30, 2021
    Assignee: EMC IP Holding Company, LLC
    Inventors: Istvan Gonczi, Ivan Basov, Sorin Faibish, Philippe Armangau, Anton Kucherov
  • Patent number: 10956370
    Abstract: Techniques for data processing a data set may comprise: performing first processing that forms a first compression unit, wherein the first compression unit includes a data chunks including a first data chunk having a first entropy value less than an entropy threshold, the first processing including: receiving a second data chunk; determining, in accordance with criteria, whether to add the second data chunk to the first compression unit; and responsive to determining to add the second data chunk to the first compression unit, adding the second data chunk to the first compression unit; and compressing the first compression unit as a single compressible unit. The second chunk may be added if its entropy value is less than the entropy threshold and if entropy values of the first and second chunks are similar. The second chunk may be added if the resulting compression unit provides sufficient storage/compression benefit.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: March 23, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Sorin Faibish, Istvan Gonczi
  • Publication number: 20210058459
    Abstract: A technique for transferring data over a network leverages a standard NAS (Network Attached Storage) protocol to augment its inherent file-copying ability with fingerprint matching, enabling the NAS protocol to limit its data copying over the network to unique data segments while avoiding copying of redundant data segments.
    Type: Application
    Filed: August 22, 2019
    Publication date: February 25, 2021
    Inventors: Sorin Faibish, Philip Shilane
  • Patent number: 10921987
    Abstract: A method of performing deduplication includes (1) receiving a write command that specifies a set of data, the set of data including multiple blocks of data, (2) hashing a subset of the set of data, yielding a representative digest of the set of data, and (3) performing deduplication on the set of data based at least in part on matching the representative digest to a digest already stored in a database which relates digests to locations of data from which the digests were produced. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: February 16, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, John P. Didier, Sorin Faibish
  • Publication number: 20210034249
    Abstract: A method of performing deduplication includes (1) receiving a write command that specifies a set of data, the set of data including multiple blocks of data, (2) hashing a subset of the set of data, yielding a representative digest of the set of data, and (3) performing deduplication on the set of data based at least in part on matching the representative digest to a digest already stored in a database which relates digests to locations of data from which the digests were produced. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Philippe Armangau, John P. Didier, Sorin Faibish
  • Publication number: 20210034579
    Abstract: A method, computer program product, and computer system for identifying, at a block level of a file, a duplicate block of a plurality of blocks within the file. Granularity of a block size used for deduplication of the file at the block level may be adjusted. A type of deduplication may be adjusted for the file. Deduplication of the file at the block level within the file may be executed based upon, at least in part, the granularity of the block size used for deduplication of the file at the block level.
    Type: Application
    Filed: August 1, 2019
    Publication date: February 4, 2021
    Inventors: IVAN BASOV, Sorin Faibish, Istvan Gonczi, Philippe Armangau
  • Publication number: 20210034263
    Abstract: A method, computer program product, and computer system for identifying duplicate sectors in a block of a plurality of blocks. The duplicate sectors in the block may be zeroed out. A data reduction operation may be performed on the block after the duplicate sectors are zeroed out.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Inventors: Istvan Gonczi, Sorin Faibish, Ivan Basov