Patents by Inventor Sorin Faibish

Sorin Faibish has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11210230
    Abstract: Techniques are provided for inline deduplication based on a number of physical blocks having common fingerprints among multiple entries of a buffer cache. One method comprises storing input/output operations in a first cache comprising a plurality of entries each corresponding to a physical storage entity comprising a plurality of physical blocks. A given entry is maintained in the first cache based on a first number of physical blocks of the given entry having a duplicate fingerprint with at least one physical block of another entry in the first cache. A second number can be determined of the physical blocks of each entry having a fingerprint in a second cache, and a first ratio is determined for two entries in the first cache using the second number and the first number. A comparison of the first ratios can be performed to sort and possibly evict entries in the first cache based on the comparison.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: December 28, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Philip Shilane, Philippe Armangau
  • Publication number: 20210342271
    Abstract: Techniques are provided for inline deduplication based on a number of physical blocks having common fingerprints among multiple entries of a buffer cache. One method comprises storing input/output operations in a first cache comprising a plurality of entries each corresponding to a physical storage entity comprising a plurality of physical blocks. A given entry is maintained in the first cache based on a first number of physical blocks of the given entry having a duplicate fingerprint with at least one physical block of another entry in the first cache. A second number is determined of the physical blocks of each entry having a fingerprint in a second cache, and a first ratio is determined for two entries in the first cache using the second number and the first number. A comparison of the first ratios can be performed to sort and possibly evict entries in the first cache based on the comparison.
    Type: Application
    Filed: April 30, 2020
    Publication date: November 4, 2021
    Inventors: Sorin Faibish, Philip Shilane, Philippe Armangau
  • Patent number: 11163449
    Abstract: A method of accepting writes in a multilayered storage system is provided. The method includes (a) monitoring a rate of flushing of data from a first data storage component to a second data storage component; (b) setting an intake rate for the first data storage component based on the monitored flushing rate; and (c) throttling writes to the first data storage component based on the set intake rate. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: November 2, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Patent number: 11157188
    Abstract: Techniques for processing data may include: receiving a candidate data block; computing a distance using a distance function, wherein the distance is an entropy-based distance and denotes a measurement of similarity between the candidate data block and a target data block; and determining, using the distance, whether to perform data deduplication of the candidate data block with respect to the target data block to identify at least one sub-block of the candidate data block that is a duplicate of at least one sub-block of the target data block. If the distance is less than a threshold, it may be expected to have a matching sub-block between the candidate and target data blocks. The distance may be a difference between entropy values for the candidate and target data blocks. The first entropy value may be used to determine whether to compress or perform partial deduplication for the candidate data block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: October 26, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Sorin Faibish, Istvan Gonczi, Philippe Armangau
  • Patent number: 11153385
    Abstract: A technique for transferring data over a network leverages a standard NAS (Network Attached Storage) protocol to augment its inherent file-copying ability with fingerprint matching, enabling the NAS protocol to limit its data copying over the network to unique data segments while avoiding copying of redundant data segments.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: October 19, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Philip Shilane
  • Patent number: 11144206
    Abstract: A method and system for sharing data reduction metadata with storage systems. Specifically, the disclosed method and system entail communicating, to a storage system, information known to host devices from which data (submitted to-be-written to the storage system) may originate. This a priori reduction-pertinent information, which may include the potential to improve storage system efficiency and/or performance at least with respect to data reduction processing of the data submitted to-be-written, had previously been considered incommunicable to the storage system. The disclosed method and system, however, lift this previous limitation and enable communication of any storage system performance-improving information, applicable to the data submitted to-be-written, to the storage system.
    Type: Grant
    Filed: November 1, 2019
    Date of Patent: October 12, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Jeremy O'Hare, Alexandre Lemay, Matthew Fredette, Sorin Faibish
  • Patent number: 11138154
    Abstract: A method, computer program product, and computing system for performing an entropy analysis on each of a plurality of candidate data chunks associated with a potential candidate to generate a plurality of candidate data chunk entropies; performing an entropy analysis on each of a plurality of target data chunks associated with a potential target to generate a plurality of target data chunk entropies; identifying a candidate data chunk entropy limit, chosen from the plurality of candidate data chunk entropies, and a target data chunk entropy limit, chosen from the plurality of candidate data chunk entropies; and comparing a specific candidate data chunk associated with the candidate data chunk entropy limit to a specific target data chunk associated with the target data chunk entropy limit to determine if the specific candidate data chunk and the specific target data chunk are identical.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: October 5, 2021
    Assignee: EMC IP Holding Company, LLC
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Vamsi Vankamamidi
  • Patent number: 11132334
    Abstract: Methods and apparatus are provided for filtering dynamically loadable namespaces (DLNs). An exemplary method comprises, in response to a job submitted by an application, obtaining a DLN portion of a global single namespace of a file system, wherein the DLN is associated with the job and is maintained in a capacity tier of a storage system; obtaining filtering directives from a user; reducing the DLN using a filtering mechanism on a directory tree associated with the DLN, based on the filtering directives, by removing files in the directory tree of the DLN that do not satisfy requirements of the filtering directives to generate a filtered DLN; and dynamically loading the filtered DLN, including reduced metadata for the filtered DLN relative to the DLN, from the capacity tier into a performance tier of the storage system for processing by the application.
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: September 28, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: John M. Bent, Sorin Faibish, Patrick S. Combes, Eriks S. Paegle, James M. Pedone, Jr.
  • Publication number: 20210286783
    Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.
    Type: Application
    Filed: March 17, 2021
    Publication date: September 16, 2021
    Inventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
  • Patent number: 11112987
    Abstract: Techniques for processing data may include: receiving a candidate block; performing partial deduplication processing of the candidate block; receiving a second candidate block subsequent to performing partial deduplication processing for the candidate block; and performing first processing to determine whether to perform promotion processing for the entry, The partial deduplication processing may include: partially deduplicating at least one sub-block of the candidate block; and creating an entry in a deduplication database for the candidate block, wherein the entry includes a digest of the candidate block and the entry denotes a potential target block having the digest, and wherein the entry includes a counter that tracks a number of missed full block deduplications between the potential target block and subsequently processed candidate blocks. The promotion processing promotes the potential target block, having the first digest of the entry, to a new target block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 7, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Philippe Armangau, Sorin Faibish, Ivan Bassov
  • Patent number: 11112985
    Abstract: Techniques for processing data may include: receiving a candidate data block; computing a distance using a distance function, wherein the distance denotes a measurement of similarity between the candidate data block and a target data block; and determining, using the distance, whether to perform data deduplication of the candidate data block with respect to the target data block to identify at least one sub-block of the candidate data block that is a duplicate of at least one sub-block of the target data block. The distance may be computed using a bit-wise logical exclusive-or operation of the contents of the candidate data block and the target data block. The distance may be computed using a bit-wise logical exclusive-or operation of digests computed for the candidate and target data blocks using a distance preserving hash function. The target and candidate block may be similar if the distance is less than a threshold.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 7, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Philippe Armangau, Sorin Faibish, Istvan Gonczi
  • Patent number: 11093468
    Abstract: A computer-executable method, system, and computer program product for managing metadata in a distributed data storage system, wherein the distributed data storage system includes a first burst buffer having a key-value store enabled to store metadata, the computer-executable method, system, and computer program product comprising receiving, from a compute node, metadata related to data stored within the distributed data storage system, indexing the metadata at the first burst buffer, and processing the metadata in the first burst buffer.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: August 17, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: John M. Bent, Sorin Faibish, Zhenhua Zhang, Xuezhao Liu, Jingwang Zhang
  • Patent number: 11080196
    Abstract: Techniques are provided for pattern-aware prefetching using a parallel log-structured file system. At least a portion of one or more files is accessed by detecting at least one pattern in a non-sequential access of the one or more files; and obtaining at least a portion of the one or more files based on the detected at least one pattern. The obtaining step comprises, for example, a prefetching or pre-allocation of the at least the portion of the one or more files. A prefetch cache can store the portion of the one or more obtained files. The cached portion of the one or more files can be provided from the prefetch cache to an application requesting the at least a portion of the one or more files.
    Type: Grant
    Filed: December 17, 2019
    Date of Patent: August 3, 2021
    Assignees: EMC IP Holding Company LLC, Triad National Security, LLC
    Inventors: John M. Bent, Sorin Faibish, Gary Grider, Aaron Torres, Jun He
  • Publication number: 20210132814
    Abstract: A method and system for sharing data reduction metadata with storage systems. Specifically, the disclosed method and system entail communicating, to a storage system, information known to host devices from which data (submitted to-be-written to the storage system) may originate. This a priori reduction-pertinent information, which may include the potential to improve storage system efficiency and/or performance at least with respect to data reduction processing of the data submitted to-be-written, had previously been considered incommunicable to the storage system. The disclosed method and system, however, lift this previous limitation and enable communication of any storage system performance-improving information, applicable to the data submitted to-be-written, to the storage system.
    Type: Application
    Filed: November 1, 2019
    Publication date: May 6, 2021
    Inventors: Jeremy O'Hare, Alexandre Lemay, Matthew Fredette, Sorin Faibish
  • Patent number: 10997126
    Abstract: Methods and apparatus are provided for reorganizing dynamically loadable namespaces (DLNs). In one exemplary embodiment, a method comprises the steps of, in response to a job submitted by an application, obtaining a DLN portion of a global single namespace of a file system, wherein the DLN is associated with the job and is maintained in a capacity tier of object storage of a storage system; obtaining one or more reordering directives from a user; rearranging one or more files in the DLN into a new directory hierarchy based on the one or more reordering directives to generate a reordered DLN; and dynamically loading the reordered DLN, including the metadata only for the reordered DLN, from the capacity tier of object storage into a performance tier of storage of the storage system for processing by the application. The reordered DLN is merged into the DLN following one or more modifications to the reordered DLN.
    Type: Grant
    Filed: December 8, 2015
    Date of Patent: May 4, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: John M. Bent, Sorin Faibish, Patrick S. Combes, Eriks S. Paegle, James M. Pedone
  • Publication number: 20210125053
    Abstract: Continuous learning may include receiving a first neural network trained using a first training data set to predict outputs; determining whether the first neural network has a successful prediction rate greater than a prediction threshold; and responsive to determining the first neural network does not have a successful prediction rate greater than the prediction threshold, performing processing.
    Type: Application
    Filed: October 25, 2019
    Publication date: April 29, 2021
    Applicant: EMC IP Holding Company LLC
    Inventor: Sorin Faibish
  • Patent number: 10990565
    Abstract: A method, computer program product, and computing system for processing a data portion to divide the data portion into a plurality of data chunks; performing an entropy analysis on each of the plurality of data chunks to generate a plurality of data chunk entropies; and determining an average data chunk entropy from the plurality of data chunk entropies.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: April 27, 2021
    Assignee: EMC IP Holding Company, LLC
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi
  • Patent number: 10990310
    Abstract: Techniques for data processing may include: determining one or more sub-blocks of a target block that match one or more sub-blocks of a candidate block; creating a shared sub-block mapping (SSM) structure having a plurality of entries, wherein each of the plurality of entries corresponds to a different one of the sub-blocks in the candidate block and wherein a value stored in said each entry, corresponding to one of the sub-blocks of the candidate block, identifies a sub-block of the target block matching said one sub-block of the candidate block; and storing the candidate block as a deduplicated block sharing at least one sub-block with the target block. The SSM structure may be stored as a metadata structure of the candidate block to identify deduplicated sub-blocks of the candidate block and to identify sub-blocks of the target block providing content for the deduplicated sub-blocks of the candidate block.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: April 27, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Istvan Gonczi, Ivan Bassov, Sorin Faibish, Philippe Armangau
  • Publication number: 20210117799
    Abstract: A method of monitoring storage performance of a remote data storage apparatus (DSA) is provided. The method includes (a) receiving performance metrics of the DSA and a first set of behavioral estimates generated by a first neural network (NN) running on the DSA operating on the performance metrics; (b) operating a second NN on the computing device with the received performance metrics as inputs, the second NN configured to produce a second set of behavioral estimates as outputs in response to the performance metrics, the second NN running at a higher level of precision than the first NN; and (c) sending to the remote DSA updated parameters of an updated version of the first NN based at least in part on the performance metrics and the first and second sets of behavioral estimates. Apparatuses, systems, and computer program products for performing similar methods are also provided.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov
  • Publication number: 20210117099
    Abstract: A method of accepting writes in a multilayered storage system is provided. The method includes (a) monitoring a rate of flushing of data from a first data storage component to a second data storage component; (b) setting an intake rate for the first data storage component based on the monitored flushing rate; and (c) throttling writes to the first data storage component based on the set intake rate. An apparatus, system, and computer program product for performing a similar method are also provided.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sorin Faibish, Istvan Gonczi, Ivan Bassov