Patents by Inventor Sorin Faibish
Sorin Faibish has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11960458Abstract: A technique for performing data deduplication operates at sub-block granularity by searching a deduplication database for a match between a candidate sub-block of a candidate block and a target sub-block of a previously-stored target block. When a match is found, the technique identifies a duplicate range shared between the candidate block and the target block and effects persistent storage of the duplicate range by configuring mapping metadata of the candidate block so that it points to the duplicate range in the target block.Type: GrantFiled: March 17, 2021Date of Patent: April 16, 2024Assignee: EMC IP Holding Company LLCInventors: Philippe Armangau, Sorin Faibish, Istvan Gonczi, Ivan Bassov, Vamsi K. Vankamamidi
-
Patent number: 11847558Abstract: A method is used in analyzing a storage system using a machine learning system. Data gathered from information associated with operations performed in a storage system is analyzed. The storage system is comprised of a plurality of components. A bitmap image is created based on the gathered data, where at least one of the plurality of components is represented in the bitmap image. The machine learning system is trained using the bitmap image, where the bitmap image is organized to depict the plurality of components of the storage system.Type: GrantFiled: May 4, 2018Date of Patent: December 19, 2023Assignee: EMC IP Holding Company LLCInventors: Sorin Faibish, Philippe Armangau, James M. Pedone, Jr.
-
Patent number: 11847333Abstract: A method, computer program product, and computer system for identifying duplicate sectors in a block of a plurality of blocks. The duplicate sectors in the block may be zeroed out. A data reduction operation may be performed on the block after the duplicate sectors are zeroed out.Type: GrantFiled: July 31, 2019Date of Patent: December 19, 2023Assignee: EMC IP Holding Company, LLCInventors: Istvan Gonczi, Sorin Faibish, Ivan Basov
-
Patent number: 11687433Abstract: Techniques for detecting state changes in a system may include receiving a first neural network that is trained to detect when the system transitions into a first resulting state, wherein the system transitions into at least a first intermediate state prior to transitioning into the final resulting state; training the first neural network using a first plurality of inputs denoting the system in the first intermediate state; obtaining a plurality of sets of internal state information of the first neural network, each set of the plurality of sets denoting an internal state of the first neural network at a different point in time after the first neural network has processed at least a portion of the first plurality of inputs; and training a second neural network, using the plurality of sets of internal state information, to detect the first intermediate state.Type: GrantFiled: April 30, 2019Date of Patent: June 27, 2023Assignee: EMC IP Holding Company LLCInventors: Sorin Faibish, James M. Pedone, Jr., Philippe Armangau
-
Patent number: 11599280Abstract: A method system, and computer program product for improving data reduction using aggregate machine learning systems comprising receiving, by an aggregating machine learning system from one or more machine learning systems associated with a set of one or more storage arrays, a first set of output parameters indicative of performance metrics for the set of the one or more storage arrays, aggregating, by the aggregating machine learning system, the first set of output parameters, resulting in a second set of output parameters, and sending, from the aggregating machine learning system, at least one member of the second set of output parameters as an input to at least one of the one or more machine learning systems associated with the set of the one or more storage arrays.Type: GrantFiled: May 30, 2019Date of Patent: March 7, 2023Assignee: EMC IP Holding Company LLCInventors: Sorin Faibish, James M. Pedone, Jr., Philippe Armangau
-
Patent number: 11593312Abstract: A method performed by a block-storage server, of storing data is described. The method includes (1) receiving, from a remote file server, data blocks to be written to persistent block storage managed by the block-storage server; (2) receiving, from the remote file server, metadata describing files to which the data blocks belong in a set of filesystems managed by the remote file server; and (3) selectively applying data reduction when storing the data blocks in the persistent block storage based, at least in part, on the received metadata. An apparatus, system, and computer program product for performing a similar method are also provided.Type: GrantFiled: July 31, 2019Date of Patent: February 28, 2023Assignee: EMC IP Holding Company LLCInventors: Sorin Faibish, Philippe Armangau, Ivan Bassov, Istvan Gonczi
-
Patent number: 11520509Abstract: A method, computer program product, and computer system for identifying a plurality of blocks. At least one heuristic associated with at least a portion of the plurality of blocks may be determined. It may be determined whether to compress at least the portion of the plurality of blocks based upon, at least in part, the at least one heuristic. At least the portion of the plurality of blocks may be compressed based upon, at least in part, the at least one heuristic.Type: GrantFiled: July 31, 2019Date of Patent: December 6, 2022Assignee: EMC IP Holding Company, LLCInventors: Sorin Faibish, Ivan Basov
-
Patent number: 11513739Abstract: A method performed by a block-storage server, of storing data is described. The method includes (1) receiving, from a remote file server, data blocks to be written to persistent block storage managed by the block-storage server; (2) receiving, from the remote file server, metadata describing a placement of the data blocks in a filesystem managed by the remote file server; and (3) organizing the data blocks within the persistent block storage based, at least in part, on the received metadata. An apparatus, system, and computer program product for performing a similar method are also provided.Type: GrantFiled: July 31, 2019Date of Patent: November 29, 2022Assignee: EMC IP Holding Company LLCInventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau
-
Patent number: 11514295Abstract: Continuous learning may include receiving a first neural network trained using a first training data set to predict outputs; determining whether the first neural network has a successful prediction rate greater than a prediction threshold; and responsive to determining the first neural network does not have a successful prediction rate greater than the prediction threshold, performing processing.Type: GrantFiled: October 25, 2019Date of Patent: November 29, 2022Assignee: EMC IP Holding Company LLCInventor: Sorin Faibish
-
Patent number: 11500540Abstract: A technique for managing data storage includes generating entropy of blocks on a per-block basis and selectively performing inline compression on blocks based at least in part on their entropy. Entropy of a block provides a rough measure of the block's compressibility. Thus, using per-block entropy enables a storage system to steer compression decisions, e.g., whether to compress and/or how much to compress, flexibly and with high granularity, striking a balance between throughput and storage efficiency.Type: GrantFiled: October 28, 2020Date of Patent: November 15, 2022Assignee: EMC IP Holding Company LLCInventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau, Vamsi K. Vankamamidi
-
Patent number: 11422975Abstract: A technique for performing data reduction applies deduplication principles when performing data compression, providing a form of enhanced compression. The technique obtains a chunk of data that contains multiple extents and applies deduplication actions to identify duplicate extents within the chunk. The technique marks duplicate extents in metadata. Such duplicate extents need not be compressed using conventional data compression, saving computational resources and considerable time.Type: GrantFiled: July 31, 2019Date of Patent: August 23, 2022Assignee: EMC IP Holding Company LLCInventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau
-
Patent number: 11372579Abstract: Techniques for generating data sets may include: receiving an initial buffer that achieves a compression ratio responsive to compression processing using a compression algorithm, the initial buffer including first content located at a first position in the initial buffer and including second content located at a second position in the initial buffer; and generating a data set of buffers using the initial buffer. The data set may be expected to achieve a specified deduplication ratio responsive to deduplication processing and to achieve the compression ratio responsive to compression processing using the compression algorithm. Generating the data set may include generating a first plurality of buffers where each buffer of the first plurality is not a duplicate of another buffer in the first plurality, and generating a second plurality of duplicate buffers. Each duplicate buffer may be a duplicate of a buffer in the first plurality of buffers.Type: GrantFiled: October 22, 2020Date of Patent: June 28, 2022Assignee: EMC IP Holding Company LLCInventors: Ivan Bassov, Istvan Gonczi, Sorin Faibish
-
Patent number: 11360954Abstract: A method, computer program product, and computing system for receiving a candidate data portion; calculating a distance-preserving hash for the candidate data portion; and performing an entropy analysis on the distance-preserving hash to generate a hash entropy for the candidate data portion.Type: GrantFiled: August 3, 2020Date of Patent: June 14, 2022Assignee: EMC IP HOLDING COMPANY, LLCInventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi
-
Patent number: 11347423Abstract: A method, computer program product, and computer system for identifying a plurality of blocks. At least one heuristic associated with at least a portion of the plurality of blocks may be determined. It may be determined whether at least the portion of the plurality of blocks is a candidate for deduplication based upon, at least in part, the at least one heuristic. At least the portion of the plurality of blocks may be deduplicated based upon, at least in part, the at least one heuristic.Type: GrantFiled: July 29, 2019Date of Patent: May 31, 2022Assignee: EMC IP HOLDING COMPANY, LLCInventors: Ivan Basov, Sorin Faibish, Istvan Gonczi
-
Publication number: 20220129190Abstract: Techniques for generating data sets may include: receiving an initial buffer that achieves a compression ratio responsive to compression processing using a compression algorithm, the initial buffer including first content located at a first position in the initial buffer and including second content located at a second position in the initial buffer; and generating a data set of buffers using the initial buffer. The data set may be expected to achieve a specified deduplication ratio responsive to deduplication processing and to achieve the compression ratio responsive to compression processing using the compression algorithm. Generating the data set may include generating a first plurality of buffers where each buffer of the first plurality is not a duplicate of another buffer in the first plurality, and generating a second plurality of duplicate buffers. Each duplicate buffer may be a duplicate of a buffer in the first plurality of buffers.Type: ApplicationFiled: October 22, 2020Publication date: April 28, 2022Applicant: EMC IP Holding Company LLCInventors: Ivan Bassov, Istvan Gonczi, Sorin Faibish
-
Publication number: 20220129162Abstract: A technique for managing data storage includes generating entropy of blocks on a per-block basis and selectively performing inline compression on blocks based at least in part on their entropy. Entropy of a block provides a rough measure of the block's compressibility. Thus, using per-block entropy enables a storage system to steer compression decisions, e.g., whether to compress and/or how much to compress, flexibly and with high granularity, striking a balance between throughput and storage efficiency.Type: ApplicationFiled: October 28, 2020Publication date: April 28, 2022Inventors: Sorin Faibish, Ivan Bassov, Istvan Gonczi, Philippe Armangau, Vamsi K. Vankamamidi
-
Patent number: 11314432Abstract: A method is used in managing data reduction in storage systems using machine learning. A value representing a data reduction assessment for a first data block in a storage system is calculated using a hash of the data block. The value is used to train a machine learning system to assess data reduction associated with a second data block in the storage system without performing the data reduction on the second data block, where assessing data reduction associated with the second data block indicates a probability as to whether the second data block can be reduced.Type: GrantFiled: March 6, 2020Date of Patent: April 26, 2022Assignee: EMC IP Holding Company LLCInventors: Sorin Faibish, Rustem Rafikov, Ivan Bassov
-
Patent number: 11308036Abstract: Techniques for processing data may include: receiving a plurality of data chunks for a data set; performing data deduplication processing for the plurality of data chunks; determining, in accordance with one or more criteria, whether a frequency distribution of a frequency histogram of digest byte frequencies is sufficiently uniform; and responsive to determining that the frequency distribution of the frequency histogram is not sufficiently uniform, performing processing to update data deduplication settings for the data set. Updating the data deduplication settings may include using a stronger hash algorithm and/or a larger size digest when generating subsequent digests. The data deduplication processing may include: determining, using a current hash algorithm, a plurality of digests for the plurality of data chunks of the data set; and updating the frequency histogram of digest byte frequencies for the data set in accordance the plurality of digests.Type: GrantFiled: April 11, 2019Date of Patent: April 19, 2022Assignee: EMC IP Holding Company LLCInventors: Istvan Gonczi, Ivan Bassov, Sorin Faibish
-
Patent number: 11232075Abstract: Techniques for data processing may include: receiving a data chunk; determining a metric value denoting a degree of compressibility of the data chunk; selecting, in accordance with the metric value denoting the compressibility of the data chunk, a first size of a plurality of sizes, wherein each of the plurality of sizes denotes a different size of an amount of storage used for storing a value of said each size; and performing the data deduplication processing for the data chunk, wherein the data deduplication processing includes using a first hash value for the data chunk to determine whether the data chunk is a duplicate of another data chunk of a hash table, wherein the first hash value is stored in a storage location of the first size.Type: GrantFiled: October 25, 2018Date of Patent: January 25, 2022Assignee: EMC IP Holding Company LLCInventors: Ivan Bassov, Sorin Faibish, Rustem Rafikov
-
Patent number: 11221991Abstract: Techniques for data processing may include: receiving a data chunk of the data set; determining, in accordance with criteria including a compressibility ratio for the data set and a cost ratio of compression computation cost and entropy computation cost, whether to activate or deactivate entropy computation for the data set, wherein the compressibility ratio is ratio of a number of compressible data chunks of the data set and a number of uncompressible data chunks of the data set; and responsive to determining to activate entropy computation for the data set, performing first processing comprising: determining an entropy value for the data chunk; and determining, in accordance with the entropy value for the data chunk, whether to compress the data chunk.Type: GrantFiled: October 30, 2018Date of Patent: January 11, 2022Assignee: EMC IP Holding Company LLCInventors: Ivan Bassov, Philippe Armangau, Sorin Faibish, Istvan Gonczi