Patents by Inventor Lior Aronovich

Lior Aronovich has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170147627
    Abstract: Embodiments for combining input data matches in data deduplication of input data by a processor. Input data matches are calculated using a plurality of deduplication processes referencing a plurality of repository data segments for the input data. A combined list of output data matches is calculated.
    Type: Application
    Filed: November 25, 2015
    Publication date: May 25, 2017
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lior ARONOVICH
  • Publication number: 20170147444
    Abstract: Embodiments for processing tracked blocks in a data storage implemented with data deduplication by a processor. Input snapshot data is partitioned into changed tracked blocks. The changed tracked blocks are grouped into enclosing similarity units. Similarity units that contain at least one input changed tracked block are processed for deduplication.
    Type: Application
    Filed: November 25, 2015
    Publication date: May 25, 2017
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lior ARONOVICH
  • Publication number: 20170132244
    Abstract: For reducing activation of similarity search in a data deduplication system using a processor device in a computing environment, input data is partitioned into data chunks. A determination is made as to whether to apply a similarity search process for an input data chunk based on deduplication results of a previous input data chunk in the input data.
    Type: Application
    Filed: January 26, 2017
    Publication date: May 11, 2017
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lior ARONOVICH
  • Patent number: 9646043
    Abstract: Embodiments for combining input data matches in data deduplication of input data by a processor. Input data matches are calculated using a plurality of deduplication processes referencing a plurality of repository data segments for the input data. A combined list of output data matches is calculated.
    Type: Grant
    Filed: November 25, 2015
    Date of Patent: May 9, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lior Aronovich
  • Publication number: 20170109385
    Abstract: Computer implemented methods for concurrent processing of operations on a tree-based data structure include: receiving input at a storage system managing a storage device in which the tree-based data structure is stored, the input identifying a set of heterogeneous operations to be applied to the tree-based data structure; determining one or more nodes of the tree-based data structure to which one or more of the set of heterogeneous operations are to be applied; determining one or more groups of the set of heterogeneous operations according to the one or more nodes to which the set of heterogeneous operations are to be applied; and applying, for each of the one or more groups, the set of heterogeneous operations according to a predefined order. Systems and methods for accomplishing the same are also disclosed.
    Type: Application
    Filed: October 20, 2015
    Publication date: April 20, 2017
    Inventors: Lior Aronovich, Kien K. Huynh
  • Publication number: 20170111442
    Abstract: Systems and methods include: receiving input at a storage system managing a storage device in which a tree-based data structure is stored, the input identifying a set of heterogeneous operations to be applied to the tree-based data structure; determining one or more nodes of the tree-based data structure to which one or more of the set of heterogeneous operations are to be applied; determining one or more groups of the set of heterogeneous operations, the determining being based at least in part on the one or more nodes to which the heterogeneous operations are to be applied; isolating processing of each node from processing of other nodes; and processing each of the one or more nodes to which one or more of the set of heterogeneous operations are to be applied with one of the groups of the set of heterogeneous operations.
    Type: Application
    Filed: October 20, 2015
    Publication date: April 20, 2017
    Inventors: Lior Aronovich, Kien K. Huynh, Gregory T. Kishi
  • Publication number: 20170109352
    Abstract: Computer implemented methods for concurrent processing of operations on a tree-based data structure include: receiving input at a storage system managing a storage device in which the tree-based data structure is stored, the input identifying a set of heterogeneous operations to be applied to the tree-based data structure; determining one or more nodes of the tree-based data structure to which one or more of the set of heterogeneous operations are to be applied; and performing one or more of the set of heterogeneous operations concurrently and in bulk. Systems and methods for accomplishing the same are also disclosed.
    Type: Application
    Filed: October 20, 2015
    Publication date: April 20, 2017
    Inventors: Lior Aronovich, Kien K. Huynh
  • Patent number: 9600515
    Abstract: For efficient calculation of both similarity search values and boundaries of digest blocks in data deduplication, input data is partitioned into chunks, and for each chunk a set of rolling hash values is calculated. A single linear scan of the rolling hash values is used to produce both similarity search values and boundaries of the digest blocks of the chunk. The rolling hash values are used to contribute to the calculation of the similarity search values and to the calculation of the boundaries of the digest blocks.
    Type: Grant
    Filed: December 16, 2015
    Date of Patent: March 21, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shay H. Akirav, Lior Aronovich, Shira Ben-Dor, Michael Hirsch, Ofer Leneman
  • Patent number: 9594924
    Abstract: Various embodiments are provided for managing a global cache coherency in a distributed shared caching for a clustered file system (CFS). The CFS manages access permissions to an entire space of data segments by using the DSM module. In response to receiving a request to access one of the data segments, a calculation operation is performed for obtaining most recent contents of one of the data segments. The calculation operation performs one of providing the most recent contents via communication with a remote DSM module which obtains the one of the data segments from an associated external cache memory, instructing by the DSM module to read from storage the one of the data segments, and determining that any existing contents of the one of the data segments in the local external cache are the most recent contents.
    Type: Grant
    Filed: January 19, 2016
    Date of Patent: March 14, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior Aronovich, Yair Toaff, Gil Paz, Ron Asher
  • Patent number: 9594766
    Abstract: For conditional activation of similarity search in a data deduplication system using a processor device in a computing environment, input data is partitioned into data chunks. A determination is made as to whether to apply the similarity search process for an input data chunk based on deduplication results of a previous input data chunk in the input data.
    Type: Grant
    Filed: July 15, 2013
    Date of Patent: March 14, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lior Aronovich
  • Patent number: 9575983
    Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. A data segment of the synthetic backup is partitioned into fixed sized sub-segments. The calculated digests of sub-segment are aggregated to produce the deduplication digest, and the deduplication digest is formed for the synthetic backup.
    Type: Grant
    Filed: April 21, 2015
    Date of Patent: February 21, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
  • Patent number: 9547662
    Abstract: For digest retrieval based on similarity search in deduplication processing in a data deduplication system using a processor device in a computing environment, input data is partitioned into fixed sized data chunks. Similarity elements and digest block boundaries and digest values are calculated for each of the fixed sized data chunks. Matching similarity elements are searched for in a search structure containing the similarity elements for each of the fixed sized data chunks in a repository of data. Positions of similar data are located in the repository. The positions of the similar data are used to locate and load into the memory stored digest values and corresponding stored digest block boundaries of the similar data in the repository. The digest values and the corresponding digest block boundaries of the input data are matched with the stored digest values and the corresponding stored digest block boundaries to find data matches.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: January 17, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shay H. Akirav, Lior Aronovich, Shira Ben-Dor, Michael Hirsch, Ofer Leneman
  • Patent number: 9536104
    Abstract: Various embodiments are provided for managing a global cache coherency in a distributed shared caching for a clustered file system (CFS). The CFS manages access permissions to an entire space of data segments by using the DSM module. In response to receiving a request to access one of the data segments, a calculation operation is performed for obtaining most recent contents of one of the data segments. Most recent contents are determined if ownership of the one of the data segments is possessed by a remote DSM module and the request to access one of the data segments is for shared permission and exists in the local external cache. The most recent contents are transported within the response if the response is in a remote external cache and has a valid permission for the one of the data segments otherwise reading from the one of the data segments.
    Type: Grant
    Filed: December 3, 2015
    Date of Patent: January 3, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior Aronovich, Yair Toaff, Gil Paz, Ron Asher
  • Publication number: 20160371347
    Abstract: Various embodiments for managing data in a data storage having data deduplication. For a back reference data structure incorporating reference information for at least one user data segment to a storage block, using a plurality of hash functions to convert between a plurality of form types of user data segment identification (ID's) representative of the at least one user data segment.
    Type: Application
    Filed: June 18, 2015
    Publication date: December 22, 2016
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lior ARONOVICH
  • Publication number: 20160371294
    Abstract: Various embodiments for managing data in a data storage having data deduplication. In response to a portion of the data storage determined to be inaccessible, an identifier of a user data segment is queried by examining a corresponding back reference data structure, the back reference data structure implemented as an approximation of a relationship between the user data segment and a particular storage block in the data storage. If the outcome of the query is negative, the user data segment is determined not associated with the particular storage block. If the outcome of the query is positive, the user data segment is warranted be examined further to determine if the user data segment is associated with the particular storage block.
    Type: Application
    Filed: June 18, 2015
    Publication date: December 22, 2016
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior ARONOVICH, Amir KREDI
  • Publication number: 20160371308
    Abstract: Various embodiments for managing data in a data storage having data deduplication. A back reference data structure is configured for user data segments as a mechanism to identify an affected storage block to which information in the back reference data structure refers. The back reference data structure is initialized such that a resolution of the back reference data structure diminishes as a number of the user data segments referencing the affected storage block increases.
    Type: Application
    Filed: June 18, 2015
    Publication date: December 22, 2016
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shay H. AKIRAV, Lior ARONOVICH, Yariv BACHAR, Shira BEN-DOR, Rafael BUCHBINDER, Amir KREDI
  • Publication number: 20160371295
    Abstract: Various embodiments for managing data in a data storage having data deduplication. For a back reference data structure incorporating reference information for at least one user data segment to a storage block, a user data segment identification (ID) representative of the at least one user data segment is removed from the back reference data structure.
    Type: Application
    Filed: June 18, 2015
    Publication date: December 22, 2016
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior ARONOVICH, Amir KREDI
  • Publication number: 20160342482
    Abstract: A computer-implemented method, according to one embodiment includes, for each repository data chunk in repository data that comprises a plurality of the repository data chunks, generating a corresponding set of repository distinguishing characteristics (RDCs). Each set of RDCs is generated by: applying a hash function to the respective input data chunk or repository data chunk to generate a plurality of hashes, each hash comprising a hash value and a hash position within the data chunk, applying a first function to the plurality of generated hashes to identify a first subset of hashes distributed across the data chunk, applying a second function to the hash positions of the hashes of the first subset to identify a second subset of the plurality of generated hashes, and defining the second subset of hashes as the set of RDCs.
    Type: Application
    Filed: August 1, 2016
    Publication date: November 24, 2016
    Inventors: Lior Aronovich, Ron Asher, Eitan Bachmat, Haim Bitner, Michael Hirsch, Shmuel T. Klein
  • Publication number: 20160335285
    Abstract: A computer program product for searching a repository of binary uninterpretted data, according to one embodiment, includes a computer readable storage medium having program instructions executable by a computer to cause the computer to perform a method comprising: analyzing, by the computer, segments of each of the repository and input data to determine a repository segment that is similar to an input segment, the analyzing including searching an index of representation values of the repository data for matching representation values of the input in a time independent of a size of the repository and linear in a size of the input data; and analyzing, by the computer, the similar repository segment with respect to the input segment to determine their common data sections while utilizing at least some of the matching representation values for data alignment, in a time linear in a size of the input segment.
    Type: Application
    Filed: July 25, 2016
    Publication date: November 17, 2016
    Inventors: Lior Aronovich, Ron Asher, Eitan Bachmat, Haim Bitner, Michael Hirsch, Shmuel T. Klein
  • Patent number: 9483483
    Abstract: Applying a content defined minimum size bound on blocks produced by content defined segmentation of data by calculating the size of the interval of data between a newly found candidate segmenting position and a last candidate segmenting position of same or higher hierarchy level, and then discarding the newly found candidate segmenting position if a size of an interval of data is lower than the minimum size bound, or retaining the newly found candidate segmenting position if the size of the interval of data is not lower than the minimum size bound or if there is no last candidate segmenting position of a same or higher hierarchy level as the newly found candidate segmenting position. When a last candidate segmenting position of a same or higher hierarchy level becomes available, the evaluation is reiterated to converge edge segmenting positions of the outputs of consecutive calculation units.
    Type: Grant
    Filed: January 15, 2016
    Date of Patent: November 1, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lior Aronovich