Patents by Inventor Yair Toaff

Yair Toaff has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230409222
    Abstract: A computer-implemented method for indexing a data item in a data storage system includes: dividing the data item into one or more large blocks; dividing each large block into one or more small blocks; calculating a strong hash value for each of the small blocks and storing a list of strong hash values with a pointer to a location of the large block; from the list of strong hash values calculated for each large block, selecting one or more representative hash values for the large block; and compiling a sparse index including an entry for each large block. Each entry is based on the representative hash values and a pointer to the list of strong hash values for each large block.
    Type: Application
    Filed: September 5, 2023
    Publication date: December 21, 2023
    Inventors: Ovad Somech, Assaf Natanzon, Idan Zach, Aviv Kuvent, Yair Toaff, Elizabeth Firman, David Spinadel
  • Patent number: 11656991
    Abstract: An information processing device comprises: a memory comprising a cache for storing information related to an object from a plurality of objects, and a summary structure configured to store a summary for the object; a volume configured to store a merge file including the plurality of objects, and a set of dump-files, each dump-file being associated with a specific cache-dump operation of the cache; and a processor configured to assign, to the cache, a first identifier; perform a cache-dump operation based on generating a dump-file associated with the first identifier and storing the information related to the object from the cache to the generated dump-file; and assign, to the cache, a second identifier, wherein the second identifier is larger than the first identifier.
    Type: Grant
    Filed: January 3, 2022
    Date of Patent: May 23, 2023
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Aviv Kuvent, Yair Toaff
  • Patent number: 11507539
    Abstract: An apparatus stores received data blocks as deduplicated data blocks. The apparatus is configured to: maintain a plurality of containers, where a reference to a container is unique within the apparatus and each container includes one or more data segments and segment metadata for each data segment, the segment metadata including a segment identifier and a segment reference, where the segment identifier is unique within the container and the segment reference is unique within the apparatus; and maintain a plurality of deduplicated data blocks storing received data blocks, where each deduplicated data block includes a plurality of identified container references, where a container reference identifier is unique within the deduplicated data block, and an ordered list of one or more segment indicators.
    Type: Grant
    Filed: February 25, 2020
    Date of Patent: November 22, 2022
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Michael Hirsch, Yehonatan David, Yair Toaff
  • Publication number: 20220121575
    Abstract: An information processing device comprises: a memory comprising a cache for storing information related to an object from a plurality of objects, and a summary structure configured to store a summary for the object; a volume configured to store a merge file including the plurality of objects, and a set of dump-files, each dump-file being associated with a specific cache-dump operation of the cache; and a processor configured to assign, to the cache, a first identifier; perform a cache-dump operation based on generating a dump-file associated with the first identifier and storing the information related to the object from the cache to the generated dump-file; and assign, to the cache, a second identifier, wherein the second identifier is larger than the first identifier.
    Type: Application
    Filed: January 3, 2022
    Publication date: April 21, 2022
    Inventors: Aviv KUVENT, Yair TOAFF
  • Publication number: 20200192760
    Abstract: The present disclosure relates to an apparatus for storing a received data block as one or more deduplicated data blocks. The apparatus includes a repository storing one or more containers, each container storing one or more data segments and segment metadata for each data segment. The apparatus further includes a database storing a plurality of deduplicated data blocks, each deduplicated data block containing a plurality of references to the data segments of the received data block and to the containers storing said data segments. The apparatus is configured to maintain, in the repository, a plurality of block backup files, each block backup file storing a copy of one or more deduplicated data blocks. The apparatus is configured to associate a deduplicated data block in the database with the block backup file in which a copy of the deduplicated data block is stored.
    Type: Application
    Filed: February 25, 2020
    Publication date: June 18, 2020
    Inventors: Yair TOAFF, Wei LI, Michael HIRSCH, Yehonatan DAVID
  • Publication number: 20200192871
    Abstract: An apparatus stores received data blocks as deduplicated data blocks. The apparatus is configured to: maintain a plurality of containers, where a reference to a container is unique within the apparatus and each container includes one or more data segments and segment metadata for each data segment, the segment metadata including a segment identifier and a segment reference, where the segment identifier is unique within the container and the segment reference is unique within the apparatus; and maintain a plurality of deduplicated data blocks storing received data blocks, where each deduplicated data block includes a plurality of identified container references, where a container reference identifier is unique within the deduplicated data block, and an ordered list of one or more segment indicators.
    Type: Application
    Filed: February 25, 2020
    Publication date: June 18, 2020
    Inventors: Michael HIRSCH, Yehonatan DAVID, Yair TOAFF
  • Patent number: 10621142
    Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.
    Type: Grant
    Filed: November 29, 2017
    Date of Patent: April 14, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
  • Patent number: 10585857
    Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.
    Type: Grant
    Filed: November 17, 2017
    Date of Patent: March 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
  • Patent number: 10489160
    Abstract: A system for compressing an input data stream to create a compressed output data stream is provided. The system comprises a memory storing a hash table comprising hash entries each comprising a hash value of an associated subset of following data items of an input data stream and a pointer to a memory location of the associated subset. A processor coupled to the memory executes operations while instructing an SIMD engine to execute concurrently one or more of the operations for consecutive subsets: calculate the hash value for each subset, search the hash table for a match of each calculated hash value and update the hash table according to the match result. The processor then updates the compressed output data stream according to the match result and a comparison result depending on the match result and operations for the plurality of associated subsets to create the compressed output data stream.
    Type: Grant
    Filed: January 11, 2019
    Date of Patent: November 26, 2019
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Michael Hirsch, Yehonatan David, Yair Toaff
  • Patent number: 10459961
    Abstract: A system for segmenting an input data stream using vector processing, comprising a processor adapted to repeat the following steps throughout an input data stream to create a segmented data stream consisting a plurality of segments: apply a rolling sequence over a sequence of consecutive data items of an input data stream, the rolling sequence includes a subset of consecutive data items of the sequence, calculate concurrently a plurality of partial hash values each by one of a plurality of processing pipelines of the processor, each for a respective one of a plurality of partial rolling sequences each including evenly spaced data items of the subset, determine compliance of each of the plurality of partial hash values with one or more respective partial segmentation criteria and designate the sequence as a variable size segment when at least some of the partial hash values comply with the respective partial segmentation criteria.
    Type: Grant
    Filed: August 2, 2017
    Date of Patent: October 29, 2019
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Yehonatan David, Yair Toaff, Michael Hirsch
  • Patent number: 10437817
    Abstract: A system for segmenting an input data stream, comprising a processor adapted to split an input data stream to a plurality of data sub-streams such that each of the plurality of data sub-streams has an overlapping portion with a consecutive data sub-stream of the plurality of data sub-streams, create concurrently a plurality of segmented data sub-streams by concurrently segmenting the plurality of data sub-streams each in one of a plurality of processing pipelines of the processor and join the plurality of segmented data sub-streams to create a segmented data stream by synchronizing a sequencing of each of the plurality of segmented data sub-streams according to one or more overlapping segments in the overlapping portion of each two consecutive data sub-streams of the plurality of data sub-streams.
    Type: Grant
    Filed: August 27, 2018
    Date of Patent: October 8, 2019
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Michael Hirsch, Yair Toaff, Yehonatan David
  • Publication number: 20190146801
    Abstract: A system for compressing an input data stream to create a compressed output data stream, comprising a memory for storing a hash table comprising hash entries each comprising a hash value of an associated subset of following data items of an input data stream and a pointer to a memory location of the associated subset. A processor coupled to the memory executes the following operations while instructing a SIMD engine to execute concurrently one or more of the operations for consecutive subsets: calculate the hash value for each subset, search the hash table for a match of each calculated hash value and update the hash table according to the match result. The processor then updates the compressed output data stream according to the match result and a comparison result depending on the match result and operations for the plurality of associated subsets to create the compressed output data stream.
    Type: Application
    Filed: January 11, 2019
    Publication date: May 16, 2019
    Inventors: Michael HIRSCH, Yehonatan DAVID, Yair TOAFF
  • Patent number: 10229131
    Abstract: For producing digest block segmentations based on reference segmentations in a data deduplication system using a processor device in a computing environment, digests are calculated for an input data chunk. Data matches and data mismatches are produced based on matching input digests with reference digests. Secondary digest block segmentations are obtained from similar reference intervals for each of the data mismatches and applied to the input data.
    Type: Grant
    Filed: July 15, 2013
    Date of Patent: March 12, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shay H. Akirav, Lior Aronovich, Michael Hirsch, Yair Toaff
  • Publication number: 20180365284
    Abstract: A system for segmenting an input data stream, comprising a processor adapted to split an input data stream to a plurality of data sub-streams such that each of the plurality of data sub-streams has an overlapping portion with a consecutive data sub-stream of the plurality of data sub-streams, create concurrently a plurality of segmented data sub-streams by concurrently segmenting the plurality of data sub-streams each in one of a plurality of processing pipelines of the processor and join the plurality of segmented data sub-streams to create a segmented data stream by synchronizing a sequencing of each of the plurality of segmented data sub-streams according to one or more overlapping segments in the overlapping portion of each two consecutive data sub-streams of the plurality of data sub-streams.
    Type: Application
    Filed: August 27, 2018
    Publication date: December 20, 2018
    Inventors: Michael HIRSCH, Yair TOAFF, Yehonatan DAVID
  • Publication number: 20180095986
    Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.
    Type: Application
    Filed: November 17, 2017
    Publication date: April 5, 2018
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior ARONOVICH, Michael Hirsch, Yair Toaff
  • Publication number: 20180081898
    Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.
    Type: Application
    Filed: November 29, 2017
    Publication date: March 22, 2018
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior ARONOVICH, Michael HIRSCH, Yair TOAFF
  • Patent number: 9858286
    Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: January 2, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
  • Patent number: 9852145
    Abstract: A deduplication storage system and a backup application create a synthetic backup. Metadata instructions are provided to the deduplication storage system. Each of the metadata instructions specifies the data segment of an originating backup and a designated location of the data segment in the synthetic backup. A set of metadata instructions is transformed into a transformed set of metadata instructions.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: December 26, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
  • Publication number: 20170344559
    Abstract: A system for segmenting an input data stream using vector processing, comprising a processor adapted to repeat the following steps throughout an input data stream to create a segmented data stream consisting a plurality of segments: apply a rolling sequence over a sequence of consecutive data items of an input data stream, the rolling sequence includes a subset of consecutive data items of the sequence, calculate concurrently a plurality of partial hash values each by one of a plurality of processing pipelines of the processor, each for a respective one of a plurality of partial rolling sequences each including evenly spaced data items of the subset, determine compliance of each of the plurality of partial hash values with one or more respective partial segmentation criteria and designate the sequence as a variable size segment when at least some of the partial hash values comply with the respective partial segmentation criteria.
    Type: Application
    Filed: August 2, 2017
    Publication date: November 30, 2017
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Yehonatan DAVID, Yair TOAFF, Michael HIRSCH
  • Patent number: 9747055
    Abstract: Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each small data chunk, a signature is generated based on a combination of a representation of characters used in selecting data to be deduplicated. A c-spectrum of the small data chunk being a sequence of representations of different characters ordered by a frequency of occurrence in the small data chunk, and an f-spectrum of the small data chunk being a corresponding sequence of frequencies of the different characters in the small data chunk.
    Type: Grant
    Filed: June 8, 2015
    Date of Patent: August 29, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior Aronovich, Ron Asher, Michael Hirsch, Shmuel T. Klein, Ehud Meiri, Yair Toaff