Patents by Inventor Yair Toaff
Yair Toaff has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230409222Abstract: A computer-implemented method for indexing a data item in a data storage system includes: dividing the data item into one or more large blocks; dividing each large block into one or more small blocks; calculating a strong hash value for each of the small blocks and storing a list of strong hash values with a pointer to a location of the large block; from the list of strong hash values calculated for each large block, selecting one or more representative hash values for the large block; and compiling a sparse index including an entry for each large block. Each entry is based on the representative hash values and a pointer to the list of strong hash values for each large block.Type: ApplicationFiled: September 5, 2023Publication date: December 21, 2023Inventors: Ovad Somech, Assaf Natanzon, Idan Zach, Aviv Kuvent, Yair Toaff, Elizabeth Firman, David Spinadel
-
Patent number: 11656991Abstract: An information processing device comprises: a memory comprising a cache for storing information related to an object from a plurality of objects, and a summary structure configured to store a summary for the object; a volume configured to store a merge file including the plurality of objects, and a set of dump-files, each dump-file being associated with a specific cache-dump operation of the cache; and a processor configured to assign, to the cache, a first identifier; perform a cache-dump operation based on generating a dump-file associated with the first identifier and storing the information related to the object from the cache to the generated dump-file; and assign, to the cache, a second identifier, wherein the second identifier is larger than the first identifier.Type: GrantFiled: January 3, 2022Date of Patent: May 23, 2023Assignee: Huawei Technologies Co., Ltd.Inventors: Aviv Kuvent, Yair Toaff
-
Patent number: 11507539Abstract: An apparatus stores received data blocks as deduplicated data blocks. The apparatus is configured to: maintain a plurality of containers, where a reference to a container is unique within the apparatus and each container includes one or more data segments and segment metadata for each data segment, the segment metadata including a segment identifier and a segment reference, where the segment identifier is unique within the container and the segment reference is unique within the apparatus; and maintain a plurality of deduplicated data blocks storing received data blocks, where each deduplicated data block includes a plurality of identified container references, where a container reference identifier is unique within the deduplicated data block, and an ordered list of one or more segment indicators.Type: GrantFiled: February 25, 2020Date of Patent: November 22, 2022Assignee: Huawei Technologies Co., Ltd.Inventors: Michael Hirsch, Yehonatan David, Yair Toaff
-
Publication number: 20220121575Abstract: An information processing device comprises: a memory comprising a cache for storing information related to an object from a plurality of objects, and a summary structure configured to store a summary for the object; a volume configured to store a merge file including the plurality of objects, and a set of dump-files, each dump-file being associated with a specific cache-dump operation of the cache; and a processor configured to assign, to the cache, a first identifier; perform a cache-dump operation based on generating a dump-file associated with the first identifier and storing the information related to the object from the cache to the generated dump-file; and assign, to the cache, a second identifier, wherein the second identifier is larger than the first identifier.Type: ApplicationFiled: January 3, 2022Publication date: April 21, 2022Inventors: Aviv KUVENT, Yair TOAFF
-
Publication number: 20200192760Abstract: The present disclosure relates to an apparatus for storing a received data block as one or more deduplicated data blocks. The apparatus includes a repository storing one or more containers, each container storing one or more data segments and segment metadata for each data segment. The apparatus further includes a database storing a plurality of deduplicated data blocks, each deduplicated data block containing a plurality of references to the data segments of the received data block and to the containers storing said data segments. The apparatus is configured to maintain, in the repository, a plurality of block backup files, each block backup file storing a copy of one or more deduplicated data blocks. The apparatus is configured to associate a deduplicated data block in the database with the block backup file in which a copy of the deduplicated data block is stored.Type: ApplicationFiled: February 25, 2020Publication date: June 18, 2020Inventors: Yair TOAFF, Wei LI, Michael HIRSCH, Yehonatan DAVID
-
Publication number: 20200192871Abstract: An apparatus stores received data blocks as deduplicated data blocks. The apparatus is configured to: maintain a plurality of containers, where a reference to a container is unique within the apparatus and each container includes one or more data segments and segment metadata for each data segment, the segment metadata including a segment identifier and a segment reference, where the segment identifier is unique within the container and the segment reference is unique within the apparatus; and maintain a plurality of deduplicated data blocks storing received data blocks, where each deduplicated data block includes a plurality of identified container references, where a container reference identifier is unique within the deduplicated data block, and an ordered list of one or more segment indicators.Type: ApplicationFiled: February 25, 2020Publication date: June 18, 2020Inventors: Michael HIRSCH, Yehonatan DAVID, Yair TOAFF
-
Patent number: 10621142Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.Type: GrantFiled: November 29, 2017Date of Patent: April 14, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lior Aronovich, Michael Hirsch, Yair Toaff
-
Patent number: 10585857Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.Type: GrantFiled: November 17, 2017Date of Patent: March 10, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lior Aronovich, Michael Hirsch, Yair Toaff
-
Patent number: 10489160Abstract: A system for compressing an input data stream to create a compressed output data stream is provided. The system comprises a memory storing a hash table comprising hash entries each comprising a hash value of an associated subset of following data items of an input data stream and a pointer to a memory location of the associated subset. A processor coupled to the memory executes operations while instructing an SIMD engine to execute concurrently one or more of the operations for consecutive subsets: calculate the hash value for each subset, search the hash table for a match of each calculated hash value and update the hash table according to the match result. The processor then updates the compressed output data stream according to the match result and a comparison result depending on the match result and operations for the plurality of associated subsets to create the compressed output data stream.Type: GrantFiled: January 11, 2019Date of Patent: November 26, 2019Assignee: Huawei Technologies Co., Ltd.Inventors: Michael Hirsch, Yehonatan David, Yair Toaff
-
Patent number: 10459961Abstract: A system for segmenting an input data stream using vector processing, comprising a processor adapted to repeat the following steps throughout an input data stream to create a segmented data stream consisting a plurality of segments: apply a rolling sequence over a sequence of consecutive data items of an input data stream, the rolling sequence includes a subset of consecutive data items of the sequence, calculate concurrently a plurality of partial hash values each by one of a plurality of processing pipelines of the processor, each for a respective one of a plurality of partial rolling sequences each including evenly spaced data items of the subset, determine compliance of each of the plurality of partial hash values with one or more respective partial segmentation criteria and designate the sequence as a variable size segment when at least some of the partial hash values comply with the respective partial segmentation criteria.Type: GrantFiled: August 2, 2017Date of Patent: October 29, 2019Assignee: Huawei Technologies Co., Ltd.Inventors: Yehonatan David, Yair Toaff, Michael Hirsch
-
Patent number: 10437817Abstract: A system for segmenting an input data stream, comprising a processor adapted to split an input data stream to a plurality of data sub-streams such that each of the plurality of data sub-streams has an overlapping portion with a consecutive data sub-stream of the plurality of data sub-streams, create concurrently a plurality of segmented data sub-streams by concurrently segmenting the plurality of data sub-streams each in one of a plurality of processing pipelines of the processor and join the plurality of segmented data sub-streams to create a segmented data stream by synchronizing a sequencing of each of the plurality of segmented data sub-streams according to one or more overlapping segments in the overlapping portion of each two consecutive data sub-streams of the plurality of data sub-streams.Type: GrantFiled: August 27, 2018Date of Patent: October 8, 2019Assignee: Huawei Technologies Co., Ltd.Inventors: Michael Hirsch, Yair Toaff, Yehonatan David
-
Publication number: 20190146801Abstract: A system for compressing an input data stream to create a compressed output data stream, comprising a memory for storing a hash table comprising hash entries each comprising a hash value of an associated subset of following data items of an input data stream and a pointer to a memory location of the associated subset. A processor coupled to the memory executes the following operations while instructing a SIMD engine to execute concurrently one or more of the operations for consecutive subsets: calculate the hash value for each subset, search the hash table for a match of each calculated hash value and update the hash table according to the match result. The processor then updates the compressed output data stream according to the match result and a comparison result depending on the match result and operations for the plurality of associated subsets to create the compressed output data stream.Type: ApplicationFiled: January 11, 2019Publication date: May 16, 2019Inventors: Michael HIRSCH, Yehonatan DAVID, Yair TOAFF
-
Patent number: 10229131Abstract: For producing digest block segmentations based on reference segmentations in a data deduplication system using a processor device in a computing environment, digests are calculated for an input data chunk. Data matches and data mismatches are produced based on matching input digests with reference digests. Secondary digest block segmentations are obtained from similar reference intervals for each of the data mismatches and applied to the input data.Type: GrantFiled: July 15, 2013Date of Patent: March 12, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shay H. Akirav, Lior Aronovich, Michael Hirsch, Yair Toaff
-
Publication number: 20180365284Abstract: A system for segmenting an input data stream, comprising a processor adapted to split an input data stream to a plurality of data sub-streams such that each of the plurality of data sub-streams has an overlapping portion with a consecutive data sub-stream of the plurality of data sub-streams, create concurrently a plurality of segmented data sub-streams by concurrently segmenting the plurality of data sub-streams each in one of a plurality of processing pipelines of the processor and join the plurality of segmented data sub-streams to create a segmented data stream by synchronizing a sequencing of each of the plurality of segmented data sub-streams according to one or more overlapping segments in the overlapping portion of each two consecutive data sub-streams of the plurality of data sub-streams.Type: ApplicationFiled: August 27, 2018Publication date: December 20, 2018Inventors: Michael HIRSCH, Yair TOAFF, Yehonatan DAVID
-
Publication number: 20180095986Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.Type: ApplicationFiled: November 17, 2017Publication date: April 5, 2018Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lior ARONOVICH, Michael Hirsch, Yair Toaff
-
Publication number: 20180081898Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.Type: ApplicationFiled: November 29, 2017Publication date: March 22, 2018Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lior ARONOVICH, Michael HIRSCH, Yair TOAFF
-
Patent number: 9858286Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.Type: GrantFiled: March 13, 2013Date of Patent: January 2, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lior Aronovich, Michael Hirsch, Yair Toaff
-
Patent number: 9852145Abstract: A deduplication storage system and a backup application create a synthetic backup. Metadata instructions are provided to the deduplication storage system. Each of the metadata instructions specifies the data segment of an originating backup and a designated location of the data segment in the synthetic backup. A set of metadata instructions is transformed into a transformed set of metadata instructions.Type: GrantFiled: March 13, 2013Date of Patent: December 26, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lior Aronovich, Michael Hirsch, Yair Toaff
-
Publication number: 20170344559Abstract: A system for segmenting an input data stream using vector processing, comprising a processor adapted to repeat the following steps throughout an input data stream to create a segmented data stream consisting a plurality of segments: apply a rolling sequence over a sequence of consecutive data items of an input data stream, the rolling sequence includes a subset of consecutive data items of the sequence, calculate concurrently a plurality of partial hash values each by one of a plurality of processing pipelines of the processor, each for a respective one of a plurality of partial rolling sequences each including evenly spaced data items of the subset, determine compliance of each of the plurality of partial hash values with one or more respective partial segmentation criteria and designate the sequence as a variable size segment when at least some of the partial hash values comply with the respective partial segmentation criteria.Type: ApplicationFiled: August 2, 2017Publication date: November 30, 2017Applicant: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Yehonatan DAVID, Yair TOAFF, Michael HIRSCH
-
Patent number: 9747055Abstract: Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each small data chunk, a signature is generated based on a combination of a representation of characters used in selecting data to be deduplicated. A c-spectrum of the small data chunk being a sequence of representations of different characters ordered by a frequency of occurrence in the small data chunk, and an f-spectrum of the small data chunk being a corresponding sequence of frequencies of the different characters in the small data chunk.Type: GrantFiled: June 8, 2015Date of Patent: August 29, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lior Aronovich, Ron Asher, Michael Hirsch, Shmuel T. Klein, Ehud Meiri, Yair Toaff