Patents by Inventor Yair Toaff

Yair Toaff has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEM AND METHOD FOR INDEXING A DATA ITEM IN A DATA STORAGE SYSTEM

Publication number: 20230409222

Abstract: A computer-implemented method for indexing a data item in a data storage system includes: dividing the data item into one or more large blocks; dividing each large block into one or more small blocks; calculating a strong hash value for each of the small blocks and storing a list of strong hash values with a pointer to a location of the large block; from the list of strong hash values calculated for each large block, selecting one or more representative hash values for the large block; and compiling a sparse index including an entry for each large block. Each entry is based on the representative hash values and a pointer to the list of strong hash values for each large block.

Type: Application

Filed: September 5, 2023

Publication date: December 21, 2023

Inventors: Ovad Somech, Assaf Natanzon, Idan Zach, Aviv Kuvent, Yair Toaff, Elizabeth Firman, David Spinadel
Device and method for maintaining summary consistency in caches

Patent number: 11656991

Abstract: An information processing device comprises: a memory comprising a cache for storing information related to an object from a plurality of objects, and a summary structure configured to store a summary for the object; a volume configured to store a merge file including the plurality of objects, and a set of dump-files, each dump-file being associated with a specific cache-dump operation of the cache; and a processor configured to assign, to the cache, a first identifier; perform a cache-dump operation based on generating a dump-file associated with the first identifier and storing the information related to the object from the cache to the generated dump-file; and assign, to the cache, a second identifier, wherein the second identifier is larger than the first identifier.

Type: Grant

Filed: January 3, 2022

Date of Patent: May 23, 2023

Assignee: Huawei Technologies Co., Ltd.

Inventors: Aviv Kuvent, Yair Toaff
Apparatus and method for storing received data blocks as deduplicated data blocks

Patent number: 11507539

Abstract: An apparatus stores received data blocks as deduplicated data blocks. The apparatus is configured to: maintain a plurality of containers, where a reference to a container is unique within the apparatus and each container includes one or more data segments and segment metadata for each data segment, the segment metadata including a segment identifier and a segment reference, where the segment identifier is unique within the container and the segment reference is unique within the apparatus; and maintain a plurality of deduplicated data blocks storing received data blocks, where each deduplicated data block includes a plurality of identified container references, where a container reference identifier is unique within the deduplicated data block, and an ordered list of one or more segment indicators.

Type: Grant

Filed: February 25, 2020

Date of Patent: November 22, 2022

Assignee: Huawei Technologies Co., Ltd.

Inventors: Michael Hirsch, Yehonatan David, Yair Toaff
DEVICE AND METHOD FOR MAINTAINING SUMMARY CONSISTENCY IN CACHES

Publication number: 20220121575

Abstract: An information processing device comprises: a memory comprising a cache for storing information related to an object from a plurality of objects, and a summary structure configured to store a summary for the object; a volume configured to store a merge file including the plurality of objects, and a set of dump-files, each dump-file being associated with a specific cache-dump operation of the cache; and a processor configured to assign, to the cache, a first identifier; perform a cache-dump operation based on generating a dump-file associated with the first identifier and storing the information related to the object from the cache to the generated dump-file; and assign, to the cache, a second identifier, wherein the second identifier is larger than the first identifier.

Type: Application

Filed: January 3, 2022

Publication date: April 21, 2022

Inventors: Aviv KUVENT, Yair TOAFF
APPARATUS AND METHOD FOR DEDUPLICATING DATA

Publication number: 20200192760

Abstract: The present disclosure relates to an apparatus for storing a received data block as one or more deduplicated data blocks. The apparatus includes a repository storing one or more containers, each container storing one or more data segments and segment metadata for each data segment. The apparatus further includes a database storing a plurality of deduplicated data blocks, each deduplicated data block containing a plurality of references to the data segments of the received data block and to the containers storing said data segments. The apparatus is configured to maintain, in the repository, a plurality of block backup files, each block backup file storing a copy of one or more deduplicated data blocks. The apparatus is configured to associate a deduplicated data block in the database with the block backup file in which a copy of the deduplicated data block is stored.

Type: Application

Filed: February 25, 2020

Publication date: June 18, 2020

Inventors: Yair TOAFF, Wei LI, Michael HIRSCH, Yehonatan DAVID
APPARATUS AND METHOD FOR STORING RECEIVED DATA BLOCKS AS DEDUPLICATED DATA BLOCKS

Publication number: 20200192871

Abstract: An apparatus stores received data blocks as deduplicated data blocks. The apparatus is configured to: maintain a plurality of containers, where a reference to a container is unique within the apparatus and each container includes one or more data segments and segment metadata for each data segment, the segment metadata including a segment identifier and a segment reference, where the segment identifier is unique within the container and the segment reference is unique within the apparatus; and maintain a plurality of deduplicated data blocks storing received data blocks, where each deduplicated data block includes a plurality of identified container references, where a container reference identifier is unique within the deduplicated data block, and an ordered list of one or more segment indicators.

Type: Application

Filed: February 25, 2020

Publication date: June 18, 2020

Inventors: Michael HIRSCH, Yehonatan DAVID, Yair TOAFF
Deduplicating input backup data with data of a synthetic backup previously constructed by a deduplication storage system

Patent number: 10621142

Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.

Type: Grant

Filed: November 29, 2017

Date of Patent: April 14, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
Creation of synthetic backups within deduplication storage system by a backup application

Patent number: 10585857

Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.

Type: Grant

Filed: November 17, 2017

Date of Patent: March 10, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
General purpose data compression using SIMD engine

Patent number: 10489160

Abstract: A system for compressing an input data stream to create a compressed output data stream is provided. The system comprises a memory storing a hash table comprising hash entries each comprising a hash value of an associated subset of following data items of an input data stream and a pointer to a memory location of the associated subset. A processor coupled to the memory executes operations while instructing an SIMD engine to execute concurrently one or more of the operations for consecutive subsets: calculate the hash value for each subset, search the hash table for a match of each calculated hash value and update the hash table according to the match result. The processor then updates the compressed output data stream according to the match result and a comparison result depending on the match result and operations for the plurality of associated subsets to create the compressed output data stream.

Type: Grant

Filed: January 11, 2019

Date of Patent: November 26, 2019

Assignee: Huawei Technologies Co., Ltd.

Inventors: Michael Hirsch, Yehonatan David, Yair Toaff
Vector processing for segmentation hash values calculation

Patent number: 10459961

Abstract: A system for segmenting an input data stream using vector processing, comprising a processor adapted to repeat the following steps throughout an input data stream to create a segmented data stream consisting a plurality of segments: apply a rolling sequence over a sequence of consecutive data items of an input data stream, the rolling sequence includes a subset of consecutive data items of the sequence, calculate concurrently a plurality of partial hash values each by one of a plurality of processing pipelines of the processor, each for a respective one of a plurality of partial rolling sequences each including evenly spaced data items of the subset, determine compliance of each of the plurality of partial hash values with one or more respective partial segmentation criteria and designate the sequence as a variable size segment when at least some of the partial hash values comply with the respective partial segmentation criteria.

Type: Grant

Filed: August 2, 2017

Date of Patent: October 29, 2019

Assignee: Huawei Technologies Co., Ltd.

Inventors: Yehonatan David, Yair Toaff, Michael Hirsch
Concurrent segmentation using vector processing

Patent number: 10437817

Abstract: A system for segmenting an input data stream, comprising a processor adapted to split an input data stream to a plurality of data sub-streams such that each of the plurality of data sub-streams has an overlapping portion with a consecutive data sub-stream of the plurality of data sub-streams, create concurrently a plurality of segmented data sub-streams by concurrently segmenting the plurality of data sub-streams each in one of a plurality of processing pipelines of the processor and join the plurality of segmented data sub-streams to create a segmented data stream by synchronizing a sequencing of each of the plurality of segmented data sub-streams according to one or more overlapping segments in the overlapping portion of each two consecutive data sub-streams of the plurality of data sub-streams.

Type: Grant

Filed: August 27, 2018

Date of Patent: October 8, 2019

Assignee: Huawei Technologies Co., Ltd.

Inventors: Michael Hirsch, Yair Toaff, Yehonatan David
GENERAL PURPOSE DATA COMPRESSION USING SIMD ENGINE

Publication number: 20190146801

Abstract: A system for compressing an input data stream to create a compressed output data stream, comprising a memory for storing a hash table comprising hash entries each comprising a hash value of an associated subset of following data items of an input data stream and a pointer to a memory location of the associated subset. A processor coupled to the memory executes the following operations while instructing a SIMD engine to execute concurrently one or more of the operations for consecutive subsets: calculate the hash value for each subset, search the hash table for a match of each calculated hash value and update the hash table according to the match result. The processor then updates the compressed output data stream according to the match result and a comparison result depending on the match result and operations for the plurality of associated subsets to create the compressed output data stream.

Type: Application

Filed: January 11, 2019

Publication date: May 16, 2019

Inventors: Michael HIRSCH, Yehonatan DAVID, Yair TOAFF
Digest block segmentation based on reference segmentation in a data deduplication system

Patent number: 10229131

Abstract: For producing digest block segmentations based on reference segmentations in a data deduplication system using a processor device in a computing environment, digests are calculated for an input data chunk. Data matches and data mismatches are produced based on matching input digests with reference digests. Secondary digest block segmentations are obtained from similar reference intervals for each of the data mismatches and applied to the input data.

Type: Grant

Filed: July 15, 2013

Date of Patent: March 12, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Shay H. Akirav, Lior Aronovich, Michael Hirsch, Yair Toaff
CONCURRENT SEGMENTATION USING VECTOR PROCESSING

Publication number: 20180365284

Abstract: A system for segmenting an input data stream, comprising a processor adapted to split an input data stream to a plurality of data sub-streams such that each of the plurality of data sub-streams has an overlapping portion with a consecutive data sub-stream of the plurality of data sub-streams, create concurrently a plurality of segmented data sub-streams by concurrently segmenting the plurality of data sub-streams each in one of a plurality of processing pipelines of the processor and join the plurality of segmented data sub-streams to create a segmented data stream by synchronizing a sequencing of each of the plurality of segmented data sub-streams according to one or more overlapping segments in the overlapping portion of each two consecutive data sub-streams of the plurality of data sub-streams.

Type: Application

Filed: August 27, 2018

Publication date: December 20, 2018

Inventors: Michael HIRSCH, Yair TOAFF, Yehonatan DAVID
CREATION OF SYNTHETIC BACKUPS WITHIN DEDUPLICATION STORAGE SYSTEM BY A BACKUP APPLICATION

Publication number: 20180095986

Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.

Type: Application

Filed: November 17, 2017

Publication date: April 5, 2018

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lior ARONOVICH, Michael Hirsch, Yair Toaff
DEDUPLICATING INPUT BACKUP DATA WITH DATA OF A SYNTHETIC BACKUP PREVIOUSLY CONSTRUCTED BY A DEDUPLICATION STORAGE SYSTEM

Publication number: 20180081898

Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.

Type: Application

Filed: November 29, 2017

Publication date: March 22, 2018

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lior ARONOVICH, Michael HIRSCH, Yair TOAFF
Deduplicating input backup data with data of a synthetic backup previously constructed by a deduplication storage system

Patent number: 9858286

Abstract: Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup.

Type: Grant

Filed: March 13, 2013

Date of Patent: January 2, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
Creation of synthetic backups within deduplication storage system by a backup application

Patent number: 9852145

Abstract: A deduplication storage system and a backup application create a synthetic backup. Metadata instructions are provided to the deduplication storage system. Each of the metadata instructions specifies the data segment of an originating backup and a designated location of the data segment in the synthetic backup. A set of metadata instructions is transformed into a transformed set of metadata instructions.

Type: Grant

Filed: March 13, 2013

Date of Patent: December 26, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lior Aronovich, Michael Hirsch, Yair Toaff
VECTOR PROCESSING FOR SEGMENTATION HASH VALUES CALCULATION

Publication number: 20170344559

Abstract: A system for segmenting an input data stream using vector processing, comprising a processor adapted to repeat the following steps throughout an input data stream to create a segmented data stream consisting a plurality of segments: apply a rolling sequence over a sequence of consecutive data items of an input data stream, the rolling sequence includes a subset of consecutive data items of the sequence, calculate concurrently a plurality of partial hash values each by one of a plurality of processing pipelines of the processor, each for a respective one of a plurality of partial rolling sequences each including evenly spaced data items of the subset, determine compliance of each of the plurality of partial hash values with one or more respective partial segmentation criteria and designate the sequence as a variable size segment when at least some of the partial hash values comply with the respective partial segmentation criteria.

Type: Application

Filed: August 2, 2017

Publication date: November 30, 2017

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Yehonatan DAVID, Yair TOAFF, Michael HIRSCH
Scalable deduplication system with small blocks

Patent number: 9747055

Abstract: Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each small data chunk, a signature is generated based on a combination of a representation of characters used in selecting data to be deduplicated. A c-spectrum of the small data chunk being a sequence of representations of different characters ordered by a frequency of occurrence in the small data chunk, and an f-spectrum of the small data chunk being a corresponding sequence of frequencies of the different characters in the small data chunk.

Type: Grant

Filed: June 8, 2015

Date of Patent: August 29, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lior Aronovich, Ron Asher, Michael Hirsch, Shmuel T. Klein, Ehud Meiri, Yair Toaff

1 2 3 4 5 next