Patents by Inventor Marian Dvorsky
Marian Dvorsky has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11966377Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.Type: GrantFiled: March 3, 2022Date of Patent: April 23, 2024Assignee: Google LLCInventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Publication number: 20220261392Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.Type: ApplicationFiled: March 3, 2022Publication date: August 18, 2022Inventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Patent number: 11269847Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.Type: GrantFiled: December 5, 2019Date of Patent: March 8, 2022Assignee: Google LLCInventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Publication number: 20200110737Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.Type: ApplicationFiled: December 5, 2019Publication date: April 9, 2020Inventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Patent number: 10515065Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.Type: GrantFiled: March 5, 2018Date of Patent: December 24, 2019Assignee: Google LLCInventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Publication number: 20180196840Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.Type: ApplicationFiled: March 5, 2018Publication date: July 12, 2018Inventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Patent number: 9934262Abstract: A method includes receiving a request to perform a shuffle operation on a data stream, the request including a set of initial key ranges: generating a shuffler configuration that assigns a shuffler from a set of shufflers to each of the initial key ranges; initiating the set of shufflers to perform the shuffle operation on the data stream; analyzing metadata statistics to determine whether a shuffler configuration update event occurs, the metadata statistics produced by the set of shufflers during the shuffle operation and indicating load statistics for each shuffler in the set of shufflers; and upon occurrence of the shuffler configuration update event and during the shuffle operation, altering the shuffler configuration based at least in part on the metadata statistics to produce an assignment of shufflers to key ranges that is different from the assignment of shufflers to the initial key ranges.Type: GrantFiled: September 19, 2016Date of Patent: April 3, 2018Assignee: Google LLCInventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Patent number: 9928263Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.Type: GrantFiled: October 3, 2013Date of Patent: March 27, 2018Assignee: Google LLCInventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Patent number: 9886325Abstract: A large-scale data processing system and method including a plurality of processes, wherein a master process assigns input data blocks to respective map processes and partitions of intermediate data are assigned to respective reduce processes. In each of the plurality of map processes an application-independent map program retrieves a sequence of input data blocks assigned thereto by the master process and applies an application-specific map function to each input data block in the sequence to produce the intermediate data and stores the intermediate data in high speed memory of the interconnected processors. Each of the plurality of reduce processes receives a respective partition of the intermediate data from the high speed memory of the interconnected processors while the map processes continue to process input data blocks an application-specific reduce function is applied to the respective partition of the intermediate data to produce output values.Type: GrantFiled: July 18, 2016Date of Patent: February 6, 2018Assignee: GOOGLE LLCInventors: Grzegorz Malewicz, Marian Dvorsky, Christopher B. Colohan, Derek P. Thomson, Joshua Louis Levenberg
-
Patent number: 9798831Abstract: A computer-implemented method for processing input data in a mapreduce framework includes: receiving, in the mapreduce framework, a data processing request for input data; initiating, based on the data processing request, a map operation on the input data by multiple mappers in the mapreduce framework, each of the mappers using an aggregator to partially aggregate the input data into one or more intermediate key/value pairs; initiating a reduce operation on the intermediate key/value pairs by multiple reducers in the mapreduce framework, wherein, without sorting the intermediate key/value pairs, those of the intermediate key/value pairs with a common key are handled by a same one of the reducers, each of the reducers using the aggregator to aggregate the intermediate key/value pairs into one or more output values; and providing the output values in response to the data processing request.Type: GrantFiled: April 1, 2011Date of Patent: October 24, 2017Assignee: Google Inc.Inventors: Biswapesh Chattopadhyay, Liang Lin, Weiran Liu, Marián Dvorský
-
Publication number: 20170090993Abstract: A large-scale data processing system and method including a plurality of processes, wherein a master process assigns input data blocks to respective map processes and partitions of intermediate data are assigned to respective reduce processes. In each of the plurality of map processes an application-independent map program retrieves a sequence of input data blocks assigned thereto by the master process and applies an application-specific map function to each input data block in the sequence to produce the intermediate data and stores the intermediate data in high speed memory of the interconnected processors. Each of the plurality of reduce processes receives a respective partition of the intermediate data from the high speed memory of the interconnected processors while the map processes continue to process input data blocks an application-specific reduce function is applied to the respective partition of the intermediate data to produce output values.Type: ApplicationFiled: July 18, 2016Publication date: March 30, 2017Inventors: Grzegorz Malewicz, Marian Dvorsky, Christopher B. Colohan, Derek P. Thomson, Joshua Louis Levenberg
-
Publication number: 20170003936Abstract: A method includes receiving a request to perform a shuffle operation on a data stream, the request including a set of initial key ranges: generating a shuffler configuration that assigns a shuffler from a set of shufflers to each of the initial key ranges; initiating the set of shufflers to perform the shuffle operation on the data stream; analyzing metadata statistics to determine whether a shuffler configuration update event occurs, the metadata statistics produced by the set of shufflers during the shuffle operation and indicating load statistics for each shuffler in the set of shufflers; and upon occurrence of the shuffler configuration update event and during the shuffle operation, altering the shuffler configuration based at least in part on the metadata statistics to produce an assignment of shufflers to key ranges that is different from the assignment of shufflers to the initial key ranges.Type: ApplicationFiled: September 19, 2016Publication date: January 5, 2017Inventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Patent number: 9483509Abstract: A method includes receiving a request to perform a shuffle operation on a data stream, the request including a set of initial key ranges: generating a shuffler configuration that assigns a shuffler from a set of shufflers to each of the initial key ranges; initiating the set of shufflers to perform the shuffle operation on the data stream; analyzing metadata statistics to determine whether a shuffler configuration update event occurs, the metadata statistics produced by the set of shufflers during the shuffle operation and indicating load statistics for each shuffler in the set of shufflers; and upon occurrence of the shuffler configuration update event and during the shuffle operation, altering the shuffler configuration based at least in part on the metadata statistics to produce an assignment of shufflers to key ranges that is different from the assignment of shufflers to the initial key ranges.Type: GrantFiled: October 2, 2013Date of Patent: November 1, 2016Assignee: Google Inc.Inventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Patent number: 9396036Abstract: A large-scale data processing system and method including a plurality of processes, wherein a master process assigns input data blocks to respective map processes and partitions of intermediate data are assigned to respective reduce processes. In each of the plurality of map processes an application-independent map program retrieves a sequence of input data blocks assigned thereto by the master process and applies an application-specific map function to each input data block in the sequence to produce the intermediate data and stores the intermediate data in high speed memory of the interconnected processors. Each of the plurality of reduce processes receives a respective partition of the intermediate data from the high speed memory of the interconnected processors while the map processes continue to process input data blocks an application-specific reduce function is applied to the respective partition of the intermediate data to produce output values.Type: GrantFiled: June 1, 2015Date of Patent: July 19, 2016Assignee: GOOGLE INC.Inventors: Grzegorz Malewicz, Marian Dvorsky, Christopher B. Colohan, Derek P. Thomson, Joshua Louis Levenberg
-
Patent number: 9298760Abstract: A method for shard assignment in a large-scale data processing job is provided. Datasets are divided into a plurality of shards and the shards are indexed and aggregated into one or more groups. A worker process is initially assigned an indexed shard from a group. The initial assignment can assigned based on a simple algorithm. The worker's subsequent shard assignment is based on the index of the initially assigned shard.Type: GrantFiled: August 3, 2012Date of Patent: March 29, 2016Assignee: Google Inc.Inventors: Xiaozhou Li, Yonggang Zhao, Marian Dvorsky, Ovidiu Gheorghioiu
-
Publication number: 20150324237Abstract: A large-scale data processing system and method including a plurality of processes, wherein a master process assigns input data blocks to respective map processes and partitions of intermediate data are assigned to respective reduce processes. In each of the plurality of map processes an application-independent map program retrieves a sequence of input data blocks assigned thereto by the master process and applies an application-specific map function to each input data block in the sequence to produce the intermediate data and stores the intermediate data in high speed memory of the interconnected processors. Each of the plurality of reduce processes receives a respective partition of the intermediate data from the high speed memory of the interconnected processors while the map processes continue to process input data blocks an application-specific reduce function is applied to the respective partition of the intermediate data to produce output values.Type: ApplicationFiled: June 1, 2015Publication date: November 12, 2015Inventors: Grzegorz Malewicz, Marian Dvorsky, Christopher B. Colohan, Derek P. Thomson, Joshua Louis Levenberg
-
Patent number: 9047141Abstract: A large-scale data processing system and method including a plurality of processes, wherein a master process assigns input data blocks to respective map processes and partitions of intermediate data are assigned to respective reduce processes. In each of the plurality of map processes an application-independent map program retrieves a sequence of input data blocks assigned thereto by the master process and applies an application-specific map function to each input data block in the sequence to produce the intermediate data and stores the intermediate data in high speed memory of the interconnected processors. Each of the plurality of reduce processes receives a respective partition of the intermediate data from the high speed memory of the interconnected processors while the map processes continue to process input data blocks an application-specific reduce function is applied to the respective partition of the intermediate data to produce output values.Type: GrantFiled: August 12, 2013Date of Patent: June 2, 2015Assignee: GOOGLE INC.Inventors: Grzegorz Malewicz, Marian Dvorsky, Christopher B. Colohan, Derek P. Thomson, Joshua Louis Levenberg
-
Publication number: 20150100592Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.Type: ApplicationFiled: October 3, 2013Publication date: April 9, 2015Applicant: Google Inc.Inventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Publication number: 20150095351Abstract: A method includes receiving a request to perform a shuffle operation on a data stream, the request including a set of initial key ranges: generating a shuffler configuration that assigns a shuffler from a set of shufflers to each of the initial key ranges; initiating the set of shufflers to perform the shuffle operation on the data stream; analyzing metadata statistics to determine whether a shuffler configuration update event occurs, the metadata statistics produced by the set of shufflers during the shuffle operation and indicating load statistics for each shuffler in the set of shufflers; and upon occurrence of the shuffler configuration update event and during the shuffle operation, altering the shuffler configuration based at least in part on the metadata statistics to produce an assignment of shufflers to key ranges that is different from the assignment of shufflers to the initial key ranges.Type: ApplicationFiled: October 2, 2013Publication date: April 2, 2015Applicant: Google Inc.Inventors: Alexander Gourkov Balikov, Marian Dvorsky, Yonggang Zhao
-
Publication number: 20130332931Abstract: A large-scale data processing system and method including a plurality of processes, wherein a master process assigns input data blocks to respective map processes and partitions of intermediate data are assigned to respective reduce processes. In each of the plurality of map processes an application-independent map program retrieves a sequence of input data blocks assigned thereto by the master process and applies an application-specific map function to each input data block in the sequence to produce the intermediate data and stores the intermediate data in high speed memory of the interconnected processors. Each of the plurality of reduce processes receives a respective partition of the intermediate data from the high speed memory of the interconnected processors while the map processes continue to process input data blocks an application-specific reduce function is applied to the respective partition of the intermediate data to produce output values.Type: ApplicationFiled: August 12, 2013Publication date: December 12, 2013Applicant: Google Inc.Inventors: Grzegorz Malewicz, Marian Dvorsky, Christopher B. Colohan, Derek P. Thomson, Joshua Louis Levenberg