Patents by Inventor Sean Quinlan

Sean Quinlan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Distributing data on distributed storage systems

Patent number: 10678647

Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.

Type: Grant

Filed: April 24, 2019

Date of Patent: June 9, 2020

Assignee: Google LLC

Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa
Distributing Data on Distributed Storage Systems

Publication number: 20190250992

Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.

Type: Application

Filed: April 24, 2019

Publication date: August 15, 2019

Applicant: Google LLC

Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa
Distributing data on distributed storage systems

Patent number: 10318384

Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.

Type: Grant

Filed: June 13, 2016

Date of Patent: June 11, 2019

Assignee: Google LLC

Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa
Quota-based resource scheduling

Patent number: 10257111

Abstract: The present disclosure relates to dynamically scheduling resource requests in a distributed system based on usage quotas. One example method includes identifying usage information for a distributed system including atoms, each atom representing a distinct item used by users of the distributed system; determining that a usage quota associated with the distributed system has been exceeded based on the usage information, the usage quota representing an upper limit for a particular type of usage of the distributed system; receiving a first request for a particular atom requiring invocation of the particular type of usage represented by the usage quota; determining that a second request for a different type of usage of the particular atom is waiting to be processed; and processing the second request for the particular atom before processing the first request.

Type: Grant

Filed: August 29, 2017

Date of Patent: April 9, 2019

Assignee: Google LLC

Inventors: Lawrence E. Greenfield, Sean Quinlan, Priyanka Gupta
Ensuring globally consistent transactions

Patent number: 10042881

Abstract: The present technology proposes techniques for ensuring globally consistent transactions. This technology may allow distributed systems to ensure the causal order of read and write transactions across different partitions of a distributed database. By assigning causally generated timestamps to the transactions based on one or more globally coherent time services, the timestamps can be used to preserve and represent the causal order of the transactions in the distributed system. In this regard, certain transactions may wait for a period of time after choosing a timestamp in order to delay the start of any second transaction that might depend on it. The wait may ensure that the effects of the first transaction are not made visible until its timestamp is guaranteed to be in the past. This may ensure that a consistent snapshot of the distributed database can be determined for any past timestamp.

Type: Grant

Filed: November 22, 2016

Date of Patent: August 7, 2018

Assignee: Google LLC

Inventors: Wilson Cheng-Yi Hsieh, Alexander Lloyd, Peter Hochschild, Michael James Boyer Epstein, Sean Quinlan
System and Method For Analyzing Data Records

Publication number: 20180052890

Abstract: Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.

Type: Application

Filed: October 31, 2017

Publication date: February 22, 2018

Inventors: Robert C. Pike, Sean Quinlan, Sean M. Dorward, Jeffrey Dean, Sanjay Ghemawat
System and method for analyzing data records

Patent number: 9830357

Abstract: A method processes data records. The method partitions the data records into groups and assigns each group to a respective process of a first plurality of processes, which execute in parallel. For each group, the assigned process extracts information from the data records, applies a script with information processing commands applied sequentially to produce intermediate values, stores the intermediate values in a respective intermediate data structure, and updates the status of the group to indicate completion. When the predefined threshold percentage of the data records are completed, the process assigns each group to a respective second process as a backup. When each of the groups has been completed by at least one process (either the original or the backup), the method executes a second plurality of processes to aggregate intermediate values from the intermediate data structures to produce output data. The aggregation includes intermediate values only once for each group.

Type: Grant

Filed: August 2, 2016

Date of Patent: November 28, 2017

Assignee: GOOGLE INC.

Inventors: Robert C. Pike, Sean Quinlan, Sean M. Dorward, Jeffrey Dean, Sanjay Ghemawat
Prioritizing data reconstruction in distributed storage systems

Patent number: 9823980

Abstract: A method of prioritizing data for recovery in a distributed storage system includes, for each stripe of a file having chunks, determining whether the stripe comprises high-availability chunks or low-availability chunks and determining an effective redundancy value for each stripe. The effective redundancy value is based on the chunks and any system domains associated with the corresponding stripe. The distributed storage system has a system hierarchy including system domains. Chunks of a stripe associated with a system domain in an active state are accessible, whereas chunks of a stripe associated with a system domain in an inactive state are inaccessible. The method also includes reconstructing substantially immediately inaccessible, high-availability chunks having an effective redundancy value less than a threshold effective redundancy value and reconstructing the inaccessible low-availability and other inaccessible high-availability chunks, after a threshold period of time.

Type: Grant

Filed: November 22, 2016

Date of Patent: November 21, 2017

Assignee: Google Inc.

Inventors: Steven Robert Schirripa, Christian Eric Schrock, Robert Cypher, Sean Quinlan
Quota-based resource scheduling

Patent number: 9781054

Abstract: The present disclosure relates to dynamically scheduling resource requests in a distributed system based on usage quotas. One example method includes identifying usage information for a distributed system including atoms, each atom representing a distinct item used by users of the distributed system; determining that a usage quota associated with the distributed system has been exceeded based on the usage information, the usage quota representing an upper limit for a particular type of usage of the distributed system; receiving a first request for a particular atom requiring invocation of the particular type of usage represented by the usage quota; determining that a second request for a different type of usage of the particular atom is waiting to be processed; and processing the second request for the particular atom before processing the first request.

Type: Grant

Filed: July 27, 2015

Date of Patent: October 3, 2017

Assignee: Google Inc.

Inventors: Lawrence E. Greenfield, Sean Quinlan, Priyanka Gupta
Storing and moving data in a distributed storage system

Patent number: 9774676

Abstract: A system, computer-readable storage medium storing at least one program, and a computer-implemented method for identifying a storage group in a distributed storage system into which data is to be stored is presented. A data structure including information relating to storage groups in a distributed storage system is maintained, where a respective entry in the data structure for a respective storage group includes placement metrics for the respective storage group. A request to identify a storage group into which data is to be stored is received from a computer system. The data structure is used to determine an identifier for a storage group whose placement metrics satisfy a selection criterion. The identifier for the storage group whose placement metrics satisfy the selection criterion is returned to the computer system.

Type: Grant

Filed: May 21, 2013

Date of Patent: September 26, 2017

Assignee: GOOGLE INC.

Inventors: Jeffrey Adgate Dean, Sanjay Ghemawat, Yasushi Saito, Andrew Fikes, Christopher Jorgen Taylor, Sean Quinlan, Michal Piotr Szymaniak, Sebastian Kanthak, Wilson Cheng-Yi Hsieh, Alexander Lloyd, Michael James Boyer Epstein
Efficient data reads from distributed storage systems

Patent number: 9747155

Abstract: A method of distributing data in a distributed storage system includes receiving a file and dividing the received file into chunks. The chunks are data-chunks and non-data chunks. The method further includes grouping chunks into a group and determining a distribution of the chunks of the group among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes hierarchical maintenance levels and maintenance domains. Each maintenance domain has an active state or an inactive state; and each storage device is associated with at least one maintenance domain. The method also includes distributing the chunks of the group to the storage devices based on the determined distribution. The chunks of the group are distributed across multiple maintenance domains to maintain an ability to reconstruct chunks of the group when a maintenance domain is in the inactive state.

Type: Grant

Filed: November 3, 2016

Date of Patent: August 29, 2017

Assignee: Google Inc.

Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa, Lidor Carmi, Christian Eric Schrock
Prioritizing Data Reconstruction in Distributed Storage Systems

Publication number: 20170075741

Abstract: A method of prioritizing data for recovery in a distributed storage system includes, for each stripe of a file having chunks, determining whether the stripe comprises high-availability chunks or low-availability chunks and determining an effective redundancy value for each stripe. The effective redundancy value is based on the chunks and any system domains associated with the corresponding stripe. The distributed storage system has a system hierarchy including system domains. Chunks of a stripe associated with a system domain in an active state are accessible, whereas chunks of a stripe associated with a system domain in an inactive state are inaccessible. The method also includes reconstructing substantially immediately inaccessible, high-availability chunks having an effective redundancy value less than a threshold effective redundancy value and reconstructing the inaccessible low-availability and other inaccessible high-availability chunks, after a threshold period of time.

Type: Application

Filed: November 22, 2016

Publication date: March 16, 2017

Applicant: Google Inc.

Inventors: Steven Robert Schirripa, Christian Eric Schrock, Robert Cypher, Sean Quinlan
Efficient Data Reads From Distributed Storage Systems

Publication number: 20170075753

Abstract: A method of distributing data in a distributed storage system includes receiving a file and dividing the received file into chunks. The chunks are data-chunks and non-data chunks. The method further includes grouping chunks into a group and determining a distribution of the chunks of the group among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes hierarchical maintenance levels and maintenance domains. Each maintenance domain has an active state or an inactive state; and each storage device is associated with at least one maintenance domain. The method also includes distributing the chunks of the group to the storage devices based on the determined distribution. The chunks of the group are distributed across multiple maintenance domains to maintain an ability to reconstruct chunks of the group when a maintenance domain is in the inactive state.

Type: Application

Filed: November 3, 2016

Publication date: March 16, 2017

Applicant: Google Inc.

Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa, Lidor Carmi, Christian Eric Schrock
Ensuring globally consistent transactions

Patent number: 9569253

Abstract: The present technology proposes techniques for ensuring globally consistent transactions. This technology may allow distributed systems to ensure the causal order of read and write transactions across different partitions of a distributed database. By assigning causally generated timestamps to the transactions based on one or more globally coherent time services, the timestamps can be used to preserve and represent the causal order of the transactions in the distributed system. In this regard, certain transactions may wait for a period of time after choosing a timestamp in order to delay the start of any second transaction that might depend on it. The wait may ensure that the effects of the first transaction are not made visible until its timestamp is guaranteed to be in the past. This may ensure that a consistent snapshot of the distributed database can be determined for any past timestamp.

Type: Grant

Filed: May 30, 2013

Date of Patent: February 14, 2017

Assignee: Google Inc.

Inventors: Wilson Cheng-Yi Hsieh, Alexander Lloyd, Peter Hochschild, Michael James Boyer Epstein, Sean Quinlan
Prioritizing data reconstruction in distributed storage systems

Patent number: 9535790

Abstract: A method of prioritizing data for recovery in a distributed storage system includes, for each stripe of a file having chunks, determining whether the stripe comprises high-availability chunks or low-availability chunks and determining an effective redundancy value for each stripe. The effective redundancy value is based on the chunks and any system domains associated with the corresponding stripe. The distributed storage system has a system hierarchy including system domains. Chunks of a stripe associated with a system domain in an active state are accessible, whereas chunks of a stripe associated with a system domain in an inactive state are inaccessible. The method also includes reconstructing substantially immediately inaccessible, high-availability chunks having an effective redundancy value less than a threshold effective redundancy value and reconstructing the inaccessible low-availability and other inaccessible high-availability chunks, after a threshold period of time.

Type: Grant

Filed: February 26, 2016

Date of Patent: January 3, 2017

Assignee: Google Inc.

Inventors: Steven Robert Schirripa, Christian Eric Schrock, Robert Cypher, Sean Quinlan
Efficient data reads from distributed storage systems

Patent number: 9514015

Abstract: A method of distributing data in a distributed storage system includes receiving a file into non-transitory memory and dividing the received file into chunks. The chunks are data-chunks and non-data chunks. The method also includes grouping one or more of the data chunks and one or more of the non-data chunks in a group. One or more chunks of the group is capable of being reconstructed from other chunks of the group. The method also includes distributing the chunks of the group to storage devices of the distributed storage system based on a hierarchy of the distributed storage system. The hierarchy includes maintenance domains having active and inactive states, each storage device associated with a maintenance domain, the chunks of a group are distributed across multiple maintenance domains to maintain the ability to reconstruct chunks of the group when a maintenance domain is in an inactive state.

Type: Grant

Filed: March 24, 2016

Date of Patent: December 6, 2016

Assignee: Google Inc.

Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa, Lidor Carmi, Christian Eric Schrock
System and Method For Analyzing Data Records

Publication number: 20160342657

Abstract: A method processes data records. The method partitions the data records into groups and assigns each group to a respective process of a first plurality of processes, which execute in parallel. For each group, the assigned process extracts information from the data records, applies a script with information processing commands applied sequentially to produce intermediate values, stores the intermediate values in a respective intermediate data structure, and updates the status of the group to indicate completion. When the predefined threshold percentage of the data records are completed, the process assigns each group to a respective second process as a backup. When each of the groups has been completed by at least one process (either the original or the backup), the method executes a second plurality of processes to aggregate intermediate values from the intermediate data structures to produce output data. The aggregation includes intermediate values only once for each group.

Type: Application

Filed: August 2, 2016

Publication date: November 24, 2016

Inventors: Robert C. Pike, Sean Quinlan, Sean M. Dorward, Jeffrey Dean, Sanjay Ghemawat
Distributing Data on Distributed Storage Systems

Publication number: 20160299815

Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.

Type: Application

Filed: June 13, 2016

Publication date: October 13, 2016

Applicant: Google Inc.

Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa
System and method for analyzing data records

Patent number: 9405808

Abstract: A method and system for analyzing data records includes allocating groups of records to respective processes of a first plurality of processes executing in parallel. In each respective process of the first plurality of processes, for each record in the group of records allocated to the respective process, a query is applied to the record so as to produce zero or more values. Zero or more emit operators are applied to each of the zero or more produced values so as to add corresponding information to an intermediate data structure. Information from a plurality of the intermediate data structures is aggregated to produce output data.

Type: Grant

Filed: February 28, 2012

Date of Patent: August 2, 2016

Assignee: GOOGLE INC.

Inventors: Robert C. Pike, Sean Quinlan, Sean M. Dorward, Jeffrey Dean, Sanjay Ghemawat
Efficient Data Reads From Distributed Storage Systems

Publication number: 20160203066

Abstract: A method of distributing data in a distributed storage system includes receiving a file into non-transitory memory and dividing the received file into chunks. The chunks are data-chunks and non-data chunks. The method also includes grouping one or more of the data chunks and one or more of the non-data chunks in a group. One or more chunks of the group is capable of being reconstructed from other chunks of the group. The method also includes distributing the chunks of the group to storage devices of the distributed storage system based on a hierarchy of the distributed storage system. The hierarchy includes maintenance domains having active and inactive states, each storage device associated with a maintenance domain, the chunks of a group are distributed across multiple maintenance domains to maintain the ability to reconstruct chunks of the group when a maintenance domain is in an inactive state.

Type: Application

Filed: March 24, 2016

Publication date: July 14, 2016

Applicant: Google Inc.

Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa, Lidor Carmi, Christian Eric Schrock

prev 1 2 3 4 next