Patents by Inventor George Steven McPherson

George Steven McPherson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10713272
    Abstract: Dynamic generation of data catalogs may be implemented for accessing data sets in different storage locations. Data sets may be accessed in order to extract portions of data. Structure recognition techniques may be applied to the extracted data in order to determine structural information for the data sets. The structural information may then be stored as part of a data catalog for the data sets. Requests to access the data catalog from different clients may be received and the requested structural data supplied so that the clients may access different data sets utilizing the supplied structural data. Data catalogs may be updated as changes to data sets are made.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: July 14, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Andrew Edward Caldwell, Anurag Windlass Gupta, Mehul Shah, Prajakta Damle, George Steven McPherson
  • Publication number: 20200218701
    Abstract: Methods and apparatus for providing consistent data storage in distributed computing systems. A consistent distributed computing file system (consistent DCFS) may be backed by an object storage service that only guarantees eventual consistency, and may leverage a data storage service (e.g., a database service) to store and maintain a file system/directory structure (a consistent DCFS directory) for the consistent DCFS that may be accessed by compute nodes for file/directory information relevant to the data objects in the consistent DCFS, rather than relying on the information maintained by the object storage service. The compute nodes may reference the consistent DCFS directory to, for example, store and retrieve strongly consistent metadata referencing data objects in the consistent DCFS. The compute nodes may, for example, retrieve metadata from consistent DCFS directory to determine whether the object storage service is presenting all of the data that it is supposed to have.
    Type: Application
    Filed: March 13, 2020
    Publication date: July 9, 2020
    Applicant: Amazon Technologies, Inc.
    Inventors: Bogdan Eduard Ghidireac, Peter Sirota, Robert Frederick Leidle, Mathew Alan Mills, George Steven McPherson, Xing Wu, Jonathan Andrew Fritz
  • Publication number: 20200159742
    Abstract: History for data objects may be maintained to detect data events. An indication of an Extract, Transform, Load (ETL) process applied to one or more source data objects to generate one or more transformed data objects may be received. History for the source data objects may be updated to include the transformed data objects and the ETL process that generated the transformed data objects. An evaluation of the update may be performed to determine whether an event associated with the data lineage is triggered. If the event is triggered, a notification of the event may be sent to one or more subscribers for the event.
    Type: Application
    Filed: January 24, 2020
    Publication date: May 21, 2020
    Applicant: Amazon Technologies, Inc.
    Inventors: George Steven McPherson, Mehul A. Shah, Prajakta Datta Damle, Gopinath Duddi, Anurag Windlass Gupta
  • Patent number: 10621210
    Abstract: Recognizing unknown data objects may be implemented for data objects stored in a data store. Data objects that are identified as unknown may be accessed to retrieve a portion of the data object. Different representations of the data object may be generated for recognizing different data schemas. An analysis of the representations may be performed to identify a data schema for the unknown data object. The data schema may be stored in a metadata store for the unknown data object.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: April 14, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Mehul A. Shah, George Steven McPherson, Prajakta Datta Damle, Gopinath Duddi, Anurag Windlass Gupta
  • Patent number: 10599621
    Abstract: A system and method for improving the speed of generating a list of previously-uncounted items stored with a computing resource service provider. The system and method involve obtaining a set of keys from a data store, wherein each key of the set of keys corresponds to an item in a group of items, wherein a quantity of items in the group is uncounted. The system and method further includes generating a first sub-listing of keys based at least in part on a first key range of the set of keys by executing a first thread, generating a second sub-listing of keys based at least in part on a second key range of the set of keys by executing a second thread, combining the first sub-listing of keys with the second sub-listing of keys to produce a list of keys, and providing the list of keys.
    Type: Grant
    Filed: February 2, 2015
    Date of Patent: March 24, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Xing Wu, George Steven McPherson, Robert Frederick Leidle, Jonathan Andrew Fritz
  • Patent number: 10592475
    Abstract: Methods and apparatus for providing consistent data storage in distributed computing systems. A consistent distributed computing file system (consistent DCFS) may be backed by an object storage service that only guarantees eventual consistency, and may leverage a data storage service (e.g., a database service) to store and maintain a file system/directory structure (a consistent DCFS directory) for the consistent DCFS that may be accessed by compute nodes for file/directory information relevant to the data objects in the consistent DCFS, rather than relying on the information maintained by the object storage service. The compute nodes may reference the consistent DCFS directory to, for example, store and retrieve strongly consistent metadata referencing data objects in the consistent DCFS. The compute nodes may, for example, retrieve metadata from consistent DCFS directory to determine whether the object storage service is presenting all of the data that it is supposed to have.
    Type: Grant
    Filed: March 21, 2014
    Date of Patent: March 17, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Bogdan Eduard Ghidireac, Peter Sirota, Robert Frederick Leidle, Mathew Alan Mills, George Steven McPherson, Xing Wu, Jonathan Andrew Fritz
  • Patent number: 10545979
    Abstract: History for data objects may be maintained to detect data events. An indication of an Extract, Transform, Load (ETL) process applied to one or more source data objects to generate one or more transformed data objects may be received. History for the source data objects may be updated to include the transformed data objects and the ETL process that generated the transformed data objects. An evaluation of the update may be performed to determine whether an event associated with the data lineage is triggered. If the event is triggered, a notification of the event may be sent to one or more subscribers for the event.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: January 28, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: George Steven McPherson, Mehul A. Shah, Prajakta Datta Damle, Gopinath Duddi, Anurag Windlass Gupta
  • Patent number: 10374915
    Abstract: A distributed services platform may generate a plurality of log files containing metric values. Metric values may be provided to a first level of a topology of aggregation modules. The first level of aggregation modules may provide output to a second level of the topology. Subsequent levels of the topology may perform greater amounts of aggregation while providing stream-based access to the aggregated values. State information for the aggregation topology may be saved subsequent to each cycle of values through the topology.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: August 6, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Xiqiang Zhi, George Steven McPherson
  • Patent number: 10338958
    Abstract: An indication of an input data stream comprising data records, stored at a stream management service, that are to be batched for a computation at a batch-oriented data processing service is received. A set of data records of the input data stream are identified, based on respective sequence numbers associated with the records, for a particular iteration of the computation. Metadata associated with the particular iteration, comprising identification information associated with the set of records on which the computation is performed during the particular iteration, is saved in a repository.
    Type: Grant
    Filed: January 27, 2014
    Date of Patent: July 2, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Ankit Kamboj, Peter Sirota, George Steven McPherson, Vageesh Kumar, Sumit Kumar
  • Publication number: 20180173774
    Abstract: History for data objects may be maintained to detect data events. An indication of an Extract, Transform, Load (ETL) process applied to one or more source data objects to generate one or more transformed data objects may be received. History for the source data objects may be updated to include the transformed data objects and the ETL process that generated the transformed data objects. An evaluation of the update may be performed to determine whether an event associated with the data lineage is triggered. If the event is triggered, a notification of the event may be sent to one or more subscribers for the event.
    Type: Application
    Filed: December 20, 2016
    Publication date: June 21, 2018
    Applicant: Amazon Technologies, Inc.
    Inventors: GEORGE STEVEN MCPHERSON, MEHUL A. SHAH, PRAJAKTA DATTA DAMLE, GOPINATH DUDDI, ANURAG WINDLASS GUPTA
  • Publication number: 20180150528
    Abstract: Data transformation workflows may be generated to transform data objects. A source data schema for a data object and a target data format or target data schema for a data object may be identified. A comparison of the source data schema and the target data format or schema may be made to determine what transformations can be performed to transform the data object into the target data format or schema. Code to execute the transformation operations may then be generated. The code may be stored for subsequent modification or execution.
    Type: Application
    Filed: December 20, 2016
    Publication date: May 31, 2018
    Applicant: Amazon Technologies, Inc.
    Inventors: MEHUL A. SHAH, GEORGE STEVEN MCPHERSON, PRAJAKTA DATTA DAMLE, GOPINATH DUDDI, ANURAG WINDLASS GUPTA, BENJAMIN ALBERT SOWELL, BOHOU LI
  • Publication number: 20180150548
    Abstract: Recognizing unknown data objects may be implemented for data objects stored in a data store. Data objects that are identified as unknown may be accessed to retrieve a portion of the data object. Different representations of the data object may be generated for recognizing different data schemas. An analysis of the representations may be performed to identify a data schema for the unknown data object. The data schema may be stored in a metadata store for the unknown data object.
    Type: Application
    Filed: December 20, 2016
    Publication date: May 31, 2018
    Applicant: Amazon Technologies, Inc.
    Inventors: MEHUL A. SHAH, GEORGE STEVEN MCPHERSON, PRAJAKTA DATTA DAMLE, GOPINATH DUDDI, ANURAG WINDLASS GUPTA
  • Publication number: 20180150529
    Abstract: Extract, Transform, Load (ETL) processing may be initiated by detected events. A trigger event may be associated with an ETL process apply one or more transformations to a source data object. The trigger event may be detected for the ETL process and evaluated with respect to one or more execution conditions for the ETL process. If the execution conditions for the ETL process are satisfied, then the ETL process may be executed. At least some of the source data object may be obtained, the one or more transformations of the ETL process may be applied, and one or more transformed data objects may be stored.
    Type: Application
    Filed: December 20, 2016
    Publication date: May 31, 2018
    Applicant: Amazon Technologies, Inc.
    Inventors: GEORGE STEVEN MCPHERSON, MEHUL A. SHAH, PRAJAKTA DATTA DAMLE, GOPINATH DUDDI, ANURAG WINDLASS GUPTA
  • Patent number: 9703594
    Abstract: A system adapted to process long-running processes is disclosed. A request to upload data is received at a server. The server divides the data into multiple parts and launches a separate process to upload each of the divided parts. The server records for each process the processing time or duration that the particular process used to upload its corresponding data item. The server maintains an average processing duration that is calculated from the processing durations of the completed processes. The server identifies that one process is continuing to run and compares a processing duration for the particular process to a threshold derived from the average processing duration. If the processing duration for the particular process exceeds the threshold, the server initiates a new process to upload the same data item. When one of either the new process or the still running process has completed processing, the server terminates the other process.
    Type: Grant
    Filed: March 2, 2015
    Date of Patent: July 11, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Ankit Kamboj, Xing Wu, George Steven McPherson, Jian Fang, Dag Stockstad, Abhishek Rajnikant Sinha