Patents by Inventor Christopher A. Olston

Christopher A. Olston has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10789544
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for batching inputs to machine learning models. One of the methods includes receiving a stream of requests, each request identifying a respective input for processing by a first machine learning model; adding the respective input from each request to a first queue of inputs for processing by the first machine learning model; determining, at a first time, that a count of inputs in the first queue as of the first time equals or exceeds a maximum batch size and, in response: generating a first batched input from the inputs in the queue as of the first time so that a count of inputs in the first batched input equals the maximum batch size, and providing the first batched input for processing by the first machine learning model.
    Type: Grant
    Filed: April 5, 2016
    Date of Patent: September 29, 2020
    Inventors: Noah Fiedel, Christopher Olston, Jeremiah Harmsen
  • Patent number: 10417439
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a catalog for multiple datasets, the method comprising accessing multiple extant data sets, the extant data sets including data sets that are independently generated and structurally dissimilar; organizing the data sets into collections, each data set in each collection belonging to the collection based on collection data associated with the data set; for each collection of data sets: determining, from a subset of the data sets that belong to the collection, metadata that describe the data sets that belong to the collection, wherein the metadata does not include the collection data, and attributing, to other data sets in the collection, the metadata determined from the subset of data sets; and generating, from the collections of data sets and the determined metadata, a catalog for the multiple datasets.
    Type: Grant
    Filed: April 6, 2017
    Date of Patent: September 17, 2019
    Assignee: Google LLC
    Inventors: Philip Korn, Steven Euijong Whang, Natalya Fridman Noy, Sudip Roy, Neoklis Polyzotis, Alon Yitzchak Halevy, Christopher Olston
  • Publication number: 20170293671
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a catalog for multiple datasets, the method comprising accessing multiple extant data sets, the extant data sets including data sets that are independently generated and structurally dissimilar; organizing the data sets into collections, each data set in each collection belonging to the collection based on collection data associated with the data set; for each collection of data sets: determining, from a subset of the data sets that belong to the collection, metadata that describe the data sets that belong to the collection, wherein the metadata does not include the collection data, and attributing, to other data sets in the collection, the metadata determined from the subset of data sets; and generating, from the collections of data sets and the determined metadata, a catalog for the multiple datasets.
    Type: Application
    Filed: April 6, 2017
    Publication date: October 12, 2017
    Inventors: Philip Korn, Steven Euijong Whang, Natalya Fridman Noy, Sudip Roy, Neoklis Polyzotis, Alon Yitzchak Halevy, Christopher Olston
  • Publication number: 20170286864
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for batching inputs to machine learning models. One of the methods includes receiving a stream of requests, each request identifying a respective input for processing by a first machine learning model; adding the respective input from each request to a first queue of inputs for processing by the first machine learning model; determining, at a first time, that a count of inputs in the first queue as of the first time equals or exceeds a maximum batch size and, in response: generating a first batched input from the inputs in the queue as of the first time so that a count of inputs in the first batched input equals the maximum batch size, and providing the first batched input for processing by the first machine learning model.
    Type: Application
    Filed: April 5, 2016
    Publication date: October 5, 2017
    Inventors: Noah Fiedel, Christopher Olston, Jeremiah Harmsen
  • Patent number: 8965842
    Abstract: A method and system are given for providing a virtual environment spanning a desktop and a cloud. In one example, the method includes receiving a query template over a data set that resides in the cloud, optimizing the query template to segment the query template into an offline phase and an online phase, executing the offline phase on the cloud to build one or more indexes, and sending the one or more indexes to the desktop.
    Type: Grant
    Filed: July 15, 2014
    Date of Patent: February 24, 2015
    Assignee: Yahoo! Inc.
    Inventor: Christopher Olston
  • Patent number: 8949834
    Abstract: Disclosed are methods and apparatus for scheduling an asynchronous workflow having a plurality of processing paths. In one embodiment, one or more predefined constraint metrics that constrain temporal asynchrony for one or more portions of the workflow may be received or provided. Input data is periodically received or intermediate or output data is generated for one or more of the processing paths of the workflow, via one or more operators, based on a scheduler process. One or more of the processing paths for generating the intermediate or output data are dynamically selected based on received input data or generated intermediate or output data and the one or more constraint metrics. The selected one or more processing paths of the workflow are then executed so that each selected processing path generates intermediate or output data for the workflow.
    Type: Grant
    Filed: April 7, 2010
    Date of Patent: February 3, 2015
    Assignee: Yahoo! Inc.
    Inventor: Christopher A. Olston
  • Publication number: 20140324822
    Abstract: A method and system are given for providing a virtual environment spanning a desktop and a cloud. In one example, the method includes receiving a query template over a data set that resides in the cloud, optimizing the query template to segment the query template into an offline phase and an online phase, executing the offline phase on the cloud to build one or more indexes, and sending the one or more indexes to the desktop.
    Type: Application
    Filed: July 15, 2014
    Publication date: October 30, 2014
    Inventor: Christopher Olston
  • Patent number: 8838527
    Abstract: A method and system are given for providing a virtual environment spanning a desktop and a cloud. In one example, the method includes receiving a query template over a data set that resides in the cloud, optimizing the query template to segment the query template into an offline phase and an online phase, executing the offline phase on the cloud to build one or more indexes, and sending the one or more indexes to the desktop.
    Type: Grant
    Filed: November 6, 2008
    Date of Patent: September 16, 2014
    Assignee: Yahoo! Inc.
    Inventor: Christopher Olston
  • Patent number: 8745183
    Abstract: An improved system and method is provided for adaptively refreshing a web page. A base version of the web page may be partitioned into a collection of fragments. Then the collection of fragments may be compared with the corresponding fragments of a recent version of the web page to determine a divergence measurement of the difference between the base version and the recent version of the web page. The divergence measurement may be recorded in a change profile representing a change history of the web page that includes a sequence of numeric pairs indicating a time offset and a divergence measurement of the difference between a version of the web page at the time offset and a base version of the web page. The refresh period for the web page may be adjusted by applying an adaptive refresh policy using the divergence measurements recorded in the change profile.
    Type: Grant
    Filed: October 26, 2006
    Date of Patent: June 3, 2014
    Assignee: Yahoo! Inc.
    Inventor: Christopher Olston
  • Patent number: 8732160
    Abstract: A method and a system are provided for exploring a large textual data set via interactive aggregation. In one example, the method includes receiving the large textual data set and an original query template, building an index for the query template, wherein the building the index comprises ordering the index a particular way to optimize query time, receiving one or more bindings for the original query template, computing an answer to the original query template using the index and the one or more bindings, and anticipating one or more future queries that a user may submit and that are related to the original query template.
    Type: Grant
    Filed: November 11, 2008
    Date of Patent: May 20, 2014
    Assignee: Yahoo! Inc.
    Inventor: Christopher Olston
  • Patent number: 8312011
    Abstract: The present invention relates to methods, systems, and computer readable media comprising instructions for identifying needy queries for which additional responsive content is needed. The method of the present invention comprises receiving a query comprising one or more terms and retrieving one or more content items identified as responsive to the query, the one or more content items ranked according to one or more ranking techniques. A score is generated for the one or more ranked content items identified as responsive to the query. A determination is thereafter made as to whether the query is needy based upon a comparison of the one or more scores associated with the one or more content items identified as responsive to the query and a needy query score threshold.
    Type: Grant
    Filed: May 16, 2011
    Date of Patent: November 13, 2012
    Assignee: Yahoo! Inc.
    Inventors: Christopher Olston, Sandeep Pandey
  • Publication number: 20110252427
    Abstract: Disclosed are methods and apparatus for scheduling an asynchronous workflow having a plurality of processing paths. In one embodiment, one or more predefined constraint metrics that constrain temporal asynchrony for one or more portions of the workflow may be received or provided. Input data is periodically received or intermediate or output data is generated for one or more of the processing paths of the workflow, via one or more operators, based on a scheduler process. One or more of the processing paths for generating the intermediate or output data are dynamically selected based on received input data or generated intermediate or output data and the one or more constraint metrics. The selected one or more processing paths of the workflow are then executed so that each selected processing path generates intermediate or output data for the workflow.
    Type: Application
    Filed: April 7, 2010
    Publication date: October 13, 2011
    Applicant: YAHOO! INC.
    Inventor: Christopher A. Olston
  • Publication number: 20110218991
    Abstract: The present invention relates to methods, systems, and computer readable media comprising instructions for identifying needy queries for which additional responsive content is needed. The method of the present invention comprises receiving a query comprising one or more terms and retrieving one or more content items identified as responsive to the query, the one or more content items ranked according to one or more ranking techniques. A score is generated for the one or more ranked content items identified as responsive to the query. A determination is thereafter made as to whether the query is needy based upon a comparison of the one or more scores associated with the one or more content items identified as responsive to the query and a needy query score threshold.
    Type: Application
    Filed: May 16, 2011
    Publication date: September 8, 2011
    Applicant: YAHOO! INC.
    Inventors: Christopher Olston, Sandeep Pandey
  • Patent number: 7991769
    Abstract: An improved system and method is provided for searching a collection of objects that may be located in hierarchies of auxiliary information for retrieval of response objects. A framework to perform a generalization search in hierarchies may be used to generalize a search by moving up to a higher level in a hierarchy of taxonomies or to specialize a search by moving down to a lower level in the hierarchy of taxonomies. Once the system may decide to enumerate response objects at a particular level of generalization, a budgeted generalization search may be used for enumerating a set of response objects within a budgeted cost.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: August 2, 2011
    Assignee: Yahoo! Inc.
    Inventors: Marcus Felipe Fontoura, Vanja Josifovski, Christopher Olston, Shanmugasundaram Ravikumar, Andrew Tomkins
  • Patent number: 7970760
    Abstract: Methods, systems, and computer readable media comprising instructions for identifying needy queries for which additional responsive content is needed. A method comprises receiving a query comprising one or more terms and retrieving one or more content items identified as responsive to the query, the one or more content items ranked according to one or more ranking techniques. A score is generated for the one or more ranked content items identified as responsive to the query. A determination is thereafter made as to whether the query is needy based upon a comparison of the one or more scores associated with the one or more content items identified as responsive to the query and a needy query score threshold.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: June 28, 2011
    Assignee: Yahoo! Inc.
    Inventors: Christopher Olston, Sandeep Pandey
  • Patent number: 7921416
    Abstract: The present invention, in an example embodiment, provides a special-purpose formal language and translator for the parallel processing of large databases in a distributed system. The special-purpose language has features of both a declarative programming language and a procedural programming language and supports the co-grouping of tables, each with an arbitrary alignment function, and the specification of procedural operations to be performed on the resulting co-groups. The language's translator translates a program in the language into optimized structured calls to an application programming interface for implementations of functionality related to the parallel processing of tasks over a distributed system. In an example embodiment, the application programming interface includes interfaces for MapReduce functionality, whose implementations are supplemented by the embodiment.
    Type: Grant
    Filed: October 20, 2006
    Date of Patent: April 5, 2011
    Assignee: Yahoo! Inc.
    Inventors: Marcus Felipe Fontoura, Vanja Josifovski, Shanmugasundaram Ravikumar, Christopher Olston, Benjamin Clay Reed, Andrew Tomkins
  • Patent number: 7899807
    Abstract: An improved system and method for crawl ordering of a web crawler by impact upon search results of a search engine is provided. Content-independent features of uncrawled web pages may be obtained, and the impact of uncrawled web pages may be estimated for queries of a workload using the content-independent features. The impact of uncrawled web pages may be estimated for queries by computing an expected impact score for uncrawled web pages that match needy queries. Query sketches may be created for a subset of the queries by computing an expected impact score for crawled web pages and uncrawled web pages matching the queries. Web pages may then be selected to fetch using a combined query-based estimate and query-independent estimate of the impact of fetching the web pages on search query results.
    Type: Grant
    Filed: December 20, 2007
    Date of Patent: March 1, 2011
    Assignee: Yahoo! Inc.
    Inventors: Christopher Olston, Sandeep Pandey
  • Patent number: 7805447
    Abstract: Computer-implemented methods, modules and clients relate to expanded, pruned sample table for testing database queries against a base table. The expanded, pruned sample table is formed from the base table by a process of initial sampling, synthesis, and pruning.
    Type: Grant
    Filed: January 16, 2008
    Date of Patent: September 28, 2010
    Assignee: Yahoo! Inc.
    Inventors: Christopher Olston, Utkarsh Srivastava
  • Publication number: 20100121847
    Abstract: A method and a system are provided for exploring a large textual data set via interactive aggregation. In one example, the method includes receiving the large textual data set and an original query template, building an index for the query template, wherein the building the index comprises ordering the index a particular way to optimize query time, receiving one or more bindings for the original query template, computing an answer to the original query template using the index and the one or more bindings, and anticipating one or more future queries that a user may submit and that are related to the original query template.
    Type: Application
    Filed: November 11, 2008
    Publication date: May 13, 2010
    Inventor: Christopher Olston
  • Publication number: 20100114867
    Abstract: A method and system are given for providing a virtual environment spanning a desktop and a cloud. In one example, the method includes receiving a query template over a data set that resides in the cloud, optimizing the query template to segment the query template into an offline phase and an online phase, executing the offline phase on the cloud to build one or more indexes, and sending the one or more indexes to the desktop.
    Type: Application
    Filed: November 6, 2008
    Publication date: May 6, 2010
    Inventor: Christopher Olston