Patents by Inventor Samuel Rash

Samuel Rash has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10645040
    Abstract: Techniques for consistent writes in a split message store are described. In one embodiment, an apparatus may comprise a client front-end component of a messaging system operative to receive a message, the message comprising message metadata and a message body; and store the message in a message queue; and the message queue operative to initiate a storing of the message metadata in a metadata store; delay a storing of the message body in a message store until a metadata storage success indication is received from the metadata store; receive the metadata storage success indication from the metadata store; and store the message body in the message store in response to receiving the metadata storage success indication from the metadata store. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: May 5, 2020
    Assignee: FACEBOOK, INC.
    Inventors: Rajesh Nishtala, Jason Curtis Jenks, Zardosht Kasheff, Samuel Rash
  • Patent number: 10581957
    Abstract: Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.
    Type: Grant
    Filed: February 10, 2017
    Date of Patent: March 3, 2020
    Assignee: Facebook, Inc.
    Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Guanghao Shen
  • Publication number: 20190207882
    Abstract: Techniques for consistent writes in a split message store are described. In one embodiment, an apparatus may comprise a client front-end component of a messaging system operative to receive a message, the message comprising message metadata and a message body; and store the message in a message queue; and the message queue operative to initiate a storing of the message metadata in a metadata store; delay a storing of the message body in a message store until a metadata storage success indication is received from the metadata store; receive the metadata storage success indication from the metadata store; and store the message body in the message store in response to receiving the metadata storage success indication from the metadata store. Other embodiments are described and claimed.
    Type: Application
    Filed: December 29, 2017
    Publication date: July 4, 2019
    Inventors: Rajesh Nishtala, Jason Curtis Jenks, Zardosht Kasheff, Samuel Rash
  • Patent number: 10223431
    Abstract: Techniques for facilitating and accelerating log data processing by splitting data streams are disclosed herein. The front-end clusters generate large amount of log data in real time and transfer the log data to an aggregating cluster. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further splits the log data into a plurality of data streams so that the data streams are sent to a receiving application in parallel. In one embodiment, the log data are randomly split to ensure the log data are evenly distributed in the split data streams. In another embodiment, the application that receives the split data streams determines how to split the log data.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: March 5, 2019
    Assignee: Facebook, Inc.
    Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Eric Hwang
  • Patent number: 9734205
    Abstract: Disclosed here are methods, systems, paradigms and structures for predicting queries, creating tables to store data for the predicted queries, and selecting a particular table to obtain the data from in response to a query. The methods include determining various combinations of a finite set of columns users may query on, based on (i) a list of columns users are interested in obtaining data for, and (ii) cardinality information of a column or combinations of columns in the list of columns. The methods further includes creating various tables based on the determined combinations of the columns using a meta query language. A query is responded to by selecting a table that has least number of rows, among the tables that satisfy query parameters. The methods include selecting a table that has a longest sequence of columns matching with a portion of the query parameters.
    Type: Grant
    Filed: April 18, 2013
    Date of Patent: August 15, 2017
    Assignee: Facebook, Inc.
    Inventors: Samuel Rash, Timothy Williamson, Martin Traverso
  • Publication number: 20170155707
    Abstract: Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.
    Type: Application
    Filed: February 10, 2017
    Publication date: June 1, 2017
    Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Guanghao Shen
  • Patent number: 9609050
    Abstract: Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: March 28, 2017
    Assignee: Facebook, Inc.
    Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Guanghao Shen
  • Patent number: 9507718
    Abstract: Disclosed are methods, systems, paradigms and structures for managing cache memory in computer systems. Certain caching techniques anticipate queries and caches the data that may be required by the anticipated queries. The queries are predicted based on previously executed queries. The features of the previously executed queries are extracted and correlated to identify a usage pattern of the features. The prediction model predicts queries based on the identified usage pattern of the features. The disclosed method includes purging data from the cache based on predefined eviction policies that are influenced by the predicted queries. The disclosed method supports caching time series data. The disclosed system includes a storage unit that stores previously executed queries and features of the queries.
    Type: Grant
    Filed: April 16, 2013
    Date of Patent: November 29, 2016
    Assignee: Facebook, Inc.
    Inventors: Samuel Rash, Timothy Williamson
  • Patent number: 9471436
    Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.
    Type: Grant
    Filed: April 23, 2013
    Date of Patent: October 18, 2016
    Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
  • Patent number: 9141723
    Abstract: Disclosed are methods, systems, paradigms and structures for caching data associated with a sliding window in computer systems. A sliding window can include a time window that progresses with time, and the data can include time series data. As time progresses, the sliding window changes bringing in new data. The cache is updated with new data as and when the sliding window moves. The sliding window data is cached at various granularity levels. The method includes storing a first portion of the data at a first granularity level and a second portion at a second granularity level. The data is cached at various granularity levels in order to effectively use the cache considering at least cache updating criteria such as (i) number of times a storage unit is queried to retrieve the data for updating the cache, (ii) the day/date/time at which the storage unit is queried.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: September 22, 2015
    Assignee: Facebook, Inc.
    Inventors: Samuel Rash, Timothy Williamson, Martin Traverso
  • Publication number: 20140317140
    Abstract: Disclosed here are methods, systems, paradigms and structures for predicting queries, creating tables to store data for the predicted queries, and selecting a particular table to obtain the data from in response to a query. The methods include determining various combinations of a finite set of columns users may query on, based on (i) a list of columns users are interested in obtaining data for, and (ii) cardinality information of a column or combinations of columns in the list of columns. The methods further includes creating various tables based on the determined combinations of the columns using a meta query language. A query is responded to by selecting a table that has least number of rows, among the tables that satisfy query parameters. The methods include selecting a table that has a longest sequence of columns matching with a portion of the query parameters.
    Type: Application
    Filed: April 18, 2013
    Publication date: October 23, 2014
    Inventors: SAMUEL RASH, TIMOTHY WILLIAMSON, MARTIN TRAVERSO
  • Publication number: 20140317448
    Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.
    Type: Application
    Filed: April 23, 2013
    Publication date: October 23, 2014
    Applicant: Facebook, Inc.
    Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
  • Publication number: 20140310470
    Abstract: Disclosed are methods, systems, paradigms and structures for managing cache memory in computer systems. Certain caching techniques anticipate queries and caches the data that may be required by the anticipated queries. The queries are predicted based on previously executed queries. The features of the previously executed queries are extracted and correlated to identify a usage pattern of the features. The prediction model predicts queries based on the identified usage pattern of the features. The disclosed method includes purging data from the cache based on predefined eviction policies that are influenced by the predicted queries. The disclosed method supports caching time series data. The disclosed system includes a storage unit that stores previously executed queries and features of the queries.
    Type: Application
    Filed: April 16, 2013
    Publication date: October 16, 2014
    Inventors: Samuel Rash, Timothy Williamson
  • Publication number: 20140280126
    Abstract: Disclosed are methods, systems, paradigms and structures for caching data associated with a sliding window in computer systems. A sliding window can include a time window that progresses with time, and the data can include time series data. As time progresses, the sliding window changes bringing in new data. The cache is updated with new data as and when the sliding window moves. The sliding window data is cached at various granularity levels. The method includes storing a first portion of the data at a first granularity level and a second portion at a second granularity level. The data is cached at various granularity levels in order to effectively use the cache considering at least cache updating criteria such as (i) number of times a storage unit is queried to retrieve the data for updating the cache, (ii) the day/date/time at which the storage unit is queried.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Applicant: Facebook, Inc.
    Inventors: Samuel Rash, Timothy Williamson, Martin Traverso
  • Publication number: 20140215007
    Abstract: Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.
    Type: Application
    Filed: January 31, 2013
    Publication date: July 31, 2014
    Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Guanghao Shen
  • Publication number: 20140214752
    Abstract: Techniques for facilitating and accelerating log data processing by splitting data streams are disclosed herein. The front-end clusters generate large amount of log data in real time and transfer the log data to an aggregating cluster. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further splits the log data into a plurality of data streams so that the data streams are sent to a receiving application in parallel. In one embodiment, the log data are randomly split to ensure the log data are evenly distributed in the split data streams. In another embodiment, the application that receives the split data streams determines how to split the log data.
    Type: Application
    Filed: January 31, 2013
    Publication date: July 31, 2014
    Inventors: Samuel Rash, Dhrubajyoti Borthakur, Zheng Shao, Eric Hwang