Patents by Inventor Samuel Rash
Samuel Rash has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10645040Abstract: Techniques for consistent writes in a split message store are described. In one embodiment, an apparatus may comprise a client front-end component of a messaging system operative to receive a message, the message comprising message metadata and a message body; and store the message in a message queue; and the message queue operative to initiate a storing of the message metadata in a metadata store; delay a storing of the message body in a message store until a metadata storage success indication is received from the metadata store; receive the metadata storage success indication from the metadata store; and store the message body in the message store in response to receiving the metadata storage success indication from the metadata store. Other embodiments are described and claimed.Type: GrantFiled: December 29, 2017Date of Patent: May 5, 2020Assignee: FACEBOOK, INC.Inventors: Rajesh Nishtala, Jason Curtis Jenks, Zardosht Kasheff, Samuel Rash
-
Patent number: 10581957Abstract: Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.Type: GrantFiled: February 10, 2017Date of Patent: March 3, 2020Assignee: Facebook, Inc.Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Guanghao Shen
-
Publication number: 20190207882Abstract: Techniques for consistent writes in a split message store are described. In one embodiment, an apparatus may comprise a client front-end component of a messaging system operative to receive a message, the message comprising message metadata and a message body; and store the message in a message queue; and the message queue operative to initiate a storing of the message metadata in a metadata store; delay a storing of the message body in a message store until a metadata storage success indication is received from the metadata store; receive the metadata storage success indication from the metadata store; and store the message body in the message store in response to receiving the metadata storage success indication from the metadata store. Other embodiments are described and claimed.Type: ApplicationFiled: December 29, 2017Publication date: July 4, 2019Inventors: Rajesh Nishtala, Jason Curtis Jenks, Zardosht Kasheff, Samuel Rash
-
Patent number: 10223431Abstract: Techniques for facilitating and accelerating log data processing by splitting data streams are disclosed herein. The front-end clusters generate large amount of log data in real time and transfer the log data to an aggregating cluster. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further splits the log data into a plurality of data streams so that the data streams are sent to a receiving application in parallel. In one embodiment, the log data are randomly split to ensure the log data are evenly distributed in the split data streams. In another embodiment, the application that receives the split data streams determines how to split the log data.Type: GrantFiled: January 31, 2013Date of Patent: March 5, 2019Assignee: Facebook, Inc.Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Eric Hwang
-
Patent number: 9734205Abstract: Disclosed here are methods, systems, paradigms and structures for predicting queries, creating tables to store data for the predicted queries, and selecting a particular table to obtain the data from in response to a query. The methods include determining various combinations of a finite set of columns users may query on, based on (i) a list of columns users are interested in obtaining data for, and (ii) cardinality information of a column or combinations of columns in the list of columns. The methods further includes creating various tables based on the determined combinations of the columns using a meta query language. A query is responded to by selecting a table that has least number of rows, among the tables that satisfy query parameters. The methods include selecting a table that has a longest sequence of columns matching with a portion of the query parameters.Type: GrantFiled: April 18, 2013Date of Patent: August 15, 2017Assignee: Facebook, Inc.Inventors: Samuel Rash, Timothy Williamson, Martin Traverso
-
Publication number: 20170155707Abstract: Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.Type: ApplicationFiled: February 10, 2017Publication date: June 1, 2017Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Guanghao Shen
-
Patent number: 9609050Abstract: Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.Type: GrantFiled: January 31, 2013Date of Patent: March 28, 2017Assignee: Facebook, Inc.Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Guanghao Shen
-
Patent number: 9507718Abstract: Disclosed are methods, systems, paradigms and structures for managing cache memory in computer systems. Certain caching techniques anticipate queries and caches the data that may be required by the anticipated queries. The queries are predicted based on previously executed queries. The features of the previously executed queries are extracted and correlated to identify a usage pattern of the features. The prediction model predicts queries based on the identified usage pattern of the features. The disclosed method includes purging data from the cache based on predefined eviction policies that are influenced by the predicted queries. The disclosed method supports caching time series data. The disclosed system includes a storage unit that stores previously executed queries and features of the queries.Type: GrantFiled: April 16, 2013Date of Patent: November 29, 2016Assignee: Facebook, Inc.Inventors: Samuel Rash, Timothy Williamson
-
Patent number: 9471436Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.Type: GrantFiled: April 23, 2013Date of Patent: October 18, 2016Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
-
Patent number: 9141723Abstract: Disclosed are methods, systems, paradigms and structures for caching data associated with a sliding window in computer systems. A sliding window can include a time window that progresses with time, and the data can include time series data. As time progresses, the sliding window changes bringing in new data. The cache is updated with new data as and when the sliding window moves. The sliding window data is cached at various granularity levels. The method includes storing a first portion of the data at a first granularity level and a second portion at a second granularity level. The data is cached at various granularity levels in order to effectively use the cache considering at least cache updating criteria such as (i) number of times a storage unit is queried to retrieve the data for updating the cache, (ii) the day/date/time at which the storage unit is queried.Type: GrantFiled: March 14, 2013Date of Patent: September 22, 2015Assignee: Facebook, Inc.Inventors: Samuel Rash, Timothy Williamson, Martin Traverso
-
Publication number: 20140317140Abstract: Disclosed here are methods, systems, paradigms and structures for predicting queries, creating tables to store data for the predicted queries, and selecting a particular table to obtain the data from in response to a query. The methods include determining various combinations of a finite set of columns users may query on, based on (i) a list of columns users are interested in obtaining data for, and (ii) cardinality information of a column or combinations of columns in the list of columns. The methods further includes creating various tables based on the determined combinations of the columns using a meta query language. A query is responded to by selecting a table that has least number of rows, among the tables that satisfy query parameters. The methods include selecting a table that has a longest sequence of columns matching with a portion of the query parameters.Type: ApplicationFiled: April 18, 2013Publication date: October 23, 2014Inventors: SAMUEL RASH, TIMOTHY WILLIAMSON, MARTIN TRAVERSO
-
Publication number: 20140317448Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.Type: ApplicationFiled: April 23, 2013Publication date: October 23, 2014Applicant: Facebook, Inc.Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
-
Publication number: 20140310470Abstract: Disclosed are methods, systems, paradigms and structures for managing cache memory in computer systems. Certain caching techniques anticipate queries and caches the data that may be required by the anticipated queries. The queries are predicted based on previously executed queries. The features of the previously executed queries are extracted and correlated to identify a usage pattern of the features. The prediction model predicts queries based on the identified usage pattern of the features. The disclosed method includes purging data from the cache based on predefined eviction policies that are influenced by the predicted queries. The disclosed method supports caching time series data. The disclosed system includes a storage unit that stores previously executed queries and features of the queries.Type: ApplicationFiled: April 16, 2013Publication date: October 16, 2014Inventors: Samuel Rash, Timothy Williamson
-
Publication number: 20140280126Abstract: Disclosed are methods, systems, paradigms and structures for caching data associated with a sliding window in computer systems. A sliding window can include a time window that progresses with time, and the data can include time series data. As time progresses, the sliding window changes bringing in new data. The cache is updated with new data as and when the sliding window moves. The sliding window data is cached at various granularity levels. The method includes storing a first portion of the data at a first granularity level and a second portion at a second granularity level. The data is cached at various granularity levels in order to effectively use the cache considering at least cache updating criteria such as (i) number of times a storage unit is queried to retrieve the data for updating the cache, (ii) the day/date/time at which the storage unit is queried.Type: ApplicationFiled: March 14, 2013Publication date: September 18, 2014Applicant: Facebook, Inc.Inventors: Samuel Rash, Timothy Williamson, Martin Traverso
-
Publication number: 20140215007Abstract: Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.Type: ApplicationFiled: January 31, 2013Publication date: July 31, 2014Inventors: Samuel Rash, Dhruba Borthakur, Zheng Shao, Guanghao Shen
-
Publication number: 20140214752Abstract: Techniques for facilitating and accelerating log data processing by splitting data streams are disclosed herein. The front-end clusters generate large amount of log data in real time and transfer the log data to an aggregating cluster. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further splits the log data into a plurality of data streams so that the data streams are sent to a receiving application in parallel. In one embodiment, the log data are randomly split to ensure the log data are evenly distributed in the split data streams. In another embodiment, the application that receives the split data streams determines how to split the log data.Type: ApplicationFiled: January 31, 2013Publication date: July 31, 2014Inventors: Samuel Rash, Dhrubajyoti Borthakur, Zheng Shao, Eric Hwang