Patents by Inventor Jonathan Ming-Cyn Hsieh

Jonathan Ming-Cyn Hsieh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150127608
    Abstract: Scalable architectures, systems, and services are provided herein for creating manifest-based snapshots in distributed computing environments. In some embodiments, responsive to receiving a request to create a snapshot of a data object, a master node identifies multiple slave nodes on which a data object is stored in the cloud-computing platform and creates a snapshot manifest representing the snapshot of the data object. The snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system. The snapshot can be created without disrupting I/O operations, e.g., in an online mode by various region servers as directed by the master node. Additionally, a log roll approach to creating the snapshot is also disclosed in which log files are marked. The replaying of log entries can reduce the probability of causal consistency in the snapshot.
    Type: Application
    Filed: October 29, 2014
    Publication date: May 7, 2015
    Inventors: Jonathan Ming-Cyn Hsieh, Matteo Bertozzi
  • Publication number: 20150019557
    Abstract: Systems and methods of dynamically processing an event using an extensible data model are disclosed. One embodiment includes, specifying attributes of the event in a data model; the data model being extensible to add properties to the event as the dataset is streamed from the source to the sink.
    Type: Application
    Filed: August 18, 2014
    Publication date: January 15, 2015
    Inventors: Jonathan Ming-Cyn Hsieh, Henry Noel Robinson
  • Patent number: 8874526
    Abstract: Systems and methods of dynamically processing an event using an extensible data model are disclosed. One embodiment includes, specifying attributes of the event in a data model; the data model being extensible to add properties to the event as the dataset is streamed from the source to the sink.
    Type: Grant
    Filed: September 8, 2010
    Date of Patent: October 28, 2014
    Assignee: Cloudera, Inc.
    Inventors: Jonathan Ming-Cyn Hsieh, Henry Noel Robinson
  • Patent number: 8812457
    Abstract: Systems and methods of dynamically processing an event using an extensible data model are disclosed. One embodiment includes, specifying attributes of the event in a data model; the data model being extensible to add properties to the event as the dataset is streamed from the source to the sink.
    Type: Grant
    Filed: September 8, 2010
    Date of Patent: August 19, 2014
    Assignee: Cloudera, Inc.
    Inventors: Jonathan Ming-Cyn Hsieh, Henry Noel Robinson
  • Publication number: 20130282668
    Abstract: Systems and methods for checking for region consistency and table integrity problems and automatically repairing a corrupted HBase cluster. The methods and systems operate in a diagnostic mode and a diagnostic and repair mode. The methods include fixing table integrity problems, such as backwards table regions, table region holes, table region overlap, and the like to restore table integrity invariant. Once the table integrity has been restored, each row key resolves to exactly one region. The methods further include fixing region inconsistencies, such as bad region assignment, no region present in the meta table, region information not in the Hadoop Distributed File System (HDFS), and the like to restore region consistency invariant. The information in the HDFS is taken as ground truth and any meta table or assignment problems that are inconsistent with the HDFS is deemed wrong and removed.
    Type: Application
    Filed: March 15, 2013
    Publication date: October 24, 2013
    Applicant: CLOUDERA, INC.
    Inventor: Jonathan Ming-Cyn Hsieh
  • Publication number: 20110246460
    Abstract: Systems and methods of facilitating collecting and aggregating datasets that are machine or user-generated for analysis are disclosed. One embodiment includes, collecting a dataset on a machine on which the dataset is received or generated, wherein, the dataset is collected from a data source on the machine, aggregating the dataset collected from the data source at a receiving location, performing analytics on the dataset upon collection or aggregation, and/or writing the dataset aggregated at the receiving location to a storage location.
    Type: Application
    Filed: September 8, 2010
    Publication date: October 6, 2011
    Applicant: Cloudera, Inc.
    Inventors: Jonathan Ming-Cyn Hsieh, Henry Noel Robinson
  • Publication number: 20110246528
    Abstract: Systems and methods of dynamically processing an event using an extensible data model are disclosed. One embodiment includes, specifying attributes of the event in a data model; the data model being extensible to add properties to the event as the dataset is streamed from the source to the sink.
    Type: Application
    Filed: September 8, 2010
    Publication date: October 6, 2011
    Applicant: Cloudera, Inc.
    Inventors: Jonathan Ming-Cyn Hsieh, Henry Noel Robinson
  • Publication number: 20110246826
    Abstract: Systems and methods of collecting and aggregating log data with fault tolerance are disclosed. One embodiment includes, one or more devices that generate log data, the one or more machines each associated with an agent node to collect the log data, wherein, the agent node generates a batch comprising multiple messages from the log data and assigns a tag to the batch. In one embodiment, the agent node further computes a checksum for the batch of multiple messages. The system may further include a collector device, the collector device being associated with a collector tier having a collector node to which the agent sends the log data; wherein, the collector determines the checksum for the batch of multiple messages received from the agent node.
    Type: Application
    Filed: September 8, 2010
    Publication date: October 6, 2011
    Applicant: Cloudera, Inc.
    Inventors: Jonathan Ming-Cyn Hsieh, Henry Noel Robinson
  • Publication number: 20110246816
    Abstract: Methods for configuring a system to collect and aggregate datasets are disclosed. One embodiment includes, identifying a data source in the system from where dataset is to be collected, configuring a machine in the system that generates the dataset to be collected, to send the dataset to the data source, identifying an arrival location where the dataset that is collected is to be aggregated or written, and/or configuring an agent node by specifying a source for the agent node as the data source in the system and specifying a sink for the agent node as the arrival location.
    Type: Application
    Filed: September 8, 2010
    Publication date: October 6, 2011
    Applicant: Cloudera, Inc.
    Inventors: Jonathan Ming-Cyn Hsieh, Henry Noel Robinson