Patents by Inventor Steve Yu Zhang

Steve Yu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Generation of a data model for searching machine data

Patent number: 8983994

Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.

Type: Grant

Filed: October 30, 2013

Date of Patent: March 17, 2015

Assignee: Splunk Inc.

Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
Report Acceleration Using Intermediate Summaries

Publication number: 20150058353

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Application

Filed: October 31, 2014

Publication date: February 26, 2015

Inventors: Stephen P. Sorkin, Steve Yu Zhang, Ledion Bitincka
Interactive Display of Aggregated Search Result Information

Publication number: 20150058326

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: November 1, 2014

Publication date: February 26, 2015

Inventors: Steve Yu Zhang, Stephen P. Sorkin
Interactive Display of Search Result Information

Publication number: 20150058325

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: October 31, 2014

Publication date: February 26, 2015

Inventors: Steve Yu Zhang, Stephen P. Sorkin
EVENT FIELD DISTRIBUTED SEARCH DISPLAY

Publication number: 20150058375

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: October 31, 2014

Publication date: February 26, 2015

Inventors: Steve Yu Zhang, Stephen P. Sorkin
Scalable Interactive Display Of Distributed Data

Publication number: 20140317111

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: May 1, 2014

Publication date: October 23, 2014

Applicant: Splunk Inc.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Approximate Order Statistics Of Real Numbers In Generic Data

Publication number: 20140229516

Abstract: A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.

Type: Application

Filed: April 18, 2014

Publication date: August 14, 2014

Applicant: Splunk Inc.

Inventor: Steve Yu Zhang
SUPPLEMENTING A HIGH PERFORMANCE ANALYTICS STORE WITH EVALUATION OF INDIVIDUAL EVENTS TO RESPOND TO AN EVENT QUERY

Publication number: 20140214888

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Type: Application

Filed: January 31, 2014

Publication date: July 31, 2014

Inventors: David Ryan MARQUARDT, Stephen Phillip SORKIN, Steve Yu ZHANG
METADATA TRACKING FOR A PIPELINED SEARCH LANGUAGE (DATA MODELING FOR FIELDS)

Publication number: 20140214807

Abstract: Embodiments are directed towards determining and tracking metadata for the generation of visualizations of requested data. A user may request data by providing a query that may be employed to search for the requested data. The query may include a plurality of commands, which may be employed in a pipeline to perform the search and to generate a table of the requested data. In some embodiments, each command may be executed to perform an action on a set of data. The execution of a command may generate one or more columns to append and/or insert into the table of requested data. Metadata for each generated column may be determined based on the actions performed by executing the commands. The table of requested data and the column metadata may be employed to generate and display a visualization of at least a portion of the requested data to a user.

Type: Application

Filed: October 31, 2013

Publication date: July 31, 2014

Applicant: Splunk Inc.

Inventors: Archana Sulochana Ganapathi, Alice Emily Neels, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
Data model for machine data for semantic search

Patent number: 8788525

Abstract: Embodiments are directed towards generating data models that may give semantic meaning for unstructured data or structured data that may include data generated and/or received by search engines, including a time series engine. Data models also may be generated to provide semantic meaning to structured data. A data model may be composed of a hierarchical data model objects analogous to an object-oriented programming class hierarchy. Users may employ a data modeling application to produce reports using search objects that may be part of, or associated with the data model. The data modeling application may employ the search object and the data model to generate a query string for searching a data repository to produce a result set. A data modeling application may map the result set data to data model objects that may be used to generate reports.

Type: Grant

Filed: September 7, 2012

Date of Patent: July 22, 2014

Assignee: Splunk Inc.

Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
Data model for machine data for semantic search

Patent number: 8788526

Abstract: Embodiments are directed towards generating data models that may give semantic meaning for unstructured data or structured data that may include data generated and/or received by search engines, including a time series engine. Data models also may be generated to provide semantic meaning to structured data. A data model may be composed of a hierarchical data model objects analogous to an object-oriented programming class hierarchy. Users may employ a data modeling application to produce reports using search objects that may be part of, or associated with the data model. The data modeling application may employ the search object and the data model to generate a query string for searching a data repository to produce a result set. A data modeling application may map the result set data to data model objects that may be used to generate reports.

Type: Grant

Filed: October 26, 2012

Date of Patent: July 22, 2014

Assignee: Splunk Inc.

Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
Approximate order statistics of real numbers in generic data

Patent number: 8756262

Abstract: A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.

Type: Grant

Filed: March 1, 2011

Date of Patent: June 17, 2014

Assignee: Splunk Inc.

Inventor: Steve Yu Zhang
Scalable interactive display of distributed data

Patent number: 8751529

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: October 25, 2012

Date of Patent: June 10, 2014

Assignee: Splunk Inc.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Approximate order statistics of real numbers in generic data

Patent number: 8745109

Abstract: A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.

Type: Grant

Filed: October 25, 2012

Date of Patent: June 3, 2014

Assignee: Splunk Inc.

Inventor: Steve Yu Zhang
REPORT ACCELERATION USING INTERMEDIATE RESULTS IN A DISTRIBUTED INDEXER SYSTEM FOR SEARCHING EVENTS

Publication number: 20140149423

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Application

Filed: January 30, 2014

Publication date: May 29, 2014

Applicant: SPLUNK INC.

Inventors: Stephen Phillip SORKIN, Steve Yu ZHANG, Ledion BITINCKA
SCALABLE INTERACTIVE DISPLAY OF DISTRIBUTED DATA

Publication number: 20140136529

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: January 17, 2014

Publication date: May 15, 2014

Applicant: Splunk Inc.

Inventors: Steve Yu ZHANG, Stephen Phillip SORKIN
Report acceleration using intermediate summaries of events

Patent number: 8682886

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Grant

Filed: October 30, 2012

Date of Patent: March 25, 2014

Assignee: Splunk Inc.

Inventors: Stephen Phillip Sorkin, Steve Yu Zhang, Ledion Bitincka
Distributed high performance analytics store

Patent number: 8682925

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Type: Grant

Filed: January 31, 2013

Date of Patent: March 25, 2014

Assignee: Splunk Inc.

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
DATA MODEL FOR MACHINE DATA FOR SEMANTIC SEARCH

Publication number: 20140074817

Abstract: Embodiments are directed towards generating data models that may give semantic meaning for unstructured data or structured data that may include data generated and/or received by search engines, including a time series engine. Data models also may be generated to provide semantic meaning to structured data. A data model may be composed of a hierarchical data model objects analogous to an object-oriented programming class hierarchy. Users may employ a data modeling application to produce reports using search objects that may be part of, or associated with the data model. The data modeling application may employ the search object and the data model to generate a query string for searching a data repository to produce a result set. A data modeling application may map the result set data to data model objects that may be used to generate reports.

Type: Application

Filed: October 26, 2012

Publication date: March 13, 2014

Applicant: Splunk Inc.

Inventors: Alice Emily Neels, Archara Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
GENERATION OF A DATA MODEL FOR SEARCHING MACHINE DATA

Publication number: 20140074889

Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.

Type: Application

Filed: October 30, 2013

Publication date: March 13, 2014

Applicant: Splunk Inc.

Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang

prev 1 2 3 4 5 6 next