Patents by Inventor Steve Yu Zhang

Steve Yu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Determining and providing quantity of unique values existing for a field

Patent number: 10339149

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: July 20, 2018

Date of Patent: July 2, 2019

Assignee: SPLUNK Inc.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Displaying drill-down event information using event identifiers

Patent number: 10318535

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: January 25, 2016

Date of Patent: June 11, 2019

Assignee: SPLUNK INC.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
EFFICIENT STORAGE OF APPROXIMATE ORDER STATISTICS OF REAL NUMBERS

Publication number: 20190163721

Abstract: A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range.

Type: Application

Filed: January 31, 2019

Publication date: May 30, 2019

Inventor: Steve Yu Zhang
INTERACTIVE DISPLAY OF AGGREGATED SEARCH RESULT INFORMATION

Publication number: 20190155811

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: November 16, 2018

Publication date: May 23, 2019

Inventors: Steve Yu Zhang, Stephen P. Sorkin
Report acceleration using intermediate summaries

Patent number: 10255310

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Grant

Filed: October 31, 2014

Date of Patent: April 9, 2019

Assignee: SPLUNK INC.

Inventors: Stephen P. Sorkin, Steve Yu Zhang, Ledion Bitincka
DATA MODEL SELECTION AND APPLICATION BASED ON DATA SOURCES

Publication number: 20190095062

Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.

Type: Application

Filed: November 29, 2018

Publication date: March 28, 2019

Inventors: ALICE EMILY NEELS, ARCHANA SULOCHANA GANAPATHI, MARC VINCENT ROBICHAUD, STEPHEN PHILLIP SORKIN, STEVE YU ZHANG
Efficient calculation and organization of approximate order statistics of real numbers

Patent number: 10235345

Abstract: A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.

Type: Grant

Filed: March 31, 2017

Date of Patent: March 19, 2019

Assignee: Splunk Inc.

Inventor: Steve Yu Zhang
PERIODIC GENERATION OF INTERMEDIATE SUMMARIES TO FACILITATE REPORT ACCELERATION

Publication number: 20190042610

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Application

Filed: October 9, 2018

Publication date: February 7, 2019

Inventors: Stephen Phillip SORKIN, Steve Yu ZHANG, Ledion BITINCKA
Data model selection and application based on data sources

Patent number: 10169405

Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.

Type: Grant

Filed: January 31, 2017

Date of Patent: January 1, 2019

Assignee: SPLUNK INC.

Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
Interactive display of aggregated search result information

Patent number: 10162863

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: November 1, 2014

Date of Patent: December 25, 2018

Assignee: SPLUNK INC.

Inventors: Steve Yu Zhang, Stephen P. Sorkin
Determining and providing quantity of unique values existing for a field

Publication number: 20180329915

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: July 20, 2018

Publication date: November 15, 2018

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
GENERATING AND STORING SUMMARIZATION TABLES FOR SETS OF SEARCHABLE EVENTS

Publication number: 20180246918

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Type: Application

Filed: April 30, 2018

Publication date: August 30, 2018

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
Extracting unique field values from event fields

Patent number: 10061821

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: July 31, 2016

Date of Patent: August 28, 2018

Assignee: Splunk Inc.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
USING AN INVERTED INDEX IN A PIPELINED SEARCH QUERY TO DETERMINE A SET OF EVENT DATA THAT IS FURTHER LIMITED BY FILTERING AND/OR PROCESSING OF SUBSEQUENT QUERY PIPESTAGES

Publication number: 20180218037

Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the reciept of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to filter out a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.

Type: Application

Filed: January 31, 2017

Publication date: August 2, 2018

Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
Generating and storing summarization tables for sets of searchable events

Patent number: 9990386

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Type: Grant

Filed: August 1, 2015

Date of Patent: June 5, 2018

Assignee: SPLUNK INC.

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
REAL-TIME SEARCH TECHNIQUES

Publication number: 20180089289

Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes receiving a real-time search query including search criteria, and receiving a stream of metrics, where each metric includes a measured value taken of a computing device. The method further includes filtering the metrics to obtain filtered metrics satisfying the search criteria, creating an in-memory summarization data structure based on the filtered metrics, communicating the summarization data to a search head, and providing search results including the summarization data, where the summarization data or data indicative of the summarization data is displayed on a display of a display device.

Type: Application

Filed: October 31, 2016

Publication date: March 29, 2018

Inventors: Steve Yu Zhang, Ledion Bitincka, Vishal Patel, David E. Simmen
METRICS STORE SYSTEM

Publication number: 20180089290

Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes ingesting each metric including at least one key value and a measured value taken of a computing resource, and storing each metric in an index of a metrics store, where the index defines at least one dimension populated with the at least one key value and a measure populated with the measured value. The method further includes cataloging metadata in a metrics catalog, where the metadata is related to the metrics stored in the metrics store, performing an analysis of metrics data included in the metrics store and/or the metrics catalog to obtain results, and causing display of the results or an indication of the results on a display device.

Type: Application

Filed: October 31, 2016

Publication date: March 29, 2018

Inventors: Thomas Allan Haggie, Clint Sharp, Alexander Douglas James, David Ryan Marquardt, Hailun Yan, Christopher Pride, Vishal Patel, Amrittpal Singh Bath, Pratiksha Shah, Murugan Kandaswamy, Steve Yu Zhang, Ledion Bitincka, David E. Simmen, Marc Andre Chene, Esguerra Ma Kharisma, Igor Stojanovski
GENERATING AND STORING SUMMARIZATION TABLES FOR SEARCHABLE EVENTS

Publication number: 20180004785

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Type: Application

Filed: September 15, 2017

Publication date: January 4, 2018

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
Associating metadata with results produced by applying a pipelined search command to machine data in timestamped events

Patent number: 9817862

Abstract: Embodiments are directed towards determining and tracking metadata for the generation of visualizations of requested data. A user may request data by providing a query that may be employed to search for the requested data. The query may include a plurality of commands, which may be employed in a pipeline to perform the search and to generate a table of the requested data. In some embodiments, each command may be executed to perform an action on a set of data. The execution of a command may generate one or more columns to append and/or insert into the table of requested data. Metadata for each generated column may be determined based on the actions performed by executing the commands. The table of requested data and the column metadata may be employed to generate and display a visualization of at least a portion of the requested data to a user.

Type: Grant

Filed: August 24, 2015

Date of Patent: November 14, 2017

Assignee: Splunk Inc.

Inventors: Archana Sulochana Ganapathi, Alice Emily Neels, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
Generating and storing summarization tables for searchable events

Patent number: 9817854

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Type: Grant

Filed: January 26, 2016

Date of Patent: November 14, 2017

Assignee: SPLUNK INC.

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang

prev 1 2 3 4 5 6 next