Patents by Inventor Steve Yu Zhang

Steve Yu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Real-time search techniques

Patent number: 11055300

Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes receiving a real-time search query including search criteria, and receiving a stream of metrics, where each metric includes a measured value taken of a computing device. The method further includes filtering the metrics to obtain filtered metrics satisfying the search criteria, creating an in-memory summarization data structure based on the filtered metrics, communicating the summarization data to a search head, and providing search results including the summarization data, where the summarization data or data indicative of the summarization data is displayed on a display of a display device.

Type: Grant

Filed: October 31, 2016

Date of Patent: July 6, 2021

Assignee: SPLUNK INC.

Inventors: Steve Yu Zhang, Ledion Bitincka, Vishal Patel, David E. Simmen
IDENTIFYING RELATED FIELD SETS BASED ON RELATED SOURCE TYPES

Publication number: 20210200755

Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.

Type: Application

Filed: March 15, 2021

Publication date: July 1, 2021

Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
Report acceleration using intermediate results in a distributed indexer system

Patent number: 11030173

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Grant

Filed: July 2, 2020

Date of Patent: June 8, 2021

Assignee: Splunk, Inc.

Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
Interactive display of search result information

Patent number: 11003675

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: June 27, 2019

Date of Patent: May 11, 2021

Assignee: SPLUNK INC.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Identifying similar field sets using related source types

Patent number: 10949420

Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.

Type: Grant

Filed: July 31, 2018

Date of Patent: March 16, 2021

Assignee: SPLUNK Inc.

Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
Providing interactive search results from a distributed search system

Patent number: 10860592

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: April 29, 2019

Date of Patent: December 8, 2020

Assignee: SPLUNK INC.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Server-side interactive search results

Patent number: 10860591

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: November 16, 2018

Date of Patent: December 8, 2020

Assignee: SPLUNK INC.

Inventors: Steve Yu Zhang, Stephen P. Sorkin
Periodic generation of intermediate summaries to facilitate report acceleration

Patent number: 10719493

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Grant

Filed: October 9, 2018

Date of Patent: July 21, 2020

Assignee: SPLUNK INC.

Inventors: Stephen Phillip Sorkin, Steve Yu Zhang, Ledion Bitincka
Query handling using summarization tables

Patent number: 10685001

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Type: Grant

Filed: April 30, 2018

Date of Patent: June 16, 2020

Assignee: Splunk Inc.

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
PROVIDING SIMILAR FIELD SETS BASED ON RELATED SOURCE TYPES

Publication number: 20200042651

Abstract: Embodiments of the present invention are directed to identifying and providing related data field sets. In one embodiment, a first portion of a graphical user interface (GUI) configured to receive a search query is displayed. The GUI enables user interaction to specify a source type in association with the search query. In accordance with a first source type specified in the search query, a first field set associated with the first source type is identified as related to a second field set associated with a second source type. A second portion of the GUI is displayed that includes a relationship indication that indicates the first field set associated with the first source type is related to the second field set associated with a second source type. Further, a third portion of the GUI is displayed that includes an explanation or recommendation associated with the relationship indication.

Type: Application

Filed: July 31, 2018

Publication date: February 6, 2020

Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
IDENTIFYING SIMILAR FIELD SETS USING RELATED SOURCE TYPES

Publication number: 20200042626

Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.

Type: Application

Filed: July 31, 2018

Publication date: February 6, 2020

Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
PIPELINED SEARCH QUERY, LEVERAGING REFERENCE VALUES OF AN INVERTED INDEX TO DETERMINE A SET OF EVENT DATA AND PERFORMING FURTHER QUERIES ON THE EVENT DATA

Publication number: 20200034363

Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the receipt of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.

Type: Application

Filed: October 2, 2019

Publication date: January 30, 2020

Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
Efficient storage of approximate order statistics of real numbers

Patent number: 10496731

Abstract: A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.

Type: Grant

Filed: January 31, 2019

Date of Patent: December 3, 2019

Assignee: Splunk Inc.

Inventor: Steve Yu Zhang
Using an inverted index in a pipelined search query to determine a set of event data that is further limited by filtering and/or processing of subsequent query pipestages

Patent number: 10474674

Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the reciept of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to filter out a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.

Type: Grant

Filed: January 31, 2017

Date of Patent: November 12, 2019

Assignee: SPLUNK INC.

Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
PARALLELIZATION OF COLLECTION QUERIES

Publication number: 20190332590

Abstract: Embodiments are directed are towards the parallelization of collection queries. A method of parallelizing collection queries comprises providing a field searchable data store comprising a plurality of field searchable time stamped event records. The method further comprises receiving, at a search head, a collection query that references a field name that identifies portions of one or more event records to be summarized. Further, the method comprises determining if the collection query can be concurrently executed on a first plurality of indexers, wherein the search head is configured to communicate with the first plurality of indexers, and wherein each indexer of the first plurality of indexers comprises one or more field searchable time stamped event records. Responsive to an affirmative determination, the method also comprises determining a second plurality of indexers relevant to the collection query and executing the collection query to generate a respective summarization table at each indexer.

Type: Application

Filed: June 25, 2019

Publication date: October 31, 2019

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
INTERACTIVE DISPLAY OF SEARCH RESULT INFORMATION

Publication number: 20190317943

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: June 27, 2019

Publication date: October 17, 2019

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Collection query driven generation of summarization information for raw machine data

Patent number: 10387396

Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Type: Grant

Filed: September 15, 2017

Date of Patent: August 20, 2019

Assignee: SPLUNK INC.

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
Providing Interactive Search Results From a Distributed Search System

Publication number: 20190251092

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: April 29, 2019

Publication date: August 15, 2019

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
DETERMINING INDICATIONS OF UNIQUE VALUES FOR FIELDS

Publication number: 20190251091

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Application

Filed: April 26, 2019

Publication date: August 15, 2019

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Interactive display of search result information

Patent number: 10380122

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: October 31, 2014

Date of Patent: August 13, 2019

Assignee: SPLUNK INC.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin

prev 1 2 3 4 5 6 next