Patents by Inventor Steve Yu Zhang

Steve Yu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

RETRIEVING DATA IDENTIFIERS FROM QUEUE FOR SEARCH OF EXTERNAL DATA SYSTEM

Publication number: 20250103604

Abstract: A computing device can receive a query that identifies a set of data to be processed and determine that a portion of the set of data resides in an external data system. The query system can request data identifiers associated with data objects of the set of data from the external data system and communicate the data identifiers to a data queue. The computing device can instruct one or more search nodes to retrieve the identifiers from the data queue. The search nodes can use the data identifiers to retrieve data objects from the external data system and process the data objects according to instructions received from the computing device. The search nodes can provide results of the processing to the computing device.

Type: Application

Filed: June 20, 2024

Publication date: March 27, 2025

Inventors: Alexandros Batsakis, Nitilaksha Satyaveera Halakatti, Ningxuan He, Prem Kumar Jayaraj, Manuel Gregorio Martinez, Balaji Rao, Jianming Zhang, Steve Yu Zhang
Retrieving data identifiers from queue for search of external data system

Patent number: 12093272

Abstract: A computing device can receive a query that identifies a set of data to be processed and determine that a portion of the set of data resides in an external data system. The query system can request data identifiers associated with data objects of the set of data from the external data system and communicate the data identifiers to a data queue. The computing device can instruct one or more search nodes to retrieve the identifiers from the data queue. The search nodes can use the data identifiers to retrieve data objects from the external data system and process the data objects according to instructions received from the computing device. The search nodes can provide results of the processing to the computing device.

Type: Grant

Filed: April 29, 2022

Date of Patent: September 17, 2024

Assignee: Splunk Inc.

Inventors: Alexandros Batsakis, Nitilaksha Satyaveera Halakatti, Ningxuan He, Prem Kumar Jayaraj, Manuel Gregorio Martinez, Balaji Rao, Jianming Zhang, Steve Yu Zhang
Federated data enrichment objects

Patent number: 12072939

Abstract: A data intake and query system can generate local data enrichment objects and receive federated data enrichment objects from another data intake and query system. In response to receiving a query, the data intake and query system can determine whether the query is subquery of a federated query. If the query is a subquery, the data intake and query system can use the federated data enrichment objects to execute the query.

Type: Grant

Filed: January 31, 2022

Date of Patent: August 27, 2024

Assignee: Splunk Inc.

Inventors: Alexandros Batsakis, Nir Frenkel, Nitilaksha Halakatti, Balaji Rao, Anish Shrigondekar, Ruochen Zhang, Steve Yu Zhang
Generating a query response by combining partial results from separate partitions of event records

Patent number: 12066995

Abstract: Embodiments are directed are towards a method for generating a query response, which comprises creating two or more partitions of event records from raw data stored in a data store, wherein each event record in the two or more partitions of event records includes a portion of the raw data and is associated with a time stamp derived from the raw data. The method also comprises generating a summarization table for each partition of the two or more partitions that: (a) identifies a field value comprising a value that corresponds to an associated field extracted from a respective event record; and (b) for the field value, includes a posting value to the respective event record within a respective partition. The method further comprises generating partial results for a received query using summarization tables in the partitions and generating a response to the query by combining the partial results.

Type: Grant

Filed: September 23, 2021

Date of Patent: August 20, 2024

Assignee: SPLUNK INC.

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
Intelligent search-time determination and usage of fields extracted at index-time

Patent number: 12038926

Abstract: A computer-implemented method of determining indexed fields at query time comprises mapping data from a first source type to indexed fields in batch form using a wildcard specifier. The method also comprises receiving a query to execute on a data set comprising data from the first source type and data from a second source type. Further, the method comprises transforming the query to execute on the data from the first source type separately from the data from the second source type. Additionally, the method comprises executing the query to operate on the data from the first source type using information associated with the indexed fields and to separately operate on the data from the second source type.

Type: Grant

Filed: January 29, 2021

Date of Patent: July 16, 2024

Assignee: SPLUNK INC.

Inventors: Jay A. Pathak, Steve Yu Zhang
Pipelined search query, leveraging reference values of an inverted index to access a set of event data and performing further queries on associated raw data

Patent number: 11977544

Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the reciept of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.

Type: Grant

Filed: July 28, 2022

Date of Patent: May 7, 2024

Assignee: SPLUNK INC.

Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
Generating search results based on intermediate summaries

Patent number: 11914562

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Grant

Filed: February 8, 2023

Date of Patent: February 27, 2024

Assignee: SPLUNK INC.

Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
Data model selection and application based on data sources

Patent number: 11893010

Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.

Type: Grant

Filed: May 2, 2022

Date of Patent: February 6, 2024

Assignee: SPLUNK INC.

Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
Tracking event records across multiple search sessions

Patent number: 11860881

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: November 15, 2021

Date of Patent: January 2, 2024

Assignee: Splunk Inc.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Identifying related field sets based on related source types

Patent number: 11841853

Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.

Type: Grant

Filed: March 15, 2021

Date of Patent: December 12, 2023

Assignee: Splunk Inc.

Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
Storing indexed fields per source type as metadata at the bucket level to facilitate search-time field learning

Patent number: 11836146

Abstract: A computer-implemented method of determining indexed fields at query time comprises indexing time-stamped events ingested from a plurality of source types. The time-stamped searchable events compare portions of raw data. The method also comprises generating an index containing each keyword in the time-stamped searchable events and an associated location reference of a respective event in which the keyword appears. Further, the method comprises generating a fields metadata file identifying indexed fields in the time-stamped searchable events for each source type. The fields metadata file comprises reference values for accessing indexed fields associated with each source type from the index. The method also comprises accessing the fields metadata file to identify the indexed fields associated with each source type prior to executing a query.

Type: Grant

Filed: January 29, 2021

Date of Patent: December 5, 2023

Assignee: SPLUNK INC.

Inventors: Jay A. Pathak, Steve Yu Zhang
Query acceleration using intermediate summaries

Patent number: 11604779

Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.

Type: Grant

Filed: May 10, 2021

Date of Patent: March 14, 2023

Assignee: SPLUNK INC.

Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
PIPELINED SEARCH QUERY, LEVERAGING REFERENCE VALUES OF AN INVERTED INDEX TO ACCESS A SET OF EVENT DATA AND PERFORMING FURTHER QUERIES ON ASSOCIATED RAW DATA

Publication number: 20220365932

Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the reciept of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.

Type: Application

Filed: July 28, 2022

Publication date: November 17, 2022

Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
Pipelined search query, leveraging reference values of an inverted index to determine a set of event data and performing further queries on the event data

Patent number: 11436222

Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the receipt of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.

Type: Grant

Filed: October 2, 2019

Date of Patent: September 6, 2022

Assignee: SPLUNK INC.

Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
Data model selection and application based on data sources

Patent number: 11321311

Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.

Type: Grant

Filed: November 29, 2018

Date of Patent: May 3, 2022

Assignee: SPLUNK INC.

Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
GENERATING A QUERY RESPONSE BY COMBINING PARTIAL RESULTS FROM SEPARATE PARTITIONS OF EVENT RECORDS

Publication number: 20220012221

Abstract: Embodiments are directed are towards a method for generating a query response, which comprises creating two or more partitions of event records from raw data stored in a data store, wherein each event record in the two or more partitions of event records includes a portion of the raw data and is associated with a time stamp derived from the raw data. The method also comprises generating a summarization table for each partition of the two or more partitions that: (a) identifies a field value comprising a value that corresponds to an associated field extracted from a respective event record; and (b) for the field value, includes a posting value to the respective event record within a respective partition. The method further comprises generating partial results for a received query using summarization tables in the partitions and generating a response to the query by combining the partial results.

Type: Application

Filed: September 23, 2021

Publication date: January 13, 2022

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
Metrics store system

Patent number: 11188550

Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes ingesting each metric including at least one key value and a measured value taken of a computing resource, and storing each metric in an index of a metrics store, where the index defines at least one dimension populated with the at least one key value and a measure populated with the measured value. The method further includes cataloging metadata in a metrics catalog, where the metadata is related to the metrics stored in the metrics store, performing an analysis of metrics data included in the metrics store and/or the metrics catalog to obtain results, and causing display of the results or an indication of the results on a display device.

Type: Grant

Filed: October 31, 2016

Date of Patent: November 30, 2021

Assignee: SPLUNK INC.

Inventors: Thomas Allan Haggie, Clint Sharp, Alexander Douglas James, David Ryan Marquardt, Hailun Yan, Christopher Pride, Vishal Patel, Amrittpal Singh Bath, Pratiksha Shah, Murugan Kandaswamy, Steve Yu Zhang, Ledion Bitincka, David E. Simmen, Marc Andre Chene, Esguerra Ma Kharisma, Igor Stojanovski
Determining indications of unique values for fields

Patent number: 11176146

Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

Type: Grant

Filed: April 26, 2019

Date of Patent: November 16, 2021

Assignee: Splunk Inc.

Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
Parallelization of collection queries

Patent number: 11163738

Abstract: Embodiments are directed are towards the parallelization of collection queries. A method of parallelizing collection queries comprises providing a field searchable data store comprising a plurality of field searchable time stamped event records. The method further comprises receiving, at a search head, a collection query that references a field name that identifies portions of one or more event records to be summarized. Further, the method comprises determining if the collection query can be concurrently executed on a first plurality of indexers, wherein the search head is configured to communicate with the first plurality of indexers, and wherein each indexer of the first plurality of indexers comprises one or more field searchable time stamped event records. Responsive to an affirmative determination, the method also comprises determining a second plurality of indexers relevant to the collection query and executing the collection query to generate a respective summarization table at each indexer.

Type: Grant

Filed: June 25, 2019

Date of Patent: November 2, 2021

Assignee: Splunk Inc.

Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
Providing similar field sets based on related source types

Patent number: 11100172

Abstract: Embodiments of the present invention are directed to identifying and providing related data field sets. In one embodiment, a first portion of a graphical user interface (GUI) configured to receive a search query is displayed. The GUI enables user interaction to specify a source type in association with the search query. In accordance with a first source type specified in the search query, a first field set associated with the first source type is identified as related to a second field set associated with a second source type. A second portion of the GUI is displayed that includes a relationship indication that indicates the first field set associated with the first source type is related to the second field set associated with a second source type. Further, a third portion of the GUI is displayed that includes an explanation or recommendation associated with the relationship indication.

Type: Grant

Filed: July 31, 2018

Date of Patent: August 24, 2021

Assignee: SPLUNK Inc.

Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang

1 2 3 4 5 … next