Patents by Inventor Steve Yu Zhang
Steve Yu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11977544Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the reciept of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.Type: GrantFiled: July 28, 2022Date of Patent: May 7, 2024Assignee: SPLUNK INC.Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
-
Patent number: 11914562Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.Type: GrantFiled: February 8, 2023Date of Patent: February 27, 2024Assignee: SPLUNK INC.Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
-
Patent number: 11893010Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.Type: GrantFiled: May 2, 2022Date of Patent: February 6, 2024Assignee: SPLUNK INC.Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
-
Patent number: 11860881Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.Type: GrantFiled: November 15, 2021Date of Patent: January 2, 2024Assignee: Splunk Inc.Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
-
Patent number: 11841853Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.Type: GrantFiled: March 15, 2021Date of Patent: December 12, 2023Assignee: Splunk Inc.Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
-
Patent number: 11836146Abstract: A computer-implemented method of determining indexed fields at query time comprises indexing time-stamped events ingested from a plurality of source types. The time-stamped searchable events compare portions of raw data. The method also comprises generating an index containing each keyword in the time-stamped searchable events and an associated location reference of a respective event in which the keyword appears. Further, the method comprises generating a fields metadata file identifying indexed fields in the time-stamped searchable events for each source type. The fields metadata file comprises reference values for accessing indexed fields associated with each source type from the index. The method also comprises accessing the fields metadata file to identify the indexed fields associated with each source type prior to executing a query.Type: GrantFiled: January 29, 2021Date of Patent: December 5, 2023Assignee: SPLUNK INC.Inventors: Jay A. Pathak, Steve Yu Zhang
-
Patent number: 11604779Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.Type: GrantFiled: May 10, 2021Date of Patent: March 14, 2023Assignee: SPLUNK INC.Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
-
Publication number: 20220365932Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the reciept of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.Type: ApplicationFiled: July 28, 2022Publication date: November 17, 2022Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
-
Patent number: 11436222Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the receipt of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.Type: GrantFiled: October 2, 2019Date of Patent: September 6, 2022Assignee: SPLUNK INC.Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
-
Patent number: 11321311Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.Type: GrantFiled: November 29, 2018Date of Patent: May 3, 2022Assignee: SPLUNK INC.Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
-
Publication number: 20220012221Abstract: Embodiments are directed are towards a method for generating a query response, which comprises creating two or more partitions of event records from raw data stored in a data store, wherein each event record in the two or more partitions of event records includes a portion of the raw data and is associated with a time stamp derived from the raw data. The method also comprises generating a summarization table for each partition of the two or more partitions that: (a) identifies a field value comprising a value that corresponds to an associated field extracted from a respective event record; and (b) for the field value, includes a posting value to the respective event record within a respective partition. The method further comprises generating partial results for a received query using summarization tables in the partitions and generating a response to the query by combining the partial results.Type: ApplicationFiled: September 23, 2021Publication date: January 13, 2022Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
-
Patent number: 11188550Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes ingesting each metric including at least one key value and a measured value taken of a computing resource, and storing each metric in an index of a metrics store, where the index defines at least one dimension populated with the at least one key value and a measure populated with the measured value. The method further includes cataloging metadata in a metrics catalog, where the metadata is related to the metrics stored in the metrics store, performing an analysis of metrics data included in the metrics store and/or the metrics catalog to obtain results, and causing display of the results or an indication of the results on a display device.Type: GrantFiled: October 31, 2016Date of Patent: November 30, 2021Assignee: SPLUNK INC.Inventors: Thomas Allan Haggie, Clint Sharp, Alexander Douglas James, David Ryan Marquardt, Hailun Yan, Christopher Pride, Vishal Patel, Amrittpal Singh Bath, Pratiksha Shah, Murugan Kandaswamy, Steve Yu Zhang, Ledion Bitincka, David E. Simmen, Marc Andre Chene, Esguerra Ma Kharisma, Igor Stojanovski
-
Patent number: 11176146Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.Type: GrantFiled: April 26, 2019Date of Patent: November 16, 2021Assignee: Splunk Inc.Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
-
Patent number: 11163738Abstract: Embodiments are directed are towards the parallelization of collection queries. A method of parallelizing collection queries comprises providing a field searchable data store comprising a plurality of field searchable time stamped event records. The method further comprises receiving, at a search head, a collection query that references a field name that identifies portions of one or more event records to be summarized. Further, the method comprises determining if the collection query can be concurrently executed on a first plurality of indexers, wherein the search head is configured to communicate with the first plurality of indexers, and wherein each indexer of the first plurality of indexers comprises one or more field searchable time stamped event records. Responsive to an affirmative determination, the method also comprises determining a second plurality of indexers relevant to the collection query and executing the collection query to generate a respective summarization table at each indexer.Type: GrantFiled: June 25, 2019Date of Patent: November 2, 2021Assignee: Splunk Inc.Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
-
Patent number: 11100172Abstract: Embodiments of the present invention are directed to identifying and providing related data field sets. In one embodiment, a first portion of a graphical user interface (GUI) configured to receive a search query is displayed. The GUI enables user interaction to specify a source type in association with the search query. In accordance with a first source type specified in the search query, a first field set associated with the first source type is identified as related to a second field set associated with a second source type. A second portion of the GUI is displayed that includes a relationship indication that indicates the first field set associated with the first source type is related to the second field set associated with a second source type. Further, a third portion of the GUI is displayed that includes an explanation or recommendation associated with the relationship indication.Type: GrantFiled: July 31, 2018Date of Patent: August 24, 2021Assignee: SPLUNK Inc.Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
-
Patent number: 11055300Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes receiving a real-time search query including search criteria, and receiving a stream of metrics, where each metric includes a measured value taken of a computing device. The method further includes filtering the metrics to obtain filtered metrics satisfying the search criteria, creating an in-memory summarization data structure based on the filtered metrics, communicating the summarization data to a search head, and providing search results including the summarization data, where the summarization data or data indicative of the summarization data is displayed on a display of a display device.Type: GrantFiled: October 31, 2016Date of Patent: July 6, 2021Assignee: SPLUNK INC.Inventors: Steve Yu Zhang, Ledion Bitincka, Vishal Patel, David E. Simmen
-
Publication number: 20210200755Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.Type: ApplicationFiled: March 15, 2021Publication date: July 1, 2021Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
-
Patent number: 11030173Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.Type: GrantFiled: July 2, 2020Date of Patent: June 8, 2021Assignee: Splunk, Inc.Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
-
Patent number: 11003675Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.Type: GrantFiled: June 27, 2019Date of Patent: May 11, 2021Assignee: SPLUNK INC.Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
-
Patent number: 10949420Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.Type: GrantFiled: July 31, 2018Date of Patent: March 16, 2021Assignee: SPLUNK Inc.Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang