Patents by Inventor Steve Yu Zhang

Steve Yu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11977544
    Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the reciept of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.
    Type: Grant
    Filed: July 28, 2022
    Date of Patent: May 7, 2024
    Assignee: SPLUNK INC.
    Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
  • Patent number: 11914562
    Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.
    Type: Grant
    Filed: February 8, 2023
    Date of Patent: February 27, 2024
    Assignee: SPLUNK INC.
    Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
  • Patent number: 11893010
    Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.
    Type: Grant
    Filed: May 2, 2022
    Date of Patent: February 6, 2024
    Assignee: SPLUNK INC.
    Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
  • Patent number: 11860881
    Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.
    Type: Grant
    Filed: November 15, 2021
    Date of Patent: January 2, 2024
    Assignee: Splunk Inc.
    Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
  • Patent number: 11841853
    Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: December 12, 2023
    Assignee: Splunk Inc.
    Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
  • Patent number: 11836146
    Abstract: A computer-implemented method of determining indexed fields at query time comprises indexing time-stamped events ingested from a plurality of source types. The time-stamped searchable events compare portions of raw data. The method also comprises generating an index containing each keyword in the time-stamped searchable events and an associated location reference of a respective event in which the keyword appears. Further, the method comprises generating a fields metadata file identifying indexed fields in the time-stamped searchable events for each source type. The fields metadata file comprises reference values for accessing indexed fields associated with each source type from the index. The method also comprises accessing the fields metadata file to identify the indexed fields associated with each source type prior to executing a query.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: December 5, 2023
    Assignee: SPLUNK INC.
    Inventors: Jay A. Pathak, Steve Yu Zhang
  • Patent number: 11604779
    Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: March 14, 2023
    Assignee: SPLUNK INC.
    Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
  • Publication number: 20220365932
    Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the reciept of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.
    Type: Application
    Filed: July 28, 2022
    Publication date: November 17, 2022
    Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
  • Patent number: 11436222
    Abstract: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the receipt of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.
    Type: Grant
    Filed: October 2, 2019
    Date of Patent: September 6, 2022
    Assignee: SPLUNK INC.
    Inventors: David Ryan Marquardt, Karthikeyan Sabhanatarajan, Steve Yu Zhang
  • Patent number: 11321311
    Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: May 3, 2022
    Assignee: SPLUNK INC.
    Inventors: Alice Emily Neels, Archana Sulochana Ganapathi, Marc Vincent Robichaud, Stephen Phillip Sorkin, Steve Yu Zhang
  • Publication number: 20220012221
    Abstract: Embodiments are directed are towards a method for generating a query response, which comprises creating two or more partitions of event records from raw data stored in a data store, wherein each event record in the two or more partitions of event records includes a portion of the raw data and is associated with a time stamp derived from the raw data. The method also comprises generating a summarization table for each partition of the two or more partitions that: (a) identifies a field value comprising a value that corresponds to an associated field extracted from a respective event record; and (b) for the field value, includes a posting value to the respective event record within a respective partition. The method further comprises generating partial results for a received query using summarization tables in the partitions and generating a response to the query by combining the partial results.
    Type: Application
    Filed: September 23, 2021
    Publication date: January 13, 2022
    Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
  • Patent number: 11188550
    Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes ingesting each metric including at least one key value and a measured value taken of a computing resource, and storing each metric in an index of a metrics store, where the index defines at least one dimension populated with the at least one key value and a measure populated with the measured value. The method further includes cataloging metadata in a metrics catalog, where the metadata is related to the metrics stored in the metrics store, performing an analysis of metrics data included in the metrics store and/or the metrics catalog to obtain results, and causing display of the results or an indication of the results on a display device.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: November 30, 2021
    Assignee: SPLUNK INC.
    Inventors: Thomas Allan Haggie, Clint Sharp, Alexander Douglas James, David Ryan Marquardt, Hailun Yan, Christopher Pride, Vishal Patel, Amrittpal Singh Bath, Pratiksha Shah, Murugan Kandaswamy, Steve Yu Zhang, Ledion Bitincka, David E. Simmen, Marc Andre Chene, Esguerra Ma Kharisma, Igor Stojanovski
  • Patent number: 11176146
    Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: November 16, 2021
    Assignee: Splunk Inc.
    Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
  • Patent number: 11163738
    Abstract: Embodiments are directed are towards the parallelization of collection queries. A method of parallelizing collection queries comprises providing a field searchable data store comprising a plurality of field searchable time stamped event records. The method further comprises receiving, at a search head, a collection query that references a field name that identifies portions of one or more event records to be summarized. Further, the method comprises determining if the collection query can be concurrently executed on a first plurality of indexers, wherein the search head is configured to communicate with the first plurality of indexers, and wherein each indexer of the first plurality of indexers comprises one or more field searchable time stamped event records. Responsive to an affirmative determination, the method also comprises determining a second plurality of indexers relevant to the collection query and executing the collection query to generate a respective summarization table at each indexer.
    Type: Grant
    Filed: June 25, 2019
    Date of Patent: November 2, 2021
    Assignee: Splunk Inc.
    Inventors: David Ryan Marquardt, Stephen Phillip Sorkin, Steve Yu Zhang
  • Patent number: 11100172
    Abstract: Embodiments of the present invention are directed to identifying and providing related data field sets. In one embodiment, a first portion of a graphical user interface (GUI) configured to receive a search query is displayed. The GUI enables user interaction to specify a source type in association with the search query. In accordance with a first source type specified in the search query, a first field set associated with the first source type is identified as related to a second field set associated with a second source type. A second portion of the GUI is displayed that includes a relationship indication that indicates the first field set associated with the first source type is related to the second field set associated with a second source type. Further, a third portion of the GUI is displayed that includes an explanation or recommendation associated with the relationship indication.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: August 24, 2021
    Assignee: SPLUNK Inc.
    Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
  • Patent number: 11055300
    Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes receiving a real-time search query including search criteria, and receiving a stream of metrics, where each metric includes a measured value taken of a computing device. The method further includes filtering the metrics to obtain filtered metrics satisfying the search criteria, creating an in-memory summarization data structure based on the filtered metrics, communicating the summarization data to a search head, and providing search results including the summarization data, where the summarization data or data indicative of the summarization data is displayed on a display of a display device.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: July 6, 2021
    Assignee: SPLUNK INC.
    Inventors: Steve Yu Zhang, Ledion Bitincka, Vishal Patel, David E. Simmen
  • Publication number: 20210200755
    Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.
    Type: Application
    Filed: March 15, 2021
    Publication date: July 1, 2021
    Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang
  • Patent number: 11030173
    Abstract: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.
    Type: Grant
    Filed: July 2, 2020
    Date of Patent: June 8, 2021
    Assignee: Splunk, Inc.
    Inventors: Ledion Bitincka, Stephen Phillip Sorkin, Steve Yu Zhang
  • Patent number: 11003675
    Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: May 11, 2021
    Assignee: SPLUNK INC.
    Inventors: Steve Yu Zhang, Stephen Phillip Sorkin
  • Patent number: 10949420
    Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: March 16, 2021
    Assignee: SPLUNK Inc.
    Inventors: Kristal Lyn Curtis, Archana Sulochana Ganapathi, Adam Oliner, Steve Yu Zhang