Patents by Inventor Sourav Pal

Sourav Pal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20190138640
    Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed, and obtained search configuration data from the external system. The system uses the search configuration data to generate a subquery for the external data system. The system also generates instructions for one or more worker nodes to receive and process results of the subquery from the external data system.
    Type: Application
    Filed: July 31, 2018
    Publication date: May 9, 2019
    Inventors: Sourav Pal, Arindam Bhattacharjee
  • Publication number: 20190138641
    Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed and generates a subquery for the external data system. The system determines a data ingest estimate and uses the data ingest estimate to generate instructions for one or more worker nodes to receive and process results of the subquery from the external data system.
    Type: Application
    Filed: July 31, 2018
    Publication date: May 9, 2019
    Inventors: Sourav Pal, Arindam Bhattacharjee
  • Publication number: 20190138639
    Abstract: Systems and methods are disclosed for receiving, at a first data intake and query system, a query that includes an indication to process data managed by another data intake and query system. The first data intake and query system identifies a second data intake and query system that manages the data to be processed and generates a subquery for execution by the second data intake and query system, generates instructions for one or more worker nodes to receive and process results of the subquery from the second data intake and query system, and instructs the worker nodes to provide results of the processing to the first data intake and query system.
    Type: Application
    Filed: July 31, 2018
    Publication date: May 9, 2019
    Inventors: Sourav Pal, Arindam Bhattacharjee
  • Publication number: 20190138642
    Abstract: Systems and methods are disclosed for receiving and executing a query received from a data intake and query system and providing results to a first group of worker nodes in a distributed execution environment. The query identifies a set of data to be processed and a manner of processing the set of data. Based on the query, the system defines a query processing scheme, and generates instructions for a second group of worker nodes to obtain the set of data from one or more dataset sources and to process the set of data. The system communicates results of the query to the first group of worker nodes.
    Type: Application
    Filed: July 31, 2018
    Publication date: May 9, 2019
    Inventors: Sourav Pal, Arindam Bhattacharjee
  • Publication number: 20190095494
    Abstract: Systems and methods are disclosed for processing and executing queries against one or more dataset. As part of processing the query, the system determines whether the query is susceptible to a significantly imbalanced partition. In the event, the query is susceptible to an imbalanced partition, the system monitors the query and determines whether to perform a multi-partitioning determination to avoid a significantly imbalanced partition.
    Type: Application
    Filed: September 25, 2017
    Publication date: March 28, 2019
    Inventors: Arindam Bhattacharjee, Sourav Pal, Christopher Pride
  • Publication number: 20190095491
    Abstract: Systems and methods are disclosed for generating a distributed execution model with untrusted commands. The system can receive a query, and process the query to identify the untrusted commands. The system can use data associated with the untrusted command to identify one or more files associated with the untrusted command. Based on the files, the system can generate a data structure and include one or more identifiers associated with the data structure in the distributed execution model. The system can distribute the distributed execution model to one or more nodes in a distributed computing environment for execution.
    Type: Application
    Filed: September 25, 2017
    Publication date: March 28, 2019
    Inventors: Arindam Bhattacharjee, Sourav Pal, Alexander Douglas James
  • Publication number: 20190095493
    Abstract: In an environment where multiple datasets are to be combined, systems and methods are disclosed for allocating a group of data entries from at least one dataset into multiple partitions. For a particular partition, the subgroup in the partition can be combined with data entries from the other dataset. In some cases, groups of data entries from each dataset are assigned to different partitions. For a particular partition, a subgroup is duplicated, some of the data entries of the subgroup are reassigned to other partitions, the subgroup is reformed to include data entries from other partitions, and the reformed subgroup is combined with the subgroup from the other dataset(s).
    Type: Application
    Filed: September 25, 2017
    Publication date: March 28, 2019
    Inventors: Arindam Bhattacharjee, Sourav Pal, Christopher Pride
  • Publication number: 20190095488
    Abstract: Systems and methods are disclosed for executing a distributed execution model with untrusted commands. The distributed execution model can be distributed to multiple nodes in a distributed computing environment. At least one node can process the distributed execution model to identify an untrusted command. The node can use data associated with the untrusted command to identify one or more files associated with the untrusted command. Based on the files, the node can generate a data structure, and execute at least a portion of the data structure.
    Type: Application
    Filed: September 25, 2017
    Publication date: March 28, 2019
    Inventors: Arindam Bhattacharjee, Sourav Pal, Alexander Douglas James
  • Publication number: 20190068702
    Abstract: Processing of search responses returned by search peers is disclosed. An example method may include transmitting, by a computer system, a search request to a plurality of search peers of a data aggregation and analysis system; receiving a plurality of data packets from the plurality of search peers; parsing one or more data packets of the plurality of data packets, to produce a response to the search request; and splitting the response into two or more fields based on at least one of: a defined set of bit positions or a defined separator.
    Type: Application
    Filed: October 30, 2018
    Publication date: February 28, 2019
    Inventors: Sourav Pal, Christopher Madden Pride
  • Patent number: 10142412
    Abstract: Multi-thread processing of search responses is disclosed. An example method may include transmitting, by a computer system, a search request to a plurality of search peers of a data aggregation and analysis system; receiving a plurality of data packets from the plurality of search peers; parsing, by a first processing thread of the computer system, one or more data packets of the plurality of data packets, to produce a partial response to the search request; and processing, by a second processing thread of the computer system, the partial response to produce a memory data structure representing an aggregated response to the search request.
    Type: Grant
    Filed: March 6, 2018
    Date of Patent: November 27, 2018
    Assignee: Splunk Inc.
    Inventors: Sourav Pal, Christopher Madden Pride
  • Publication number: 20180218045
    Abstract: The disclosed embodiments include a method performed by a data intake and query system. The method includes receiving a search query by a search head, defining a search process for applying the search query to indexers, delegating a first portion of the search process to indexers and a second portion of the search process to intermediary node(s) communicatively coupled to the search head and the indexers. The first portion can define a search scope for obtaining partial search results of the indexers and the second portion can define operations for combining the partial search results by the intermediary node(s) to produce a combination of the partial search results. The search head then receives the combination of the partial search results, and outputs final search results for the search query, where the final search results are based on the combination of the partial search results.
    Type: Application
    Filed: January 30, 2017
    Publication date: August 2, 2018
    Inventors: Sourav Pal, Ashish Mathew, Xiaowei Wang, Christopher Pride
  • Publication number: 20180198858
    Abstract: Multi-thread processing of search responses is disclosed. An example method may include transmitting, by a computer system, a search request to a plurality of search peers of a data aggregation and analysis system; receiving a plurality of data packets from the plurality of search peers; parsing, by a first processing thread of the computer system, one or more data packets of the plurality of data packets, to produce a partial response to the search request; and processing, by a second processing thread of the computer system, the partial response to produce a memory data structure representing an aggregated response to the search request.
    Type: Application
    Filed: March 6, 2018
    Publication date: July 12, 2018
    Inventors: Sourav Pal, Christopher Madden Pride
  • Patent number: 9942318
    Abstract: Asynchronous processing of messages that are received from multiple servers is disclosed. An example method may include transmitting, by a computer system, a search request to a plurality of search peers of a data aggregation and analysis system. The method may further include receiving a plurality of sub-application layer protocol packets from the plurality of search peers. The method may further include parsing, by a first processing thread of the computer system, one or more sub-application layer protocol packets of the plurality of sub-application layer protocol packets, to produce an application layer message representing a partial response to the search request. The method may further include processing, by a second processing thread of the computer system, the application layer message to produce a memory data structure representing an aggregated response to the search request.
    Type: Grant
    Filed: October 26, 2016
    Date of Patent: April 10, 2018
    Assignee: Splunk Inc.
    Inventors: Sourav Pal, Christopher Madden Pride
  • Publication number: 20180089269
    Abstract: Systems and methods are disclosed for processing queries against one or more dataset sources. The system tracks query resource data and resource utilization data. The query-resource usage data can indicate resources used to execute queries. The node resource utilization data can indicate current utilization of nodes in the system. Upon receipt of a query that identifies a set of data to be processed and a manner of processing the set of data, the system can use the query-resource usage data and the resource utilization data to define a query processing scheme. The query can then be executed using the query processing scheme. In some cases, the query coordinator can dynamically allocate partitions operating on worker nodes to execute the query.
    Type: Application
    Filed: July 31, 2017
    Publication date: March 29, 2018
    Inventors: Sourav Pal, Arindam Bhattacharjee, Christopher Pride
  • Publication number: 20180089306
    Abstract: Systems and methods for a data index and query system that utilize a query acceleration data store. An example method includes receiving a query identifying a set of data to be processed and a manner of processing the set of data. A query processing scheme for obtaining and processing the set of data is defined. First partial results of the query stored in a data store are identified, with the first partial results corresponding to a first portion of the set of data. One or more partitions are dynamically allocated to obtain a second portion of the set of data from different data sources. The second portion of the set of data is processed to obtain second partial results. The first partial results and second partial results are combined. The query is executed based on the query processing scheme.
    Type: Application
    Filed: July 31, 2017
    Publication date: March 29, 2018
    Inventors: Sourav Pal, Arindam Bhattacharjee, Asha Andrade
  • Publication number: 20180089312
    Abstract: Systems and methods are disclosed for processing and executing queries against one or more dataset sources, where the queries identify a set of data to be processed and a manner of processing the set of data. To query the dataset sources, a query coordinator generates a query processing scheme that includes a dynamic allocation of multiple layers of partitions. The query is then executed based on the query processing scheme.
    Type: Application
    Filed: July 31, 2017
    Publication date: March 29, 2018
    Inventors: Sourav Pal, Arindam Bhattacharjee, Kishore Reddy Ramasayam, Alexander Douglas James
  • Publication number: 20180089262
    Abstract: Systems and methods are disclosed for processing queries against a common storage utilizing dynamically allocated partitions operating on one or more worker nodes. The common storage can include one or more data stores, which collectively contain a data set divided across multiple buckets of data. To query the common storage, a query coordinator can retrieve metadata regarding the multiple buckets, in order to determine a subset of buckets that are potentially relevant to a query. The query coordinator can then dynamically allocate partitions operating on worker nodes to retrieve and intake individual buckets of the subset into a phased search process. The dynamic allocation can be selected to maximize parallelization of the buckets across partitions, thus increasing a speed at which the common storage can be searched.
    Type: Application
    Filed: July 31, 2017
    Publication date: March 29, 2018
    Inventors: Arindam Bhattacharjee, Sourav Pal, Ramkumar Chandrasekharan
  • Publication number: 20180089324
    Abstract: Systems and methods are disclosed for utilizing an ingested data buffer operating according to a publish-subscribe messaging model as an intake mechanism for a query system. Data from various sources can be placed into the data buffer according to different topics. Indexers can subscribe to these topics in order to ingest the data into the system for long-term storage and later search. In addition, worker nodes may directly subscribe to the topics to enable continuous or streaming searching of the data, without delays that may be caused by ingestion of the data at an indexer. When a request for a streaming search is received, a query coordinator can determine a number of message queues on the data buffer that contain potentially relevant messages. The query coordinator can then dynamically allocate partitions operating on worker nodes to retrieve and intake messages from the message queues into a phased search process.
    Type: Application
    Filed: July 31, 2017
    Publication date: March 29, 2018
    Inventors: Sourav Pal, Arindam Bhattacharjee, Alexander Douglas James
  • Publication number: 20180089259
    Abstract: Systems and methods are disclosed for processing queries against an external data source utilizing dynamically allocated partitions operating on one or more worker nodes. The external data source can include data that has not been processed by the system. To query the external data source, a query coordinator can generate a subquery for the external data source based on determined functionality of the data source. The subquery can identify data in the external data source for processing and a manner for processing the data. In addition, the query coordinator can dynamically allocate partitions operating on worker nodes to retrieve and intake results of the subquery. In some cases, number of partitions allocated can be based on a number of partitions supported by the external data source.
    Type: Application
    Filed: July 31, 2017
    Publication date: March 29, 2018
    Inventors: Alexander Douglas James, Sourav Pal, Arindam Bhattacharjee, Christopher Pride
  • Publication number: 20180089278
    Abstract: Systems and methods are disclosed for processing queries against one or more dataset sources utilizing dynamically allocated partitions operating on one or more worker nodes. The results of the processing are stored in a dataset destination. The queries can identify data in the one or more dataset sources for processing and a manner for processing the data. In addition, the queries can identify the dataset destination for storing results of the query. To process the query, a query coordinator can dynamically allocate partitions operating on worker nodes to retrieve data for processing, process the data, and communicate the data to the dataset sources. In addition, the query coordinator can dynamically allocate partitions based on an identification of the dataset destination.
    Type: Application
    Filed: July 31, 2017
    Publication date: March 29, 2018
    Inventors: Arindam Bhattacharjee, Sourav Pal, Alexander Douglas James, Christopher Pride