Patents by Inventor Arindam Bhattacharjee

Arindam Bhattacharjee has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Cursored searches in a data fabric service system

Patent number: 10585951

Abstract: The disclosed embodiments include techniques to obtain ordered search results based on partial search results from across multiple diverse internal and/or external data sources. The ordering of the search results may be with respect to a parameter associated with the partial search results. An example of a parameter includes time. As such, the disclosed technique can provide a time-ordered search result based on partial search results obtained from across multiple internal and/or external data sources. Moreover, the disclosed technique can provide time-ordered search results regardless of whether the partial search results obtained from the diverse data sources are timestamped.

Type: Grant

Filed: October 31, 2016

Date of Patent: March 10, 2020

Assignee: Splunk Inc.

Inventors: Arindam Bhattacharjee, Sourav Pal, Christopher Pride
SEARCH SERVICE SYSTEM MONITORING

Publication number: 20200065340

Abstract: The disclosed embodiments also include monitoring and metering services of the data fabric service (DFS) system. Specifically, these services can include techniques for monitoring and metering metrics of the DFS system. The metrics are standards for measuring use or misuse of the DFS system. Examples of the metrics include data or components of the DFS system. For example, a metric can include data stored or communicated by the DFS system or components of the DFS system that are used or reserved for exclusive use by customers. The metrics can be measured with respect to time or computing resources (e.g., CPU utilization, memory usage) of the DFS system. For example, a DFS service can include metering the usage of particular worker nodes by a customer over a threshold period of time.

Type: Application

Filed: November 5, 2019

Publication date: February 27, 2020

Inventors: James Alasdair Robert Hodge, Sourav Pal, Arindam Bhattacharjee, Mustafa Ahamed
ADDRESSING MEMORY LIMITS FOR PARTITION TRACKING AMONG WORKER NODES

Publication number: 20200065303

Abstract: Systems and methods are described for distributed processing a query in a first query language utilizing a query execution engine intended for single-device execution. While distributed processing provides numerous benefits over single-device processing, distributed query execution engines can be significantly more difficult to develop that single-device engines. Embodiments of this disclosure enable the use of a single-device engine to support distributed processing, by dividing a query into multiple stages, each of which can be executed by multiple, concurrent executions of a single-device engine. Between stages, data can be shuffled between executions of the engine, such that individual executions of the engine are provided with a complete set of records needed to implement an individual stage. Because single-device engines can be significantly less difficult to develop, use of the techniques described herein can enable a distributed system to rapidly support multiple query languages.

Type: Application

Filed: October 18, 2019

Publication date: February 27, 2020

Inventors: Arindam Bhattacharjee, Sourav Pal, Srinivas Bobba
SUPPORTING ADDITIONAL QUERY LANGUAGES THROUGH DISTRIBUTED EXECUTION OF QUERY ENGINES

Publication number: 20200050612

Abstract: Systems and methods are described for distributed processing a query in a first query language utilizing a query execution engine intended for single-device execution. While distributed processing provides numerous benefits over single-device processing, distributed query execution engines can be significantly more difficult to develop that single-device engines. Embodiments of this disclosure enable the use of a single-device engine to support distributed processing, by dividing a query into multiple stages, each of which can be executed by multiple, concurrent executions of a single-device engine. Between stages, data can be shuffled between executions of the engine, such that individual executions of the engine are provided with a complete set of records needed to implement an individual stage. Because single-device engines can be significantly less difficult to develop, use of the techniques described herein can enable a distributed system to rapidly support multiple query languages.

Type: Application

Filed: October 18, 2019

Publication date: February 13, 2020

Inventors: Arindam Bhattacharjee, Sourav Pal, Timothy Tully
QUERY EXECUTION AT A REMOTE HETEROGENEOUS DATA STORE OF A DATA FABRIC SERVICE

Publication number: 20200050586

Abstract: Systems and methods are described for executing a query of raw machine data that is stored at a remote data store that may store heterogeneous data. The system can determine the directories or file types that may store event data and may instruct one or more worker nodes to access files that may store events based on the determined directories of file types. Further, the system may exclude files at the remote data store that may not be identified as potentially storing events enabling a query that implicates a heterogeneous data store to be efficiently executed.

Type: Application

Filed: October 18, 2019

Publication date: February 13, 2020

Inventors: Sourav Pal, Arindam Bhattacharjee, Timothy Tully
REASSIGNING PROCESSING TASKS TO AN EXTERNAL STORAGE SYSTEM

Publication number: 20200050607

Abstract: Systems and methods are described for reducing execution time of a query that references external data systems. The system can determine an external data system is capable of processing one or more map or reduce phases of a map-reduce operation. When it is determined that the external data system can process a map or reduce phase, associated operations may be reassigned from the system to the external data system reducing the processing resources used by the system to response to the query and, in some cases, speeding up performance of the query.

Type: Application

Filed: October 18, 2019

Publication date: February 13, 2020

Inventors: Sourav Pal, Arindam Bhattacharjee, Wayne Patterson
SEARCH FUNCTIONALITY OF A DATA INTAKE AND QUERY SYSTEM

Publication number: 20200004794

Abstract: Disclosed is a technique that can be performed in a distributed computer network. The technique can include a data index and query system that receives a search query, defines a search scheme for applying the search query on distributed data storage systems including an internal data storage system of the data index and query system and an external data storage system. The internal data storage system stores data as time-indexed events including respective segments of raw machine data. The data index and query system can transfer a portion of the search scheme to a search service, which can return search results obtained by application of the search scheme to the distributed data storage systems including the internal data storage system and the external data storage system. Lastly, the search results or data indicative of the search results can be output on a display device to the user.

Type: Application

Filed: September 13, 2019

Publication date: January 2, 2020

Inventors: Sourav Pal, Christopher Madden Pride, Arindam Bhattacharjee, Xiaowei Wang, James Alasdair Robert Hodge, Mustafa Ahamed
Data fabric services

Patent number: 10474723

Abstract: The disclosed embodiments also include monitoring and metering services of the data fabric service (DFS) system. Specifically, these services can include techniques for monitoring and metering metrics of the DFS system. The metrics are standards for measuring use or misuse of the DFS system. Examples of the metrics include data or components of the DFS system. For example, a metric can include data stored or communicated by the DFS system or components of the DFS system that are used or reserved for exclusive use by customers. The metrics can be measured with respect to time or computing resources (e.g., CPU utilization, memory usage) of the DFS system. For example, a DFS service can include metering the usage of particular worker nodes by a customer over a threshold period of time.

Type: Grant

Filed: October 31, 2016

Date of Patent: November 12, 2019

Assignee: SPLUNK INC.

Inventors: James Alasdair Robert Hodge, Sourav Pal, Arindam Bhattacharjee, Mustafa Ahamed
BUCKET DATA DISTRIBUTION FOR EXPORTING DATA TO WORKER NODES

Publication number: 20190310977

Abstract: Systems and methods are described for exporting bucket data from one or more buckets to one or more worker nodes. The system can identify data from different bucket data from buckets stored in a data intake and query system that is to be processed by one or more worker nodes. The system can allocate one or more execution resources, such as a processing pipeline, to process and export the bucket data from the buckets. The system can assign bucket data corresponding to individual buckets to the execution resource based on a bucket distribution policy. The indexer can export the bucket data to the worker nodes for further processing based on the bucket data-execution resource assignment.

Type: Application

Filed: April 29, 2019

Publication date: October 10, 2019

Inventors: Sourav Pal, Arindam Bhattacharjee, Asha Andrade, Nikhil Roy
ASSIGNING PROCESSING TASKS IN A DATA INTAKE AND QUERY SYSTEM

Publication number: 20190272271

Abstract: Systems and methods are described for assigning a processing task from one component of a data intake and query system to a different component of the data intake and query system. As part of processing a query, the system can determine that a particular processing task is to be executed by a particular component of the data intake and query system. Based on the characteristics of the component that is to execute the processing task, the system can assign the task or a supplemental task to one or more other components of the data intake and query system.

Type: Application

Filed: April 29, 2019

Publication date: September 5, 2019

Inventors: Arindam Bhattacharjee, Sourav Pal, Srinivas Bobba
Determining Records Generated by a Processing Task of a Query

Publication number: 20190258635

Abstract: Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.

Type: Application

Filed: April 29, 2019

Publication date: August 22, 2019

Inventors: Sourav Pal, Arindam Bhattacharjee, Asha Andrade
QUERY SCHEDULING BASED ON A QUERY-RESOURCE ALLOCATION AND RESOURCE AVAILABILITY

Publication number: 20190258631

Abstract: Systems and methods are described for scheduling a query for execution. The system receives and parses a query to identify one or more portions of the query. The system determines a resource allocation for each portion of the query, and determines an availability of compute resources for the different portions of the query. Based on the resource allocation and the availability of compute resources, the system schedules the query.

Type: Application

Filed: April 29, 2019

Publication date: August 22, 2019

Inventors: Sourav Pal, Arindam Bhattacharjee, Nikhil Roy
PARTITIONING AND REDUCING RECORDS AT INGEST OF A WORKER NODE

Publication number: 20190258637

Abstract: Systems and methods are described for partitioning and reducing records at ingest of a worker node. The worker node receives chunks of data from one or more indexers of a data intake and query system based on the execution of a query by the data intake and query system. The worker node assigns records to different record groups based on the content of the records. The system also assigns the record to a partition of a group of partitions. Record data of the records in a particular partition is combined. The system processes the partitions based on the query.

Type: Application

Filed: April 29, 2019

Publication date: August 22, 2019

Inventors: Arindam Bhattacharjee, Sourav Pal, Wayne Patterson, Srinivas Bobba
Determining a Record Generation Estimate of a Processing Task

Publication number: 20190258632

Abstract: Systems and methods are described for determining a record generation estimate related to a particular processing task. The system obtains a sample set of data that includes multiple records. The system applies a processing task, such as a transform or regular expression rule to the sample set of data and determines how many records are generated by the processing task. Based on the number of records generated, the system determines a record generation estimate. The system can use the record generation estimate to allocate compute resources or determine a query execution time for at least a portion of the query based on the record generation estimate.

Type: Application

Filed: April 29, 2019

Publication date: August 22, 2019

Inventors: Sourav Pal, Arindam Bhattacharjee, Asha Andrade
RECORD EXPANSION AND REDUCTION BASED ON A PROCESSING TASK IN A DATA INTAKE AND QUERY SYSTEM

Publication number: 20190258636

Abstract: Systems and methods are described for processing records associated with a query that identifies an association between two data fields. The system can obtain a chunk of data that includes multiple records based on a query received by a data intake and query system. At least one record can include multiple sub-records that share a field value for at least one field. The system can generate a record from each sub-record and assign the generated records to one or more groups of partitions. The system can combine record data of generated records assigned to one partition of a group of partitions and then combine record data across the group of partitions. The system can process the results of the combination of records across the group of partitions based on the query.

Type: Application

Filed: April 29, 2019

Publication date: August 22, 2019

Inventors: Arindam Bhattacharjee, Sourav Pal, Wayne Patterson
Data fabric service system architecture

Patent number: 10353965

Abstract: Disclosed is a technique that can be performed in a distributed computer network. The technique can include a data index and query system that receives search query, defines a search scheme for applying the search query on distributed data storage systems including an internal data storage system of the data index and query system and an external data storage system. The internal data storage system stores data as time-indexed events including respective segments of raw machine data. The data index and query system can transfer a portion of the search scheme to a search service, which can return search results obtained by application of the search scheme to the distributed data storage systems including the internal data storage system and the external data storage system. Lastly, the search results or data indicative of the search results can be output on a display device to the user.

Type: Grant

Filed: September 26, 2016

Date of Patent: July 16, 2019

Assignee: SPLUNK INC.

Inventors: Sourav Pal, Christopher Pride, Arindam Bhattacharjee, Xiaowei Wang, James Alasdair Robert Hodge, Mustafa Ahamed
DATA INTAKE AND QUERY SYSTEM SEARCH FUNCTIONALITY IN A DATA FABRIC SERVICE SYSTEM

Publication number: 20190171676

Abstract: Disclosed is a technique that can be performed in a distributed computer network. The technique can include a data index and query system that receives a search query and defines a search scheme for applying the search query on distributed data storage systems including an internal data storage system of the data intake and query system and an external data storage system communicatively coupled to the data intake and query system over a network. The data index and query system communicates at least a portion of the search scheme to a search service for application on behalf of the data intake and query system, receives from the search service a search result of the search query obtained by application of the search scheme to the distributed data storage systems, and causes the search result or data indicative thereof to be displayed on a display device.

Type: Application

Filed: January 31, 2019

Publication date: June 6, 2019

Inventors: Sourav Pal, Christopher Pride, Arindam Bhattacharjee, Xiaowei Wang, James Alasdair Robert Hodge, Mustafa Ahamed
SEARCH SERVICE FOR A DATA FABRIC SYSTEM

Publication number: 20190171677

Abstract: Disclosed is a technique that can be performed in a distributed network. The technique can include a search service system that receives an indication of at least a portion of a search scheme to cause worker nodes to obtain search results from distributed data storage systems. The search scheme is defined by a data intake and query system. The search service system defines a search process based on the at least a portion of the search scheme and executes the search process to cause the worker nodes to obtain search results from the distributed data storage systems. The search service system receives a combination of search results based on the search results obtained by the worker nodes from the distributed data storage systems, and causes an output based on the combination of search results obtained by the data intake and query system in accordance with the search scheme.

Type: Application

Filed: January 31, 2019

Publication date: June 6, 2019

Inventors: Sourav Pal, Christopher Pride, Arindam Bhattacharjee, Xiaowei Wang, James Alasdair Robert Hodge, Mustafa Ahamed
SEARCH FUNCTIONALITY OF WORKER NODES IN A DATA FABRIC SERVICE SYSTEM

Publication number: 20190171678

Abstract: Disclosed is a technique that can be performed in a distributed computer network. The technique can include a worker node that receives search instructions defined by a search service based on at least a portion of a search scheme defined by a data intake and query system, to cause the worker node to obtain search results from distributed data storage systems communicatively coupled to the worker node over a network. The distributed data storage systems include an external data storage system and/or an internal data storage system of the data intake and query system. The worker node obtains the search results by searching the distributed data storage systems in accordance with the search instructions, and communicating, over the network to the search service, a combination of search results based on the search results to cause an output by the data intake and query system in accordance with the search scheme.

Type: Application

Filed: January 31, 2019

Publication date: June 6, 2019

Inventors: Sourav Pal, Christopher Pride, Arindam Bhattacharjee, Xiaowei Wang, James Alasdair Robert Hodge, Mustafa Ahamed
TIMELINER FOR A DATA FABRIC SERVICE SYSTEM

Publication number: 20190163840

Abstract: The disclosed embodiments include techniques for organizing and presenting search results obtained from within a big data ecosystem via a data intake and query system. In particular, a data intake and query system may cause output of the search results or data indicative of the search results on a display device.

Type: Application

Filed: October 31, 2016

Publication date: May 30, 2019

Inventors: Sourav Pal, Arindam Bhattacharjee, Christopher Pride

prev 1 2 3 4 5 6 7 8 next