Patents by Inventor Srihari Cadambi
Srihari Cadambi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10162550Abstract: A graph storage and processing system is provided. The system includes a scalable, distributed, fault-tolerant, in-memory graph storage device for storing base graph data representative of graphs. The system further includes a real-time, in memory graph storage device for storing update graph data representative of graph updates for the graphs with respect to a time threshold. The system also includes an in-memory graph sampler for sampling the base graph data to generate sampled portions of the graphs and for storing the sampled portions of the graph. The system additionally includes a query manager for providing a query interface between applications and the system and for forming graph data representative of a complete graph from at least the base graph data and the update graph data, if any. The system also includes a graph computer for processing the sampled portions using batch-type computations to generate approximate results for graph-based queries.Type: GrantFiled: August 20, 2015Date of Patent: December 25, 2018Assignee: NEC CorporationInventors: Kunal Rao, Giuseppe Coviello, Srimat Chakradhar, Souvik Bhattacherjee, Srihari Cadambi
-
Patent number: 10019190Abstract: A method for detecting abnormal changes in real-time in dynamic graphs. The method includes extracting, by a graph sampler, an active sampled graph from an underlying base graph. The method further includes merging, by a graph merger, the active sampled graph with graph updates within a predetermined recent time period to generate a merged graph. The method also includes computing, by a graph diameter computer, a diameter of the merged graph. The method additionally includes determining, by a graph diameter change determination device, whether a graph diameter change exists. The method further includes generating, by an alarm generator, a user-perceptible alarm responsive to the graph diameter change.Type: GrantFiled: August 20, 2015Date of Patent: July 10, 2018Assignee: NEC CorporationInventors: Kunal Rao, Giuseppe Coviello, Srimat Chakradhar, Souvik Bhattacherjee, Srihari Cadambi
-
Patent number: 9965209Abstract: A method in a graph storage and processing system is provided. The method includes storing, in a scalable, distributed, fault-tolerant, in-memory graph storage device, base graph data representative of graphs, and storing, in a real-time, in memory graph storage device, update graph data representative of graph updates for the graphs with respect to a time threshold. The method further includes sampling the base graph data to generate sampled portions of the graphs and storing the sampled portions, by an in-memory graph sampler. The method additionally includes providing, by a query manager, a query interface between applications and the system. The method also includes forming, by the query manager, graph data representative of a complete graph from at least the base graph data and the update graph data, if any. The method includes processing, by a graph computer, the sampled portions using batch-type computations to generate approximate results for graph-based queries.Type: GrantFiled: August 20, 2015Date of Patent: May 8, 2018Assignee: NEC CorporationInventors: Kunal Rao, Giuseppe Coviello, Srimat Chakradhar, Souvik Bhattacherjee, Srihari Cadambi
-
Patent number: 9720597Abstract: Systems and methods for swapping out and in pinned memory regions between main memory and a separate storage location in a system, including establishing an offload buffer in an interposing library; swapping out pinned memory regions by transferring offload buffer data from a coprocessor memory to a host processor memory, unregistering and unmapping a memory region employed by the offload buffer from the interposing library, wherein the interposing library is pre-loaded on the coprocessor, and collects and stores information employed during the swapping out. The pinned memory regions are swapped in by mapping and re-registering the files to the memory region employed by the offload buffer, and transferring data of the offload buffer data from the host memory back to the re-registered memory region.Type: GrantFiled: January 23, 2015Date of Patent: August 1, 2017Assignee: NEC CorporationInventors: Cheng-Hong Li, Giuseppe Coviello, Kunal Rao, Murugan Sankaradas, Srihari Cadambi, Srimat Chakradhar, Rajat Phull
-
Patent number: 9367357Abstract: Methods and systems for scheduling jobs to manycore nodes in a cluster include selecting a job to run according to the job's wait time and the job's expected execution time; sending job requirements to all nodes in a cluster, where each node includes a manycore processor; determining at each node whether said node has sufficient resources to ever satisfy the job requirements and, if no node has sufficient resources, deleting the job; creating a list of nodes that have sufficient free resources at a present time to satisfy the job requirements; and assigning the job to a node, based on a difference between an expected execution time and associated confidence value for each node and a hypothetical fastest execution time and associated hypothetical maximum confidence value.Type: GrantFiled: April 24, 2014Date of Patent: June 14, 2016Assignee: NEC CorporationInventors: Srihari Cadambi, Kunal Rao, Srimat Chakradhar, Rajat Phull, Giuseppe Coviello, Murugan Sankaradass, Cheng-Hong Li
-
Publication number: 20160110409Abstract: A method in a graph storage and processing system is provided. The method includes storing, in a scalable, distributed, fault-tolerant, in-memory graph storage device, base graph data representative of graphs, and storing, in a real-time, in memory graph storage device, update graph data representative of graph updates for the graphs with respect to a time threshold. The method further includes sampling the base graph data to generate sampled portions of the graphs and storing the sampled portions, by an in-memory graph sampler. The method additionally includes providing, by a query manager, a query interface between applications and the system. The method also includes forming, by the query manager, graph data representative of a complete graph from at least the base graph data and the update graph data, if any. The method includes processing, by a graph computer, the sampled portions using batch-type computations to generate approximate results for graph-based queries.Type: ApplicationFiled: August 20, 2015Publication date: April 21, 2016Inventors: Kunal Rao, Giuseppe Coviello, Srimat Chakradhar, Souvik Bhattacherjee, Srihari Cadambi
-
Publication number: 20160110134Abstract: A graph storage and processing system is provided. The system includes a scalable, distributed, fault-tolerant, in-memory graph storage device for storing base graph data representative of graphs. The system further includes a real-time, in memory graph storage device for storing update graph data representative of graph updates for the graphs with respect to a time threshold. The system also includes an in-memory graph sampler for sampling the base graph data to generate sampled portions of the graphs and for storing the sampled portions of the graph. The system additionally includes a query manager for providing a query interface between applications and the system and for forming graph data representative of a complete graph from at least the base graph data and the update graph data, if any. The system also includes a graph computer for processing the sampled portions using batch-type computations to generate approximate results for graph-based queries.Type: ApplicationFiled: August 20, 2015Publication date: April 21, 2016Inventors: Kunal Rao, Giuseppe Coviello, Srimat Chakradhar, Souvik Bhattacherjee, Srihari Cadambi
-
Publication number: 20160110404Abstract: A method is provided for detecting abnormal changes in real-time in dynamic graphs. The method includes extracting, by a graph sampler, an active sampled graph from an underlying base graph. The method further includes merging, by a graph merger, the active sampled graph with graph updates within a predetermined recent time period to generate a merged graph. The method also includes computing, by a graph diameter computer, a diameter of the merged graph. The method additionally includes determining, by a graph diameter change determination device, whether a graph diameter change exists. The method further includes generating, by an alarm generator, a user-perceptible alarm responsive to the graph diameter change.Type: ApplicationFiled: August 20, 2015Publication date: April 21, 2016Inventors: Kunal Rao, Giuseppe Coviello, Srimat Chakradhar, Souvik Bhattacherjee, Srihari Cadambi
-
Method for simultaneous scheduling of processes and offloading computation on many-core coprocessors
Patent number: 9152467Abstract: A method is disclosed to manage a multi-processor system with one or more manycore devices, by managing real-time bag-of-tasks applications for a cluster, wherein each task runs on a single server node, and uses the offload programming model, and wherein each task has a deadline and three specific resource requirements: total processing time, a certain number of manycore devices and peak memory on each device; when a new task arrives, querying each node scheduler to determine which node can best accept the task and each node scheduler responds with an estimated completion time and a confidence level, wherein the node schedulers use an urgency-based heuristic to schedule each task and its offloads; responding to an accept/reject query phase, wherein the cluster scheduler send the task requirements to each node and queries if the node can accept the task with an estimated completion time and confidence level; and scheduling tasks and offloads using a aging and urgency-based heuristic, wherein the aging guaranteType: GrantFiled: April 6, 2013Date of Patent: October 6, 2015Assignee: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Kunal Rao, Srimat T. Chakradhar, Rajat Phull, Giuseppe Coviello, Murugan Sankaradass, Cheng-Hong Li -
Patent number: 9135741Abstract: Systems and methods are disclosed that share coprocessor resources between two or more applications in a computing cluster using a job selector to receive jobs from a job queue; a node selector coupled to the job selector; an off line profiler with an interference prediction model; a coprocessor dynamic interference detection module; and a coprocessor interference response module.Type: GrantFiled: October 6, 2012Date of Patent: September 15, 2015Assignee: NEC Laboratories America, Inc.Inventors: Cheng-Hong Li, Srihari Cadambi, Srimat T Chakradhar, Rajat Phull
-
Publication number: 20150212733Abstract: Systems and methods for swapping out and in pinned memory regions between main memory and a separate storage location in a system, including establishing an offload buffer in an interposing library; swapping out pinned memory regions by transferring offload buffer data from a coprocessor memory to a host processor memory, unregistering and unmapping a memory region employed by the offload buffer from the interposing library, wherein the interposing library is pre-loaded on the coprocessor, and collects and stores information employed during the swapping out. The pinned memory regions are swapped in by mapping and re-registering the files to the memory region employed by the offload buffer, and transferring data of the offload buffer data from the host memory back to the re-registered memory region.Type: ApplicationFiled: January 23, 2015Publication date: July 30, 2015Inventors: Cheng-Hong LI, Giuseppe Coviello, Kunal Rao, Murugan Sankaradas, Srihari Cadambi, Srimat Chakradhar, Rajat Phull
-
Patent number: 9086925Abstract: A runtime method is disclosed that dynamically sets up core containers and thread-to-core affinity for processes running on manycore coprocessors. The method is completely transparent to user applications and incurs low runtime overhead. The method is implemented within a user-space middleware that also performs scheduling and resource management for both offload and native applications using the manycore coprocessors.Type: GrantFiled: April 6, 2013Date of Patent: July 21, 2015Assignee: NEC Laboratories America, Inc.Inventors: Cheng-Hong Li, Kunal Rao, Srihari Cadambi, Rajat Phull, Giuseppe Coviello, Murugan Sankaradas, Srimat Chakradhar
-
Patent number: 9038088Abstract: Methods and systems for managing data loads on a cluster of processors that implement an iterative procedure through parallel processing of data for the procedure are disclosed. One method includes monitoring, for at least one iteration of the procedure, completion times of a plurality of different processing phases that are undergone by each of the processors in a given iteration. The method further includes determining whether a load imbalance factor threshold is exceeded in the given iteration based on the completion times for the given iteration. In addition, the data is repartitioned by reassigning the data to the processors based on predicted dependencies between assigned data units of the data and completion times of a plurality of the processers for at least two of the phases. Further, the parallel processing is implemented on the cluster of processors in accordance with the reassignment.Type: GrantFiled: March 1, 2012Date of Patent: May 19, 2015Assignee: NEC Laboratories America, Inc.Inventors: Rajat Phull, Srihari Cadambi, Nishkam Ravi, Srimat Chakradhar
-
Publication number: 20150113542Abstract: A method is provided for controlling a compute cluster having a plurality of nodes. Each of the plurality of nodes has a respective computing device with a main server and one or more coprocessor-based hardware accelerators. The method includes receiving a plurality of jobs for scheduling. The method further includes scheduling the plurality of jobs across the plurality of nodes responsive to a knapsack-based sharing-aware schedule generated by a knapsack-based sharing-aware scheduler. The knapsack-based sharing-aware schedule is generated to co-locate together on a same computing device certain ones of the plurality of jobs that are mutually compatible based on a set of requirements whose fulfillment is determined using a knapsack-based sharing-aware technique that uses memory as a knapsack capacity and minimizes makespan while adhering to coprocessor memory and thread resource constraints.Type: ApplicationFiled: October 3, 2014Publication date: April 23, 2015Inventors: Srihari Cadambi, Giuseppe Coviello, Srimat Chakradhar
-
Patent number: 8990827Abstract: Systems and methods for managing a processor and one or more co-processors for a database application whose queries have been processed into an intermediate form (IR) containing kernels of the database application that have been fused and split; dynamically scheduling such kernels on CUDA streams and further dynamically dispatching kernels to GPU devices by estimating execution time in order to achieve high performance.Type: GrantFiled: October 6, 2012Date of Patent: March 24, 2015Assignee: NEC Laboratories America, Inc.Inventors: Haicheng Wu, Srihari Cadambi, Srimat T Chakradhar
-
Patent number: 8984519Abstract: A system and method for scheduling client-server applications onto heterogeneous clusters includes storing at least one client request of at least one application in a pending request list on a computer readable storage medium. A priority metric is computed for each application, where the computed priority metric is applied to each client request belonging to that application. The priority metric is determined based on estimated performance of the client request and load on the pending request list. The at least one client request of the at least one application is scheduled based on the priority metric onto one or more heterogeneous resources.Type: GrantFiled: October 13, 2011Date of Patent: March 17, 2015Assignee: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Srimat Chakradhar, M. Mustafa Rafique
-
Publication number: 20150066988Abstract: Systems and methods for sorting data, including chunking unsorted data such that each chunk is of a size that fits within a last level cache of the system. One or more threads are instantiated in each physical core of the system, chunks assigned physical cores are distributed evenly across the threads on the physical cores. Subchunks in the physical cores are sorted using vector intrinsics, the subchunks being data assigned to the threads in the physical cores, and the subchunks are merged to generate sorted large chunks. A binary tree, which includes leaf nodes that correspond to the sorted large chunks, is built, leaf nodes are assigned to threads, and tree nodes are assigned to a circular buffer, wherein the circular buffer is lock and synchronization free. The large chunks are sorted to generate sorted data as output.Type: ApplicationFiled: August 29, 2014Publication date: March 5, 2015Inventors: Srihari Cadambi, Srimat Chakradhar, Yuan Yuan
-
Patent number: 8874943Abstract: Low-power systems and methods are disclosed for executing an application software on a general purpose processor and a plurality of accelerators with a runtime controller. The runtime controller splits a workload across the processor and the accelerators to minimize energy. The system includes building one or more performance models in an application-agnostic manner; and monitoring system performance in real-time and adjusting the workload splitting to minimize energy while conforming to a target quality of service (QoS).Type: GrantFiled: April 4, 2011Date of Patent: October 28, 2014Assignee: NEC Laboratories America, Inc.Inventors: Abhinandan Majumdar, Srihari Cadambi, Srimat T Chakradhar
-
Publication number: 20140237477Abstract: Methods and systems for scheduling jobs to manycore nodes in a cluster include selecting a job to run according to the job's wait time and the job's expected execution time; sending job requirements to all nodes in a cluster, where each node includes a manycore processor; determining at each node whether said node has sufficient resources to ever satisfy the job requirements and, if no node has sufficient resources, deleting the job; creating a list of nodes that have sufficient free resources at a present time to satisfy the job requirements; and assigning the job to a node, based on a difference between an expected execution time and associated confidence value for each node and a hypothetical fastest execution time and associated hypothetical maximum confidence value.Type: ApplicationFiled: April 24, 2014Publication date: August 21, 2014Applicant: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Kunal Rao, Srimat Chakradhar, Rajat Phull, Giuseppe Coviello, Murugan Sankaradass, Cheng-Hong Li
-
Publication number: 20140208072Abstract: A method is disclosed to manage a multi-processor system with one or more multiple-core coprocessors by intercepting coprocessor offload infrastructure application program interface (API) calls; scheduling user processes to run on one of the coprocessors; scheduling offloads within user processes to run on one of the coprocessors; and affinitizing offloads to predetermined cores within one of the coprocessors by selecting and allocating cores to an offload, and obtaining a thread-to-core mapping from a user.Type: ApplicationFiled: April 6, 2013Publication date: July 24, 2014Applicant: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Kunal Rao, Srimat T. Chakradhar, Rajat Phull, Giuseppe Coviello, Murugan Sankaradass, Cheng-Hong Li