Patents by Inventor Srihari Cadambi
Srihari Cadambi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20140208331Abstract: A runtime method is disclosed that dynamically sets up core containers and thread-to-core affinity for processes running on manycore coprocessors. The method is completely transparent to user applications and incurs low runtime overhead. The method is implemented within a user-space middleware that also performs scheduling and resource management for both offload and native applications using the manycore coprocessors.Type: ApplicationFiled: April 6, 2013Publication date: July 24, 2014Applicant: NEC Laboratories America, Inc.Inventors: Cheng-Hong Li, Srihari Cadambi, Kunal Rao, Srimat T. Chakradhar, Rajat Phull, Giuseppe Coviello, Murugan Sankaradass
-
METHOD FOR SIMULTANEOUS SCHEDULING OF PROCESSES AND OFFLOADING COMPUTATION ON MANY-CORE COPROCESSORS
Publication number: 20140208327Abstract: A method is disclosed to manage a multi-processor system with one or more manycore devices, by managing real-time bag-of-tasks applications for a cluster, wherein each task runs on a single server node, and uses the offload programming model, and wherein each task has a deadline and three specific resource requirements: total processing time, a certain number of manycore devices and peak memory on each device; when a new task arrives, querying each node scheduler to determine which node can best accept the task and each node scheduler responds with an estimated completion time and a confidence level, wherein the node schedulers use an urgency-based heuristic to schedule each task and its offloads; responding to an accept/reject query phase, wherein the cluster scheduler send the task requirements to each node and queries if the node can accept the task with an estimated completion time and confidence level; and scheduling tasks and offloads using a aging and urgency-based heuristic, wherein the aging guaranteType: ApplicationFiled: April 6, 2013Publication date: July 24, 2014Applicant: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Kunal Rao, Srimat T. Chakradhar, Rajat Phull, Giuseppe Coviello, Murugan Sankaradass, Cheng-Hong Li -
Patent number: 8583896Abstract: Systems and methods for massively parallel processing on an accelerator that includes a plurality of processing cores. Each processing core includes multiple processing chains configured to perform parallel computations, each of which includes a plurality of interconnected processing elements. The cores further include multiple of smart memory blocks configured to store and process data, each memory block accepting the output of one of the plurality of processing chains. The cores communicate with at least one off-chip memory bank.Type: GrantFiled: July 26, 2010Date of Patent: November 12, 2013Assignee: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Abhinandan Majumdar, Michela Becchi, Srimat Chakradhar, Hans Peter Graf
-
Publication number: 20130191612Abstract: Systems and methods are disclosed that share coprocessor resources between two or more applications in a computing cluster using a job selector to receive jobs from a job queue; a node selector coupled to the job selector; an off line profiler with an interference prediction model; a coprocessor dynamic interference detection module; and a coprocessor interference response module.Type: ApplicationFiled: October 6, 2012Publication date: July 25, 2013Applicant: NEC LABORATORIES AMERICA, INC.Inventors: Cheng-Hong Li, Srihari Cadambi, Srimat T. Chakradhar, Rajat Phull
-
Patent number: 8442927Abstract: A coprocessor and method for processing convolutional neural networks includes a configurable input switch coupled to an input. A plurality of convolver elements are enabled in accordance with the input switch. An output switch is configured to receive outputs from the set of convolver elements to provide data to output branches. A controller is configured to provide control signals to the input switch and the output switch such that the set of convolver elements are rendered active and a number of output branches are selected for a given cycle in accordance with the control signals.Type: GrantFiled: February 1, 2010Date of Patent: May 14, 2013Assignee: NEC Laboratories America, Inc.Inventors: Srimat Chakradhar, Murugan Sankaradas, Venkata S. Jakkula, Srihari Cadambi
-
Publication number: 20130091507Abstract: Systems and methods for managing a processor and one or more co-processors for a database application whose queries have been processed into an intermediate form (IR) containing kernels of the database application that have been fused and split; dynamically scheduling such kernels on CUDA streams and further dynamically dispatching kernels to GPU devices by estimating execution time in order to achieve high performance.Type: ApplicationFiled: October 6, 2012Publication date: April 11, 2013Applicant: NEC Laboratories America, Inc.Inventors: Haicheng Wu, Srihari Cadambi, Srimat T. Chakradhar
-
Patent number: 8375392Abstract: Systems and method for data-aware scheduling of applications on a heterogeneous platform having at least one central processing unit (CPU) and at least one accelerator. Such systems and methods include a function call handling module configured to intercept, analyze, and schedule library calls on a processing element. The function call handling module further includes a function call interception module configured to intercept function calls to predefined libraries, a function call analysis module configured to analyze argument size and location, and a function call redirection module configured to schedule library calls and data transfers. The systems and methods also use a memory unification module, configured to keep data coherent between memories associated with the at least one CPU and the at least one accelerator based on the output of the function call redirection module.Type: GrantFiled: August 20, 2010Date of Patent: February 12, 2013Assignee: NEC Laboratories America, Inc.Inventors: Michela Becchi, Surendra Byna, Srihari Cadambi, Srimat Chakradhar
-
Patent number: 8359281Abstract: A method system for training an apparatus to recognize a pattern includes providing the apparatus with a host processor executing steps of a machine learning process; providing the apparatus with an accelerator including at least two processors; inputting training pattern data into the host processor; determining coefficient changes in the machine learning process with the host processor using the training pattern data; transferring the training data to the accelerator; determining kernel dot-products with the at least two processors of the accelerator using the training data; and transferring the dot-products back to the host processor.Type: GrantFiled: June 4, 2009Date of Patent: January 22, 2013Assignee: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Igor Durdanovic, Venkata Jakkula, Eric Cosatto, Murugan Sankaradass, Hans Peter Graf, Srimat T. Chakradhar
-
Publication number: 20120233486Abstract: Methods and systems for managing data loads on a cluster of processors that implement an iterative procedure through parallel processing of data for the procedure are disclosed. One method includes monitoring, for at least one iteration of the procedure, completion times of a plurality of different processing phases that are undergone by each of the processors in a given iteration. The method further includes determining whether a load imbalance factor threshold is exceeded in the given iteration based on the completion times for the given iteration. In addition, the data is repartitioned by reassigning the data to the processors based on predicted dependencies between assigned data units of the data and completion times of a plurality of the processers for at least two of the phases. Further, the parallel processing is implemented on the cluster of processors in accordance with the reassignment.Type: ApplicationFiled: March 1, 2012Publication date: September 13, 2012Applicant: NEC Laboratories America, Inc.Inventors: Rajat Phull, Srihari Cadambi, Nishkam Ravi, Srimat Chakradhar
-
Publication number: 20120124591Abstract: A system and method for scheduling client-server applications onto heterogeneous clusters includes storing at least one client request of at least one application in a pending request list on a computer readable storage medium. A priority metric is computed for each application, where the computed priority metric is applied to each client request belonging to that application. The priority metric is determined based on estimated performance of the client request and load on the pending request list. The at least one client request of the at least one application is scheduled based on the priority metric onto one or more heterogeneous resources.Type: ApplicationFiled: October 13, 2011Publication date: May 17, 2012Applicant: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Srimat Chakradhar, M. Mustafa Rafique
-
Publication number: 20120079298Abstract: Low-power systems and methods are disclosed for executing an application software on a general purpose processor and a plurality of accelerators with a runtime controller. The runtime controller splits a workload across the processor and the accelerators to minimize energy. The system includes building one or more performance models in an application-agnostic manner; and monitoring system performance in real-time and adjusting the workload splitting to minimize energy while conforming to a target quality of service (QoS).Type: ApplicationFiled: April 4, 2011Publication date: March 29, 2012Applicant: NEC LABORATORIES AMERICA, INC.Inventors: Abhinandan Majumdar, Srihari Cadambi, Srimat Chakradhar
-
Patent number: 8131532Abstract: A system and method is disclosed for formal verification of software programs that advantageously bounds the ranges of values that a variable in the software can take during runtime.Type: GrantFiled: June 3, 2006Date of Patent: March 6, 2012Assignee: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Aleksandr Zaks, Franjo Ivancic, Ilya Shlyakhter, Zijiang Yang, Malay Ganai, Aarti Gupta, Pranav Ashar
-
Publication number: 20110173155Abstract: Systems and method for data-aware scheduling of applications on a heterogeneous platform having at least one central processing unit (CPU) and at least one accelerator. Such systems and methods include a function call handling module configured to intercept, analyze, and schedule library calls on a processing element. The function call handling module further includes a function call interception module configured to intercept function calls to predefined libraries, a function call analysis module configured to analyze argument size and location, and a function call redirection module configured to schedule library calls and data transfers. The systems and methods also use a memory unification module, configured to keep data coherent between memories associated with the at least one CPU and the at least one accelerator based on the output of the function call redirection module.Type: ApplicationFiled: August 20, 2010Publication date: July 14, 2011Applicant: NEC Laboratories America, Inc.Inventors: Michela Becchi, Surendra Byna, Srihari Cadambi, Srimat Chakradhar
-
Publication number: 20110119467Abstract: Systems and methods for massively parallel processing on an accelerator that includes a plurality of processing cores. Each processing core includes multiple processing chains configured to perform parallel computations, each of which includes a plurality of interconnected processing elements. The cores further include multiple of smart memory blocks configured to store and process data, each memory block accepting the output of one of the plurality of processing chains. The cores communicate with at least one off-chip memory bank.Type: ApplicationFiled: July 26, 2010Publication date: May 19, 2011Applicant: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Abhinandan Majumdar, Michela Becchi, Srimat Chakradhar, Hans Peter Graf
-
Publication number: 20110029471Abstract: A coprocessor and method for processing convolutional neural networks includes a configurable input switch coupled to an input. A plurality of convolver elements are enabled in accordance with the input switch. An output switch is configured to receive outputs from the set of convolver elements to provide data to output branches. A controller is configured to provide control signals to the input switch and the output switch such that the set of convolver elements are rendered active and a number of output branches are selected for a given cycle in accordance with the control signals.Type: ApplicationFiled: February 1, 2010Publication date: February 3, 2011Applicant: NEC Laboratories America, Inc.Inventors: SRIMAT CHAKRADHAR, Murugan Sankaradas, Venkata S. Jakkula, Srihari Cadambi
-
Patent number: 7653670Abstract: An architecture and method for data storage and retrieval which also addresses a number of key problems associated with systems employing hash-based techniques. The architecture and method employs a collision-free hashing scheme called Bloomier filter while eliminates its false positives in a storage efficient way. Wildcard support is added through the use of a scheme called prefix collapsing, while introducing only a small additional storage requirement and reduced hardware complexity. Real update traces indicate that both fast and incremental updates are provided—features generally not available in prior-art collision-free hashing schemes.Type: GrantFiled: November 28, 2006Date of Patent: January 26, 2010Assignee: NEC Laboratories America, Inc.Inventors: Jahangir Hasan, Srihari Cadambi, Srimat T. Chakradhar
-
Publication number: 20090304268Abstract: A method system for training an apparatus to recognize a pattern includes providing the apparatus with a host processor executing steps of a machine learning process; providing the apparatus with an accelerator including at least two processors; inputting training pattern data into the host processor; determining coefficient changes in the machine learning process with the host processor using the training pattern data; transferring the training data to the accelerator; determining kernel dot-products with the at least two processors of the accelerator using the training data; and transferring the dot-products back to the host processor.Type: ApplicationFiled: June 4, 2009Publication date: December 10, 2009Applicant: NEC LABORATORIES AMERICA, INC.Inventors: Srihari Cadambi, Igor Durdanovic, Venkata Jakkula, Eric Cosatto, Murugan Sankaradass, Hans Peter Graf, Srimat T. Chakradhar
-
Patent number: 7592935Abstract: An information retrieval architecture for performing a multi-dimensional search for a lookup value associated with a set of input values, the set of input values organized into one or more fields, the information retrieval architecture including a plurality of classification modules, each classification module storing the lookup values, each lookup value being associated with a set of input values; and a preprocessing module which receives a set of input values and selectively limits search of the plurality of classification modules to a subset of the classification modules.Type: GrantFiled: March 9, 2006Date of Patent: September 22, 2009Assignee: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Srimat T Chakradhar
-
Patent number: 7398278Abstract: A method of storing addresses in a database comprising generating original prefixes representing the addresses to be stored in the database. The prefix lengths are determined dynamically. Reduced prefixes are generated by collapsing or expanding the original prefixes such that a number of reduced prefixes is less than a number of original prefixes. The reduced prefixes are stored in the database.Type: GrantFiled: May 20, 2005Date of Patent: July 8, 2008Assignees: NEC Electronics Corporation, NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Srimat Chakradhar, Hirohiko Shibata
-
Publication number: 20080034427Abstract: A method includes reducing a deterministic finite automata DFA representative of an expression to provide a smaller DFA, and subjecting information that matches the smaller DFA to non-deterministic finite automata NFA representative of the expression for reducing memory required for pattern matching of the information.Type: ApplicationFiled: July 30, 2007Publication date: February 7, 2008Applicant: NEC Laboratories America, Inc.Inventors: Srihari Cadambi, Srimat T. Chakradhar, Michela Becchi