Patents by Inventor Srimat Chakradhar

Srimat Chakradhar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8762794
    Abstract: Methods and systems for cross-layer forgiveness exploitation include executing one or more applications using a processing platform that includes a first reliable processing core and at least one additional processing core having a lower reliability than the first processing core, modifying application execution according to one or more best-effort techniques to improve performance, and controlling parameters associated with the processing platform and the best-effort layer that control performance and error rate such that performance is maximized in a region of low hardware-software interference.
    Type: Grant
    Filed: November 18, 2011
    Date of Patent: June 24, 2014
    Assignee: NEC Laboratories America, Inc.
    Inventors: Srimat Chakradhar, Hyungmin Cho, Anand Raghunathan
  • Publication number: 20140053131
    Abstract: Methods and systems for asynchronous offload to many-core coprocessors include splitting a loop in an input source code into a sampling sub-part, a many integrated core (MIC) sub-part, and a central processing unit (CPU) sub-part; executing the sampling sub-part with a processor to determine loop characteristics including memory- and processor-operations executed by the loop; identifying optimal split boundaries based on the loop characteristics such that the MIC sub-part will complete in a same amount of time when executed on a MIC processor as the CPU sub-part will take when executed on a CPU; and modifying the input source code to split the loop at the identified boundaries, such that the MIC sub-part is executed on a MIC processor and the CPU sub-part is concurrently executed on a CPU.
    Type: Application
    Filed: July 12, 2013
    Publication date: February 20, 2014
    Inventors: Nishkam Ravi, Yi Yang, Srimat Chakradhar
  • Patent number: 8643656
    Abstract: A method includes configuring a shared library, stored in a memory, to be loaded into applications to intercept graphics processing unit (GPU) computation requests for different types of workload kernals corresponding to the applications. The method further includes generating a power prediction and a performance prediction for at least one candidate kernel combination for execution on a GPU responsive to the GPU computations requests. The at least one candidate kernel combination pertains to at least two of the workload kernals. The method also includes rendering a decision of whether to execute the at least one candidate kernel combination or to execute the at least two of the workload kernals pertaining thereto separately, based on the power prediction and the performance prediction.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: February 4, 2014
    Assignee: NEC Laboratories America, Inc.
    Inventors: Dong Li, Surendra Byna, Srimat Chakradhar
  • Patent number: 8583896
    Abstract: Systems and methods for massively parallel processing on an accelerator that includes a plurality of processing cores. Each processing core includes multiple processing chains configured to perform parallel computations, each of which includes a plurality of interconnected processing elements. The cores further include multiple of smart memory blocks configured to store and process data, each memory block accepting the output of one of the plurality of processing chains. The cores communicate with at least one off-chip memory bank.
    Type: Grant
    Filed: July 26, 2010
    Date of Patent: November 12, 2013
    Assignee: NEC Laboratories America, Inc.
    Inventors: Srihari Cadambi, Abhinandan Majumdar, Michela Becchi, Srimat Chakradhar, Hans Peter Graf
  • Patent number: 8442927
    Abstract: A coprocessor and method for processing convolutional neural networks includes a configurable input switch coupled to an input. A plurality of convolver elements are enabled in accordance with the input switch. An output switch is configured to receive outputs from the set of convolver elements to provide data to output branches. A controller is configured to provide control signals to the input switch and the output switch such that the set of convolver elements are rendered active and a number of output branches are selected for a given cycle in accordance with the control signals.
    Type: Grant
    Filed: February 1, 2010
    Date of Patent: May 14, 2013
    Assignee: NEC Laboratories America, Inc.
    Inventors: Srimat Chakradhar, Murugan Sankaradas, Venkata S. Jakkula, Srihari Cadambi
  • Publication number: 20130055224
    Abstract: A system and method for compiling includes parsing code of an application stored in a computer readable storage medium to identify one or more parallelizable code portions. At least one parallelizable code portion is optimized by transforming offload construct code portions to provide an optimized application.
    Type: Application
    Filed: August 24, 2012
    Publication date: February 28, 2013
    Applicant: NEC LABORATORIES AMERICA, INC.
    Inventors: Nishkam Ravi, Tao Bao, Ozcan Ozturk, Srimat Chakradhar
  • Publication number: 20130055225
    Abstract: A system and method for compiling includes, for a parallelizable code portion of an application stored on a computer readable storage medium, determining one or more variables that are to be transferred to and/or from a coprocessor if the parallelizable code portion were to be offloaded. A start location and an end location are determined for at least one of the one or more variables as a size in memory. The parallelizable code portion is transformed by inserting an offload construct around the parallelizable code portion and passing the one or more variables and the size as arguments of the offload construct such that the parallelizable code portion is offloaded to a coprocessor at runtime.
    Type: Application
    Filed: August 24, 2012
    Publication date: February 28, 2013
    Applicant: NEC LABORATORIES AMERICA, INC.
    Inventors: Nishkam Ravi, Tao Bao, Ozcan Ozturk, Srimat Chakradhar
  • Patent number: 8375392
    Abstract: Systems and method for data-aware scheduling of applications on a heterogeneous platform having at least one central processing unit (CPU) and at least one accelerator. Such systems and methods include a function call handling module configured to intercept, analyze, and schedule library calls on a processing element. The function call handling module further includes a function call interception module configured to intercept function calls to predefined libraries, a function call analysis module configured to analyze argument size and location, and a function call redirection module configured to schedule library calls and data transfers. The systems and methods also use a memory unification module, configured to keep data coherent between memories associated with the at least one CPU and the at least one accelerator based on the output of the function call redirection module.
    Type: Grant
    Filed: August 20, 2010
    Date of Patent: February 12, 2013
    Assignee: NEC Laboratories America, Inc.
    Inventors: Michela Becchi, Surendra Byna, Srihari Cadambi, Srimat Chakradhar
  • Patent number: 8286172
    Abstract: Implementations of the present principles include Best-effort computing systems and methods. In accordance with various exemplary aspects of the present principles, a application computation requests directed to a processing platform may be intercepted and classified as either guaranteed computations or best-effort computations. Best-effort computations may be dropped to improve processing performance while minimally affecting the end result of application computations. In addition, interdependencies between best-effort computations may be relaxed to improve parallelism and processing speed while maintaining accuracy of computation results.
    Type: Grant
    Filed: March 6, 2009
    Date of Patent: October 9, 2012
    Assignee: NEC Laboratories America, Inc.
    Inventors: Srimat Chakradhar, Anand Raghunathan, Jiayuan Meng
  • Publication number: 20120233486
    Abstract: Methods and systems for managing data loads on a cluster of processors that implement an iterative procedure through parallel processing of data for the procedure are disclosed. One method includes monitoring, for at least one iteration of the procedure, completion times of a plurality of different processing phases that are undergone by each of the processors in a given iteration. The method further includes determining whether a load imbalance factor threshold is exceeded in the given iteration based on the completion times for the given iteration. In addition, the data is repartitioned by reassigning the data to the processors based on predicted dependencies between assigned data units of the data and completion times of a plurality of the processers for at least two of the phases. Further, the parallel processing is implemented on the cluster of processors in accordance with the reassignment.
    Type: Application
    Filed: March 1, 2012
    Publication date: September 13, 2012
    Applicant: NEC Laboratories America, Inc.
    Inventors: Rajat Phull, Srihari Cadambi, Nishkam Ravi, Srimat Chakradhar
  • Publication number: 20120131389
    Abstract: Methods and systems for cross-layer forgiveness exploitation include executing one or more applications using a processing platform that includes a first reliable processing core and at least one additional processing core having a lower reliability than the first processing core, modifying application execution according to one or more best-effort techniques to improve performance, and controlling parameters associated with the processing platform and the best-effort layer that control performance and error rate such that performance is maximized in a region of low hardware-software interference.
    Type: Application
    Filed: November 18, 2011
    Publication date: May 24, 2012
    Applicant: NEC Laboratories America, Inc.
    Inventors: Srimat Chakradhar, Hyungmin Cho, Anand Raghunathan
  • Publication number: 20120124591
    Abstract: A system and method for scheduling client-server applications onto heterogeneous clusters includes storing at least one client request of at least one application in a pending request list on a computer readable storage medium. A priority metric is computed for each application, where the computed priority metric is applied to each client request belonging to that application. The priority metric is determined based on estimated performance of the client request and load on the pending request list. The at least one client request of the at least one application is scheduled based on the priority metric onto one or more heterogeneous resources.
    Type: Application
    Filed: October 13, 2011
    Publication date: May 17, 2012
    Applicant: NEC Laboratories America, Inc.
    Inventors: Srihari Cadambi, Srimat Chakradhar, M. Mustafa Rafique
  • Publication number: 20120084747
    Abstract: Methods and systems for iterative convergence include performing at least one global iteration. Each global iteration includes partitioning input data into multiple input data partitions according to an input data partitioning function, partitioning a model into multiple model partitions according to a model partitioning function, performing at least one local iteration using a processor to compute sub-problems formed from a model partition and an input data partition to produce multiple locally updated models, and combining the locally updated models from the at least one local iteration according to a model merging function to produce a merged model.
    Type: Application
    Filed: September 19, 2011
    Publication date: April 5, 2012
    Applicant: NEC LABORATORIES AMERICA, INC.
    Inventors: Srimat Chakradhar, Reza Farivar, Anand Raghunathan
  • Publication number: 20120081373
    Abstract: A method includes configuring a shared library, stored in a memory, to be loaded into applications to intercept graphics processing unit (GPU) computation requests for different types of workload kernals corresponding to the applications. The method further includes generating a power prediction and a performance prediction for at least one candidate kernel combination for execution on a GPU responsive to the GPU computations requests. The at least one candidate kernel combination pertains to at least two of the workload kernals. The method also includes rendering a decision of whether to execute the at least one candidate kernel combination or to execute the at least two of the workload kernals pertaining thereto separately, based on the power prediction and the performance prediction.
    Type: Application
    Filed: September 8, 2011
    Publication date: April 5, 2012
    Applicant: NEC LABORATORIES AMERICA, INC.
    Inventors: DONG LI, SURENDRA BYNA, SRIMAT CHAKRADHAR
  • Publication number: 20120079298
    Abstract: Low-power systems and methods are disclosed for executing an application software on a general purpose processor and a plurality of accelerators with a runtime controller. The runtime controller splits a workload across the processor and the accelerators to minimize energy. The system includes building one or more performance models in an application-agnostic manner; and monitoring system performance in real-time and adjusting the workload splitting to minimize energy while conforming to a target quality of service (QoS).
    Type: Application
    Filed: April 4, 2011
    Publication date: March 29, 2012
    Applicant: NEC LABORATORIES AMERICA, INC.
    Inventors: Abhinandan Majumdar, Srihari Cadambi, Srimat Chakradhar
  • Publication number: 20110173155
    Abstract: Systems and method for data-aware scheduling of applications on a heterogeneous platform having at least one central processing unit (CPU) and at least one accelerator. Such systems and methods include a function call handling module configured to intercept, analyze, and schedule library calls on a processing element. The function call handling module further includes a function call interception module configured to intercept function calls to predefined libraries, a function call analysis module configured to analyze argument size and location, and a function call redirection module configured to schedule library calls and data transfers. The systems and methods also use a memory unification module, configured to keep data coherent between memories associated with the at least one CPU and the at least one accelerator based on the output of the function call redirection module.
    Type: Application
    Filed: August 20, 2010
    Publication date: July 14, 2011
    Applicant: NEC Laboratories America, Inc.
    Inventors: Michela Becchi, Surendra Byna, Srihari Cadambi, Srimat Chakradhar
  • Publication number: 20110119467
    Abstract: Systems and methods for massively parallel processing on an accelerator that includes a plurality of processing cores. Each processing core includes multiple processing chains configured to perform parallel computations, each of which includes a plurality of interconnected processing elements. The cores further include multiple of smart memory blocks configured to store and process data, each memory block accepting the output of one of the plurality of processing chains. The cores communicate with at least one off-chip memory bank.
    Type: Application
    Filed: July 26, 2010
    Publication date: May 19, 2011
    Applicant: NEC Laboratories America, Inc.
    Inventors: Srihari Cadambi, Abhinandan Majumdar, Michela Becchi, Srimat Chakradhar, Hans Peter Graf
  • Patent number: 7921206
    Abstract: A computer implemented technique framework, prototype tool and associated methods that provide a high degree of visibility and control over the in-field execution of software in a minimally intrusive manner wherein developer-defined correctness tests and validation logic are embedded into the sensor node itself, making in-field software testing autonomous without necessitating continuous developer participation.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: April 5, 2011
    Assignee: NEC Laboratories America, Inc.
    Inventors: Kiran Nagaraja, Vijay Raghunathan, Florin Sultan, Srimat Chakradhar, Nupur Kothari
  • Publication number: 20110029471
    Abstract: A coprocessor and method for processing convolutional neural networks includes a configurable input switch coupled to an input. A plurality of convolver elements are enabled in accordance with the input switch. An output switch is configured to receive outputs from the set of convolver elements to provide data to output branches. A controller is configured to provide control signals to the input switch and the output switch such that the set of convolver elements are rendered active and a number of output branches are selected for a given cycle in accordance with the control signals.
    Type: Application
    Filed: February 1, 2010
    Publication date: February 3, 2011
    Applicant: NEC Laboratories America, Inc.
    Inventors: SRIMAT CHAKRADHAR, Murugan Sankaradas, Venkata S. Jakkula, Srihari Cadambi
  • Publication number: 20100088492
    Abstract: Implementations of the present principles include Best-effort computing systems and methods. In accordance with various exemplary aspects of the present principles, a application computation requests directed to a processing platform may be intercepted and classified as either guaranteed computations or best-effort computations. Best-effort computations may be dropped to improve processing performance while minimally affecting the end result of application computations. In addition, interdependencies between best-effort computations may be relaxed to improve parallelism and processing speed while maintaining accuracy of computation results.
    Type: Application
    Filed: March 6, 2009
    Publication date: April 8, 2010
    Applicant: NEC Laboratories America, Inc.
    Inventors: Srimat Chakradhar, Anand Raghunathan, Jiayuan Meng