Patents by Inventor Sunil K. Shukla

Sunil K. Shukla has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170123794
    Abstract: An apparatus and method for supporting simultaneous multiple iterations (SMI) and iteration level commits (ILC) in a course grained reconfigurable architecture (CGRA). The apparatus includes: Hardware structures that connect all of multiple processing engines (PEs) to a load-store unit (LSU) configured to keep track of which compiled program code iterations have completed, which ones are in flight and which are yet to begin, and a control unit including hardware structures that are used to maintain synchronization and initiate and terminate loops within the PEs. The processing elements, LSU and control unit are configured to commit instructions, and save and restore context at loop iteration boundaries. In doing so, the apparatus tracks and buffers state of in-flight iterations, and detects conditions that prevents an iteration from completion.
    Type: Application
    Filed: November 4, 2015
    Publication date: May 4, 2017
    Inventors: Chia-yu Chen, Kailash Gopalakrishnan, Jinwook Oh, Lee M. Saltzman, Sunil K. Shukla, Vijayalakshmi Srinivasan
  • Publication number: 20170123795
    Abstract: An apparatus and method for supporting simultaneous multiple iterations (SMI) in a course grained reconfigurable architecture (CGRA). In support of SMI, the apparatus includes: Hardware structures that connect all of multiple processing engines (PEs) to a load-store unit (LSU) configured to keep track of which compiled program code iterations have completed, which ones are in flight and which are yet to begin, and a control unit including hardware structures that are used to maintain synchronization and initiate and terminate loops within the PEs. SMI permits execution of the next instruction within any iteration (in flight). If instructions from multiple iterations are ready for execution (and are pre-decoded), then the hardware selects the lowest iteration number ready for execution. If in a particular clock cycle, a loop iteration with a lower iteration number is stalled (i.e.
    Type: Application
    Filed: November 4, 2015
    Publication date: May 4, 2017
    Inventors: Chia-yu Chen, Kailash Gopalakrishnan, Jinwook Oh, Sunil K. Shukla, Vijayalakshmi Srinivasan
  • Patent number: 9632928
    Abstract: Embodiments of the invention provide a method and system for dynamic memory management implemented in hardware. In an embodiment, the method comprises storing objects in a plurality of heaps, and operating a hardware garbage collector to free heap space. The hardware garbage collector traverses the heaps and marks selected objects, uses the marks to identify a plurality of the objects, and frees the identified objects. In an embodiment, the method comprises storing objects in a heap, each of at least some of the objects including a multitude of pointers; and operating a hardware garbage collector to free heap space. The hardware garbage collector traverses the heap, using the pointers of some of the objects to identify others of the objects; processes the objects to mark selected objects; and uses the marks to identify a group of the objects, and frees the identified objects.
    Type: Grant
    Filed: April 28, 2016
    Date of Patent: April 25, 2017
    Assignee: International Business Machines Corporation
    Inventors: David F. Bacon, Perry S. Cheng, Sunil K. Shukla
  • Publication number: 20160239414
    Abstract: Embodiments of the invention provide a method and system for dynamic memory management implemented in hardware. In an embodiment, the method comprises storing objects in a plurality of heaps, and operating a hardware garbage collector to free heap space. The hardware garbage collector traverses the heaps and marks selected objects, uses the marks to identify a plurality of the objects, and frees the identified objects. In an embodiment, the method comprises storing objects in a heap, each of at least some of the objects including a multitude of pointers; and operating a hardware garbage collector to free heap space. The hardware garbage collector traverses the heap, using the pointers of some of the objects to identify others of the objects; processes the objects to mark selected objects; and uses the marks to identify a group of the objects, and frees the identified objects.
    Type: Application
    Filed: April 28, 2016
    Publication date: August 18, 2016
    Inventors: David F. Bacon, Perry S. Cheng, Sunil K. Shukla
  • Patent number: 9418187
    Abstract: As described herein, a tool records a log (or trace) of all sources of non-determinism in the system. In most of the cases, it's enough to log all transitions and the exact timestamps at all the entry and exit points of the system. By using this information it is possible to recreate a cycle accurate execution of the hardware system in simulation. Unlike CHIPSCOPE and SIGNALTAP which let you monitor a small number of signals in the design, the tool provides visibility into the whole system.
    Type: Grant
    Filed: November 13, 2015
    Date of Patent: August 16, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Daniel Foisy, Sunil K. Shukla
  • Patent number: 9355030
    Abstract: Embodiments of the invention provide a method and system for dynamic memory management implemented in hardware. In an embodiment, the method comprises storing objects in a plurality of heaps, and operating a hardware garbage collector to free heap space. The hardware garbage collector traverses the heaps and marks selected objects, uses the marks to identify a plurality of the objects, and frees the identified objects. In an embodiment, the method comprises storing objects in a heap, each of at least some of the objects including a multitude of pointers; and operating a hardware garbage collector to free heap space. The hardware garbage collector traverses the heap, using the pointers of some of the objects to identify others of the objects; processes the objects to mark selected objects; and uses the marks to identify a group of the objects, and frees the identified objects.
    Type: Grant
    Filed: June 6, 2014
    Date of Patent: May 31, 2016
    Assignee: International Business Machines Corporation
    Inventors: David F. Bacon, Perry S. Cheng, Sunil K. Shukla
  • Patent number: 9329843
    Abstract: A communication stack for software-hardware co-execution on heterogeneous computing systems with processors and reconfigurable logic, in one aspect, may comprise a crossbar operable to connect hardware user code and functioning as a platform independent communication layer. A physical interface interfaces to the reconfigurable logic. A physical interface bridge is connected to the cross and the physical interface. The physical interface bridge connects the crossbar and the physical interface via a platform specific translation layer specific to the reconfigurable logic. The crossbar, the physical interface, and the physical interface bridge may be instantiated in response to the hardware user code being generated, the crossbar instantiated with associated parameters comprising one or more routes and associated data widths. The hardware user code is assigned a unique virtual route in the crossbar.
    Type: Grant
    Filed: June 10, 2013
    Date of Patent: May 3, 2016
    Assignee: International Business Machines Corporation
    Inventors: Perry S. Cheng, Rodric Rabbah, Sunil K. Shukla
  • Patent number: 9323506
    Abstract: A communication stack for software-hardware co-execution on heterogeneous computing systems with processors and reconfigurable logic, in one aspect, may comprise a crossbar operable to connect hardware user code and functioning as a platform independent communication layer. A physical interface interfaces to the reconfigurable logic. A physical interface bridge is connected to the cross and the physical interface. The physical interface bridge connects the crossbar and the physical interface via a platform specific translation layer specific to the reconfigurable logic. The crossbar, the physical interface, and the physical interface bridge may be instantiated in response to the hardware user code being generated, the crossbar instantiated with associated parameters comprising one or more routes and associated data widths. The hardware user code is assigned a unique virtual route in the crossbar.
    Type: Grant
    Filed: August 5, 2013
    Date of Patent: April 26, 2016
    Assignee: International Business Machines Corporation
    Inventors: Perry S. Cheng, Rodric Rabbah, Sunil K. Shukla
  • Publication number: 20160070835
    Abstract: As described herein, a tool records a log (or trace) of all sources of non-determinism in the system. In most of the cases, it's enough to log all transitions and the exact timestamps at all the entry and exit points of the system. By using this information it is possible to recreate a cycle accurate execution of the hardware system in simulation. Unlike CHIPSCOPE and SIGNALTAP which let you monitor a small number of signals in the design, the tool provides visibility into the whole system.
    Type: Application
    Filed: November 13, 2015
    Publication date: March 10, 2016
    Inventors: Daniel FOISY, Sunil K. SHUKLA
  • Patent number: 9217774
    Abstract: As described herein, a tool records a log (or trace) of all sources of non-determinism in the system. In most of the cases, it's enough to log all transitions and the exact timestamps at all the entry and exit points of the system. By using this information it is possible to recreate a cycle accurate execution of the hardware system in simulation. Unlike CHIPSCOPE and SIGNALTAP which let you monitor a small number of signals in the design, the tool provides visibility into the whole system.
    Type: Grant
    Filed: August 29, 2014
    Date of Patent: December 22, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Daniel Foisy, Sunil K. Shukla
  • Publication number: 20150356007
    Abstract: Embodiments of the invention provide a method and system for dynamic memory management implemented in hardware. In an embodiment, the method comprises storing objects in a plurality of heaps, and operating a hardware garbage collector to free heap space. The hardware garbage collector traverses the heaps and marks selected objects, uses the marks to identify a plurality of the objects, and frees the identified objects. In an embodiment, the method comprises storing objects in a heap, each of at least some of the objects including a multitude of pointers; and operating a hardware garbage collector to free heap space. The hardware garbage collector traverses the heap, using the pointers of some of the objects to identify others of the objects; processes the objects to mark selected objects; and uses the marks to identify a group of the objects, and frees the identified objects.
    Type: Application
    Filed: June 6, 2014
    Publication date: December 10, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David F. Bacon, Perry S. Cheng, Sunil K. Shukla
  • Publication number: 20150128100
    Abstract: As described herein, a tool records a log (or trace) of all sources of non-determinism in the system. In most of the cases, it's enough to log all transitions and the exact timestamps at all the entry and exit points of the system. By using this information it is possible to recreate a cycle accurate execution of the hardware system in simulation. Unlike CHIPSCOPE and SIGNALTAP which let you monitor a small number of signals in the design, the tool provides visibility into the whole system.
    Type: Application
    Filed: August 29, 2014
    Publication date: May 7, 2015
    Inventors: Daniel Foisy, Sunil K. Shukla
  • Patent number: 8856491
    Abstract: A computing device is provided and includes a memory module, a sweep engine, a root snapshot module, and a trace engine. The memory module has a memory implemented as at least one hardware circuit. The memory module uses a dual-ported memory configuration. The sweep engine includes a stack pointer. The sweep engine is configured to send a garbage collection signal if the stack pointer falls below a specified level. The sweep engine is in communication with the memory module to reclaim memory. The root snapshot engine is configured to take a snapshot of roots from at least one mutator if the garbage collection signal is received from the sweep engine. The trace engine receives roots from the root snapshot engine and is in communication with the memory module to receive data.
    Type: Grant
    Filed: May 23, 2012
    Date of Patent: October 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: David F. Bacon, Perry S. Cheng, Sunil K. Shukla
  • Publication number: 20140208299
    Abstract: A communication stack for software-hardware co-execution on heterogeneous computing systems with processors and reconfigurable logic, in one aspect, may comprise a crossbar operable to connect hardware user code and functioning as a platform independent communication layer. A physical interface interfaces to the reconfigurable logic. A physical interface bridge is connected to the cross and the physical interface. The physical interface bridge connects the crossbar and the physical interface via a platform specific translation layer specific to the reconfigurable logic. The crossbar, the physical interface, and the physical interface bridge may be instantiated in response to the hardware user code being generated, the crossbar instantiated with associated parameters comprising one or more routes and associated data widths. The hardware user code is assigned a unique virtual route in the crossbar.
    Type: Application
    Filed: June 10, 2013
    Publication date: July 24, 2014
    Inventors: Perry S. Cheng, Rodric Rabbah, Sunil K. Shukla
  • Publication number: 20140208300
    Abstract: A communication stack for software-hardware co-execution on heterogeneous computing systems with processors and reconfigurable logic, in one aspect, may comprise a crossbar operable to connect hardware user code and functioning as a platform independent communication layer. A physical interface interfaces to the reconfigurable logic. A physical interface bridge is connected to the cross and the physical interface. The physical interface bridge connects the crossbar and the physical interface via a platform specific translation layer specific to the reconfigurable logic. The crossbar, the physical interface, and the physical interface bridge may be instantiated in response to the hardware user code being generated, the crossbar instantiated with associated parameters comprising one or more routes and associated data widths. The hardware user code is assigned a unique virtual route in the crossbar.
    Type: Application
    Filed: August 5, 2013
    Publication date: July 24, 2014
    Applicant: International Business Machines Corporation
    Inventors: Perry S. Cheng, Rodric Rabbah, Sunil K. Shukla
  • Publication number: 20130346930
    Abstract: Searching for desired clock frequency for integrated circuit-based design may receive timing result of a hardware synthesis job executed based on a code specifying hardware design. One or more different timing constraints specifying respective one or more different clock frequencies than used in the hardware synthesis job may be automatically generated without modifying the code. One or more instances of the hardware synthesis job to run with the respective one or more different timing constraints may be automatically spawned. The automatic generation and spawning may repeat until a termination criterion is met, and/or a desired successful timing constraint is identified for the hardware design from the different timing constraints based on whether the one or more instances of the hardware synthesis job met their respective timing constraints.
    Type: Application
    Filed: August 30, 2013
    Publication date: December 26, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Perry S. Cheng, Rodric Rabbah, Sunil K. Shukla
  • Publication number: 20130318290
    Abstract: A computing device is provided and includes a memory module, a sweep engine, a root snapshot module, and a trace engine. The memory module has a memory implemented as at least one hardware circuit. The memory module uses a dual-ported memory configuration. The sweep engine includes a stack pointer. The sweep engine is configured to send a garbage collection signal if the stack pointer falls below a specified level. The sweep engine is in communication with the memory module to reclaim memory. The root snapshot engine is configured to take a snapshot of roots from at least one mutator if the garbage collection signal is received from the sweep engine. The trace engine receives roots from the root snapshot engine and is in communication with the memory module to receive data.
    Type: Application
    Filed: May 23, 2012
    Publication date: November 28, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David F. Bacon, Perry S. Cheng, Sunil K. Shukla
  • Publication number: 20130318315
    Abstract: A method of garbage collection in a computing device is provided. The method includes providing a memory module having a memory implemented as at least one hardware circuit. The memory module uses a dual-ported memory configuration. The method includes triggering a garbage collection signal by a sweep engine of the computing device. The sweep engine is in communication with a memory module to reclaim memory. The method includes receiving the garbage collection signal by a root snapshot engine of the computing device. The method includes taking a snapshot of roots from at least one mutator by the root snapshot engine if the garbage collection signal is received. The method includes receiving roots from the root snapshot engine by a trace engine of the computing device. The trace engine is in communication with the memory module to receive data.
    Type: Application
    Filed: June 19, 2012
    Publication date: November 28, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David F. Bacon, Perry S. Cheng, Sunil K. Shukla
  • Patent number: 8566768
    Abstract: Searching for desired clock frequency for integrated circuit-based design may receive timing result of a hardware synthesis job executed based on a code specifying hardware design. One or more different timing constraints specifying respective one or more different clock frequencies than used in the hardware synthesis job may be automatically generated without modifying the code. One or more instances of the hardware synthesis job to run with the respective one or more different timing constraints may be automatically spawned. The automatic generation and spawning may repeat until a termination criterion is met, and/or a desired successful timing constraint is identified for the hardware design from the different timing constraints based on whether the one or more instances of the hardware synthesis job met their respective timing constraints.
    Type: Grant
    Filed: April 6, 2012
    Date of Patent: October 22, 2013
    Assignee: International Business Machines Corporation
    Inventors: Sunil K. Shukla, Perry S. Cheng, Rodric Rabbah
  • Publication number: 20130268907
    Abstract: Searching for desired clock frequency for integrated circuit-based design may receive timing result of a hardware synthesis job executed based on a code specifying hardware design. One or more different timing constraints specifying respective one or more different clock frequencies than used in the hardware synthesis job may be automatically generated without modifying the code. One or more instances of the hardware synthesis job to run with the respective one or more different timing constraints may be automatically spawned. The automatic generation and spawning may repeat until a termination criterion is met, and/or a desired successful timing constraint is identified for the hardware design from the different timing constraints based on whether the one or more instances of the hardware synthesis job met their respective timing constraints.
    Type: Application
    Filed: April 6, 2012
    Publication date: October 10, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sunil K. Shukla, Perry S. Cheng, Rodric Rabbah