Patents by Inventor Vijayalakshmi Srinivasan

Vijayalakshmi Srinivasan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150206566
    Abstract: Systems and methods to manage memory on a spin transfer torque magnetoresistive random-access memory (STT-MRAM) are provided. A particular method may include determining a performance characteristic using relationship information that relates a bit error rate to at least one of a programming pulse width, a temperature, a history-based predictive performance parameter , a coding scheme, and a voltage level also associated with a memory. The performance characteristic is stored and used to manage a write operation associated with the memory.
    Type: Application
    Filed: January 21, 2014
    Publication date: July 23, 2015
    Applicant: International Business Machines Corporation
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Xiaochen Guo, Hillery C. Hunter, Jude A. Rivers, Vijayalakshmi Srinivasan
  • Publication number: 20150206568
    Abstract: Systems and methods to manage memory on a spin transfer torque magnetoresistive random-access memory (STT-MRAM) are provided. A particular method may include determining a performance characteristic using relationship information that relates a bit error rate to at least one of a programming pulse width, a temperature, a history-based predictive performance parameter , a coding scheme, and a voltage level also associated with a memory. The performance characteristic is stored and used to manage a write operation associated with the memory.
    Type: Application
    Filed: June 30, 2014
    Publication date: July 23, 2015
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Xiaochen Guo, Hillery C. Hunter, Jude A. Rivers, Vijayalakshmi Srinivasan
  • Patent number: 9069545
    Abstract: Systems and methods are disclosed that allow atomic updates to global data to be at least partially eliminated to reduce synchronization overhead in parallel computing. A compiler analyzes the data to be processed to selectively permit unsynchronized data transfer for at least one type of data. A programmer may provide a hint to expressly identify the type of data that are candidates for unsynchronized data transfer. In one embodiment, the synchronization overhead is reducible by generating an application program that selectively substitutes codes for unsynchronized data transfer for a subset of codes for synchronized data transfer. In another embodiment, the synchronization overhead is reducible by employing a combination of software and hardware by using relaxation data registers and decoders that collectively convert a subset of commands for synchronized data transfer into commands for unsynchronized data transfer.
    Type: Grant
    Filed: July 18, 2011
    Date of Patent: June 30, 2015
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayanan Renganarayana, Vijayalakshmi Srinivasan
  • Publication number: 20150074356
    Abstract: A processor and a method implemented by the processor to obtain computation results are described. The processor includes a unified reuse table embedded in a processor pipeline, the unified reuse table including a plurality of entries, each entry of the plurality of entries corresponding with a computation instruction or a set of computation instructions. The processor also includes a functional unit to perform a computation based on a corresponding instruction.
    Type: Application
    Filed: October 15, 2013
    Publication date: March 12, 2015
    Applicant: International Business Machines Corporation
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Xiaochen Guo, Hillery C. Hunter, Jude A. Rivers, Vijayalakshmi Srinivasan
  • Publication number: 20150074381
    Abstract: A processor and a method implemented by the processor to obtain computation results are described. The processor includes a unified reuse table embedded in a processor pipeline, the unified reuse table including a plurality of entries, each entry of the plurality of entries corresponding with a computation instruction or a set of computation instructions. The processor also includes a functional unit to perform a computation based on a corresponding instruction.
    Type: Application
    Filed: September 6, 2013
    Publication date: March 12, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Xiaochen Guo, Hillery C. Hunter, Jude A. Rivers, Vijayalakshmi Srinivasan
  • Publication number: 20150058569
    Abstract: Embodiments relate to a non-data inclusive coherent (NIC) directory for a symmetric multiprocessor (SMP) of a computer. An aspect includes determining a first eviction entry of a highest-level cache in a multilevel caching structure of the first processor node of the SMP. Another aspect includes determining that the NIC directory is not full. Another aspect includes determining that the first eviction entry of the highest-level cache is owned by a lower-level cache in the multilevel caching structure. Another aspect includes, based on the NIC directory not being full and based on the first eviction entry of the highest-level cache being owned by the lower-level cache, installing an address of the first eviction entry of the highest-level cache in a first new entry in the NIC directory. Another aspect includes invalidating the first eviction entry in the highest-level cache.
    Type: Application
    Filed: September 30, 2014
    Publication date: February 26, 2015
    Inventors: Timothy C. Bronson, Garrett M. Drapala, Rebecca M. Gott, Pak-kin Mak, Vijayalakshmi Srinivasan, Craig R. Walters
  • Publication number: 20140258621
    Abstract: Embodiments relate to a non-data inclusive coherent (NIC) directory for a symmetric multiprocessor (SMP) of a computer. An aspect includes determining a first eviction entry of a highest-level cache in a multilevel caching structure of the first processor node of the SMP. Another aspect includes determining that the NIC directory is not full. Another aspect includes determining that the first eviction entry of the highest-level cache is owned by a lower-level cache in the multilevel caching structure. Another aspect includes, based on the NIC directory not being full and based on the first eviction entry of the highest-level cache being owned by the lower-level cache, installing an address of the first eviction entry of the highest-level cache in a first new entry in the NIC directory. Another aspect includes invalidating the first eviction entry in the highest-level cache.
    Type: Application
    Filed: March 5, 2013
    Publication date: September 11, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Timothy C. Bronson, Garrett M. Drapala, Rebecca M. Gott, Pak-Kin Mak, Vijayalakshmi Srinivasan, Craig R. Walters
  • Patent number: 8826095
    Abstract: A hardened store-in cache system includes a store-in cache having lines of a first linesize stored with checkbits, wherein the checkbits include byte-parity bits, and an ancillary store-only cache (ASOC) that holds a copy of most recently stored-to lines of the store-in cache. The ASOC includes fewer lines than the store-in cache, each line of the ASOC having the first linesize stored with the checkbits.
    Type: Grant
    Filed: March 4, 2011
    Date of Patent: September 2, 2014
    Assignee: International Business Machines Corporation
    Inventors: Philip George Emma, Wing K. Luk, Thomas R. Puzak, Vijayalakshmi Srinivasan
  • Publication number: 20140019689
    Abstract: A scheme referred to as a “Region-based cache restoration prefetcher” (RECAP) is employed for cache preloading on a partition or a context switch. The RECAP exploits spatial locality to provide a bandwidth-efficient prefetcher to reduce the “cold” cache effect caused by multiprogrammed virtualization. The RECAP groups cache blocks into coarse-grain regions of memory, and predicts which regions contain useful blocks that should be prefetched the next time the current virtual machine executes. Based on these predictions, and using a simple compression technique that also exploits spatial locality, the RECAP provides a robust prefetcher that improves performance without excessive bandwidth overhead or slowdown.
    Type: Application
    Filed: July 10, 2012
    Publication date: January 16, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Harold W. Cain, III, Vijayalakshmi Srinivasan, Jason Zebchuk
  • Patent number: 8627010
    Abstract: An apparatus and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
    Type: Grant
    Filed: September 5, 2012
    Date of Patent: January 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Alan G. Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
  • Patent number: 8589762
    Abstract: Multi-bit stuck-at fault error recovery can be enabled by adaptive multi-bit error correction method, in which the overhead of error correction hardware is reduced without affecting the lifetime of the memory device. Error correction logic hardware is decoupled from memory blocks. An error correction logic block is partitioned such that error correction logic entries support different number of error correction capabilities based on the probability of occurrence of the different number of errors in different memory blocks. Faulty memory blocks are mapped to appropriate error correction logic entries. The mapping can be one-to-one or many-to-one depending on embodiments. The adaptive partitioning of the error correction logic entries can be configured to match projected statistical distribution of errors in logic blocks, and can reduce the total error correction logic overhead, provide sufficient error correction, and/or extend the lifetime of the memory device.
    Type: Grant
    Filed: July 5, 2011
    Date of Patent: November 19, 2013
    Assignee: International Business Machines Corporation
    Inventors: Jude A. Rivers, Vijayalakshmi Srinivasan
  • Patent number: 8521999
    Abstract: A method comprising receiving a branch instruction, decoding a branch address and the branch instruction, executing a branch action associated with the branch address, determining whether a branch associated with the branch action was taken, and saving an identifier of the branch instruction and in indicator that the branch action was taken in a prefetch history table responsive to determining that the branch associated with the branch action was taken.
    Type: Grant
    Filed: March 11, 2010
    Date of Patent: August 27, 2013
    Assignee: International Business Machines Corporation
    Inventors: Philip G. Emma, Allan M. Hartstein, Brian R. Prasky, Thomas R. Puzak, Vijayalakshmi Srinivasan
  • Patent number: 8516197
    Abstract: An apparatus, method and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
    Type: Grant
    Filed: February 11, 2011
    Date of Patent: August 20, 2013
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Alan G. Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
  • Publication number: 20130205118
    Abstract: A computer system for instruction execution includes a processor having a pipeline. The system is configured to perform a method including fetching, in the pipeline, a plurality of instructions, wherein the plurality of instructions includes a plurality of branch instructions, for each of the plurality of branch instructions, assigning a branch uncertainty to each of the plurality of branch instructions, for each of the plurality of instructions, assigning an instruction uncertainty that is a summation of branch uncertainties of older unresolved branches and balancing the instructions, based on a current summation of instruction uncertainty, in the pipeline.
    Type: Application
    Filed: February 6, 2012
    Publication date: August 8, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alper Buyuktosunoglu, Brian R. Prasky, Vijayalakshmi Srinivasan
  • Patent number: 8417917
    Abstract: A mechanism is provided for improving the performance and efficiency of multi-core processors. A system controller in a data processing system determines an operational function for each primary processor core in a set of primary processor cores in a primary processor core logic layer and for each secondary processor core in a set of secondary processor cores in a secondary processor core logic layer, thereby forming a set of determined operational functions. The system controller then generates an initial configuration, based on the set of determined operational functions, for initializing the set of primary processor cores and the set of secondary processor cores in the three-dimensional processor core architecture. The initial configuration indicates how at least one primary processor core of the set of primary processor cores collaborate with at least one secondary processor core of the set of secondary processor cores.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: April 9, 2013
  • Publication number: 20130024662
    Abstract: Systems and methods are disclosed that allow atomic updates to global data to be at least partially eliminated to reduce synchronization overhead in parallel computing. A compiler analyzes the data to be processed to selectively permit unsynchronized data transfer for at least one type of data. A programmer may provide a hint to expressly identify the type of data that are candidates for unsynchronized data transfer. In one embodiment, the synchronization overhead is reducible by generating an application program that selectively substitutes codes for unsynchronized data transfer for a subset of codes for synchronized data transfer. In another embodiment, the synchronization overhead is reducible by employing a combination of software and hardware by using relaxation data registers and decoders that collectively convert a subset of commands for synchronized data transfer into commands for unsynchronized data transfer.
    Type: Application
    Filed: July 18, 2011
    Publication date: January 24, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lakshminarayanan Renganarayana, Vijayalakshmi Srinivasan
  • Publication number: 20130013977
    Abstract: Multi-bit stuck-at fault error recovery can be enabled by adaptive multi-bit error correction method, in which the overhead of error correction hardware is reduced without affecting the lifetime of the memory device. Error correction logic hardware is decoupled from memory blocks. An error correction logic block is partitioned such that error correction logic entries support different number of error correction capabilities based on the probability of occurrence of the different number of errors in different memory blocks. Faulty memory blocks are mapped to appropriate error correction logic entries. The mapping can be one-to-one or many-to-one depending on embodiments. The adaptive partitioning of the error correction logic entries can be configured to match projected statistical distribution of errors in logic blocks, and can reduce the total error correction logic overhead, provide sufficient error correction, and/or extend the lifetime of the memory device.
    Type: Application
    Filed: July 5, 2011
    Publication date: January 10, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jude A. Rivers, Vijayalakshmi Srinivasan
  • Publication number: 20130007423
    Abstract: Systems and methods for predicting out-of-order instruction-level parallelism (ILP) of threads being executed in a multi-threaded processor and prioritizing scheduling thereof are described herein. One aspect provides for tracking completion of instructions using a global completion table having a head segment and a tail segment; storing prediction values for each instruction in a prediction table indexed via instruction identifiers associated with each instruction, a prediction value being configured to indicate an instruction is predicted to issue from one of: the head segment and the tail segment; and predicting threads with more instructions issuing from the tail segment have a higher degree of out-of-order instruction-level parallelism. Other embodiments and aspects are also described herein.
    Type: Application
    Filed: June 29, 2011
    Publication date: January 3, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ioana Monica Burcea, Alper Buyuktosunoglu, Brian Robert Prasky, Vijayalakshmi Srinivasan
  • Publication number: 20120331232
    Abstract: An apparatus and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
    Type: Application
    Filed: September 5, 2012
    Publication date: December 27, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Alan Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
  • Publication number: 20120284463
    Abstract: In a decode stage of hardware processor pipeline, one particular instruction of a plurality of instructions is decoded. It is determined that the particular instruction requires a memory access. Responsive to such determination, it is predicted whether the memory access will result in a cache miss. The predicting in turn includes accessing one of a plurality of entries in a pattern history table stored as a hardware table in the decode stage. The accessing is based, at least in part, upon at least a most recent entry in a global history buffer. The pattern history table stores a plurality of predictions. The global history buffer stores actual results of previous memory accesses as one of cache hits and cache misses.
    Type: Application
    Filed: May 2, 2011
    Publication date: November 8, 2012
    Applicant: International Business Machines Corporation
    Inventors: Vijayalakshmi Srinivasan, Brian R. Prasky