Patents by Inventor Marc A. Gollub

Marc A. Gollub has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8990641
    Abstract: In a data processing system, a selection is made, based at least on addresses of previously detected errors in a memory subsystem, between at least a first timing and a second timing of data transmission with respect to completion of error detection processing on a target memory block of the memory access request. In response to receipt of the memory access request and selection of the first timing, data from the target memory block is transmitted to a requestor prior to completion of error detection processing on the target memory block. In response to receipt of the memory access request and selection of the second timing, data from the target memory block is transmitted to the requestor after and in response to completion of error detection processing on the target memory block.
    Type: Grant
    Filed: November 16, 2012
    Date of Patent: March 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Marc A. Gollub, Benjiman L. Goodman, Sujatha Kashyap, Eric E. Retter, Yuen C. Tschang
  • Patent number: 8862953
    Abstract: A method includes directing an access of a memory location of a memory device to an error correction code (ECC) decoder in response to receiving a test activation request indicating the memory location. The method also includes writing a test pattern to the memory location and reading a value from the memory location. The method further includes determining whether a fault is detected at the memory location based on a comparison of the test pattern and the value.
    Type: Grant
    Filed: January 4, 2013
    Date of Patent: October 14, 2014
    Assignee: International Business Machines Corporation
    Inventors: Marc A. Gollub, Girisankar Paulraj, Diyanesh B. Vidyapoornachary, Kenneth L. Wright
  • Publication number: 20140281681
    Abstract: According to one embodiment, a method for error correction in a memory module having ranks is provided where each rank has memory devices. The method includes determining a first mark condition for a first rank of the memory module, the first mark condition based on one or more uncorrectable error occurring in a first memory device in the first rank, placing a first mark in the first memory device, determining a second mark condition for the first rank, the second mark condition based on one or more uncorrectable error occurring in a second memory device in the first rank, placing a second mark in a third memory device in a second rank of the memory module and configuring the first memory device to respond to commands directed to the second rank, wherein configuring the first memory device is based on placing of the first mark and the second mark.
    Type: Application
    Filed: March 13, 2013
    Publication date: September 18, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Edgar R. Cordero, Marc A. Gollub, Girisankar Paulraj, Diyanesh B. Vidyapoornachary
  • Publication number: 20140195852
    Abstract: A method includes reading, at a memory controller, data from a first dynamic random-access memory (DRAM) die layer of a DRAM stack. The method also includes writing the data to a second DRAM die layer of the DRAM stack. The method further includes sending a request to a test engine to test the first DRAM die layer after writing the data to the second DRAM die layer.
    Type: Application
    Filed: January 9, 2013
    Publication date: July 10, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Marc A. Gollub, Girisankar Paulraj, Diyanesh B. Vidyapoornachary, Kenneth L. Wright
  • Publication number: 20140195867
    Abstract: A method includes directing an access of a memory location of a memory device to an error correction code (ECC) decoder in response to receiving a test activation request indicating the memory location. The method also includes writing a test pattern to the memory location and reading a value from the memory location. The method further includes determining whether a fault is detected at the memory location based on a comparison of the test pattern and the value.
    Type: Application
    Filed: January 4, 2013
    Publication date: July 10, 2014
    Applicant: International Business Machines Corporation
    Inventors: Marc A. Gollub, Girisankar Paulraj, Diyanesh B. Vidyapoornachary, Kenneth L. Wright
  • Publication number: 20140143612
    Abstract: In a data processing system, a selection is made, based at least on addresses of previously detected errors in a memory subsystem, between at least a first timing and a second timing of data transmission with respect to completion of error detection processing on a target memory block of the memory access request. In response to receipt of the memory access request and selection of the first timing, data from the target memory block is transmitted to a requestor prior to completion of error detection processing on the target memory block. In response to receipt of the memory access request and selection of the second timing, data from the target memory block is transmitted to the requestor after and in response to completion of error detection processing on the target memory block.
    Type: Application
    Filed: November 16, 2012
    Publication date: May 22, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: MARC A. GOLLUB, BENJIMAN L. GOODMAN, SUJATHA KASHYAP, ERIC E. RETTER, YUEN C. TSCHANG
  • Patent number: 8689080
    Abstract: In some example embodiments, a method includes performing a memory scrub of a memory across a scrub cycle of multiple scrub cycles. The method includes identifying correctable errors of symbols in the memory that are a result of accesses from a section of the memory in response to the memory scrub. The method also includes performing an analysis across the multiple scrub cycles, wherein the performing of the analysis comprises determining whether at least two symbols across the multiple scrub cycles have at least one correctable error. The method includes responsive to determining that at least two symbols across the multiple scrub cycles have at least one correctable error, executing at least one repair of the memory that includes the section of memory.
    Type: Grant
    Filed: August 21, 2012
    Date of Patent: April 1, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jay W. Carman, Marc A. Gollub, Anshuman Khandual, Jyotindra Patel
  • Patent number: 8650437
    Abstract: A method and apparatus for controlling marking store updates in a central electronic complex with a plurality of core processors and eDRAM cache and interconnect bus to a service processor for loading memory controller firmware to dual-channel DDR3 memory controllers with an internal marking store. Loaded firmware of the memory controllers is responsible for tracking of ECC errors using a ECC decoder control whereby said marking store is written by a slow ECC decoder, and read by a fast ECC decoder for every read operation of said memory controllers to provide a blocking mechanism for notifying marking store firmware when the marking store has been updated and which guarantees that marking store firmware cannot write to the marking store until the marking store firmware has seen updates without causing the marking store hardware to time out.
    Type: Grant
    Filed: June 29, 2010
    Date of Patent: February 11, 2014
    Assignee: International Business Machines Corporation
    Inventors: Richard E. Fry, Marc A. Gollub, Luis A. Lastras-Montano, Eric E. Retter, Kenneth L. Wright
  • Patent number: 8640006
    Abstract: An apparatus includes a processor, a memory, and an error module operable on the processor. The error module is configured to perform a memory scrub of the memory across a scrub cycle of multiple scrub cycles. The error module is configured to identify correctable errors of symbols in the memory that are a result of accesses from a section of the memory in response to the memory scrub. The error module is configured to perform an analysis across the multiple scrub cycles, wherein the analysis comprises a determination whether at least two symbols across the multiple scrub cycles have at least one correctable error. The error module is configured to responsive to a determination that at least two symbols across the multiple scrub cycles have at least one correctable error, execute at least one repair of the memory that includes the section of memory.
    Type: Grant
    Filed: June 29, 2011
    Date of Patent: January 28, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jay W. Carman, Marc A. Gollub, Anshuman Khandual, Jyotindra Patel
  • Patent number: 8484521
    Abstract: Mechanisms are provided in which firmware verifies he entire system's memory scrub coverage through some additional memory controller (MC) registers/attentions and builds up a processor runtime diagnostic (PRD) scrub coverage table during every scrub cycle. Firmware may go through the scrub coverage table rank-by-rank on a periodic basis to determine whether any ranks had not been covered by hardware scrubbing. Firmware may initiate a targeted scrub and diagnostic for all of the ranks that did not have adequate scrub coverage. If for some reason the system still has some memory ranks that have not been covered by the initial hardware scrub and the targeted scrub, then the firmware may perform some course of action for fault isolation.
    Type: Grant
    Filed: April 7, 2011
    Date of Patent: July 9, 2013
    Assignee: International Business Machines Corporation
    Inventors: Jay W. Carman, Marc A. Gollub, Anshuman Khandual
  • Patent number: 8352806
    Abstract: A system to improve memory failure management may include memory, and an error control decoder to determine failures in the memory. The system may also include an agent that may monitor failures in the memory. The system may further include a table where the error control decoder may record the failures, and where the agent can read and write to.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: January 8, 2013
    Assignee: International Business Machines Corporation
    Inventors: Marc A. Gollub, Luis A. Lastras-Montano, Piyush C. Patel, Eric E. Retter, Barry M. Trager, Shmuel Winograd, Kenneth L. Wright
  • Publication number: 20130007542
    Abstract: In some example embodiments, a method includes performing a memory scrub of a memory across a scrub cycle of multiple scrub cycles. The method includes identifying correctable errors of symbols in the memory that are a result of accesses from a section of the memory in response to the memory scrub. The method also includes performing an analysis across the multiple scrub cycles, wherein the performing of the analysis comprises determining whether at least two symbols across the multiple scrub cycles have at least one correctable error. The method includes responsive to determining that at least two symbols across the multiple scrub cycles have at least one correctable error, executing at least one repair of the memory that includes the section of memory.
    Type: Application
    Filed: August 21, 2012
    Publication date: January 3, 2013
    Applicant: International Business Machines Corporation
    Inventors: Jay W. Carman, Marc A. Gollub, Anshuman Khandual, Jyotindra Patel
  • Publication number: 20130007541
    Abstract: An apparatus includes a processor, a memory, and an error module operable on the processor. The error module is configured to perform a memory scrub of the memory across a scrub cycle of multiple scrub cycles. The error module is configured to identify correctable errors of symbols in the memory that are a result of accesses from a section of the memory in response to the memory scrub. The error module is configured to perform an analysis across the multiple scrub cycles, wherein the analysis comprises a determination whether at least two symbols across the multiple scrub cycles have at least one correctable error. The error module is configured to responsive to a determination that at least two symbols across the multiple scrub cycles have at least one correctable error, execute at least one repair of the memory that includes the section of memory.
    Type: Application
    Filed: June 29, 2011
    Publication date: January 3, 2013
    Applicant: International Business Machines Corporation
    Inventors: Jay W. Carman, Marc A. Gollub, Anshuman Khandual, Jyotindra Patel
  • Publication number: 20120260139
    Abstract: Mechanisms are provided in which firmware verifies he entire system's memory scrub coverage through some additional memory controller (MC) registers/attentions and builds up a processor runtime diagnostic (PRD) scrub coverage table during every scrub cycle. Firmware may go through the scrub coverage table rank-by-rank on a periodic basis to determine whether any ranks had not been covered by hardware scrubbing. Firmware may initiate a targeted scrub and diagnostic for all of the ranks that did not have adequate scrub coverage. If for some reason the system still has some memory ranks that have not been covered by the initial hardware scrub and the targeted scrub, then the firmware may perform some course of action for fault isolation.
    Type: Application
    Filed: April 7, 2011
    Publication date: October 11, 2012
    Applicant: International Business Machines Corporation
    Inventors: Jay W. Carman, Marc A. Gollub, Anshuman Khandual
  • Patent number: 8195981
    Abstract: Embodiments of the invention provide an interrupt handler configured to distinguish between critical and non-critical unrecoverable memory errors, yielding different actions for each. Doing so may allow a system to recover from certain memory errors without having to terminate a running process. In addition, when an operating system critical task experiences an unrecoverable error, such a task may be acting on behalf of a non-critical process (e.g., when swapping out a virtual memory page). When this occurs, an interrupt handler may respond to a memory error with the same response that would result had the process itself performed the memory operation. Further, firmware may be configured to perform diagnostics to identify potential memory errors and alert the operating system before a memory region state change occurs, such that the memory error would become critical.
    Type: Grant
    Filed: June 3, 2008
    Date of Patent: June 5, 2012
    Assignee: International Business Machines Corporation
    Inventors: Marc A. Gollub, Zane C. Shelley, Alwood P. Williams, III
  • Patent number: 8103900
    Abstract: A method and circuit for implementing enhanced memory reliability using memory scrub operations to determine a frequency of intermittent correctable errors, and a design structure on which the subject circuit resides are provided. A memory scrub for intermittent performs at least two reads before moving to a next memory scrub address. A number of intermittent errors is tracked where an intermittent error is identified, responsive to identifying one failing read and one passing read of the at least two reads.
    Type: Grant
    Filed: July 28, 2009
    Date of Patent: January 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Richard E. Fry, Marc A. Gollub, Eric E. Retter, Kenneth L. Wright
  • Publication number: 20110320911
    Abstract: A method and apparatus for controlling marking store updates in a central electronic complex with a plurality of core processors and eDRAM cache and interconnect bus to a service processor for loading memory controller firmware to dual-channel DDR3 memory controllers with an internal marking store. Loaded firmware of the memory controllers is responsible for tracking of ECC errors using a ECC decoder control whereby said marking store is written by a slow ECC decoder, and read by a fast ECC decoder for every read operation of said memory controllers to provide a blocking mechanism for notifying marking store firmware when the marking store has been updated and which guarantees that marking store firmware cannot write to the marking store until the marking store firmware has seen updates without causing the marking store hardware to time out.
    Type: Application
    Filed: June 29, 2010
    Publication date: December 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Richard E. Fry, Marc A. Gollub, Luis A. Lastras-Montano, Eric E. Retter, Kenneth L. Wright
  • Patent number: 7953914
    Abstract: Embodiments of the invention provide an interrupt handler configured to distinguish between critical and non-critical unrecoverable memory errors, yielding different actions for each. Doing so may allow a system to recover from certain memory errors without having to terminate a running process. In addition, when an operating system critical task experiences an unrecoverable error, such a task may be acting on behalf of a non-critical process (e.g., when swapping out a virtual memory page). When this occurs, an interrupt handler may respond to a memory error with the same response that would result had the process itself performed the memory operation. Further, firmware may be configured to perform diagnostics to identify potential memory errors and alert the operating system before a memory region state change occurs, such that the memory error would become critical.
    Type: Grant
    Filed: June 3, 2008
    Date of Patent: May 31, 2011
    Assignee: International Business Machines Corporation
    Inventors: Marc A. Gollub, Zane C. Shelley, Alwood P. Williams, III
  • Patent number: 7900100
    Abstract: A system, method and program product for utilizing error correction code (ECC) logic to detect multi-bit errors. In one embodiment, a first test pattern and a second test pattern are applied to a set of hardware bit positions. The first and second patterns are multiple logic level patterns and the second test pattern is the logical complement of the first test pattern. The first and second test patterns are utilized by the ECC logic to detect correctable errors having n or fewer bits. One or more bit positions of a first correctable error occurring responsive to applying the first test pattern are determined and one or more bit positions of a second correctable error occurring responsive to applying the second test pattern are determined. The determined bit positions of the first and second correctable errors are processed to identify a multiple-bit error within the set of hardware bit positions.
    Type: Grant
    Filed: February 21, 2007
    Date of Patent: March 1, 2011
    Assignee: International Business Machines Corporation
    Inventor: Marc A. Gollub
  • Patent number: 7895477
    Abstract: Embodiments of the invention provide an interrupt handler configured to distinguish between critical and non-critical unrecoverable memory errors, yielding different actions for each. Doing so may allow a system to recover from certain memory errors without having to terminate a running process. In addition, when an operating system critical task experiences an unrecoverable error, such a task may be acting on behalf of a non-critical process (e.g., when swapping out a virtual memory page). When this occurs, an interrupt handler may respond to a memory error with the same response that would result had the process itself performed the memory operation. Further, firmware may be configured to perform diagnostics to identify potential memory errors and alert the operating system before a memory region state change occurs, such that the memory error would become critical.
    Type: Grant
    Filed: June 3, 2008
    Date of Patent: February 22, 2011
    Assignee: International Business Machines Corporation
    Inventors: Marc A. Gollub, Zane C. Shelley, Alwood P. Williams, III