Patents by Inventor Douglas Craig Bossen

Douglas Craig Bossen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method of repairing a processor array for a failure detected at runtime

Patent number: 6851071

Abstract: An apparatus and method of repairing a processor array for a failure detected at runtime in a system supporting persistent component deallocation are provided. The apparatus and method of the present invention allow redundant array bits to be used for recoverable faults detected in arrays during run time, instead of only at system boot, while still maintaining the dynamic and persistent processor deallocation features of the computing system. With the apparatus and method of the present invention, a failure of a cache array is detected and a determination is made as to whether a repairable failure threshold is exceeded during runtime. If this threshold is exceeded, a determination is made as to whether cache array redundancy may be applied to correct the failure, i.e. a bit error. If so, the cache array redundancy is applied without marking the processor as unavailable.

Type: Grant

Filed: October 11, 2001

Date of Patent: February 1, 2005

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Daniel James Henderson, Raymond Leslie Hicks, Alongkorn Kitamorn, David Otto Lewis, Thomas Alan Liebsch
Method, apparatus, and computer program product for deconfiguring a processor

Patent number: 6789048

Abstract: According to a method form of the invention, in a computer system having a processing load distributed among a number of processors in the system, test computations are performed at intervals by floating point logic of a processor responsive to stored test instructions. Responsive to the test computations indicating an erroneous result by one of the processors information is passed by a firmware process and entered into an operating system error log. Responsive to the information, an operating system deconfiguration service is notified of the error log entry, and the service deconfigures the indicated processor, while the system is still running.

Type: Grant

Filed: April 4, 2002

Date of Patent: September 7, 2004

Assignee: International Business Machines Corporation

Inventors: Richard Louis Arndt, Douglas Marvin Benignus, Douglas Craig Bossen, Daniel James Henderson, Alongkorn Kitamorn
Method and system for end-to-end problem determination and fault isolation for storage area networks

Patent number: 6636981

Abstract: A method and system for problem determination and fault isolation in a storage area network (SAN) is provided. A complex configuration of multi-vendor host systems, FC switches, and storage peripherals are connected in a SAN via a communications architecture (CA). A communications architecture element (CAE) is a network-connected device that has successfully registered with a communications architecture manager (CAM) on a host computer via a network service protocol, and the CAM contains problem determination (PD) functionality for the SAN and maintains a SAN PD information table (SPDIT). The CA comprises all network-connected elements capable of communicating information stored in the SPDIT. The CAM uses a SAN topology map and the SPDIT are used to create a SAN diagnostic table (SDT). A failing component in a particular device may generate errors that cause devices along the same network connection path to generate errors.

Type: Grant

Filed: January 6, 2000

Date of Patent: October 21, 2003

Assignee: International Business Machines Corporation

Inventors: Barry Stanley Barnett, Douglas Craig Bossen
Method, apparatus, and computer program product for deconfiguring a processor

Publication number: 20030191607

Abstract: According to a method form of the invention, in a computer system having a processing load distributed among a number of processors in the system, test computations are performed at intervals by floating point logic of a processor responsive to stored test instructions. Responsive to the test computations indicating an erroneous result by one of the processors information is passed by a firmware process and entered into an operating system error log. Responsive to the information, an operating system deconfiguration service is notified of the error log entry, and the service deconfigures the indicated processor, while the system is still running.

Type: Application

Filed: April 4, 2002

Publication date: October 9, 2003

Applicant: International Business Machines Corporation

Inventors: Richard Louis Arndt, Douglas Marvin Benignus, Douglas Craig Bossen, Daniel James Henderson, Alongkorn Kitamorn
Apparatus and method of repairing a processor array for a failure detected at runtime

Publication number: 20030074598

Abstract: An apparatus and method of repairing a processor array for a failure detected at runtime in a system supporting persistent component deallocation are provided. The apparatus and method of the present invention allow redundant array bits to be used for recoverable faults detected in arrays during run time, instead of only at system boot, while still maintaining the dynamic and persistent processor deallocation features of the computing system. With the apparatus and method of the present invention, a failure of a cache array is detected and a determination is made as to whether a repairable failure threshold is exceeded during runtime. If this threshold is exceeded, a determination is made as to whether cache array redundancy may be applied to correct the failure, i.e. a bit error. If so, the cache array redundancy is applied without marking the processor as unavailable.

Type: Application

Filed: October 11, 2001

Publication date: April 17, 2003

Applicant: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Daniel James Henderson, Raymond Leslie Hicks, Alongkorn Kitamorn, David Otto Lewis, Thomas Alan Liebsch
Method and apparatus for run-time deconfiguration of a processor in a symmetrical multi-processing system

Patent number: 6516429

Abstract: A method and apparatus in a multiprocessor data processing system for managing a plurality of processors. Monitoring for recoverable errors in a set of processors is performed. Responsive to detecting a recoverable error for a processor in the set of processors, a determination is made as to whether the recoverable error indicates a trend towards an unrecoverable error. Responsive to a determination that the recoverable error indicates a trend towards an unrecoverable error, actions are initiated to stop the processor.

Type: Grant

Filed: November 4, 1999

Date of Patent: February 4, 2003

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Alongkorn Kitamorn, Charles Andrew McLaughlin, John Thomas O'Quin, II
Recovery mechanism for L1 data cache parity errors

Patent number: 6332181

Abstract: A method of handling a cache error (such as a parity error), which allows a software recovery, by reporting the error using an unrelated system resource, such as an interrupt service, and particularly a data storage interrupt. The parity error can be reported by generating a data storage interrupt and using the data storage interrupt status register (DSISR) to indicate that the data storage interrupt is a result of the parity error. The context of the processor can be fully synchronized while handling the parity error.

Type: Grant

Filed: May 4, 1998

Date of Patent: December 18, 2001

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Kevin Arthur Chiarot, Namratha Rajasekharaiah Jaisimha, Avijit Saha
Method and system for boot-time deconfiguration of a memory in a processing system

Patent number: 6243823

Abstract: A method and system for deconfiguring software in a processing system is disclosed. In one aspect, a processing system comprises a central processing unit (CPU), and a memory coupled to the CPU. The memory includes a memory array and a memory controller for capturing information concerning the status of the memory array. The processing system includes a service processor for gathering and analyzing status information from the memory controller. The processing system also includes a nonvolatile device coupled to the CPU and the service processor. The nonvolatile device includes a deconfiguration area. The deconfiguration area stores information concerning the status of the memory array from the service processor. The deconfiguration area also provides information for deconfiguring at least a portion of the memory array during a boot time of the processing system. Accordingly, through the present invention, memory errors are detected during normal computer operations by error detection logic.

Type: Grant

Filed: October 2, 1998

Date of Patent: June 5, 2001

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Alongkorn Kitamorn, Charles Andrew McLaughlin
Method and system for boot-time deconfiguration of a processor in a symmetrical multi-processing system

Patent number: 6233680

Abstract: A method and system for deconfiguring a CPU in a processing system is disclosed. In one aspect, a processing system is disclosed that comprises a central processing unit (CPU), and a memory coupled to the CPU. The error status register for capturing information concerning the status of the CPU. The processing system includes a service processor for gathering and analyzing status information from the CPU error register. The processing system also includes a nonvolatile device coupled to the service processor. The nonvolatile device includes a deconfiguration area. The deconfiguration area stores information concerning the status of the CPU from the service processor. The deconfiguration area also provides information for deconfiguring a CPU during a boot time of the processing system. Accordingly, through the present invention, CPU errors are detected during normal computer operations by error detection logic.

Type: Grant

Filed: October 2, 1998

Date of Patent: May 15, 2001

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Alongkorn Kitamorn, Charles Andrew McLaughlin
Enhanced error handling for I/O load/store operations to a PCI device via bad parity or zero byte enables

Patent number: 6223299

Abstract: Device selects lines from each I/O device are brought into a PCI host bridge individually so that the device number of a failing device may be logged in an error register when an error is seen on the PCI bus. Until the error register is reset, subsequent load and store operations are delayed until the device number of the subject device may be checked against the error register. If the subject device is a previously failing device, the load/store operation to that device is prevented from completing, either by forcing bad parity or zeroing all byte enables. By forcing bad parity of zero byte enables, the I/O device will respond to the load or store request by activating its device select line, but will not accept store data. Operations to devices which are not logged in the error register are permitted to proceed normally, as are all load store operations when the error register is clear.

Type: Grant

Filed: May 4, 1998

Date of Patent: April 24, 2001

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Charles Andrew McLaughlin, Danny Marvin Neal, James Otto Nicholson, Steven Mark Thurber
Time-lag duplexing techniques

Patent number: 6199171

Abstract: A method and implementing system are provided for handling detected faults in a processor to improve reliability of a computer system. An exemplary fault-tolerant on-line transactional (OLT) computer system is illustrated which includes first and second OLT processors connected to an I/O processor through a system bus. Transaction results are stored in local processor buffers and at predetermined batch intervals, the stored transactions are compared. The matched transaction results are flushed to data store while unmatched transactions are re-executed. If the same errors do not occur during a re-execution, the errors are determined to be transient and the transaction results are flushed to storage.

Type: Grant

Filed: June 26, 1998

Date of Patent: March 6, 2001

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Arun Chandra
Method for writing single width bar codes on semiconductors wafers

Patent number: 6179207

Abstract: A single width bar code exhibiting inherent self clocking characteristics is provided so as to be particularly useful in the identification of semiconductor wafers in very large scale integrated circuit manufacturing processes. The codes described herein are robust, reliable and highly readable even in the face of relatively high variations in scanning speed. The codes are also desirably dense in terms of character representations per linear centimeter, an important consideration in semiconductor manufacturing wherein space on the chips and the wafer is at a premium. Additionally, a preferred embodiment of the present invention exhibits a minimum number for the maximum number of spaces between adjacent bars in code symbol sequences.

Type: Grant

Filed: June 18, 1996

Date of Patent: January 30, 2001

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Chin-Long Chen, Fredrick Hayes Dill, Douglas Seymore Goodman, Mu-Yue Hsiao, Paul Vincent McCann, James Michael Mulligan, Ricky Allen Rand
Cache error retry technique

Patent number: 6108753

Abstract: A method and apparatus is provided for enhanced error correction processing through a retry mechanism. When an L1 cache instruction line error is detected, either by a parity error detection process or by an ECC (error correcting code) or other process, the disclosed methodology will schedule an automatic retry of the event that caused the line error without re-booting the entire system. Thereafter, if the error remains present after a predetermined number of retries to load the requested data from L1 cache, then a second level of corrective action is undertaken. The second level corrective action includes accessing an alternate memory location, such as the L2 cache for example. If the state of the requested cache line is exclusive or shared, then an artificial L1 miss is generated for use in enabling an L2 access for the requested cache line.

Type: Grant

Filed: March 31, 1998

Date of Patent: August 22, 2000

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Manratha Rajasekharaiah Jaisimha, Avijit Saha, Shih-Hsiung Stephen Tung
Method and system for fault-handling to improve reliability of a data-processing system

Patent number: 6058491

Abstract: A method and system for handling detected faults in a processor to improve reliability of a computer system is disclosed. A fault-tolerant computer system is provided which includes a first processor, a second processor, and a comparator. Coupled to a system bus, a first processor is utilized to produce a first output. The second processor, also coupled to the system bus, is utilized to produce a second output. During the operation of the computer system, the second processor operates at the same clock speed as the first processor and lags behind the first processor. The comparator is utilized to compare the first and second output such that an operation will be retried if the first output is not the same as the second output.

Type: Grant

Filed: September 15, 1997

Date of Patent: May 2, 2000

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Arun Chandra
Run time error probe in a network computing environment

Patent number: 5978936

Abstract: A first set of test instructions are provided for a first node in a computer network. A corresponding second set is provided for a second node in the network. The test instruction sets are partitioned into modules. The nodes process their respective sets of test instructions independently to generate test results for each module on each node, except when a synchronizing event occurs. Each node stores its test results for each test module. Since the test modules have an ordered processing sequence, each node's test results for corresponding test modules can be compared asynchronously on an ongoing basis.

Type: Grant

Filed: November 19, 1997

Date of Patent: November 2, 1999

Assignee: International Business Machines Corporation

Inventors: Arun Chandra, Douglas Craig Bossen, Nandakumar Nityananda Tendolkar
Dual error correction code

Patent number: 5956351

Abstract: A method of detecting errors in a data stream being transmitted in a computer system, e.g., from a memory array to a memory controller, by determining whether the encoding was performed using a first encoding method or a second encoding method, and thereafter decoding the data stream using a logic circuit based on a single parity-check matrix. The entire parity-check matrix is used to decode the data stream if the first encoding method was used, and a subset of the parity-check matrix is used to decode the data stream if the second encoding method was used. Encoding according to the first method allows correction of all single-symbol errors and detection of all double-symbol errors in the data stream, and encoding according to the second method allows correction of all single-bit errors and detection of all double-bit errors in the data stream. The subset matrix may be permuted if the second encoding method was used, to create a permuted matrix further allowing detection of single-symbol errors.

Type: Grant

Filed: April 7, 1997

Date of Patent: September 21, 1999

Assignee: International Business Machines Corporation

Inventors: Douglas Craig Bossen, Chin-Long Chen
Fault tolerant computer memory systems and components employing dual level error correction and detection with disablement feature

Patent number: 5682394

Abstract: In a memory system comprising a plurality of memory units each of which possesses unit-level error correction capabilities and each of which is tied to a system level error correction function, memory reliability is enhanced by providing a mechanism for disabling the unit-level error correction capability, for example, in response to the occurrence of an uncorrectable error in one of the memory units. This counter-intuitive approach which disables an error correction function nonetheless enhances overall memory system reliability since it enables the employment of the complement/recomplement algorithm which depends upon the presence of reproducible errors for proper operation. Thus, chip level error correction systems, which are increasingly desirable at high packaging densities, are employed in a way which does not interfere with system level error correction methods.

Type: Grant

Filed: February 2, 1993

Date of Patent: October 28, 1997

Assignee: International Business Machines Corporation

Inventors: Robert Martin Blake, Douglas Craig Bossen, Chin-Long Chen, John Atkinson Fifield, Howard Leo Kalter