Patents by Inventor Douglas Craig Bossen
Douglas Craig Bossen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 6851071Abstract: An apparatus and method of repairing a processor array for a failure detected at runtime in a system supporting persistent component deallocation are provided. The apparatus and method of the present invention allow redundant array bits to be used for recoverable faults detected in arrays during run time, instead of only at system boot, while still maintaining the dynamic and persistent processor deallocation features of the computing system. With the apparatus and method of the present invention, a failure of a cache array is detected and a determination is made as to whether a repairable failure threshold is exceeded during runtime. If this threshold is exceeded, a determination is made as to whether cache array redundancy may be applied to correct the failure, i.e. a bit error. If so, the cache array redundancy is applied without marking the processor as unavailable.Type: GrantFiled: October 11, 2001Date of Patent: February 1, 2005Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Daniel James Henderson, Raymond Leslie Hicks, Alongkorn Kitamorn, David Otto Lewis, Thomas Alan Liebsch
-
Patent number: 6789048Abstract: According to a method form of the invention, in a computer system having a processing load distributed among a number of processors in the system, test computations are performed at intervals by floating point logic of a processor responsive to stored test instructions. Responsive to the test computations indicating an erroneous result by one of the processors information is passed by a firmware process and entered into an operating system error log. Responsive to the information, an operating system deconfiguration service is notified of the error log entry, and the service deconfigures the indicated processor, while the system is still running.Type: GrantFiled: April 4, 2002Date of Patent: September 7, 2004Assignee: International Business Machines CorporationInventors: Richard Louis Arndt, Douglas Marvin Benignus, Douglas Craig Bossen, Daniel James Henderson, Alongkorn Kitamorn
-
Method and system for end-to-end problem determination and fault isolation for storage area networks
Patent number: 6636981Abstract: A method and system for problem determination and fault isolation in a storage area network (SAN) is provided. A complex configuration of multi-vendor host systems, FC switches, and storage peripherals are connected in a SAN via a communications architecture (CA). A communications architecture element (CAE) is a network-connected device that has successfully registered with a communications architecture manager (CAM) on a host computer via a network service protocol, and the CAM contains problem determination (PD) functionality for the SAN and maintains a SAN PD information table (SPDIT). The CA comprises all network-connected elements capable of communicating information stored in the SPDIT. The CAM uses a SAN topology map and the SPDIT are used to create a SAN diagnostic table (SDT). A failing component in a particular device may generate errors that cause devices along the same network connection path to generate errors.Type: GrantFiled: January 6, 2000Date of Patent: October 21, 2003Assignee: International Business Machines CorporationInventors: Barry Stanley Barnett, Douglas Craig Bossen -
Publication number: 20030191607Abstract: According to a method form of the invention, in a computer system having a processing load distributed among a number of processors in the system, test computations are performed at intervals by floating point logic of a processor responsive to stored test instructions. Responsive to the test computations indicating an erroneous result by one of the processors information is passed by a firmware process and entered into an operating system error log. Responsive to the information, an operating system deconfiguration service is notified of the error log entry, and the service deconfigures the indicated processor, while the system is still running.Type: ApplicationFiled: April 4, 2002Publication date: October 9, 2003Applicant: International Business Machines CorporationInventors: Richard Louis Arndt, Douglas Marvin Benignus, Douglas Craig Bossen, Daniel James Henderson, Alongkorn Kitamorn
-
Publication number: 20030074598Abstract: An apparatus and method of repairing a processor array for a failure detected at runtime in a system supporting persistent component deallocation are provided. The apparatus and method of the present invention allow redundant array bits to be used for recoverable faults detected in arrays during run time, instead of only at system boot, while still maintaining the dynamic and persistent processor deallocation features of the computing system. With the apparatus and method of the present invention, a failure of a cache array is detected and a determination is made as to whether a repairable failure threshold is exceeded during runtime. If this threshold is exceeded, a determination is made as to whether cache array redundancy may be applied to correct the failure, i.e. a bit error. If so, the cache array redundancy is applied without marking the processor as unavailable.Type: ApplicationFiled: October 11, 2001Publication date: April 17, 2003Applicant: International Business Machines CorporationInventors: Douglas Craig Bossen, Daniel James Henderson, Raymond Leslie Hicks, Alongkorn Kitamorn, David Otto Lewis, Thomas Alan Liebsch
-
Patent number: 6516429Abstract: A method and apparatus in a multiprocessor data processing system for managing a plurality of processors. Monitoring for recoverable errors in a set of processors is performed. Responsive to detecting a recoverable error for a processor in the set of processors, a determination is made as to whether the recoverable error indicates a trend towards an unrecoverable error. Responsive to a determination that the recoverable error indicates a trend towards an unrecoverable error, actions are initiated to stop the processor.Type: GrantFiled: November 4, 1999Date of Patent: February 4, 2003Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Alongkorn Kitamorn, Charles Andrew McLaughlin, John Thomas O'Quin, II
-
Patent number: 6332181Abstract: A method of handling a cache error (such as a parity error), which allows a software recovery, by reporting the error using an unrelated system resource, such as an interrupt service, and particularly a data storage interrupt. The parity error can be reported by generating a data storage interrupt and using the data storage interrupt status register (DSISR) to indicate that the data storage interrupt is a result of the parity error. The context of the processor can be fully synchronized while handling the parity error.Type: GrantFiled: May 4, 1998Date of Patent: December 18, 2001Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Kevin Arthur Chiarot, Namratha Rajasekharaiah Jaisimha, Avijit Saha
-
Patent number: 6243823Abstract: A method and system for deconfiguring software in a processing system is disclosed. In one aspect, a processing system comprises a central processing unit (CPU), and a memory coupled to the CPU. The memory includes a memory array and a memory controller for capturing information concerning the status of the memory array. The processing system includes a service processor for gathering and analyzing status information from the memory controller. The processing system also includes a nonvolatile device coupled to the CPU and the service processor. The nonvolatile device includes a deconfiguration area. The deconfiguration area stores information concerning the status of the memory array from the service processor. The deconfiguration area also provides information for deconfiguring at least a portion of the memory array during a boot time of the processing system. Accordingly, through the present invention, memory errors are detected during normal computer operations by error detection logic.Type: GrantFiled: October 2, 1998Date of Patent: June 5, 2001Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Alongkorn Kitamorn, Charles Andrew McLaughlin
-
Patent number: 6233680Abstract: A method and system for deconfiguring a CPU in a processing system is disclosed. In one aspect, a processing system is disclosed that comprises a central processing unit (CPU), and a memory coupled to the CPU. The error status register for capturing information concerning the status of the CPU. The processing system includes a service processor for gathering and analyzing status information from the CPU error register. The processing system also includes a nonvolatile device coupled to the service processor. The nonvolatile device includes a deconfiguration area. The deconfiguration area stores information concerning the status of the CPU from the service processor. The deconfiguration area also provides information for deconfiguring a CPU during a boot time of the processing system. Accordingly, through the present invention, CPU errors are detected during normal computer operations by error detection logic.Type: GrantFiled: October 2, 1998Date of Patent: May 15, 2001Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Alongkorn Kitamorn, Charles Andrew McLaughlin
-
Patent number: 6223299Abstract: Device selects lines from each I/O device are brought into a PCI host bridge individually so that the device number of a failing device may be logged in an error register when an error is seen on the PCI bus. Until the error register is reset, subsequent load and store operations are delayed until the device number of the subject device may be checked against the error register. If the subject device is a previously failing device, the load/store operation to that device is prevented from completing, either by forcing bad parity or zeroing all byte enables. By forcing bad parity of zero byte enables, the I/O device will respond to the load or store request by activating its device select line, but will not accept store data. Operations to devices which are not logged in the error register are permitted to proceed normally, as are all load store operations when the error register is clear.Type: GrantFiled: May 4, 1998Date of Patent: April 24, 2001Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Charles Andrew McLaughlin, Danny Marvin Neal, James Otto Nicholson, Steven Mark Thurber
-
Patent number: 6199171Abstract: A method and implementing system are provided for handling detected faults in a processor to improve reliability of a computer system. An exemplary fault-tolerant on-line transactional (OLT) computer system is illustrated which includes first and second OLT processors connected to an I/O processor through a system bus. Transaction results are stored in local processor buffers and at predetermined batch intervals, the stored transactions are compared. The matched transaction results are flushed to data store while unmatched transactions are re-executed. If the same errors do not occur during a re-execution, the errors are determined to be transient and the transaction results are flushed to storage.Type: GrantFiled: June 26, 1998Date of Patent: March 6, 2001Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Arun Chandra
-
Patent number: 6179207Abstract: A single width bar code exhibiting inherent self clocking characteristics is provided so as to be particularly useful in the identification of semiconductor wafers in very large scale integrated circuit manufacturing processes. The codes described herein are robust, reliable and highly readable even in the face of relatively high variations in scanning speed. The codes are also desirably dense in terms of character representations per linear centimeter, an important consideration in semiconductor manufacturing wherein space on the chips and the wafer is at a premium. Additionally, a preferred embodiment of the present invention exhibits a minimum number for the maximum number of spaces between adjacent bars in code symbol sequences.Type: GrantFiled: June 18, 1996Date of Patent: January 30, 2001Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Chin-Long Chen, Fredrick Hayes Dill, Douglas Seymore Goodman, Mu-Yue Hsiao, Paul Vincent McCann, James Michael Mulligan, Ricky Allen Rand
-
Patent number: 6108753Abstract: A method and apparatus is provided for enhanced error correction processing through a retry mechanism. When an L1 cache instruction line error is detected, either by a parity error detection process or by an ECC (error correcting code) or other process, the disclosed methodology will schedule an automatic retry of the event that caused the line error without re-booting the entire system. Thereafter, if the error remains present after a predetermined number of retries to load the requested data from L1 cache, then a second level of corrective action is undertaken. The second level corrective action includes accessing an alternate memory location, such as the L2 cache for example. If the state of the requested cache line is exclusive or shared, then an artificial L1 miss is generated for use in enabling an L2 access for the requested cache line.Type: GrantFiled: March 31, 1998Date of Patent: August 22, 2000Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Manratha Rajasekharaiah Jaisimha, Avijit Saha, Shih-Hsiung Stephen Tung
-
Patent number: 6058491Abstract: A method and system for handling detected faults in a processor to improve reliability of a computer system is disclosed. A fault-tolerant computer system is provided which includes a first processor, a second processor, and a comparator. Coupled to a system bus, a first processor is utilized to produce a first output. The second processor, also coupled to the system bus, is utilized to produce a second output. During the operation of the computer system, the second processor operates at the same clock speed as the first processor and lags behind the first processor. The comparator is utilized to compare the first and second output such that an operation will be retried if the first output is not the same as the second output.Type: GrantFiled: September 15, 1997Date of Patent: May 2, 2000Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Arun Chandra
-
Patent number: 5978936Abstract: A first set of test instructions are provided for a first node in a computer network. A corresponding second set is provided for a second node in the network. The test instruction sets are partitioned into modules. The nodes process their respective sets of test instructions independently to generate test results for each module on each node, except when a synchronizing event occurs. Each node stores its test results for each test module. Since the test modules have an ordered processing sequence, each node's test results for corresponding test modules can be compared asynchronously on an ongoing basis.Type: GrantFiled: November 19, 1997Date of Patent: November 2, 1999Assignee: International Business Machines CorporationInventors: Arun Chandra, Douglas Craig Bossen, Nandakumar Nityananda Tendolkar
-
Patent number: 5956351Abstract: A method of detecting errors in a data stream being transmitted in a computer system, e.g., from a memory array to a memory controller, by determining whether the encoding was performed using a first encoding method or a second encoding method, and thereafter decoding the data stream using a logic circuit based on a single parity-check matrix. The entire parity-check matrix is used to decode the data stream if the first encoding method was used, and a subset of the parity-check matrix is used to decode the data stream if the second encoding method was used. Encoding according to the first method allows correction of all single-symbol errors and detection of all double-symbol errors in the data stream, and encoding according to the second method allows correction of all single-bit errors and detection of all double-bit errors in the data stream. The subset matrix may be permuted if the second encoding method was used, to create a permuted matrix further allowing detection of single-symbol errors.Type: GrantFiled: April 7, 1997Date of Patent: September 21, 1999Assignee: International Business Machines CorporationInventors: Douglas Craig Bossen, Chin-Long Chen
-
Patent number: 5682394Abstract: In a memory system comprising a plurality of memory units each of which possesses unit-level error correction capabilities and each of which is tied to a system level error correction function, memory reliability is enhanced by providing a mechanism for disabling the unit-level error correction capability, for example, in response to the occurrence of an uncorrectable error in one of the memory units. This counter-intuitive approach which disables an error correction function nonetheless enhances overall memory system reliability since it enables the employment of the complement/recomplement algorithm which depends upon the presence of reproducible errors for proper operation. Thus, chip level error correction systems, which are increasingly desirable at high packaging densities, are employed in a way which does not interfere with system level error correction methods.Type: GrantFiled: February 2, 1993Date of Patent: October 28, 1997Assignee: International Business Machines CorporationInventors: Robert Martin Blake, Douglas Craig Bossen, Chin-Long Chen, John Atkinson Fifield, Howard Leo Kalter