Abstract: False positive error warnings associated with hot insertion or removal of a device with an SAS link are filtered by comparing the timing of error warnings with the timing of hot insertion or removal of the device. An SCSI Enclosure Processor monitors physical device presence events through a side band bus, such as an I2C bus interfaced with physical devices. Upon detection of an error associated with the SAS link, an error filter module retrieves time stamped physical device presence events from the SCSI Enclosure Processor, compares the time stamp of the physical device presence event and suppresses the warning if the time stamp falls within a predetermined time of the error.
Abstract: A storage system that may include one or more memory sections, one or more switches, and a management system. The memory sections include memory devices and a section controller capable of detecting faults with the memory section and transmitting messages to the management system regarding detected faults. The storage system may include a management system capable of receiving fault messages from the section controllers and removing from service the faulty memory sections. Additionally, the management system may determine routing algorithms for the one or more switches.
Type:
Grant
Filed:
February 26, 2007
Date of Patent:
June 2, 2009
Assignee:
Ring Technology Enterprises, LLC
Inventors:
Melvin James Bullen, Steven Louis Dodd, William Thomas Lynch, David James Herbison
Abstract: A method and system for handling errors and exceptions in an ERP environment are disclosed. According to one aspect of the present invention, a condition or event causes a script-engine associated with a particular ERP server to generate an error message. The error message is communicated to a centralized controller-node. The centralized controller-node analyzes the message and determines the best course of action for handling an error or exception related to the error message. Based on the controller node's analysis, the controller node communicates a response message, thereby enabling the process that caused the error to continue without terminating abnormally.
Abstract: A data transmission control system includes a server configured to store an application program having a data area in which transmission control information containing a threshold value used to detect abnormal data transmission is stored and to allow the application program to be downloaded in response to a request, and a mobile device that downloads the application program from the server. The mobile device comprises a management table recording the transmission control information extracted from the downloaded application program; a counter configured to count a number of messages or traffic volume per unit time; a detector configured to compare the counted value with the threshold value contained in the transmission control information to detect abnormal data transmission; and a transmission regulating unit configured to restrict data transmission upon detection of the abnormal data transmission.
Abstract: Singleton services can be automatically migrated from one application server to another in a cluster using a lease table and a migration master in case of a failure of the first application server.
Abstract: A method for processing a connection-mode IP connection is provided. Generic methods known from the prior are problematic in that in the case of connection-mode IP communication connections between two control devices the connection-mode data are lost in one of the control devices after a non-availability situation and thus the messages received cannot be associated with a connection. The method provides that the messages emitted by the sending control device are made available to the upper layer of the connection-mode communication connection, which uses the same as a criterion for the preferential recovery of intact connections.
Abstract: Configurable error handling apparatus and methods to operate the same are disclosed. An example apparatus comprises a processor core in a semiconductor package, a hardware functional block in the semiconductor package, an error handler in the semiconductor package, wherein the error handler is configurable to route error data from the hardware functional block to at least one of a first error log or a second error log and to route error signals from the hardware functional block to at least one of an operating system or firmware, and wherein the processor core configures the error handler and the hardware functional block.
Type:
Grant
Filed:
February 13, 2006
Date of Patent:
May 12, 2009
Assignee:
Intel Corporation
Inventors:
Suresh Marisetty, Baskaran Ganesan, Gautam Bhagwandas Doshi, Murugasamy Nachimuthu, Koichi Yamada, Jose A. Vargas, Jim Crossland, Stan J. Domen
Abstract: A method for recovery in a two-node data processing system is provided wherein each node is a primary server for a first nonvolatile storage device and for which there is provided shared access to a second nonvolatile storage device for which the other node is a primary server and wherein each node also includes a direct connection to the shared nonvolatile storage device for which the other node is the primary server. Upon notification of failure, the method operates by first confirming continued access by each node to the nonvolatile storage device for which it is the primary server and then by attempting to access the shared nonvolatile storage device via the direct connection and by waiting for a time sufficient for the same process to be carried out by the other node. If access to the shared nonvolatile storage device is successful, the node takes control of both nonvolatile storage devices. If the access is not successful a comparison of node numbers is carried out to decide the issue of control.
Type:
Grant
Filed:
July 2, 2007
Date of Patent:
May 12, 2009
Assignee:
International Business Machines Corporation
Abstract: In case an error statistics of one of the disk drives exceeds a predetermined threshold, the disk is determined as a suspect disk drive. A recovery mode is set successively. During the time when a setting of the recovery mode is in progress and no access is made from a host 16 in this time, the address range of the suspect disk drive is specified. At the same time, a processing is started in that the data of the suspect disk is copied to a spare disk 34 sequentially to recover the data. The data of the suspect disk drive is copied to the spare disk drive 34 to recover the data when the address range of the suspect disk drive does not correspond to the write failure address range of a management table 48. The data of a normal disk drive is copied to the spare disk drive 34 to recover the data when the address range of the suspect disk drive corresponds to the write failure address range of the management table 48.
Abstract: A corrective action method or subsystem for providing corrective actions in a for a computing domain shared among multiple customers wherein different domain resources are shared by different customers, and each customer's corrective action preferences are accommodated differently according a repository of customer preferences. A database may be queried when a fault event or out-of-limits condition is detected for a given shared resource to determine which customers share the resource, determine each affected customer's response preferences, and to perform corrective actions according to those response preferences. For example, three customers may share a particular hard drive in a shared computing system. One customer may prefer to receive an email notice when the drive is nearly full, another may prefer to receive additional allocation of disk space elsewhere, and the third may prefer to receive a written report of space utilization.
Type:
Grant
Filed:
April 17, 2003
Date of Patent:
May 5, 2009
Assignee:
International Business Machines Corporation
Inventors:
Rhonda L. Childress, Mark Anthony Laney, Reid Douglas Minyen, Neil Raymond Pennell
Abstract: A heterogeneous and cost effective software based business continuity framework for integrated applications in a computing environment. In one example embodiment, this is accomplished by consolidating application status at each site. The framework then detects and notifies any DR conditions as a function of the consolidated site application status so that either a manual or an automatic system startup and shutdown can then be performed to maintain the business continuity.
Type:
Grant
Filed:
June 16, 2006
Date of Patent:
April 14, 2009
Assignee:
Hewlett-Packard Development Company, L.P.
Abstract: A method and system for distributing a plurality of data processing units (116, 118, 120, 122, 124, 126, 128, 130 and 132) in a communication network (100) is provided. The communication network comprises a plurality of manager data processing units (104, 106 and 108) for serving the plurality of data processing units. The method includes detecting (304) a failure at a manager data processing unit of the plurality of manager data processing units. Further, the method includes distributing (306) the plurality of data processing units to remaining manager data processing units of the plurality of manager data processing units.
Abstract: Aspects of the invention for testing and debugging an embedded device under test may include the step of loading an instruction into a parameterized shift register of a BIST module coupled to each one of a plurality of embedded memory modules comprising the embedded device under test. An identity of the loaded instruction may be determined subsequent to loading the instruction into the parameterized shift register. A plurality of test signals may be generated which correspond to the determined identity of the loaded instruction. In this regard, each of the generated plurality of test signals may control the execution of the testing and debugging of a corresponding one of each of the plurality of embedded memory modules that make up the embedded device under test.
Type:
Grant
Filed:
October 11, 2002
Date of Patent:
April 14, 2009
Assignee:
Broadcom Corporation
Inventors:
Zeynep M. Toros, Esin Terzioglu, Gil Winograd
Abstract: A method of correlating a plurality of event logs surrounding abnormal program termination of a plurality of networked computers, includes continuously generating event records that includes operating system events, information technology (IT) infrastructure events and program application events, transmitting and storing the event records to a monitoring database, generating and transmitting an abnormal program termination event record when a computer experiences abnormal program termination to the monitoring database, and synchronizing the stored event records and the abnormal program termination event record of the computer based on receiving the abnormal program termination event record at the monitoring database, and with respect to the abnormal program termination event record.
Type:
Grant
Filed:
May 28, 2008
Date of Patent:
March 24, 2009
Assignee:
International Business Machines Corporation
Inventors:
Lutz Werner Denefleh, Burghard Bruno Eisele, Jens Michael Hopf, Rudolf Michalak
Abstract: The intelligent distributed file system enables the storing of file data among a plurality of smart storage units which are accessed as a single file system. The intelligent distributed file system utilizes a metadata data structure to track and manage detailed information about each file, including, for example, the device and block locations of the file's data blocks, to permit different levels of replication and/or redundancy within a single file system, to facilitate the change of redundancy parameters, to provide high-level protection for metadata, to replicate and move data in real-time, and to permit the creation of virtual hot spares among the smart storage units without the need to idle any single smart storage unit in the intelligent distributed file system.
Type:
Grant
Filed:
August 11, 2006
Date of Patent:
March 24, 2009
Assignee:
Isilon Systems Inc.
Inventors:
Sujal M. Patel, Paul A. Mikesell, Darren P. Schack, Aaron J. Passey
Abstract: A system comprises a non volatile memory and a plurality of processors. The non volatile memory stores an error handling routine. Each processor of the plurality of processors accesses the error handling routine on detecting an error and, on certain errors, signals the remaining processors to enter a rendezvous state. In the rendezvous state, a single processor takes over and performs error handling.
Type:
Grant
Filed:
July 28, 2003
Date of Patent:
March 10, 2009
Assignee:
Intel Corporation
Inventors:
Suresh Marisetty, George Thangadurai, Mani Ayyar
Abstract: The present invention includes combining mass storage devices having different probable time to failures into a single data storage system, so that most of the mass storage devices don't crash or become inoperable at the same time, such as when the system is exposed to a most probable failure causing setting or circumstances for the mass storage devices. Embodiments also include using a tiered system of functional testing, to identify “premium” mass storage devices for use in a premium data storage system, as compared to “standard” devices. The premium mass storage devices may not include reworked devices or components, and may have to pass a more stringent functional test than a test for a “standard” device. Also contemplated is use of a data storage system or server having standard or premium mass storage devices, and/or groups of mass storage device with different probable time to failure.
Abstract: The invention concerns a method for transmitting digital messages through output terminals (22) of a monitoring circuit (18) incorporated in a microprocessor (12) during execution of a series of instructions, the digital messages representing characteristic data stored by the monitoring circuit upon detecting a specific event in the execution of the series of instructions, one of said data corresponding to an identifier of said specific event, said method comprising the following steps: comparing the data of the last two detected specific events having a common identifier, if the compared data are identical, incrementing a repeat counter associated with said specific event; and if the compared data are different, transmitting a digital message representing the data of the last detected specific event, and furthermore, if the content of the repeat counter associated with said specific event is other than zero, transmitting a digital message indicating a repeat of the specific event.
Abstract: A processing system includes a direct current to direct current (DC-DC) converter for generating a supply voltage when coupled to a battery. A memory module stores a plurality of operational instructions. A processing module receives power from the DC-DC converter and executes the plurality of operational instructions. A power monitor circuit monitors the power source and powers down the power source when a first error condition is detected in the power source.
Abstract: A system that detects the onset of hard disk drive failure. During operation, the system measures vibrations from the hard disk drive to produce one or more vibration signals. Next, the system generates a vibration signature for the hard disk drive from the measured vibration signals. The system then determines if the vibration signature indicates the onset of hard disk failure by comparing the vibration signature with a reference vibration signature for the hard disk drive. If so, the system generates a warning or takes a remedial action.