Prepared Backup Processor (e.g., Initializing Cold Backup) Or Updating Backup Processor (e.g., By Checkpoint Message) Patents (Class 714/13)
-
Patent number: 8108719Abstract: An information processing device comprises a plurality of processing units on which OSs and execution environments operate, and shared peripheral devices shared by the plurality of processing units. The information processing device is provided with a failure concealing device for concealing a failure which has occurred in a processing unit. The failure concealing device determines a substitutional processing unit that will act as a substitute for a failed processing unit so that the OS and execution environment which have operated on the failed processing unit will operate on the substitutional processing unit, switches the OS and execution environment which have operated on the failed processing unit so that they will operate on the substitutional processing unit, and switches a shared resource used by the failed processing unit such that it is available to the substitutional processing unit.Type: GrantFiled: September 13, 2007Date of Patent: January 31, 2012Assignee: NEC CorporationInventors: Hiroaki Inoue, Masamichi Takagi, Masayuki Mizuno
-
Patent number: 8108715Abstract: A computer-implemented method for resolving split-brain scenarios in computer clusters may include (1) identifying a plurality of nodes within a computer cluster that are configured to collectively perform at least one task, (2) receiving, from a node within the computer cluster, a failure notification that identifies a link-based communication failure experienced by the node that prevents the nodes within the computer cluster from collectively performing the task, and, upon receiving the failure notification, (3) immediately prompting each node within the computer cluster to participate in an arbitration event in order to identify a subset of the nodes that is to assume responsibility for performing the task subsequent to the link-based communication failure. Various other methods, systems, and computer-readable media are also disclosed.Type: GrantFiled: July 2, 2010Date of Patent: January 31, 2012Assignee: Symantec CorporationInventor: Sandeep Agarwal
-
Patent number: 8103906Abstract: An automated and scalable system for total real-time redundancy of a plurality of client-server systems, wherein, data is replicated through a network connection and operationally located on a virtual machine that substitutes for a failed client-server system, wherein the virtual machine is activated and installed on the cloud computing environment. Monitoring applications are installed on both the client-server systems and the cloud computing environment. System components are identified, a network connection is initiated, a heartbeat is established, data replication is automated, system failure is detected, failover is initiated, and subsequent client-server restoration is automated.Type: GrantFiled: October 1, 2010Date of Patent: January 24, 2012Inventors: Massoud Alibakhsh, Shahram Famorzadeh
-
Patent number: 8099625Abstract: Method and apparatus for self-checking and self-correcting memory states of a programmable resource is described. Programmable resource of an integrated circuit has a first core and a second core instantiated therein. A first internal configuration port and a second internal configuration port of the integrated circuit are respectively connected to the first core and the second core. The second core is coupled to the first core for monitoring operation of the first core with the second core, and the second core is configured to obtain control responsive to a failure of the first core or the first internal configuration port for a self-correcting mode.Type: GrantFiled: April 3, 2009Date of Patent: January 17, 2012Assignee: XILINX, Inc.Inventors: Chen Wei Tseng, Weiguang Lu, Matthew P. Baker
-
Patent number: 8090982Abstract: A multiprocessor system is disclosed. The multiprocessor system includes plural processor cores to which control to be performed is allocated. The multiprocessor system includes a monitoring processor which detects an abnormal operation that has occurred in a specific processor core to which control having a higher priority order than control to be allocated to processor cores other than the specific processor core is allocated. When the monitoring processor detects the abnormal operation in the specific processor core, the monitoring processor allocates the control having the higher priority order to one of the processor cores other than the specific processor core.Type: GrantFiled: June 11, 2008Date of Patent: January 3, 2012Assignee: Toyota Jidosha Kabushiki KaishaInventors: Takashi Inoue, Takeshi Inoguchi
-
Patent number: 8086898Abstract: A redundant I/O module includes: a control I/O module that communicates with a controller and that comprises a first IOM setting information holding section for storing IOM setting information downloaded from a high-level apparatus, and a standby I/O module that communicates with the controller and that comprises a second IOM setting information holding section for storing IOM setting information downloaded from the high-level apparatus, wherein the controller includes: an IOM status generation section that detects a status of replacement of the standby I/O module; an IOM setting information acquisition section that makes an access to the first IOM setting information holding section of the control I/O module; an IOM setting information generation section that generates IOM setting information about the standby I/O module; and an IOM download section that downloads the generated IOM setting information into the second IOM setting information holding section of the standby I/O module.Type: GrantFiled: January 25, 2010Date of Patent: December 27, 2011Assignee: Yokogawa Electric CorporationInventor: Noriko Kase
-
Patent number: 8078904Abstract: Provided is a method of managing a computer system including a plurality of storage systems and a plurality of management appliances for managing the plurality of storage systems. A first management appliance and a second management appliance hold an identifier of a first storage system and management data obtained from the first storage system. The method includes the steps of: selecting a third management appliance from the plurality of management appliances when a failure occurs in the first management appliance; transmitting the identifier held in the second management appliance from the second management appliance to the selected third management appliance; and holding the identifier transmitted from the second management appliance in the selected third management appliance. Thus, it is possible to prevent, after failing-over due to an abnormality of a maintenance/management appliance, a single point of failure from occurring to reduce reliability of the maintenance/management appliance.Type: GrantFiled: October 29, 2010Date of Patent: December 13, 2011Assignee: Hitachi, Ltd.Inventors: Takahiro Fujita, Hirokazu Ikeda, Nobuyuki Osaki
-
Patent number: 8078907Abstract: A cpu-set type multiprocessor system allows a cpu of a cpu-set that has a hardware exception to disable itself and notify the system. The system assigns processes of the cpu-set that include the problem cpu to another cpu-set. The disabling of the problem cpu and transfer of the related processes to another cpu-set allows the system to failsoft so that other cpu-sets the multiprocessor system can continue to run and a recovery of the processes being run on the problem cpu-set occurs.Type: GrantFiled: January 19, 2006Date of Patent: December 13, 2011Assignee: Silicon Graphics, Inc.Inventors: Patrick John Donlin, Samuel Edward Watters
-
Patent number: 8074109Abstract: Techniques are described of using votes of third-party components to select a master processor from a plurality of redundant processors. A master processor and a standby processor maintain communications with one another. If communication between the master processor and the standby processor fails, the processors may poll a set of registered voters to determine which of the processors is to be the master processor. In this way, the processors may determine which of the processors is to be master without the use of a shared indicator to specify which of the processors is to be the master processor.Type: GrantFiled: November 14, 2006Date of Patent: December 6, 2011Assignee: Unisys CorporationInventor: James Roffe
-
Patent number: 8074094Abstract: A mechanism for synchronizing states of components in a first routing engine to corresponding components in a second routing engine is provided. In order to reduce the amount of data required to synchronize the state of the components and the time and resources required to perform the synchronization, the state-related information transmitted from the first routing engine to the second routing engine is limited to information used to build states of a subset of the components associated with the first routing engine. That subset of components is limited to those components that receive stimuli (e.g., data streams or data packets) from sources external to the routing engine. Other components on the second routing engine synchronize state by receiving information from those components on the second routing engine that received the external stimuli information.Type: GrantFiled: August 20, 2007Date of Patent: December 6, 2011Assignee: Cisco Technology, Inc.Inventors: Jeffrey David Haag, Gary Lee Harris, Samuel G. Henderson, Richard Foltak
-
Patent number: 8074111Abstract: A method for responding to a failure of hardware locus of at a communication installation having a plurality of control apparatuses for controlling a plurality of processes distributed among a plurality of hardware loci, the hardware loci including at least one spare hardware locus, includes the steps of: (a) Shifting control of a failed process from an initial control apparatus to an alternate control apparatus located at an alternate hardware locus than the failed hardware locus. The failed process is a respective process controlled by the initial control apparatus located at the failed hardware locus. (b) Relocating the respective control apparatuses located at the failed hardware locus to a spare hardware locus. (c) Shifting control of the failed process from the alternate control apparatus to the initial control apparatus relocated at the spare hardware locus.Type: GrantFiled: September 18, 2006Date of Patent: December 6, 2011Assignee: Nortel Networks, Ltd.Inventor: Sandeep Mehta
-
Patent number: 8074099Abstract: In a computer system including server apparatuses such as an active server and a standby server connected to a storage apparatus, when the active server fails, a management server changes over connection to the storage apparatus from the active server to standby server to thereby hand over operation to the standby server. The management server refers to a fail-over strategy table in which apparatus information of the server apparatuses is associated with fail-over methods to select fail-over strategy in consideration of apparatus information of the active and standby servers.Type: GrantFiled: August 19, 2009Date of Patent: December 6, 2011Assignee: Hitachi, Ltd.Inventors: Shigeki Arata, Takashi Tameshige
-
Patent number: 8069145Abstract: A method and apparatus for obtaining data of a cache node in a tree-structured cluster is described. In one embodiment, a query for data in the cache node of the tree-structured cluster is received. A determination of whether the data is stored in the queried cache node is made. An inquiry of other cache nodes in the cluster for the data is performed. An instance of the data from a cache node storing the data is replicated to the cache node receiving the query.Type: GrantFiled: August 30, 2007Date of Patent: November 29, 2011Assignee: Red Hat, Inc.Inventors: Manik Ram Surtani, Brian Edward Stansberry
-
Patent number: 8069368Abstract: When a primary server executing a task fails in a computer system where a plurality of servers are connected to an external disk device via a network and the servers boot an operation system from the external disk device, task processing is taken over from the primary server to a server that is not executing a task in accordance with the following method. The method for taking over a task includes the steps of detecting that the primary server fails; searching the computer system for a server that has the same hardware configuration as that of the primary server and that is not running a task; enabling the server, searched for as a result of the search, to access the external disk device; and booting the server from the external disk device.Type: GrantFiled: May 4, 2009Date of Patent: November 29, 2011Assignee: Hitachi, Ltd.Inventors: Keisuke Hatasaki, Takao Nakajima
-
Patent number: 8069366Abstract: A cluster system comprises a plurality of nodes that provides data-access service to a shared storage, each node having at least one failover partner node for taking over services of a node if the node fails. Each node may produce write logs for the shared storage and periodically send write logs at predetermined time intervals to a global device which stores write logs from each node. The global device may detect failure of a node by monitoring time intervals of when write logs are received from each node. Upon detection of a node failure, the global device may provide the write logs of the failed node to one or more partner nodes for performing the write logs on the shared storage. Write logs may be transmitted only between nodes and the global device to reduce data exchanges between nodes and conserving I/O resources of the nodes.Type: GrantFiled: April 29, 2009Date of Patent: November 29, 2011Assignee: NetApp, Inc.Inventor: Thomas Rudolf Wenzel
-
Patent number: 8065559Abstract: The present invention provides methods and systems for performing load balancing via a plurality of virtual servers upon a failover using metrics from a backup virtual server. The methods and systems described herein provide systems and methods for an appliance detecting that a first virtual server of a plurality of virtual servers having one or more backup virtual servers load balanced by an appliance is not available, identifying at least a first backup virtual server of a one or more backup virtual servers of the first virtual server is available, maintaining a status of the first virtual server as available in response to the identification, obtaining one or more metrics from the first backup virtual server of a one or more backup virtual servers, and determining the load across the plurality of virtual servers using the metrics obtained from the first backup virtual server associated with the first virtual server.Type: GrantFiled: May 29, 2008Date of Patent: November 22, 2011Assignee: Citrix Systems, Inc.Inventors: Sandeep Kamath, Josephine Suganthi, Sergey Verzunov, Murali Raja, Anil Shetty
-
Patent number: 8060779Abstract: Provided are a method, system, and article of manufacture for using virtual copies in a failover and failback environment. Updates are copied from a primary first storage at the primary site to a secondary first storage at the secondary site during system operations. A second storage is maintained at at least one of the primary and secondary sites. A failover is performed from the primary site to the secondary site after a failure at the primary site. The at least one second storage is used after recovery of the primary site to synchronize the secondary site to the primary site. Only updates made to the secondary site during the failover are copied to or from the at least one second storage in response to the recovery at the primary site.Type: GrantFiled: April 12, 2010Date of Patent: November 15, 2011Assignee: International Business Machines CorporationInventors: Brent Cameron Beardsley, Gregory Edward McBride, Robert Francis Bartfai
-
Patent number: 8054986Abstract: An acoustic communication device includes a computer device having an acoustic transmitter and/or an acoustic receiver. A signal processing module processes sound signals such that the transmitter and/or receiver are employed to permit acoustic communication between computer devices using sound signals.Type: GrantFiled: August 18, 2009Date of Patent: November 8, 2011Assignee: International Business Machines CorporationInventors: Dimitri Kanevsky, Ran D. Zilca, Alexander Zlatsin
-
Patent number: 8055940Abstract: A system and method detects communication error among multiple nodes in a concurrent computing environment. One or more barrier synchronization points/checkpoints or regions are used to check for a communication mismatch. The barrier synchronization point(s)/checkpoint(s) can be placed anywhere in the concurrent computing program. Once a node reaches a barrier synchronization point/checkpoint, it is not allowed to communicate with another node regarding data that is needed to execute the concurrent computing program, even if the other node has not reached the barrier synchronization point/checkpoint. Regions can also, or alternatively, be used to detect a communication mismatch instead of barrier synchronization points/checkpoints. A concurrent program on each node is separated into one or more regions. Two nodes communicate with each other when their regions are compatible. If their regions are not compatible, a communication mismatch occurs.Type: GrantFiled: July 17, 2007Date of Patent: November 8, 2011Assignee: The MathWorks, Inc.Inventors: Edric Ellis, Jocelyn Luke Martin
-
Patent number: 8056076Abstract: A method and system for acquiring a quiesceing set of information associated with a virtual machine. A virtual machine is cloned. The cloned virtual machine has an associated persistent storage device. The state of the persistent storage device is transformed into a quiesced state of the cloned virtual machine by utilizing a shut-down process. The shut-down process is executed on the cloned virtual machine to quiesce the cloned virtual machine and the quiesceing set of information of the cloned virtual machine is automatically reduced to information stored on the persistent storage device.Type: GrantFiled: June 29, 2006Date of Patent: November 8, 2011Assignee: VMware, Inc.Inventors: Greg Hutchins, Christian Czezatke, Satyam B. Vaghani, Mallik Mahalingam, Shaw Chuang, Bich Le
-
Patent number: 8054849Abstract: A system and method for managing video content streams are disclosed. The method includes receiving a plurality of multicast video streams at a server and buffering each video stream within a memory at the server. The method also includes generating a multicast video output at the server and a unicast video output at the server.Type: GrantFiled: May 27, 2005Date of Patent: November 8, 2011Assignee: AT&T Intellectual Property I, L.P.Inventor: Dinesh Nadarajah
-
Patent number: 8051326Abstract: System and method for completeness of transmission control protocol (TCP) high availability (HA) are disclosed. The system includes an active processor, having an application and a TCP, and a standby processor, having another application and another TCP; wherein communications among the active application, the active TCP, the standby application and the standby TCP quickly and efficiently enable the system seamlessly switching over from the active processor to the standby processor for transmission of incoming TCP data streams and outgoing TCP data streams if the active processor fails.Type: GrantFiled: October 15, 2007Date of Patent: November 1, 2011Assignee: FutureWei Technologies, Inc.Inventor: Huaimo Chen
-
Patent number: 8051322Abstract: In a communication system using an IP tunnel for communication between application processing apparatuses (hereinafter, processing apparatuses), an application can be moved to an arbitrary processing apparatus, update of tunnel tables included in the respective processing apparatuses is quickly performed, and a buffer for waiting for packets during the table update is made small. A redundancy managing apparatus manages a correspondence between a virtual IP address (VIP) of an application in a communication system and an IP address (RIP) of an processing apparatus to execute the application. The processing apparatus notifies the VIP of the communication partner application of the application to the redundancy managing apparatus.Type: GrantFiled: May 5, 2009Date of Patent: November 1, 2011Assignee: Hitachi Kokusai Electric Inc.Inventors: Norihisa Matsumoto, Nodoka Mimura, Hidetoshi Nishiyama, Makoto Watanabe
-
Patent number: 8051220Abstract: A process control system is provided having a plurality of I/O devices in communication using a bus. A primary redundant I/O device and a secondary redundant I/O device are coupled to the bus, where the secondary redundant I/O device is programmed to detect a primary redundant I/O device fault. The secondary redundant I/O device, upon detecting the primary redundant I/O device fault, publishes a primary redundant I/O device fault message on the bus. The controller may deactivate the primary redundant I/O device and activate the secondary redundant I/O device responsive to the primary redundant I/O device fault message.Type: GrantFiled: December 23, 2009Date of Patent: November 1, 2011Assignee: Fisher-Rosemount Systems, Inc.Inventors: Michael D. Apel, Steven L. Dienstbier
-
Patent number: 8041986Abstract: A proposed fail over method for taking over task that is preformed on an active server to a backup server, even when the active server and the backup server have different hardware configuration. The method for making a backup server take over task when a fault occurs on a active server, comprises steps of acquiring configuration information on the hardware in the active server and the backup server, acquiring information relating the hardware in the backup server with the hardware in the active server, selecting a backup server to take over the task that is executed on the active server where the fault occurred, creating logical partitions on the selected backup server, and taking over the task executed on the active server logical partitions, in the logical partitions created on the selected backup server.Type: GrantFiled: March 26, 2010Date of Patent: October 18, 2011Assignee: Hitachi, Ltd.Inventors: Keisuke Hatasaki, Masayoshi Kitamura, Yoshifumi Takamoto
-
Information-processing equipment and system therefor with switching control for switchover operation
Patent number: 8032786Abstract: In cases where the system which performs service provision includes plural kinds of OS, the plural kinds of OS are operated simultaneously on one standby server provided with the virtual control unit. When a failure etc. occurred in the operation system server necessitates the system switchover from the operation system server to the standby server, the virtual control unit of the standby server distinguishes an operation system server in which the failure has occurred, and takes over the processing to the switching control unit on a suitable OS on the standby server.Type: GrantFiled: April 24, 2008Date of Patent: October 4, 2011Assignee: Hitachi, Ltd.Inventor: Yukihiro Shimmura -
Patent number: 8032781Abstract: A system and method for allowing more rapid takeover of a failed filer by a clustered takeover partner filer in the presence of a coredump procedure (e.g. a transfer of the failed filer's working memory) is provided. To save time, the coredump is allowed to occur contemporaneously with the takeover of the failed filer's regular, active file service disks by the partner so that the takeover need not await completion of the coredump to begin. This is accomplished, briefly stated, by the following techniques. The coredump is written to a single disk that is not involved in regular file service, so that takeover of regular file services can proceed without interference from coredump. A reliable means for both filers in a cluster to identify the coredump disk is provided, which removes takeover dependence upon unreliable communications mechanisms.Type: GrantFiled: October 7, 2010Date of Patent: October 4, 2011Assignee: NetApp, Inc.Inventors: Susan M. Coatney, John Lloyd, Jeffrey S. Kimmel, Brian Parkison, David Brittain Bolen
-
Patent number: 8028193Abstract: Failover of blade servers in a data center including powering off a failing blade server by a system management server through a blade server management module (‘BSMM’) managing the failing blade server, the failing blade server characterized by a machine type, one or more network addresses, and one or more storage addresses, the addresses being virtual addresses; identifying, by the system management server from a pool of standby blade servers, a replacement blade server, the replacement blade server managed by a BSMM; assigning, by the system management server through the BSMM managing the replacement blade server, the one or more network addresses and the one or more storage addresses of the failing blade server to the replacement blade server, including enabling in the replacement blade server the assigned addresses; and powering on the replacement blade server by the system management server through the BSMM managing the replacement blade server.Type: GrantFiled: December 13, 2007Date of Patent: September 27, 2011Assignee: International Business Machines CorporationInventors: Gregory W. Dake, Eric R. Kern, Andrew B. McNeill, Jr., Martin J. Tross, Theodore B. Vojnovich, Ben-Ami Yassour
-
Patent number: 8028192Abstract: A method, system and computer-readable medium for providing rapid failback of a computer system is described. The method, which operates during failback of a secondary computer to a primary computer, accesses a map to determine a location of a latest version of data corresponding to a read request, where the location may be within either a primary data storage or a secondary data storage. The system comprises a primary computer coupled to a primary data storage and a secondary computer coupled to a secondary data storage. The primary computer maintains a write log and the secondary computer maintains a map. The computer-readable medium contains instructions, which, when executed by a processor, performs the steps embodied by the method.Type: GrantFiled: April 28, 2006Date of Patent: September 27, 2011Assignee: Symantec Operating CorporationInventors: Anand Kekre, Angshuman Bezbaruah, Ankur Panchbudhe
-
Patent number: 8024605Abstract: A method for maintaining the ability of a parent server process to communicate with one or more client processes is disclosed. In the method, a first child server process is configured to monitor for failure of the parent server process and to respond to failure of the parent server process by: i) continuing any communication with the client processes that would have been performed by the parent server process had it not failed; and ii) initiating a second child server process which is configured to monitor for failure of the first child server process and to respond to such a failure in the same manner as the first child server process responds to failure of the parent server process.Type: GrantFiled: August 22, 2005Date of Patent: September 20, 2011Assignee: Oracle International CorporationInventor: Ramasubramanian Saravanakumar
-
Patent number: 8006131Abstract: In particular embodiments, method and system for detecting a failure of a primary ad-splicer, conveying a failure information for the failed primary ad-splicer to a redundant ad-splicer, dynamically forwarding one or more pre-spliced packets intended for the failed primary ad-splicer to the redundant ad-splicer, receiving one or more post-spliced packets from the redundant ad-splicer, and transmitting the post-spliced packets towards one or more target receivers are provided.Type: GrantFiled: October 29, 2008Date of Patent: August 23, 2011Assignee: Cisco Technology, Inc.Inventors: Rajiv Asati, Anil Thomas, Toerless Eckert
-
Patent number: 8006130Abstract: Techniques for generating a system model for use by and availability management framework (AMF) are described. Inputs are received, processed and mapped into outputs which are further processed into a configuration file in an Information Model Management (IMM) Service eXternal Markup Language (XML) format which can be used as a system model by an AMF.Type: GrantFiled: September 30, 2008Date of Patent: August 23, 2011Assignee: Telefonaktiebolaget L M Ericsson (Publ)Inventors: Ali Kanso, Maria Toeroe
-
Patent number: 8006124Abstract: Provided are a large-scale cluster monitoring system and a method for automatically building/restoring the same, which can automatically build a large-scale monitoring system and can automatically build a monitoring environment when a failure occurs in nodes. The large-scale cluster monitoring system includes a CM server, a BD server, GM nodes, NA nodes, and a DB agent. The CM server manages nodes in a large-scale cluster system. The DB server stores monitoring information that is state information of nodes in groups. The GM nodes respectively collect the monitoring information that is the state information of the nodes in the corresponding groups to store the collected monitoring information in the DB server. The NA nodes access the CM server to obtain GM node information and respectively collect the state information of the nodes in the corresponding groups to transfer the collected state information to the corresponding GM nodes.Type: GrantFiled: August 5, 2008Date of Patent: August 23, 2011Assignee: Electronics and Telecommunications Research InstituteInventors: Choon-Seo Park, Song-Woo Sok, Chang-Soo Kim, Yoo-Hyun Park, Yong-Ju Lee, Jin-Hwan Jeong, Hag-Young Kim
-
Patent number: 8001418Abstract: Systems, devices, software, hardware and networks adapted and arranged for monitoring and correcting faults in networked media player systems that include electronic displays are provided. After detection or notification of a fault in at least one networked media player in a network of at least two, or N, media players operationally connected to electronic displays, the invention provides an alternate source of signal to the affected display. In some preferred embodiments, the invention utilizes at least one additional, or N+1, media player as a backup to substitute for the failed media player. Reconfiguration of the faulted media player by means of the N+1 backup networked media player advantageously increases the reliability and efficiency of ongoing maintenance of digital visual systems operating in commercial and other environments.Type: GrantFiled: March 6, 2009Date of Patent: August 16, 2011Assignee: EK3 Technologies, Inc.Inventors: Dennis Michaelson, Joseph Hishon
-
Patent number: 8001431Abstract: A control apparatus controls a device to which the control apparatus is connected. The control apparatus includes a storing unit and a linking unit. The storing unit stores an error message that contains information on a failed component in a storage device upon receiving the error message from the device. The linking unit stores the error message and information on a replacement component, which has been installed in the device in place of the failed component, in the storage device in association with each other upon receiving the information on the replacement component.Type: GrantFiled: August 25, 2010Date of Patent: August 16, 2011Assignee: Fujitsu LimitedInventor: Katsuhiko Konno
-
Patent number: 7996715Abstract: A new multi nodal computer system comprising a number of nodes on which chips of different types reside. The new multi nodal computer system is characterized in that there is one clock chip per node, each clock chip controlling only the chips residing on that node said chips being appropriate for sending a check stop request to the associated clock chip in case of a malfunction. A new check stop handling method is characterized in that depending on the source of the check stop request the clock chip that received the check stop request initiates a system check stop, a node check up, or a chip check stop.Type: GrantFiled: November 5, 2008Date of Patent: August 9, 2011Assignee: International Business Machines CorporationInventors: Karin Rebmann, Dietmar Schmunkamp, Tobias Webel, Thomas E. Gilbert, Timothy G. McNamara, Patrick J. Meaney
-
Patent number: 7991822Abstract: Local versions of attributes of a storage object are maintained at a plurality of nodes, wherein a first attribute designates a first node of the plurality of nodes as an owner node for the storage object, and wherein a second attribute includes information to resolve validity of ownership of the storage object among the plurality of nodes. The owner node communicates changes to be made to the local versions of the attributes at other nodes of the plurality of nodes. A second node of the plurality of nodes requests ownership of the storage object. The first attribute is updated to designate the second node of the plurality of nodes as the owner node, in response to determining from the second attribute that the validity of ownership of the storage object allows the second node to inherit ownership of the storage object once the first node surrenders ownership of the storage object.Type: GrantFiled: August 29, 2007Date of Patent: August 2, 2011Assignee: International Business Machines CorporationInventors: Thomas William Bish, Mark Albert Reid, Joseph M. Swingler, Michael Wayne Young
-
Patent number: 7992032Abstract: Even when a large number of guest OSs exist, a failover method meeting high availability needed by the guest OSs is provided for the each guest OS. In the event of a physical or logical change of a system, or change of operation states, a smooth failover method can be realized by preventing the consumption of resource amounts due to excessive failover methods, and the occurrence of systemdown due to an inadequate failover method. In a server virtualization environment, in a cluster configuration having a failover method due to hot standby and cold standby, by selecting a failover method meeting high availability requirements specifying performance during failover of applications on the guest OSs, a suitable cluster configuration is realized. Failure monitoring is realized by quantitative heartbeat.Type: GrantFiled: August 3, 2007Date of Patent: August 2, 2011Assignee: Hitachi, Ltd.Inventors: Tsunehiko Baba, Toshiomi Moriki, Yuji Tsushima
-
Patent number: 7991850Abstract: The present invention provides a method and apparatus to restore the operating system of a personal internet communicator (PIC) to a “known good” operational state in the event of a catastrophic failure. In an embodiment of the invention, the hard drive of the personal internet communicator is organized in three partitions: 1) a partition for the operating system and related files; 2) a user data partition; and 3) a “restore” partition. The restore partition is hidden by modifying the type of partition that can be detected by the user or any operating system. Upon a catastrophic failure, the system can be returned to an operational state by performing a sector-by-sector restoration to copy an image of the operating system and related system files back to the operating system partition. In various embodiments of the invention, the PIC system state is continuously monitored by a “registry sniffing” routine that maintains a file containing data corresponding to the system state of the PIC.Type: GrantFiled: July 28, 2005Date of Patent: August 2, 2011Assignee: Advanced Micro Devices, Inc.Inventors: Jeffrey M. Lavin, Stephen Paul
-
Patent number: 7992038Abstract: An architecture for protecting against failure in a switched storage network using virtualization.Type: GrantFiled: July 1, 2010Date of Patent: August 2, 2011Assignee: EMC CorporationInventors: Bradford B. Glade, David W. Harvey, John Kemeny, Lee W. VanTine, Matthew D. Waxman
-
Patent number: 7987228Abstract: The invention relates to communications, particularly but not exclusively broadband communications. One facet of the present invention relates to provisioning of services in a communications network and finds particular, but not exclusive, application in a broadband network environment or other environment where services are provisioned. The provisioning of services will now be discussed in more detail.Type: GrantFiled: July 3, 2002Date of Patent: July 26, 2011Assignee: Accenture Global Services LimitedInventors: Jean Christophe McKeown, Henri Chabrier
-
Patent number: 7987386Abstract: A method is provided in which checkpointing operations are carried out in data processing systems running multiple processes which employ shared memory in a manner which preserves data coherence and integrity but which places no timing restrictions or constraints which require coordination of checkpointing operations. Data structures within local process memory and within shared memory provide the checkpoint operation with application level information concerning shared memory resources specific to at least two processes being checkpointed. Methods are provided for establishing, restoring and releasing shared memory regions that are accessed by multiple cooperating processes.Type: GrantFiled: April 2, 2008Date of Patent: July 26, 2011Assignee: International Business Machines CorporationInventors: Bin Jia, Ellick C. Law, Richard R. Treumann
-
Patent number: 7979739Abstract: Systems and methods for managing a redundant management module are provided. In this regard, a representative system, among others, includes first and second management modules that are configured to manage a computing device; and a programmable logic device that is configured to: instruct the first management module to manage the computing device responsive to detecting that the first management module is ready to manage the computing device, and instruct the second management module to manage the computing device responsive to detecting that the first management module failed to manage the computing device.Type: GrantFiled: August 21, 2008Date of Patent: July 12, 2011Assignee: Hewlett-Packard Development Company, L.P.Inventors: Kum Cheong Adam Chan, Chee Cheng Jeffrey Liang, Boon Siang Choo, Dale Shidla
-
Patent number: 7979546Abstract: Database management systems, methods, and program products that exploit time dependent sequential database management system processes to ensure presentation of the same data or view to one or a plurality of users through sequencing asynchronous database management operations such as recovery and replication. Sequencing is accomplished through the use of entries in sequential logs, including transaction logs, recovery logs, and other data recovery tools and applications. Uses include managing data migration and data replication.Type: GrantFiled: April 15, 2008Date of Patent: July 12, 2011Assignee: International Business Machines CorporationInventors: Elizabeth B. Hamel, Bruce G. Lindsay
-
Patent number: 7975175Abstract: Embodiments of a system that adjusts a checkpointing frequency in a distributed computing system that executes multiple jobs are described. During operation, the system receives signals associated with the operation of the computing nodes. Then, the system determines risk metrics for the computing nodes using a pattern-recognition technique to identify anomalous signals in the received signals. Next, the system adjusts a checkpointing frequency of a given checkpoint for a given computing node based on a comparison of a risk metric associated with the given computing node and a threshold, thereby implementing holistic fault tolerance, in which prediction and prevention of potential faults occurs across the distributed computing system.Type: GrantFiled: July 9, 2008Date of Patent: July 5, 2011Assignee: Oracle America, Inc.Inventors: Lawrence G. Votta, Keith A. Whisnant, Kenny C. Gross
-
Patent number: 7975174Abstract: One aspect of the present invention provides a system for failover comprising at least one client selectively connectable to one of at least two interconnected servers via a network connection. In a normal state, one of the servers is designated a primary server when connected to the client and a remainder of the servers are designated as backup servers when not connected to the client. The at least one client is configured to send messages to the primary server. The servers are configured to process the messages using at least one service that is identical in each of the servers. The services are unaware of whether a server respective to the service is operating as the primary server or the backup server. The servers are further configured to maintain a library, or the like, that indicates whether a server is the primary server or a server is the backup server. The services within each server are to make external calls via its respective library.Type: GrantFiled: April 9, 2010Date of Patent: July 5, 2011Assignee: TSX Inc.Inventors: Tudor Morosan, Gregory A. Allen, Viktor Pavlenko, Benson Sze-Kit Lam
-
Publication number: 20110161630Abstract: An apparatus and method is described herein for replacing faulty core components. General purpose hardware is provided to replace core pipeline components, such as execution units. In the embodiment of execution unit replacement, a proxy unit is provided, such that mapping logic is able to map instruction/operations, which correspond to faulty execution units, to the proxy unit. As a result, the proxy unit is able to receive the operations, send them to general purpose hardware for execution, and subsequently write-back the execution results to a register file; it essentially replaces the defective execution unit allowing a processor with defective units to be sold or continue operation.Type: ApplicationFiled: December 28, 2009Publication date: June 30, 2011Inventors: Steven E. Raasch, Michael D. Powell, Shubhendu S. Mukherjee, Arijit Biswas
-
Publication number: 20110161729Abstract: Techniques for transparently replacing a processor, that receives interrupts in a partitioned computing device, with a replacement processor, are disclosed. In at least some embodiments, methods are discussed for directing the interrupts to an unchangeable identifier mapped to the processor's identifier and replacing the processor with the replacement processor. An intermediary, such as an I/O APIC, is used for storing the unchangeable identifier. The mapping may use logical mode delivery, physical mode delivery, or interrupt mapping.Type: ApplicationFiled: March 9, 2011Publication date: June 30, 2011Applicant: Microsoft CorporationInventors: Andrew J. Ritz, Ellsworth D. Walker, Yimin Deng, Christopher Ahna
-
Patent number: 7971094Abstract: A failover module generates a user interface to enable an administrative user to define a failover plan for a primary site. The failover plan includes user-specified information for use by multiple operations of a failover process for failing over a server system from the primary site to a failover site. The failover plan can be stored as a data object on a computer system at the failover site. In the event of a serious failure at the primary site, the failover process can be invoked and carried out on the failover site with little or no human intervention, based on the failover plan, to cause the server system to be failed over to the failover site, thereby substantially reducing downtime of the server system and its data.Type: GrantFiled: March 3, 2009Date of Patent: June 28, 2011Assignee: NetApp, Inc.Inventors: Paul M. Benn, Balamurali Palaiah
-
Patent number: 7970844Abstract: A method of “stateful failover” is provided that allows email gateway systems in a cluster to deliver email messages that have been accepted for delivery by a member of the cluster, but has failed with out delivering the messages. The method involves creating a backup copy of the messages that have been accepted for delivery by one email gateway system in the stateful failover cluster on one or more other email gateway systems in the stateful failover cluster. Upon detecting the failure of the email gateway system that accepted the message, another member of the stateful failover cluster that has access to the backup copy of the message queue takes responsibility for the delivery of the messages on the mirrored queue.Type: GrantFiled: August 26, 2009Date of Patent: June 28, 2011Assignee: WatchGuard Technologies, Inc.Inventors: Robert Osborne, Bill Simpson, Rod Gilchrist