At Operating System Level (epo) Patents (Class 714/E11.132)
  • Publication number: 20130275801
    Abstract: A computer program product for performing error recovery is configured to perform a method that includes creating, by a processor, a recovery checkpoint. The processor is dynamically switched into a non-recoverable processing mode of operation based on creating the software recovery checkpoint. The non-recoverable processing mode of operation is a mode in which a subset of hardware error recovery resources are powered-down or re-purposed for instruction processing. It is determined, during the non-recoverable processing mode of operation, that a new software recovery checkpoint is required. Based on the determining that a new software recovery checkpoint is required, the processor is dynamically switched into a recoverable processing mode of operation. The recoverable processing mode of operation is a mode in which hardware error recovery resources, including at least one of the hardware error recovery resources in the subset, are purposed for hardware error recovery operations.
    Type: Application
    Filed: April 16, 2012
    Publication date: October 17, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
  • Publication number: 20130145204
    Abstract: A system for applying a recovery mechanism to a network of medical diagnostics instruments is provided herein. The system includes the following: a plurality of medical diagnostics instruments, each associated with a network connected component; a plurality of communication modules, each associated with a corresponding one of the plurality of network connected components, wherein each one of the plurality of communication modules is arranged to report on malfunctioning components that are network connected with the corresponding component, and a recovery module, configured to: (i) obtain reports from the communication modules; (ii) re-establish the malfunctioning components; and (iii) notify all communication modules of the re-establishment of the malfunctioning components, wherein the communication modules are further configured to re-establish connection between the corresponding components and the re-established components.
    Type: Application
    Filed: December 6, 2011
    Publication date: June 6, 2013
    Applicant: BIO-RAD LABORATORIES
    Inventors: Shlomo Gabel, Zvlya Tamir
  • Publication number: 20130111260
    Abstract: A recover to cloud (R2C) service replicates a customer production environment to virtual data centers (VDCs) operated in a cloud service provider environment. Customers provision both a disaster recovery VDC and a test VDC. At A Time of Disaster (ATOD), the disaster VDC is made available to the customer through the cloud. The disaster VDC is allocated a first set of resources dedicated to the specific customer and to disaster recovery. The test VDC, brought on line at A Time of Test (ATOT), is allocated resources from second set of resources arranged in a shared pool. separate from the first set. Provisioning of the test VDC does not disturb critical resource assignments needed in the event of a disaster.
    Type: Application
    Filed: October 27, 2011
    Publication date: May 2, 2013
    Applicant: SunGard Availability Services LP
    Inventors: CHANDRA REDDY, Enyou Li
  • Publication number: 20130103974
    Abstract: Managing firmware in a computing system storing a plurality of different firmware images for the same firmware includes: calculating, for each firmware image in dependence upon a plurality of predefined factors, a preference score; responsive to a failure of a particular firmware image, selecting a firmware image having a highest preference score; and failing over to the selected firmware image.
    Type: Application
    Filed: October 25, 2011
    Publication date: April 25, 2013
    Applicant: International Business Machines Corporation
    Inventors: Fred A. Bower, III, Michael H. Nolterieke, William G. Pagan, Paul B. Tippett
  • Publication number: 20130091335
    Abstract: A computer-implemented method, computer program product and data processing system provide checkpoint high-available for an application in a virtualized environment with reduced network demands. An application executes on a primary host machine comprising a first virtual machine. A virtualization module receives a designation from the application of a portion of the memory of the first virtual machine as purgeable memory, wherein the purgeable memory can be reconstructed by the application when the purgeable memory is unavailable. Changes are tracked to a processor state and to a remaining portion that is not purgeable memory and the changes are periodically forwarded at checkpoints to a secondary host machine. In response to an occurrence of a failure condition on the first virtual machine, the secondary host machine is signaled to continue execution of the application by using the forwarded changes to the remaining portion of the memory and by reconstructing the purgeable memory.
    Type: Application
    Filed: October 5, 2011
    Publication date: April 11, 2013
    Applicant: IBM CORPORATION
    Inventors: James Mulcahy, Geraint North
  • Publication number: 20130061086
    Abstract: To provide a fault-tolerant system requiring only one new server when the number of jobs to he processed concurrently exceeds the number of jobs processable by the current servers and requiring no standby servers. Servers 1 and 2 each run a hypervisor to establish multiple virtual machines. The hypervisors assign primary and secondary to the virtual machines in the manner that any of the servers has one or more primary virtual machines and one or more secondary virtual machines, and assign different processing to the virtual machines on the same server. When any of the servers is determined to have failed, the server including the secondary virtual machine paired with the primary virtual machine on the failed server promotes the secondary virtual machine to the primary.
    Type: Application
    Filed: March 7, 2012
    Publication date: March 7, 2013
    Applicant: NEC Corporation
    Inventor: Kiyoshi BABA
  • Publication number: 20130047024
    Abstract: Provided are techniques for configuring a primary shared Ethernet adapter (SEA) and a backup SEA into a failover (F/O) protocol; providing a user interface (UI) for enabling a user to request a SEA load sharing protocol; in response to a user request for a SEA load sharing protocol, verifying that criteria for load sharing are satisfied; setting, by the UI a load sharing mode, comprising: requesting, by the backup SEA to the primary SEA, implementation of the SEA load sharing protocol; responsive to the requesting by the backup SEA, the primary SEA transmit an acknowledgment to the backup SEA and transitions into a sharing state; and responsive to the acknowledgment from the primary SEA, the backup SEA transitions to the sharing state.
    Type: Application
    Filed: August 15, 2011
    Publication date: February 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shaival J. Chokshi, Xiaohan Qin, Rakesh Sharma, Patrick T. Vo
  • Publication number: 20120297236
    Abstract: In one embodiment, a method attempts, by a computing device, to determine a placement of a set of virtual machines on available hosts upon failure of a host. The placement considers the set of virtual machines as being not powered on any of the available hosts. The method further determines, by the computing device, a placed list of virtual machines in the set of virtual machines as a recommendation to power on to the available hosts. The determination of the placed list of virtual machines is used to determine a power off list of virtual machines in the set of virtual machines to power off, wherein virtual machines in the power off list of virtual machines are currently powered on available hosts but were considered to be powered off to determine the placement.
    Type: Application
    Filed: May 17, 2011
    Publication date: November 22, 2012
    Applicant: VMWARE, INC.
    Inventors: Elisha ZISKIND, Guoqiang SHU
  • Patent number: 8271754
    Abstract: A system, method and computer-usable medium are disclosed for recovering data from a memory storage device. The operating system (OS) of an IPS comprising a source memory storage device, further comprising stored data, is monitored to detect a defective operating state. If a defective operating state of the OS is detected, then operation of the IPS is terminated, followed by the initiation of IPS boot operations to recover data from the source memory storage device. The OS is bypassed, and initial boot operations are performed from a management controller or from the BIOS of the IPS. Additional boot operations are performed, and once the IPS has been brought to an operative state, a data recovery module is used to transfer data from the source memory storage device to a target storage device.
    Type: Grant
    Filed: October 5, 2009
    Date of Patent: September 18, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Terry L. Cole, Paul W. Vancil
  • Publication number: 20120144259
    Abstract: By utilizing Reed-Solomon erasure decoding algorithms and techniques, the system is able to perform error detection for the case where the number of bytes received in error exceeds a correcting capability of a decoder. The error detection can be used, for example, to determine whether a codeword is decodable, and whether the retransmission of data is necessary. The retransmission can be accomplished by assembling a message that is sent to another modem requesting retransmission of one or more portions of data, such as one or more codewords.
    Type: Application
    Filed: June 3, 2010
    Publication date: June 7, 2012
    Applicant: AWARE, INC.
    Inventors: Joshua Grossman, John Greszczuk, Marcos C. Tzannes
  • Publication number: 20120137163
    Abstract: A multi-core system 1 according to the present invention includes a plurality of OSs: OS[1] 110 and OS[2] 120 set as a main system and a standby system for a sound reproducing function. The standby-system OS[1] 110 sets a timer 17 according to a DMA transfer completion interruption request to detect a failure of the main-system OS[2] 120 according to detection of timeout by the timer 17. Upon detection of a failure of the main-system OS[2] 120, the standby-system OS[1] 110 is switched as the main-system OS[2] 120 to operate a device driver 114 on a side of the standby-system OS[1] 110, thereby continuously executing audio mixing processing of audio data and DMA transfer request processing.
    Type: Application
    Filed: April 2, 2010
    Publication date: May 31, 2012
    Inventor: Kentaro Sasagawa
  • Publication number: 20120084602
    Abstract: A fault restoration technique for use in a virtual environment is provided. The fault restoration technique includes monitoring fault state values of a plurality of domains, detecting a faulty domain, if any, from the plurality of, and restoring the faulty domain by reloading the OS of the faulty domain.
    Type: Application
    Filed: June 8, 2011
    Publication date: April 5, 2012
    Inventors: Sung-Min Lee, Sang-Bum Suh
  • Publication number: 20120011397
    Abstract: A computer apparatus includes a managing unit realizing virtual computers including device driver virtual computers and user virtual computers, the user virtual computers communicating with various devices via the device driver virtual computers. Error detection information is received from one of the virtual computers upon detection of error in one of the device drivers used for communication with one of the devices in one of the virtual computers. One or more types of the virtual computers and the contents of recovery process corresponding to the type of device driver and the type of error indicated in the received error detection information are acquired from error recovery control information. A recovery instruction is transmitted to one or more of the virtual computers identified by the one or more acquired types of virtual computers in order to cause the one or more identified virtual computers to perform the acquired contents of the recovery process.
    Type: Application
    Filed: April 26, 2011
    Publication date: January 12, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Takeo Murakami, Masahide Noda
  • Publication number: 20100017643
    Abstract: Provided is a failover method for a cluster system for realizing smooth failover of the guest OS's, even when there are many guest OS's, while reducing consumption of computer resources of a server. Smooth failover is realized by preventing competition during failover even when the number of guest OS's is increased. In a cluster configuration in which a slave/master cluster program is operated in a guest OS/host OS, the master cluster program (510) collects and transmits heartbeats of the slave cluster program, thereby realizing failure monitoring through the certain amount of heartbeats without depending on the number of guest OS's. Further, when the master cluster program monitors failures of the slave cluster program of its own computer to find a normal operation of the guest OS, the amount of communication through heartbeats is reduced by eliminating the necessity of communication to a standby system slave cluster program.
    Type: Application
    Filed: September 23, 2009
    Publication date: January 21, 2010
    Inventors: Tsunehiko Baba, Yuji Tsushima, Toshiomi Moriki
  • Publication number: 20100011243
    Abstract: Methods, systems, and media for enabling a software application to recover from a fault condition, and for protecting a software application from a fault condition, are provided. In some embodiments, methods include detecting a fault condition during execution of the software application, restoring execution of the software application to a previous point of execution, the previous point of execution occurring during execution of a first subroutine in the software application, and forcing the first subroutine to forego further execution and return to a caller of the first subroutine.
    Type: Application
    Filed: April 17, 2007
    Publication date: January 14, 2010
    Applicant: THE TRUSTEES OF COLUMBIA UNIVERSITY
    Inventors: Michael E. Locasto, Angelos D. Keromytis, Salvatore J. Stolfo, Angelos Stavrou, Gabriela Cretu, Stylianos Sidiroglou, Jason Nieh, Oren Laadan
  • Publication number: 20090024820
    Abstract: A method and module for performing a crash dump in a data processing apparatus in which memory for running the crash dump routine is allocated at the time of the crash. The method comprises running a first routine to identify memory locations of data for use by a second routine; allocating memory for performing the second routine from a memory range that does not contain the identified memory locations; and running the second routine using the allocated memory, wherein the first routine comprises a dummy crash dump routine and the second routine comprises a crash dump routine. The dummy crash dump may use smaller data sizes and does not perform any input or output to non-volatile storage of the data to be dumped. When a memory range that is safe to be reused has been identified, the data stored therein can be dumped and then memory for performing the actual crash dump routine can be allocated from the memory range and be reused for performing the actual crash dump routine.
    Type: Application
    Filed: July 16, 2008
    Publication date: January 22, 2009
    Applicant: Hewlett-Packard Development Company, L.P.
    Inventor: Basker Ponnuswamy
  • Publication number: 20080104447
    Abstract: A diagnostic system and method for repairing computing devices comprises a diagnostic application running on a same computing system having a failed operating system (O/S). The diagnostic application is provided with access to the file system of the failed O/S image. The diagnostic software application collects relevant configuration information from the file system of the failed O/S image, and transports this information to a proxy system running the same operating system as the computing device being diagnosed. The proxy system utilizes the collected data to diagnose the subject failed O/S system. Once the proxy makes a determination it synthesizes repair information comprising new or modified files and instructions to be transported back to the diagnostic software system to apply. A network connection is provided between the computer running the diagnostic application and the proxy system that enables data to be easily transported between the two systems without human intervention.
    Type: Application
    Filed: January 7, 2008
    Publication date: May 1, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bulent Abali, Robert Saccone
  • Publication number: 20080104441
    Abstract: A method of kernel panic recovery, comprising detecting a kernel panic of a first kernel, retrieving at least some of a state of at least one thread running on the first kernel, and restoring the state of the at least one process on a second kernel.
    Type: Application
    Filed: October 17, 2007
    Publication date: May 1, 2008
    Applicant: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.
    Inventors: Pramod Sathyanarayana RAO, Lal Samuel VARGHESE