Synchronization Maintenance Of Processors Patents (Class 714/12)
  • Patent number: 7643602
    Abstract: A method is provided for estimating a frequency offset value. This method includes: receiving a signal from the transmitting device at the receiving device, the received signal having a transmitter frequency (510); generating a local signal at the receiving device, the local signal having a starting frequency (520); comparing a received signal phase and a local signal phase to determine an adjusted error signal representing a phase difference between the received signal and the local signal (530); adjusting a current frequency of the local signal from the starting frequency to the transmitting frequency over a time period (540); integrating the adjusted error signal over the time period to generate an integrated error signal (550); and filtering the integrated error signal to generate a frequency difference estimate indicative of the frequency difference between the transmitter frequency and the starting frequency (560).
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: January 5, 2010
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Timothy R. Miller, John W. McCorkle
  • Patent number: 7634679
    Abstract: A method and system for performing a failover process on a production server. A source server and disaster recovery server are assigned indicators when operating in failover mode. A version match is performed to validate that the Exchange server applications and the storage area network vendor resources are compatible for the source and disaster recovery servers. Thereafter, multiple mailbox stores are dismounted on the source mailbox server. In turn, multiple databases and multiple transaction log drives are dismounted. After this step, drives are mounted in the disaster recovery location that are mirrors of the source server database and transaction log drives. After mounting, an Exchange System Attendant resource is created or Exchange and user attributes are updated in Active Directory. Finally, Exchange services are started and mailbox stores are mounted on the disaster recovery server.
    Type: Grant
    Filed: November 30, 2005
    Date of Patent: December 15, 2009
    Assignee: Microsoft Corporation
    Inventor: Daniel Quintiliano
  • Patent number: 7624302
    Abstract: According to one embodiment, a method comprises detecting loss of lockstep (LOL) for a processor in a multi-processor system. The method further comprises determining that the processor for which the LOL is detected is assigned the role of boot processor, and switching the role of boot processor to a spare processor without shutting down the system's operating system. In another embodiment, a method comprises system firmware determining that an LOL is detected for a lockstep pair of processors that are assigned the role of boot processor in a system. The method further comprises determining one of the lockstep pair of processors that is not the cause of the LOL, and copying the state of the determined one of the lockstep pair of processors that is not the cause of the LOL to a spare processor. The method further comprises switching the role of boot processor to the spare processor.
    Type: Grant
    Filed: October 25, 2004
    Date of Patent: November 24, 2009
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Scott L. Michaelis, Anurupa Rajkumari, William B. McHardy
  • Patent number: 7620845
    Abstract: A distributed system using a quorum redundancy method in which a redundancy process is executed by at least Q processing elements of N processing elements communicable with each other, each of N processing elements includes a resynchronization determining unit for determining that an execution state of the processing element itself can be resynchronized with a latest execution state in the distributed system in the case where the processing element can communicate with at least F+1 elements (F=N?Q) already synchronized of the N processing elements at the time of rebooting the processing element, and a resynchronizing unit for resynchronizing the execution state of the processing element itself to the latest one of the execution states of the at least F+1 processing elements in accordance with the result of determination by the resynchronizing unit.
    Type: Grant
    Filed: March 3, 2005
    Date of Patent: November 17, 2009
    Assignees: Kabushiki Kaisha Toshiba, Toshiba Solutions Corporation
    Inventor: Kotaro Endo
  • Patent number: 7616725
    Abstract: A signal delay structure and method of reducing skew between clock and data signals in a high-speed serial communications interface includes making a global adjustment to the clock signal in the time domain to compensate for a component of the skew that is common between the clock and all data signals. This can include skew caused by the variation in frequency of the input clock from a nominal value, misalignment between the phase of the clock and data generated at the source of the two signals. The global adjustment is made through a delay component that is common to all of the clock signal lines for which skew with data signals is to be compensated. A second level adjustment is made that compensates for the component of the skew that is common to the clock and a subset of the data signals.
    Type: Grant
    Filed: May 27, 2003
    Date of Patent: November 10, 2009
    Assignee: Broadcom Corporation
    Inventors: Jun Cao, Guangming Yin
  • Patent number: 7617412
    Abstract: A dual-processing unit with single clock source CPUs safety I/O module having a safety timer crosscheck diagnostic to enable each CPU to verify the accuracy of the clock source of the other CPU. The diagnostic works by having the first CPU act as a controlling CPU and the second CPU act as a monitoring CPU. Both CPUs are synchronized to begin one cycle of their respective safety functions at the same time. As part of the diagnostic, the controlling CPU is set to be interrupted after a pre-determined time period while the monitoring CPU is set to be interrupted slightly after that. When the controlling CPU is interrupted after the pre-determined time has passed as determined by that CPU's clock source, it sends a signal to the monitoring CPU which then verifies that the perceived time is within an expected range. To verify that the clock source of the monitoring CPU is accurate, the first CPU swaps roles to become the monitoring CPU while the second CPU becomes the controlling CPU.
    Type: Grant
    Filed: October 25, 2006
    Date of Patent: November 10, 2009
    Assignee: Rockwell Automation Technologies, Inc.
    Inventors: Norman S. Shelvik, Daniel M. Gass
  • Patent number: 7613961
    Abstract: One embodiment disclosed relates to a method of compiling a program to be executed on a target central processing unit (CPU). The method includes opportunistically scheduling diagnostic testing of CPU registers. The method may include use of a predetermined level of aggressiveness for the scheduling of the register diagnostic testing. The scheduled diagnostic testing may include writing known data to a register, reading data from the register, and comparing the known data with the data that was read. If the comparison indicates a difference, then a jump may occur to a fault handler routine.
    Type: Grant
    Filed: October 14, 2003
    Date of Patent: November 3, 2009
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Andrew Harvey Barr, Ken Gary Pomaranski, Dale John Shidla
  • Patent number: 7613948
    Abstract: A fault-tolerant computer uses multiple commercial processors operating synchronously, i.e., in lock-step. In an exemplary embodiment, redundancy logic isolates the outputs of the processors from other computer components, so that the other components see only majority vote outputs of the processors. Processor resynchronization, initiated at predetermined time, milestones, and/or in response to processor faults, protects the computer from single event upsets. During resynchronization, processor state data is flushed and an instance of these data in accordance with processor majority vote is stored. Processor caches are flushed to update computer memory with more recent data stored in the caches. The caches are invalidated and disabled, and snooping is disabled. A controller is notified that snooping has been disabled. In response to the notification, the controller performs a hardware reset of the processors. The processors are loaded with the stored state data, and snooping and caches are enabled.
    Type: Grant
    Filed: February 19, 2008
    Date of Patent: November 3, 2009
    Assignee: Maxwell Technologies, Inc.
    Inventors: Robert A. Hillman, Mark Steven Conrad
  • Patent number: 7610510
    Abstract: Method and apparatus for transactional fault tolerance in a client-server system is described. In one example, output data generated by execution of a service on a primary server during a current epoch between a first checkpoint and a second checkpoint is buffered. A copy of an execution context of the primary server is established on a secondary server in response to the second checkpoint. The output data as buffered is released from the primary server in response to establishment of the copy of the execution context on the secondary server.
    Type: Grant
    Filed: February 16, 2007
    Date of Patent: October 27, 2009
    Assignee: Symantec Corporation
    Inventors: Anurag Agarwal, Dharmesh Shah, Nagaraj Kalmala, Neelakandan Panchaksharam, Rajeev Bharadhwaj, Sameer Lokray, Srikanth Sm, Thomas Bean
  • Patent number: 7606342
    Abstract: The tracking of the phase of a received signal having a known preamble is accomplished by the steps of: initializing a phase-locked loop in accordance with estimated phase parameters, which are generated during an estimation interval by processing samples of the known preamble; delaying the preamble; generating phase error parameters by processing samples of the delayed preamble; and training the phase locked loop by tracking the phase-tracked signal in accordance with the tracking error parameters during a training interval after the estimation interval. The timing of the sampling is likewise trained in a closed timing loop in accordance with timing error parameters generated during the training interval after the timing loop has been initialized by estimated timing parameters generated during the estimation interval. The duration of the delay of the preamble is one-half the duration of the estimation interval.
    Type: Grant
    Filed: April 5, 2006
    Date of Patent: October 20, 2009
    Assignee: L-3 Communications Titan Corporation
    Inventors: John Robert Wiss, Omer F. Acikel
  • Publication number: 20090259885
    Abstract: Systems and methods for redundancy management in fault tolerant computing are provided. The systems and methods generally relate to enabling the use of non-custom, off-the-shelf components and tools to provide redundant fault tolerant computing. The various embodiments described herein, generally speaking, use a decrementer register in a general purpose processor for synchronizing identical operations across redundant general purpose processors, execute redundancy management services in the kernels of commercial off-the-shelf real-time operating systems (RTOS) running on the general purpose processors, and use soft coded tables to schedule operations and assign redundancy management parameters across the general purpose processors.
    Type: Application
    Filed: April 14, 2008
    Publication date: October 15, 2009
    Applicant: The Charles Stark Draper Laboratory, Inc.
    Inventors: Brendan O'Connell, Joseph Kochocki
  • Patent number: 7596738
    Abstract: One embodiment of the present invention provides a system that determines the cause of a correctable memory error. First, the system detects a correctable error during an access to a memory location in a main memory by a first processor, wherein the correctable error is detected by error detection and correction circuitry. Next, the system reads tag bits for a cache line associated with the memory location, wherein the tag bits contain address information for the cache line, as well as state information indicating a coherency protocol state for the cache line. The system then tests the memory location by causing the first processor to perform read and write operations to the memory location to produce test results. Finally, the system uses the test results and the tag bits to determine the cause of the correctable error, if possible.
    Type: Grant
    Filed: November 17, 2004
    Date of Patent: September 29, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Stephen A. Chessin, Tarik P. Soydan, Louis Y. Tsien
  • Patent number: 7583774
    Abstract: A clock synchronizer, for generating a local clock signal synchronized to a received clock signal, is described and claimed, along with a corresponding clock synchronization method. The clock synchronizer incorporates a reference oscillator providing a reference signal, and a synthesizer circuit arranged to synthesize a local clock signal from the reference signal. The synthesizer circuit comprises a phase-locked-loop circuit, including a phase detector receiving the reference signal, and a controllable divider arranged in a feedback path from a controlled oscillator to the phase detector, the divider being controllable to set a frequency division value N along the path to determine a ratio of the local clock frequency to the reference frequency. The clock synchronizer also incorporates a clock comparison circuit adapted to generate a digital signal indicative of an asynchronism between the local and remote clock signals. A control link is arranged to link the clock comparison circuit to the divider.
    Type: Grant
    Filed: November 15, 2004
    Date of Patent: September 1, 2009
    Assignee: Wolfson Microelectronics plc
    Inventor: Paul Lesso
  • Patent number: 7584388
    Abstract: An error notification method notifies errors generated in first and second processor systems to each processor within the first and second processor systems, in a computer system that includes the first processor system operable in a normal mode and the second processor system operable together with the first processor system in a mirror mode. The error notification method generates an error interrupt signal that indicates each error by a corresponding one of a plurality of error levels, reduces the error level of a corresponding error interrupt signal when the error within the first processor system is avoided in the mirror mode, and notifies the error to each processor within the first and second processor systems using the error interrupt signal.
    Type: Grant
    Filed: July 18, 2005
    Date of Patent: September 1, 2009
    Assignee: Fujitsu Limited
    Inventor: Jin Takahashi
  • Patent number: 7562244
    Abstract: In a method for data signal transfer across different clock-domains, including synchronization of a data signal with a current clock-domain where said data signal is processed, the processing of said data signal is started before the synchronization of said data signal is completed in said current clock-domain.
    Type: Grant
    Filed: May 4, 2004
    Date of Patent: July 14, 2009
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Paul Wielage
  • Patent number: 7555674
    Abstract: A computer replication and/or recovery system and process comprising a recovery machine which rebuilds operating system (“OS”) disks for damaged computers from their backup images. The recovery processes are performed within the recovery machine, which is a separate machine from both the damaged and replacement machines. Rebuilt OS disks are then adapted to the replacement computers with different hardware. The recovery method of the present invention is a network independent solution.
    Type: Grant
    Filed: September 9, 2004
    Date of Patent: June 30, 2009
    Inventor: Chuan Wang
  • Patent number: 7552359
    Abstract: A computer system includes a plurality of systems configured to be connected to each other by links and to operate synchronously each other. Each of said plurality of systems includes a fault tolerant controller, a CPU, a baseboard management controller and a plurality of hardware modules. The CPU is connected with the fault tolerant controller. The baseboard management controller is connected with the fault tolerant controller. The plurality of hardware modules is connected with the fault tolerant controller. When receiving a trouble which occurs in any of the plurality of systems, the fault tolerant controller outputs an interrupt regarding the trouble to at least one of the CPU and the baseboard management controller predetermined correspondingly to the trouble.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: June 23, 2009
    Assignee: NEC Corporation
    Inventor: Yasushi Takemori
  • Patent number: 7549082
    Abstract: A method and system of bringing processors to the same computational point. At least some of the illustrative embodiments are computer systems comprising a first processor executing a program, a second processor executing a duplicate copy of the program (but at different computational points in the program), and a shared main memory coupled to the first and second processors. When the processors each receive duplicate copies of an interrupt request, the processors are configured to bring their respective programs to the same computational points prior to servicing the interrupt request.
    Type: Grant
    Filed: February 3, 2006
    Date of Patent: June 16, 2009
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Dale E. Southgate, Mihai Damian, Peter A. Reynolds, William F. Bruckert, James S. Klecka
  • Patent number: 7549078
    Abstract: Providing redundancy between an active component and a standby component in a network router comprises maintaining a first route input information base associated with the active component, synchronizing with the first route information base a second route input information base associated with the standby component, generating a route output information base using the second route input information base, and comparing the generated route output information base, in the event of switchover of the standby component to an active mode, to a synchronized route output information base associated with the standby component which synchronized route output information base reflects routes known to have been shared with one or more peers by the active component prior to the switchover, and sharing and/or withdrawing routes as necessary to reflect any differences between the generated route output information base and the synchronized route output information base.
    Type: Grant
    Filed: January 31, 2006
    Date of Patent: June 16, 2009
    Assignee: Alcatel Lucent
    Inventors: Kendall Harvey, Paul Kwok
  • Patent number: 7546354
    Abstract: The present invention provides a scalable, highly available distributed network data storage system that efficiently and reliably provides network clients and application servers with access to large data stores, such as NAS units, and manages client and server requests for data from the data stores, thereby comprising a distributed storage manager. A storage manager constructed in accordance with the invention can receive and process network requests for data at a large, aggregated network data store, such as a collection of NAS units, and can manage data traffic between the network clients and NAS units.
    Type: Grant
    Filed: July 1, 2002
    Date of Patent: June 9, 2009
    Assignee: EMC Corporation
    Inventors: Chenggong Charles Fan, Srinivas M. Aji, Jehoshua Bruck
  • Patent number: 7543180
    Abstract: One embodiment of the present invention provides a system that enhances throughput and fault-tolerance in a parallel-processing system. During operation, the system first receives a task. Next, the system partitions N computing nodes into M set-aside nodes and N-M primary computing nodes, wherein M?1. The system then processes the task in parallel across the N-M primary computing nodes. While doing so, the system proactively monitors the health of each of the N-M primary computing nodes. If the system detects a node in the N-M primary computing nodes to be at risk of failure, the system copies the portion of the task associated with the at-risk node to a subset of the M set-aside nodes. The system then processes the portion of the task in parallel across the subset of the M set-aside nodes while the N-M primary computing nodes continue executing.
    Type: Grant
    Filed: March 8, 2006
    Date of Patent: June 2, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Kenny C. Gross, Alan Paul Wood
  • Patent number: 7539897
    Abstract: The present invention has been made to realize access processing performed in accordance with synchronous/asynchronous state between processors in a fault tolerant system. In two systems that constitute a fault tolerant system, a router assigns, to an access packet transmitted from a CPU to an IO device, tag information including ID codes of access source and destination and information indicating whether the access packet is synchronous access. An access comparison section has buffers and that retain the packets from the CPU on a system basis, a tag check section that determines whether each packet is synchronous packet access based on the tag information assigned to the packets retained in the buffers, and a comparison section that outputs the packet from one system to an IO IF and discards the packet from other system in the case where the packet is synchronous access.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: May 26, 2009
    Assignee: NEC Corporation
    Inventor: Fumitoshi Mizutani
  • Patent number: 7539899
    Abstract: A computer cloning system and process comprising a cloning machine which modifies a survived or reproduced operating system (“OS”) devices to adapt to new replacement hardware. The cloning processes are performed within the cloning machine, which is a separate machine from both the damaged and replacement machines. The modified OS device is then adapted to the replacement computer with different hardware. The cloning method of the present invention is a network independent solution. This invention may also be used for computer system hardware and/or software upgrades, computer testing, new computer installation, and system migration.
    Type: Grant
    Filed: December 5, 2006
    Date of Patent: May 26, 2009
    Inventor: Chuan Wang
  • Patent number: 7519856
    Abstract: There is provided a fault tolerant system capable of adequately performing error processing, synchronization processing, and resynchronization processing for realizing a fault tolerant function in accordance with the system state. The fault tolerant system comprises at least two systems including: a CPU subsystem; an IO subsystem connected to the CPU subsystem; an FT controller to be connected between the CPU subsystem and IO subsystem; and crosslinks connecting own system and other system through the FT controller. The CPU subsystem operates at the same timing with a CPU subsystem of other system in lock-step. The FT controller manages a plurality of system operations, according to which both systems perform error processing, duplication processing, and resynchronization processing for fault tolerant, by associating a plurality of states corresponding to the system operations with predetermined event signals.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: April 14, 2009
    Assignee: NEC Corporation
    Inventor: Fumitoshi Mizutani
  • Patent number: 7516359
    Abstract: According to one embodiment, a method comprises detecting a loss of lockstep (LOL) for a processor module. The method further comprises determining a type of LOL that is detected, and, based at least in part on the determined type of LOL, determining a responsive action to take for the LOL. According to one embodiment, a method comprises detecting a loss of lockstep (LOL) for a processor module. The method further comprises using information identifying at least one of type of the detected LOL and source of the detected LOL to determine a responsive action to take for the LOL.
    Type: Grant
    Filed: October 25, 2004
    Date of Patent: April 7, 2009
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Scott L. Michaelis, Anurupa Rajkumari, William B. McHardy
  • Patent number: 7516360
    Abstract: The present invention provides a system and method for the execution of jobs in a distributed computing architecture that uses worker clients which are characterized by a checkpointing mechanism component for generating checkpointing information being assigned to at least one worker client, at least one failover system being assigned to the worker client, a component (failover system selection component) for automatically assigning at least one existing or newly created failover system to the failure system being assigned to a worker client in the case said worker clients fails, wherein the assigned failover system provides all function components in order to take over the execution of the job when said assigned worker client fails, wherein the assigned failover system further includes at least a failover monitor component for detecting failover situations of said assigned worker client.
    Type: Grant
    Filed: September 9, 2004
    Date of Patent: April 7, 2009
    Assignee: International Business Machines Corporation
    Inventors: Utz Bacher, Oliver Benke, Boas Betzler, Thomas Lumpp, Eberhard Pasch
  • Publication number: 20090089613
    Abstract: A redundancy system that can perform synchronization even if a failure occurs to an application. According to the redundancy system of the present invention, a synchronization data memory area, a management bit map table having a flag created for each segment of the synchronization data memory area, and a management memory area for storing the starting address of the segment are set in each device. In the service application process, a service is performed using one or more segments, a flag corresponding to the segment is set, and synchronization information is written to the management memory each time the segment is written or overwritten. In the read process, each flag in the management bit map table is checked, and if a flag being set exists, the synchronization data is read from the segment corresponding to the synchronization information stored in the management memory, and the flag is reset.
    Type: Application
    Filed: November 26, 2008
    Publication date: April 2, 2009
    Applicant: Oki Electric Industry Co., Ltd.
    Inventor: Tomotake Koike
  • Patent number: 7512837
    Abstract: A method for recovering lost cache capacity in a multi core chip having at least one defective core including identifying the cores contained in the chip that are viable cores and identifying at least one core contained in the chip that is defective. The method also includes identifying the cache memory local to the defective core and determining a redistribution of the cache resources local to the at least one defective core among the viable cores. The method also features dividing the cache memory local to the at least one defective core according to the redistribution determination and determining the address information associated with the cache memory local to the at least one defective core. The method also features providing the address information associated with the cache memory associated with the defective core to at least one of the viable cores, facilitating the supplementation of the cache memory local to the viable cores with the cache memory associated with the defective core.
    Type: Grant
    Filed: April 4, 2008
    Date of Patent: March 31, 2009
    Assignee: International Business Machines Corporation
    Inventors: Diane Flemming, Ghadier R. Gholami, Octavian F. Herescu, William A. Maron, Mysore M. Srinivas
  • Patent number: 7509544
    Abstract: A data repair and synchronization method of dual flash ROM is provided, which includes a first flash ROM and a second flash ROM that store the same system data, wherein one of the first flash ROM and the second flash ROM is used to perform a data repair on the other flash ROM with damaged data and perform a data synchronization between the two flash ROMs, thereby ensuring that once the data in one flash ROM is damaged during the system operation, the complete system data stored in the other flash ROM is used to recover the damaged operating system and the files in the system. Meanwhile, through performing the data synchronization periodically, important configuration files in the system stored in the two flash ROMs are kept to be updated and completed.
    Type: Grant
    Filed: February 26, 2007
    Date of Patent: March 24, 2009
    Assignee: Inventec Corporation
    Inventors: Nan Zhang, Yan-Peng Yang, Tom Chen, Win-Harn Liu
  • Patent number: 7509375
    Abstract: The present invention relates to a global management system for a multimodule, multiprocessor machine (PK). The system is characterized in that it comprises an independent module (SM) dedicated to the global management of a plurality of first modules (M1 through Mn), the independent module (SM) being connected to each management tool (BUMP) for each of the first modules (M1 through Mn) by a first specific link supporting a given communication protocol that makes it possible to manage each of the first modules at the startup of the machine, during the running of the machine, and after the machine stops running, the independent module (SM) being connected to each of the first modules via a second link, and the independent module also being globally connected to the multimodule machine (PK) via a physical link of a local area network (LAN) linked to at least two of the first modules (M2 and M3).
    Type: Grant
    Filed: May 2, 2007
    Date of Patent: March 24, 2009
    Assignee: Bull SAS
    Inventors: Caudrelier Christian, Olivares Lorenzo, Reix Tony
  • Patent number: 7502973
    Abstract: A method and device for monitoring a distributed system made up of a plurality of users that are connected by one bus system are provided, in which distributed system at least a number of the users are provided as monitoring users. The process data of at least one monitored user are filed in data areas of memory units of the bus system, to which the monitoring users have access, and the process data are evaluated by the monitoring users.
    Type: Grant
    Filed: May 4, 2004
    Date of Patent: March 10, 2009
    Assignee: Robert Bosch GmbH
    Inventors: Dietmar Baumann, Dirk Hofmann, Herbert Vollert, Willi Nagel, Andreas Henke, Bertram Foitzik, Bernd Goetzelmann
  • Patent number: 7502958
    Abstract: According to at least one embodiment, a method comprises detecting loss of lockstep for a pair of processors. The method further comprises triggering, by firmware, an operating system to idle the processors, and recovering, by the firmware, lockstep between the pair of processors. After lockstep is recovered between the pair of processors, the method further comprises triggering, by the firmware, the operating system to recognize the processors as being available for receiving instructions.
    Type: Grant
    Filed: October 25, 2004
    Date of Patent: March 10, 2009
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Scott L. Michaelis, Anurupa Rajkumari, William B. McHardy
  • Patent number: 7502959
    Abstract: A system comprises a non volatile memory and a plurality of processors. The non volatile memory stores an error handling routine. Each processor of the plurality of processors accesses the error handling routine on detecting an error and, on certain errors, signals the remaining processors to enter a rendezvous state. In the rendezvous state, a single processor takes over and performs error handling.
    Type: Grant
    Filed: July 28, 2003
    Date of Patent: March 10, 2009
    Assignee: Intel Corporation
    Inventors: Suresh Marisetty, George Thangadurai, Mani Ayyar
  • Patent number: 7502956
    Abstract: An information processing apparatus includes a plurality of computing units. At least one of the computing units includes a recording unit that records a status of an error occurrence in each of the computing units. The each of the computing units includes an error notifying unit that notifies the error occurrence to at least one of the computing units that includes the recording unit when an error occurs in the each of the computing units itself.
    Type: Grant
    Filed: November 10, 2004
    Date of Patent: March 10, 2009
    Assignee: Fujitsu Limited
    Inventors: Jin Takahashi, Seishi Okada
  • Patent number: 7502954
    Abstract: A data storage system includes a disk drive array including a plurality of disk drives; a first storage processor for controlling the operation of the data storage system; a second storage processor for controlling the operation of the data storage system; a first arbiter for controlling communication of data from the first storage processor and the second storage processor to a first group of disk drives of the disk drive array; and a second arbiter for controlling communication of data from the first storage processor and the second storage processor to a second group of disk drives of the disk drive array. Selected data is redundantly stored on disk drives in the first group of disk drives and the second group of disk drives, such that, upon failure of the first arbiter, the selected data is available to the first storage processor and the second storage processor through the second arbiter.
    Type: Grant
    Filed: May 26, 2004
    Date of Patent: March 10, 2009
    Assignee: EMC Corporation
    Inventors: Stephen E. Strickland, Timothy Dorr, John V. Burroughs, Michael A. Faulkner, Steven D. Sardella
  • Publication number: 20090063898
    Abstract: Recovery circuits react to errors in a processor core by waiting for an error-free completion of any pending store-conditional instruction or a cache-inhibited load before ceasing to checkpoint or backup progress of a processor core. Recovery circuits remove the processor core from the logical configuration of the symmetric multiprocessor system, potentially reducing propagation of errors to other parts of the system. The processor core is reset and the checkpointed values may be restored to registers of the processor core. The core processor is allowed not just to resume execution just prior to the instructions that failed to execute correctly the first time, but is allowed to operate in a reduced execution mode for a preprogrammed number of groups. If the preprogrammed number of instruction groups execute without error, the processor core is allowed to resume normal execution.
    Type: Application
    Filed: November 13, 2008
    Publication date: March 5, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Susan Elizabeth Eisen, Hung Qui Le, Michael James Mack, Dung Quoc Nguyen, Jose Angel Paredes, Scott Barnett Swaney
  • Patent number: 7499865
    Abstract: Environment asset inventories of computing environment assets are arranged into computing environments, and at least one collector interface is disposed between the environments to detect movement of an asset and to produce asset movement reports. Upon receipt of a movement report, one or more backup copies of a environment asset inventory are accessed, and compared a modified environment asset inventory. A history of each asset in each inventory is maintained. A history report regarding the life of an asset from first introduction into an environment throughout movements between environments is produced, including any applicable patches and upgrades configured into an environment. A discrepancy report is generated including assets, locations, status, and a revision level indicators.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: March 3, 2009
    Assignee: International Business Machines Corporation
    Inventors: Vijay Kumar Aggarwal, Craig Lawton, Christopher Andrew Peters, Puthukode G. Ramachandran, Lorin Evan Ullmann, John Whitfield
  • Patent number: 7500139
    Abstract: A fault-tolerant computer has a pair of duplex systems having respective CPU subsystems that are operable identically in lock-step synchronism. Each of the duplex systems has a CPU, a main storage unit, a CPU bus controller, and a DMA controller. The CPU and the main storage unit are included in each of the CPU subsystems. The CPU bus controller continuously operates the CPU of its own system even if it detects an asynchronous operation while the CPU subsystems are operating in synchronism with each other. Even if the asynchronous operation is detected, the DMA controller holds a DMA transfer process for transferring data stored in the main storage unit of its own system or the other system to the main storage unit of the other system or its own system after the asynchronous operation is detected until a certain time is reached.
    Type: Grant
    Filed: December 19, 2005
    Date of Patent: March 3, 2009
    Assignee: NEC Corporation
    Inventor: Fumitoshi Mizutani
  • Patent number: 7496786
    Abstract: A system is provided for rapidly synchronizing two or more processing elements in a fault-tolerant computing system. Embodiments of this system allow for the rapid synchronization of two processing elements through partial copies of the contents of memory associate with each processing element.
    Type: Grant
    Filed: January 10, 2006
    Date of Patent: February 24, 2009
    Assignee: Stratus Technologies Bermuda Ltd.
    Inventors: Simon Graham, Dan Lussier, Tim Wegner, Jeffrey Somers, Steven Haid, John W. Edwards, Jr.
  • Patent number: 7493515
    Abstract: Assigning a processor to a logical partition in a computer supporting multiple logical partitions that include assigning priorities to partitions, detecting a checkstop of a failing processor of a partition, retrieving the failing processor's state, replacing by a hypervisor the failing processor with a replacement processor from a partition having a priority lower than the priority of the partition of the failing processor, and assigning the retrieved state of the failing processor as the state of the replacement processor.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: February 17, 2009
    Assignee: International Business Machines Corporation
    Inventors: William J. Armstrong, Naresh Nayar, Gary R. Ricard
  • Patent number: 7478272
    Abstract: Replacing a failing physical processor in a computer supporting multiple logical partitions, where the logical partitions include dedicated partitions and shared processor partitions, the dedicated partitions are supported by virtual processors having assigned physical processors, and the shared processor partitions are supported by pools of virtual processors. The pools of virtual processors have assigned physical processors.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: January 13, 2009
    Assignee: International Business Machines Corporation
    Inventors: William J. Armstrong, Naresh Nayar, Gary R. Ricard
  • Patent number: 7478274
    Abstract: A duplex system has duplicated processor devices. Each of the processor devices has a first copying section which writes data written in a memory of the processor device, into a same address of a memory of the other processor device, a second copying section which divides all data in the memory of the processor device to sequentially write all data into the memory of the other processor device periodically, an error detecting section which checks the data written in the memory of the processor device, and an error check register which sets an error bit when the error detecting section detects an error. After the first copying section and the second copying section write data into a memory of the standby side processor device, the control side processor device checks an error bit of an error check register of the standby side processor device.
    Type: Grant
    Filed: March 30, 2006
    Date of Patent: January 13, 2009
    Assignee: Yokogawa Electric Corporation
    Inventors: Jun Nishida, Toshio Hatano
  • Patent number: 7474581
    Abstract: Rank numbers specified by a second counter are refreshed in sequence by using a count value of a first counter which is initialized by a synchronous reset signal and counts timing for performing refresh, and the rank numbers specified by a refresh rank control unit are continuously refreshed in sequence in the case where the synchronous reset signal is active.
    Type: Grant
    Filed: January 26, 2007
    Date of Patent: January 6, 2009
    Assignee: NEC Corporation
    Inventor: Yukihiro Tanaka
  • Patent number: 7475284
    Abstract: A redundancy system that can perform synchronization even if a failure occurs to an application. According to the redundancy system of the present invention, a synchronization data memory area, a management bit map table having a flag created for each segment of the synchronization data memory area, and a management memory area for storing the starting address of the segment are set in each device. In the service application process, a service is performed using one or more segments, a flag corresponding to the segment is set, and synchronization information is written to the management memory each time the segment is written or overwritten. In the read process, each flag in the management bit map table is checked, and if a flag being set exists, the synchronization data is read from the segment corresponding to the synchronization information stored in the management memory, and the flag is reset.
    Type: Grant
    Filed: March 17, 2006
    Date of Patent: January 6, 2009
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Tomotake Koike
  • Patent number: 7472034
    Abstract: A system and method for test generation for system level verification using parallel algorithms are provided. The present invention generates test patterns for system level tests by exploiting the scalability of parallel algorithms while allowing for data set coloring and expected result checking. Based on the characteristics of the system being tested an iterative parallel algorithm is selected from a plurality of possible parallel algorithms. The selected parallel algorithm is then separated into separate program statements for execution by a plurality of processors. A serial version of the selected algorithm is executed to generate a set of expected results. The devised parallel version of the selected algorithm is then run to generate a set of test result data which is compared to the set of expected results. If the two sets of data match, it is determined that the system is operating correctly.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: December 30, 2008
    Assignee: International Business Machines Corporation
    Inventors: Sanjay Gupta, Steven L. Roberts, Christopher J. Spandikow
  • Patent number: 7467327
    Abstract: A method and system of aligning execution point of duplicate copies of a user program by exchanging information about instructions executed. At least some of the exemplary embodiments may be a method comprising operating duplicate copies of a user program in a first and second processor, allowing at least one of the user programs to execute until retired instruction counter values in each processor are substantially the same, and then executing a number of instructions of each user program. Of the instructions executed, at least some of the instructions are decoded and the inputs of each decoded instruction determined (the decoding substantially simultaneously with executing in each processor).
    Type: Grant
    Filed: January 25, 2005
    Date of Patent: December 16, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Paul Del Vigna, Jr., Robert L. Jardine
  • Patent number: 7467326
    Abstract: The fault-tolerant or self-correcting computer system is disclosed. The computer system that is provided with various sets of protections against failures that may be caused by space radiation, for example. Improved reliability of the system is achieved by scrubbing of the components on a regular schedule, rather than waiting for an error to be detected. Thus, errors that may go undetected for an extended period are not allowed to propagate and further damage the system. Three or more processors are provided to operate in parallel, and a controller is provided to receive signals from the processors and, using a voting logic, determines a majority signal value. In this manner, the controller can detect an error when a signal from one of the processors differs from the majority signal. The system is also provided with a scrubbing module for resynchronizing the processors after a predetermined milestone has been reached.
    Type: Grant
    Filed: April 17, 2003
    Date of Patent: December 16, 2008
    Assignee: Maxwell Technologies, Inc.
    Inventors: Robert Allen Hillman, Mark Steven Conrad
  • Patent number: 7461291
    Abstract: A method of providing arbitration for redundant controllers is provided, which includes: providing logic for automatically determining which controller of redundant controllers is active controller, wherein outputs of the redundant controllers are electrically hardwired together and provided as input to a device; and providing first and second hardware arbitration components for first and second controllers of the redundant controllers, each hardware arbitration component ensuring that outputs of the respective controller are enabled only when the associated controller is active controller. The first and second hardware arbitration components are separate hardware components which communicate and cooperate as a distributed hardware interlock mechanism that ensures outputs of only one controller are enabled at a time.
    Type: Grant
    Filed: June 19, 2007
    Date of Patent: December 2, 2008
    Assignee: International Business Machines Corporation
    Inventors: Gary D. Anderson, Gerald J. Fahr, Raymond J. Harrington
  • Patent number: 7460989
    Abstract: A method is provided, wherein a virtual internal master clock is used in connection with a RISC CPU. The RISC CPU comprises a number of concurrently operating function units, wherein each unit runs according to its own clocks, including multiple-stage totally unsynchronized clocks, in order to process a stream of instructions. The method includes the steps of generating a virtual model master clock having a clock cycle, and initializing each of the function units at the beginning of respectively corresponding processing cycles. The method further includes operating each function unit during a respectively corresponding processing cycle to carry out a task with respect to one of the instructions, in order to produce a result. Respective results are all evaluated in synchronization, by means of the master clock. This enables the instruction processing operation to be modeled using a sequential computer language, such as C or C++.
    Type: Grant
    Filed: October 14, 2004
    Date of Patent: December 2, 2008
    Assignee: International Business Machines Corporation
    Inventor: Oliver Keren Ban
  • Patent number: 7454533
    Abstract: A host computer is connected to logical disks via controllers, and accesses the logical disks. When maintenance work is performed on a first controller, information indicating that the first controller is undergoing maintenance work is stored in the other controllers. When an access is made to a logical disk designating a path including the first controller, a redundancy driver of the host computer can discriminate the cause for an access failure based on the information stored in the other controllers even if the access fails.
    Type: Grant
    Filed: February 4, 2005
    Date of Patent: November 18, 2008
    Assignee: NEC Corporation
    Inventor: Kenichi Miki