Forward Recovery (e.g., Redoing Committed Action) Patents (Class 714/16)
-
Publication number: 20140019803Abstract: A computer system includes a simultaneous multi-threading processor and memory in operable communication with the processor. The processor is configured to perform a method including running multiple threads simultaneously, detecting a hardware error in one or more hardware structures of the processing circuit, and identifying one or more victim threads of the multiple threads. The processor is further configured to identify a plurality of hardware structures associated with execution of the one or more victim threads, isolate the one or more victim threads from the rest of the multiple threads by preventing access to the plurality of hardware structures by the multiple threads, flush the one or more victim threads by resetting hardware states of the plurality of hardware structures, and restore the one or more victim threads by restoring the plurality of hardware structures to a known safe state.Type: ApplicationFiled: July 13, 2012Publication date: January 16, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
-
Publication number: 20130339788Abstract: A dual redundant process controller is provided. The controller comprises a first processor, memory, and instance of a process control application stored in the first memory. The controller further comprises a second processor, memory, and instance of the process control application stored in the second memory.Type: ApplicationFiled: July 18, 2013Publication date: December 19, 2013Applicant: Invensys Systems, Inc.Inventors: Alan A. Gale, Andrew L. Kling, Mark E. Timperley, Lawrence T. Bass, John J. Lavallee, George W. Cranshaw, Alan M. Foskett
-
Publication number: 20130311825Abstract: A communication system, method, and components are described. Specifically, the method described herein provides the ability for an application sequence of a communication session to be reconstructed during the communication session and even though SIP standards dictate that the reconstruction of the application sequence should be denied and the session should be terminated.Type: ApplicationFiled: May 21, 2012Publication date: November 21, 2013Applicant: AVAYA INC.Inventors: Gordon R. Brunson, Mehmet C. Balasaygun, Harsh V. Mendiratta
-
Patent number: 8589730Abstract: Systems and methods are provided for handling errors during device bootup from a non-volatile memory (“NVM”). A NVM interface of an electronic device can be configured to detect errors and maintain an error log in volatile memory while the device is being booted up. Once device bootup has completed, a NVM driver of the electronic device can be configured to correct the detected errors using the error log. For example, the electronic device can move data to more reliable blocks and/or retire blocks that are close to failure, thereby improving overall device reliability.Type: GrantFiled: August 31, 2010Date of Patent: November 19, 2013Assignee: Apple Inc.Inventors: Matthew Byom, Kenneth Herman, Nir J. Wakrat, Daniel J. Post
-
Patent number: 8578144Abstract: Checkpoint snapshots of segments of system memory are taken while an operating system is booting in a computer system. The segments of system memory are stored in non-volatile memory as hibernation files. In response to detecting a request for a system reboot of the OS, an affected hibernation file, which corresponds to an affected segment of system memory that will change during the system reboot of the OS, is identified. A restoration of the system memory via a wake-up from hibernation is then initiated. The wake-up from hibernation proceeds until the affected hibernation file is reached, such that initial steps in the system reboot are bypassed. Thereafter, subsequent steps, which are after the bypassed initial steps in the system reboot, are executed.Type: GrantFiled: August 4, 2010Date of Patent: November 5, 2013Assignee: International Business Machines CorporationInventors: Fred A. Bower, III, Michael H. Nolterieke, William G. Pagan
-
Publication number: 20130290779Abstract: Aspects of the subject matter described herein relate to auditing operations. In aspects, operations may be audited synchronously and/or asynchronously to one or more audit targets. When auditing synchronously, audit records may be written synchronously to an audit target. When auditing asynchronously, a buffer may be used to store audit records until the audit records are flushed to an audit target. If an error occurs in auditing, a policy may be evaluated to determine how to respond. One exemplary response includes failing an operation that triggered a subsequent audit record. Furthermore, if a buffer was unable to be copied to an audit target, the contents of the buffer may be preserved and one or more retries may be attempted to copy the buffer to the audit target.Type: ApplicationFiled: April 30, 2012Publication date: October 31, 2013Applicant: MICROSOFT CORPORATIONInventors: Zubair Ahmed Mughal, Jack S. Richins, Jerome R. Halmans
-
Patent number: 8572331Abstract: A method is disclosed for reliably updating a data group in a read-before-write data replication environment. The method reliably updates the data group by receiving an updated data group sent from a first storage medium to a second storage medium, comparing the updated data group with a previous data group previously existing on the second storage medium and writing the updated data group to the second storage medium. The read-before-write and differencing method disclosed maintain reliability by storing multiple copies of changes made to the second storage medium during and after the write process.Type: GrantFiled: October 30, 2008Date of Patent: October 29, 2013Assignee: International Business Machines CorporationInventors: Henry Esmond Butterworth, Kenneth Fairclough Day, III, Philip Matthew Doatmas, John Jay Wolfgang, Vitaly Zautner, Aviad Zlotnick
-
Publication number: 20130283097Abstract: Methods, systems, and programming for distributing tasks to a network of machines are disclosed. A plurality of tasks is received, each task having an associated priority level. Each of the plurality of tasks is assigned to a priority line of a plurality of priority lines based on the associated priority level of each of the plurality of tasks. A distribution strategy is determined for the plurality of tasks based on an analysis of at least one worker machine. A group of tasks is scheduled from the plurality of priority lines to a gateway line based on the distribution strategy. Tasks are pushed from the gateway line to the at least one worker machine to process the tasks. The progress of tasks processed by worker machines is monitored and results of tasks are fetched and delivered to users of user devices.Type: ApplicationFiled: April 23, 2012Publication date: October 24, 2013Applicant: Yahoo! Inc.Inventors: Zhongqian Chen, Xiaobing Han, Hui Wu, Hang Su, Shenghong Zhu
-
Patent number: 8566642Abstract: A storage controller changes a block size to carry out a shredding process. A data shredder uses a large block size BSZ1 set by a block size setting part to write shredding data in a storage area of a disk drive and erase data stored therein. An error arising during the writing operation of the shredding data is detected by an error detecting part. When the error is detected, the block size setting part sets the block size smaller by one stage than the initial block size to the data shredder. Every time the error arises, the block size used in the shredding process is diminished. Thus, the number of times of writings of the shredding data is reduced as much as possible to improve a processing speed and erase the data of a wide range as much as possible.Type: GrantFiled: January 10, 2011Date of Patent: October 22, 2013Assignee: Hitachi, Ltd.Inventor: Mao Ohara
-
Patent number: 8566641Abstract: Among other aspects disclosed are a method and system for processing a batch of input data in a fault tolerant manner. The method includes reading a batch of input data including a plurality of records from one or more data sources and passing the batch through a dataflow graph. The dataflow graph includes two or more nodes representing components connected by links representing flows of data between the components. At least one but fewer than all of the components includes a checkpoint process for an action performed for each of multiple units of work associated with one or more of the records. The checkpoint process includes opening a checkpoint buffer stored in non-volatile memory at the start of processing for the batch.Type: GrantFiled: June 14, 2012Date of Patent: October 22, 2013Assignee: Ab Initio Technology LLCInventors: Bryan Phil Douros, Matthew Darcy Atterbury, Tim Wakeling
-
Publication number: 20130275805Abstract: An embodiment of the invention is directed to a method associated with a node comprising a hypervisor and guest VMs, each guest VM being managed by the hypervisor and disposed to run applications, the node being joined with other nodes to form an HA cluster. The method includes establishing an internal bidirectional communication channel between each guest VM and the hypervisor, and further includes sending commands and responses thereto through the channel, wherein respective commands manage a specified application running on the given guest VM. The messages are selectively monitored, to detect a failure condition associated with the specified application running on the given guest VM. Responsive to detecting a failure condition, action is taken to correct the failure condition, wherein the action includes sending at least one command through the internal channel from the hypervisor to the given guest VM.Type: ApplicationFiled: August 20, 2012Publication date: October 17, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Richard E. Harper, Marcel Mittelstaedt, Markus Mueller, Lisa F. Spainhower
-
Publication number: 20130275806Abstract: A method for performing error recovery that includes creating, by a processor, a recovery checkpoint. The processor is dynamically switched into a non-recoverable processing mode of operation based on creating the software recovery checkpoint. The non-recoverable processing mode of operation is a mode in which a subset of hardware error recovery resources are powered-down or re-purposed for instruction processing. It is determined, during the non-recoverable processing mode of operation, that a new software recovery checkpoint is required. Based on the determining that a new software recovery checkpoint is required, the processor is dynamically switched into a recoverable processing mode of operation. The recoverable processing mode of operation is a mode in which hardware error recovery resources, including at least one of the hardware error recovery resources in the subset, are purposed for hardware error recovery operations.Type: ApplicationFiled: March 5, 2013Publication date: October 17, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
-
Publication number: 20130275807Abstract: The installation of multiple applications by an installer is executed in a mode that does not display an error message in a display device. Upon an installation performed by the installer ending, the result of the installation performed by the installer is determined. As a result of the determination, an installer that failed at the installation is caused to re-execute the installation of the application whose installation failed in a mode that displays an error message in the display device. As a result of the re-execution, an error message is displayed in the display device by the installer that failed at the installation.Type: ApplicationFiled: June 7, 2013Publication date: October 17, 2013Inventor: Yousuke SUGAI
-
Publication number: 20130262925Abstract: Techniques for rescheduling a failed backup job are described in various implementations. A method that implements the techniques may include identifying a failed instance of a backup job, and determining an estimated amount of time to complete a rescheduled execution of the failed instance. The method may also include determining an available window of time in a backup schedule that equals or exceeds the estimated amount of time to complete the rescheduled execution, and rescheduling the failed instance for execution during the available window of time.Type: ApplicationFiled: March 28, 2012Publication date: October 3, 2013Inventors: Hari Dhanalakoti, Sreekanth Gopisetty
-
Patent number: 8549384Abstract: Apparatus having corresponding methods and computer-readable media comprise an encoder configured to provide encoded data according to an error correction code; a flash memory interface configured to write the encoded data to a location in flash memory, and to read the encoded data from the location in the flash memory; a decoder configured to decode the encoded data read from the location in the flash memory, and to indicate a number of resulting decode errors; and a retirement module configured to retire the location responsive to a number of resulting decode errors reaching an error threshold T.Type: GrantFiled: June 24, 2010Date of Patent: October 1, 2013Assignee: Marvell International Ltd.Inventors: ChengKuo Huang, Sui-Hung Fred Au, Xueshi Yang, Lau Nguyen
-
Publication number: 20130246845Abstract: Systems and methods are provided for supporting transaction recovery based on a strict ordering of two-phase commit calls. At least one resource manager in a mid-tier transactional environment can be designated as the “determiner resource,” in order to support eliminating mid-tier transaction logs (TLOG) in processing a two-phase transaction. A transaction manager can prepare all other resource managers in the mid-tier transactional system before the determiner resource. Furthermore, the transaction manager can rely on the list of outstanding transactions to be committed that is provided by the determiner resource for recovering the transaction. The transaction manager can commit an in-doubt transaction returned from a resource manager that matches the list of in-doubt transactions returned from the determiner resource. Otherwise, the transaction manager can roll back the in-doubt transaction.Type: ApplicationFiled: March 14, 2013Publication date: September 19, 2013Applicant: ORACLE INTERNATIONAL CORPORATIONInventor: Paul Parkinson
-
Patent number: 8533529Abstract: A system and method can support a compensation work. The system includes one or more compensation functions that use a process state to realize the compensation work associated with a forward work, wherein the compensation work is executed at a different runtime from an execution time of the forward work, and wherein the process state contains data needed to execute the compensation work. The system also includes a process execution engine that can dynamically manage process state to make the state available to the compensation functions. The process state is retrieved based on a closure data structure that holds an expression and an environment of variable bindings, in which the expression is to be evaluated.Type: GrantFiled: September 9, 2011Date of Patent: September 10, 2013Assignee: Oracle International CorporationInventor: Alexandre de Castro Alves
-
Patent number: 8516302Abstract: A communication system enabling wireless transmission of messages via packets; and a method of operating the system provides for improved accuracy in the transmission of a message, particularly for overcoming signal distortion associated with the phase changes and varying multipath found in transmissions from the locomotive of a moving train. The maximum benefit of forward-error correction (FEC) with Reed-Solomon (RS) coding is applied for a message payload that is significantly shorter than the fixed length of a packet, with lesser coding being performed with longer payloads.Type: GrantFiled: April 16, 2010Date of Patent: August 20, 2013Assignee: General Electric CompanyInventors: Thomas Clayton Mayo, Kenneth Roy Tuttle, Richard Alan Place
-
Publication number: 20130212432Abstract: Systems, apparatus, methods, and articles of manufacture provide for facilitating upload of one or more electronic files from a user device to a remote server. In some embodiments, a background upload process manages connectivity of the user device to the remote server and staging file uploads in a disconnected mode for automatic processing when connectivity is restored.Type: ApplicationFiled: February 8, 2013Publication date: August 15, 2013Applicant: THE TRAVELERS INDEMNITY COMPANYInventor: The Travelers Indemnity Company
-
Publication number: 20130198564Abstract: Technologies are generally presented for a migration system and a method for moving data and applications from a cloud or non-cloud network to a cloud network employing a Parameterized Dynamic Model (PDM) having one or more multi-dimensional parameters. In some examples, the PDM parameters may represent the Service level Agreement (SLA) requirements that a target cloud may need to satisfy for a successful cloud migration. The PDM may include a Model Execution Code (MEC) module configured to execute the PDM acting upon the PDM parameter in a cloud environment following the sequencing defined in the PDM as a sequencing parameter. The PDM-MEC based migration system may also include fault-tolerance and error recovery during the migration while the MEC code is executed.Type: ApplicationFiled: March 15, 2012Publication date: August 1, 2013Applicant: Empire Technology Development, LLCInventor: Seth Hasit
-
Patent number: 8489921Abstract: A distributed system for creating a checkpoint for a plurality of processes running on the distributed system. The distributed system includes a plurality of compute nodes with an operating system executing on each compute node. A checkpoint library resides at the user level on each of the compute nodes, and the checkpoint library is transparent to the operating system residing on the same compute node and to the other compute nodes. Each checkpoint library uses a windowed messaging logging protocol for checkpointing of the distributed system. Processes participating in a distributed computation on the distributed system may be migrated from one compute node to another compute node in the distributed system by re-mapping of hardware addresses using the checkpoint library.Type: GrantFiled: April 7, 2009Date of Patent: July 16, 2013Assignee: Open Invention Network, LLCInventors: Srinidhi Varadarajan, Joseph Ruscio
-
Patent number: 8478720Abstract: The present invention concerns a file repair method for recovering a file, in a system for distributing content to more than one receiver, comprising, at a first receiver, the steps of receiving a set of files in a push multicast from a transmitter, receiving an identifier of a second receiver that owns a missing file that is not comprised in the received set of file; and recovering the missing file from the second receiver in a pull mode using a peer-to-peer mechanism. Another object of the invention is a method for file recovery in a server and in a peer device.Type: GrantFiled: August 28, 2007Date of Patent: July 2, 2013Assignee: Thomson LicensingInventors: Eric Gautier, Rémi Houdaille, Willem Lubbers
-
Patent number: 8479044Abstract: A computer implemented method, apparatus, and computer program product for determining a state associated with a transaction for use with a transactional processing system comprising a transaction coordinator and a plurality of grouped and inter-connected resource managers, the method comprising the steps of: in response to a communications failure between the transaction coordinator and a first resource manager causing a transaction to have an in doubt state, connecting to a second resource manager; in response to the connecting step, sending by the transaction coordinator to the second resource manager, a query requesting data associated with the in doubt transaction; obtaining at the first resource manager, by the second resource manager, a shared lock to data associated with the in doubt transaction; and in response to the obtaining step, collating, by the second resource manager, data associated with the in doubt transaction associated with the first resource manager.Type: GrantFiled: July 22, 2010Date of Patent: July 2, 2013Assignee: International Business Machines CorporationInventors: Paul S. Dennis, Stephen J. Hobson, Pete Siddall, Jamie P. Squibb, Phillip G. Willoughby
-
Publication number: 20130166950Abstract: A data processing device 10 receives a process request from an external interface 20 of a client terminal etc., carries out the transaction request with respect to a message according to the process request and passes to an API which forms the interface of various types of program carried out in the server 30. The data processing device 10 includes a process (n) which carries out transaction processing with respect to a message with a trade category [n], and a backup process (n) which carries out transaction processing with respect to a message in the case where the transaction processing carried out by the process (n) fails. In addition, the data processing device 10 includes an error process part 123 which isolates the cause of a failure according to a result of a process by the backup process (n).Type: ApplicationFiled: December 20, 2012Publication date: June 27, 2013Applicant: THE BANK OF TOKYO - MITSUBISHI UFJ, LTD.Inventor: The Bank of Tokyo - Mitsubishi UFJ, Ltd.
-
Patent number: 8473783Abstract: Fault tolerance is provided in a distributed system. The complexity of replicas and rollback requests are avoided; instead, a local failure in a component of a distributed system is tolerated. The local failure is tolerated by storing state related to a requested operation on the component, persisting that stored state in a data store, such as a relational database, asynchronously processing the operation request, and if a failure occurs, restarting the component using the stored state from the data store.Type: GrantFiled: November 9, 2010Date of Patent: June 25, 2013Assignee: International Business Machines CorporationInventors: Henrique Andrade, Kirsten W. Hildrum, Michael J. E. Spicer, Chitra Venkatramani, Rohit S. Wagle
-
Patent number: 8473782Abstract: A method of controlling a mobile terminal, method including performing, via a controller on the mobile terminal, data synchronizations with at least one external device, displaying, via a display on the mobile terminal, a list of data synchronization history corresponding to the performed data synchronizations, determining, via the controller, whether or not a particular data synchronization from the list includes an error, undoing, via the controller, the particular data synchronization to a state prior to the particular data synchronization, if it is determined that the particular data synchronization includes the error, and re-performing, via the controller, the particular data synchronization using data corresponding to the undone particular data synchronization.Type: GrantFiled: November 3, 2010Date of Patent: June 25, 2013Assignee: LG Electronics Inc.Inventor: Sangjoo Park
-
Patent number: 8458284Abstract: A system for transferring a live application from a source to a target machines includes memory capture component that monitors and captures memory segments associated with one or more memories, one or more sets of these memory segments comprising one or more applications, the memory segments changing while the live application is in execution. A frequency ranking component organizes the memory segments in an order determined by memory segment change frequency. A link identification component identifies one or more connecting links to one or more sets of peer machines, each set of machines connecting said source machine to said target machine, the link identifier further determining the bandwidth associated with each connecting link. A routing component preferentially routes one or more of the memory segments over said connecting links based on said order.Type: GrantFiled: June 12, 2009Date of Patent: June 4, 2013Assignee: International Business Machines CorporationInventors: Hai Huang, Yaoping Ruan, Sambit Sahu, Anees A. Shaikh, Kunwadee Sripanidkulchai, Sai Zeng
-
Patent number: 8448022Abstract: After execution by a thread of an instruction for invoking a function containing unreliable code, a pointer is stored to thread local storage for the thread in one or more reserved registers. The thread local storage is a portion of memory of the computing device associated with the thread. In the thread local storage, a stack pointer is stored to a position in a call stack associated with the instruction for invoking the function. The function is called, thereby causing the function to execute. In response to a fault occurring in the function, the pointer to thread local storage is used to retrieve the stack pointer in the thread local storage. The position in the call stack is used in a recovery process for the fault.Type: GrantFiled: October 26, 2010Date of Patent: May 21, 2013Assignee: VMware, Inc.Inventor: Micah Elizabeth Scott
-
Patent number: 8442962Abstract: A computer-implemented method, a computer-readable medium and a system are provided. A transaction master for each of a plurality of transactions of a database is provided. Each transaction master is configured to communicate with at least one transaction slave to manage execution of a transaction in the plurality of transactions. Each transaction master configured to perform generating a transaction token to specify data to be visible for a transaction on the database, the transaction token including a transaction identifier for identifying whether the transaction is a committed transaction or an uncommitted transaction, receiving a request to commit the transaction, initiating, based on the request, a two-phase commit operation to commit the transaction.Type: GrantFiled: December 28, 2010Date of Patent: May 14, 2013Assignee: SAP AGInventors: Juchang Lee, Michael Muehle
-
Patent number: 8429452Abstract: Also provided are techniques for failover when a network adapter fails, wherein the network adapter is connected to a miniport driver that is connected to a filter driver. With the miniport driver, it is determined that at least one of the network adapter and a data path through the network adapter has failed. With the miniport driver, the filter driver is notified that at least one of the network adapter and the data path through the network adapter has failed.Type: GrantFiled: June 23, 2011Date of Patent: April 23, 2013Assignee: Intel CorporationInventors: Alexander Belyakov, Mikhail Sennikovsky, Alexey Drozdov
-
Patent number: 8424016Abstract: Briefly, techniques to manage interrupts and swaps of threads operating in critical region. In an embodiment, a thread is to be interrupted during a first critical region with an interrupt routine. The thread may be set to restart at a beginning of the first critical region in response to an indication that the thread is working in a critical region. Other embodiments are also claimed and disclosed.Type: GrantFiled: March 29, 2011Date of Patent: April 16, 2013Assignee: Intel CorporationInventor: Joseph S. Cavallo
-
Patent number: 8417733Abstract: Embodiments of the present invention provide techniques, including systems, methods, and computer readable medium, for dynamic atomic bitsets. A dynamic atomic bitset is a data structure that provides a bitset that can grow or shrink in size as required. The dynamic atomic bitset is non-blocking, wait-free, and thread-safe.Type: GrantFiled: March 4, 2010Date of Patent: April 9, 2013Assignee: Oracle International CorporationInventor: Nathan Reynolds
-
Publication number: 20130086419Abstract: A transactional system can utilize the distributed storage and high availability (HA) capability provided by a clustered database to support easy and feasible disaster recovery. The transactional middleware machine environment comprises one or more transactional application servers associated with a transaction. The one or more transactional application servers operate to persist transactional log information associated with the transaction in a database that connects with said one or more transactional application servers at a local site. The database at the local site operates to replicate the persisted transactional log information to a remote database at a remote site. The remote database allows a different transactional application server at the remote site to recover the persisted transactional log information and complete the transaction, when a disaster disables the local site.Type: ApplicationFiled: March 7, 2012Publication date: April 4, 2013Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Todd Little, Xiangdong Li, Xianzheng Lv
-
Patent number: 8402310Abstract: In one embodiment, the present invention includes a method for determining a vulnerability level for an instruction executed in a processor, and re-executing the instruction if the vulnerability level is above a threshold. The vulnerability level may correspond to a soft error likelihood for the instruction while the instruction is in the processor. Other embodiments are described and claimed.Type: GrantFiled: October 28, 2011Date of Patent: March 19, 2013Assignee: Intel CorporationInventors: Xavier Vera, Oguz Ergin, Osman Unsal, Jaume Abella, Antonio González
-
Publication number: 20130061090Abstract: A partial rebooting recovery apparatus is provided. The partial rebooting recovery apparatus may store a system state of a predetermined booting point in time, may receive a failure signal of a system, may call a failure recovery processing function, may recover the system to the system state of the predetermined booting point in time, based on the failure signal, and may reboot the system from a point in time at which the system is recovered.Type: ApplicationFiled: September 5, 2012Publication date: March 7, 2013Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTEInventor: Kwang Yong LEE
-
Patent number: 8386440Abstract: The subject invention pertains to data store corruption recovery. More specifically, the invention concerns systems and methods for identifying corrupt data in a manner that prevents de-committing or removal of valid or consistent transactions from a database. This can be accomplished at least in part by logging the identities of data items that a transaction reads. Furthermore, the subject invention provides for employment of a multi-version (or transaction-time) database to reduce significantly reduce any down time or database unavailability caused by a corrupt transaction and associated corrupt data items. Accordingly, no backups need to be installed and only updates by the original corrupt transaction and transactions that read corrupt data need to be de-committed or removed.Type: GrantFiled: May 10, 2005Date of Patent: February 26, 2013Assignee: Microsoft CorporationInventors: David B. Lomet, Roger S. Barga
-
Patent number: 8381031Abstract: On a typical motherboard the processor and memory are separated by a printed circuit data bus that traverses the motherboard. Throughput, or data transfer rate, on the data bus is much lower than the rate at which a modern processor can operate. The difference between the data bus throughput and the processor speed significantly limits the effective processing speed of the computer when the processor is required to process large amounts of data stored in the memory. The processor is forced to wait for data to be transferred to or from the memory, leaving the processor under-utilized. The delays are compounded in a distributed computing system including a number of computers operating in parallel. The present disclosure describes systems, method and apparatus that tend to alleviate delays so that memory access bottlenecks are not compounded within distributed computing systems.Type: GrantFiled: August 6, 2010Date of Patent: February 19, 2013Assignee: Advanced Processor Architectures, LLCInventors: Louis Edmund Chall, John Bradley Serson, Philip Arnold Roberts, Cecil Eugene Hutchins
-
Patent number: 8381028Abstract: A computer usable program product for accelerating recovery in an MPI environment is provided in the illustrative embodiments. A first portion of a distributed application executes using a first processor and a second portion using a second processor in a distributed computing environment. After a failure of operation of the first portion, the first portion is restored to a checkpoint. A first part of the first portion is distributed to a third processor and a second part to a fourth processor. A computation of the first portion is performed using the first and the second parts in parallel. A first message is computed in the first portion and sent to the second portion, the message having been initially computed after a time of the checkpoint. A second message is replayed from the second portion without computing the second message in the second portion.Type: GrantFiled: April 16, 2012Date of Patent: February 19, 2013Assignee: International Business Machines CorporationInventor: Elmootazbellah Nabil Elnozahy
-
Patent number: 8381030Abstract: The present disclosure involves systems, software, and computer implemented methods for retrying business methods at an application server after thrown exceptions. One process includes operations for invoking a business method of an enterprise bean hosted in an enterprise bean container. The operations further include determining whether retry conditions are satisfied after an exception is thrown during execution of the business method. The business method is invoked again based on a predefined retry policy when the retry conditions are satisfied.Type: GrantFiled: December 23, 2009Date of Patent: February 19, 2013Assignee: SAP AGInventors: Peter Matov, Krasimir Topchiyski, Bistra Yakimova, Vladimir Pavlov
-
Patent number: 8375247Abstract: Embodiments include a computer processor-error controller, a computerized device, a device, an apparatus, and a method. A computer processor-error controller includes a monitoring circuit operable to detect a computational error corresponding to an execution of a second instruction by a processor operable to execute a sequence of program instructions that includes a first instruction that is fetched before the second instruction. The computer processor-error controller includes an error recovery circuit operable to restore an execution of the sequence of program instructions to the first instruction in response to the detected computational error.Type: GrantFiled: February 28, 2006Date of Patent: February 12, 2013Assignee: The Invention Science Fund I, LLCInventors: Bran Ferren, W. Daniel Hillis, William Henry Mangione-Smith, Nathan P. Myhrvold, Clarence T. Tegreene, Lowell L. Wood, Jr.
-
Patent number: 8370693Abstract: A system and method communicates commands from a command originator to receiving devices, yet the receiving devices do not confirm receipt of the command. The most current command (e.g. the one with the highest sequence number) is rebroadcast by the command originator and the receiving devices, tending to be more frequent upon detection of an event indicating that the most current command was not received by at least one other device, and less frequently upon detection of an event indicating that the most current command was provided with sufficient duplication that if another device could receive it, the device likely did receive it, subject to a maximum and minimum rate.Type: GrantFiled: February 6, 2012Date of Patent: February 5, 2013Assignee: Cisco Technology, Inc.Inventors: Alec Woo, David E. Culler
-
Patent number: 8365015Abstract: The present disclosure provides memory level error correction methods and apparatus. A memory controller is intermediate the memory devices, such as DRAM chips or memory modules, and a processor, such a graphics processor or a main processor. The memory controller can provide error correction. In an example, the memory controller includes a buffer to store instructions and data for execution by the controller and a replay buffer to store the instructions such that operations can be replayed to prior state before the error. An error detector receives data read from the memory devices and if no error is detected outputs the data. If an error is detected, the error detector signals the memory controller to replay the instructions stored in the replay buffer.Type: GrantFiled: August 9, 2010Date of Patent: January 29, 2013Assignee: Nvidia CorporationInventors: Shu-Yi Yu, Shane Keil, John Edmondson
-
Patent number: 8365016Abstract: In one embodiment, the present invention includes a method for selecting a first transaction execution mode to begin a first transaction in a unbounded transactional memory (UTM) system having a plurality of transaction execution modes. These transaction execution modes include hardware modes to execute within a cache memory of a processor, a hardware assisted mode to execute using transactional hardware of the processor and a software buffer, and a software transactional memory (STM) mode to execute without the transactional hardware. The first transaction execution mode can be selected to be a highest performant of the hardware modes if no pending transaction is executing in the STM mode, otherwise a lower performant mode can be selected. Other embodiments are described and claimed.Type: GrantFiled: November 30, 2011Date of Patent: January 29, 2013Assignee: Intel CorporationInventors: Jan Gray, Martin Taillefer, Yossi Levanoni, Ali-Reza Adl-Tabatabai, Dave Detlefs, Vinod Grover, Mike Magruder, Matt Tolton, Bratin Saha, Gad Sheaffer, Vadim Bassin
-
Publication number: 20130024727Abstract: Method for automatically reloading software characterized in that it comprises: a step of detecting corruption (E101) of at least one part of a software package of an on-board programmable device (10-1, 10-2, 10-n); and, in response to signaling, a step of reloading (E103) a non-corrupt version of the said at least one corrupt part of the software in order to replace the said at least one corrupt part of the software.Type: ApplicationFiled: July 11, 2012Publication date: January 24, 2013Applicant: AIRBUS OPERATIONS (S.A.S.)Inventors: Anne Frayssignes, Nicolas Caule
-
Patent number: 8359367Abstract: A system, method and computer program product for supporting system initiated checkpoints in parallel computing systems. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity.Type: GrantFiled: March 25, 2010Date of Patent: January 22, 2013Assignee: International Business Machines CorporationInventors: Dong Chen, Philip Heidelberger
-
Patent number: 8352786Abstract: A compressed replay buffer in a first electronic unit of an electronic system holds commands in a table. As commands are transmitted from the first electronic unit to a second electronic unit, the command, along with associated data, command type, and the like are stored in a row in the table. No rows in the table contain “dead cycles” to indicate that no command was sent on a particular cycle on a bus over which the commands were transmitted. The second electronic unit may request that the first electronic unit replay some number of commands. In response, the first electronic unit uses commands in the compressed replay buffer, along with required timings stored on the first electronic unit, to replay the number of commands requested.Type: GrantFiled: July 20, 2010Date of Patent: January 8, 2013Assignee: International Business Machines CorporationInventors: Herman L. Blackmon, Ryan S. Haraden, Joseph A. Kirscht, Elizabeth A. McGlone
-
Publication number: 20130007518Abstract: Described are embodiments directed at persistent handles that are used to retain state across network failures and server failovers. Persistent handles are requested by a client after a session has been established with a file server. The request for the persistent handle includes a handle identifier generated by the client. The server uses the handle identifier to associate with state information. When there is a network failure or a server failover, and a reconnection to the client, the handle identifier is used to identify replayed requests that if replayed would create an inconsistent state on the server. The replayed requests are then appropriately handled.Type: ApplicationFiled: June 30, 2011Publication date: January 3, 2013Applicant: Microsoft CorporationInventors: Mathew George, David M. Kruse, James T. Pinkerton, Roopesh C. Battepati, Tom Jolly, Paul R. Swan, Mingdong Shang, Daniel Edward Lovinger
-
Publication number: 20120324282Abstract: A method for event management in asynchronous work processing including timing at least one step in an asynchronous work process, wherein the at least one step is performed by an application and the at least one step has an expected time of completion; determining an error preventing step completion in response to the expected time of completion expiring; correcting the error; and re-performing the at least one step.Type: ApplicationFiled: June 14, 2011Publication date: December 20, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Khalid A. Asad, David S. Cruley, John DiClemente, Paul Ilechko, David J. Mulley
-
Publication number: 20120297248Abstract: A memory device recognizes that data corruption is present in a block. In response, rather than skip the block and continue write operations into a different uncorrupted block, the memory device continues to write data into the corrupted block. The memory device may write data on the basis of logical groups. The logical groups may be smaller than a block and larger than a page, but other sizes are also possible. In response to write corruption in the block (e.g., from power loss during a write operation), the memory device may skip certain parts of the block and continue writing into the block. For example, the memory device may skip the remainder of the page range in which the logical group was going to be written when data corruption occurred, and instead write that logical group into the block from the start of the next logical group unit, the next available page, or any other boundary.Type: ApplicationFiled: May 17, 2011Publication date: November 22, 2012Inventor: Alan David Bennett
-
Patent number: 8312053Abstract: Embodiments of the present invention provide techniques, including systems, methods, and computer readable medium, for dynamic atomic arrays. A dynamic atomic array is a data structure that provides an array that can grow or shrink in size as required. The dynamic atomic array is non-blocking, wait-free, and thread-safe. The dynamic atomic array may be used to provide arrays of any primitive data type as well as complex types, such as objects.Type: GrantFiled: September 11, 2009Date of Patent: November 13, 2012Assignee: Oracle International CorporationInventor: Nathan Reynolds