Isolate Or Remove Failed Node Without Replacement (e.g., Bypassing, Re-routing, Etc.) Patents (Class 714/4.2)
-
Publication number: 20150082078Abstract: A controller area network (CAN) includes a plurality of CAN elements comprising a communication bus and a plurality of controllers. A method for monitoring includes periodically determining vectors wherein each vector includes inactive ones of the controllers detected during a filtering window. Contents of the periodically determined vectors are time-filtered to determine a fault record vector. A fault on the CAN is isolated by comparing the fault record vector and a fault signature vector determined based upon a network topology for the CAN.Type: ApplicationFiled: July 16, 2014Publication date: March 19, 2015Inventor: SHENGBING JIANG
-
Patent number: 8984331Abstract: Systems and methods are provided for detecting an anomaly in a computer that is part of a population of networked computers. Snapshots are received from a plurality of computers within the population of computers, where individual snapshots include a state of assets and runtime processes of a respective computer. An asset normalization model is generated from the snapshots and serves as a baseline model for detecting an anomaly in the state of assets and runtime processes of a respective computer. A snapshot from at least one of the computers is compared to the asset normalization model in order to determine whether an anomaly is present in a state of static assets and runtime processes of the at least one of the computers.Type: GrantFiled: September 6, 2012Date of Patent: March 17, 2015Assignee: Triumfant, Inc.Inventor: Mitchell N. Quinn
-
Publication number: 20150067385Abstract: An information processing system includes a plurality of nodes and a shared memory connected to the plurality of nodes. Each of the nodes includes a plurality of functional circuits, a control device, and a register configured to store a plurality of interrupt factors that occur in the plurality of functional circuits. And The control device in one node among the plurality of nodes receives the interrupt factor in each register of a plurality of other nodes in response to an occurrence of the interrupt factor, extracts an interrupt factor to be detected as a failure among the received interrupt factors, specifies a fail node according to an extraction result, and, after suppressing access to the shared memory by the fail node, controls to separate the fail node from the information processing system on basis of log information received from the plurality of other nodes.Type: ApplicationFiled: July 24, 2014Publication date: March 5, 2015Inventor: Kazuhiro YUUKI
-
Patent number: 8972772Abstract: Systems and methods are disclosed herein for a replicated duplex computer system. The system includes a triplet of network elements, which each maintain a clock signal, and a monitor at each network element for monitoring incoming clock signals. Each network element interfaces with a fault containment region (FCR). The system provides the ability to transition to a duplex system if one of the fault containment regions fails. The three network elements are able to send their clock signals to the other network elements and receive their own clock signal and clock signals from the other elements. The monitors are configured to detect discrepancies in the clock signals of the network elements. If a monitor determines that an FCR has failed, each network element is reconfigured so that the FTPP system operates in a duplex mode without the faulty FCR by replacing the clock signal from the faulty element with its own clock signal.Type: GrantFiled: February 24, 2011Date of Patent: March 3, 2015Assignee: The Charles Stark Draper Laboratory, Inc.Inventors: Samuel Beilin, David Crane, M. Jay Prizant, Eric T. Antelman, Jeffrey Zinchuk, Roger Racine, Neil Brock, Adam J. Elbirt
-
Patent number: 8959386Abstract: A network, in particular an Ethernet network, contains as network elements at least two network components that are interconnected by a network transmission line. Accordingly, at least one expansion unit having two external ports is disposed in the network line for extending the scope thereof, wherein the expansion unit forwards a failure of the network transmission line at one of the ports thereof to a port of the next subsequent network element.Type: GrantFiled: May 23, 2011Date of Patent: February 17, 2015Assignee: Siemens AktiengesellschaftInventors: Ralf Beyer, Harald Karl, Michael Wilding
-
Patent number: 8954787Abstract: A maintenance free storage container includes a plurality of storage servers, wherein the maintenance free storage container allows for multiple storage servers of the plurality of storage servers to be in a failure mode without replacement. The maintenance free storage container further includes a container controller operable to manage failure mode information of the plurality of storage servers, manage mapping of a plurality of virtual storage servers to at least some of the plurality of storage servers based on the failure mode information, communicate storage server access requests with a device external to the maintenance free storage container using addressing of the plurality of virtual storage servers, and communicate the storage server access requests within the maintenance free storage container using addressing of the plurality of storage servers.Type: GrantFiled: April 18, 2012Date of Patent: February 10, 2015Assignee: Cleversafe, Inc.Inventors: S. Christopher Gladwin, Jason K. Resch, Gary W. Grube, Timothy W. Markison
-
Patent number: 8954786Abstract: A method, system, and medium are disclosed for performing transparent failover in a cluster server system. The cluster includes a plurality of servers. In servicing a client request, a primary server replicates session data for the client into memory space of one or more backup servers. The primary server sends a response to the client, wherein the response includes an indication of the one or more backup servers. When the client sends a subsequent request, it includes an indication of the backup servers. If the primary server is unavailable, the cluster determines a recovery server from among the backup servers indicated by the request. The chosen recovery server would then service the request.Type: GrantFiled: July 28, 2011Date of Patent: February 10, 2015Assignee: Oracle International CorporationInventors: Rajiv P. Mordani, Mahesh Kannan, Kshitiz Saxena, Shreedhar Ganapathy
-
Patent number: 8949659Abstract: Scheduling workloads based on detected hardware errors is provided. In response to determining that a hardware error is detected, it is determined whether the hardware error is a cache error. In response to determining that the hardware error is a cache error, it is determined whether execution of a workload on a processor is changing contents of a cache associated with the cache error more than a threshold value. In response to determining that the execution of the workload on the processor is changing the contents of the cache associated with the cache error more than the threshold value, it is determined whether the cache associated with the cache error is private to a core in the processor. In response to determining that the cache associated with the cache error is private to a core, the execution of the workload is scheduled on a different core of the processor.Type: GrantFiled: October 18, 2012Date of Patent: February 3, 2015Assignee: International Business Machines CorporationInventor: Venkatesh Sainath
-
Patent number: 8949658Abstract: In order to protect against various load balancing failures, the host selection algorithm on the load balancer can be modified to take into account data available about the state of the entire service and each host server in the cluster. The state can include a number of metrics, including the sampled response time taken by the selected host service. The load balancer can use the state information in order to detect anomalies among the host services. For example, the load balancer can determine that the sampled response time of one host service has deviated by more than a standard deviation limit (or other predetermined threshold) from the sampled response times of the other host services in the cluster. If such an anomaly is detected, the load balancer can take various remedial actions, such as disabling the routing of incoming requests to the potentially faulty host service.Type: GrantFiled: March 2, 2012Date of Patent: February 3, 2015Assignee: Amazon Technologies, Inc.Inventors: Brian A. Scanlan, Chris Higgins
-
Publication number: 20150026507Abstract: It is intended to shorten the time required for a path recalculation and a path switching upon occurrence of a failure. A path generation unit of a transport control server (TCS) S-1 generates the normal path information in accordance with the topology information of a network and the resource information which are set. Also, the path generation unit generates in advance the backup path information for occurrence of the failure based on the prediction topology information and the prediction resource information which have been modified in accordance with a predicted failure position. The path generation unit stores the generated backup path information in a data storage unit. A path information notification unit of the TCS (S-1) notifies nodes N of the generated normal path information. A failure information acquisition unit of the TCS (S-1) detects the occurrence of the failure.Type: ApplicationFiled: October 3, 2014Publication date: January 22, 2015Inventor: Daisuke MATSUBARA
-
Patent number: 8914663Abstract: Techniques for rescheduling a failed backup job are described in various implementations. A method that implements the techniques may include identifying a failed instance of a backup job, and determining an estimated amount of time to complete a rescheduled execution of the failed instance. The method may also include determining an available window of time in a backup schedule that equals or exceeds the estimated amount of time to complete the rescheduled execution, and rescheduling the failed instance for execution during the available window of time.Type: GrantFiled: March 28, 2012Date of Patent: December 16, 2014Assignee: Hewlett-Packard Development Company, L.P.Inventors: Hari Dhanalakoti, Sreekanth Gopisetty
-
Patent number: 8913484Abstract: A method and system for managing a backup service gateway (SGW) associated with a primary SGW, comprising periodically receiving from the primary SGW at least a portion of corresponding UE session state information, the received portion of session state information being sufficient to enable the secondary SGW to indicate to an inquiring management entity that all user sessions associated with a group of mobile devices supported by the primary SGW are in a live state; and in response to a failure of the primary SGW, assuming management of IP addresses and paths associated with the primary SGW and causing each UE supported by the failed primary SGW to reauthorize itself to the network.Type: GrantFiled: March 18, 2012Date of Patent: December 16, 2014Assignee: Alcatel LucentInventors: Vachaspati P. Kompella, Satyam Sinha, Praveen Vasant Muley, Sathyender Nelakonda
-
Patent number: 8909980Abstract: Described are techniques for coordinating processing to redirect requests. First and second storage processors of a data storage system are provided. Requests are directed to a first set of physical target ports of the first storage processor. The second storage processor is unavailable and has a second set of virtual target ports. Each virtual port of the second set is virtualized by a physical port of the first set. First processing is performed to redirect requests from the second set of virtual ports to a third set of physical ports of the second storage processor. First processing includes booting the second storage processor, directing requests to the third set of physical ports of the second storage processor rather than second set of virtual ports, and queuing requests received at the third set of physical ports until completion of pending requests previously received at the second set of virtual ports.Type: GrantFiled: June 29, 2012Date of Patent: December 9, 2014Assignee: EMC CorporationInventors: Daniel B. Lewis, Anoop George Ninan, Shuyu Lee, Matthew Long, Dilesh Naik
-
Patent number: 8892937Abstract: The control device detects a failed node in which a failure has occurred from a plurality of computation nodes included in a plurality of computation units included in the parallel computer. The control device chooses execution nodes for executing the program from the computation nodes of the parallel computer except the detected failed nodes based on the number of computation nodes needed to execute the program. The control device selects a paths to connect the computation nodes from a plurality of links each connecting two computation units adjacent to each other through a plurality of paths configured to connect computation nodes included in two computation units adjacent to each other in a one-to-one manner included in the links connecting two computation units adjacent to each other in the plurality of computation units including the choosed execution nodes except the path connected to the detected failed node.Type: GrantFiled: January 18, 2012Date of Patent: November 18, 2014Assignee: Fujitsu LimitedInventor: Hidetoshi Iwashita
-
Patent number: 8868967Abstract: The present invention discloses a method for connection-error handling of service in an Automatically Switched Optical Network (ASON), to resolve the technical problems that the conventional method for connection-error handling cannot realize rapid automatic configuration and the efficiency is low and other problems. Through automatically completing the configuration of the connection-error handling information of the start node and the end node by the control plane, the present invention overcomes the defect that manual setting is error prone; it is rapid and simple to implement the interaction of the error handling information between the start node and the end node by protocol exchange.Type: GrantFiled: March 23, 2010Date of Patent: October 21, 2014Assignee: ZTE CorporationInventor: Jianhong Wu
-
Patent number: 8867334Abstract: In one embodiment, a list of border node next hop options is maintained in a memory. The list of border node next hop options includes one or more of border nodes that may be utilized to reach one or more prefixes. An index value is associated with each border node of the list of border node next hop options. A list of labels is also maintained in the memory. The index value of each border node is associated with a corresponding label for a path to reach that border node. When a change to the one or more border nodes is detected, the list of border node next hop options is updated to remove a border node. However, a label for the path to reach the border node is maintained in the list of labels for at least a period of time.Type: GrantFiled: December 28, 2011Date of Patent: October 21, 2014Assignee: Cisco Technology, Inc.Inventors: Pranav Dharwadkar, Yuri Tsier, Clarence Filsfils, John Bettink, Pradosh Mohapatra
-
Publication number: 20140310555Abstract: The disclosed embodiments disclose techniques for performing physical domain error isolation and recovery in a multi-domain system, where the multi-domain system includes two or more processor chips and one or more switch chips that provide connectivity and cache-coherency support for the processor chips, and the processor chips are divided into two or more distinct domains. During operation, one of the switch chips determines a fault in the multi-domain system. The switch chip determines an originating domain that is associated with the fault, and then signals the fault and an identifier for the originating domain to its internal units, some of which perform clearing operations that clear out all traffic for the originating domain without affecting the other domains of the multi-domain system.Type: ApplicationFiled: April 12, 2013Publication date: October 16, 2014Applicant: Oracle International CorporationInventors: Jurgen M. Schulz, Vishak Chandrasekhar, Wayne F. Seltzer, Brian J. McGee
-
Patent number: 8862812Abstract: A first storage system includes a first storage unit to provide storage volumes, a first storage controller, and a first memory to store a first control program to process an input/output request received by the first storage system. A second storage system includes a second storage unit to provide storage volumes, a second storage controller, and a second memory to store a second control program to process an input/output request received by the second storage system. Each of the first and second storage systems is configured to present the storage volumes of the other storage system to the host computer, so that the host computer can access the storage volumes of each of the first and second storage systems via one of the first and second storage systems if the host computer is unable to access the storage volumes via the other storage system.Type: GrantFiled: January 19, 2011Date of Patent: October 14, 2014Assignee: Hitachi, Ltd.Inventors: Yasuyuki Mimatsu, Shoji Kodama
-
Publication number: 20140304544Abstract: Each node device has a sensor data saving information list storage section for storing a sensor data saving information list indicates a proper node device for saving each of sensor data among node devices according to an attribute of the sensor data. A sensor data arrangement section transfers each of the sensor data saved in sensor data storage sections of the node devices to the proper node device for saving the sensor data based on the sensor data saving information list.Type: ApplicationFiled: December 9, 2011Publication date: October 9, 2014Applicant: OMRON CORPORATIONInventor: Hideki Takenaka
-
Patent number: 8856584Abstract: The time required for a path recalculation and a path switching upon occurrence of a failure is shortened. A path generation unit of a transport control server (TCS) S-1 generates the normal path information in accordance with the topology information of a network and the resource information which are set. Also, the path generation unit generates in advance the backup path information for occurrence of the failure based on the prediction topology information and the prediction resource information which have been modified in accordance with a predicted failure position. A path information notification unit of the TCS (S-1) notifies nodes N of the generated normal path information. A failure information acquisition unit of the TCS (S-1) detects the occurrence of the failure. If the occurrence of the failure is detected, the path information notification unit notifies the nodes N of the backup path information that is stored.Type: GrantFiled: July 31, 2009Date of Patent: October 7, 2014Assignee: Hitachi, Ltd.Inventor: Daisuke Matsubara
-
Patent number: 8856585Abstract: Various exemplary embodiments relate to a method and related network node including one or more of the following: detecting, by a resource allocation device, a failure of server hardware; identifying a first agent device that is configured to utilize the server hardware; and taking at least one action to effect a reconfiguration of the first agent device in response to the server hardware failure. Various embodiments additionally include one or more of the following: identifying a second agent device that is configured to utilize the server hardware; and taking at least one action to effect a reconfiguration of the second agent device in response to the server hardware failure. Various embodiments additionally include one or more of the following: receiving, by the resource allocation device from a second agent device, an indication of the failure of server hardware, wherein the second agent device is different from the first agent device.Type: GrantFiled: August 1, 2011Date of Patent: October 7, 2014Assignee: Alcatel LucentInventors: Eric J. Bauer, Randee S. Adams
-
Patent number: 8850262Abstract: An approach to detecting processor failure in a multi-processor environment is disclosed. The approach may include having each CPU in the system responsible for monitoring another CPU in the system. A CPUn reads a timestampn+1 created by CPUn+1 which CPUn is monitoring from a shared memory location. The CPUn reads its own timestampn and compares the two timestamps to calculate a delta value. If the delta value is above a threshold, the CPUn determines that CPUn+1 has failed and initiates error handling for the CPUs in the system. One CPU may be designated a master CPU, and be responsible for beginning the error handling process. In such embodiments, the CPUn may initiate error handling by notifying the master CPU that CPUn+1 has failed. If CPUn+1 is the master CPU, the CPUn may take additional steps to initiate error handling, and may broadcast a non-critical interrupt to all CPUs, triggering error handling.Type: GrantFiled: October 12, 2010Date of Patent: September 30, 2014Assignee: International Business Machines CorporationInventors: Charles S. Cardinell, Roger G. Hathorn, Bernhard Laubli, Timothy J. Van Patten
-
Publication number: 20140245059Abstract: Aspects of a method and system for hybrid redundancy for electronic networks are provided. A first line card may comprise a first instance of a network layer circuit, a first instance of a physical layer circuit, and an interface to a data bus (e.g., an Ethernet bus) for communicating with a second line card. In response to detecting a failure of the first instance of the network layer circuit, the first instance of the physical layer circuit may switch from processing of a signal received via the first instance of the network layer circuit to processing of a signal received via the interface. The system may comprise a second line card. The second line card may comprise a second instance of the network layer circuit. The second instance of the network layer circuit may be coupled to the data bus.Type: ApplicationFiled: February 24, 2014Publication date: August 28, 2014Applicant: MaxLinear, Inc.Inventor: Curtis Ling
-
Patent number: 8819477Abstract: Disclosed are various embodiments that facilitate error handling in a network page generation environment. A request for a network page is obtained from a client. The network page is associated with a network site hosted by a hosting provider on behalf of a customer. Page generation code supplied by the customer is executed by a framework in response to the request. The page generation code is configured to generate at least a portion of the network page. A customized error network page is sent to the client in response to determining that an error has occurred in the framework that executes the page generation code.Type: GrantFiled: February 1, 2012Date of Patent: August 26, 2014Assignee: Amazon Technologies, Inc.Inventors: Prashant J. Thakare, Andrew S. Huntwork, Jeremy Boynes, Pravi Garg, Shashank Shekhar
-
Patent number: 8812898Abstract: A system and method are provided for ensuring reliable data transfers by automatically recovering from un-correctable errors detected in data traversing throughout a system and being retrieved from an unreliable intermediate data buffer between a first memory and a secondary slower memory. Additionally, measures to compensate for the use of unreliable or error-prone components and interconnects, such as, for example, SRAM memory as a temporary buffer are provided. Further, measures to detect and correct errors—whatever the type—injected or occurring at any stage throughout traversal of the system are provided.Type: GrantFiled: September 27, 2012Date of Patent: August 19, 2014Assignee: Cadence Design Systems, Inc.Inventors: Manas Lahon, Sandeep Brahmadathan
-
Patent number: 8780695Abstract: A unit and a system for protection switching of line cards in a telecommunication system are described. A protection unit is connectable between communication lines and a line interface unit. The protection unit can be interconnected with other protection units to form a protection switching system. One protection unit in the protection switching system is connectable to a stand-by line card. The protection switching system is configured so that when protection switching is needed, the line signal is re-directed between the communication line for a failed line card and the stand-by line card via electrical connection elements.Type: GrantFiled: October 23, 2008Date of Patent: July 15, 2014Assignee: Telefonaktiebolaget L M Ericsson (publ)Inventor: Renato Grosso
-
Patent number: 8773236Abstract: Systems and methods for local management units in a photovoltaic energy system. In one embodiment, a method implemented in a computer system includes: attempting to communicate on a first active channel with a master management unit from a local management unit that controls a solar module; if communication with the master management unit on the first active channel has not been established, attempting to communicate on a second active channel with the master management unit.Type: GrantFiled: September 30, 2010Date of Patent: July 8, 2014Assignee: Tigo Energy, Inc.Inventors: Maxym Makhota, Daniel Eizips, Shmuel Arditi, Ron Hadar
-
Patent number: 8762801Abstract: A system includes a first device, a first storage element, a comparator and a second device. The first device is configured to test memory cells in an array of memory cells to detect defective memory cells. The defective memory cells include a first memory cell and a second memory cell. The first storage element is configured to store a first address of the first memory cell. The comparator is configured to compare a second address of the second memory cell to the first address.Type: GrantFiled: April 15, 2013Date of Patent: June 24, 2014Assignee: Marvell International Ltd.Inventors: Winston Lee, Albert Wu, Chorng-Lii Liou
-
Publication number: 20140157041Abstract: The present invention relates to a distributed avionics system (100, 500) having a plurality of computer nodes arranged to execute a plurality of partitions/applications (P1, P2, P3, P4, P5, P6). The distributed avionics system comprises reconfiguration means (332) arranged to reconfigure the distributed avionics system upon detection of failure in at least one of the computer nodes. Each partition/application is associated to a application/partition availability level. The reconfiguration means are arranged to reconfigure the distributed avionics system based on the partition/application availability levels of the partition/applications (P1, P2, P3, P4, P5, P6). The present invention further relates to a method for back-up handling in a distributed avionics system having a plurality of computer nodes (A, B, C).Type: ApplicationFiled: May 17, 2011Publication date: June 5, 2014Applicant: SAAB ABInventors: Torkel Danielsson, Jan Håkegård, Anders Gripsborn
-
Patent number: 8743680Abstract: According to one aspect of the present disclosure, a method and technique for hierarchical network failure handling in a clustered node environment is disclosed. The method includes: detecting a network failure by a node in a cluster, the cluster having plural nodes arranged in a hierarchy, wherein the network failure is associated with a subordinate node in the hierarchy to the detecting node; communicating the network failure from the detecting node to a superior node in the hierarchy; determining whether the network failure affects nodes higher than the detecting node in the hierarchy; and responsive to determining that the network failure does not affect nodes higher than the detecting node in the hierarchy, the detecting node initiating a protocol to expel the subordinate node from the cluster.Type: GrantFiled: August 12, 2011Date of Patent: June 3, 2014Assignee: International Business Machines CorporationInventors: William B. Brown, David J. Craft, Robert K. Gjertsen
-
Publication number: 20140149782Abstract: A method and apparatus for facilitating process restart in an IS-IS router that includes an active router processor (RP) module for supporting an active IS-IS process instance as well as one or more dormant instances of the active IS-IS process. Routing database information maintained by the active IS-IS process is synchronized to one or more corresponding databases associated with the dormant instances. Responsive to a control signal, one of the dormant instances may be activated as the new active IS-IS process instance on the active RP module, wherein the contents of the database corresponding to the newly activated instance are used for continuing to maintain routing functionality.Type: ApplicationFiled: March 15, 2013Publication date: May 29, 2014Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)Inventor: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
-
Publication number: 20140149783Abstract: Multiple computers in a cluster maintain respective sets of identifiers of neighbor computers in the cluster for each of multiple named resource. A combination of the respective sets of identifiers define a respective tree formed by the respective sets of identifiers for a respective named resource in the set of named resources. Upon origination and detection of a request at a given computer in the cluster, a given computer forwards the request from the given computer over a network to successive computers in the hierarchical tree leading to the computers relevant in handling the request based on use of identifiers of neighbor computers. Thus, a combination of identifiers of neighbor computers identify potential paths to related computers in the tree.Type: ApplicationFiled: January 30, 2014Publication date: May 29, 2014Inventor: Ivan I. Georgiev
-
Publication number: 20140143589Abstract: Disclosed herein is a method of managing the path of an OSEK network. The method of managing the path of an OSEK network includes step S1 at which a message is transferred along nodes of the OSEK network; step S2 at which a failed node at which a network failure has occurred is detected while the message is being transferred at step S1; step S3 at which the failed node of step S2 is eliminated from the overall network; and step S4 at which the message is transferred from a source node that has transferred the message to the failed node of step S2 to a target node to which the failed node will transfer the message by connecting the source node with the target node.Type: ApplicationFiled: May 4, 2012Publication date: May 22, 2014Applicant: Research & Business Foundation SUNGKYUNKWAN UNIVERSITYInventors: Jae Wook Jeon, Sung Suk Jung, Ho Young Jeong, Jin Ho Kim
-
Patent number: 8719618Abstract: A technique is provided for a cache. A cache controller accesses a set in a congruence class and determines that the set contains corrupted data based on an error being found. The cache controller determines that a delete parameter for taking the set offline is met and determines that a number of currently offline sets in the congruence class is higher than an allowable offline number threshold. The cache controller determines not to take the set in which the error was found offline based on determining that the number of currently offline sets in the congruence class is higher than the allowable offline number threshold.Type: GrantFiled: June 13, 2012Date of Patent: May 6, 2014Assignee: International Business Machines CorporationInventors: Ekaterina M. Ambroladze, Michael A. Blake, Timothy C. Bronson, Hieu T. Huynh
-
Patent number: 8713355Abstract: A system that incorporates teachings of the present disclosure may include, for example, an edge device having a controller to receive a Session Initiation Protocol (SIP) message from a user endpoint device (UE) requesting communication services, forward the SIP message to a network element of a Server Office, receive from the network element a first error message indicating communication services at the Server Office are unavailable, replace the first error message with a second error message, the second error message indicating a temporary unavailability of communication services, and transmit the second error message to the UE. Additional embodiments are disclosed.Type: GrantFiled: June 7, 2013Date of Patent: April 29, 2014Assignee: AT&T Intellectual Property I, LPInventors: Chaoxin Qiu, Robert F. Dailey, Satish Parolkar
-
Patent number: 8711684Abstract: A method and apparatus for detecting an intermittent path to a storage system comprising accessing path statistics comprising indicia of path state of a path to a storage system, determining whether the path state has changed during a predefined period and, if the path state has changed at least a predefined number of times during the predefined period, identifying the path as intermittent. Once a path is deemed intermittent, the path is aged until either the path is no longer intermittent or the path is deemed dead.Type: GrantFiled: July 9, 2007Date of Patent: April 29, 2014Assignee: Symantec CorporationInventors: Ameya Prakash Usgaonkar, Hari Krishna Vemuri, Siddhartha Nandi
-
Patent number: 8706105Abstract: A method, system and machine-readable storage medium for providing fault tolerance in a distributed mobile architecture (dMA) system. The method includes receiving a message or failing to receive the message within a predetermined time relating to a first dMA gateway (dMAG) at a second dMAG. It is determined whether the first dMAG is not operational or is otherwise offline based on the received message or the failure to receive the message. One or more dMA nodes associated with the first dMAG are notified in order to request connections to an external system via the second dMAG. The external system is also notified to request connections to one or more dMA nodes associated with the first dMAG via the second dMAG.Type: GrantFiled: June 27, 2008Date of Patent: April 22, 2014Assignee: Lemko CorporationInventor: Shaowei Pan
-
Publication number: 20140108854Abstract: High availability of a virtual machine is ensured even when all of the virtual machine's IO paths fail. In such a case, the virtual machine is migrated to a host that is sharing the same storage system as the current host in which the virtual machine is being executed and has at least one functioning IO path to the shared storage system. After execution control of the virtual machine is transferred to the new host, IO operations from the virtual machine are issued over the new IO path.Type: ApplicationFiled: October 11, 2012Publication date: April 17, 2014Applicant: VMware, Inc.Inventors: Jinto ANTONY, Sudhish Panamthanath Thankappan, Jidhin Malathusseril Thomas
-
Publication number: 20140095922Abstract: An initial SIP message is sent to establish a first SIP communication session from a first SIP device. The initial SIP message is sent via a first of a plurality of session managers to a second SIP device. After receiving the initial SIP message at the second SIP device and before ending the first SIP communication session, either the first or second SIP device sends a second SIP message. The second SIP message is sent to the first of the plurality of session managers. Either the first or second SIP devices detects that a response SIP message to the sent second SIP message was not received within a defined time period. In response to detecting that the SIP response message was not received within the defined time period, either the first or second SIP device resends the second SIP message to a second one of the plurality of session managers.Type: ApplicationFiled: September 28, 2012Publication date: April 3, 2014Applicant: Avaya Inc.Inventors: Stephen Andrew Baker, Harsh V. Mendiratta, Kevin Sean Cripps, Ryan Scott Wallach
-
Publication number: 20140095923Abstract: Embodiments of the invention relate to faulty recovery mechanisms for a two-dimensional (2-D) network on a processor array. One embodiment comprises a processor array including multiple processors core circuits, and a redundant routing system for routing packets between the core circuits. The redundant routing system comprises multiple switches, wherein each switch corresponds to one or more core circuits of the processor array. The redundant routing system further comprises multiple data paths interconnecting the switches, and a controller for selecting one or more data paths. Each selected data path is used to bypass at least one component failure of the processor array to facilitate full operation of the processor array.Type: ApplicationFiled: September 28, 2012Publication date: April 3, 2014Applicant: International Business Machines CorporationInventors: Rodrigo Alvarez-Icaza Rivera, John V. Arthur, John E. Barth, JR., Andrew S. Cassidy, Subramanian Iyer, Paul A. Merolla, Dharmendra S. Modha
-
Patent number: 8677180Abstract: A system and a method for failover control comprising: maintaining a primary device table entry (DTE) in a first table activated for a first adapter in communication with a first processor node having a first root complex via a first switch assembly and maintaining a secondary DTE in standby for a second adapter in communication with a second processor node having a second root complex via a second switch assembly; maintaining a primary DTE in a second table activated for the second adapter and maintaining a secondary DTE in standby for the first adapter; and upon a failover, updating the secondary DTE in the first table as an active entry for the second adapter and forming a path to enable traffic to route from the second adapter through the second switch assembly over to the first switch assembly and up to the first root complex of the first processor node.Type: GrantFiled: June 23, 2010Date of Patent: March 18, 2014Assignee: International Business Machines CorporationInventors: Gerd K. Bayer, David F. Craddock, Thomas A. Gregg, Michael Jung, Andreas Kohler, Elke G. Nass, Oliver G. Schlag, Peter K. Szwed
-
Patent number: 8671306Abstract: A messaging system may operate on multiple processor partitions in several configurations to provide queuing and topic subscription services on a large scale. A queue service may receive messages from a multiple transmitting services and distribute the messages to a single service. A topic subscription service may receive messages from multiple transmitting services, but distribute the messages to multiple recipients, often with a filter applied to each recipient where the filter defines which messages may be transmitted by the recipient. Large queues or topic subscriptions may be divided across multiple processor partitions with separate sets of recipients for each partition in some cases, or with duplicate sets of recipients in other cases.Type: GrantFiled: December 21, 2010Date of Patent: March 11, 2014Assignee: Microsoft CorporationInventors: Kartik Paramasivam, Murali Krishnaprasad, Jayu Katti, Pramod Gurunath, Affan Arshad Dar
-
Patent number: 8661130Abstract: Server management data describes observed operating condition of a pool of spare servers. Based on a demand forecast of a specific target system, a dynamic allocation period is determined as a period during which the target system needs additional server resources to handle an expected demand. Based on the dynamic allocation period and server management data, a set of allocation candidates are nominated from the spare server pool, by eliminating therefrom spare servers which are likely to fail during the dynamic allocation period. An appropriate allocation candidate is then selected for allocation to the target system, such that the selected candidate will satisfy a specified requirement during its allocation period.Type: GrantFiled: March 10, 2009Date of Patent: February 25, 2014Assignee: Fujitsu LimitedInventors: Masataka Sonoda, Satoshi Tsuchiya, Kunimasa Koike, Atsuji Sekiguchi
-
Publication number: 20140053014Abstract: Embodiments relate to a computer for transmitting data in a network. The computer includes at least one data transmission port configured to be connected to at least one storage device via a plurality of paths of a network. The computer further includes a processor configured to detect recurring intermittent errors in one or more paths of the plurality of paths and to disable access to the one or more paths based on detecting the recurring intermittent errors.Type: ApplicationFiled: March 8, 2013Publication date: February 20, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ian A. MacQuarrie, James A. O'Connor, Limei Shaw, Thomas Walter, Thomas V. Weaver, Shawn T. Wright
-
Patent number: 8656212Abstract: Various embodiments herein include at least one of systems, methods, and software to detect and reduce messages from network entity management clients that are not utilized by a network management system. Once identified, the network management system may send a command to the network entity management clients to no longer send particular message types to the network management system. The network management system may also, or alternatively, be configured to take no action when such messages are subsequently received.Type: GrantFiled: February 17, 2012Date of Patent: February 18, 2014Assignee: CA, Inc.Inventors: Timothy J. Pirozzi, Jerome S. Simms, Jonathan Caron
-
Publication number: 20140040659Abstract: Embodiments herein relate to selection of one of first and second links between first and second network devices. The first link is to transmit the traffic between the first and second network devices directly and the second link is to transmit the traffic between the first and second network devices through a network appliance.Type: ApplicationFiled: July 31, 2012Publication date: February 6, 2014Inventor: Gary Michael Wassermann
-
Publication number: 20140025985Abstract: A connection node is included in a connecting part of a plurality of rings in a ring network. The connection node includes a failure detecting unit, an optical-signal processing unit, an ODU switch, and an optical-signal processing unit. The failure detecting unit detects failure in the connecting part. The optical-signal processing unit receives data transmitted from another node on a ring to which the connection node belongs. Upon detection of the failure, the ODU switch determines whether to pass the data or return the data in reverse direction from the connection node depending on a destination to transfer the received data, and sets a transmission path of the data based on a result of the determination. The optical-signal processing unit transfers the data in accordance with the set transmission path.Type: ApplicationFiled: May 7, 2013Publication date: January 23, 2014Applicant: FUJITSU LIMITEDInventor: Yuji Tochio
-
Patent number: 8627137Abstract: In one embodiment, a network device may detect a data plane critical fault condition, while a corresponding control plane is not experiencing a critical fault condition. In response to a network device based critical fault condition, the network device may activate and advertise an increased and expensive usable metric for each network interface of the network device. On the other hand, in response to an interface based critical fault condition, the network device may activate and advertise an increased and expensive usable metric for one or more particular network interfaces of the interface based critical fault, and signals, over the control plane to a corresponding network device at an opposing end of each particular network interface of the interface based critical fault, a request to activate and advertise an increased and expensive usable metric at the opposing end of each particular network interface.Type: GrantFiled: September 16, 2010Date of Patent: January 7, 2014Assignee: Cisco Technology, Inc.Inventors: Nikunj R. Vaidya, Pradosh Mohapatra, Arun Satyanarayana, Pankaj Bhagra, Clarence Filsfils
-
Patent number: 8621263Abstract: A quorum service within a cluster infrastructure layer of a cluster environment comprising a plurality of nodes automatically triggers at least one automated fencing operation integrated within the quorum service, to reliably maintain a node usability state of each node of the plurality of nodes indicating an availability of each node to control and access at least one shared resource of the cluster. The quorum service reports the node usability state of each node as a cluster health status to at least one distributed application within an application layer of the cluster environment, to provide a reliable cluster health status of the plurality of nodes to the at least one distributed application for a failover of said at least one shared resource from control by a failed node from among the plurality of nodes to another node from among the plurality of nodes.Type: GrantFiled: December 18, 2012Date of Patent: December 31, 2013Assignee: International Business Machines CorporationInventors: Myung Bae, Robert K. Gardner
-
Patent number: 8621262Abstract: A semiconductor integrated circuit, including a first master circuit, a second master circuit, a first slave circuit assigned to the first master circuit, and determines that an access request signal is sent from the first master circuit when an identification information is a first value, a first bus coupled to the first master circuit, the second master circuit, and the first slave circuit, a bus controller is configured to transmit the access request signal to the first slave circuit via the first bus, a system controller directs the bus controller to substitute the first value for a second value on the identification information of the access request signal received from the second master circuit when the first master circuit is in the deactivated state.Type: GrantFiled: September 14, 2012Date of Patent: December 31, 2013Assignee: Renesas Electronics CorporationInventors: Shigeyuki Ueno, Hiroyuki Nakajima