Abstract: A fault-tolerant system including a calculation unit and an output synthesizer is provided. The calculation unit receives a first environmental parameter and input data, wherein the calculation unit further includes a first and a second calculation circuits. The first calculation circuit is arranged to perform a calculation on the input data in response to the first environmental parameter to generate a first calculation result. The second calculation circuit is different from the first calculation circuit, and arranged to perform the calculation on the input data in response to the first environmental parameter to generate a second calculation result. The output synthesizer selects a first and a second set of bits from the first and the second calculation result according to a control signal, and synthesizes the first set of bits and the second set of bits in sequence to generate an adjusted calculation result.
Type:
Grant
Filed:
October 15, 2013
Date of Patent:
December 6, 2016
Assignee:
Industrial Technology Research Institute
Inventors:
Yung-Chang Chang, Hsing-Chuang Liu, Chih-Jen Yang
Abstract: A computer processor includes a decoder for decoding machine instructions and an execution unit for executing those instructions. The decoder and the execution unit are capable of decoding and executing vector instructions that include one or more format conversion indicators. For instance, the processor may be capable of executing a vector-load-convert-and-write (VLoadConWr) instruction that provides for loading data from memory to a vector register. The VLoadConWr instruction may include a format conversion indicator to indicate that the data from memory should be converted from a first format to a second format before the data is loaded into the vector register. Other embodiments are described and claimed.
Type:
Grant
Filed:
March 15, 2013
Date of Patent:
November 15, 2016
Assignee:
Intel Corporation
Inventors:
Eric Sprangle, Robert D. Cavin, Anwar Rohillah, Douglas M. Carmean
Abstract: Some embodiments provide a system for allocating resources in a compute farm. During operation, the system can receive resource-requirement information for a project. Next, the system can receive a request to execute a new job in the compute farm. In response to determining that no job slots are available for executing the new job, and that the project associated with the new job has not used up its allocated job slots, the system may execute the new job by suspending or re-queuing a job that is currently executing, and allocating the freed-up job slot to the new job. If the system receives a resource-intensive job, the system may create dummy jobs, and schedule the dummy jobs on the same computer system as the resource-intensive job to prevent the queuing system from scheduling multiple resource-intensive jobs on the same computer system.
Abstract: A method and apparatus for creating and executing a packet of chained instructions in a processor. A first instruction specifies a first operand is to be accessed from a memory and delivered through a first path in a first network to a first output. A second instruction specifies the first operand is to be received from the first output, to operate on the first operand, and to generate a result delivered to a second output. The second instruction does not identify a source device for the first operand and a destination device for the result. A third instruction specifies the first result is to be received from the second output and delivered through a first path in a second network for storage in the memory. The first, second, and third instructions are paired together as a packet of chained instructions for execution by a processor.
Abstract: Systems and methods are provided to track cluster nodes and provide high availability in a computing system. A computer system includes hosts, a cluster manager, and a cluster database. The cluster database includes entries corresponding to the hosts which identify the physical location of a corresponding host. The cluster manager uses the data to select at least two hosts and assign the selected hosts to a service group for executing an application. The cluster manager selects hosts via an algorithm that determines which hosts are least likely to share a single point of failure. The data includes a hierarchical group of location attributes describing two or more of a host's country, state, city, building, room, enclosure, and radio frequency identifier (RFID). The location-based algorithm identifies a group of selected hosts whose smallest shared location attribute is highest in the hierarchical group. The system updates the data whenever a physical location of a host changes.
Type:
Grant
Filed:
March 19, 2009
Date of Patent:
September 27, 2016
Assignee:
Veritas Technologies LLC
Inventors:
Sandeep Agarwal, Chio Fai Aglaia Kong, Karthik Ramamurthy
Abstract: In an array processing section, using data strings entered from input ports, a plurality of data processor elements execute predetermined operations while transferring data to each other, and output data strings of results of the operations from a plurality of output ports. A first data string converter converts data strings stored in a plurality of data storages of a data storage group into a placement suitable for the operations in the array processing section, and enters the converted data strings into the input ports of the array processing section. A second data string converter converts the data strings output from output ports of the array processing section into a placement to be stored in the plurality of data storages of the data storage group.
Abstract: Methods, apparatus, instructions and logic provide SIMD vector sub-byte decompression functionality. Embodiments include shuffling a first and second byte into the least significant portion of a first vector element, and a third and fourth byte into the most significant portion. Processing continues shuffling a fifth and sixth byte into the least significant portion of a second vector element, and a seventh and eighth byte into the most significant portion. Then by shifting the first vector element by a first shift count and the second vector element by a second shift count, sub-byte elements are aligned to the least significant bits of their respective bytes. Processors then shuffle a byte from each of the shifted vector elements' least significant portions into byte positions of a destination vector element, and from each of the shifted vector elements' most significant portions into byte positions of another destination vector element.
Type:
Grant
Filed:
July 31, 2013
Date of Patent:
August 2, 2016
Assignee:
Intel Corporation
Inventors:
Tal Uliel, Elmoustapha Ould-Ahmed-Vall, Thomas Willhalm, Robert Valentine
Abstract: An electronic processor is provided for use with a memory (2530) having selectable memory areas. The processor includes a memory area selection circuit (MMU) operable to select one of the selectable memory areas at a time, and an instruction fetch circuit (2520, 2550) operable to fetch a target instruction at an address from the selected one of the selectable memory areas.
Abstract: Systems and methods for integrating multiple best effort hardware transactional support mechanisms, such as Read Set Monitoring (RSM) and Best Effort Hardware Transactional Memory (BEHTM), in a single transactional memory implementation are described. The best effort mechanisms may be integrated such that the overhead associated with support of multiple mechanisms may be reduced and/or the performance of the resulting transactional memory implementations may be improved over those that include any one of the mechanisms, or an un-integrated collection of multiple such mechanisms. Two or more of the mechanisms may be employed concurrently or serially in a single attempt to execute a transaction, without aborting or retrying the transaction. State maintained or used by a first mechanism may be shared with or transferred to another mechanism for use in execution of the transaction. This transfer may be performed automatically by the integrated mechanisms (e.g., without user, programmer, or software intervention).
Abstract: A circuit arrangement and method selectively bypass an instruction buffer for selected instructions so that bypassed instructions can be dispatched without having to first pass through the instruction buffer. Thus, for example, in the case that an instruction buffer is partially or completely flushed as a result of an instruction redirect (e.g., due to a branch mispredict), instructions can be forwarded to subsequent stages in an instruction unit and/or to one or more execution units without the latency associated with passing through the instruction buffer.
Type:
Grant
Filed:
June 28, 2010
Date of Patent:
May 31, 2016
Assignee:
International Business Machines Corporation
Inventors:
Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Abstract: A buffer may be configured to store a plurality of items, and to be accessed by one or more activities of an instance of a process model. A scheduler may be configured to schedule execution of each of a plurality of activities of the process model, and to determine an activation of an activity of the plurality of activities. The scheduler may include an activity manager configured to access an activity profile of the activity upon the determining of the activation, the activity profile including buffer access characteristics according to which the activity is designed to access the buffer. A process execution unit may be configured to execute the activity and may include a buffer access manager configured to access the buffer according to the buffer access characteristics of the activity profile, and to thereby facilitate an exchange of at least one item between the buffer and the activity.
Abstract: Aspects capable of dynamically and flexibly supporting a plurality of locales upon provisioning are provided. An associated management server includes a storage table configured to store a plurality of logical device operations, a plurality of locales, and a plurality of workflows. The management server further includes a provisioning circuit configured to dynamically determine, for a required logical device operation among the plurality of logical device operations, a resource server among all resource servers connected to the management server by way of provisioning. Each of the resource servers is associated with a different one of the plurality of locales in advance of the provisioning. The management server further includes a calling circuit configured to search the storage table using a locale among the plurality of locales that is associated with the dynamically determined resource server to select a workflow from the plurality of workflows for the required logical device operation.
Type:
Grant
Filed:
May 27, 2014
Date of Patent:
April 19, 2016
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Abstract: A processor may generate a result vector when executing a RunningShiftForDivide1P or RunningShiftForDivide2P instruction. For example, upon executing a RunningShiftForDivide1P/2P instruction, the processor may receive a first input vector and a second input vector. The processor then may record a base value from an element at a key element position in the first input vector. Next, when generating the result vector, for each active element in the result vector to the right of the key element position, the processor may generate a shifted base value using shift values from the second input vector. The processor then may correct the shifted base value when a predetermined condition is met. Next, the processor may set the element of the result vector equal to the shifted base value.
Abstract: A method and apparatus for handling low power and high performance loads is herein described. Software, such as a compiler, is utilized to identify producer loads, consumer reuse loads, and consumer forwarded loads. Based on the identification by software, hardware is able to direct performance of the load directly to a load value buffer, a store buffer, or a data cache. As a result, accesses to cache are reduced, through direct loading from load and store buffers, without sacrificing load performance.
Type:
Grant
Filed:
December 30, 2007
Date of Patent:
April 12, 2016
Assignee:
Intel Corporation
Inventors:
Tingting Sha, Chris Wilkerson, Herbert Hum, Alaa R. Alameldeen
Abstract: A computing method includes specifying a virtual computer system including at least one virtual or physical compute node, which produces data packets having respective source attributes. At least one Virtual Input-Output Connection (VIOC) that is uniquely associated with the values of the source attributes is defined. A policy specifying an operation to be performed with regard to the VIOC is defined. The virtual computer system is implemented on a physical computer system, which includes at least one physical packet switching element. The physical packet switching element is configured to identify the data packets whose source attributes have the values that are associated with the VIOC and to perform the operation on the identified data packets, so as to enforce the policy on the VIOC.
Abstract: A semiconductor processor is described. The semiconductor processor includes logic circuitry to perform a logical reduction instruction. The logic circuitry has swizzle circuitry to swizzle a vector's elements so as to form a swizzle vector. The logic circuitry also has vector logic circuitry to perform a vector logic operation on said vector and said swizzle vector.
Type:
Grant
Filed:
September 24, 2010
Date of Patent:
September 22, 2015
Assignee:
Intel Corporation
Inventors:
Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver
Abstract: A system that supports multi-threaded transactions includes one or more processors configured to speculatively execute a first thread in a first transaction and speculatively execute a second thread concurrently with the first thread in a comparable transaction with respect to the first transaction. It further includes one or more memories coupled to the one or more processors, configured to provide the one or more processors with data storage. An uncommitted value written by the first thread while executing in the first transaction is visible to the second thread executing in the comparable transaction.
Type:
Grant
Filed:
March 2, 2009
Date of Patent:
September 8, 2015
Assignee:
Parakinetics Inc.
Inventors:
David I. August, Neil Vachharajani, Matthew J. Bridges
Abstract: A processing core in a multi-processing core system is configured to execute a sequence of instructions as an atomic memory transaction. Executing each instruction in the sequence comprises validating that the instruction meets a set of one or more atomicity criteria, including that executing the instruction does not require accessing shared memory. Executing the atomic memory transaction may comprise storing memory data from a source cache line into a target register, reading or modifying the memory data stored in the target register as part of executing the sequence, and storing a value from the target register to the source cache line.
Abstract: A method and a computing device. A first computing device and a second computing device are connected, wherein the first computing device includes a first virtual machine monitor that hosts a first virtual machine. A boot image is provided on the first computing device, wherein the boot image includes a second virtual machine monitor that is adapted to host the first virtual machine. The second computing device is triggered to boot the boot image from the first computing device. A storage network is established between the first computing device and the second computing device, wherein the storage network includes storage space of the first computing device. Lastly, the first virtual machine is migrated from the first computing device to the second computing device, wherein the first virtual machine is executed by the second computing device but still located on the first virtual machine.
Type:
Grant
Filed:
December 19, 2008
Date of Patent:
April 21, 2015
Assignee:
International Business Machines Corporation
Inventors:
Marco Hoehle, Christian Kirsch, Andreas Schmengler, Stephan Schwarzer
Abstract: A method for managing the usage of hardware resources by application programs within a computer system is disclosed. A use cost value is set for a device within a computer system. A number of tickets associated with a process is held. Upon execution of the process, the use cost value is compared to the number of tickets held by the process. The process is permitted to use the device based on the result of the comparison.