Processing Control For Data Transfer Patents (Class 712/225)
-
Patent number: 8443174Abstract: Provided is a processor and method of performing speculative load instructions of the processor in which a load instruction is performed only in the case where the load instruction substantially accesses a memory. A load instruction for canceling operations is performed in other cases except the above case, so that problems occurring by accessing an input/output (I/O) mapped memory area and the like at the time of performing speculative load instructions can be prevented using only a software-like method, thereby improving the performance of a processor.Type: GrantFiled: August 14, 2007Date of Patent: May 14, 2013Assignee: Samsung Electronics Co., Ltd.Inventors: Hong-Seok Kim, Hee Seok Kim, Jeongwook Kim, Suk Jin Kim
-
Publication number: 20130117547Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.Type: ApplicationFiled: December 29, 2012Publication date: May 9, 2013Inventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
-
Publication number: 20130117546Abstract: A Load/Store Disjoint instruction, when executed by a CPU, accesses operands from two disjoint memory locations and sets condition code indicators to indicate whether or not the two operands appeared to be accessed atomically by means of block-concurrent interlocked fetch with no intervening stores to the operands from other CPUs. In a Load Pair Disjoint form of the instruction, the accesses are loads and the disjoint data is stored in general registers.Type: ApplicationFiled: December 26, 2012Publication date: May 9, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: INTERNATIONAL BUSINESS MACHINES CORPORATION
-
Publication number: 20130111192Abstract: A remote receives an instruction to transmit and determines whether or not to include an acknowledgement request in the instruction based on statistics regarding receipt of acknowledgements associated with previously transmitted instructions. If so, the remote control device includes the request before transmitting. The remote control may determine whether or not to include the request in a variety of different ways in a variety of different implementations. In some implementations, the remote control may classify instructions into two or more different classifications and may treat instructions of different classifications differently. In other implementations, the remote control may treat the same instruction differently depending on the number of requested acknowledgements successfully received during a time period. In various other implementations, the remote control may perform various combinations of these approaches.Type: ApplicationFiled: October 31, 2011Publication date: May 2, 2013Applicant: EchoStar Technologies L.L.C.Inventor: William R. Reams
-
Patent number: 8433884Abstract: A multiprocessor executes a plurality of threads without decreasing execution efficiency. The multiprocessor includes a first processor allocating a different register file to each of a predetermined number of threads to be executed from among plural threads, and executing the predetermined number of threads in parallel; and a second processor performing processing according to a processing request made by the first processor. The first processor has areas allocated to the plurality of threads in one-to-one correspondence, makes the processing request to the second processor according to an instruction included in one of the predetermined number of threads, upon receiving a request for writing a value resulting from the processing from the second processor, judges whether the one thread is being executed, and when judging negatively, performs control such that the obtained value is written into one of the areas allocated to the one thread.Type: GrantFiled: June 16, 2009Date of Patent: April 30, 2013Assignee: Panasonic CorporationInventor: Hiroyuki Morishita
-
Patent number: 8412917Abstract: Disclosed are methods and systems for dynamically determining data-transfer paths. The data-transfer paths are dynamically determined in response to an instruction that facilitates data transfer among execution lanes in an integrated-circuit processing device operable to execute operations in parallel. In addition, embodiments include an integrated-circuit processing device operable to execute operations in parallel, including the capability of providing confirmation information to potential source lanes, the confirmation information indicating whether the potential source lanes may send data to requested destination lanes during a data-transfer interval.Type: GrantFiled: September 20, 2011Date of Patent: April 2, 2013Assignee: Calos Fund Limited Liability CompanyInventors: Brucek Khailany, William James Dally, Ujval J. Kapasi, Jim Jian Lin
-
Patent number: 8411103Abstract: One embodiment of the invention sets forth a CROP configured to perform both color raster operations and atomic transactions. Upon receiving an atomic transaction, the distribution unit within the CROP transmits a read request to the L2 cache for retrieving the destination operand. The distribution unit also transmits the source operands and the operation code to the latency buffer for storage until the destination operand is retrieved from the L2 cache. The processing pipeline transmits the operation code, the source and destination operands and an atomic flag to the blend unit for processing. The blend unit performs the atomic transaction on the source and destination operands based on the operation code and returns the result of the atomic transaction to the processing pipeline for storage in the internal cache. The processing pipeline writes the result of the atomic transaction to the L2 cache for storage at the memory location associated with the atomic transaction.Type: GrantFiled: September 29, 2009Date of Patent: April 2, 2013Assignee: Nvidia CorporationInventors: Narayan Kulshrestha, Adam Paul Dreyer, Chad D. Walker, Rui M. Bastos
-
Publication number: 20130080747Abstract: The present invention relates to a processor including: an instruction cache configured to store at least some of first instructions stored in an external memory and second instructions each including a plurality of micro instructions; a micro cache configured to store third instructions corresponding to the plurality of micro instructions included in the second instructions; and a core configured to read out the first and second instructions from the instruction cache and perform calculation, in which the core performs calculation by the first instructions from the instruction cache under a normal mode, and when the process enters a micro instruction mode, the core performs calculation by the third instructions corresponding to the plurality of micro instructions provided from the micro cache.Type: ApplicationFiled: September 10, 2012Publication date: March 28, 2013Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTEInventor: Young-Su KWON
-
Patent number: 8401015Abstract: The present invention relates to a method for representing a partition of n w-bit intervals associated to d-bit data in a data communications network, said method comprising the steps of: providing in a memory (102), a datagram forwarding data structure (10) provided for indicating where to forward a datagram in said network, which data structure (10) is in the form of a tree comprising at least one leaf (11) and possibly a number of nodes (13) including partial nodes, said data structure (10) having a height (h), corresponding to a number of memory accesses required for looking up a largest stored non-negative integer smaller than or equal to a query key, step 201, reducing worst storage cost by using a technique for reduction of worst case storage cost that are selectable from: partial block tree compaction, virtual blocks, bit push pulling, block aggregation or split block trees, and variations thereof, step 202, updating the layered data structure partially including by using a technique for scheduling maType: GrantFiled: October 19, 2007Date of Patent: March 19, 2013Assignee: Oricane ABInventor: Mikael Sundström
-
Patent number: 8402255Abstract: A processor that is configured to perform parallel operations in a computer system where one or more memory hazards may be present is described. An instruction fetch unit within the processor is configured to fetch instructions for detecting one or more critical memory hazards between memory addresses if memory operations are performed in parallel on multiple addresses corresponding to at least a partial vector of addresses. Note that critical memory hazards include memory hazards that lead to different results when the memory addresses are processed in parallel than when the memory addresses are processed sequentially. Furthermore, an execution unit within the processor is configured to execute the instructions for detecting the one or more critical memory hazards.Type: GrantFiled: September 1, 2011Date of Patent: March 19, 2013Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff
-
Publication number: 20130067206Abstract: Endpoint-based parallel data processing in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.Type: ApplicationFiled: November 9, 2012Publication date: March 14, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: International Business Machines Corporation
-
Publication number: 20130067205Abstract: An apparatus includes a processor and a memory coupled to the processor. The memory stores an instruction packet (e.g., a VLIW instruction packet) including a first predicate independent instruction and a second predicate independent instruction. Each of the predicate independent instructions has the same destination.Type: ApplicationFiled: September 9, 2011Publication date: March 14, 2013Applicant: QUALCOMM IncorporatedInventors: Erich J. Plondke, Lucian Codrescu, Mao Zeng, Charles J. Tabony, Suresh K. Venkumahanti
-
Patent number: 8386751Abstract: One embodiment of the present includes a heterogenous, high-performance, scalable processor having at least one W-type sub-processor capable of processing W bits in parallel, W being an integer value, at least one N-type sub-processor capable of processing N bits in parallel, N being an integer value smaller than W by a factor of two. The processor further includes a shared bus coupling the at least one W-type sub-processor and at least one N-type sub-processor and memory shared coupled to the at least one W-type sub-processor and the at least one N-type sub-processor, wherein the W-type sub-processor rearranges memory to accommodate execution of applications allowing for fast operations.Type: GrantFiled: May 18, 2010Date of Patent: February 26, 2013Assignee: Icelero LLCInventors: Amit Ramchandran, John Reid Hauser, Jr.
-
Publication number: 20130046954Abstract: Disclosed is an architecture, system and method for performing multi-thread DFA descents on a single input stream. An executer performs DFA transitions from a plurality of threads each starting at a different point in an input stream. A plurality of executers may operate in parallel to each other and a plurality of thread contexts operate concurrently within each executer to maintain the context of each thread which is state transitioning. A scheduler in each executer arbitrates instructions for the thread into an at least one pipeline where the instructions are executed. Tokens may be output from each of the plurality of executers to a token processor which sorts and filters the tokens into dispatch order.Type: ApplicationFiled: January 18, 2012Publication date: February 21, 2013Inventors: Michael Ruehle, Umesh Ramkrishnarao Kasture, Vinay Janardan Naik, Nayan Amrutlal Suthar, Robert J. McMillen
-
Publication number: 20130036276Abstract: Systems and methods for providing additional instructions for supporting efficient memory corruption detection in a processor. A physical memory may be a DRAM with a spare bank of memory reserved for a hardware failover mechanism. Version numbers associated with data structures allocated in the memory may be generated so that version numbers of adjacent data structures are different. A processor determines that a fetched instruction is a memory access instruction corresponding to a first data structure within the memory. For instructions that are not a version update instruction, the processor compares the first version number and second version number stored in a location in the memory indicated by the generated address and flags an error if there is a mismatch. For version update instructions, the processor performs a memory access operation on the second version number with no comparison check.Type: ApplicationFiled: August 2, 2011Publication date: February 7, 2013Inventors: Zoran Radovic, Darryl J. Gove, Graham Ricketson Murphy
-
Patent number: 8370609Abstract: This invention includes a circuit for tracking memory operations with trace-based execution. Each trace includes a sequence of operations that includes zero or more of the memory operations. The memory operations being executed form a set of active memory operations that have a predefined program order among them and corresponding ordering constraints. At least some of the active memory operations access the memory in an execution order that is different from the program order. Checkpoint entries are associated with each trace. Each entry refers to a checkpoint location. Memory operation ordering entries correspond to each one of the active memory operations. Violations of the ordering constraints result in overwriting the checkpoint locations associated with the selected trace as well as the checkpoint locations associated with traces that are younger than the selected trace.Type: GrantFiled: February 13, 2008Date of Patent: February 5, 2013Assignee: Oracle America, Inc.Inventors: John Gregory Favor, Paul G. Chan, Graham Ricketson Murphy, Joseph Byron Rowlands
-
Patent number: 8370844Abstract: Embodiments off the invention provide a mechanism for process migration on a massively parallel computer system. In particular, embodiments of the invention may be used to update process state data for a migrated compute node, such as MPI (or other communication library) state data, across a full collection of compute nodes present in a given parallel system executing a parallel task. Migrating a process form one compute node to another may be useful to address a variety of sub-optimal operating conditions. For example, one or more processes may be migrated to cure network congestion resulting from a poorly mapped task or when a compute node is predicted to experience a hardware failure.Type: GrantFiled: September 12, 2007Date of Patent: February 5, 2013Assignee: International Business Machines CorporationInventors: Charles Jens Archer, David L. Darrington, Patrick Joseph McCarthy, Amanda Peters, Albert Sidelnik
-
Patent number: 8364937Abstract: A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a load instruction and a data-dependent instruction to the instruction pipeline. Based on an operating mode, such as ECC mode or parity mode, the data-dependent instruction may execute in either the first of the second instruction pipeline stage. Further, the execution of the data-dependent instruction may depend on whether the most recently executed instruction was misaligned.Type: GrantFiled: April 13, 2012Date of Patent: January 29, 2013Assignee: Rambus Inc.Inventors: William C. Moyer, Jeffrey W. Scott
-
Patent number: 8364803Abstract: The present invention relates to a method for routing in a data communications network, said method comprising the steps of: providing in a memory (102), a datagram forwarding data structure (10) provided for indicating where to forward a datagram in said network, which data structure (10) is in the form of a tree comprising at least one leaf (11) and possibly a number of nodes (13) including partial nodes, said data structure (10) having a height (h), corresponding to a number of memory accesses required for looking up a largest stored non-negative integer smaller than or equal to a query key, step 201, reducing worst storage cost by using a technique for reduction of worst case storage cost that are selectable from: partial block tree compaction, virtual blocks, bit push pulling, block aggregation or split block trees, and variations thereof, step 202, updating the layered data structure partially including by using a technique for scheduling maintenance work that are selectable from: vertical segmentationType: GrantFiled: October 19, 2007Date of Patent: January 29, 2013Assignee: Oricane ABInventor: Mikael Sundström
-
Publication number: 20130024673Abstract: A technology capable of reducing load on both system processing and filter operation and improving power consumption and performance is provided. In a digital signal processor, a program memory, a program counter, and a control logic circuit are provided, and a bit field of each instruction includes instruction stop flag information and bit field information. Also, the control logic circuit carries out the control in such a manner that the instruction whose instruction stop flag information is cleared is executed as is to proceed to the next instruction processing, execution of the instruction whose instruction stop flag information is set is stopped if an execution resumption trigger condition corresponding to the bit field information is not satisfied, and the instruction whose instruction stop flag information is set is executed if the execution resumption trigger condition corresponding to bit field information is satisfied, to proceed to the next instruction processing.Type: ApplicationFiled: July 21, 2012Publication date: January 24, 2013Inventor: Takanaga YAMAZAKI
-
Patent number: 8352712Abstract: A method and processor chip design for enabling a processor core to continue sending store operations speculatively to the store queue after the core receives indication that the store queue is full. The processor core is configured with speculative store logic that enables the processor core to continue issuing store operations while the store queue full signal is asserted. A copy of the speculatively issued store operation is placed within a speculative store buffer. The core waits for a signal from the store queue indicating the store operation was accepted into the store queue. When the speculatively-issued store operation is accepted within the store queue, the copy is discarded from the buffer. However, when the store operation is rejected, the speculative store logic re-issues the store operation ahead of normal store operations.Type: GrantFiled: May 6, 2004Date of Patent: January 8, 2013Assignee: International Business Machines CorporationInventors: Robert H. Bell, Jr., Thomas Michael Capasso, Guy Lynn Guthrie, Hugh Shen, Jeffrey Adam Stuecheli
-
Publication number: 20120331276Abstract: A method of executing an instruction set to select a set of registers, includes reading a first instruction of the instruction set; interpreting a first operand of the first instruction to represent a first register S to be selected; interpreting a second operand of the first instruction to represent a number N of registers to be selected; and selecting N consecutive registers starting at the first register S to form the set of registers.Type: ApplicationFiled: December 20, 2011Publication date: December 27, 2012Applicant: CAMBRIDGE SILICON RADIO LIMITEDInventors: Peter Smith, David Richard Hargreaves
-
Publication number: 20120317402Abstract: A facility is provided to enable operator message commands from multiple, distinct sources to be provided to a coupling facility of a computing environment for processing. These commands are used, for instance, to perform actions on the coupling facility, and may be received from consoles coupled to the coupling facility, as well as logical partitions or other systems coupled thereto. Responsive to performing the commands, responses are returned to the initiators of the commands.Type: ApplicationFiled: June 10, 2011Publication date: December 13, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: David A. Elko, Steven N. Goss, Thomas C. Shaw
-
Patent number: 8332829Abstract: Within a data processing system, one or more register files are assigned to respective states of a graph for each of a plurality of clock cycles. A plurality of edges are inserted to form connections between the states of the graph, with respective weights being assigned to each of the edges. A best route through the graph is then determined based, at least in part, on the weights assigned to the edges.Type: GrantFiled: August 15, 2008Date of Patent: December 11, 2012Assignee: Calos Fund Limited Liability CompanyInventor: Peter Mattson
-
Publication number: 20120311306Abstract: Parallelism of processing can be improved while existing software resources are utilized substantially as they are. A data processing apparatus includes a plurality of processing units configured to process packets each including data and extended identification information added to the data, the extended identification information including identification information for identifying the data and instruction information indicating one or more processing instructions to the data, each processing unit in the plurality of processing units including: an input/output unit configured to obtain, in the packets, only a packet whose address information indicates said each processing unit in the plurality of processing units, the address information determined in accordance with the extended identification information; and an operation unit configured to execute the processing instruction in the packet obtained by the input/output unit.Type: ApplicationFiled: June 1, 2012Publication date: December 6, 2012Applicant: MUSH-A CO., LTD.Inventor: Mitsuru Mushano
-
Patent number: 8327121Abstract: A microprocessor includes an N-way cache and a logic block that selectively enables and disables the N-way cache for at least one clock cycle if a first register load instructions and a second register load instruction, following the first register load instruction, are detected as pointing to the same index line in which the requested data is stored. The logic block further provides a disabling signal to the N-way cache for at least one clock cycle if the first and second instructions are detected as pointing to the same cache way.Type: GrantFiled: August 20, 2008Date of Patent: December 4, 2012Assignee: MIPS Technologies, Inc.Inventors: Ajit Karthik Mylavarapu, Sanjai Balakrishnan Athi
-
Publication number: 20120297172Abstract: Tools and techniques are described for multi-threaded processing for opening and saving documents. These tools may provide load processes for reading documents from storage devices, and for loading the documents into applications. These tools may spawn a load process thread for executing a given load process on a first processing unit, and an application thread may execute a given application on a second processing unit. A first pipeline may be created for executing the load process thread, with the first pipeline performing tasks associated with loading the document into the application. A second pipeline may be created for executing the application process thread, with the second pipeline performing tasks associated with operating on the documents. The tasks in the first pipeline are configured to pass tokens as input to the tasks in the second pipeline.Type: ApplicationFiled: August 8, 2012Publication date: November 22, 2012Applicant: MICROSOFT CORPORATIONInventors: Uladzislau Sudzilouski, Igor Zaika
-
Publication number: 20120272047Abstract: Method, apparatus, and program means for shuffling data. The method of one embodiment comprises receiving a first operand having a set of L data elements and a second operand having a set of L control elements. For each control element, data from a first operand data element designated by the individual control element is shuffled to an associated resultant data element position if its flush to zero field is not set and a zero is placed into the associated resultant data element position if its flush to zero field is not set.Type: ApplicationFiled: July 2, 2012Publication date: October 25, 2012Inventors: William W. Macy, JR., Eric L. Debes, Patrice L. Roussel, Huy V. Nguyen
-
Publication number: 20120265971Abstract: A mapper unit of an out-of-order processor assigns a particular counter currently in a counter free pool to count a number of mappings of logical registers to a particular physical register from among multiple physical registers, responsive to an execution of an instruction by the mapper unit mapping at least one logical register to the particular physical register. The number of counters is less than the number of physical registers. The mapper unit, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.Type: ApplicationFiled: April 15, 2011Publication date: October 18, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: GREGORY W. ALEXANDER, BRIAN D. BARRICK, JOHN W. WARD, III
-
Patent number: 8289335Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.Type: GrantFiled: February 3, 2006Date of Patent: October 16, 2012Assignee: MicroUnity Systems Engineering, Inc.Inventors: Craig Hansen, John Moussouris, Alexia Massalin
-
Patent number: 8291200Abstract: A comparison circuit can reduce the amount of power consumed when searching a load queue or a store queue of a microprocessor. Some embodiments of the comparison circuit use a comparison unit that performs an initial comparison of addresses using a subset of the address bits. If the initial comparison results in a match, a second comparison unit can be enabled to compare another subset of the address bits.Type: GrantFiled: August 4, 2009Date of Patent: October 16, 2012Assignee: STMicroelectronics (Beijing) R&D Co., Ltd.Inventors: Kai-Feng Wang, Hong-Xia Sun, Yong-Qiang Wu
-
Publication number: 20120260074Abstract: A microprocessor having performs an architectural instruction that instructs it to perform an operation on first and second source operands to generate a result and to write the result to a destination register only if its architectural condition flags satisfy a condition specified in the architectural instruction. A hardware instruction translator translates the instruction into first and second microinstructions. To execute the first microinstruction, an execution pipeline performs the operation on the source operands to generate the result. To execute the second microinstruction, it writes the destination register with the result generated by the first microinstruction if the architectural condition flags satisfy the condition, and writes the destination register with the current value of the destination register if the architectural condition flags do not satisfy the condition.Type: ApplicationFiled: December 21, 2011Publication date: October 11, 2012Applicant: VIA TECHNOLOGIES, INC.Inventors: G. Glenn Henry, Gerard M. Col, Rodney E. Hooker, Terry Parks
-
Publication number: 20120260071Abstract: An architectural instruction instructs a microprocessor to perform an operation on first and second source operands to generate a result and to write the result to a destination register only if architectural condition flags satisfy a condition specified in the architectural instruction. A hardware instruction translator translates the architectural instruction into first and second microinstructions. To execute the first microinstruction, an execution pipeline performs the operation on the source operands to generate the result, determines whether the architectural condition flags satisfy the condition, and updates a non-architectural indicator to indicate whether the architectural condition flags satisfy the condition.Type: ApplicationFiled: December 21, 2011Publication date: October 11, 2012Applicant: VIA TECHNOLOGIES, INC.Inventors: G. Glenn Henry, Gerard M. Col, Rodney E. Hooker, Terry Parks
-
Publication number: 20120260075Abstract: A microprocessor includes a hardware instruction translator that translates an architectural instruction into first and second microinstructions. To execute the first microinstruction, an execution pipeline performs the shift operation on the first source operand to generate the first result and a carry flag value and updates a non-architectural carry flag with the generated carry flag value. To execute the second microinstruction, it performs the second operation on the first result and the second operand to generate the second result and new condition flag values based on the second result. If a architectural condition flags satisfy the condition, it updates the architectural carry flag with the non-architectural carry flag value and updates at least one of the other architectural condition flags with the corresponding generated new condition flag values; otherwise, it updates the architectural condition flags with the current value of the architectural condition flags.Type: ApplicationFiled: December 21, 2011Publication date: October 11, 2012Applicant: VIA TECHNOLOGIES, INC.Inventors: G. Glenn Henry, Gerard M. Col, Rodney E. Hooker, Terry Parks
-
Publication number: 20120254596Abstract: A system and method for controlling messaging between a first processor and a second processor is disclosed. The second processor controls one or more peripheral devices on behalf of a plurality of predetermined tasks being executed by the first processor. The system includes a message control module that receives an input message intended for the second processor from the first processor and maintains a message history based on the received input message and previously received input messages. The message history indicates which peripheral devices of the system are to be on and which tasks of the plurality of tasks requested the peripheral devices to be on. The message control module is further configured to generate an output message that includes output instructions for the second processor based on the message history and an output duration based on the message history. The second processor executes the output instructions.Type: ApplicationFiled: March 31, 2011Publication date: October 4, 2012Applicants: DENSO CORPORATION, DENSO INTERNATIONAL AMERICA, INC.Inventors: Wan-ping Yang, Koji Shinoda, Hiroaki Shibata
-
Patent number: 8281075Abstract: A technique for triggering a system bus write command with user code includes identifying a specific store-type instruction in a user instruction sequence. The specific store-type instruction is converted into a specific request-type command, which is configured to include core permission controls (that are stored in core configuration registers of a processor core by a trusted kernel) and user created data (stored in a cache memory). Slave devices are configured through register space (that is only accessible by the trusted kernel) with respective slave permission controls. The specific request-type command is then transmitted from the cache memory, via a system bus. In this case, the slave devices that receive the specific request-type command process the specific request-type command when the core permission controls are the same as the respective slave permission controls. The trusted kernel may be included in a hypervisor or an operating system.Type: GrantFiled: April 14, 2009Date of Patent: October 2, 2012Assignee: International Business Machines CorporationInventors: Lakshminarayana Baba Arimilli, Brian Mitchell Bass, David Wayne Cummings, Bernard Charles Drerup, Guy Lynn Guthrie, Ronald Nick Kalla, Hugh Shen, Michael Steven Siegel, William John Starke, Derek Edward Williams
-
Patent number: 8269784Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.Type: GrantFiled: January 19, 2012Date of Patent: September 18, 2012Assignee: MicroUnity Systems Engineering, Inc.Inventors: Craig Hansen, John Moussouris, Alexia Massalin
-
Patent number: 8271766Abstract: An information processing device including registers (105) for holding data and an operation device (102) for executing arithmetic and logic operations on input/output data held in the register. The information processing device can issue an inter-register copy instruction for instructing data held in one register to be copied to another register. The information processing device further includes a copy information holding device (113) for reserving for execution of a data copy operation by the inter-register copy instruction from a control unit (108) so as to execute the actual copy operation simultaneously with the succeeding instruction to hide the execution time of the copy operation. Thus, in the inter-register copy instruction execution phase, a reservation for a data copy operation is stored in the copy information holding device so that the execution phase is completed without performing the actual data copy operation.Type: GrantFiled: May 18, 2006Date of Patent: September 18, 2012Assignee: NEC CorporationInventor: Noritaka Hoshi
-
Publication number: 20120226868Abstract: Devices and methods for providing deterministic execution of multithreaded applications are provided. In some embodiments, each thread is provided access to an isolated memory region, such as a private cache. In some embodiments, more than one private cache are synchronized via a modified MOESI coherence protocol. The modified coherence protocol may be configured to refrain from synchronizing the isolated memory regions until the end of an execution quantum. The execution quantum may end when all threads experience a quantum end event such as reaching a threshold instruction count, overflowing the isolated memory region, and/or attempting to access a lock released by a different thread in the same quantum.Type: ApplicationFiled: March 1, 2012Publication date: September 6, 2012Applicant: University of Washington through its Center for CommercializationInventors: Luis Henrique Ceze, Thomas Bergan, Joseph Devietti, Daniel Joseph Grossman, Jacob Eric Nelson
-
Patent number: 8255631Abstract: A method, processor, and data processing system for implementing a framework for priority-based scheduling and throttling of prefetching operations. A prefetch engine (PE) assigns a priority to a first prefetch stream, indicating a relative priority for scheduling prefetch operations of the first prefetch stream. The PE monitors activity within the data processing system and dynamically updates the priority of the first prefetch stream based on the activity (or lack thereof). Low priority streams may be discarded. The PE also schedules prefetching in a priority-based scheduling sequence that corresponds to the priority currently assigned to the scheduled active streams. When there are no prefetches within a prefetch queue, the PE triggers the active streams to provide prefetches for issuing. The PE determines when to throttle prefetching, based on the current usage level of resources relevant to completing the prefetch.Type: GrantFiled: February 1, 2008Date of Patent: August 28, 2012Assignee: International Business Machines CorporationInventors: Lei Chen, Lixin Zhang
-
Patent number: 8255095Abstract: A modular avionics system includes several cabinets arranged at various locations in an aircraft and interconnected in a network. The cabinets are used for controlling or processing signals from and to sensors, actuators and other systems of the aircraft. The system includes parallel processors, for example transputers. The cabinets comprise at least two core processor modules (CPM1, CPM2) and at least two input/output modules (IOM1, IOM2). The input/output modules (IOM1, IOM2) serve as interfaces to the systems to be controlled, and serve for the control and intermediate storage of the data flowing into and out of the cabinet. Each core processor module (CPM1, CPM2) communicates independently with each IOM module and CPM module by way of links; and in each core processor a number of independent system programs works under the control of an operating system.Type: GrantFiled: November 16, 2006Date of Patent: August 28, 2012Assignee: Airbus Operations GmbHInventor: Heinz Girlich
-
Patent number: 8255673Abstract: Apparatus for processing data is provided comprising processing circuitry and monitoring circuitry for monitoring write transactions and performing transaction authorizations of certain transactions in dependence upon associated memory addresses. The processing circuitry is configured to enable execution of a write instruction corresponding to a write transaction to be monitored to continue to completion while the monitoring circuitry is performing monitoring of the write transactions and the monitoring circuitry is arranged to cause storage of write transaction data in an intermediate storage element for those transactions for which an authorization is required. Storage of write transaction data in an intermediate storage element enables the write transaction to be reissued in dependence upon the result of the transaction authorization although the corresponding write instruction has already completed.Type: GrantFiled: April 25, 2008Date of Patent: August 28, 2012Assignee: ARM LimitedInventors: Daniel Kershaw, Daren Croxford
-
Patent number: 8255672Abstract: A processor includes: a plurality of registers; an instruction readout circuit configured to read out an instruction from a memory; an instruction generation circuit configured to generate instructions for saving data into a predetermined storage area, for the respective registers, if the instruction read out by the instruction readout circuit is an instruction causing the data stored in each of the plurality of registers to be saved; and an instruction execution circuit configured to execute the instruction read out from the memory and the instructions generated by the instruction generation circuit.Type: GrantFiled: May 28, 2008Date of Patent: August 28, 2012Assignees: Semiconductor Components Industries, LLC, Sanyo Semiconductor Co., Ltd.Inventors: Iwao Honda, Shinya Kishida
-
Patent number: 8254401Abstract: The device comprises a memory (3) for storing several user share parameters and several amounts capable of advancing. A decision means (6) allocates a chosen service slice of the resource to a user selected as possessing the least advanced amount. It subsequently advances his amount according to a chosen increment. A memory link means (5) defines user queues of “FIFO” type, such that the user having the least advanced amount in a queue appears at the head of this queue. According to the invention, the memory (3) stores a limited number of values of increments. The memory link means (5) associates one of these values of increments with each user and allocates an increment value to each queue.Type: GrantFiled: July 1, 2010Date of Patent: August 28, 2012Assignee: StreamcoreInventor: Rémi Despres
-
Patent number: 8250231Abstract: A method to reduce buffer capacity in a processor includes giving the data packets admittance to the processor through at least one interface, storing the data packets in at least one input buffer, and using a packet rate shaper outside of a processing pipeline to control flow of the data packets to the pipeline before the data packets enter the pipeline. First and second data packets are given admittance to the pipeline in dependence on cost information per packet that is dependent upon an expected time period of residence of the first data packet in the pipeline. Cost information dependent upon an expected time period of residence of the second data packet in the pipeline differs from said cost information dependent upon the expected time period of residence of the first data packet in the pipeline.Type: GrantFiled: December 20, 2005Date of Patent: August 21, 2012Assignee: Marvell International Ltd.Inventors: Thomas Bodén, Jakob Carlström
-
Publication number: 20120210102Abstract: A first hardware thread executes a software program instruction, which instructs the first hardware thread to initiate a second hardware thread. As such, the first hardware thread identifies one or more register values accessible by the first hardware thread. Next, the first hardware thread copies the identified register values to one or more registers accessible by the second hardware thread. In turn, the second hardware thread accesses the copied register values included in the accessible registers and executes software code accordingly.Type: ApplicationFiled: April 21, 2012Publication date: August 16, 2012Applicant: International Business Machines CorporationInventors: Giles Roger Frazier, Ronald P. Hall
-
Publication number: 20120204015Abstract: A computer-program product may have instructions that, when executed, cause a processor to perform operations including managing execution of application functions that access data in a shared buffer; determining if a first instruction that is stored at a first memory location causes, upon execution, data to be read from or written to the shared buffer; and when it is determined that the first instruction causes data to be read from or written to the shared buffer, 1) identify one or more replacement instructions to execute in place of the first instruction; 2) store the one or more replacement instructions; and 3) replace the first instruction at the first memory location with a second instruction that, when executed, causes the stored one or more replacement instructions to be executed.Type: ApplicationFiled: April 13, 2012Publication date: August 9, 2012Applicant: APPLE INC.Inventors: Ronnie G. Misra, Joshua H. Shaffer
-
Publication number: 20120198214Abstract: One embodiment sets forth a technique for N-way memory barrier operation coalescing. When a first memory barrier is received for a first thread group execution of subsequent memory operations for the first thread group are suspended until the first memory barrier is executed. Subsequent memory barriers for different thread groups may be coalesced with the first memory barrier to produce a coalesced memory barrier that represents memory barrier operations for multiple thread groups. When the coalesced memory barrier is being processed, execution of subsequent memory operations for the different thread groups is also suspended. However, memory operations for other thread groups that are not affected by the coalesced memory barrier may be executed.Type: ApplicationFiled: April 6, 2012Publication date: August 2, 2012Inventors: Shirish GADRE, Charles McCARVER, Anjana RAJENDRAN, Omkar PARANJAPE, Steven James HEINRICH
-
Publication number: 20120198213Abstract: A packet handler for a packet processing system includes a plurality of parallel action machines, each of the plurality of parallel action machines being configured to perform a respective packet processing function; and a plurality of action machine input registers, wherein each of the plurality of parallel action machines is associated with one or more of the plurality of action machine input registers, and wherein an action machine of the plurality of parallel action machines is automatically triggered to perform its respective packet processing function in the event that data sufficient to perform the actions machine's respective packet processing function is written into the action machine's one or more respective action machine input registers.Type: ApplicationFiled: January 31, 2011Publication date: August 2, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Francois Abel, Jean Calvignac, Christoph Hagleitner, Fabrice Verplanken
-
Patent number: RE43825Abstract: A system and method forward data between processing elements. A first processing element includes an address register that stores a first memory address. A forwarding storage element is coupled to the first processing element. A second processing element, coupled to the forwarding storage element, transmits a second memory address to the forwarding storage element. The forwarding storage transmits the second memory address to the first processing element, and the first processing element compares the second memory address with the first memory address.Type: GrantFiled: November 19, 2007Date of Patent: November 20, 2012Assignee: The United States of America as Represented by the Secretary of the NavyInventors: Joel Zvi Apisdorf, Sam Brandon Sandbote, Michael Daniel Poole