Processing Control For Data Transfer Patents (Class 712/225)
  • Publication number: 20120198213
    Abstract: A packet handler for a packet processing system includes a plurality of parallel action machines, each of the plurality of parallel action machines being configured to perform a respective packet processing function; and a plurality of action machine input registers, wherein each of the plurality of parallel action machines is associated with one or more of the plurality of action machine input registers, and wherein an action machine of the plurality of parallel action machines is automatically triggered to perform its respective packet processing function in the event that data sufficient to perform the actions machine's respective packet processing function is written into the action machine's one or more respective action machine input registers.
    Type: Application
    Filed: January 31, 2011
    Publication date: August 2, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Francois Abel, Jean Calvignac, Christoph Hagleitner, Fabrice Verplanken
  • Patent number: 8234483
    Abstract: A computing and communication chip architecture is provided wherein the interfaces of processor access to the memory chips are implemented as a high-speed packet switched serial interface as part of each chip. In one embodiment, the interface is accomplished through a gigabit Ethernet interface provided by protocol processor integrated as part of the chip. The protocol processor encapsulates the memory address and control information like Read, Write, number of successive bytes etc, as an Ethernet packet for communication among the processor and memory chips that are located on the same motherboard, or even on different circuit cards. In one embodiment, the communication over head of the Ethernet protocol is further reduced by using an enhanced Ethernet protocol with shortened data frames within a constrained neighborhood, and/or by utilizing a bit stream switch where direct connection paths can be established between elements that comprise the computing or communication architecture.
    Type: Grant
    Filed: October 25, 2010
    Date of Patent: July 31, 2012
    Assignee: Psimast, Inc
    Inventor: Viswa Nath Sharma
  • Patent number: 8234653
    Abstract: A data processing architecture includes multiple processors connected in series between a load balancer and reorder logic. The load balancer is configured to receive data and distribute the data across the processors. Appropriate ones of the processors are configured to process the data. The reorder logic is configured to receive the data processed by the processors, reorder the data, and output the reordered data.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: July 31, 2012
    Assignee: Juniper Networks, Inc.
    Inventors: John C Carney, Michael E Lipman
  • Patent number: 8230410
    Abstract: An enhanced mechanism for parallel execution of computer programs utilizes a bidding model to allocate additional registers and execution units for stretches of code identified as opportunities for microparallelization. A microparallel processor architecture apparatus permits software (e.g. compiler) to implement short-term parallel execution of stretches of code identified as such before execution. In one embodiment, an additional paired unit, if available, is allocated for execution of an identified stretch of code. Each additional paired unit includes an execution unit and a half set of registers. This apparatus is available for compilers or assembler language coders to use and allows software to unlock parallel execution capabilities that are present in existing computer programs but heretofore were executed sequentially for lack of a suitable apparatus.
    Type: Grant
    Filed: October 26, 2009
    Date of Patent: July 24, 2012
    Assignee: International Business Machines Corporation
    Inventor: Larry W. Loen
  • Patent number: 8230179
    Abstract: Administering non-cacheable memory load instructions in a computing environment where cacheable data is produced and consumed in a coherent manner without harming performance of a producer, the environment including a hierarchy of computer memory that includes one or more caches backed by main memory, the caches controlled by a cache controller, at least one of the caches configured as a write-back cache. Embodiments of the present invention include receiving, by the cache controller, a non-cacheable memory load instruction for data stored at a memory address, the data treated by the producer as cacheable; determining by the cache controller from a cache directory whether the data is cached; if the data is cached, returning the data in the memory address from the write-back cache without affecting the write-back cache's state; and if the data is not cached, returning the data from main memory without affecting the write-back cache's state.
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: July 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Jon K. Kriegel, Jamie R. Kuesel
  • Publication number: 20120185679
    Abstract: Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing by the parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry; registering in each endpoint in the geometry a dispatch callback function for a collective operation; and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.
    Type: Application
    Filed: January 17, 2011
    Publication date: July 19, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Bob R. Cernohous, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8225012
    Abstract: A method may include distributing ranges of addresses in a memory among a first set of functions in a first pipeline. The first set of the functions in the first pipeline may operate on data using the ranges of addresses. Different ranges of addresses in the memory may be redistributed among a second set of functions in a second pipeline without waiting for the first set of functions to be flushed of data.
    Type: Grant
    Filed: September 3, 2009
    Date of Patent: July 17, 2012
    Assignee: Intel Corporation
    Inventor: Thomas A. Piazza
  • Patent number: 8219788
    Abstract: A virtual core management system including a first physical core having a first utilization constraint, a second physical core having a second utilization constraint, and a virtual core including a collection of logical states associated with execution of a program. The virtual core management system further includes a utilization indicator configured to measure a utilization of the first physical core with respect to the first utilization constraint and measure a utilization of the second physical core with respect to the second utilization constraint, and a virtual core management component configured to map the virtual core to one of the first physical core and the second physical core based on at least one of the utilization of the first physical core and the utilization of the second physical core.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: July 10, 2012
    Assignee: Oracle America, Inc.
    Inventors: Yu Qing Cheng, Peter N. Glaskowsky, Carlos Puchol, Seungyoon Peter Song
  • Patent number: 8219787
    Abstract: In one embodiment, a processor comprises a retire unit and a load/store unit coupled thereto. The retire unit is configured to retire a first store memory operation responsive to the first store memory operation having been processed at least to a pipeline stage at which exceptions are reported for the first store memory operation. The load/store unit comprises a queue having a first entry assigned to the first store memory operation. The load/store unit is configured to retain the first store memory operation in the first entry subsequent to retirement of the first store memory operation if the first store memory operation is not complete. The queue may have multiple entries, and more than one store may be retained in the queue after being retired by the retire unit.
    Type: Grant
    Filed: May 9, 2011
    Date of Patent: July 10, 2012
    Assignee: Apple Inc.
    Inventors: Wei-Han Lien, Po-Yung Chang
  • Patent number: 8214626
    Abstract: Method, apparatus, and program means for shuffling data. The method of one embodiment comprises receiving a first operand having a set of L data elements and a second operand having a set of L control elements. For each control element, data from a first operand data element designated by the individual control element is shuffled to an associated resultant data element position if its flush to zero field is not set and a zero is placed into the associated resultant data element position if its flush to zero field is not set.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: July 3, 2012
    Assignee: Intel Corporation
    Inventors: William W. Macy, Jr., Eric L. Debes, Patrice L. Roussel, Huy V. Nguyen
  • Patent number: 8209523
    Abstract: A data moving processor includes a code memory coupled to a code fetch circuit and a decode circuit coupled to the code fetch circuit. An address stack is coupled to the decode circuit and configured to store address data. A general purpose stack is coupled to the decode circuit and configured to store other data. The data moving processor uses data from the general purpose stack to perform calculations. The data moving processor uses address data from the address stack to identify source and destination memory locations. The address data may be used to drive an address line of a memory during a read or write operation. The address stack and general purpose stack are separately controlled using bytecode.
    Type: Grant
    Filed: January 22, 2009
    Date of Patent: June 26, 2012
    Assignee: Intel Mobile Communications GmbH
    Inventors: Ulf Nordqvist, Jinan Lin, Xiaoning Nie, Stefan Maier, Siegmar Koeppe
  • Patent number: 8200949
    Abstract: A multi-threaded processor system, method, and computer program product capable of utilizing a register file cache are provided for simultaneously processing a plurality of threads. A processor capable of simultaneously processing a plurality of threads is provided. The processor includes a register file and a register file cache in communication with the register file.
    Type: Grant
    Filed: December 9, 2008
    Date of Patent: June 12, 2012
    Assignee: NVIDIA Corporation
    Inventors: David Tarjan, Kevin Skadron
  • Patent number: 8200950
    Abstract: A pipeline operation processor comprises a pipeline processing unit and an instruction insertion controller which inserts an instruction when access to an operation memory is requested, and corrects control information by reference to control information of stages. When a control program is in execution, on receiving an access request instruction requesting for access to the operation memory, the instruction insertion controller inserts an NOP instruction from the instruction decoding unit in place of the access request instruction. The access request instruction is executed while the pipeline processing unit executes no operation, and subsequently, the pipeline processing is continued.
    Type: Grant
    Filed: June 4, 2009
    Date of Patent: June 12, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Motohiko Okabe
  • Patent number: 8200941
    Abstract: A method includes, in a processor, loading/moving a first portion of bits of a source into a first portion of a destination register and duplicate that first portion of bits in a subsequent portion of the destination register.
    Type: Grant
    Filed: April 15, 2011
    Date of Patent: June 12, 2012
    Assignee: Intel Corporation
    Inventor: Patrice Roussel
  • Patent number: 8195924
    Abstract: A method and system for early instruction text based operand store compare avoidance in a processor are provided. The system includes a processor pipeline for processing instruction text in an instruction stream, where the instruction text includes operand address information. The system also includes delay logic to monitor the instruction stream. The delay logic performs a method that includes detecting a load instruction following a store instruction in the instruction stream, comparing the operand address information of the store instruction with the load instruction. The method also includes delaying the load instruction in the processor pipeline in response to detecting a common field value between the operand address information of the store instruction and the load instruction.
    Type: Grant
    Filed: March 17, 2011
    Date of Patent: June 5, 2012
    Assignee: International Business Machines Corporation
    Inventors: Khary J. Alexander, Fadi Y. Basuba, Bruce C. Giamei, David S. Hutton, Chung-Lung K. Shum
  • Patent number: 8190867
    Abstract: A processor comprising a register file, and a decoder to decode an instruction to specify a first source register having a first packed signed 16-bit integers, and to specify a second source register having a second packed signed 16-bit integers. A functional unit to generate a result to be stored in a specified destination. The result including a third packed 8-bit integers including an integer for each integer in the first packed integers, and an integer for each integer in the second packed integers. The integers corresponding to the first packed integers next to one another in the result. The integers corresponding to the second packed integers next to one another. A highest order integer of the result corresponding to a highest order integer of the first packed integers. A lowest order integer of the result corresponding to a lowest order integer of the second packed integers.
    Type: Grant
    Filed: May 16, 2011
    Date of Patent: May 29, 2012
    Assignee: Intel Corporation
    Inventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
  • Publication number: 20120124335
    Abstract: Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
    Type: Application
    Filed: January 5, 2012
    Publication date: May 17, 2012
    Applicant: ALTERA CORPORATION
    Inventors: Gerald G. Pechanek, David Carl Strube, Edwin Franklin Barry, Charles W. Kurak, JR., Carl Donald Busboom, Dale Edward Schneider, Nikos P. Pitsianis, Grayson Morris, Edward A. Wolff, Patrick R. Marchand, Ricardo E. Rodriguez, Marco C. Jacobs
  • Patent number: 8181003
    Abstract: Improved instruction set and core design, control and communication for programmable microprocessors is disclosed, involving the strategy for replacing centralized program sequencing in present-day and prior art processors with a novel distributed program sequencing wherein each functional unit has its own instruction fetch and decode block, and each functional unit has its own local memory for program storage; and wherein computational hardware execution units and memory units are flexibly pipelined as programmable embedded processors with reconfigurable pipeline stages of different order in response to varying application instruction sequences that establish different configurations and switching interconnections of the hardware units.
    Type: Grant
    Filed: May 29, 2008
    Date of Patent: May 15, 2012
    Assignee: Axis Semiconductor, Inc.
    Inventors: Xiaolin Wang, Qian Wu, Benjamin Marshall, Fugui Wang, Gregory Pitarys, Ke Ning
  • Publication number: 20120117420
    Abstract: A method of implementing a mask load or mask store instruction by a processor is provided. The method may include receiving the mask load or mask store instruction, a location of a memory operand and a location of corresponding mask bits associated with the memory operand, breaking the received memory operand into a plurality of sub-operands and executing the mask load or mask store instruction on each of the plurality of sub-operands using a fastpath operation or using microcode, wherein the respective mask load or mask store instruction loads or stores each of the plurality of sub-operands based upon the corresponding mask bits.
    Type: Application
    Filed: November 5, 2010
    Publication date: May 10, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Kelvin GOVEAS, Edward MCLELLAN, Steven BEIGELMACHER, David KROESCHE, Michael CLARK
  • Publication number: 20120110309
    Abstract: Methods, systems, and computer readable media for improved transfer of processing data outputs to memory are disclosed. According to an embodiment, a method for transferring outputs of a plurality of threads concurrently executing in one or more processing units to a memory includes: forming, based upon one or more of the outputs, a combined memory export instruction comprising one or more data elements and one or more control elements; and sending the combined memory export instruction to the memory. The combined memory export instruction can be sent to memory in a single clock cycle. Another method includes: forming, based upon outputs from two or more of the threads, a memory export instruction comprising two or more data elements; embedding at least one address representative of the two or more of the outputs in a second memory instruction; and sending the memory export instruction and the second memory instruction to the memory.
    Type: Application
    Filed: October 29, 2010
    Publication date: May 3, 2012
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Laurent Lefebvre, Michael Mantor, Robert Hankinson
  • Patent number: 8171267
    Abstract: A method and apparatus for migrating a task in a multi-processor system. The method includes examining whether a second process has been allocated to a second processor, the second process having a same instruction to execute as a first process and having different data to process in response to the instruction from the first process, the instruction being to execute the task; selecting a method of migrating the first process or a method of migrating a thread included in the first process based on the examining and migrating the task from a first processor to the second processor using the selected method. Therefore, cost and power required for task migration can be minimized. Consequently, power consumption can be maintained in a low-power environment, such as an embedded system, which, in turn, optimizes the performance of the multi-processor system and prevents physical damage to the circuit of the multi-processor system.
    Type: Grant
    Filed: June 30, 2008
    Date of Patent: May 1, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Seung-won Lee
  • Patent number: 8171266
    Abstract: A method for look-ahead load pre-fetching that reduces the effects of instruction stalls caused by high latency instructions. Look-ahead load pre-fetching is accomplished by searching an instruction stream for load memory instructions while the instruction stream is stalled waiting for completion of a previous instruction in the instruction stream. A pre-fetch operation is issued for each load memory instruction found. The pre-fetch operations cause data for the corresponding load memory instructions to be copied to a cache, thereby avoiding long latencies in the subsequent execution of the load memory instructions.
    Type: Grant
    Filed: August 2, 2001
    Date of Patent: May 1, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Alan H. Karp, Rajiv Gupta
  • Patent number: 8171259
    Abstract: A dynamic reconfigurable circuit includes multiple clusters each including a group of reconfigurable processing elements. The dynamic reconfigurable circuit is capable of dynamically changing a configuration of the clusters according to a context including a description of processing of the processing elements and of connection between the processing elements. A first cluster among the clusters includes a signal generating circuit that when an instruction to change the context is received, generates a report signal indicative of the instruction to change the context; a signal adding circuit that adds the report signal generated by the signal generating circuit to output data that is to be transmitted from the first cluster to a second cluster; and a data clearing circuit that, when output data to which a report signal generated by the second cluster is added is received, performs a clearing process of clearing the output data received.
    Type: Grant
    Filed: February 27, 2009
    Date of Patent: May 1, 2012
    Assignee: Fujitsu Semiconductor Limited
    Inventors: Takashi Hanai, Shinichi Sutou
  • Patent number: 8171263
    Abstract: A parallel data processing apparatus using a SIMD array of processing elements is disclosed. The apparatus makes use of a register in order to control issuance of instructions to the processing elements in the array.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: May 1, 2012
    Assignee: Rambus Inc.
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Publication number: 20120096245
    Abstract: A computing device includes a receiving unit that receives control information indicating an instruction to be executed on a process that is distributed or an instruction contained in the process that is distributed, from a control information creating device that transmits the control information to each computing device on a network. The computing device further includes a processor configured to suspend execution of an instruction when the instruction to be executed on the process occurs or the instruction contained in the process that is distributed is executed, and execute the suspended instruction when the suspended instruction is associated with the instruction indicated by the control information that is received by the receiving unit.
    Type: Application
    Filed: December 29, 2011
    Publication date: April 19, 2012
    Applicant: Fujitsu Limited
    Inventor: Yuta HIGUCHI
  • Patent number: 8161272
    Abstract: The memory unit is compatible with a plurality of operation modes. The plurality of operation modes include the normal mode allowing access and the standby mode consuming a lower power than the normal mode. The branch detection section detects a branch instruction from an instruction fetched from the memory unit by the CPU. The mode control section changes an operation mode of the memory unit according to a detection result by the branch detection section.
    Type: Grant
    Filed: December 23, 2008
    Date of Patent: April 17, 2012
    Assignee: Renesas Electronics Corporation
    Inventor: Kiminari Yamazoe
  • Patent number: 8161273
    Abstract: Embodiments of the present invention provide a system that allocates registers in a processor. The system starts by commencing a transaction, wherein commencing the transaction involves preserving a pre-transactional state of registers in a first register file. The system then allocates one or more registers for temporary use during the transaction. Upon finishing using each allocated register during the transaction, the system executes an instruction that restores the allocated register to the pre-transactional state.
    Type: Grant
    Filed: February 26, 2008
    Date of Patent: April 17, 2012
    Assignee: Oracle America, Inc.
    Inventor: Paul Caprioli
  • Patent number: 8161271
    Abstract: Embodiments of the invention provide logic within the store data path between a processor and a memory array. The logic may be configured to misalign vector data as it is stored to memory. By misaligning vector data as it is stored to memory, memory bandwidth may be maximized while processing bandwidth required to store vector data misaligned is minimized. Furthermore, embodiments of the invention provide logic within the load data path which allows vector data which is stored misaligned to be aligned as it is loaded into a vector register. By aligning misaligned vector data as it is loaded into a vector register, memory bandwidth may be maximized while processing bandwidth required to align misaligned vector data may be minimized.
    Type: Grant
    Filed: July 11, 2007
    Date of Patent: April 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
  • Patent number: 8161270
    Abstract: A programmable processor configured to perform one or more packet modifications through execution of one or more commands. A pipelined processor core comprises a first stage configured to selectively shift and mask data in each of a plurality of categories in response to one or more decoded commands, and combine the selectively shifted and masked data in each of the categories. The pipelined processor core further comprises a second stage configured to selectively perform one or more operations on the combined data from the first stage and other data responsive to the one or more decoded commands. In one implementation, the processor is implemented as an application specific integrated circuit (ASIC).
    Type: Grant
    Filed: March 30, 2004
    Date of Patent: April 17, 2012
    Assignee: Extreme Networks, Inc.
    Inventors: David K. Parker, Erik R. Swenson, Christopher J. Young
  • Patent number: 8161480
    Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
    Type: Grant
    Filed: May 29, 2007
    Date of Patent: April 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Gabor Dozsa, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8156314
    Abstract: A system and method are described that manage incremental state updates in such a way that multiple threads within a processor can each operate, in effect, on their own set of state data. The system and method are applicable to any processor in which multiple threads require access to sets of state information which differ from one another by a relatively small number of state changes.
    Type: Grant
    Filed: October 25, 2007
    Date of Patent: April 10, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark M. Leather, Brian D. Emberling
  • Patent number: 8156261
    Abstract: A variety of advantageous mechanisms for improved data transfer control within a data processing system are described. A DMA controller is described which is implemented as a multiprocessing transfer engine supporting multiple transfer controllers which may work independently or in cooperation to carry out data transfers, with each transfer controller acting as an autonomous processor, fetching and dispatching DMA instructions to multiple execution units. In particular, mechanisms for initiating and controlling the sequence of data transfers are provided, as are processes for autonomously fetching DMA instructions which are decoded sequentially but executed in parallel.
    Type: Grant
    Filed: March 1, 2011
    Date of Patent: April 10, 2012
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Edward A. Wolff
  • Patent number: 8151091
    Abstract: A data processing system and method are disclosed. The system comprises an instruction-fetch stage where an instruction is fetched and a specific instruction is input into decode stage; a decode stage where said specific instruction indicates that contents of a register in a register file are used as an index, and then, the register file pointed to by said index is accessed based on said index; an execution stage where an access result of said decode stage is received, and computations are implemented according to the access result of the decode stage.
    Type: Grant
    Filed: May 21, 2009
    Date of Patent: April 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Xiao Tao Chang, Qiang Liui
  • Patent number: 8131982
    Abstract: A method for branch prediction, the method comprising, receiving a load instruction including a first data location in a first memory area, retrieving data including a branch address and a target address from the first data location, and saving the data in a branch prediction memory, or receiving an unload instruction including the first data location in the first memory area, retrieving data including a branch address and a target address from the branch prediction memory, and saving the data in the first data location.
    Type: Grant
    Filed: June 13, 2008
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Philip G. Emma, Allan M. Hartstein, Keith N. Langston, Brian R. Prasky, Thomas R. Puzak, Charles F. Webb
  • Patent number: 8112612
    Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.
    Type: Grant
    Filed: May 17, 2010
    Date of Patent: February 7, 2012
    Assignee: Coherent Logix, Incorporated
    Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
  • Publication number: 20120030452
    Abstract: The present disclosure includes methods, devices, modules, and systems for modifying commands. One device embodiment includes a memory controller including a channel, wherein the channel includes a command queue configured to hold commands, and circuitry configured to modify at least a number of commands in the queue and execute the modified commands.
    Type: Application
    Filed: October 11, 2011
    Publication date: February 2, 2012
    Applicant: MICRON TECHNOLOGY, INC.
    Inventor: Mehdi Asnaashari
  • Patent number: 8108661
    Abstract: Provided are a data processing apparatus and a method of controlling the data processing apparatus. The data processing apparatus may select a single stream processor from a plurality of stream processors based on stream processor status information, and input data into the selected stream processor. The stream processor status information may include first status information of a processor core and second status information of at least one internal memory.
    Type: Grant
    Filed: April 2, 2009
    Date of Patent: January 31, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Won Jong Lee, Chan Min Park, Shi Hwa Lee
  • Patent number: 8108838
    Abstract: A method for adaptive runtime reconfiguration of a co-processor instruction set, in a computer system with at least a main processor communicatively connected to at least one reconfigurable co-processor, includes the steps of configuring the co-processor to implement an instruction set comprising one or more co-processor instructions, issuing a co-processor instruction to the co-processor, and determining whether the instruction is implemented in the co-processor. For an instruction not implemented in the co-processor instruction set, raising a stall signal to delay the main processor, determining whether there is enough space in the co-processor for the non-implemented instruction, and if there is enough space for said instruction, reconfiguring the instruction set of the co-processor by adding the non-implemented instruction to the co-processor instruction set. The stall signal is cleared and the instruction is executed.
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: January 31, 2012
    Assignee: International Business Machines Corporation
    Inventors: Sameh W. Asaad, Richard Gerard Hofmann
  • Patent number: 8108660
    Abstract: Each of processors has a barrier write register and a barrier read register. Each barrier write register is wired to each barrier read register by a dedicated wiring block. For example, a 1-bit barrier write register of a processor is connected, via the wiring block, to a first bit of each 8-bit barrier read register contained in the processors, and a 1-bit barrier write register of another processor is connected, via a wiring block, to a second bit of each 8-bit barrier read register contained in the processors. For example, a processor writes information to its own barrier write register, thereby notifying synchronization stand-by to the other processors and reads its own barrier read register, thereby recognizing whether the other processors are in synchronization stand-by or not. Therefore, a special dedicated instruction is not required along barrier synchronization processing, and the processing can be made at a high speed.
    Type: Grant
    Filed: January 22, 2009
    Date of Patent: January 31, 2012
    Assignees: Renesas Electronics Corporation, Waseda University
    Inventors: Hironori Kasahara, Keiji Kimura, Masayuki Ito, Tatsuya Kamei, Toshihiro Hattori
  • Patent number: 8108659
    Abstract: Thread synchronization techniques are used to control access to a memory resource (e.g., a counter) that is shared among multiple threads. Each thread has a unique identifier and threads are assigned to instances of the shared resource so that at least one instance is shared by two or more threads. Each thread assigned to a particular instance of the shared resource has a unique ordering index. A thread is allowed to access its assigned instance of the resource at a point in the program code determined by its ordering index. The threads are advantageously synchronized (explicitly or implicitly) so that no more than one thread attempts to access the same instance of the resource at a given time.
    Type: Grant
    Filed: September 19, 2007
    Date of Patent: January 31, 2012
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 8108658
    Abstract: A data processing circuit comprises a register file (14) having read ports and write ports. A plurality of functional units (21a-c), is coupled to receive operand data from a same combination of read ports. Each functional unit is coupled to a respective one of the write ports for writing a respective result. An instruction issue slot has outputs (11) for supplying register selection information to said combination read ports and to the respective ones of the write ports. The output of the issue slot also supplies an operation code. The functional units (21a-c) in the plurality are arranged to respond to at least to one value of the operation code by each executing a respective operation using the same operands from said same combination and each functional unit producing a respective result at a respective ones of the write ports.
    Type: Grant
    Filed: September 21, 2005
    Date of Patent: January 31, 2012
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Antonius Adrianus Maria Van Wel
  • Publication number: 20120023313
    Abstract: An electronic circuit (4000) includes a bias value generator circuit (3900) operable to supply a varying bias value in a programmable range, and an instruction circuit (3625, 4010) responsive to a first instruction to program the range of said bias value generator circuit (3900) and further responsive to a second instruction having an operand to repeatedly issue said second instruction with said operand varied in an operand value range determined as a function of the varying bias value.
    Type: Application
    Filed: September 28, 2011
    Publication date: January 26, 2012
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Kenichi TASHIRO, Hiroyuki MIZUNO, Yuji UMEMOTO
  • Patent number: 8103859
    Abstract: According to an aspect of the embodiment, when data on a cache RAM is rewritten in a storage processing of one thread, an determination unit searches a fetch port which holds a request of another thread, checks whether a request exists whose processing is completed, whose instruction is a load type instruction, and whose target address corresponds to a target address in a storage processing. When the corresponding request is detected, the determination unit sets a re-execution request flag to all the entries of the fetch port from the next entry of the entry which holds the oldest request to the entry which holds the detected request. When the processing of the oldest request is executed, a re-execution request unit transfers a re-execution request of an instruction to an instruction control unit for the request held in the entry in which the re-execution request flag is set.
    Type: Grant
    Filed: December 17, 2009
    Date of Patent: January 24, 2012
    Assignee: Fujitsu Limited
    Inventor: Naohiro Kiyota
  • Patent number: 8103853
    Abstract: A chip having an intelligent fabric may include a soft application processor, a reconfigurable hardware intelligent processor, a partitioned memory storage, and an interface to an external reconfigurable communication processor. The reconfigurable hardware intelligent processor may be configured to implement a distributed reconfigurable processor, and to provide cognitive control for at least one of allocation, reallocation, and performance monitoring.
    Type: Grant
    Filed: March 5, 2008
    Date of Patent: January 24, 2012
    Assignee: The Boeing Company
    Inventors: Tirumale K. Ramesh, John L. Meier
  • Patent number: 8098655
    Abstract: A system includes a queue that stores P data units, each data unit including multiple bytes. The system further includes a control unit that shifts, byte by byte, Q data units from the queue during a first system clock cycle, where Q<P, and sends, during the first system clock cycle, the Q data units to a processing device configured to process a maximum of Q data units per system clock cycle.
    Type: Grant
    Filed: July 21, 2009
    Date of Patent: January 17, 2012
    Assignee: Juniper Networks, Inc.
    Inventor: Brian Gaudet
  • Publication number: 20120011349
    Abstract: Disclosed are methods and systems for dynamically determining data-transfer paths. The data-transfer pats are determined in response to an instruction that facilitates data transfer among execution lanes in an integrated-circuit processing device operable to execute operations in parallel.
    Type: Application
    Filed: September 20, 2011
    Publication date: January 12, 2012
    Applicant: Calos Fund Limited Liability Company
    Inventors: Brucek Khailany, William James Dally, Ujval J. Kapasi, Jim Jian Lin, Raghunath Rao, DeForest Tovey, Mark Rygh, Jung-Ho Ahn
  • Patent number: 8095743
    Abstract: Access to a memory area by a first processor that executes a first processor program and a second processor that executes a second processor program is granted to one of the first processor and the second processor at a time. Access to the memory area by the first processor and the second processor are cyclically uniquely allocated (e.g., t?[(ad mod m)=o]) between the first and the second processor by the first and second processor programs.
    Type: Grant
    Filed: March 29, 2010
    Date of Patent: January 10, 2012
    Assignee: Trident Microsystems (Far East) Ltd.
    Inventors: Matthias Vierthaler, Carsten Noeske
  • Publication number: 20120005530
    Abstract: Transactional memory implementations may be extended to include special transaction communicator objects through which concurrent transactions can communicate. Changes by a first transaction to a communicator may be visible to concurrent transactions before the first transaction commits. Although isolation of transactions may be compromised by such communication, the effects of this compromise may be limited by tracking dependencies among transactions, and preventing any transaction from committing unless every transaction whose changes it has observed also commits. For example, mutually dependent or cyclically dependent transactions may commit or abort together. Transactions that do not communicate with each other may remain isolated. The system may provide a communicator-isolating transaction that ensures isolation even for accesses to communicators, which may be implemented using nesting transactions. True (e.g., read-after-write) dependencies, ordering (e.g.
    Type: Application
    Filed: June 30, 2010
    Publication date: January 5, 2012
    Inventors: Virendra J. Marathe, Victor M. Luchangco
  • Patent number: 8090913
    Abstract: A system has a first plurality of cores in a first coherency group. Each core transfers data in packets. The cores are directly coupled serially to form a serial path. The data packets are transferred along the serial path. The serial path is coupled at one end to a packet switch. The packet switch is coupled to a memory. The first plurality of cores and the packet switch are on an integrated circuit. The memory may or may not be on the integrated circuit. In another aspect a second plurality of cores in a second coherency group is coupled to the packet switch. The cores of the first and second pluralities may be reconfigured to form or become part of coherency groups different from the first and second coherency groups.
    Type: Grant
    Filed: December 20, 2010
    Date of Patent: January 3, 2012
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Perry H. Pelley, III, George P. Hoekstra, Lucio F. Pessoa
  • Patent number: 8090933
    Abstract: The present invention relates to a method for the unification of PER branch and PER store operations within the same dataflow. The method comprises determining a PER range, the PER range comprising a storage area defined by a designated storage starting area and a designated storage ending area, wherein the storage starting area is designated by a value of the contents of a first control register and the storage ending area is designated by a value of the contents of a second control register. The method also comprises retrieving register field content values that are stored at a plurality of registers, wherein the retrieved content values comprises a length field content value, and setting the length field content value to zero for a PER branch instruction, thereby enabling a PER branch instruction to performed similarly to a PER storage instruction.
    Type: Grant
    Filed: February 12, 2008
    Date of Patent: January 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Bruce C. Giamei