Processing Control For Data Transfer Patents (Class 712/225)

SYSTEM AND METHOD FOR PROCESSOR WITH PREDICTIVE MEMORY RETRIEVAL ASSIST

Publication number: 20100115221

Abstract: A system and method are described for a memory management processor which, using a table of reference addresses embedded in the object code, can open the appropriate memory pages to expedite the retrieval of information from memory referenced by instructions in the execution pipeline. A suitable compiler parses the source code and collects references to branch addresses, calls to other routines, or data references, and creates reference tables listing the addresses for these references at the beginning of each routine. These tables are received by the memory management processor as the instructions of the routine are beginning to be loaded into the execution pipeline, so that the memory management processor can begin opening memory pages where the referenced information is stored. Opening the memory pages where the referenced information is located before the instructions reach the instruction processor helps lessen memory latency delays which can greatly impede processing performance.

Type: Application

Filed: January 11, 2010

Publication date: May 6, 2010

Inventor: Dean A. Klein
MANAGING AN OUT-OF-ORDER ASYNCHRONOUS HETEROGENEOUS REMOTE DIRECT MEMORY ACCESS (RDMA) MESSAGE QUEUE

Publication number: 20100106948

Abstract: A system and method operable to manage a message queue is provided. This management may involve out-of-order asynchronous heterogeneous remote direct memory access (RDMA) to the message queue. This system includes a pair of processing devices, a primary processing device and an additional processing device, a memory in storage location and a data bus coupled to the processing devices. The processing devices cooperate to process queue data within a shared message queue wherein when an individual processing device successfully accesses queue data the queue data is locked for the exclusive use of the processing device. When the processing device acquires the queue data, the queue data is locked and the queue data acquired by the acquiring processing device includes the queue data for both the primary processing device and additional processing device such that the processing device has all queue data necessary to process the data and return processed queue data.

Type: Application

Filed: October 24, 2008

Publication date: April 29, 2010

Inventors: Gregory Howard Bellows, Jason N. Dale
Method and apparatus for migrating data

Patent number: 7707151

Abstract: One aspect is directed to a method for performing data migration from a first volume to a second volume while allowing a write operation to be performed on the first volume during the act of migrating. Another aspect is a method and apparatus that stores, in a persistent manner, state information indicating a portion of the first volume successfully copied to the second volume. Another aspect is a method and apparatus for migrating data from a first volume to a second volume, and resuming, after an interruption of the migration, copying data from the first volume to the second volume without starting from the beginning of the data. Another aspect is a method and apparatus for migrating to data from a first to a second volume, receiving an access request directed to the first volume from an application that stores data on the first volume, and redirecting the access request to the second volume without having to reconfigure the application that accesses data on the first volume.

Type: Grant

Filed: January 29, 2003

Date of Patent: April 27, 2010

Assignee: EMC Corporation

Inventors: Steven M. Blumenau, Stephen J. Todd
Accessing data in inaccessible memory while emulating memory access instruction by executing translated instructions including call to transfer data to accessible memory

Patent number: 7707392

Abstract: An information processing system includes a first processor that accesses a first memory, a second processor that accesses a second memory, and a data transfer unit for executing data transfer between the first memory and the second memory. The first processor executes functions of translating an instruction out of instructions included in the program except a memory access instruction into an instruction for the second processor and translating the memory access instruction into an instruction sequence containing a call instruction of the program to transfer the access data on the first memory to the second memory via a data transfer unit.

Type: Grant

Filed: March 13, 2008

Date of Patent: April 27, 2010

Assignee: Kabushiki Kaisha Toshiba

Inventors: Seiji Maeda, Hidenori Matsuzaki, Yusuke Shirota, Kazuya Kitsunai
Computer memory architecture for hybrid serial and parallel computing systems

Patent number: 7707388

Abstract: In one embodiment, a serial processor is configured to execute software instructions in a software program in serial. A serial memory is configured to store data for use by the serial processor in executing the software instructions in serial. A plurality of parallel processors are configured to execute software instructions in the software program in parallel. A plurality of partitioned memory modules are provided and configured to store data for use by the plurality of parallel processors in executing software instructions in parallel. Accordingly, a processor/memory structure is provided that allows serial programs to use quick local serial memories and parallel programs to use partitioned parallel memories. The system may switch between a serial mode and a parallel mode. The system may incorporate pre-fetching commands of several varieties.

Type: Grant

Filed: November 29, 2006

Date of Patent: April 27, 2010

Assignee: XMTT Inc.

Inventor: Uzi Vishkin
Method and apparatus for downloading program by using hand shaking in digital signal processing

Patent number: 7698538

Abstract: A method and apparatus are provided for downloading a program by using hand-shaking in a digital signal processor (DSP), in which the program stored at an external memory is downloaded to an internal memory by using the hand-shaking in an asynchronous system having a dual CPU, wherein current operation of the digital signal processor is temporarily held to shorten a downloading time.

Type: Grant

Filed: January 9, 2003

Date of Patent: April 13, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventor: Seong-Ho Yoon
Ultra low power ASIP architecture

Patent number: 7694084

Abstract: A microcomputer architecture comprises a microprocessor unit and a first memory unit, the microprocessor unit comprising a functional unit and at least one data register, the functional unit and the at least one data register being linked to a data bus internal to the microprocessor unit. The data register is a wide register comprising a plurality of second memory units which are capable to each contain one word. The wide register is adapted so that the second memory units are simultaneously accessible by the first memory unit, and so that at least part of the second memory units are separately accessible by the functional unit.

Type: Grant

Filed: March 10, 2006

Date of Patent: April 6, 2010

Assignee: IMEC

Inventors: Praveen Raghavan, Francky Catthoor
Memory access control in a multiprocessor system

Patent number: 7689779

Abstract: Access to a memory area by a first processor that executes a first processor program and a second processor that executes a second processor program is granted to one of the first processor and the second processor at a time. Access to the memory area by the first processor and the second processor are cyclically uniquely allocated (e.g., t?[(ad mod m)=o]) between the first and the second processor by the first and second processor programs.

Type: Grant

Filed: August 14, 2006

Date of Patent: March 30, 2010

Assignee: Micronas GmbH

Inventors: Matthias Vierthaler, Carsten Noeske
Method and system to indicate an exception-triggering page within a microprocessor

Patent number: 7689806

Abstract: A method and system to indicate which page within a software-managed page table triggers an exception within a microprocessor, such as, for example, a digital signal processor, wherein a software-managed translation lookaside buffer (TLB) module receives a virtual address produced by an instruction within a Very Long Instruction Word (VLIW) packet, such as, for example, a fetch instruction, and further compares the virtual address to each stored TLB entry. If a match exists, then the TLB module outputs a corresponding mapped physical address for the instruction. Otherwise, if the VLIW packet spans two pages, where a first page is present as a TLB entry within the TLB module and the second page is missing from the stored TLB entries, an indication bit within a data field of a control register is set to identify the TLB miss exception to a software management unit.

Type: Grant

Filed: July 14, 2006

Date of Patent: March 30, 2010

Assignee: Q

Inventors: Lucian Codrescu, Erich Plondke, Muhammad Ahmed, Vijaya Kumar Janjanam
EMBEDDED-DRAM DSP ARCHITECTURE HAVING IMPROVED INSTRUCTION SET

Publication number: 20100070742

Abstract: An embedded-DRAM processor architecture includes a DRAM array, a set of register files, set of functional units, and a data assembly unit. The data assembly unit includes a set of row-address registers and is responsive to commands to activate and deactivate DRAM rows and to control the movement of data throughout the system. A pipelined data assembly approach allowing the functional units to perform register-to-register operations, and allowing the data assembly unit to perform all load/store operations using wide data busses. Data masking and switching hardware allows individual data words or groups of words to be transferred between the registers and memory. Other aspects of the disclosure include a memory and logic structure and an associated method to extract data blocks from memory to accelerate, for example, operations related to image compression and decompression.

Type: Application

Filed: November 20, 2009

Publication date: March 18, 2010

Applicant: Micron Technology, Inc.

Inventor: Eric M. Dowling
Stream processor and information processing apparatus

Patent number: 7680962

Abstract: An array type processor comprises a data path unit to execute processing, and a state management unit to control the state of the data path unit in accordance with a command that specifies processing on the data. An input DMA circuit reads from a memory information and data to be processed including a command corresponding to the data. The input DMA circuit first transfers the command to the state management unit, and then transfers the data to be processed to the data path unit.

Type: Grant

Filed: December 21, 2005

Date of Patent: March 16, 2010

Assignee: NEC Electronics Corporation

Inventors: Kenichiro Anjo, Katsumi Togawa, Ryoko Sasaki, Taro Fujii, Masato Motomura
Method and apparatus for processing data in a processing unit being a thread in a multithreading environment

Patent number: 7680964

Abstract: A method for improving timing behavior of a processing unit in a multithreading environment is disclosed, wherein the processing unit generates data frames for an output unit by combining data from a plurality of input units, and the processed data are buffered in an output buffer between the processing unit and the output unit. The method comprises sending from the output unit to the processing unit a value corresponding to the filling of the output buffer, calculating a timer value, setting a timer with the timer value, wherein the timer calls the processing unit thread after the specified time. The timer value depends on the value corresponding to the averaged filling of the output buffer. As a result, the average filling of the output buffer is lower compared to conventional thread management, and thus the system is more flexible and reacts quicker.

Type: Grant

Filed: May 26, 2005

Date of Patent: March 16, 2010

Assignee: Thomson Licensing

Inventor: Jürgen Schmidt
Method and apparatus for providing large register address space while maximizing cycletime performance for a multi-threaded register file set

Patent number: 7681018

Abstract: A parallel hardware-based multithreaded processor is described. The processor includes a general purpose processor that coordinates system functions and a plurality of microengines that support multiple hardware threads or contexts. The processor also includes a memory control system that has a first memory controller that sorts memory references based on whether the memory references are directed to an even bank or an odd bank of memory and a second memory controller that optimizes memory references based upon whether the memory references are read references or write references. Instructions for switching and branching based on executing contexts are also disclosed.

Type: Grant

Filed: January 12, 2001

Date of Patent: March 16, 2010

Assignee: Intel Corporation

Inventors: Gilbert Wolrich, Matthew J. Adiletta, William Wheeler
Method and system for function acceleration using custom instructions

Patent number: 7676661

Abstract: A fast linked multiprocessor network including a plurality of processing modules implemented on a field programmable gate array and a plurality of configurable uni-directional links coupled among at least two of the plurality processing modules provide a streaming communication channel between at least two of the plurality of processing modules. Such configuration provides a function accelerator that can feed at least one processor with data values using one custom instruction to put data values on at least one uni-directional serial link and that can extract data values from at least one processor using one custom instruction to get data values from the at least one uni-directional serial link.

Type: Grant

Filed: October 5, 2004

Date of Patent: March 9, 2010

Assignee: Xilinx, Inc.

Inventors: Sundararajarao Mohan, Satish R. Ganesan, Goran Bilski
Packet processor with wide register set architecture

Patent number: 7676646

Abstract: A Wide Register Set (WRS) is used in a packet processor to increase performance for certain packet processing operations. The registers in the WRS have wider bit lengths than the main registers used for primary packet processing operations. A wide logic unit is configured to conduct logic operations on the wide register set and in one implementation includes hardware primitives specifically configured for packet scheduling operations. A special interlocking mechanism is additionally used to coordinate accesses among multiple processors or threads to the same wide register address locations. The WRS produces a scheduling engine that is much cheaper than previous hardware solutions with higher performance than previous software solutions. The WRS provides a small, compact, flexible, and scalable scheduling sub-system and can tolerate long memory latencies by using cheaper memory while sharing memory with other uses.

Type: Grant

Filed: March 2, 2005

Date of Patent: March 9, 2010

Assignee: Cisco Technology, Inc.

Inventor: Earl T. Cohen
Circuit for monitoring a microprocessor and analysis tool and inputs/outputs thereof

Patent number: 7673121

Abstract: A method for the transmission of digital messages by the output terminals of a monitoring circuit which is integrated into a microprocessor, the digital messages being representative of first specific events which are dependent on the execution of a series of instructions by the microprocessor.

Type: Grant

Filed: November 14, 2002

Date of Patent: March 2, 2010

Assignee: STMicroelectronics S.A.

Inventors: Catherine Robert, Xavier Robert, Jehan-Philippe Barbiero
Crossbar switch, information processor, and transfer method

Patent number: 7672305

Abstract: Port input sections generate, when the head flits of a packet are stored in the first and second registers, first and second mediation request signals destined for a desired request destination, and further generate a first notification signal used to notify the presence or absence of the first mediation request signal destined for any request destination. Upon reception of a mediation result signal, the port input sections output the flit from the first register and sequentially forward flits to be stored in the first register and the second register, and the port output sections sequentially output the flit outputted from the first register of any one of the port input sections to the node.

Type: Grant

Filed: October 2, 2006

Date of Patent: March 2, 2010

Assignee: NEC Corporation

Inventor: Yoshihisa Yamada
DATA CACHE RECEIVE FLOP BYPASS

Publication number: 20100049953

Abstract: A microprocessor includes an N-way cache and a logic block that selectively enables and disables the N-way cache for at least one clock cycle if a first register load instructions and a second register load instruction, following the first register load instruction, are detected as pointing to the same index line in which the requested data is stored. The logic block further provides a disabling signal to the N-way cache for at least one clock cycle if the first and second instructions are detected as pointing to the same cache way.

Type: Application

Filed: August 20, 2008

Publication date: February 25, 2010

Applicant: MIPS Technologies, Inc.

Inventors: Ajit Karthik Mylavarapu, Sanjai Balakrishnan Athi
Instruction-parallel processor with zero-performance-overhead operand copy

Patent number: 7669041

Abstract: A processor having a zero-overhead operand copy capability. The processor includes multiple execution units to execute instructions in parallel and multiple register files each associated with one or more of the execution units. The processor further includes circuitry to select either an instruction execution result from a first one of the execution units or content of a register within a first one of the register files associated with the first one of the execution units to be stored within a register within a second one of the register files.

Type: Grant

Filed: April 30, 2007

Date of Patent: February 23, 2010

Assignee: Stream Processors, Inc.

Inventors: Brucek Khailany, Ujval J. Kapasi
Method and apparatus for executing a long transaction

Patent number: 7669040

Abstract: A system that executes a long transaction in a system with limited transactional hardware resources. During operation, the system executes the long transaction in a non transactional mode, which does not use transactional hardware resources. The system defers stores generated during the long transaction so that the stores are not committed to the architectural state of a processor until the transaction is successfully completed. If the long transaction successfully completes, the system commits the long transaction, which involves performing multiple hardware transactions to commit the deferred stores to the architectural state of the processor.

Type: Grant

Filed: December 15, 2006

Date of Patent: February 23, 2010

Assignee: Sun Microsystems, Inc.

Inventor: David Dice
Register allocation method and system for program compiling

Patent number: 7660970

Abstract: Disclosed is a data processing system and method. The data processing method determines the number of static registers and the number of rotating registers for assigning a register to a variable contained in a certain program, assigns the register to the variable based on the number of the static registers and the number of the rotating registers, and compiles the program. Further, the method stores in the special register a value corresponding to the number of the rotating registers in the compiling operation, and obtains a physical address from a logical address of the register based on the value. Accordingly, the present invention provides an aspect of efficiently using register files by dynamically controlling the number of rotating registers and the number of static registers for a software pipelined loop, and has an effect capable of reducing the generations of spill/fill codes unnecessary during program execution to a minimum.

Type: Grant

Filed: August 21, 2006

Date of Patent: February 9, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventors: Suk-jin Kim, Jeong-wook Kim, Hong-seok Kim, Soo-jung Ryu
Result data forwarding in parallel vector data processor based on scalar operation issue order

Patent number: 7660967

Abstract: A computer processor is responsive to successive processing instructions in an issue order to process regular vectors to generate a result vector without use of a cache. At least two architectural registers having input-vector capability are selectively coupled to memory to receive corresponding vector-elements of two vectors and transfer the vector-elements to a selected functional unit. At least one architectural register having output capability is selectively coupled to an output, which in turn is coupled to transfer result vector-elements to the memory. The functional unit performs a function on the vector-elements to generate a respective result-element. The result-elements are transferred to a selected architectural register for processing as operands in performance of further functions by a functional unit, or are transferred to the output for transfer to memory. In either case, the order of the result vector-elements is restored to the issue order of the successive processing instructions.

Type: Grant

Filed: January 30, 2008

Date of Patent: February 9, 2010

Assignee: Efficient Memory Technology

Inventor: Maurice L. Hutson
Graphics processing on a processor core

Patent number: 7656409

Abstract: In a many core system, receiving a call to a graphics driver; translating the call into a command executable on a core of the many core system; and executing the translated call on the core.

Type: Grant

Filed: December 23, 2005

Date of Patent: February 2, 2010

Assignee: Intel Corporation

Inventors: Lyle Cool, Yasser Rasheed
PROGRAMMABLE SIGNAL PROCESSING CIRCUIT AND METHOD OF DEMODULATING

Publication number: 20100017453

Abstract: A programmable signal processing circuit has an instruction processing circuit (23, 24. 26), which has an instruction set that comprises a demapping instruction. The instruction processing circuit (23, 24, 26) has an operand input (30a) for receiving a complex number operand of the demapping instruction from a register file (22) and a result output (34) for writing a demapping result of the demapping instruction to the register file (22). The instruction processing circuit (23, 24, 26) determines at least four bit metrics in response to the demapping instruction, each indicating a relative position of the complex number relative to respective border line in a complex plane. The instruction processing circuit (23, 24, 26) writes a combination of the at least four bit metrics together to the result output (34) in the demapping result.

Type: Application

Filed: December 13, 2005

Publication date: January 21, 2010

Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.

Inventors: Ingolf Held, Marcus M.G. Quax, Paulus W.F. Gruijters
Communication between processor core partitions with exclusive read or write to descriptor queues for shared memory space

Patent number: 7650488

Abstract: In an embodiment, a method is provided that may include providing a first address space exclusively and coherently accessible by a first processor core partition in a platform. A second address space may be provided in this embodiment that is exclusively and coherently accessible by a second processor core partition in the platform. Also in this embodiment, a third address space in the platform may be provided that is accessible, at least in part, by both the first and second processor core partitions and may be to permit communication between the first and second processor core partitions of at least one packet and at least one descriptor associated with the at least one packet. The at least one descriptor may indicate, at least in part, one or more locations in the third address space to store, at least in part, the at least one packet. Of course, many alternatives, modifications, and variations are possible without departing from this embodiment.

Type: Grant

Filed: June 18, 2008

Date of Patent: January 19, 2010

Assignee: Intel Corporation

Inventors: Annie Foong, Bryan E. Veal, Arun Raghunath
Method and apparatus for implementing atomicity of memory operations in dynamic multi-streaming processors

Patent number: 7650605

Abstract: A multi-streaming processor has a plurality of streams for streaming one or more instruction threads, a set of functional resources for processing instructions from streams, and a lock mechanism for locking selected memory locations shared by streams of the processor, the hardware-lock mechanism operating to set a lock when an atomic memory sequence is started and to clear a lock when an atomic memory sequence is completed. In preferred embodiments the lock mechanism comprises one or more storage locations associated with each stream of the processor, each storage location enabled to store a memory address a lock bit, and a stall bit. Methods for practicing the invention using the apparatus are also taught.

Type: Grant

Filed: February 20, 2007

Date of Patent: January 19, 2010

Assignee: MIPS Technologies, Inc.

Inventors: Stephen Melvin, Mario D. Nemirovsky
STORAGE DEVICE, CONTROLLING METHOD FOR STORAGE DEVICE, AND CONTROL PROGRAM

Publication number: 20100005257

Abstract: A storage device includes a first storage unit that stores data read from a recording medium based on an instruction received from a processing device, and transmitting the data stored in the first storage unit to the processing device. The storage device also includes a second storage unit that stores the instruction received from the processing device; a counter that counts the number of pieces of data stored in the first storage unit; and a control unit that transmits the data stored in the first storage unit to the processing device based on a count value of the counter and, when the data read upon the instruction is stored in the first storage unit, writes identification information indicating that storing data has been completed in the second storage unit and, based on the identification information, transmits the data stored in the first storage unit to the processing device.

Type: Application

Filed: June 8, 2009

Publication date: January 7, 2010

Applicant: FUJITSU LIMITED

Inventors: Masaaki Tamura, Gen Ohshima
BRANCH TRACE METHODOLOGY

Publication number: 20100005316

Abstract: Method, system, and computer program product embodiments for performing a branch trace operation on a computer system of an end user are provided. An encrypted mapping macro is provided to the end user to be made operational on the computer system. A trace program is provided to the end user. The end user executes the trace program on the computer system as a diagnostic tool. The trace program is adapted for decrypting the encrypted mapping macro, determining a storage offset location of a branch instruction; checking the storage offset location for an identifying constant, cross referencing the identifying constant with an entry in the decrypted mapping macro to identify a branch triggering bit and diagnostic information associated with the branch instruction, and returning the branch triggering bit and diagnostic information, the branch triggering bit and diagnostic information provided to a coder.

Type: Application

Filed: July 7, 2008

Publication date: January 7, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: David Bruce LeGENDRE, David Charles REED, Max Douglas SMITH
Data processor

Publication number: 20100005279

Abstract: The data processor executes an instruction having a direction for write to a reference register of other instruction flow and an instruction having a direction for reference register invalidation. The data processor is arranged as a data processor having typical functions as an integrated whole of processors (CPU1 and CPU2) which execute simple instruction flows. When executing the instruction having a direction for write to a reference register of other instruction flow, the processor confirms whether a write register is invalid. The processor waits for the register to be made invalid, if the register is not invalid, and performs write if the register is invalid. After having executed the instruction having a direction for reference register invalidation, the processor invalidates the register to which a reference has been made. When the reference register is invalid, execution of the referring instruction is suspended until it is made valid.

Type: Application

Filed: September 14, 2009

Publication date: January 7, 2010

Inventor: Fumio Arakawa
System and Method to Perform Fast Rotation Operations

Publication number: 20090327667

Abstract: Systems and methods to perform fast rotation operations are disclosed. In a particular embodiment, a method includes executing a single instruction. The method includes receiving first data indicating a first coordinate and a second coordinate, receiving a first control value that indicates a first rotation value selected from a set of ninety degree multiples, and writing output data corresponding to the first data rotated by the first rotation value.

Type: Application

Filed: June 26, 2008

Publication date: December 31, 2009

Applicant: QUALCOMM INCORPORATED

Inventors: Shankar Krithivasan, Erich James Plondke, Lucian Codrescu, Mao Zeng, Remi Jonathan Gurski
NETWORK TASK OFFLOAD APPARATUS AND METHOD THEREOF

Publication number: 20090327693

Abstract: A network task offload apparatus includes an offload circuit and a buffer scheduler. The offload circuit performs corresponding network task processing on a plurality of packets in parallel according to an offload command. The buffer scheduler includes a buffer control unit and a plurality of buffer units. The plurality of buffer units are controlled by the buffer control unit and are scheduled to store the processed packets.

Type: Application

Filed: June 24, 2009

Publication date: December 31, 2009

Inventors: Li-Han Liang, Tao-Chun Wang, Kuo-Nan Yang, Shieh-Hsing Kuo
Multi-Threaded Processes For Opening And Saving Documents

Publication number: 20090327668

Abstract: Tools and techniques are described for multi-threaded processing for opening and saving documents. These tools may provide load processes for reading documents from storage devices, and for loading the documents into applications. These tools may spawn a load process thread for executing a given load process on a first processing unit, and an application thread may execute a given application on a second processing unit. A first pipeline may be created for executing the load process thread, with the first pipeline performing tasks associated with loading the document into the application. A second pipeline may be created for executing the application process thread, with the second pipeline performing tasks associated with operating on the documents. The tasks in the first pipeline are configured to pass tokens as input to the tasks in the second pipeline.

Type: Application

Filed: June 27, 2008

Publication date: December 31, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Uladzislau Sudzilouski, Igor Zaika
METHOD AND SYSTEM FOR HARDWARE-BASED SECURITY OF OBJECT REFERENCES

Publication number: 20090327666

Abstract: A method for managing data, including obtaining a first instruction for moving a first data item from a first source to a first destination, determining a data type of the first data item, determining a data type supported by the first destination, comparing the data type of the first data item with the data type supported by the first destination to test a validity of the first instruction, and moving the first data item from the first source to the first destination based on the validity of the first instruction.

Type: Application

Filed: June 25, 2008

Publication date: December 31, 2009

Applicant: SUN MICROSYSTEMS, INC.

Inventors: Mario I. Wolczko, Gregory M. Wright, Matthew L. Seidl
Method and apparatus for forwarding store data to loads in a pipelined processor

Patent number: 7640414

Abstract: Methods, systems, and computer program products for forwarding store data to loads in a pipelined processor are provided. In one implementation, a processor is provided that includes a decoder operable to decode an instruction, and a plurality of execution units operable to respectively execute a decoded instruction from the decoder. The plurality of execution units include a load/store execution unit operable to execute decoded load instructions and decoded store instructions and generate corresponding load memory operations and store memory operations. The store queue is operable to buffer one or more store memory operations prior to the one or more memory operations being completed, and the store queue is operable to forward store data of the one or more store memory operations buffered in the store queue to a load memory operation on a byte-by-byte basis.

Type: Grant

Filed: November 16, 2006

Date of Patent: December 29, 2009

Assignee: International Business Machines Corporation

Inventors: Jason Alan Cox, Kevin Chih Kang Lin, Eric Francis Robinson
SINGLE-CYCLE LOW POWER CPU ARCHITECTURE

Publication number: 20090319760

Abstract: An n architecture for implementing an instruction pipeline within a CPU comprises an arithmetic logic unit (ALU), an address arithmetic unit (AAU), a program counter (PC), a read-only memory (ROM) coupled to the program counter, to an instruction register, and to an instruction decoder coupled to the arithmetic logic unit. A random access memory (RAM) is coupled to the instruction decoder, to the arithmetic logic unit, and to a RAM address register.

Type: Application

Filed: August 27, 2009

Publication date: December 24, 2009

Inventors: Benjamin F. Froemming, Emil Lambrache
SYSTEM AND METHOD FOR PROCESSING LOW DENSITY PARITY CHECK CODES USING A DETERMINISTIC CACHING APPARATUS

Publication number: 20090313459

Abstract: A system, method and article of manufacture are disclosed for processing Low Density Parity Check (LDPC) codes. The system comprises a multitude of processing units for processing the codes; and a processor chip including an on-chip, multi-port data cache for temporarily storing the LDPC codes. This data cache includes a plurality of input ports for receiving the LDPC codes from some of the processing units, and a plurality of output ports for sending the LDPC codes to others of the processing units. An off-chip, external memory stores the LDPC codes and transmits the LDPC codes to and receives the LDPC codes from at least some of the processing units. A sequence processor controls the transmission of the LDPC codes between the processor units and the on-chip data cache so that the LDPC codes are processed by the processing units according to a given sequence.

Type: Application

Filed: June 13, 2008

Publication date: December 17, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Thomas A. Horvath
Systems and methods for reordering processor instructions

Patent number: 7634635

Abstract: Systems and methods for reordering processor instructions. In accordance with a first embodiment of the present invention, a microprocessor comprises circuitry to process an instruction extension, wherein the instruction extension is transparent to the programming model of the microprocessor. The instruction extension may comprise a field for indicating an offset from a memory structure pointer. The microprocessor includes circuitry for adding the offset to the memory structure pointer to indicate a specific element of the memory structure. The specific element of the memory structure comprises address information corresponding to speculative data.

Type: Grant

Filed: April 7, 2006

Date of Patent: December 15, 2009

Inventors: Brian Holscher, Guillermo Rozas, James Van Zoeren, David Dunn
Avoiding live-lock in a processor that supports speculative execution

Patent number: 7634639

Abstract: One embodiment of the present invention provides a system which avoids a live-lock state in a processor that supports speculative-execution. The system starts by issuing instructions for execution in program order during execution of a program in a normal-execution mode. Upon encountering a launch condition during the execution of an instruction (a “launch instruction”) which causes the processor to enter a speculative-execution mode, the system checks status indicators associated with a forward progress buffer. If the status indicators indicate that the forward progress buffer contains data for the launch instruction, the system resumes normal-execution mode. Upon resumption of normal-execution mode, the system retrieves the data from a data field contained in the forward progress buffer and executes the launch instruction using the retrieved data as input data for the launch instruction. The system next deasserts the status indicators.

Type: Grant

Filed: August 23, 2005

Date of Patent: December 15, 2009

Assignee: Sun Microsystems, Inc.

Inventors: Shailender Chaudhry, Paul Caprioli, Sherman H. Yip, Guarav Garg, Ketaki Rao
Performing An Allreduce Operation On A Plurality Of Compute Nodes Of A Parallel Computer

Publication number: 20090307467

Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.

Type: Application

Filed: May 21, 2008

Publication date: December 10, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Ahmad Faraj
Program controlled embedded-DRAM-DSP having improved instruction set architecture

Patent number: 7631170

Abstract: An efficient embedded-DRAM processor architecture and associated methods. In one exemplary embodiment, the architecture includes a DRAM array, a set of register files, set of functional units, and a data assembly unit. The data assembly unit includes a set of row-address registers and is responsive to commands to activate and deactivate DRAM rows and to control the movement of data throughout the system. A pipelined data assembly approach allowing the functional units to perform register-to-register operations, and allowing the data assembly unit to perform all load/store operations using wide data busses. Data masking and switching hardware allows individual data words or groups of words to be transferred between the registers and memory. Other aspects of the invention include a memory and logic structure and an associated method to extract data blocks from memory to accelerate, for example, operations related to image compression and decompression.

Type: Grant

Filed: February 13, 2002

Date of Patent: December 8, 2009

Assignee: Micron Technology, Inc.

Inventor: Eric M. Dowling
Instruction set design, control and communication in programmable microprocessor cases and the like

Publication number: 20090300337

Abstract: Improved instruction set and core design, control and communication for programmable microprocessors is disclosed, involving the strategy for replacing centralized program sequencing in present-day and prior art processors with a novel distributed program sequencing wherein each functional unit has its own instruction fetch and decode block, and each functional unit has its own local memory for program storage; and wherein computational hardware execution units and memory units are flexibly pipelined as programmable embedded processors with reconfigurable pipeline stages of different order in response to varying application instruction sequences that establish different configurations and switching interconnections of the hardware units.

Type: Application

Filed: May 29, 2008

Publication date: December 3, 2009

Inventors: Xiaolin Wang, Qian Wu, Benjamin Marshall, Fugui Wang, Gregory Pitarys, Ke Ning
Method and circuit implementation for multiple-word transfer into/from memory subsystems

Patent number: 7627743

Abstract: A multi-word transfer instruction, a memory transfer method using the multi-word transfer instruction and a circuit implementation for transferring multiple words between a memory subsystem and a processor register file are provided. The multi-word transfer instruction specifies an access type (load or store), a consecutive register group, a selection mask and a base register for the starting address of the corresponding memory locations. Therefore, the total number of words accessed by this instruction is equal to the number of registers specified in the consecutive register group along with the number of the registers specified by the selection mask. Besides, additional information, such as an address update mode, an order mode and a modification mode, may be further specified in the multi-word transfer instruction.

Type: Grant

Filed: January 12, 2007

Date of Patent: December 1, 2009

Assignee: Andes Technology Corporation

Inventors: Hong-Men Su, Chuan-Hua Chang, Jen-Chih Tseng
Methods and apparatus for providing data transfer control

Patent number: 7627698

Abstract: A variety of advantageous mechanisms for improved data transfer control within a data processing system are described. A DMA controller is described which is implemented as a multiprocessing transfer engine supporting multiple transfer controllers which may work independently or in cooperation to carry out data transfers, with each transfer controller acting as an autonomous processor, fetching and dispatching DMA instructions to multiple execution units. In particular, mechanisms for initiating and controlling the sequence of data transfers are provided, as are processes for autonomously fetching DMA instructions which are decoded sequentially but executed in parallel.

Type: Grant

Filed: July 30, 2007

Date of Patent: December 1, 2009

Assignee: Altera Corporation

Inventors: Edwin Franklin Barry, Edward A. Wolff
External memory accessing DMA request scheduling in IC of parallel processing engines according to completion notification queue occupancy level

Patent number: 7627744

Abstract: An integrated circuit comprises an external memory, a plurality of parallel connected Vector Processing Engines (VPEs), and an External Memory Unit (EMU) providing a data transfer path between the VPEs and the external memory. Each VPE contains a plurality of data processing units and a message queuing system adapted to transfer messages between the data processing units and other components of the integrated circuit.

Type: Grant

Filed: May 10, 2007

Date of Patent: December 1, 2009

Assignee: NVIDIA Corporation

Inventors: Monier Maher, Jean Pierre Bordes, Christopher Lamb, Sanjay J. Patel
Performing An Allreduce Operation On A Plurality Of Compute Nodes Of A Parallel Computer

Publication number: 20090292905

Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.

Type: Application

Filed: May 21, 2008

Publication date: November 26, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Ahmad Faraj
Instructions for efficiently accessing unaligned partial vectors

Patent number: 7624251

Abstract: One embodiment of the present invention provides a processor that is configured to execute load-swapped-partial instructions. An instruction fetch unit within the processor is configured to fetch the load-swapped-partial instruction to be executed. Note that the load-swapped-partial instruction specifies a source address in a memory, which is possibly an unaligned address. Furthermore, an execution unit within the processor is configured to execute the load-swapped-partial instruction. This involves loading a partial-vector-sized datum from a naturally-aligned memory region encompassing the source address.

Type: Grant

Filed: January 18, 2007

Date of Patent: November 24, 2009

Assignee: Apple Inc.

Inventors: Jeffry E. Gonion, Keith E. Diefendorff
PROGRAMMABLE SIGNAL AND PROCESSING CIRCUIT AND METHOD OF DEPUNCTURING

Publication number: 20090287911

Abstract: A programmable signal processing circuit has an instruction processing circuit (23, 24, 26), with an instruction set that comprises a depuncture instruction. The instruction processing circuit (23, 24, 26) forms the depuncture result by copying bit metrics from a bit metrics operand and inserting one or more predetermined bit metric values between the bit metrics from the bit metric operand in the depuncture result. The instruction processing circuit (23, 24, 26) changes the relative locations of the copied bit metrics with respect to each other in the depuncture result as compared to the relative locations of the copied bit metrics with respect to each other in the bit metric operand, to an extent needed for accommodating the inserted predetermined bit metric value or values.

Type: Application

Filed: December 13, 2005

Publication date: November 19, 2009

Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.

Inventors: Paulus W.F. Gruijters, Marcus M.G. Quax
Instructions for efficiently accessing unaligned vectors

Patent number: 7620797

Abstract: One embodiment of the present invention provides a processor which is configured to execute load-swapped instructions, which are possibly directed to unaligned source address. The processor is configured to execute the load-swapped instruction by loading a vector from a naturally-aligned memory region encompassing the source address, and in doing so rotating the bytes of the vector to cause the byte at the specified source address to reside at the least-significant byte position within the vector for a little-endian memory transaction, or causing said byte to be positioned at the most-significant byte position within the vector for a big-endian memory transaction.

Type: Grant

Filed: November 1, 2006

Date of Patent: November 17, 2009

Assignee: Apple Inc.

Inventors: Jeffry E. Gonion, Keith E. Diefendorff
Context Switching On A Network On Chip

Publication number: 20090282226

Abstract: A network on chip (‘NOC’) that includes IP blocks, routers, memory communications controllers, and network interface controllers, each IP block adapted to the network by an application messaging interconnect including an inbox and an outbox, one or more of the IP blocks including computer processors supporting a plurality of threads, the NOC also including an inbox and outbox controller configured to set pointers to the inbox and outbox, respectively, that identify valid message data for a current thread; and software running in the current thread that, upon a context switch to a new thread, is configured to: save the pointer values for the current thread, and reset the pointer values to identify valid message data for the new thread, where the inbox and outbox controller are further configured to retain the valid message data for the current thread in the boxes until context switches again to the current thread.

Type: Application

Filed: May 9, 2008

Publication date: November 12, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Russell D. Hoover, Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
STORE QUEUE

Publication number: 20090282225

Abstract: Embodiments of the present invention provide a system which executes a load instruction or a store instruction. During operation the system receives a load instruction. The system then determines if an unrestricted entry or a restricted entry in a store queue contains data that satisfies the load instruction. If not, the system retrieves data for the load instruction from a cache.

Type: Application

Filed: May 6, 2008

Publication date: November 12, 2009

Applicant: SUN MICROSYSTEMS, INC.

Inventors: Paul Caprioli, Martin Karlsson, Shailender Chaudhry, Gideon N. Levinsky

prev … 11 12 13 14 15 16 17 18 19 … next