Processing Control For Data Transfer Patents (Class 712/225)
  • Publication number: 20100115221
    Abstract: A system and method are described for a memory management processor which, using a table of reference addresses embedded in the object code, can open the appropriate memory pages to expedite the retrieval of information from memory referenced by instructions in the execution pipeline. A suitable compiler parses the source code and collects references to branch addresses, calls to other routines, or data references, and creates reference tables listing the addresses for these references at the beginning of each routine. These tables are received by the memory management processor as the instructions of the routine are beginning to be loaded into the execution pipeline, so that the memory management processor can begin opening memory pages where the referenced information is stored. Opening the memory pages where the referenced information is located before the instructions reach the instruction processor helps lessen memory latency delays which can greatly impede processing performance.
    Type: Application
    Filed: January 11, 2010
    Publication date: May 6, 2010
    Inventor: Dean A. Klein
  • Publication number: 20100106948
    Abstract: A system and method operable to manage a message queue is provided. This management may involve out-of-order asynchronous heterogeneous remote direct memory access (RDMA) to the message queue. This system includes a pair of processing devices, a primary processing device and an additional processing device, a memory in storage location and a data bus coupled to the processing devices. The processing devices cooperate to process queue data within a shared message queue wherein when an individual processing device successfully accesses queue data the queue data is locked for the exclusive use of the processing device. When the processing device acquires the queue data, the queue data is locked and the queue data acquired by the acquiring processing device includes the queue data for both the primary processing device and additional processing device such that the processing device has all queue data necessary to process the data and return processed queue data.
    Type: Application
    Filed: October 24, 2008
    Publication date: April 29, 2010
    Inventors: Gregory Howard Bellows, Jason N. Dale
  • Patent number: 7707151
    Abstract: One aspect is directed to a method for performing data migration from a first volume to a second volume while allowing a write operation to be performed on the first volume during the act of migrating. Another aspect is a method and apparatus that stores, in a persistent manner, state information indicating a portion of the first volume successfully copied to the second volume. Another aspect is a method and apparatus for migrating data from a first volume to a second volume, and resuming, after an interruption of the migration, copying data from the first volume to the second volume without starting from the beginning of the data. Another aspect is a method and apparatus for migrating to data from a first to a second volume, receiving an access request directed to the first volume from an application that stores data on the first volume, and redirecting the access request to the second volume without having to reconfigure the application that accesses data on the first volume.
    Type: Grant
    Filed: January 29, 2003
    Date of Patent: April 27, 2010
    Assignee: EMC Corporation
    Inventors: Steven M. Blumenau, Stephen J. Todd
  • Patent number: 7707392
    Abstract: An information processing system includes a first processor that accesses a first memory, a second processor that accesses a second memory, and a data transfer unit for executing data transfer between the first memory and the second memory. The first processor executes functions of translating an instruction out of instructions included in the program except a memory access instruction into an instruction for the second processor and translating the memory access instruction into an instruction sequence containing a call instruction of the program to transfer the access data on the first memory to the second memory via a data transfer unit.
    Type: Grant
    Filed: March 13, 2008
    Date of Patent: April 27, 2010
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Seiji Maeda, Hidenori Matsuzaki, Yusuke Shirota, Kazuya Kitsunai
  • Patent number: 7707388
    Abstract: In one embodiment, a serial processor is configured to execute software instructions in a software program in serial. A serial memory is configured to store data for use by the serial processor in executing the software instructions in serial. A plurality of parallel processors are configured to execute software instructions in the software program in parallel. A plurality of partitioned memory modules are provided and configured to store data for use by the plurality of parallel processors in executing software instructions in parallel. Accordingly, a processor/memory structure is provided that allows serial programs to use quick local serial memories and parallel programs to use partitioned parallel memories. The system may switch between a serial mode and a parallel mode. The system may incorporate pre-fetching commands of several varieties.
    Type: Grant
    Filed: November 29, 2006
    Date of Patent: April 27, 2010
    Assignee: XMTT Inc.
    Inventor: Uzi Vishkin
  • Patent number: 7698538
    Abstract: A method and apparatus are provided for downloading a program by using hand-shaking in a digital signal processor (DSP), in which the program stored at an external memory is downloaded to an internal memory by using the hand-shaking in an asynchronous system having a dual CPU, wherein current operation of the digital signal processor is temporarily held to shorten a downloading time.
    Type: Grant
    Filed: January 9, 2003
    Date of Patent: April 13, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Seong-Ho Yoon
  • Patent number: 7694084
    Abstract: A microcomputer architecture comprises a microprocessor unit and a first memory unit, the microprocessor unit comprising a functional unit and at least one data register, the functional unit and the at least one data register being linked to a data bus internal to the microprocessor unit. The data register is a wide register comprising a plurality of second memory units which are capable to each contain one word. The wide register is adapted so that the second memory units are simultaneously accessible by the first memory unit, and so that at least part of the second memory units are separately accessible by the functional unit.
    Type: Grant
    Filed: March 10, 2006
    Date of Patent: April 6, 2010
    Assignee: IMEC
    Inventors: Praveen Raghavan, Francky Catthoor
  • Patent number: 7689779
    Abstract: Access to a memory area by a first processor that executes a first processor program and a second processor that executes a second processor program is granted to one of the first processor and the second processor at a time. Access to the memory area by the first processor and the second processor are cyclically uniquely allocated (e.g., t?[(ad mod m)=o]) between the first and the second processor by the first and second processor programs.
    Type: Grant
    Filed: August 14, 2006
    Date of Patent: March 30, 2010
    Assignee: Micronas GmbH
    Inventors: Matthias Vierthaler, Carsten Noeske
  • Patent number: 7689806
    Abstract: A method and system to indicate which page within a software-managed page table triggers an exception within a microprocessor, such as, for example, a digital signal processor, wherein a software-managed translation lookaside buffer (TLB) module receives a virtual address produced by an instruction within a Very Long Instruction Word (VLIW) packet, such as, for example, a fetch instruction, and further compares the virtual address to each stored TLB entry. If a match exists, then the TLB module outputs a corresponding mapped physical address for the instruction. Otherwise, if the VLIW packet spans two pages, where a first page is present as a TLB entry within the TLB module and the second page is missing from the stored TLB entries, an indication bit within a data field of a control register is set to identify the TLB miss exception to a software management unit.
    Type: Grant
    Filed: July 14, 2006
    Date of Patent: March 30, 2010
    Assignee: Q
    Inventors: Lucian Codrescu, Erich Plondke, Muhammad Ahmed, Vijaya Kumar Janjanam
  • Publication number: 20100070742
    Abstract: An embedded-DRAM processor architecture includes a DRAM array, a set of register files, set of functional units, and a data assembly unit. The data assembly unit includes a set of row-address registers and is responsive to commands to activate and deactivate DRAM rows and to control the movement of data throughout the system. A pipelined data assembly approach allowing the functional units to perform register-to-register operations, and allowing the data assembly unit to perform all load/store operations using wide data busses. Data masking and switching hardware allows individual data words or groups of words to be transferred between the registers and memory. Other aspects of the disclosure include a memory and logic structure and an associated method to extract data blocks from memory to accelerate, for example, operations related to image compression and decompression.
    Type: Application
    Filed: November 20, 2009
    Publication date: March 18, 2010
    Applicant: Micron Technology, Inc.
    Inventor: Eric M. Dowling
  • Patent number: 7680962
    Abstract: An array type processor comprises a data path unit to execute processing, and a state management unit to control the state of the data path unit in accordance with a command that specifies processing on the data. An input DMA circuit reads from a memory information and data to be processed including a command corresponding to the data. The input DMA circuit first transfers the command to the state management unit, and then transfers the data to be processed to the data path unit.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: March 16, 2010
    Assignee: NEC Electronics Corporation
    Inventors: Kenichiro Anjo, Katsumi Togawa, Ryoko Sasaki, Taro Fujii, Masato Motomura
  • Patent number: 7680964
    Abstract: A method for improving timing behavior of a processing unit in a multithreading environment is disclosed, wherein the processing unit generates data frames for an output unit by combining data from a plurality of input units, and the processed data are buffered in an output buffer between the processing unit and the output unit. The method comprises sending from the output unit to the processing unit a value corresponding to the filling of the output buffer, calculating a timer value, setting a timer with the timer value, wherein the timer calls the processing unit thread after the specified time. The timer value depends on the value corresponding to the averaged filling of the output buffer. As a result, the average filling of the output buffer is lower compared to conventional thread management, and thus the system is more flexible and reacts quicker.
    Type: Grant
    Filed: May 26, 2005
    Date of Patent: March 16, 2010
    Assignee: Thomson Licensing
    Inventor: Jürgen Schmidt
  • Patent number: 7681018
    Abstract: A parallel hardware-based multithreaded processor is described. The processor includes a general purpose processor that coordinates system functions and a plurality of microengines that support multiple hardware threads or contexts. The processor also includes a memory control system that has a first memory controller that sorts memory references based on whether the memory references are directed to an even bank or an odd bank of memory and a second memory controller that optimizes memory references based upon whether the memory references are read references or write references. Instructions for switching and branching based on executing contexts are also disclosed.
    Type: Grant
    Filed: January 12, 2001
    Date of Patent: March 16, 2010
    Assignee: Intel Corporation
    Inventors: Gilbert Wolrich, Matthew J. Adiletta, William Wheeler
  • Patent number: 7676661
    Abstract: A fast linked multiprocessor network including a plurality of processing modules implemented on a field programmable gate array and a plurality of configurable uni-directional links coupled among at least two of the plurality processing modules provide a streaming communication channel between at least two of the plurality of processing modules. Such configuration provides a function accelerator that can feed at least one processor with data values using one custom instruction to put data values on at least one uni-directional serial link and that can extract data values from at least one processor using one custom instruction to get data values from the at least one uni-directional serial link.
    Type: Grant
    Filed: October 5, 2004
    Date of Patent: March 9, 2010
    Assignee: Xilinx, Inc.
    Inventors: Sundararajarao Mohan, Satish R. Ganesan, Goran Bilski
  • Patent number: 7676646
    Abstract: A Wide Register Set (WRS) is used in a packet processor to increase performance for certain packet processing operations. The registers in the WRS have wider bit lengths than the main registers used for primary packet processing operations. A wide logic unit is configured to conduct logic operations on the wide register set and in one implementation includes hardware primitives specifically configured for packet scheduling operations. A special interlocking mechanism is additionally used to coordinate accesses among multiple processors or threads to the same wide register address locations. The WRS produces a scheduling engine that is much cheaper than previous hardware solutions with higher performance than previous software solutions. The WRS provides a small, compact, flexible, and scalable scheduling sub-system and can tolerate long memory latencies by using cheaper memory while sharing memory with other uses.
    Type: Grant
    Filed: March 2, 2005
    Date of Patent: March 9, 2010
    Assignee: Cisco Technology, Inc.
    Inventor: Earl T. Cohen
  • Patent number: 7673121
    Abstract: A method for the transmission of digital messages by the output terminals of a monitoring circuit which is integrated into a microprocessor, the digital messages being representative of first specific events which are dependent on the execution of a series of instructions by the microprocessor.
    Type: Grant
    Filed: November 14, 2002
    Date of Patent: March 2, 2010
    Assignee: STMicroelectronics S.A.
    Inventors: Catherine Robert, Xavier Robert, Jehan-Philippe Barbiero
  • Patent number: 7672305
    Abstract: Port input sections generate, when the head flits of a packet are stored in the first and second registers, first and second mediation request signals destined for a desired request destination, and further generate a first notification signal used to notify the presence or absence of the first mediation request signal destined for any request destination. Upon reception of a mediation result signal, the port input sections output the flit from the first register and sequentially forward flits to be stored in the first register and the second register, and the port output sections sequentially output the flit outputted from the first register of any one of the port input sections to the node.
    Type: Grant
    Filed: October 2, 2006
    Date of Patent: March 2, 2010
    Assignee: NEC Corporation
    Inventor: Yoshihisa Yamada
  • Publication number: 20100049953
    Abstract: A microprocessor includes an N-way cache and a logic block that selectively enables and disables the N-way cache for at least one clock cycle if a first register load instructions and a second register load instruction, following the first register load instruction, are detected as pointing to the same index line in which the requested data is stored. The logic block further provides a disabling signal to the N-way cache for at least one clock cycle if the first and second instructions are detected as pointing to the same cache way.
    Type: Application
    Filed: August 20, 2008
    Publication date: February 25, 2010
    Applicant: MIPS Technologies, Inc.
    Inventors: Ajit Karthik Mylavarapu, Sanjai Balakrishnan Athi
  • Patent number: 7669041
    Abstract: A processor having a zero-overhead operand copy capability. The processor includes multiple execution units to execute instructions in parallel and multiple register files each associated with one or more of the execution units. The processor further includes circuitry to select either an instruction execution result from a first one of the execution units or content of a register within a first one of the register files associated with the first one of the execution units to be stored within a register within a second one of the register files.
    Type: Grant
    Filed: April 30, 2007
    Date of Patent: February 23, 2010
    Assignee: Stream Processors, Inc.
    Inventors: Brucek Khailany, Ujval J. Kapasi
  • Patent number: 7669040
    Abstract: A system that executes a long transaction in a system with limited transactional hardware resources. During operation, the system executes the long transaction in a non transactional mode, which does not use transactional hardware resources. The system defers stores generated during the long transaction so that the stores are not committed to the architectural state of a processor until the transaction is successfully completed. If the long transaction successfully completes, the system commits the long transaction, which involves performing multiple hardware transactions to commit the deferred stores to the architectural state of the processor.
    Type: Grant
    Filed: December 15, 2006
    Date of Patent: February 23, 2010
    Assignee: Sun Microsystems, Inc.
    Inventor: David Dice
  • Patent number: 7660970
    Abstract: Disclosed is a data processing system and method. The data processing method determines the number of static registers and the number of rotating registers for assigning a register to a variable contained in a certain program, assigns the register to the variable based on the number of the static registers and the number of the rotating registers, and compiles the program. Further, the method stores in the special register a value corresponding to the number of the rotating registers in the compiling operation, and obtains a physical address from a logical address of the register based on the value. Accordingly, the present invention provides an aspect of efficiently using register files by dynamically controlling the number of rotating registers and the number of static registers for a software pipelined loop, and has an effect capable of reducing the generations of spill/fill codes unnecessary during program execution to a minimum.
    Type: Grant
    Filed: August 21, 2006
    Date of Patent: February 9, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Suk-jin Kim, Jeong-wook Kim, Hong-seok Kim, Soo-jung Ryu
  • Patent number: 7660967
    Abstract: A computer processor is responsive to successive processing instructions in an issue order to process regular vectors to generate a result vector without use of a cache. At least two architectural registers having input-vector capability are selectively coupled to memory to receive corresponding vector-elements of two vectors and transfer the vector-elements to a selected functional unit. At least one architectural register having output capability is selectively coupled to an output, which in turn is coupled to transfer result vector-elements to the memory. The functional unit performs a function on the vector-elements to generate a respective result-element. The result-elements are transferred to a selected architectural register for processing as operands in performance of further functions by a functional unit, or are transferred to the output for transfer to memory. In either case, the order of the result vector-elements is restored to the issue order of the successive processing instructions.
    Type: Grant
    Filed: January 30, 2008
    Date of Patent: February 9, 2010
    Assignee: Efficient Memory Technology
    Inventor: Maurice L. Hutson
  • Patent number: 7656409
    Abstract: In a many core system, receiving a call to a graphics driver; translating the call into a command executable on a core of the many core system; and executing the translated call on the core.
    Type: Grant
    Filed: December 23, 2005
    Date of Patent: February 2, 2010
    Assignee: Intel Corporation
    Inventors: Lyle Cool, Yasser Rasheed
  • Publication number: 20100017453
    Abstract: A programmable signal processing circuit has an instruction processing circuit (23, 24. 26), which has an instruction set that comprises a demapping instruction. The instruction processing circuit (23, 24, 26) has an operand input (30a) for receiving a complex number operand of the demapping instruction from a register file (22) and a result output (34) for writing a demapping result of the demapping instruction to the register file (22). The instruction processing circuit (23, 24, 26) determines at least four bit metrics in response to the demapping instruction, each indicating a relative position of the complex number relative to respective border line in a complex plane. The instruction processing circuit (23, 24, 26) writes a combination of the at least four bit metrics together to the result output (34) in the demapping result.
    Type: Application
    Filed: December 13, 2005
    Publication date: January 21, 2010
    Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.
    Inventors: Ingolf Held, Marcus M.G. Quax, Paulus W.F. Gruijters
  • Patent number: 7650488
    Abstract: In an embodiment, a method is provided that may include providing a first address space exclusively and coherently accessible by a first processor core partition in a platform. A second address space may be provided in this embodiment that is exclusively and coherently accessible by a second processor core partition in the platform. Also in this embodiment, a third address space in the platform may be provided that is accessible, at least in part, by both the first and second processor core partitions and may be to permit communication between the first and second processor core partitions of at least one packet and at least one descriptor associated with the at least one packet. The at least one descriptor may indicate, at least in part, one or more locations in the third address space to store, at least in part, the at least one packet. Of course, many alternatives, modifications, and variations are possible without departing from this embodiment.
    Type: Grant
    Filed: June 18, 2008
    Date of Patent: January 19, 2010
    Assignee: Intel Corporation
    Inventors: Annie Foong, Bryan E. Veal, Arun Raghunath
  • Patent number: 7650605
    Abstract: A multi-streaming processor has a plurality of streams for streaming one or more instruction threads, a set of functional resources for processing instructions from streams, and a lock mechanism for locking selected memory locations shared by streams of the processor, the hardware-lock mechanism operating to set a lock when an atomic memory sequence is started and to clear a lock when an atomic memory sequence is completed. In preferred embodiments the lock mechanism comprises one or more storage locations associated with each stream of the processor, each storage location enabled to store a memory address a lock bit, and a stall bit. Methods for practicing the invention using the apparatus are also taught.
    Type: Grant
    Filed: February 20, 2007
    Date of Patent: January 19, 2010
    Assignee: MIPS Technologies, Inc.
    Inventors: Stephen Melvin, Mario D. Nemirovsky
  • Publication number: 20100005257
    Abstract: A storage device includes a first storage unit that stores data read from a recording medium based on an instruction received from a processing device, and transmitting the data stored in the first storage unit to the processing device. The storage device also includes a second storage unit that stores the instruction received from the processing device; a counter that counts the number of pieces of data stored in the first storage unit; and a control unit that transmits the data stored in the first storage unit to the processing device based on a count value of the counter and, when the data read upon the instruction is stored in the first storage unit, writes identification information indicating that storing data has been completed in the second storage unit and, based on the identification information, transmits the data stored in the first storage unit to the processing device.
    Type: Application
    Filed: June 8, 2009
    Publication date: January 7, 2010
    Applicant: FUJITSU LIMITED
    Inventors: Masaaki Tamura, Gen Ohshima
  • Publication number: 20100005316
    Abstract: Method, system, and computer program product embodiments for performing a branch trace operation on a computer system of an end user are provided. An encrypted mapping macro is provided to the end user to be made operational on the computer system. A trace program is provided to the end user. The end user executes the trace program on the computer system as a diagnostic tool. The trace program is adapted for decrypting the encrypted mapping macro, determining a storage offset location of a branch instruction; checking the storage offset location for an identifying constant, cross referencing the identifying constant with an entry in the decrypted mapping macro to identify a branch triggering bit and diagnostic information associated with the branch instruction, and returning the branch triggering bit and diagnostic information, the branch triggering bit and diagnostic information provided to a coder.
    Type: Application
    Filed: July 7, 2008
    Publication date: January 7, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David Bruce LeGENDRE, David Charles REED, Max Douglas SMITH
  • Publication number: 20100005279
    Abstract: The data processor executes an instruction having a direction for write to a reference register of other instruction flow and an instruction having a direction for reference register invalidation. The data processor is arranged as a data processor having typical functions as an integrated whole of processors (CPU1 and CPU2) which execute simple instruction flows. When executing the instruction having a direction for write to a reference register of other instruction flow, the processor confirms whether a write register is invalid. The processor waits for the register to be made invalid, if the register is not invalid, and performs write if the register is invalid. After having executed the instruction having a direction for reference register invalidation, the processor invalidates the register to which a reference has been made. When the reference register is invalid, execution of the referring instruction is suspended until it is made valid.
    Type: Application
    Filed: September 14, 2009
    Publication date: January 7, 2010
    Inventor: Fumio Arakawa
  • Publication number: 20090327667
    Abstract: Systems and methods to perform fast rotation operations are disclosed. In a particular embodiment, a method includes executing a single instruction. The method includes receiving first data indicating a first coordinate and a second coordinate, receiving a first control value that indicates a first rotation value selected from a set of ninety degree multiples, and writing output data corresponding to the first data rotated by the first rotation value.
    Type: Application
    Filed: June 26, 2008
    Publication date: December 31, 2009
    Applicant: QUALCOMM INCORPORATED
    Inventors: Shankar Krithivasan, Erich James Plondke, Lucian Codrescu, Mao Zeng, Remi Jonathan Gurski
  • Publication number: 20090327693
    Abstract: A network task offload apparatus includes an offload circuit and a buffer scheduler. The offload circuit performs corresponding network task processing on a plurality of packets in parallel according to an offload command. The buffer scheduler includes a buffer control unit and a plurality of buffer units. The plurality of buffer units are controlled by the buffer control unit and are scheduled to store the processed packets.
    Type: Application
    Filed: June 24, 2009
    Publication date: December 31, 2009
    Inventors: Li-Han Liang, Tao-Chun Wang, Kuo-Nan Yang, Shieh-Hsing Kuo
  • Publication number: 20090327668
    Abstract: Tools and techniques are described for multi-threaded processing for opening and saving documents. These tools may provide load processes for reading documents from storage devices, and for loading the documents into applications. These tools may spawn a load process thread for executing a given load process on a first processing unit, and an application thread may execute a given application on a second processing unit. A first pipeline may be created for executing the load process thread, with the first pipeline performing tasks associated with loading the document into the application. A second pipeline may be created for executing the application process thread, with the second pipeline performing tasks associated with operating on the documents. The tasks in the first pipeline are configured to pass tokens as input to the tasks in the second pipeline.
    Type: Application
    Filed: June 27, 2008
    Publication date: December 31, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Uladzislau Sudzilouski, Igor Zaika
  • Publication number: 20090327666
    Abstract: A method for managing data, including obtaining a first instruction for moving a first data item from a first source to a first destination, determining a data type of the first data item, determining a data type supported by the first destination, comparing the data type of the first data item with the data type supported by the first destination to test a validity of the first instruction, and moving the first data item from the first source to the first destination based on the validity of the first instruction.
    Type: Application
    Filed: June 25, 2008
    Publication date: December 31, 2009
    Applicant: SUN MICROSYSTEMS, INC.
    Inventors: Mario I. Wolczko, Gregory M. Wright, Matthew L. Seidl
  • Patent number: 7640414
    Abstract: Methods, systems, and computer program products for forwarding store data to loads in a pipelined processor are provided. In one implementation, a processor is provided that includes a decoder operable to decode an instruction, and a plurality of execution units operable to respectively execute a decoded instruction from the decoder. The plurality of execution units include a load/store execution unit operable to execute decoded load instructions and decoded store instructions and generate corresponding load memory operations and store memory operations. The store queue is operable to buffer one or more store memory operations prior to the one or more memory operations being completed, and the store queue is operable to forward store data of the one or more store memory operations buffered in the store queue to a load memory operation on a byte-by-byte basis.
    Type: Grant
    Filed: November 16, 2006
    Date of Patent: December 29, 2009
    Assignee: International Business Machines Corporation
    Inventors: Jason Alan Cox, Kevin Chih Kang Lin, Eric Francis Robinson
  • Publication number: 20090319760
    Abstract: An n architecture for implementing an instruction pipeline within a CPU comprises an arithmetic logic unit (ALU), an address arithmetic unit (AAU), a program counter (PC), a read-only memory (ROM) coupled to the program counter, to an instruction register, and to an instruction decoder coupled to the arithmetic logic unit. A random access memory (RAM) is coupled to the instruction decoder, to the arithmetic logic unit, and to a RAM address register.
    Type: Application
    Filed: August 27, 2009
    Publication date: December 24, 2009
    Inventors: Benjamin F. Froemming, Emil Lambrache
  • Publication number: 20090313459
    Abstract: A system, method and article of manufacture are disclosed for processing Low Density Parity Check (LDPC) codes. The system comprises a multitude of processing units for processing the codes; and a processor chip including an on-chip, multi-port data cache for temporarily storing the LDPC codes. This data cache includes a plurality of input ports for receiving the LDPC codes from some of the processing units, and a plurality of output ports for sending the LDPC codes to others of the processing units. An off-chip, external memory stores the LDPC codes and transmits the LDPC codes to and receives the LDPC codes from at least some of the processing units. A sequence processor controls the transmission of the LDPC codes between the processor units and the on-chip data cache so that the LDPC codes are processed by the processing units according to a given sequence.
    Type: Application
    Filed: June 13, 2008
    Publication date: December 17, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Thomas A. Horvath
  • Patent number: 7634635
    Abstract: Systems and methods for reordering processor instructions. In accordance with a first embodiment of the present invention, a microprocessor comprises circuitry to process an instruction extension, wherein the instruction extension is transparent to the programming model of the microprocessor. The instruction extension may comprise a field for indicating an offset from a memory structure pointer. The microprocessor includes circuitry for adding the offset to the memory structure pointer to indicate a specific element of the memory structure. The specific element of the memory structure comprises address information corresponding to speculative data.
    Type: Grant
    Filed: April 7, 2006
    Date of Patent: December 15, 2009
    Inventors: Brian Holscher, Guillermo Rozas, James Van Zoeren, David Dunn
  • Patent number: 7634639
    Abstract: One embodiment of the present invention provides a system which avoids a live-lock state in a processor that supports speculative-execution. The system starts by issuing instructions for execution in program order during execution of a program in a normal-execution mode. Upon encountering a launch condition during the execution of an instruction (a “launch instruction”) which causes the processor to enter a speculative-execution mode, the system checks status indicators associated with a forward progress buffer. If the status indicators indicate that the forward progress buffer contains data for the launch instruction, the system resumes normal-execution mode. Upon resumption of normal-execution mode, the system retrieves the data from a data field contained in the forward progress buffer and executes the launch instruction using the retrieved data as input data for the launch instruction. The system next deasserts the status indicators.
    Type: Grant
    Filed: August 23, 2005
    Date of Patent: December 15, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Shailender Chaudhry, Paul Caprioli, Sherman H. Yip, Guarav Garg, Ketaki Rao
  • Publication number: 20090307467
    Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.
    Type: Application
    Filed: May 21, 2008
    Publication date: December 10, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Ahmad Faraj
  • Patent number: 7631170
    Abstract: An efficient embedded-DRAM processor architecture and associated methods. In one exemplary embodiment, the architecture includes a DRAM array, a set of register files, set of functional units, and a data assembly unit. The data assembly unit includes a set of row-address registers and is responsive to commands to activate and deactivate DRAM rows and to control the movement of data throughout the system. A pipelined data assembly approach allowing the functional units to perform register-to-register operations, and allowing the data assembly unit to perform all load/store operations using wide data busses. Data masking and switching hardware allows individual data words or groups of words to be transferred between the registers and memory. Other aspects of the invention include a memory and logic structure and an associated method to extract data blocks from memory to accelerate, for example, operations related to image compression and decompression.
    Type: Grant
    Filed: February 13, 2002
    Date of Patent: December 8, 2009
    Assignee: Micron Technology, Inc.
    Inventor: Eric M. Dowling
  • Publication number: 20090300337
    Abstract: Improved instruction set and core design, control and communication for programmable microprocessors is disclosed, involving the strategy for replacing centralized program sequencing in present-day and prior art processors with a novel distributed program sequencing wherein each functional unit has its own instruction fetch and decode block, and each functional unit has its own local memory for program storage; and wherein computational hardware execution units and memory units are flexibly pipelined as programmable embedded processors with reconfigurable pipeline stages of different order in response to varying application instruction sequences that establish different configurations and switching interconnections of the hardware units.
    Type: Application
    Filed: May 29, 2008
    Publication date: December 3, 2009
    Inventors: Xiaolin Wang, Qian Wu, Benjamin Marshall, Fugui Wang, Gregory Pitarys, Ke Ning
  • Patent number: 7627743
    Abstract: A multi-word transfer instruction, a memory transfer method using the multi-word transfer instruction and a circuit implementation for transferring multiple words between a memory subsystem and a processor register file are provided. The multi-word transfer instruction specifies an access type (load or store), a consecutive register group, a selection mask and a base register for the starting address of the corresponding memory locations. Therefore, the total number of words accessed by this instruction is equal to the number of registers specified in the consecutive register group along with the number of the registers specified by the selection mask. Besides, additional information, such as an address update mode, an order mode and a modification mode, may be further specified in the multi-word transfer instruction.
    Type: Grant
    Filed: January 12, 2007
    Date of Patent: December 1, 2009
    Assignee: Andes Technology Corporation
    Inventors: Hong-Men Su, Chuan-Hua Chang, Jen-Chih Tseng
  • Patent number: 7627698
    Abstract: A variety of advantageous mechanisms for improved data transfer control within a data processing system are described. A DMA controller is described which is implemented as a multiprocessing transfer engine supporting multiple transfer controllers which may work independently or in cooperation to carry out data transfers, with each transfer controller acting as an autonomous processor, fetching and dispatching DMA instructions to multiple execution units. In particular, mechanisms for initiating and controlling the sequence of data transfers are provided, as are processes for autonomously fetching DMA instructions which are decoded sequentially but executed in parallel.
    Type: Grant
    Filed: July 30, 2007
    Date of Patent: December 1, 2009
    Assignee: Altera Corporation
    Inventors: Edwin Franklin Barry, Edward A. Wolff
  • Patent number: 7627744
    Abstract: An integrated circuit comprises an external memory, a plurality of parallel connected Vector Processing Engines (VPEs), and an External Memory Unit (EMU) providing a data transfer path between the VPEs and the external memory. Each VPE contains a plurality of data processing units and a message queuing system adapted to transfer messages between the data processing units and other components of the integrated circuit.
    Type: Grant
    Filed: May 10, 2007
    Date of Patent: December 1, 2009
    Assignee: NVIDIA Corporation
    Inventors: Monier Maher, Jean Pierre Bordes, Christopher Lamb, Sanjay J. Patel
  • Publication number: 20090292905
    Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.
    Type: Application
    Filed: May 21, 2008
    Publication date: November 26, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Ahmad Faraj
  • Patent number: 7624251
    Abstract: One embodiment of the present invention provides a processor that is configured to execute load-swapped-partial instructions. An instruction fetch unit within the processor is configured to fetch the load-swapped-partial instruction to be executed. Note that the load-swapped-partial instruction specifies a source address in a memory, which is possibly an unaligned address. Furthermore, an execution unit within the processor is configured to execute the load-swapped-partial instruction. This involves loading a partial-vector-sized datum from a naturally-aligned memory region encompassing the source address.
    Type: Grant
    Filed: January 18, 2007
    Date of Patent: November 24, 2009
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20090287911
    Abstract: A programmable signal processing circuit has an instruction processing circuit (23, 24, 26), with an instruction set that comprises a depuncture instruction. The instruction processing circuit (23, 24, 26) forms the depuncture result by copying bit metrics from a bit metrics operand and inserting one or more predetermined bit metric values between the bit metrics from the bit metric operand in the depuncture result. The instruction processing circuit (23, 24, 26) changes the relative locations of the copied bit metrics with respect to each other in the depuncture result as compared to the relative locations of the copied bit metrics with respect to each other in the bit metric operand, to an extent needed for accommodating the inserted predetermined bit metric value or values.
    Type: Application
    Filed: December 13, 2005
    Publication date: November 19, 2009
    Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.
    Inventors: Paulus W.F. Gruijters, Marcus M.G. Quax
  • Patent number: 7620797
    Abstract: One embodiment of the present invention provides a processor which is configured to execute load-swapped instructions, which are possibly directed to unaligned source address. The processor is configured to execute the load-swapped instruction by loading a vector from a naturally-aligned memory region encompassing the source address, and in doing so rotating the bytes of the vector to cause the byte at the specified source address to reside at the least-significant byte position within the vector for a little-endian memory transaction, or causing said byte to be positioned at the most-significant byte position within the vector for a big-endian memory transaction.
    Type: Grant
    Filed: November 1, 2006
    Date of Patent: November 17, 2009
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20090282226
    Abstract: A network on chip (‘NOC’) that includes IP blocks, routers, memory communications controllers, and network interface controllers, each IP block adapted to the network by an application messaging interconnect including an inbox and an outbox, one or more of the IP blocks including computer processors supporting a plurality of threads, the NOC also including an inbox and outbox controller configured to set pointers to the inbox and outbox, respectively, that identify valid message data for a current thread; and software running in the current thread that, upon a context switch to a new thread, is configured to: save the pointer values for the current thread, and reset the pointer values to identify valid message data for the new thread, where the inbox and outbox controller are further configured to retain the valid message data for the current thread in the boxes until context switches again to the current thread.
    Type: Application
    Filed: May 9, 2008
    Publication date: November 12, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Russell D. Hoover, Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
  • Publication number: 20090282225
    Abstract: Embodiments of the present invention provide a system which executes a load instruction or a store instruction. During operation the system receives a load instruction. The system then determines if an unrestricted entry or a restricted entry in a store queue contains data that satisfies the load instruction. If not, the system retrieves data for the load instruction from a cache.
    Type: Application
    Filed: May 6, 2008
    Publication date: November 12, 2009
    Applicant: SUN MICROSYSTEMS, INC.
    Inventors: Paul Caprioli, Martin Karlsson, Shailender Chaudhry, Gideon N. Levinsky