Data Flow Based System Patents (Class 712/201)
-
Patent number: 9043380Abstract: A data processing system arranged for receiving over a network, according to a data transfer protocol, data directed to any of a plurality of destination identities, the data processing system comprising: data storage for storing data received over the network; and a first processing arrangement for performing processing in accordance with the data transfer protocol on received data in the data storage, for making the received data available to respective destination identities; and a response former arranged for: receiving a message requesting a response indicating the availability of received data to each of a group of destination identities; and forming such a response; wherein the system is arranged to, in dependence on receiving the said message.Type: GrantFiled: July 13, 2012Date of Patent: May 26, 2015Assignee: Solarflare Communications, Inc.Inventors: Steven Leslie Pope, Derek Edward Roberts, David James Riddoch, Greg Law, Steve Grantham, Matthew Slattery
-
Patent number: 9026768Abstract: A computing machine is disclosed having a memory system for storing a collection of execution nodes, a head for reading a sequence of symbols in the execution nodes in the memory system, and writing a sequence of symbols in the memory system. The machine is configured to execute a computation with a collection of pairs of execution nodes. Each pair of execution nodes represents a machine instruction. One execution node in the pair represents input of the machine instruction represented by the execution nodes. Another execution node in the pair represents output of the machine instruction represented by the execution nodes. Each execution node has a state of the machine, a sequence of symbols and a number.Type: GrantFiled: September 14, 2009Date of Patent: May 5, 2015Assignee: Aemea Inc.Inventor: Michael Stephen Fiske
-
Patent number: 9015352Abstract: The present invention includes an adaptable high-performance node (RXN) with several features that enable it to provide high performance along with adaptability. A preferred embodiment of the RXN includes a run-time configurable data path and control path. The RXN supports multi-precision arithmetic including 8, 16, 24, and 32 bit codes. Data flow can be reconfigured to minimize register accesses for different operations. For example, multiply-accumulate operations can be performed with minimal, or no, register stores by reconfiguration of the data path. Predetermined kernels can be configured during a setup phase so that the RXN can efficiently execute, e.g., Discrete Cosine Transform (DCT), Fast-Fourier Transform (FFT) and other operations. Other features are provided.Type: GrantFiled: March 31, 2014Date of Patent: April 21, 2015Assignee: Altera CorporationInventor: Amit Ramchandran
-
Patent number: 9015622Abstract: Some embodiments of a system and a method to tune a computing system based on a profile have been presented. A profile as used herein broadly refers to a file containing various parameters of a computing system, such as kernel parameters (e.g., buffer size, network setup, etc.), usable to configure the computing system. For instance, a set of profiles are stored in a computer-readable storage device in a computing system, such as a server, a personal computer, a laptop computer, etc. A processing device miming on the computing system may receive a user selection of one of the set of profiles. In response to the user selection, the processing device may load the selected profile onto the computing system in order to tune the computing system according to the selected profile.Type: GrantFiled: January 20, 2010Date of Patent: April 21, 2015Assignee: Red Hat, Inc.Inventors: Thomas K. Wörner, Christopher Haughey Snook
-
Publication number: 20150074376Abstract: Embodiments are provided for an asynchronous processor using master and assisted tokens. In an embodiment, an apparatus for an asynchronous processor comprises a memory to cache a plurality of instructions, a feedback engine to decode the instructions from the memory, and a plurality of XUs coupled to the feedback engine and arranged in a token ring architecture. Each one of the XUs is configured to receive an instruction of the instructions form the feedback engine, and receive a master token associated with a resource and further receive an assisted token for the master token. Upon determining that the assisted token and the master token are received in an abnormal order, the XU is configured to detect an operation status for the instruction in association with the assisted token, and upon determining a needed action in accordance with the operation status and the assisted token, perform the needed action.Type: ApplicationFiled: September 8, 2014Publication date: March 12, 2015Inventors: Yiqun Ge, Wuxian Shi, Qifan Zhang, Tao Huang, Wen Tong
-
Patent number: 8972769Abstract: A data processing apparatus includes: a plurality of processing units adapted to process data according to input operation clocks; and a control unit adapted to measure response times of the plurality of processing units when the operation clocks of a common frequency are supplied to the plurality of processing units, and to control a frequency of the operation clocks to be supplied to at least one of the plurality of processing units so that a plurality of measured response times become closer to each other.Type: GrantFiled: January 21, 2011Date of Patent: March 3, 2015Assignee: Canon Kabushiki KaishaInventors: Akio Nakagawa, Hisashi Ishikawa
-
Patent number: 8959304Abstract: A data processing apparatus comprises a primary processor, a secondary processor configured to perform secure data processing operations and non-secure data processing operations and a memory configured to store secure data used by the secondary processor when performing the secure data processing operations and configured to store non-secure data used by the secondary processor when performing the non-secure data processing operations, wherein the secure data cannot be accessed by the non-secure data processing operations, wherein the secondary processor comprises a memory management unit configured to administer accesses to the memory from the secondary processor, the memory management unit configured to perform translations between virtual memory addresses used by the secondary processor and physical memory addresses used by the memory, wherein the translations are configured in dependence on a page table base address, the page table base address identifying a storage location in the memory of a set of desType: GrantFiled: February 26, 2013Date of Patent: February 17, 2015Assignee: ARM LimitedInventors: Dominic Hugo Symes, Ola Hugosson, Donald Felton, Sean Tristram Ellis
-
Patent number: 8843928Abstract: A method and system of efficient use and programming of a multi-processing core device. The system includes a programming construct that is based on stream-domain code. A programmable core based computing device is disclosed. The computing device includes a plurality of processing cores coupled to each other. A memory stores stream-domain code including a stream defining a stream destination module and a stream source module. The stream source module places data values in the stream and the stream conveys data values from the stream source module to the stream destination module. A runtime system detects when the data values are available to the stream destination module and schedules the stream destination module for execution on one of the plurality of processing cores.Type: GrantFiled: January 21, 2011Date of Patent: September 23, 2014Assignee: QST Holdings, LLCInventors: Paul Master, Frederick Furtek
-
Patent number: 8826286Abstract: The present invention relates to the field of enterprise network computing. In particular, it relates to monitoring workload of a workload scheduler. Information defining a plurality of test jobs of low priority is received. The test jobs have respective launch times, and are launched for execution in a data processing system in accordance with said launch times and said low execution priority. The number of test jobs executed within a pre-defined analysis time range is determined A performance decrease warning is issued if the number of executed test jobs is lower than a predetermined threshold number. A workload scheduler discards launching of jobs having a low priority when estimating that a volume of jobs submitted with higher priority is sufficient to keep said scheduling system busy.Type: GrantFiled: November 27, 2012Date of Patent: September 2, 2014Assignee: International Business Machines CorporationInventor: Sergej Boris
-
Publication number: 20140181472Abstract: A method and apparatus for providing a scalable compute fabricare provided herein. The method includes determining a workflow for processing by the scalable compute fabric, wherein the workflow is based on an instruction set. A pipeline in configured dynamically for processing the workflow, and the workflow is executed using the pipeline.Type: ApplicationFiled: December 20, 2012Publication date: June 26, 2014Inventors: Scott Krig, Teresa Morrison
-
Patent number: 8706916Abstract: The present invention includes an adaptable high-performance node (RXN) with several features that enable it to provide high performance along with adaptability. A preferred embodiment of the RXN includes a run-time configurable data path and control path. The RXN supports multi-precision arithmetic including 8, 16, 24, and 32 bit codes. Data flow can be reconfigured to minimize register accesses for different operations. For example, multiply-accumulate operations can be performed with minimal, or no, register stores by reconfiguration of the data path. Predetermined kernels can be configured during a setup phase so that the RXN can efficiently execute, e.g., Discrete Cosine Transform (DCT), Fast-Fourier Transform (FFT) and other operations. Other features are provided.Type: GrantFiled: February 15, 2013Date of Patent: April 22, 2014Assignee: Altera CorporationInventor: Amit Ramchandran
-
Patent number: 8638805Abstract: Described embodiments provide for restructuring a scheduling hierarchy of a network processor having a plurality of processing modules and a shared memory. The scheduling hierarchy schedules packets for transmission. The network processor generates tasks corresponding to each received packet associated with a data flow. A traffic manager receives tasks provided by one of the processing modules and determines a queue of the scheduling hierarchy corresponding to the task. The queue has a parent scheduler at each of one or more next levels of the scheduling hierarchy up to a root scheduler, forming a branch of the hierarchy. The traffic manager determines if the queue and one or more of the parent schedulers of the branch should be restructured. If so, the traffic manager drops subsequently received tasks for the branch, drains all tasks of the branch, and removes the corresponding nodes of the branch from the scheduling hierarchy.Type: GrantFiled: September 30, 2011Date of Patent: January 28, 2014Assignee: LSI CorporationInventors: Balakrishnan Sundararaman, Shashank Nemawarkar, David Sonnier, Shailendra Aulakh, Allen Vestal
-
Publication number: 20140013081Abstract: An approach for processing data by a pipeline of a single hardware-implemented virtual multiple instance finite state machine (VMI FSM) is presented. Based on a current state and context of an FSM instance, an input token selected from multiple input tokens to enter a pipeline of the VMI FSM, and a status of an environment, a new state of the FSM instance is determined and an output token is determined. The input token includes a reference to the FSM instance. In one embodiment, the reference is an InfiniBand QP number. After a receipt by the pipeline of the first input token and prior to determining the new state of the FSM instance and determining the output token, a logic circuit selects a second input token to enter the pipeline. The second input token includes a reference to a second FSM instance.Type: ApplicationFiled: September 9, 2013Publication date: January 9, 2014Applicant: International Business Machines CorporationInventors: Rolf K. Fritz, Andreas Muller, Thomas Schlipf, Daniel Thiele
-
Publication number: 20140013080Abstract: A method and system are provided for deriving a resultant software code from an originating ordered list of instructions that does not include overlapping branch logic. The method may include deriving a plurality of unordered software constructs from a sequence of processor instructions; associating software constructs in accordance with an original logic of the sequence of processor instructions; determining and resolving memory precedence conflicts within the associated plurality of software constructs; resolving forward branch logic structures into conditional logic constructs; resolving back branch logic structures into loop logic constructs; and/or applying the plurality of unordered software constructs in a programming operation by a parallel execution logic circuitry. The resultant plurality of unordered software constructs may be converted into programming reconfigurable logic, computers or processors, and also by means of a computer network or an electronics communications network.Type: ApplicationFiled: December 20, 2012Publication date: January 9, 2014Inventor: ROBERT KEITH MYKLAND
-
Patent number: 8612955Abstract: A dataflow instruction set architecture and execution model, referred to as WaveScalar, which is designed for scalable, low-complexity/high-performance processors, while efficiently providing traditional memory semantics through a mechanism called wave-ordered memory. Wave-ordered memory enables “real-world” programs, written in any language, to be run on the WaveScalar architecture, as well as any out-of-order execution unit. Because it is software-controlled, wave-ordered memory can be disabled to obtain greater parallelism. Wavescalar also includes a software-controlled tag management system.Type: GrantFiled: January 22, 2008Date of Patent: December 17, 2013Assignee: University of WashingtonInventors: Mark H. Oskin, Steven J. Swanson, Susan J. Eggers
-
Publication number: 20130290674Abstract: Constructs may express SIMD control flow that can be efficiently implemented on a SIMD machine with support for SIMD control flow. The execution semantics of constructs serve as a functional specification for an emulation implementation in the central processing unit (CPU), a non-SIMD machine, using conventional C++ compiler such as GCC or Microsoft Visual C++ without any modification to the conventional compiler in some embodiments.Type: ApplicationFiled: April 30, 2012Publication date: October 31, 2013Inventors: Biju George, Guei-Yuan Luch
-
Publication number: 20130246737Abstract: Mechanisms, in a data processing system comprising a single instruction multiple data (SIMD) processor, for performing a data dependency check operation on vector element values of at least two input vector registers are provided. Two calls to a simd-check instruction are performed, one with input vector registers having a first order and one with the input vector registers having a different order. The simd-check instruction performs comparisons to determine if any data dependencies are present. Results of the two calls to the simd-check instruction are obtained and used to determine if any data dependencies are present in the at least two input vector registers. Based on the results, the SIMD processor may perform various operations.Type: ApplicationFiled: March 15, 2012Publication date: September 19, 2013Applicant: International Business Machines CorporationInventors: Alexandre E. Eichenberger, Bruce M. Fleischer
-
Publication number: 20130080737Abstract: A vector data access unit includes data access ordering circuitry, for issuing data access requests indicated by the elements to the data store, and configured in response to receipt of at least two decoded vector data access instructions, and one of the instructions being a write instruction. Data accesses are performed in the instructed order to determine an element indicating the next data access for each of said vector data access instructions. One of the next data accesses is selected to be issued to the data store in dependence upon an order in which the at least two vector data instructions were received. The position of the elements indicates the next data accesses relative to each other within their respective plurality of elements. A numerical position of the element indicating the next data access within the plurality of elements of an earlier instruction is less than a predetermined value.Type: ApplicationFiled: September 28, 2011Publication date: March 28, 2013Applicant: ARM LimitedInventor: Alastair David Reid
-
Patent number: 8397233Abstract: A device includes an input processing unit and an output processing unit. The input processing unit dispatches first data to one of a group of processing engines, records an identity of the one processing engine in a location in a first memory, reserves one or more corresponding locations in a second memory, causes the first data to be processed by the one processing engine, and stores the processed first data in one of the locations in the second memory. The output processing unit receives second data, assigns an entry address corresponding to a location in an output memory to the second data, transfers the second data and the entry address to one of a group of second processing engines, causes the second data to be processed by the second processing engine, and stores the processed second data to the location in the output memory.Type: GrantFiled: May 23, 2007Date of Patent: March 12, 2013Assignee: Juniper Networks, Inc.Inventors: Raymond Marcelino Manese Lim, Stefan Dyckerhoff, Jeffrey Glenn Libby, Teshager Tesfaye
-
Patent number: 8397186Abstract: A technique for reliably replaying operations in electronic-design-automation (EDA) software is described. In this technique, the EDA software stores operations performed by a user during a design session, as well as any replay look-ahead instructions, in a log file. When repeating the first operation, the replay look-ahead instruction ensures that the same state is obtained in the EDA environment as was previously obtained. For example, if an interrupt occurred when the first operation was previously performed, the replay look-ahead instruction may specify when the interrupt occurred during the performance of the operation so that the effect of the interrupt may be simulated when replaying the first operation.Type: GrantFiled: October 30, 2009Date of Patent: March 12, 2013Assignee: Synopsys, Inc.Inventor: Jeffrey T. Brubaker
-
Patent number: 8380884Abstract: The present invention includes an adaptable high-performance node (RXN) with several features that enable it to provide high performance along with adaptability. A preferred embodiment of the RXN includes a run-time configurable data path and control path. The RXN supports multi-precision arithmetic including 8, 16, 24, and 32 bit codes. Data flow can be reconfigured to minimize register accesses for different operations. For example, multiply-accumulate operations can be performed with minimal, or no, register stores by reconfiguration of the data path. Predetermined kernels can be configured during a setup phase so that the RXN can efficiently execute, e.g., Discrete Cosine Transform (DCT), Fast-Fourier Transform (FFT) and other operations. Other features are provided.Type: GrantFiled: March 7, 2011Date of Patent: February 19, 2013Assignee: Altera CorporationInventor: Amit Ramchandran
-
Patent number: 8381219Abstract: The present invention relates to the field of enterprise network computing. In particular, it relates to a method and respective system for monitoring workload of a workload scheduler. Information defining a plurality of test jobs of low priority is received. The test jobs have respective launch times, and the test jobs are launched for execution in a data processing system in accordance with said launch times and said low execution priority. It is evaluated how many of said test jobs are executed within a pre-defined analysis time range. A performance decrease warning is issued, if the number of executed test jobs is lower than a predetermined threshold number. The workload scheduler discards launching of jobs having a low priority when estimating that a volume of jobs submitted with higher priority is sufficient to keep said scheduling system busy.Type: GrantFiled: January 17, 2008Date of Patent: February 19, 2013Assignee: International Business Machines CorporationInventor: Sergej Boris
-
Patent number: 8296764Abstract: The disclosure describes internal synchronization in adaptive integrated circuitry which utilizes a data flow model for data processing. Task initiation and execution are controlled based upon data consumption measured in data buffer units, with initiation of and transitions between tasks based on a determined boundary condition within the data stream. When a data processing task is selected for synchronization, a boundary condition in a data stream is determined for commencement of the selected data processing task. Then, a timing marker for the commencement of the selected data processing task is determined relative to the data stream. The timing marker is dual-valued, providing a designated buffer unit and a designated byte or bit location within the designated buffer. The timing marker is communicated to the selected data processing task, which then commences data processing at a location in the data stream designated by the timing marker.Type: GrantFiled: September 9, 2004Date of Patent: October 23, 2012Assignee: NVIDIA CorporationInventors: Ghobad Heidari-Bateni, Sharad D. Sambhwani
-
Patent number: 8265396Abstract: The present invention provides for the recovery of characters entered into at least one data entry zone of a data entry window. A method in accordance with an embodiment includes: storing a first image of the data entry window during data entry; subtracting a reference image from the first image to obtain a delta image, wherein the reference image is an image of the data entry window without data entered; identifying at least one non empty zone of the delta image and the location of the at least one data entry zone on the data entry window from the location of the at least one non empty zone on the delta image; extracting at least one character by applying optical character recognition to the least one non empty zone; and inputting the at least one character into the location of the at least one data entry zone.Type: GrantFiled: December 3, 2008Date of Patent: September 11, 2012Assignee: International Business Machines CorporationInventors: Frederic Bauchot, Jean-Luc Collet, Gerard Marmigere, Joaquin Picon
-
Publication number: 20120216019Abstract: There is provided embodiment of methods of generating a hardware design for a pipelined parallel stream processor.Type: ApplicationFiled: February 17, 2011Publication date: August 23, 2012Applicant: MAXELER TECHNOLOGIES, LTD.Inventors: Jacob Alexis Bower, James Huggett, Oliver Pell
-
Patent number: 8244718Abstract: Embodiments of the present invention provide a database system that is optimized by using hardware acceleration. The system may be implemented in several variations to accommodate a wide range of queries and database sizes. In some embodiments, the system may comprise a host system that is coupled to one or more hardware accelerator components. The host system may execute software or provide an interface for receiving queries. The host system analyzes and parses these queries into tasks. The host system may then select some of the tasks and translate them into machine code instructions, which are executed by one or more hardware accelerator components. The tasks executed by hardware accelerators are generally those tasks that may be repetitive or processing intensive. Such tasks may include, for example, indexing, searching, sorting, table scanning, record filtering, and the like.Type: GrantFiled: August 27, 2007Date of Patent: August 14, 2012Assignee: Teradata US, Inc.Inventors: Joseph I. Chamdani, Raj Cherabuddi, Michael Corwin, Jeremy Branscome, Liuxi Yang, Ravi Krishnamurthy
-
Patent number: 8239660Abstract: A high speed processor. The processor includes terminals that each execute a subset of the instruction set. In at least one of the terminals, the instructions are executed in an order determined by data flow. Instructions are loaded into the terminal in pages. A notation is made when an operand for an instruction is generated by another instruction. When operands for an instruction are available, that instruction is a “ready” instruction. A ready instruction is selected in each cycle and executed. To allow data to be transmitted between terminals, each terminal is provided with a receive station, such that data generated in one terminal may be transmitted to another terminal for use as an operand in that terminal. In one embodiment, one terminal is an arithmetic terminal, executing arithmetic operations such as addition, multiplication and division. The processor has a second terminal, which contains functional logic to execute all other instructions in the instruction set.Type: GrantFiled: March 26, 2010Date of Patent: August 7, 2012Assignee: STMicroelectronics Inc.Inventor: Stefano Cervini
-
Patent number: 8179896Abstract: A network processor of an embodiment includes a packet classification engine, a processing pipeline, and a controller. The packet classification engine allows for classifying each of a plurality of packets according to packet type. The processing pipeline has a plurality of stages for processing each of the plurality of packets in a pipelined manner, where each stage includes one or more processors. The controller allows for providing the plurality of packets to the processing pipeline in an order that is based at least partially on: (i) packet types of the plurality of packets as classified by the packet classification engine and (ii) estimates of processing times for processing packets of the packet types at each stage of the plurality of stages of the processing pipeline. A method in a network processor allows for prefetching instructions into a cache for processing a packet based on a packet type of the packet.Type: GrantFiled: November 7, 2007Date of Patent: May 15, 2012Inventor: Justin Mark Sobaje
-
Patent number: 8161271Abstract: Embodiments of the invention provide logic within the store data path between a processor and a memory array. The logic may be configured to misalign vector data as it is stored to memory. By misaligning vector data as it is stored to memory, memory bandwidth may be maximized while processing bandwidth required to store vector data misaligned is minimized. Furthermore, embodiments of the invention provide logic within the load data path which allows vector data which is stored misaligned to be aligned as it is loaded into a vector register. By aligning misaligned vector data as it is loaded into a vector register, memory bandwidth may be maximized while processing bandwidth required to align misaligned vector data may be minimized.Type: GrantFiled: July 11, 2007Date of Patent: April 17, 2012Assignee: International Business Machines CorporationInventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
-
Patent number: 8145881Abstract: A data processing device comprising a multidimensional array of coarse grained logic elements processing data and operating at a first clock rate and communicating with one another and/or other elements via busses and/or communication lines operated at a second clock rate is disclosed, wherein the first clock rate is higher than the second and wherein the coarse grained logic elements comprise storage means for storing data needed to be processed.Type: GrantFiled: October 24, 2008Date of Patent: March 27, 2012Inventors: Martin Vorbach, Alexander Thomas
-
Patent number: 8108653Abstract: A processor includes a compute array comprising a first plurality of compute engines serially connected along a data flow path such that data flows between successive compute engines at successive times. The first plurality of compute engines includes an initial compute engine and a final compute engine. The data flow path includes a recirculation path connecting the final compute engine to the initial compute engine with no compute engine therebetween.Type: GrantFiled: February 5, 2010Date of Patent: January 31, 2012Assignee: Analog Devices, Inc.Inventors: Boris Lerner, Douglas Garde
-
Publication number: 20110320771Abstract: A circuit arrangement and method selectively bypass an instruction buffer for selected instructions so that bypassed instructions can be dispatched without having to first pass through the instruction buffer. Thus, for example, in the case that an instruction buffer is partially or completely flushed as a result of an instruction redirect (e.g., due to a branch mispredict), instructions can be forwarded to subsequent stages in an instruction unit and/or to one or more execution units without the latency associated with passing through the instruction buffer.Type: ApplicationFiled: June 28, 2010Publication date: December 29, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
-
Patent number: 8078839Abstract: An electronic processing element is disclosed for use in a system having a plurality of processing elements. The electronic processing element includes an input instruction memory, an operation unit, and an output instruction memory. The input instruction memory is configured to store and retrieve a plurality of operation codes and, for each operation code, an associated output instruction memory address. The operation unit is configured to generate an output datum defined by at least a selected operation code and an associated input datum. The output instruction memory is configured to receive the output instruction memory address and to retrieve an address for an input instruction memory of a second processing element. Upon selection of an input instruction memory address and presentation of an associated input datum, the processing element generates an output datum in association with a corresponding input instruction memory address of the second processing element.Type: GrantFiled: December 12, 2008Date of Patent: December 13, 2011Assignee: Wave SemiconductorInventor: Karl Fant
-
Patent number: 8024489Abstract: A system for communicating command parameters between a processor and a memory flow controller is provided. The system makes use of a channel interface as the primary mechanism for communicating between the processor and a memory flow controller. The channel interface provides channels for communicating with processor facilities, memory flow control facilities, machine state registers, and external processor interrupt facilities, for example. These channels may be designated as blocking or non-blocking. With blocking channels, when no data is available to be read from the corresponding registers, or there is no space available to write to the corresponding registers, the processor is placed in a low power “stall” state. The processor is automatically awakened, via communication across the blocking channel, when data becomes available or space is freed. Thus, the channels of the present invention permit the processor to stay in a low power state.Type: GrantFiled: April 21, 2008Date of Patent: September 20, 2011Assignee: International Business Machines CorporationInventors: Michael N. Day, Charles R. Johns, Peichun P. Liu, Todd E. Swanson, Thuong Q. Truong
-
Patent number: 7941794Abstract: A data flow graph processing method divides a program describing target operations into two or more subprograms and converts each of the two or more subprograms into a data flow graph (DFG) representing dependency in execution between operations carried out in sequence. Also generated is flow data indicating the order of execution of DFGs corresponding to respective subprograms. DFGs are converted into configuration data and the flow data is converted into control data.Type: GrantFiled: August 29, 2005Date of Patent: May 10, 2011Assignee: Sanyo Electric Co., Ltd.Inventors: Makoto Ozone, Hiroshi Nakajima, Tatsuo Hiramatsu, Katsunori Hirase, Makoto Okada
-
Publication number: 20110087895Abstract: A processor may include a hardware instruction fetch unit configured to issue instructions for execution, and a hardware functional unit configured to receive instructions for execution, where the instructions include cryptographic instruction(s) and non-cryptographic instruction(s). The functional unit may include a cryptographic execution pipeline configured to execute the cryptographic instructions with a corresponding cryptographic execution latency, and a non-cryptographic execution pipeline configured to execute the non-cryptographic instructions with a corresponding non-cryptographic execution latency that is longer than the cryptographic execution latency.Type: ApplicationFiled: October 8, 2009Publication date: April 14, 2011Inventors: Christopher H. Olson, Gregory F. Grohoski, Robert T. Golla
-
Patent number: 7913007Abstract: Systems, methods, and computer program products for preemption in asynchronous systems using anti-tokens are disclosed. According to one aspect, configurable system for constructing asynchronous application specific integrated data pipeline circuits with preemption includes a plurality of modular circuit stages that are connectable with each other and with other circuit elements to form multi-stage asynchronous application specific integrated data pipeline circuits for asynchronously sending data and tokens in a forward direction through the pipeline and for asynchronously sending anti-tokens in a backward direction through the pipeline. Each stage is configured to perform a handshaking protocol with other pipeline stages, the protocol including receiving either a token from the previous stage or an anti-token from the next stage, and in response, sending both a token forward to the next stage and an anti-token backward to the previous stage.Type: GrantFiled: September 29, 2008Date of Patent: March 22, 2011Assignee: The University of North CarolinaInventors: Montek Singh, Manoj Kumar Ampalam
-
Publication number: 20110060890Abstract: A stream data generating method for a computer system for generating stream data having time information applied thereto in a time series order and processing the generated stream data on the basis of a registered query. The computer system includes a storage for storing therein query information indicative of a plurality of sorts of constituent elements forming stream data corresponding to the query on the basis of the query and a stream definition indicative of the plurality of constituent elements, a data generator for generating and transmitting stream data; and a stream data processor for processing the stream data transmitted from the data generator. The data generator a less quantity of stream data to be transmitted to the stream data processor on the basis of the query information.Type: ApplicationFiled: November 25, 2009Publication date: March 10, 2011Inventors: Kazuho TANAKA, Takahiro Yokoyama, Tomohiro Hanai, Satoru Watanabe, Atsuro Handa
-
Patent number: 7904603Abstract: The present invention includes an adaptable high-performance node (RXN) with several features that enable it to provide high performance along with adaptability. A preferred embodiment of the RXN includes a run-time configurable data path and control path. The RXN supports multi-precision arithmetic including 8, 16, 24, and 32 bit codes. Data flow can be reconfigured to minimize register accesses for different operations. For example, multiply-accumulate operations can be performed with minimal, or no, register stores by reconfiguration of the data path. Predetermined kernels can be configured during a setup phase so that the RXN can efficiently execute, e.g., Discrete Cosine Transform (DCT), Fast-Fourier Transform (FFT) and other operations. Other features are provided.Type: GrantFiled: September 10, 2009Date of Patent: March 8, 2011Assignee: QST Holdings, LLCInventor: Amit Ramchandran
-
Patent number: 7895418Abstract: There is disclosed an operand queue for use in a floating point unit. The floating point unit comprises floating point processing units for executing floating point instructions that write operands to an external memory and for executing floating point instructions that read operands from the external memory. The floating point also comprises an operand queue for storing a plurality of operands associated with one or more operations being processed in the floating point unit. The operand queue stores a first operand being written to an external memory by a floating point write instruction executed by a first one of the plurality of floating point processing units and supplies the first operand to a floating point read instruction executed by a second one of the plurality of floating point processing units subsequent to the execution of the floating point write instruction.Type: GrantFiled: November 28, 2005Date of Patent: February 22, 2011Assignee: National Semiconductor CorporationInventor: Daniel W. Green
-
Patent number: 7895586Abstract: A data flow graph processing method divides at least one DFG generated into a plurality of sub-DFGs, in accordance with the number of logic circuits in a circuit set in a reconfigurable circuit. When the reconfigurable circuit is provided with a structure including multiple-row connections, the number of columns in the sub-DFG is configured to be equal to or fewer than the number of logic circuits per row in the reconfigurable circuit. Subsequently, the sub-DFGs are joined so as to generate a joined DFG. The number of columns in the joined DFG is also configured to be equal to or fewer than the number of columns per row in the reconfigurable circuit. The joined DFG is redivided to sizes with number of rows equal to or fewer than the number of rows in the reconfigurable circuit, so as to generate subjoined DFGs mappable into the reconfigurable circuit.Type: GrantFiled: June 20, 2005Date of Patent: February 22, 2011Assignee: Sanyo Electric Co., Ltd.Inventor: Makoto Ozone
-
Patent number: 7870102Abstract: An apparatus and method to store data are disclosed. The method provides a data storage system comprising a fossilized data management apparatus interconnected with one or more data storage devices. The method provides to the fossilized data management apparatus information and meta data associated with that information, wherein the meta data comprises a format field, a context field, a retention field, a data management field, and a storage management field. The fossilized data management apparatus instructs the one or more data storage devices to write the information to the one or more data storage devices based upon the meta data format field.Type: GrantFiled: July 12, 2006Date of Patent: January 11, 2011Assignee: International Business Machines CorporationInventors: Nils Haustein, Ulf Troppens, Daniel James Winarski
-
Patent number: 7804844Abstract: A processor is specified for implementing actions for manipulating the fields of the packets of a communication protocol. A cluster specification is input specifying clusters of independent actions. A constraint specification is input of dependencies constraining performance of the actions, including a dependency between a first action from a first cluster and a second action from a second cluster. Each cluster is assigned to a stage of a dataflow pipeline of the processor, and the dependencies are satisfied by performing each stage in an order of the dataflow pipeline. The first action is transferred between the stages of the first and second clusters. A timeframe is scheduled for performing each action in each stage of the dataflow pipeline. The timeframe is scheduled for performing of the first and second actions in the stage of the second cluster in accordance with the dependencies. A specification of the dataflow pipeline is output.Type: GrantFiled: August 5, 2008Date of Patent: September 28, 2010Assignee: Xilinx, Inc.Inventors: Michael E. Attig, Gordon J. Brebner
-
Publication number: 20100229162Abstract: A compiling apparatus includes an instruction-sequence-hierarchy-graph generating unit that generates an instruction sequence hierarchy graph by arraying unit graphs, to each of which a data path realized by a plurality of microinstructions included in one instruction sequence is to be allocated and in each of which function units included in a target processor are a node and a data line between the function units is an edge, to correspond to an execution order of a plurality of instruction sequences and by connecting arrayed unit graphs with an edge corresponding to a hardware path capable of establishing a data path across the instruction sequences; a data path allocating unit that allocates a data path to each of the unit graphs constituting the instruction sequence hierarchy graph; and an object program output unit that generates an instruction sequence group based on the data path allocated to the instruction sequence hierarchy graph.Type: ApplicationFiled: September 15, 2009Publication date: September 9, 2010Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Ryuji HADA, Takashi Miyamori, Keiri Nakanishi, Masato Sumiyoshi, Takahisa Wada, Yasuki Tanabe, Katsuyuki Kimura, Shunichi Ishiwata
-
Patent number: 7779165Abstract: Producers and consumer processes may synchronize and transfer data using a shared data structure. After locating a potential transfer location that indicates an EMPTY status, a producer may store data to be transferred in the transfer location. A producer may use a compare-and-swap (CAS) operation to store the transfer data to the transfer location. A consumer may subsequently read the transfer data from the transfer location and store, such as by using a CAS operation, a DONE status indicator in the transfer location. The producer may notice the DONE indication and may then set the status location back to EMPTY to indicate that the location is available for future transfers, by the same or a different producer. The producer may also monitor the transfer location and time out if no consumer has picked up the transfer data.Type: GrantFiled: January 4, 2006Date of Patent: August 17, 2010Assignee: Oracle America, Inc.Inventors: Mark S. Moir, Daniel S. Nussbaum, Ori Shalev, Nir N. Shavit
-
Publication number: 20100199070Abstract: A programmable filter processor which is adaptable to different filtering algorithms, a plurality of different software algorithms being executable, the programmable filter processor including a logic unit which includes a plurality of pipeline stages; a first memory in which the software algorithms are stored; a second memory in which raw data and parameters for the different filter algorithms are stored; and an address generating unit which is controllable via a program counter, the address generating unit being developed to generate control commands for the second memory and the logic unit.Type: ApplicationFiled: July 8, 2008Publication date: August 5, 2010Inventors: Stephen Schmitt, Juergen Mallok, Juergen Hanisch
-
Publication number: 20100199071Abstract: A data processing apparatus in which pipeline processing is performed comprises a control unit that controls a data processing sequence, a first processing unit that begins first data processing by inputting data on the basis of a start signal, outputs data subjected to the first data processing, and outputs a completion signal to the control unit after completing the first data processing, and a second processing unit that begins second data processing by inputting the data subjected to the first data processing on the basis of a start signal, outputs data subjected to the second data processing, and outputs a completion signal to the control unit after completing the second data processing. The control unit outputs a following start signal to the first processing unit and the second processing unit upon reception of the completion signal of the first data processing and the second data processing respectively.Type: ApplicationFiled: January 29, 2010Publication date: August 5, 2010Inventors: Keisuke Nakazono, Akira Ueno
-
Patent number: 7752576Abstract: An input unit inputs specification description that includes a plurality of pieces of processing information each indicative of a processing performed by a design object and association information indicative of associations among the processing information. A node generating unit generates a node for each of the processing information. A link generating unit generates, based on the association information, a link that couples nodes generated by the node generating unit. A sub-chart generating unit configured to generate a plurality of sub-charts by dividing a chart indicating a content of the specification description, based on the node and the link. A function-module generating unit generates, for each of the sub-charts, a function module that executes a function based on the processing information corresponding to the node in the sub-chart and the association information corresponding to the link in the sub-chart.Type: GrantFiled: March 31, 2006Date of Patent: July 6, 2010Assignee: Fujitsu LimitedInventors: Qiang Zhu, Tsuneo Nakata
-
Patent number: 7724030Abstract: In one embodiment, an integrated device is disclosed. For example, in one embodiment of the present invention, a device comprises a core module for providing one or more output signals. The device comprises an output logic module for receiving the one or more output signals and an input logic module, wherein the one or more output signals are received by the input logic module via one or more feedback paths, where the one or more output signals are forwarded back to the core module.Type: GrantFiled: December 3, 2007Date of Patent: May 25, 2010Assignee: XILINX, Inc.Inventors: Steven E. McNeil, Andrew W. Lai
-
Patent number: 7716455Abstract: A high speed processor. The processor includes terminals that each execute a subset of the instruction set. In at least one of the terminals, the instructions are executed in an order determined by data flow. Instructions are loaded into the terminal in pages. A notation is made when an operand for an instruction is generated by another instruction. When operands for an instruction are available, that instruction is a “ready” instruction. A ready instruction is selected in each cycle and executed. To allow data to be transmitted between terminals, each terminal is provided with a receive station, such that data generated in one terminal may be transmitted to another terminal for use as an operand in that terminal. In one embodiment, one terminal is an arithmetic terminal, executing arithmetic operations such as addition, multiplication and division. The processor has a second terminal, which contains functional logic to execute all other instructions in the instruction set.Type: GrantFiled: December 3, 2004Date of Patent: May 11, 2010Assignee: STMicroelectronics, Inc.Inventor: Stefano Cervini