Data Flow Array Processor Patents (Class 712/18)
-
Patent number: 7574582Abstract: There is disclosed a processor array, which achieves an approximately constant latency. Communications to and from the farthest array elements are suitably pipelined for the distance, while communications to and from closer array elements are deliberately “over-pipelined” such that the latency to all end-point elements is the same number of clock cycles. The processor array has a plurality of primary buses, each connected to a primary bus driver, and each having a respective plurality of primary bus nodes thereon; respective pluralities of secondary buses, connected to said primary bus nodes; a plurality of processor elements, each connected to one of the secondary buses; and delay elements associated with the primary bus nodes, for delaying communications with processor elements connected to different ones of the secondary buses by different amounts, in order to achieve a degree of synchronization between operation of said processor elements.Type: GrantFiled: January 26, 2004Date of Patent: August 11, 2009Assignee: Picochip Designs LimitedInventor: John Matthew Nolan
-
Patent number: 7555630Abstract: A context forwarding bus efficiently communicates control and data between processing elements in a processor unit having a plurality of processing elements. Control and data information is transferred over a first bus from processing element to processing element.Type: GrantFiled: December 21, 2004Date of Patent: June 30, 2009Assignee: Intel CorporationInventors: Sanjeev Jain, Gilbert M. Wolrich, Mark B. Rosenbluth
-
Patent number: 7539845Abstract: An integrated circuit comprises a plurality of tiles. Each tile comprises a processor, and a switch including switching circuitry to forward data received over data paths from other tiles to the processor and to switches of other tiles, and to forward data received from the processor to switches of other tiles. The integrated circuit further comprises an interface coupled to a plurality of the tiles to transfer data between one or more switches of the tiles and one or more switches of tiles in an externally coupled integrated circuit.Type: GrantFiled: April 14, 2006Date of Patent: May 26, 2009Assignee: Tilera CorporationInventors: David Wentzlaff, Carl G. Ramey, Anant Agarwal
-
Patent number: 7500239Abstract: According to some embodiments, each of a plurality of threads receives a start signal from a previous thread and a data packet from a buffer. Each thread issues a command to store the data packet in a memory, receives a continue signal from the previous thread, transmits a continue signal to a next thread after the data packet is stored in the memory, disposes of the data packet, receives an indication that the buffer has received a new packet, receives a start signal from the previous thread, and transmits a start signal to a next thread.Type: GrantFiled: May 23, 2003Date of Patent: March 3, 2009Assignee: Intel CorporationInventor: David Qiang Meng
-
Publication number: 20090031104Abstract: Data processing device comprising a multidimensional array of ALUs, having at least two dimensions where the number of ALUs in the dimension is greater or equal to 2, adapted to process data without register caused latency between at least some of the ALUs in the corresponding array.Type: ApplicationFiled: February 6, 2006Publication date: January 29, 2009Inventors: Martin Vorbach, Frank May
-
Patent number: 7483429Abstract: A network processor dataflow chip and method for flexible dataflow are provided. The dataflow chip comprises a plurality of on-chip data transmission and scheduling circuit structures. The data transmission and scheduling circuit structures are selected responsive to indicators. Data transmission circuit structures may comprise selectable frame processing and data transmission functions. Selectable frame processing may comprise cut and paste, full dispatch and store and dispatch frame processing. Scheduling functions include full internal scheduling, calendar scheduling in communication with an external scheduler, and external calendar scheduling. In another aspect of the present invention, data transmission functions may comprise low latency and normal latency external processor interfaces for selectively providing privileged access to dataflow chip resources.Type: GrantFiled: May 18, 2005Date of Patent: January 27, 2009Assignee: International Business Machines CorporationInventors: Jean L. Calvignac, Chih-jen Chang, Joseph F. Logan, Fabrice J. Verplanken, Daniel Wind
-
Patent number: 7461236Abstract: An integrated circuit includes a plurality of tiles. Each tile comprises a processor; and a switch including switching circuitry to forward data over data paths from other tiles to the processor and to switches of other tiles according to a switch instruction indicating an input port to which each of multiple output ports of the switch is to be coupled. The switch is able to operate in a first mode in which successive input data arriving at the switch are forwarded according to a different switch instruction, and a second mode in which successive input data arriving at the switch are forwarded according to the same switch instruction.Type: GrantFiled: December 21, 2005Date of Patent: December 2, 2008Assignee: Tilera CorporationInventor: David Wentzlaff
-
Patent number: 7444495Abstract: A computing arrangement including a processor and programmable logic. In various embodiments, the arrangement includes an instruction processing circuit coupled to a programmable logic circuit, and a memory arrangement coupled to the instruction processing circuit and to the programmable logic circuit. The instruction processing circuit executes instructions of a native instruction set, and the programmable logic is configured to dynamically translate input instructions to translated instructions of the native instruction set. The translated instructions are stored in a translation cache in the memory arrangement, and the translation cache is managed by the programmable logic. The programmable logic then provides the translated instructions to the instruction processing circuit for execution.Type: GrantFiled: August 30, 2002Date of Patent: October 28, 2008Assignee: Hewlett-Packard Development Company, L.P.Inventor: Gregory S. Snider
-
Patent number: 7426628Abstract: A method for run-time prediction of a next caller of a shared functional unit, wherein the shared functional unit is operable to be called by two or more callers out of a plurality of callers. The shared functional unit and the plurality of callers are operable to execute in parallel on a parallel execution unit. The run-time prediction is used for data flow programs. The run-time prediction detects a calling pattern of the plurality of callers of the shared functional unit and predicts the next caller out of the plurality of callers of the shared functional unit. The run-time prediction then loads state information associated with the next caller out of the plurality of callers.Type: GrantFiled: March 15, 2004Date of Patent: September 16, 2008Assignee: National Instruments CorporationInventor: Newton G. Petersen
-
Patent number: 7404066Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.Type: GrantFiled: January 24, 2007Date of Patent: July 22, 2008Assignee: Micron Technology, Inc.Inventor: Graham Kirsch
-
Patent number: 7370123Abstract: A descriptor queue composed of descriptors containing input address information that represents an address for storing data to be processed and output address information that represents an address for storing processed data is constructed and stored in a memory. A stream processor for performing a plurality of processes parallel to each other on the data to be processed acquires a descriptor from the memory, reads data to be processed from the memory according to the input address information contained in the descriptor, processes the data, and stores the processed data back into the memory according to the output address information contained in the descriptor.Type: GrantFiled: October 11, 2005Date of Patent: May 6, 2008Assignee: NEC Electronics CorporationInventors: Kenichiro Anjo, Katsumi Togawa, Ryoko Sasaki
-
Patent number: 7305649Abstract: A streaming processor circuit of a processing system is automatically generated by selecting a set of circuit parameters consistent with a set of circuit constraints and generating a representation of a candidate streaming processor circuit based upon the set of circuit parameters to execute one or more iterations of a computation specified by a streaming data flow graph. The candidate streaming processor circuit is evaluated with respect to one or more quality metrics and the representation of the candidate streaming processor circuit is output if the candidate streaming processor circuit satisfies a set of processing system constraints and is better in at least one of the one or more quality metrics than other candidate streaming processor circuits.Type: GrantFiled: April 20, 2005Date of Patent: December 4, 2007Assignee: Motorola, Inc.Inventors: Nikos Bellas, Sek M. Chai, Erica M. Lau, Zhiyuan Li, Daniel A. Linzmeier
-
Patent number: 7290123Abstract: A loop detector with an array to store a counter of loop iterations, where the number of entries in the array may be, for example, smaller than the number of entries in the loop detector. Entries in the array may, for example, be associated with more than one entry in the loop detector. The array may store, for example, a counter of speculative iterations of a loop or, for example, a number of actual iterations of a loop.Type: GrantFiled: May 20, 2004Date of Patent: October 30, 2007Assignee: Intel CorporationInventor: Tal Gat
-
Patent number: 7287146Abstract: An array-type computer processor stops, with a plurality of computer programs held, a state control unit and a data-path unit, upon input of event data for task switching. The array-type computer processor obtains the operation state of the state control unit and the processed data of the data-path unit when stopped, and temporarily holds them for each of a plurality of the computer programs. Upon completion of this, the array-type computer processor reads the operation state and processed data of any other computer program and sets them in the state control unit and data-path unit. Upon completion of this, the array-type computer processor outputs to the state control unit the event data for starting the operation. The state control unit then starts to sequentially transfer the operation state, thereby making it possible to perform the process operations according to a plurality of computer programs in a time-sharing manner.Type: GrantFiled: February 2, 2005Date of Patent: October 23, 2007Assignees: NEC Corporation, NEC Electronics CorporationInventors: Takeshi Inuo, Nobuki Kajihara, Takao Toi, Tooru Awashima, Hirokazu Kami, Taro Fujii, Kenichiro Anjo, Kouichiro Furuta, Masato Motomura
-
Patent number: 7260558Abstract: An apparatus, a carrier medium carrying computer readable code to implement a method, and a method for searching for a plurality of patterns definable by complex expressions, and further, for efficiently generating data for such searching. One method includes accepting or determining a plurality of state machines for searching for a plurality of patterns, merging the state machines to form a merged state machine, and storing a data structure describing the merged state machine, including state data on the states of the merged state machine. The method is such that pattern matching logic reading state data and accepting a sequence of inputs can search the input sequence for the plurality of patterns.Type: GrantFiled: October 24, 2005Date of Patent: August 21, 2007Assignee: Hi/fn, Inc.Inventors: Paul C. Cheng, Fangli Chien
-
Patent number: 7130986Abstract: According to some embodiments, it is determined if a register is ready to exchange data with a processing element.Type: GrantFiled: June 30, 2003Date of Patent: October 31, 2006Assignee: Intel CorporationInventors: Kalpesh D. Mehta, Louis A. Lippincott, Eric F. Vannerson
-
Patent number: 7100020Abstract: An integrated circuit (203) for use in processing streams of data generally and streams of packets in particular. The integrated circuit (203) includes a number of packet processors (307, 313, 303), a table look up engine (301), a queue management engine (305) and a buffer management engine (315). The packet processors (307, 313, 303) include a receive processor (421), a transmit processor (427) and a risc core processor (401), all of which are programmable. The receive processor (421) and the core processor (401) cooperate to receive and route packets being received and the core processor (401) and the transmit processor (427) cooperate to transmit packets. Routing is done by using information from the table look up engine (301) to determine a queue (215) in the queue management engine (305) which is to receive a descriptor (217) describing the received packet's payload.Type: GrantFiled: May 7, 1999Date of Patent: August 29, 2006Assignee: Freescale Semiconductor, Inc.Inventors: Thomas B. Brightman, Andrew T. Brown, John F. Brown, James A. Farrell, Andrew D. Funk, David J. Husak, Edward J. McLellan, Mark A. Sankey, Paul Schmitt, Donald A. Priore
-
Patent number: 7000090Abstract: A center focussed SIMD array system including an SIMD array including a plurality of processing elements arranged in a number of columns and rows and having two mutually perpendicular axes of symmetry defining four quadrants; and a sequencer circuit for moving the data in each element to the next adjacent element towards one axis of symmetry until the data is in the elements along the one axis of symmetry and then moving the data in the elements along the the one axis of symmetry to the next adjacent element towards the other axis of symmetry until the data is at the four central elements at the origin of the axes of symmetry.Type: GrantFiled: May 30, 2002Date of Patent: February 14, 2006Assignee: Analog Devices, Inc.Inventors: Yosef Stein, Joshua A. Kablotsky
-
Patent number: 7000022Abstract: Frame-based streaming data flows through a graph of multiple interconnected processing modules. The modules have a set of performance parameters whose values specify the sensitivity of each module to the selection of certain resources of a system. A user specifies overall goals for an actual graph for processing a given type of data for a particular purpose. A flow manager constructs the graph as a sequence of module interconnections required for processing the data, in response to the parameter values of the individual modules in the graph in view of the goals for the overall graph as a whole, and divides it into pipes each having one or more modules and each assigned to a memory manager for handling data frames in the pipe.Type: GrantFiled: June 7, 2004Date of Patent: February 14, 2006Assignee: Microsoft CorporationInventors: Rafael S. Lisitsa, George H. J. Shaw, Dale A. Sather, Bryan A. Woodruff
-
Patent number: 6993764Abstract: A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval.Type: GrantFiled: June 28, 2001Date of Patent: January 31, 2006Assignee: The Regents of the University of CaliforniaInventors: Fabrizio Petrini, Wu-chun Feng
-
Patent number: 6993639Abstract: Embodiments of the invention relate to a processing cell for use in computing systems. Generally, a processing cell generates remote instructions to be received and processed by at least one other processing cell. A processing cell may include a program counter, an instruction memory, and appropriate elements such as a branch lookup, a branch unit, etc. Alternatively, the processing cell may include a state machine that replaces the program counter and the instruction memory. Embodiments of the invention are able to support the VLIW mode, the MIMD) mode, a mixture of both modes of execution, etc.Type: GrantFiled: April 1, 2003Date of Patent: January 31, 2006Assignee: Hewlett-Packard Development Company, L.P.Inventors: Michael S. Schlansker, Boon Seong Ang
-
Patent number: 6976150Abstract: A scalable processing system includes a memory device having a plurality of executable program instructions, wherein each of the executable program instructions includes a timetag data field indicative of the nominal sequential order of the associated executable program instructions. The system also includes a plurality of processing elements, which are configured and arranged to recieve executable program instructions from the memory device, wherein each of the processing elements executes executable instructions having the highest priority as indicated by the state of the timetag data field.Type: GrantFiled: April 6, 2001Date of Patent: December 13, 2005Assignee: The Board of Governors for Higher Education, State of Rhode Island and Providence PlantationsInventors: Augustus K. Uht, David Morano, David Kaeli
-
Patent number: 6967950Abstract: In a network of digital signal processor nodes connected in a peer-to-peer relationship, a data packet sent to a node causes a return transmission from that node. The requester digital signal processor sends a data packet to a target digital signal processor. Upon arrival at the target digital signal processor, its receiver drives the arriving request packet into an I/O memory and triggers a transmitter interrupt. Next, the pull interrupt causes the transmitter to execute on a next packet boundary the pull request packet. Finally, the execution of the pull request causes the transmitter to pull a portion of the local I/O memory and send it back to the requester digital signal processor. The same physical portion of the I/O memory is overlaid with two logical uses, a receiver channel and a transmitter code block.Type: GrantFiled: July 13, 2001Date of Patent: November 22, 2005Assignee: Texas Instruments IncorporatedInventors: Peter Galicki, Cheryl S. Shepherd, Jonathan H. Thorn
-
Patent number: 6954842Abstract: General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions.Type: GrantFiled: August 28, 2003Date of Patent: October 11, 2005Assignee: PTS CorporationInventors: Thomas L. Drabenstott, Gerald G. Pechanek, Edwin F. Barry, Charles W. Kurak, Jr.
-
Patent number: 6954085Abstract: A reconfigurable logic array (RLA) system (104) that includes an RLA (108) and a programmer (112) for reprogramming the RLA on a cyclical basis. A function (F) requiring a larger amount of logic than contained in the RLA is partitioned into multiple functional blocks (FB1, FB2, FB3). The programmer contains software (144) that partitions the RLA into a function region FR located between two storage regions SR1, SR2. The programmer then programs functional region sequentially with the functional blocks of the function so that the functional blocks process in alternating directions between the storage regions. While the programmer is re-configuring function region with the next functional block and re-configuring one of the storage regions for receiving the output of the next functional block, data being passed from the current functional block to the next functional block is held in the other storage region.Type: GrantFiled: October 13, 2003Date of Patent: October 11, 2005Assignee: International Business Machines CorporationInventors: Kenneth J. Goodnow, Clarence R. Ogilvie, Christopher B. Reynolds, Jack R. Smith, Sebastian T. Ventrone
-
Patent number: 6948048Abstract: A method for exchanging information within a mesh network that has an array of nodes defined by four quadrants. The method includes the initial step of exchanging information from a set of nodes in one quadrant to a set of nodes located in an adjacent quadrant. The exchange of information simultaneously occurs in both a vertical and horizontal direction within the array. Information is then exchanged between nodes within the same quadrant and subquadrants.Type: GrantFiled: July 2, 2002Date of Patent: September 20, 2005Assignee: Intel CorporationInventors: Brent Baxter, Stuart Hawkinson, Satyanarayan Gupta
-
Patent number: 6920562Abstract: An encryption mechanism tightly-couples hardware data encryption functions with software-based protocol decode processing within a pipelined processor of a programmable processing engine. Tight-coupling is achieved by a micro-architecture of the pipelined processor that allows encryption functions to be accessed as a novel encryption execution unit of the processor. Such coupling substantially reduces the latency associated with conventional hardware/software interfaces.Type: GrantFiled: December 18, 1998Date of Patent: July 19, 2005Assignee: Cisco Technology, Inc.Inventors: Darren Kerr, John William Marshall
-
Patent number: 6904512Abstract: A data flow processor includes a number of hardware units each having more than one mode. A plurality of hardware units may be connected together to implement a flow made up of a series of processes. The flows, initiated by a central processing unit, may proceed independently and substantially at their own pace. Thus, the flows may operate in parallel, independently with respect to one another. Each of the hardware units may be configured differently to operate with each of the different flows.Type: GrantFiled: June 13, 2003Date of Patent: June 7, 2005Assignee: Intel CorporationInventor: Randy R. Dunton
-
Patent number: 6874079Abstract: Aspects of a method and system for digital signal processing within an adaptive computing engine are described. These aspects include a mini-matrix, the mini-matrix comprising a set of composite blocks, each composite block capable of executing a predetermined set of instructions. A sequencer is included for controlling the set of composite blocks and directing instructions among the set of composite blocks based on a data-flow graph. Further, a data network is included and transmits data to and from the set of composite blocks and to the sequencer, while a status network routes status word data resulting from instruction execution in the set of composite blocks. With the present invention, an effective combination of hardware resources is provided in a manner that provides multi-bit digital signal processing capabilities for an embedded system environment, particularly in an implementation of an adaptive computing engine.Type: GrantFiled: July 25, 2001Date of Patent: March 29, 2005Assignee: Quicksilver TechnologyInventor: Eugene B. Hogenauer
-
Patent number: 6859869Abstract: A data processing system, wherein a data flow processor (DFP) integrated circuit chip is provided which comprises a plurality of orthogonally arranged homogeneously structured cells, each cell having a plurality of logically same and structurally identically arranged modules. The cells are combined and facultatively grouped using lines and columns and connected to the input/output ports of the DFP. A compiler programs and configures the cells, each by itself and facultatively-grouped, such that random logic functions and/or linkages among the cells can be realized. The manipulation of the DFP configuration is performed during DFP operation such that modification of function parts (MACROs) of the DFP can take place without requiring other function parts to be deactivated or being impaired.Type: GrantFiled: April 12, 1999Date of Patent: February 22, 2005Assignee: PACT XPP Technologies AGInventor: Martin Vorbach
-
Patent number: 6836839Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.Type: GrantFiled: March 22, 2001Date of Patent: December 28, 2004Assignee: Quicksilver Technology, Inc.Inventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
-
Patent number: 6801202Abstract: A method and computer graphics system capable of implementing multiple pipelines for the parallel processing of graphics data. For certain data, a requirement may exist that the data be processed in order. The graphics system may use a set of tokens to reliably switch between ordered and unordered data modes. Furthermore, the graphics system may be capable of super-sampling and performing real-time convolution. In one embodiment, the computer graphics system may comprise a graphics processor, a sample buffer, and a sample-to-pixel calculation unit. The graphics processor may be configured to receive graphics data and to generate a plurality of samples for each of a plurality of frames. The sample buffer, which is coupled to the graphics processor, may be configured to store the samples. The sample-to-pixel calculation unit is programmable to generate a plurality of output pixels by filtering the rendered samples using a filter.Type: GrantFiled: June 28, 2001Date of Patent: October 5, 2004Assignee: Sun Microsystems, Inc.Inventors: Scott R. Nelson, Lisa Grenier, Michael F. Deering
-
Patent number: 6791551Abstract: A system and method for synchronizing image display and buffer swapping in a multiple processor-multiple display environment. In a master-slave dichotomy, one processor or system is deemed the master and the others act as slaves. The master generates signals used to control vertical retrace and buffer swapping for itself and the slaves. In addition, a synchronization signal generator is provided to synchronize a timing signal between the master and slave systems.Type: GrantFiled: November 27, 2001Date of Patent: September 14, 2004Assignee: Silicon Graphics, Inc.Inventors: Shrijeet Mukherjee, Kanoj Sarcar, James Tornes
-
Patent number: 6789140Abstract: The data processor for processing operation data stored in a memory connected to an external bus in the order of operations includes: an interface section for holding a parameter required for transfer of the operation data; an operation section receiving the operation data from the interface section for performing predetermined processing; and an operation memory for storing the operation data transferred. The interface section sequentially transfers the operation data from the memory connected to the external bus to the operation memory using the parameter, and sequentially transfers the operation data from the operation memory to the operation section.Type: GrantFiled: August 8, 2002Date of Patent: September 7, 2004Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Atsushi Kotani, Yoshiteru Mino
-
Patent number: 6754802Abstract: A single chip active memory includes a plurality of memory stripes, each coupled to a full word interface and one of a plurality of processing element (PE) sub-arrays. The large number of couplings between a PE sub-array and its associated memory stripe are managed by placing the PE sub-arrays so that their data paths run at right angle to the data paths of the plurality of memory stripes. The data lines exiting the memory stripes are run across the PE sub-arrays on one metal layer. At the appropriate locations, the data lines are coupled to another orthogonally oriented metal layer to complete the coupling between the memory stripe and its associated PE sub-array. The plurality of PE sub-arrays are mapped to form a large logical array, in which each PE is coupled to four other PEs. Physically distant PEs are coupled using current mode differential logical couplings an drivers to insure good signal integrity at high operational speeds. Each PE contains a small DRAM register array.Type: GrantFiled: August 25, 2000Date of Patent: June 22, 2004Assignee: Micron Technology, Inc.Inventor: Graham Kirsch
-
Patent number: 6751723Abstract: An system-on-a-chip integrated circuit has a field programmable gate array core having logic clusters, static random access memory modules, and routing resources, a field programmable gate array virtual component interface translator having inputs and outputs, wherein the inputs are connected to the field programmable gate array core, a microcontroller, a microcontroller virtual component interface translator having input and outputs, wherein the inputs are connected to the microcontroller, a system bus connected to the outputs of the field programmable gate array virtual component interface translator and also to the outputs of said microcontroller virtual component interface translator, and direct connections between the microcontroller and the routing resources of the field programmable gate array core.Type: GrantFiled: September 2, 2000Date of Patent: June 15, 2004Assignee: Actel CorporationInventors: Arunangshu Kundu, Arnold Goldfein, William C. Plants, David Hightower
-
Patent number: 6721822Abstract: A variety of advantageous mechanisms for improved data transfer control within a data processing system are described. A DMA controller is described which is implemented as a multiprocessing transfer engine supporting multiple transfer controllers which may work independently or in cooperation to carry out data transfers, with each transfer controller acting as an autonomous processor, fetching and dispatching DMA instructions to multiple execution units. In particular, mechanisms for initiating and controlling the sequence of data transfers are provided, as are processes for autonomously fetching DMA instructions which are decoded sequentially but executed in parallel.Type: GrantFiled: September 24, 2002Date of Patent: April 13, 2004Assignee: PTS CorporationInventors: Edwin Frank Barry, Edward A. Wolff
-
Patent number: 6711665Abstract: An associative processor includes a plurality of arrays of content addressable memory (CAM) cells and a plurality of tags registers in a tags logic block. Different tags registers are associated with different CAM cell arrays at will, to support parallel execution of the same or different arithmetical operations on two or more CAM cell arrays, and to support pipelined arithmetical operations by having two CAM cell arrays share a tags register to transfer data from one CAM cell array to another using appropriate compare and write operations. All the CAM cell arrays share the same mask and pattern registers. Preferably, at least one tags register is located physically between two of the CAM cell arrays.Type: GrantFiled: May 17, 2000Date of Patent: March 23, 2004Assignee: Neomagic Israel Ltd.Inventors: Avidan Akerib, Josh Meir, Ronen Stilkol, Yaron Serfati
-
Patent number: 6609188Abstract: A data flow processor includes a number of hardware units each having more than one mode. A plurality of hardware units may be connected together to implement a flow made up of a series of processes. The flows, initiated by a central processing unit, may proceed independently and substantially at their own pace. Thus, the flows may operate in parallel, independently with respect to one another. Each of the hardware units may be configured differently to operate with each of the different flows.Type: GrantFiled: March 31, 2000Date of Patent: August 19, 2003Assignee: Intel CorporationInventor: Randy R. Dunton
-
Patent number: 6591357Abstract: A method and an apparatus for configuring arbitrary sized data paths comprising multiple context processing elements (MCPEs) are provided. Multiple MCPEs may be chained to form wider-word data paths of arbitrary widths, wherein a first ALU serves as the most significant byte (MSB) of the data path while a second ALU serves as the least significant byte (LSB) of the data path. The ALUs of the data path are coupled using a left-going, or forward, carry chain for transmitting at least one carry bit from the LSB ALU to the MSB ALU. The MSB ALU comprises configurable logic for generating at least one signal in response to a carry bit received over the left-going carry chain, the at least one signal comprising a saturation signal and a saturation value. The MCPEs of the data path use configurable logic to manipulate a resident bit sequence in response to the saturation signal transmitted thereby reconfiguring, or changing the operation of, the data path in response to he saturation signal.Type: GrantFiled: February 26, 2001Date of Patent: July 8, 2003Assignee: Broadcom CorporationInventor: Ethan A. Mirsky
-
Patent number: 6581152Abstract: An indirect VLIW (iVLIW) architecture is described which contains a minimum of two instruction memories. The first instruction memory (SIM) contains short-instruction-words (SIWs) of a fixed length. The second instruction memory (VIM), contains very-long-instruction-words (VLIWs) which allow execution of multiple instructions in parallel. Each SIW may be fetched and executed as an independent instruction by one of the available execution units. A special class of SIW is used to reference the VIM indirectly to either execute or load a specified VLIW instruction (called an “XV” instruction for “eXecute VLIW”, or LV for “Load VLIW”). In these cases, the SIW instruction specifies how the location of the VLIW is to be accessed. Other aspects of this approach relate to the application of data memory addressing techniques for execution or loading of VLIWs that parallel the addressing modes used for data memory accesses.Type: GrantFiled: February 11, 2002Date of Patent: June 17, 2003Assignee: BOPS, Inc.Inventors: Edwin F. Barry, Gerald G. Pechanek
-
Patent number: 6526498Abstract: A method and an apparatus for retiming in a network of multiple context processing elements are provided. A programmable delay element is configured to programmably delay signals between a number of multiple context processing elements of an array without requiring a multiple context processing element to implement the delay. The output of a first multiple context processing element is coupled to a first multiplexer and to the input of a number of serially connected delay registers. The output of each of the serially connected delay registers is coupled to the input of a second multiplexer. The output of the second multiplexer is coupled to the input of the first multiplexer, and the output of the first multiplexer is coupled to a second multiple context processing element. The first and second multiplexers are provided with at least one set of data representative of at least one configuration memory context of a multiple context processing element.Type: GrantFiled: February 15, 2000Date of Patent: February 25, 2003Assignee: Broadcom CorporationInventors: Ethan Mirsky, Robert French, Ian Eslick
-
Patent number: 6466828Abstract: Featured is a device controller in a system having a multiplicity of such controllers and a conveying system and method for controlling a multiplicity of devices using such controllers. Each controller includes a plurality of bi-directional communications ports, a processor that processes information and provides outputs, where at least one output controls the device, and an applications program for execution within the processor that includes instructions and criteria for processing the information and providing the processor outputs. Specifically, the applications program includes instructions and criteria for communicating information between and among controllers; instructions and criteria for processing information received by a controller; and instructions and criteria for modifying the operation of a device responsive to the communicated information. For a conveying system having a multiplicity of conveying sections, a controller is provided for each section.Type: GrantFiled: September 2, 1999Date of Patent: October 15, 2002Assignee: Quantum Conveyor Systems, Inc.Inventors: Hans J. Lem, Richard J. Bowman
-
Patent number: 6460127Abstract: An associative signal processing apparatus for processing a plurality of samples of an incoming signal in parallel, the apparatus comprising: (a) an array, of processors, each processor including a multiplicity of associative memory cells, the memory cells being operative to perform: (i) compare operations, in parallel, on the plurality of samples of the incoming signal; and (ii) write operations, in parallel, on the plurality of samples of the incoming signal; and (b) an I/O buffer register including a multiplicity of associative memory cells, the register being operative to: (i) input the plurality of samples of the incoming signal to the array of processors in parallel by having the I/O buffer register memory cells perform at least one associative compare operation and the array memory cells perform at least one associative write operation; and (ii) receive, in parallel, a plurality of processed samples from the array of processors by having the array memory cells perform at least one associative compare oType: GrantFiled: October 26, 1998Date of Patent: October 1, 2002Assignee: Neomagic Israel Ltd.Inventor: Avidan Akerib
-
Patent number: 6456620Abstract: Disclosed is a method for all-to-all personalized exchange for a class of multistage interconnecting networks (MINs). The method is based on a Latin square matrix corresponding to a set of admissible permutations of a multistage interconnecting network. Disclosed are first and second methods for constructing a Latin square matrix used in the personalized exchange technique. Also disclosed is a generic method for decomposing all-to-all personalized exchange patterns into admissible permutations to form the Latin square matrix for self-routing networks which are a subclass of the MINs.Type: GrantFiled: February 17, 1999Date of Patent: September 24, 2002Assignees: Verizon Laboratories Inc., The University of VermontInventors: Jianchao Wang, Yuanyuan Yang
-
Patent number: 6442627Abstract: An output FIFO data transfer control device can comprise a geometric arithmetic core including one integer processing unit or IPU and a plurality of floating-point processing units or FPUs. Each processing unit includes an intermediate buffer or data output buffer for storing a data on an arithmetic result. When an instruction of data transfer from at least one of the plurality of processing units to one output FIFO is issued, a write/read pointer generating unit generates a write pointer identifying a specific location where data on an arithmetic result associated with the instruction is to be stored in the intermediate buffer of at least one of the plurality of processing units. The write/read pointer generating unit also generates a read pointer identifying a specific location where data is to be read out of the intermediate buffer of at least one of the plurality of processing units.Type: GrantFiled: December 3, 1999Date of Patent: August 27, 2002Assignee: Mitsubishi Denki Kabushiki KaishaInventors: Hiroyasu Negishi, Junko Kobara, Yoshitsugu Inoue, Hiroyuki Kawai, Keijiro Yoshimatsu, Nelson Chan, Robert Streitenberger
-
Patent number: 6425068Abstract: An expanded arithmetic and logic unit (EALU) with special extra functions is integrated into a configurable unit for performing data processing operations. The EALU is configured by a function register, which greatly reduces the volume of data required for configuration. The cell can be cascaded freely over a bus system, the EALU being decoupled from the bus system over input and output registers. The output registers are connected to the input of the EALU to permit serial operations. A bus control unit is responsible for the connection to the bus, which it connects according to the bus register. The unit is designed so that distribution of data to multiple receivers (broadcasting) is possible. A synchronization circuit controls the data exchange between multiple cells over the bus system. The EALU, the synchronization circuit, the bus control unit, and registers are designed so that a cell can be reconfigured on site independently of the cells surrounding it.Type: GrantFiled: October 8, 1997Date of Patent: July 23, 2002Assignee: PACT GmbHInventors: Martin Vorbach, Robert Münch
-
Publication number: 20020083297Abstract: A multi-thread packet processor which processes data packets using a multi-threaded pipelined machine, wherein no instruction depends on a preceding instruction because each instruction in the pipeline is executed for a different thread. The multi-thread packet processor transfers a data packet from a flexible data input buffer to a packet task manager, dispatches the data packet from the packet task manager to a multi-threaded pipelined analysis machine, classifies the data packet in the analysis machine, modifies and forwards the data packet in a packet manipulator. The multi-thread packet processor includes an analysis machine having multiple pipelines, wherein one pipeline is dedicated to directly manipulating individual data bits of a bit field, a packet task manager, a packet manipulator, a global access bus including a master request bus and a slave request bus separated from each other and pipelined, an external memory engine, and a hash engine.Type: ApplicationFiled: December 22, 2000Publication date: June 27, 2002Inventors: Richard P. Modelski, Michael J. Craren
-
Patent number: 6356994Abstract: An indirect VLIW (iVLIW) architecture is described which contains a minimum of two instruction memories. The first instruction memory (SIM) contains short-instruction-words (SIWs) of a fixed length. The second instruction memory (VIM), contains very-long-instruction-words (VLIWs) which allow execution of multiple instructions in parallel. Each SIW may be fetched and executed as an independent instruction by one of the available execution units. A special class of SIW is used to reference the VIM indirectly to either execute or load a specified VLIW instruction (called an “XV” instruction for “eXecute VLIW”, or LV for “Load VLIW”). In these cases, the SIW instruction specifies how the location of the VLIW is to be accessed. Other aspects of this approach relate to the application of data memory addressing techniques for execution or loading of VLIWs that parallel the addressing modes used for data memory accesses.Type: GrantFiled: July 9, 1999Date of Patent: March 12, 2002Assignee: BOPS, IncorporatedInventors: Edwin F. Barry, Gerald G. Pechanek
-
Patent number: 6351798Abstract: The present invention provides an address resolution method for use in a multiprocessor system with distributed shared memory. The method allows users to change a memory configuration and a system configuration to increase system operation flexibility and to isolate errors. A cell controller indexes into an address resolution table using the high-order part of a processor-specified address. A write protection flag specifies whether to permit write access from other cells. An attempt to write-access a cell inhibited for write access causes a logical circuit to output an access exception signal.Type: GrantFiled: June 15, 1999Date of Patent: February 26, 2002Assignee: NEC CorporationInventor: Fumio Aono