Concurrent Patents (Class 712/9)
-
Patent number: 12156101Abstract: Technologies are described for a distributed audio system configured to switch operating modes. In some examples, the distributed audio system may include a device configured to execute first instructions from a first memory corresponding to operation in a first mode, receive a signal to switch modes, and, responsive to receiving the signal, switch a select signal corresponding to the first memory to select a second memory. Responsive to selecting the second memory, the device may be configured to execute second instructions from the second memory corresponding to operation in a second mode. The device may enter a registration mode to register portable parts to the device.Type: GrantFiled: May 5, 2022Date of Patent: November 26, 2024Assignee: Lightspeed Technologies, Inc.Inventors: David M. Jordahl, Robert Paul D'Angelo, Sr., Baiqiang Ren, Michael A. Frost, Jonathan Umfleet, Shaun Fagan
-
Patent number: 11360750Abstract: A method for facilitating a play of a legacy game is described. The method includes receiving a user input during the play of the legacy game, determining whether one or more blocks of code for servicing the user input are cached, and accessing one or more instructions of a legacy game code upon determining that the one or more blocks of code are not cached. The method further includes compiling the one or more blocks of code from the one or more instructions of the legacy game code, caching the one or more blocks of code, and executing the one or more blocks of code to display a virtual environment.Type: GrantFiled: January 28, 2021Date of Patent: June 14, 2022Assignee: Sony Interactive Entertainment LLCInventors: Ernesto Corvi, George Weising, David Thach
-
Patent number: 10997116Abstract: A computing system is described herein that expedites deep neural network (DNN) operations or other processing operations using a hardware accelerator. The hardware accelerator, in turn, includes a tensor-processing engine that works in conjunction with a scalar-processing unit (SPU). The tensor-processing engine handles various kinds of tensor-based operations required by the DNN, such as multiplying vectors by matrices, combining vectors with other vectors, transforming individual vectors, etc. The SPU performs scalar-based operations, such as forming the reciprocal of a scalar, generating the square root of a scalar, etc. According to one illustrative implementation, the computing system uses the same vector-based programmatic interface to interact with both the tensor-processing engine and the SPU.Type: GrantFiled: August 6, 2019Date of Patent: May 4, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Steven Karl Reinhardt, Joseph Anthony Mayer, II, Dan Zhang
-
Patent number: 9792087Abstract: An embodiment of a system and method for performing a numerical operation on input data in a hybrid floating-point format includes representing input data as a sign bit, exponent bits, and mantissa bits. The exponent bits are represented as an unsigned integer including an exponent bias, and a signed numerical value of zero is represented as a first reserved combination of the mantissa bits and the exponent bits. Each of all other combinations of the mantissa bits and the exponent bits represents a real finite non-zero number. The mantissa bits are operated on with a “one” bit before a radix point for the all other combinations of the mantissa bits and the exponent bits.Type: GrantFiled: April 20, 2012Date of Patent: October 17, 2017Assignee: FUTUREWEI TECHNOLOGIES, INC.Inventors: Yuanbin Guo, Tong Sun, Weizhong Chen
-
Patent number: 9720696Abstract: Embodiments of the present invention provide systems and methods for mapping the architected state of one or more threads to a set of distributed physical register files to enable independent execution of one or more threads in a multiple slice processor. In one embodiment, a system is disclosed including a plurality of dispatch queues which receive instructions from one or more threads and an even number of parallel execution slices, each parallel execution slice containing a register file. A routing network directs an output from the dispatch queues to the parallel execution slices and the parallel execution slices independently execute the one or more threads.Type: GrantFiled: September 30, 2014Date of Patent: August 1, 2017Assignee: International Business Machines CorporationInventors: Sam G. Chu, Markus Kaltenbach, Hung Q. Le, Jentje Leenstra, Jose E. Moreira, Dung Q. Nguyen, Brian W. Thompto
-
Patent number: 9715385Abstract: Vector exception handling is facilitated. A vector instruction is executed that operates on one or more elements of a vector register. When an exception is encountered during execution of the instruction, a vector exception code is provided that indicates a position within the vector register that caused the exception. The vector exception code also includes a reason for the exception.Type: GrantFiled: January 23, 2013Date of Patent: July 25, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan D. Bradbury, Michael K. Gschwind, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 8972689Abstract: A storage processor identifies latency of memory drives for different numbers of concurrent storage operations. The identified latency is used to identify debt limits for the number of concurrent storage operations issued to the memory drives. The storage processor may issue additional storage operations to the memory devices when the number of storage operations is within the debt limit. Storage operations may be deferred when the number of storage operations is outside the debt limit.Type: GrantFiled: February 2, 2011Date of Patent: March 3, 2015Assignee: Violin Memory, Inc.Inventor: Erik de la Iglesia
-
Publication number: 20140359253Abstract: A processor may include a vector functional unit that supports concurrent operations on multiple data elements of a maximum element size. The functional unit may also support concurrent execution of multiple distinct vector program instructions, where the multiple vector instructions each operate on multiple data elements of less than the maximum element size.Type: ApplicationFiled: May 29, 2013Publication date: December 4, 2014Applicant: Apple Inc.Inventor: Jeffry E. Gonion
-
Patent number: 8868885Abstract: A device system and method for processing program instructions, for example, to execute intra vector operations. A fetch unit may receive a program instruction defining different operations on data elements stored at the same vector memory address. A processor may include different types of execution units each executing a different one of a predetermined plurality of elemental instructions. Each program instruction may be a combination of one or more of the elemental instructions. The processor may receive a vector of data elements stored non-consecutively at the same vector memory address to be processed by a same one of the elemental instructions and a vector of configuration values independently associated with executing the same elemental instruction on the non-consecutive data elements. At least two configuration values may be different to implement different operations by executing the same elemental instruction using the different configuration values on the vector of non-consecutive data elements.Type: GrantFiled: November 18, 2010Date of Patent: October 21, 2014Assignee: Ceva D.S.P. Ltd.Inventors: Yaakov Dekter, Michael Boukaya, Shai Shpigelblat, Moshe Steinberg
-
Publication number: 20140289496Abstract: Systems, apparatuses and methods for utilizing enhanced macroscalar predicate operations which take enhanced predicate operands that designate the element width and which elements are to be processed. The element width and the number of elements per vector are determined at run-time rather than being defined in the architectural definition of the instruction. This enables additional parallelism when processing smaller-sized data. The instruction performs the requested operation on the elements specified by the enhanced control predicate, assuming an element-width also specified by the enhanced control predicate, and returns the result as an enhanced predicate of the same element width.Type: ApplicationFiled: March 18, 2014Publication date: September 25, 2014Applicant: Apple Inc.Inventor: Jeffry E. Gonion
-
Publication number: 20140289498Abstract: Systems, apparatuses and methods for utilizing enhanced Macroscalar vector operations which take an enhanced predicate operand that designates the element width and which elements are to be processed. The element width and the number of elements per vector are determined at run-time rather than being defined in the architectural definition of the instruction. This enables additional parallelism when processing smaller-sized data. The instruction performs the requested operation on the elements specified by the enhanced predicate, assuming an element-width also specified by the enhanced predicate, and returns the result as a vector of elements of the same element width.Type: ApplicationFiled: March 18, 2014Publication date: September 25, 2014Applicant: Apple Inc.Inventor: Jeffry E. Gonion
-
Publication number: 20140289497Abstract: Systems, apparatuses and methods for utilizing enhanced Macroscalar comparison operations which take an enhanced predicate operand that designates the element width and which elements are to be processed. The element width and the number of elements per vector are determined at run-time rather than being defined in the architectural definition of the instruction. This enables additional parallelism when processing smaller-sized data. The instruction performs the requested operation on the elements specified by the enhanced predicate, assuming an element-width also specified by the enhanced predicate, and returns the result as an enhanced predicate corresponding to the result of the comparison.Type: ApplicationFiled: March 18, 2014Publication date: September 25, 2014Applicant: Apple Inc.Inventor: Jeffry E. Gonion
-
Patent number: 8775510Abstract: The invention provides, in one aspect, an improved system for data access comprising a file server that is coupled to a client device or application executing thereon via one or more networks. The server comprises static storage that is organized in one or more directories, each containing, zero, one or more files. The server also comprises a file system operable, in cooperation with a file system on the client device, to provide authorized applications executing on the client device access to those directories and/or files. Fast file server (FFS) software or other functionality executing on or in connection with the server responds to requests received from the client by transferring requested data to the client device over multiple network pathways. That data can comprise, for example, directory trees, files (or portions thereof), and so forth.Type: GrantFiled: January 31, 2013Date of Patent: July 8, 2014Assignee: PME IP Australia Pty LtdInventors: Malte Westerhoff, Detlev Stalling
-
Patent number: 8732359Abstract: When executing a graphical model of a dynamic system that includes two or more concurrently executing sets of operations, a processor is configured to create a first buffer and a second buffer within the executable graphical model. A first set of operations is configured to write data to the first buffer during a first execution instance of the first set of operations. The first set of operations is configured to write data to the second buffer during a second execution instance of the first thread. A second set of operations is configured to read the data from the first buffer during an instance of the second thread that executes contemporaneously with the second execution instance of the first set of operations. Determinations regarding access to the first buffer and second buffer by the first thread and second thread are self-contained within the first thread and second thread, respectively.Type: GrantFiled: December 7, 2012Date of Patent: May 20, 2014Assignee: The MathWorks, Inc.Inventors: James E. Carrick, Biao Yu
-
Patent number: 8707012Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: GrantFiled: October 12, 2012Date of Patent: April 22, 2014Assignee: Intel CorporationInventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Patent number: 8687008Abstract: A latency tolerant system for executing video processing operations. The system includes a host interface for implementing communication between the video processor and a host CPU, a scalar execution unit coupled to the host interface and configured to execute scalar video processing operations, and a vector execution unit coupled to the host interface and configured to execute vector video processing operations. A command FIFO is included for enabling the vector execution unit to operate on a demand driven basis by accessing the memory command FIFO. A memory interface is included for implementing communication between the video processor and a frame buffer memory. A DMA engine is built into the memory interface for implementing DMA transfers between a plurality of different memory locations and for loading the command FIFO with data and instructions for the vector execution unit.Type: GrantFiled: November 4, 2005Date of Patent: April 1, 2014Assignee: NVIDIA CorporationInventors: Ashish Karandikar, Shirish Gadre, Stephen D. Lew
-
Patent number: 8645588Abstract: The present invention provides embodiments of an apparatus used to implement a pipelined serial ring bus. One embodiment of the apparatus includes one or more ring buses configured to communicatively couple registers associated with logical elements in a processor. The ring bus(s) are configured to concurrently convey information associated with a plurality of load or store operations.Type: GrantFiled: November 1, 2010Date of Patent: February 4, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Christopher D. Bryant, David Kaplan
-
Patent number: 8532288Abstract: A cryptographic engine for modulo N multiplication, which is structured as a plurality of almost identical, serially connected Processing Elements, is controlled so as to accept input in blocks that are smaller than the maximum capability of the engine in terms of bits multiplied at one time. The serially connected hardware is thus partitioned on the fly to process a variety of cryptographic key sizes while still maintaining all of the hardware in an active processing state.Type: GrantFiled: December 1, 2006Date of Patent: September 10, 2013Assignee: International Business Machines CorporationInventors: Camil Fayad, John K. Li, Siegfried K. H. Sutter, Phil C. Yeh
-
Method and system for implementing efficient locking to facilitate parallel processing of IC designs
Patent number: 8438512Abstract: Disclosed is an improved method and system for implementing parallelism for execution of electronic design automation (EDA) tools, such as layout processing tools. Examples of EDA layout processing tools are placement and routing tools. Efficient locking mechanism are described for facilitating parallel processing and to minimize blocking.Type: GrantFiled: August 30, 2011Date of Patent: May 7, 2013Assignee: Cadence Design Systems, Inc.Inventors: David Cross, Eric Nequist -
Patent number: 8392529Abstract: The invention provides, in one aspect, an improved system for data access comprising a file server that is coupled to a client device or application executing thereon via one or more networks. The server comprises static storage that is organized in one or more directories, each containing, zero, one or more files. The server also comprises a file system operable, in cooperation with a file system on the client device, to provide authorized applications executing on the client device access to those directories and/or files. Fast file server (FFS) software or other functionality executing on or in connection with the server responds to requests received from the client by transferring requested data to the client device over multiple network pathways. That data can comprise, for example, directory trees, files (or portions thereof), and so forth.Type: GrantFiled: August 27, 2007Date of Patent: March 5, 2013Assignee: PME IP Australia Pty LtdInventors: Malte Westerhoff, Detlev Stalling
-
Patent number: 8316216Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: GrantFiled: October 21, 2009Date of Patent: November 20, 2012Assignee: Intel CorporationInventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Patent number: 8205204Abstract: An multi-threading processor is provided. The multi-threading processor includes a first instruction fetch unit to receive a first thread and a second instruction fetch unit to receive a second thread. A multi-thread scheduler coupled to the instruction fetch units and a execution unit. The multi-thread scheduler determines the width of the execution unit and the execution unit executes the threads accordingly.Type: GrantFiled: January 23, 2009Date of Patent: June 19, 2012Assignee: Intel CorporationInventors: Ken Shoemaker, Sailesh Kottapalli, Kin-Kee Sit
-
Patent number: 8179896Abstract: A network processor of an embodiment includes a packet classification engine, a processing pipeline, and a controller. The packet classification engine allows for classifying each of a plurality of packets according to packet type. The processing pipeline has a plurality of stages for processing each of the plurality of packets in a pipelined manner, where each stage includes one or more processors. The controller allows for providing the plurality of packets to the processing pipeline in an order that is based at least partially on: (i) packet types of the plurality of packets as classified by the packet classification engine and (ii) estimates of processing times for processing packets of the packet types at each stage of the plurality of stages of the processing pipeline. A method in a network processor allows for prefetching instructions into a cache for processing a packet based on a packet type of the packet.Type: GrantFiled: November 7, 2007Date of Patent: May 15, 2012Inventor: Justin Mark Sobaje
-
Patent number: 8169439Abstract: Embodiments of the invention are generally related to image processing, and more specifically to vector units for supporting image processing. A combined vector/scalar unit is provided wherein one or more processing lanes of the vector unit are used for performing scalar operations. An integrated register file is also provided for storing vector and scalar data. Therefore, the transfer of data to memory to exchange data between independent vector and scalar units is obviated and a significant amount of chip area is saved.Type: GrantFiled: October 23, 2007Date of Patent: May 1, 2012Assignee: International Business Machines CorporationInventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
-
Patent number: 8161266Abstract: An improved superscalar processor. The processor includes multiple lanes, allowing multiple instructions in a bundle to be executed in parallel. In vector mode, the parallel lanes may be used to execute multiple instances of a bundle, representing multiple iterations of the bundle in a vector run. Scheduling logic determines whether, for each bundle, multiple instances can be executed in parallel. If multiple instances can be executed in parallel, coupling circuitry couples an instance of the bundle from one lane into one or more other lanes. In each lane, register addresses are renamed to ensure proper execution of the bundles in the vector run. Additionally, the processor may include a register bank separate from the architectural register file. Renaming logic can generate addresses to this separate register bank that are longer than used to address architectural registers, allowing longer vectors and more efficient processor operation.Type: GrantFiled: December 22, 2008Date of Patent: April 17, 2012Assignee: STMicroelectronics Inc.Inventor: Osvaldo M. Colavin
-
Patent number: 8132031Abstract: A method, apparatus, and program product optimize power consumption in a parallel computing system that includes a plurality of computing nodes by selectively throttling performance of selected nodes to effectively slow down the completion of quicker executing parts of a workload of the computing system when those parts are dependent upon or otherwise associated with the completion of other, slower executing parts of the same workload. Parts of the workload are executed on the computing nodes, including concurrently executing a first part on a first computing node and a second part on a second computing node. The first node is selectively throttled during execution of the first part to decrease power consumption of the first node and conform a completion time of for the first node in completing the first part of the workload with a completion time for the second node in completing the second part.Type: GrantFiled: March 17, 2009Date of Patent: March 6, 2012Assignee: International Business Machines CorporationInventors: Eric Lawrence Barsness, David L. Darrington, Amanda Peters, John Matthew Santosuosso
-
Method and system for implementing efficient locking to facilitate parallel processing of IC designs
Patent number: 8010917Abstract: Disclosed is an improved method and system for implementing parallelism for execution of electronic design automation (EDA) tools, such as layout processing tools. Examples of EDA layout processing tools are placement and routing tools. Efficient locking mechanism are described for facilitating parallel processing and to minimize blocking.Type: GrantFiled: December 26, 2007Date of Patent: August 30, 2011Assignee: Cadence Design Systems, Inc.Inventors: David Cross, Eric Nequist -
Patent number: 7987344Abstract: A programmable processor and method for improving the performance of processors by incorporating an execution unit configurable to execute a plurality of instruction streams from the plurality of threads, wherein each instruction stream includes a group instruction that operates on a plurality of data elements in partitioned fields of at least one of the registers to produce a catenated result.Type: GrantFiled: January 16, 2004Date of Patent: July 26, 2011Assignee: Microunity Systems Engineering, Inc.Inventors: Craig Hansen, John Moussouris
-
Patent number: 7962906Abstract: A compiler includes a mechanism for employing multiple synergistic processors to execute long vectors. The compiler receives a single source program. The compiler identifies vectorizable loop code in the single source program and extracts the vectorizable loop code from the single source program. The compiler then compiles the extracted vectorizable loop code for a plurality of synergistic processors. The compiler also compiles a remainder of the single source program for a principal processor to form an executable main program such that the executable main program controls operation of the executable vectorizable loop code on the plurality of synergistic processors.Type: GrantFiled: March 15, 2007Date of Patent: June 14, 2011Assignee: International Business Machines CorporationInventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel Arthur Prener
-
Publication number: 20100332792Abstract: Systems and methods for improved vector data processing based on separately processing elements of a vector in multiple simultaneously executing vector element processing units are disclosed. One embodiment of the present invention is a vector processing system including a plurality of vector element processing units and a routing infrastructure. The routing infrastructure is configured to route each element of a received vector to a respective one of the vector element processing units. The received vector may be from a memory which is coupled to the vector element processing units by the routing infrastructure. Each vector element processing unit is configured to simultaneously process two or more elements, wherein each of the two or more elements is from a separate vector. Embodiments of the present invention also provide for forwarding of data and results of computation between vector element processing units.Type: ApplicationFiled: June 30, 2009Publication date: December 30, 2010Applicant: Advanced Micro Devices, Inc.Inventor: Daniel B. CLIFTON
-
Patent number: 7788471Abstract: A system and method for performing vector arithmetic is disclosed. The method includes loading two operand vectors, each composed of a number of vector elements, into two storage locations. A selected arithmetic operation is performed on the operand vectors to produce a result vector having the number of vector elements. Each vector element of the result vector is associated with an arithmetic logic cell that has a first input that can receive any vector element from the first vector and a second input that can receive any vector element from the second vector. Accordingly each vector element of the result vector is a function of any two individual vector elements of the operand vectors. By applying the operand vector elements to the appropriate arithmetic logic cells, and by selecting the appropriate arithmetic operation, complex vector operations can be performed efficiently.Type: GrantFiled: September 18, 2006Date of Patent: August 31, 2010Assignee: Freescale Semiconductor, Inc.Inventor: Chengke Sheng
-
Patent number: 7779382Abstract: Validity of one or more assertions for any concurrent execution of a plurality of software instructions with at most k?1 context switches can be determined. Validity checking can account for execution of the software instructions in an unbounded stack depth scenario. A finite data domain representation can be used. The software instructions can be represented by a pushdown system. Validity checking can account for thread creation during execution of the plurality of software instructions.Type: GrantFiled: December 10, 2004Date of Patent: August 17, 2010Assignee: Microsoft CorporationInventors: Niels Jakob Rehof, Shaz Qadeer
-
Patent number: 7673076Abstract: An enhanced direct memory access (EDMA) operation issues a read command to the source port to request data. The port returns the data along with response information, which contains the channel and valid byte count. The EDMA stores the read data into a write buffer and acknowledges to the source port that the EDMA can accept more data. The read response and data can come from more than one port and belong to different channels. Removing channel prioritizing according to this invention allows the EDMA to store read data in the write buffer and the EDMA then can acknowledge the port read response concurrently across all channels. This improves the EDMA inbound and outbound data flow dramatically.Type: GrantFiled: May 13, 2005Date of Patent: March 2, 2010Assignee: Texas Instruments IncorporatedInventors: Sanjive Agarwala, Kyle Castille, Quang-Dieu An
-
Publication number: 20100042807Abstract: The described embodiments provide a processor for generating a result vector with incremented or decremented values from an input vector. During operation, the processor receives an input vector and a control vector. The processor then copies a value contained in a selected element of the input vector. The processor next generates the result vector, which involves writing an incremented or decremented value to the result vector, depending on the value of the control vector and the embodiment. In addition, a predicate vector can be used to control the values that are written to the result vector.Type: ApplicationFiled: June 30, 2009Publication date: February 18, 2010Applicant: APPLE INC.Inventors: Jeffry E. Gonion, Keith E. Diefendorff, JR.
-
Patent number: 7660967Abstract: A computer processor is responsive to successive processing instructions in an issue order to process regular vectors to generate a result vector without use of a cache. At least two architectural registers having input-vector capability are selectively coupled to memory to receive corresponding vector-elements of two vectors and transfer the vector-elements to a selected functional unit. At least one architectural register having output capability is selectively coupled to an output, which in turn is coupled to transfer result vector-elements to the memory. The functional unit performs a function on the vector-elements to generate a respective result-element. The result-elements are transferred to a selected architectural register for processing as operands in performance of further functions by a functional unit, or are transferred to the output for transfer to memory. In either case, the order of the result vector-elements is restored to the issue order of the successive processing instructions.Type: GrantFiled: January 30, 2008Date of Patent: February 9, 2010Assignee: Efficient Memory TechnologyInventor: Maurice L. Hutson
-
Patent number: 7620795Abstract: Apparatus and method for a microcontroller are described. The microcontroller includes a microprocessor having storage and bussing for accessing the storage. A portion of the bussing is coupled to hardwired operation codes, and a portion of the storage is for storing code. The hardwired operation codes are in part for placing the microprocessor into an exception handling mode. The exception handling mode includes reactivating the storage for execution of the code without having to reload the code therein.Type: GrantFiled: January 14, 2005Date of Patent: November 17, 2009Assignee: Xilinx, Inc.Inventor: Peter Ryser
-
Patent number: 7603488Abstract: Systems and methods for providing efficient memory allocation, reduced processor intervention and power consumption, and increased memory access bandwidth. One embodiment comprises a system including a plurality of memory units which are accessible in parallel, a dynamic memory unit configured to dynamically allocate and deallocate storage space in the memory units, and a plurality of direct memory access (DMA) engines configured to access the memory units in parallel through the memory management subsystem. The system may be implemented in the MAC engine of a device that communicates with other devices via a wireless communication link. This embodiment may store packets in FIFOs within the memory units as elements of linked list data structures that can be joined together without having to move the previously stored data. DMA engines access a context table to obtain DMA channel information that enables them to move data through appropriate DMA channels.Type: GrantFiled: July 15, 2004Date of Patent: October 13, 2009Assignee: Alereon, Inc.Inventors: Martin Gravenstein, Nirmalendu B. Patra, Andrew Probst, Dave Ohmann, Clair A. Hardesty
-
Patent number: 7603492Abstract: A streaming data interface device (700) of a streaming processing system (200) is automatically generated by selecting a set of circuit parameters (610) consistent with a set of circuit constraints and generating (612, 614) a representation of a candidate memory interface device based upon a set of stream descriptors. The candidate streaming data interface device is evaluated (616) with respect to one or more quality metrics and the representation of the candidate streaming processor circuit is output (622) if the candidate memory interface device satisfies a set of processing system constraints and is better in at least one of the one or more quality metrics than other candidate memory interface devices.Type: GrantFiled: September 20, 2005Date of Patent: October 13, 2009Assignee: Motorola, Inc.Inventors: Sek M. Chai, Nikos Bellas, Malcolm R. Dwyer, Erica M. Lau, Zhiyuan Li, Daniel A. Linzmeier
-
Patent number: 7500240Abstract: An multi-threading processor is provided. The multi-threading processor includes a first instruction fetch unit to receive a first thread and a second instruction fetch unit to receive a second thread. A multi-thread scheduler coupled to the instruction fetch units and a execution unit. The multi-thread scheduler determines the width of the execution unit and the execution unit executes the threads accordingly.Type: GrantFiled: January 15, 2002Date of Patent: March 3, 2009Assignee: Intel CorporationInventors: Ken Shoemaker, Sailesh Kottapalli, Kin-Kee Sit
-
Patent number: 7460989Abstract: A method is provided, wherein a virtual internal master clock is used in connection with a RISC CPU. The RISC CPU comprises a number of concurrently operating function units, wherein each unit runs according to its own clocks, including multiple-stage totally unsynchronized clocks, in order to process a stream of instructions. The method includes the steps of generating a virtual model master clock having a clock cycle, and initializing each of the function units at the beginning of respectively corresponding processing cycles. The method further includes operating each function unit during a respectively corresponding processing cycle to carry out a task with respect to one of the instructions, in order to produce a result. Respective results are all evaluated in synchronization, by means of the master clock. This enables the instruction processing operation to be modeled using a sequential computer language, such as C or C++.Type: GrantFiled: October 14, 2004Date of Patent: December 2, 2008Assignee: International Business Machines CorporationInventor: Oliver Keren Ban
-
Publication number: 20080282059Abstract: A method and apparatus for maintaining membership in a set of items to be used in a predetermined manner in a computer system. A representation of each member of the set is mapped into a number of components of a primary and secondary vector when a member is added to the set. Periodically, the primary vector is changed to the secondary vector and the secondary vector to the primary vector. When members of the set are deleted, the components of the secondary vector are changed to indicate deletion of these members after the primary vector is changed to the secondary vector. Finally, membership in the set is determined by examining the components in the primary vector, and the members in the set of items are then used in a predetermined manner in the computer system. More specifically, in a sample embodiment of the present invention, membership in the set would determine if data is to be stored or removed from cache memory in a computer system.Type: ApplicationFiled: May 9, 2007Publication date: November 13, 2008Inventors: Kattamuri Ekanadham, Il Park, Pratap Chandra Pattnaik, Xiaowei Shen
-
Patent number: 7451146Abstract: A method and computer system for implementing, in a multithreaded environment, an almost non-blocking linked list allow a lock-free access provided that certain conditions are met. The approach involves: associating a pointer and an auxiliary data structure with each linked list, using a compare-and-swap (CAS) operation, and making a slight modification of values associated with nodes under certain conditions. The CAS operation guards against setting the pointers incorrectly during insertion and removal operations. The auxiliary data structure, also referred to as the ‘black list,’ holds a dynamic list of values, typically pointer values, associated with nodes that are in the process of being removed by a thread.Type: GrantFiled: June 30, 2004Date of Patent: November 11, 2008Assignee: Hewlett-Packard Development Company, L.P.Inventor: Hans-Juergen K. H. Boehm
-
Patent number: 7444488Abstract: A method and a programmable unit for bit field shifting in a memory device in a programmable unit as a result of the execution of an instruction, in which a bit segment is shifted within a first memory unit to a second memory unit, are presented. The bit segment is read with a first bit length from a first bit field in the first memory unit starting at a first start point. The bit segment that has been read is stored in the first bit field in the second memory unit starting at a second start point. The first or the second start points is updated by a predetermined value and the updated start point is stored for subsequent method steps.Type: GrantFiled: September 30, 2005Date of Patent: October 28, 2008Assignee: Infineon TechnologiesInventors: Xiaoning Nie, Thomas Wahl
-
Publication number: 20080077768Abstract: A method and apparatus are provided to perform efficient merging operations of two or more streams of data by using SIMD instruction. Streams of data are merged together in parallel and with mitigated or removed conditional branching. The merge operations of the streams of data include Merge AND and Merge OR operations.Type: ApplicationFiled: September 27, 2006Publication date: March 27, 2008Inventors: Hiroshi Inoue, Moriyoshi Ohara, Hideaki Komatsu
-
Patent number: 7313788Abstract: A method for determining vectorization configurations in a computer processor architecture, the method including identifying a vectorizable loop in a computer program, identifying a memory access pattern of data required for implementing the loop in the architecture, computing a set of candidate configurations of resources required for vectorizing the data in the architecture, where the computing step includes configuring a vector pointer register of the architecture in support of either of reorder-on-read use and reorder-on-write use of a vector element file of the architecture, selecting one of the candidates in accordance with predefined selection criteria, and implementing the selected vectorization configuration in the architecture.Type: GrantFiled: October 29, 2003Date of Patent: December 25, 2007Assignee: International Business Machines CorporationInventors: Shay Ben-David, Dorit Naishlos, Uzi Shvadron, Ayal Zaks
-
Patent number: 7293258Abstract: A data processor has a debug circuit arranged to monitor whether operand data used for execution of a program meets a debug exception condition. The debug exception condition tests a two or more of multi-bit subfields of a vector operand independently. Debug action is taken if one or more of the multi-bit subfields meet the corresponding conditions.Type: GrantFiled: May 17, 2000Date of Patent: November 6, 2007Assignee: NXP B.V.Inventors: Hendrikus Petrus Elisabeth Vranken, Kornelis Antonius Vissers, Fransiscus Wilhelmus Sijstermans
-
Patent number: 7237086Abstract: A customization program for use in customizing a baseboard management controller used for monitoring operation of various computer system components is disclosed. A user interacts with the customization program to customize the baseboard management controller based on a configuration of components specified for the baseboard of the computer system. The customization program provides a user interface having a repository of icons and a design page. The icons represent various components that may be connected, either directly or indirectly, to the baseboard. The design page is used for constructing a model representing the specified configuration of components. As a user drags icons onto the design page, the model is updated to reflect selection of the components corresponding to these icons. Further, the customization program creates a configuration file that identifies and describes each of the selected components.Type: GrantFiled: November 26, 2003Date of Patent: June 26, 2007Assignee: American Megatrends, Inc.Inventors: Govind A. Kothandapani, Bakka Ravinder Reddy
-
Patent number: 7206920Abstract: A method of locating a target value includes loading the target value into elements of a first register. The first register includes N elements (N>0). The method also includes indicating in elements of a second register, which includes N elements corresponding to the first register, whether a corresponding element from data storage matches a corresponding element of the first register.Type: GrantFiled: December 13, 2005Date of Patent: April 17, 2007Assignee: Intel CorporationInventor: Jean-Francois C. Collard
-
Patent number: 7146486Abstract: A scalar processor that includes a plurality of scalar arithmetic logic units and a special function unit. Each scalar unit performs, in a different time interval, the same operation on a different data item, where each different time interval is one of a plurality of successive, adjacent time intervals. Each unit provides an output data item in the time interval in which the unit performs the operation and provides a processed data item in the last of the successive, adjacent time intervals. The special function unit provides a special function computation for the output data item of a selected one of the scalar units, in the time interval in which the selected scalar unit performs the operation, so as to avoid a conflict in use among the scalar units. A vector processing unit includes an input data buffer, the scalar processor, and an output orthogonal converter.Type: GrantFiled: January 29, 2003Date of Patent: December 5, 2006Assignee: S3 Graphics Co., Ltd.Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
-
Patent number: 7130985Abstract: Described herein is a data processor that comprises a register memory and a processor unit. The processor unit simultaneously executes a single instruction on a plurality of operands in the register memory. The plurality of operands may be one or more contiguous regions. The contiguous regions may be specified as an address and a format such as a row, a column, or a neighborhood relative to the address.Type: GrantFiled: October 31, 2002Date of Patent: October 31, 2006Assignee: Broadcom CorporationInventors: Stephen Barlow, Neil Bailey, Timothy Ramsdale, David Plowman, Robert Swann