Concurrent Patents (Class 712/9)
  • Patent number: 11360750
    Abstract: A method for facilitating a play of a legacy game is described. The method includes receiving a user input during the play of the legacy game, determining whether one or more blocks of code for servicing the user input are cached, and accessing one or more instructions of a legacy game code upon determining that the one or more blocks of code are not cached. The method further includes compiling the one or more blocks of code from the one or more instructions of the legacy game code, caching the one or more blocks of code, and executing the one or more blocks of code to display a virtual environment.
    Type: Grant
    Filed: January 28, 2021
    Date of Patent: June 14, 2022
    Assignee: Sony Interactive Entertainment LLC
    Inventors: Ernesto Corvi, George Weising, David Thach
  • Patent number: 10997116
    Abstract: A computing system is described herein that expedites deep neural network (DNN) operations or other processing operations using a hardware accelerator. The hardware accelerator, in turn, includes a tensor-processing engine that works in conjunction with a scalar-processing unit (SPU). The tensor-processing engine handles various kinds of tensor-based operations required by the DNN, such as multiplying vectors by matrices, combining vectors with other vectors, transforming individual vectors, etc. The SPU performs scalar-based operations, such as forming the reciprocal of a scalar, generating the square root of a scalar, etc. According to one illustrative implementation, the computing system uses the same vector-based programmatic interface to interact with both the tensor-processing engine and the SPU.
    Type: Grant
    Filed: August 6, 2019
    Date of Patent: May 4, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Steven Karl Reinhardt, Joseph Anthony Mayer, II, Dan Zhang
  • Patent number: 9792087
    Abstract: An embodiment of a system and method for performing a numerical operation on input data in a hybrid floating-point format includes representing input data as a sign bit, exponent bits, and mantissa bits. The exponent bits are represented as an unsigned integer including an exponent bias, and a signed numerical value of zero is represented as a first reserved combination of the mantissa bits and the exponent bits. Each of all other combinations of the mantissa bits and the exponent bits represents a real finite non-zero number. The mantissa bits are operated on with a “one” bit before a radix point for the all other combinations of the mantissa bits and the exponent bits.
    Type: Grant
    Filed: April 20, 2012
    Date of Patent: October 17, 2017
    Assignee: FUTUREWEI TECHNOLOGIES, INC.
    Inventors: Yuanbin Guo, Tong Sun, Weizhong Chen
  • Patent number: 9720696
    Abstract: Embodiments of the present invention provide systems and methods for mapping the architected state of one or more threads to a set of distributed physical register files to enable independent execution of one or more threads in a multiple slice processor. In one embodiment, a system is disclosed including a plurality of dispatch queues which receive instructions from one or more threads and an even number of parallel execution slices, each parallel execution slice containing a register file. A routing network directs an output from the dispatch queues to the parallel execution slices and the parallel execution slices independently execute the one or more threads.
    Type: Grant
    Filed: September 30, 2014
    Date of Patent: August 1, 2017
    Assignee: International Business Machines Corporation
    Inventors: Sam G. Chu, Markus Kaltenbach, Hung Q. Le, Jentje Leenstra, Jose E. Moreira, Dung Q. Nguyen, Brian W. Thompto
  • Patent number: 9715385
    Abstract: Vector exception handling is facilitated. A vector instruction is executed that operates on one or more elements of a vector register. When an exception is encountered during execution of the instruction, a vector exception code is provided that indicates a position within the vector register that caused the exception. The vector exception code also includes a reason for the exception.
    Type: Grant
    Filed: January 23, 2013
    Date of Patent: July 25, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Eric M. Schwarz, Timothy J. Slegel
  • Patent number: 8972689
    Abstract: A storage processor identifies latency of memory drives for different numbers of concurrent storage operations. The identified latency is used to identify debt limits for the number of concurrent storage operations issued to the memory drives. The storage processor may issue additional storage operations to the memory devices when the number of storage operations is within the debt limit. Storage operations may be deferred when the number of storage operations is outside the debt limit.
    Type: Grant
    Filed: February 2, 2011
    Date of Patent: March 3, 2015
    Assignee: Violin Memory, Inc.
    Inventor: Erik de la Iglesia
  • Publication number: 20140359253
    Abstract: A processor may include a vector functional unit that supports concurrent operations on multiple data elements of a maximum element size. The functional unit may also support concurrent execution of multiple distinct vector program instructions, where the multiple vector instructions each operate on multiple data elements of less than the maximum element size.
    Type: Application
    Filed: May 29, 2013
    Publication date: December 4, 2014
    Applicant: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 8868885
    Abstract: A device system and method for processing program instructions, for example, to execute intra vector operations. A fetch unit may receive a program instruction defining different operations on data elements stored at the same vector memory address. A processor may include different types of execution units each executing a different one of a predetermined plurality of elemental instructions. Each program instruction may be a combination of one or more of the elemental instructions. The processor may receive a vector of data elements stored non-consecutively at the same vector memory address to be processed by a same one of the elemental instructions and a vector of configuration values independently associated with executing the same elemental instruction on the non-consecutive data elements. At least two configuration values may be different to implement different operations by executing the same elemental instruction using the different configuration values on the vector of non-consecutive data elements.
    Type: Grant
    Filed: November 18, 2010
    Date of Patent: October 21, 2014
    Assignee: Ceva D.S.P. Ltd.
    Inventors: Yaakov Dekter, Michael Boukaya, Shai Shpigelblat, Moshe Steinberg
  • Publication number: 20140289496
    Abstract: Systems, apparatuses and methods for utilizing enhanced macroscalar predicate operations which take enhanced predicate operands that designate the element width and which elements are to be processed. The element width and the number of elements per vector are determined at run-time rather than being defined in the architectural definition of the instruction. This enables additional parallelism when processing smaller-sized data. The instruction performs the requested operation on the elements specified by the enhanced control predicate, assuming an element-width also specified by the enhanced control predicate, and returns the result as an enhanced predicate of the same element width.
    Type: Application
    Filed: March 18, 2014
    Publication date: September 25, 2014
    Applicant: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Publication number: 20140289497
    Abstract: Systems, apparatuses and methods for utilizing enhanced Macroscalar comparison operations which take an enhanced predicate operand that designates the element width and which elements are to be processed. The element width and the number of elements per vector are determined at run-time rather than being defined in the architectural definition of the instruction. This enables additional parallelism when processing smaller-sized data. The instruction performs the requested operation on the elements specified by the enhanced predicate, assuming an element-width also specified by the enhanced predicate, and returns the result as an enhanced predicate corresponding to the result of the comparison.
    Type: Application
    Filed: March 18, 2014
    Publication date: September 25, 2014
    Applicant: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Publication number: 20140289498
    Abstract: Systems, apparatuses and methods for utilizing enhanced Macroscalar vector operations which take an enhanced predicate operand that designates the element width and which elements are to be processed. The element width and the number of elements per vector are determined at run-time rather than being defined in the architectural definition of the instruction. This enables additional parallelism when processing smaller-sized data. The instruction performs the requested operation on the elements specified by the enhanced predicate, assuming an element-width also specified by the enhanced predicate, and returns the result as a vector of elements of the same element width.
    Type: Application
    Filed: March 18, 2014
    Publication date: September 25, 2014
    Applicant: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 8775510
    Abstract: The invention provides, in one aspect, an improved system for data access comprising a file server that is coupled to a client device or application executing thereon via one or more networks. The server comprises static storage that is organized in one or more directories, each containing, zero, one or more files. The server also comprises a file system operable, in cooperation with a file system on the client device, to provide authorized applications executing on the client device access to those directories and/or files. Fast file server (FFS) software or other functionality executing on or in connection with the server responds to requests received from the client by transferring requested data to the client device over multiple network pathways. That data can comprise, for example, directory trees, files (or portions thereof), and so forth.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: July 8, 2014
    Assignee: PME IP Australia Pty Ltd
    Inventors: Malte Westerhoff, Detlev Stalling
  • Patent number: 8732359
    Abstract: When executing a graphical model of a dynamic system that includes two or more concurrently executing sets of operations, a processor is configured to create a first buffer and a second buffer within the executable graphical model. A first set of operations is configured to write data to the first buffer during a first execution instance of the first set of operations. The first set of operations is configured to write data to the second buffer during a second execution instance of the first thread. A second set of operations is configured to read the data from the first buffer during an instance of the second thread that executes contemporaneously with the second execution instance of the first set of operations. Determinations regarding access to the first buffer and second buffer by the first thread and second thread are self-contained within the first thread and second thread, respectively.
    Type: Grant
    Filed: December 7, 2012
    Date of Patent: May 20, 2014
    Assignee: The MathWorks, Inc.
    Inventors: James E. Carrick, Biao Yu
  • Patent number: 8707012
    Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 12, 2012
    Date of Patent: April 22, 2014
    Assignee: Intel Corporation
    Inventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
  • Patent number: 8687008
    Abstract: A latency tolerant system for executing video processing operations. The system includes a host interface for implementing communication between the video processor and a host CPU, a scalar execution unit coupled to the host interface and configured to execute scalar video processing operations, and a vector execution unit coupled to the host interface and configured to execute vector video processing operations. A command FIFO is included for enabling the vector execution unit to operate on a demand driven basis by accessing the memory command FIFO. A memory interface is included for implementing communication between the video processor and a frame buffer memory. A DMA engine is built into the memory interface for implementing DMA transfers between a plurality of different memory locations and for loading the command FIFO with data and instructions for the vector execution unit.
    Type: Grant
    Filed: November 4, 2005
    Date of Patent: April 1, 2014
    Assignee: NVIDIA Corporation
    Inventors: Ashish Karandikar, Shirish Gadre, Stephen D. Lew
  • Patent number: 8645588
    Abstract: The present invention provides embodiments of an apparatus used to implement a pipelined serial ring bus. One embodiment of the apparatus includes one or more ring buses configured to communicatively couple registers associated with logical elements in a processor. The ring bus(s) are configured to concurrently convey information associated with a plurality of load or store operations.
    Type: Grant
    Filed: November 1, 2010
    Date of Patent: February 4, 2014
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Christopher D. Bryant, David Kaplan
  • Patent number: 8532288
    Abstract: A cryptographic engine for modulo N multiplication, which is structured as a plurality of almost identical, serially connected Processing Elements, is controlled so as to accept input in blocks that are smaller than the maximum capability of the engine in terms of bits multiplied at one time. The serially connected hardware is thus partitioned on the fly to process a variety of cryptographic key sizes while still maintaining all of the hardware in an active processing state.
    Type: Grant
    Filed: December 1, 2006
    Date of Patent: September 10, 2013
    Assignee: International Business Machines Corporation
    Inventors: Camil Fayad, John K. Li, Siegfried K. H. Sutter, Phil C. Yeh
  • Patent number: 8438512
    Abstract: Disclosed is an improved method and system for implementing parallelism for execution of electronic design automation (EDA) tools, such as layout processing tools. Examples of EDA layout processing tools are placement and routing tools. Efficient locking mechanism are described for facilitating parallel processing and to minimize blocking.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: May 7, 2013
    Assignee: Cadence Design Systems, Inc.
    Inventors: David Cross, Eric Nequist
  • Patent number: 8392529
    Abstract: The invention provides, in one aspect, an improved system for data access comprising a file server that is coupled to a client device or application executing thereon via one or more networks. The server comprises static storage that is organized in one or more directories, each containing, zero, one or more files. The server also comprises a file system operable, in cooperation with a file system on the client device, to provide authorized applications executing on the client device access to those directories and/or files. Fast file server (FFS) software or other functionality executing on or in connection with the server responds to requests received from the client by transferring requested data to the client device over multiple network pathways. That data can comprise, for example, directory trees, files (or portions thereof), and so forth.
    Type: Grant
    Filed: August 27, 2007
    Date of Patent: March 5, 2013
    Assignee: PME IP Australia Pty Ltd
    Inventors: Malte Westerhoff, Detlev Stalling
  • Patent number: 8316216
    Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: November 20, 2012
    Assignee: Intel Corporation
    Inventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
  • Patent number: 8205204
    Abstract: An multi-threading processor is provided. The multi-threading processor includes a first instruction fetch unit to receive a first thread and a second instruction fetch unit to receive a second thread. A multi-thread scheduler coupled to the instruction fetch units and a execution unit. The multi-thread scheduler determines the width of the execution unit and the execution unit executes the threads accordingly.
    Type: Grant
    Filed: January 23, 2009
    Date of Patent: June 19, 2012
    Assignee: Intel Corporation
    Inventors: Ken Shoemaker, Sailesh Kottapalli, Kin-Kee Sit
  • Patent number: 8179896
    Abstract: A network processor of an embodiment includes a packet classification engine, a processing pipeline, and a controller. The packet classification engine allows for classifying each of a plurality of packets according to packet type. The processing pipeline has a plurality of stages for processing each of the plurality of packets in a pipelined manner, where each stage includes one or more processors. The controller allows for providing the plurality of packets to the processing pipeline in an order that is based at least partially on: (i) packet types of the plurality of packets as classified by the packet classification engine and (ii) estimates of processing times for processing packets of the packet types at each stage of the plurality of stages of the processing pipeline. A method in a network processor allows for prefetching instructions into a cache for processing a packet based on a packet type of the packet.
    Type: Grant
    Filed: November 7, 2007
    Date of Patent: May 15, 2012
    Inventor: Justin Mark Sobaje
  • Patent number: 8169439
    Abstract: Embodiments of the invention are generally related to image processing, and more specifically to vector units for supporting image processing. A combined vector/scalar unit is provided wherein one or more processing lanes of the vector unit are used for performing scalar operations. An integrated register file is also provided for storing vector and scalar data. Therefore, the transfer of data to memory to exchange data between independent vector and scalar units is obviated and a significant amount of chip area is saved.
    Type: Grant
    Filed: October 23, 2007
    Date of Patent: May 1, 2012
    Assignee: International Business Machines Corporation
    Inventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
  • Patent number: 8161266
    Abstract: An improved superscalar processor. The processor includes multiple lanes, allowing multiple instructions in a bundle to be executed in parallel. In vector mode, the parallel lanes may be used to execute multiple instances of a bundle, representing multiple iterations of the bundle in a vector run. Scheduling logic determines whether, for each bundle, multiple instances can be executed in parallel. If multiple instances can be executed in parallel, coupling circuitry couples an instance of the bundle from one lane into one or more other lanes. In each lane, register addresses are renamed to ensure proper execution of the bundles in the vector run. Additionally, the processor may include a register bank separate from the architectural register file. Renaming logic can generate addresses to this separate register bank that are longer than used to address architectural registers, allowing longer vectors and more efficient processor operation.
    Type: Grant
    Filed: December 22, 2008
    Date of Patent: April 17, 2012
    Assignee: STMicroelectronics Inc.
    Inventor: Osvaldo M. Colavin
  • Patent number: 8132031
    Abstract: A method, apparatus, and program product optimize power consumption in a parallel computing system that includes a plurality of computing nodes by selectively throttling performance of selected nodes to effectively slow down the completion of quicker executing parts of a workload of the computing system when those parts are dependent upon or otherwise associated with the completion of other, slower executing parts of the same workload. Parts of the workload are executed on the computing nodes, including concurrently executing a first part on a first computing node and a second part on a second computing node. The first node is selectively throttled during execution of the first part to decrease power consumption of the first node and conform a completion time of for the first node in completing the first part of the workload with a completion time for the second node in completing the second part.
    Type: Grant
    Filed: March 17, 2009
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Eric Lawrence Barsness, David L. Darrington, Amanda Peters, John Matthew Santosuosso
  • Patent number: 8010917
    Abstract: Disclosed is an improved method and system for implementing parallelism for execution of electronic design automation (EDA) tools, such as layout processing tools. Examples of EDA layout processing tools are placement and routing tools. Efficient locking mechanism are described for facilitating parallel processing and to minimize blocking.
    Type: Grant
    Filed: December 26, 2007
    Date of Patent: August 30, 2011
    Assignee: Cadence Design Systems, Inc.
    Inventors: David Cross, Eric Nequist
  • Patent number: 7987344
    Abstract: A programmable processor and method for improving the performance of processors by incorporating an execution unit configurable to execute a plurality of instruction streams from the plurality of threads, wherein each instruction stream includes a group instruction that operates on a plurality of data elements in partitioned fields of at least one of the registers to produce a catenated result.
    Type: Grant
    Filed: January 16, 2004
    Date of Patent: July 26, 2011
    Assignee: Microunity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris
  • Patent number: 7962906
    Abstract: A compiler includes a mechanism for employing multiple synergistic processors to execute long vectors. The compiler receives a single source program. The compiler identifies vectorizable loop code in the single source program and extracts the vectorizable loop code from the single source program. The compiler then compiles the extracted vectorizable loop code for a plurality of synergistic processors. The compiler also compiles a remainder of the single source program for a principal processor to form an executable main program such that the executable main program controls operation of the executable vectorizable loop code on the plurality of synergistic processors.
    Type: Grant
    Filed: March 15, 2007
    Date of Patent: June 14, 2011
    Assignee: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel Arthur Prener
  • Publication number: 20100332792
    Abstract: Systems and methods for improved vector data processing based on separately processing elements of a vector in multiple simultaneously executing vector element processing units are disclosed. One embodiment of the present invention is a vector processing system including a plurality of vector element processing units and a routing infrastructure. The routing infrastructure is configured to route each element of a received vector to a respective one of the vector element processing units. The received vector may be from a memory which is coupled to the vector element processing units by the routing infrastructure. Each vector element processing unit is configured to simultaneously process two or more elements, wherein each of the two or more elements is from a separate vector. Embodiments of the present invention also provide for forwarding of data and results of computation between vector element processing units.
    Type: Application
    Filed: June 30, 2009
    Publication date: December 30, 2010
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Daniel B. CLIFTON
  • Patent number: 7788471
    Abstract: A system and method for performing vector arithmetic is disclosed. The method includes loading two operand vectors, each composed of a number of vector elements, into two storage locations. A selected arithmetic operation is performed on the operand vectors to produce a result vector having the number of vector elements. Each vector element of the result vector is associated with an arithmetic logic cell that has a first input that can receive any vector element from the first vector and a second input that can receive any vector element from the second vector. Accordingly each vector element of the result vector is a function of any two individual vector elements of the operand vectors. By applying the operand vector elements to the appropriate arithmetic logic cells, and by selecting the appropriate arithmetic operation, complex vector operations can be performed efficiently.
    Type: Grant
    Filed: September 18, 2006
    Date of Patent: August 31, 2010
    Assignee: Freescale Semiconductor, Inc.
    Inventor: Chengke Sheng
  • Patent number: 7779382
    Abstract: Validity of one or more assertions for any concurrent execution of a plurality of software instructions with at most k?1 context switches can be determined. Validity checking can account for execution of the software instructions in an unbounded stack depth scenario. A finite data domain representation can be used. The software instructions can be represented by a pushdown system. Validity checking can account for thread creation during execution of the plurality of software instructions.
    Type: Grant
    Filed: December 10, 2004
    Date of Patent: August 17, 2010
    Assignee: Microsoft Corporation
    Inventors: Niels Jakob Rehof, Shaz Qadeer
  • Patent number: 7673076
    Abstract: An enhanced direct memory access (EDMA) operation issues a read command to the source port to request data. The port returns the data along with response information, which contains the channel and valid byte count. The EDMA stores the read data into a write buffer and acknowledges to the source port that the EDMA can accept more data. The read response and data can come from more than one port and belong to different channels. Removing channel prioritizing according to this invention allows the EDMA to store read data in the write buffer and the EDMA then can acknowledge the port read response concurrently across all channels. This improves the EDMA inbound and outbound data flow dramatically.
    Type: Grant
    Filed: May 13, 2005
    Date of Patent: March 2, 2010
    Assignee: Texas Instruments Incorporated
    Inventors: Sanjive Agarwala, Kyle Castille, Quang-Dieu An
  • Publication number: 20100042807
    Abstract: The described embodiments provide a processor for generating a result vector with incremented or decremented values from an input vector. During operation, the processor receives an input vector and a control vector. The processor then copies a value contained in a selected element of the input vector. The processor next generates the result vector, which involves writing an incremented or decremented value to the result vector, depending on the value of the control vector and the embodiment. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: June 30, 2009
    Publication date: February 18, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff, JR.
  • Patent number: 7660967
    Abstract: A computer processor is responsive to successive processing instructions in an issue order to process regular vectors to generate a result vector without use of a cache. At least two architectural registers having input-vector capability are selectively coupled to memory to receive corresponding vector-elements of two vectors and transfer the vector-elements to a selected functional unit. At least one architectural register having output capability is selectively coupled to an output, which in turn is coupled to transfer result vector-elements to the memory. The functional unit performs a function on the vector-elements to generate a respective result-element. The result-elements are transferred to a selected architectural register for processing as operands in performance of further functions by a functional unit, or are transferred to the output for transfer to memory. In either case, the order of the result vector-elements is restored to the issue order of the successive processing instructions.
    Type: Grant
    Filed: January 30, 2008
    Date of Patent: February 9, 2010
    Assignee: Efficient Memory Technology
    Inventor: Maurice L. Hutson
  • Patent number: 7620795
    Abstract: Apparatus and method for a microcontroller are described. The microcontroller includes a microprocessor having storage and bussing for accessing the storage. A portion of the bussing is coupled to hardwired operation codes, and a portion of the storage is for storing code. The hardwired operation codes are in part for placing the microprocessor into an exception handling mode. The exception handling mode includes reactivating the storage for execution of the code without having to reload the code therein.
    Type: Grant
    Filed: January 14, 2005
    Date of Patent: November 17, 2009
    Assignee: Xilinx, Inc.
    Inventor: Peter Ryser
  • Patent number: 7603488
    Abstract: Systems and methods for providing efficient memory allocation, reduced processor intervention and power consumption, and increased memory access bandwidth. One embodiment comprises a system including a plurality of memory units which are accessible in parallel, a dynamic memory unit configured to dynamically allocate and deallocate storage space in the memory units, and a plurality of direct memory access (DMA) engines configured to access the memory units in parallel through the memory management subsystem. The system may be implemented in the MAC engine of a device that communicates with other devices via a wireless communication link. This embodiment may store packets in FIFOs within the memory units as elements of linked list data structures that can be joined together without having to move the previously stored data. DMA engines access a context table to obtain DMA channel information that enables them to move data through appropriate DMA channels.
    Type: Grant
    Filed: July 15, 2004
    Date of Patent: October 13, 2009
    Assignee: Alereon, Inc.
    Inventors: Martin Gravenstein, Nirmalendu B. Patra, Andrew Probst, Dave Ohmann, Clair A. Hardesty
  • Patent number: 7603492
    Abstract: A streaming data interface device (700) of a streaming processing system (200) is automatically generated by selecting a set of circuit parameters (610) consistent with a set of circuit constraints and generating (612, 614) a representation of a candidate memory interface device based upon a set of stream descriptors. The candidate streaming data interface device is evaluated (616) with respect to one or more quality metrics and the representation of the candidate streaming processor circuit is output (622) if the candidate memory interface device satisfies a set of processing system constraints and is better in at least one of the one or more quality metrics than other candidate memory interface devices.
    Type: Grant
    Filed: September 20, 2005
    Date of Patent: October 13, 2009
    Assignee: Motorola, Inc.
    Inventors: Sek M. Chai, Nikos Bellas, Malcolm R. Dwyer, Erica M. Lau, Zhiyuan Li, Daniel A. Linzmeier
  • Patent number: 7500240
    Abstract: An multi-threading processor is provided. The multi-threading processor includes a first instruction fetch unit to receive a first thread and a second instruction fetch unit to receive a second thread. A multi-thread scheduler coupled to the instruction fetch units and a execution unit. The multi-thread scheduler determines the width of the execution unit and the execution unit executes the threads accordingly.
    Type: Grant
    Filed: January 15, 2002
    Date of Patent: March 3, 2009
    Assignee: Intel Corporation
    Inventors: Ken Shoemaker, Sailesh Kottapalli, Kin-Kee Sit
  • Patent number: 7460989
    Abstract: A method is provided, wherein a virtual internal master clock is used in connection with a RISC CPU. The RISC CPU comprises a number of concurrently operating function units, wherein each unit runs according to its own clocks, including multiple-stage totally unsynchronized clocks, in order to process a stream of instructions. The method includes the steps of generating a virtual model master clock having a clock cycle, and initializing each of the function units at the beginning of respectively corresponding processing cycles. The method further includes operating each function unit during a respectively corresponding processing cycle to carry out a task with respect to one of the instructions, in order to produce a result. Respective results are all evaluated in synchronization, by means of the master clock. This enables the instruction processing operation to be modeled using a sequential computer language, such as C or C++.
    Type: Grant
    Filed: October 14, 2004
    Date of Patent: December 2, 2008
    Assignee: International Business Machines Corporation
    Inventor: Oliver Keren Ban
  • Publication number: 20080282059
    Abstract: A method and apparatus for maintaining membership in a set of items to be used in a predetermined manner in a computer system. A representation of each member of the set is mapped into a number of components of a primary and secondary vector when a member is added to the set. Periodically, the primary vector is changed to the secondary vector and the secondary vector to the primary vector. When members of the set are deleted, the components of the secondary vector are changed to indicate deletion of these members after the primary vector is changed to the secondary vector. Finally, membership in the set is determined by examining the components in the primary vector, and the members in the set of items are then used in a predetermined manner in the computer system. More specifically, in a sample embodiment of the present invention, membership in the set would determine if data is to be stored or removed from cache memory in a computer system.
    Type: Application
    Filed: May 9, 2007
    Publication date: November 13, 2008
    Inventors: Kattamuri Ekanadham, Il Park, Pratap Chandra Pattnaik, Xiaowei Shen
  • Patent number: 7451146
    Abstract: A method and computer system for implementing, in a multithreaded environment, an almost non-blocking linked list allow a lock-free access provided that certain conditions are met. The approach involves: associating a pointer and an auxiliary data structure with each linked list, using a compare-and-swap (CAS) operation, and making a slight modification of values associated with nodes under certain conditions. The CAS operation guards against setting the pointers incorrectly during insertion and removal operations. The auxiliary data structure, also referred to as the ‘black list,’ holds a dynamic list of values, typically pointer values, associated with nodes that are in the process of being removed by a thread.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: November 11, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Hans-Juergen K. H. Boehm
  • Patent number: 7444488
    Abstract: A method and a programmable unit for bit field shifting in a memory device in a programmable unit as a result of the execution of an instruction, in which a bit segment is shifted within a first memory unit to a second memory unit, are presented. The bit segment is read with a first bit length from a first bit field in the first memory unit starting at a first start point. The bit segment that has been read is stored in the first bit field in the second memory unit starting at a second start point. The first or the second start points is updated by a predetermined value and the updated start point is stored for subsequent method steps.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: October 28, 2008
    Assignee: Infineon Technologies
    Inventors: Xiaoning Nie, Thomas Wahl
  • Publication number: 20080077768
    Abstract: A method and apparatus are provided to perform efficient merging operations of two or more streams of data by using SIMD instruction. Streams of data are merged together in parallel and with mitigated or removed conditional branching. The merge operations of the streams of data include Merge AND and Merge OR operations.
    Type: Application
    Filed: September 27, 2006
    Publication date: March 27, 2008
    Inventors: Hiroshi Inoue, Moriyoshi Ohara, Hideaki Komatsu
  • Patent number: 7313788
    Abstract: A method for determining vectorization configurations in a computer processor architecture, the method including identifying a vectorizable loop in a computer program, identifying a memory access pattern of data required for implementing the loop in the architecture, computing a set of candidate configurations of resources required for vectorizing the data in the architecture, where the computing step includes configuring a vector pointer register of the architecture in support of either of reorder-on-read use and reorder-on-write use of a vector element file of the architecture, selecting one of the candidates in accordance with predefined selection criteria, and implementing the selected vectorization configuration in the architecture.
    Type: Grant
    Filed: October 29, 2003
    Date of Patent: December 25, 2007
    Assignee: International Business Machines Corporation
    Inventors: Shay Ben-David, Dorit Naishlos, Uzi Shvadron, Ayal Zaks
  • Patent number: 7293258
    Abstract: A data processor has a debug circuit arranged to monitor whether operand data used for execution of a program meets a debug exception condition. The debug exception condition tests a two or more of multi-bit subfields of a vector operand independently. Debug action is taken if one or more of the multi-bit subfields meet the corresponding conditions.
    Type: Grant
    Filed: May 17, 2000
    Date of Patent: November 6, 2007
    Assignee: NXP B.V.
    Inventors: Hendrikus Petrus Elisabeth Vranken, Kornelis Antonius Vissers, Fransiscus Wilhelmus Sijstermans
  • Patent number: 7237086
    Abstract: A customization program for use in customizing a baseboard management controller used for monitoring operation of various computer system components is disclosed. A user interacts with the customization program to customize the baseboard management controller based on a configuration of components specified for the baseboard of the computer system. The customization program provides a user interface having a repository of icons and a design page. The icons represent various components that may be connected, either directly or indirectly, to the baseboard. The design page is used for constructing a model representing the specified configuration of components. As a user drags icons onto the design page, the model is updated to reflect selection of the components corresponding to these icons. Further, the customization program creates a configuration file that identifies and describes each of the selected components.
    Type: Grant
    Filed: November 26, 2003
    Date of Patent: June 26, 2007
    Assignee: American Megatrends, Inc.
    Inventors: Govind A. Kothandapani, Bakka Ravinder Reddy
  • Patent number: 7206920
    Abstract: A method of locating a target value includes loading the target value into elements of a first register. The first register includes N elements (N>0). The method also includes indicating in elements of a second register, which includes N elements corresponding to the first register, whether a corresponding element from data storage matches a corresponding element of the first register.
    Type: Grant
    Filed: December 13, 2005
    Date of Patent: April 17, 2007
    Assignee: Intel Corporation
    Inventor: Jean-Francois C. Collard
  • Patent number: 7146486
    Abstract: A scalar processor that includes a plurality of scalar arithmetic logic units and a special function unit. Each scalar unit performs, in a different time interval, the same operation on a different data item, where each different time interval is one of a plurality of successive, adjacent time intervals. Each unit provides an output data item in the time interval in which the unit performs the operation and provides a processed data item in the last of the successive, adjacent time intervals. The special function unit provides a special function computation for the output data item of a selected one of the scalar units, in the time interval in which the selected scalar unit performs the operation, so as to avoid a conflict in use among the scalar units. A vector processing unit includes an input data buffer, the scalar processor, and an output orthogonal converter.
    Type: Grant
    Filed: January 29, 2003
    Date of Patent: December 5, 2006
    Assignee: S3 Graphics Co., Ltd.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7130985
    Abstract: Described herein is a data processor that comprises a register memory and a processor unit. The processor unit simultaneously executes a single instruction on a plurality of operands in the register memory. The plurality of operands may be one or more contiguous regions. The contiguous regions may be specified as an address and a format such as a row, a column, or a neighborhood relative to the address.
    Type: Grant
    Filed: October 31, 2002
    Date of Patent: October 31, 2006
    Assignee: Broadcom Corporation
    Inventors: Stephen Barlow, Neil Bailey, Timothy Ramsdale, David Plowman, Robert Swann
  • Patent number: 7062633
    Abstract: It is decided whether a first source data from the memory 101 is a data which is to be subjected to arithmetic or not by a state flag detection means 150, the result of the decision is retained as a state flag, and it is decided by a condition decision means 109 whether or not the state flag satisfies a condition for performing the arithmetic. A control means 110 controls whether an ALU 100 should perform the arithmetic or not on the basis of the condition satisfaction/dissatisfaction information.
    Type: Grant
    Filed: December 15, 1999
    Date of Patent: June 13, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Mana Hamada, Shunichi Kuromaru, Tomonori Yonezawa, Tsuyoshi Nakamura