Vector Processor Patents (Class 712/2)
  • Patent number: 7587711
    Abstract: The present invention discloses a method and system for specifying and executing computing tasks in a preboot execution environment in general, and, in particular, a method and system for generalized imaging utilizing a language agent and encapsulated object oriented polyphase preboot execution and specification language. The target customization is advantageously accomplished by encapsulating target dependent parameters in specification files. The target specific parameters are resolved at appropriate execution time when the parameter information becomes available. Such approach simplifies specification of complex tasks to a merely few lines of code. The approach of the present invention nevertheless affords reliable, robust, and accurate performance, because the pertinent parametric information are resolved only when they can be accurately ascertained. Furthermore, the specification encapsulations are themselves a part of the image set, providing self-describing images with self-contained imaging methods.
    Type: Grant
    Filed: February 27, 2004
    Date of Patent: September 8, 2009
    Assignee: WYSE Technology Inc.
    Inventor: Andrew T. Fausak
  • Patent number: 7581084
    Abstract: A method and apparatus for loading and storing vectors from and to memory, including embedding a location identifier in bits comprising a vector load and store instruction, wherein the location identifier indicates a location in the vector where useful data ends. The vector load instruction further includes a value field that indicates a particular constant for use by the load/store unit to set locations in the vector register beyond the useful data with the constant. By embedding the ending location of the useful date in the instruction, bandwidth and memory are saved by only requiring that the useful data in the vector be loaded and stored.
    Type: Grant
    Filed: May 21, 2004
    Date of Patent: August 25, 2009
    Assignee: Nintendo Co., Ltd.
    Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng
  • Publication number: 20090172348
    Abstract: A computer processor includes control logic for executing LoadUnpack and PackStore instructions. In one embodiment, the processor includes a vector register and a mask register. In response to a PackStore instruction with an argument specifying a memory location, a circuit in the processor copies unmasked vector elements from the vector register to consecutive memory locations, starting at the specified memory location, without copying masked vector elements. In response to a LoadUnpack instruction, the circuit copies data items from consecutive memory locations, starting at an identified memory location, into unmasked vector elements of the vector register, without copying data to masked vector elements. Other embodiments are described and claimed.
    Type: Application
    Filed: December 26, 2007
    Publication date: July 2, 2009
    Inventor: Robert Cavin
  • Patent number: 7548248
    Abstract: Methods and apparatuses for blending two images using vector table look up operations. In one aspect of the invention, a method to blend two images includes: loading a vector of keys into a vector register; converting the vector of keys into a first vector of blending factors for the first image and a second vector of blending factors for the second image using a plurality of look up tables; and computing an image attribute for the blended image using the blending factors.
    Type: Grant
    Filed: June 7, 2007
    Date of Patent: June 16, 2009
    Assignee: Apple Inc.
    Inventors: Steven Todd Weybrew, David Ligon, Ronald Gerard Langhi
  • Publication number: 20090150647
    Abstract: A vectorizable execution unit is capable of being operated in a plurality of modes, with the processing lanes in the vectorizable execution unit grouped into different combinations of logical execution units in different modes. By doing so, processing lanes can be selectively grouped together to operate as different types of vector execution units and/or scalar execution units, and if desired, dynamically switched during runtime to process various types of instruction streams in a manner that is best suited for each type of instruction stream. As a consequence, a single vectorizable execution unit may be configurable, e.g., via software control, to operate either as a vector execution or a plurality of scalar execution units.
    Type: Application
    Filed: December 7, 2007
    Publication date: June 11, 2009
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
  • Publication number: 20090144521
    Abstract: Extensible Markup Language (XML) data is represented as a list of structures with each structure in the list representing an aspect of the XML. A set of frequently used elements is extracted from the list of structure representation and stored in packed vectors. The packed vector representation allows Single Instruction Multiple Data (SIMD) instructions to be used directly on the XML data to increase the speed at which the XML data may be searched while minimizing the memory needed to store the XML data.
    Type: Application
    Filed: December 3, 2007
    Publication date: June 4, 2009
    Inventor: Kevin J. Jones
  • Patent number: 7543119
    Abstract: A vector processing system provides high performance vector processing using a System-On-a-Chip (SOC) implementation technique. One or more scalar processors (or cores) operate in conjunction with a vector processor, and the processors collectively share access to a plurality of memory interfaces coupled to Dynamic Random Access read/write Memories (DRAMs). In typical embodiments the vector processor operates as a slave to the scalar processors, executing computationally intensive Single Instruction Multiple Data (SIMD) codes in response to commands received from the scalar processors. The vector processor implements a vector processing Instruction Set Architecture (ISA) including machine state, instruction set, exception model, and memory model.
    Type: Grant
    Filed: February 10, 2006
    Date of Patent: June 2, 2009
    Inventors: Richard Edward Hessel, Nathan Daniel Tuck, Korbin S. Van Dyke, Chetana N. Keltcher
  • Patent number: 7526456
    Abstract: A method of operating a Linear Complementarity Problem (LCP) solver is disclosed, where the LCP solver is characterized by multiple execution units operating in parallel to implement a competent computational method adapted to resolve physics-based LCPs in real-time.
    Type: Grant
    Filed: March 8, 2004
    Date of Patent: April 28, 2009
    Assignee: NVIDIA Corporation
    Inventors: Lihua Zhang, Richard Tonge, Dilip Sequeira, Monier Maher
  • Patent number: 7526629
    Abstract: A vector processing apparatus includes a main memory, an instruction issuing section which issues instructions, an overtaking control circuit which outputs the instructions received from the instruction issuing section to an instruction executing section in an order based on whether each of a first and second instructions belongs to a first specific instruction group, whether each of the first and second instructions belongs to a second specific instruction group in the first specific instruction group, whether a fourth instruction belongs to a fourth specific instruction group, whether a third instruction belongs to a third specific instruction group, and whether an address area of the main memory relating to the third instruction and an address area of the main memory relating to each of the first and second instructions do not overlap, and the instruction executing section executes the instructions received from the overtaking control circuit.
    Type: Grant
    Filed: February 23, 2005
    Date of Patent: April 28, 2009
    Assignee: NEC Corporation
    Inventor: Yasumasa Saida
  • Publication number: 20090106525
    Abstract: A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for image processing, and more specifically to vector units for supporting image processing is provided. A combined vector/scalar unit is provided wherein one or more processing lanes of the vector unit are used for performing scalar operations. An integrated register file is also provided for storing vector and scalar data. Therefore, the transfer of data to memory to exchange data between independent vector and scalar units is obviated and a significant amount of chip area is saved.
    Type: Application
    Filed: March 14, 2008
    Publication date: April 23, 2009
    Inventors: David Arnold LUICK, Eric Oliver MEJDRICH, Adam James Muff
  • Patent number: 7503049
    Abstract: An information processing apparatus switches between an Operating System 1 and an Operating System 2 during operation and comprises: a storing unit including a first area storing data managed by OS1, a second area storing a reset handler containing instructions for returning to OS2 and for branching to OS2, and a switching unit that switches connection/disconnection of the first area with outside; a table storing unit storing information showing the reset handler's position; a CPU having a program counter and executing an instruction at a position indicated by positional information in the program counter; and a management unit that, when instructed to switch from OS1 to OS2 while the apparatus is operating with OS1, instructs the switching unit to disconnect the first area and the CPU to reset. When instructed to reset itself, the CPU initializes its state and sets the reset handler positional information into the program counter.
    Type: Grant
    Filed: May 26, 2004
    Date of Patent: March 10, 2009
    Assignee: Panasonic Corporation
    Inventors: Kouichi Kanemura, Teruto Hirota, Takayuki Ito
  • Patent number: 7487302
    Abstract: A memory subsystem includes a memory controller operable to generate first control signals according to a standard interface. A memory interface adapter is coupled to the memory controller and is operable responsive to the first control signals to develop second control signals adapted to be applied to a memory subsystem to access desired storage locations within the memory subsystem.
    Type: Grant
    Filed: October 3, 2005
    Date of Patent: February 3, 2009
    Assignee: Lockheed Martin Corporation
    Inventors: Brent I. Gouldey, Joel J. Fuster, John Rapp, Mark Jones
  • Patent number: 7467286
    Abstract: A method and apparatus are provided for executing packed data instructions. According to one aspect of the invention, a processor includes registers, a register renaming unit coupled to the registers, a decoder coupled to the register renaming unit, and a partial-width execution unit coupled to the decoder. The register renaming unit provides an architectural register file to store packed data operands that include data elements. The decoder is to decode a first and second set of instructions that each specify one or more registers in the architectural register file. Each of the instructions in the first set specify operations to be performed on all of the data elements. In contrast, each of the instructions in the second set specify operations to be performed on only a subset of the data elements. The partial-width execution unit is to execute operations specified by either the first or second set of instructions.
    Type: Grant
    Filed: May 9, 2005
    Date of Patent: December 16, 2008
    Assignee: Intel Corporation
    Inventors: Mohammad Abdallah, James Coke, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
  • Patent number: 7466316
    Abstract: An integrated circuit includes at least two different types of processors, such as a graphics processor and a video processor. At least one operation is commonly by supported by two different types of processors. For each commonly supported operation that is scheduled, a decision is made to determine which type of processor will be selected to implement the operation.
    Type: Grant
    Filed: December 14, 2004
    Date of Patent: December 16, 2008
    Assignee: NVIDIA Corporation
    Inventors: Jonah M. Alben, Stephen D. Lew, Paolo E. Sabella
  • Patent number: 7457938
    Abstract: In one embodiment, the present invention includes a method for executing an operation on low order portions of first and second source operands using a first execution stack of a processor and executing the operation on high order portions of the first and second source operands using a second execution stack of the processor, where the operation in the second execution stack is staggered by one or more cycles from the operation in the first execution stack. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: November 25, 2008
    Assignee: Intel Corporation
    Inventors: Stephan Jourdan, Avinash Sodani, Michael Fetterman, Per Hammarlund, Ronak Singhal, Glenn Hinton
  • Patent number: 7447873
    Abstract: In a multithreaded processing core, groups of threads are executed using single instruction, multiple data (SIMD) parallelism by a set of parallel processing engines. Input data defining objects to be processed received as a stream of input data blocks, and the input data blocks are loaded into a local register file in the core such that all of the data for one of the input objects is accessible to one of the processing engines. The input data can be loaded directly into the local register file, or the data can be accumulated in a buffer and loaded after accumulation, for instance during a launch operation for a SIMD group. Shared input data can also be loaded into a shared memory in the processing core.
    Type: Grant
    Filed: November 29, 2005
    Date of Patent: November 4, 2008
    Assignee: NVIDIA Corporation
    Inventor: Bryon S. Nordquist
  • Patent number: 7446773
    Abstract: An integrated circuit includes at least two different types of processors. The integrated circuit includes an integrated host and associated scheduler. At least one operation is supported by two or more different types of processors. The scheduler schedules operations on the different types of processors.
    Type: Grant
    Filed: December 14, 2004
    Date of Patent: November 4, 2008
    Assignee: NVIDIA Corporation
    Inventors: Jonah M. Alben, Stephen D. Lew, Paolo E. Sabella
  • Patent number: 7418574
    Abstract: A peer-vector machine includes a host processor and a hardwired pipeline accelerator. The host processor executes a program, and, in response to the program, generates host data, and the pipeline accelerator generates pipeline data from the host data. Alternatively, the pipeline accelerator generates the pipeline data, and the host processor generates the host data from the pipeline data. Because the peer-vector machine includes both a processor and a pipeline accelerator, it can often process data more efficiently than a machine that includes only processors or only accelerators. For example, one can design the peer-vector machine so that the host processor performs decision-making and non-mathematically intensive operations and the accelerator performs non-decision-making and mathematically intensive operations.
    Type: Grant
    Filed: October 9, 2003
    Date of Patent: August 26, 2008
    Assignee: Lockheed Martin Corporation
    Inventors: Chandan Mathur, Scott Hellenbach, John W. Rapp, Larry Jackson, Mark Jones, Troy Cherasaro
  • Patent number: 7404065
    Abstract: In one embodiment, a method for flow optimization and prediction for vector streaming single instruction, multiple data (SIMD) extension (VSSE) memory operations is disclosed. The method comprises generating an optimized micro-operation (?op) flow for an instruction to operate on a vector if the instruction is predicted to be unmasked and unit-stride, the instruction to access elements in memory, and accessing via the optimized ?op flow two or more of the elements at the same time without determining masks of the two or more elements. Other embodiments are also described.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: July 22, 2008
    Assignee: Intel Corporation
    Inventors: Stephan Jourdan, Per Hammarlund, Michael Fetterman, Michael P. Cornaby, Glenn Hinton, Avinash Sodani
  • Publication number: 20080114964
    Abstract: A single unified level one instruction cache in which some lines may contain traces and other lines in the same congruence class may contain blocks of instructions consistent with conventional cache lines. Control is exercised over which lines are contained within the cache. This invention avoids inefficiencies in the cache by removing trace lines experiencing early exits from the cache, or trace lines that are short, by maintaining a few bits of information about the accuracy of the control flow in a trace cache line and using that information in addition to the LRU (Least Recently Used) bits that maintain the recency information of a cache line, in order to make a replacement decision.
    Type: Application
    Filed: November 14, 2006
    Publication date: May 15, 2008
    Inventors: Gordon T. Davis, Richard W. Doing, John D. Jabusch, M V V Anil Krishna, Brett Olsson, Eric F. Robinson, Sumedh W. Sathaye, Jeffrey R. Summers
  • Patent number: 7356710
    Abstract: A method, system and computer program product for computing a message authentication code for data in storage of a computing environment. An instruction specifies a unit of storage for which an authentication code is to be computed. An computing operation computes an authentication code for the unit of storage. A register is used for providing a cryptographic key for use in the computing to the authentication code. Further, the register may be used in a chaining operation.
    Type: Grant
    Filed: May 12, 2003
    Date of Patent: April 8, 2008
    Assignee: International Business Machines Corporation
    Inventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
  • Publication number: 20080082783
    Abstract: The present invention is generally related to integrated circuit devices, and more particularly, to methods, systems and design structures for the field of image processing, and more specifically to vector units for supporting image processing. A dual vector unit implementation is described wherein two vector units are configured receive data from a common register file. The vector units may independently and simultaneously process instructions. Furthermore, the vector units may be adapted to perform scalar operations thereby integrating the vector and scalar processing. The vector units may also be configured to share resources to perform an operation, for example, a cross product operation.
    Type: Application
    Filed: October 26, 2007
    Publication date: April 3, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
  • Patent number: 7275147
    Abstract: Execution of a single stand-alone instruction manipulates two n bit strings of data to pack data or align the data. Decoding of the single instruction identifies two registers of n bits each and a shift value, preferably as parameters of the instruction. A first and a second subset of data of less than n bits are selected, by logical shifting, from the two registers, respectively, based solely upon the shift value. Then, the subsets are concatenated, preferably by a logical OR, to obtain an output of n bits. The output may be aligned data or packed data, particularly useful for performing a single operation on multiple sets of the data through parallel processing with a SIMD processor.
    Type: Grant
    Filed: March 31, 2003
    Date of Patent: September 25, 2007
    Assignee: Hitachi, Ltd.
    Inventor: Clifford Tavares
  • Patent number: 7257695
    Abstract: According to some embodiments, a dynamic region in a register file may be described for an operand. The described region may, for example, store multiple data elements, each data element being associated with an execution channel of an execution engine. Information may then be stored into and/or retrieved from the register file in accordance with the described region.
    Type: Grant
    Filed: December 28, 2004
    Date of Patent: August 14, 2007
    Assignee: Intel Corporation
    Inventors: Hong Jiang, Val Cook
  • Patent number: 7230633
    Abstract: Methods and apparatuses for blending two images using vector table look up operations. In one aspect of the invention, a method to blend two images includes: loading a vector of keys into a vector register; converting the vector of keys into a first vector of blending factors for the first image and a second vector of blending factors for the second image using a plurality of look up tables; and computing an image attribute for the blended image using the blending factors.
    Type: Grant
    Filed: January 11, 2006
    Date of Patent: June 12, 2007
    Assignee: Apple Inc.
    Inventors: Steven Todd Weybrew, David Ligon, Ronald Gerard Langhi
  • Patent number: 7206857
    Abstract: A method is described that involves recognizing that an input queue state has reached a buffer's worth of information. The method also involves generating a first request to read a buffer's worth of information from an input RAM that implements the input queue. The method further involves recognizing that an output queue has room to receive information and that an intermediate queue that provides information to the output queue does not have information waiting to be forwarded to the output queue. The method also involves generating a second request to read information from the input RAM so that at least a portion of the room can be filled. The method also involves granting one of the first and second requests.
    Type: Grant
    Filed: May 10, 2002
    Date of Patent: April 17, 2007
    Assignee: Altera Corporation
    Inventors: Neil Mammen, Greg Maturi, Mammen Thomas
  • Patent number: 7197606
    Abstract: A computer 10a stores boot information OA1 and application information AP1 stored on a local disk 16a, the information being respectively stored as an OS1 shared file group in a shared LU1 and as a AP1 shared file group in a shared LUn+1. For personal information (including personal information of boot information or AP information), computer 10a stores the information as a user personal file group in an personal LU1. Computer 10a transmits image outline information, LU information and file information for the sets of information stored in shared LU1, shared LUn+1 and personal LU1 to an disk-image management server 30, where the information is stored in the storage device 31 of the disk-image management server 30.
    Type: Grant
    Filed: August 13, 2004
    Date of Patent: March 27, 2007
    Assignee: Hitachi, Ltd.
    Inventors: Ikuko Kobayashi, Shinji Kimura, Ayumi Mikuma
  • Patent number: 7197623
    Abstract: Protocol processor intended to be associated with at least one main processor of a system with a view to the execution of tasks to which the main processor is not suited. The Protocol Processor comprises a program part (30) including an incrementation register (31), a program memory (33) connected to the incrementation register (31) in order to receive addresses thereof, a decoding part (35) intended to receive instructions from the program memory (33) of the program part (30) with a view to executing an instruction in two cycles, and a data part (36) for executing the instruction.
    Type: Grant
    Filed: June 28, 2000
    Date of Patent: March 27, 2007
    Assignee: Texas Instruments Incorporated
    Inventors: Gerard Chauvel, Francis Aussedat, Pierre Calippe
  • Patent number: 7159099
    Abstract: A re-configurable, streaming vector processor (100) is provided which includes a number of function units (102), each having one or more inputs for receiving data values and an output for providing a data value, a re-configurable interconnection switch (104) and a micro-sequencer (118). The re-configurable interconnection switch (104) includes one or more links, each link operable to couple an output of a function unit (102) to an input of a function unit (102) as directed by the micro-sequencer (118). The vector processor may also include one or more input-stream units (122) for retrieving data from memory. Each input-stream unit is directed by a host processor and has a defined interface (116) to the host processor. The vector processor also includes one or more output-stream units (124) for writing data to memory or to the host processor. The defined interface of the input-stream and output-stream units forms a first part of the programming model.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: January 2, 2007
    Assignee: Motorola, Inc.
    Inventors: Brian Geoffrey Lucas, Philip E. May, Kent Donald Moat, Raymond B. Essick, IV, Silviu Chiricescu, James M. Norris, Michael Allen Schuette, Ali Saidi
  • Patent number: 7149877
    Abstract: A disclosed byte execution unit receives byte instruction information and two operands, and performs an operation specified by the byte instruction information upon one or both of the operands, thereby producing a result. The byte instruction specifies either a count ones in bytes operation, an average bytes operation, an absolute differences of bytes operation, or a sum bytes into halfwords operation. In one embodiment, the byte execution unit includes multiple byte units. Each byte unit includes multiple population counters, two compressor units, adder input multiplexer logic, adder logic, and result multiplexer logic. A data processing system is described including a processor coupled to a memory system. The processor includes the byte execution unit. The memory system includes a byte instruction, wherein the byte instruction specifies either the count ones in bytes operation, the average bytes operation, the absolute differences of bytes operation, or the sum bytes into halfwords operation.
    Type: Grant
    Filed: July 17, 2003
    Date of Patent: December 12, 2006
    Assignee: International Business Machines Corporation
    Inventors: Sang Hoo Dhong, Hwa-Joon Oh, Brad William Michael, Silvia Melitta Mueller, Kevin D. Tran
  • Patent number: 7146486
    Abstract: A scalar processor that includes a plurality of scalar arithmetic logic units and a special function unit. Each scalar unit performs, in a different time interval, the same operation on a different data item, where each different time interval is one of a plurality of successive, adjacent time intervals. Each unit provides an output data item in the time interval in which the unit performs the operation and provides a processed data item in the last of the successive, adjacent time intervals. The special function unit provides a special function computation for the output data item of a selected one of the scalar units, in the time interval in which the selected scalar unit performs the operation, so as to avoid a conflict in use among the scalar units. A vector processing unit includes an input data buffer, the scalar processor, and an output orthogonal converter.
    Type: Grant
    Filed: January 29, 2003
    Date of Patent: December 5, 2006
    Assignee: S3 Graphics Co., Ltd.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7043627
    Abstract: In view of a necessity of alleviating factors obstructing an effect of SIMD operation such as in-register data alignment in high speed formation of an SIMD processor, numerous data can be supplied to a data alignment operation pipe 211 by dividing a register file into four banks and enabling to designate a plurality of registers by a single piece of operand to thereby enable to make access to four registers simultaneously and data alignment operation can be carried out at high speed. Further, by defining new data pack instruction, data unpack instruction and data permutation instruction, data supplied in a large number can be aligned efficiently. Further, by the above-described characteristic, definition of multiply accumulate operation instruction maximizing parallelism of SIMD can be carried out.
    Type: Grant
    Filed: September 4, 2001
    Date of Patent: May 9, 2006
    Assignee: Hitachi, Ltd.
    Inventors: Takehiro Shimizu, Fumio Arakawa
  • Patent number: 7043607
    Abstract: The vector unit 21 outputs a first flash address to the flash address array 24. The vector unit 31 outputs a second flash address to the flash address array 34. In the master unit 2, the flash address array 24 compares an address registered in a cache with the first flash address. In the slave unit 3, the flash address array 34 compares the address registered in the cache with the second flash address. When said first flash address coincides with said address registered in said cache, the flash address array 24 sends a first coincidence address to the address array 25. When said second flash address coincides with said address registered in said cache, the flash address array 34 sends a second coincidence address to the address array 25. A corresponding address of the address array 25 is flashed based on the first address sent from the flash address array 24 and based on the second address sent from the flash address 34.
    Type: Grant
    Filed: June 12, 2003
    Date of Patent: May 9, 2006
    Assignee: NEC Corporation
    Inventor: Kenji Ezoe
  • Patent number: 7034849
    Abstract: Methods and apparatuses for blending two images using vector table look up operations. In one aspect of the invention, a method to blend two images includes: loading a vector of keys into a vector register; converting the vector of keys into a first vector of blending factors for the first image and a second vector of blending factors for the second image using a plurality of look up tables; and computing an image attribute for the blended image using the blending factors.
    Type: Grant
    Filed: December 31, 2001
    Date of Patent: April 25, 2006
    Assignee: Apple Computer, Inc.
    Inventors: Steven Todd Weybrew, David Ligon, Ronald Gerard Langhi
  • Patent number: 6996698
    Abstract: Processing restrictions of a computing environment are filtered and blocked, in certain circumstances, such that processing continues despite the restrictions. One restriction includes an indication that fetching of storage keys is prohibited, in response to a buffer miss. When a processing unit of the computing environment is met with this restriction, it performs a comparison of addresses, which indicates whether the fetching can continue. If fetching can continue, the restriction is ignored.
    Type: Grant
    Filed: May 12, 2003
    Date of Patent: February 7, 2006
    Assignee: International Business Machines Corporation
    Inventors: Timothy J. Slegel, Jane H. Bartik, Lisa C. Heller, Erwin F. Pfeffer, Ute Gaertner
  • Patent number: 6968445
    Abstract: A multithreaded processor includes an instruction decoder for decoding retrieved instructions to determine an instruction type for each of the retrieved instructions, an integer unit coupled to the instruction decoder for processing integer type instructions, and a vector unit coupled to the instruction decoder for processing vector type instructions. A reduction unit is preferably associated with the vector unit and receives parallel data elements processed in the vector unit. The reduction unit generates a serial output from the parallel data elements. The processor may be configured to execute at least control code, digital signal processor (DSP) code, Java code and network processing code, and is therefore well-suited for use in a convergence device. The processor is preferably configured to utilize token triggered threading in conjunction with instruction pipelining.
    Type: Grant
    Filed: October 11, 2002
    Date of Patent: November 22, 2005
    Assignee: Sandbridge Technologies, Inc.
    Inventors: Erdem Hokenek, Mayan Moudgill, C. John Glossner
  • Patent number: 6968402
    Abstract: Techniques to buffer and present chunks are disclosed. In some embodiments, a first interface may receive chunks of a first cache line, and a second interface may receive chunks of a second cache line. A buffer may store chunks of the first cache line in a first chunk order and may store chunks of the second cache line in a second chunk order. A control unit may present a requester via the second interface with one or more chunks of the first cache line from the buffer.
    Type: Grant
    Filed: May 22, 2003
    Date of Patent: November 22, 2005
    Assignee: Intel Corporation
    Inventors: David R. Jackson, Stephen W. Kiss, Miles F. Schwartz
  • Patent number: 6963341
    Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.
    Type: Grant
    Filed: May 20, 2003
    Date of Patent: November 8, 2005
    Inventor: Tibet Mimar
  • Patent number: 6950834
    Abstract: A database table reorganization is defined to permit online access of the table during the reorganization. Records are reorganized in the database table by vacating records from a defined number of pages and then filling the pages with records in accordance with a desired ordering for the records. Temporary pointers to the new locations of moved records are used to prevent table scanner access to the database table from missing or duplicating records while scanning the database table during reorganization. Removal of the temporary pointers is synchronized with the completion of scanning of all table scanners that are commenced during a time when records are being moved as part of a vacating or filling step.
    Type: Grant
    Filed: January 29, 2001
    Date of Patent: September 27, 2005
    Assignee: International Business Machines Corporation
    Inventors: Matthew A. Huras, Nelson Hop Hing, Jeffrey J. Goss, Bruce G. Lindsay
  • Patent number: 6924802
    Abstract: A system, method, and computer program product are provided for generating display data. The data processing system loads coefficient values corresponding to a behavior of a selected function in pre-defined ranges of input data. The data processing system then determines, responsive to items of input data, the range of input data in which the selected function is to be estimated. The data processing system then selects, through the use of a vector permute function, the coefficient values, and evaluates an index function at the each of the items of input data. It then estimates the value of the selected function through parallel mathematical operations on the items of input data, the selected coefficient values, and the values of the index function, and, responsive to the one or more values of the selected function, generates display data.
    Type: Grant
    Filed: September 12, 2002
    Date of Patent: August 2, 2005
    Assignee: International Business Machines Corporation
    Inventors: Gordon Clyde Fossum, Harm Peter Hofstee, Barry L. Minor, Mark Richard Nutter
  • Publication number: 20040267957
    Abstract: Windowed multiuser detection techniques are disclosed. A window of data is established, and certain central bits within the window are selected as reliable, while other side bits are ignored. The selected bits are demodulated. The windowed multiuser detector moves along to the next window in such a manner that the next group of central bit decisions lay contiguous with the previous set, and eventually every bit to be demodulated has at some point been a central bit decision. Most any type of MUD algorithm (e.g., MMSE algorithm MUD or M-algorithm MUD) can be used to compute estimates in the windowed data. Unreliable windowed data are distinguished from reliable data (e.g., weighting or other de-emphasis scheme).
    Type: Application
    Filed: February 11, 2004
    Publication date: December 30, 2004
    Inventor: Robert B MacLeod
  • Publication number: 20040210739
    Abstract: A vector signal processor of the present invention consisting of a digital signal processor unit, a VSP command bus, a data flow interface, a broadcast interface, a multi-channel buffered serial port (McBSP) network, and a host interface. The vector signal processor of this invention has high processing speed, better communication between modules, far better coordination, and uses daughter cards to enhance various processing functions.
    Type: Application
    Filed: January 6, 2003
    Publication date: October 21, 2004
    Inventors: Yung-Po Huang, Che-Hui Chang-Chien, Chao-Yuan Huang, Chiung-Hung Chang
  • Patent number: 6807620
    Abstract: The present invention relates to the architecture and use of a computer system optimized for the efficient modeling of graphics. The computer system has a primary processor and a graphics processor. The primary processor has two vector processor units within it, one which is closely connected to central processor unit. Simultaneously performing complex modeling calculations on the first vector processor and CPU, and geometry transformation calculations on the second vector processor, allows for efficient modeling of graphics. Furthermore, the graphics processor is optimized to rapidly switch between data flow from the two vector processors. In addition, the graphics processor is able to render many pixels simultaneously, and has a local memory on the graphics processor chip that acts as a frame buffer, texture buffer, and z buffer. This allows a high fill rate to the frame buffer.
    Type: Grant
    Filed: February 11, 2000
    Date of Patent: October 19, 2004
    Assignee: Sony Computer Entertainment Inc.
    Inventors: Masakazu Suzuoki, Akio Ohba, Masaaki Oka, Toshiyuki Hiroi, Teiji Yutaka, Toyoshi Okada, Masayoshi Tanaka
  • Patent number: 6782470
    Abstract: The register file of a processor includes embedded operand queues. The configuration of the register file into registers and operand queues is defined dynamically by a computer program. The programmer determines the trade-off between the number and size of the operand queue(s) versus the number of registers used for the program. The programmer partitions a portion of the registers into one or more operand queues. A given queue occupies a consecutive set of registers, although multiple queues need not occupy consecutive registers. An additional address bit is included to distinguish operand queue addresses from register addresses. Queue state logic tracks status information for each queue, including a header pointer, tail pointer, start address, end address and number of vacancies value. The program sets the locations and depth of a given operand queue within the register file.
    Type: Grant
    Filed: November 6, 2000
    Date of Patent: August 24, 2004
    Assignee: University of Washington
    Inventors: Stefan G. Berg, Michael S. Grow, Weiyun Sun, Donglok Kim, Yongmin Kim
  • Publication number: 20040117599
    Abstract: A functional-level instruction-set computing (FLIC) architecture executes higher-level functional instructions such as lookups and bit-compares of variable-length operands. Each FLIC processing-engine slice has specialized processing units including a lookup unit that searches for a matching entry in a lookup cache. Variable-length operands are stored in execution buffers. The operand length and location in the execution buffer are stored in fixed-length general-purpose registers (GPRs) that also store fixed-length operands. A copy/move unit moves data between input and output buffers and one or more FLIC processing-engine slices. Multiple contexts can each have a set of GPRs and execution buffers. An expansion buffer in a FLIC slice can be allocated to a context to expand that context's execution buffer for storing longer operands.
    Type: Application
    Filed: December 12, 2002
    Publication date: June 17, 2004
    Applicant: NEXSIL COMMUNICATIONS, INC.
    Inventors: Millind Mittal, Mehul Kharidia, Tarun Kumar Tripathy, J. Sukarno Mertoguno
  • Patent number: 6734874
    Abstract: A method, apparatus and article of manufacture are provided for handling both scalar and vector components during graphics processing. To accomplish this, vertex data is received in the form of vectors after which vector operations are performed on the vector vertex data. Next, scalar operations may be executed on an output of the vector operations, thereby rendering vertex data in the form of scalars. Such scalar vertex data may then be converted to vector vertex data for performing vector operations thereon.
    Type: Grant
    Filed: January 31, 2001
    Date of Patent: May 11, 2004
    Assignee: nVidia Corporation
    Inventors: John Erik Lindholm, Simon Moy, David B. Kirk, Paolo E. Sabella
  • Publication number: 20040059436
    Abstract: Data processing architecture comprises one or more data processing components associated with a logical level such that a data processing component associated with a logical level only accepts input from one or more data processing components in a logically higher or lower logical level or an external source; a data processing component associated with a logical level only provides output to one or more data processing components in a logically higher or lower level or an external recipient system. Each data processing component can not accept input or provide output from or to a data processing component in the same logical level; and a data processing component will only accept an input that conforms to an ontology related to the logical level with which the data processing component is associated.
    Type: Application
    Filed: March 24, 2003
    Publication date: March 25, 2004
    Inventors: Mark Stephen Anderson, Dean Crawford Engelhardt, Damian Andrew Marriott, Suneel Singh Randhawa
  • Patent number: 6697930
    Abstract: A pipeline video decoder and decompression system handles a plurality of separately encoded bit streams arranged as a single serial bit stream of digital bits and having separately encoded pairs of control codes and corresponding data carried in the serial bit stream. The pipeline system employs a plurality of interconnected stages to decode and decompress the single bit stream, including a start code detector. When in a search mode, the state code detector searches for a specific start code corresponding to one of multiple compression standards. The start code detector responding to the single serial bit stream generates control tokens and data tokens. A respective one of the tokens includes a plurality of data words. Each data word has an extension bit which indicates a presence of additional words therein. The data words are thereby unlimited in number.
    Type: Grant
    Filed: February 7, 2001
    Date of Patent: February 24, 2004
    Assignee: Discovision Associates
    Inventors: Adrian P Wise, Martin W Sotheran, William P Robbins, Anthony M Jones, Helen R Finch, Kevin J Boyd, Anthony Peter J Claydon
  • Publication number: 20040015677
    Abstract: A digital signal processor (DSP) includes a SIMD-based organization wherein operations are executed on a plurality of single-instruction multiple data (SIMD) datapaths or stages connected in cascade. The functionality and data values at each stage may be different, including a different width (e.g., a different number of bits per value) in each stage. The operands and destination for data in a computational datapath are selected indirectly through vector pointer registers in a vector pointers datapath. Each vector pointer register contains a plurality of pointers into a register file of a computational datapath.
    Type: Application
    Filed: July 18, 2002
    Publication date: January 22, 2004
    Applicant: International Business Machines Corporation
    Inventors: Jaime H. Moreno, Jeffrey Haskell Derby, Uzi Shvadron, Fredy Daniel Daniel Neeser, Victor Zyuban, Ayal Zaks, Shay Ben-David
  • Patent number: 6675379
    Abstract: A method for memory management in execution of a program by a computer having a memory includes identifying in the program an array of array elements. At a given point in the program, a range of the elements is determined within the array such that none of the elements in the array outside the range is alive at the point. Information regarding the determined range is passed to a memory management function, so that memory locations are associated with the array elements, responsive to the determined range.
    Type: Grant
    Filed: June 30, 2000
    Date of Patent: January 6, 2004
    Assignee: International Business Machines Corporation
    Inventors: Elliot Karl Kolodner, Ran Shaham, Mooly Sagiv