Vector Processor Patents (Class 712/2)
-
Patent number: 7587711Abstract: The present invention discloses a method and system for specifying and executing computing tasks in a preboot execution environment in general, and, in particular, a method and system for generalized imaging utilizing a language agent and encapsulated object oriented polyphase preboot execution and specification language. The target customization is advantageously accomplished by encapsulating target dependent parameters in specification files. The target specific parameters are resolved at appropriate execution time when the parameter information becomes available. Such approach simplifies specification of complex tasks to a merely few lines of code. The approach of the present invention nevertheless affords reliable, robust, and accurate performance, because the pertinent parametric information are resolved only when they can be accurately ascertained. Furthermore, the specification encapsulations are themselves a part of the image set, providing self-describing images with self-contained imaging methods.Type: GrantFiled: February 27, 2004Date of Patent: September 8, 2009Assignee: WYSE Technology Inc.Inventor: Andrew T. Fausak
-
Patent number: 7581084Abstract: A method and apparatus for loading and storing vectors from and to memory, including embedding a location identifier in bits comprising a vector load and store instruction, wherein the location identifier indicates a location in the vector where useful data ends. The vector load instruction further includes a value field that indicates a particular constant for use by the load/store unit to set locations in the vector register beyond the useful data with the constant. By embedding the ending location of the useful date in the instruction, bandwidth and memory are saved by only requiring that the useful data in the vector be loaded and stored.Type: GrantFiled: May 21, 2004Date of Patent: August 25, 2009Assignee: Nintendo Co., Ltd.Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng
-
Publication number: 20090172348Abstract: A computer processor includes control logic for executing LoadUnpack and PackStore instructions. In one embodiment, the processor includes a vector register and a mask register. In response to a PackStore instruction with an argument specifying a memory location, a circuit in the processor copies unmasked vector elements from the vector register to consecutive memory locations, starting at the specified memory location, without copying masked vector elements. In response to a LoadUnpack instruction, the circuit copies data items from consecutive memory locations, starting at an identified memory location, into unmasked vector elements of the vector register, without copying data to masked vector elements. Other embodiments are described and claimed.Type: ApplicationFiled: December 26, 2007Publication date: July 2, 2009Inventor: Robert Cavin
-
Patent number: 7548248Abstract: Methods and apparatuses for blending two images using vector table look up operations. In one aspect of the invention, a method to blend two images includes: loading a vector of keys into a vector register; converting the vector of keys into a first vector of blending factors for the first image and a second vector of blending factors for the second image using a plurality of look up tables; and computing an image attribute for the blended image using the blending factors.Type: GrantFiled: June 7, 2007Date of Patent: June 16, 2009Assignee: Apple Inc.Inventors: Steven Todd Weybrew, David Ligon, Ronald Gerard Langhi
-
Publication number: 20090150647Abstract: A vectorizable execution unit is capable of being operated in a plurality of modes, with the processing lanes in the vectorizable execution unit grouped into different combinations of logical execution units in different modes. By doing so, processing lanes can be selectively grouped together to operate as different types of vector execution units and/or scalar execution units, and if desired, dynamically switched during runtime to process various types of instruction streams in a manner that is best suited for each type of instruction stream. As a consequence, a single vectorizable execution unit may be configurable, e.g., via software control, to operate either as a vector execution or a plurality of scalar execution units.Type: ApplicationFiled: December 7, 2007Publication date: June 11, 2009Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
-
Publication number: 20090144521Abstract: Extensible Markup Language (XML) data is represented as a list of structures with each structure in the list representing an aspect of the XML. A set of frequently used elements is extracted from the list of structure representation and stored in packed vectors. The packed vector representation allows Single Instruction Multiple Data (SIMD) instructions to be used directly on the XML data to increase the speed at which the XML data may be searched while minimizing the memory needed to store the XML data.Type: ApplicationFiled: December 3, 2007Publication date: June 4, 2009Inventor: Kevin J. Jones
-
Patent number: 7543119Abstract: A vector processing system provides high performance vector processing using a System-On-a-Chip (SOC) implementation technique. One or more scalar processors (or cores) operate in conjunction with a vector processor, and the processors collectively share access to a plurality of memory interfaces coupled to Dynamic Random Access read/write Memories (DRAMs). In typical embodiments the vector processor operates as a slave to the scalar processors, executing computationally intensive Single Instruction Multiple Data (SIMD) codes in response to commands received from the scalar processors. The vector processor implements a vector processing Instruction Set Architecture (ISA) including machine state, instruction set, exception model, and memory model.Type: GrantFiled: February 10, 2006Date of Patent: June 2, 2009Inventors: Richard Edward Hessel, Nathan Daniel Tuck, Korbin S. Van Dyke, Chetana N. Keltcher
-
Patent number: 7526456Abstract: A method of operating a Linear Complementarity Problem (LCP) solver is disclosed, where the LCP solver is characterized by multiple execution units operating in parallel to implement a competent computational method adapted to resolve physics-based LCPs in real-time.Type: GrantFiled: March 8, 2004Date of Patent: April 28, 2009Assignee: NVIDIA CorporationInventors: Lihua Zhang, Richard Tonge, Dilip Sequeira, Monier Maher
-
Patent number: 7526629Abstract: A vector processing apparatus includes a main memory, an instruction issuing section which issues instructions, an overtaking control circuit which outputs the instructions received from the instruction issuing section to an instruction executing section in an order based on whether each of a first and second instructions belongs to a first specific instruction group, whether each of the first and second instructions belongs to a second specific instruction group in the first specific instruction group, whether a fourth instruction belongs to a fourth specific instruction group, whether a third instruction belongs to a third specific instruction group, and whether an address area of the main memory relating to the third instruction and an address area of the main memory relating to each of the first and second instructions do not overlap, and the instruction executing section executes the instructions received from the overtaking control circuit.Type: GrantFiled: February 23, 2005Date of Patent: April 28, 2009Assignee: NEC CorporationInventor: Yasumasa Saida
-
Publication number: 20090106525Abstract: A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for image processing, and more specifically to vector units for supporting image processing is provided. A combined vector/scalar unit is provided wherein one or more processing lanes of the vector unit are used for performing scalar operations. An integrated register file is also provided for storing vector and scalar data. Therefore, the transfer of data to memory to exchange data between independent vector and scalar units is obviated and a significant amount of chip area is saved.Type: ApplicationFiled: March 14, 2008Publication date: April 23, 2009Inventors: David Arnold LUICK, Eric Oliver MEJDRICH, Adam James Muff
-
Patent number: 7503049Abstract: An information processing apparatus switches between an Operating System 1 and an Operating System 2 during operation and comprises: a storing unit including a first area storing data managed by OS1, a second area storing a reset handler containing instructions for returning to OS2 and for branching to OS2, and a switching unit that switches connection/disconnection of the first area with outside; a table storing unit storing information showing the reset handler's position; a CPU having a program counter and executing an instruction at a position indicated by positional information in the program counter; and a management unit that, when instructed to switch from OS1 to OS2 while the apparatus is operating with OS1, instructs the switching unit to disconnect the first area and the CPU to reset. When instructed to reset itself, the CPU initializes its state and sets the reset handler positional information into the program counter.Type: GrantFiled: May 26, 2004Date of Patent: March 10, 2009Assignee: Panasonic CorporationInventors: Kouichi Kanemura, Teruto Hirota, Takayuki Ito
-
Patent number: 7487302Abstract: A memory subsystem includes a memory controller operable to generate first control signals according to a standard interface. A memory interface adapter is coupled to the memory controller and is operable responsive to the first control signals to develop second control signals adapted to be applied to a memory subsystem to access desired storage locations within the memory subsystem.Type: GrantFiled: October 3, 2005Date of Patent: February 3, 2009Assignee: Lockheed Martin CorporationInventors: Brent I. Gouldey, Joel J. Fuster, John Rapp, Mark Jones
-
Patent number: 7467286Abstract: A method and apparatus are provided for executing packed data instructions. According to one aspect of the invention, a processor includes registers, a register renaming unit coupled to the registers, a decoder coupled to the register renaming unit, and a partial-width execution unit coupled to the decoder. The register renaming unit provides an architectural register file to store packed data operands that include data elements. The decoder is to decode a first and second set of instructions that each specify one or more registers in the architectural register file. Each of the instructions in the first set specify operations to be performed on all of the data elements. In contrast, each of the instructions in the second set specify operations to be performed on only a subset of the data elements. The partial-width execution unit is to execute operations specified by either the first or second set of instructions.Type: GrantFiled: May 9, 2005Date of Patent: December 16, 2008Assignee: Intel CorporationInventors: Mohammad Abdallah, James Coke, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
-
Patent number: 7466316Abstract: An integrated circuit includes at least two different types of processors, such as a graphics processor and a video processor. At least one operation is commonly by supported by two different types of processors. For each commonly supported operation that is scheduled, a decision is made to determine which type of processor will be selected to implement the operation.Type: GrantFiled: December 14, 2004Date of Patent: December 16, 2008Assignee: NVIDIA CorporationInventors: Jonah M. Alben, Stephen D. Lew, Paolo E. Sabella
-
Patent number: 7457938Abstract: In one embodiment, the present invention includes a method for executing an operation on low order portions of first and second source operands using a first execution stack of a processor and executing the operation on high order portions of the first and second source operands using a second execution stack of the processor, where the operation in the second execution stack is staggered by one or more cycles from the operation in the first execution stack. Other embodiments are described and claimed.Type: GrantFiled: September 30, 2005Date of Patent: November 25, 2008Assignee: Intel CorporationInventors: Stephan Jourdan, Avinash Sodani, Michael Fetterman, Per Hammarlund, Ronak Singhal, Glenn Hinton
-
Patent number: 7447873Abstract: In a multithreaded processing core, groups of threads are executed using single instruction, multiple data (SIMD) parallelism by a set of parallel processing engines. Input data defining objects to be processed received as a stream of input data blocks, and the input data blocks are loaded into a local register file in the core such that all of the data for one of the input objects is accessible to one of the processing engines. The input data can be loaded directly into the local register file, or the data can be accumulated in a buffer and loaded after accumulation, for instance during a launch operation for a SIMD group. Shared input data can also be loaded into a shared memory in the processing core.Type: GrantFiled: November 29, 2005Date of Patent: November 4, 2008Assignee: NVIDIA CorporationInventor: Bryon S. Nordquist
-
Patent number: 7446773Abstract: An integrated circuit includes at least two different types of processors. The integrated circuit includes an integrated host and associated scheduler. At least one operation is supported by two or more different types of processors. The scheduler schedules operations on the different types of processors.Type: GrantFiled: December 14, 2004Date of Patent: November 4, 2008Assignee: NVIDIA CorporationInventors: Jonah M. Alben, Stephen D. Lew, Paolo E. Sabella
-
Patent number: 7418574Abstract: A peer-vector machine includes a host processor and a hardwired pipeline accelerator. The host processor executes a program, and, in response to the program, generates host data, and the pipeline accelerator generates pipeline data from the host data. Alternatively, the pipeline accelerator generates the pipeline data, and the host processor generates the host data from the pipeline data. Because the peer-vector machine includes both a processor and a pipeline accelerator, it can often process data more efficiently than a machine that includes only processors or only accelerators. For example, one can design the peer-vector machine so that the host processor performs decision-making and non-mathematically intensive operations and the accelerator performs non-decision-making and mathematically intensive operations.Type: GrantFiled: October 9, 2003Date of Patent: August 26, 2008Assignee: Lockheed Martin CorporationInventors: Chandan Mathur, Scott Hellenbach, John W. Rapp, Larry Jackson, Mark Jones, Troy Cherasaro
-
Patent number: 7404065Abstract: In one embodiment, a method for flow optimization and prediction for vector streaming single instruction, multiple data (SIMD) extension (VSSE) memory operations is disclosed. The method comprises generating an optimized micro-operation (?op) flow for an instruction to operate on a vector if the instruction is predicted to be unmasked and unit-stride, the instruction to access elements in memory, and accessing via the optimized ?op flow two or more of the elements at the same time without determining masks of the two or more elements. Other embodiments are also described.Type: GrantFiled: December 21, 2005Date of Patent: July 22, 2008Assignee: Intel CorporationInventors: Stephan Jourdan, Per Hammarlund, Michael Fetterman, Michael P. Cornaby, Glenn Hinton, Avinash Sodani
-
Publication number: 20080114964Abstract: A single unified level one instruction cache in which some lines may contain traces and other lines in the same congruence class may contain blocks of instructions consistent with conventional cache lines. Control is exercised over which lines are contained within the cache. This invention avoids inefficiencies in the cache by removing trace lines experiencing early exits from the cache, or trace lines that are short, by maintaining a few bits of information about the accuracy of the control flow in a trace cache line and using that information in addition to the LRU (Least Recently Used) bits that maintain the recency information of a cache line, in order to make a replacement decision.Type: ApplicationFiled: November 14, 2006Publication date: May 15, 2008Inventors: Gordon T. Davis, Richard W. Doing, John D. Jabusch, M V V Anil Krishna, Brett Olsson, Eric F. Robinson, Sumedh W. Sathaye, Jeffrey R. Summers
-
Patent number: 7356710Abstract: A method, system and computer program product for computing a message authentication code for data in storage of a computing environment. An instruction specifies a unit of storage for which an authentication code is to be computed. An computing operation computes an authentication code for the unit of storage. A register is used for providing a cryptographic key for use in the computing to the authentication code. Further, the register may be used in a chaining operation.Type: GrantFiled: May 12, 2003Date of Patent: April 8, 2008Assignee: International Business Machines CorporationInventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
-
Publication number: 20080082783Abstract: The present invention is generally related to integrated circuit devices, and more particularly, to methods, systems and design structures for the field of image processing, and more specifically to vector units for supporting image processing. A dual vector unit implementation is described wherein two vector units are configured receive data from a common register file. The vector units may independently and simultaneously process instructions. Furthermore, the vector units may be adapted to perform scalar operations thereby integrating the vector and scalar processing. The vector units may also be configured to share resources to perform an operation, for example, a cross product operation.Type: ApplicationFiled: October 26, 2007Publication date: April 3, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
-
Patent number: 7275147Abstract: Execution of a single stand-alone instruction manipulates two n bit strings of data to pack data or align the data. Decoding of the single instruction identifies two registers of n bits each and a shift value, preferably as parameters of the instruction. A first and a second subset of data of less than n bits are selected, by logical shifting, from the two registers, respectively, based solely upon the shift value. Then, the subsets are concatenated, preferably by a logical OR, to obtain an output of n bits. The output may be aligned data or packed data, particularly useful for performing a single operation on multiple sets of the data through parallel processing with a SIMD processor.Type: GrantFiled: March 31, 2003Date of Patent: September 25, 2007Assignee: Hitachi, Ltd.Inventor: Clifford Tavares
-
Patent number: 7257695Abstract: According to some embodiments, a dynamic region in a register file may be described for an operand. The described region may, for example, store multiple data elements, each data element being associated with an execution channel of an execution engine. Information may then be stored into and/or retrieved from the register file in accordance with the described region.Type: GrantFiled: December 28, 2004Date of Patent: August 14, 2007Assignee: Intel CorporationInventors: Hong Jiang, Val Cook
-
Patent number: 7230633Abstract: Methods and apparatuses for blending two images using vector table look up operations. In one aspect of the invention, a method to blend two images includes: loading a vector of keys into a vector register; converting the vector of keys into a first vector of blending factors for the first image and a second vector of blending factors for the second image using a plurality of look up tables; and computing an image attribute for the blended image using the blending factors.Type: GrantFiled: January 11, 2006Date of Patent: June 12, 2007Assignee: Apple Inc.Inventors: Steven Todd Weybrew, David Ligon, Ronald Gerard Langhi
-
Patent number: 7206857Abstract: A method is described that involves recognizing that an input queue state has reached a buffer's worth of information. The method also involves generating a first request to read a buffer's worth of information from an input RAM that implements the input queue. The method further involves recognizing that an output queue has room to receive information and that an intermediate queue that provides information to the output queue does not have information waiting to be forwarded to the output queue. The method also involves generating a second request to read information from the input RAM so that at least a portion of the room can be filled. The method also involves granting one of the first and second requests.Type: GrantFiled: May 10, 2002Date of Patent: April 17, 2007Assignee: Altera CorporationInventors: Neil Mammen, Greg Maturi, Mammen Thomas
-
Information storing method for computer system including a plurality of computers and storage system
Patent number: 7197606Abstract: A computer 10a stores boot information OA1 and application information AP1 stored on a local disk 16a, the information being respectively stored as an OS1 shared file group in a shared LU1 and as a AP1 shared file group in a shared LUn+1. For personal information (including personal information of boot information or AP information), computer 10a stores the information as a user personal file group in an personal LU1. Computer 10a transmits image outline information, LU information and file information for the sets of information stored in shared LU1, shared LUn+1 and personal LU1 to an disk-image management server 30, where the information is stored in the storage device 31 of the disk-image management server 30.Type: GrantFiled: August 13, 2004Date of Patent: March 27, 2007Assignee: Hitachi, Ltd.Inventors: Ikuko Kobayashi, Shinji Kimura, Ayumi Mikuma -
Patent number: 7197623Abstract: Protocol processor intended to be associated with at least one main processor of a system with a view to the execution of tasks to which the main processor is not suited. The Protocol Processor comprises a program part (30) including an incrementation register (31), a program memory (33) connected to the incrementation register (31) in order to receive addresses thereof, a decoding part (35) intended to receive instructions from the program memory (33) of the program part (30) with a view to executing an instruction in two cycles, and a data part (36) for executing the instruction.Type: GrantFiled: June 28, 2000Date of Patent: March 27, 2007Assignee: Texas Instruments IncorporatedInventors: Gerard Chauvel, Francis Aussedat, Pierre Calippe
-
Patent number: 7159099Abstract: A re-configurable, streaming vector processor (100) is provided which includes a number of function units (102), each having one or more inputs for receiving data values and an output for providing a data value, a re-configurable interconnection switch (104) and a micro-sequencer (118). The re-configurable interconnection switch (104) includes one or more links, each link operable to couple an output of a function unit (102) to an input of a function unit (102) as directed by the micro-sequencer (118). The vector processor may also include one or more input-stream units (122) for retrieving data from memory. Each input-stream unit is directed by a host processor and has a defined interface (116) to the host processor. The vector processor also includes one or more output-stream units (124) for writing data to memory or to the host processor. The defined interface of the input-stream and output-stream units forms a first part of the programming model.Type: GrantFiled: June 28, 2002Date of Patent: January 2, 2007Assignee: Motorola, Inc.Inventors: Brian Geoffrey Lucas, Philip E. May, Kent Donald Moat, Raymond B. Essick, IV, Silviu Chiricescu, James M. Norris, Michael Allen Schuette, Ali Saidi
-
Patent number: 7149877Abstract: A disclosed byte execution unit receives byte instruction information and two operands, and performs an operation specified by the byte instruction information upon one or both of the operands, thereby producing a result. The byte instruction specifies either a count ones in bytes operation, an average bytes operation, an absolute differences of bytes operation, or a sum bytes into halfwords operation. In one embodiment, the byte execution unit includes multiple byte units. Each byte unit includes multiple population counters, two compressor units, adder input multiplexer logic, adder logic, and result multiplexer logic. A data processing system is described including a processor coupled to a memory system. The processor includes the byte execution unit. The memory system includes a byte instruction, wherein the byte instruction specifies either the count ones in bytes operation, the average bytes operation, the absolute differences of bytes operation, or the sum bytes into halfwords operation.Type: GrantFiled: July 17, 2003Date of Patent: December 12, 2006Assignee: International Business Machines CorporationInventors: Sang Hoo Dhong, Hwa-Joon Oh, Brad William Michael, Silvia Melitta Mueller, Kevin D. Tran
-
Patent number: 7146486Abstract: A scalar processor that includes a plurality of scalar arithmetic logic units and a special function unit. Each scalar unit performs, in a different time interval, the same operation on a different data item, where each different time interval is one of a plurality of successive, adjacent time intervals. Each unit provides an output data item in the time interval in which the unit performs the operation and provides a processed data item in the last of the successive, adjacent time intervals. The special function unit provides a special function computation for the output data item of a selected one of the scalar units, in the time interval in which the selected scalar unit performs the operation, so as to avoid a conflict in use among the scalar units. A vector processing unit includes an input data buffer, the scalar processor, and an output orthogonal converter.Type: GrantFiled: January 29, 2003Date of Patent: December 5, 2006Assignee: S3 Graphics Co., Ltd.Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
-
Patent number: 7043627Abstract: In view of a necessity of alleviating factors obstructing an effect of SIMD operation such as in-register data alignment in high speed formation of an SIMD processor, numerous data can be supplied to a data alignment operation pipe 211 by dividing a register file into four banks and enabling to designate a plurality of registers by a single piece of operand to thereby enable to make access to four registers simultaneously and data alignment operation can be carried out at high speed. Further, by defining new data pack instruction, data unpack instruction and data permutation instruction, data supplied in a large number can be aligned efficiently. Further, by the above-described characteristic, definition of multiply accumulate operation instruction maximizing parallelism of SIMD can be carried out.Type: GrantFiled: September 4, 2001Date of Patent: May 9, 2006Assignee: Hitachi, Ltd.Inventors: Takehiro Shimizu, Fumio Arakawa
-
Patent number: 7043607Abstract: The vector unit 21 outputs a first flash address to the flash address array 24. The vector unit 31 outputs a second flash address to the flash address array 34. In the master unit 2, the flash address array 24 compares an address registered in a cache with the first flash address. In the slave unit 3, the flash address array 34 compares the address registered in the cache with the second flash address. When said first flash address coincides with said address registered in said cache, the flash address array 24 sends a first coincidence address to the address array 25. When said second flash address coincides with said address registered in said cache, the flash address array 34 sends a second coincidence address to the address array 25. A corresponding address of the address array 25 is flashed based on the first address sent from the flash address array 24 and based on the second address sent from the flash address 34.Type: GrantFiled: June 12, 2003Date of Patent: May 9, 2006Assignee: NEC CorporationInventor: Kenji Ezoe
-
Patent number: 7034849Abstract: Methods and apparatuses for blending two images using vector table look up operations. In one aspect of the invention, a method to blend two images includes: loading a vector of keys into a vector register; converting the vector of keys into a first vector of blending factors for the first image and a second vector of blending factors for the second image using a plurality of look up tables; and computing an image attribute for the blended image using the blending factors.Type: GrantFiled: December 31, 2001Date of Patent: April 25, 2006Assignee: Apple Computer, Inc.Inventors: Steven Todd Weybrew, David Ligon, Ronald Gerard Langhi
-
Patent number: 6996698Abstract: Processing restrictions of a computing environment are filtered and blocked, in certain circumstances, such that processing continues despite the restrictions. One restriction includes an indication that fetching of storage keys is prohibited, in response to a buffer miss. When a processing unit of the computing environment is met with this restriction, it performs a comparison of addresses, which indicates whether the fetching can continue. If fetching can continue, the restriction is ignored.Type: GrantFiled: May 12, 2003Date of Patent: February 7, 2006Assignee: International Business Machines CorporationInventors: Timothy J. Slegel, Jane H. Bartik, Lisa C. Heller, Erwin F. Pfeffer, Ute Gaertner
-
Patent number: 6968445Abstract: A multithreaded processor includes an instruction decoder for decoding retrieved instructions to determine an instruction type for each of the retrieved instructions, an integer unit coupled to the instruction decoder for processing integer type instructions, and a vector unit coupled to the instruction decoder for processing vector type instructions. A reduction unit is preferably associated with the vector unit and receives parallel data elements processed in the vector unit. The reduction unit generates a serial output from the parallel data elements. The processor may be configured to execute at least control code, digital signal processor (DSP) code, Java code and network processing code, and is therefore well-suited for use in a convergence device. The processor is preferably configured to utilize token triggered threading in conjunction with instruction pipelining.Type: GrantFiled: October 11, 2002Date of Patent: November 22, 2005Assignee: Sandbridge Technologies, Inc.Inventors: Erdem Hokenek, Mayan Moudgill, C. John Glossner
-
Patent number: 6968402Abstract: Techniques to buffer and present chunks are disclosed. In some embodiments, a first interface may receive chunks of a first cache line, and a second interface may receive chunks of a second cache line. A buffer may store chunks of the first cache line in a first chunk order and may store chunks of the second cache line in a second chunk order. A control unit may present a requester via the second interface with one or more chunks of the first cache line from the buffer.Type: GrantFiled: May 22, 2003Date of Patent: November 22, 2005Assignee: Intel CorporationInventors: David R. Jackson, Stephen W. Kiss, Miles F. Schwartz
-
Patent number: 6963341Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.Type: GrantFiled: May 20, 2003Date of Patent: November 8, 2005Inventor: Tibet Mimar
-
Patent number: 6950834Abstract: A database table reorganization is defined to permit online access of the table during the reorganization. Records are reorganized in the database table by vacating records from a defined number of pages and then filling the pages with records in accordance with a desired ordering for the records. Temporary pointers to the new locations of moved records are used to prevent table scanner access to the database table from missing or duplicating records while scanning the database table during reorganization. Removal of the temporary pointers is synchronized with the completion of scanning of all table scanners that are commenced during a time when records are being moved as part of a vacating or filling step.Type: GrantFiled: January 29, 2001Date of Patent: September 27, 2005Assignee: International Business Machines CorporationInventors: Matthew A. Huras, Nelson Hop Hing, Jeffrey J. Goss, Bruce G. Lindsay
-
Patent number: 6924802Abstract: A system, method, and computer program product are provided for generating display data. The data processing system loads coefficient values corresponding to a behavior of a selected function in pre-defined ranges of input data. The data processing system then determines, responsive to items of input data, the range of input data in which the selected function is to be estimated. The data processing system then selects, through the use of a vector permute function, the coefficient values, and evaluates an index function at the each of the items of input data. It then estimates the value of the selected function through parallel mathematical operations on the items of input data, the selected coefficient values, and the values of the index function, and, responsive to the one or more values of the selected function, generates display data.Type: GrantFiled: September 12, 2002Date of Patent: August 2, 2005Assignee: International Business Machines CorporationInventors: Gordon Clyde Fossum, Harm Peter Hofstee, Barry L. Minor, Mark Richard Nutter
-
Publication number: 20040267957Abstract: Windowed multiuser detection techniques are disclosed. A window of data is established, and certain central bits within the window are selected as reliable, while other side bits are ignored. The selected bits are demodulated. The windowed multiuser detector moves along to the next window in such a manner that the next group of central bit decisions lay contiguous with the previous set, and eventually every bit to be demodulated has at some point been a central bit decision. Most any type of MUD algorithm (e.g., MMSE algorithm MUD or M-algorithm MUD) can be used to compute estimates in the windowed data. Unreliable windowed data are distinguished from reliable data (e.g., weighting or other de-emphasis scheme).Type: ApplicationFiled: February 11, 2004Publication date: December 30, 2004Inventor: Robert B MacLeod
-
Publication number: 20040210739Abstract: A vector signal processor of the present invention consisting of a digital signal processor unit, a VSP command bus, a data flow interface, a broadcast interface, a multi-channel buffered serial port (McBSP) network, and a host interface. The vector signal processor of this invention has high processing speed, better communication between modules, far better coordination, and uses daughter cards to enhance various processing functions.Type: ApplicationFiled: January 6, 2003Publication date: October 21, 2004Inventors: Yung-Po Huang, Che-Hui Chang-Chien, Chao-Yuan Huang, Chiung-Hung Chang
-
Patent number: 6807620Abstract: The present invention relates to the architecture and use of a computer system optimized for the efficient modeling of graphics. The computer system has a primary processor and a graphics processor. The primary processor has two vector processor units within it, one which is closely connected to central processor unit. Simultaneously performing complex modeling calculations on the first vector processor and CPU, and geometry transformation calculations on the second vector processor, allows for efficient modeling of graphics. Furthermore, the graphics processor is optimized to rapidly switch between data flow from the two vector processors. In addition, the graphics processor is able to render many pixels simultaneously, and has a local memory on the graphics processor chip that acts as a frame buffer, texture buffer, and z buffer. This allows a high fill rate to the frame buffer.Type: GrantFiled: February 11, 2000Date of Patent: October 19, 2004Assignee: Sony Computer Entertainment Inc.Inventors: Masakazu Suzuoki, Akio Ohba, Masaaki Oka, Toshiyuki Hiroi, Teiji Yutaka, Toyoshi Okada, Masayoshi Tanaka
-
Patent number: 6782470Abstract: The register file of a processor includes embedded operand queues. The configuration of the register file into registers and operand queues is defined dynamically by a computer program. The programmer determines the trade-off between the number and size of the operand queue(s) versus the number of registers used for the program. The programmer partitions a portion of the registers into one or more operand queues. A given queue occupies a consecutive set of registers, although multiple queues need not occupy consecutive registers. An additional address bit is included to distinguish operand queue addresses from register addresses. Queue state logic tracks status information for each queue, including a header pointer, tail pointer, start address, end address and number of vacancies value. The program sets the locations and depth of a given operand queue within the register file.Type: GrantFiled: November 6, 2000Date of Patent: August 24, 2004Assignee: University of WashingtonInventors: Stefan G. Berg, Michael S. Grow, Weiyun Sun, Donglok Kim, Yongmin Kim
-
Publication number: 20040117599Abstract: A functional-level instruction-set computing (FLIC) architecture executes higher-level functional instructions such as lookups and bit-compares of variable-length operands. Each FLIC processing-engine slice has specialized processing units including a lookup unit that searches for a matching entry in a lookup cache. Variable-length operands are stored in execution buffers. The operand length and location in the execution buffer are stored in fixed-length general-purpose registers (GPRs) that also store fixed-length operands. A copy/move unit moves data between input and output buffers and one or more FLIC processing-engine slices. Multiple contexts can each have a set of GPRs and execution buffers. An expansion buffer in a FLIC slice can be allocated to a context to expand that context's execution buffer for storing longer operands.Type: ApplicationFiled: December 12, 2002Publication date: June 17, 2004Applicant: NEXSIL COMMUNICATIONS, INC.Inventors: Millind Mittal, Mehul Kharidia, Tarun Kumar Tripathy, J. Sukarno Mertoguno
-
Patent number: 6734874Abstract: A method, apparatus and article of manufacture are provided for handling both scalar and vector components during graphics processing. To accomplish this, vertex data is received in the form of vectors after which vector operations are performed on the vector vertex data. Next, scalar operations may be executed on an output of the vector operations, thereby rendering vertex data in the form of scalars. Such scalar vertex data may then be converted to vector vertex data for performing vector operations thereon.Type: GrantFiled: January 31, 2001Date of Patent: May 11, 2004Assignee: nVidia CorporationInventors: John Erik Lindholm, Simon Moy, David B. Kirk, Paolo E. Sabella
-
Publication number: 20040059436Abstract: Data processing architecture comprises one or more data processing components associated with a logical level such that a data processing component associated with a logical level only accepts input from one or more data processing components in a logically higher or lower logical level or an external source; a data processing component associated with a logical level only provides output to one or more data processing components in a logically higher or lower level or an external recipient system. Each data processing component can not accept input or provide output from or to a data processing component in the same logical level; and a data processing component will only accept an input that conforms to an ontology related to the logical level with which the data processing component is associated.Type: ApplicationFiled: March 24, 2003Publication date: March 25, 2004Inventors: Mark Stephen Anderson, Dean Crawford Engelhardt, Damian Andrew Marriott, Suneel Singh Randhawa
-
Patent number: 6697930Abstract: A pipeline video decoder and decompression system handles a plurality of separately encoded bit streams arranged as a single serial bit stream of digital bits and having separately encoded pairs of control codes and corresponding data carried in the serial bit stream. The pipeline system employs a plurality of interconnected stages to decode and decompress the single bit stream, including a start code detector. When in a search mode, the state code detector searches for a specific start code corresponding to one of multiple compression standards. The start code detector responding to the single serial bit stream generates control tokens and data tokens. A respective one of the tokens includes a plurality of data words. Each data word has an extension bit which indicates a presence of additional words therein. The data words are thereby unlimited in number.Type: GrantFiled: February 7, 2001Date of Patent: February 24, 2004Assignee: Discovision AssociatesInventors: Adrian P Wise, Martin W Sotheran, William P Robbins, Anthony M Jones, Helen R Finch, Kevin J Boyd, Anthony Peter J Claydon
-
Publication number: 20040015677Abstract: A digital signal processor (DSP) includes a SIMD-based organization wherein operations are executed on a plurality of single-instruction multiple data (SIMD) datapaths or stages connected in cascade. The functionality and data values at each stage may be different, including a different width (e.g., a different number of bits per value) in each stage. The operands and destination for data in a computational datapath are selected indirectly through vector pointer registers in a vector pointers datapath. Each vector pointer register contains a plurality of pointers into a register file of a computational datapath.Type: ApplicationFiled: July 18, 2002Publication date: January 22, 2004Applicant: International Business Machines CorporationInventors: Jaime H. Moreno, Jeffrey Haskell Derby, Uzi Shvadron, Fredy Daniel Daniel Neeser, Victor Zyuban, Ayal Zaks, Shay Ben-David
-
Patent number: 6675379Abstract: A method for memory management in execution of a program by a computer having a memory includes identifying in the program an array of array elements. At a given point in the program, a range of the elements is determined within the array such that none of the elements in the array outside the range is alive at the point. Information regarding the determined range is passed to a memory management function, so that memory locations are associated with the array elements, responsive to the determined range.Type: GrantFiled: June 30, 2000Date of Patent: January 6, 2004Assignee: International Business Machines CorporationInventors: Elliot Karl Kolodner, Ran Shaham, Mooly Sagiv