Scalar/vector Processor Interface Patents (Class 712/3)

VECTOR REGISTER ADDRESSING AND FUNCTIONS BASED ON A SCALAR REGISTER DATA VALUE

Publication number: 20140244967

Abstract: Techniques are provided for executing a vector alignment instruction. A scalar register file in a first processor is configured to share one or more register values with a second processor, the one or more register values accessed from the scalar register file according to an Rt address specified, in a vector alignment instruction, wherein a start location is determined from one of the shared register values. An alignment circuit in the second processor is configured to align data identified between the start location within a beginning Vu register of a vector register file (VRF) and an end location of a last Vu register of the VRF according to the vector alignment instruction. A store circuit is configured to select the aligned data from the alignment circuit and store the aligned data in the vector register file according to an alignment store address specified by the vector alignment instruction.

Type: Application

Filed: February 26, 2013

Publication date: August 28, 2014

Applicant: Qualcomm Incorporated

Inventors: Ajay A. Ingle, Marc M. Hoffman, Jose Fridman, Lucian Codrescu
MAPPING VECTOR REPRESENTATIONS ONTO A PREDICATED SCALAR MULTI-THREADED SYSTEM

Publication number: 20140244968

Abstract: A system implementing a method for generating code for execution based on a SIMT model with parallel units of threads is provided. The system identifies a loop within a program that includes vector processing. The system generates instructions for a thread that include an instruction to set a predicate based on whether the thread of a parallel unit corresponds to a vector element. The system also generates instructions to perform the vector processing via scalar operations predicated on the predicate. As a result, the system generates instructions to perform the vector processing but to avoid branch divergence within the parallel unit of threads that would be needed to check whether a thread corresponds to a vector element.

Type: Application

Filed: February 28, 2013

Publication date: August 28, 2014

Inventor: Cray Inc.
Optimized Matrix and Vector Operations In Instruction Limited Algorithms That Perform EOS Calculations

Publication number: 20140201450

Abstract: There is provided a system and method for optimizing matrix and vector calculations in instruction limited algorithms that perform EOS calculations. The method includes dividing each matrix associated with an EOS stability equation and an EOS phase split equation into a number of tiles, wherein the tile size is heterogeneous or homogenous. Each vector associated with the EOS stability equation and the EOS phase split equation may be divided into a number of strips. The tiles and strips may be stored in main memory, cache, or registers, and the matrix and vector operations associated with successive substitutions and Newton iterations may be performed in parallel using the tiles and strips.

Type: Application

Filed: July 23, 2012

Publication date: July 17, 2014

Inventor: Kjetil B. Haugen
COLLAPSING OF MULTIPLE NESTED LOOPS, METHODS AND INSTRUCTIONS

Publication number: 20140189287

Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

Type: Application

Filed: December 27, 2012

Publication date: July 3, 2014

Inventors: Mikhail Plotnikov, Andrey Naraikin, Elmoustapha Ould-Ahmed-Vall
Methods and systems for command acceleration in a video processor via translation of scalar instructions into vector instructions

Patent number: 8738891

Abstract: A method for implementing command acceleration. The method includes receiving a first set of instructions from a first processor, wherein the first set of instructions are formatted in accordance with a microarchitecture of the first processor. The first set of instructions are translated into a second set of instructions, wherein the second set of instructions are formatted in accordance with a microarchitecture of a second processor. The second set instructions are then transmitted to the second processor for execution by the second processor.

Type: Grant

Filed: November 4, 2005

Date of Patent: May 27, 2014

Assignee: NVIDIA Corporation

Inventors: Ashish Karandikar, Shirish Gadre, Amir H. Salek
Video processor having scalar and vector components

Patent number: 8698817

Abstract: A video processor for executing video processing operations. The video processor includes a host interface for implementing communication between the video processor and a host CPU. A memory interface is included for implementing communication between the video processor and a frame buffer memory. A scalar execution unit is coupled to the host interface and the memory interface and is configured to execute scalar video processing operations. A vector execution unit is coupled to the host interface and the memory interface and is configured to execute vector video processing operations.

Type: Grant

Filed: November 4, 2005

Date of Patent: April 15, 2014

Assignee: Nvidia Corporation

Inventors: Shirish Gadre, Ashish Karandikar, Stephen D. Lew, Christopher T. Cheng
Latency tolerant system for executing video processing operations

Patent number: 8687008

Abstract: A latency tolerant system for executing video processing operations. The system includes a host interface for implementing communication between the video processor and a host CPU, a scalar execution unit coupled to the host interface and configured to execute scalar video processing operations, and a vector execution unit coupled to the host interface and configured to execute vector video processing operations. A command FIFO is included for enabling the vector execution unit to operate on a demand driven basis by accessing the memory command FIFO. A memory interface is included for implementing communication between the video processor and a frame buffer memory. A DMA engine is built into the memory interface for implementing DMA transfers between a plurality of different memory locations and for loading the command FIFO with data and instructions for the vector execution unit.

Type: Grant

Filed: November 4, 2005

Date of Patent: April 1, 2014

Assignee: NVIDIA Corporation

Inventors: Ashish Karandikar, Shirish Gadre, Stephen D. Lew
Sharing a fault-status register when processing vector instructions

Patent number: 8683178

Abstract: The described embodiments provide a processor that executes vector instructions. In the described embodiments, the processor initializes an architectural fault-status register (FSR) and a shadow copy of the architectural FSR by setting each of N bit positions in the architectural FSR and the shadow copy of the architectural FSR to a first predetermined value. The processor then executes a first first-faulting or non-faulting (FF/NF) vector instruction. While executing the first vector instruction, the processor also executes one or more subsequent FF/NF instructions. In these embodiments, when executing the first vector instruction and the subsequent vector instructions, the processor updates one or more bit positions in the shadow copy of the architectural FSR to a second predetermined value upon encountering a fault condition.

Type: Grant

Filed: April 20, 2011

Date of Patent: March 25, 2014

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
Multi context execution on a video processor

Patent number: 8683184

Abstract: A method for implementing multi context execution on a video processor having a scalar execution unit and a vector execution unit. The method includes allocating a first task to a vector execution unit and allocating a second task to the vector execution unit. The first task is from a first context in the second task is from a second context. The method further includes interleaving a plurality of work packages comprising the first task and the second task to generate a combined work package stream. The combined work package stream is subsequently executed on the vector execution unit.

Type: Grant

Filed: November 4, 2005

Date of Patent: March 25, 2014

Assignee: Nvidia Corporation

Inventors: Stephen D. Lew, Ashish Karandikar, Shirish Gadre, Franciscus W. Sijstermans
System and method for implementing elliptic curve scalar multiplication in cryptography

Patent number: 8649508

Abstract: A system and method for implementing the Elliptic Curve scalar multiplication method in cryptography, where the Double Base Number System is expressed in decreasing order of exponents and further on using it to determine Elliptic curve scalar multiplication over a finite elliptic curve.

Type: Grant

Filed: September 29, 2008

Date of Patent: February 11, 2014

Assignee: Tata Consultancy Services Ltd.

Inventor: Natarajan Vijayarangan
PROGRAMMABLE DEVICE FOR SOFTWARE DEFINED RADIO TERMINAL

Publication number: 20140040594

Abstract: A programmable device suitable for software defined radio terminal is disclosed. In one aspect, the device includes a scalar cluster providing a scalar data path and a scalar register file and arranged for executing scalar instructions. The device may further include at least two interconnected vector clusters connected with the scalar cluster. Each of the at least two vector clusters provides a vector data path and a vector register file and is arranged for executing at least one vector instruction different from vector instructions performed by any other vector cluster of the at least two vector clusters.

Type: Application

Filed: October 2, 2013

Publication date: February 6, 2014

Applicants: Samsung Electronics, IMEC

Inventors: Bruno Bougard, Thomas Schuster
Data parallel function call for determining if called routine is data parallel

Patent number: 8627042

Abstract: Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.

Type: Grant

Filed: December 30, 2009

Date of Patent: January 7, 2014

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
Data parallel function call for determining if called routine is data parallel

Patent number: 8627043

Abstract: Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.

Type: Grant

Filed: March 26, 2012

Date of Patent: January 7, 2014

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
APPARATUS AND METHOD OF VECTOR UNIT SHARING

Publication number: 20140006748

Abstract: A reconfigurable vector processor is described that allows the size of its vector units to be changed in order to process vectors of different sizes. The reconfigurable vector processor comprises a plurality of processor units. Each of the processor units comprises a control unit for decoding instructions and generating control signals, a scalar unit for processing instructions on scalar data, and a vector unit for processing instructions on vector data under control of control signals. The reconfigurable vector processor architecture also comprises a vector control selector for selectively providing control signals generated by one processor unit of the plurality of processor units to the vector unit of a different processor unit of the plurality of processor units.

Type: Application

Filed: January 25, 2011

Publication date: January 2, 2014

Applicant: COGNIVUE CORPORATION

Inventors: Malcolm Stewart, Ali Osman Ors, Daniel Laroche
Tightly coupled scalar and boolean processor with result vector subunit controlled by instruction flow

Patent number: 8549256

Abstract: Methods and apparatus relating to a tightly coupled scalar and Boolean processor are described. In an embodiment, a Boolean unit may include a result vector subunit. The result vector subunit may be controlled by an instruction flow that is managed by a scalar unit. Other embodiments are also disclosed.

Type: Grant

Filed: January 15, 2007

Date of Patent: October 1, 2013

Assignee: Intel Corporation

Inventor: Charles Narad
Scalar/vector processor that includes a functional unit with a vector section and a scalar section

Patent number: 8510534

Abstract: A scalar/vector processor includes a plurality of functional units (252, 260, 262, 264, 266, 268, 270). At least one of the functional units includes a vector section (210) for operating on at least one vector and a scalar section (220) for operating on at least one scalar. The vector section and scalar section of the functional unit co-operate by the scalar section being arranged to provide and/or consume at least one scalar required by and/or supplied by the vector section of the functional unit.

Type: Grant

Filed: May 22, 2003

Date of Patent: August 13, 2013

Assignee: ST-Ericsson SA

Inventors: Cornelis Hermanus Van Berkel, Patrick Peter Elizabeth Meuwissen, Nur Engin
PROCESSOR WITH TABLE LOOKUP AND HISTOGRAM PROCESSING UNITS

Publication number: 20130185539

Abstract: A processor includes a scalar processor core and a vector coprocessor core coupled to the scalar processor core. The scalar processor core is configured to retrieve an instruction stream from program storage, and pass vector instructions in the instruction stream to the vector coprocessor core. The vector coprocessor core includes a register file, a plurality of execution units, and a table lookup unit. The register file includes a plurality of registers. The execution units are arranged in parallel to process a plurality of data values. The execution units are coupled to the register file. The table lookup unit is coupled to the register file in parallel with the execution units. The table lookup unit is configured to retrieve table values from one or more lookup tables stored in memory by executing table lookup vector instructions in a table lookup loop.

Type: Application

Filed: July 13, 2012

Publication date: July 18, 2013

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Ching-Yu HUNG, Shinri INAMORI, Jagadeesh SANKARAN, Peter CHANG
PROCESSOR WITH INTER-PROCESSING PATH COMMUNICATION

Publication number: 20130185538

Abstract: A processor includes a scalar processor core and a vector coprocessor core coupled to the scalar processor core. The scalar processor core is configured to retrieve an instruction stream from program storage. The instruction stream includes scalar instructions executable by the scalar processor core and vector instructions executable by the vector coprocessor core. The scalar processor core is configured to pass the vector instructions to the vector coprocessor core. The vector coprocessor core configured to process a plurality of data values in parallel while executing each vector instruction passed by the scalar processor core. The vector coprocessor core includes a plurality of processing paths arranged in parallel to process the data values. Each of the processing paths includes an execution unit. Each of the execution units is configured to communicate a result of processing to each other of the execution units.

Type: Application

Filed: July 13, 2012

Publication date: July 18, 2013

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Ching-Yu Hung, Shinri Inamori, Jagadeesh Sankaran, Peter Chang
PROGRAMMABLE DEVICE FOR SOFTWARE DEFINED RADIO TERMINAL

Publication number: 20130173884

Abstract: A programmable device suitable for software defined radio terminal is disclosed. In one aspect, the device includes a scalar cluster providing a scalar data path and a scalar register file and arranged for executing scalar instructions. The device may further include at least two interconnected vector clusters connected with the scalar cluster. Each of the at least two vector clusters provides a vector data path and a vector register file and is arranged for executing at least one vector instruction different from vector instructions performed by any other vector cluster of the at least two vector clusters.

Type: Application

Filed: December 7, 2012

Publication date: July 4, 2013

Applicants: Samsung Electronics, Imec

Inventors: Imec, Samsung Electronics
SPECIALIZED VECTOR INSTRUCTION AND DATAPATH FOR MATRIX MULTIPLICATION

Publication number: 20130159665

Abstract: A data processing element includes an input unit configured to provide instructions for scalar, vector and array processing, and a scalar processing unit configured to provide a scalar pipeline datapath for processing a scalar quantity. Additionally, the data processing element includes a vector processing unit coupled to the scalar processing unit and configured to provide a vector pipeline datapath employing a vector register for processing a one-dimensional vector quantity. The data processing element further includes an array processing unit coupled to the vector processing unit and configured to provide an array pipeline datapath employing a parallel processing structure for processing a two-dimensional vector quantity. A method of operating a data processing element and a MIMO receiver employing a data processing element are also provided.

Type: Application

Filed: December 15, 2011

Publication date: June 20, 2013

Applicant: Verisilicon Holdings Co., Ltd.

Inventor: Asheesh Kashyap
Optimizing scalar code executed on a SIMD engine by alignment of SIMD slots

Patent number: 8370817

Abstract: A mechanism is provided for optimizing scalar code executed on a single instruction multiple data (SIMD) engine by aligning the slots of SIMD registers. With the mechanism, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.

Type: Grant

Filed: May 27, 2008

Date of Patent: February 5, 2013

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien
Scalable Processing Unit

Publication number: 20130024652

Abstract: Various methods and systems are provided for processing units that may be scaled. In one embodiment, a processing unit includes a plurality of scalar processing units and a vector processing unit in communication with each of the plurality of scalar processing units. The vector processing unit is configured to coordinate execution of instructions received from the plurality of scalar processing units. In another embodiment, a scalar instruction packet including a pre-fix instruction and a vector instruction packet including a vector instruction is obtained. Execution of the vector instruction may be modified by the pre-fix instruction in a processing unit including a vector processing unit. In another embodiment, a scalar instruction packet including a plurality of partitions is obtained. The location of the partitions is determined based upon a partition indicator included in the scalar instruction packet and a scalar instruction included in a partition is executed by a processing unit.

Type: Application

Filed: September 20, 2011

Publication date: January 24, 2013

Applicant: BROADCOM CORPORATION

Inventors: Neil Bailey, Eben Upton
Providing nearest neighbor point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer

Patent number: 8296457

Abstract: Methods, apparatus, and products are disclosed for providing nearest neighbor point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer, each compute node connected to each adjacent compute node in the global combining network through a link, that include: identifying each link in the global combining network for each compute node of the operational group; designating one of a plurality of point-to-point class routing identifiers for each link such that no compute node in the operational group is connected to two adjacent compute nodes in the operational group with links designated for the same class routing identifiers; and configuring each compute node of the operational group for point-to-point communications with each adjacent compute node in the global combining network through the link between that compute node and that adjacent compute node using that link's designated class routing identifier.

Type: Grant

Filed: August 2, 2007

Date of Patent: October 23, 2012

Assignee: International Business Machines Corporation

Inventors: Charles J. Archer, Ahmad A. Faraj, Todd A. Inglett, Joseph D. Ratterman
DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING VECTOR OPERATIONS

Publication number: 20120260061

Abstract: A data processing apparatus having processing circuitry, a scalar register bank and a vector register bank, including decoding circuitry arranged to decode a sequence of instructions to generate control signals for the processing circuitry. The decoding circuitry is responsive to a decode modifier instruction within the sequence of instructions to alter decoding of a subsequent scalar instruction in the sequence by mapping at least one scalar operand specified by the subsequent scalar instruction to at least one vector operand in the vector register bank, and, in dependence on the scalar operation specified by the subsequent scalar instruction, determining a vector operation to be performed on at least a subset of the operand elements within the at least the one vector operand. Such an approach enables a wide variety of vector operations to be specified without the need to individually define separate vector instructions for those vector operations.

Type: Application

Filed: April 4, 2012

Publication date: October 11, 2012

Applicant: ARM LIMITED

Inventor: Alastair David Reid
Combining speculative physics modeling with goal-based artificial intelligence

Patent number: 8280826

Abstract: In one embodiment, the present invention includes a method for identifying a deformable object of a scene of a computer game that is visible by an artificial intelligence (AI) character of the game, requesting a speculative physics simulation associated with the deformable object to determine a result of an action to the deformable object by the AI character, and selecting an action to be performed by the AI character, where the selection is based at least in part on the speculative physics simulation. Other embodiments are described and claimed.

Type: Grant

Filed: October 10, 2011

Date of Patent: October 2, 2012

Assignee: Intel Corporation

Inventors: David Putzolu, Aaron Kunze, Teresa Morrison
Graphics processing method and apparatus implementing window system

Patent number: 8203567

Abstract: A graphics processing method and apparatus described herein is capable of converting graphics processing of a window system into a vector-based application program interface (API) format usable in the GPU and performing the converted graphics processing in the GPU. For example, the vector-based API may be based on an OpenVG standard or an EGL standard.

Type: Grant

Filed: July 3, 2009

Date of Patent: June 19, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Dong-kyun Jeong, Soo-chan Lim, Na-min Kim
Method, apparatus, and systems to support execution pipelining in a memory controller

Patent number: 8190830

Abstract: A memory controller may execute instructions instead of sending the instructions to a processor for execution. To maintain synchronization between the memory controller and the processor, the memory controller may queue a null instruction in the memory controller for each non-filler instruction sent to the processor and may send a filler instruction to the processor for each non-null instruction to be executed by the memory controller.

Type: Grant

Filed: March 10, 2006

Date of Patent: May 29, 2012

Assignee: Intel Corporation

Inventor: Gurumurthy Rajaram
System and method of processing data using scalar/vector instructions

Patent number: 8190854

Abstract: A method of processing data is disclosed that includes performing a fetch of a plurality of instructions from a memory unit. The method also includes grouping the plurality of instructions into packets of instructions of different types for parallel execution by a plurality of instruction execution units. The packets of instructions include a first instruction and a second instruction. The method includes using a combined scalar and vector condition code register to execute the first instruction for a compare operation and the second instruction for a conditional operation using the combined scalar and vector condition code register. The method also includes when the compare operation is a scalar compare operation, receiving a scalar compare instruction for the scalar compare operation at an instruction executing unit and storing results of the scalar compare operation in the combined scalar and vector condition code register.

Type: Grant

Filed: January 20, 2010

Date of Patent: May 29, 2012

Assignee: QUALCOMM Incorporated

Inventors: Lucian Codrescu, Erich J. Plondke, Taylor Simpson
VECTOR PROCESSING CIRCUIT, COMMAND ISSUANCE CONTROL METHOD, AND PROCESSOR SYSTEM

Publication number: 20120124332

Abstract: A vector processing circuit includes a vector register file including a plurality of array elements, a command issuance control circuit, and a plurality of pipeline arithmetic units. Each pipeline arithmetic unit performs arithmetic processing of data stored in the array elements indicated as a source by one command in parts through a plurality of cycles and stores the result in the array elements indicated as a destination by the one command through a plurality of cycles. When data word length of a preceding command is longer than that of a subsequent command, the command issuance control circuit changes data sizes of the array elements in accordance with data word length of the command and determines whether there is register interference between the array element to be processed at a non-head cycle of the preceding command, and the array element to be processed at a head cycle of the subsequent command.

Type: Application

Filed: October 24, 2011

Publication date: May 17, 2012

Applicant: FUJITSU LIMITED

Inventors: GE Yi, Yoshimasa Takebe, Hiromasa Takahashi
Methods and apparatus for processing scalar and vector instructions

Patent number: 8090928

Abstract: In one embodiment of the present invention, a processor includes a scalar computation unit; a vector co-processor coupled to the scalar computation unit; and one or more function-specific engines coupled to the scalar computation unit, where the engines are adapted to minimize data exchange penalties by processing small in-out bit slices.

Type: Grant

Filed: June 28, 2002

Date of Patent: January 3, 2012

Assignee: Intellectual Ventures I LLC

Inventors: Dominik J. Schmidt, Robert Warren Sherburne, Jr.
Image Processing Address Generator

Publication number: 20110307684

Abstract: An image processing system including a vector processor and a memory adapted for attaching to the vector processor. The memory is adapted to store multiple image frames. The vector processor includes an address generator operatively attached to the memory to access the memory. The address generator is adapted for calculating addresses of the memory over the multiple image frames. The addresses may be calculated over the image frames based upon an image parameter. The image parameter may specify which of the image frames are processed simultaneously. A scalar processor may be attached to the vector processor. The scalar processor provides the image parameter(s) to the address generator for address calculation over the multiple image frames. An input register may be attached to the vector processor. The input register may be adapted to receive a very long instruction word (VLIW) instruction.

Type: Application

Filed: June 10, 2010

Publication date: December 15, 2011

Inventors: Yosef Kreinin, Gil Dogon, Emmanuel Sixsou, Yosi Arbeli, Mois Navon, Roman Sajman
Method and arrangement for cache memory management, related processor architecture

Patent number: 8078804

Abstract: A data cache memory coupled to a processor including processor clusters are adapted to operate simultaneously on scalar and vectorial data by providing data locations in the data cache memory for storing data for processing. The data locations are accessed either in a scalar mode or in a vectorial mode. This is done by explicitly mapping the data locations that are scalar and the data locations that are vectorial.

Type: Grant

Filed: June 26, 2007

Date of Patent: December 13, 2011

Assignees: STMicroelectronics S.r.l., STMicroelectronics N.V.

Inventors: Francesco Pappalardo, Giuseppe Notarangelo, Elena Salurso, Elio Guidetti
Combining speculative physics modeling with goal-based artificial intelligence

Patent number: 8069124

Abstract: In one embodiment, the present invention includes a method for identifying a deformable object of a scene of a computer game that is visible by an artificial intelligence (AI) character of the game, requesting a speculative physics simulation associated with the deformable object to determine a result of an action to the deformable object by the AI character, and selecting an action to be performed by the AI character, where the selection is based at least in part on the speculative physics simulation. Other embodiments are described and claimed.

Type: Grant

Filed: March 26, 2008

Date of Patent: November 29, 2011

Assignee: Intel Corporation

Inventors: David Putzolu, Aaron Kunze, Teresa Morrison
Data Parallel Function Call for Determining if Called Routine is Data Parallel

Publication number: 20110161623

Abstract: Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.

Type: Application

Filed: December 30, 2009

Publication date: June 30, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
Automatic instruction set architecture generation

Patent number: 7971197

Abstract: A digital computer system automatically creates an Instruction Set Architecture (ISA) that potentially exploits VLIW instructions, vector operations, fused operations, and specialized operations with the goal of increasing the performance of a set of applications while keeping hardware cost below a designer specified limit, or with the goal of minimizing hardware cost given a required level of performance.

Type: Grant

Filed: August 18, 2005

Date of Patent: June 28, 2011

Assignee: Tensilica, Inc.

Inventors: David William Goodwin, Dror Maydan, Ding-Kai Chen, Darin Stamenov Petkov, Steven Weng-Kiang Tjiang, Peng Tu, Christopher Rowen
EXECUTION OF VARIABLE WIDTH VECTOR PROCESSING INSTRUCTIONS

Publication number: 20110145543

Abstract: A processing unit executes a vector width instruction in a program and the processing unit obtains and supplies the width of an appropriate vector register that will be used to process variable vector processing instructions. Then, when the processing unit executes variable vector processing instructions in the program, the processing unit processes the variable vector processing instructions using the appropriate vector register with the instructions having the same width as the appropriate vector register. The width that the processing unit obtains may be less than an actual width of the appropriate vector register and may set by the processing unit. In this way, many different vector widths can be supported using a single set of instructions for vector processing. New instructions are not required if vector widths are changed and processing units having vector registers of differing widths do not require different code.

Type: Application

Filed: December 15, 2009

Publication date: June 16, 2011

Applicant: Sun Microsystems, Inc.

Inventor: Peter Carl Damron
Method and apparatus for obtaining a scalar value directly from a vector register

Patent number: 7908460

Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.

Type: Grant

Filed: May 3, 2010

Date of Patent: March 15, 2011

Assignee: Nintendo Co., Ltd.

Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng, Timothy J. Van Hook
METHOD AND STRUCTURE OF USING SIMD VECTOR ARCHITECTURES TO IMPLEMENT MATRIX MULTIPLICATION

Publication number: 20110055517

Abstract: A structure (and method) including a plurality of coprocessing units and a controller that selectively loads data for processing on the plurality of coprocessing units, using a compound loading instruction. The compound loading instruction includes a plurality of low-level software instructions that preliminarily processes input data in a manner predetermined to simulate an effect of a single hardware loading instruction that would provide optimal loading of complex matrix data by loading input data in accordance with the effect of multiplying i·i=?1.

Type: Application

Filed: August 26, 2009

Publication date: March 3, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexandre E. Eichenberger, Michael Karl Gschwind, John A. Gunnels, Fred Gehrung Gustavson, Brett Olsson
Physics processing unit

Patent number: 7895411

Abstract: One embodiment of the invention sets forth a hardware-based physics processing unit (PPU) having unique architecture designed to efficiently generate physics data. The PPU includes a PPU control engine (PCE), a data movement engine and a floating point engine (FPE). The PCE manages the overall operation of the PPU by allocating memory resources and transmitting graphics processing commands to the FPE and data movement commands to the DME. The FPE includes multiple vector processors that operate in parallel and perform floating point operations on data received from a host unit to generate physics simulation data. The DME facilitates the transmission of data between the host unit and the FPE by performs data movement operations between memories internal and external to the PPU.

Type: Grant

Filed: November 19, 2003

Date of Patent: February 22, 2011

Assignee: NVIDIA Corporation

Inventors: Monier Maher, Otto A. Schmid, Curtis Davis, Manju Hegde, Jean Pierre Bordes
Multi-addressable register file

Patent number: 7877582

Abstract: A single register file may be addressed using both scalar and SIMD instructions. That is, subsets of registers within a multi-addressable register file according to the illustrative embodiments, are addressable with different instruction forms, e.g., scalar instructions, SIMD instructions, etc., while the entire set of registers may be addressed with yet another form of instructions, referred to herein as Vector-Scalar Extension (VSX) instructions. The operation set that may be performed on the entire set of registers using the VSX instruction form is substantially similar to that of the operation sets of the subsets of registers. Such an arrangement allows legacy instructions to access subsets of registers within the multi-addressable register file while new instructions, i.e. the VSX instructions, may access the entire range of registers within the multi-addressable register file.

Type: Grant

Filed: January 31, 2008

Date of Patent: January 25, 2011

Assignee: International Business Machines Corporation

Inventors: Michael K. Gschwind, Brett Olsson
MULTIPROCESSOR COMMUNICATION PROTOCOL BRIDGE BETWEEN SCALAR AND VECTOR COMPUTE NODES

Publication number: 20110010522

Abstract: A multiprocessor computer system includes a plurality of processor nodes coupled by a direct processor interconnect network, and a plurality of processor nodes coupled by an indirect processor interconnect network. A bridge directly couples the direct processor interconnect network and the indirect processor interconnect network.

Type: Application

Filed: June 11, 2010

Publication date: January 13, 2011

Applicant: Cray Inc.

Inventors: Dennis C. Abts, Peter M. Klausler, James Nowicki
Data processing apparatus and method for handling vector instructions

Publication number: 20100312988

Abstract: A data processing apparatus and method and provided for handling vector instructions. The data processing apparatus has a register data store with a plurality of registers arranged to store data elements. A vector processing unit is then used to execute a sequence of vector instructions, with the vector processing unit having a plurality of lanes of parallel processing and having access to the register data store in order to read data elements from, and write data elements to, the register data store during the execution of the sequence of vector instructions. A skip indication storage maintains a skip indicator for each of the lanes of parallel processing. The vector processing unit is responsive to a vector skip instruction to perform an update operation to set within the skip indication storage the skip indicator for a determined one or more lanes.

Type: Application

Filed: January 19, 2010

Publication date: December 9, 2010

Applicant: ARM LIMITED

Inventors: Andreas BJÖRKLUND, Erik Persson, Ola Hugosson
Multidimensional processor architecture

Patent number: 7831804

Abstract: A processor architecture includes a number of processing elements for treating input signals. The architecture is organized according to a matrix including rows and columns, the columns of which each include at least one microprocessor block having a computational part and a set of associated processing elements that are able to receive the same input signals. The number of associated processing elements is selectively variable in the direction of the column so as to exploit the parallelism of said signals. Additionally the processor architecture of the present invention enable dynamic switching between instruction parallelism and data parallel processing typical of vectorial functionality. The architecture can be scaled in various dimensions in an optimal configuration for the algorithm to be executed.

Type: Grant

Filed: May 30, 2008

Date of Patent: November 9, 2010

Assignee: ST Microelectronics S.R.L.

Inventors: Francesco Pappalardo, Giuseppe Notarangelo, Elio Guidetti
PROCESSOR

Publication number: 20100235607

Abstract: A processor includes a setting register in which a mode is set, a general-purpose register including a preferred slot used during scalar computing and a slot not used during the scalar computing, a selector configured to select and output data of a register designated by a mode set in the setting register during the scalar computing, and a computing unit configured to execute the scalar computing using the preferred slot of the general-purpose register and store computing result data of the scalar computing in the preferred slot of the general-purpose register. The data of the register output from the selector is stored in the slot of the general-purpose register.

Type: Application

Filed: March 2, 2010

Publication date: September 16, 2010

Applicant: Kabushiki Kaisha Toshiba

Inventors: Hiroaki Sugita, Seiji Maeda, Tatsuya Mizutani
Method and apparatus for obtaining a scalar value directly from a vector register

Publication number: 20100217954

Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.

Type: Application

Filed: May 3, 2010

Publication date: August 26, 2010

Applicant: Nintendo Co., Ltd.,

Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng, Timothy J.. Van Hook
PROGRAMMABLE DEVICE FOR SOFTWARE DEFINED RADIO TERMINAL

Publication number: 20100186006

Abstract: A programmable device suitable for software defined radio terminal is disclosed. In one aspect, the device includes a scalar cluster providing a scalar data path and a scalar register file and arranged for executing scalar instructions. The device may further include at least two interconnected vector clusters connected with the scalar cluster. Each of the at least two vector clusters provides a vector data path and a vector register file and is arranged for executing at least one vector instruction different from vector instructions performed by any other vector cluster of the at least two vector clusters.

Type: Application

Filed: December 17, 2009

Publication date: July 22, 2010

Applicants: IMEC, Samsung Electronics

Inventors: Bruno Bougard, Thomas Schuster
Method for providing physics simulation data

Patent number: 7739479

Abstract: A method of providing physics data within a game program or simulation using a hardware-based physics processing unit having unique architecture designed to efficiently calculate physics related data.

Type: Grant

Filed: November 19, 2003

Date of Patent: June 15, 2010

Assignee: NVIDIA Corporation

Inventors: Jean Pierre Bordes, Curtis Davis, Monier Maher, Manju Hegde, Otto A. Schmid
Method and apparatus for obtaining a scalar value directly from a vector register

Patent number: 7739480

Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.

Type: Grant

Filed: January 11, 2005

Date of Patent: June 15, 2010

Assignee: Nintendo Co., Ltd.

Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng, Timothy J. Van Hook
System and Method of Processing Data Using Scalar/Vector Instructions

Publication number: 20100118852

Abstract: A method of processing data is disclosed that includes performing a fetch of a plurality of instructions from a memory unit. The method also includes grouping the plurality of instructions into packets of instructions of different types for parallel execution by a plurality of instruction execution units. The packets of instructions include a first instruction and a second instruction. The method includes using a combined scalar and vector condition code register to execute the first instruction for a compare operation and the second instruction for a conditional operation using the combined scalar and vector condition code register. The method also includes when the compare operation is a scalar compare operation, receiving a scalar compare instruction for the scalar compare operation at an instruction executing unit and storing results of the scalar compare operation in the combined scalar and vector condition code register.

Type: Application

Filed: January 20, 2010

Publication date: May 13, 2010

Applicant: QUALCOMM INCORPORATED

Inventors: Lucian Codrescu, Erich J. Plondke, Taylor Simpson
Unified address space architecture

Publication number: 20100115228

Abstract: A multiprocessor computer system has a plurality of first processors having a first addressable memory space, and a plurality of second processors having a second addressable memory space. The second addressable memory space is of a different size than the first addressable memory space, and the first addressable memory space and second addressable memory space comprise a part of the same common address space.

Type: Application

Filed: October 31, 2008

Publication date: May 6, 2010

Applicant: CRAY INC.

Inventors: Michael Parker, Timothy J. Johnson, Laurence S. Kaplan, Steven L. Scott, Robert Alverson, Skef Iterum

prev 1 2 3 next