Controlling Access To External Vector Data Patents (Class 712/6)

Prefetch kernels on data-parallel processors

Patent number: 11500778

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

Type: Grant

Filed: March 9, 2020

Date of Patent: November 15, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
Apparatus and methods for generating dot product

Patent number: 10860316

Abstract: Aspects for generating a dot product for two vectors in neural network are described herein. The aspects may include a controller unit configured to receive a vector load instruction that includes a first address of a first vector and a length of the first vector. The aspects may further include a direct memory access unit configured to retrieve the first vector from a storage device based on the first address of the first vector. Further still, the aspects may include a caching unit configured to store the first vector.

Type: Grant

Filed: October 26, 2018

Date of Patent: December 8, 2020

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Tian Zhi, Qi Guo, Shaoli Liu, Tianshi Chen, Yunji Chen
Systems and methods for providing content-based product recommendations

Patent number: 10614504

Abstract: Systems, apparatuses, and methods are provided herein for content-based product recommendations. A system for content-based product recommendations comprises a content monitoring device configured to monitor video content viewed by a user, a customer vectors database, a product vectors database; and a control circuit being configured to: detect, via the content monitoring device, a video content being viewed by the user, identify an item associated with a current segment of the video content viewed by the user, determine a product category associated with the item, determine alignments between the customer value vectors and the product characteristic vectors for each of the plurality of products in the product category, select a recommended product from the plurality of products based on the alignments between the customer value vectors and the product characteristic vectors for each of the plurality of products, and initiate an offer of the recommended product to the customer.

Type: Grant

Filed: April 14, 2017

Date of Patent: April 7, 2020

Assignee: Walmart Apollo, LLC

Inventors: Bruce W. Wilkinson, Brian G. McHale, Todd D. Mattingly
Dot product based processing elements

Patent number: 10049082

Abstract: Systems and methods for calculating a dot product using digital signal processing units that are organized into a dot product processing unit for dot product processing using multipliers and adders of the digital signal processing units.

Type: Grant

Filed: September 15, 2016

Date of Patent: August 14, 2018

Assignee: ALTERA CORPORATION

Inventors: Andrew Chaang Ling, Davor Capalija, Tomasz Sebastian Czajkowski, Andrei Mihai Hagiescu Miriste
Instruction for implementing iterations having an iteration dependent condition with a vector loop

Patent number: 9921837

Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction identifies an input vector operand whose input elements specify one or the other of two states. The instruction execution pipeline also includes an instruction decoder to decode the instruction. The instruction execution pipeline also includes a functional unit to execute the instruction and provide a resultant output vector. The functional unit includes logic circuitry to produce an element in a specific element position of the resultant output vector by performing an operation on a value derived from a base value using a stride in response to one but not the other of the two states being present in a corresponding element position of the input vector operand.

Type: Grant

Filed: July 19, 2016

Date of Patent: March 20, 2018

Assignee: INTEL CORPORATION

Inventor: Mikhail Plotnikov
Method and apparatus for approximating detection of overlaps between memory ranges

Patent number: 9910650

Abstract: A computer-implemented method for managing loop code in a compiler includes using a conflict detection procedure that detects across-iteration dependency for arrays of single memory addresses to determine whether a potential across-iteration dependency exists for arrays of memory addresses for ranges of memory accessed by the loop code.

Type: Grant

Filed: September 25, 2014

Date of Patent: March 6, 2018

Assignee: Intel Corporation

Inventors: Albert Hartono, Nalini Vasudevan, Sara S. Baghsorkhi, Cheng Wang, Youfeng Wu
Apparatus and method for efficient gather and scatter operations

Patent number: 9785436

Abstract: An apparatus and method are described for performing efficient gather operations in a pipelined processor. For example, a processor according to one embodiment of the invention comprises: gather setup logic to execute one or more gather setup operations in anticipation of one or more gather operations, the gather setup operations to determine one or more addresses of vector data elements to be gathered by the gather operations; and gather logic to execute the one or more gather operations to gather the vector data elements using the one or more addresses determined by the gather setup operations.

Type: Grant

Filed: September 28, 2012

Date of Patent: October 10, 2017

Assignee: INTEL CORPORATION

Inventors: Edward T. Grochowski, Dennis R. Bradford, George Z. Chrysos, Andrew T. Forsyth, Michael D. Upton, Lisa K. Wu
Page state directory for managing unified virtual memory

Patent number: 9767036

Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.

Type: Grant

Filed: October 16, 2013

Date of Patent: September 19, 2017

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Cameron Buschardt, Sherry Cheung, James Leroy Deming, Samuel H. Duncan, Lucien Dunning, Robert George, Arvind Gopalakrishnan, Mark Hairgrove, Chenghuan Jia, John Mashey
Vector processing in an active memory device

Patent number: 9575755

Abstract: Embodiments relate to vector processing in an active memory device. An aspect includes a method for vector processing in an active memory device that includes memory and a processing element. The method includes decoding, in the processing element, an instruction including a plurality of sub-instructions to execute in parallel. An iteration count to repeat execution of the sub-instructions in parallel is determined. Based on the iteration count, execution of the sub-instructions in parallel is repeated for multiple iterations by the processing element. Multiple locations in the memory are accessed in parallel based on the execution of the sub-instructions.

Type: Grant

Filed: August 3, 2012

Date of Patent: February 21, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
Vector processing in an active memory device

Patent number: 9535694

Abstract: Embodiments relate to vector processing in an active memory device. An aspect includes a system for vector processing in an active memory device. The system includes memory in the active memory device and a processing element in the active memory device. The processing element is configured to perform a method including decoding an instruction with a plurality of sub-instructions to execute in parallel. An iteration count to repeat execution of the sub-instructions in parallel is determined. Execution of the sub-instructions is repeated in parallel for multiple iterations, by the processing element, based on the iteration count. Multiple locations in the memory are accessed in parallel based on the execution of the sub-instructions.

Type: Grant

Filed: August 8, 2012

Date of Patent: January 3, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
Interleaving data accesses issued in response to vector access instructions

Patent number: 9021233

Abstract: A vector data access unit includes data access ordering circuitry, for issuing data access requests indicated by elements of earlier and a later vector instructions, one being a write instruction. An element indicating the next data access for each of the instructions is determined. The next data accesses for the earlier and the later instructions may be reordered. The next data access of the earlier instruction is selected if the position of the earlier instruction's next data element is less than or equal to the position of the later instruction's next data element minus a predetermined value. The next data access of the later instruction may be selected if the position of the earlier instruction's next data element is higher than the position of the later instruction's next data element minus a predetermined value. Thus data accesses from earlier and later instructions are partially interleaved.

Type: Grant

Filed: September 28, 2011

Date of Patent: April 28, 2015

Assignee: ARM Limited

Inventor: Alastair David Reid
DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING SPECULATIVE VECTOR ACCESS OPERATIONS

Publication number: 20150100754

Abstract: A data processing apparatus and method for performing speculative vector access operations are provided. The data processing apparatus has a reconfigurable buffer accessible to vector data access circuitry and comprising a storage array for storing up to M vectors of N vectors elements. The vector data access circuitry performs speculative data write operations in order to cause vector elements from selected vector operands in a vector register bank to be stored into the reconfigurable buffer. On occurrence of a commit condition, the vector elements currently stored in the reconfigurable buffer are then written to a data store. Speculation control circuitry maintains a speculation width indication indicating the number of vector elements of each selected vector operand stored in the reconfigurable buffer.

Type: Application

Filed: August 18, 2014

Publication date: April 9, 2015

Inventors: Alastair David REID, Daniel KERSHAW
Processor and system using a mask register to track progress of gathering and prefetching elements from memory

Patent number: 8892848

Abstract: A device, system and method for assigning values to elements in a first register, where each data field in a first register corresponds to a data element to be written into a second register, and where for each data field in the first register, a first value may indicate that the corresponding data element has not been written into the second register and a second value indicates that the corresponding data element has been written into the second register, reading the values of each of the data fields in the first register, and for each data field in the first register having the first value, gathering the corresponding data element and writing the corresponding data element into the second register, and changing the value of the data field in the first register from the first value to the second value. Other embodiments are described and claimed.

Type: Grant

Filed: July 5, 2011

Date of Patent: November 18, 2014

Assignee: Intel Corporation

Inventors: Eric Sprangle, Anwar Rohillah, Robert Cavin, Tom Forsyth, Michael Abrash
Loading/discarding acquired data for vector load instruction upon determination of prediction success of multiple preceding branch instructions

Patent number: 8850167

Abstract: Provided is a processor including an instruction issue unit that issues a vector load instruction read from a main memory based on branch target prediction of a branch target in a branch instruction, a data acquisition unit that starts issue of a plurality of acquisition requests for acquiring a plurality of vector data based on the issued vector load instruction from the main memory, a determination unit that determines a success or a failure of the branch target prediction after the branch target is determined, and a vector load management unit that, when the branch target prediction is determined to be a success, acquires all vector data based on the plurality of acquisition requests and then transfers all the vector data to a vector register, and, when the branch target prediction is determined to be a failure, discards the vector data acquired by the issued acquisition requests.

Type: Grant

Filed: September 22, 2011

Date of Patent: September 30, 2014

Assignee: NEC Corporation

Inventor: Masao Fukagawa
Using vector atomic memory operation to handle data of different lengths

Patent number: 8826252

Abstract: A system and method of compiling program code, wherein the program code includes an operation on an array of data elements stored in memory of a computer system. The program code is scanned for an equation which operates on data of lengths other than the limited number of vector supported data lengths. The equation is then replaced with vectorized machine executable code, wherein the machine executable code comprises a nested loop and wherein the nested loop comprises an exterior loop and a virtual interior loop. The exterior loop decomposes the equation into a plurality of loops of length N, wherein N is an integer greater than one. The virtual interior loop executes vector operations corresponding to the N length loop to form a result vector of length N, wherein the virtual interior loop includes one or more vector atomic memory operation (AMO) instructions, used to resolve false conflicts.

Type: Grant

Filed: June 12, 2009

Date of Patent: September 2, 2014

Assignee: Cray Inc.

Inventor: Terry D. Greyzck
System and method for implementing elliptic curve scalar multiplication in cryptography

Patent number: 8649508

Abstract: A system and method for implementing the Elliptic Curve scalar multiplication method in cryptography, where the Double Base Number System is expressed in decreasing order of exponents and further on using it to determine Elliptic curve scalar multiplication over a finite elliptic curve.

Type: Grant

Filed: September 29, 2008

Date of Patent: February 11, 2014

Assignee: Tata Consultancy Services Ltd.

Inventor: Natarajan Vijayarangan
Dynamically updating current communication information

Patent number: 8560808

Abstract: A method, system and computer readable media for dynamically updating current communication information, for enabling access to current communication based upon biometric information and/or for allowing communication information to be associated with biometric information and then allowing this communication information to be provided to desired recipients.

Type: Grant

Filed: January 3, 2012

Date of Patent: October 15, 2013

Assignee: International Business Machines Corporation

Inventors: Sarbajit Kumar Rakshit, Shawn K. Sremaniak, Thomas S. Mazzeo, Barry Allan Kritt
System and method for using a mask register to track progress of gathering elements from memory

Patent number: 7984273

Abstract: A system and method for assigning values to elements in a first register, where each data field in a first register corresponds to a data element to be written into a second register, and where for each data field in the first register, a first value may indicate that the corresponding data element has not been written into the second register and a second value indicates that the corresponding data element has been written into the second register, reading the values of each of the data fields in the first register, and for each data field in the first register having the first value, gathering the corresponding data element and writing the corresponding data element into the second register, and changing the value of the data field in the first register from the first value to the second value. Other embodiments are described and claimed.

Type: Grant

Filed: December 31, 2007

Date of Patent: July 19, 2011

Assignees: Intel Corporation

Inventors: Eric Sprangle, Anwar Rohillah, Robert Cavin, Tom Forsyth, Michael Abrash
Logic controller having hard-coded control logic and programmable override control store entries

Patent number: 7895379

Abstract: Control logic of a node controller receives an input vector and produces an output vector. The control logic includes a plurality of tied control store entries including hard-coded logic to identify unique values of the input vector and to produce the output vector from a hard-coded output vector when the input vector is identified and when the tied control store is enabled. The control logic also includes a plurality of spare control store entries including programmable logic configurable to identify values of the input vector and to produce the output vector from a programmable output vector when the input vector is identified and when the spare control store is enabled. One of the spare control store entries that is configured to identify a value of the input vector that none of the tied control store entries that are enabled by the entry-enables register are configured to identify is enabled.

Type: Grant

Filed: December 23, 2008

Date of Patent: February 22, 2011

Assignee: Unisys Corporation

Inventors: Ross M. Weber, David R. Spatafore
Load misaligned vector with permute and mask insert

Patent number: 7783860

Abstract: Embodiments of the invention provide logic within the store data path between a processor and a memory array. The logic may be configured to misalign vector data as it is stored to memory. By misaligning vector data as it is stored to memory, memory bandwidth may be maximized while processing bandwidth required to store vector data misaligned is minimized. Furthermore, embodiments of the invention provide logic within the load data path which allows vector data which is stored misaligned to be aligned as it is loaded into a vector register. By aligning misaligned vector data as it is loaded into a vector register, memory bandwidth may be maximized while processing bandwidth required to align misaligned vector data may be minimized.

Type: Grant

Filed: July 31, 2007

Date of Patent: August 24, 2010

Assignee: International Business Machines Corporation

Inventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
Decoupling of write address from its associated write data in a store to a shared memory in a multiprocessor system

Patent number: 7743223

Abstract: In a computer system having a plurality of processors connected to a shared memory, a system and method of decoupling an address from write data in a store to the shared memory. A write request address is generated for a memory write, wherein the write request address points to a memory location in shared memory. A write request is issued to the shared memory, wherein the write request includes the write request address. The write request address is noted in the shared memory and addresses in subsequent load and store requests are compared in share memory to the write request address. The write data is transferred to the shared memory and matched, within the shared memory, to the write request address. The write data is then stored into the shared memory as a function of the write request address.

Type: Grant

Filed: August 18, 2003

Date of Patent: June 22, 2010

Assignee: Cray Inc.

Inventors: Steven L. Scott, Gregory J. Faanes
Fast sparse list walker

Patent number: 7743231

Abstract: Provided are a method, information processing system, and computer readable medium for identifying active bits in a vector. The method comprises receiving a pointer associated with a vector of bits. The pointer is associated with a current bit within the vector of bits. The vector of bits if grouped into groups of a mathematical power of two, which is any non-negative integer powers of two. One or more current groups are determined which are the groups of the mathematical power of two comprising the current bit. The one or more current groups of the power of two are analyzed. A largest group of the power of two is identified in the one or more current groups comprising all empty bits. The pointer is set to point to a bit following a last bit in the identified largest group of the power of two comprising all empty bits.

Type: Grant

Filed: February 27, 2007

Date of Patent: June 22, 2010

Assignee: International Business Machines Corporation

Inventors: Scot H. Rider, Todd A. Strader
External memory accessing DMA request scheduling in IC of parallel processing engines according to completion notification queue occupancy level

Patent number: 7627744

Abstract: An integrated circuit comprises an external memory, a plurality of parallel connected Vector Processing Engines (VPEs), and an External Memory Unit (EMU) providing a data transfer path between the VPEs and the external memory. Each VPE contains a plurality of data processing units and a message queuing system adapted to transfer messages between the data processing units and other components of the integrated circuit.

Type: Grant

Filed: May 10, 2007

Date of Patent: December 1, 2009

Assignee: NVIDIA Corporation

Inventors: Monier Maher, Jean Pierre Bordes, Christopher Lamb, Sanjay J. Patel
Apparatus and method for enforcing homogeneity within partitions of heterogeneous computer systems

Patent number: 7519800

Abstract: A heterogeneous computer system has multiple interconnected cells, each cell has multiple primary processors of the same Instruction Set Architecture (ISA) type, but different cells may have processors of different ISA types. Each cell has a cell type register readable by a processor external to the cell. The cell type register of each cell is used at system startup time to ensure that all processors of a system partition have compatible ISA types.

Type: Grant

Filed: March 27, 2003

Date of Patent: April 14, 2009

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Scott Lynn Michaelis
Method and system for preventing current-privilege-level-information leaks to non-privileged code

Patent number: 7480797

Abstract: Various embodiments of the present invention introduce privilege-level mapping into a computer architecture not initially designed for supporting virtualization. Privilege-level mapping can, with relatively minor changes to processor logic, fully prevent privileged-level-information leaks by which non-privilege code can determine the current machine-level privilege level at which they are executing. In one embodiment of the present invention, a new privilege-level mapping register is introduced, and privilege-level mapping is enabled for all but code invoked by privileged-level-0-forcing hardware events.

Type: Grant

Filed: July 31, 2004

Date of Patent: January 20, 2009

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Bret McKee
Flow optimization and prediction for VSSE memory operations

Patent number: 7404065

Abstract: In one embodiment, a method for flow optimization and prediction for vector streaming single instruction, multiple data (SIMD) extension (VSSE) memory operations is disclosed. The method comprises generating an optimized micro-operation (?op) flow for an instruction to operate on a vector if the instruction is predicted to be unmasked and unit-stride, the instruction to access elements in memory, and accessing via the optimized ?op flow two or more of the elements at the same time without determining masks of the two or more elements. Other embodiments are also described.

Type: Grant

Filed: December 21, 2005

Date of Patent: July 22, 2008

Assignee: Intel Corporation

Inventors: Stephan Jourdan, Per Hammarlund, Michael Fetterman, Michael P. Cornaby, Glenn Hinton, Avinash Sodani
METHOD AND ARRANGEMENT FOR CACHE MEMORY MANAGEMENT, RELATED PROCESSOR ARCHITECTURE

Publication number: 20080016317

Abstract: A data cache memory coupled to a processor including processor clusters are adapted to operate simultaneously on scalar and vectorial data by providing data locations in the data cache memory for storing data for processing. The data locations are accessed either in a scalar mode or in a vectorial mode. This is done by explicitly mapping the data locations that are scalar and the data locations that are vectorial.

Type: Application

Filed: June 26, 2007

Publication date: January 17, 2008

Applicants: STMicroelectronics S.r.l., STMicroelectronics N.V.

Inventors: Francesco Pappalardo, Giuseppe Notarangelo, Elena Salurso, Elio Guidetti
Load/store operation of memory misaligned vector data using alignment register storing realigned data portion for combining with remaining portion

Patent number: 7219212

Abstract: A processor can achieve high code density while allowing higher performance than existing architectures, particularly for Digital Signal Processing (DSP) applications. In accordance with one aspect, the processor supports three possible instruction sizes while maintaining the simplicity of programming and allowing efficient physical implementation. Most of the application code can be encoded using two sets of narrow size instructions to achieve high code density. Adding a third (and larger, i.e. VLIW) instruction size allows the architecture to encode multiple operations per instruction for the performance critical section of the code. Further, each operation of the VLIW format instruction can optionally be a SIMD operation that operates upon vector data. A scheme for the optimal utilization (highest achievable performance for the given amount of hardware) of multiply-accumulate (MAC) hardware is also provided.

Type: Grant

Filed: February 25, 2005

Date of Patent: May 15, 2007

Assignee: Tensilica, Inc.

Inventors: Himanshu A. Sanghavi, Earl A. Killian, James Robert Kennedy, Darin S. Petkov, Peng Tu, William A. Huffman
Method and apparatus for addressing a vector of elements in a partitioned memory using stride, skip and span values

Patent number: 7100019

Abstract: A system and method for calculating memory addresses in a partitioned memory in a processing system having a processing unit, input and output units, a program sequencer and an external interface. An address calculator includes a set of storage elements, such as registers, and an arithmetic unit for calculating a memory address of a vector element dependent upon values stored in the storage elements and the address of a previous vector element. The storage elements hold STRIDE, SKIP and SPAN values and optionally a TYPE value, relating to the spacing between elements in the same partition, the spacing between elements in the consecutive partitions, the number of elements in a partition and the size of a vector element, respectively.

Type: Grant

Filed: September 8, 2003

Date of Patent: August 29, 2006

Assignee: Motorola, Inc.

Inventors: James M. Norris, Philip E. May, Kent D. Moat, Raymond B. Essick, IV, Brian G. Lucas
Network processor which defines virtual paths without using logical path descriptors

Patent number: 7069557

Abstract: A virtual path feature in which several virtual channels share an assigned amount of bandwidth is implemented in a network processor. The network processor maintains a schedule indicative of respective times at which a plurality of virtual channels are to be serviced. An entry is read from the schedule. The entry corresponds to a current transmit cycle and includes a pointer to a channel descriptor for a virtual channel to be serviced in the current transmit cycle. A data cell for the virtual channel to be serviced in the current cycle is transmitted. An entry is added to the schedule to point to a channel descriptor that is pointed to by the channel descriptor for the virtual channel serviced in the current transmit cycle.

Type: Grant

Filed: May 23, 2002

Date of Patent: June 27, 2006

Assignee: International Business Machines Corporation

Inventor: Merwin Herscher Alferness
Information processing system and cache flash control method used for the same

Patent number: 7043607

Abstract: The vector unit 21 outputs a first flash address to the flash address array 24. The vector unit 31 outputs a second flash address to the flash address array 34. In the master unit 2, the flash address array 24 compares an address registered in a cache with the first flash address. In the slave unit 3, the flash address array 34 compares the address registered in the cache with the second flash address. When said first flash address coincides with said address registered in said cache, the flash address array 24 sends a first coincidence address to the address array 25. When said second flash address coincides with said address registered in said cache, the flash address array 34 sends a second coincidence address to the address array 25. A corresponding address of the address array 25 is flashed based on the first address sent from the flash address array 24 and based on the second address sent from the flash address 34.

Type: Grant

Filed: June 12, 2003

Date of Patent: May 9, 2006

Assignee: NEC Corporation

Inventor: Kenji Ezoe
Fast and flexible scan conversion and matrix transpose in a SIMD processor

Patent number: 6963341

Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.

Type: Grant

Filed: May 20, 2003

Date of Patent: November 8, 2005

Inventor: Tibet Mimar
Computer system and method of controlling computation

Patent number: 6957324

Abstract: A vector computer system includes a plurality of memory banks 40, a vector processor 11, and a plurality of additional processing units 30 each of which is connected to one of the memory banks 40. Each of the additional processing units 30 reads data from the corresponding memory bank 40 by referring to an address designated by the processor 11, and performs a designated operation about the data. Then the additional processing unit 30 stores the result of the operation into the designated address.

Type: Grant

Filed: September 28, 2001

Date of Patent: October 18, 2005

Assignee: NEC Corporation

Inventor: Takumi Washio
Aggregation of sensory data for distributed decision-making

Patent number: 6865517

Abstract: A method, apparatus and computer product that enables a processor associated with a node in a computer system having various nodes, the nodes having sensors which provide data, and the nodes being connected by a communications facility acquiring local data from the sensor and remote data from other nodes via the data transfer facility. The nodes process data from a local sensor at the node and from remote sensors at other nodes; and analyze the local data, data from other nodes and local decisions made at and received from other nodes to make a local decision for action at the node. A local decision made at a node is in turn communicated to other nodes.

Type: Grant

Filed: December 11, 2002

Date of Patent: March 8, 2005

Assignee: International Business Machines Corporation

Inventors: David F. Bantz, John S. Davis, II, Rafah A. Hosn, Nicholas M. Mitchell, Veronique Perret, Daby M. Sow, Jeremy B. Sussman
Cache consistent control of subsequent overlapping memory access during specified vector scatter instruction execution

Patent number: 6816960

Abstract: A vector artchitecture processing unit according to the present invention comprises a vector scatter (VSC) address coincidence detection unit 3 that comprises registers in which an area start address and an area end address of an area specified by an area-specified vector scatter instruction are stored; and a circuit that checks if the addresses specified by the area-specified vector scatter instruction overlap with an address to be accessed by a memory access instruction following the area-specified vector scatter instruction, wherein an instruction issue control unit 1 comprises a hold control circuit that holds the following memory access instruction in response to an address conflict signal from the VSC address conflict detector.

Type: Grant

Filed: July 10, 2001

Date of Patent: November 9, 2004

Assignee: NEC Corporation

Inventor: Hisao Koyanagi
Method and apparatus for transferring vector data between memory and a register file

Patent number: 6813701

Abstract: A compiler and vector data transfer instructions for use in a vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. The compiler identifies the use of vector data in an application program and implements one or more vector instructions for transferring the vector data between memory and registers used to perform calculations on the vector data. A vector is partitioned by the compiler into variable-sized streams which are transferred into and out of the processor as burst transactions. The compiler schedules transfers of vector streams required in a calculation so that calculations on a portion of the vector data are performed while a subsequent portion of the vector data is transferred. A vector buffer pool is partitioned into one or more vector buffers and each vector buffer is used at a specific time.

Type: Grant

Filed: August 17, 1999

Date of Patent: November 2, 2004

Assignee: NEC Electronics America, Inc.

Inventor: Ahmad R. Ansari
Vector transfer system generating address error exception when vector to be transferred does not start and end on same memory page

Patent number: 6742106

Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting.

Type: Grant

Filed: January 28, 2003

Date of Patent: May 25, 2004

Assignee: NEC Electronics, Inc.

Inventor: Ahmad R. Ansari
Bus protocol for efficiently transferring vector data

Patent number: 6665749

Abstract: The present invention provides a bus architecture for a data processing system that improves transfers of vector data using a vector transfer unit (VTU). An external bus is coupled between the vector transfer unit and the memory. The external bus includes a system command bus that is used to transmit a data transfer command. The command is based on a corresponding vector transfer instruction in the application program, such as load vector data or store vector data. The commands for transferring the data elements include a burst read command and a burst write command. A variable number of data elements may be transferred, according to the user's requirements. The system command bus is also capable of transmitting a packing ratio that indicates the number of data elements that fit in the width of the external bus. This allows the entire bandwidth of the external bus to be used during vector data transfers.

Type: Grant

Filed: August 17, 1999

Date of Patent: December 16, 2003

Assignee: NEC Electronics, Inc.

Inventor: Ahmad R. Ansari
System for posting vector synchronization instructions to vector instruction queue to separate vector instructions from different application programs

Patent number: 6625720

Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector instructions are used for transferring the vector data between memory and registers used to perform calculations on the vector data. The transfers of portions of the vector data required in a calculation are scheduled so that calculations on a portion of the vector data are performed while a subsequent portion of the vector data is transferred. A vector buffer pool is partitioned into one or more vector buffers based on configuration information including the number of vectors buffers required by an application program and the size required for each vector buffer. The vector buffers are allocated for exclusive use by an application program that is executing in the data processor. Vector data transfer instructions are posted in a vector transfer instruction queue and are executed in the order they are posted to the instruction queue.

Type: Grant

Filed: August 17, 1999

Date of Patent: September 23, 2003

Assignee: NEC Electronics, Inc.

Inventor: Ahmad R. Ansari
Apparatus and method for program optimizing

Patent number: 6571386

Abstract: An optimizer (100) comprises a memory (110) and a processor (130). The memory stores a program (200) to be optimized and optimization software (301). Controlled by the optimization software, the processor (120) (a) determines local vectors (“local”) in instructions of the program (200) which indicate the use of resources by the instructions (use-vectors, exh-vectors); (b) scans the program (200) for Single-Entry-Single-Exit (SESE) structures (U, T, V, S); and (c) determines SESE vectors from the local vectors. The SESE vectors indicate the use of resources by the SESE structures and can be combined by the optimizer to obtain a program vector. When some instructions are modified, then optimizer (100) only re-calculates the SESE vector of the corresponding SESE and re-combines the old SESE vector with the modified SESE vector to determine a new program vector.

Type: Grant

Filed: March 13, 2000

Date of Patent: May 27, 2003

Assignee: Motorola, Inc.

Inventors: Mikhail Figurin, Mikhail Okrugin, Dmitriy Barmenkov
Vector transfer system generating address error exception when vector to be transferred does not start and end on same memory page

Patent number: 6513107

Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting.

Type: Grant

Filed: August 17, 1999

Date of Patent: January 28, 2003

Assignee: NEC Electronics, Inc.

Inventor: Ahmad R. Ansari
Transfer of data between processors in a multi-processor system

Patent number: 6484220

Abstract: A method for transferring data between devices in a computer system. In a preferred embodiment, a requesting device broadcasts a request for data. Each of a plurality of devices within the computer system responds to the request and indicates the location of the device and whether the device contains the requested data. The data is then transferred to the requesting device from one of the devices containing the data within the plurality of devices to the requesting device. The device selected to transfer the data to the requesting device has the closest logical proximity to the requesting device which results in a quick transfer of data.

Type: Grant

Filed: August 26, 1999

Date of Patent: November 19, 2002

Assignee: International Business Machines Corporation

Inventors: Manuel Joseph Alvarez, II, Sanjay Raghunath Deshpande, Kenneth Douglas Klapproth, David Mui
Vector scatter instruction control circuit and vector architecture information processing equipment

Publication number: 20020007449

Abstract: A vector artchitecture processing unit according to the present invention comprises a vector scatter (VSC) address coincidence detection unit 3 that comprises registers in which an area start address and an area end address of an area specified by an area-specified vector scatter instruction are stored; and a circuit that checks if the addresses specified by the area-specified vector scatter instruction overlap with an address to be accessed by a memory access instruction following the area-specified vector scatter instruction, wherein an instruction issue control unit 1 comprises a hold control circuit that holds the following memory access instruction in response to an address conflict signal from the VSC address conflict detector.

Type: Application

Filed: July 10, 2001

Publication date: January 17, 2002

Applicant: NEC CORPORATION

Inventor: Hisao Koyanagi
Dynamic scheduling mechanism for an asynchronous/isochronous integrated circuit interconnect bus

Patent number: 6336179

Abstract: A first counter sequentially counts a plurality of numbers from respective sources requesting transfer of data. Each of the numbers represents an amount of isochronous data to transfer over the bus from the respective ones of the sources during a frame on a bus. A count value in a second counter is selectably incremented when the first counter is counting, to provide a remaining count value indicative a remaining amount of data to transfer during the frame. The remaining count value in the second counter is decremented for each isochronous transfer on the bus after the remaining amount of data to transfer has been determined from all sources requesting transfer of isochronous data during the frame. A third counter tracks the time remaining in the frame and compares the remaining count value to the time remaining in the frame to determine a priority mode on the bus.

Type: Grant

Filed: August 21, 1998

Date of Patent: January 1, 2002

Assignee: Advanced Micro Devices, Inc.

Inventor: Dale E. Gulick
Physical layer interface and method for arbitration over serial bus using digital line state signals

Patent number: 6324611

Abstract: A physical layer interface for a serial bus includes a controller for producing parallel data representing a near-end line state of the serial bus. A line transmitter is connected to the controller for converting the parallel data therefrom into serial data and transmitting the serial data to the serial bus. A line receiver is connected to the serial bus for receiving therefrom serial dtaa and converting the received serial data into parallel data representing a far-end line state of the serial bus. A differential line state of the serial bus is detected from the parallel data of the controller and the parallel data of the line receiver. The detected differential line state is the input to the controller. In a modified embodiment, a far-end line state of the serial bus is detected from the near-end line state of the serial bus and a far-end differential signal received by the line receiver and directly supplied to the controller.

Type: Grant

Filed: September 17, 1998

Date of Patent: November 27, 2001

Assignee: NEC Corporation

Inventor: Takayuki Nyu
Method and apparatus for processing a set of data values with plural processing units mask bits generated by other processing units

Patent number: 6308250

Abstract: A method and system for operating a computing system having multiple processing units. According to a new machine instruction, called the iota instruction, the computing system operates on a vector of mask bits to generate an iota vector having a sequence of values. In one form, each value of the iota vector is a sum of a series of the lower order mask bits up to and including the mask bit corresponding to the entry in the iota vector. In another form, each entry in the iota vector is a sum of a series of lower order mask bits but does not include the mask bit corresponding to the particular entry in the iota vector. In order to calculate the iota vector, the multiple processing units of the present invention communicate the mask bits to the other processing units. Advantages of the present invention include the vectorization of software loops having certain data hazards that prevented conventional compilers from vectorizing the software.

Type: Grant

Filed: June 23, 1998

Date of Patent: October 23, 2001

Assignee: Silicon Graphics, Inc.

Inventor: Peter Michael Klausler
Collation of interrupt control devices

Patent number: 6253304

Abstract: A first and a second local interrupt controller are disposed on a single integrated circuit. The first and second local interrupt controllers are coupled to controllably provide at least one interrupt request signal, respectively, to a first and second processor. An input/output (I/O) interrupt controller is also on the integrated circuit and coupled to receive an interrupt request from at least one input/output device. A communication circuit on the integrated circuit is coupled to the input/output interrupt controller and the first and second local interrupt controllers. The communication circuit provides for transfer of interrupt information between the first local interrupt controller, the second local interrupt controller and the input/output interrupt controller.

Type: Grant

Filed: January 4, 1999

Date of Patent: June 26, 2001

Assignee: Advanced Micro Devices, Inc.

Inventors: Larry Hewitt, David Neal Suggs, Greg Smaus, Derrick R. Meyer
Data processing system for processing vector data and method therefor

Patent number: 6202130

Abstract: A data processing system includes a data processor (10) coupled to a memory system having a first memory, such as an L1 data cache (16), arranged with a second memory (such as an L2 cache) at a lower hierarchical level. The data processor (10) prefetches data elements of a vector into the first memory prior to processing such data elements. If a requested data element is not present in the first memory, a load request is issued to the second memory and to lower levels of the memory hierarchy until the requested data element is finally retrieved and stored in the first memory. The data processor (10) continues to prefetch subsequent data elements of the vector by considering the length of the data element and the stride of the vector. In one embodiment, the data processor (10) prefetches the vector into the first memory in response to a single data stream touch load (DST) instruction (100).

Type: Grant

Filed: April 17, 1998

Date of Patent: March 13, 2001

Assignee: Motorola, Inc.

Inventors: Hunter Ledbetter Scales, III, Keith Everett Diefendorff, Brett Olsson, Pradeep Kumar Dubey, Ronald Ray Hochsprung, Bradford Byron Beavers, Bradley G. Burgess, Michael Dean Snyder, Cathy May, Edward John Silha
Microprocessor modified to perform inverse discrete cosine transform operations on a one-dimensional matrix of numbers within a minimal number of instructions

Patent number: 6141673

Abstract: A multimedia extension unit (MEU) is provided for performing various multimedia-type operations. The MEU can be coupled either through a coprocessor bus or a local central processing unit (CPU) bus to a conventional processor. The MEU employs vector registers, a vector arithmetic logic unit (ALU), and an operand routing unit (ORU) to perform a maximum number of the multimedia operations within as few instruction cycles as possible. Complex algorithms are readily performed by arranging operands upon the vector ALU in accordance with the desired algorithm flowgraph. The ORU aligns the operands within partitioned slots or sub-slots of the vector registers using vector instructions unique to the MEU. At the output of the ORU, operand pairs from vector source or destination registers can be easily routed and combined at the vector ALU.

Type: Grant

Filed: May 25, 1999

Date of Patent: October 31, 2000

Assignees: Advanced Micro Devices, Inc., Compaq Computer Corp.

Inventors: John S. Thayer, John Gregory Favor, Frederick D. Weber
System and method for routing one operand to arithmetic logic units from fixed register slots and another operand from any register slot

Patent number: 6009505

Abstract: A multimedia extension unit (MEU) is provided for performing various multimedia-type operations. The MEU can be coupled either through a coprocessor bus or a local CPU bus to a conventional processor. The MEU employs vector registers, a vector ALU, and an operand routing unit (ORU) to perform a maximum number of the multimedia operations within as few instruction cycles as possible. Complex algorithms are readily performed by arranging operands upon the vector ALU in accordance with the desired algorithm flowgraph. The ORU aligns the operands within partitioned slots or sub-slots of the vector registers using vector instructions unique to the MEU. At the output of the ORU, operand pairs from vector source or destination registers can be easily routed and combined at the vector ALU. The vector instructions employ special load/store instructions in combination with numerous operational instructions to carry out concurrent multimedia operations on the aligned operands.

Type: Grant

Filed: December 2, 1996

Date of Patent: December 28, 1999

Assignees: Compaq Computer Corp., Advanced Micro Devices, Inc.

Inventors: John S. Thayer, Gary W. Thome, Brian E. Longhenry, John G. Favor, Frederick D. Weber

1 2 next