Vector Processor Patents (Class 712/2)

Scalar/vector processor interface (Class 712/3)

Distributing of vector data to vector registers (Class 712/4)

Masking to control an access to data in vector register (Class 712/5)

Controlling access to external vector data (Class 712/6)

Vector processor operation (Class 712/7)

Data element rearrangement, processors, methods, systems, and instructions

Patent number: 11941394

Abstract: A processor includes a decode unit to decode an instruction indicating a source packed data operand having source data elements and indicating a destination storage location. Each of the source data elements has a source data element value and a source data element position. An execution unit, in response to the instruction, stores a result packed data operand having result data elements each having a result data element value and a result data element position. Each result data element value is one of: (1) equal to a source data element position of a source data element, closest to one end of the source operand, having a source data element value equal to the result data element position of the result data element; and (2) a replacement value, when no source data element has a source data element value equal to the result data element position of the result data element.

Type: Grant

Filed: December 9, 2019

Date of Patent: March 26, 2024

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Jong Soo Park
Reusing an operand received from a first-in-first-out (FIFO) buffer according to an operand specifier value specified in a predefined field of an instruction

Patent number: 11620132

Abstract: Various embodiments are provided reusing an operand in an instruction set architecture (ISA) by one or more processors in a computing system. An instruction may specify that an operand register for a selected operand retain operand data used by a previous instruction. The operand data in the operand register may be reused by the instruction.

Type: Grant

Filed: May 8, 2019

Date of Patent: April 4, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce Fleischer, Sunil Shukla, Vijayalakshmi Srinivasan, Jungwook Choi
Instructions for vector operations with constant values

Patent number: 11579881

Abstract: Disclosed embodiments relate to instructions for vector operations with immediate values. In one example, a system includes a memory and a processor that includes fetch circuitry to fetch the instruction from a code storage, the instruction including an opcode, a destination identifier to specify a destination vector register, a first immediate, and a write mask identifier to specify a write mask register, the write mask register including at least one bit corresponding to each destination vector register element, the at least one bit to specify whether the destination vector register element is masked or unmasked, decode circuitry to decode the fetched instruction, and execution circuitry to execute the decoded instruction, to, use the write mask register to determine unmasked elements of the destination vector register, and, when the opcode specifies to broadcast, broadcast the first immediate to one or more unmasked vector elements of the destination vector register.

Type: Grant

Filed: June 29, 2017

Date of Patent: February 14, 2023

Assignee: Intel Corporation

Inventors: Gadi Haber, Robert Valentine, Ayal Zaks, Jesus Corbal San Adrian
Systems, methods, and apparatuses for tile load

Patent number: 11567765

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.

Type: Grant

Filed: July 1, 2017

Date of Patent: January 31, 2023

Assignee: Intel Corporation

Inventors: Robert Valentine, Menachem Adelman, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Dan Baum, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil
True/false vector index registers and methods of populating thereof

Patent number: 11507374

Abstract: Disclosed herein are vector index registers for storing or loading indexes of true and/or false results of comparison operations in vector processors. Each of the vector index registers store multiple addresses for accessing multiple positions in operand vectors.

Type: Grant

Filed: May 20, 2019

Date of Patent: November 22, 2022

Assignee: Micron Technology, Inc.

Inventor: Steven Jeffrey Wallach
Systems, methods, and apparatuses for heterogeneous computing

Patent number: 11416281

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

Type: Grant

Filed: December 31, 2016

Date of Patent: August 16, 2022

Assignee: Intel Corporation

Inventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
Cluster of scalar engines to accelerate intersection in leaf node

Patent number: 11263799

Abstract: Cluster of acceleration engines to accelerate intersections.

Type: Grant

Filed: June 24, 2020

Date of Patent: March 1, 2022

Assignee: INTEL CORPORATION

Inventors: Prasoonkumar Surti, Carsten Benthin, Karthik Vaidyanathan, Philip Laws, Scott Janus, Sven Woop
Computer architecture with fixed program dataflow elements and stream processor

Patent number: 11151077

Abstract: A hardware accelerator for computers combines a stand-alone, high-speed, fixed program dataflow functional element with a stream processor, the latter of which may autonomously access memory in predefined access patterns after receiving simple stream instructions and provide them to the dataflow functional element. The result is a compact, high-speed processor that may exploit fixed program dataflow functional elements.

Type: Grant

Filed: June 28, 2017

Date of Patent: October 19, 2021

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar
Pipelining multi-directional reduction

Patent number: 11093438

Abstract: Embodiments for pipelining multi-directional reduction by one or more processors in a computing system. One or more reduce scatter operations and one or more all-gather operations may be assigned to each of a plurality of independent networks. The one or more reduce scatter operations and the one or more all-gather operations may be sequentially executed in each of the plurality of independent networks according to a serialized execution order and a defined time period.

Type: Grant

Filed: January 7, 2019

Date of Patent: August 17, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Minsik Cho, Ulrich Finkler, David Kung
Static shared memory access with one piece of input data to be reused for successive execution of one instruction in a reconfigurable parallel processor

Patent number: 10956360

Abstract: Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise an arithmetic logic unit (ALU), a data buffer associated with the ALU, and an indicator associated with the data buffer to indicate whether a piece of data inside the data buffer is to be reused for repeated execution of a same instruction as a pipeline stage.

Type: Grant

Filed: March 13, 2018

Date of Patent: March 23, 2021

Assignee: AZURENGINE TECHNOLOGIES ZHUHAI INC.

Inventors: Yuan Li, Jianbin Zhu
Instruction and logic for processing text strings

Patent number: 10929131

Abstract: Method, apparatus, and program means for performing a string comparison operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources store a result of a comparison between each data element of a first and second operand corresponding to a first and second text string, respectively.

Type: Grant

Filed: April 15, 2019

Date of Patent: February 23, 2021

Assignee: Intel Corporation

Inventors: Michael A. Julier, Jeffrey D. Gray, Srinivas Chennupaty, Sean P. Mirkes, Mark P. Seconi
Handling of inter-element address hazards for vector instructions

Patent number: 10922084

Abstract: An apparatus has processing circuitry supporting vector load and store instructions. In response to a transaction start event, the processing circuitry executes one or more subsequent instructions speculatively. In response to a transaction end event, the processing circuitry commits speculative results of those instructions. Hazard detection circuitry detects whether an inter-element address hazard occurs between an address for data element J for an earlier vector load instruction and an address for data element K for a later vector store instruction, where K and J are not equal. In response to detecting the inter-element address hazard, the hazard detection circuitry triggers the processing circuitry to abort further processing of the instructions following the transaction start event and to prevent the speculative results being committed. This approach can provide faster performance for vectorised code.

Type: Grant

Filed: August 14, 2017

Date of Patent: February 16, 2021

Assignee: ARM Limited

Inventors: Matthew James Horsnell, Mbou Eyole
Antenna tuning control using general purpose input/output data

Patent number: 10911092

Abstract: A digital-to-analog converter (DAC) and a method for operating the DAC are disclosed. The DAC receives, over a first channel, a control signal that is transmitted in accordance with a binary protocol. The DAC also receives, over a second channel different than the first channel, data that is transmitted in accordance with a multilevel communication protocol that is different than the binary protocol. The DAC determines a plurality of first and second voltages based on the received data and identifies, based on the control signal, a time when data transmission or reception is switched between first and second antennas. In response to identifying, based on the control signal, the time when data transmission or reception is switched, the DAC outputs the determined plurality of first voltages to a first antenna tuning circuit or the determined plurality of second voltages to a second antenna tuning circuit.

Type: Grant

Filed: November 7, 2019

Date of Patent: February 2, 2021

Assignees: STMicroelectronics (Tours) SAS, STMicroelectronics (Shenzhen) R&D Co. Ltd

Inventors: Songfeng Zhao, Jean Pierre Proot
Strideshift instruction for transposing bits inside vector register

Patent number: 10884750

Abstract: A processor includes a decode circuit to decode an instruction into a decoded instruction and an execution circuit to execute the decoded instruction to access a first bit of a first input vector located at a bit position indicated by an element of a second input vector, stride over bits of the first input vector using a stride to access bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector, and store the first bit of the first input vector and the bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector as consecutive bits in a destination vector.

Type: Grant

Filed: February 28, 2017

Date of Patent: January 5, 2021

Assignee: INTEL CORPORATION

Inventors: Mikhail Plotnikov, Igor Ermolaev
Apparatus and methods for vector operations

Patent number: 10831861

Abstract: Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector. The first vector may include one or more first elements and the second vector may include one or more second elements. The aspects may further include a computation module configured to calculate a cross product between the first vector and the second vector in response to an instruction.

Type: Grant

Filed: October 26, 2018

Date of Patent: November 10, 2020

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Tao Luo, Tian Zhi, Shaoli Liu, Tianshi Chen, Yunji Chen
Multistage collector for outputs in multiprocessor systems

Patent number: 10783605

Abstract: Aspects include a multistage collector to receive outputs from plural processing elements. Processing elements may comprise (each or collectively) a plurality of clusters, with one or more ALUs that may perform SIMD operations on a data vector and produce outputs according to the instruction stream being used to configure the ALU(s). The multistage collector includes substituent components each with at least one input queue, a memory, a packing unit, and an output queue; these components can be sized to process groups of input elements of a given size, and can have multiple input queues and a single output queue. Some components couple to receive outputs from the ALUs and others receive outputs from other components. Ultimately, the multistage collector can output groupings of input elements. Each grouping of elements (e.g., at input queues, or stored in the memories of component) can be formed based on matching of index elements.

Type: Grant

Filed: February 4, 2019

Date of Patent: September 22, 2020

Assignee: Imagination Technologies Limited

Inventors: James Alexander McCombe, Steven John Clohset, Jason Rupert Redgrave, Luke Tilman Peterson
Virtualization and central coordination in wireless networks

Patent number: 10771981

Abstract: A method for cellular network operation includes establishing communications in a cellular network between a given user equipment (UE) and a base station, comprising a virtualization platform and radio transceiver points (R-TP). A central coordinator selects a downlink coordinated transmission mode involving a first and a second R-TP which receive messages, over a control interface between the virtualization platform and each R-TP, including specifications of time-frequency resources to be used for user data and reference signals transmission. The radio processing functions in the communications with the given UE are performed while executing the selected coordinated mode by both selected R-TPs in accordance with the time-frequency resource specifications.

Type: Grant

Filed: September 5, 2016

Date of Patent: September 8, 2020

Inventor: Mariana Goldhamer
System and method for biometric authentication in connection with camera-equipped devices

Patent number: 10728242

Abstract: The present invention relates generally to the use of biometric technology for authentication and identification, and more particularly to non-contact based solutions for authenticating and identifying users, via computers, such as mobile devices, to selectively permit or deny access to various resources. In the present invention authentication and/or identification is performed using an image or a set of images of an individual's palm through a process involving the following key steps: (1) detecting the palm area using local classifiers; (2) extracting features from the region(s) of interest; and (3) computing the matching score against user models stored in a database, which can be augmented dynamically through a learning process.

Type: Grant

Filed: October 5, 2018

Date of Patent: July 28, 2020

Assignee: Element Inc.

Inventors: Yann LeCun, Adam Perold, Yang Wang, Sagar Waghmare
Vector reduction processor

Patent number: 10706007

Abstract: A vector reduction circuit configured to reduce an input vector of elements comprises a plurality of cells, wherein each of the plurality of cells other than a designated first cell that receives a designated first element of the input vector is configured to receive a particular element of the input vector, receive, from another of the one or more cells, a temporary reduction element, perform a reduction operation using the particular element and the temporary reduction element, and provide, as a new temporary reduction element, a result of performing the reduction operation using the particular element and the temporary reduction element. The vector reduction circuit also comprises an output circuit configured to provide, for output as a reduction of the input vector, a new temporary reduction element corresponding to a result of performing the reduction operation using a last element of the input vector.

Type: Grant

Filed: September 12, 2018

Date of Patent: July 7, 2020

Assignee: Google LLC

Inventors: Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam
Instructions and logic to perform floating-point and integer operations for machine learning

Patent number: 10353706

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Grant

Filed: November 21, 2017

Date of Patent: July 16, 2019

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Centrality measure ranking for a multiplex network

Patent number: 10341190

Abstract: Centrality measure ranking for a multiple network is provided by a method that includes obtaining a representation of a multiplex network including layers and nodes representing communicating entities. The method determines a node centrality measure for each node of the nodes. This includes determining intra-layer and inter-layer centrality measures. The method determines a respective centrality measure for each communicating entity as a function of node centrality measures for nodes representing the communicating entity across the layers of the multiplex network. The method also ranks the communicating entities by their centrality measures.

Type: Grant

Filed: November 14, 2017

Date of Patent: July 2, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Krishnasuri Narayanam, Ramasuri Narayanam, Mukundan Sundararajan
Concurrent program execution optimization

Patent number: 10318353

Abstract: An architecture for a load-balanced groups of multi-stage manycore processors shared dynamically among a set of software applications, with capabilities for destination task defined intra-application prioritization of inter-task communications (ITC), for architecture-based ITC performance isolation between the applications, as well as for prioritizing application task instances for execution on cores of manycore processors based at least in part on which of the task instances have available for them the input data, such as ITC data, that they need for executing.

Type: Grant

Filed: September 16, 2016

Date of Patent: June 11, 2019

Inventor: Mark Henrik Sandstrom
Vector permutation circuit and vector processor

Patent number: 10303473

Abstract: A vector permutation circuit and a vector processor are provided. The vector permutation circuit includes a grouping unit, m selection units connected to the grouping unit, j switching units connected to the m selection units, and a control unit connected to each selection unit and each switching unit, where each switching unit is connected to m/j selection units; the grouping unit divides to-be-permutated vector data into n vector data groups and output the n vector data groups to the m selection units; under control of the control unit, each selection unit selects a second vector data group from an input first vector data group, and outputs the second vector data group to a switching unit connected to the selection unit; under control of the control unit, each switching unit switches and outputs elements in the input second vector data group.

Type: Grant

Filed: September 29, 2016

Date of Patent: May 28, 2019

Assignee: HUAWEI TECHNOLOGIES CO., LTD

Inventors: Yunbi Chen, Kai Hu
Data processing apparatus and method for controlling performance of speculative vector operations

Patent number: 10261789

Abstract: A data processing apparatus and a method of controlling performance of speculative vector operations are provided. The apparatus comprises processing circuitry for performing a sequence of speculative vector operations on vector operands, each vector operand comprising a plurality of vector elements, and speculation control circuitry for maintaining a speculation width indication indicating the number of vector elements of each vector operand to be subjected to the speculative vector operations. The speculation width indication is set to an initial value prior to performance of the sequence of speculative vector operations. The processing circuitry generates progress indications during performance of the sequence of speculative vector operations, and the speculation control circuitry detects, with reference to the progress indications and speculation reduction criteria, presence of a speculation reduction condition.

Type: Grant

Filed: August 18, 2014

Date of Patent: April 16, 2019

Assignee: ARM Limited

Inventors: Alastair David Reid, Daniel Kershaw
Processors, methods, and systems to access a set of registers as either a plurality of smaller registers or a combined larger register

Patent number: 10228941

Abstract: A processor of an aspect includes a set of registers capable of storing packed data. An execution unit is coupled with the set of registers. The execution unit is to access the set of registers in at least two different ways in response to instructions. The at least two different ways include a first way in which the set of registers are to represent a plurality of N-bit registers. The at least two different ways also include a second way in which the set of registers are to represent a single register of at least 2N-bits. In one aspect, the at least 2N-bits is to be at least 256-bits.

Type: Grant

Filed: June 28, 2013

Date of Patent: March 12, 2019

Assignee: Intel Corporation

Inventors: Bret L. Toll, Ronak Singhal, Buford M. Guy, Mishali Naik
System and method for biometric authentication in connection with camera equipped devices

Patent number: 10135815

Abstract: The present invention relates generally to the use of biometric technology for authentication and identification, and more particularly to non-contact based solutions for authenticating and identifying users, via computers, such as mobile devices, to selectively permit or deny access to various resources. In the present invention authentication and/or identification is performed using an image or a set of images of an individual's palm through a process involving the following key steps: (1) detecting the palm area using local classifiers; (2) extracting features from the region(s) of interest; and (3) computing the matching score against user models stored in a database, which can be augmented dynamically through a learning process.

Type: Grant

Filed: August 1, 2014

Date of Patent: November 20, 2018

Assignee: ELEMENT, INC.

Inventors: Yann LeCun, Adam Perold, Yang Wang, Sagar Waghmare
Access node architecture for 5G radio and other access networks

Patent number: 10085302

Abstract: An access node for a telecommunications network is partitioned into a front end unit and a back end unit coupled by an internet protocol (IP) packet based communication link to provide for data and control packets to be sent between the back end unit and the front end unit. The front end unit performs physical layer and media access layer (MAC) sublayer processing for data for transmission to/from user equipment in the network using baseband processing units that perform highly parallel floating/fixed point operations. The back end unit includes a plurality of general purpose processors to provide data link layer and network layer processing. back end portions may be pooled to provide greater efficiency.

Type: Grant

Filed: September 15, 2017

Date of Patent: September 25, 2018

Assignees: AT&T Mobility II LLC, AT&T Intellectual Property I, L.P.

Inventors: Dimas R. Noriega, Arthur R. Brisebois, Giuseppe De Rosa, Henry J. Fowler, Jr.
Generalized resettable memory

Patent number: 9958917

Abstract: Disclosed is a resettable memory device including a memory unit, a reset status indicator circuit, a logic sampling circuit, and a multiplexer for performing a reset function. The memory unit includes cells for storing states of signals in a design under test. The reset status indicator stores states of indicators indicating whether corresponding cells should be reset or not. Responsive to the reset status indicator indicating that the value of the cell should not be reset, the multiplexer receives the value stored in the cell and outputs the retrieved value from the cell. Responsive to the reset status indicator indicating that the value of the cell should be reset, the multiplexer outputs a reset value instead of the value stored in the cell. The reset value may be changed by the logic sampling circuit at different time periods or certain logic conditions, and output through the multiplexer.

Type: Grant

Filed: December 2, 2016

Date of Patent: May 1, 2018

Assignee: Synopsys, Inc.

Inventors: Ngai Ngai William Hung, Dhiraj Goswami
Apparatus and method for vector processing

Patent number: 9916130

Abstract: An apparatus comprises processing circuitry for performing, in response to a vector instruction, a plurality of lanes of processing or respective data elements with at least one operand vector to generate corresponding result data elements of a result vector. The processing circuitry may support performing at least two of the lanes of processing with different rounding modes for generating rounding values for the corresponding result data elements of the result vector. This allows two or more calculations with different rounding modes to be executed in response to a single instruction, to improve performance.

Type: Grant

Filed: December 9, 2014

Date of Patent: March 13, 2018

Assignee: ARM Limited

Inventors: David Raymond Lutz, Neil Burgess
Method and apparatus for performing an efficient scatter

Patent number: 9891914

Abstract: An apparatus and method for performing an efficient scatter operation. For example, one embodiment of a processor comprises: an allocator unit to receive a scatter operation comprising a number of data elements and responsively allocate resources to execute the scatter operation; a memory execution cluster comprising at least a portion of the resources to execute the scatter operation, the resources including one or more store data buffers and one or more store address buffers; and a senior store pipeline to transfer store data elements from the store data buffers to system memory using addresses from the store address buffers prior to retirement of the scatter operation.

Type: Grant

Filed: April 10, 2015

Date of Patent: February 13, 2018

Assignee: Intel Corporation

Inventors: Ramon Matas, Alexey P. Suprun, Roger Gramunt, Chung-Lun Chan, Rammohan Padmanabhan
Remote radio head and associated method

Patent number: 9887714

Abstract: It is provided a remote radio head configured to provide a radio interface for a network node. The remote radio head comprising an antenna, an analog interface for connecting with the network node, radio frequency (RF) circuitry configured to convert between intermediate frequency signals of the analog interface and RF signals of the antenna, digital circuitry configured to process transmission and/or reception signals, a first analog to digital converter (ADC) connected to the digital circuitry, and a first digital to analog converter (DAC) connected to the digital circuitry. The first ADC, the digital circuitry, and the first DAC are connected between the antenna and the analog interface for receiving or transmitting radio signals. A corresponding method is also presented.

Type: Grant

Filed: July 4, 2014

Date of Patent: February 6, 2018

Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)

Inventors: Marko E. Leinonen, Kauko Heinikoski
Methods and systems for fast set-membership tests using one or more processors that support single instruction multiple data instructions

Patent number: 9886459

Abstract: Methods and apparatuses for determining set-membership using Single Instruction Multiple Data (“SIMD”) architecture are presented herein. Specifically, methods and apparatuses are discussed for determining, in parallel, whether multiple values in a first set of values are members of a second set of values. Many of the methods and systems discussed herein are applied to determining whether one or more rows in a dictionary-encoded column of a database table satisfy one or more conditions based on the dictionary-encoded column. However, the methods and systems discussed herein may apply to many applications executed on a SIMD processor using set-membership tests.

Type: Grant

Filed: July 22, 2014

Date of Patent: February 6, 2018

Assignee: Oracle International Corporation

Inventors: Shasank K. Chavan, Phumpong Watanaprakornkul
Bus communications with multi-device messaging

Patent number: 9798684

Abstract: Methods and systems are described for reading from or writing to a plurality of slave devices connected to a communications bus having a common data line. The slave devices are mapped to a virtual device address and the communication is initiated by the master by signaling a start condition and the virtual device address. Each of the slave devices mapped to the virtual device address identifies a register in that slave device associated with the virtual device address and, in sequence, performs a read or write operation on the bus with regard to its identified register in a respective predetermined time slot within the communication or to a corresponding virtual register address assigned to the slave device previously.

Type: Grant

Filed: April 21, 2015

Date of Patent: October 24, 2017

Assignee: BLACKBERRY LIMITED

Inventor: Jens Kristian Poulsen
Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods

Patent number: 9792118

Abstract: Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption are disclosed. Related vector processor systems and methods are also disclosed. The VPEs are configured to provide filter vector processing operations. To minimize re-fetching of input vector data samples from memory to reduce power consumption, a tapped-delay line(s) is included in the data flow paths between a vector data file and execution units in the VPE. The tapped-delay line(s) is configured to receive and provide input vector data sample sets to execution units for performing filter vector processing operations. The tapped-delay line(s) is also configured to shift the input vector data sample set for filter delay taps and provide the shifted input vector data sample set to execution units, so the shifted input vector data sample set does not have to be re-fetched during filter vector processing operations.

Type: Grant

Filed: November 15, 2013

Date of Patent: October 17, 2017

Assignee: QUALCOMM Incorporated

Inventors: Raheel Khan, Fahad Ali Mujahid, Afshin Shiravi
Access node architecture for 5G radio and other access networks

Patent number: 9769873

Abstract: An access node for a telecommunications network is partitioned into a front end unit and a back end unit coupled by an internet protocol (IP) packet based communication link to provide for data and control packets to be sent between the back end unit and the front end unit. The front end unit performs physical layer and media access layer (MAC) sublayer processing for data for transmission to/from user equipment in the network using baseband processing units that perform highly parallel floating/fixed point operations. The back end unit includes a plurality of general purpose processors to provide data link layer and network layer processing. back end portions may be pooled to provide greater efficiency.

Type: Grant

Filed: December 29, 2015

Date of Patent: September 19, 2017

Assignees: AT&T Intellectual Property I, L.P., AT&T Mobility II LLC

Inventors: Dimas R. Noriega, Arthur R. Brisebois, Giuseppe De Rosa, Henry J. Fowler, Jr.
Multi-processor integrated circuits

Patent number: 9753765

Abstract: An integrated circuit unit and method for synchronizing processing threads running on respective processors are provided. The unit includes an interrupt request controller which is programmable to provide a first desired number of synchronization objects and a second desired number of interrupt request signals for supply to such processors. The controller is operable to direct and interrupt request signals to a chosen processor in dependence upon data received from the processors.

Type: Grant

Filed: March 22, 2004

Date of Patent: September 5, 2017

Assignee: Altera Corporation

Inventor: Robert Jackson
Apparatus and method of vector unit sharing

Patent number: 9727526

Abstract: A reconfigurable vector processor is described that allows the size of its vector units to be changed in order to process vectors of different sizes. The reconfigurable vector processor comprises a plurality of processor units. Each of the processor units comprises a control unit for decoding instructions and generating control signals, a scalar unit for processing instructions on scalar data, and a vector unit for processing instructions on vector data under control of control signals. The reconfigurable vector processor architecture also comprises a vector control selector for selectively providing control signals generated by one processor unit of the plurality of processor units to the vector unit of a different processor unit of the plurality of processor units.

Type: Grant

Filed: January 25, 2011

Date of Patent: August 8, 2017

Assignee: NXP USA, Inc.

Inventors: Malcolm Stewart, Ali Osman Ors, Daniel Laroche
System and method for updating configuration data for sub-systems of an automated banking machine

Patent number: 9665360

Abstract: A computer implemented method for updating configuration data in at least one automated banking machine is configured to execute configuration update steps embodied with a computer readable medium. The method includes identifying one or more sub-systems implemented within the automated banking machine, receiving an update to configuration data for at least one of the identified sub-systems, generating a restore point based on a current implementation of the sub-systems for the automated banking machine, and installing the configuration data in the automated banking machine. The identified sub-systems can include at least two of roll storage modules, a note handling module controller, a note detector module, and an interface controller.

Type: Grant

Filed: July 24, 2012

Date of Patent: May 30, 2017

Assignee: Glory Global Solutions (International) Limited

Inventors: Dominik Cipa, Gunnar Kunz, Ulrich Marti, Olivier Martin
Packed data rearrangement control indexes precursors generation processors, methods, systems, and instructions

Patent number: 9639354

Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes the result including a sequence of at least four non-negative integers. In an aspect, values of the at least four non-negative integers are not calculated using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: December 22, 2011

Date of Patent: May 2, 2017

Assignee: Intel Corporation

Inventors: Seth Abraham, Robert Valentine, Elmoustapha Ould-Ahmed-Vall, Zeev Sperber, Amit Gradstein
Vector register addressing and functions based on a scalar register data value

Patent number: 9632781

Abstract: Techniques are provided for executing a vector alignment instruction. A scalar register file in a first processor is configured to share one or more register values with a second processor, the one or more register values accessed from the scalar register file according to an Rt address specified in a vector alignment instruction, wherein a start location is determined from one of the shared register values. An alignment circuit in the second processor is configured to align data identified between the start location within a beginning Vu register of a vector register file (VRF) and an end location of a last Vu register of the VRF according to the vector alignment instruction. A store circuit is configured to select the aligned data from the alignment circuit and store the aligned data in the vector register file according to an alignment store address specified by the vector alignment instruction.

Type: Grant

Filed: February 26, 2013

Date of Patent: April 25, 2017

Assignee: QUALCOMM Incorporated

Inventors: Ajay A. Ingle, Marc M. Hoffman, Jose Fridman, Lucian Codrescu
Apparatus and method for asymmetric dual path processing

Patent number: 9477475

Abstract: According to embodiments disclosed herein, there is disclosed a computer processor architecture; and in particular a computer processor, a method of operating the same, and a computer program product that makes use of an instruction set for the computer. In one embodiment, the computer processor includes: (1) a decode unit for decoding instruction packets fetched from a memory holding the instruction packets, (2) a control processing channel capable of performing control operations and (3) a data processing channel capable of performing data processing operations, wherein, in use the decode unit causes instructions of instruction packets comprising a plurality of only control instructions to be executed sequentially on the control processing channel, and wherein, in use the decode unit causes instructions of instruction packets comprising a plurality of instructions comprising at least one data processing instruction to be executed simultaneously on the data processing channel.

Type: Grant

Filed: April 30, 2015

Date of Patent: October 25, 2016

Assignee: Nvidia Technology UK Limited

Inventor: Simon Knowles
Concurrent program execution optimization

Patent number: 9448847

Abstract: An architecture for a load-balanced groups of multi-stage manycore processors shared dynamically among a set of software applications, with capabilities for destination task defined intra-application prioritization of inter-task communications (ITC), for architecture-based ITC performance isolation between the applications, as well as for prioritizing application task instances for execution on cores of manycore processors based at least in part on which of the task instances have available for them the input data, such as ITC data, that they need for executing.

Type: Grant

Filed: June 27, 2014

Date of Patent: September 20, 2016

Assignee: THROUGHPUTER, INC.

Inventor: Mark Henrik Sandstrom
Apparatus and method for non-blocking execution of static scheduled processor

Patent number: 9405546

Abstract: An apparatus and method for non-blocking execution of a static scheduled processor, the apparatus including a processor to process at least one operation using transferred input data, and an input buffer used to transfer the input data to the processor, and store a result of processing the at least one operation, wherein the processor may include at least one functional unit (FU) to execute the at least one operation, and the at least one FU may process the transferred input data using at least one of a regular latency operation and an irregular latency operation.

Type: Grant

Filed: March 6, 2014

Date of Patent: August 2, 2016

Assignee: Samsung Electronics Co., Ltd.

Inventors: Kwon Taek Kwon, Sang Oak Woo, Shi Hwa Lee, Seok Yoon Jung
Methods and systems for processing network messages in an accelerated processing device

Patent number: 9319254

Abstract: The present method and system enables receiving a radio frequency (RF) signal. The received RF signal is assigned to a single instruction multiple data (SIMD) module in an accelerated processing device (APD) for processing to extract network messages. The extracted network layer messages are further processed by the SIMD module to obtain data transmitted via the RF signal.

Type: Grant

Filed: August 3, 2012

Date of Patent: April 19, 2016

Assignee: ATI Technologies ULC

Inventor: Moiz Haq
Data mover moving data to accelerator for processing and returning result data based on instruction received from a processor utilizing software and hardware interrupts

Patent number: 9038073

Abstract: Efficient data processing apparatus and methods include hardware components which are pre-programmed by software. Each hardware component triggers the other to complete its tasks. After the final pre-programmed hardware task is complete, the hardware component issues a software interrupt.

Type: Grant

Filed: August 13, 2009

Date of Patent: May 19, 2015

Assignee: QUALCOMM Incorporated

Inventors: Mathias Kohlenz, Irfan Anwar Khan, Sathyanarayan Madhusudan, Shailesh Maheshwari, Srividhya Krishnamoorthy, Sandeep Urgaonkar, Thomas Klingenbrunn, Tim Tynghuei Liou, Idreas Mir
Scalar readXF instruction for processing vectors

Patent number: 9009528

Abstract: The described embodiments include a processor that handles faults. The processor first receives an input vector, a control vector, and a predicate vector, each vector comprising a plurality of elements. Then, for a first element of the input vector for which corresponding elements of the control vector and the predicate vector are active, the processor performs a scalar read operation using an address from the element of the input vector. When a fault condition is encountered while performing the read operation, the processor determines if the element is a first element where a corresponding element of the control vector is active. If so (i.e., if the element is a first element where a corresponding element of the control vector is active), the processor processes the fault. Otherwise, the processor masks the fault for the element.

Type: Grant

Filed: September 5, 2012

Date of Patent: April 14, 2015

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
VECTOR ARITHMETIC REDUCTION

Publication number: 20150052330

Abstract: In a particular embodiment, a method includes executing a vector instruction at a processor. The vector instruction includes a vector input that includes a plurality of elements. Executing the vector instruction includes providing a first element of the plurality of elements as a first output. Executing the vector instruction further includes performing an arithmetic operation on the first element and a second element of the plurality of elements to provide a second output. Executing the vector instruction further includes storing the first output and the second output in an output vector.

Type: Application

Filed: August 14, 2013

Publication date: February 19, 2015

Applicant: QUALCOMM Incorporated

Inventors: Ajay Anant Ingle, Marc Murray Hoffman, Deepak Mathew, Mao Zeng
Confirm instruction for processing vectors

Patent number: 8938642

Abstract: The described embodiments include a processor with a fault status register (FSR) that executes a Confirm instruction. In these embodiments, when executing the Confirm instruction, the processor receives a predicate vector that includes N elements. For a first set of bit positions in the FSR for which corresponding elements of the predicate vector are active, the processor determines if at least one of the first set of bit positions in the FSR holds a predetermined value. When at least one of the first set of bit positions in the FSR holds the predetermined value, the processor causes a fault in the processor.

Type: Grant

Filed: May 23, 2012

Date of Patent: January 20, 2015

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
Audio digital signal processor

Patent number: 8935468

Abstract: A microprocessor includes a memory interface to obtain data envelopes of a first length, and control logic to implement an instruction to load an initial data envelope of a stream of data values into a buffer, each data value having a second length shorter than the first length, the stream of data values being disposed across successive data envelopes at the memory interface. Another instruction merges current contents of the buffer and the memory interface such that each invocation loads one of the data values into a first register, and moves at least a remainder of the current contents of the memory interface into the buffer for use in a successive invocation. Another instruction loads a reversed representation of a set of data values obtained via the memory interface into a second register. Another instruction implements an FIR computation including a SIMD operation involving multiple data values of the stream and the reversed representation.

Type: Grant

Filed: December 31, 2012

Date of Patent: January 13, 2015

Assignee: Cadence Design Systems, Inc.

Inventors: Dror E. Maydan, William A. Huffman, Sachin Ghanekar, Fei Sun
Multithreaded programmable direct memory access engine

Patent number: 8918553

Abstract: A mechanism programming a direct memory access engine operating as a multithreaded processor is provided. A plurality of programs is received from a host processor in a local memory associated with the direct memory access engine. A request is received in the direct memory access engine from the host processor indicating that the plurality of programs located in the local memory is to be executed. The direct memory access engine executes two or more of the plurality of programs without intervention by a host processor. As each of the two or more of the plurality of programs completes execution, the direct memory access engine sends a completion notification to the host processor that indicates that the program has completed execution.

Type: Grant

Filed: June 5, 2012

Date of Patent: December 23, 2014

Assignee: International Business Machines Corporation

Inventors: Brian K. Flachs, Harm P. Hofstee, Charles R. Johns, Matthew E. King, John S. Liberty, Brad W. Michael

1 2 3 4 next