Vector Processor Patents (Class 712/2)
  • Patent number: 11941394
    Abstract: A processor includes a decode unit to decode an instruction indicating a source packed data operand having source data elements and indicating a destination storage location. Each of the source data elements has a source data element value and a source data element position. An execution unit, in response to the instruction, stores a result packed data operand having result data elements each having a result data element value and a result data element position. Each result data element value is one of: (1) equal to a source data element position of a source data element, closest to one end of the source operand, having a source data element value equal to the result data element position of the result data element; and (2) a replacement value, when no source data element has a source data element value equal to the result data element position of the result data element.
    Type: Grant
    Filed: December 9, 2019
    Date of Patent: March 26, 2024
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Jong Soo Park
  • Patent number: 11620132
    Abstract: Various embodiments are provided reusing an operand in an instruction set architecture (ISA) by one or more processors in a computing system. An instruction may specify that an operand register for a selected operand retain operand data used by a previous instruction. The operand data in the operand register may be reused by the instruction.
    Type: Grant
    Filed: May 8, 2019
    Date of Patent: April 4, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce Fleischer, Sunil Shukla, Vijayalakshmi Srinivasan, Jungwook Choi
  • Patent number: 11579881
    Abstract: Disclosed embodiments relate to instructions for vector operations with immediate values. In one example, a system includes a memory and a processor that includes fetch circuitry to fetch the instruction from a code storage, the instruction including an opcode, a destination identifier to specify a destination vector register, a first immediate, and a write mask identifier to specify a write mask register, the write mask register including at least one bit corresponding to each destination vector register element, the at least one bit to specify whether the destination vector register element is masked or unmasked, decode circuitry to decode the fetched instruction, and execution circuitry to execute the decoded instruction, to, use the write mask register to determine unmasked elements of the destination vector register, and, when the opcode specifies to broadcast, broadcast the first immediate to one or more unmasked vector elements of the destination vector register.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: February 14, 2023
    Assignee: Intel Corporation
    Inventors: Gadi Haber, Robert Valentine, Ayal Zaks, Jesus Corbal San Adrian
  • Patent number: 11567765
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: January 31, 2023
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Menachem Adelman, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Dan Baum, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil
  • Patent number: 11507374
    Abstract: Disclosed herein are vector index registers for storing or loading indexes of true and/or false results of comparison operations in vector processors. Each of the vector index registers store multiple addresses for accessing multiple positions in operand vectors.
    Type: Grant
    Filed: May 20, 2019
    Date of Patent: November 22, 2022
    Assignee: Micron Technology, Inc.
    Inventor: Steven Jeffrey Wallach
  • Patent number: 11416281
    Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
    Type: Grant
    Filed: December 31, 2016
    Date of Patent: August 16, 2022
    Assignee: Intel Corporation
    Inventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
  • Patent number: 11263799
    Abstract: Cluster of acceleration engines to accelerate intersections.
    Type: Grant
    Filed: June 24, 2020
    Date of Patent: March 1, 2022
    Assignee: INTEL CORPORATION
    Inventors: Prasoonkumar Surti, Carsten Benthin, Karthik Vaidyanathan, Philip Laws, Scott Janus, Sven Woop
  • Patent number: 11151077
    Abstract: A hardware accelerator for computers combines a stand-alone, high-speed, fixed program dataflow functional element with a stream processor, the latter of which may autonomously access memory in predefined access patterns after receiving simple stream instructions and provide them to the dataflow functional element. The result is a compact, high-speed processor that may exploit fixed program dataflow functional elements.
    Type: Grant
    Filed: June 28, 2017
    Date of Patent: October 19, 2021
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar
  • Patent number: 11093438
    Abstract: Embodiments for pipelining multi-directional reduction by one or more processors in a computing system. One or more reduce scatter operations and one or more all-gather operations may be assigned to each of a plurality of independent networks. The one or more reduce scatter operations and the one or more all-gather operations may be sequentially executed in each of the plurality of independent networks according to a serialized execution order and a defined time period.
    Type: Grant
    Filed: January 7, 2019
    Date of Patent: August 17, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Minsik Cho, Ulrich Finkler, David Kung
  • Patent number: 10956360
    Abstract: Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise an arithmetic logic unit (ALU), a data buffer associated with the ALU, and an indicator associated with the data buffer to indicate whether a piece of data inside the data buffer is to be reused for repeated execution of a same instruction as a pipeline stage.
    Type: Grant
    Filed: March 13, 2018
    Date of Patent: March 23, 2021
    Assignee: AZURENGINE TECHNOLOGIES ZHUHAI INC.
    Inventors: Yuan Li, Jianbin Zhu
  • Patent number: 10929131
    Abstract: Method, apparatus, and program means for performing a string comparison operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources store a result of a comparison between each data element of a first and second operand corresponding to a first and second text string, respectively.
    Type: Grant
    Filed: April 15, 2019
    Date of Patent: February 23, 2021
    Assignee: Intel Corporation
    Inventors: Michael A. Julier, Jeffrey D. Gray, Srinivas Chennupaty, Sean P. Mirkes, Mark P. Seconi
  • Patent number: 10922084
    Abstract: An apparatus has processing circuitry supporting vector load and store instructions. In response to a transaction start event, the processing circuitry executes one or more subsequent instructions speculatively. In response to a transaction end event, the processing circuitry commits speculative results of those instructions. Hazard detection circuitry detects whether an inter-element address hazard occurs between an address for data element J for an earlier vector load instruction and an address for data element K for a later vector store instruction, where K and J are not equal. In response to detecting the inter-element address hazard, the hazard detection circuitry triggers the processing circuitry to abort further processing of the instructions following the transaction start event and to prevent the speculative results being committed. This approach can provide faster performance for vectorised code.
    Type: Grant
    Filed: August 14, 2017
    Date of Patent: February 16, 2021
    Assignee: ARM Limited
    Inventors: Matthew James Horsnell, Mbou Eyole
  • Patent number: 10911092
    Abstract: A digital-to-analog converter (DAC) and a method for operating the DAC are disclosed. The DAC receives, over a first channel, a control signal that is transmitted in accordance with a binary protocol. The DAC also receives, over a second channel different than the first channel, data that is transmitted in accordance with a multilevel communication protocol that is different than the binary protocol. The DAC determines a plurality of first and second voltages based on the received data and identifies, based on the control signal, a time when data transmission or reception is switched between first and second antennas. In response to identifying, based on the control signal, the time when data transmission or reception is switched, the DAC outputs the determined plurality of first voltages to a first antenna tuning circuit or the determined plurality of second voltages to a second antenna tuning circuit.
    Type: Grant
    Filed: November 7, 2019
    Date of Patent: February 2, 2021
    Assignees: STMicroelectronics (Tours) SAS, STMicroelectronics (Shenzhen) R&D Co. Ltd
    Inventors: Songfeng Zhao, Jean Pierre Proot
  • Patent number: 10884750
    Abstract: A processor includes a decode circuit to decode an instruction into a decoded instruction and an execution circuit to execute the decoded instruction to access a first bit of a first input vector located at a bit position indicated by an element of a second input vector, stride over bits of the first input vector using a stride to access bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector, and store the first bit of the first input vector and the bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector as consecutive bits in a destination vector.
    Type: Grant
    Filed: February 28, 2017
    Date of Patent: January 5, 2021
    Assignee: INTEL CORPORATION
    Inventors: Mikhail Plotnikov, Igor Ermolaev
  • Patent number: 10831861
    Abstract: Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector. The first vector may include one or more first elements and the second vector may include one or more second elements. The aspects may further include a computation module configured to calculate a cross product between the first vector and the second vector in response to an instruction.
    Type: Grant
    Filed: October 26, 2018
    Date of Patent: November 10, 2020
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Tao Luo, Tian Zhi, Shaoli Liu, Tianshi Chen, Yunji Chen
  • Patent number: 10783605
    Abstract: Aspects include a multistage collector to receive outputs from plural processing elements. Processing elements may comprise (each or collectively) a plurality of clusters, with one or more ALUs that may perform SIMD operations on a data vector and produce outputs according to the instruction stream being used to configure the ALU(s). The multistage collector includes substituent components each with at least one input queue, a memory, a packing unit, and an output queue; these components can be sized to process groups of input elements of a given size, and can have multiple input queues and a single output queue. Some components couple to receive outputs from the ALUs and others receive outputs from other components. Ultimately, the multistage collector can output groupings of input elements. Each grouping of elements (e.g., at input queues, or stored in the memories of component) can be formed based on matching of index elements.
    Type: Grant
    Filed: February 4, 2019
    Date of Patent: September 22, 2020
    Assignee: Imagination Technologies Limited
    Inventors: James Alexander McCombe, Steven John Clohset, Jason Rupert Redgrave, Luke Tilman Peterson
  • Patent number: 10771981
    Abstract: A method for cellular network operation includes establishing communications in a cellular network between a given user equipment (UE) and a base station, comprising a virtualization platform and radio transceiver points (R-TP). A central coordinator selects a downlink coordinated transmission mode involving a first and a second R-TP which receive messages, over a control interface between the virtualization platform and each R-TP, including specifications of time-frequency resources to be used for user data and reference signals transmission. The radio processing functions in the communications with the given UE are performed while executing the selected coordinated mode by both selected R-TPs in accordance with the time-frequency resource specifications.
    Type: Grant
    Filed: September 5, 2016
    Date of Patent: September 8, 2020
    Inventor: Mariana Goldhamer
  • Patent number: 10728242
    Abstract: The present invention relates generally to the use of biometric technology for authentication and identification, and more particularly to non-contact based solutions for authenticating and identifying users, via computers, such as mobile devices, to selectively permit or deny access to various resources. In the present invention authentication and/or identification is performed using an image or a set of images of an individual's palm through a process involving the following key steps: (1) detecting the palm area using local classifiers; (2) extracting features from the region(s) of interest; and (3) computing the matching score against user models stored in a database, which can be augmented dynamically through a learning process.
    Type: Grant
    Filed: October 5, 2018
    Date of Patent: July 28, 2020
    Assignee: Element Inc.
    Inventors: Yann LeCun, Adam Perold, Yang Wang, Sagar Waghmare
  • Patent number: 10706007
    Abstract: A vector reduction circuit configured to reduce an input vector of elements comprises a plurality of cells, wherein each of the plurality of cells other than a designated first cell that receives a designated first element of the input vector is configured to receive a particular element of the input vector, receive, from another of the one or more cells, a temporary reduction element, perform a reduction operation using the particular element and the temporary reduction element, and provide, as a new temporary reduction element, a result of performing the reduction operation using the particular element and the temporary reduction element. The vector reduction circuit also comprises an output circuit configured to provide, for output as a reduction of the input vector, a new temporary reduction element corresponding to a result of performing the reduction operation using a last element of the input vector.
    Type: Grant
    Filed: September 12, 2018
    Date of Patent: July 7, 2020
    Assignee: Google LLC
    Inventors: Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam
  • Patent number: 10353706
    Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
    Type: Grant
    Filed: November 21, 2017
    Date of Patent: July 16, 2019
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 10341190
    Abstract: Centrality measure ranking for a multiple network is provided by a method that includes obtaining a representation of a multiplex network including layers and nodes representing communicating entities. The method determines a node centrality measure for each node of the nodes. This includes determining intra-layer and inter-layer centrality measures. The method determines a respective centrality measure for each communicating entity as a function of node centrality measures for nodes representing the communicating entity across the layers of the multiplex network. The method also ranks the communicating entities by their centrality measures.
    Type: Grant
    Filed: November 14, 2017
    Date of Patent: July 2, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Krishnasuri Narayanam, Ramasuri Narayanam, Mukundan Sundararajan
  • Patent number: 10318353
    Abstract: An architecture for a load-balanced groups of multi-stage manycore processors shared dynamically among a set of software applications, with capabilities for destination task defined intra-application prioritization of inter-task communications (ITC), for architecture-based ITC performance isolation between the applications, as well as for prioritizing application task instances for execution on cores of manycore processors based at least in part on which of the task instances have available for them the input data, such as ITC data, that they need for executing.
    Type: Grant
    Filed: September 16, 2016
    Date of Patent: June 11, 2019
    Inventor: Mark Henrik Sandstrom
  • Patent number: 10303473
    Abstract: A vector permutation circuit and a vector processor are provided. The vector permutation circuit includes a grouping unit, m selection units connected to the grouping unit, j switching units connected to the m selection units, and a control unit connected to each selection unit and each switching unit, where each switching unit is connected to m/j selection units; the grouping unit divides to-be-permutated vector data into n vector data groups and output the n vector data groups to the m selection units; under control of the control unit, each selection unit selects a second vector data group from an input first vector data group, and outputs the second vector data group to a switching unit connected to the selection unit; under control of the control unit, each switching unit switches and outputs elements in the input second vector data group.
    Type: Grant
    Filed: September 29, 2016
    Date of Patent: May 28, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD
    Inventors: Yunbi Chen, Kai Hu
  • Patent number: 10261789
    Abstract: A data processing apparatus and a method of controlling performance of speculative vector operations are provided. The apparatus comprises processing circuitry for performing a sequence of speculative vector operations on vector operands, each vector operand comprising a plurality of vector elements, and speculation control circuitry for maintaining a speculation width indication indicating the number of vector elements of each vector operand to be subjected to the speculative vector operations. The speculation width indication is set to an initial value prior to performance of the sequence of speculative vector operations. The processing circuitry generates progress indications during performance of the sequence of speculative vector operations, and the speculation control circuitry detects, with reference to the progress indications and speculation reduction criteria, presence of a speculation reduction condition.
    Type: Grant
    Filed: August 18, 2014
    Date of Patent: April 16, 2019
    Assignee: ARM Limited
    Inventors: Alastair David Reid, Daniel Kershaw
  • Patent number: 10228941
    Abstract: A processor of an aspect includes a set of registers capable of storing packed data. An execution unit is coupled with the set of registers. The execution unit is to access the set of registers in at least two different ways in response to instructions. The at least two different ways include a first way in which the set of registers are to represent a plurality of N-bit registers. The at least two different ways also include a second way in which the set of registers are to represent a single register of at least 2N-bits. In one aspect, the at least 2N-bits is to be at least 256-bits.
    Type: Grant
    Filed: June 28, 2013
    Date of Patent: March 12, 2019
    Assignee: Intel Corporation
    Inventors: Bret L. Toll, Ronak Singhal, Buford M. Guy, Mishali Naik
  • Patent number: 10135815
    Abstract: The present invention relates generally to the use of biometric technology for authentication and identification, and more particularly to non-contact based solutions for authenticating and identifying users, via computers, such as mobile devices, to selectively permit or deny access to various resources. In the present invention authentication and/or identification is performed using an image or a set of images of an individual's palm through a process involving the following key steps: (1) detecting the palm area using local classifiers; (2) extracting features from the region(s) of interest; and (3) computing the matching score against user models stored in a database, which can be augmented dynamically through a learning process.
    Type: Grant
    Filed: August 1, 2014
    Date of Patent: November 20, 2018
    Assignee: ELEMENT, INC.
    Inventors: Yann LeCun, Adam Perold, Yang Wang, Sagar Waghmare
  • Patent number: 10085302
    Abstract: An access node for a telecommunications network is partitioned into a front end unit and a back end unit coupled by an internet protocol (IP) packet based communication link to provide for data and control packets to be sent between the back end unit and the front end unit. The front end unit performs physical layer and media access layer (MAC) sublayer processing for data for transmission to/from user equipment in the network using baseband processing units that perform highly parallel floating/fixed point operations. The back end unit includes a plurality of general purpose processors to provide data link layer and network layer processing. back end portions may be pooled to provide greater efficiency.
    Type: Grant
    Filed: September 15, 2017
    Date of Patent: September 25, 2018
    Assignees: AT&T Mobility II LLC, AT&T Intellectual Property I, L.P.
    Inventors: Dimas R. Noriega, Arthur R. Brisebois, Giuseppe De Rosa, Henry J. Fowler, Jr.
  • Patent number: 9958917
    Abstract: Disclosed is a resettable memory device including a memory unit, a reset status indicator circuit, a logic sampling circuit, and a multiplexer for performing a reset function. The memory unit includes cells for storing states of signals in a design under test. The reset status indicator stores states of indicators indicating whether corresponding cells should be reset or not. Responsive to the reset status indicator indicating that the value of the cell should not be reset, the multiplexer receives the value stored in the cell and outputs the retrieved value from the cell. Responsive to the reset status indicator indicating that the value of the cell should be reset, the multiplexer outputs a reset value instead of the value stored in the cell. The reset value may be changed by the logic sampling circuit at different time periods or certain logic conditions, and output through the multiplexer.
    Type: Grant
    Filed: December 2, 2016
    Date of Patent: May 1, 2018
    Assignee: Synopsys, Inc.
    Inventors: Ngai Ngai William Hung, Dhiraj Goswami
  • Patent number: 9916130
    Abstract: An apparatus comprises processing circuitry for performing, in response to a vector instruction, a plurality of lanes of processing or respective data elements with at least one operand vector to generate corresponding result data elements of a result vector. The processing circuitry may support performing at least two of the lanes of processing with different rounding modes for generating rounding values for the corresponding result data elements of the result vector. This allows two or more calculations with different rounding modes to be executed in response to a single instruction, to improve performance.
    Type: Grant
    Filed: December 9, 2014
    Date of Patent: March 13, 2018
    Assignee: ARM Limited
    Inventors: David Raymond Lutz, Neil Burgess
  • Patent number: 9891914
    Abstract: An apparatus and method for performing an efficient scatter operation. For example, one embodiment of a processor comprises: an allocator unit to receive a scatter operation comprising a number of data elements and responsively allocate resources to execute the scatter operation; a memory execution cluster comprising at least a portion of the resources to execute the scatter operation, the resources including one or more store data buffers and one or more store address buffers; and a senior store pipeline to transfer store data elements from the store data buffers to system memory using addresses from the store address buffers prior to retirement of the scatter operation.
    Type: Grant
    Filed: April 10, 2015
    Date of Patent: February 13, 2018
    Assignee: Intel Corporation
    Inventors: Ramon Matas, Alexey P. Suprun, Roger Gramunt, Chung-Lun Chan, Rammohan Padmanabhan
  • Patent number: 9886459
    Abstract: Methods and apparatuses for determining set-membership using Single Instruction Multiple Data (“SIMD”) architecture are presented herein. Specifically, methods and apparatuses are discussed for determining, in parallel, whether multiple values in a first set of values are members of a second set of values. Many of the methods and systems discussed herein are applied to determining whether one or more rows in a dictionary-encoded column of a database table satisfy one or more conditions based on the dictionary-encoded column. However, the methods and systems discussed herein may apply to many applications executed on a SIMD processor using set-membership tests.
    Type: Grant
    Filed: July 22, 2014
    Date of Patent: February 6, 2018
    Assignee: Oracle International Corporation
    Inventors: Shasank K. Chavan, Phumpong Watanaprakornkul
  • Patent number: 9887714
    Abstract: It is provided a remote radio head configured to provide a radio interface for a network node. The remote radio head comprising an antenna, an analog interface for connecting with the network node, radio frequency (RF) circuitry configured to convert between intermediate frequency signals of the analog interface and RF signals of the antenna, digital circuitry configured to process transmission and/or reception signals, a first analog to digital converter (ADC) connected to the digital circuitry, and a first digital to analog converter (DAC) connected to the digital circuitry. The first ADC, the digital circuitry, and the first DAC are connected between the antenna and the analog interface for receiving or transmitting radio signals. A corresponding method is also presented.
    Type: Grant
    Filed: July 4, 2014
    Date of Patent: February 6, 2018
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventors: Marko E. Leinonen, Kauko Heinikoski
  • Patent number: 9798684
    Abstract: Methods and systems are described for reading from or writing to a plurality of slave devices connected to a communications bus having a common data line. The slave devices are mapped to a virtual device address and the communication is initiated by the master by signaling a start condition and the virtual device address. Each of the slave devices mapped to the virtual device address identifies a register in that slave device associated with the virtual device address and, in sequence, performs a read or write operation on the bus with regard to its identified register in a respective predetermined time slot within the communication or to a corresponding virtual register address assigned to the slave device previously.
    Type: Grant
    Filed: April 21, 2015
    Date of Patent: October 24, 2017
    Assignee: BLACKBERRY LIMITED
    Inventor: Jens Kristian Poulsen
  • Patent number: 9792118
    Abstract: Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption are disclosed. Related vector processor systems and methods are also disclosed. The VPEs are configured to provide filter vector processing operations. To minimize re-fetching of input vector data samples from memory to reduce power consumption, a tapped-delay line(s) is included in the data flow paths between a vector data file and execution units in the VPE. The tapped-delay line(s) is configured to receive and provide input vector data sample sets to execution units for performing filter vector processing operations. The tapped-delay line(s) is also configured to shift the input vector data sample set for filter delay taps and provide the shifted input vector data sample set to execution units, so the shifted input vector data sample set does not have to be re-fetched during filter vector processing operations.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: October 17, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Raheel Khan, Fahad Ali Mujahid, Afshin Shiravi
  • Patent number: 9769873
    Abstract: An access node for a telecommunications network is partitioned into a front end unit and a back end unit coupled by an internet protocol (IP) packet based communication link to provide for data and control packets to be sent between the back end unit and the front end unit. The front end unit performs physical layer and media access layer (MAC) sublayer processing for data for transmission to/from user equipment in the network using baseband processing units that perform highly parallel floating/fixed point operations. The back end unit includes a plurality of general purpose processors to provide data link layer and network layer processing. back end portions may be pooled to provide greater efficiency.
    Type: Grant
    Filed: December 29, 2015
    Date of Patent: September 19, 2017
    Assignees: AT&T Intellectual Property I, L.P., AT&T Mobility II LLC
    Inventors: Dimas R. Noriega, Arthur R. Brisebois, Giuseppe De Rosa, Henry J. Fowler, Jr.
  • Patent number: 9753765
    Abstract: An integrated circuit unit and method for synchronizing processing threads running on respective processors are provided. The unit includes an interrupt request controller which is programmable to provide a first desired number of synchronization objects and a second desired number of interrupt request signals for supply to such processors. The controller is operable to direct and interrupt request signals to a chosen processor in dependence upon data received from the processors.
    Type: Grant
    Filed: March 22, 2004
    Date of Patent: September 5, 2017
    Assignee: Altera Corporation
    Inventor: Robert Jackson
  • Patent number: 9727526
    Abstract: A reconfigurable vector processor is described that allows the size of its vector units to be changed in order to process vectors of different sizes. The reconfigurable vector processor comprises a plurality of processor units. Each of the processor units comprises a control unit for decoding instructions and generating control signals, a scalar unit for processing instructions on scalar data, and a vector unit for processing instructions on vector data under control of control signals. The reconfigurable vector processor architecture also comprises a vector control selector for selectively providing control signals generated by one processor unit of the plurality of processor units to the vector unit of a different processor unit of the plurality of processor units.
    Type: Grant
    Filed: January 25, 2011
    Date of Patent: August 8, 2017
    Assignee: NXP USA, Inc.
    Inventors: Malcolm Stewart, Ali Osman Ors, Daniel Laroche
  • Patent number: 9665360
    Abstract: A computer implemented method for updating configuration data in at least one automated banking machine is configured to execute configuration update steps embodied with a computer readable medium. The method includes identifying one or more sub-systems implemented within the automated banking machine, receiving an update to configuration data for at least one of the identified sub-systems, generating a restore point based on a current implementation of the sub-systems for the automated banking machine, and installing the configuration data in the automated banking machine. The identified sub-systems can include at least two of roll storage modules, a note handling module controller, a note detector module, and an interface controller.
    Type: Grant
    Filed: July 24, 2012
    Date of Patent: May 30, 2017
    Assignee: Glory Global Solutions (International) Limited
    Inventors: Dominik Cipa, Gunnar Kunz, Ulrich Marti, Olivier Martin
  • Patent number: 9639354
    Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes the result including a sequence of at least four non-negative integers. In an aspect, values of the at least four non-negative integers are not calculated using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: May 2, 2017
    Assignee: Intel Corporation
    Inventors: Seth Abraham, Robert Valentine, Elmoustapha Ould-Ahmed-Vall, Zeev Sperber, Amit Gradstein
  • Patent number: 9632781
    Abstract: Techniques are provided for executing a vector alignment instruction. A scalar register file in a first processor is configured to share one or more register values with a second processor, the one or more register values accessed from the scalar register file according to an Rt address specified in a vector alignment instruction, wherein a start location is determined from one of the shared register values. An alignment circuit in the second processor is configured to align data identified between the start location within a beginning Vu register of a vector register file (VRF) and an end location of a last Vu register of the VRF according to the vector alignment instruction. A store circuit is configured to select the aligned data from the alignment circuit and store the aligned data in the vector register file according to an alignment store address specified by the vector alignment instruction.
    Type: Grant
    Filed: February 26, 2013
    Date of Patent: April 25, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Ajay A. Ingle, Marc M. Hoffman, Jose Fridman, Lucian Codrescu
  • Patent number: 9477475
    Abstract: According to embodiments disclosed herein, there is disclosed a computer processor architecture; and in particular a computer processor, a method of operating the same, and a computer program product that makes use of an instruction set for the computer. In one embodiment, the computer processor includes: (1) a decode unit for decoding instruction packets fetched from a memory holding the instruction packets, (2) a control processing channel capable of performing control operations and (3) a data processing channel capable of performing data processing operations, wherein, in use the decode unit causes instructions of instruction packets comprising a plurality of only control instructions to be executed sequentially on the control processing channel, and wherein, in use the decode unit causes instructions of instruction packets comprising a plurality of instructions comprising at least one data processing instruction to be executed simultaneously on the data processing channel.
    Type: Grant
    Filed: April 30, 2015
    Date of Patent: October 25, 2016
    Assignee: Nvidia Technology UK Limited
    Inventor: Simon Knowles
  • Patent number: 9448847
    Abstract: An architecture for a load-balanced groups of multi-stage manycore processors shared dynamically among a set of software applications, with capabilities for destination task defined intra-application prioritization of inter-task communications (ITC), for architecture-based ITC performance isolation between the applications, as well as for prioritizing application task instances for execution on cores of manycore processors based at least in part on which of the task instances have available for them the input data, such as ITC data, that they need for executing.
    Type: Grant
    Filed: June 27, 2014
    Date of Patent: September 20, 2016
    Assignee: THROUGHPUTER, INC.
    Inventor: Mark Henrik Sandstrom
  • Patent number: 9405546
    Abstract: An apparatus and method for non-blocking execution of a static scheduled processor, the apparatus including a processor to process at least one operation using transferred input data, and an input buffer used to transfer the input data to the processor, and store a result of processing the at least one operation, wherein the processor may include at least one functional unit (FU) to execute the at least one operation, and the at least one FU may process the transferred input data using at least one of a regular latency operation and an irregular latency operation.
    Type: Grant
    Filed: March 6, 2014
    Date of Patent: August 2, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kwon Taek Kwon, Sang Oak Woo, Shi Hwa Lee, Seok Yoon Jung
  • Patent number: 9319254
    Abstract: The present method and system enables receiving a radio frequency (RF) signal. The received RF signal is assigned to a single instruction multiple data (SIMD) module in an accelerated processing device (APD) for processing to extract network messages. The extracted network layer messages are further processed by the SIMD module to obtain data transmitted via the RF signal.
    Type: Grant
    Filed: August 3, 2012
    Date of Patent: April 19, 2016
    Assignee: ATI Technologies ULC
    Inventor: Moiz Haq
  • Patent number: 9038073
    Abstract: Efficient data processing apparatus and methods include hardware components which are pre-programmed by software. Each hardware component triggers the other to complete its tasks. After the final pre-programmed hardware task is complete, the hardware component issues a software interrupt.
    Type: Grant
    Filed: August 13, 2009
    Date of Patent: May 19, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Mathias Kohlenz, Irfan Anwar Khan, Sathyanarayan Madhusudan, Shailesh Maheshwari, Srividhya Krishnamoorthy, Sandeep Urgaonkar, Thomas Klingenbrunn, Tim Tynghuei Liou, Idreas Mir
  • Patent number: 9009528
    Abstract: The described embodiments include a processor that handles faults. The processor first receives an input vector, a control vector, and a predicate vector, each vector comprising a plurality of elements. Then, for a first element of the input vector for which corresponding elements of the control vector and the predicate vector are active, the processor performs a scalar read operation using an address from the element of the input vector. When a fault condition is encountered while performing the read operation, the processor determines if the element is a first element where a corresponding element of the control vector is active. If so (i.e., if the element is a first element where a corresponding element of the control vector is active), the processor processes the fault. Otherwise, the processor masks the fault for the element.
    Type: Grant
    Filed: September 5, 2012
    Date of Patent: April 14, 2015
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Publication number: 20150052330
    Abstract: In a particular embodiment, a method includes executing a vector instruction at a processor. The vector instruction includes a vector input that includes a plurality of elements. Executing the vector instruction includes providing a first element of the plurality of elements as a first output. Executing the vector instruction further includes performing an arithmetic operation on the first element and a second element of the plurality of elements to provide a second output. Executing the vector instruction further includes storing the first output and the second output in an output vector.
    Type: Application
    Filed: August 14, 2013
    Publication date: February 19, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Ajay Anant Ingle, Marc Murray Hoffman, Deepak Mathew, Mao Zeng
  • Patent number: 8938642
    Abstract: The described embodiments include a processor with a fault status register (FSR) that executes a Confirm instruction. In these embodiments, when executing the Confirm instruction, the processor receives a predicate vector that includes N elements. For a first set of bit positions in the FSR for which corresponding elements of the predicate vector are active, the processor determines if at least one of the first set of bit positions in the FSR holds a predetermined value. When at least one of the first set of bit positions in the FSR holds the predetermined value, the processor causes a fault in the processor.
    Type: Grant
    Filed: May 23, 2012
    Date of Patent: January 20, 2015
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 8935468
    Abstract: A microprocessor includes a memory interface to obtain data envelopes of a first length, and control logic to implement an instruction to load an initial data envelope of a stream of data values into a buffer, each data value having a second length shorter than the first length, the stream of data values being disposed across successive data envelopes at the memory interface. Another instruction merges current contents of the buffer and the memory interface such that each invocation loads one of the data values into a first register, and moves at least a remainder of the current contents of the memory interface into the buffer for use in a successive invocation. Another instruction loads a reversed representation of a set of data values obtained via the memory interface into a second register. Another instruction implements an FIR computation including a SIMD operation involving multiple data values of the stream and the reversed representation.
    Type: Grant
    Filed: December 31, 2012
    Date of Patent: January 13, 2015
    Assignee: Cadence Design Systems, Inc.
    Inventors: Dror E. Maydan, William A. Huffman, Sachin Ghanekar, Fei Sun
  • Patent number: 8918553
    Abstract: A mechanism programming a direct memory access engine operating as a multithreaded processor is provided. A plurality of programs is received from a host processor in a local memory associated with the direct memory access engine. A request is received in the direct memory access engine from the host processor indicating that the plurality of programs located in the local memory is to be executed. The direct memory access engine executes two or more of the plurality of programs without intervention by a host processor. As each of the two or more of the plurality of programs completes execution, the direct memory access engine sends a completion notification to the host processor that indicates that the program has completed execution.
    Type: Grant
    Filed: June 5, 2012
    Date of Patent: December 23, 2014
    Assignee: International Business Machines Corporation
    Inventors: Brian K. Flachs, Harm P. Hofstee, Charles R. Johns, Matthew E. King, John S. Liberty, Brad W. Michael