Patents Examined by Aimee Li
  • Patent number: 10776118
    Abstract: A computing system comprising a central processing unit (CPU), a memory processor and a memory device comprising a data array and an index array. The computing system is configured to store data lines comprising data elements in the data array and to store index lines comprising a plurality of memory indices in the index array. The memory indices indicate memory positions of data elements in the data array with respect to a start address of the data array. There is further provided a related computer implemented method and a related computer program product.
    Type: Grant
    Filed: September 9, 2016
    Date of Patent: September 15, 2020
    Assignee: International Business Machines Corporation
    Inventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
  • Patent number: 10768934
    Abstract: A data processing system supports a predicated-loop instruction that controls vectorised execution of a program loop body in respect of a plurality of vector elements. When the number of elements to be processed is not a whole number multiple of the number of lanes of processing supported for that element size, then the predicated-loop instruction controls suppression of processing in one or more lanes not required.
    Type: Grant
    Filed: March 21, 2017
    Date of Patent: September 8, 2020
    Assignee: ARM Limited
    Inventor: Thomas Christopher Grocutt
  • Patent number: 10747712
    Abstract: An integrated circuit includes a plurality of tiles. Each tile includes a processor, a switch including switching circuitry to forward data over data paths from other tiles to the processor and to switches of other tiles, and a switch memory that stores instruction streams that are able to operate independently for respective output ports of the switch.
    Type: Grant
    Filed: June 2, 2014
    Date of Patent: August 18, 2020
    Assignee: Massachusetts Institute of Technology
    Inventor: Anant Agarwal
  • Patent number: 10747537
    Abstract: A set machine instruction is provided that has associated therewith a result location to be used with a set operation. The set machine instruction is executed, which includes checking contents of a selected field, and determining, based on the checking, whether the contents of the selected field indicate a first condition, a second condition or a third condition represented in one data type. The result location is set to a value based on the determining, wherein the value, based on the setting, is of a data type different from the one data type and represents a result of a previously executed instruction. The result of the previously executed instruction being one of the first condition, the second condition or the third condition.
    Type: Grant
    Filed: September 29, 2015
    Date of Patent: August 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Brett Olsson
  • Patent number: 10747535
    Abstract: Systems, apparatuses, and methods for processing load instructions are disclosed. A processor includes at least a data cache and a load queue for storing load instructions. The load queue includes poison indicators for load instructions waiting to reach non-speculative status. When a non-cacheable load instruction is speculatively executed, then the poison bit is automatically set for the load instruction. If a cacheable load instruction is speculatively executed, then the processor waits until detecting a first condition before setting the poison bit for the load instruction. The first condition may be detecting a cache line with data for the load instruction being evicted from the cache. If an ordering event occurs for a load instruction with a set poison bit, then the load instruction may be flushed and replayed. An ordering event may be a data barrier or a hazard on an older load targeting the same address as the load.
    Type: Grant
    Filed: July 11, 2016
    Date of Patent: August 18, 2020
    Assignee: Apple Inc.
    Inventors: Mahesh K. Reddy, Matthew C. Stone
  • Patent number: 10740098
    Abstract: A method, computer program product, and computer system for providing a comparison result vector of a predefined number of elements w resulting from comparison of multiple vectors of compressed data within a processor comprising registers of same size m is provided. Vector elements of the comparison result vector are stored in a register of the registers. Zero bits are padded between vector elements of each of the comparison result vectors. A compare bit result vector indicative of the vector elements is generated for accessing the results of the comparison in the comparison result vector.
    Type: Grant
    Filed: February 6, 2018
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Cedric Lichtenau, Silvia M. Mueller, Jens P. Seifert, Jörg-Stephan Vogt, Markus Lachenmayr, L'Emir Salim Chehab, Pavankrishna Ellore Ramesh, Sourabh Chougule
  • Patent number: 10740107
    Abstract: Operation of a multi-slice processor that includes a plurality of execution slices and an instruction sequencing unit. Operation of such a multi-slice processor includes: receiving, at the instruction sequencing unit, a load instruction indicating load address data and a load data length; determining a previous store instruction in an issue queue such that store address data for the previous store instruction corresponds to the load address data, wherein the previous store instruction corresponds to a store data length; and generating, in dependence upon the store data length matching the load data length, an indication in the issue queue that indicates a dependency between the load instruction and the previous store instruction.
    Type: Grant
    Filed: June 1, 2016
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Salma Ayub, Joshua W. Bowman, Jeffrey C. Brownscheidle, Kurt A. Feiste, Dung Q. Nguyen, Salim A. Shah, Brian W. Thompto
  • Patent number: 10740099
    Abstract: A machine instruction is provided that has associated therewith a result location to be used for a set operation, a first source, a second source, and an operation select field configured to specify a plurality of selectable operations. The machine instruction is executed, which includes obtaining the first source, the second source, and a selected operation, and performing the selected operation on the first source and the second source to obtain a result in one data type. That result is quantized to a value in a different data type, and the value is placed in the result location.
    Type: Grant
    Filed: November 14, 2015
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Brett Olsson
  • Patent number: 10733091
    Abstract: Transactional memory accesses are tracked using read and write sets based on actual program flow. A read and write set is associated with a range of instructions of a transaction. When execution follows a predicted branch, loads and stores are marked as being of selected read and write sets. Then, when a misprediction is processed, and execution is rewound, speculatively added read and write set indications are removed from the read and write sets.
    Type: Grant
    Filed: May 3, 2016
    Date of Patent: August 4, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura, Chung-Lung K. Shum
  • Patent number: 10732976
    Abstract: A processor includes an instruction pipeline. The pipeline can be operated alternatively in a multi-thread mode and in a single-thread mode. In the multi-thread mode, the instruction pipeline processes multiple threads in an interleaved or simultaneous manner. In the single-thread mode, the pipeline processes a single thread. The instruction pipeline comprises multiple functional units, each of which is reserved for one thread among the multiple threads when the pipeline is in the multi-thread mode and reserved for one context layer among multiple context layers when the instruction pipeline is in the single-thread mode.
    Type: Grant
    Filed: January 10, 2013
    Date of Patent: August 4, 2020
    Assignee: NXP USA, Inc.
    Inventors: Alistair Robertson, Jeffrey W. Scott
  • Patent number: 10725900
    Abstract: Transactional memory accesses are tracked using read and write sets based on actual program flow. A read and write set is associated with a range of instructions of a transaction. When execution follows a predicted branch, loads and stores are marked as being of selected read and write sets. Then, when a misprediction is processed, and execution is rewound, speculatively added read and write set indications are removed from the read and write sets.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: July 28, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura, Chung-Lung K. Shum
  • Patent number: 10719316
    Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.
    Type: Grant
    Filed: November 9, 2017
    Date of Patent: July 21, 2020
    Assignee: INTEL CORPORATION
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
  • Patent number: 10719056
    Abstract: Embodiments herein describe a reservation station (RS) in a processor that merges control data from multiple sources into a merged control data value. Before an instruction issues, the RS gathers and saves control data indicating how the instruction is to be executed. This control data may be saved in control registers. An instruction, however, can update many different types of status control bits in these registers. As such, the RS may store different types of control data for an instruction. Instead of the RS containing multiple registers and data paths for every type of control data, the embodiments herein describe merge logic in the RS that permits control data from different sources to be merged into a single control data value. Once the instruction is issued, the RS passes the merged control data value to an execution unit for processing.
    Type: Grant
    Filed: May 2, 2016
    Date of Patent: July 21, 2020
    Assignee: International Business Machines Corporation
    Inventors: Brian D. Barrick, Joshua W. Bowman, Jeffrey C. Brownscheidle, Sundeep Chadha, Michael J. Genden, Dhivya Jeganathan, Dung Q. Nguyen, Salim A. Shah
  • Patent number: 10719327
    Abstract: In some embodiments, a branch prediction unit includes a plurality of branch prediction circuits and selection logic. At least two of the branch prediction circuits are configured, based on an address of a branch instruction and different sets of history information, to provide a corresponding branch prediction for the branch instruction. At least one storage element of the at least two branch prediction circuits is set associative. The selection logic is configured to select a particular branch prediction output by one of the branch prediction circuits as a current branch prediction output of the branch prediction unit. In some instances, the branch prediction unit may be less likely to replace branch prediction information, as compared to a different branch prediction unit that does not include a set associative storage element. In some embodiments, this arrangement may lead to increased performance of the branch prediction unit.
    Type: Grant
    Filed: May 19, 2015
    Date of Patent: July 21, 2020
    Assignee: Apple Inc.
    Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Conrado Blasco
  • Patent number: 10719322
    Abstract: A technique includes determining whether one or more instructions in an instruction group require cracking. Whether the instructions that require cracking are associated with a decode-time instruction optimization (DTIO) sequence is also determined. In response to a first instruction, included in the one or more instructions, requiring cracking and the first instruction not being part of a DTIO sequence, the first instruction is cracked into internal operations (IOPs). In response to a second instruction, included in the one or more instructions, requiring cracking and the second instruction being part of a DTIO sequence, an IOP sequence (that includes at least one IOP that is associated with at least a cracked version of the second instruction and at least a third instruction that is included in the one or more instructions and at least one other IOP that is associated with the cracked version of the second instruction) is generated.
    Type: Grant
    Filed: June 10, 2015
    Date of Patent: July 21, 2020
    Assignee: International Business Machines Corporation
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10713174
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. The streaming engine stores an early address of next to be fetched data elements and a late address of a data element in the stream head register for each of the nested loops. The streaming engine stores an early loop counts of next to be fetched data elements and a late loop counts of a data element in the stream head register for each of the nested loops.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: July 14, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Joseph Zbiciak, Timothy D. Anderson
  • Patent number: 10705851
    Abstract: A method for scheduling micro-instructions, performed by a first qualifier, is provided. The method includes the following steps: detecting a write-back signal broadcasted by a second qualifier; determining whether a value of a first load-detection counting logic is to be synchronized with a value of a second load-detection counting logic carried by the write-back signal according to content of the write-back signal; determining whether execution statuses of all load micro-instructions are cache hit when the synchronized value of the first load-detection counting logic reaches a predetermined value; and driving a release circuit to remove a micro-instruction in a reservation station queue when the execution statuses of the all load micro-instructions are cache hit and the micro-instruction has been dispatched to an arithmetic and logic unit for execution.
    Type: Grant
    Filed: October 2, 2018
    Date of Patent: July 7, 2020
    Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.
    Inventor: Xiaolong Fei
  • Patent number: 10705839
    Abstract: A processor having a decoder to decode an instruction to generate a decoded instruction; a first source register to store a first plurality of packed signed bytes; a second source register to store a second plurality of packed signed bytes; execution circuitry to execute the decoded instruction, the execution circuitry including: multiplier circuitry to multiply each packed signed byte from the first source register with a corresponding packed signed byte from the second source register to generate temporary products, adder circuitry to add a plurality of sets of the temporary products to generate a plurality of temporary sums; negation and extension circuitry to negate and extend each of the temporary sums to doublewords sums; and accumulation circuitry to add each of the doublewords sums to a doubleword from a third source register to generate final doubleword results; and a packed data destination register to store the final doubleword results.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: July 7, 2020
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark Charney, Jesus Corbal
  • Patent number: 10698688
    Abstract: A set machine instruction is provided that has associated therewith a result location to be used with a set operation. The set machine instruction is executed, which includes checking contents of a selected field, and determining, based on the checking, whether the contents of the selected field indicate a first condition, a second condition or a third condition represented in one data type. The result location is set to a value based on the determining, wherein the value, based on the setting, is of a data type different from the one data type and represents a result of a previously executed instruction. The result of the previously executed instruction being one of the first condition, the second condition or the third condition.
    Type: Grant
    Filed: June 24, 2015
    Date of Patent: June 30, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Brett Olsson
  • Patent number: 10684857
    Abstract: A method includes storing a first address of a first instruction executed by a processor core in a first table, where the first instruction writes a value into a register for utilization in addressing memory. The method stores the first address of the first instruction executed by the processor core in a second table with multiple entries, where a register value loaded into the register is utilized as a second address by a second instruction executed by the processor core to access a main memory. The method determines whether an instruction address associated with an instruction executed by the processor core is present in the second table, where the instruction address is the second address. Responsive to determining the instruction address is present in the second table, the method prefetches data from the main memory, where the register value is utilized as the second address in the main memory.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: June 16, 2020
    Assignee: International Business Machines Corporation
    Inventors: Wolfgang Gellerich, Gerrit Koch, Peter M. Held, Martin Schwidefsky