Patents by Inventor Luca NASSI
Luca NASSI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11132202Abstract: An apparatus comprises execution circuitry to perform operations on source data values and to generate result data values; issue circuitry comprising one or more issue queues identifying pending operations awaiting performance by the execution circuitry, and selection circuitry to select pending operations to issue to the execution circuitry; data value cache storage comprising first and second cache regions; and cache control circuitry to control the storing to the first cache region of result data values generated by the execution circuitry and the eviction of stored result data values from the first cache region in response to newly generated result data values being stored in the first cache region; the cache control circuitry being configured to store to the second cache region result data values required as source data values for one or more oldest pending operations identified by the one or more issue queues and to inhibit eviction of a given result data value stored in the second cache region until inType: GrantFiled: September 24, 2019Date of Patent: September 28, 2021Assignee: Arm LimitedInventors: Luca Nassi, Rémi Marius Teyssier, Cédric Denis Robert Airaud, Albin Pierrick Tonnerre, Francois Donati, Christophe Carbonne, Damian Maiorano
-
Patent number: 11036511Abstract: An apparatus has a processing pipeline, and first and second register files. A temporary-register-using instruction is supported which controls the pipeline to perform an operation using a temporary variable derived from an operand stored in the first register file. In response to the instruction, when a predetermined condition is not satisfied, the pipeline processes at least one register move micro-operation to transfer data from the at least one source register of the first register file to at least one newly allocated temporary register of the second register file. When the condition is satisfied, the operation can be performed using a temporary variable already stored in the temporary register of the second register file used by an earlier temporary-register-using instruction specifying the same source register for determining the temporary variable, in the absence of an intervening instruction for rewriting the source register.Type: GrantFiled: July 29, 2019Date of Patent: June 15, 2021Assignee: Arm LimitedInventors: Xiaoyang Shen, Damien Robin Martin, Cédric Denis Robert Airaud, Luca Nassi, François Donati
-
Patent number: 11010159Abstract: Apparatus comprises counter and bit-shift circuitry to provide a succession of processing stages each comprising a count operation stage and a corresponding bit-shift stage, each processing stage operating with respect to a set of contiguous n-bit groups of bit positions, where n is 1 for a first processing stage and n doubles from one processing stage in the succession of processing stages to a next processing stage in the succession of processing stages; each count operation stage being configured to generate, for a first set of alternate instances of the n-bit groups of bit positions, count values indicating a respective number of bits of a predetermined bit value in a mask data word; and each bit-shift stage being configured to generate a bit-shifted data word by bit-shifting bits of a data word to be processed, for a second set of alternate instances of the n-bit groups of bit positions complementary to the first set, by respective numbers of bit positions dependent upon the count values generated by theType: GrantFiled: August 31, 2018Date of Patent: May 18, 2021Assignee: ARM LIMITEDInventors: Xiaoyang Shen, Cedric Denis Robert Airaud, Luca Nassi, Damien Robin Martin
-
Patent number: 10977044Abstract: An apparatus comprising processing circuitry is provided, the processing circuitry comprising execution circuitry, commit circuitry, issue circuitry comprising an issue queue and selection circuitry, and a branch predictor. The processing circuitry is configured to identify a speculation barrier instruction in the commit queue. While an entry in the commit queue identifies a speculation barrier instruction, when a branch instruction that follows the speculation barrier instruction in the program order is selected for issue, the processing circuitry performs a first execution of the instruction, inhibiting updating of branch prediction data items associated with the branch instruction and inhibiting the selection circuitry from invalidating the associated issue queue entry.Type: GrantFiled: September 5, 2019Date of Patent: April 13, 2021Assignee: Arm LimitedInventors: Remi Marius Teyssier, Luca Nassi, Albin Pierrick Tonnerre, François Donati
-
Patent number: 10915327Abstract: Aspects of the present disclosure relate to an apparatus comprising a plurality of clusters, each cluster having a plurality of execution units to execute instructions. The apparatus comprises dispatch circuitry to determine, for each instruction to be executed, a chosen cluster from amongst the plurality of clusters to which to dispatch that instruction for execution. This determination is performed by selecting between a default dispatch policy wherein said chosen cluster is a cluster to which an earlier instruction to generate at least one source operand of said instruction was dispatched for execution, and an alternative dispatch policy for selecting said chosen cluster. Said selecting is based on a selection parameter. The dispatch circuitry is further configured to dispatch said instruction to the chosen cluster for execution.Type: GrantFiled: December 14, 2018Date of Patent: February 9, 2021Assignee: Arm LimitedInventors: Luca Nassi, Remi Marius Teyssier, François Donati, Damian Maiorano
-
Patent number: 10846098Abstract: An apparatus and method of data processing are provided. The apparatus comprises at least two execution pipelines, one with a shorter execution latency than the other. The execution pipelines share a write port and issue circuitry of the apparatus issues decoded instructions to a selected execution pipeline. The apparatus further comprises at least one additional pipeline stage and the issue circuitry can detect a write port conflict condition in dependence on a latency indication associated with a decoded instruction which it is to issue. If the issue circuitry intends to issue the decoded instruction to the execution pipeline with the shorter execution latency then when the write port conflict condition is found the issue circuitry will cause use of at least one additional pipeline stage in addition to the target execution pipeline to avoid the write port conflict.Type: GrantFiled: May 29, 2018Date of Patent: November 24, 2020Assignee: Arm LimitedInventors: Cédric Denis Robert Airaud, Luca Nassi, Damien Robin Martin, Xiaoyang Shen
-
Patent number: 10725964Abstract: Apparatuses and methods of data processing are disclosed. An apparatus comprises two data processing clusters each having multiple data processing lanes to perform single instruction multiple data (SIMD) processing. Decoded instructions are issued to at least one of the two data processing clusters. A decoded SIMD instruction specifying a vector length which is more than the width of the data processing lanes of the first data processing cluster has a first part issued to the first data processing cluster for execution. An issuance target for a second remaining part of the decoded SIMD instruction is selected in dependence on a dynamic performance condition. When the dynamic performance condition has a first state the issuance target is the first data processing cluster and when the dynamic performance condition has a second state the issuance target is the second data processing cluster.Type: GrantFiled: June 12, 2018Date of Patent: July 28, 2020Assignee: Arm LimitedInventors: Cedric Denis Robert Airaud, Luca Nassi, Damien Robin Martin, Xiaoyang Shen
-
Publication number: 20200192674Abstract: Aspects of the present disclosure relate to an apparatus comprising a plurality of clusters, each cluster having a plurality of execution units to execute instructions. The apparatus comprises dispatch circuitry to determine, for each instruction to be executed, a chosen cluster from amongst the plurality of clusters to which to dispatch that instruction for execution. This determination is performed by selecting between a default dispatch policy wherein said chosen cluster is a cluster to which an earlier instruction to generate at least one source operand of said instruction was dispatched for execution, and an alternative dispatch policy for selecting said chosen cluster. Said selecting is based on a selection parameter. The dispatch circuitry is further configured to dispatch said instruction to the chosen cluster for execution.Type: ApplicationFiled: December 14, 2018Publication date: June 18, 2020Inventors: Luca NASSI, Remi Marius TEYSSIER, François DONATI, Damian MAIORANO
-
Patent number: 10649782Abstract: An apparatus and method are provided for controlling branch prediction. The apparatus has processing circuitry for executing instructions, and branch prediction circuitry that comprises a plurality of branch prediction mechanisms used to predict target addresses for branch instructions to be executed by the processing circuitry. The branch instructions comprise a plurality of branch types, where one branch type is a return instruction. The branch prediction mechanisms include a return prediction mechanism used by default to predict a target address when a return instruction is detected by the branch prediction circuitry. However, the branch prediction circuitry is responsive to a trigger condition indicative of misprediction of the target address when using the return prediction mechanism to predict the target address for a given return instruction, to switch to using an alternative branch prediction mechanism for predicting the target address for the given return instruction.Type: GrantFiled: March 29, 2018Date of Patent: May 12, 2020Assignee: Arm LimitedInventors: Luca Nassi, Houdhaifa Bouzguarrou, Guillaume Bolbenes
-
Patent number: 10635445Abstract: An apparatus and method of operating an apparatus are disclosed. The apparatus has a program counter permitted range storage element defining a permitted range of program counter values for the sequence of instructions it executes. Branch prediction circuitry predicts target instruction addresses for branch instructions. In response to a program counter modifying event, a program counter speculative range storage element is updated corresponding to each speculatively executed instruction after a branch instruction. Program counter permitted range verification circuitry is responsive to resolution of a modification of the program counter permitted range indication resulting from the program counter modifying event to determine whether the speculatively executed program counter range satisfies the permitted range of program counter values. A branch mis-prediction mechanism may support the response of the apparatus if the permitted range of program counter values is violated.Type: GrantFiled: May 29, 2018Date of Patent: April 28, 2020Assignee: Arm LimitedInventors: Rémi Marius Teyssier, Albin Pierrick Tonnerre, Cédric Denis Robert Airaud, Luca Nassi, Guillaume Bolbenes, Francois Donati, Lee Evan Eisen, Pasquale Ranone
-
Publication number: 20200117464Abstract: An apparatus comprising processing circuitry is provided, the processing circuitry comprising execution circuitry, commit circuitry, issue circuitry comprising an issue queue and selection circuitry, and a branch predictor. The processing circuitry is configured to identify a speculation barrier instruction in the commit queue. While an entry in the commit queue identifies a speculation barrier instruction, when a branch instruction that follows the speculation barrier instruction in the program order is selected for issue, the processing circuitry performs a first execution of the instruction, inhibiting updating of branch prediction data items associated with the branch instruction and inhibiting the selection circuitry from invalidating the associated issue queue entry.Type: ApplicationFiled: September 5, 2019Publication date: April 16, 2020Inventors: Remi Marius TEYSSIER, Luca NASSI, Albin Pierrick TONNERRE, François DONATI
-
Publication number: 20200117463Abstract: An apparatus comprises execution circuitry to perform operations on source data values and to generate result data values; issue circuitry comprising one or more issue queues identifying pending operations awaiting performance by the execution circuitry, and selection circuitry to select pending operations to issue to the execution circuitry; data value cache storage comprising first and second cache regions; and cache control circuitry to control the storing to the first cache region of result data values generated by the execution circuitry and the eviction of stored result data values from the first cache region in response to newly generated result data values being stored in the first cache region; the cache control circuitry being configured to store to the second cache region result data values required as source data values for one or more oldest pending operations identified by the one or more issue queues and to inhibit eviction of a given result data value stored in the second cache region until inType: ApplicationFiled: September 24, 2019Publication date: April 16, 2020Inventors: Luca NASSI, Rémi Marius TEYSSIER, Cédric Denis Robert AIRAUD, Albin Pierrick TONNERRE, Francois DONATI, Christophe CARBONNE, Damian MAIORANO
-
Publication number: 20200110613Abstract: Data processing apparatus comprises a processing element configured to access an architectural register representing a given system register; mapping circuitry to map the architectural register representing the given system register to a physical register selected from a set of physical registers; a register bank having a set of two or more respective banked versions of the given system register, in which a respective one of the banked versions of the system register is associated with each of a plurality of current operating states of the processing element; in which, when the processing element changes operating state from a first operating state associated with a first one of the banked versions of the system register to a second operating state associated with a second, different, one of the banked versions of the system register, the processing element is configured to store the current contents of the architectural register representing the given system register to the first one of the banked versions oType: ApplicationFiled: September 5, 2019Publication date: April 9, 2020Inventors: Cedric Denis Robert AIRAUD, Albin Pierrick TONNERRE, Luca NASSI, Remi Marius TEYSSIER
-
Publication number: 20200073660Abstract: Apparatus comprises counter and bit-shift circuitry to provide a succession of processing stages each comprising a count operation stage and a corresponding bit-shift stage, each processing stage operating with respect to a set of contiguous n-bit groups of bit positions, where n is 1 for a first processing stage and n doubles from one processing stage in the succession of processing stages to a next processing stage in the succession of processing stages; each count operation stage being configured to generate, for a first set of alternate instances of the n-bit groups of bit positions, count values indicating a respective number of bits of a predetermined bit value in a mask data word; and each bit-shift stage being configured to generate a bit-shifted data word by bit-shifting bits of a data word to be processed, for a second set of alternate instances of the n-bit groups of bit positions complementary to the first set, by respective numbers of bit positions dependent upon the count values generated by theType: ApplicationFiled: August 31, 2018Publication date: March 5, 2020Inventors: Xiaoyang SHEN, Cedric Denis Robert AIRAUD, Luca NASSI, Damien Robin MARTIN
-
Publication number: 20200065109Abstract: An apparatus has a processing pipeline, and first and second register files. A temporary-register-using instruction is supported which controls the pipeline to perform an operation using a temporary variable derived from an operand stored in the first register file. In response to the instruction, when a predetermined condition is not satisfied, the pipeline processes at least one register move micro-operation to transfer data from the at least one source register of the first register file to at least one newly allocated temporary register of the second register file. When the condition is satisfied, the operation can be performed using a temporary variable already stored in the temporary register of the second register file used by an earlier temporary-register-using instruction specifying the same source register for determining the temporary variable, in the absence of an intervening instruction for rewriting the source register.Type: ApplicationFiled: July 29, 2019Publication date: February 27, 2020Inventors: Xiaoyang SHEN, Damien Robin MARTIN, Cédric Denis Robert AIRAUD, Luca NASSI, François DONATI
-
Patent number: 10558462Abstract: An apparatus and method are provided for storing source operands for operations. The apparatus comprises execution circuitry for performing operations on data values, and a register file comprising a plurality of registers to store the data values operated on by the execution circuitry. Issue circuitry is also provided that has a pending operations storage identifying pending operations awaiting performance by the execution circuitry and selection circuitry to select pending operations from the pending operation storage to issue to the execution circuitry. The pending operations storage comprises an entry for each pending operation, each entry storing attribute information identifying the operation to be performed, where that attribute information includes a source identifier field for each source operand of the pending operation.Type: GrantFiled: May 23, 2018Date of Patent: February 11, 2020Assignee: Arm LimitedInventors: Luca Nassi, Cédric Denis Robert Airaud, Rémi Marius Teyssier, Albin Pierrick Tonnerre
-
Publication number: 20190377706Abstract: Apparatuses and methods of data processing are disclosed. An apparatus comprises two data processing clusters each having multiple data processing lanes to perform single instruction multiple data (SIMD) processing. Decoded instructions are issued to at least one of the two data processing clusters. A decoded SIMD instruction specifying a vector length which is more than the width of the data processing lanes of the first data processing cluster has a first part issued to the first data processing cluster for execution. An issuance target for a second remaining part of the decoded SIMD instruction is selected in dependence on a dynamic performance condition. When the dynamic performance condition has a first state the issuance target is the first data processing cluster and when the dynamic performance condition has a second state the issuance target is the second data processing cluster.Type: ApplicationFiled: June 12, 2018Publication date: December 12, 2019Inventors: Cedric Denis Robert AIRAUD, Luca NASSI, Damien Robin MARTIN, Xiaoyang SHEN
-
Publication number: 20190370001Abstract: An apparatus and method of operating an apparatus are disclosed. The apparatus has a program counter permitted range storage element defining a permitted range of program counter values for the sequence of instructions it executes. Branch prediction circuitry predicts target instruction addresses for branch instructions. In response to a program counter modifying event, a program counter speculative range storage element is updated corresponding to each speculatively executed instruction after a branch instruction. Program counter permitted range verification circuitry is responsive to resolution of a modification of the program counter permitted range indication resulting from the program counter modifying event to determine whether the speculatively executed program counter range satisfies the permitted range of program counter values. A branch mis-prediction mechanism may support the response of the apparatus if the permitted range of program counter values is violated.Type: ApplicationFiled: May 29, 2018Publication date: December 5, 2019Inventors: Rémi Marius TEYSSIER, Albin Pierrick TONNERRE, Cédric Denis Robert AIRAUD, Luca NASSI, Guillaume BOLBENES, Francois DONATI, Lee Evan EISEN, Pasquale RANONE
-
Publication number: 20190370004Abstract: An apparatus and method of data processing are provided. The apparatus comprises at least two execution pipelines, one with a shorter execution latency than the other. The execution pipelines share a write port and issue circuitry of the apparatus issues decoded instructions to a selected execution pipeline. The apparatus further comprises at least one additional pipeline stage and the issue circuitry can detect a write port conflict condition in dependence on a latency indication associated with a decoded instruction which it is to issue. If the issue circuitry intends to issue the decoded instruction to the execution pipeline with the shorter execution latency then when the write port conflict condition is found the issue circuitry will cause use of at least one additional pipeline stage in addition to the target execution pipeline to avoid the write port conflict.Type: ApplicationFiled: May 29, 2018Publication date: December 5, 2019Inventors: Cédric Denis Robert AIRAUD, Luca NASSI, Damien Robin MARTIN, Xiaoyang SHEN
-
Publication number: 20190361705Abstract: An apparatus and method are provided for storing source operands for operations. The apparatus comprises execution circuitry for performing operations on data values, and a register file comprising a plurality of registers to store the data values operated on by the execution circuitry. Issue circuitry is also provided that has a pending operations storage identifying pending operations awaiting performance by the execution circuitry and selection circuitry to select pending operations from the pending operation storage to issue to the execution circuitry. The pending operations storage comprises an entry for each pending operation, each entry storing attribute information identifying the operation to be performed, where that attribute information includes a source identifier field for each source operand of the pending operation.Type: ApplicationFiled: May 23, 2018Publication date: November 28, 2019Inventors: Luca NASSI, Cédric Denis Robert AIRAUD, Rémi Marius TEYSSIER, Albin Pierrick TONNERRE