Patents by Inventor Chiloda Ashan Senarath PATHIRANE

Chiloda Ashan Senarath PATHIRANE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11429393
    Abstract: An apparatus for data processing and a method of data processing are provided. Data processing operations are performed in response to instructions which reference architectural registers using physical registers to store data values when performing the data processing operations. Mappings between the architectural registers and the physical registers are stored, and when a data hazard condition is identified with respect to out-of-order program execution of an instruction, an architectural register specified in the instruction is remapped to an available physical register. A reorder buffer stores an entry for each destination architectural register specified by the instruction, entries being stored in program order, and an entry specifies a destination architectural register and an original physical register to which the destination architectural register was mapped before the architectural register remapped to an available physical register.
    Type: Grant
    Filed: November 11, 2015
    Date of Patent: August 30, 2022
    Assignee: ARM LIMITED
    Inventors: Vladimir Vasekin, Ian Michael Caulfield, Chiloda Ashan Senarath Pathirane
  • Patent number: 11416252
    Abstract: A data processing system includes an instruction pipeline containing instruction queue circuitry, fusion circuitry and decoder circuitry. The fusion circuitry serves to identify fusible groups of program instructions within a Y-wide window of program instructions and supply a stream of program instructions including such replacement fused program instructions to a X-wide decoder circuitry which decodes X program instructions in parallel using parallel decoders.
    Type: Grant
    Filed: December 27, 2017
    Date of Patent: August 16, 2022
    Assignee: Arm Limited
    Inventors: Vladimir Vasekin, Chiloda Ashan Senarath Pathirane, Jungsoo Kim, Alexei Fedorov
  • Patent number: 11216277
    Abstract: Aspects of the present disclosure relate to an apparatus comprising register circuitry implementing a plurality of registers and processing circuitry to perform data processing operations on data stored in said registers. The apparatus comprises store buffer circuitry to, responsive to a store instruction in respect of given data, temporarily store said given data prior to providing said given data to a memory. Responsive to receiving at the processing circuitry a request to perform a state-saving-triggering operation, the register circuitry is configured to capture in shadow registers of said register circuitry a state of a subset of registers of the plurality of registers, provide the captured state from the shadow registers to the memory.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: January 4, 2022
    Assignee: Arm Limited
    Inventor: Chiloda Ashan Senarath Pathirane
  • Patent number: 11068238
    Abstract: A multiplier circuit is described in which sub-products calculated in a first stage of a carry-save adder (CSA) network are output early, processed by applying a processing function, and re-injected into a subsequent stage of the CSA network to add the processed sub-products. This allows a CSA network provided for multiplication operations to be reused for operations which require sub-products to be processed and added, such as floating-point dot product operations performed on floating-point values represented in bfloatl6 format.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: July 20, 2021
    Assignee: Arm Limited
    Inventors: Michael Alexander Kennedy, Neil Burgess, Zichao Xie, Chiloda Ashan Senarath Pathirane
  • Patent number: 11036510
    Abstract: A merging predicated instruction controls a processing pipeline to perform a processing operation to determine a processing result based on at least one source operand, and to perform a merging operation to merge the processing result with a previous value of a destination register under control of a predicate value identifying, for each of a plurality of portions of the destination register, whether that portion is to be set to a corresponding portion of the processing result or a corresponding portion of the previous value. The merging predicated instruction is permitted to be issued to the pipeline with a timing which results in the previous value of the destination register still being unavailable when the merging predicated instruction is at a given pipeline stage at which the processing result is determined. This can help to improve performance of subsequent instructions which are independent of the merging predicated instruction.
    Type: Grant
    Filed: October 11, 2018
    Date of Patent: June 15, 2021
    Assignee: Arm Limited
    Inventors: Karel Hubertus Gerardus Walters, Chiloda Ashan Senarath Pathirane
  • Publication number: 20210096863
    Abstract: Aspects of the present disclosure relate to an apparatus comprising register circuitry implementing a plurality of registers and processing circuitry to perform data processing operations on data stored in said registers. The apparatus comprises store buffer circuitry to, responsive to a store instruction in respect of given data, temporarily store said given data prior to providing said given data to a memory. Responsive to receiving at the processing circuitry a request to perform a state-saving-triggering operation, the register circuitry is configured to capture in shadow registers of said register circuitry a state of a subset of registers of the plurality of registers, provide the captured state from the shadow registers to the memory.
    Type: Application
    Filed: September 26, 2019
    Publication date: April 1, 2021
    Inventor: Chiloda Ashan Senarath PATHIRANE
  • Patent number: 10963253
    Abstract: An apparatus comprises instruction decoding circuitry to generate micro-operations in response to program instructions; and processing circuitry to perform data processing in response to the micro-operations generated by the instruction decoding circuitry. In response to a predicated vector instruction, the instruction decoding circuitry reads or predicts an estimated value of the predicate value, and depending on the estimated value, varies a composition of at least one micro-operation generated in response to the predicated vector instruction. This can enable more efficient use of hardware resources in the processing circuitry.
    Type: Grant
    Filed: July 10, 2018
    Date of Patent: March 30, 2021
    Assignee: Arm Limited
    Inventors: Karel Hubertus Gerardus Walters, Chiloda Ashan Senarath Pathirane, Michael Alexander Kennedy
  • Publication number: 20200371749
    Abstract: A multiplier circuit is described in which sub-products calculated in a first stage of a carry-save adder (CSA) network are output early, processed by applying a processing function, and re-injected into a subsequent stage of the CSA network to add the processed sub-products. This allows a CSA network provided for multiplication operations to be reused for operations which require sub-products to be processed and added, such as floating-point dot product operations performed on floating-point values represented in bfloatl6 format.
    Type: Application
    Filed: May 21, 2019
    Publication date: November 26, 2020
    Inventors: Michael Alexander KENNEDY, Neil BURGESS, Zichao XIE, Chiloda Ashan Senarath PATHIRANE
  • Patent number: 10719329
    Abstract: An apparatus and method are provided for using predicted result values. The apparatus has a processing unit that comprises processing circuitry for executing a sequence of instructions, and value prediction circuitry for identifying a predicted result value for at least one instruction. A result producing structure is provided that is responsive to a request issued from the processing unit when the processing circuitry is executing a first instruction, to produce a result value for the first instruction and return that result value to the processing unit. While waiting for the result value from the result producing structure, the processing circuitry can be arranged to speculatively execute at least one dependent instruction using a predicted result value for the first instruction as obtained from the value prediction circuitry.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: July 21, 2020
    Assignee: Arm Limited
    Inventors: Vladimir Vasekin, David Michael Bull, Chiloda Ashan Senarath Pathirane, Alexei Fedorov
  • Publication number: 20200117457
    Abstract: A merging predicated instruction controls a processing pipeline to perform a processing operation to determine a processing result based on at least one source operand, and to perform a merging operation to merge the processing result with a previous value of a destination register under control of a predicate value identifying, for each of a plurality of portions of the destination register, whether that portion is to be set to a corresponding portion of the processing result or a corresponding portion of the previous value. The merging predicated instruction is permitted to be issued to the pipeline with a timing which results in the previous value of the destination register still being unavailable when the merging predicated instruction is at a given pipeline stage at which the processing result is determined. This can help to improve performance of subsequent instructions which are independent of the merging predicated instruction.
    Type: Application
    Filed: October 11, 2018
    Publication date: April 16, 2020
    Inventors: Karel Hubertus Gerardus WALTERS, Chiloda Ashan Senarath PATHIRANE
  • Patent number: 10552160
    Abstract: A processing pipeline for processing instructions with instructions from multiple threads in flight concurrently may have control circuitry to detect a stalling event associated with a given thread. In response, at least one instruction of the given thread may be flushed from the pipeline, and the control circuitry may trigger fetch circuitry to reduce a fraction of the fetched instructions which are fetched from the given thread. A mechanism is also described to determine when to trigger a predetermined action when a delay in accessing information becomes greater than a delay threshold, and to update the delay threshold based on a difference between a return delay when the information is returned from the storage circuitry and the delay threshold.
    Type: Grant
    Filed: May 23, 2018
    Date of Patent: February 4, 2020
    Assignee: ARM Limited
    Inventors: Ian Michael Caulfield, Max John Batley, Chiloda Ashan Senarath Pathirane
  • Publication number: 20200019402
    Abstract: An apparatus comprises instruction decoding circuitry to generate micro-operations in response to program instructions; and processing circuitry to perform data processing in response to the micro-operations generated by the instruction decoding circuitry. In response to a predicated vector instruction, the instruction decoding circuitry reads or predicts an estimated value of the predicate value, and depending on the estimated value, varies a composition of at least one micro-operation generated in response to the predicated vector instruction. This can enable more efficient use of hardware resources in the processing circuitry.
    Type: Application
    Filed: July 10, 2018
    Publication date: January 16, 2020
    Inventors: Karel Hubertus Gerardus WALTERS, Chiloda Ashan Senarath PATHIRANE, Michael Alexander KENNEDY
  • Publication number: 20200004547
    Abstract: An apparatus and method are provided for using predicted result values. The apparatus has a processing unit that comprises processing circuitry for executing a sequence of instructions, and value prediction circuitry for identifying a predicted result value for at least one instruction. A result producing structure is provided that is responsive to a request issued from the processing unit when the processing circuitry is executing a first instruction, to produce a result value for the first instruction and return that result value to the processing unit. While waiting for the result value from the result producing structure, the processing circuitry can be arranged to speculatively execute at least one dependent instruction using a predicted result value for the first instruction as obtained from the value prediction circuitry.
    Type: Application
    Filed: June 28, 2018
    Publication date: January 2, 2020
    Inventors: Vladimir VASEKIN, David Michael BULL, Chiloda Ashan Senarath PATHIRANE, Alexei FEDOROV
  • Publication number: 20190196832
    Abstract: A data processing system 2 includes an instruction pipeline 14 containing instruction queue circuitry 28, fusion circuitry 30 and decoder circuitry 32. The fusion circuitry 30 serves to identify fusible groups of program instructions within a Y-wide window of program instructions and supply a stream of program instructions including such replacement fused program instructions to a X-wide decoder circuitry 32 which decodes X program instructions in parallel using parallel decoders 40, 42, 44.
    Type: Application
    Filed: December 27, 2017
    Publication date: June 27, 2019
    Inventors: Vladimir VASEKIN, Chiloda Ashan Senarath PATHIRANE, Jungsoo KIM, Alexei FEDOROV
  • Patent number: 10296349
    Abstract: Data processing circuitry comprises allocation circuitry to allocate one or more source and destination processor registers, of a set of processor registers each defined by a respective register index, to a processor instruction for use in execution of that processor instruction and to associate, with the processor instruction, information to indicate the register index of the allocated source and destination processor registers; the avocation circuitry being selectively operable to allocate, to a processor instruction, a group of destination processor registers having a subset of their register indices in common and to associate, with the processor instruction, information to indicate the register index of one processor register of the group and identifying information to identify one or more bits of the register index which differ between the processor registers in the allocated group of processor registers.
    Type: Grant
    Filed: January 7, 2016
    Date of Patent: May 21, 2019
    Assignee: ARM Limited
    Inventors: Vladimir Vasekin, Antony John Penton, Chiloda Ashan Senarath Pathirane, Andrew James Antony Lees
  • Publication number: 20180267805
    Abstract: A processing pipeline for processing instructions with instructions from multiple threads in flight concurrently may have control circuitry to detect a stalling event associated with a given thread. In response, at least one instruction of the given thread may be flushed from the pipeline, and the control circuitry may trigger fetch circuitry to reduce a fraction of the fetched instructions which are fetched from the given thread. A mechanism is also described to determine when to trigger a predetermined action when a delay in accessing information becomes greater than a delay threshold, and to update the delay threshold based on a difference between a return delay when the information is returned from the storage circuitry and the delay threshold.
    Type: Application
    Filed: May 23, 2018
    Publication date: September 20, 2018
    Inventors: Ian Michael CAULFIELD, Max John BATLEY, Chiloda Ashan Senarath PATHIRANE
  • Publication number: 20170199738
    Abstract: Data processing circuitry comprises allocation circuitry to allocate one or more source and destination processor registers, of a set of processor registers each defined by a respective register index, to a processor instruction for use in execution of that processor instruction and to associate, with the processor instruction, information to indicate the register index of the allocated source and destination processor registers; the avocation circuitry being selectively operable to allocate, to a processor instruction, a group of destination processor registers having a subset of their register indices in common and to associate, with the processor instruction, information to indicate the register index of one processor register of the group and identifying information to identify one or more bits of the register index which differ between the processor registers in the allocated group of processor registers.
    Type: Application
    Filed: January 7, 2016
    Publication date: July 13, 2017
    Inventors: Vladimir VASEKIN, Antony John PENTON, Chiloda Ashan Senarath PATHIRANE, Andrew James Antony LEES
  • Publication number: 20170139716
    Abstract: A processing pipeline for processing instructions with instructions from multiple threads in flight concurrently may have control circuitry to detect a stalling event associated with a given thread. In response, at least one instruction of the given thread may be flushed from the pipeline, and the control circuitry may trigger fetch circuitry to reduce a fraction of the fetched instructions which are fetched from the given thread. A mechanism is also described to determine when to trigger a predetermined action when a delay in accessing information becomes greater than a delay threshold, and to update the delay threshold based on a difference between a return delay when the information is returned from the storage circuitry and the delay threshold.
    Type: Application
    Filed: November 18, 2015
    Publication date: May 18, 2017
    Inventors: Ian Michael CAULFIELD, Max John BATLEY, Chiloda Ashan Senarath PATHIRANE
  • Publication number: 20170132010
    Abstract: An apparatus for data processing and a method of data processing are provided. Data processing operations are performed in response to instructions which reference architectural registers using physical registers to store data values when performing the data processing operations. Mappings between the architectural registers and the physical registers are stored, and when a data hazard condition is identified with respect to out-of-order program execution of an instruction, an architectural register specified in the instruction is remapped to an available physical register. A reorder buffer stores an entry for each destination architectural register specified by the instruction, entries being stored in program order, and an entry specifies a destination architectural register and an original physical register to which the destination architectural register was mapped before the architectural register remapped to an available physical register.
    Type: Application
    Filed: November 11, 2015
    Publication date: May 11, 2017
    Inventors: Vladimir VASEKIN, Ian Michael CAULFIELD, Chiloda Ashan Senarath PATHIRANE