Patents by Inventor Mladen Wilder

Mladen Wilder has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230050061
    Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.
    Type: Application
    Filed: August 11, 2021
    Publication date: February 16, 2023
    Inventors: Andrew M. Havlir, Steven Fishwick, David A. Gotwalt, Benjamin Bowman, Ralph C. Taylor, Melissa L. Velez, Mladen Wilder, Ali Rabbani Rankouhi, Fergus W. MacGarry
  • Patent number: 11397624
    Abstract: A data processing system including a data processor which is operable to execute programs to perform data processing operations and in which execution threads executing a program to perform data processing operations may be grouped together into thread groups. The data processor comprises a cross-lane permutation circuit which is operable to perform processing for cross-lane instructions which require data to be permuted (copied or moved) between the threads of a thread group. The cross-lane permutation circuit has plural data lanes between which data may be permuted (moved or copied). The number of data lanes is fewer than the number of threads in a thread group.
    Type: Grant
    Filed: January 22, 2019
    Date of Patent: July 26, 2022
    Assignee: Arm Limited
    Inventors: Luka Dejanovic, Mladen Wilder
  • Patent number: 10726606
    Abstract: When a shader program is to be executed by a graphics processor, the graphics processor is caused to execute at least two variants of the shader program and the operation of the graphics processor when executing execution threads for the different variants of the shader program is monitored. A variant of the shader program to be executed by subsequent execution threads that are to execute the shader program is then selected based on the monitoring of the operation of the shading stage when executing the execution threads for the different variants of the shader program.
    Type: Grant
    Filed: February 19, 2019
    Date of Patent: July 28, 2020
    Assignee: Arm Limited
    Inventors: Peter William Harris, Mladen Wilder
  • Publication number: 20200233726
    Abstract: A data processing system including a data processor which is operable to execute programs to perform data processing operations and in which execution threads executing a program to perform data processing operations may be grouped together into thread groups. The data processor comprises a cross-lane permutation circuit which is operable to perform processing for cross-lane instructions which require data to be permuted (copied or moved) between the threads of a thread group. The cross-lane permutation circuit has plural data lanes between which data may be permuted (moved or copied). The number data lanes is fewer than the number of threads in a thread group.
    Type: Application
    Filed: January 22, 2019
    Publication date: July 23, 2020
    Applicant: Arm Limited
    Inventors: Luka Dejanovic, Mladen Wilder
  • Publication number: 20190259193
    Abstract: When a shader program is to be executed by a graphics processor, the graphics processor is caused to execute at least two variants of the shader program and the operation of the graphics processor when executing execution threads for the different variants of the shader program is monitored. A variant of the shader program to be executed by subsequent execution threads that are to execute the shader program is then selected based on the monitoring of the operation of the shading stage when executing the execution threads for the different variants of the shader program.
    Type: Application
    Filed: February 19, 2019
    Publication date: August 22, 2019
    Applicant: Arm Limited
    Inventors: Peter William Harris, Mladen Wilder
  • Patent number: 9304926
    Abstract: A coherent memory system includes a plurality of level 1 cache memories 6 connected via interconnect circuitry 18 to a level 2 cache memory 8. Coherency control circuitry 10 manages coherency between lines of data. Evict messages from the level 1 cache memories to the coherency control circuitry 10 are sent via the read address channel AR. Read messages are also sent via the read address channel AR. The read address channel AR is configured such that a read message may not be reordered relative to an evict message. The coherency control circuitry 10 is configured such that a read message will not be processed ahead of an evict message. The level 1 cache memories 6 do not track in-flight evict messages. No acknowledgement of an evict message is sent from the coherency control circuitry 10 back to the level 1 cache memory 6.
    Type: Grant
    Filed: July 23, 2013
    Date of Patent: April 5, 2016
    Assignee: ARM Limited
    Inventors: Ian Bratt, Mladen Wilder, Ole Henrik Jahren
  • Publication number: 20150032969
    Abstract: A coherent memory system includes a plurality of level 1 cache memories 6 connected via interconnect circuitry 18 to a level 2 cache memory 8. Coherency control circuitry 10 manages coherency between lines of data. Evict messages from the level 1 cache memories to the coherency control circuitry 10 are sent via the read address channel AR. Read messages are also sent via the read address channel AR. The read address channel AR is configured such that a read message may not be reordered relative to an evict message. The coherency control circuitry 10 is configured such that a read message will not be processed ahead of an evict message. The level 1 cache memories 6 do not track in-flight evict messages. No acknowledgement of an evict message is sent from the coherency control circuitry 10 back to the level 1 cache memory 6.
    Type: Application
    Filed: July 23, 2013
    Publication date: January 29, 2015
    Applicant: ARM LIMITED
    Inventors: Ian BRATT, Mladen WILDER, Ole Henrik JAHREN
  • Patent number: 8595280
    Abstract: A data processing apparatus and method for performing multiply-accumulate operations is provided. The data processing apparatus includes data processing circuitry responsive to control signals to perform data processing operations on at least one input data element.
    Type: Grant
    Filed: October 29, 2010
    Date of Patent: November 26, 2013
    Assignee: ARM Limited
    Inventors: Dominic Hugo Symes, Mladen Wilder, Guy Larri
  • Patent number: 8473819
    Abstract: An electronic device is described which receives data from a transmitting device via a communications channel. The electronic device comprises digital processing circuitry arranged to process the data received via the communications channel to generate output data, error detection circuitry arranged to detect errors in the output data, and monitoring circuitry arranged to monitor the quality of digital processing conducted by the digital processing circuitry and generate digital performance data indicative of the monitored quality of digital processing. The electronic device also comprises control circuitry responsive to error information comprising errors detected by the error detection circuitry and the performance data generated by the monitoring circuitry to modify the operation of one or both of the transmitting device and the electronic device.
    Type: Grant
    Filed: July 15, 2009
    Date of Patent: June 25, 2013
    Assignee: ARM Limited
    Inventors: Daniel Kershaw, David Michael Bull, Mladen Wilder
  • Patent number: 8443170
    Abstract: An apparatus and method for performing SIMD multiply-accumulate operations includes SIMD data processing circuitry responsive to control signals to perform data processing operations in parallel on multiple data elements. Instruction decoder circuitry is coupled to the SIMD data processing circuitry and is responsive to program instructions to generate the required control signals. The instruction decoder circuitry is responsive to a single instruction (referred to herein as a repeating multiply-accumulate instruction) having as input operands a first vector of input data elements, a second vector of coefficient data elements, and a scalar value indicative of a plurality of iterations required, to generate control signals to control the SIMD processing circuitry.
    Type: Grant
    Filed: September 17, 2009
    Date of Patent: May 14, 2013
    Assignee: ARM Limited
    Inventors: Mladen Wilder, Dominic Hugo Symes, Richard Edward Bruce
  • Patent number: 8423752
    Abstract: An apparatus for processing data is provided comprising processing circuitry having permutation circuitry for performing permutation operations, a register bank having a plurality of registers for storing data and control circuitry responsive to program instructions to control the processing circuitry to perform data processing operations. The control circuitry is arranged to be responsive to a control-generating instruction to generate in dependence upon a bit-mask control signals to configure permutation circuitry for performing permutation operation on an input operand. The bit-mask identifies within the input operand the first group of data elements having a first ordering and a second group of data elements having a second ordering and the permutation operation is such that it preserves one of the first ordering and the second ordering but changes the other of the first ordering and the second ordering.
    Type: Grant
    Filed: December 16, 2008
    Date of Patent: April 16, 2013
    Assignee: ARM Limited
    Inventors: Dominic Hugo Symes, Mladen Wilder
  • Patent number: 8255446
    Abstract: An apparatus and method are provided for performing rearrangement operations and arithmetic operations on data. The data processing apparatus has processing circuitry for performing Single Instruction Multiple Data (SIMD) processing operations and scalar processing operations, a register bank for storing data and control circuitry responsive to program instructions to control the processing circuitry to perform data processing operations. The control circuitry is arranged to responsive to a combined rearrangement arithmetic instruction to control the processing circuitry to perform a rearrangement operation and at least one SIMD arithmetic operation on a plurality of data elements stored in the register bank. The rearrangement operation is configurable by a size parameter derived at least in part from the register bank. The size parameter provides an indication of a number of data elements forming a rearrangement element for the purposes of the rearrangement operation.
    Type: Grant
    Filed: November 29, 2007
    Date of Patent: August 28, 2012
    Assignee: ARM Limited
    Inventors: Daniel Kershaw, Mladen Wilder, Dominic Hugo Symes
  • Publication number: 20110185262
    Abstract: An electronic device is described which receives data from a transmitting device via a communications channel. The electronic device comprises digital processing circuitry arranged to process the data received via the communications channel to generate output data, error detection circuitry arranged to detect errors in the output data, and monitoring circuitry arranged to monitor the quality of digital processing conducted by the digital processing circuitry and generate digital performance data indicative of the monitored quality of digital processing. The electronic device also comprises control circuitry responsive to error information comprising errors detected by the error detection circuitry and the performance data generated by the monitoring circuitry to modify the operation of one or both of the transmitting device and the electronic device.
    Type: Application
    Filed: July 15, 2009
    Publication date: July 28, 2011
    Inventors: Daniel Kershaw, David Michael Bull, Mladen Wilder
  • Publication number: 20110106871
    Abstract: A data processing apparatus and method for performing multiply-accumulate operations is provided. The data processing apparatus includes data processing circuitry responsive to control signals to perform data processing operations on at least one input data element.
    Type: Application
    Filed: October 29, 2010
    Publication date: May 5, 2011
    Applicant: ARM LIMITED
    Inventors: Dominic Hugo Symes, Mladen Wilder, Guy Larri
  • Publication number: 20110093863
    Abstract: A data engine that can be interrupted is disclosed, the data engine comprising plurality of elements for storing, routing and processing the data, the plurality of elements comprising: processing elements for processing the data; registers for storing the data being processed; the data processing engine being configured to receive a clock signal and in response to the clock signal to periodically transmit a plurality of the control signals to a corresponding plurality of the elements in parallel; the data engine further comprising: control circuitry configured in response to receipt of an external interrupt request: to pause transmission of the control signals to the elements and to transmit a copy of the register data stored in the plurality of registers to a store; to transmit in parallel a next plurality of the control signals in the stream of control signals to a corresponding plurality of the elements, and to transmit a copy of output data output by the processing elements in response to the next plurali
    Type: Application
    Filed: October 21, 2009
    Publication date: April 21, 2011
    Inventors: Jef Louis Verdonck, Mladen Wilder, Johan Matterne
  • Patent number: 7895417
    Abstract: A data processing system 2 is provided including an instruction decoder 34 responsive to program instructions within an instruction register 32 to generate control signals for controlling data processing circuitry 36. The instructions supported include an address calculation instruction which splits an input address value at a position dependent upon a size value into a first portion and second portion, adds a non-zero offset value to the first portion, sets the second portion to a value and then concatenates the result of the processing on the first portion and the second portion to form the output address value. Another type of instruction supported is a select-and-insert instruction. This instruction takes a first input value and shifts it by N bit positions to form a shifted value, selects N bits from within a second input value in dependence upon the first input value and then concatenates the shifted value with the N bits to form an output value.
    Type: Grant
    Filed: April 30, 2010
    Date of Patent: February 22, 2011
    Assignee: ARM Limited
    Inventors: Dominic Hugo Symes, Daniel Kershaw, Mladen Wilder
  • Publication number: 20100274990
    Abstract: An apparatus and method for performing SIMD multiply-accumulate operations includes SIMD data processing circuitry responsive to control signals to perform data processing operations in parallel on multiple data elements. Instruction decoder circuitry is coupled to the SIMD data processing circuitry and is responsive to program instructions to generate the required control signals. The instruction decoder circuitry is responsive to a single instruction (referred to herein as a repeating multiply-accumulate instruction) having as input operands a first vector of input data elements, a second vector of coefficient data elements, and a scalar value indicative of a plurality of iterations required, to generate control signals to control the SIMD processing circuitry.
    Type: Application
    Filed: September 17, 2009
    Publication date: October 28, 2010
    Inventors: Mladen Wilder, Dominic Hugo Symes, Richard Edward Bruce
  • Patent number: 7814302
    Abstract: A data processing system 2 is provided including an instruction decoder 34 responsive to program instructions within an instruction register 32 to generate control signals for controlling data processing circuitry 36. The instructions supported include an address calculation instruction which splits an input address value at a position dependent upon a size value into a first portion and second portion, adds a non-zero offset value to the first portion, sets the second portion to a value and then concatenates the result of the processing on the first portion and the second portion to form the output address value. Another type of instruction supported is a select-and-insert instruction. This instruction takes a first input value and shifts it by N bit positions to form a shifted value, selects N bits from within a second input value in dependence upon the first input value and then concatenates the shifted value with the N bits to form an output value.
    Type: Grant
    Filed: February 13, 2008
    Date of Patent: October 12, 2010
    Assignee: ARM Limited
    Inventors: Dominic Hugo Symes, Daniel Kershaw, Mladen Wilder
  • Publication number: 20100217958
    Abstract: A data processing system 2 is provided including an instruction decoder 34 responsive to program instructions within an instruction register 32 to generate control signals for controlling data processing circuitry 36. The instructions supported include an address calculation instruction which splits an input address value at a position dependent upon a size value into a first portion and second portion, adds a non-zero offset value to the first portion, sets the second portion to a value and then concatenates the result of the processing on the first portion and the second portion to form the output address value. Another type of instruction supported is a select-and-insert instruction. This instruction takes a first input value and shifts it by N bit positions to form a shifted value, selects N bits from within a second input value in dependence upon the first input value and then concatenates the shifted value with the N bits to form an output value.
    Type: Application
    Filed: April 30, 2010
    Publication date: August 26, 2010
    Applicant: ARM Limited
    Inventors: Dominic Hugo Symes, Daniel Kershaw, Mladen Wilder
  • Publication number: 20090254736
    Abstract: An apparatus for processing data is provided comprising rearrangement circuitry having a plurality of rearrangement stages for rearranging a plurality N of input data elements, each rearrangement stage comprising at most N multiplexers arranged to select between M data elements where M is in integer less than N. Control circuitry is provided that is responsive to program instructions to control the rearrangement circuitry to perform rearrangement operations. The rearrangement circuitry is configurable by the control circuitry to perform a plurality of different rearrangement operations. The rearrangement circuitry comprises main rearrangement circuitry having a plurality of rearrangement stages in which there is a unique path between any given input element and any given output element and supplementary rearrangement circuitry in which from each input data element there is a path to at most C output data elements where 1<C<N/2.
    Type: Application
    Filed: April 7, 2008
    Publication date: October 8, 2009
    Applicant: ARM Limited
    Inventors: Dominic Hugo Symes, Mladen Wilder