Patents by Inventor Mladen Wilder
Mladen Wilder has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230050061Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.Type: ApplicationFiled: August 11, 2021Publication date: February 16, 2023Inventors: Andrew M. Havlir, Steven Fishwick, David A. Gotwalt, Benjamin Bowman, Ralph C. Taylor, Melissa L. Velez, Mladen Wilder, Ali Rabbani Rankouhi, Fergus W. MacGarry
-
Patent number: 11397624Abstract: A data processing system including a data processor which is operable to execute programs to perform data processing operations and in which execution threads executing a program to perform data processing operations may be grouped together into thread groups. The data processor comprises a cross-lane permutation circuit which is operable to perform processing for cross-lane instructions which require data to be permuted (copied or moved) between the threads of a thread group. The cross-lane permutation circuit has plural data lanes between which data may be permuted (moved or copied). The number of data lanes is fewer than the number of threads in a thread group.Type: GrantFiled: January 22, 2019Date of Patent: July 26, 2022Assignee: Arm LimitedInventors: Luka Dejanovic, Mladen Wilder
-
Patent number: 10726606Abstract: When a shader program is to be executed by a graphics processor, the graphics processor is caused to execute at least two variants of the shader program and the operation of the graphics processor when executing execution threads for the different variants of the shader program is monitored. A variant of the shader program to be executed by subsequent execution threads that are to execute the shader program is then selected based on the monitoring of the operation of the shading stage when executing the execution threads for the different variants of the shader program.Type: GrantFiled: February 19, 2019Date of Patent: July 28, 2020Assignee: Arm LimitedInventors: Peter William Harris, Mladen Wilder
-
Publication number: 20200233726Abstract: A data processing system including a data processor which is operable to execute programs to perform data processing operations and in which execution threads executing a program to perform data processing operations may be grouped together into thread groups. The data processor comprises a cross-lane permutation circuit which is operable to perform processing for cross-lane instructions which require data to be permuted (copied or moved) between the threads of a thread group. The cross-lane permutation circuit has plural data lanes between which data may be permuted (moved or copied). The number data lanes is fewer than the number of threads in a thread group.Type: ApplicationFiled: January 22, 2019Publication date: July 23, 2020Applicant: Arm LimitedInventors: Luka Dejanovic, Mladen Wilder
-
Publication number: 20190259193Abstract: When a shader program is to be executed by a graphics processor, the graphics processor is caused to execute at least two variants of the shader program and the operation of the graphics processor when executing execution threads for the different variants of the shader program is monitored. A variant of the shader program to be executed by subsequent execution threads that are to execute the shader program is then selected based on the monitoring of the operation of the shading stage when executing the execution threads for the different variants of the shader program.Type: ApplicationFiled: February 19, 2019Publication date: August 22, 2019Applicant: Arm LimitedInventors: Peter William Harris, Mladen Wilder
-
Patent number: 9304926Abstract: A coherent memory system includes a plurality of level 1 cache memories 6 connected via interconnect circuitry 18 to a level 2 cache memory 8. Coherency control circuitry 10 manages coherency between lines of data. Evict messages from the level 1 cache memories to the coherency control circuitry 10 are sent via the read address channel AR. Read messages are also sent via the read address channel AR. The read address channel AR is configured such that a read message may not be reordered relative to an evict message. The coherency control circuitry 10 is configured such that a read message will not be processed ahead of an evict message. The level 1 cache memories 6 do not track in-flight evict messages. No acknowledgement of an evict message is sent from the coherency control circuitry 10 back to the level 1 cache memory 6.Type: GrantFiled: July 23, 2013Date of Patent: April 5, 2016Assignee: ARM LimitedInventors: Ian Bratt, Mladen Wilder, Ole Henrik Jahren
-
Publication number: 20150032969Abstract: A coherent memory system includes a plurality of level 1 cache memories 6 connected via interconnect circuitry 18 to a level 2 cache memory 8. Coherency control circuitry 10 manages coherency between lines of data. Evict messages from the level 1 cache memories to the coherency control circuitry 10 are sent via the read address channel AR. Read messages are also sent via the read address channel AR. The read address channel AR is configured such that a read message may not be reordered relative to an evict message. The coherency control circuitry 10 is configured such that a read message will not be processed ahead of an evict message. The level 1 cache memories 6 do not track in-flight evict messages. No acknowledgement of an evict message is sent from the coherency control circuitry 10 back to the level 1 cache memory 6.Type: ApplicationFiled: July 23, 2013Publication date: January 29, 2015Applicant: ARM LIMITEDInventors: Ian BRATT, Mladen WILDER, Ole Henrik JAHREN
-
Patent number: 8595280Abstract: A data processing apparatus and method for performing multiply-accumulate operations is provided. The data processing apparatus includes data processing circuitry responsive to control signals to perform data processing operations on at least one input data element.Type: GrantFiled: October 29, 2010Date of Patent: November 26, 2013Assignee: ARM LimitedInventors: Dominic Hugo Symes, Mladen Wilder, Guy Larri
-
Patent number: 8473819Abstract: An electronic device is described which receives data from a transmitting device via a communications channel. The electronic device comprises digital processing circuitry arranged to process the data received via the communications channel to generate output data, error detection circuitry arranged to detect errors in the output data, and monitoring circuitry arranged to monitor the quality of digital processing conducted by the digital processing circuitry and generate digital performance data indicative of the monitored quality of digital processing. The electronic device also comprises control circuitry responsive to error information comprising errors detected by the error detection circuitry and the performance data generated by the monitoring circuitry to modify the operation of one or both of the transmitting device and the electronic device.Type: GrantFiled: July 15, 2009Date of Patent: June 25, 2013Assignee: ARM LimitedInventors: Daniel Kershaw, David Michael Bull, Mladen Wilder
-
Patent number: 8443170Abstract: An apparatus and method for performing SIMD multiply-accumulate operations includes SIMD data processing circuitry responsive to control signals to perform data processing operations in parallel on multiple data elements. Instruction decoder circuitry is coupled to the SIMD data processing circuitry and is responsive to program instructions to generate the required control signals. The instruction decoder circuitry is responsive to a single instruction (referred to herein as a repeating multiply-accumulate instruction) having as input operands a first vector of input data elements, a second vector of coefficient data elements, and a scalar value indicative of a plurality of iterations required, to generate control signals to control the SIMD processing circuitry.Type: GrantFiled: September 17, 2009Date of Patent: May 14, 2013Assignee: ARM LimitedInventors: Mladen Wilder, Dominic Hugo Symes, Richard Edward Bruce
-
Patent number: 8423752Abstract: An apparatus for processing data is provided comprising processing circuitry having permutation circuitry for performing permutation operations, a register bank having a plurality of registers for storing data and control circuitry responsive to program instructions to control the processing circuitry to perform data processing operations. The control circuitry is arranged to be responsive to a control-generating instruction to generate in dependence upon a bit-mask control signals to configure permutation circuitry for performing permutation operation on an input operand. The bit-mask identifies within the input operand the first group of data elements having a first ordering and a second group of data elements having a second ordering and the permutation operation is such that it preserves one of the first ordering and the second ordering but changes the other of the first ordering and the second ordering.Type: GrantFiled: December 16, 2008Date of Patent: April 16, 2013Assignee: ARM LimitedInventors: Dominic Hugo Symes, Mladen Wilder
-
Patent number: 8255446Abstract: An apparatus and method are provided for performing rearrangement operations and arithmetic operations on data. The data processing apparatus has processing circuitry for performing Single Instruction Multiple Data (SIMD) processing operations and scalar processing operations, a register bank for storing data and control circuitry responsive to program instructions to control the processing circuitry to perform data processing operations. The control circuitry is arranged to responsive to a combined rearrangement arithmetic instruction to control the processing circuitry to perform a rearrangement operation and at least one SIMD arithmetic operation on a plurality of data elements stored in the register bank. The rearrangement operation is configurable by a size parameter derived at least in part from the register bank. The size parameter provides an indication of a number of data elements forming a rearrangement element for the purposes of the rearrangement operation.Type: GrantFiled: November 29, 2007Date of Patent: August 28, 2012Assignee: ARM LimitedInventors: Daniel Kershaw, Mladen Wilder, Dominic Hugo Symes
-
Publication number: 20110185262Abstract: An electronic device is described which receives data from a transmitting device via a communications channel. The electronic device comprises digital processing circuitry arranged to process the data received via the communications channel to generate output data, error detection circuitry arranged to detect errors in the output data, and monitoring circuitry arranged to monitor the quality of digital processing conducted by the digital processing circuitry and generate digital performance data indicative of the monitored quality of digital processing. The electronic device also comprises control circuitry responsive to error information comprising errors detected by the error detection circuitry and the performance data generated by the monitoring circuitry to modify the operation of one or both of the transmitting device and the electronic device.Type: ApplicationFiled: July 15, 2009Publication date: July 28, 2011Inventors: Daniel Kershaw, David Michael Bull, Mladen Wilder
-
Publication number: 20110106871Abstract: A data processing apparatus and method for performing multiply-accumulate operations is provided. The data processing apparatus includes data processing circuitry responsive to control signals to perform data processing operations on at least one input data element.Type: ApplicationFiled: October 29, 2010Publication date: May 5, 2011Applicant: ARM LIMITEDInventors: Dominic Hugo Symes, Mladen Wilder, Guy Larri
-
Publication number: 20110093863Abstract: A data engine that can be interrupted is disclosed, the data engine comprising plurality of elements for storing, routing and processing the data, the plurality of elements comprising: processing elements for processing the data; registers for storing the data being processed; the data processing engine being configured to receive a clock signal and in response to the clock signal to periodically transmit a plurality of the control signals to a corresponding plurality of the elements in parallel; the data engine further comprising: control circuitry configured in response to receipt of an external interrupt request: to pause transmission of the control signals to the elements and to transmit a copy of the register data stored in the plurality of registers to a store; to transmit in parallel a next plurality of the control signals in the stream of control signals to a corresponding plurality of the elements, and to transmit a copy of output data output by the processing elements in response to the next pluraliType: ApplicationFiled: October 21, 2009Publication date: April 21, 2011Inventors: Jef Louis Verdonck, Mladen Wilder, Johan Matterne
-
Patent number: 7895417Abstract: A data processing system 2 is provided including an instruction decoder 34 responsive to program instructions within an instruction register 32 to generate control signals for controlling data processing circuitry 36. The instructions supported include an address calculation instruction which splits an input address value at a position dependent upon a size value into a first portion and second portion, adds a non-zero offset value to the first portion, sets the second portion to a value and then concatenates the result of the processing on the first portion and the second portion to form the output address value. Another type of instruction supported is a select-and-insert instruction. This instruction takes a first input value and shifts it by N bit positions to form a shifted value, selects N bits from within a second input value in dependence upon the first input value and then concatenates the shifted value with the N bits to form an output value.Type: GrantFiled: April 30, 2010Date of Patent: February 22, 2011Assignee: ARM LimitedInventors: Dominic Hugo Symes, Daniel Kershaw, Mladen Wilder
-
Publication number: 20100274990Abstract: An apparatus and method for performing SIMD multiply-accumulate operations includes SIMD data processing circuitry responsive to control signals to perform data processing operations in parallel on multiple data elements. Instruction decoder circuitry is coupled to the SIMD data processing circuitry and is responsive to program instructions to generate the required control signals. The instruction decoder circuitry is responsive to a single instruction (referred to herein as a repeating multiply-accumulate instruction) having as input operands a first vector of input data elements, a second vector of coefficient data elements, and a scalar value indicative of a plurality of iterations required, to generate control signals to control the SIMD processing circuitry.Type: ApplicationFiled: September 17, 2009Publication date: October 28, 2010Inventors: Mladen Wilder, Dominic Hugo Symes, Richard Edward Bruce
-
Patent number: 7814302Abstract: A data processing system 2 is provided including an instruction decoder 34 responsive to program instructions within an instruction register 32 to generate control signals for controlling data processing circuitry 36. The instructions supported include an address calculation instruction which splits an input address value at a position dependent upon a size value into a first portion and second portion, adds a non-zero offset value to the first portion, sets the second portion to a value and then concatenates the result of the processing on the first portion and the second portion to form the output address value. Another type of instruction supported is a select-and-insert instruction. This instruction takes a first input value and shifts it by N bit positions to form a shifted value, selects N bits from within a second input value in dependence upon the first input value and then concatenates the shifted value with the N bits to form an output value.Type: GrantFiled: February 13, 2008Date of Patent: October 12, 2010Assignee: ARM LimitedInventors: Dominic Hugo Symes, Daniel Kershaw, Mladen Wilder
-
Publication number: 20100217958Abstract: A data processing system 2 is provided including an instruction decoder 34 responsive to program instructions within an instruction register 32 to generate control signals for controlling data processing circuitry 36. The instructions supported include an address calculation instruction which splits an input address value at a position dependent upon a size value into a first portion and second portion, adds a non-zero offset value to the first portion, sets the second portion to a value and then concatenates the result of the processing on the first portion and the second portion to form the output address value. Another type of instruction supported is a select-and-insert instruction. This instruction takes a first input value and shifts it by N bit positions to form a shifted value, selects N bits from within a second input value in dependence upon the first input value and then concatenates the shifted value with the N bits to form an output value.Type: ApplicationFiled: April 30, 2010Publication date: August 26, 2010Applicant: ARM LimitedInventors: Dominic Hugo Symes, Daniel Kershaw, Mladen Wilder
-
Publication number: 20090254736Abstract: An apparatus for processing data is provided comprising rearrangement circuitry having a plurality of rearrangement stages for rearranging a plurality N of input data elements, each rearrangement stage comprising at most N multiplexers arranged to select between M data elements where M is in integer less than N. Control circuitry is provided that is responsive to program instructions to control the rearrangement circuitry to perform rearrangement operations. The rearrangement circuitry is configurable by the control circuitry to perform a plurality of different rearrangement operations. The rearrangement circuitry comprises main rearrangement circuitry having a plurality of rearrangement stages in which there is a unique path between any given input element and any given output element and supplementary rearrangement circuitry in which from each input data element there is a path to at most C output data elements where 1<C<N/2.Type: ApplicationFiled: April 7, 2008Publication date: October 8, 2009Applicant: ARM LimitedInventors: Dominic Hugo Symes, Mladen Wilder