Patents by Inventor Andreas Due Engh-Halstvedt
Andreas Due Engh-Halstvedt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9933999Abstract: An apparatus, method and program are provided for calculating a result value to a required precision of a repeating iterative sum, wherein the repeating iterative sum comprises multiple iterations of an addition using an input value. Addition is performed in a single iteration of addition as a sum operation using overlapping portions of the input value and a shifted version of the input value, wherein the shifted version of the input value has a partial overlap with the input value. At least one result portion is produced by incrementing an input derived from the input value using the output from the sum operation and the result value is constructed using the at least one result portion to give the result value to the required precision. The repeating iterative sum is thereby flattened into a flattened calculation which requires only a single iteration of addition using the input value, thus facilitating the calculation of the result value of the repeating iterative sum.Type: GrantFiled: October 8, 2015Date of Patent: April 3, 2018Assignee: ARM LimitedInventors: Andreas Due Engh-Halstvedt, Edvard Fielding
-
Patent number: 9927882Abstract: A graphics processing unit 2 includes a texture pipeline 6 having a first pipeline portion 18 and a second pipeline portion 20. A subject instruction within the first pipeline portion 18 is recirculated within the first pipeline portion 18 until descriptor data to be loaded from a memory 4 by that subject instruction has been cached within a shared descriptor cache 22. When the descriptor has been stored within the shared descriptor cache 22, then the subject instruction is passed to the second pipeline portion 20 where further processing operations are performed and the subject instruction recirculated until those further processing operations have completed. The descriptor data is locked within the shared descriptor cache 22 until there are no pending subject instructions within the texture pipeline 6 which required to use that descriptor data.Type: GrantFiled: April 30, 2012Date of Patent: March 27, 2018Assignee: ARM LimitedInventors: Andreas Due Engh-Halstvedt, Jorn Nystad
-
Patent number: 9865065Abstract: A graphics processing pipeline includes processing circuitry. The processing circuitry is configured to determine attribute information for an object to be rendered for a set of sampling points from a compressed representation of attribute information associated with the object, when the set of sampling points is being processed by the graphics processing pipeline to generate a render output. The processing circuitry is also configured to use the determined attribute information to control the processing of the set of sampling points by the graphics processing pipeline when generating the render output.Type: GrantFiled: February 22, 2016Date of Patent: January 9, 2018Assignee: Arm LimitedInventors: Peter William Harris, Sandeep Kakarlapudi, Andreas Due Engh-Halstvedt
-
Publication number: 20170330372Abstract: A graphics processing pipeline comprises vertex shading circuitry that operates to vertex shade position attributes of vertices of a set of vertices to be processed by the graphics processing pipeline, to generate, inter alia, a separate vertex shaded position attribute value for each view of the plural different views. Tiling circuitry then determines for the vertices that have been subjected to the first vertex shading operation, whether the vertices should be processed further. Vertex shading circuitry then performs a second vertex shading operation on the vertices that it has been determined should be processed further, to vertex shade the remaining vertex attributes for each vertex that it has been determined should be processed further, to generate, inter alia, a single vertex shaded attribute value for the set of plural views.Type: ApplicationFiled: May 15, 2017Publication date: November 16, 2017Applicant: ARM LimitedInventors: Sandeep Kakarlapudi, Jorn Nystad, Andreas Due-Engh Halstvedt
-
Publication number: 20170315805Abstract: Processing circuitry performs processing operations specified by program instructions. An instruction decoder decodes an atomic-add-with-carry instruction AAD-DC to control the processing circuitry to perform an atomic operation of an add of an addend operand value and a data value stored in a memory to generate a result value stored in the memory and a carry value indicative of whether or not the add generated a carry out.Type: ApplicationFiled: November 3, 2015Publication date: November 2, 2017Inventor: Andreas Due ENGH-HALSTVEDT
-
Publication number: 20170285955Abstract: A data array to be stored is first divided into a plurality of blocks. Each block is further sub-divided into a set of sub-blocks. Data representing sub-blocks of the data array is stored, together with a header data block for each block that the data array has been divided into. For each block, it is determined whether all the data positions for the block have the same data value associated with them, and, if so, an indication that all of the data positions within the block have the same data value associated with them, and an indication of the same data value that is associated with each of the data positions in the block, is stored in the header data block for that block of the data array.Type: ApplicationFiled: March 29, 2017Publication date: October 5, 2017Applicant: ARM LimitedInventors: Quinn Carter, Lars Oskar Flordal, Jakob Axel Fries, Andreas Due Engh-Halstvedt
-
Patent number: 9753735Abstract: A data processing system includes a processing pipeline for the parallel execution of a plurality of threads. An issue controller issues threads to the processing pipeline. A stall manager controls the stalling and unstalling of threads when a cache miss occurs within a cache memory. The issue controller issues the threads to the processing pipeline in accordance with both a main sequence and a pilot sequence. The pilot sequence is followed such that threads within the pilot sequence are issued at least a given time ahead of their neighbors within a main sequence. The given time corresponds approximately to the latency associated with a cache miss. The threads may be arranged in groups corresponding to blocks of pixels for processing within a graphics processing unit.Type: GrantFiled: January 14, 2015Date of Patent: September 5, 2017Assignee: ARM LimitedInventors: Andreas Due Engh-Halstvedt, Ian Victor Devereux, David Bermingham, Jakob Axel Fries, Oskar Lars Flordal
-
Publication number: 20170193691Abstract: A graphics processing pipeline includes position shading circuitry, a tiler, varying-only vertex shading circuitry and fragment (frontend) shading circuitry. The tiler reads a list of indices defining a set of vertices to be processed by the graphics processing pipeline and determines whether or not vertex shading is required for the positional attributes of the vertices. If vertex shading is required, the tiler sends a position shading request for the vertices to the position shading circuitry. The tiler uses the vertex shaded position data to identify primitives that should be processed further to generate the render output and that accordingly should be subjected to a second, varying shading, vertex shading operation. When the tiler determines that a vertex (or group of vertices) should be subjected to the second, varying shading, vertex shading operation, the tiler sends a varying shading request for the vertex (or vertices) to the varying shading circuitry.Type: ApplicationFiled: December 28, 2016Publication date: July 6, 2017Applicant: ARM LimitedInventors: Frank Langtind, Andreas Due Engh-Halstvedt, Sandeep Kakarlapudi
-
Publication number: 20170061678Abstract: A tile-based graphics processing pipeline includes a back-facing determination and culling unit that is operable to cull back-facing triangles before the tiling stage. The back-facing determination and culling unit include a triangle size estimator that estimates the size of a triangle being considered. If the size of the triangle is less than a selected size, then the area of the triangle is calculated using fixed point arithmetic and the result of that area calculation is used by a back-face culling unit to determine whether to cull the triangle or not. On the other hand, if the size estimator determines that the primitive is greater than the selected size, then the triangle bypasses the fixed point area calculation and back-face culling unit and is instead passed directly to the tiler.Type: ApplicationFiled: August 25, 2016Publication date: March 2, 2017Applicant: ARM LimitedInventors: Andreas Due Engh-Halstvedt, Frank Langtind
-
Publication number: 20170024847Abstract: A graphics processing unit 3 includes a rasteriser 25, a thread spawner 40, a programmable execution unit 41, a varying interpolator 42, a texture mapper 43, and a blender 29. The programmable execution unit 41 is able to communicate with the varying interpolator 42, the texture mapper 43 and the blender 29 to request processing operations by those graphic specific accelerators. In addition to this, these graphics-specific accelerators are also able to communicate directly with each other and with the thread spawner 40, independently of the programmable execution unit 41. This allows for certain graphics processing operations to be performed using direct communication between the graphics-specific accelerators of the graphics processing unit, instead of executing instructions in the programmable execution unit to trigger the performance of those operations by the graphics-specific accelerators.Type: ApplicationFiled: July 12, 2016Publication date: January 26, 2017Applicant: ARM LimitedInventors: Andreas Due Engh-Halstvedt, David James Bermingham, Amir Kleen, Jørn Nystad, Kenneth Edvard Østby
-
Patent number: 9530241Abstract: Techniques for performing clipping of graphics primitives 60 with respect to a clipping boundary 65 are described. The clipping step 10 may be performed separately for each tile of a graphics frame to be rendered, after a primitive list for the tile has been read from a primitive memory 38. Clipping may be performed only for larger primitives whose size exceeds a given threshold. Clipping of a primitive 60 to the clipping boundary 65 may be performed inexactly so that only a single clipped primitive is generated which may extend beyond the clipping boundary. A clipped primitive generated by clipping may be used for a depth function calculation of a primitive setup operation and not for an edge determination.Type: GrantFiled: November 7, 2014Date of Patent: December 27, 2016Assignee: ARM LimitedInventors: Andreas Due Engh-Halstvedt, Frode Heggelund, Jørn Nystad
-
Patent number: 9430381Abstract: A graphics processing unit 2 includes a texture pipeline 6 which performs filter operations upon texture values. If the texture values are integer texture values, then they may be processed by the texture pipeline in a variable order corresponding to the order in which they are retrieved from a memory 4. If the texture values are floating point texture values, then they are processed in a fixed order in order to ensure result invariants as the filter operation is non-associative for floating point values. The filter operation is not commenced until all of the floating point texture values have been retrieved from the memory 4 and other available for processing.Type: GrantFiled: April 21, 2014Date of Patent: August 30, 2016Assignee: ARM LimitedInventors: Andreas Due Engh-Halstvedt, Jorn Nystad
-
Publication number: 20160247249Abstract: A graphics processing pipeline includes processing circuitry. The processing circuitry is configured to determine attribute information for an object to be rendered for a set of sampling points from a compressed representation of attribute information associated with the object, when the set of sampling points is being processed by the graphics processing pipeline to generate a render output. The processing circuitry is also configured to use the determined attribute information to control the processing of the set of sampling points by the graphics processing pipeline when generating the render output.Type: ApplicationFiled: February 22, 2016Publication date: August 25, 2016Applicant: ARM LimitedInventors: Peter William Harris, Sandeep Kakarlapudi, Andreas Due Engh-Halstvedt
-
Publication number: 20160239939Abstract: In a graphics processing system, a driver for the graphics processing pipeline can include conditional graphics processing tasks in the graphics processing tasks that are to be executed by the graphics processing pipeline to generate a render output required by an application. Each such conditional task has associated with it a condition to be used by the graphics processing pipeline to determine whether to execute processing for the task or not and a region of the render output over which the processing for the task will be executed when the condition for the task is met. The graphics processing pipeline determines whether the condition associated with the task has been met, and only executes the processing for the task if the condition associated with the task has been met.Type: ApplicationFiled: February 12, 2016Publication date: August 18, 2016Applicant: ARM LimitedInventors: Sandeep Kakarlapudi, Andreas Due Engh-Halstvedt, Lars Oskar Flordal, Arne Bergene Fossaa
-
Patent number: 9411662Abstract: A data processing apparatus comprises processing circuitry arranged to process processing threads using resources accessible to the processing circuitry. A pipeline is provided for handling at least two pending threads awaiting processing by the processing circuitry. The pipeline includes at least one resource-requesting pipeline stage for requesting access to resources for the pending threads. A priority controller controls priority levels of the pending threads. The priority levels define a priority with which pending threads are granted access to resources. When a pending thread reaches a final pipeline stage, if the request resources are not yet available then the priority level of that thread is raised selectively and the thread is returned to a first pipeline stage of the pipeline. If the requested resources are available then the thread is forwarded from the pipeline.Type: GrantFiled: July 16, 2013Date of Patent: August 9, 2016Assignee: ARM LimitedInventors: Nebojsa Makljenovic, Edvard Fielding, Andreas Due Engh-Halstvedt
-
Publication number: 20160179676Abstract: A data processing system incorporates a write-back cache and supports load-and-clean program instructions. The action of a load-and-clean program instruction is to load a data value and to mark as clean at least a target portion within a cache line of the write-back cache which is storing the data value loaded. The data values to be subject to such load-and-clean instructions may be identified by the programmer as the last use of those data values, or may be identified by a compiler as the last use of those data values. The data values may be from a stack memory region in which their pattern of access is predictable and it is known when they are no longer required. Another example of regular memory accesses where the last access can be identified is when processing streaming media data.Type: ApplicationFiled: December 2, 2015Publication date: June 23, 2016Inventors: Andreas Due ENGH-HALSTVEDT, Jørn NYSTAD
-
Publication number: 20160124708Abstract: An apparatus, method and program are provided for calculating a result value to a required precision of a repeating iterative sum, wherein the repeating iterative sum comprises multiple iterations of an addition using an input value. Addition is performed in a single iteration of addition as a sum operation using overlapping portions of the input value and a shifted version of the input value, wherein the shifted version of the input value has a partial overlap with the input value. At least one result portion is produced by incrementing an input derived from the input value using the output from the sum operation and the result value is constructed using the at least one result portion to give the result value to the required precision. The repeating iterative sum is thereby flattened into a flattened calculation which requires only a single iteration of addition using the input value, thus facilitating the calculation of the result value of the repeating iterative sum.Type: ApplicationFiled: October 8, 2015Publication date: May 5, 2016Inventors: Andreas Due ENGH-HALSTVEDT, Edvard FIELDING
-
Publication number: 20160110837Abstract: A graphics processing apparatus and method of performing graphics processing are provided. The graphics processing apparatus comprises a sequence of processing stages capable of performing graphics processing to generate a frame of display data. The graphics processing is performed on a tile-by-tile basis. The graphics processing apparatus is capable of determining if a current tile subject to the graphics processing is empty. At least one processing stage of the sequence of processing stages is omitted for graphics processing of the current tile in dependence on whether the current tile is empty.Type: ApplicationFiled: October 5, 2015Publication date: April 21, 2016Inventors: Isidoros SIDERIS, Michel Patrick Gabriel Emil IWANIEC, Andrew BURDASS, Nebojsa MAKLJENOVIC, Andreas Due ENGH-HALSTVEDT
-
Publication number: 20150227376Abstract: A data processing system includes a processing pipeline for the parallel execution of a plurality of threads. An issue controller issues threads to the processing pipeline. A stall manager controls the stalling and unstalling of threads when a cache miss occurs within a cache memory. The issue controller issues the threads to the processing pipeline in accordance with both a main sequence and a pilot sequence. The pilot sequence is followed such that threads within the pilot sequence are issued at least a given time ahead of their neighbours within a main sequence. The given time corresponds approximately to the latency associated with a cache miss. The threads may be arranged in groups corresponding to blocks of pixels for processing within a graphics processing unit.Type: ApplicationFiled: January 14, 2015Publication date: August 13, 2015Inventors: Andreas Due ENGH-HALSTVEDT, Ian Victor DEVEREUX, David BERMINGHAM, Jakob Alex FRIES, Oskar Lars FLORDAL
-
Publication number: 20150161814Abstract: Techniques for performing clipping of graphics primitives 60 with respect to a clipping boundary 65 are described. The clipping step 10 may be performed separately for each tile of a graphics frame to be rendered, after a primitive list for the tile has been read from a primitive memory 38. Clipping may be performed only for larger primitives whose size exceeds a given threshold. Clipping of a primitive 60 to the clipping boundary 65 may be performed inexactly so that only a single clipped primitive is generated which may extend beyond the clipping boundary. A clipped primitive generated by clipping may be used for a depth function calculation of a primitive setup operation and not for an edge determination.Type: ApplicationFiled: November 7, 2014Publication date: June 11, 2015Inventors: Andreas Due ENGH-HALSTVEDT, Frode Heggelund, Jørn Nystad