Patents by Inventor Jorn Nystad

Jorn Nystad has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10331449
    Abstract: Various encoding schemes are discussed for more efficiently encoding instructions which identify first and second architectural register numbers. In the first example, by constraining the first architectural register number to be greater than the second architectural register number, this frees up encodings for use in encoding other operations. In a second example, the first and second architectural register numbers may take any value but one of a first type of processing operation and a second type of processing operation is selected depending on a comparison of the first and second architectural register numbers.
    Type: Grant
    Filed: January 22, 2016
    Date of Patent: June 25, 2019
    Assignee: ARM Limited
    Inventors: Simon Hosie, Jørn Nystad
  • Patent number: 10331574
    Abstract: A slave device communicates with a host system via a host communications bus. The host system includes one processing unit that can act as bus master and send access requests for slave resources on the slave device via the communications bus. The slave device platform includes a memory management unit, a programmable central processing unit and one slave resource. The memory management unit acts as an address translating device, and accepts requests with virtual addresses from a master device on the host system, translates the virtual addresses used in the access request to the “internal” physical address of the slave's resources and forwards the access to the appropriate physical resource. When an address miss occurs in the memory management unit, it passes the handling of the access request over to the controlling CPU which executes software to then resolve the address miss and handle the access request.
    Type: Grant
    Filed: October 19, 2012
    Date of Patent: June 25, 2019
    Assignee: ARM Norway AS
    Inventors: Jorn Nystad, Edvard Sorgard, Borgar Ljosland, Mario Blazevic
  • Patent number: 10331404
    Abstract: Apparatus for processing data includes processing circuitry 16, 18, 20, 22, 24, 26 and decoder circuitry 14 for decoding program instructions. The program instructions decoded include a floating point pre-conversion instruction which performs round-to-nearest ties to even rounding upon the mantissa field of an input floating number to generate an output floating point number with the same mantissa length but with the mantissa rounded to a position corresponding to a shorter mantissa field. The output mantissa field includes a suffix of zero values concatenated the rounded value. The decoder for circuitry 14 is also responsive to an integer pre-conversion instruction to quantise and input integer value using round-to-nearest ties to even rounding to form an output integer operand with a number of significant bits matched to the mantissa size of a floating point number to which the integer is later to be converted using an integer-to-floating point conversion instruction.
    Type: Grant
    Filed: December 29, 2014
    Date of Patent: June 25, 2019
    Assignee: ARM Limited
    Inventors: Jorn Nystad, Andreas Due Engh-Halstvedt, Simon Alex Charles
  • Patent number: 10269168
    Abstract: When sampling a cube map when rendering in a graphics processing system, the vector representation of the desired cube map sample provided by the application is converted into a 2D position on one of the faces of the cube map for use by the texturing unit of the graphics processing pipeline. The determined 2D texture coordinates (S, T) are represented using standard 32-bit IEEE 754 floating point numbers, and the 3-bit face index for the cube map is included in one of the numbers representing the texture coordinates by packing it into the sign bit and the top two bits of the exponent to provide a modified texture coordinate value. The modified 32-bit texture coordinate representation is then provided together with the 32-bit floating point number corresponding to the other texture coordinate as the cube map descriptor to the texturing unit of the graphics processing pipeline.
    Type: Grant
    Filed: March 4, 2016
    Date of Patent: April 23, 2019
    Assignee: Arm Limited
    Inventor: Jorn Nystad
  • Publication number: 20190096025
    Abstract: A texture mapping apparatus, e.g. of a graphics processing unit, comprises texture fetching circuitry operable to receive a set of weight values for a convolution operation and fetch from memory a set of input data values on which the convolution operation is to be performed. The texture mapping apparatus further comprises texture filtering circuitry operable to perform a convolution operation using the set of received weight values and the set of fetched input data values. The texture mapping apparatus can allow a graphics processing unit to perform a variety of convolution operations in an efficient manner.
    Type: Application
    Filed: September 24, 2018
    Publication date: March 28, 2019
    Applicant: Arm Limited
    Inventors: Jorn Nystad, Carmelo Giliberto, Edvard Fielding
  • Publication number: 20190087155
    Abstract: There is provided an apparatus and method for comparing wide data types. The apparatus comprises processing circuitry to perform a plurality of comparison operations in order to compare a first value and a second value, each of the first value and the second value having a length greater than N bits, and each comparison operation operating on a corresponding N bits of the first and second values. The plurality of comparison operations are chained to form a sequence such that each comparison operation is arranged to output an accumulated comparison result incorporating the comparison results of any previous comparison operations in the sequence, and such that for each comparison operation other than a final comparison operation in the sequence the accumulated comparison result is provided for use as an input by a next comparison operation in the sequence.
    Type: Application
    Filed: May 25, 2016
    Publication date: March 21, 2019
    Inventor: Jørn NYSTAD
  • Publication number: 20190088009
    Abstract: A graphics processing apparatus comprises fragment generating circuitry to generate graphics fragments corresponding to graphics primitives, thread processing circuitry to perform threads of processing corresponding to the fragments, and forward kill circuitry to trigger a forward kill operation to prevent further processing of a target thread of processing corresponding to an earlier graphics fragment when the forward kill operation is enabled for the target thread and the earlier graphics fragment is determined to be obscured by one or more later graphics fragments. The thread processing circuitry supports enabling of the forward kill operation for a thread including at least one forward kill blocking instruction having a property indicative that the forward kill operation should be disabled for the given thread, when the thread processing circuitry has not yet reached a portion of the thread including the at least one forward kill blocking instruction.
    Type: Application
    Filed: September 12, 2018
    Publication date: March 21, 2019
    Inventors: Stephane FOREY, Jørn NYSTAD, Reimar Gisbert DÖFFINGER, Kenneth Edvard ØSTBY, Toni Viki BRKIC
  • Patent number: 10230376
    Abstract: An apparatus and method are provided, the apparatus comprising: storage circuitry to store an input data value; divider circuitry to split the input data value into at least one sub-value in dependence on a number of lanes for a current iteration, each sub-value occupying a lane, and to operate on each sub-value to generate a quotient corresponding to the division of that sub-value by a divisor, wherein the divisor is an odd integer; remainder circuitry to operate on each sub-value to generate a remainder corresponding to the remainder of dividing that sub-value by the divisor; concatenation circuitry to concatenate each quotient to produce a concatenated division value, and to concatenate each remainder to produce a concatenated remainder value, in each subsequent iteration, the input data value being formed from the concatenated remainder value of a preceding iteration; and output circuitry to output, after a plurality of iterations, a result of adding the concatenated division values produced by said plura
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: March 12, 2019
    Assignee: ARM Limited
    Inventor: Jørn Nystad
  • Patent number: 10223288
    Abstract: A slave device communicates with a host system via a host communications bus. The host system includes one processor that can act as bus master and send access requests for slave resources on the slave device via the communications bus. The slave device platform includes a memory management unit, a programmable central processor and one slave resource. The memory management unit acts as an address translating device, and accepts requests with virtual addresses from a master device on the host system, translates the virtual addresses used in the access request to the “internal” physical addresses of the slave's resources and forwards the accesses to the appropriate physical resource. When an address miss occurs in the memory management unit, it passes the handling of the access request over to the controlling CPU which executes software to then resolve the address miss and handle the access request.
    Type: Grant
    Filed: July 15, 2015
    Date of Patent: March 5, 2019
    Assignee: ARM NORWAY AS
    Inventors: Jorn Nystad, Edvard Sorgard, Borgar Ljosland, Mario Blazevic
  • Patent number: 10204391
    Abstract: A tile-based graphics processing pipeline that uses primitive lists that can encompass plural rendering tiles includes a primitive list reading unit that reads primitive lists for a tile being rendered to determine primitives to be processed for the tile and a rasterizer that rasterizes input primitives to generate graphics fragments to be processed. The pipeline further comprises a comparison unit between the primitive list reading unit and the rasterizer that for primitives that have been read from primitive lists that include plural rendering tiles, compares the location of the primitive in the render target to the location of the tile being rendered, and then either sends the primitive onwards to the rasterizer if the comparison determines that the primitive could lie at least partially within the tile, or does not send the primitive to the rasterizer if the comparison determines that the primitive definitely does not lie within the tile.
    Type: Grant
    Filed: June 4, 2013
    Date of Patent: February 12, 2019
    Assignee: Arm Limited
    Inventors: Frode Heggelund, Jorn Nystad
  • Patent number: 10204440
    Abstract: A graphics processing system generates interpolated vertex shaded attribute data for plural sampling points of plural fragments of a quad fragment that is being used to sample a primitive. The interpolated vertex shaded attribute data for the plural sampling points is generated using a reference position for the quad fragment that is defined with respect to a first coordinate system, together with rotated sampling point delta values for the primitive that are defined with respect to a second coordinate system. The rotated sampling point delta values allow the interpolated vertex shaded attribute data to be generated more efficiently for the plural sampling points.
    Type: Grant
    Filed: July 23, 2016
    Date of Patent: February 12, 2019
    Assignee: Arm Limited
    Inventors: Frode Heggelund, Jorn Nystad
  • Publication number: 20190019323
    Abstract: In a graphics processing system, when using a graphics texture that is stored in memory as YUV texture data, the YUV texture data is stored in the texture cache from which it is to be read when generating a render output such that the data values for a chrominance data element and its associated set of one or more luminance data elements of the texture are stored together as a group in the cache. The group of data in the cache is tagged with an identifier for the data values of the chrominance data element and its associated set of one or more luminance data elements that is useable to identify the chrominance data element and its associated set of one or more luminance data elements in the cache, and that is indicative of a position in the YUV graphics texture.
    Type: Application
    Filed: July 8, 2018
    Publication date: January 17, 2019
    Applicant: Arm Limited
    Inventors: Edvard Fielding, Jorn Nystad, Andreas Due Engh-Halstvedt
  • Patent number: 10176546
    Abstract: A data processing system determines for a stream of instructions to be executed, whether there are any instructions that can be re-ordered in the instruction stream 41 and assigns each such instruction to an instruction completion tracker and includes in the encoding for the instruction an indication of the instruction completion tracker it has been assigned to 42. For each instruction in the instruction stream, an indication of which instruction completion trackers, if any, the instruction depends on is also provided 43, 44. Then, when an instruction that is indicated as being dependent on an instruction completion tracker is to be executed, the status of the relevant instruction completion tracker is checked before executing the instruction.
    Type: Grant
    Filed: July 2, 2013
    Date of Patent: January 8, 2019
    Assignee: Arm Limited
    Inventor: Jorn Nystad
  • Patent number: 10157132
    Abstract: A method of operating a data processing system comprises maintaining record of a set of processing passes to be performed by processing pass circuitry of the data processing system. The method comprises performing cycles of operation in which it is considered whether or not the data required for a subset of processing passes is stored in a local cache. The subset of processing passes that is considered in a subsequent scan of the record comprises at least one processing pass that was not considered in the previous scan of the record, regardless of whether or not the data considered in the previous scan is determined as being stored in the cache. The method provides an efficient way to identify processing passes that are ready to be performed.
    Type: Grant
    Filed: July 27, 2017
    Date of Patent: December 18, 2018
    Assignee: Arm Limited
    Inventors: Edvard Fielding, Andreas Due Engh-Halstvedt, Jorn Nystad, Antonio Garcia Guirado, William Robert Stoye, Ian Rudolf Bratt
  • Patent number: 10152763
    Abstract: The present disclosure relates to a graphics processors and graphics processing systems. In the graphics processor, the rasterizer may operate to identify pairs of fragments for a primitive being rendered for which not all the sampling positions in the fragments are covered by the primitive. When the fragments reach the fragment shader, corresponding execution threads may be spawned for execution by the fragment shader to process the fragments. A first part of the fragment shader program that uses the helper threads of the thread groups may then be executed. There may then be a merge instruction in the fragment shader program which operates to cause the active threads of the thread groups to be merged into a single, combined thread group. Following this thread group merger, the remaining program steps of the fragment shader program may be executed for the merged thread group.
    Type: Grant
    Filed: July 23, 2016
    Date of Patent: December 11, 2018
    Assignee: Arm Limited
    Inventor: Jorn Nystad
  • Patent number: 10147202
    Abstract: To encode a texture to be used in a graphics processing system, the texture is first downscaled to generate a lower resolution representation of the texture 41. An upscaled version 42 of the lower resolution version of the texture is then compared to the original texture to determine a set of difference values indicating for each texel the difference between the value of the texel in the upscaled version of the texture and in the original texture 43. An encoded texture data block is then generated for each 8×8 block of texels in the original texture 44. Each encoded texture data block contains a base color value taken from the lower resolution representation of the texture and a set of index values indicating the difference data from the determined set of difference data to be used when decoding the block of texture data to generate the data values to be used for the texture data elements that the block of texture data represents.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: December 4, 2018
    Assignee: Arm Limited
    Inventors: Jorn Nystad, Anders Lassen
  • Publication number: 20180293765
    Abstract: A scene to be rendered is divided into plural individual sub-regions or tiles. The individual sub-regions 51 are also grouped into differing groups of sets of plural sub-regions. There is a top level layer comprising a set of 8×8 sub-regions which encompasses the entire scene area. There is then a group of four 4×4 sets of sub-regions, then a group of sixteen 2×2 sets of sub-regions, and finally a layer comprising the 64 single sub-regions. A primitive list building processor takes each primitive in turn, determines a location for that primitive, compares the primitive's location with the locations of the sub-regions and the locations of the sets of sub-regions, and allocates the primitive to respective primitive lists for the sub-regions and sets of sub-regions accordingly.
    Type: Application
    Filed: June 13, 2018
    Publication date: October 11, 2018
    Inventors: Edvard Sorgard, Borgar LJOSLAND, Jorn NYSTAD, Mario BLAZEVIC, Frank LANGTIND
  • Patent number: 10089783
    Abstract: A graphics processing pipeline comprises a tessellation stage that is configured to tessellate a patch into tessellation primitives. When tessellating the patch, the tessellation stage generates tessellation vertex coordinate pairs that define within a parameter space the locations of vertices of the tessellation primitives for the patch. The tessellation vertex coordinate pairs are initially represented using a first binary representation and are then encoded into a more convenient second binary representation, but without any loss of resolution in the data. The step of encoding comprises mapping at least one of the tessellation vertex coordinate pairs to a mapped coordinate pair that can be represented using the second binary representation, wherein the mapped coordinate pair defines a location within an area of the parameter space that would otherwise be unused, invalid and/or unreachable for the vertices of the tessellation primitives for the patch.
    Type: Grant
    Filed: July 25, 2016
    Date of Patent: October 2, 2018
    Assignee: Arm Limited
    Inventor: Jorn Nystad
  • Patent number: 10089709
    Abstract: A graphics processing unit 3 includes a rasterizer 25, a thread spawner 40, a programmable execution unit 41, a varying interpolator 42, a texture mapper 43, and a blender 29. The programmable execution unit 41 is able to communicate with the varying interpolator 42, the texture mapper 43 and the blender 29 to request processing operations by those graphic specific accelerators. In addition to this, these graphics-specific accelerators are also able to communicate directly with each other and with the thread spawner 40, independently of the programmable execution unit 41. This allows for certain graphics processing operations to be performed using direct communication between the graphics-specific accelerators of the graphics processing unit, instead of executing instructions in the programmable execution unit to trigger the performance of those operations by the graphics-specific accelerators.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: October 2, 2018
    Assignee: Arm Limited
    Inventors: Andreas Due Engh-Halstvedt, David James Bermingham, Amir Kleen, Jørn Nystad, Kenneth Edvard Østby
  • Publication number: 20180211436
    Abstract: A programmable execution unit (42) of a graphics processor includes a functional unit (50) that is operable to execute instructions (51). The output of the functional unit (50) can both be written to a register file (46) and fed back directly as an input to the functional unit by means of a feedback circuit (52). Correspondingly, an instruction that is to be executed by the functional unit (50) can select as its inputs either the fed-back output (52) from the execution of the previous instruction, or inputs from the registers (46). A register access descriptor (54) between each instruction in a group of instructions (53) specifies the registers whose values will be available on the register ports that the functional unit will read when executing the instruction, and the register address where the result of the execution of the instruction will be written to. The programmable execution unit (42) executes group of instructions (53) that are to be executed atomically.
    Type: Application
    Filed: July 22, 2016
    Publication date: July 26, 2018
    Applicant: Arm Limited
    Inventor: Jorn Nystad