Patents by Inventor Yoong-Chert Foo

Yoong-Chert Foo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10725821
    Abstract: A method of activating scheduling instructions within a parallel processing unit is described. The method includes checking if an ALU targeted by a decoded instruction is full by checking a value of an ALU work fullness counter stored in the instruction controller and associated with the targeted ALU. If the targeted ALU is not full, the decoded instruction is sent to the targeted ALU for execution and the ALU work fullness counter associated with the targeted ALU is updated. If, however, the targeted ALU is full, a scheduler is triggered to de-activate the scheduled task by changing the scheduled task from the active state to a non-active state. When an ALU changes from being full to not being full, the scheduler is triggered to re-activate an oldest scheduled task waiting for the ALU by removing the oldest scheduled task from the non-active state.
    Type: Grant
    Filed: June 18, 2018
    Date of Patent: July 28, 2020
    Assignee: Imagination Technologies Limited
    Inventors: Simon Nield, Yoong-Chert Foo, Adam de Grasse, Luca Iuliano
  • Publication number: 20200226828
    Abstract: A method and system for generating two or three dimensional computer graphics images using multisample antialiasing (MSAA) is provided, which enables memory bandwidth to be conserved. For each of one or more pixels it is determined whether all of a plurality of sample areas of that pixel are located within a particular primitive. For those pixels where it is determined that all the sample areas of that pixel are located within that primitive, a value is stored in a multisample memory for a smaller number of the sample areas of that pixel than the total number of the sample areas of that pixel and data is stored indicating that all the sample areas of that pixel are located within that primitive.
    Type: Application
    Filed: March 28, 2020
    Publication date: July 16, 2020
    Inventors: Yoong Chert Foo, Salil Sahasrabudhe, Andrew Davy
  • Patent number: 10698690
    Abstract: Method and apparatus are provided for synchronising execution of a plurality of threads on a multi-threaded processor. A program executed by a thread can have a number of synchronisation points corresponding to points where execution is to be synchronised with another thread. Execution of a thread is paused when it reaches a synchronisation point until at least one other thread with which it is intended to be synchronised reaches a corresponding synchronisation point. Execution is subsequently resumed. A control core maintains status data for threads and can cause a thread that is ready to run to use execution resources that were occupied by a thread that is waiting for a synchronisation event.
    Type: Grant
    Filed: January 18, 2019
    Date of Patent: June 30, 2020
    Assignee: Imagination Technologies Limited
    Inventor: Yoong Chert Foo
  • Publication number: 20200202607
    Abstract: A decoder unit is configured to decode a plurality of texels in accordance with a texel request, the plurality of texels being encoded across one or more blocks of encoded texture data each encoding a block of texels, and includes a first set of one or more decoders, each of the first set of decoders being configured to decode n texels from a single received block of encoded texture data; a second set of or more decoders, each of the second set of decoders being configured to decode p texels from a single received block of encoded texture data, where p<n; and control logic configured to allocate blocks of encoded texture data to the decoders in accordance with the texel request.
    Type: Application
    Filed: March 2, 2020
    Publication date: June 25, 2020
    Inventors: Yoong Chert Foo, Kenneth Rovers
  • Publication number: 20200201678
    Abstract: A SIMD microprocessor is configured to execute programs divided into discrete phases. A scheduler is provided for scheduling instructions. A plurality of resources are for executing instructions issued by the scheduler, wherein the scheduler is configured to schedule each phase of the program only after receiving an indication that execution of the preceding phase of the program has been completed. By splitting programs into multiple phases and providing a scheduler that is able to determine whether execution of a phase has been completed, each phase can be separately scheduled and the results of preceding phases can be used to inform the scheduling of subsequent phases. In one example, different numbers of threads and/or different numbers of data instances per thread may be processed for different phases of the same program.
    Type: Application
    Filed: February 29, 2020
    Publication date: June 25, 2020
    Inventor: Yoong Chert Foo
  • Patent number: 10679319
    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
    Type: Grant
    Filed: April 17, 2019
    Date of Patent: June 9, 2020
    Assignee: Imagination Technologies Limited
    Inventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
  • Patent number: 10636195
    Abstract: A decoder unit is configured to decode a plurality of texels in accordance with a texel request, the plurality of texels being encoded across one or more blocks of encoded texture data each encoding a block of texels, and includes a first set of one or more decoders, each of the first set of decoders being configured to decode n texels from a single received block of encoded texture data; a second set of or more decoders, each of the second set of decoders being configured to decode p texels from a single received block of encoded texture data, where p<n; and control logic configured to allocate blocks of encoded texture data to the decoders in accordance with the texel request.
    Type: Grant
    Filed: April 28, 2018
    Date of Patent: April 28, 2020
    Assignee: Imagination Technologies Limited
    Inventors: Yoong Chert Foo, Kenneth Rovers
  • Patent number: 10614622
    Abstract: A method and system for generating two or three dimensional computer graphics images using multisample antialiasing (MSAA) is provided, which enables memory bandwidth to be conserved. For each of one or more pixels it is determined whether all of a plurality of sample areas of that pixel are located within a particular primitive. For those pixels where it is determined that all the sample areas of that pixel are located within that primitive, a value is stored in a multisample memory for a smaller number of the sample areas of that pixel than the total number of the sample areas of that pixel and data is stored indicating that all the sample areas of that pixel are located within that primitive.
    Type: Grant
    Filed: February 18, 2016
    Date of Patent: April 7, 2020
    Assignee: Imagination Technologies Limited
    Inventors: Yoong Chert Foo, Salil Sahasrabudhe, Andrew Davy
  • Patent number: 10585700
    Abstract: A microprocessor is configured to execute programs divided into discrete phases. A scheduler is provided for scheduling instructions. A plurality of resources are for executing instructions issued by the scheduler, wherein the scheduler is configured to schedule each phase of the program only after receiving an indication that execution of the preceding phase of the program has been completed. By splitting programs into multiple phases and providing a scheduler that is able to determine whether execution of a phase has been completed, each phase can be separately scheduled and the results of preceding phases can be used to inform the scheduling of subsequent phases. In one example, different numbers of threads and/or different numbers of data instances per thread may be processed for different phases of the same program.
    Type: Grant
    Filed: February 29, 2016
    Date of Patent: March 10, 2020
    Assignee: Imagination Technologies Limited
    Inventor: Yoong Chert Foo
  • Publication number: 20200073713
    Abstract: A method of scheduling tasks within a GPU or other highly parallel processing unit is described which is both age-aware and wakeup event driven. Tasks which are received are added to an age-based task queue. Wakeup event bits for task types, or combinations of task types and data groups, are set in response to completion of a task dependency and these wakeup event bits are used to select an oldest task from the queue that satisfies predefined criteria.
    Type: Application
    Filed: November 6, 2019
    Publication date: March 5, 2020
    Inventors: Simon Nield, Adam de Grasse, Luca Iuliano, Ollie Mower, Yoong-Chert Foo
  • Patent number: 10503547
    Abstract: A method of scheduling tasks within a GPU or other highly parallel processing unit is described which is both age-aware and wakeup event driven. Tasks which are received are added to an age-based task queue. Wakeup event bits for task types, or combinations of task types and data groups, are set in response to completion of a task dependency and these wakeup event bits are used to select an oldest task from the queue that satisfies predefined criteria.
    Type: Grant
    Filed: May 14, 2019
    Date of Patent: December 10, 2019
    Assignee: Imagination Technologies Limited
    Inventors: Simon Nield, Adam de Grasse, Luca Iuliano, Ollie Mower, Yoong-Chert Foo
  • Patent number: 10481911
    Abstract: Method and apparatus are provided for synchronizing execution of a plurality of threads on a multi-threaded processor. A program executed by a thread can have a number of synchronization points corresponding to points where execution is to be synchronized with another thread. Execution of a thread is paused when it reaches a synchronization point until at least one other thread with which it is intended to be synchronized reaches a corresponding synchronization point. Execution is subsequently resumed. A control core maintains status data for threads and can cause a thread that is ready to run to use execution resources that were occupied by a thread that is waiting for a synchronization event.
    Type: Grant
    Filed: February 11, 2014
    Date of Patent: November 19, 2019
    Assignee: Imagination Technologies Limited
    Inventor: Yoong Chert Foo
  • Patent number: 10475228
    Abstract: A graphics processing system processes primitive fragments using a rendering space which is sub-divided into tiles. The graphics processing system comprises processing engines configured to apply texturing and/or shading to primitive fragments. The graphics processing system also comprises a cache system for storing graphics data for primitive fragments, the cache system including multiple cache subsystems. Each of the cache subsystems is coupled to a respective set of one or more processing engines. The graphics processing system also comprises a tile allocation unit which operates in one or more allocation modes to allocate tiles to processing engines. The allocation mode(s) include a spatial allocation mode in which groups of spatially adjacent tiles are allocated to the processing engines according to a spatial allocation scheme, which ensures that each of the groups of spatially adjacent tiles is allocated to a set of processing engines which are coupled to the same cache subsystem.
    Type: Grant
    Filed: January 4, 2019
    Date of Patent: November 12, 2019
    Assignee: Imagination Technologies Limited
    Inventors: Jonathan Redshaw, Yoong Chert Foo
  • Patent number: 10402935
    Abstract: A method of profiling the performance of a graphics unit when rendering a scene according to a graphics pipeline, includes executing stages of the graphics pipeline using one or more units of rendering circuitry to perform at least one rendering task that defines a portion of the work required to render the scene, the at least one rendering task associated with a set flag; propagating an indication of the flag through stages of the graphics pipeline as the scene is rendered so that work done as part of the at least one rendering task is associated with the set flag; changing the value of a counter associated with a unit of rendering circuitry in response to an occurrence of an event while that unit performs an item of work associated with the set flag; and reading the value of the counter to thereby measure the occurrences of the event caused by completing the at least one rendering task.
    Type: Grant
    Filed: October 31, 2017
    Date of Patent: September 3, 2019
    Assignee: Imagination Technologies Limited
    Inventor: Yoong-Chert Foo
  • Publication number: 20190266018
    Abstract: A method of scheduling tasks within a GPU or other highly parallel processing unit is described which is both age-aware and wakeup event driven. Tasks which are received are added to an age-based task queue. Wakeup event bits for task types, or combinations of task types and data groups, are set in response to completion of a task dependency and these wakeup event bits are used to select an oldest task from the queue that satisfies predefined criteria.
    Type: Application
    Filed: May 14, 2019
    Publication date: August 29, 2019
    Inventors: Simon Nield, Adam de Grasse, Luca Iuliano, Ollie Mower, Yoong-Chert Foo
  • Publication number: 20190244325
    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
    Type: Application
    Filed: April 17, 2019
    Publication date: August 8, 2019
    Inventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
  • Patent number: 10318348
    Abstract: A method of scheduling tasks within a GPU or other highly parallel processing unit is described which is both age-aware and wakeup event driven. Tasks which are received are added to an age-based task queue. Wakeup event bits for task types, or combinations of task types and data groups, are set in response to completion of a task dependency and these wakeup event bits are used to select an oldest task from the queue that satisfies predefined criteria.
    Type: Grant
    Filed: September 25, 2017
    Date of Patent: June 11, 2019
    Assignee: Imagination Technologies Limited
    Inventors: Simon Nield, Adam de Grasse, Luca Iuliano, Ollie Mower, Yoong-Chert Foo
  • Patent number: 10311539
    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
    Type: Grant
    Filed: November 2, 2016
    Date of Patent: June 4, 2019
    Assignee: Imagination Technologies Limited
    Inventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
  • Publication number: 20190155607
    Abstract: Method and apparatus are provided for synchronising execution of a plurality of threads on a multi-threaded processor. A program executed by a thread can have a number of synchronisation points corresponding to points where execution is to be synchronised with another thread. Execution of a thread is paused when it reaches a synchronisation point until at least one other thread with which it is intended to be synchronised reaches a corresponding synchronisation point. Execution is subsequently resumed. A control core maintains status data for threads and can cause a thread that is ready to run to use execution resources that were occupied by a thread that is waiting for a synchronisation event.
    Type: Application
    Filed: January 18, 2019
    Publication date: May 23, 2019
    Inventor: Yoong Chert Foo
  • Publication number: 20190139294
    Abstract: A graphics processing system processes primitive fragments using a rendering space which is sub-divided into tiles. The graphics processing system comprises processing engines configured to apply texturing and/or shading to primitive fragments. The graphics processing system also comprises a cache system for storing graphics data for primitive fragments, the cache system including multiple cache subsystems. Each of the cache subsystems is coupled to a respective set of one or more processing engines. The graphics processing system also comprises a tile allocation unit which operates in one or more allocation modes to allocate tiles to processing engines. The allocation mode(s) include a spatial allocation mode in which groups of spatially adjacent tiles are allocated to the processing engines according to a spatial allocation scheme, which ensures that each of the groups of spatially adjacent tiles is allocated to a set of processing engines which are coupled to the same cache subsystem.
    Type: Application
    Filed: January 4, 2019
    Publication date: May 9, 2019
    Inventors: Jonathan Redshaw, Yoong Chert Foo