Patents by Inventor Sooraj Puthoor

Sooraj Puthoor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Managing cache coherence using information in a page table

Patent number: 10019377

Abstract: The described embodiments include a computing device with two or more types of processors and a memory that is shared between the two or more types of processors. The computing device performs operations for handling cache coherency between the two or more types of processors. During operation, the computing device sets a cache coherency indicator in metadata in a page table entry in a page table, the page table entry information about a page of data that is stored in the memory. The computing device then uses the cache coherency indicator to determine operations to be performed when accessing data in the page of data in the memory. For example, the computing device can use the coherency indicator to determine whether a coherency operation is to be performed when a processor of a given type accesses data in the page of data in the memory.

Type: Grant

Filed: May 23, 2016

Date of Patent: July 10, 2018

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Arkaprava Basu, Bradford M. Beckmann, Shuai Che, Sooraj Puthoor
Predicting a context portion to move between a context buffer and registers based on context portions previously used by at least one other thread

Patent number: 10019283

Abstract: A processing device includes a first memory that includes a context buffer. The processing device also includes a processor core to execute threads based on context information stored in registers of the processor core and a memory controller to selectively move a subset of the context information between the context buffer and the registers based on one or more latencies of the threads.

Type: Grant

Filed: June 22, 2015

Date of Patent: July 10, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Dmitri Yudanov, Sergey Blagodurov, Arkaprava Basu, Sooraj Puthoor, Joseph L. Greathouse
SCOPED PERSISTENCE BARRIERS FOR NON-VOLATILE MEMORIES

Publication number: 20180088858

Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.

Type: Application

Filed: September 23, 2016

Publication date: March 29, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Arkaprava Basu, Mitesh R. Meswani, Dibakar Gope, Sooraj Puthoor
Dynamic wavefront creation for processing units using a hybrid compactor

Patent number: 9898287

Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.

Type: Grant

Filed: April 9, 2015

Date of Patent: February 20, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Sooraj Puthoor, Bradford M. Beckmann, Dmitri Yudanov
Managing Cache Coherence Using Information in a Page Table

Publication number: 20170337136

Abstract: The described embodiments include a computing device with two or more types of processors and a memory that is shared between the two or more types of processors. The computing device performs operations for handling cache coherency between the two or more types of processors. During operation, the computing device sets a cache coherency indicator in metadata in a page table entry in a page table, the page table entry information about a page of data that is stored in the memory. The computing device then uses the cache coherency indicator to determine operations to be performed when accessing data in the page of data in the memory. For example, the computing device can use the coherency indicator to determine whether a coherency operation is to be performed when a processor of a given type accesses data in the page of data in the memory.

Type: Application

Filed: May 23, 2016

Publication date: November 23, 2017

Inventors: Arkaprava Basu, Bradford M. Beckmann, Shuai Che, Sooraj Puthoor
METHOD AND APPARATUS FOR INTER-LANE THREAD MIGRATION

Publication number: 20170220346

Abstract: Briefly, methods and apparatus to migrate a software thread from one wavefront executing on one execution unit to another wavefront executing on another execution unit whereby both execution units are associated with a compute unit of a processing device such as, for example, a GPU. The methods and apparatus may execute compiled dynamic thread migration swizzle buffer instructions that when executed allow access to a dynamic thread migration swizzle buffer that allows for the migration of register context information when migrating software threads. The register context information may be located in one or more locations of a register file prior to storing the register context information into the dynamic thread migration swizzle buffer. The method and apparatus may also return the register context information from the dynamic thread migration swizzle buffer to one or more different register file locations of the register file.

Type: Application

Filed: January 29, 2016

Publication date: August 3, 2017

Applicant: Advanced Micro Devices, Inc.

Inventors: Bradford Beckmann, Sooraj Puthoor
SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS

Publication number: 20170004080

Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.

Type: Application

Filed: June 30, 2015

Publication date: January 5, 2017

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Zhe Wang, Sooraj Puthoor, Bradford M. Beckmann
INSTRUCTION CONTEXT SWITCHING

Publication number: 20160371082

Abstract: A processing device includes a first memory that includes a context buffer. The processing device also includes a processor core to execute threads based on context information stored in registers of the processor core and a memory controller to selectively move a subset of the context information between the context buffer and the registers based on one or more latencies of the threads.

Type: Application

Filed: June 22, 2015

Publication date: December 22, 2016

Inventors: Dmitri Yudanov, Sergey Blagodurov, Arkaprava Basu, Sooraj Puthoor, Joseph L. Greathouse
DYNAMIC WAVEFRONT CREATION FOR PROCESSING UNITS USING A HYBRID COMPACTOR

Publication number: 20160239302

Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.

Type: Application

Filed: April 9, 2015

Publication date: August 18, 2016

Applicant: Advanced Micro Devices, Inc.

Inventors: Sooraj Puthoor, Bradford M. Beckmann, Dmitri Yudanov
HETEROGENEOUS FUNCTION UNIT DISPATCH IN A GRAPHICS PROCESSING UNIT

Publication number: 20160085551

Abstract: A compute unit configured to execute multiple threads in parallel is presented. The compute unit includes one or more single instruction multiple data (SIMD) units and a fetch and decode logic. The SIMD units have differing numbers of arithmetic logic units (ALUs), such that each SIMD unit can execute a different number of threads. The fetch and decode logic is in communication with each of the SIMD units, and is configured to assign the threads to the SIMD units for execution based on such differing numbers of ALUs.

Type: Application

Filed: September 18, 2014

Publication date: March 24, 2016

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Joseph L. Greathouse, Mitesh R. Meswani, Sooraj Puthoor, Dmitri Yudanov, James M. O'Connor

prev 1 2 3