Patents by Inventor Aaftab A. Munshi

Aaftab A. Munshi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Parallel runtime execution on multiple processors

Patent number: 9471401

Abstract: A method and an apparatus that schedule a plurality of executables in a schedule queue for execution in one or more physical compute devices such as CPUs or GPUs concurrently are described. One or more executables are compiled online from a source having an existing executable for a type of physical compute devices different from the one or more physical compute devices. Dependency relations among elements corresponding to scheduled executables are determined to select an executable to be executed by a plurality of threads concurrently in more than one of the physical compute devices. A thread initialized for executing an executable in a GPU of the physical compute devices are initialized for execution in another CPU of the physical compute devices if the GPU is busy with graphics processing threads.

Type: Grant

Filed: May 15, 2015

Date of Patent: October 18, 2016

Assignee: Apple Inc.

Inventors: Aaftab Munshi, Jeremy Sandmel
Data parallel computing on multiple processors

Patent number: 9442757

Abstract: A method and an apparatus that allocate one or more physical compute devices such as central processing units or graphical processing units attached to a host processing unit running an application for executing one or more threads of the application are described. The allocation may be based on data representing a processing capability requirement from the application for executing an executable in the one or more threads. A compute device identifier may be associated with the allocated physical compute devices to schedule and execute the executable in the one or more threads concurrently in one or more of the allocated physical compute devices concurrently.

Type: Grant

Filed: January 24, 2014

Date of Patent: September 13, 2016

Assignee: Apple Inc.

Inventors: Aaftab Munshi, Jeremy Sandmel
Parallel runtime execution on multiple processors

Patent number: 9436526

Abstract: A method and an apparatus that schedule a plurality of executables in a schedule queue for execution in one or more physical compute devices such as CPUs or GPUs concurrently are described. One or more executables are compiled online from a source having an existing executable for a type of physical compute devices different from the one or more physical compute devices. Dependency relations among elements corresponding to scheduled executables are determined to select an executable to be executed by a plurality of threads concurrently in more than one of the physical compute devices. A thread initialized for executing an executable in a GPU of the physical compute devices are initialized for execution in another CPU of the physical compute devices if the GPU is busy with graphics processing threads.

Type: Grant

Filed: January 24, 2014

Date of Patent: September 6, 2016

Assignee: Apple Inc.

Inventors: Aaftab Munshi, Jeremy Sandmel
APPLICATION INTERFACE ON MULTIPLE PROCESSORS

Publication number: 20160217011

Abstract: A method and an apparatus that execute a parallel computing program in a programming language for a parallel computing architecture are described. The parallel computing program is stored in memory in a system with parallel processors. The system includes a host processor, a graphics processing unit (GPU) coupled to the host processor and a memory coupled to at least one of the host processor and the GPU. The parallel computing program is stored in the memory to allocate threads between the host processor and the GPU. The programming language includes an API to allow an application to make calls using the API to allocate execution of the threads between the host processor and the GPU. The programming language includes host function data tokens for host functions performed in the host processor and kernel function data tokens for compute kernel functions performed in one or more compute processors, e.g. GPUs or CPUs, separate from the host processor.

Type: Application

Filed: January 27, 2016

Publication date: July 28, 2016

Inventors: Aaftab Munshi, Jeremy Sandmel
APPLICATION PROGRAMMING INTERFACES FOR DATA PARALLEL COMPUTING ON MULTIPLE PROCESSORS

Publication number: 20160188371

Abstract: A method and an apparatus for a parallel computing program calling APIs (application programming interfaces) in a host processor to perform a data processing task in parallel among compute units are described. The compute units are coupled to the host processor including central processing units (CPUs) and graphic processing units (GPUs). A program object corresponding to a source code for the data processing task is generated in a memory coupled to the host processor according to the API calls. Executable codes for the compute units are generated from the program object according to the API calls to be loaded for concurrent execution among the compute units to perform the data processing task.

Type: Application

Filed: December 21, 2015

Publication date: June 30, 2016

Inventors: Aaftab Munshi, Nathaniel Begeman
Parallel runtime execution on multiple processors

Patent number: 9304834

Abstract: A method and an apparatus that schedule a plurality of executables in a schedule queue for execution in one or more physical compute devices such as CPUs or GPUs concurrently are described. One or more executables are compiled online from a source having an existing executable for a type of physical compute devices different from the one or more physical compute devices. Dependency relations among elements corresponding to scheduled executables are determined to select an executable to be executed by a plurality of threads concurrently in more than one of the physical compute devices. A thread initialized for executing an executable in a GPU of the physical compute devices are initialized for execution in another CPU of the physical compute devices if the GPU is busy with graphics processing threads.

Type: Grant

Filed: September 13, 2012

Date of Patent: April 5, 2016

Assignee: Apple Inc.

Inventors: Aaftab Munshi, Jeremy Sandmel
Application programming interfaces for data parallel computing on multiple processors

Patent number: 9250697

Abstract: A method and an apparatus for a parallel computing program calling APIs (application programming interfaces) in a host processor to perform a data processing task in parallel among compute units are described. The compute units are coupled to the host processor including central processing units (CPUs) and graphic processing units (GPUs). A program object corresponding to a source code for the data processing task is generated in a memory coupled to the host processor according to the API calls. Executable codes for the compute units are generated from the program object according to the API calls to be loaded for concurrent execution among the compute units to perform the data processing task.

Type: Grant

Filed: January 24, 2014

Date of Patent: February 2, 2016

Assignee: Apple Inc.

Inventors: Aaftab Munshi, Nathaniel Begeman
Data parallel computing on multiple processors

Patent number: 9207971

Abstract: A method and an apparatus that allocate one or more physical compute devices such as CPUs or GPUs attached to a host processing unit running an application for executing one or more threads of the application are described. The allocation may be based on data representing a processing capability requirement from the application for executing an executable in the one or more threads. A compute device identifier may be associated with the allocated physical compute devices to schedule and execute the executable in the one or more threads concurrently in one or more of the allocated physical compute devices concurrently.

Type: Grant

Filed: September 13, 2012

Date of Patent: December 8, 2015

Assignee: Apple Inc.

Inventors: Aaftab Munshi, Jeremy Sandmel
Unified Intermediate Representation

Publication number: 20150347107

Abstract: A system decouples the source code language from the eventual execution environment by compiling the source code language into a unified intermediate representation that conforms to a language model allowing both parallel graphical operations and parallel general-purpose computational operations. The intermediate representation may then be distributed to end-user computers, where an embedded compiler can compile the intermediate representation into an executable binary targeted for the CPUs and GPUs available in that end-user device. The intermediate representation is sufficient to define both graphics and non-graphics compute kernels and shaders. At install-time or later, the intermediate representation file may be compiled for the specific target hardware of the given end-user computing system.

Type: Application

Filed: September 30, 2014

Publication date: December 3, 2015

Inventors: Aaftab Munshi, Rahul U. Joshi, Mon P. Wang, Kelvin C. Chiu
Language, Function Library, And Compiler For Graphical And Non-Graphical Computation On A Graphical Processor Unit

Publication number: 20150347108

Abstract: A compiler and library provide the ability to compile a programming language according to a defined language model into a programming language independent, machine independent intermediate representation, for conversion into an executable on a target programmable device. The language model allows writing programs that perform data-parallel graphics and non-graphics tasks.

Type: Application

Filed: February 20, 2015

Publication date: December 3, 2015

Inventors: Aaftab A. Munshi, Kenneth C. Dyke, Rahul U. Joshi, Richard W. Schreyer
PARALLEL RUNTIME EXECUTION ON MULTIPLE PROCESSORS

Publication number: 20150317192

Abstract: A method and an apparatus that schedule a plurality of executables in a schedule queue for execution in one or more physical compute devices such as CPUs or GPUs concurrently are described. One or more executables are compiled online from a source having an existing executable for a type of physical compute devices different from the one or more physical compute devices. Dependency relations among elements corresponding to scheduled executables are determined to select an executable to be executed by a plurality of threads concurrently in more than one of the physical compute devices. A thread initialized for executing an executable in a GPU of the physical compute devices are initialized for execution in another CPU of the physical compute devices if the GPU is busy with graphics processing threads.

Type: Application

Filed: May 15, 2015

Publication date: November 5, 2015

Inventors: Aaftab Munshi, Jeremy Sandmel
SUBBUFFER OBJECTS

Publication number: 20150187322

Abstract: A method and an apparatus for a parallel computing program using subbuffers to perform a data processing task in parallel among heterogeneous compute units are described. The compute units can include a heterogeneous mix of central processing units (CPUs) and graphic processing units (GPUs). A system creates a subbuffer from a parent buffer for each of a plurality of heterogeneous compute units. If a subbuffer is not associated with the same compute unit as the parent buffer, the system copies data from the subbuffer to memory of that compute unit. The system further tracks updates to the data and transfers those updates back to the subbuffer.

Type: Application

Filed: December 18, 2014

Publication date: July 2, 2015

Inventors: Aaftab A. Munshi, Ian R. Ollmann
Parallel runtime execution on multiple processors

Patent number: 9052948

Abstract: A method and an apparatus that schedule a plurality of executables in a schedule queue for execution in one or more physical compute devices such as CPUs or GPUs concurrently are described. One or more executables are compiled online from a source having an existing executable for a type of physical compute devices different from the one or more physical compute devices. Dependency relations among elements corresponding to scheduled executables are determined to select an executable to be executed by a plurality of threads concurrently in more than one of the physical compute devices. A thread initialized for executing an executable in a GPU of the physical compute devices are initialized for execution in another CPU of the physical compute devices if the GPU is busy with graphics processing threads.

Type: Grant

Filed: August 28, 2012

Date of Patent: June 9, 2015

Assignee: Apple Inc.

Inventors: Aaftab Munshi, Jeremy Sandmel
Subbuffer objects

Patent number: 8957906

Abstract: A method and an apparatus for a parallel computing program using subbuffers to perform a data processing task in parallel among heterogeneous compute units are described. The compute units can include a heterogeneous mix of central processing units (CPUs) and graphic processing units (GPUs). A system creates a subbuffer from a parent buffer for each of a plurality of heterogeneous compute units. If a subbuffer is not associated with the same compute unit as the parent buffer, the system copies data from the subbuffer to memory of that compute unit. The system further tracks updates to the data and transfers those updates back to the subbuffer.

Type: Grant

Filed: April 16, 2014

Date of Patent: February 17, 2015

Assignee: Apple Inc.

Inventors: Aaftab A. Munshi, Ian R. Ollmann
Enqueuing Kernels from Kernels on GPU/CPU

Publication number: 20150022538

Abstract: Graphics processing units (GPUs) and other compute units are allowed to enqueue tasks for themselves by themselves, without needing a host processor to queue the work for the GPU. Built-in functions enable kernels to enqueue kernels for execution on a device. In some embodiments, ndrange kernels execute over an N-dimensional range to provide data-parallel operations. Task kernels provide task-parallel operations. In some embodiments, kernels may be defined using clang block syntax. The order of execution of commands on a compute unit may be constrained or allow execution of commands out-of-order. Compute units may control when kernels enqueued by the compute unit begins execution.

Type: Application

Filed: December 23, 2013

Publication date: January 22, 2015

Inventor: Aaftab A. Munshi
SUBBUFFER OBJECTS

Publication number: 20140313214

Abstract: A method and an apparatus for a parallel computing program using subbuffers to perform a data processing task in parallel among heterogeneous compute units are described. The compute units can include a heterogeneous mix of central processing units (CPUs) and graphic processing units (GPUs). A system creates a subbuffer from a parent buffer for each of a plurality of heterogeneous compute units. If a subbuffer is not associated with the same compute unit as the parent buffer, the system copies data from the subbuffer to memory of that compute unit. The system further tracks updates to the data and transfers those updates back to the subbuffer.

Type: Application

Filed: April 16, 2014

Publication date: October 23, 2014

Applicant: Apple Inc.

Inventors: Aaftab A. Munshi, Ian R. Ollmann
APPLICATION PROGRAMMING INTERFACES FOR DATA PARALLEL COMPUTING ON MULTIPLE PROCESSORS

Publication number: 20140237457

Abstract: A method and an apparatus for a parallel computing program calling APIs (application programming interfaces) in a host processor to perform a data processing task in parallel among compute units are described. The compute units are coupled to the host processor including central processing units (CPUs) and graphic processing units (GPUs). A program object corresponding to a source code for the data processing task is generated in a memory coupled to the host processor according to the API calls. Executable codes for the compute units are generated from the program object according to the API calls to be loaded for concurrent execution among the compute units to perform the data processing task.

Type: Application

Filed: January 24, 2014

Publication date: August 21, 2014

Applicant: Apple Inc.

Inventors: Aaftab Munshi, Nathaniel Begeman
Application programming interfaces for data parallel computing on multiple processors

Patent number: 8806513

Abstract: A method and an apparatus for a parallel computing program calling APIs (application programming interfaces) in a host processor to perform a data processing task in parallel among compute units are described. The compute units are coupled to the host processor including central processing units (CPUs) and graphic processing units (GPUs). A program object corresponding to a source code for the data processing task is generated in a memory coupled to the host processor according to the API calls. Executable codes for the compute units are generated from the program object according to the API calls to be loaded for concurrent execution among the compute units to perform the data processing task.

Type: Grant

Filed: October 5, 2012

Date of Patent: August 12, 2014

Assignee: Apple Inc.

Inventors: Aaftab A. Munshi, Nathaniel Begeman
PARALLEL RUNTIME EXECUTION ON MULTIPLE PROCESSORS

Publication number: 20140201746

Abstract: A method and an apparatus that schedule a plurality of executables in a schedule queue for execution in one or more physical compute devices such as CPUs or GPUs concurrently are described. One or more executables are compiled online from a source having an existing executable for a type of physical compute devices different from the one or more physical compute devices. Dependency relations among elements corresponding to scheduled executables are determined to select an executable to be executed by a plurality of threads concurrently in more than one of the physical compute devices. A thread initialized for executing an executable in a GPU of the physical compute devices arc initialized for execution in another CPU of the physical compute devices if the GPU is busy with graphics processing threads.

Type: Application

Filed: January 24, 2014

Publication date: July 17, 2014

Applicant: Apple Inc.

Inventors: Aaftab Munshi, Jeremy Sandmel
DATA PARALLEL COMPUTING ON MULTIPLE PROCESSORS

Publication number: 20140201755

Abstract: A method and an apparatus that allocate one or more physical compute devices such as CPUs or GPUs attached to a host processing unit running an application for executing one or more threads of the application are described. The allocation may be based on data representing a processing capability requirement from the application for executing an executable in the one or more threads. A compute device identifier may be associated with the allocated physical compute devices to schedule and execute the executable in the one or more threads concurrently in one or more of the allocated physical compute devices concurrently.

Type: Application

Filed: January 24, 2014

Publication date: July 17, 2014

Applicant: Apple Inc.

Inventors: Aaftab Munshi, Jeremy Sandmel

prev 1 2 3 4 5 6 next