Patents by Inventor Arash Ashari

Arash Ashari has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

Patent number: 10223762

Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.

Type: Grant

Filed: March 16, 2018

Date of Patent: March 5, 2019

Assignee: International Business Machines Corporation

Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
SYSTEMS AND METHODS FOR AUTOMATED AUDIENCE IDENTIFICATION

Publication number: 20190034973

Abstract: Systems, methods, and non-transitory computer-readable media can identify a target page and an advertising campaign comprising one or more advertisements associated with the target page. One or more users are identified for inclusion in a base audience based on page information associated with the target page. One or more users are identified for inclusion in an expanded audience based on expanded audience criteria. The advertising campaign is presented to a smart audience comprising the base audience and the expanded audience.

Type: Application

Filed: July 26, 2017

Publication date: January 31, 2019

Inventors: Jinyi Yao, Martin Schatz, Arash Ashari, Vijay Rangarajan, Liushan Yang, Iris Yui Chang
PIPELINED APPROACH TO FUSED KERNELS FOR OPTIMIZATION OF MACHINE LEARNING WORKLOADS ON GRAPHICAL PROCESSING UNITS

Publication number: 20180211357

Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.

Type: Application

Filed: March 16, 2018

Publication date: July 26, 2018

Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

Patent number: 9972063

Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. An optimized fused GPU kernel is employed to exploit temporal locality for inherent data-flow dependencies in the identified computation. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing for the identified computation is performed. GPU kernel launch parameters are estimated following an analytical model that maximizes thread occupancy and minimizes atomic writes to GPU global memory.

Type: Grant

Filed: July 30, 2015

Date of Patent: May 15, 2018

Assignee: International Business Machines Corporation

Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
PIPELINED APPROACH TO FUSED KERNELS FOR OPTIMIZATION OF MACHINE LEARNING WORKLOADS ON GRAPHICAL PROCESSING UNITS

Publication number: 20170032487

Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. An optimized fused GPU kernel is employed to exploit temporal locality for inherent data-flow dependencies in the identified computation. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing for the identified computation is performed. GPU kernel launch parameters are estimated following an analytical model that maximizes thread occupancy and minimizes atomic writes to GPU global memory.

Type: Application

Filed: July 30, 2015

Publication date: February 2, 2017

Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda

Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

SYSTEMS AND METHODS FOR AUTOMATED AUDIENCE IDENTIFICATION

PIPELINED APPROACH TO FUSED KERNELS FOR OPTIMIZATION OF MACHINE LEARNING WORKLOADS ON GRAPHICAL PROCESSING UNITS

Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

PIPELINED APPROACH TO FUSED KERNELS FOR OPTIMIZATION OF MACHINE LEARNING WORKLOADS ON GRAPHICAL PROCESSING UNITS