Patents by Inventor Arash Ashari

Arash Ashari has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10223762
    Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.
    Type: Grant
    Filed: March 16, 2018
    Date of Patent: March 5, 2019
    Assignee: International Business Machines Corporation
    Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
  • Publication number: 20190034973
    Abstract: Systems, methods, and non-transitory computer-readable media can identify a target page and an advertising campaign comprising one or more advertisements associated with the target page. One or more users are identified for inclusion in a base audience based on page information associated with the target page. One or more users are identified for inclusion in an expanded audience based on expanded audience criteria. The advertising campaign is presented to a smart audience comprising the base audience and the expanded audience.
    Type: Application
    Filed: July 26, 2017
    Publication date: January 31, 2019
    Inventors: Jinyi Yao, Martin Schatz, Arash Ashari, Vijay Rangarajan, Liushan Yang, Iris Yui Chang
  • Publication number: 20180211357
    Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.
    Type: Application
    Filed: March 16, 2018
    Publication date: July 26, 2018
    Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
  • Patent number: 9972063
    Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. An optimized fused GPU kernel is employed to exploit temporal locality for inherent data-flow dependencies in the identified computation. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing for the identified computation is performed. GPU kernel launch parameters are estimated following an analytical model that maximizes thread occupancy and minimizes atomic writes to GPU global memory.
    Type: Grant
    Filed: July 30, 2015
    Date of Patent: May 15, 2018
    Assignee: International Business Machines Corporation
    Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
  • Publication number: 20170032487
    Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. An optimized fused GPU kernel is employed to exploit temporal locality for inherent data-flow dependencies in the identified computation. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing for the identified computation is performed. GPU kernel launch parameters are estimated following an analytical model that maximizes thread occupancy and minimizes atomic writes to GPU global memory.
    Type: Application
    Filed: July 30, 2015
    Publication date: February 2, 2017
    Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda