Patents by Inventor Heiner Giefers

Heiner Giefers has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11783200
    Abstract: Field-programmable gate array and method to implement an artificial neural network. A trained model of the neural network is processed, in which weights are defined in a floating-point format, to quantize each set of weights to a respective reduced-precision format in dependence on effect of quantization on accuracy of the model. For each set of weights, a partitioning scheme is defined for a set of block memories of the apparatus such that a plurality k of those weights can be stored in each addressable location of the set of memories, wherein k differs for different sets of weights. The apparatus can be programmed to implement the neural network such that weights in each set are persistently stored in a set of block memories partitioned according to the partitioning scheme for that set of weights.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: October 10, 2023
    Assignee: International Business Machines Corporation
    Inventors: Dionysios Diamantopoulos, Heiner Giefers, Christoph Hagleitner
  • Patent number: 11275713
    Abstract: The invention is notably directed to a computing system configured to perform linear algebraic operations. The computing system comprises a co-processing module comprising a co-processing unit. The co-processing unit comprises a parallel array of bit-serial processing units. The bit-serial processing units are adapted to perform the linear algebraic operations with variable precision. The invention further concerns a related computer implemented method and a related computer program product.
    Type: Grant
    Filed: June 9, 2018
    Date of Patent: March 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
  • Patent number: 10776118
    Abstract: A computing system comprising a central processing unit (CPU), a memory processor and a memory device comprising a data array and an index array. The computing system is configured to store data lines comprising data elements in the data array and to store index lines comprising a plurality of memory indices in the index array. The memory indices indicate memory positions of data elements in the data array with respect to a start address of the data array. There is further provided a related computer implemented method and a related computer program product.
    Type: Grant
    Filed: September 9, 2016
    Date of Patent: September 15, 2020
    Assignee: International Business Machines Corporation
    Inventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
  • Publication number: 20200257986
    Abstract: Field-programmable gate array and method to implement an artificial neural network. A trained model of the neural network is processed, in which weights are defined in a floating-point format, to quantize each set of weights to a respective reduced-precision format in dependence on effect of quantization on accuracy of the model. For each set of weights, a partitioning scheme is defined for a set of block memories of the apparatus such that a plurality k of those weights can be stored in each addressable location of the set of memories, wherein k differs for different sets of weights. The apparatus can be programmed to implement the neural network such that weights in each set are persistently stored in a set of block memories partitioned according to the partitioning scheme for that set of weights.
    Type: Application
    Filed: February 8, 2019
    Publication date: August 13, 2020
    Inventors: Dionysios Diamantopoulos, Heiner Giefers, Christoph Hagleitner
  • Patent number: 10685082
    Abstract: According to some embodiments, a computer-implemented method for performing sparse matrix dense matrix (SpMM) multiplication on a single field programmable gate array (FPGA) module comprising a k-stage pipeline is described. The method may include interleaving k-stage threads on the k-stage pipeline comprising a plurality of threads t0 to tk-1, wherein a first result of thread t0 is ready one cycle after the first input of thread tk-1 is fed into the pipeline, and outputting a result matrix Y.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: June 16, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Costas Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Raphael C. Polig, Peter W. J. Staar
  • Publication number: 20190377707
    Abstract: The invention is notably directed to a computing system configured to perform linear algebraic operations. The computing system comprises a co-processing module comprising a co-processing unit. The co-processing unit comprises a parallel array of bit-serial processing units. The bit-serial processing units are adapted to perform the linear algebraic operations with variable precision. The invention further concerns a related computer implemented method and a related computer program product.
    Type: Application
    Filed: June 9, 2018
    Publication date: December 12, 2019
    Inventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
  • Patent number: 10430325
    Abstract: Differential data access. A method for storing and reading data elements to and from a memory is provided. The method includes storing a data element as a base word in a first precision, storing at least one delta word including additional information related to a second precision version of the stored data element, and reading the base word and the at least one delta word of the stored data element to access the data element in the second precision.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: October 1, 2019
    Assignee: International Business Machines Corporation
    Inventors: Christoph M Angerer, Heiner Giefers, Raphael Polig
  • Patent number: 10430326
    Abstract: Differential data access. A method for storing and reading data elements to and from a memory is provided. The method includes storing a data element as a base word in a first precision, storing at least one delta word including additional information related to a second precision version of the stored data element, and reading the base word and the at least one delta word of the stored data element to access the data element in the second precision.
    Type: Grant
    Filed: December 21, 2016
    Date of Patent: October 1, 2019
    Assignee: International Business Machines Corporation
    Inventors: Christoph M. Angerer, Heiner Giefers, Raphael Polig
  • Patent number: 10025754
    Abstract: Embodiments of the present invention provide methods, computer program products, and systems for solving a linear equation system using a hardware-implemented extended solver, wherein a calculation precision is adapted in each iteration step of a solving process is provided. Embodiments of the present invention can be used to perform on-the-fly interpolations using the data associated with the highest resolution of the three-dimensional finite element voxel model to a lower resolution than the highest resolution as well as to perform solving computations of the solving process in the lower resolution.
    Type: Grant
    Filed: July 22, 2015
    Date of Patent: July 17, 2018
    Assignee: International Business Machines Corporation
    Inventors: Christoph M. Angerer, Konstantinos Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Yves G. Ineichen, Raphael Polig
  • Patent number: 9959202
    Abstract: A computing memory includes an execution unit and an access processor coupled with a memory system, where the execution unit and the access processor are logically separated units. The execution unit is for processing operand data. The access processor is for providing operand data and configuration data to the execution unit. The access processor reads operand data from the memory system and sends the operand data to the execution unit. The execution unit executes the operand data according to the provided configuration data. The access processor includes information about execution times of operations of the execution unit for the provided configuration. The access processor reserves time-slots for writing execution unit results provided by the execution unit into selected locations in the memory system based on the information about the execution times, upon sending at least one of the operand data and the configuration data to the execution unit.
    Type: Grant
    Filed: September 16, 2015
    Date of Patent: May 1, 2018
    Assignee: International Business Machines Corporation
    Inventors: Jan Van Lunteren, Heiner Giefers
  • Publication number: 20180074962
    Abstract: A computing system comprising a central processing unit (CPU), a memory processor and a memory device comprising a data array and an index array. The computing system is configured to store data lines comprising data elements in the data array and to store index lines comprising a plurality of memory indices in the index array. The memory indices indicate memory positions of data elements in the data array with respect to a start address of the data array. There is further provided a related computer implemented method and a related computer program product.
    Type: Application
    Filed: September 9, 2016
    Publication date: March 15, 2018
    Inventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
  • Patent number: 9870315
    Abstract: A computing memory includes an execution unit and an access processor coupled with a memory system, where the execution unit and the access processor are logically separated units. The execution unit is for processing operand data. The access processor is for providing operand data and configuration data to the execution unit. The access processor reads operand data from the memory system and sends the operand data to the execution unit. The execution unit executes the operand data according to the provided configuration data. The access processor includes information about execution times of operations of the execution unit for the provided configuration. The access processor reserves time-slots for writing execution unit results provided by the execution unit into selected locations in the memory system based on the information about the execution times, upon sending at least one of the operand data and the configuration data to the execution unit.
    Type: Grant
    Filed: January 27, 2017
    Date of Patent: January 16, 2018
    Assignee: International Business Machines Corporation
    Inventors: Jan Van Lunteren, Heiner Giefers
  • Patent number: 9779061
    Abstract: An iterative refinement apparatus configured to generate data defining a solution vector x for a linear system represented by Ax=b, where A is a predetermined matrix and b is a predetermined vector. An outer solver processes input data, defining the matrix A and vector b, in accordance with an outer loop of an iterative refinement method to generate said data defining the solution vector x. An inner solver processes data items in accordance with an inner loop of the iterative refinement method. The inner solver is configured to process said data items having variable bit-width and data format. A precision controller determines the bit-widths and data formats of the data items adaptively in dependence on the results of the processing steps of the iterative refinement method; the precision controller configured to control operation of the inner solver for processing said data items with the bit-widths and data formats.
    Type: Grant
    Filed: February 16, 2015
    Date of Patent: October 3, 2017
    Assignee: International Business Machines Corporation
    Inventors: Christoph M. Angerer, Konstantinos Bekas, Alessandro Curioni, Silvio Dragone, Heiner Giefers, Christoph Hagleitner, Raphael C. Polig
  • Patent number: 9703573
    Abstract: Embodiments are directed to a heterogeneous system for dynamically mapping library calls to one of a plurality of processing platforms. The plurality of processing platforms include a central processing unit (CPU) and one or more acceleration units as co-processing units. The system includes an interposer configured to intercept the library calls from an application programming interface (API) and to map the library calls to one of the plurality of processing platforms according to a classification scheme based on an affinity table. The affinity table includes call signatures representing input parameters of sample library calls. Furthermore, the affinity table includes one or more performance parameters of the sample library calls for each of the processing platforms. The performance parameters indicate the performance of the sample library calls on the respective processing platform. Also included are a related method and a related computer program product.
    Type: Grant
    Filed: April 26, 2016
    Date of Patent: July 11, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Heiner Giefers, Raphael Polig
  • Publication number: 20170147531
    Abstract: According to some embodiments, a computer-implemented method for performing sparse matrix dense matrix (SpMM) multiplication on a single field programmable gate array (FPGA) module comprising a k-stage pipeline is described. The method may include interleaving k-stage threads on the k-stage pipeline comprising a plurality of threads t0 to tk-1, wherein a first result of thread t0 is ready one cycle after the first input of thread tk-1 is fed into the pipeline, and outputting a result matrix Y.
    Type: Application
    Filed: October 31, 2016
    Publication date: May 25, 2017
    Inventors: Costas Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Raphael C. Polig, Peter W. J. Staar
  • Publication number: 20170139625
    Abstract: A computing memory includes an execution unit and an access processor coupled with a memory system, where the execution unit and the access processor are logically separated units. The execution unit is for processing operand data. The access processor is for providing operand data and configuration data to the execution unit. The access processor reads operand data from the memory system and sends the operand data to the execution unit. The execution unit executes the operand data according to the provided configuration data. The access processor includes information about execution times of operations of the execution unit for the provided configuration. The access processor reserves time-slots for writing execution unit results provided by the execution unit into selected locations in the memory system based on the information about the execution times, upon sending at least one of the operand data and the configuration data to the execution unit.
    Type: Application
    Filed: January 27, 2017
    Publication date: May 18, 2017
    Inventors: Jan Van Lunteren, Heiner Giefers
  • Publication number: 20170097883
    Abstract: Differential data access. A method for storing and reading data elements to and from a memory is provided. The method includes storing a data element as a base word in a first precision, storing at least one delta word including additional information related to a second precision version of the stored data element, and reading the base word and the at least one delta word of the stored data element to access the data element in the second precision.
    Type: Application
    Filed: December 21, 2016
    Publication date: April 6, 2017
    Inventors: Christoph M. Angerer, Heiner Giefers, Raphael Polig
  • Patent number: 9558156
    Abstract: According to some embodiments, a computer-implemented method for performing sparse matrix dense matrix (SpMM) multiplication on a single field programmable gate array (FPGA) module comprising a k-stage pipeline is described. The method may include interleaving k-stage threads on the k-stage pipeline comprising a plurality of threads t0 to tk-1, wherein a first result of thread t0 is ready one cycle after the first input of thread tk-1 is fed into the pipeline, and outputting a result matrix Y.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: January 31, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Costas Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Raphael C. Polig, Peter W. J. Staar
  • Publication number: 20170024356
    Abstract: Embodiments of the present invention provide methods, computer program products, and systems for solving a linear equation system using a hardware-implemented extended solver, wherein a calculation precision is adapted in each iteration step of a solving process is provided. Embodiments of the present invention can be used to perform on-the-fly interpolations using the data associated with the highest resolution of the three-dimensional finite element voxel model to a lower resolution than the highest resolution as well as to perform solving computations of the solving process in the lower resolution.
    Type: Application
    Filed: July 22, 2015
    Publication date: January 26, 2017
    Inventors: Christoph M. Angerer, Konstantinos Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Yves G. Ineichen, Raphael Polig
  • Publication number: 20160170652
    Abstract: Differential data access. A method for storing and reading data elements to and from a memory is provided. The method includes storing a data element as a base word in a first precision, storing at least one delta word including additional information related to a second precision version of the stored data element, and reading the base word and the at least one delta word of the stored data element to access the data element in the second precision.
    Type: Application
    Filed: December 14, 2015
    Publication date: June 16, 2016
    Inventors: Christoph M. Angerer, Heiner Giefers, Raphael Polig