Patents by Inventor Heiner Giefers
Heiner Giefers has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11783200Abstract: Field-programmable gate array and method to implement an artificial neural network. A trained model of the neural network is processed, in which weights are defined in a floating-point format, to quantize each set of weights to a respective reduced-precision format in dependence on effect of quantization on accuracy of the model. For each set of weights, a partitioning scheme is defined for a set of block memories of the apparatus such that a plurality k of those weights can be stored in each addressable location of the set of memories, wherein k differs for different sets of weights. The apparatus can be programmed to implement the neural network such that weights in each set are persistently stored in a set of block memories partitioned according to the partitioning scheme for that set of weights.Type: GrantFiled: February 8, 2019Date of Patent: October 10, 2023Assignee: International Business Machines CorporationInventors: Dionysios Diamantopoulos, Heiner Giefers, Christoph Hagleitner
-
Patent number: 11275713Abstract: The invention is notably directed to a computing system configured to perform linear algebraic operations. The computing system comprises a co-processing module comprising a co-processing unit. The co-processing unit comprises a parallel array of bit-serial processing units. The bit-serial processing units are adapted to perform the linear algebraic operations with variable precision. The invention further concerns a related computer implemented method and a related computer program product.Type: GrantFiled: June 9, 2018Date of Patent: March 15, 2022Assignee: International Business Machines CorporationInventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
-
Patent number: 10776118Abstract: A computing system comprising a central processing unit (CPU), a memory processor and a memory device comprising a data array and an index array. The computing system is configured to store data lines comprising data elements in the data array and to store index lines comprising a plurality of memory indices in the index array. The memory indices indicate memory positions of data elements in the data array with respect to a start address of the data array. There is further provided a related computer implemented method and a related computer program product.Type: GrantFiled: September 9, 2016Date of Patent: September 15, 2020Assignee: International Business Machines CorporationInventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
-
Publication number: 20200257986Abstract: Field-programmable gate array and method to implement an artificial neural network. A trained model of the neural network is processed, in which weights are defined in a floating-point format, to quantize each set of weights to a respective reduced-precision format in dependence on effect of quantization on accuracy of the model. For each set of weights, a partitioning scheme is defined for a set of block memories of the apparatus such that a plurality k of those weights can be stored in each addressable location of the set of memories, wherein k differs for different sets of weights. The apparatus can be programmed to implement the neural network such that weights in each set are persistently stored in a set of block memories partitioned according to the partitioning scheme for that set of weights.Type: ApplicationFiled: February 8, 2019Publication date: August 13, 2020Inventors: Dionysios Diamantopoulos, Heiner Giefers, Christoph Hagleitner
-
Patent number: 10685082Abstract: According to some embodiments, a computer-implemented method for performing sparse matrix dense matrix (SpMM) multiplication on a single field programmable gate array (FPGA) module comprising a k-stage pipeline is described. The method may include interleaving k-stage threads on the k-stage pipeline comprising a plurality of threads t0 to tk-1, wherein a first result of thread t0 is ready one cycle after the first input of thread tk-1 is fed into the pipeline, and outputting a result matrix Y.Type: GrantFiled: October 31, 2016Date of Patent: June 16, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Costas Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Raphael C. Polig, Peter W. J. Staar
-
Publication number: 20190377707Abstract: The invention is notably directed to a computing system configured to perform linear algebraic operations. The computing system comprises a co-processing module comprising a co-processing unit. The co-processing unit comprises a parallel array of bit-serial processing units. The bit-serial processing units are adapted to perform the linear algebraic operations with variable precision. The invention further concerns a related computer implemented method and a related computer program product.Type: ApplicationFiled: June 9, 2018Publication date: December 12, 2019Inventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
-
Patent number: 10430325Abstract: Differential data access. A method for storing and reading data elements to and from a memory is provided. The method includes storing a data element as a base word in a first precision, storing at least one delta word including additional information related to a second precision version of the stored data element, and reading the base word and the at least one delta word of the stored data element to access the data element in the second precision.Type: GrantFiled: December 14, 2015Date of Patent: October 1, 2019Assignee: International Business Machines CorporationInventors: Christoph M Angerer, Heiner Giefers, Raphael Polig
-
Patent number: 10430326Abstract: Differential data access. A method for storing and reading data elements to and from a memory is provided. The method includes storing a data element as a base word in a first precision, storing at least one delta word including additional information related to a second precision version of the stored data element, and reading the base word and the at least one delta word of the stored data element to access the data element in the second precision.Type: GrantFiled: December 21, 2016Date of Patent: October 1, 2019Assignee: International Business Machines CorporationInventors: Christoph M. Angerer, Heiner Giefers, Raphael Polig
-
Patent number: 10025754Abstract: Embodiments of the present invention provide methods, computer program products, and systems for solving a linear equation system using a hardware-implemented extended solver, wherein a calculation precision is adapted in each iteration step of a solving process is provided. Embodiments of the present invention can be used to perform on-the-fly interpolations using the data associated with the highest resolution of the three-dimensional finite element voxel model to a lower resolution than the highest resolution as well as to perform solving computations of the solving process in the lower resolution.Type: GrantFiled: July 22, 2015Date of Patent: July 17, 2018Assignee: International Business Machines CorporationInventors: Christoph M. Angerer, Konstantinos Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Yves G. Ineichen, Raphael Polig
-
Patent number: 9959202Abstract: A computing memory includes an execution unit and an access processor coupled with a memory system, where the execution unit and the access processor are logically separated units. The execution unit is for processing operand data. The access processor is for providing operand data and configuration data to the execution unit. The access processor reads operand data from the memory system and sends the operand data to the execution unit. The execution unit executes the operand data according to the provided configuration data. The access processor includes information about execution times of operations of the execution unit for the provided configuration. The access processor reserves time-slots for writing execution unit results provided by the execution unit into selected locations in the memory system based on the information about the execution times, upon sending at least one of the operand data and the configuration data to the execution unit.Type: GrantFiled: September 16, 2015Date of Patent: May 1, 2018Assignee: International Business Machines CorporationInventors: Jan Van Lunteren, Heiner Giefers
-
Publication number: 20180074962Abstract: A computing system comprising a central processing unit (CPU), a memory processor and a memory device comprising a data array and an index array. The computing system is configured to store data lines comprising data elements in the data array and to store index lines comprising a plurality of memory indices in the index array. The memory indices indicate memory positions of data elements in the data array with respect to a start address of the data array. There is further provided a related computer implemented method and a related computer program product.Type: ApplicationFiled: September 9, 2016Publication date: March 15, 2018Inventors: Heiner Giefers, Raphael Polig, Jan Van Lunteren
-
Patent number: 9870315Abstract: A computing memory includes an execution unit and an access processor coupled with a memory system, where the execution unit and the access processor are logically separated units. The execution unit is for processing operand data. The access processor is for providing operand data and configuration data to the execution unit. The access processor reads operand data from the memory system and sends the operand data to the execution unit. The execution unit executes the operand data according to the provided configuration data. The access processor includes information about execution times of operations of the execution unit for the provided configuration. The access processor reserves time-slots for writing execution unit results provided by the execution unit into selected locations in the memory system based on the information about the execution times, upon sending at least one of the operand data and the configuration data to the execution unit.Type: GrantFiled: January 27, 2017Date of Patent: January 16, 2018Assignee: International Business Machines CorporationInventors: Jan Van Lunteren, Heiner Giefers
-
Patent number: 9779061Abstract: An iterative refinement apparatus configured to generate data defining a solution vector x for a linear system represented by Ax=b, where A is a predetermined matrix and b is a predetermined vector. An outer solver processes input data, defining the matrix A and vector b, in accordance with an outer loop of an iterative refinement method to generate said data defining the solution vector x. An inner solver processes data items in accordance with an inner loop of the iterative refinement method. The inner solver is configured to process said data items having variable bit-width and data format. A precision controller determines the bit-widths and data formats of the data items adaptively in dependence on the results of the processing steps of the iterative refinement method; the precision controller configured to control operation of the inner solver for processing said data items with the bit-widths and data formats.Type: GrantFiled: February 16, 2015Date of Patent: October 3, 2017Assignee: International Business Machines CorporationInventors: Christoph M. Angerer, Konstantinos Bekas, Alessandro Curioni, Silvio Dragone, Heiner Giefers, Christoph Hagleitner, Raphael C. Polig
-
Patent number: 9703573Abstract: Embodiments are directed to a heterogeneous system for dynamically mapping library calls to one of a plurality of processing platforms. The plurality of processing platforms include a central processing unit (CPU) and one or more acceleration units as co-processing units. The system includes an interposer configured to intercept the library calls from an application programming interface (API) and to map the library calls to one of the plurality of processing platforms according to a classification scheme based on an affinity table. The affinity table includes call signatures representing input parameters of sample library calls. Furthermore, the affinity table includes one or more performance parameters of the sample library calls for each of the processing platforms. The performance parameters indicate the performance of the sample library calls on the respective processing platform. Also included are a related method and a related computer program product.Type: GrantFiled: April 26, 2016Date of Patent: July 11, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Heiner Giefers, Raphael Polig
-
Publication number: 20170147531Abstract: According to some embodiments, a computer-implemented method for performing sparse matrix dense matrix (SpMM) multiplication on a single field programmable gate array (FPGA) module comprising a k-stage pipeline is described. The method may include interleaving k-stage threads on the k-stage pipeline comprising a plurality of threads t0 to tk-1, wherein a first result of thread t0 is ready one cycle after the first input of thread tk-1 is fed into the pipeline, and outputting a result matrix Y.Type: ApplicationFiled: October 31, 2016Publication date: May 25, 2017Inventors: Costas Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Raphael C. Polig, Peter W. J. Staar
-
Publication number: 20170139625Abstract: A computing memory includes an execution unit and an access processor coupled with a memory system, where the execution unit and the access processor are logically separated units. The execution unit is for processing operand data. The access processor is for providing operand data and configuration data to the execution unit. The access processor reads operand data from the memory system and sends the operand data to the execution unit. The execution unit executes the operand data according to the provided configuration data. The access processor includes information about execution times of operations of the execution unit for the provided configuration. The access processor reserves time-slots for writing execution unit results provided by the execution unit into selected locations in the memory system based on the information about the execution times, upon sending at least one of the operand data and the configuration data to the execution unit.Type: ApplicationFiled: January 27, 2017Publication date: May 18, 2017Inventors: Jan Van Lunteren, Heiner Giefers
-
Publication number: 20170097883Abstract: Differential data access. A method for storing and reading data elements to and from a memory is provided. The method includes storing a data element as a base word in a first precision, storing at least one delta word including additional information related to a second precision version of the stored data element, and reading the base word and the at least one delta word of the stored data element to access the data element in the second precision.Type: ApplicationFiled: December 21, 2016Publication date: April 6, 2017Inventors: Christoph M. Angerer, Heiner Giefers, Raphael Polig
-
Patent number: 9558156Abstract: According to some embodiments, a computer-implemented method for performing sparse matrix dense matrix (SpMM) multiplication on a single field programmable gate array (FPGA) module comprising a k-stage pipeline is described. The method may include interleaving k-stage threads on the k-stage pipeline comprising a plurality of threads t0 to tk-1, wherein a first result of thread t0 is ready one cycle after the first input of thread tk-1 is fed into the pipeline, and outputting a result matrix Y.Type: GrantFiled: November 24, 2015Date of Patent: January 31, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Costas Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Raphael C. Polig, Peter W. J. Staar
-
Publication number: 20170024356Abstract: Embodiments of the present invention provide methods, computer program products, and systems for solving a linear equation system using a hardware-implemented extended solver, wherein a calculation precision is adapted in each iteration step of a solving process is provided. Embodiments of the present invention can be used to perform on-the-fly interpolations using the data associated with the highest resolution of the three-dimensional finite element voxel model to a lower resolution than the highest resolution as well as to perform solving computations of the solving process in the lower resolution.Type: ApplicationFiled: July 22, 2015Publication date: January 26, 2017Inventors: Christoph M. Angerer, Konstantinos Bekas, Alessandro Curioni, Heiner Giefers, Christoph Hagleitner, Yves G. Ineichen, Raphael Polig
-
Publication number: 20160170652Abstract: Differential data access. A method for storing and reading data elements to and from a memory is provided. The method includes storing a data element as a base word in a first precision, storing at least one delta word including additional information related to a second precision version of the stored data element, and reading the base word and the at least one delta word of the stored data element to access the data element in the second precision.Type: ApplicationFiled: December 14, 2015Publication date: June 16, 2016Inventors: Christoph M. Angerer, Heiner Giefers, Raphael Polig