Patents by Inventor Tomasz Czajkowski

Tomasz Czajkowski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230359695
    Abstract: Matrix multiplication systolic array feed methods and related processing element (PE) microarchitectures for efficiently implementing systolic array generic matrix multiplier (SGEMM) in integrated circuits is provided. A systolic array architecture may include a processing element array, a column feeder array, and a row feeder array. A bandwidth of external memory may be reduced by a factor of reduction based on interleaving of the matrix data via a feeding pattern of the column feeder array and the row feeder array.
    Type: Application
    Filed: July 17, 2023
    Publication date: November 9, 2023
    Inventors: Jack Z. Yinger, Andrew Ling, Tomasz Czajkowski, Davor Capalija, Eriko Nurvitadhi, Deborah Marr
  • Publication number: 20230064381
    Abstract: Matrix multiplication systolic array feed methods and related processing element (PE) microarchitectures for efficiently implementing systolic array generic matrix multiplier (SGEMM) in integrated circuits is provided. A systolic array architecture may include a processing element array, a column feeder array, and a row feeder array. A bandwidth of external memory may be reduced by a factor of reduction based on interleaving of the matrix data via a feeding pattern of the column feeder array and the row feeder array.
    Type: Application
    Filed: May 9, 2022
    Publication date: March 2, 2023
    Inventors: Jack Z. Yinger, Andrew Ling, Tomasz Czajkowski, Davor Capalija, Eriko Nurvitadhi, Deborah Marr
  • Patent number: 11328037
    Abstract: Matrix multiplication systolic array feed methods and related processing element (PE) microarchitectures for efficiently implementing systolic array generic matrix multiplier (SGEMM) in integrated circuits is provided. A systolic array architecture may include a processing element array, a column feeder array, and a row feeder array. A bandwidth of external memory may be reduced by a factor of reduction based on interleaving of the matrix data via a feeding pattern of the column feeder array and the row feeder array.
    Type: Grant
    Filed: July 7, 2017
    Date of Patent: May 10, 2022
    Assignee: Intel Corporation
    Inventors: Jack Z. Yinger, Andrew Ling, Tomasz Czajkowski, Davor Capalija, Eriko Nurvitadhi, Deborah Marr
  • Patent number: 11256836
    Abstract: Power dissipation in integrated circuits may be reduced by efficient implementation of high level programming on the integrated circuits. As the high level programming logic is implemented on the integrated circuits, data inputs are disabled based upon branches and/or data that is not used by the high level programming.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: February 22, 2022
    Assignee: Intel Corporation
    Inventor: Tomasz Czajkowski
  • Patent number: 10599404
    Abstract: A method of compiling program code includes determining if the program code controls a programmable logic device to execute other program code. The program code is a parallel program having a barrier function call for a group of threads. If it is determined that program code is to control the programmable logic device, then the program code is transformed by replacing the barrier function call with control logic inserted into the program code such that the transformed program code remains a parallel program and maintains synchronization among the group of threads. A compiler system that compiles program code with a barrier function call for a group of threads is also described.
    Type: Grant
    Filed: June 1, 2012
    Date of Patent: March 24, 2020
    Assignee: Altera Corporation
    Inventors: David Neto, Deshanand Singh, Tomasz Czajkowski, John Stuart Freeman, Tian Yi David Han
  • Publication number: 20190042673
    Abstract: Power dissipation in integrated circuits may be reduced by efficient implementation of high level programming on the integrated circuits. As the high level programming logic is implemented on the integrated circuits, data inputs are disabled based upon branches and/or data that is not used by the high level programming.
    Type: Application
    Filed: December 14, 2017
    Publication date: February 7, 2019
    Inventor: Tomasz Czajkowski
  • Publication number: 20190012295
    Abstract: Matrix multiplication systolic array feed methods and related processing element (PE) microarchitectures for efficiently implementing systolic array generic matrix multiplier (SGEMM) in integrated circuits is provided. A systolic array architecture may include a processing element array, a column feeder array, and a row feeder array. A bandwidth of external memory may be reduced by a factor of reduction based on interleaving of the matrix data via a feeding pattern of the column feeder array and the row feeder array.
    Type: Application
    Filed: July 7, 2017
    Publication date: January 10, 2019
    Inventors: Jack Z. Yinger, Andrew Ling, Tomasz Czajkowski, Davor Capalija, Eriko Nurvitadhi, Deborah Marr
  • Patent number: 9904514
    Abstract: An integrated circuit may be provided with a specialized processing block that performs floating-point addition and subtraction operations. For this purpose, the specialized processing block includes a fused adder and subtractor stage with an adder circuit and a subtractor circuit. The adder and subtractor circuits share an alignment stage for aligning the mantissas of incoming floating-point numbers and provide a simplified normalization stage with one right shifter and one left shifter. The specialized processing blocks may be arranged in rows or columns such that an input of a first specialized processing block is directly coupled to an output of a second specialized processing block and an input of the second specialized processing block is directly coupled to an output of the first specialized processing block.
    Type: Grant
    Filed: October 6, 2015
    Date of Patent: February 27, 2018
    Assignee: Altera Corporation
    Inventor: Tomasz Czajkowski
  • Patent number: 9639326
    Abstract: An integrated circuit is provided that performs floating-point addition or subtraction operations involving at least three floating-point numbers. The floating-point numbers are pre-processed by dynamically extending the number of mantissa bits, determining the floating-point number with the biggest exponent, and shifting the mantissa of the other floating-point numbers to the right. Each extended mantissa has at least twice the number of bits of the mantissa entering the floating-point operation. The exact bit extension is dependent on the number of floating-point numbers to be added. The mantissas of all floating-point numbers with an exponent smaller than the biggest exponent are shifted to the right. The number of right shift bits is dependent on the difference between the biggest exponent and the respective floating-point exponent.
    Type: Grant
    Filed: June 14, 2016
    Date of Patent: May 2, 2017
    Assignee: Altera Corporation
    Inventor: Tomasz Czajkowski
  • Patent number: 9626218
    Abstract: Circuitry for dynamically ordering the execution of multiple threads in parallel is presented. The circuitry may include a control circuit that controls the execution of multiple subsets of threads using multiple processing units in parallel. Each of the plurality of processing units may be associated with an adjustable order thread issuer that may receive a subset of threads and an order in which to execute the subset of threads from the control circuit. The adjustable order thread issuer may manage the processing unit by providing each thread from the subset of threads for execution to the processing unit in the specified order. The adjustable order thread issuer may adjust the order in which threads are issued in an effort to optimize shared resource usage and thus improve the performance of a multithreaded application.
    Type: Grant
    Filed: March 10, 2014
    Date of Patent: April 18, 2017
    Assignee: Altera Corporation
    Inventors: Dmitry Denisenko, Tomasz Czajkowski
  • Publication number: 20160291934
    Abstract: An integrated circuit is provided that performs floating-point addition or subtraction operations involving at least three floating-point numbers. The floating-point numbers are pre-processed by dynamically extending the number of mantissa bits, determining the floating-point number with the biggest exponent, and shifting the mantissa of the other floating-point numbers to the right. Each extended mantissa has at least twice the number of bits of the mantissa entering the floating-point operation. The exact bit extension is dependent on the number of floating-point numbers to be added. The mantissas of all floating-point numbers with an exponent smaller than the biggest exponent are shifted to the right. The number of right shift bits is dependent on the difference between the biggest exponent and the respective floating-point exponent.
    Type: Application
    Filed: June 14, 2016
    Publication date: October 6, 2016
    Inventor: Tomasz Czajkowski
  • Patent number: 9430425
    Abstract: Systems and methods for resource sharing of pipelined circuitry of an integrated circuit (IC) are provided. For example, in one embodiment, a method for sharing a functional unit of an integrated circuit (IC) includes receiving two or more threads configured to access the functional unit through two or more data entry points associated with corresponding data exit points configured to receive processed thread data. The method further includes arbitrating the processing of the two or more threads by the functional unit to obtain the processed thread data. To arbitrate, the exit points that cannot receive additional data are determined. Threads are only received from data entry points with corresponding data exit points that can receive additional data. The processed output data is provided to a corresponding exit point.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: August 30, 2016
    Assignee: Altera Corporation
    Inventor: Tomasz Czajkowski
  • Patent number: 9405728
    Abstract: An integrated circuit is provided that performs floating-point addition or subtraction operations involving at least three floating-point numbers. The floating-point numbers are pre-processed by dynamically extending the number of mantissa bits, determining the floating-point number with the biggest exponent, and shifting the mantissa of the other floating-point numbers to the right. Each extended mantissa has at least twice the number of bits of the mantissa entering the floating-point operation. The exact bit extension is dependent on the number of floating-point numbers to be added. The mantissas of all floating-point numbers with an exponent smaller than the biggest exponent are shifted to the right. The number of right shift bits is dependent on the difference between the biggest exponent and the respective floating-point exponent.
    Type: Grant
    Filed: September 5, 2013
    Date of Patent: August 2, 2016
    Assignee: Altera Corporation
    Inventor: Tomasz Czajkowski
  • Patent number: 9135087
    Abstract: Systems and methods for limiting resource usage of a kernel of an integrated circuit are provided. For example, in one embodiment a method for limiting a number of workgroups that may simultaneously access a kernel of an integrated circuit (IC) includes determining a threshold number of workgroups that may access the kernel simultaneously. A thread of execution is received. The thread of execution is allowed to access the kernel when the threshold number of workgroups would not be exceeded by the thread of execution accessing the kernel.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: September 15, 2015
    Assignee: Altera Corporation
    Inventors: Tomasz Czajkowski, John Freeman, Peter Yiannacouras
  • Publication number: 20150067010
    Abstract: An integrated circuit is provided that performs floating-point addition or subtraction operations involving at least three floating-point numbers. The floating-point numbers are pre-processed by dynamically extending the number of mantissa bits, determining the floating-point number with the biggest exponent, and shifting the mantissa of the other floating-point numbers to the right. Each extended mantissa has at least twice the number of bits of the mantissa entering the floating-point operation. The exact bit extension is dependent on the number of floating-point numbers to be added. The mantissas of all floating-point numbers with an exponent smaller than the biggest exponent are shifted to the right. The number of right shift bits is dependent on the difference between the biggest exponent and the respective floating-point exponent.
    Type: Application
    Filed: September 5, 2013
    Publication date: March 5, 2015
    Applicant: Altera Corporation
    Inventor: Tomasz Czajkowski