Patents by Inventor Daniel Thuerck

Daniel Thuerck has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230315479
    Abstract: A method for supporting throughput-oriented computing includes a single instruction multiple threads (SIMT) program configured to launch a plurality of warps, each respective warp of the plurality of warps comprises threads to be executed in lockstep within the each respective warp. Individual warp sizes of the plurality of warps are used as a runtime parameter for the SIMT program, such that a parameterized SIMT program is provided, which is parameterizable via the individual warp sizes, and the parameterized SIMT program is executed on a single instruction multiple data (SIMD) vector architecture.
    Type: Application
    Filed: February 25, 2021
    Publication date: October 5, 2023
    Inventor: Daniel THUERCK
  • Publication number: 20230120516
    Abstract: A method for optimizing a neural network includes identifying parameters of a computation graph of the neural network that depend on input data as a computation part, and parameters of the computation graph that are independent of the input data as a pre-evaluation part. The method splits the computation graph into the pre-evaluation part and the computation part, and generates and applies a wrapper that performs a transparent mapping of data layouts of the pre-evaluation part.
    Type: Application
    Filed: January 11, 2022
    Publication date: April 20, 2023
    Inventors: Nicolas Weber, Daniel Thuerck
  • Publication number: 20230024035
    Abstract: A system, method, and computer-readable medium for synthesizing zero-copy sparse matrix factorization operations in heterogeneous compute systems are provided. The system includes a host and an accelerator device. The host device is configured to divide an input matrix into a plurality of blocks which are transferred to a memory of the accelerator device. The host device is also configured to generate at least one index buffer that includes pointers to the block in the accelerator's memory, where each index buffer represents a frontal matrix associated with a matrix decomposition algorithm. The host processor is configured to receive one or more kernels configured to process the index buffer(s) on an accelerator device. The index buffers are processed by the accelerator device and the modified block data is written back to a memory of the host device to generate a factorized output matrix.
    Type: Application
    Filed: November 5, 2021
    Publication date: January 26, 2023
    Inventors: Daniel Thuerck, Nicolas Weber