Patents by Inventor Derek Gladding

Derek Gladding has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11748251
    Abstract: Embodiments of the present disclosure include systems and methods for storing tensors in memory based on depth. In some embodiments, for each of a plurality of sets of elements in a three-dimensional (3D) matrix, a position is determined along a height axis and width axis of the 3D matrix. At the determined position, a set of elements are identified along a depth axis of the 3D matrix. The set of elements are stored in a contiguous block of memory.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: September 5, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nitin Garegrat, Shankar Narayan, Derek Gladding
  • Publication number: 20220222174
    Abstract: Embodiments of the present disclosure include systems and methods for storing tensors in memory based on depth. In some embodiments, for each of a plurality of sets of elements in a three-dimensional (3D) matrix, a position is determined along a height axis and width axis of the 3D matrix. At the determined position, a set of elements are identified along a depth axis of the 3D matrix. The set of elements are stored in a contiguous block of memory.
    Type: Application
    Filed: January 8, 2021
    Publication date: July 14, 2022
    Inventors: Nitin Garegrat, Shankar Narayan, Derek Gladding
  • Publication number: 20220222318
    Abstract: Embodiments of the present disclosure include systems and methods for performing tensor operations using a programmable control engine. A command queue is configured to receive a command from a software application. A configuration storage is configured to store a plurality of configurations. A matrix multiplication unit is configured to perform matrix multiplication operations. Memory is configured to store matrices. A control engine is configured to retrieve the command from the command queue; retrieve a configuration from the configuration storage based on the command; generate, based on the command and the configuration, instructions for the matrix multiplication unit to perform a set of matrix multiplication operations on first and second matrices stored in the memory; send the instructions to the matrix multiplication unit to configure the matrix multiplication unit to output results of the set of matrix multiplication operations; and store the results in a third matrix in the memory.
    Type: Application
    Filed: January 8, 2021
    Publication date: July 14, 2022
    Inventors: Nitin Garegrat, Derek Gladding, Shankar Narayan, Sujatha Santhanaraman, Jayadev Velagandula
  • Patent number: 8024394
    Abstract: Included are embodiments of a Multiply-Accumulate Unit to process multiple format floating point operands. For short format operands, embodiments of the Multiply Accumulate Unit are configured to process data with twice the throughput as long and mixed format data. At least one embodiment can include a short exponent calculation component configured to receive short format data, a long exponent calculation component configured to receive long format data, and a mixed exponent calculation component configured to receive short exponent data, the mixed exponent calculation component further configured to received long format data. Embodiments also include a mantissa datapath configured for implementation to accommodate processing of long, mixed, and short floating point operands.
    Type: Grant
    Filed: February 6, 2007
    Date of Patent: September 20, 2011
    Assignee: Via Technologies, Inc.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Publication number: 20110208946
    Abstract: Disclosed are various embodiments of a stream processing unit for single instruction multiple data (SIMD) processing, wherein the stream processing unit executes a stage of a Multiply-Accumulate calculation. In one embodiment, the stream processing unit comprises a plurality of scalar arithmetic logic units (ALUs) configured to receive data having a plurality of data types. The number and type of scalar ALUs corresponds to an SIMD factor. In one embodiment, the scalar ALUs are executed sequentially with a delay being introduced in between execution of each of the scalar ALUs, wherein the delay corresponds to the SIMD factor.
    Type: Application
    Filed: May 4, 2011
    Publication date: August 25, 2011
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7675521
    Abstract: Systems for performing rasterization are described. At least one embodiment includes a span generator for performing rasterization. In accordance with such embodiments, the span generator comprises functionals representing a scissoring box, loaders configured to convert the functionals from a general form to a special case form, edge generators configured to read the special case form of the scissoring box, whereby the special case form simplifies calculations by the edge generators. The span generator further comprises sorters configured to compute the intersection of half-planes, wherein edges of the intersection are generated by the edge generators and a span buffer configured to temporarily store spans before tiling.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: March 9, 2010
    Assignee: VIA Technologies, Inc.
    Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7551174
    Abstract: A low-cost high-speed programmable rasterizer accepting an input set of functionals representing a triangle, clipping planes and a scissoring box, and producing multiple spans per clock cycle as output. A Loader converts the input set from a general form to a special case form accepted by a set of Edge Generators, the restricted input format accepted by the Edge Generators contributing to their efficient hardware implementation.
    Type: Grant
    Filed: December 23, 2003
    Date of Patent: June 23, 2009
    Assignee: Via Technologies, Inc.
    Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Publication number: 20080158252
    Abstract: Systems for performing rasterization are described. At least one embodiment includes a span generator for performing rasterization. In accordance with such embodiments, the span generator comprises functionals representing a scissoring box, loaders configured to convert the functionals from a general form to a special case form, edge generators configured to read the special case form of the scissoring box, whereby the special case form simplifies calculations by the edge generators. The span generator further comprises sorters configured to compute the intersection of half-planes, wherein edges of the intersection are generated by the edge generators and a span buffer configured to temporarily store spans before tiling.
    Type: Application
    Filed: March 11, 2008
    Publication date: July 3, 2008
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Publication number: 20070186082
    Abstract: Included are embodiments of a stream processor configured to process data in any of a plurality of different formats. At least one embodiment of the stream processor includes a first scalar arithmetic logic unit (ALU), configured to process a plurality of sets of short data in response to a received short format control signal from an instruction set and process a set of long data in response to a received long format control signal from the instruction set. Embodiments of the processor also include a second arithmetic logic unit (ALU), configured to receive the processed data from the first arithmetic logic unit (ALU) and process the input data and the processed data according to a control signal from the instruction set. Still other embodiments include a special function unit (SFU) configured to provide additional computational functionality to the first ALU and the second ALU.
    Type: Application
    Filed: February 6, 2007
    Publication date: August 9, 2007
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Publication number: 20070185953
    Abstract: Included are embodiments of a Multiply-Accumulate Unit to process multiple format floating point operands. For short format operands, embodiments of the Multiply Accumulate Unit are configured to process data with twice the throughput as long and mixed format data. At least one embodiment can include a short exponent calculation component configured to receive short format data, a long exponent calculation component configured to receive long format data, and a mixed exponent calculation component configured to receive short exponent data, the mixed exponent calculation component further configured to received long format data. Embodiments also include a mantissa datapath configured for implementation to accommodate processing of long, mixed, and short floating point operands.
    Type: Application
    Filed: February 6, 2007
    Publication date: August 9, 2007
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Publication number: 20070030277
    Abstract: A method for processing graphics data packets comprises allocating an entity for the graphics data packet of vertices, triangles, and/or pixels in one or more execution blocks that receives an assignment from a global spreader to process the graphics data packets. A pointer, which points to the allocated entity, communicates a pointer to a data mover, and the data mover loads some graphics data packets into a memory. A number of processing stages may follow such that one or more floating point or integer instructions is executed on the graphics data packets, as controlled by a thread controller. Upon completion of calculations on the graphics data packets, the allocated entity may be deleted and the graphics data packets may be communicated to another execution block or as directed by the global spreader.
    Type: Application
    Filed: August 8, 2005
    Publication date: February 8, 2007
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding, Jeremiah Childs
  • Publication number: 20070030280
    Abstract: A parallel graphics processor having a spreader coupled to a plurality of execution components is disclosed. The spreader maintains status information for each of the plurality of execution components and establishes a priority for each of the plurality of execution blocks to receive a graphics entity to be processed. The priorities are arranged in accordance with the maintained status information and a type of graphics entity to be processed. The spreader communicates a request to a selected execution component to allocate the graphics entity to be processed in its entity descriptor table and copies graphics entity data to the selected execution component. The spreader indexes assignment of the graphics entity in its logical table and subsequently receives indication from the selected instruction execution component that the graphics entity has been processed. Thereafter, graphics images may be presented on a display.
    Type: Application
    Filed: August 8, 2005
    Publication date: February 8, 2007
    Inventors: Timour Paltashev, Boris Prokopenko, Derek Gladding
  • Patent number: 7159003
    Abstract: A system and method for converting two binary digits into redundant sign-digit format. The system comprises a first adder for adding the binary digits together to generate a first result. A second adder adds an input carry from a previous digit to the first result and subtracts a value equal to the radix of the of the binary digits form the first result if the first result is greater than an initial threshold in order to generate an intermediate result. The system further includes a third adder for adding a second input carry from the previous digit to the intermediate result and subtracting the value of the radix from the intermediate result if the intermediate result is greater than a prescribed value such that the addition of the two binary digits are in redundant sign-digit format.
    Type: Grant
    Filed: February 21, 2003
    Date of Patent: January 2, 2007
    Assignee: S3 Graphics Co., Ltd.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7146486
    Abstract: A scalar processor that includes a plurality of scalar arithmetic logic units and a special function unit. Each scalar unit performs, in a different time interval, the same operation on a different data item, where each different time interval is one of a plurality of successive, adjacent time intervals. Each unit provides an output data item in the time interval in which the unit performs the operation and provides a processed data item in the last of the successive, adjacent time intervals. The special function unit provides a special function computation for the output data item of a selected one of the scalar units, in the time interval in which the selected scalar unit performs the operation, so as to avoid a conflict in use among the scalar units. A vector processing unit includes an input data buffer, the scalar processor, and an output orthogonal converter.
    Type: Grant
    Filed: January 29, 2003
    Date of Patent: December 5, 2006
    Assignee: S3 Graphics Co., Ltd.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7098924
    Abstract: A method and apparatus for obtaining an attribute in homogenous space. After obtaining the vertices of a triangle, the world space coordinates and the attribute of each vertex are transformed to homogeneous coordinates and an attribute in viewer space. Then a set of homogenous coefficients of the triangle is computed based on the viewer space vertex homogeneous coordinates, and the viewer space coordinates of each vertex are projected to coordinates in screen space. Pixels in the screen space that are affected by the projected triangle are determined. For each pixel affected by the triangle, a set of barycentric coefficients in viewer space is computed, based on the homogenous triangle coefficients, and a linear interpolation is performed based on the set of viewer space barycentric coefficients and the viewer space attributes of the triangle vertices to obtain the attribute of the pixel affected by the triangle.
    Type: Grant
    Filed: September 24, 2003
    Date of Patent: August 29, 2006
    Assignee: VIA Technologies, Inc.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Publication number: 20050134603
    Abstract: A low-cost high-speed programmable rasterizer accepting an input set of functionals representing a triangle, clipping planes and a scissoring box, and producing multiple spans per clock cycle as output. A Loader converts the input set from a general form to a special case form accepted by a set of Edge Generators, the restricted input format accepted by the Edge Generators contributing to their efficient hardware implementation.
    Type: Application
    Filed: December 23, 2003
    Publication date: June 23, 2005
    Applicant: Via Technologies, Inc
    Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Publication number: 20040145589
    Abstract: A method and apparatus for obtaining an attribute of a pixel in homogenous space. After obtaining the vertices of a triangle, the world space coordinates and the attribute of each vertex are transformed to coordinates and an attribute in viewer space. Then a set of homogenous coefficients of each vertex is computed based on the viewer space coordinates, and the viewer space coordinates of each vertex are projected to coordinates in screen space. Pixels in the screen space that are affected by the triangle are determined based on the screen space coordinates. For each pixel affected by the triangle, a set of barycentric coefficients in homogenous space is computed, based on the homogenous coefficients, and a linear interpolation is performed based on the set of homogenous barycentric coefficients and the attributes of the vertices in the viewer space to obtain the attribute in the homogenous space of that pixel.
    Type: Application
    Filed: September 24, 2003
    Publication date: July 29, 2004
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding