Patents by Inventor Derek Gladding

Derek Gladding has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Performing tensor operations using a programmable control engine

Patent number: 12079301

Abstract: A command queue is configured to receive a command from a software application. A configuration storage is configured to store a plurality of configurations. A matrix multiplication unit is configured to perform matrix multiplication operations. Memory is configured to store matrices. A control engine is configured to retrieve the command from the command queue; retrieve a configuration from the configuration storage based on the command; generate, based on the command and the configuration, instructions for the matrix multiplication unit to perform a set of matrix multiplication operations on first and second matrices stored in the memory; send the instructions to the matrix multiplication unit to configure the matrix multiplication unit to output results of the set of matrix multiplication operations; and store the results in a third matrix in the memory.

Type: Grant

Filed: January 8, 2021

Date of Patent: September 3, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Nitin Garegrat, Derek Gladding, Shankar Narayan, Sujatha Santhanaraman, Jayadev Velagandula
Storing tensors in memory based on depth

Patent number: 11748251

Abstract: Embodiments of the present disclosure include systems and methods for storing tensors in memory based on depth. In some embodiments, for each of a plurality of sets of elements in a three-dimensional (3D) matrix, a position is determined along a height axis and width axis of the 3D matrix. At the determined position, a set of elements are identified along a depth axis of the 3D matrix. The set of elements are stored in a contiguous block of memory.

Type: Grant

Filed: January 8, 2021

Date of Patent: September 5, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Nitin Garegrat, Shankar Narayan, Derek Gladding
PERFORMING TENSOR OPERATIONS USING A PROGRAMMABLE CONTROL ENGINE

Publication number: 20220222318

Abstract: Embodiments of the present disclosure include systems and methods for performing tensor operations using a programmable control engine. A command queue is configured to receive a command from a software application. A configuration storage is configured to store a plurality of configurations. A matrix multiplication unit is configured to perform matrix multiplication operations. Memory is configured to store matrices. A control engine is configured to retrieve the command from the command queue; retrieve a configuration from the configuration storage based on the command; generate, based on the command and the configuration, instructions for the matrix multiplication unit to perform a set of matrix multiplication operations on first and second matrices stored in the memory; send the instructions to the matrix multiplication unit to configure the matrix multiplication unit to output results of the set of matrix multiplication operations; and store the results in a third matrix in the memory.

Type: Application

Filed: January 8, 2021

Publication date: July 14, 2022

Inventors: Nitin Garegrat, Derek Gladding, Shankar Narayan, Sujatha Santhanaraman, Jayadev Velagandula
STORING TENSORS IN MEMORY BASED ON DEPTH

Publication number: 20220222174

Abstract: Embodiments of the present disclosure include systems and methods for storing tensors in memory based on depth. In some embodiments, for each of a plurality of sets of elements in a three-dimensional (3D) matrix, a position is determined along a height axis and width axis of the 3D matrix. At the determined position, a set of elements are identified along a depth axis of the 3D matrix. The set of elements are stored in a contiguous block of memory.

Type: Application

Filed: January 8, 2021

Publication date: July 14, 2022

Inventors: Nitin Garegrat, Shankar Narayan, Derek Gladding
Dual mode floating point multiply accumulate unit

Patent number: 8024394

Abstract: Included are embodiments of a Multiply-Accumulate Unit to process multiple format floating point operands. For short format operands, embodiments of the Multiply Accumulate Unit are configured to process data with twice the throughput as long and mixed format data. At least one embodiment can include a short exponent calculation component configured to receive short format data, a long exponent calculation component configured to receive long format data, and a mixed exponent calculation component configured to receive short exponent data, the mixed exponent calculation component further configured to received long format data. Embodiments also include a mantissa datapath configured for implementation to accommodate processing of long, mixed, and short floating point operands.

Type: Grant

Filed: February 6, 2007

Date of Patent: September 20, 2011

Assignee: Via Technologies, Inc.

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
Dual Mode Floating Point Multiply Accumulate Unit

Publication number: 20110208946

Abstract: Disclosed are various embodiments of a stream processing unit for single instruction multiple data (SIMD) processing, wherein the stream processing unit executes a stage of a Multiply-Accumulate calculation. In one embodiment, the stream processing unit comprises a plurality of scalar arithmetic logic units (ALUs) configured to receive data having a plurality of data types. The number and type of scalar ALUs corresponds to an SIMD factor. In one embodiment, the scalar ALUs are executed sequentially with a delay being introduced in between execution of each of the scalar ALUs, wherein the delay corresponds to the SIMD factor.

Type: Application

Filed: May 4, 2011

Publication date: August 25, 2011

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
Method and apparatus for triangle rasterization with clipping and wire-frame mode support

Patent number: 7675521

Abstract: Systems for performing rasterization are described. At least one embodiment includes a span generator for performing rasterization. In accordance with such embodiments, the span generator comprises functionals representing a scissoring box, loaders configured to convert the functionals from a general form to a special case form, edge generators configured to read the special case form of the scissoring box, whereby the special case form simplifies calculations by the edge generators. The span generator further comprises sorters configured to compute the intersection of half-planes, wherein edges of the intersection are generated by the edge generators and a span buffer configured to temporarily store spans before tiling.

Type: Grant

Filed: March 11, 2008

Date of Patent: March 9, 2010

Assignee: VIA Technologies, Inc.

Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
Method and apparatus for triangle rasterization with clipping and wire-frame mode support

Patent number: 7551174

Abstract: A low-cost high-speed programmable rasterizer accepting an input set of functionals representing a triangle, clipping planes and a scissoring box, and producing multiple spans per clock cycle as output. A Loader converts the input set from a general form to a special case form accepted by a set of Edge Generators, the restricted input format accepted by the Edge Generators contributing to their efficient hardware implementation.

Type: Grant

Filed: December 23, 2003

Date of Patent: June 23, 2009

Assignee: Via Technologies, Inc.

Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
Method and Apparatus for Triangle Rasterization with Clipping and Wire-Frame Mode Support

Publication number: 20080158252

Abstract: Systems for performing rasterization are described. At least one embodiment includes a span generator for performing rasterization. In accordance with such embodiments, the span generator comprises functionals representing a scissoring box, loaders configured to convert the functionals from a general form to a special case form, edge generators configured to read the special case form of the scissoring box, whereby the special case form simplifies calculations by the edge generators. The span generator further comprises sorters configured to compute the intersection of half-planes, wherein edges of the intersection are generated by the edge generators and a span buffer configured to temporarily store spans before tiling.

Type: Application

Filed: March 11, 2008

Publication date: July 3, 2008

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
Stream Processor with Variable Single Instruction Multiple Data (SIMD) Factor and Common Special Function

Publication number: 20070186082

Abstract: Included are embodiments of a stream processor configured to process data in any of a plurality of different formats. At least one embodiment of the stream processor includes a first scalar arithmetic logic unit (ALU), configured to process a plurality of sets of short data in response to a received short format control signal from an instruction set and process a set of long data in response to a received long format control signal from the instruction set. Embodiments of the processor also include a second arithmetic logic unit (ALU), configured to receive the processed data from the first arithmetic logic unit (ALU) and process the input data and the processed data according to a control signal from the instruction set. Still other embodiments include a special function unit (SFU) configured to provide additional computational functionality to the first ALU and the second ALU.

Type: Application

Filed: February 6, 2007

Publication date: August 9, 2007

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
Dual Mode Floating Point Multiply Accumulate Unit

Publication number: 20070185953

Abstract: Included are embodiments of a Multiply-Accumulate Unit to process multiple format floating point operands. For short format operands, embodiments of the Multiply Accumulate Unit are configured to process data with twice the throughput as long and mixed format data. At least one embodiment can include a short exponent calculation component configured to receive short format data, a long exponent calculation component configured to receive long format data, and a mixed exponent calculation component configured to receive short exponent data, the mixed exponent calculation component further configured to received long format data. Embodiments also include a mantissa datapath configured for implementation to accommodate processing of long, mixed, and short floating point operands.

Type: Application

Filed: February 6, 2007

Publication date: August 9, 2007

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
Global spreader and method for a parallel graphics processor

Publication number: 20070030280

Abstract: A parallel graphics processor having a spreader coupled to a plurality of execution components is disclosed. The spreader maintains status information for each of the plurality of execution components and establishes a priority for each of the plurality of execution blocks to receive a graphics entity to be processed. The priorities are arranged in accordance with the maintained status information and a type of graphics entity to be processed. The spreader communicates a request to a selected execution component to allocate the graphics entity to be processed in its entity descriptor table and copies graphics entity data to the selected execution component. The spreader indexes assignment of the graphics entity in its logical table and subsequently receives indication from the selected instruction execution component that the graphics entity has been processed. Thereafter, graphics images may be presented on a display.

Type: Application

Filed: August 8, 2005

Publication date: February 8, 2007

Inventors: Timour Paltashev, Boris Prokopenko, Derek Gladding
Method for processing vertex, triangle, and pixel graphics data packets

Publication number: 20070030277

Abstract: A method for processing graphics data packets comprises allocating an entity for the graphics data packet of vertices, triangles, and/or pixels in one or more execution blocks that receives an assignment from a global spreader to process the graphics data packets. A pointer, which points to the allocated entity, communicates a pointer to a data mover, and the data mover loads some graphics data packets into a memory. A number of processing stages may follow such that one or more floating point or integer instructions is executed on the graphics data packets, as controlled by a thread controller. Upon completion of calculations on the graphics data packets, the allocated entity may be deleted and the graphics data packets may be communicated to another execution block or as directed by the global spreader.

Type: Application

Filed: August 8, 2005

Publication date: February 8, 2007

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding, Jeremiah Childs
Method and apparatus for generating sign-digit format of sum of two numbers

Patent number: 7159003

Abstract: A system and method for converting two binary digits into redundant sign-digit format. The system comprises a first adder for adding the binary digits together to generate a first result. A second adder adds an input carry from a previous digit to the first result and subtracts a value equal to the radix of the of the binary digits form the first result if the first result is greater than an initial threshold in order to generate an intermediate result. The system further includes a third adder for adding a second input carry from the previous digit to the intermediate result and subtracting the value of the radix from the intermediate result if the intermediate result is greater than a prescribed value such that the addition of the two binary digits are in redundant sign-digit format.

Type: Grant

Filed: February 21, 2003

Date of Patent: January 2, 2007

Assignee: S3 Graphics Co., Ltd.

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
SIMD processor with scalar arithmetic logic units

Patent number: 7146486

Abstract: A scalar processor that includes a plurality of scalar arithmetic logic units and a special function unit. Each scalar unit performs, in a different time interval, the same operation on a different data item, where each different time interval is one of a plurality of successive, adjacent time intervals. Each unit provides an output data item in the time interval in which the unit performs the operation and provides a processed data item in the last of the successive, adjacent time intervals. The special function unit provides a special function computation for the output data item of a selected one of the scalar units, in the time interval in which the selected scalar unit performs the operation, so as to avoid a conflict in use among the scalar units. A vector processing unit includes an input data buffer, the scalar processor, and an output orthogonal converter.

Type: Grant

Filed: January 29, 2003

Date of Patent: December 5, 2006

Assignee: S3 Graphics Co., Ltd.

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
Method and programmable device for triangle interpolation in homogeneous space

Patent number: 7098924

Abstract: A method and apparatus for obtaining an attribute in homogenous space. After obtaining the vertices of a triangle, the world space coordinates and the attribute of each vertex are transformed to homogeneous coordinates and an attribute in viewer space. Then a set of homogenous coefficients of the triangle is computed based on the viewer space vertex homogeneous coordinates, and the viewer space coordinates of each vertex are projected to coordinates in screen space. Pixels in the screen space that are affected by the projected triangle are determined. For each pixel affected by the triangle, a set of barycentric coefficients in viewer space is computed, based on the homogenous triangle coefficients, and a linear interpolation is performed based on the set of viewer space barycentric coefficients and the viewer space attributes of the triangle vertices to obtain the attribute of the pixel affected by the triangle.

Type: Grant

Filed: September 24, 2003

Date of Patent: August 29, 2006

Assignee: VIA Technologies, Inc.

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
Method and apparatus for triangle rasterization with clipping and wire-frame mode support

Publication number: 20050134603

Abstract: A low-cost high-speed programmable rasterizer accepting an input set of functionals representing a triangle, clipping planes and a scissoring box, and producing multiple spans per clock cycle as output. A Loader converts the input set from a general form to a special case form accepted by a set of Edge Generators, the restricted input format accepted by the Edge Generators contributing to their efficient hardware implementation.

Type: Application

Filed: December 23, 2003

Publication date: June 23, 2005

Applicant: Via Technologies, Inc

Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
Method and programmable device for triangle interpolation in homogeneous space

Publication number: 20040145589

Abstract: A method and apparatus for obtaining an attribute of a pixel in homogenous space. After obtaining the vertices of a triangle, the world space coordinates and the attribute of each vertex are transformed to coordinates and an attribute in viewer space. Then a set of homogenous coefficients of each vertex is computed based on the viewer space coordinates, and the viewer space coordinates of each vertex are projected to coordinates in screen space. Pixels in the screen space that are affected by the triangle are determined based on the screen space coordinates. For each pixel affected by the triangle, a set of barycentric coefficients in homogenous space is computed, based on the homogenous coefficients, and a linear interpolation is performed based on the set of homogenous barycentric coefficients and the attributes of the vertices in the viewer space to obtain the attribute in the homogenous space of that pixel.

Type: Application

Filed: September 24, 2003

Publication date: July 29, 2004

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding