Patents by Inventor Xiaodan Tan

Xiaodan Tan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PROGRAMMABLE COMPUTE ENGINE HAVING TRANSPOSE OPERATIONS

Publication number: 20240111528

Abstract: A technique to execute transpose and compute operations may include retrieving a set of machine instructions from an instruction buffer of a data processor. The instruction buffer has multiple entries, and each entry stores one machine instruction. A machine instruction from the set of machine instructions is executed to transpose a submatrix of an input tensor and perform computations on column elements of the submatrix. The machine instruction combines the transpose operation with computational operations into a single machine instruction.

Type: Application

Filed: September 21, 2022

Publication date: April 4, 2024

Inventors: Xiaodan Tan, Paul Gilbert Meyer, Sheng Xu, Ron Diamant
COMPUTE ENGINE WITH TRANSPOSE CIRCUITRY

Publication number: 20240103813

Abstract: An integrated circuit that combines transpose and compute operations may include a transpose circuit coupled to a set of compute channels. Each compute channel may include multiple arithmetic logic unit (ALU) circuits coupled in series. The transpose circuit is operable to receive an input tensor, transpose the input tensor, and output a transposed tensor to the set of compute channels. The set of compute channels is operable to generate outputs in parallel, with each of the outputs being generated from a corresponding vector of the transposed tensor.

Type: Application

Filed: September 21, 2022

Publication date: March 28, 2024

Inventors: Xiaodan Tan, Paul Gilbert Meyer, Sheng Xu, Ron Diamant
Machine instructions for decoding acceleration including fuse input instructions to fuse multiple JPEG data blocks together to take advantage of a full SIMD width of a processor

Patent number: 11941397

Abstract: Techniques to take advantage of the single-instruction-multiple-data (SIMD) capabilities of a processor to process data blocks can include implementing an instruction to fuse the data blocks together. The fuse input instruction can have a first input vector, a second input vector, a select input, a first output vector, and a second output vector. The fuse input instruction selects a portion of the first input vector and a portion of the second input vector based on the select input, sign extends the selected portion of the first input vector and the selected portion of the second input vector, and shuffles data elements of the sign extended portion of the first input vector with data elements of the sign extended portion of the second input vector to generate the first and second output vectors.

Type: Grant

Filed: May 31, 2022

Date of Patent: March 26, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Xiaodan Tan, Paul Gilbert Meyer
MIXING SPARSITY COMPRESSION

Publication number: 20230100930

Abstract: Techniques for compressing a neural network model by mixing compression ratios (sparsity patterns) are described. The weight tensor of a neural network model is divided into weight groups. The pruning cost of compressing the weight values according to a compression ratio is determined for each weight group, and a pruning cost distribution for the compression ratio is generated from the pruning costs of the weight groups. A cost threshold can then be selected from the pruning cost distribution, and weight groups having a pruning cost below the selected cost threshold are compressed according to the compression ratio. The remaining weight groups can be compressed using one or more less aggressive compression ratios. The cost threshold can be adjusted to tune the overall sparsity and accuracy of the compressed neural network.

Type: Application

Filed: September 30, 2021

Publication date: March 30, 2023

Inventors: Xiaodan Tan, Paul Gilbert Meyer, Gennady Pekhimenko, Randy Renfu Huang

PROGRAMMABLE COMPUTE ENGINE HAVING TRANSPOSE OPERATIONS

COMPUTE ENGINE WITH TRANSPOSE CIRCUITRY

Machine instructions for decoding acceleration including fuse input instructions to fuse multiple JPEG data blocks together to take advantage of a full SIMD width of a processor

MIXING SPARSITY COMPRESSION