Patents by Inventor Vijay Anand R. Korthikanti

Vijay Anand R. Korthikanti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240112006
    Abstract: A network of matrix processing units (MPUs) is provided on a device, where each MPU is connected to at least one other MPU in the network, and each MPU is to perform matrix multiplication operations. Computer memory stores tensor data and a master control central processing unit (MCC) is provided on the device to receive an instruction from a host device, where the instruction includes one or more tensor operands based on the tensor data. The MCC invokes a set of operations on one or more of the MPUs based on the instruction, where the set of operations includes operations on the tensor operands. A result is generated from the set of operations, the result embodied as a tensor value.
    Type: Application
    Filed: December 8, 2023
    Publication date: April 4, 2024
    Inventors: Horace H. Lau, Prashant Arora, Olivia K. Wu, Tony L. Werner, Carey K. Kloss, Amir Khosrowshahi, Andrew Yang, Aravind Kalaiah, Vijay Anand R. Korthikanti
  • Patent number: 11748625
    Abstract: In one embodiment, a matrix operation may be performed using a plurality of input matrices, wherein the matrix operation is associated with one or more convolution operations. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: September 5, 2023
    Assignee: Intel Corporation
    Inventors: Vijay Anand R. Korthikanti, Aravind Kalaiah, Tony L. Werner, Carey K. Kloss, Amir Khosrowshahi
  • Publication number: 20230222331
    Abstract: A network of matrix processing units (MPUs) is provided on a device, where each MPU is connected to at least one other MPU in the network, and each MPU is to perform matrix multiplication operations. Computer memory stores tensor data and a master control central processing unit (MCC) is provided on the device to receive an instruction from a host device, where the instruction includes one or more tensor operands based on the tensor data. The MCC invokes a set of operations on one or more of the MPUs based on the instruction, where the set of operations includes operations on the tensor operands. A result is generated from the set of operations, the result embodied as a tensor value.
    Type: Application
    Filed: March 15, 2023
    Publication date: July 13, 2023
    Inventors: Horce H. Lau, Prashant Arora, Olivia K. Wu, Tony L. Werner, Carey K. Kloss, Amir Khosrowshahi, Andrew Yang, Aravind Kalaiah, Vijay Anand R. Korthikanti
  • Publication number: 20220245438
    Abstract: A network of matrix processing units (MPUs) is provided on a device, where each MPU is connected to at least one other MPU in the network, and each MPU is to perform matrix multiplication operations. Computer memory stores tensor data and a master control central processing unit (MCC) is provided on the device to receive an instruction from a host device, where the instruction includes one or more tensor operands based on the tensor data. The MCC invokes a set of operations on one or more of the MPUs based on the instruction, where the set of operations includes operations on the tensor operands. A result is generated from the set of operations, the result embodied as a tensor value.
    Type: Application
    Filed: April 25, 2022
    Publication date: August 4, 2022
    Inventors: Horce H. Lau, Prashant Arora, Olivia K. Wu, Tony L. Werner, Carey K. Kloss, Amir Khosrowshahi, Andrew Yang, Aravind Kalaiah, Vijay Anand R. Korthikanti
  • Publication number: 20220121954
    Abstract: In one embodiment, a matrix operation may be performed using a plurality of input matrices, wherein the matrix operation is associated with one or more convolution operations. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.
    Type: Application
    Filed: December 28, 2021
    Publication date: April 21, 2022
    Inventors: Vijay Anand R. Korthikanti, Aravind Kalaiah, Tony L. Werner, Carey K. Kloss, Amir Khosrowshahi
  • Patent number: 10949496
    Abstract: In one embodiment, a matrix operation may be performed to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory. Data associated with the input matrix may be accessed using one or more strided memory operations, wherein the one or more strided memory operations are configured to access the two-dimensional memory at a plurality of locations that are separated by a particular interval. The data accessed using the one or more strided memory operations may be stored in a result matrix, wherein the data accessed using each strided memory operation is stored in the result matrix in non-transpose form or transpose form.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: March 16, 2021
    Assignee: Intel Corporation
    Inventors: Vijay Anand R. Korthikanti, Aravind Kalaiah, Tony L. Werner, Amir Khosrowshahi
  • Patent number: 10922380
    Abstract: In one embodiment, a matrix operation associated with a plurality of input matrices may be performed. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: February 16, 2021
    Assignee: Intel Corporation
    Inventors: Vijay Anand R. Korthikanti, Carey K. Kloss, Aravind Kalaiah, Amir Khosrowshahi
  • Publication number: 20190392297
    Abstract: A network of matrix processing units (MPUs) is provided on a device, where each MPU is connected to at least one other MPU in the network, and each MPU is to perform matrix multiplication operations. Computer memory stores tensor data and a master control central processing unit (MCC) is provided on the device to receive an instruction from a host device, where the instruction includes one or more tensor operands based on the tensor data. The MCC invokes a set of operations on one or more of the MPUs based on the instruction, where the set of operations includes operations on the tensor operands. A result is generated from the set of operations, the result embodied as a tensor value.
    Type: Application
    Filed: December 28, 2017
    Publication date: December 26, 2019
    Applicant: Intel Corporation
    Inventors: Horace H. Lau, Prashant Arora, Olivia K. Wu, Tony Werner, Carey K. Kloss, Amir Khosrowshahi, Andrew Yang, Aravind Kalaiah, Vijay Anand R. Korthikanti
  • Publication number: 20190138569
    Abstract: In one embodiment, a matrix operation associated with a plurality of input matrices may be performed. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.
    Type: Application
    Filed: December 31, 2018
    Publication date: May 9, 2019
    Applicant: Intel Corporation
    Inventors: Vijay Anand R. Korthikanti, Carey K. Kloss, Aravind Kalaiah, Amir Khosrowshahi
  • Patent number: 10169296
    Abstract: In one embodiment, a matrix operation associated with a plurality of input matrices may be performed. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: January 1, 2019
    Assignee: Intel Corporation
    Inventors: Vijay Anand R. Korthikanti, Carey K. Kloss, Aravind Kalaiah, Amir Khosrowshahi
  • Publication number: 20180189652
    Abstract: In one embodiment, a matrix operation may be performed using a plurality of input matrices, wherein the matrix operation is associated with one or more convolution operations. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.
    Type: Application
    Filed: December 30, 2016
    Publication date: July 5, 2018
    Applicant: Intel Corporation
    Inventors: Vijay Anand R. Korthikanti, Aravind Kalaiah, Tony L. Werner, Carey K. Kloss, Amir Khosrowshahi
  • Publication number: 20180189227
    Abstract: In one embodiment, a matrix operation may be performed to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory. Data associated with the input matrix may be accessed using one or more strided memory operations, wherein the one or more strided memory operations are configured to access the two-dimensional memory at a plurality of locations that are separated by a particular interval. The data accessed using the one or more strided memory operations may be stored in a result matrix, wherein the data accessed using each strided memory operation is stored in the result matrix in non-transpose form or transpose form.
    Type: Application
    Filed: December 30, 2016
    Publication date: July 5, 2018
    Applicant: Intel Corporation
    Inventors: Vijay Anand R. Korthikanti, Aravind Kalaiah, Tony L. Werner, Amir Khosrowshahi
  • Publication number: 20180189236
    Abstract: In one embodiment, a matrix operation associated with a plurality of input matrices may be performed. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.
    Type: Application
    Filed: December 30, 2016
    Publication date: July 5, 2018
    Applicant: Intel Corporation
    Inventors: Vijay Anand R. Korthikanti, Carey K. Kloss, Aravind Kalaiah, Amir Khosrowshahi