Patents by Inventor Yuchen Hao

Yuchen Hao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230306291
    Abstract: A system for generating simulated data is disclosed. The system may determine items of content utilized by a network. The system may also retrieve one or more data patterns associated with one or more features associated with the content. The system may also determine a plurality of indices associated with the data patterns. The system may also generate, based on the data patterns and the plurality of indices, simulated data associated with the content.
    Type: Application
    Filed: March 22, 2022
    Publication date: September 28, 2023
    Inventors: Martin SCHATZ, Dmitry Vengertsev, Yifan Liu, Ilana Marisa Arbisser, Yuchen Hao, Muhammet Mustafa Ozdal
  • Patent number: 11699081
    Abstract: The disclosed computer-implemented method may include (1) receiving, at a hardware accelerator that supports an ANN, an activation data set that is to undergo a convolution operation via a filter kernel of the ANN, (2) receiving, at the hardware accelerator, an argument indicating that the filter kernel exceeds at least one boundary of the activation data set when slid across a certain position during the convolution operation, (3) determining, based at least in part on the argument, that the hardware accelerator is to generate padding data at the boundary of the activation data set in connection with the certain position of the filter kernel, and then (4) performing, at the hardware accelerator, the convolution operation by processing a portion of the activation data set and the padding data when the filter kernel slides across the certain position. Various other systems and methods are also disclosed.
    Type: Grant
    Filed: December 20, 2019
    Date of Patent: July 11, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Ehsan Khish Ardestani Zadeh, Martin Schatz, Krishnakumar Narayanan Nair, Yuchen Hao, Abdulkadir Utku Diril, Rakesh Komuravelli
  • Patent number: 11599181
    Abstract: A computer-implemented method may include (1) maintaining (a) a filter matrix in a filter cache included in a local memory device (LMD) included in a hardware accelerator, and (b) a plurality of activation matrices corresponding to different rows of an activation volume in an activation cache included in the LMD, (2) for each activation matrix, directing a matrix multiplication unit (MMU) included in the hardware accelerator to execute a matrix multiplication operation (MMU) using the filter matrix and the activation matrix, (3) loading an additional filter matrix into the filter cache, and (4) directing the MMU to execute a plurality of additional MMOs, each additional MMO using one filter matrix included in the filter cache and one activation matrix included in the activation cache, such that the MMU reuses the filter matrix for at least one additional MMO and uses the additional filter matrix for a different additional MMO.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: March 7, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Yuchen Hao, Thomas Mark Ulrich, Rakesh Komuravelli, Ehsan Khish Ardestani Zadeh, Martin Schatz
  • Patent number: 11580192
    Abstract: A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: February 14, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Publication number: 20230004624
    Abstract: A system comprises a data input vector unit, a weight input vector unit, and a plurality of calculation units. The data input vector unit is configured to concurrently receive elements of different rows of a first and second data matrix. The weight input vector unit is configured to receive a combined weight vector and at least in part concurrently provide obtained weight elements of a first and second weight matrix to a corresponding first and second group of calculation units. At least one calculation unit of each group of the first and second group of calculation units is configured to multiply elements from the data input vector unit with corresponding elements of the corresponding weight matrix from the weight input vector unit and sum together multiplication results of the corresponding calculation unit to at least in part determine a corresponding element in a first or second convolution result matrix.
    Type: Application
    Filed: June 30, 2022
    Publication date: January 5, 2023
    Inventors: Krishnakumar Narayanan Nair, Olivia Wu, Ehsan Khish Ardestani Zadeh, Abdulkadir Utku Diril, Thomas Mark Ulrich, Yuchen Hao, Rakesh Komuravelli, Aravind Kalaiah
  • Patent number: 11537865
    Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.
    Type: Grant
    Filed: February 18, 2020
    Date of Patent: December 27, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Patent number: 11520854
    Abstract: A first group of elements is element-wise multiplied with a second group of elements using a plurality of multipliers belonging to a matrix multiplication hardware unit. Results of the plurality of multipliers are added together using a hierarchical tree of adders belonging to the matrix multiplication hardware unit and a final result of the hierarchical tree of adders or any of a plurality of intermediate results of the hierarchical tree of adders is selectively provided for use in determining an output result matrix.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: December 6, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Yuchen Hao, Krishnakumar Narayanan Nair, Ehsan Khish Ardestani Zadeh, Rakesh Komuravelli, Abdulkadir Utku Diril, Thomas Mark Ulrich
  • Patent number: 11520853
    Abstract: A processor system comprises two groups of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate matrix for each channel. Each register stores at least one data element from each matrix. The hardware channel convolution processor unit is configured to multiply each data element in a first and second portion of the first group of registers with a corresponding data element in the second group of registers to determine corresponding multiplication results and sum together the multiplication results for each specific channel to determine two corresponding channel convolution result data elements in a corresponding channel convolution result matrix.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: December 6, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Publication number: 20220365784
    Abstract: A processor system comprises a shared memory and a processing element. The processing element includes a matrix processor unit and is in communication with the shared memory. The processing element is configured to receive a processor instruction specifying a data matrix and a matrix manipulation operation. A manipulation matrix based on the processor instruction is identified. The data matrix and the manipulation matrix are used to perform a matrix operation to determine a result matrix.
    Type: Application
    Filed: May 25, 2022
    Publication date: November 17, 2022
    Inventors: Thomas Mark Ulrich, Krishnakumar Narayanan Nair, Yuchen Hao
  • Patent number: 11501147
    Abstract: A disclosed computer-implemented method may include maintaining, within a local memory device (LMD) included in a hardware accelerator (1) a filter matrix corresponding to a filter location included in each of a set of filters of a convolutional layer of an artificial neural network (ANN), and (2) a set of activation vectors corresponding to an active region of an activation volume input into the convolutional layer. The method may also include determining that the active region of the activation volume is contiguous with a padding region associated with at least a portion of the activation volume. The method may further include directing a matrix multiplication unit (MMU) included in the hardware accelerator to execute a matrix multiplication operation (MMO) using the filter matrix and an activation matrix that may include (1) the set of activation vectors, and (2) at least one padding vector corresponding to the padding region.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: November 15, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Narayanan Nair, Ehsan Khish Ardestani, Martin Schatz, Yuchen Hao, Abdulkadir Utku Diril, Rakesh Komuravelli
  • Patent number: 11481471
    Abstract: A system comprises a matrix processor unit that includes a first type of register, a group of a second type of registers, and a plurality of calculation units. The first type of register is configured to concurrently store values from different rows of a first matrix. At least a portion of the first type of register is logically divided into groups of elements, and each of the groups corresponds to a different row of the first matrix. Each of the second type of registers is configured to concurrently store values from a plurality of different rows of a second matrix. Each of the calculation units corresponds to one of the second type of registers and is configured to at least in part determine a corresponding element in a result matrix of convoluting the second matrix with the first matrix.
    Type: Grant
    Filed: August 16, 2019
    Date of Patent: October 25, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Nair, Abdulkadir Utku Diril, Dheevatsa Mudigere, Olivia Wu, Ehsan Khish Ardestani Zadeh, Yuchen Hao
  • Patent number: 11443013
    Abstract: A processor system comprises a hardware channel convolution processor unit and dot product processor unit. The channel convolution processor unit is configured to perform depthwise convolution, including by multiplying each data element of a first group of data elements of a convolution data matrix with a corresponding data element of a second group of data elements of a plurality of depthwise convolution weight matrices and summing together, for each specific channel, multiplication results corresponding to the specific channel to determine one corresponding result data element in a corresponding channel convolution result matrix to calculate a portion of depthwise convolution results.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: September 13, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Patent number: 11409838
    Abstract: A system comprises a data input vector unit, a weight input vector unit, and a plurality of calculation units of a matrix processor unit. The data input vector unit is configured to concurrently receive elements of different rows of a first and second data matrix. The weight input vector unit is configured to receive a combined weight vector and at least in part concurrently provide obtained weight elements of a first and second weight matrix to a corresponding first and second group of calculation units. Each calculation unit of the first and second group of calculation units is configured to multiply elements from the data input vector unit with elements of the corresponding weight matrix from the weight input vector unit and sum together multiplication results of the corresponding calculation unit to at least in part determine a corresponding element in a first or second convolution result matrix.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: August 9, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Narayanan Nair, Olivia Wu, Ehsan Khish Ardestani Zadeh, Abdulkadir Utku Diril, Thomas Mark Ulrich, Yuchen Hao, Rakesh Komuravelli, Aravind Kalaiah
  • Patent number: 11372644
    Abstract: A processor system comprises a shared memory and a processing element. The processing element includes a matrix processor unit and is in communication with the shared memory. The processing element is configured to receive a processor instruction specifying a data matrix and a matrix manipulation operation. A manipulation matrix based on the processor instruction is identified. The data matrix and the manipulation matrix are loaded into the matrix processor unit and a matrix operation is performed to determine a result matrix. The result matrix is outputted to a destination location.
    Type: Grant
    Filed: December 9, 2019
    Date of Patent: June 28, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Thomas Mark Ulrich, Krishnakumar Narayanan Nair, Yuchen Hao
  • Publication number: 20220107782
    Abstract: A processor system comprises one or more logic units configured to receive a processor instruction identifying a first floating point number to be multiplied with a second floating point number. The floating point numbers are each decomposed into a group of a plurality of component numbers, wherein a number of bits used to represent each floating point number is greater than a number of bits used to represent any component number in each group of the plurality of component numbers. The component numbers of the first group are multiplied with the component numbers of the second group to determine intermediate multiplication results that are summed together to determine an effective result that represents a result of multiplying the first floating point number with the second floating point number.
    Type: Application
    Filed: October 20, 2021
    Publication date: April 7, 2022
    Inventors: Krishnakumar Narayanan Nair, Anup Ramesh Kadkol, Ehsan Khish Ardestani Zadeh, Olivia Wu, Yuchen Hao, Thomas Mark Ulrich, Rakesh Komuravelli
  • Patent number: 11188303
    Abstract: A processor system comprises one or more logic units configured to receive a processor instruction identifying a first floating point number to be multiplied with a second floating point number. The floating point numbers are each decomposed into a group of a plurality of component numbers, wherein a number of bits used to represent each floating point number is greater than a number of bits used to represent any component number in each group of the plurality of component numbers. The component numbers of the first group are multiplied with the component numbers of the second group to determine intermediate multiplication results that are summed together to determine an effective result that represents a result of multiplying the first floating point number with the second floating point number.
    Type: Grant
    Filed: October 2, 2019
    Date of Patent: November 30, 2021
    Assignee: Facebook, Inc.
    Inventors: Krishnakumar Narayanan Nair, Anup Ramesh Kadkol, Ehsan Khish Ardestani Zadeh, Olivia Wu, Yuchen Hao, Thomas Mark Ulrich, Rakesh Komuravelli
  • Publication number: 20210334072
    Abstract: A processor system comprises a plurality of dot product processor units and element-wise multiplication units. The dot product processor units perform a depthwise convolution of a data matrix with a separate depthwise convolution weight matrix for each data matrix channel. Each dot product processor unit performs at least a portion of the depthwise convolution for one or more data matrix channels. The element-wise multiplication units perform multiplication operations of a pointwise convolution. Each element-wise multiplication unit applies to each depthwise convolution partial result element received from one or more of the dot product processor units a corresponding data element from each of a plurality of pointwise convolution weight filters to determine element-wise multiplication unit results. The processor system sums together different groups of data elements from the element-wise multiplication unit results to at least in part calculate different data elements of a result of the pointwise convolution.
    Type: Application
    Filed: April 22, 2020
    Publication date: October 28, 2021
    Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Publication number: 20210319076
    Abstract: A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element.
    Type: Application
    Filed: April 8, 2020
    Publication date: October 14, 2021
    Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Patent number: 11138292
    Abstract: An electronic circuit performs depthwise convolution of an input matrix with a kernel matrix to generate an output matrix. In each of a plurality of rounds of operations, a row of kernel matrix elements is selected for the round of operations, and applied to the input matrix to obtain an intermediate data array corresponding to the selected row of kernel elements. The electronic circuit includes a plurality of subcircuits operable in parallel to generate, in each operation, a set of intermediate data elements in the intermediate data array. Each subcircuit generates a respective intermediate data element that is the sum of a respective row of the input matrix elements weighted by a set of weight elements including the selected row of kernel elements and at least one zero element. The selected row of kernel elements is successively shifted among the set of weight elements in the round of operations.
    Type: Grant
    Filed: May 16, 2019
    Date of Patent: October 5, 2021
    Assignee: FACEBOOK, INC.
    Inventors: Krishnakumar Nair, Abdulkadir Utku Diril, Dheevatsa Mudigere, Ehsan Khish Ardestani Zadeh, Olivia Wu, Yuchen Hao
  • Publication number: 20210294875
    Abstract: A processor system comprises a hardware channel convolution processor unit and dot product processor unit. The channel convolution processor unit is configured to perform depthwise convolution, including by multiplying each data element of a first group of data elements of a convolution data matrix with a corresponding data element of a second group of data elements of a plurality of depthwise convolution weight matrices and summing together, for each specific channel, multiplication results corresponding to the specific channel to determine one corresponding result data element in a corresponding channel convolution result matrix to calculate a portion of depthwise convolution results.
    Type: Application
    Filed: March 23, 2020
    Publication date: September 23, 2021
    Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian