Patents by Inventor Vinayak Anand Gokhale
Vinayak Anand Gokhale has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12164917Abstract: A system including one or more processors configured to receive a transpose instruction indicating to transpose a source matrix to a result matrix, provide data elements of the source matrix to input switching circuits, reorder the data elements using the input switching circuits, provide the data elements from the input switching circuits to one or more lanes of a datapath, provide the data elements from the datapath to output switching circuits, undo the reordering of the data elements using the output switching circuits, and provide the data elements from the output switching circuits to a result matrix. Each respective lane of the datapath receiving data elements receives multiple data elements directed to different respective non-overlapping portions of the lane.Type: GrantFiled: May 17, 2023Date of Patent: December 10, 2024Assignee: Google LLCInventors: Vinayak Anand Gokhale, Matthew Leever Hedlund, Matthew William Ashcraft, Indranil Chakraborty
-
Publication number: 20240385837Abstract: A system including one or more processors configured to receive a transpose instruction indicating to transpose a source matrix to a result matrix, provide data elements of the source matrix to input switching circuits, reorder the data elements using the input switching circuits, provide the data elements from the input switching circuits to one or more lanes of a datapath, provide the data elements from the datapath to output switching circuits, undo the reordering of the data elements using the output switching circuits, and provide the data elements from the output switching circuits to a result matrix. Each respective lane of the datapath receiving data elements receives multiple data elements directed to different respective non-overlapping portions of the lane.Type: ApplicationFiled: May 17, 2023Publication date: November 21, 2024Inventors: Vinayak Anand Gokhale, Matthew Leever Hedlund, Matthew William Ashcraft, Indranil Chakraborty
-
Patent number: 12073216Abstract: In a system including vector registers storing right-hand side data and left-hand side data, first and second matrix staging registers, and a systolic array of processing cells for conducting matrix multiplication operations using the right-hand side data and left-hand side data, one or more processors load the right-hand side data from the vector registers to the first matrix staging register based on an instruction indicating whether to transpose the right-hand side data, load the left-hand side data from the vector registers into the second matrix staging register based on another instruction indicating whether to transpose the left-hand side data, load the right-hand side data from the first matrix staging register into the systolic array, and, in a cycle of the matrix multiplication operation, pass one or more columns of the left-hand side data from the second matrix staging register to a column of the systolic array.Type: GrantFiled: February 14, 2023Date of Patent: August 27, 2024Assignee: Google LLCInventors: Matthew Leever Hedlund, Christopher Aaron Clark, Andrew Everett Phelps, Thomas James Norrie, Sushma Honnavara-Prasad, Vinayak Anand Gokhale, Pareesa Ameneh Golnari
-
Publication number: 20240272904Abstract: In a system including vector registers storing right-hand side data and left-hand side data, first and second matrix staging registers, and a systolic array of processing cells for conducting matrix multiplication operations using the right-hand side data and left-hand side data, one or more processors load the right-hand side data from the vector registers to the first matrix staging register based on an instruction indicating whether to transpose the right-hand side data, load the left-hand side data from the vector registers into the second matrix staging register based on another instruction indicating whether to transpose the left-hand side data, load the right-hand side data from the first matrix staging register into the systolic array, and, in a cycle of the matrix multiplication operation, pass one or more columns of the left-hand side data from the second matrix staging register to a column of the systolic array.Type: ApplicationFiled: February 14, 2023Publication date: August 15, 2024Inventors: Matthew Leever Hedlund, Christopher Aaron Clark, Andrew Everett Phelps, Thomas James Norrie, Sushma Honnavara-Prasad, Vinayak Anand Gokhale, Pareesa Ameneh Golnari
-
Publication number: 20240220202Abstract: A system and method for matrix multiplication using a systolic array configurable between multiple modes of operation. A systolic processor may receive a data type indicator for the matrix multiplication. For a first data type, the systolic processor may load the right-hand side data from the right-hand matrix register into the data processing cells of the systolic array between row 0 and row M?1, and pass the respective row of the left-hand side data through a corresponding row of the systolic array between rows 0 and M?1. For a second data type, the systolic processor may split each element of the left-hand side data and the right-hand side data into respective first and second element halves, and move each element half through a corresponding row of the systolic array between rows 0 and 2M?1.Type: ApplicationFiled: February 14, 2023Publication date: July 4, 2024Inventors: Matthew Leever Hedlund, Christopher Aaron Clark, Andrew Everett Phelps, Thomas James Norrie, Norman Paul Jouppi, Sushma Honnavara-Prasad, Vinayak Anand Gokhale, Pareesa Ameneh Golnari
-
Patent number: 10157156Abstract: A coprocessor (PL) is disclosed. The PL includes a memory router, at least one collection block that is configured to transfer data to/from the memory router, each collection block includes a collection router that is configured to i) transfer data to/from the memory router, ii) transfer data to/from at least one collection router of a neighboring collection block, and iii) transfer data to/from blocks within the collection block, and at least one programmable operator that is configured to i) transfer data to/from the collection router, and ii) perform a programmable operation on data received from the collection router.Type: GrantFiled: December 31, 2017Date of Patent: December 18, 2018Assignee: PURDUE RESEARCH FOUNDATIONInventors: Eugenio Culurciello, Berin Eduard Martini, Vinayak Anand Gokhale, Jonghoon Jin, Aysegul Dundar
-
Publication number: 20180107620Abstract: A coprocessor (PL) is disclosed. The PL includes a memory router, at least one collection block that is configured to transfer data to/from the memory router, each collection block includes a collection router that is configured to i) transfer data to/from the memory router, ii) transfer data to/from at least one collection router of a neighboring collection block, and iii) transfer data to/from blocks within the collection block, and at least one programmable operator that is configured to i) transfer data to/from the collection router, and ii) perform a programmable operation on data received from the collection router.Type: ApplicationFiled: December 31, 2017Publication date: April 19, 2018Applicant: Purdue Research FoundationInventors: Eugenio Culurciello, Berin Eduard Martini, Vinayak Anand Gokhale, Jonghoon Jin, Aysegul Dundar
-
Patent number: 9858220Abstract: A coprocessor (PL) is disclosed. The PL includes a memory router, at least one collection block that is configured to transfer data to/from the memory router, each collection block includes a collection router that is configured to i) transfer data to/from the memory router, ii) transfer data to/from at least one collection router of a neighboring collection block, and iii) transfer data to/from blocks within the collection block, and at least one programmable operator that is configured to i) transfer data to/from the collection router, and ii) perform a programmable operation on data received from the collection router.Type: GrantFiled: March 17, 2015Date of Patent: January 2, 2018Assignee: Purdue Research FoundationInventors: Eugenio Culurciello, Berin Eduard Martini, Vinayak Anand Gokhale, Jonghoon Jin, Aysegul Dundar
-
Publication number: 20150261702Abstract: A coprocessor (PL) is disclosed. The PL includes a memory router, at least one collection block that is configured to transfer data to/from the memory router, each collection block includes a collection router that is configured to i) transfer data to/from the memory router, ii) transfer data to/from at least one collection router of a neighboring collection block, and iii) transfer data to/from blocks within the collection block, and at least one programmable operator that is configured to i) transfer data to/from the collection router, and ii) perform a programmable operation on data received from the collection router.Type: ApplicationFiled: March 17, 2015Publication date: September 17, 2015Applicant: PURDUE RESEARCH FOUNDATIONInventors: Eugenio Culurciello, Berin Eduard Martini, Vinayak Anand Gokhale, Jonghoon Jin, Aysegul Dundar