Patents by Inventor Ganesh Venkatesh

Ganesh Venkatesh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR SPEECH OR TEXT PROCESSING USING MATRIX OPERATIONS

Publication number: 20240152575

Abstract: Disclosed herein includes a system, a method, and a device for processing and converting data using matrix operations. Circuitry can partition an input of a first data format across a plurality of lookup tables each residing in a respective memory. The circuitry can access weight information from a load store memory, and the partitioned input on a per column basis from the plurality of lookup tables. The circuitry can perform a number of multiply-accumulate (MAC) operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables. The number of MAC operations performed per cycle can correspond to a total number of columns of the plurality of lookup tables. The circuitry can generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.

Type: Application

Filed: January 17, 2024

Publication date: May 9, 2024

Applicant: Meta Platforms Technologies, LLC

Inventors: Alagappan Valliappan, Pierce I-Jen Chuang, Ganesh Venkatesh
Dynamic content rendering based on context for AR and assistant systems

Patent number: 11966701

Abstract: In one embodiment, a method includes rendering a first output image comprising one or more augmented-reality (AR) objects for displays of an AR rendering device of an AR system associated with a first user. The method further includes accessing sensor signals associated with the first user. The one or more sensor signals may be captured by sensors of the AR system. The method further includes detecting a change in a context of the first user with respect to a real-world environment based on the sensor signals. The method further includes rendering a second output image comprising the AR objects for the displays of the AR rendering device. One or more of the AR objects may be adapted based on the detected change in the context of the first user.

Type: Grant

Filed: August 2, 2021

Date of Patent: April 23, 2024

Assignee: Meta Platforms, Inc.

Inventors: Yiming Pu, Christopher E Balmes, Gabrielle Catherine Moskey, John Jacob Blakeley, Amy Lawson Bearman, Alireza Dirafzoon, Matthew Dan Feiszli, Ganesh Venkatesh, Babak Damavandi, Jiwen Ren, Chengyuan Yan, Guangqiang Dong
Systems and methods for reading and writing sparse data in a neural network accelerator

Patent number: 11954025

Abstract: Disclosed herein includes a system, a method, and a device for reading and writing sparse data in a neural network accelerator. A mask identifying byte positions within a data word having non-zero values in memory can be accessed. Each bit of the mask can have a first value or a second value, the first value indicating that a byte of the data word corresponds to a non-zero byte value, the second value indicating that the byte of the data word corresponds to a zero byte value. The data word can be modified to have non-zero byte values stored at an end of a first side of the data word in the memory, and any zero byte values stored in a remainder of the data word. The modified data word can be written to the memory via at least a first slice of a plurality of slices that is configured to access the first side of the data word in the memory.

Type: Grant

Filed: March 24, 2023

Date of Patent: April 9, 2024

Assignee: Meta Platforms Technologies, LLC

Inventors: Ganesh Venkatesh, Liangzhen Lai, Pierce I-Jen Chuang, Meng Li
Spatial tiling of compute arrays with shared control

Patent number: 11954580

Abstract: In one embodiment, a method for machine learning acceleration includes receiving, by a shared controller of a tensor processor cluster that includes multiple tensor processors, a multi-cycle instruction, determining, based on the instruction, a sequence of vector operations to be executed by the tensor processors and address information usable to determine a respective spatial partition of an input tensor on which each tensor processor is to operate when performing each vector operation. The method also includes, for each vector operation in the sequence, generating, based on the address information, a common address offset, relative to a respective base address associated with each tensor processor, at which each tensor processor is to retrieve the respective spatial partition on which the tensor processor is to operate, multicasting the common address offset to the tensor processors, and controlling the tensor processors to execute the vector operation in parallel and in lock step.

Type: Grant

Filed: September 16, 2020

Date of Patent: April 9, 2024

Assignee: Meta Platforms, Inc.

Inventors: Harshit Khaitan, Ganesh Venkatesh, Vikas Chandra
Tensor controller architecture

Patent number: 11922306

Abstract: A machine-learning accelerator system, comprising: a plurality of controllers each configured to traverse a feature map with n-dimensions according to instructions that specify, for each of the n-dimensions, a respective traversal size, wherein each controller comprises: a counter stack comprising counters each associated with a respective dimension of the n-dimensions of the feature map, wherein each counter is configured to increment a respective count from a respective initial value to the respective traversal size associated with the respective dimension associated with that counter; a plurality of address generators each configured to use the respective counts of the counters to generate at least one memory address at which a portion of the feature map is stored; and a dependency controller computing module configured to (1) track conditional statuses for incrementing the counters and (2) allow or disallow each of the counters to increment based on the conditional statuses.

Type: Grant

Filed: December 28, 2020

Date of Patent: March 5, 2024

Assignee: Meta Platforms, Inc.

Inventors: Harshit Khaitan, Ganesh Venkatesh, Simon James Hollis
Systems and methods for speech or text processing using matrix operations

Patent number: 11899745

Abstract: Disclosed herein includes a system, a method, and a device for processing and converting data using matrix operations. Circuitry can partition an input of a first data format across a plurality of lookup tables each residing in a respective memory. The circuitry can access weight information from a load store memory, and the partitioned input on a per column basis from the plurality of lookup tables. The circuitry can perform a number of multiply-accumulate (MAC) operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables. The number of MAC operations performed per cycle can correspond to a total number of columns of the plurality of lookup tables. The circuitry can generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.

Type: Grant

Filed: August 19, 2020

Date of Patent: February 13, 2024

Assignee: Meta Platforms Technologies, LLC

Inventors: Alagappan Valliappan, Ganesh Venkatesh, Pierce I-Jen Chuang
SYSTEMS, METHODS, AND APPARATUSES FOR HETEROGENEOUS COMPUTING

Publication number: 20230418655

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

Type: Application

Filed: June 9, 2023

Publication date: December 28, 2023

Inventors: Rajesh M. SANKARAN, Gilbert NEIGER, Narayan RANGANATHAN, Stephen R. VAN DOREN, Joseph NUZMAN, Niall D. MCDONNELL, Michael A. O'HANLON, Lokpraveen B. MOSUR, Tracy Garrett DRYSDALE, Eriko NURVITADHI, Asit K. MISHRA, Ganesh VENKATESH, Deborah T. MARR, Nicholas P. CARTER, Jonathan D. PEARCE, Edward T. GROCHOWSKI, Richard J. GRECO, Robert VALENTINE, Jesus CORBAL, Thomas D. FLETCHER, Dennis R. BRADFORD, Dwight P. MANLEY, Mark J. CHARNEY, Jeffrey J. COOK, Paul CAPRIOLI, Koichi YAMADA, Kent D. GLOSSOP, David B. SHEFFIELD
SYSTEMS AND METHODS FOR READING AND WRITING SPARSE DATA IN A NEURAL NETWORK ACCELERATOR

Publication number: 20230229591

Abstract: Disclosed herein includes a system, a method, and a device for reading and writing sparse data in a neural network accelerator. A mask identifying byte positions within a data word having non-zero values in memory can be accessed. Each bit of the mask can have a first value or a second value, the first value indicating that a byte of the data word corresponds to a non-zero byte value, the second value indicating that the byte of the data word corresponds to a zero byte value. The data word can be modified to have non-zero byte values stored at an end of a first side of the data word in the memory, and any zero byte values stored in a remainder of the data word. The modified data word can be written to the memory via at least a first slice of a plurality of slices that is configured to access the first side of the data word in the memory.

Type: Application

Filed: March 24, 2023

Publication date: July 20, 2023

Inventors: Ganesh Venkatesh, Liangzhen Lai, Pierce I-Jen Chuang, Meng Li
Systems and methods for distributing a neural network across multiple computing devices

Patent number: 11698529

Abstract: Disclosed herein is a method for using a neural network across multiple devices. The method can include receiving, by a first device configured with a first one or more layers of a neural network, input data for processing via the neural network implemented across the first device and a second device. The method can include outputting, by the first one or more layers of the neural network implemented on the first device, a data set that is reduced in size relative to the input data while identifying one or more features of the input data for processing by a second one or more layers of the neural network. The method can include communicating, by the first device, the data set to the second device for processing via the second one or more layers of the neural network implemented on the second device.

Type: Grant

Filed: July 9, 2019

Date of Patent: July 11, 2023

Assignee: Meta Platforms Technologies, LLC

Inventors: Liangzhen Lai, Pierce I-Jen Chuang, Vikas Chandra, Ganesh Venkatesh
Systems, methods, and apparatuses for heterogeneous computing

Patent number: 11693691

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

Type: Grant

Filed: July 21, 2021

Date of Patent: July 4, 2023

Assignee: Intel Corporation

Inventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
Optimization for deconvolution

Patent number: 11681777

Abstract: Disclosed herein includes a system, a method, and a device for improving computational efficiency of deconvolution by reducing a number of dot products. In one aspect, an input image having a set of pixels is received. A first dot product may be performed on a subset of the set of pixels of the input image and a portion of a kernel, to generate a first pixel of an output image. A number of multiplications performed for the first dot product performed may be less than a number of elements of the kernel. A second dot product on a remaining portion of the kernel to generate the first pixel of the output image may be bypassed.

Type: Grant

Filed: January 10, 2022

Date of Patent: June 20, 2023

Assignee: Meta Platforms Technologies, LLC

Inventor: Ganesh Venkatesh
System and method for performing small channel count convolutions in energy-efficient input operand stationary accelerator

Patent number: 11675998

Abstract: Disclosed herein includes a system, a method, and a device for receiving input data to generate a plurality of outputs for a layer of a neural network. The plurality of outputs are arranged in a first array. Dimensions of the first array may be compared with dimensions of a processing unit (PE) array including a plurality of PEs. According to a result of the comparing, the first array is partitioned into subarrays by the processor. Each of the subarrays has dimensions less than or equal to the dimensions of the PE array. A first group of PEs in the PE array is assigned to a first one of the subarrays. A corresponding output of the plurality of outputs is generated using a portion of the input data by each PE of the first group of PEs assigned to the first one of the subarrays.

Type: Grant

Filed: July 15, 2019

Date of Patent: June 13, 2023

Assignee: Meta Platforms Technologies, LLC

Inventors: Ganesh Venkatesh, Liangzhen Lai, Pierce I-Jen Chuang, Meng Li
Systems and methods for reading and writing sparse data in a neural network accelerator

Patent number: 11630770

Abstract: Disclosed herein includes a system, a method, and a device for reading and writing sparse data in a neural network accelerator. A plurality of slices can be established to access a memory having an access size of a data word. A first slice can be configured to access a first side of the data word in memory. Circuitry can access a mask identifying byte positions within the data word having non-zero values. The circuitry can modify the data word to have non-zero byte values stored starting at an end of the first side, and any zero byte values stored in a remainder of the data word. A determination can be made whether a number of non-zero byte values is less than or equal to a first access size of the first slice. The circuitry can write the modified data word to the memory via at least the first slice.

Type: Grant

Filed: July 11, 2019

Date of Patent: April 18, 2023

Assignee: Meta Platforms Technologies, LLC

Inventors: Ganesh Venkatesh, Liangzhen Lai, Pierce I-Jen Chuang, Meng Li
System and method for shift-based information mixing across channels for shufflenet-like neural networks

Patent number: 11615319

Abstract: Disclosed herein includes a system, a method, and a device for performing a convolution on data of a current layer of a neural network, including a plurality of channels arranged in a first order and partitioned into a plurality of first partitions according to the first order. Each first partition includes a result of a convolution on a corresponding partition of channels in data of a previous layer of the neural network. The device shifts the plurality of channels arranged in the first order to a second order, partition the shifted plurality of channels into a plurality of second partitions, according to the second order. For each of the plurality of second partitions, the device performs a convolution on channels of the shifted plurality of channels that are in the corresponding second partition.

Type: Grant

Filed: July 15, 2019

Date of Patent: March 28, 2023

Assignee: Meta Platforms Technologies, LLC

Inventor: Ganesh Venkatesh
Holographic Calling for Artificial Reality

Publication number: 20220413434

Abstract: A holographic calling system can capture and encode holographic data at a sender-side of a holographic calling pipeline and decode and present the holographic data as a 3D representation of a sender at a receiver-side of the holographic calling pipeline. The holographic calling pipeline can include stages to capture audio, color images, and depth images; densify the depth images to have a depth value for each pixel while generating parts masks and a body model; use the masks to segment the images into parts needed for hologram generation; convert depth images into a 3D mesh; paint the 3D mesh with color data; perform torso disocclusion; perform face reconstruction; and perform audio synchronization. In various implementations, different of these stages can be performed sender-side or receiver side. The holographic calling pipeline also includes sender-side compression, transmission over a communication channel, and receiver-side decompression and hologram output.

Type: Application

Filed: June 28, 2021

Publication date: December 29, 2022

Inventors: Albert PARRA POZO, Joseph VIRSKUS, Ganesh VENKATESH, Kai LI, Shen-Chi CHEN, Amit KUMAR, Rakesh RANJAN, Brian Keith CABRAL, Samuel Alan JOHNSON, Wei YE, Michael Alexander SNOWER, Yash PATEL
Holographic Calling for Artificial Reality

Publication number: 20220413433

Abstract: A holographic calling system can capture and encode holographic data at a sender-side of a holographic calling pipeline and decode and present the holographic data as a 3D representation of a sender at a receiver-side of the holographic calling pipeline. The holographic calling pipeline can include stages to capture audio, color images, and depth images; densify the depth images to have a depth value for each pixel while generating parts masks and a body model; use the masks to segment the images into parts needed for hologram generation; convert depth images into a 3D mesh; paint the 3D mesh with color data; perform torso disocclusion; perform face reconstruction; and perform audio synchronization. In various implementations, different of these stages can be performed sender-side or receiver side. The holographic calling pipeline also includes sender-side compression, transmission over a communication channel, and receiver-side decompression and hologram output.

Type: Application

Filed: June 28, 2021

Publication date: December 29, 2022

Inventors: Albert PARRA POZO, Joseph VIRSKUS, Ganesh VENKATESH, Kai LI, Shen-Chi CHEN, Amit KUMAR, Rakesh RANJAN, Brian Keith CABRAL, Samuel Alan JOHNSON, Wei YE, Michael Alexander SNOWER, Yash PATEL
Dynamic Content Rendering Based on Context for AR and Assistant Systems

Publication number: 20220374130

Abstract: In one embodiment, a method includes rendering a first output image comprising one or more augmented-reality (AR) objects for displays of an AR rendering device of an AR system associated with a first user. The method further includes accessing sensor signals associated with the first user. The one or more sensor signals may be captured by sensors of the AR system. The method further includes detecting a change in a context of the first user with respect to a real-world environment based on the sensor signals. The method further includes rendering a second output image comprising the AR objects for the displays of the AR rendering device. One or more of the AR objects may be adapted based on the detected change in the context of the first user.

Type: Application

Filed: August 2, 2021

Publication date: November 24, 2022

Inventors: Yiming Pu, Christopher E. Balmes, Gabrielle Catherine Moskey, John Jacob Blakeley, Amy Lawson Bearman, Alireza Dirafzoon, Matthew Dan Feiszli, Ganesh Venkatesh, Babak Damavandi, Jiwen Ren, Chengyuan Yan, Guangqiang Dong
Holographic calling for artificial reality

Patent number: 11461962

Abstract: A holographic calling system can capture and encode holographic data at a sender-side of a holographic calling pipeline and decode and present the holographic data as a 3D representation of a sender at a receiver-side of the holographic calling pipeline. The holographic calling pipeline can include stages to capture audio, color images, and depth images; densify the depth images to have a depth value for each pixel while generating parts masks and a body model; use the masks to segment the images into parts needed for hologram generation; convert depth images into a 3D mesh; paint the 3D mesh with color data; perform torso disocclusion; perform face reconstruction; and perform audio synchronization. In various implementations, different of these stages can be performed sender-side or receiver side. The holographic calling pipeline also includes sender-side compression, transmission over a communication channel, and receiver-side decompression and hologram output.

Type: Grant

Filed: June 28, 2021

Date of Patent: October 4, 2022

Assignee: Meta Platforms Technologies, LLC

Inventors: Albert Parra Pozo, Joseph Virskus, Ganesh Venkatesh, Kai Li, Shen-Chi Chen, Amit Kumar, Rakesh Ranjan, Brian Keith Cabral, Samuel Alan Johnson, Wei Ye, Michael Alexander Snower, Yash Patel
Efficient multiply-accumulation based on sparse matrix

Patent number: 11429394

Abstract: Disclosed herein includes improving computational efficiency of multiply-accumulate (MAC) operation. In one aspect, a computing device identifies, a first vector including non-zero elements of a base matrix, and a second vector indicating a location of each of the non-zero elements of the base matrix. In one aspect, the device determines a first element and a second element of the first vector. In one aspect, the device determines a third element and a fourth element of the second vector. In one aspect, the device determines i) a fifth element of an input vector according to the third element of the second vector, and ii) a sixth element of the input vector according to the fourth element of the second vector. In one aspect, the device causes a MAC circuitry to perform a dot product according to the first element, the second element, the fifth element, and the sixth element.

Type: Grant

Filed: August 19, 2020

Date of Patent: August 30, 2022

Assignee: Meta Platforms Technologies, LLC

Inventors: Alagappan Valliappan, Ganesh Venkatesh, Pierce I-Jen Chuang
Systems, methods, and apparatuses for heterogeneous computing

Patent number: 11416281

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

Type: Grant

Filed: December 31, 2016

Date of Patent: August 16, 2022

Assignee: Intel Corporation

Inventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield

1 2 3 next