Patents by Inventor Nishit SHAH

Nishit SHAH has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Methods and systems of configuring electronic devices

Patent number: 11641408

Abstract: A system of configuring a new device may include a new device that is not configured with one or more settings. The new device includes a short range communication transmitter and programming instructions configured to cause the new device to operate in a discoverable mode. The system includes an existing device that is configured with the settings, and that includes a short range communication receiver and programming instructions. The programming instructions are configured to cause the existing device to receive instructions to set up the new device, in response to receiving the instructions, detect, by the short range communication receiver, a presence of the new device by detecting the broadcast signal within a communication range of the short range communication receiver, and in response to detecting the presence of the new device, transmit at least a portion of the one or more settings directly to the new device.

Type: Grant

Filed: October 29, 2021

Date of Patent: May 2, 2023

Assignee: Google LLC

Inventors: Ushasree Kode, Nishit Shah, Ibrahim Damlaj, Michal Levin, Thomas Weedon Hume
Heterogeneous computing on a system-on-chip, including machine learning inference

Patent number: 11631001

Abstract: A system-on-chip (SoC) integrated circuit product includes a machine learning accelerator (MLA). It also includes other processor cores, such as general purpose processors and application-specific processors. It also includes a network-on-chip for communication between the different modules. The SoC implements a heterogeneous compute environment because the processor cores are customized for different purposes and typically will use different instruction sets. Applications may use some or all of the functionalities offered by the processor cores, and the processor cores may be programmed into different pipelines to perform different tasks.

Type: Grant

Filed: April 10, 2020

Date of Patent: April 18, 2023

Assignee: SiMa Technologies, Inc.

Inventors: Srivathsa Dhruvanarayan, Nishit Shah, Bradley Taylor, Moenes Zaher Iskarous
Ordering computations of a machine learning network in a machine learning accelerator for efficient memory usage

Patent number: 11586894

Abstract: A compiler efficiently manages memory usage in the machine learning accelerator by intelligently ordering computations of a machine learning network. The compiler identifies a set of partial networks of the machine learning network representing portions of the machine learning network across multiple layers on which an output or set of outputs are dependent. Because any given output may depend on only a limited subset of intermediate outputs from the prior layers, each partial network may include only a small fraction of the intermediate outputs from each layer. Instead of implementing the MLN by computing one layer at a time, the compiler schedules instructions to sequentially implement partial networks. As each layer of a partial network is completed, the intermediate outputs can be released from memory. The described technique enables intermediate outputs to be directly streamed between processing elements of the machine learning accelerator without requiring large transfers to and from external memory.

Type: Grant

Filed: May 4, 2020

Date of Patent: February 21, 2023

Assignee: SiMa Technologies, Inc.

Inventors: Reed Kotler, Nishit Shah
MACHINE LEARNING NETWORK IMPLEMENTED BY STATICALLY SCHEDULED INSTRUCTIONS

Publication number: 20230023303

Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.

Type: Application

Filed: October 3, 2022

Publication date: January 26, 2023

Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S.J Attia, Spenser Don Gilliland, Bradley Taylor
SPARSIFYING NARROW DATA FORMATS FOR NEURAL NETWORKS

Publication number: 20220405571

Abstract: Embodiments of the present disclosure include systems and methods for sparsifying narrow data formats for neural networks. A plurality of activation values in a neural network are provided to a muxing unit. A set of sparsification operations are performed on a plurality of weight values to generate a subset of the plurality of weight values and mask values associated with the plurality of weight values. The subset of the plurality of weight values are provided to a matrix multiplication unit. The muxing unit generates a subset of the plurality of activation values based on the mask values and provides the subset of the plurality of activation values to the matrix multiplication unit. The matrix multiplication unit performs a set of matrix multiplication operations on the subset of the plurality of weight values and the subset of the plurality of activation values to generate a set of outputs.

Type: Application

Filed: June 16, 2021

Publication date: December 22, 2022

Inventors: Bita DARVISH ROUHANI, Venmugil Elango, Eric S. Chung, Douglas C Burger, Mattheus C. Heddes, Nishit Shah, Rasoul Shafipour, Ankit More
Efficient convolution of multi-channel input samples with multiple kernels

Patent number: 11488066

Abstract: Convolutions of an input sample with multiple kernels is decomposed into matrix multiplications of a V×C matrix of input values times a C×K matrix of kernel values, producing a V×K product. For the second matrix, C is a channel dimension (i.e., each row of the second matrix is a different channel of the input sample and kernel) and K is the kernel dimension (i.e., each column of the second matrix is a different kernel), but all the values correspond to the same pixel position in the kernel. In the matrix product, V is the output dimension and K is the kernel dimension. Thus, each value in the output matrix is a partial product for a certain output pixel and kernel, and the matrix multiplication parallelizes the convolutions by calculating partial products for multiple output pixels and multiple kernels.

Type: Grant

Filed: April 21, 2020

Date of Patent: November 1, 2022

Assignee: SiMa Technologies, Inc.

Inventors: Nishit Shah, Srivathsa Dhruvanarayan
Hysteretic control of a boost converter

Patent number: 11462999

Abstract: A boost converter includes a clock generator, a switching converter, a hysteretic controller, and a power tracking module. The clock generator configured to output a clock signal; The switching converter configured to operate at a frequency based on the clock signal. The hysteretic controller configured to regulate an intermediate output from the switching converter. The power tracking module configured to change a frequency control signal that is sent to the clock generator, the change in frequency is based on a current flowing into an output capacitor such that a charge time of the capacitor is minimized when the current is maximized.

Type: Grant

Filed: August 6, 2020

Date of Patent: October 4, 2022

Inventors: Nishit Shah, Pedram Lajevardi, Kenneth Wojciechowski, Christoph Lang
Methods And Systems For Remotely Authorizing Transactions

Publication number: 20220245601

Abstract: A system and associated method for authorizing a transaction between a teller device and a user device is disclosed. The system can include a teller device configured to generate a remote signing notification. The system can include a server device configured to receive the remote signing notification from the teller device and generate a transaction document based on the remote signing notification. The system can include a user device configured to display a webpage based on the URI and perform a signature in the signature section of the webpage and transmit the signature from the user device to the server device to authorize the transaction.

Type: Application

Filed: January 29, 2021

Publication date: August 4, 2022

Applicant: Integrated Media Management, LLC

Inventors: Nishit Shah, David Aranovsky, John A. Levy
Machine learning network implemented by statically scheduled instructions, with system-on-chip

Patent number: 11403519

Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.

Type: Grant

Filed: April 6, 2020

Date of Patent: August 2, 2022

Assignee: SiMa Technologies, Inc.

Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S. J Attia, Spenser Don Gilliland
Machine learning network implemented by statically scheduled instructions, with MLA chip

Patent number: 11354570

Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.

Type: Grant

Filed: April 6, 2020

Date of Patent: June 7, 2022

Assignee: SiMa Technologies, Inc.

Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S. J Attia, Spenser Don Gilliland
Machine learning network implemented by statically scheduled instructions, with compiler

Patent number: 11321607

Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.

Type: Grant

Filed: April 3, 2020

Date of Patent: May 3, 2022

Assignee: SiMa Technologies, Inc.

Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S. J Attia, Spenser Don Gilliland
Energy gathering image sensor system

Patent number: 11303808

Abstract: A sensor system includes a pixel array, a DC/DC converter, and a photodiode stack. The pixel array is configured to operate in an image capturing mode or an energy harvesting mode. The DC/DC converter is configured to convert energy captured by the pixel array while in energy harvesting mode. The photodiode stack is located adjacent to the pixel array and configured to provide power to the DC/DC converter.

Type: Grant

Filed: August 6, 2020

Date of Patent: April 12, 2022

Inventors: Nishit Shah, Pedram Lajevardi, Kenneth Wojciechowski, Christoph Lang
Methods and Systems of Configuring Electronic Devices

Publication number: 20220053060

Abstract: A system of configuring a new device may include a new device that is not configured with one or more settings. The new device includes a short range communication transmitter and programming instructions configured to cause the new device to operate in a discoverable mode. The system includes an existing device that is configured with the settings, and that includes a short range communication receiver and programming instructions. The programming instructions are configured to cause the existing device to receive instructions to set up the new device, in response to receiving the instructions, detect, by the short range communication receiver, a presence of the new device by detecting the broadcast signal within a communication range of the short range communication receiver, and in response to detecting the presence of the new device, transmit at least a portion of the one or more settings directly to the new device.

Type: Application

Filed: October 29, 2021

Publication date: February 17, 2022

Applicant: Google LLC

Inventors: Ushasree Kode, Nishit Shah, Ibrahim Damlaj, Michal Levin, Thomas Weedon Hume
Methods and systems of configuring electronic devices

Patent number: 11190601

Abstract: A system of configuring a new device may include a new device that is not configured with one or more settings. The new device includes a short range communication transmitter and programming instructions configured to cause the new device to operate in a discoverable mode. The system includes an existing device that is configured with the settings, and that includes a short range communication receiver and programming instructions. The programming instructions are configured to cause the existing device to receive instructions to set up the new device, in response to receiving the instructions, detect, by the short range communication receiver, a presence of the new device by detecting the broadcast signal within a communication range of the short range communication receiver, and in response to detecting the presence of the new device, transmit at least a portion of the one or more settings directly to the new device.

Type: Grant

Filed: September 11, 2020

Date of Patent: November 30, 2021

Assignee: Google LLC

Inventors: Ushasree Kode, Nishit Shah, Ibrahim Damlaj, Michal Levin, Thomas Weedon Hume
ALLOCATING COMPUTATIONS OF A MACHINE LEARNING NETWORK IN A MACHINE LEARNING ACCELERATOR

Publication number: 20210342733

Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The compiler allocates instructions of the computer program to different groups of processing elements (Tiles) for execution such that different groups of Tiles implement different layers of the machine learning network. The compiler may determine the size of the different groups based on a partial computation metric associated with the computations performed to implement the corresponding layer. Furthermore, the compiler may assign specific Tiles to each group based on a set of predefined layout constraints. The compiler may statically schedule at least a portion of the instructions into one or more deterministic phases for execution by the groups of Tiles.

Type: Application

Filed: April 29, 2020

Publication date: November 4, 2021

Inventors: Reed Kotler, Nishit Shah
ORDERING COMPUTATIONS OF A MACHINE LEARNING NETWORK IN A MACHINE LEARNING ACCELERATOR FOR EFFICIENT MEMORY USAGE

Publication number: 20210342675

Abstract: A compiler efficiently manages memory usage in the machine learning accelerator by intelligently ordering computations of a machine learning network. The compiler identifies a set of partial networks of the machine learning network representing portions of the machine learning network across multiple layers on which an output or set of outputs are dependent. Because any given output may depend on only a limited subset of intermediate outputs from the prior layers, each partial network may include only a small fraction of the intermediate outputs from each layer. Instead of implementing the MLN by computing one layer at a time, the compiler schedules instructions to sequentially implement partial networks. As each layer of a partial network is completed, the intermediate outputs can be released from memory. The described technique enables intermediate outputs to be directly streamed between processing elements of the machine learning accelerator without requiring large transfers to and from external memory.

Type: Application

Filed: May 4, 2020

Publication date: November 4, 2021

Inventors: Reed Kotler, Nishit Shah
INTER-PROCESSOR DATA TRANSFER IN A MACHINE LEARNING ACCELERATOR, USING STATICALLY SCHEDULED INSTRUCTIONS

Publication number: 20210342673

Abstract: A compiler generates a computer program implementing a machine learning network on a machine learning accelerator (MLA) including interconnected processing elements. The computer program includes data transfer instructions for non-colliding data transfers between the processing elements. To generate the data transfer instructions, the compiler determines non-conflicting data transfer paths for data transfers based on a topology of the interconnections between processing elements, on dependencies of the instructions and on a duration for execution of the instructions. Each data transfer path specifies a routing and a time slot for the data transfer. The compiler generates data transfer instructions that specify routing of the data transfers and generates a static schedule that schedules execution of the data transfer instructions during the time slots for the data transfers.

Type: Application

Filed: May 1, 2020

Publication date: November 4, 2021

Inventors: Nishit Shah, Srivathsa Dhruvanarayan, Reed Kotler
AVOIDING DATA ROUTING CONFLICTS IN A MACHINE LEARNING ACCELERATOR

Publication number: 20210326681

Abstract: A compiler receives a description of a machine learning network (MLN) and generates a computer program that implements the MLN on a machine learning accelerator (MLA). To implement the MLN, the compiler generates compute instructions that implement computations of the MLN on different processing units (Tiles), and data transfer instructions that transfer data used in the computations. The compiler may statically schedule at least a portion of the instructions for execution by the Tiles according to fixed timing. The compiler may initially implement data transfers between non-adjacent Tiles (or external memories) by implementing a sequence of transfers through one or more intermediate Tiles (or external memories) in accordance with a set of default routing rules that dictates the data path. The computer program may then be simulated to identify routing conflicts. When routing conflicts are detected, the compiler updates the computer program in a manner that avoids the conflicts.

Type: Application

Filed: April 21, 2020

Publication date: October 21, 2021

Inventors: Reed Kotler, Nishit Shah
SYNCHRONIZATION OF PROCESSING ELEMENTS THAT EXECUTE STATICALLY SCHEDULED INSTRUCTIONS IN A MACHINE LEARNING ACCELERATOR

Publication number: 20210326189

Abstract: A method, system, and apparatus are disclosed herein for bridging a deterministic phase of instructions with a non-deterministic phase of instructions when those instructions are executed by a machine learning accelerator while executing a machine learning network. In the non-deterministic phase, data and instructions are transferred from off-chip memory to on-chip memory. When the transfer is complete, processing elements are synchronized and, upon synchronization, a deterministic phase of instructions is executed by the processing elements.

Type: Application

Filed: April 17, 2020

Publication date: October 21, 2021

Inventors: Nishit Shah, Srivathsa Dhruvanarayan, Reed Kotler
EFFICIENT CONVOLUTION OF MULTI-CHANNEL INPUT SAMPLES WITH MULTIPLE KERNELS

Publication number: 20210326750

Abstract: Convolutions of an input sample with multiple kernels is decomposed into matrix multiplications of a V×C matrix of input values times a C×K matrix of kernel values, producing a V×K product. For the second matrix, C is a channel dimension (i.e., each row of the second matrix is a different channel of the input sample and kernel) and K is the kernel dimension (i.e., each column of the second matrix is a different kernel), but all the values correspond to the same pixel position in the kernel. In the matrix product, V is the output dimension and K is the kernel dimension. Thus, each value in the output matrix is a partial product for a certain output pixel and kernel, and the matrix multiplication parallelizes the convolutions by calculating partial products for multiple output pixels and multiple kernels.

Type: Application

Filed: April 21, 2020

Publication date: October 21, 2021

Inventors: Nishit Shah, Srivathsa Dhruvanarayan

prev 1 2 3 4 next