Patents by Inventor Mikhail Smelyanskiy

Mikhail Smelyanskiy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Static memory allocation in neural networks

Patent number: 11514306

Abstract: The disclosed computer-implemented method may include compiling a neural network, and the compiling may include organizing an interconnected set of nodes in a series of layers, and for each node in each layer, assigning an associated activation of a plurality of activations. Each activation may output a respective tensor of a plurality of tensors. The compiling may also include allocating memory for the activations by determining a respective memory size for each activation, and based on the respective memory size for each activation, assigning a memory block in the neural network to the activation. The method may also include, after the allocating the memory for the activations, accessing the memory blocks to perform the plurality of activations and thereby execute the neural network. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: March 14, 2018

Date of Patent: November 29, 2022

Assignee: Meta Platforms, Inc.

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, Saleem Abdulrasool
Systems and methods for protecting neural network weights

Patent number: 11487888

Abstract: The disclosed computer-implemented method may include (i) identifying a neural network that comprises an interconnected set of nodes organized in a set of layers represented by a plurality of matrices that each comprise a plurality of weights, where each weight represents a connection between a node in the interconnected set of nodes that resides in one layer in the set of layers and an additional node in the set of interconnected nodes that resides in a different layer in the set of layers, (ii) encrypting, using an encryption cipher, the plurality of weights, (iii) detecting that execution of the neural network has been initiated, and (iv) decrypting, using the encryption cipher, the plurality of weights in response to detecting that the execution of the neural network has been initiated. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: June 1, 2020

Date of Patent: November 1, 2022

Assignee: Meta Platforms, Inc.

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, Roman Levenstein
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS

Publication number: 20220343174

Abstract: Described herein is a graphics processor including a processing resource including a multiplier configured to multiply input associated with the instruction at one of a first plurality of bit widths, an adder configured to add a product output from the multiplier with an accumulator value at one of a second plurality of bit widths, and circuitry to select a first bit width of the first plurality of bit widths for the multiplier and a second bit width of the second plurality of bit widths for the adder.

Type: Application

Filed: May 12, 2022

Publication date: October 27, 2022

Applicant: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
Optimized compute hardware for machine learning operations

Patent number: 11334796

Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.

Type: Grant

Filed: August 3, 2020

Date of Patent: May 17, 2022

Assignee: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
Systems and methods for employing predication in computational models

Patent number: 11264011

Abstract: The disclosed method may include (1) determining whether a next operation of a plurality of operations of an artificial neural network (ANN) is dependent upon a Boolean predication value based on a representative value for a weight or an input of a node of the ANN, (2) based on the next operation not being dependent on the Boolean predication value, allowing the next operation to update a state of the ANN, and (3) based on the next operation being dependent on the Boolean predication value, performing at least one of (a) allowing, based on the Boolean predication value being a first value, the next operation to update the state of the ANN, and (b) preventing, based on the Boolean predication value being a second value different from the first value, the next operation from updating the state of the ANN. Various other methods and systems are also disclosed.

Type: Grant

Filed: January 22, 2020

Date of Patent: March 1, 2022

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, James Kenneth Reed
Lowering hardware for neural networks

Patent number: 11256977

Abstract: A disclosed computing system may include a special-purpose hardware device having an input subsystem, a linearization subsystem, and a matrix multiplication unit. The input subsystem may facilitate on-the-fly convolution lowering within a neural network convolution layer by directing input volume patches to logical unit(s) of the device. The linearization subsystem may be configured to receive a patch from the input subsystem and to linearize the patch by arranging elements of the patch as a portion of a data matrix row. The matrix multiplication unit of device may be configured to receive the data matrix from the linearization subsystem and to apply a filter matrix to the data matrix via a matrix multiplication operation. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: December 29, 2017

Date of Patent: February 22, 2022

Assignee: Facebook, Inc.

Inventors: Mikhail Smelyanskiy, Abdulkadir Utku Diril, Jong Soo Park, Nadav Rotem
Systems and methods for efficient scaling of quantized integers

Patent number: 11023240

Abstract: The disclosed computer-implemented method may include receiving an input value and a floating-point scaling factor and determining (1) an integer scaling factor based on the floating-point scaling factor, (2) a pre-scaling adjustment value representative of a number of places by which to shift a binary representation of the input value prior to a scaling operation, and (3) a post-scaling adjustment value representative of a number of places by which to shift the binary representation of the input value following the scaling operation. The method may further include calculating a scaled result value by (1) shifting rightwards the binary representation of the input value by the pre-scaling adjustment value, (2) scaling the shifted binary representation of the input value by the integer scaling factor, and (3) shifting rightwards the shifted and scaled binary value by the post-scaling adjustment value. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: November 22, 2019

Date of Patent: June 1, 2021

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Jong Soo Park, Zhaoxia Deng, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Roman Dzhabarov, James Hegeman
Systems and methods for optimizing power usage for systems within quality-of-service constraints

Patent number: 10948966

Abstract: The disclosed computer-implemented method may include (i) identifying an artificial neural network that processes each input to the artificial neural network in a fixed number of operations, (ii) performing an analysis on the artificial neural network to determine an execution metric that represents the fixed number of operations performed by the artificial neural network to process each input, (iii) determining a quality-of-service metric for an executing system that executes the artificial neural network, and (iv) optimizing power consumption of the executing system by configuring, based on the execution metric and the quality-of-service metric, a processing throughput of at least one physical processor of the executing system, thereby causing the executing system to execute the artificial neural network at a rate that satisfies the quality-of-service metric while limiting the power consumption of the executing system. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: March 7, 2018

Date of Patent: March 16, 2021

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS

Publication number: 20210019631

Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.

Type: Application

Filed: August 3, 2020

Publication date: January 21, 2021

Applicant: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
Systems and methods for efficiently updating neural networks

Patent number: 10817783

Abstract: The disclosed computer-implemented method for efficiently updating neural networks may include (i) identifying a neural network that comprises sets of interconnected nodes represented at least in part by a plurality of matrices and that is trained on a training computing device and executes on at least one endpoint device, (ii) constraining a training session for the neural network to reduce the size in memory of the difference between the previous values of the matrices prior to the training session and the new values of the matrices after the training session, (iii) creating a delta update for the neural network that describes the difference between the previous values and the new values, and (iv) updating the neural network on the endpoint device to the new state by sending the delta update from the training computing device to the endpoint computing device. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: May 7, 2020

Date of Patent: October 27, 2020

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, Christopher Dewan
In-memory processing based on combining output currents

Patent number: 10777251

Abstract: A first value is stored in a first memory cell. A first component output current, from a first electronic component, is provided based on the stored first value, wherein the first component output current is proportional to a place value represented by the first value. A second value is stored in a second memory cell. A second component output current, from a second electronic component, is provided based on the stored second value, wherein the second component output current is proportional to a place value represented by the second value. A combined current of at least the first component output current and the second component output current is detected, wherein the combined current corresponds to a sum of at least the first value and the second value.

Type: Grant

Filed: May 9, 2019

Date of Patent: September 15, 2020

Assignee: Facebook, Inc.

Inventors: Ahmad Byagowi, Aravind Kalaiah, Mikhail Smelyanskiy
Optimized compute hardware for machine learning operations

Patent number: 10776699

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

Type: Grant

Filed: January 12, 2018

Date of Patent: September 15, 2020

Assignee: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
Virtual vector processing

Patent number: 10768989

Abstract: Methods and apparatus to provide virtualized vector processing are described. In one embodiment, one or more operations corresponding to a virtual vector request are distributed to one or more processor cores for execution.

Type: Grant

Filed: January 16, 2018

Date of Patent: September 8, 2020

Assignee: Intel Corporation

Inventors: Anthony Nguyen, Engin Ipek, Victor Lee, Daehyun Kim, Mikhail Smelyanskiy
Systems and methods for protecting neural network weights

Patent number: 10719613

Abstract: The disclosed computer-implemented method may include (i) identifying a neural network that comprises an interconnected set of nodes organized in a set of layers represented by a plurality of matrices that each comprise a plurality of weights, where each weight represents a connection between a node in the interconnected set of nodes that resides in one layer in the set of layers and an additional node in the set of interconnected nodes that resides in a different layer in the set of layers, (ii) encrypting, using an encryption cipher, the plurality of weights, (iii) detecting that execution of the neural network has been initiated, and (iv) decrypting, using the encryption cipher, the plurality of weights in response to detecting that the execution of the neural network has been initiated. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: February 23, 2018

Date of Patent: July 21, 2020

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, Roman Levenstein
Systems and methods for efficiently updating neural networks

Patent number: 10699190

Abstract: The disclosed computer-implemented method for efficiently updating neural networks may include (i) identifying a neural network that comprises sets of interconnected nodes represented at least in part by a plurality of matrices and that is trained on a training computing device and executes on at least one endpoint device, (ii) constraining a training session for the neural network to reduce the size in memory of the difference between the previous values of the matrices prior to the training session and the new values of the matrices after the training session, (iii) creating a delta update for the neural network that describes the difference between the previous values and the new values, and (iv) updating the neural network on the endpoint device to the new state by sending the delta update from the training computing device to the endpoint computing device. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: March 4, 2018

Date of Patent: June 30, 2020

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, Christopher Dewan
Dynamic power management for artificial intelligence hardware accelerators

Patent number: 10671147

Abstract: A computer-implemented method for dynamically managing the power usage and/or performance of an artificial intelligence (AI) hardware accelerator may include (1) receiving an instruction stream that includes one or more instructions for performing at least one AI-specific computing task, (2) identifying a plurality of special-purpose, hardware-based functional units configured to perform AI-specific computing tasks, (3) predicting, based on an analysis of at least a portion of the instruction stream, a power-usage requirement for at least one of the functional units when executing the instruction stream, and then (4) modifying, based on the power-usage requirement, the power supplied to at least one of the functional units. Various other methods and systems are also disclosed.

Type: Grant

Filed: December 18, 2017

Date of Patent: June 2, 2020

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Jong Soo Park, Mikhail Smelyanskiy, Abdulkadir Utku Diril
SYSTEMS AND METHODS FOR EMPLOYING PREDICATION IN COMPUTATIONAL MODELS

Publication number: 20200160848

Abstract: The disclosed method may include (1) determining whether a next operation of a plurality of operations of an artificial neural network (ANN) is dependent upon a Boolean predication value based on a representative value for a weight or an input of a node of the ANN, (2) based on the next operation not being dependent on the Boolean predication value, allowing the next operation to update a state of the ANN, and (3) based on the next operation being dependent on the Boolean predication value, performing at least one of (a) allowing, based on the Boolean predication value being a first value, the next operation to update the state of the ANN, and (b) preventing, based on the Boolean predication value being a second value different from the first value, the next operation from updating the state of the ANN. Various other methods and systems are also disclosed.

Type: Application

Filed: January 22, 2020

Publication date: May 21, 2020

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, James Kenneth Reed
Hardware apparatus and methods to prefetch a multidimensional block of elements from a multidimensional array

Patent number: 10656944

Abstract: Methods and apparatuses relating to a prefetch instruction to prefetch a multidimensional block of elements from a multidimensional array into a cache. In one embodiment, a hardware processor includes a decoder to decode a prefetch instruction to prefetch a multidimensional block of elements from a multidimensional array into a cache, wherein at least one operand of the prefetch instruction is to indicate a system memory address of an element of the multidimensional block of elements, a stride of the multidimensional block of elements, and boundaries of the multidimensional block of elements, and an execution unit to execute the prefetch instruction to generate system memory addresses of the other elements of the multidimensional block of elements, and load the multidimensional block of elements into the cache from the system memory addresses.

Type: Grant

Filed: June 8, 2018

Date of Patent: May 19, 2020

Assignee: Intel Corporation

Inventors: Victor Lee, Mikhail Smelyanskiy, Alexander Heinecke
Systems and methods for efficient scaling of quantized integers

Patent number: 10579383

Abstract: The disclosed computer-implemented method may include receiving an input value and a floating-point scaling factor and determining (1) an integer scaling factor based on the floating-point scaling factor, (2) a pre-scaling adjustment value representative of a number of places by which to shift a binary representation of the input value prior to a scaling operation, and (3) a post-scaling adjustment value representative of a number of places by which to shift the binary representation of the input value following the scaling operation. The method may further include calculating a scaled result value by (1) shifting rightwards the binary representation of the input value by the pre-scaling adjustment value, (2) scaling the shifted binary representation of the input value by the integer scaling factor, and (3) shifting rightwards the shifted and scaled binary value by the post-scaling adjustment value. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: May 30, 2018

Date of Patent: March 3, 2020

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Jong Soo Park, Zhaoxia Deng, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Roman Dzhabarov, James Wesley Hegeman
Systems and methods for employing predication in computational models

Patent number: 10553207

Abstract: The disclosed method may include (1) determining whether a next operation of a plurality of operations of a computational model is dependent upon a Boolean predication value, (2) based on the next operation not being dependent on the Boolean predication value, performing the next operation, where a state of the computational model is updated as a result of performing the next operation, and (3) based on the next operation being dependent on the Boolean predication value, performing at least one of (a) allowing, based on the Boolean predication value being a first value, the next operation to update the state of the computational model, and (b) preventing, based on the Boolean predication value being a second value different from the first value, the next operation from updating the state of the computational model. Various other methods and systems are also disclosed.

Type: Grant

Filed: December 29, 2017

Date of Patent: February 4, 2020

Assignee: Facebook, Inc.

Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, James Kenneth Reed

1 2 3 4 next