Patents by Inventor Abdulkadir Utku Diril

Abdulkadir Utku Diril has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10372787
    Abstract: A special-purpose hardware accelerator may include a cache configured to store an input matrix related to performing a convolution operation and a matrix-multiplication subsystem pre-configured with matrix-transform coefficients for performing matrix-transform operations. The matrix-multiplication subsystem may perform the convolution operation by (1) reading the input matrix from the cache, (2) transforming the input matrix via matrix multiplication, (3) transforming, via matrix multiplication, a parameter matrix that includes convolution parameters for performing the convolution operation, (4) applying the transformed parameter matrix to the transformed input matrix via an element-wise multiplication operation, and then (5) performing an inverse-transformation operation on the results of the element-wise multiplication operation to create an output matrix for the convolution operation. Various other systems and methods are also disclosed.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: August 6, 2019
    Assignee: Facebook, Inc.
    Inventors: Jong Soo Park, Nadav Rotem, Mikhail Smelyanskiy, Abdulkadir Utku Diril
  • Publication number: 20190205358
    Abstract: A special-purpose, hardware-based accelerator may include an input subsystem configured to receive first and second vectors as operands of a full dot-product operation. The accelerator may also include a sparsity-aware dot-product engine communicatively coupled to the input subsystem and configured to perform adaptive dot-product processing by: (1) identifying, within the first and second vectors, at least one zero-value element and (2) executing, in response to identifying the zero-value element, a reduced dot-product operation that excludes, relative to the full dot-product operation, at least one mathematical operation in which the zero-value element is an operand. The accelerator may also include an output subsystem that is communicatively coupled to the sparsity-aware dot-product engine and configured to send a result of the reduced dot-product operation to a storage subsystem. Various other accelerators, computing systems, and methods are also disclosed.
    Type: Application
    Filed: December 29, 2017
    Publication date: July 4, 2019
    Inventors: Abdulkadir Utku Diril, Jong Soo Park, Nadav Rotem, Mikhail Smelyanskiy
  • Publication number: 20190205735
    Abstract: A disclosed computing system may include a special-purpose hardware device having an input subsystem, a linearization subsystem, and a matrix multiplication unit. The input subsystem may facilitate on-the-fly convolution lowering within a neural network convolution layer by directing input volume patches to logical unit(s) of the device. The linearization subsystem may be configured to receive a patch from the input subsystem and to linearize the patch by arranging elements of the patch as a portion of a data matrix row. The matrix multiplication unit of device may be configured to receive the data matrix from the linearization subsystem and to apply a filter matrix to the data matrix via a matrix multiplication operation. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: December 29, 2017
    Publication date: July 4, 2019
    Inventors: Mikhail Smelyanskiy, Abdulkadir Utku Diril, Jong Soo Park, Nadav Rotem
  • Publication number: 20190205094
    Abstract: The disclosed method may include (1) receiving a precision level of each weight associated with each input of a node of a computational model, (2) identifying, for each weight, one of a plurality of multiplier groups, where each multiplier group may include a plurality of hardware multipliers of a corresponding bit width, and where the corresponding bit width of the plurality of hardware multipliers of the one of the plurality of multiplier groups may be sufficient to multiply the weight by the associated input, and (3) multiplying each weight by its associated input using an available hardware multiplier of the one of the plurality of multiplier groups identified for the weight. Various other processing elements, methods, and systems are also disclosed.
    Type: Application
    Filed: December 29, 2017
    Publication date: July 4, 2019
    Inventors: Abdulkadir Utku Diril, Mikhail Smelyanskiy, Nadav Rotem, Jong Soo Park
  • Publication number: 20190206390
    Abstract: The disclosed method may include (1) determining whether a next operation of a plurality of operations of a computational model is dependent upon a Boolean predication value, (2) based on the next operation not being dependent on the Boolean predication value, performing the next operation, where a state of the computational model is updated as a result of performing the next operation, and (3) based on the next operation being dependent on the Boolean predication value, performing at least one of (a) allowing, based on the Boolean predication value being a first value, the next operation to update the state of the computational model, and (b) preventing, based on the Boolean predication value being a second value different from the first value, the next operation from updating the state of the computational model. Various other methods and systems are also disclosed.
    Type: Application
    Filed: December 29, 2017
    Publication date: July 4, 2019
    Inventors: Nadav Rotem, Abdulkadir Utku Diril, Mikhail Smelyanskiy, Jong Soo Park, James Kenneth Reed
  • Publication number: 20190187775
    Abstract: A computer-implemented method for dynamically managing the power usage and/or performance of an artificial intelligence (AI) hardware accelerator may include (1) receiving an instruction stream that includes one or more instructions for performing at least one AI-specific computing task, (2) identifying a plurality of special-purpose, hardware-based functional units configured to perform AI-specific computing tasks, (3) predicting, based on an analysis of at least a portion of the instruction stream, a power-usage requirement for at least one of the functional units when executing the instruction stream, and then (4) modifying, based on the power-usage requirement, the power supplied to at least one of the functional units. Various other methods and systems are also disclosed.
    Type: Application
    Filed: December 18, 2017
    Publication date: June 20, 2019
    Inventors: Nadav Rotem, Jong Soo Park, Mikhail Smelyanskiy, Abdulkadir Utku Diril
  • Publication number: 20190190538
    Abstract: A system may include a memory device that stores parameters of a layer of a neural network that have been compressed. The system may also include a special-purpose hardware processing unit programmed to, for the layer of the neural network: (1) receive the compressed parameters from the memory device, (2) decompress the compressed parameters, and (3) apply the decompressed parameters in an arithmetic operation of the layer of the neural network. Various other methods, systems, and accelerators are also disclosed.
    Type: Application
    Filed: December 18, 2017
    Publication date: June 20, 2019
    Inventors: Jong Soo Park, Nadav Rotem, Mikhail Smelyanskiy, Abdulkadir Utku Diril
  • Publication number: 20190179869
    Abstract: A special-purpose hardware accelerator may include a cache configured to store an input matrix related to performing a convolution operation and a matrix-multiplication subsystem pre-configured with matrix-transform coefficients for performing matrix-transform operations. The matrix-multiplication subsystem may perform the convolution operation by (1) reading the input matrix from the cache, (2) transforming the input matrix via matrix multiplication, (3) transforming, via matrix multiplication, a parameter matrix that includes convolution parameters for performing the convolution operation, (4) applying the transformed parameter matrix to the transformed input matrix via an element-wise multiplication operation, and then (5) performing an inverse-transformation operation on the results of the element-wise multiplication operation to create an output matrix for the convolution operation. Various other systems and methods are also disclosed.
    Type: Application
    Filed: December 12, 2017
    Publication date: June 13, 2019
    Inventors: Jong Soo Park, Nadav Rotem, Mikhail Smelyanskiy, Abdulkadir Utku Diril
  • Publication number: 20190171927
    Abstract: A method for performing layer-level quantization may include (1) performing an inference of an activation layer of a neural network, (2) storing a first limit value of the activation layer in a data storage system, (3) storing a second limit value of the activation layer in the data storage system, (4) determining a scaling factor based on the first and second limit values, and then (5) applying the scaling factor on a subsequent inference. Various other methods, systems, and devices are also disclosed.
    Type: Application
    Filed: December 6, 2017
    Publication date: June 6, 2019
    Inventors: Abdulkadir Utku Diril, Jong Soo Park, Nadav Rotem, Mikhail Smelyanskiy
  • Patent number: 8553046
    Abstract: An apparatus and method for detecting and handling thin lines in a raster image includes reading depth values for each pixel of an n×m block of pixels surrounding a substantially central pixel. Differences are then calculated for selected depth values of the n×m block of pixels to yield multiple difference values. These difference values may then be compared with multiple pre-computed difference values associated with thin lines pre-determined to pass through the n×m block of pixels. If the difference values of the pixel block substantially match the difference values of one of the pre-determined thin lines, the pixel block may be deemed to describe a thin line. The apparatus and method may preclude application of an anti-aliasing filter to the substantially central pixel of the pixel block in the event it describes a thin line.
    Type: Grant
    Filed: November 9, 2007
    Date of Patent: October 8, 2013
    Assignee: Vivante Corporation
    Inventors: Lefan Zhong, Abdulkadir Utku Diril
  • Patent number: 8416241
    Abstract: An apparatus and method for rasterizing a primitive in a graphics system is disclosed in one example of the invention as including scanning a first row of tiles, one tile at a time, starting from a first point and scanning in a first direction. Immediately after scanning the first row of tiles, the method includes moving from the first point to a second point in an orthogonal direction relative to the first row. Immediately after moving from the first point to the second point, the method includes scanning a second row of tiles, one tile at a time, starting from the second point and scanning in the first direction. By scanning rows in the same direction immediately prior to and after moving from one row to another, cache utilization is improved.
    Type: Grant
    Filed: July 21, 2011
    Date of Patent: April 9, 2013
    Assignee: Vivante Corporation
    Inventors: Abdulkadir Utku Diril, Frido Garritsen
  • Publication number: 20120044245
    Abstract: An apparatus and method for rasterizing a primitive in a graphics system is disclosed in one example of the invention as including scanning a first row of tiles, one tile at a time, starting from a first point and scanning in a first direction. Immediately after scanning the first row of tiles, the method includes moving from the first point to a second point in an orthogonal direction relative to the first row. Immediately after moving from the first point to the second point, the method includes scanning a second row of tiles, one tile at a time, starting from the second point and scanning in the first direction. By scanning rows in the same direction immediately prior to and after moving from one row to another, cache utilization is improved.
    Type: Application
    Filed: July 21, 2011
    Publication date: February 23, 2012
    Applicant: Vivante Corporation
    Inventors: Abdulkadir Utku Diril, Frido Garritsen
  • Patent number: 8060765
    Abstract: A power monitor for electronic devices, such as computer chips, is used to estimate the power consumption and to compare the estimated power consumption against the power budget. The estimated power consumption is based on activity signals from various functional blocks of the computer chip. The activity signals that are monitored correlate accurately to the total number of flip-flops that are active at a given time. If the estimated power consumption exceeds the power budget, the speed of the clock signals supplied to the computer chip is reduced.
    Type: Grant
    Filed: November 2, 2006
    Date of Patent: November 15, 2011
    Assignee: NVIDIA Corporation
    Inventors: Hungse Cha, Robert J. Hasslen, III, John A. Robinson, Sean J. Treichler, Abdulkadir Utku Diril
  • Patent number: 8009169
    Abstract: An apparatus and method for rasterizing a primitive in a graphics system is disclosed in one example of the invention as including scanning a first row of tiles, one tile at a time, starting from a first point and scanning in a first direction. Immediately after scanning the first row of tiles, the method includes moving from the first point to a second point in an orthogonal direction relative to the first row. Immediately after moving from the first point to the second point, the method includes scanning a second row of tiles, one tile at a time, starting from the second point and scanning in the first direction. By scanning rows in the same direction immediately prior to and after moving from one row to another, cache utilization is improved.
    Type: Grant
    Filed: November 9, 2007
    Date of Patent: August 30, 2011
    Assignee: Vivante Corporation
    Inventors: Abdulkadir Utku Diril, Frido Garritsen
  • Publication number: 20090122076
    Abstract: An apparatus and method for detecting and handling thin lines in a raster image includes reading depth values for each pixel of an n×m block of pixels surrounding a substantially central pixel. Differences are then calculated for selected depth values of the n×m block of pixels to yield multiple difference values. These difference values may then be compared with multiple pre-computed difference values associated with thin lines pre-determined to pass through the n×m block of pixels. If the difference values of the pixel block substantially match the difference values of one of the pre-determined thin lines, the pixel block may be deemed to describe a thin line. The apparatus and method may preclude application of an anti-aliasing filter to the substantially central pixel of the pixel block in the event it describes a thin line.
    Type: Application
    Filed: November 9, 2007
    Publication date: May 14, 2009
    Applicant: Vivante Corporation
    Inventors: Lefan Zhong, Abdulkadir Utku Diril
  • Publication number: 20090122064
    Abstract: An apparatus and method for rasterizing a primitive in a graphics system is disclosed in one example of the invention as including scanning a first row of tiles, one tile at a time, starting from a first point and scanning in a first direction. Immediately after scanning the first row of tiles, the method includes moving from the first point to a second point in an orthogonal direction relative to the first row. Immediately after moving from the first point to the second point, the method includes scanning a second row of tiles, one tile at a time, starting from the second point and scanning in the first direction. By scanning rows in the same direction immediately prior to and after moving from one row to another, cache utilization is improved.
    Type: Application
    Filed: November 9, 2007
    Publication date: May 14, 2009
    Applicant: Vivante Corporation
    Inventors: Abdulkadir Utku Diril, Frido Garritsen