Patents by Inventor Aliasger Tayeb Zaidy

Aliasger Tayeb Zaidy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11829627
    Abstract: Various embodiments provide for one or more processor instructions and memory instructions that enable a memory sub-system to predict a schedule for migrating data between memory devices, which can be part of a memory sub-system.
    Type: Grant
    Filed: August 16, 2021
    Date of Patent: November 28, 2023
    Assignee: Micron Technology, Inc.
    Inventors: David Andrew Roberts, Aliasger Tayeb Zaidy
  • Publication number: 20230206045
    Abstract: A device for deep learning acceleration with mixed precision may include a precision mode port configured to receive an indication of an output precision mode, a data input port configured to receive an input value, and a truncation component configured to truncate the input value into a keep segment value and a truncate segment value. The device may be configured to add the keep segment value and a carry bit to generate a rounded keep segment value, and to generate a rounded output based on the rounded keep segment value and the output precision mode. The rounded output generation component may be configured to generate the rounded output to include a sign bit of the keep segment value and either a first quantity or a second quantity of lower bits of the keep segment value based on the output precision mode being either a first value or a second value.
    Type: Application
    Filed: June 16, 2022
    Publication date: June 29, 2023
    Inventors: Sen MA, Aliasger Tayeb ZAIDY, Dustin WERRAN
  • Publication number: 20230206046
    Abstract: A device for deep learning acceleration with mixed precision may include a token generator configured to generate a token value and may include multiple multiplexers. Each multiplexer may include a load port configured to receive map data, a max pool port configured to receive max pool data, and matrix-matrix (MM) data input ports each configured to receive MM data based on MM output generated by an MM component. Each multiplexer may include an output port configured to provide output data to a single MM component. Each multiplexer may provide corresponding output data to a different MM component. Each multiplexer may be configured to select, based on the token value, an input from one of the load port, the max pool port, or a single MM data input port, of the MM data input ports, as the output data to be provided to the output port.
    Type: Application
    Filed: June 16, 2022
    Publication date: June 29, 2023
    Inventors: Sen MA, Aliasger Tayeb ZAIDY, Dustin WERRAN
  • Publication number: 20230206061
    Abstract: A device for deep learning acceleration with mixed precision may include a first precision mode port to receive an indication of an input precision mode and a second precision mode port to receive an indication of an output precision mode. The device may include a first data port to receive map data and a second data port to receive kernel data. The device may include multiply-accumulate (MAC) components that are each configured to generate a MAC output based on the input precision mode, the map data, and the kernel data. The device may include an adder component to generate an adder component output based on the input precision mode and one or more MAC outputs. The device may include a rounding component to round the adder component output, based on the output precision mode, to generate a rounded output, and an output port to output the rounded output.
    Type: Application
    Filed: June 16, 2022
    Publication date: June 29, 2023
    Inventors: Sen MA, Aliasger Tayeb ZAIDY, Dustin WERRAN
  • Publication number: 20230206044
    Abstract: A device for deep learning acceleration with mixed precision may include a first data port configured to receive a map data segment and a second data port configured to receive a kernel data segment. The device may include a precision mode port configured to receive an indication of an input precision mode that indicates a word length for the map data segment and for the kernel data segment. The device may include a multiplier component configured to generate a multiplier component output based on the input precision mode and based on multiplying the map data segment and the kernel data segment. The device may include an adder component configured to generate an adder component output based on the input precision mode and based on the multiplier component output. The device may include an output port configured to output the adder component output.
    Type: Application
    Filed: June 16, 2022
    Publication date: June 29, 2023
    Inventors: Sen MA, Aliasger Tayeb ZAIDY, Dustin WERRAN
  • Publication number: 20230206041
    Abstract: A device for deep learning acceleration with mixed precision may include multiple matrix-matrix (MM) components that each include multiple map memory components configured to store map data, multiple kernel memory components configured to store kernel data, and multiple matrix-vector (MV) components. The MV components may each include multiple vector-vector (VV) components that are each configured to generate a VV output based on an input precision mode, an output precision mode, and an accumulation of products that is based on the map data and the kernel data. Each VV component included in a particular MV component may be coupled with each map memory component and may be coupled with a single kernel memory component. The device may include a data distribution component coupled with the multiple MM components and configured to load the map data into the multiple map memory components.
    Type: Application
    Filed: June 16, 2022
    Publication date: June 29, 2023
    Inventors: Sen MA, Aliasger Tayeb ZAIDY, Dustin WERRAN
  • Publication number: 20230206043
    Abstract: A device for deep learning acceleration with mixed precision may include matrix-vector (MV) components that each include vector-vector (VV) components that are each configured to generate a respective VV output based on an input precision mode, an output precision mode, and an accumulation of products. The accumulation of products may be calculated by adding products based on the input precision mode. Each product may be calculated by multiplying, based on the input precision mode, a map data segment and a kernel data segment. Each MV component may include one or more components configured to concatenate VV outputs to generate a concatenated VV output. The device may include activation function components that are each configured to receive a corresponding concatenated VV output, generate an activation function output based on the corresponding concatenated VV output and the output precision mode, and output the activation function output.
    Type: Application
    Filed: June 16, 2022
    Publication date: June 29, 2023
    Inventors: Sen MA, Aliasger Tayeb ZAIDY, Dustin WERRAN
  • Publication number: 20230206042
    Abstract: A device for deep learning acceleration with mixed precision may include vector-vector (VV) components that are each configured to generate a VV output based on an input precision mode, an output precision mode, and at least one accumulation of products. Each accumulation of products may be calculated by adding products based on the input precision mode. Each product may be calculated by multiplying a map word and a kernel word based on the input precision mode. The input precision mode may indicate an input word length for the map word and for the kernel word, and the output precision mode may indicate an output word length for the VV output. The device may include one or more components configured to concatenate VV outputs, corresponding to the VV components, to generate a concatenated VV output. The device may include an output port configured to output the concatenated VV output.
    Type: Application
    Filed: June 16, 2022
    Publication date: June 29, 2023
    Inventors: Sen MA, Aliasger Tayeb ZAIDY, Dustin WERRAN
  • Publication number: 20230100328
    Abstract: Disclosed in some examples are improved address prediction and memory preloading that leverages next-delta prediction and/or far-delta prediction for scheduling using a DNN. Previous memory access sequence data that identify one or more memory addresses previously accessed by one or more processors of a system may be processed and then converted into a sequence of delta values. The sequence of delta values are then mapped to one or more classes that are then input to a DNN. The DNN then outputs a predicted future class identifier sequence that represents addresses that the DNN predicts will be accessed by the processor in the future. The predicted future class identifier sequence is then converted back to a predicted delta value sequence and back into a set of one or more predicted addresses.
    Type: Application
    Filed: July 18, 2022
    Publication date: March 30, 2023
    Inventors: Aliasger Tayeb Zaidy, David Andrew Roberts, Patrick Michael Sheridan, Lukasz Burzawa
  • Publication number: 20230051103
    Abstract: Various embodiments provide for one or more processor instructions and memory instructions that enable a memory sub-system to predict a schedule for migrating data between memory devices, which can be part of a memory sub-system.
    Type: Application
    Filed: August 16, 2021
    Publication date: February 16, 2023
    Inventors: David Andrew Roberts, Aliasger Tayeb Zaidy
  • Publication number: 20220223201
    Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, the accelerator can have processing units to perform at least matrix computations of an artificial neural network via execution of instructions. The processing units have a local memory store operands of the instructions. The accelerator can access a random access memory via a system buffer, or without going through the system buffer. A fetch instruction can request an item, available at a memory address in the random access memory, to be loaded into the local memory at a local address. The fetch instruction can include a hint for the caching of the item in the system buffer. During execution of the instruction, the hint can be used to determine whether to load the item through the system buffer or to bypass the system buffer in loading the item.
    Type: Application
    Filed: January 11, 2021
    Publication date: July 14, 2022
    Inventors: Aliasger Tayeb Zaidy, Patrick Alan Estep, David Andrew Roberts
  • Publication number: 20220147812
    Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler has an artificial neural network configured to identify an optimized compilation option for an artificial neural network to be compiled by the compiler and/or for a hardware platform of Deep Learning Accelerators. The artificial neural network of the compiler can be trained via machine learning to identify the optimized compilation option based on the features of the artificial neural network to be compiled and/or features of the hardware platform on which the compiler output will be executed.
    Type: Application
    Filed: November 6, 2020
    Publication date: May 12, 2022
    Inventors: Andre Xian Ming Chang, Aliasger Tayeb Zaidy, Marko Vitez, Michael Cody Glapa, Abhishek Chaurasia, Eugenio Culurciello
  • Publication number: 20220147811
    Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler can identify a plurality of portions of an artificial neural network for implementation on a plurality of such integrated circuit devices respectively. The compiler converts a description of the artificial neural network into a plurality of compiler outputs executable on the plurality of devices to generate an output of the artificial neural network response to an input to the artificial neural network. Intermediate results are communicated among the devices in generating the output of the artificial neural network.
    Type: Application
    Filed: November 6, 2020
    Publication date: May 12, 2022
    Inventors: Jaime Cummins, Marko Vitez, Eugenio Culurciello, Andre Xian Ming Chang, Aliasger Tayeb Zaidy
  • Publication number: 20220147808
    Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler can convert a description of an artificial neural network into a generic result of compilation according to a specification of a generic Deep Learning Accelerator and then map the first result of compilation into a platform-specific result according to a specification of a specific hardware platform of Deep Learning Accelerators. The platform-specific result can be stored into the RAM of the integrated circuit device to enable the integrated circuit device to autonomously perform the computation of the artificial neural network in generating an output in response to an input to the artificial neural network.
    Type: Application
    Filed: November 6, 2020
    Publication date: May 12, 2022
    Inventors: Andre Xian Ming Chang, Aliasger Tayeb Zaidy, Eugenio Culurciello, Jaime Cummins, Marko Vitez
  • Publication number: 20220147810
    Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory. A computing device running a compiler can interact and/or probe an integrated circuit device to identify hardware characteristics of the integrated circuit device in performing matrix computations. The compiler can generate and optimize a result of compilation from a description of an artificial neural network based at least in part on the hardware characteristics of the integrated circuit device. The result of compilation can include first data representative of parameters of the artificial neural network and second data representative of instructions executable by the integrated circuit device to generate an output of the artificial neural network based on the first data and an input to the artificial neural network.
    Type: Application
    Filed: November 6, 2020
    Publication date: May 12, 2022
    Inventors: Aliasger Tayeb Zaidy, Marko Vitez, Eugenio Culurciello, Jaime Cummins, Andre Xian Ming Chang
  • Publication number: 20220147809
    Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory. A compiler can convert a description of an artificial neural network into a compiler output through optimization and/or selection of hardware options of the integrated circuit device. The compiler output can include parameters of the artificial neural network, instructions executable by processing units of the Deep Learning Accelerator to generate an output of the artificial neural network responsive to an input to the artificial neural network, and hardware options to be stored in registers connected to control hardware configurations of the processing units.
    Type: Application
    Filed: November 6, 2020
    Publication date: May 12, 2022
    Inventors: Aliasger Tayeb Zaidy, Marko Vitez, Eugenio Culurciello, Jaime Cummins, Andre Xian Ming Chang
  • Publication number: 20220147813
    Abstract: Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler is configured to generate instructions executable by the Deep Learning Accelerator from a description of a target artificial neural network. The instructions may call routines in a runtime library that has an embedded artificial neural network configured to predict optimized execution options available to implement the routines. The prediction is based at least in part on a pattern of data being processed in the target artificial neural network and/or a pattern of usages of the routines by the instructions.
    Type: Application
    Filed: November 6, 2020
    Publication date: May 12, 2022
    Inventors: Andre Xian Ming Chang, Aliasger Tayeb Zaidy, Marko Vitez, Eugenio Culurciello