Patents by Inventor Barath Lakshmanan

Barath Lakshmanan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240152756
    Abstract: In one embodiment, a method of training an autoencoder neural network includes determining autoencoder design parameters for the autoencoder neural network, including an input image size for an input image, a compression ratio for compression of the input image into a latent vector, and a latent vector size for the latent vector. The input image size is determined based on a resolution of training images and a size of target features to be detected. The compression ratio is determined based on entropy of the training images. The latent vector size is determined based on the compression ratio. The method further includes training the autoencoder neural network based on the autoencoder design parameters and the training dataset, and then saving the trained autoencoder neural network on a storage device.
    Type: Application
    Filed: March 25, 2022
    Publication date: May 9, 2024
    Applicant: Intel Corporation
    Inventors: Barath Lakshmanan, Ashish B. Datta, Craig D. Sperry, David J. Austin, Caleb Mark McMillan, Neha Purushothaman, Rita H. Wouhaybi
  • Patent number: 11948224
    Abstract: One embodiment provides an apparatus comprising a memory stack including multiple memory dies and a parallel processor including a plurality of multiprocessors. Each multiprocessor has a single instruction, multiple thread (SIMT) architecture, the parallel processor coupled to the memory stack via one or more memory interfaces. At least one multiprocessor comprises a multiply-accumulate circuit to perform multiply-accumulate operations on matrix data in a stage of a neural network implementation to produce a result matrix comprising a plurality of matrix data elements at a first precision, precision tracking logic to evaluate metrics associated with the matrix data elements and indicate if an optimization is to be performed for representing data at a second stage of the neural network implementation, and a numerical transform unit to dynamically perform a numerical transform operation on the matrix data elements based on the indication to produce transformed matrix data elements at a second precision.
    Type: Grant
    Filed: November 1, 2022
    Date of Patent: April 2, 2024
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Patent number: 11934934
    Abstract: An apparatus to facilitate optimization of a convolutional neural network (CNN) is disclosed. The apparatus includes optimization logic to receive a CNN model having a list of instructions and including pruning logic to optimize the list of instructions by eliminating branches in the list of instructions that comprise a weight value of 0.
    Type: Grant
    Filed: April 17, 2017
    Date of Patent: March 19, 2024
    Assignee: Intel Corporation
    Inventors: Liwei Ma, Elmoustapha Ould- Ahmed-Vall, Barath Lakshmanan, Ben J. Ashbaugh, Jingyi Jin, Jeremy Bottleson, Mike B. Macpherson, Kevin Nealis, Dhawal Srivastava, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Altug Koker, Abhishek R. Appu
  • Publication number: 20240086683
    Abstract: An apparatus to facilitate workload scheduling is disclosed. The apparatus includes one or more clients, one or more processing units to processes workloads received from the one or more clients, including hardware resources and scheduling logic to schedule direct access of the hardware resources to the one or more clients to process the workloads.
    Type: Application
    Filed: September 21, 2023
    Publication date: March 14, 2024
    Applicant: Intel Corporation
    Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Chandrasekaran Sakthivel, Barath Lakshmanan, Jingyi Jin, Justin E. Gottschlich, Michael Strickland
  • Publication number: 20230401668
    Abstract: One embodiment provides a general-purpose graphics processing unit comprising a dynamic precision floating-point unit including a control unit having precision tracking hardware logic to track an available number of bits of precision for computed data relative to a target precision, wherein the dynamic precision floating-point unit includes computational logic to output data at multiple precisions.
    Type: Application
    Filed: August 25, 2023
    Publication date: December 14, 2023
    Applicant: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Patent number: 11809978
    Abstract: An apparatus to facilitate workload scheduling is disclosed. The apparatus includes one or more clients, one or more processing units to processes workloads received from the one or more clients, including hardware resources and scheduling logic to schedule direct access of the hardware resources to the one or more clients to process the workloads.
    Type: Grant
    Filed: April 18, 2022
    Date of Patent: November 7, 2023
    Assignee: Intel Corporation
    Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Chandrasekaran Sakthivel, Barath Lakshmanan, Jingyi Jin, Justin E. Gottschlich, Michael Strickland
  • Publication number: 20230315481
    Abstract: Described herein is a general-purpose graphics processing unit including a multiprocessor having a single instruction, multiple thread, SIMT, architecture. The multiprocessor comprises multiple sets of compute units each having a first logic unit configured to perform floating-point operations and a second logic unit configured to perform integer operations, with a thread of the floating-point instruction being executed in parallel with a thread of the integer instruction.
    Type: Application
    Filed: May 4, 2023
    Publication date: October 5, 2023
    Applicant: Intel Corporation
    Inventors: ELMOUSTAPHA OULD-AHMED-VALL, BARATH LAKSHMANAN, TATIANA SHPEISMAN, Joydeep Ray, Ping T. Tang, Michael Strickland, Xiaoming Chen, Anbang Yao, Ben J. Ashbaugh, Linda L. Hurd, Liwei Ma
  • Patent number: 11727246
    Abstract: Embodiments provide systems and methods which facilitate optimization of a convolutional neural network (CNN). One embodiment provides for a non-transitory machine-readable medium storing instructions that cause one or more processors to perform operations comprising processing a trained convolutional neural network (CNN) to generate a processed CNN, the trained CNN having weights in a floating-point format. Processing the trained CNN includes quantizing the weights in the floating-point format to generate weights in an integer format. Quantizing the weights includes generating a quantization table to enable non-uniform quantization of the weights and quantizing the weights from the floating-point format to the integer format using the quantization table. The operations additionally comprise performing an inference operation utilizing the processed CNN with the integer format weights.
    Type: Grant
    Filed: February 22, 2019
    Date of Patent: August 15, 2023
    Assignee: Intel Corporation
    Inventors: Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Ben J. Ashbaugh, Jingyi Jin, Jeremy Bottleson, Mike B. Macpherson, Kevin Nealis, Dhawal Srivastava, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Altug Koker, Abhishek R. Appu
  • Publication number: 20230061331
    Abstract: One embodiment provides a multi-chip module accelerator usable to execute tensor data processing operations a multi-chip module. The multi-chip module may include a memory stack including multiple memory dies and parallel processor circuitry communicatively coupled to the memory stack. The parallel processor circuitry may include multiprocessor cores to execute matrix multiplication and accumulate operations. The matrix multiplication and accumulate operations may include floating-point operations that are configurable to include two-dimensional matrix multiply and accumulate operations involving inputs that have differing floating-point precisions. The floating-point operations may include a first operation at a first precision and a second operation at a second precision. The first operation may include a multiply having at least one 16-bit floating-point input and the second operation may include an accumulate having a 32-bit floating-point input.
    Type: Application
    Filed: October 5, 2022
    Publication date: March 2, 2023
    Applicant: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Publication number: 20230061670
    Abstract: One embodiment provides an apparatus comprising a memory stack including multiple memory dies and a parallel processor including a plurality of multiprocessors. Each multiprocessor has a single instruction, multiple thread (SIMT) architecture, the parallel processor coupled to the memory stack via one or more memory interfaces. At least one multiprocessor comprises a multiply-accumulate circuit to perform multiply-accumulate operations on matrix data in a stage of a neural network implementation to produce a result matrix comprising a plurality of matrix data elements at a first precision, precision tracking logic to evaluate metrics associated with the matrix data elements and indicate if an optimization is to be performed for representing data at a second stage of the neural network implementation, and a numerical transform unit to dynamically perform a numerical transform operation on the matrix data elements based on the indication to produce transformed matrix data elements at a second precision.
    Type: Application
    Filed: November 1, 2022
    Publication date: March 2, 2023
    Applicant: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Publication number: 20230039729
    Abstract: Methods and apparatus relating to autonomous vehicle neural network optimization techniques are described. In an embodiment, the difference between a first training dataset to be used for a neural network and a second training dataset to be used for the neural network is detected. The second training dataset is authenticated in response to the detection of the difference. The neural network is used to assist in an autonomous vehicle/driving. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: October 11, 2022
    Publication date: February 9, 2023
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Linda L. Hurd, Dukhwan Kim, Mike B. MacPherson, John C. Weast, Justin E. Gottschlich, Jingyi Jin, Barath Lakshmanan, Chandrasekaran Sakthivel, Michael S. Strickland, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Balaji Vembu, Ping T. Tang, Anbang Yao, Tatiana Shpeisman, Xiaoming Chen
  • Publication number: 20220382555
    Abstract: One embodiment provides for a graphics processing unit (GPU) to accelerate machine learning operations, the GPU comprising an instruction cache to store a first instruction and a second instruction, the first instruction to cause the GPU to perform a floating-point operation, including a multi-dimensional floating-point operation, and the second instruction to cause the GPU to perform an integer operation; and a general-purpose graphics compute unit having a single instruction, multiple thread architecture, the general-purpose graphics compute unit to concurrently execute the first instruction and the second instruction.
    Type: Application
    Filed: June 14, 2022
    Publication date: December 1, 2022
    Applicant: Intel Corporation
    Inventors: ELMOUSTAPHA OULD-AHMED-VALL, BARATH LAKSHMANAN, TATIANA SHPEISMAN, Joydeep Ray, Ping T. Tang, Michael Strickland, Xiaoming Chen, Anbang Yao, Ben J. Ashbaugh, Linda L. Hurd, Liwei Ma
  • Publication number: 20220327357
    Abstract: An apparatus to facilitate workload scheduling is disclosed. The apparatus includes one or more clients, one or more processing units to processes workloads received from the one or more clients, including hardware resources and scheduling logic to schedule direct access of the hardware resources to the one or more clients to process the workloads.
    Type: Application
    Filed: April 18, 2022
    Publication date: October 13, 2022
    Applicant: Intel Corporation
    Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Chandrasekaran Sakthivel, Barath Lakshmanan, Jingyi Jin, Justin E. Gottschlich, Michael Strickland
  • Patent number: 11468541
    Abstract: Embodiments described herein provide a graphics processor that can perform a variety of mixed and multiple precision instructions and operations. One embodiment provides a streaming multiprocessor that can concurrently execute multiple thread groups, wherein the streaming multiprocessor includes a single instruction, multiple thread (SIMT) architecture and the streaming multiprocessor is to execute multiple threads for each of multiple instructions. The streaming multiprocessor can perform concurrent integer and floating-point operations and includes a mixed precision core to perform operations at multiple or mixed precisions and dynamic ranges.
    Type: Grant
    Filed: April 14, 2022
    Date of Patent: October 11, 2022
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anhang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Patent number: 11461107
    Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a streaming multiprocessor having a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The streaming multiprocessor comprises multiple processing blocks including multiple processing cores. The processing cores include independent integer and floating-point data paths that are configurable to concurrently execute multiple independent instructions. A memory is coupled with the multiple processing blocks.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: October 4, 2022
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Tatiana Shpeisman, Joydeep Ray, Ping T. Tang, Michael Strickland, Xiaoming Chen, Anbang Yao, Ben J. Ashbaugh, Linda L. Hurd, Liwei Ma
  • Patent number: 11409537
    Abstract: One embodiment provides for a graphics processing unit (GPU) to accelerate machine learning operations, the GPU comprising an instruction cache to store a first instruction and a second instruction, the first instruction to cause the GPU to perform a floating-point operation, including a multi-dimensional floating-point operation, and the second instruction to cause the GPU to perform an integer operation; and a general-purpose graphics compute unit having a single instruction, multiple thread (SIMT) architecture, the general-purpose graphics compute unit to simultaneously execute the first instruction and the second instruction, wherein the integer operation corresponds to a memory address calculation.
    Type: Grant
    Filed: November 21, 2017
    Date of Patent: August 9, 2022
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Tatiana Shpeisman, Joydeep Ray, Ping T. Tang, Michael Strickland, Xiaoming Chen, Anbang Yao, Ben J. Ashbaugh, Linda L. Hurd, Liwei Ma
  • Publication number: 20220245753
    Abstract: Embodiments described herein provide a graphics processor that can perform a variety of mixed and multiple precision instructions and operations. One embodiment provides a streaming multiprocessor that can concurrently execute multiple thread groups, wherein the streaming multiprocessor includes a single instruction, multiple thread (SIMT) architecture and the streaming multiprocessor is to execute multiple threads for each of multiple instructions. The streaming multiprocessor can perform concurrent integer and floating-point operations and includes a mixed precision core to perform operations at multiple or mixed precisions and dynamic ranges.
    Type: Application
    Filed: April 14, 2022
    Publication date: August 4, 2022
    Applicant: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Patent number: 11315007
    Abstract: An apparatus to facilitate workload scheduling is disclosed. The apparatus includes one or more clients, one or more processing units to processes workloads received from the one or more clients, including hardware resources and scheduling logic to schedule direct access of the hardware resources to the one or more clients to process the workloads.
    Type: Grant
    Filed: July 1, 2020
    Date of Patent: April 26, 2022
    Assignee: Intel Corporation
    Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Chandrasekaran Sakthivel, Barath Lakshmanan, Jingyi Jin, Justin E. Gottschlich, Michael Strikland
  • Patent number: 11308574
    Abstract: Embodiments described herein provide a graphics processor that can perform a variety of mixed and multiple precision instructions and operations. One embodiment provides a streaming multiprocessor that can concurrently execute multiple thread groups, wherein the streaming multiprocessor includes a single instruction, multiple thread (SIMT) architecture and the streaming multiprocessor is to execute multiple threads for each of multiple instructions. The streaming multiprocessor can perform concurrent integer and floating-point operations and includes a mixed precision core to perform operations at multiple precisions.
    Type: Grant
    Filed: August 3, 2020
    Date of Patent: April 19, 2022
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
  • Publication number: 20220004935
    Abstract: An apparatus to facilitate ensemble learning for deep feature defect detection is disclosed. The apparatus includes one or more processors to receive a deep feature vector from a feature extractor of an ensemble learning system, the deep feature vector extracted from input data; cluster the deep feature vector into a plurality of clusters based on a distance into the plurality of clusters; execute a probabilistic machine learning model corresponding to a cluster of the plurality of clusters to which the deep feature vector is clustered; and detect whether the deep feature vector comprises a defect based on an output of execution of the probabilistic machine learning model.
    Type: Application
    Filed: September 22, 2021
    Publication date: January 6, 2022
    Applicant: Intel Corporation
    Inventors: Barath Lakshmanan, Craig Sperry, David Austin, Nilesh Ahuja