Patents by Inventor Amin Firoozshahian

Amin Firoozshahian has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Function as a service (FaaS) system enhancements

Patent number: 11922220

Abstract: Embodiments of systems, apparatuses and methods provide enhanced function as a service (FaaS) to users, e.g., computer developers and cloud service providers (CSPs). A computing system configured to provide such enhanced FaaS service include one or more controls architectural subsystems, software and orchestration subsystems, network and storage subsystems, and security subsystems. The computing system executes functions in response to events triggered by the users in an execution environment provided by the architectural subsystems, which represent an abstraction of execution management and shield the users from the burden of managing the execution. The software and orchestration subsystems allocate computing resources for the function execution by intelligently spinning up and down containers for function code with decreased instantiation latency and increased execution scalability while maintaining secured execution.

Type: Grant

Filed: April 16, 2019

Date of Patent: March 5, 2024

Assignee: Intel Corporation

Inventors: Mohammad R. Haghighat, Kshitij Doshi, Andrew J. Herdrich, Anup Mohan, Ravishankar R. Iyer, Mingqiu Sun, Krishna Bhuyan, Teck Joo Goh, Mohan J. Kumar, Michael Prinke, Michael Lemay, Leeor Peled, Jr-Shian Tsai, David M. Durham, Jeffrey D. Chamberlain, Vadim A. Sukhomlinov, Eric J. Dahlen, Sara Baghsorkhi, Harshad Sane, Areg Melik-Adamyan, Ravi Sahita, Dmitry Yurievich Babokin, Ian M. Steiner, Alexander Bachmutsky, Anil Rao, Mingwei Zhang, Nilesh K. Jain, Amin Firoozshahian, Baiju V. Patel, Wenyong Huang, Yeluri Raghuram
Reservation architecture for overcommitted memory

Patent number: 11681611

Abstract: Various systems and methods for computer memory overcommitment management are described herein. A system for computer memory management includes a memory device to store data and a mapping table; and a memory overcommitment circuitry to: receive a signal to move data in a first block from a memory reduction area in the memory device to a non-memory reduction area in the memory device, the memory reduction area to store data using a memory reduction technique, and the non-memory reduction area to store data without any memory reduction techniques; allocate a second block in the non-memory reduction area; copy the data in the first block to the second block; and update the mapping table to revise a pointer to point to the second block, the mapping table used to store pointers to memory device in the memory reduction area and the non-memory reduction area.

Type: Grant

Filed: December 11, 2020

Date of Patent: June 20, 2023

Assignee: Intel Corporation

Inventors: Omid Azizi, Amin Firoozshahian, Andreas Kleen, Mahesh Madhav, Mahesh Maddury, Chandan Egbert, Eric Gouldey
Grouped convolution using point-to-point connected channel convolution engines

Patent number: 11580192

Abstract: A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element.

Type: Grant

Filed: April 8, 2020

Date of Patent: February 14, 2023

Assignee: Meta Platforms, Inc.

Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
Mapping convolution to a channel convolution engine

Patent number: 11537865

Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

Type: Grant

Filed: February 18, 2020

Date of Patent: December 27, 2022

Assignee: Meta Platforms, Inc.

Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
Mapping convolution to a partition channel convolution engine

Patent number: 11520853

Abstract: A processor system comprises two groups of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate matrix for each channel. Each register stores at least one data element from each matrix. The hardware channel convolution processor unit is configured to multiply each data element in a first and second portion of the first group of registers with a corresponding data element in the second group of registers to determine corresponding multiplication results and sum together the multiplication results for each specific channel to determine two corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

Type: Grant

Filed: February 28, 2020

Date of Patent: December 6, 2022

Assignee: Meta Platforms, Inc.

Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
Pipelined pointwise convolution using per-channel convolution operations

Patent number: 11443013

Abstract: A processor system comprises a hardware channel convolution processor unit and dot product processor unit. The channel convolution processor unit is configured to perform depthwise convolution, including by multiplying each data element of a first group of data elements of a convolution data matrix with a corresponding data element of a second group of data elements of a plurality of depthwise convolution weight matrices and summing together, for each specific channel, multiplication results corresponding to the specific channel to determine one corresponding result data element in a corresponding channel convolution result matrix to calculate a portion of depthwise convolution results.

Type: Grant

Filed: March 23, 2020

Date of Patent: September 13, 2022

Assignee: Meta Platforms, Inc.

Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
Hardware-assisted paging mechanisms

Patent number: 11392491

Abstract: Processing circuitry for computer memory management includes memory reduction circuitry to implement a memory reduction technique; and reference count information collection circuitry to: access a memory region, the memory region subject to the memory reduction technique; obtain an indication of memory reduction of the memory region; calculate metrics based on the indication of memory reduction of cache lines associated with the memory region; and provide the metrics to a system software component for use in memory management mechanisms.

Type: Grant

Filed: June 27, 2018

Date of Patent: July 19, 2022

Assignee: Intel Corporation

Inventors: Amin Firoozshahian, Omid Azizi, Chandan Egbert, David Hansen, Andreas Kleen, Mahesh Maddury, Mahesh Madhav, Alexandre Solomatnikov, John Peter Stevenson
Pause communication from I/O devices supporting page faults

Patent number: 11169929

Abstract: A processing device includes a core to execute instructions, and memory management circuitry coupled to, memory, the core and an I/O device that supports page faults. The memory management circuitry includes an express invalidations circuitry, and a page translation permission circuitry. The memory management circuitry is to, while the core is executing the instructions, receive a command to pause communication between the I/O device and the memory. In response to receiving the command to pause the communication, modify permissions of page translations by the page translation permission circuitry and transmit an invalidation request, by the express invalidations circuitry to the I/O device, to cause cached page translations in the I/O device to be invalidated.

Type: Grant

Filed: April 20, 2018

Date of Patent: November 9, 2021

Assignee: INTEL CORPORATION

Inventors: Rupin Vakharwala, Amin Firoozshahian, Stephen Van Doren, Rajesh Sankaran, Mahesh Madhav, Omid Azizi, Andreas Kleen, Mahesh Maddury, Ashok Raj
MAPPING CONVOLUTION TO CONNECTED PROCESSING ELEMENTS USING DISTRIBUTED PIPELINED SEPARABLE CONVOLUTION OPERATIONS

Publication number: 20210334072

Abstract: A processor system comprises a plurality of dot product processor units and element-wise multiplication units. The dot product processor units perform a depthwise convolution of a data matrix with a separate depthwise convolution weight matrix for each data matrix channel. Each dot product processor unit performs at least a portion of the depthwise convolution for one or more data matrix channels. The element-wise multiplication units perform multiplication operations of a pointwise convolution. Each element-wise multiplication unit applies to each depthwise convolution partial result element received from one or more of the dot product processor units a corresponding data element from each of a plurality of pointwise convolution weight filters to determine element-wise multiplication unit results. The processor system sums together different groups of data elements from the element-wise multiplication unit results to at least in part calculate different data elements of a result of the pointwise convolution.

Type: Application

Filed: April 22, 2020

Publication date: October 28, 2021

Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
GROUPED CONVOLUTION USING POINT-TO-POINT CONNECTED CHANNEL CONVOLUTION ENGINES

Publication number: 20210319076

Abstract: A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element.

Type: Application

Filed: April 8, 2020

Publication date: October 14, 2021

Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
PIPELINED POINTWISE CONVOLUTION USING PER-CHANNEL CONVOLUTION OPERATIONS

Publication number: 20210294875

Abstract: A processor system comprises a hardware channel convolution processor unit and dot product processor unit. The channel convolution processor unit is configured to perform depthwise convolution, including by multiplying each data element of a first group of data elements of a convolution data matrix with a corresponding data element of a second group of data elements of a plurality of depthwise convolution weight matrices and summing together, for each specific channel, multiplication results corresponding to the specific channel to determine one corresponding result data element in a corresponding channel convolution result matrix to calculate a portion of depthwise convolution results.

Type: Application

Filed: March 23, 2020

Publication date: September 23, 2021

Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
MAPPING CONVOLUTION TO A PARTITION CHANNEL CONVOLUTION ENGINE

Publication number: 20210271451

Abstract: A processor system comprises two groups of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate matrix for each channel. Each register stores at least one data element from each matrix. The hardware channel convolution processor unit is configured to multiply each data element in a first and second portion of the first group of registers with a corresponding data element in the second group of registers to determine corresponding multiplication results and sum together the multiplication results for each specific channel to determine two corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

Type: Application

Filed: February 28, 2020

Publication date: September 2, 2021

Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
FUNCTION AS A SERVICE (FAAS) SYSTEM ENHANCEMENTS

Publication number: 20210263779

Abstract: Embodiments of systems, apparatuses and methods provide enhanced function as a service (FaaS) to users, e.g., computer developers and cloud service providers (CSPs). A computing system configured to provide such enhanced FaaS service include one or more controls architectural subsystems, software and orchestration subsystems, network and storage subsystems, and security subsystems. The computing system executes functions in response to events triggered by the users in an execution environment provided by the architectural subsystems, which represent an abstraction of execution management and shield the users from the burden of managing the execution. The software and orchestration subsystems allocate computing resources for the function execution by intelligently spinning up and down containers for function code with decreased instantiation latency and increased execution scalability while maintaining secured execution.

Type: Application

Filed: April 16, 2019

Publication date: August 26, 2021

Applicant: Intel Corporation

Inventors: Mohammad R. Haghighat, Kshitij Doshi, Andrew J. Herdrich, Anup Mohan, Ravishankar R. Iyer, Mingqiu Sun, Krishna Bhuyan, Teck Joo Goh, Mohan J. Kumar, Michael Prinke, Michael Lemay, Leeor Peled, Jr-Shian Tsai, David M. Durham, Jeffrey D. Chamberlain, Vadim A. Sukhomlinov, Eric J. Dahlen, Sara Baghsorkhi, Harshad Sane, Areg Melik-Adamyan, Ravi Sahita, Dmitry Yurievich Babokin, Ian M. Steiner, Alexander Bachmutsky, Anil Rao, Mingwei Zhang, Nilesh K. Jain, Amin Firoozshahian, Baiju V. Patel, Wenyong Huang, Yeluri Raghuram
MAPPING CONVOLUTION TO A CHANNEL CONVOLUTION ENGINE

Publication number: 20210256363

Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

Type: Application

Filed: February 18, 2020

Publication date: August 19, 2021

Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
RESERVATION ARCHITECTURE FOR OVERCOMMITTED MEMORY

Publication number: 20210240609

Abstract: Various systems and methods for computer memory overcommitment management are described herein. A system for computer memory management includes a memory device to store data and a mapping table; and a memory overcommitment circuitry to: receive a signal to move data in a first block from a memory reduction area in the memory device to a non-memory reduction area in the memory device, the memory reduction area to store data using a memory reduction technique, and the non-memory reduction area to store data without any memory reduction techniques; allocate a second block in the non-memory reduction area; copy the data in the first block to the second block; and update the mapping table to revise a pointer to point to the second block, the mapping table used to store pointers to memory device in the memory reduction area and the non-memory reduction area.

Type: Application

Filed: December 11, 2020

Publication date: August 5, 2021

Inventors: Omid Azizi, Amin Firoozshahian, Andreas Kleen, Mahesh Madhav, Mahesh Maddury, Chandan Egbert, Eric Gouldey
Concept for approximate deduplication in storage and memory

Patent number: 11079955

Abstract: Examples relate to an approximative memory deduplication method, a controller apparatus or controller device for a memory or storage controller, a memory or storage controller, a computer system and to a computer program. The approximative memory deduplication method comprises determining a hash value of a data block. The hash value is based on a user-defined approximative hashing function. The approximative memory deduplication method comprises storing a quantized version of the data block based on the hash value using a memory or storage device of the computer system.

Type: Grant

Filed: December 17, 2018

Date of Patent: August 3, 2021

Assignee: Intel Corporation

Inventors: Francesc Guim Bernat, Karthik Kumar, Mustafa Hajeer, Thomas Willhalm, Amin Firoozshahian, Chandan Egbert
Distributed physical processing of matrix sum operation

Patent number: 11010202

Abstract: A specification of an operation to perform one or more element-wise sums of specified portions of a matrix is received. The specification of the operation is analyzed to select a type of processing load partitioning to be applied. Based on the selected type of processing load partitioning to be applied, processing required to perform the operation is partitioned across a plurality of physical processing elements in parallel. The partitioned processing is distributed to the physical hardware processing elements to perform in parallel the element-wise sums of the specified portions of the matrix.

Type: Grant

Filed: August 6, 2019

Date of Patent: May 18, 2021

Assignee: Facebook, Inc.

Inventors: Martin Schatz, Amin Firoozshahian
DISTRIBUTED PHYSICAL PROCESSING OF MATRIX SUM OPERATION

Publication number: 20210042116

Abstract: A specification of an operation to perform one or more element-wise sums of specified portions of a matrix is received. The specification of the operation is analyzed to select a type of processing load partitioning to be applied. Based on the selected type of processing load partitioning to be applied, processing required to perform the operation is partitioned across a plurality of physical processing elements in parallel. The partitioned processing is distributed to the physical hardware processing elements to perform in parallel the element-wise sums of the specified portions of the matrix.

Type: Application

Filed: August 6, 2019

Publication date: February 11, 2021

Inventors: Martin Schatz, Amin Firoozshahian
Reservation architecture for overcommitted memory

Patent number: 10866888

Abstract: Various systems and methods for computer memory overcommitment management are described herein. A system for computer memory management includes a memory device to store data and a mapping table; and a memory overcommitment circuitry to: receive a signal to move data in a first block from a memory reduction area in the memory device to a non-memory reduction area in the memory device, the memory reduction area to store data using a memory reduction technique, and the non-memory reduction area to store data without any memory reduction techniques; allocate a second block in the non-memory reduction area; copy the data in the first block to the second block; and update the mapping table to revise a pointer to point to the second block, the mapping table used to store pointers to memory device in the memory reduction area and the non-memory reduction area.

Type: Grant

Filed: January 11, 2018

Date of Patent: December 15, 2020

Assignee: Intel Corporation

Inventors: Omid Azizi, Amin Firoozshahian, Andreas Kleen, Mahesh Madhav, Mahesh Maddury, Chandan Egbert, Eric Gouldey
Lazy memory deduplication

Patent number: 10732880

Abstract: Various systems and methods for computer memory management are described herein. A system for computer memory management includes a first memory device including a mapping table; a second memory device including a staging area; a third memory device including a dedup data region; and a controller operable to: receive a memory access request, the memory access request including an address and data; write the data to the staging area; and update the mapping table with the address.

Type: Grant

Filed: January 11, 2018

Date of Patent: August 4, 2020

Assignee: Intel Corporation

Inventors: Omid Azizi, Amin Firoozshahian, John Stevenson, Mahesh Maddury, Chandan Egbert, Henk Neefs

1 2 next