Patents by Inventor Asaf Hargil

Asaf Hargil has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11341210
    Abstract: This application relates to a multi-layer convolution operation. The multi-layer convolution operation is optimized for a vector processing unit having a number of data paths configured to operate on vector operands containing a number of elements processed in parallel by the data paths. The convolution operation specifies a convolution kernel utilized to filter a multi-channel input and generate a multi-channel output of the convolution operation. A number of threads are generated to process blocks of the multi-channel output, each block comprising a set of windows of a number of channels of the multi-channel output. Each window is a portion of the array of elements in a single layer of the multi-channel output. Each thread processes a block in accordance with an arbitrary width of the block, processing a set of instructions for each sub-block of the block having a well-defined width, the instructions optimized for the vector processing unit.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: May 24, 2022
    Inventors: Asaf Hargil, Ali Sazegari
  • Publication number: 20200356837
    Abstract: This application relates to performing fully-connected inferences using a convolutional neural network. A method includes receiving a two-dimensional input matrix that includes a plurality of elements. The method further includes identifying a two-dimensional weight matrix corresponding to the two-dimensional input matrix, where the two-dimensional weight matrix includes a plurality of weight values. The method further includes transposing a first column of the two-dimensional weight matrix and storing the transposed first column of the two-dimensional weight matrix in a first register having a first length corresponding to the transposed first column. The method further includes generating a first output element by performing a first dot product operation using a first row of the two-dimensional input matrix and the transposed first column. Finally, the method includes storing the first output element in a first row of a two-dimensional output matrix.
    Type: Application
    Filed: September 11, 2019
    Publication date: November 12, 2020
    Inventors: Asaf HARGIL, Ali SAZEGARI
  • Publication number: 20200356836
    Abstract: This application relates to classifying information using a fully-connected layer of a convolutional neural network. A method for classifying information using a fully-connected layer of a convolutional neural network includes calculating a first partial output for a first block of elements by performing a dot product operation using a first row of elements of the first block of elements and a first weight block, where the first row of elements of the first block of elements corresponds to a first batch of elements. The method further includes generating a first output element using the first partial output for the first block of elements and at least one other partial output corresponding to the first batch of elements.
    Type: Application
    Filed: September 11, 2019
    Publication date: November 12, 2020
    Inventors: Asaf HARGIL, Ali SAZEGARI
  • Publication number: 20200265106
    Abstract: This application relates to a multi-layer convolution operation. The multi-layer convolution operation is optimized for a vector processing unit having a number of data paths configured to operate on vector operands containing a number of elements processed in parallel by the data paths. The convolution operation specifies a convolution kernel utilized to filter a multi-channel input and generate a multi-channel output of the convolution operation. A number of threads are generated to process blocks of the multi-channel output, each block comprising a set of windows of a number of channels of the multi-channel output. Each window is a portion of the array of elements in a single layer of the multi-channel output. Each thread processes a block in accordance with an arbitrary width of the block, processing a set of instructions for each sub-block of the block having a well-defined width, the instructions optimized for the vector processing unit.
    Type: Application
    Filed: May 29, 2019
    Publication date: August 20, 2020
    Inventors: Asaf HARGIL, Ali SAZEGARI
  • Patent number: 9086872
    Abstract: Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: July 21, 2015
    Assignee: Intel Corporation
    Inventors: Asaf Hargil, Doron Orenstein
  • Patent number: 9081562
    Abstract: Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: July 14, 2015
    Assignee: Intel Corporation
    Inventors: Asaf Hargil, Doron Orenstein
  • Publication number: 20140189296
    Abstract: A loop remainder mask instruction indicates a current iteration count of a loop as a first operand, an iteration limit of a loop as a second operand, and a destination. The loop contains iterations and each iteration includes a data element of the array. A processor receives the loop remainder mask instruction, decodes the instruction for execution, and stores a result of the execution in the destination. The result indicates a number of data elements of the array past an end of a preceding portion of the array that are to be handled separately from the preceding portion, the end of the preceding portion being where the current iteration count is recorded.
    Type: Application
    Filed: December 14, 2011
    Publication date: July 3, 2014
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Andrey Naraikin, Suleyman Sair, Asaf Hargil, Miland B. Girkar, Bret T. Toll, Mark J. Charney
  • Publication number: 20130232321
    Abstract: Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 5, 2013
    Inventors: Asaf Hargil, Doron Orenstein
  • Patent number: 8386547
    Abstract: A technique to accelerate range detection in a spline calcuation. In one embodiment, an instruction and corresponding logic are provided to perform range detection within a computer or processor.
    Type: Grant
    Filed: October 31, 2008
    Date of Patent: February 26, 2013
    Assignee: Intel Corporation
    Inventors: Asaf Hargil, Evgeny Fiksman, Artiom Myaskouvskey, Doron Orenstien
  • Publication number: 20100332794
    Abstract: Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand.
    Type: Application
    Filed: June 30, 2009
    Publication date: December 30, 2010
    Inventors: Asaf Hargil, Doron Orenstein
  • Publication number: 20100135417
    Abstract: A video processing device may comprise a video processing logic to control the enhancement operations performed on the video processing device. The video processing logic may determine a short term frame rate average value in response to receiving a plurality of video frames. Further, the video processing logic may generate a derivative of the short term frame rate using the short term frame rate value. The video processing logic may then activate monitoring of a processor usage if the derivative of the short term frame rate is below a first threshold value. The video processing logic may then reduce the performance of rendering of the plurality of video frames if a processor usage average value is above a second threshold. While restoring the performance, the video processing logic may restore the enhancement operations in steps after determining that processor resources are available.
    Type: Application
    Filed: December 2, 2008
    Publication date: June 3, 2010
    Inventor: Asaf Hargil
  • Publication number: 20100115014
    Abstract: A technique to accelerate range detection in a spline calcuation. In one embodiment, an instruction and corresponding logic are provided to perform range detection within a computer or processor.
    Type: Application
    Filed: October 31, 2008
    Publication date: May 6, 2010
    Inventors: Asaf Hargil, Evgeny Fiksman, Artiom Myaskouvskey, Doron Orenstien