Patents by Inventor Michael Behar

Michael Behar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10210137
    Abstract: A processor, including: decode circuitry to decode instructions; a data cache unit including circuitry to cache data for the processor; and an approximate matrix multiplication (AMM) circuit including: a data receptor circuit to receive a weight vector w and an input vector x, both of size N, and a compression regulating parameter n; a factorizer circuit to factorize w into w?B·s, by computing a binary factorized matrix B of size N×n, and a dictionary vector s of size n; and a binary multiplier circuit to compute w^T x?(B·s)^T x=s^T(B^T x), the binary multiplier circuit comprising a hardware accelerator circuit to compute an array product B^T x).
    Type: Grant
    Filed: June 28, 2017
    Date of Patent: February 19, 2019
    Assignee: Intel Corporation
    Inventors: Ehud Cohen, Daniel David Ben-Dayan Rubin, Michael Behar, Dmitri Vainbrand
  • Publication number: 20190004997
    Abstract: A processor, including: decode circuitry to decode instructions; a data cache unit including circuitry to cache data for the processor; and an approximate matrix multiplication (AMM) circuit including: a data receptor circuit to receive a weight vector w and an input vector x, both of size N, and a compression regulating parameter n; a factorizer circuit to factorize w into w?B·s, by computing a binary factorized matrix B of size N×n, and a dictionary vector s of size n; and a binary multiplier circuit to compute w?T x?(B·s)?T x=s?T (B)?T x), the binary multiplier circuit comprising a hardware accelerator circuit to compute an array product B?T x).
    Type: Application
    Filed: June 28, 2017
    Publication date: January 3, 2019
    Applicant: Intel Corporation
    Inventors: Ehud Cohen, Daniel David Ben-Dayan Rubin, Michael Behar, Dmitri Vainbrand
  • Publication number: 20180314492
    Abstract: In an example, an apparatus comprises a plurality of execution units and logic, at least partially including hardware logic, to gate at least one of a multiply unit or an accumulate unit in response to an input of value zero. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: April 28, 2017
    Publication date: November 1, 2018
    Applicant: Intel Corporation
    Inventors: Yaniv Fais, Tomer Bar-On, Jacob Subag, Jeremie Dreyfuss, Lev Faivishevsky, Michael Behar, Amit Bleiweiss, Guy Jacob, Gal Leibovich, Itamar Ben-Ari, Galina Ryvchin, Eyal Yaacoby
  • Publication number: 20180307987
    Abstract: In an example, an apparatus comprises at least one execution platform; and logic, at least partially including hardware logic, to receive a trained neural network model in a model optimizer and convert the trained neural network model to an optimized model comprising parameters that are fit to the at least one execution platform. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: April 24, 2017
    Publication date: October 25, 2018
    Applicant: Intel Corporation
    Inventors: Amit Bleiweiss, Itamar Ben-Ari, Michael Behar, Guy Jacob, Gal Leibovich, Jacob Subag, Lev Faivishevsky, Yaniv Fais, Tomer Schwartz
  • Publication number: 20180293777
    Abstract: In an example, an apparatus comprises a plurality of execution units; and logic, at least partially including hardware logic, to determine a sub-graph of a network that can be executed in a frequency domain and apply computations in the sub-graph in the frequency domain. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: April 8, 2017
    Publication date: October 11, 2018
    Applicant: Intel Corporation
    Inventors: Uzi Sarel, Ehud Cohen, Tomer Schwartz, Amitai Armon, Yahav Shadmiy, Itamar Ben-Ari, Amit Bleiweiss, Lev Faivishevsky, Tomer Bar-On, Yaniv Fais, Jacob Subag, Michael Behar, Guy Jacob, Gal Leibovich, Jeremie Dreyfuss
  • Publication number: 20180189981
    Abstract: Embodiments described herein provide a processing apparatus comprising compute logic to generate output feature map data for a convolutional neural network (CNN) and write the feature map data to a memory buffer; a direct memory access (DMA) controller including a feature map encoder, the DMA controller to read the feature map data from the memory buffer, encode the feature map data using one of multiple encode algorithms, and write encoded feature map data to memory coupled with the processing apparatus; and wherein the compute logic is to read the encoded feature map data from the memory in an encoded format and decode the encoded feature map data while reading the encoded feature map data.
    Type: Application
    Filed: December 30, 2016
    Publication date: July 5, 2018
    Inventors: AJIT SINGH, BHARAT DAGA, OREN AGAM, MICHAEL BEHAR, DMITRI VAINBRAND
  • Patent number: 9875213
    Abstract: Instructions and logic provide SIMD vector packed histogram functionality. Some processor embodiments include first and second registers storing, in each of a plurality of data fields of a register lane portion, corresponding elements of a first and of a second data type, respectively. A decode stage decodes an instruction for SIMD vector packed histograms. One or more execution units, compare each element of the first data type, in the first register lane portion, with a range specified by the instruction. For any elements of the first register portion in said range, corresponding elements of the second data type, from the second register portion, are added into one of a plurality data fields of a destination register lane portion, selected according to the value of its corresponding element of the first data type, to generate packed weighted histograms for each destination register lane portion.
    Type: Grant
    Filed: June 26, 2015
    Date of Patent: January 23, 2018
    Assignee: Intel Corporation
    Inventors: Edward T. Grochowski, Galina Ryvchin, Michael Behar
  • Publication number: 20160378716
    Abstract: Instructions and logic provide SIMD vector packed histogram functionality. Some processor embodiments include first and second registers storing, in each of a plurality of data fields of a register lane portion, corresponding elements of a first and of a second data type, respectively. A decode stage decodes an instruction for SIMD vector packed histograms. One or more execution units, compare each element of the first data type, in the first register lane portion, with a range specified by the instruction. For any elements of the first register portion in said range, corresponding elements of the second data type, from the second register portion, are added into one of a plurality data fields of a destination register lane portion, selected according to the value of its corresponding element of the first data type, to generate packed weighted histograms for each destination register lane portion.
    Type: Application
    Filed: June 26, 2015
    Publication date: December 29, 2016
    Inventors: Edward T. Grochowski, Galina Ryvchin, Michael Behar
  • Patent number: 9189398
    Abstract: A processor is described comprising: an architectural register file implemented as a combination of a register file cache and an architectural register region within a level 1 (L1) data cache, and a data location table (DLT) to store data indicating a location of each architectural register within the register file cache and/or the architectural register region within the L1 data cache.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: November 17, 2015
    Assignee: INTEL CORPORATION
    Inventors: Ilan Pardo, Michael Behar, Oren Ben-Kiki, Dror Markovich
  • Publication number: 20140189191
    Abstract: A processor is described comprising: an architectural register file implemented as a combination of a register file cache and an architectural register region within a level 1 (L1) data cache, and a data location table (DLT) to store data indicating a location of each architectural register within the register file cache and/or the architectural register region within the L1 data cache.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: Ilan Pardo, Michael Behar, Oren Ben-Kiki, Dror Markovich