Patents by Inventor Marc Lupon

Marc Lupon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170277658
    Abstract: An apparatus and method are described for distributed and cooperative computation in artificial neural networks. For example, one embodiment of an apparatus comprises: an input/output (I/O) interface; a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each of the plurality of processing units to process at least a portion of the data for the input neurons and synaptic weights to generate partial results; and an interconnect communicatively coupling the plurality of processing units, each of the processing units to share the partial results with one or more other processing units over the interconnect, the other processing units using the partial results to generate additional partial results or final results. The processing units may share data including input neurons and weights over the shared input bus.
    Type: Application
    Filed: November 19, 2015
    Publication date: September 28, 2017
    Inventors: Frederico C. PRATAS, Ayose J. FALCON, Marc LUPON, Fernando LATORRE, Pedro LOPEZ, Enric HERRERO ABELLANAS, Georgios TOURNAVITIS
  • Publication number: 20170220524
    Abstract: Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.
    Type: Application
    Filed: March 9, 2017
    Publication date: August 3, 2017
    Inventors: Enric Herrero Abellanas, Marc Lupon, Ayose J. Falcon, Frederico C. Pratas, Fernando Latorre, Pedro Lopez
  • Patent number: 9613001
    Abstract: Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: April 4, 2017
    Assignee: Intel Corporation
    Inventors: Enric Herrero Abellanas, Marc Lupon, Ayose J. Falcon, Frederico C. Pratas, Fernando Latorre, Pedro Lopez
  • Publication number: 20170039033
    Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.
    Type: Application
    Filed: October 24, 2016
    Publication date: February 9, 2017
    Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
  • Patent number: 9477441
    Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.
    Type: Grant
    Filed: November 23, 2015
    Date of Patent: October 25, 2016
    Assignee: Intel Corporation
    Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
  • Patent number: 9389871
    Abstract: An error handling method includes identifying a code region eligible for cumulative multiply add (CMA) optimization and translating code region instructions into interpreter code instructions, which may include translating sequences of multiply add instructions in the code region instructions into fusion code including CMA instructions. Floating point (FP) exceptions generated by the fusion code may be monitored and at least a portion of the code region instructions may be re-translated to eliminate some or all fusion code if CMA intermediate rounding exceptions exceed a threshold.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: July 12, 2016
    Assignee: Intel Corporation
    Inventors: Marc Lupon, Grigorios Magklis, Sridhar Samudrala, Raul Martinez, Kyriakos A. Stavrou, Enric Gibert Codina
  • Publication number: 20160179434
    Abstract: A storage device and method are described for performing convolution operations. For example, one embodiment of an apparatus to perform convolution operations comprises a plurality of processing units to execute convolution operations on input data and partial results; a unified scratchpad memory comprising a plurality of memory banks communicatively coupled to the plurality of processing units through a plurality of read/write ports, each of the plurality of memory banks partitioned to store both the input data and partial results; a control unit to allocate the input data and partial results to the memory banks to ensure a minimum quality of service in accordance with the specified number of read/write ports and the specified convolution operation to be performed.
    Type: Application
    Filed: September 22, 2015
    Publication date: June 23, 2016
    Inventors: ENRIC HERRERO ABELLANAS, GEORGIOS TOURNAVITIS, FREDERICO C. PRATAS, MARC LUPON, FERNANDO LATORRE, PEDRO LOPEZ, AYOSE J. FALCON
  • Patent number: 9329848
    Abstract: A mechanism is described for facilitating dynamic and efficient fusion of computing instructions according to one embodiment. A method of embodiments, as described herein, includes monitoring a software program for a program region having fusion candidate instructions for a fusion operation at a computing system; evaluating whether the macro operation of the candidate instructions is valuable to the software program; and performing the fusion operation if it is evaluated to be valuable.
    Type: Grant
    Filed: March 27, 2013
    Date of Patent: May 3, 2016
    Assignee: Intel Corporation
    Inventors: Marc Lupon, Raul Martinez, Enric Gibert Codina, Kyriakos A. Stavrou, Grigorios Magklis, Sridhar Samudrala
  • Publication number: 20160092222
    Abstract: A processor includes a front end, a decoder, an allocator, and a retirement unit. The decoder includes logic to identify an end-of-live-range (EOLR) indicator. The EOLR indicator specifies an architectural register and a location in code for which the architectural register is unused. The allocator includes logic to scan for a mapping of the architectural register to a physical register, based upon the EOLR indicator. The allocator also includes logic to generate a request to disassociate the architectural register from the physical register. The retirement unit includes logic to disassociate the architectural register from the physical register.
    Type: Application
    Filed: September 25, 2014
    Publication date: March 31, 2016
    Inventors: David Pardo Keppel, Denis M. Khartikov, Fernando LaTorre, Marc Lupon, Grigorios Magklis, Naveen Neelakantam, Georgios Tournavitis, Polychronis Xekalakis
  • Publication number: 20160077802
    Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.
    Type: Application
    Filed: November 23, 2015
    Publication date: March 17, 2016
    Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
  • Publication number: 20160026912
    Abstract: A processor includes a processor core and a calculation circuit. The processor core includes logic determine a set of weights for use in a convolutional neural network (CNN) calculation and scale up the weights using a scale value. The calculation circuit includes logic to receive the scale value, the set of weights, and a set of input values, wherein each input value and associated weight of a same fixed size. The calculation circuit also includes logic to determine results from convolutional neural network (CNN) calculations based upon the set of weights applied to the set of input values, scale down the results using the scale value, truncate the scaled down results to the fixed size, and communicatively couple the truncated results to an output for a layer of the CNN.
    Type: Application
    Filed: July 22, 2014
    Publication date: January 28, 2016
    Inventors: Ayose J. Falcon, Marc Lupon, Enric Herrero Abellanas, Pedro Lopez, Fernando Latorre, Frederico C. Pratas, Georgios Tournavitis
  • Patent number: 9213523
    Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: December 15, 2015
    Assignee: Intel Corporation
    Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
  • Patent number: 9116719
    Abstract: Described herein are technologies for optimizing computer code. A code generator can optimize a portion of original code to create optimized code. The code generator can create a partial commit point to indicate that execution of the optimized code produces an invalid architectural state. The code generator can create recovery information recover a valid architectural state at a recovery point. The code generator can associate the partial commit point and recovery information with the optimized code.
    Type: Grant
    Filed: June 27, 2013
    Date of Patent: August 25, 2015
    Assignee: Intel Corporation
    Inventors: Raul Martinez, Enric Gibert Codina, Marc Lupon, Kyriakos A. Stavrou
  • Publication number: 20150178246
    Abstract: Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.
    Type: Application
    Filed: December 20, 2013
    Publication date: June 25, 2015
    Inventors: Enric Herrero Abellanas, Marc Lupon, Ayose J. Falcon, Frederico C. Pratas, Fernando Latorre, Pedro Lopez
  • Publication number: 20150170021
    Abstract: A processing device includes a processor core and a number of calculation modules that each is configurable to perform any one of operations for a convolutional neuron network system. A first set of the calculation modules are configured to perform convolution operations, a second set of the calculation modules are reconfigured to perform averaging operations, and a third set of the calculation modules are reconfigured to perform dot product operations.
    Type: Application
    Filed: December 18, 2013
    Publication date: June 18, 2015
    Inventors: Marc Lupon, Enric Herrero Abellanas, Ayose Falcon, Fernando Latorre, Pedro Lopez, Frederico Pratas
  • Publication number: 20150026671
    Abstract: A mechanism is described for facilitating dynamic and efficient fusion of computing instructions according to one embodiment. A method of embodiments, as described herein, includes monitoring a software program for a program region having fusion candidate instructions for a fusion operation at a computing system; evaluating whether the macro operation of the candidate instructions is valuable to the software program; and performing the fusion operation if it is evaluated to be valuable.
    Type: Application
    Filed: March 27, 2013
    Publication date: January 22, 2015
    Inventors: Marc Lupon, Raul Martinez, Enric Gibert Codina, Kyriakos A. Stavrou, Grigorios Magklis, Sridhar Samudrala
  • Publication number: 20150007153
    Abstract: Described herein are technologies for optimizing computer code. A code generator can optimize a portion of original code to create optimized code. The code generator can create a partial commit point to indicate that execution of the optimized code produces an invalid architectural state. The code generator can create recovery information recover a valid architectural state at a recovery point. The code generator can associate the partial commit point and recovery information with the optimized code.
    Type: Application
    Filed: June 27, 2013
    Publication date: January 1, 2015
    Inventors: Raul Martinez, Enric Gibert Codina, Marc Lupon, Kyriakos A. Stavrou
  • Publication number: 20140281419
    Abstract: An error handling method includes identifying a code region eligible for cumulative multiply add (CMA) optimization and translating code region instructions into interpreter code instructions, which may include translating sequences of multiply add instructions in the code region instructions into fusion code including CMA instructions. Floating point (FP) exceptions generated by the fusion code may be monitored and at least a portion of the code region instructions may be re-translated to eliminate some or all fusion code if CMA intermediate rounding exceptions exceed a threshold.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Applicant: Intel Corporation
    Inventors: Marc Lupon, Grigorios Magklis, Sridhar Samudrala, Raul Martinez, Kyriakos A. Stavrou, Enric Gibert Codina
  • Publication number: 20140156976
    Abstract: Techniques and mechanisms for a processor to determine whether a commit action is to be performed. In an embodiment, a processor performs operations to determine whether a commit instruction is for contingent performance of a commit action. In another embodiment, one or more conditions of processor state are evaluated in response to determining that the commit instruction is for contingent performance of the commit action, where the evaluation is performed to determine whether the commit action indicated by the commit instruction is to be performed.
    Type: Application
    Filed: December 22, 2011
    Publication date: June 5, 2014
    Inventors: Enric Gibert Codina, Josep M. Codina, Fernando Latorre, Pedro Marcuello, Pedro Lopez, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Christos E. Kotselidis, Marc Lupon, Carlos Madriles Gimeno, Grigorios Magklis, Alejandro Martinez Vicente, Raul Martinez, Daniel Ortega, Demos Pavlou, Kyriakos A. Stavrou, Georgios Tournavitis, Polychronis Xekalakis
  • Publication number: 20140095849
    Abstract: A computer-readable storage medium, method and system for optimization-level aware branch prediction is described. A gear level is assigned to a set of application instructions that have been optimized. The gear level is also stored in a register of a branch prediction unit of a processor. Branch prediction is then performed by the processor based upon the gear level.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: Polychronis Xekalakis, Pedro Marcuello, Alejandro Vicente Martinez, Christos E. Kotselidis, Grigorios Magklis, Fernando Latorre, Raul Martinez, Josep M. Codina, Enric Gibert Codina, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Pedro Lopez, Marc Lupon, Carlos Madriles, Daniel Ortega, Demos Pavlou, Kyriakos A. Stavrou, Georgios Tournavitis