Patents by Inventor Marc Lupon

Marc Lupon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD AND APPARATUS FOR DISTRIBUTED AND COOPERATIVE COMPUTATION IN ARTIFICIAL NEURAL NETWORKS

Publication number: 20170277658

Abstract: An apparatus and method are described for distributed and cooperative computation in artificial neural networks. For example, one embodiment of an apparatus comprises: an input/output (I/O) interface; a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each of the plurality of processing units to process at least a portion of the data for the input neurons and synaptic weights to generate partial results; and an interconnect communicatively coupling the plurality of processing units, each of the processing units to share the partial results with one or more other processing units over the interconnect, the other processing units using the partial results to generate additional partial results or final results. The processing units may share data including input neurons and weights over the shared input bus.

Type: Application

Filed: November 19, 2015

Publication date: September 28, 2017

Inventors: Frederico C. PRATAS, Ayose J. FALCON, Marc LUPON, Fernando LATORRE, Pedro LOPEZ, Enric HERRERO ABELLANAS, Georgios TOURNAVITIS
PROCESSING DEVICE FOR PERFORMING CONVOLUTION OPERATIONS

Publication number: 20170220524

Abstract: Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.

Type: Application

Filed: March 9, 2017

Publication date: August 3, 2017

Inventors: Enric Herrero Abellanas, Marc Lupon, Ayose J. Falcon, Frederico C. Pratas, Fernando Latorre, Pedro Lopez
Processing device for performing convolution operations

Patent number: 9613001

Abstract: Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.

Type: Grant

Filed: December 20, 2013

Date of Patent: April 4, 2017

Assignee: Intel Corporation

Inventors: Enric Herrero Abellanas, Marc Lupon, Ayose J. Falcon, Frederico C. Pratas, Fernando Latorre, Pedro Lopez
DOUBLE ROUNDED COMBINED FLOATING-POINT MULTIPLY AND ADD

Publication number: 20170039033

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Application

Filed: October 24, 2016

Publication date: February 9, 2017

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Double rounded combined floating-point multiply and add

Patent number: 9477441

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Grant

Filed: November 23, 2015

Date of Patent: October 25, 2016

Assignee: Intel Corporation

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Combined floating point multiplier adder with intermediate rounding logic

Patent number: 9389871

Abstract: An error handling method includes identifying a code region eligible for cumulative multiply add (CMA) optimization and translating code region instructions into interpreter code instructions, which may include translating sequences of multiply add instructions in the code region instructions into fusion code including CMA instructions. Floating point (FP) exceptions generated by the fusion code may be monitored and at least a portion of the code region instructions may be re-translated to eliminate some or all fusion code if CMA intermediate rounding exceptions exceed a threshold.

Type: Grant

Filed: March 15, 2013

Date of Patent: July 12, 2016

Assignee: Intel Corporation

Inventors: Marc Lupon, Grigorios Magklis, Sridhar Samudrala, Raul Martinez, Kyriakos A. Stavrou, Enric Gibert Codina
STORAGE DEVICE AND METHOD FOR PERFORMING CONVOLUTION OPERATIONS

Publication number: 20160179434

Abstract: A storage device and method are described for performing convolution operations. For example, one embodiment of an apparatus to perform convolution operations comprises a plurality of processing units to execute convolution operations on input data and partial results; a unified scratchpad memory comprising a plurality of memory banks communicatively coupled to the plurality of processing units through a plurality of read/write ports, each of the plurality of memory banks partitioned to store both the input data and partial results; a control unit to allocate the input data and partial results to the memory banks to ensure a minimum quality of service in accordance with the specified number of read/write ports and the specified convolution operation to be performed.

Type: Application

Filed: September 22, 2015

Publication date: June 23, 2016

Inventors: ENRIC HERRERO ABELLANAS, GEORGIOS TOURNAVITIS, FREDERICO C. PRATAS, MARC LUPON, FERNANDO LATORRE, PEDRO LOPEZ, AYOSE J. FALCON
Mechanism for facilitating dynamic and efficient fusion of computing instructions in software programs

Patent number: 9329848

Abstract: A mechanism is described for facilitating dynamic and efficient fusion of computing instructions according to one embodiment. A method of embodiments, as described herein, includes monitoring a software program for a program region having fusion candidate instructions for a fusion operation at a computing system; evaluating whether the macro operation of the candidate instructions is valuable to the software program; and performing the fusion operation if it is evaluated to be valuable.

Type: Grant

Filed: March 27, 2013

Date of Patent: May 3, 2016

Assignee: Intel Corporation

Inventors: Marc Lupon, Raul Martinez, Enric Gibert Codina, Kyriakos A. Stavrou, Grigorios Magklis, Sridhar Samudrala
INSTRUCTION AND LOGIC FOR BULK REGISTER RECLAMATION

Publication number: 20160092222

Abstract: A processor includes a front end, a decoder, an allocator, and a retirement unit. The decoder includes logic to identify an end-of-live-range (EOLR) indicator. The EOLR indicator specifies an architectural register and a location in code for which the architectural register is unused. The allocator includes logic to scan for a mapping of the architectural register to a physical register, based upon the EOLR indicator. The allocator also includes logic to generate a request to disassociate the architectural register from the physical register. The retirement unit includes logic to disassociate the architectural register from the physical register.

Type: Application

Filed: September 25, 2014

Publication date: March 31, 2016

Inventors: David Pardo Keppel, Denis M. Khartikov, Fernando LaTorre, Marc Lupon, Grigorios Magklis, Naveen Neelakantam, Georgios Tournavitis, Polychronis Xekalakis
DOUBLE ROUNDED COMBINED FLOATING-POINT MULTIPLY AND ADD

Publication number: 20160077802

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Application

Filed: November 23, 2015

Publication date: March 17, 2016

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
WEIGHT-SHIFTING MECHANISM FOR CONVOLUTIONAL NEURAL NETWORKS

Publication number: 20160026912

Abstract: A processor includes a processor core and a calculation circuit. The processor core includes logic determine a set of weights for use in a convolutional neural network (CNN) calculation and scale up the weights using a scale value. The calculation circuit includes logic to receive the scale value, the set of weights, and a set of input values, wherein each input value and associated weight of a same fixed size. The calculation circuit also includes logic to determine results from convolutional neural network (CNN) calculations based upon the set of weights applied to the set of input values, scale down the results using the scale value, truncate the scaled down results to the fixed size, and communicatively couple the truncated results to an output for a layer of the CNN.

Type: Application

Filed: July 22, 2014

Publication date: January 28, 2016

Inventors: Ayose J. Falcon, Marc Lupon, Enric Herrero Abellanas, Pedro Lopez, Fernando Latorre, Frederico C. Pratas, Georgios Tournavitis
Double rounded combined floating-point multiply and add

Patent number: 9213523

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Grant

Filed: June 29, 2012

Date of Patent: December 15, 2015

Assignee: Intel Corporation

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Partial commits in dynamic binary translation based systems

Patent number: 9116719

Abstract: Described herein are technologies for optimizing computer code. A code generator can optimize a portion of original code to create optimized code. The code generator can create a partial commit point to indicate that execution of the optimized code produces an invalid architectural state. The code generator can create recovery information recover a valid architectural state at a recovery point. The code generator can associate the partial commit point and recovery information with the optimized code.

Type: Grant

Filed: June 27, 2013

Date of Patent: August 25, 2015

Assignee: Intel Corporation

Inventors: Raul Martinez, Enric Gibert Codina, Marc Lupon, Kyriakos A. Stavrou
PROCESSING DEVICE FOR PERFORMING CONVOLUTION OPERATIONS

Publication number: 20150178246

Abstract: Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.

Type: Application

Filed: December 20, 2013

Publication date: June 25, 2015

Inventors: Enric Herrero Abellanas, Marc Lupon, Ayose J. Falcon, Frederico C. Pratas, Fernando Latorre, Pedro Lopez
RECONFIGURABLE PROCESSING UNIT

Publication number: 20150170021

Abstract: A processing device includes a processor core and a number of calculation modules that each is configurable to perform any one of operations for a convolutional neuron network system. A first set of the calculation modules are configured to perform convolution operations, a second set of the calculation modules are reconfigured to perform averaging operations, and a third set of the calculation modules are reconfigured to perform dot product operations.

Type: Application

Filed: December 18, 2013

Publication date: June 18, 2015

Inventors: Marc Lupon, Enric Herrero Abellanas, Ayose Falcon, Fernando Latorre, Pedro Lopez, Frederico Pratas
MECHANISM FOR FACILITATING DYNAMIC AND EFFICIENT FUSION OF COMPUTING INSTRUCTIONS IN SOFTWARE PROGRAMS

Publication number: 20150026671

Abstract: A mechanism is described for facilitating dynamic and efficient fusion of computing instructions according to one embodiment. A method of embodiments, as described herein, includes monitoring a software program for a program region having fusion candidate instructions for a fusion operation at a computing system; evaluating whether the macro operation of the candidate instructions is valuable to the software program; and performing the fusion operation if it is evaluated to be valuable.

Type: Application

Filed: March 27, 2013

Publication date: January 22, 2015

Inventors: Marc Lupon, Raul Martinez, Enric Gibert Codina, Kyriakos A. Stavrou, Grigorios Magklis, Sridhar Samudrala
PARTIAL COMMITS IN DYNAMIC BINARY TRANSLATION BASED SYSTEMS

Publication number: 20150007153

Abstract: Described herein are technologies for optimizing computer code. A code generator can optimize a portion of original code to create optimized code. The code generator can create a partial commit point to indicate that execution of the optimized code produces an invalid architectural state. The code generator can create recovery information recover a valid architectural state at a recovery point. The code generator can associate the partial commit point and recovery information with the optimized code.

Type: Application

Filed: June 27, 2013

Publication date: January 1, 2015

Inventors: Raul Martinez, Enric Gibert Codina, Marc Lupon, Kyriakos A. Stavrou
COMBINED FLOATING POINT MULTIPLIER ADDER WITH INTERMEDIATE ROUNDING LOGIC

Publication number: 20140281419

Abstract: An error handling method includes identifying a code region eligible for cumulative multiply add (CMA) optimization and translating code region instructions into interpreter code instructions, which may include translating sequences of multiply add instructions in the code region instructions into fusion code including CMA instructions. Floating point (FP) exceptions generated by the fusion code may be monitored and at least a portion of the code region instructions may be re-translated to eliminate some or all fusion code if CMA intermediate rounding exceptions exceed a threshold.

Type: Application

Filed: March 15, 2013

Publication date: September 18, 2014

Applicant: Intel Corporation

Inventors: Marc Lupon, Grigorios Magklis, Sridhar Samudrala, Raul Martinez, Kyriakos A. Stavrou, Enric Gibert Codina
METHOD, APPARATUS AND SYSTEM FOR SELECTIVE EXECUTION OF A COMMIT INSTRUCTION

Publication number: 20140156976

Abstract: Techniques and mechanisms for a processor to determine whether a commit action is to be performed. In an embodiment, a processor performs operations to determine whether a commit instruction is for contingent performance of a commit action. In another embodiment, one or more conditions of processor state are evaluated in response to determining that the commit instruction is for contingent performance of the commit action, where the evaluation is performed to determine whether the commit action indicated by the commit instruction is to be performed.

Type: Application

Filed: December 22, 2011

Publication date: June 5, 2014

Inventors: Enric Gibert Codina, Josep M. Codina, Fernando Latorre, Pedro Marcuello, Pedro Lopez, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Christos E. Kotselidis, Marc Lupon, Carlos Madriles Gimeno, Grigorios Magklis, Alejandro Martinez Vicente, Raul Martinez, Daniel Ortega, Demos Pavlou, Kyriakos A. Stavrou, Georgios Tournavitis, Polychronis Xekalakis
INSTRUCTION AND LOGIC FOR OPTIMIZATION LEVEL AWARE BRANCH PREDICTION

Publication number: 20140095849

Abstract: A computer-readable storage medium, method and system for optimization-level aware branch prediction is described. A gear level is assigned to a set of application instructions that have been optimized. The gear level is also stored in a register of a branch prediction unit of a processor. Branch prediction is then performed by the processor based upon the gear level.

Type: Application

Filed: September 28, 2012

Publication date: April 3, 2014

Inventors: Polychronis Xekalakis, Pedro Marcuello, Alejandro Vicente Martinez, Christos E. Kotselidis, Grigorios Magklis, Fernando Latorre, Raul Martinez, Josep M. Codina, Enric Gibert Codina, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Pedro Lopez, Marc Lupon, Carlos Madriles, Daniel Ortega, Demos Pavlou, Kyriakos A. Stavrou, Georgios Tournavitis

prev 1 2 3 next