Patents by Inventor Ramon Matas

Ramon Matas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEM, DEVICES AND/OR PROCESSES FOR DEFINING A SEARCH SPACE FOR NEURAL NETWORK PROCESSING DEVICE ARCHITECTURES

Publication number: 20240046065

Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to determine options for decisions in connection with design features of a computing device. In a particular implementation, design options for two or more design decisions of neural network processing device may be identified based, at least in part, on combination of a definition of available computing resources and one or more predefined performance constraints.

Type: Application

Filed: August 3, 2022

Publication date: February 8, 2024

Inventors: Hokchhay Tann, Ramon Matas Navarro, Igor Fedorov, Chuteng Zhou, Paul Nicholas Whatmough, Matthew Mattina
SYSTEM, DEVICES AND/OR PROCESSES FOR DESIGNING NEURAL NETWORK PROCESSING DEVICES

Publication number: 20230042271

Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to select options for decisions in connection with design features of a computing device. In a particular implementation, design options for two or more design decisions of neural network processing device may be selected based, at least in part, on combination of function values that are computed based, at least in part, on a tensor expressing sample neural network weights.

Type: Application

Filed: August 4, 2021

Publication date: February 9, 2023

Inventors: Igor Fedorov, Ramon Matas Navarro, Chuteng Zhou, Hokchhay Tann, Paul Nicholas Whatmough, Matthew Mattina
Setup for receiving an optical data signal

Patent number: 11405106

Abstract: The disclosure relates to a setup for receiving an optical data signal having input optics for receiving the signal. An optical receiving fiber with an end facet is provided, which can be injected into the optical receiving fiber by an optical collimation system. A detector for detecting the optical data content is connected to the optical receiving fiber. A receive calibration source is provided, which is connected to the optical receiving fiber by a circulator. An insertable retroreflector is provided in the light path for adjusting the setup into the light path so that light from the receive calibration source is reflected and focused by the optical collimation system onto the end facet of the receiving fiber. The distance in the z-direction between the optical collimation system and the end facet of the receiving fiber is adjusted by the power of the light from the receive calibration source detected.

Type: Grant

Filed: January 10, 2020

Date of Patent: August 2, 2022

Assignee: Deutsches Zentrum für Luft- und Raumfahrt e.V.

Inventors: Fabian Rein, Juraj Poliak, Ramon Mata Calvo
Dynamic partitioning of execution resources

Patent number: 11307903

Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.

Type: Grant

Filed: January 31, 2018

Date of Patent: April 19, 2022

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Luke Durant, Ramon Matas Navarro, Alan Menezes, Jeffrey Tuckey, Gentaro Hirota, Brian Pharris
NEURAL NETWORK SELECTION

Publication number: 20220092404

Abstract: A computer-implemented method of identifying a neural network for processing data includes: clustering a training dataset into a plurality of data clusters based on similarities in activation patterns generated in neurons of a teacher neural network in response to inputting the training dataset into the teacher neural network, training a student neural network for processing each of the plurality of data clusters, and providing a data classifier neural network for identifying one or more of the trained student neural networks to process data based on a data cluster of the data.

Type: Application

Filed: September 18, 2020

Publication date: March 24, 2022

Inventors: Mark John O'CONNOR, Ramon Matas NAVARRO
Setup for Receiving an Optical Data Signal

Publication number: 20220085885

Abstract: The disclosure relates to a setup for receiving an optical data signal having input optics for receiving the signal. An optical receiving fiber with an end facet is provided, which can be injected into the optical receiving fiber by an optical collimation system. A detector for detecting the optical data content is connected to the optical receiving fiber. A receive calibration source is provided, which is connected to the optical receiving fiber by a circulator. An insertable retroreflector is provided in the light path for adjusting the setup into the light path so that light from the receive calibration source is reflected and focused by the optical collimation system onto the end facet of the receiving fiber. The distance in the z-direction between the optical collimation system and the end facet of the receiving fiber is adjusted by the power of the light from the receive calibration source detected.

Type: Application

Filed: January 10, 2020

Publication date: March 17, 2022

Inventors: Fabian Rein, Juraj Poliak, Ramon Mata Calvo
Dynamic partitioning of execution resources

Patent number: 10817338

Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.

Type: Grant

Filed: January 31, 2018

Date of Patent: October 27, 2020

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Luke Durant, Ramon Matas Navarro, Alan Menezes, Jeffrey Tuckey, Gentaro Hirota, Brian Pharris
Transmitter for an optical free-beam communication system

Patent number: 10637576

Abstract: A transmitter for an optical free-beam communication system includes two light transmitters for the optical transmission of a data signal using one single-sideband modulation, wherein each light transmitter emits a side of the band modulation so that a light signal arriving at a receiver corresponds to a double-sideband modulation.

Type: Grant

Filed: October 28, 2016

Date of Patent: April 28, 2020

Assignee: Deutsches Zentrum für Luft-und Raumfahrt e.V.

Inventors: Ramon Mata Calvo, Dirk Giggenbach, Christian Fuchs, Ahmad Mustafa
DYNAMIC PARTITIONING OF EXECUTION RESOURCES

Publication number: 20190235928

Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.

Type: Application

Filed: January 31, 2018

Publication date: August 1, 2019

Inventors: Jerome F. Duluk, JR., Luke Durant, Ramon Matas Navarro, Alan Menezes, Jeffrey Tuckey, Gentaro Hirota, Brian Pharris
DYNAMIC PARTITIONING OF EXECUTION RESOURCES

Publication number: 20190235924

Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.

Type: Application

Filed: January 31, 2018

Publication date: August 1, 2019

Inventors: Jerome F. Duluk, Jr., Luke Durant, Ramon Matas Navarro, Alan Menezes, Jeffrey Tuckey, Gentaro Hirota, Brian Pharris
Resolving memory accesses crossing cache line boundaries

Patent number: 10318427

Abstract: An instruction in a first cache line may be identified and an address associated with the instruction may be determined. The address may be determined to cross a cache line boundary associated with the first cache line and a second cache line. In response to determining that the address crosses the cache line boundary, the instruction may be adjusted based on a portion of the address included in the first cache line and a second instruction may be created based on a portion of the address included in the second cache line. The second instruction may be injected into an instruction pipeline after the adjusted instruction.

Type: Grant

Filed: December 18, 2014

Date of Patent: June 11, 2019

Assignee: Intel Corporation

Inventors: Ramon Matas, Chung-Lun Chan, Alexey P. Suprun, Aditya Kesiraju
Stateless capture of data linear addresses during precise event based sampling

Patent number: 10175986

Abstract: A processor includes a logic for stateless capture of data linear addresses (DLA) during precise event based sampling (PEBS) for an out-of-order execution engine. The engine may include a PEBS unit with logic to increment a counter each time an instance of a designated micro-op is retired a reorder buffer, capture output DLA referenced by an instance of the micro-op that executes after the counter overflows, set a captured bit associated with a reorder buffer identifier for the instance of the micro-op, and store a PEBS record in a debug storage when the instance of the micro-op is retired from the reorder buffer. The designated micro-op references a DLA of a memory accessible to the processor.

Type: Grant

Filed: May 8, 2017

Date of Patent: January 8, 2019

Assignee: Intel Corporation

Inventors: Roger Gramunt, Ramon Matas, Benjamin C. Chaffin, Neal S. Moyer, Rammohan Padmanabhan, Alexey P. Suprun, Matthew G. Smith
Transmitter for an Optical Free-Beam Communication System

Publication number: 20180337729

Abstract: A transmitter for an optical free-beam communication system includes two light transmitters for the optical transmission of a data signal using one single-sideband modulation, wherein each light transmitter emits a side of the band modulation so that a light signal arriving at a receiver corresponds to a double-sideband modulation.

Type: Application

Filed: October 28, 2016

Publication date: November 22, 2018

Inventors: Ramon Mata Calvo, Dirk Giggenbach, Christian Fuchs, Mustafa Ahmad
Apparatuses, methods, and systems to share translation lookaside buffer entries

Patent number: 10108554

Abstract: Methods, systems, and apparatuses relating to sharing translation lookaside buffer entries are described. In one embodiment, a processor includes one or more cores to execute a plurality of threads, a translation lookaside buffer comprising a plurality of entries, each entry comprising a virtual address to physical address translation and a plurality of bit positions, and each set bit of the plurality of bit positions in each entry indicating that the virtual address to physical address translation is valid for a respective thread of the plurality of threads, and a memory management circuit to clear all set bits for a thread by asserting a reset command to a respective reset port of the translation lookaside buffer for the thread, wherein the translation lookaside buffer comprises a separate reset port for each of the plurality of threads.

Type: Grant

Filed: December 5, 2016

Date of Patent: October 23, 2018

Assignee: Intel Corporation

Inventors: Chung-Lun Chan, Ramon Matas
APPARATUSES, METHODS, AND SYSTEMS TO SHARE TRANSLATION LOOKASIDE BUFFER ENTRIES

Publication number: 20180157598

Abstract: Methods, systems, and apparatuses relating to sharing translation lookaside buffer entries are described. In one embodiment, a processor includes one or more cores to execute a plurality of threads, a translation lookaside buffer comprising a plurality of entries, each entry comprising a virtual address to physical address translation and a plurality of bit positions, and each set bit of the plurality of bit positions in each entry indicating that the virtual address to physical address translation is valid for a respective thread of the plurality of threads, and a memory management circuit to clear all set bits for a thread by asserting a reset command to a respective reset port of the translation lookaside buffer for the thread, wherein the translation lookaside buffer comprises a separate reset port for each of the plurality of threads.

Type: Application

Filed: December 5, 2016

Publication date: June 7, 2018

Inventors: CHUNG-LUN CHAN, RAMON MATAS
Method and apparatus for performing an efficient scatter

Patent number: 9891914

Abstract: An apparatus and method for performing an efficient scatter operation. For example, one embodiment of a processor comprises: an allocator unit to receive a scatter operation comprising a number of data elements and responsively allocate resources to execute the scatter operation; a memory execution cluster comprising at least a portion of the resources to execute the scatter operation, the resources including one or more store data buffers and one or more store address buffers; and a senior store pipeline to transfer store data elements from the store data buffers to system memory using addresses from the store address buffers prior to retirement of the scatter operation.

Type: Grant

Filed: April 10, 2015

Date of Patent: February 13, 2018

Assignee: Intel Corporation

Inventors: Ramon Matas, Alexey P. Suprun, Roger Gramunt, Chung-Lun Chan, Rammohan Padmanabhan
Scalable event handling in multi-threaded processor cores

Patent number: 9886396

Abstract: In one embodiment, a processor includes a frontend unit having an instruction decoder to receive and to decode instructions of a plurality of threads, an execution unit coupled to the instruction decoder to receive and execute the decoded instructions, and an instruction retirement unit having a retirement logic to receive the instructions from the execution unit and to retire the instructions associated with one or more of the threads that have an instruction or an event pending to be retired. The instruction retirement unit includes a thread arbitration logic to select one of the threads at a time and to dispatch the selected thread to the retirement logic for retirement processing.

Type: Grant

Filed: December 23, 2014

Date of Patent: February 6, 2018

Assignee: Intel Corporation

Inventors: Roger Gramunt, Rammohan Padmanabhan, Ramon Matas, Neal S. Moyer, Benjamin C. Chaffin, Avinash Sodani, Alexey P. Suprun, Vikram S. Sundaram, Chung-Lun Chan, Gerardo A. Fernandez, Julio Gago, Michael S. Yang, Aditya Kesiraju
STATELESS CAPTURE OF DATA LINEAR ADDRESSES DURING PRECISE EVENT BASED SAMPLING

Publication number: 20170242698

Abstract: A processor includes a logic for stateless capture of data linear addresses (DLA) during precise event based sampling (PEBS) for an out-of-order execution engine. The engine may include a PEBS unit with logic to increment a counter each time an instance of a designated micro-op is retired a reorder buffer, capture output DLA referenced by an instance of the micro-op that executes after the counter overflows, set a captured bit associated with a reorder buffer identifier for the instance of the micro-op, and store a PEBS record in a debug storage when the instance of the micro-op is retired from the reorder buffer. The designated micro-op references a DLA of a memory accessible to the processor.

Type: Application

Filed: May 8, 2017

Publication date: August 24, 2017

Inventors: Roger Gramunt, Ramon Matas, Benjamin C. Chaffin, Neal S. Moyer, Rammohan Padmanabhan, Alexey P. Suprun, Matthew G. Smith
Memory fault suppression via re-execution and hardware FSM

Patent number: 9715432

Abstract: Exemplary aspects are directed toward resolving fault suppression in hardware, which at the same time does not incur a performance hit. For example, when multiple instructions are executing simultaneously, a mask can specify which elements need not be executed. If the mask is disabled, those elements do not need to be executed. A determination is then made as to whether a fault happens in one of the elements that have been disabled. If there is a fault in one of the elements that has been disabled, a state machine re-fetches the instructions in a special mode. More specifically, the state machine determines if the fault is on a disabled element, and if the fault is on a disabled element, then the state machine specifies that the fault should be ignored. If during the first execution there was no mask, if there is an error present during execution, then the element is re-run with the mask to see if the error is a “real” fault.

Type: Grant

Filed: December 23, 2014

Date of Patent: July 25, 2017

Assignee: INTEL CORPORATION

Inventors: Ramon Matas, Roger Gramunt, Chung-Lun Chan, Benjamin C. Chaffin, Aditya Kesiraju, Jonathan C. Hall, Jesus Corbal
Boot strap processor assignment for a multi-core processing unit

Patent number: 9658861

Abstract: Following a restart or a reboot of a system that includes a multi-core processor, the multi-core processor may assign one of the cores as a boot strap processor (BSP). Initialization logic may detect a state of each of the plurality of processing cores as active or inactive. The initialization logic may detect an attribute of each of the plurality of processing cores as eligible to be assigned as a BSP or as ineligible to be assigned as the BSP. The initialization logic may detect a last processing core of the plurality of processing cores in the interconnect that is an active processing core based at least in part on the state and is eligible to be assigned as the BSP based at least in part on the attribute. In various embodiments, the initialization information may assign the last processing core as the BSP.

Type: Grant

Filed: December 29, 2011

Date of Patent: May 23, 2017

Assignee: Intel Corporation

Inventors: Steven S. Chang, Anshuman Thakur, Ramacharan Sundararaman, Ramon Matas, Jay S. Lawlor, Robert F. Netting

1 2 next