Patents by Inventor Guillermo J. Rozas

Guillermo J. Rozas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9081563
    Abstract: Apparatus and a method for causing scheduler software to produce code which executes more rapidly by ignoring some of the normal constraints placed on its scheduling operations and simply scheduling certain instructions to run as fast as possible, raising an exception if the scheduling violates a scheduling constraint, and determining steps to be taken for correctly executing each set of instructions about which an exception is raised.
    Type: Grant
    Filed: June 4, 2012
    Date of Patent: July 14, 2015
    Inventors: Guillermo J. Rozas, Godfrey P. D'Souza, Charles R. Price, Paul S. Serris
  • Patent number: 8924648
    Abstract: A method for caching attribute data for matching attributes with physical addresses. The method includes storing a plurality of attribute entries in a memory, wherein the memory is configured to provide at least one attribute entry when accessed with a physical address, and wherein the attribute entry provided describes characteristics of the physical address.
    Type: Grant
    Filed: September 20, 2013
    Date of Patent: December 30, 2014
    Inventors: H. Peter Anvin, Guillermo J. Rozas, Alexander Klaiber, John P. Banning
  • Patent number: 8898395
    Abstract: Methods and systems for maintaining cache consistency are described. A group of instructions is executed. The group of instructions can include multiple memory operations, and also includes an instruction that when executed causes a cache line to be accessed. In response to execution of that instruction, an indicator associated with the group of instructions is updated to indicate that the cache line has been accessed. The cache line is indicated as having been accessed until execution of the group of instructions is ended.
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: November 25, 2014
    Inventor: Guillermo J. Rozas
  • Publication number: 20140337659
    Abstract: A dual-domain dynamic multiplexer and a method of transitioning between asynchronous voltage and frequency domains. One embodiment of the dual-domain dynamic multiplexer includes: (1) a first domain having a first voltage and a first clock, and a second domain having a second voltage and a second clock, (2) a plurality of data and data select input pairs wherein a data input of an input pair is in the first domain and a data select input of an input pair is in the second domain, and (3) a pre-charge stage in the second domain that is energized upon an edge of the second clock, whereby one data and data input pair is enabled and data latched in the second domain upon another edge of the second clock.
    Type: Application
    Filed: May 13, 2013
    Publication date: November 13, 2014
    Applicant: Nvidia Corporation
    Inventors: Guillermo J. Rozas, Jason Golbus, Chi Keung Lee
  • Publication number: 20140289471
    Abstract: A coherence decoupling buffer. In accordance with a first embodiment, a coherence decoupling buffer is for storing tag information of cache lines evicted from a plurality of cache memories. A coherence decoupling buffer may be free of value information of the plurality of cache memories. A coherence decoupling buffer may also be combined with a coherence memory.
    Type: Application
    Filed: June 3, 2014
    Publication date: September 25, 2014
    Applicant: Intellectual Venture Funding LLC
    Inventor: Guillermo J. Rozas
  • Publication number: 20140281392
    Abstract: The disclosure provides a micro-processing system operable in a hardware decoder mode and in a translation mode. In the hardware decoder mode, the hardware decoder receives and decodes non-native ISA instructions into native instructions for execution in a processing pipeline. In the translation mode, native translations of non-native ISA instructions are executed in the processing pipeline without using the hardware decoder. The system includes a code portion profile stored in hardware that changes dynamically in response to use of the hardware decoder to execute portions of non-native ISA code. The code portion profile is then used to dynamically form new native translations executable in the translation mode.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Nathan Tuck, Alexander Klaiber, Ross Segelken, David Dunn, Ben Hertzberg, Rupert Brauch, Thomas Kistler, Guillermo J. Rozas, Madhu Swarna
  • Publication number: 20140189313
    Abstract: Various embodiments of microprocessors and methods of operating a microprocessor during runahead operation are disclosed herein. One example method of operating a microprocessor includes identifying a runahead-triggering event associated with a runahead-triggering instruction and, responsive to identification of the runahead-triggering event, entering runahead operation and inserting the runahead-triggering instruction along with one or more additional instructions in a queue. The example method also includes resuming non-runahead operation of the microprocessor in response to resolution of the runahead-triggering event and re-dispatching the runahead-triggering instruction along with the one or more additional instructions from the queue to the execution logic.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Guillermo J. Rozas, Alexander Klaiber, James van Zoeren, Paul Serris, Brad Hoyt, Sridharan Ramakrishnan, Hens Vanderschoot, Ross Segelken, Darrell D. Boggs, Magnus Ekman, Aravindh Baktha, David Dunn
  • Publication number: 20140189316
    Abstract: In one embodiment, in an execution pipeline having a plurality of execution subunits, a method of using a bypass network to directly forward data from a producing execution subunit to a consuming execution subunit is provided. The method includes producing output data with the producing execution subunit, consuming input data with the consuming execution subunit, for one or more intervening operations whose input is the output data from the producing execution subunit and whose output is the input data to the consuming execution subunit, evaluating those one or more intervening operations to determine whether their execution would compose an identify function, and if the one or more intervening operations would compose such an identity function, controlling the bypass network to forward the producing execution subunit's output data directly to the consuming execution subunit.
    Type: Application
    Filed: December 27, 2012
    Publication date: July 3, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Gokul Govindu, Parag Gupta, Scott Pitkethly, Guillermo J. Rozas
  • Publication number: 20140181462
    Abstract: A method for detecting an instruction ordering violation in a CPU. The method includes receiving a reordered stream of instructions and detecting whether an ordering violation has occurred by using virtual addresses. The method further includes transferring results of the reordered stream of instructions from a load store buffer into a cache and detecting whether an ordering violation has occurred by using physical addresses. Subsequently, a recovery is initiated upon detection of an ordering violation.
    Type: Application
    Filed: December 26, 2012
    Publication date: June 26, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Guillermo J. Rozas, Bharath Krishnan, James Van Zoeren
  • Publication number: 20140164736
    Abstract: Embodiments related to managing lazy runahead operations at a microprocessor are disclosed. For example, an embodiment of a method for operating a microprocessor described herein includes identifying a primary condition that triggers an unresolved state of the microprocessor. The example method also includes identifying a forcing condition that compels resolution of the unresolved state. The example method also includes, in response to identification of the forcing condition, causing the microprocessor to enter a runahead mode.
    Type: Application
    Filed: December 7, 2012
    Publication date: June 12, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Guillermo J. Rozas, Alexander Klaiber, James van Zoeren, Paul Serris, Brad Hoyt, Sridharan Ramakrishnan, Hens Vanderschoot, Ross Segelken, Darrell D. Boggs, Magnus Ekman
  • Publication number: 20140164738
    Abstract: Embodiments related to methods and devices operative, in the event that execution of an instruction produces a runahead-triggering event, to cause a microprocessor to enter into and operate in a runahead without reissuing the instruction are provided. In one example, a microprocessor is provided. The example microprocessor includes fetch logic for retrieving an instruction, scheduling logic for issuing the instruction retrieved by the fetch logic for execution, and runahead control logic. The example runahead control logic is operative, in the event that execution of the instruction as scheduled by the scheduling logic produces a runahead-triggering event, to cause the microprocessor to enter into and operate in a runahead mode without reissuing the instruction, and carry out runahead policies while the microprocessor is in the runahead mode that governs operation of the microprocessor and cause the microprocessor to operate differently than when not in the runahead mode.
    Type: Application
    Filed: December 7, 2012
    Publication date: June 12, 2014
    Applicant: NVIDIA Corporation
    Inventors: Magnus Ekman, Guillermo J. Rozas, Alexander Klaiber, James van Zoeren, Paul Serris, Brad Hoyt, Sridharan Ramakrishnan, Hens Vanderschoot, Ross Segelken, Darrell D. Boggs
  • Patent number: 8751753
    Abstract: A coherence decoupling buffer. In accordance with a first embodiment of the present invention, a coherence decoupling buffer is for storing tag information of cache lines evicted from a plurality of cache memories. A coherence decoupling buffer may be free of value information of the plurality of cache memories. A coherence decoupling buffer may also be combined with a coherence memory.
    Type: Grant
    Filed: April 7, 2005
    Date of Patent: June 10, 2014
    Inventor: Guillermo J. Rozas
  • Publication number: 20140136891
    Abstract: Embodiments related to managing potentially invalid results generated/obtained by a microprocessor during runahead are provided. In one example, a method for operating a microprocessor includes causing the microprocessor to enter runahead upon detection of a runahead event. The example method also includes, during runahead, determining that an operation associated with an instruction referencing a storage location would produce a potentially invalid result based on a value of an architectural poison bit associated with the storage location and performing a different operation in response.
    Type: Application
    Filed: November 14, 2012
    Publication date: May 15, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Bruce Holmer, Guillermo J. Rozas, Alexander Klaiber, James van Zoeren, Paul Serris, Brad Hoyt, Sridharan Ramakrishnan, Hens Vanderschoot, Ross Segelken, Darrell D. Boggs, Magnus Ekman
  • Publication number: 20140082291
    Abstract: In a processor, a method for speculative permission acquisition for access to a shared memory. The method includes receiving a store from a processor core to modify a shared cache line, and in response to receiving the store, marking the cache line as speculative. The cache line is then modified in accordance with the store. Upon receiving a modification permission, the modified cache line is subsequently committed.
    Type: Application
    Filed: September 14, 2012
    Publication date: March 20, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: James Van Zoeren, Alexander Klaiber, Guillermo J. Rozas, Paul Serris
  • Publication number: 20130326153
    Abstract: The disclosure provides systems and methods for maintaining cache coherency in a multi-threaded processing environment. For each location in a data cache, a global state is maintained specifying the coherency of the cache location relative to other data caches and/or to a shared memory resource backing the data cache. For each cache location, thread state information associated with a plurality of threads is maintained. The thread state information is specified separately and in addition to the global state, and is used to individually control read and write permissions for each thread for the cache location. The thread state information is also used, for example by a cache controller, to control whether uncommitted transactions of threads relating to the cache location are to be rolled back.
    Type: Application
    Filed: May 31, 2012
    Publication date: December 5, 2013
    Applicant: NVIDIA CORPORATION
    Inventor: Guillermo J. Rozas
  • Publication number: 20130297911
    Abstract: Embodiments related to re-dispatching an instruction selected for re-execution from a buffer upon a microprocessor re-entering a particular execution location after runahead are provided. In one example, a microprocessor is provided. The example microprocessor includes fetch logic, one or more execution mechanisms for executing a retrieved instruction provided by the fetch logic, and scheduler logic for scheduling the retrieved instruction for execution. The example scheduler logic includes a buffer for storing the retrieved instruction and one or more additional instructions, the scheduler logic being configured, upon the microprocessor re-entering at a particular execution location after runahead, to re-dispatch, from the buffer, an instruction that has been previously dispatched to one of the execution mechanisms.
    Type: Application
    Filed: May 3, 2012
    Publication date: November 7, 2013
    Applicant: NVIDIA CORPORATION
    Inventors: Guillermo J. Rozas, Paul Serris, Brad Hoyt, Sridharan Ramakrishnan, Hens Vanderschoot, Ross Segelken, Darrell Boggs, Magnus Ekman
  • Patent number: 8566564
    Abstract: A method for caching attribute data for matching attributes with physical addresses. The method includes storing a plurality of attribute entries in a memory, wherein the memory is configured to provide at least one attribute entry when accessed with a physical address, and wherein the attribute entry provided describes characteristics of the physical address.
    Type: Grant
    Filed: December 13, 2012
    Date of Patent: October 22, 2013
    Inventors: H. Peter Anvin, Guillermo J. Rozas, Alexander Klaiber, John P. Banning
  • Patent number: 8508262
    Abstract: Systems and methods for design and operation of signal generator circuitry with output frequencies greater than the oscillator frequency. Accordingly, in a first method embodiment, a method of producing an output periodic electronic signal comprises accessing four signals having a quadrature phase relationship. First and second pairs of these signals having a one half cycle phase relationship are averaged to produce two signals having an improved duty cycle and a one-quarter cycle phase relationship. The first and second averaged periodic electronic signals are combined in an exclusive OR circuit to produce the output periodic electronic signal at twice the oscillator frequency. Advantageously, the periodic signal may comprise a desirable duty cycle of 50 percent.
    Type: Grant
    Filed: September 7, 2011
    Date of Patent: August 13, 2013
    Inventors: William N. Schnaitter, Guillermo J. Rozas
  • Patent number: 8464033
    Abstract: Methods and systems thereof for exception handling are described. An event to be handled is identified during execution of a code sequence. A bit is set to indicate that handling of the event is to be deferred. An exception corresponding to the event is generated if the bit is set.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: June 11, 2013
    Inventors: Guillermo J. Rozas, Alexander Klaiber
  • Patent number: 8438548
    Abstract: In one embodiment, after translating a plurality of target instructions from a target memory location into a plurality of host instructions, a write operation to a target memory portion which includes said target memory location is detected. In response to the detecting, a copy of the target instructions is stored in a host memory. In response to an attempt to execute the host instructions, the copy is compared with a plurality of current target instructions presently stored in the target memory location. Further, in response to a mismatch based on the comparison, the host instructions are disabled.
    Type: Grant
    Filed: February 4, 2011
    Date of Patent: May 7, 2013
    Inventors: John Banning, H. Peter Anvin, Robert Bedicheck, Guillermo J. Rozas, Andrew Shaw, Linus Torvalds, Jason Wilson