Patents by Inventor Mitchell Alsup

Mitchell Alsup has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10061592
    Abstract: A method for improving power, performance, area (PPA) for mixed precision computations in a processing environment. The method includes determining a braiding factor as a number of units of work encoded into a physical thread. A value of the braiding factor is determined based on a mix of precision requirements presented for individual units of work. Units of work are classified as instructions for applied code transformation based on associated precision requirements for the processing environment. Instruction inputs from specified registers are packed together into a destination register according to the determined value of the braiding factor. The packed instructions presented in vector form are executed with an instruction set architecture configured for executing packed instructions of different precisions.
    Type: Grant
    Filed: March 30, 2015
    Date of Patent: August 28, 2018
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Maxim Lukyanov, Alexander Grosul, Mitchell Alsup, Boris Beylin
  • Patent number: 9727341
    Abstract: A method for computing in a thread-based environment provides manipulating an execution mask to enable and disable threads when executing multiple conditional function clauses for process instructions. Execution lanes are controlled based on execution participation for the process instructions for reducing resource consumption. Execution of particular one or more schedulable structures that include multiple process instructions are skipped based on the execution mask and activating instructions.
    Type: Grant
    Filed: August 12, 2014
    Date of Patent: August 8, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Mitchell Alsup, Yang Jiao, Boris Beylin, Maxim Lukyanov, Alexander Grosul
  • Patent number: 9483264
    Abstract: A method for executing instructions in a thread processing environment includes determining a multiple requirements that must be satisfied and resources that must be available for executing multiple instructions. The multiple instructions are encapsulated into a schedulable structure. A header is configured for the schedulable structure with information including the determined multiple requirements and resources. The schedulable structure is schedule for executing each of the multiple instructions using the information.
    Type: Grant
    Filed: August 12, 2014
    Date of Patent: November 1, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Mitchell Alsup, Boris Beylin, Michael Shebanow, SungSoo Park
  • Patent number: 9471305
    Abstract: A method for graphics processing includes generating one or more transcendental instructions in a graphics processing unit (GPU). Micro-code is formed for processing the one or more transcendental instructions in the GPU. The micro-code is processed using an iterative process including cubic interpolation and an evaluation of a cubic polynomial.
    Type: Grant
    Filed: August 12, 2014
    Date of Patent: October 18, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Mitchell Alsup
  • Publication number: 20150378741
    Abstract: A method for improving power, performance, area (PPA) for mixed precision computations in a processing environment. The method includes determining a braiding factor as a number of units of work encoded into a physical thread. A value of the braiding factor is determined based on a mix of precision requirements presented for individual units of work. Units of work are classified as instructions for applied code transformation based on associated precision requirements for the processing environment. Instruction inputs from specified registers are packed together into a destination register according to the determined value of the braiding factor. The packed instructions presented in vector form are executed with an instruction set architecture configured for executing packed instructions of different precisions.
    Type: Application
    Filed: March 30, 2015
    Publication date: December 31, 2015
    Inventors: Maxim Lukyanov, Alexander Grosul, Mitchell Alsup, Boris Beylin
  • Publication number: 20150324228
    Abstract: A method for executing instructions in a thread processing environment includes determining a multiple requirements that must be satisfied and resources that must be available for executing multiple instructions. The multiple instructions are encapsulated into a schedulable structure. A header is configured for the schedulable structure with information including the determined multiple requirements and resources. The schedulable structure is schedule for executing each of the multiple instructions using the information.
    Type: Application
    Filed: August 12, 2014
    Publication date: November 12, 2015
    Inventors: Mitchell Alsup, Boris Beylin, Michael Shebanow, SungSoo Park
  • Publication number: 20150324198
    Abstract: A method for computing in a thread-based environment provides manipulating an execution mask to enable and disable threads when executing multiple conditional function clauses for process instructions. Execution lanes are controlled based on execution participation for the process instructions for reducing resource consumption. Execution of particular one or more schedulable structures that include multiple process instructions are skipped based on the execution mask and activating instructions.
    Type: Application
    Filed: August 12, 2014
    Publication date: November 12, 2015
    Inventors: Mitchell Alsup, Yang Jiao, Boris Beylin, Maxim Lukyanov, Alexander Grosul
  • Publication number: 20150325032
    Abstract: A method for processing pixel information includes pushing pixel varying attributes to a register file of a shader processing element. At least a portion of the pixel varying attributes are pulled based on a control flow in the shader processing element. At least a portion of the pixel varying attributes are interpolated.
    Type: Application
    Filed: September 30, 2014
    Publication date: November 12, 2015
    Inventors: Yang Jiao, Mitchell Alsup
  • Publication number: 20150324949
    Abstract: A method for graphics processing includes generating one or more transcendental instructions in a graphics processing unit (GPU). Micro-code is formed for processing the one or more transcendental instructions in the GPU. The micro-code is processed using an iterative process including cubic interpolation and an evaluation of a cubic polynomial.
    Type: Application
    Filed: August 12, 2014
    Publication date: November 12, 2015
    Inventor: Mitchell Alsup
  • Patent number: 8069336
    Abstract: Various embodiments of methods and systems for implementing a microprocessor that includes a trace cache and attempts to transition fetching from instruction cache to trace cache only on label boundaries are disclosed. In one embodiment, a microprocessor may include an instruction cache, a branch prediction unit, and a trace cache. The prefetch unit may fetch instructions from the instruction cache until the branch prediction unit outputs a predicted target address for a branch instruction. When the branch prediction unit outputs a predicted target address, the prefetch unit may check for an entry matching the predicted target address in the trace cache. If a match is found, the prefetch unit may fetch one or more traces from the trace cache in lieu of fetching instructions from the instruction cache.
    Type: Grant
    Filed: December 3, 2003
    Date of Patent: November 29, 2011
    Assignee: Globalfoundries Inc.
    Inventors: Mitchell Alsup, Gregory William Smaus
  • Patent number: 7882330
    Abstract: In one embodiment, a system comprises one or more input/output (I/O) devices; an I/O memory management unit (IOMMU) coupled to receive memory requests sourced by the I/O devices and configured to provide address translation for the memory requests; and a virtual machine monitor (VMM) configured to manage one or more virtual machines on the system, wherein the VMM is configured to virtualize the IOMMU, providing one or more virtual IOMMUs for use by one or more virtual machines.
    Type: Grant
    Filed: September 18, 2009
    Date of Patent: February 1, 2011
    Assignee: GLOBALFOUNDRIES Inc.
    Inventors: Michael Haertel, Mark D. Hummel, Andrew W. Lueck, Geoffrey S. Strongin, Mitchell Alsup
  • Patent number: 7809923
    Abstract: In an embodiment, an input/output (I/O) memory management unit (IOMMU) comprises at least one memory configured to store translation data; and control logic coupled to the memory and configured to translate an I/O device-generated memory request using the translation data. The translation data corresponds to one or more device table entries in a device table stored in a memory system of a computer system that includes the IOMMU, wherein the device table entry for a given request is selected by an identifier corresponding to the I/O device that generates the request. The translation data further corresponds to one or more I/O page tables, wherein the selected device table entry for the given request includes a pointer to a set of I/O page tables to be used to translate the given request.
    Type: Grant
    Filed: December 10, 2009
    Date of Patent: October 5, 2010
    Assignee: GLOBALFOUNDRIES Inc.
    Inventors: Mark D. Hummel, Geoffrey S. Strongin, Mitchell Alsup, Michael Haertel, Andrew W. Lueck
  • Publication number: 20100095085
    Abstract: In an embodiment, an input/output (I/O) memory management unit (IOMMU) comprises at least one memory configured to store translation data; and control logic coupled to the memory and configured to translate an I/O device-generated memory request using the translation data. The translation data corresponds to one or more device table entries in a device table stored in a memory system of a computer system that includes the IOMMU, wherein the device table entry for a given request is selected by an identifier corresponding to the I/O device that generates the request. The translation data further corresponds to one or more I/O page tables, wherein the selected device table entry for the given request includes a pointer to a set of I/O page tables to be used to translate the given request.
    Type: Application
    Filed: December 10, 2009
    Publication date: April 15, 2010
    Inventors: Mark D. Hummel, Geoffrey S. Strongin, Mitchell Alsup, Michael Haertel, Andrew W. Lueck
  • Patent number: 7694110
    Abstract: Various embodiments of methods and systems for implementing a set of microcode operations corresponding to a microcoded instruction as a microcode subroutine are disclosed. In one embodiment, a microprocessor includes a dispatch unit configured to dispatch operations and a scheduler coupled to the dispatch unit and configured to schedule dispatched operations for execution. In response to receiving a microcoded instruction, the dispatch unit is configured to dispatch a microcode subroutine call operation that specifies a tag identifying a microcode subroutine.
    Type: Grant
    Filed: July 8, 2003
    Date of Patent: April 6, 2010
    Assignee: GlobalFoundries Inc.
    Inventors: Mitchell Alsup, Gregory W. Smaus
  • Patent number: 7653803
    Abstract: In an embodiment, an input/output (I/O) memory management unit (IOMMU) comprises at least one memory configured to store translation data; and control logic coupled to the memory and configured to translate an I/O device-generated memory request using the translation data. The translation data corresponds to one or more device table entries in a device table stored in a memory system of a computer system that includes the IOMMU, wherein the device table entry for a given request is selected by an identifier corresponding to the I/O device that generates the request. The translation data further corresponds to one or more I/O page tables, wherein the selected device table entry for the given request includes a pointer to a set of I/O page tables to be used to translate the given request.
    Type: Grant
    Filed: January 16, 2007
    Date of Patent: January 26, 2010
    Assignee: GLOBALFOUNDRIES Inc.
    Inventors: Mark D. Hummel, Geoffrey S. Strongin, Mitchell Alsup, Michael Haertel, Andrew W. Lueck
  • Publication number: 20100011147
    Abstract: In one embodiment, a system comprises one or more input/output (I/O) devices; an I/O memory management unit (IOMMU) coupled to receive memory requests sourced by the I/O devices and configured to provide address translation for the memory requests; and a virtual machine monitor (VMM) configured to manage one or more virtual machines on the system, wherein the VMM is configured to virtualize the IOMMU, providing one or more virtual IOMMUs for use by one or more virtual machines.
    Type: Application
    Filed: September 18, 2009
    Publication date: January 14, 2010
    Inventors: Mark D. Hummel, Andrew W. Lueck, Geoffrey S. Strongin, Mitchell Alsup, Michael Haertel
  • Patent number: 7636819
    Abstract: A method for providing proactive synchronization in a computer system includes a processor requesting exclusive access to a given memory resource. The request may include one or more addresses associated with the given memory resource. The method also includes comparing each of the addresses in the request to each address in a plurality of sets of addresses. Each address in the sets of addresses may correspond to a respective memory resource to which a requestor has exclusive access. In addition, in response to any address of the one or more addresses matching any address in the plurality of sets of addresses, the method includes returning a count value associated with the set including the matching address. The count value may be indicative of the number of requestors contending for the matching address. Software may utilize this count value to proactively choose an item with lower contention probabilities in subsequent attempts.
    Type: Grant
    Filed: August 23, 2006
    Date of Patent: December 22, 2009
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Mitchell Alsup
  • Patent number: 7627722
    Abstract: A method for denying probes during proactive synchronization includes a first processor operating in an advanced synchronization mode, which includes the first processor specifying and acquiring exclusive access to a given memory resource. During operation in the advanced synchronization mode, specifying comprises executing a code sequence including: one or more locked memory reference instructions having a LOCK prefix and one or more addresses associated with the given memory resource. Specifying also includes executing an ACQUIRE instruction that is subsequent to the one or more locked memory reference instructions. The method further includes a second processor requesting access to the given memory resource and issuing a probe message. In response to receiving the probe message, the first processor responding to the probe message with a failure message, thereby denying the second processor access to the given memory resource.
    Type: Grant
    Filed: August 23, 2006
    Date of Patent: December 1, 2009
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Mitchell Alsup
  • Patent number: 7613898
    Abstract: In one embodiment, a system comprises one or more input/output (I/O) devices; an I/O memory management unit (IOMMU) coupled to receive memory requests sourced by the I/O devices and configured to provide address translation for the memory requests; and a virtual machine monitor (VMM) configured to manage one or more virtual machines on the system, wherein the VMM is configured to virtualize the IOMMU, providing one or more virtual IOMMUs for use by one or more virtual machines.
    Type: Grant
    Filed: January 16, 2007
    Date of Patent: November 3, 2009
    Assignee: GLOBALFOUNDRIES Inc.
    Inventors: Michael Haertel, Mark D. Hummel, Geoffrey S. Strongin, Andrew W. Lueck, Mitchell Alsup
  • Patent number: 7606985
    Abstract: Providing proactive synchronization in a computer system may include providing an augmented instruction set with additional synchronizing instructions. Therefore, a method includes a processor executing a set of instructions to request exclusive access to a plurality of memory resources. The set of instructions includes an ACQUIRE instruction. In addition, the method may include storing addresses referenced by the set of instructions within a buffer. Further, the method may include sending the addresses referenced by the set of instructions, as a set, to be compared to other addresses to which exclusive access to memory addresses has been granted in response to execution of the ACQUIRE instruction.
    Type: Grant
    Filed: August 23, 2006
    Date of Patent: October 20, 2009
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Mitchell Alsup