Patents by Inventor Ashraf Ahmed

Ashraf Ahmed has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20180129474
    Abstract: According to one general aspect, an apparatus may include a floating-point multiply-accumulate unit configured to generate a floating point result by either adding or subtracting three floating point operands: an addend, a product carry, and a product sum. The floating-point multiply-accumulate unit may include a close path adder. The close path adder may include an unincremented mantissa addition circuit configured to compute an unincremented mantissa result based upon the three floating point operands. The close path adder may also include an incremented mantissa addition circuit configured to, at least partially in parallel with the mantissa addition circuit, produce an incremented mantissa result. The close path adder may further include a selection circuit configured to produce a close path result by selecting between the unincremented mantissa result and the incremented mantissa result.
    Type: Application
    Filed: February 10, 2017
    Publication date: May 10, 2018
    Inventor: Ashraf AHMED
  • Patent number: 9940101
    Abstract: Embodiments of the present disclosure include a tininess prediction and handler engine for handling numeric underflow while streamlining the data path for handling normal range cases, thereby avoiding flushes, and reducing the complexity of a scheduler with respect to how dependent operations are handled. A preemptive tiny detection logic section can detect a potential tiny result for the function or operation that is being performed, and can produce a pessimistic tiny indicator. The tininess prediction and handler engine can further include a subnormal post-processing pipe, which can denormalize and round one or more subnormal operations while in a post-processing mode. A schedule modification logic section can reschedule in-flight operations. The schedule modification logic section can issue dependent operations optimistically assuming that a producing operation will not produce a tiny result, and so will not incur extra latency associated with fixing the tiny result in the post-processing pipe.
    Type: Grant
    Filed: March 10, 2016
    Date of Patent: April 10, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ashraf Ahmed, Nicholas Todd Humphries, Marc Augustin
  • Patent number: 9588770
    Abstract: Reconfiguring a register file using a rename table having a plurality of fields that indicate fracture information about a source register of an instruction for instructions which have narrow to wide dependencies.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: March 7, 2017
    Assignee: Samsung Electronics Co., LTD.
    Inventors: Bradley Gene Burgess, Ashraf Ahmed, Ravi Iyengar
  • Publication number: 20170060532
    Abstract: Embodiments of the inventive concept include a fast close path solution and circuit of a three path fused multiply-adder circuit. The fast close path circuit can include one or more compressors that can receive multiple operands and produce a result sum and a result carry. The close path circuit can include one or more leading zero anticipators (LZAs). The one or more LZAs can receive and process the result sum and the result carry. The close path circuit can include one or more adders. The one or more adders can receive and add the result sum and the result carry in parallel with the one or more LZAs processing the result sum and the result carry. Since the close path is the critical timing path, by performing the addition operations in parallel with the LZA and/or priority encode (PENC) operations, the logic depth and latency of the close path are reduced.
    Type: Application
    Filed: February 1, 2016
    Publication date: March 2, 2017
    Inventor: Ashraf AHMED
  • Publication number: 20170060533
    Abstract: Embodiments of the present disclosure include a tininess prediction and handler engine for handling numeric underflow while streamlining the data path for handling normal range cases, thereby avoiding flushes, and reducing the complexity of a scheduler with respect to how dependent operations are handled. A preemptive tiny detection logic section can detect a potential tiny result for the function or operation that is being performed, and can produce a pessimistic tiny indicator. The tininess prediction and handler engine can further include a subnormal post-processing pipe, which can denormalize and round one or more subnormal operations while in a post-processing mode. A schedule modification logic section can reschedule in-flight operations. The schedule modification logic section can issue dependent operations optimistically assuming that a producing operation will not produce a tiny result, and so will not incur extra latency associated with fixing the tiny result in the post-processing pipe.
    Type: Application
    Filed: March 10, 2016
    Publication date: March 2, 2017
    Inventors: Ashraf AHMED, Nicholas Todd HUMPHRIES, Marc AUGUSTIN
  • Patent number: 9513908
    Abstract: According to one general aspect, an apparatus may include a load/store unit, an execution unit, and a first and a second data path. The load/store unit may be configured to load/store data from/to a memory and transmit the data to/from an execution unit, wherein the data includes a plurality of elements. The execution unit may be configured to perform an operation upon the data. The load/store unit may be configured to transmit the data to/from the execution unit via either a first data path configured to communicate, without transposition, the data between the load/store unit and the execution unit, or a second data path configured to communicate, with transposition, the data between the load/store unit and the execution unit, wherein transposition includes dynamically distributing portions of the data amongst a plurality of elements according to an instruction.
    Type: Grant
    Filed: September 3, 2013
    Date of Patent: December 6, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ashraf Ahmed, Nicholas Todd Humphries, Marc Michael Augustin
  • Publication number: 20150039874
    Abstract: Translation of boot code read request commands from an on-board processor of a system on a chip (SoC) from a bus protocol (e.g., advanced high-performance bus (AHB) protocol) into a sequence of commands understandable by a serial interface of the SoC to read boot code from an off-board (e.g., flash or other non-volatile) memory device. The serial interface of the memory device may include a relatively low pin count (e.g., 5 pins) and boot code of the memory device may be modified after tape-out of the SoC free of necessitating a subsequent tape-out of the SoC.
    Type: Application
    Filed: July 31, 2013
    Publication date: February 5, 2015
    Applicant: Oracle International Corporation
    Inventors: Erik Schlanger, Eric Devolder, Ashraf Ahmed
  • Publication number: 20140331032
    Abstract: According to one general aspect, an apparatus may include a load/store unit, an execution unit, and a first and a second data path. The load/store unit may be configured to load/store data from/to a memory and transmit the data to/from an execution unit, wherein the data includes a plurality of elements. The execution unit may be configured to perform an operation upon the data. The load/store unit may be configured to transmit the data to/from the execution unit via either a first data path configured to communicate, without transposition, the data between the load/store unit and the execution unit, or a second data path configured to communicate, with transposition, the data between the load/store unit and the execution unit, wherein transposition includes dynamically distributing portions of the data amongst a plurality of elements according to an instruction.
    Type: Application
    Filed: September 3, 2013
    Publication date: November 6, 2014
    Inventors: Ashraf Ahmed, Nicholas Todd Humphries, Marc Michael Augustin
  • Publication number: 20140281415
    Abstract: Reconfiguring a register file using a rename table having a plurality of fields that indicate fracture information about a source register of an instruction for instructions which have narrow to wide dependencies.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Inventors: Bradley Gene BURGESS, Ashraf AHMED, Ravi IYENGAR
  • Patent number: 7565513
    Abstract: A technique of operating a processor includes determining whether a floating point unit (FPU) of the processor is to operate in a full-bit mode or a reduced-bit mode. An instruction is fetched and the instruction is decoded into a single operation, when the full-bit mode is indicated, or multiple operations, when the reduced-bit mode is indicated.
    Type: Grant
    Filed: February 28, 2007
    Date of Patent: July 21, 2009
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashraf Ahmed, Kelvin Domnic Goveas, Michael Clark, Jelena Ilic
  • Publication number: 20080209185
    Abstract: A technique of operating a processor includes determining whether a floating point unit (FPU) of the processor is to operate in a full-bit mode or a reduced-bit mode. An instruction is fetched and the instruction is decoded into one or more full-bit operations, when the full-bit mode is indicated, or one or more reduced-bit operations, when the reduced-bit mode is indicated.
    Type: Application
    Filed: May 31, 2007
    Publication date: August 28, 2008
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ashraf Ahmed, Kelvin Domnic Goveas, Michael Clark, Jelena Ilic
  • Publication number: 20080209184
    Abstract: A technique of operating a processor includes determining whether a floating point unit (FPU) of the processor is to operate in a full-bit mode or a reduced-bit mode. An instruction is fetched and the instruction is decoded into a single operation, when the full-bit mode is indicated, or multiple operations, when the reduced-bit mode is indicated.
    Type: Application
    Filed: February 28, 2007
    Publication date: August 28, 2008
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ashraf Ahmed, Kelvin Domnic Goveas, Michael Clark, Jelena Ilic
  • Patent number: 6944744
    Abstract: A functional unit of a processor may be configured to operate on instructions as either a single, wide functional unit or as multiple, independent narrower units. For example, an execution unit may be scheduled to execute an instruction as a single double-wide execution unit or as two independently schedulable single-wide execution units. Functional unit portions may be independently schedulable for execution of instructions operating on a first data type (e.g. SISD instructions). For single-wide instructions, functional unit portions may be scheduled independently. An issue lock mechanism may lock functional unit portions together so that they form a single multi-wide functional unit. For certain multi-wide instructions (e.g. certain SIMD instructions), an instruction operating on a multi-wide or vector data type may be scheduled so that the full multi-wide operation is performed concurrently by functional unit portions locked together as a one wide functional unit.
    Type: Grant
    Filed: August 27, 2002
    Date of Patent: September 13, 2005
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashraf Ahmed, Michael A. Filippo, James K. Pickett
  • Publication number: 20040181652
    Abstract: A functional unit of a processor may be configured to operate on instructions as either a single, wide functional unit or as multiple, independent narrower units. For example, an execution unit may be scheduled to execute an instruction as a single double-wide execution unit or as two independently schedulable single-wide execution units. Functional unit portions may be independently schedulable for execution of instructions operating on a first data type (e.g. SISD instructions). For single-wide instructions, functional unit portions may be scheduled independently. An issue lock mechanism may lock functional unit portions together so that they form a single multi-wide functional unit. For certain multi-wide instructions (e.g. certain SIMD instructions), an instruction operating on a multi-wide or vector data type may be scheduled so that the full multi-wide operation is performed concurrently by functional unit portions locked together as a one wide functional unit.
    Type: Application
    Filed: August 27, 2002
    Publication date: September 16, 2004
    Inventors: Ashraf Ahmed, Michael A. Filippo, James K. Pickett
  • Patent number: 6009511
    Abstract: A superscalar microprocessor appends a tag value to each floating point number. The tag value indicates whether the corresponding floating point number is a normal floating point number or a special floating point number. Additionally, the tag value indicates the type of special floating point number represented by the corresponding floating point number. The tag value is stored with the floating point number in a register file of the floating point unit. Tag values are also generated for floating point numbers read from memory. When a floating point core of a floating point unit receives operands from either the register file or memory, the floating point core examines the tag values to determine whether each operand is a normal floating point number or a special floating point number. If either operand is a special floating point number, the floating point core determines the type of special floating point number and applies any applicable special rules.
    Type: Grant
    Filed: June 11, 1997
    Date of Patent: December 28, 1999
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Thomas W. Lynch, Ashraf Ahmed
  • Patent number: 5978901
    Abstract: A superscalar microprocessor includes a combination floating point and multimedia unit. The floating point and multimedia unit includes one set of registers. The multimedia core and floating point core share the one set of registers. Each register as a type field associated with the register. The type field identifies whether the associated register contains valid data and whether the data is of multimedia type or floating point type. If the register stores floating point type data, the type field further indicates which of a plurality of floating point types the register stores such as: zero, infinity and normal. The floating point core relies on the type field to identify special floating point numbers such as zero and infinity. To ensure predictable results when a floating point instruction is executed subsequent to a multimedia instruction, a retyping algorithm retypes registers typed as multimedia type when the first floating point instruction subsequent to a multimedia instruction is executed.
    Type: Grant
    Filed: August 21, 1997
    Date of Patent: November 2, 1999
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark R. Luedtke, Paul K. Miller, Chris N. Hinds, Ashraf Ahmed