Patents by Inventor Teik-Chung Tan

Teik-Chung Tan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11861781
    Abstract: The graphics processing unit (GPU) of a processing system transitions to a low-power state between frame rendering operations according to an inter-frame power off process, where GPU state information is stored on retention hardware. The retention hardware can include retention random access memory (RAM) or retention flip-flops. The retention hardware is operable in an active mode and a retention mode, where read/write operations are enabled at the retention hardware in the active mode and disabled in the retention mode, but data stored on the retention hardware is still retained in the retention mode. The retention hardware is placed in the retention state between frame rendering operations. The GPU transitions from its low-power state to its active state upon receiving an indication that a new frame is ready to be rendered and is restored using the GPU state information stored at the retention hardware.
    Type: Grant
    Filed: December 28, 2020
    Date of Patent: January 2, 2024
    Assignees: SAMSUNG ELECTRONICS CO., LTD., Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC
    Inventors: Sreekanth Godey, Ashkan Hosseinzadeh Namin, Seunghun Jin, Teik-Chung Tan
  • Publication number: 20220207813
    Abstract: The graphics processing unit (GPU) of a processing system transitions to a low-power state between frame rendering operations according to an inter-frame power off process, where GPU state information is stored on retention hardware. The retention hardware can include retention random access memory (RAM) or retention flip-flops. The retention hardware is operable in an active mode and a retention mode, where read/write operations are enabled at the retention hardware in the active mode and disabled in the retention mode, but data stored on the retention hardware is still retained in the retention mode. The retention hardware is placed in the retention state between frame rendering operations. The GPU transitions from its low-power state to its active state upon receiving an indication that a new frame is ready to be rendered and is restored using the GPU state information stored at the retention hardware.
    Type: Application
    Filed: December 28, 2020
    Publication date: June 30, 2022
    Inventors: Sreekanth GODEY, Ashkan HOSSEINZADEH NAMIN, Seunghun JIN, Teik-Chung TAN
  • Patent number: 9395988
    Abstract: A method and apparatus for register packing prior to register renaming in a microprocessor are provided. The method includes: receiving a plurality of micro operations (micro-ops) decoded from one or more instructions; packing a plurality of registers which are included in the micro-ops into a packed register structure including a plurality of packed registers based on a preset number of rename ports of a renamer through which the packed registers are read or written for register renaming; and sending the packed registers for register renaming.
    Type: Grant
    Filed: March 8, 2013
    Date of Patent: July 19, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Teik-Chung Tan, Bradley Gene Burgess, Ravi Iyengar
  • Patent number: 9256544
    Abstract: For a memory access at a processor, only a subset (less than all) of the ways of a cache associated with a memory address is prepared for access. The subset of ways is selected based on stored information indicating, for each memory access, which corresponding way of the cache was accessed. The subset of ways is selected and preparation of the subset of ways is initiated prior to the final determination as to which individual cache way in the subset is to be accessed.
    Type: Grant
    Filed: December 26, 2012
    Date of Patent: February 9, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Matthew M. Crum, Teik-Chung Tan
  • Publication number: 20140258687
    Abstract: A method and apparatus for register packing prior to register renaming in a microprocessor are provided. The method includes: receiving a plurality of micro operations (micro-ops) decoded from one or more instructions; packing a plurality of registers which are included in the micro-ops into a packed register structure including a plurality of packed registers based on a preset number of rename ports of a renamer through which the packed registers are read or written for register renaming; and sending the packed registers for register renaming.
    Type: Application
    Filed: March 8, 2013
    Publication date: September 11, 2014
    Inventors: Teik-Chung TAN, Bradley Gene BURGESS, Ravi IYENGAR
  • Publication number: 20140181407
    Abstract: For a memory access at a processor, only a subset (less than all) of the ways of a cache associated with a memory address is prepared for access. The subset of ways is selected based on stored information indicating, for each memory access, which corresponding way of the cache was accessed. The subset of ways is selected and preparation of the subset of ways is initiated prior to the final determination as to which individual cache way in the subset is to be accessed.
    Type: Application
    Filed: December 26, 2012
    Publication date: June 26, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Matthew M. Crum, Teik-Chung Tan
  • Patent number: 7783692
    Abstract: A method and circuit for fast flag generation. The circuit is coupled to receive data to be shifted, the data including a first plurality of bits. A shift count value (including a second plurality of bits) is also received by the circuit, as well as an indication of a direction the data is to be shifted. Based on the shift count value and the indication of direction, the position of a bit within the data is determined. The bit is then output as a flag bit.
    Type: Grant
    Filed: July 5, 2005
    Date of Patent: August 24, 2010
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Wing-Shek Wong, Michael E. Tuuk, Teik-Chung Tan
  • Patent number: 7610476
    Abstract: Various embodiments of methods and systems for storing multiple groups of microcode operations and corresponding control sequences per row of microcode ROM are disclosed. In one embodiment, an integrated circuit may include a microcode ROM coupled to a control sequence logic unit. The microcode ROM may store multiple groups of microcode operations per row. For each group of microcode operations stored in a row, a corresponding control sequence may also be stored in the row. Each group of microcode operations may be included in a microcode routine. The groups of microcode operations stored in a row may be included in the same microcode routine, or some of the groups may be included in different microcode routines.
    Type: Grant
    Filed: December 5, 2003
    Date of Patent: October 27, 2009
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Teik-Chung Tan, Gregory William Smaus
  • Patent number: 7584237
    Abstract: A method and mechanism for performing division. A processor includes a divider configured to perform arithmetic division operations. Prior to dividing a dividend by a divisor, the divider manipulates the dividend and divisor to reduce the number of bits considered and the computations required to perform the division. The divisor is normalized by eliminating sign bits. The dividend is prescaled to eliminate one or more sign bits. Prescaling of the dividend may not be precise as sign bits of the dividend may be shifted out as groups of bits, rather than individual bits. Prescaling of the dividend may be adjusted to account for the fact that the divider considers multiple bits of the dividend at a time. Subsequent to prescaling and adjustment, the dividend may be adjusted in dependence upon the normalization of the divisor. Further adjustment may be utilized to maintain a significance relationship between the divisor and dividend. Subsequent to further adjustment, the division operation may be completed.
    Type: Grant
    Filed: October 11, 2005
    Date of Patent: September 1, 2009
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Teik-Chung Tan, Michael Tuuk, Wing-Shek Wong
  • Patent number: 7464255
    Abstract: A method and mechanism for performing shift operations using a shuffle unit. A processor includes a shuffle unit configured to perform shuffle operations responsive to shuffle instructions. The shuffle unit is adapted to support shift operations as well. In response to determining a shuffle instruction is received, selected bits of an immediate value of the shuffle instruction are used to generate byte selects for relocating bytes of a source operand. In response to determining the instruction is a shift instruction, the shuffle unit performs an arithmetic operation on a first and second value, where the first value corresponds to a particular destination byte position, and the second value corresponds to the immediate value. The result of the arithmetic operation comprises a byte select which selects one of the bytes of a source operand for conveyance to the particular destination byte position.
    Type: Grant
    Filed: July 28, 2005
    Date of Patent: December 9, 2008
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Teik-Chung Tan, Kelvin Domnic Goveas
  • Patent number: 7380070
    Abstract: A cache system is constructed in accordance with an architecture that comprises a tag array into which tags are stored that are used to determine whether a hit or a miss into the cache system has occurred. Further, the cache system comprises a data array into which cache lines of data are stored, each cache line comprising a plurality of sub-lines, and each sub-line is adapted to be written back to a system memory separate from the other sub-lines. The cache system also comprises a controller coupled to the tag and data arrays. The tag array includes a cache-line dirty bit associated with each cache line and the data array includes a plurality of dirty bits for each cache line. The plurality of dirty bits comprises one sub-line dirty bit for each sub-line.
    Type: Grant
    Filed: February 17, 2005
    Date of Patent: May 27, 2008
    Assignee: Texas Instruments Incorporated
    Inventor: Teik-Chung Tan
  • Patent number: 7124236
    Abstract: A microprocessor including a level two cache memory including asynchronously accessible cache blocks. The microprocessor includes an execution unit coupled to a cache memory subsystem which includes a plurality of storage blocks, each configured to store a plurality of data units. Each of the plurality of storage blocks may be accessed asynchronously. In addition, the cache subsystem includes a plurality of tag units which are coupled to the plurality of storage blocks. Each of the tag units may be configured to store a plurality of tags each including an address tag value which corresponds to a given unit of data stored within the plurality of storage blocks. Each of the plurality of tag units may be accessed synchronously.
    Type: Grant
    Filed: November 26, 2002
    Date of Patent: October 17, 2006
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Teik-Chung Tan, Mitchell Alsup, Jerry D. Moench
  • Publication number: 20060195677
    Abstract: A cache system comprises a plurality of cache banks, a translation look-aside buffer (TLB), and a scheduler. The TLB is used to translate a virtual address (VA) to a physical address (PA). The scheduler, before the VA has been completely translated to the PA, uses a subset of the VA's bits to schedule access to the plurality of cache banks.
    Type: Application
    Filed: February 28, 2005
    Publication date: August 31, 2006
    Applicant: Texas Instruments Incorporated
    Inventor: Teik-Chung Tan
  • Publication number: 20060184745
    Abstract: A cache system is constructed in accordance with an architecture that comprises a tag array into which tags are stored that are used to determine whether a hit or a miss into the cache system has occurred. Further, the cache system comprises a data array into which cache lines of data are stored, each cache line comprising a plurality of sub-lines, and each sub-line is adapted to be written back to a system memory separate from the other sub-lines. The cache system also comprises a controller coupled to the tag and data arrays. The tag array includes a cache-line dirty bit associated with each cache line and the data array includes a plurality of dirty bits for each cache line. The plurality of dirty bits comprises one sub-line dirty bit for each sub-line.
    Type: Application
    Filed: February 17, 2005
    Publication date: August 17, 2006
    Applicant: Texas Instruments Incorporated
    Inventor: Teik-Chung Tan
  • Patent number: 7028068
    Abstract: A multiplier includes a plurality of subunits. Each of the plurality of subunits is configured to perform a portion of a multiplication operation, and the plurality of subunits are coupled together to perform the multiplication operation. At least a first subunit of the plurality of subunits and a second subunit of the plurality of subunits are configured to perform a same portion of the multiplication operation. The first subunit and the second subunit are clocked at a first clock frequency, during use, that is less than a second clock frequency at which a remainder of the plurality of subunits are clocked during use. The first subunit and the second subunit each have inputs coupled to a third subunit of the plurality of subunits to receive multiplication operations to be operated upon by the respective first subunit and second subunit.
    Type: Grant
    Filed: February 4, 2003
    Date of Patent: April 11, 2006
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kelvin D. Goveas, Teik-Chung Tan
  • Patent number: 6823427
    Abstract: Various methods and systems for implementing a sectored least recently used (LRU) cache replacement algorithm are disclosed. Each set in an N-way set-associative cache is partitioned into several sectors that each include two or more of the N ways. Usage status indicators such as pointers show the relative usage status of the sectors in an associated set. For example, an LRU pointer may point to the LRU sector, an MRU pointer may point to the MRU sector, and so on. When a replacement is performed, a way within the LRU sector identified by the LRU pointer is filled.
    Type: Grant
    Filed: May 16, 2001
    Date of Patent: November 23, 2004
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Benjamin T. Sander, Teik-Chung Tan, Adam Duley
  • Patent number: 6760392
    Abstract: A system and method for transferring data using an early response signal to indicate subsequent transmission of data after a fixed latency, wherein the signal and data are transferred from a first clock domain to a second clock domain using a clock skipping technique. In one embodiment, an early response signal is transmitted by a first device k clock pulses prior to transmission of the data. The receiving device, which is operating at a higher clock rate, receives the early response signal and delays the signal by the number of skipped pulses which will occur in the second clock domain before the occurrence of the kth valid pulse. The second device employs a skip pattern generator to generate a signal indicative of this number of skipped pulses and provides the number to a delay circuit which delays the early response signal for an this number of clock pulses. The delayed early response signal is then output to the appropriate logic to indicate the latency of the subsequent data transfer.
    Type: Grant
    Filed: November 12, 1999
    Date of Patent: July 6, 2004
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Teik-Chung Tan, Brian D. McMinn
  • Patent number: 6725337
    Abstract: A cache controller configured to speculatively invalidate a cache line may respond to an invalidating request or instruction immediately instead of waiting for error checking to complete. In case the error checking determines that the invalidation is erroneous and thus should not be performed, the cache controller protects the speculatively invalidated cache line from modification until error checking is complete. This way, if the invalidation is later found to be erroneous, the speculative invalidation can be reversed. If error checking completes without detecting any errors, the speculative invalidation becomes non-speculative.
    Type: Grant
    Filed: May 16, 2001
    Date of Patent: April 20, 2004
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Teik-Chung Tan, Benjamin T. Sander
  • Patent number: 6571318
    Abstract: A processor is described which includes a stride detect table. The stride detect table includes one or more entries, each entry used to track a potential stride pattern. Additionally, each entry includes a confidence counter. The confidence counter may be incremented each time another address in the pattern is detected, and thus may be indicative of the strength of the pattern (e.g., the likelihood of the pattern repeating). At a first threshold of the confidence counter, prefetching of the next address in the pattern (the most recent address plus the stride) may be initiated. At a second, greater threshold, a more aggressive prefetching may be initiated (e.g. the most recent address plus twice the stride). In some implementations, the prefetch mechanism including the stride detect table may replace a prefetch buffer and prefetch logic in the memory controller.
    Type: Grant
    Filed: March 2, 2001
    Date of Patent: May 27, 2003
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Benjamin T. Sander, William A. Hughes, Sridhar P. Subramanian, Teik-Chung Tan
  • Patent number: 6424688
    Abstract: A system and method for transferring data from a first clock domain to a second clock domain wherein a clock skipping technique is employed to maintain the same level of data throughput in the transmitting and receiving domains. In one embodiment, a plurality of serial data values are received from a device in the first clock domain and are stored in a plurality of flip-flops. The data values are clocked into the flip-flops, one value per flip-flop, at a first clock rate corresponding to the first clock domain. After a value is stored in the last flip-flop, the cycle is repeated and the previously stored values are overwritten. The data values are retrieved from the flip-flops after the values have had time to stabilize, but before they are overwritten. The values are retrieved at a second clock rate corresponding to a second clock domain and are transferred to a device in the second clock domain.
    Type: Grant
    Filed: October 27, 1999
    Date of Patent: July 23, 2002
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Teik-Chung Tan, Derrick R. Meyer, Brian D. McMinn