Patents by Inventor Stefan G. Berg

Stefan G. Berg has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7409530
    Abstract: A VLIW instruction format is introduced having a set of control bits which identify subinstruction sharing conditions. At compilation the VLIW instruction is analyzed to identify subinstruction sharing opportunities. Such opportunities are encoded in the control bits of the instruction. Before the instruction is moved into the instruction cache, the instruction is compressed into the new format to delete select redundant occurrences of a subinstruction. Specifically, where a subinstruction is to be shared by corresponding functional processing units of respective clusters, the subinstruction need only appear in the instruction once. The redundant appearance is deleted. The control bits are decoded at instruction parsing time to route a shared subinstruction to the associated functional processing units.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: August 5, 2008
    Assignee: University of Washington
    Inventors: Donglok Kim, Stefan G. Berg, Weiyun Sun, Yongmin Kim
  • Patent number: 7234040
    Abstract: Data are prefetched into a cache from a prefetch region of memory, based on a program instruction reference and on compile-time information that indicates the bounds of the prefetch region, a size of a prefetch block, and a location of the prefetch block. If the program reference address lies with the prefetch region, an offset distance is used to determine the address of the prefetch block. Prefetching is performed either from a continuous one-dimensional prefetch region, or an embedded multi-dimensional prefetch region. The prefetch block address is respectively determined in one dimension or multiple dimensions. Program-directed prefetching is implemented by a media processor or by a separate processing component in communication with the media processor. The primary components include a program-directed prefetch controller, a cache, a function unit, and a memory. Preferably, region registers store the compile-time information, and the prefetched data are stored in a cache prefetch buffer.
    Type: Grant
    Filed: July 20, 2004
    Date of Patent: June 19, 2007
    Assignee: University of Washington
    Inventors: Stefan G. Berg, Donglok Kim, Yongmin Kim
  • Patent number: 6859870
    Abstract: A VLIW instruction format is introduced having a set of control bits which identify subinstruction sharing conditions. At compilation the VLIW instruction is analyzed to identify subinstruction sharing opportunities. Such opportunities are encoded in the control bits of the instruction. Before the instruction is moved into the instruction cache, the instruction is compressed into the new format to delete select redundant occurrences of a subinstruction. Specifically, where a subinstruction is to be shared by corresponding functional processing units of respective clusters, the subinstruction need only appear in the instruction once. The redundant appearance is deleted. The control bits are decoded at instruction parsing time to route a shared subinstruction to the associated functional processing units.
    Type: Grant
    Filed: March 7, 2000
    Date of Patent: February 22, 2005
    Assignee: University of Washington
    Inventors: Donglok Kim, Stefan G. Berg, Weiyun Sun, Yongmin Kim
  • Publication number: 20040268051
    Abstract: Data are prefetched into a cache from a prefetch region of memory, based on a program instruction reference and on compile-time information that indicates the bounds of the prefetch region, a size of a prefetch block, and a location of the prefetch block. If the program reference address lies with the prefetch region, an offset distance is used to determine the address of the prefetch block. Prefetching is performed either from a continuous one-dimensional prefetch region, or an embedded multi-dimensional prefetch region. The prefetch block address is respectively determined in one dimension or multiple dimensions. Program-directed prefetching is implemented by a media processor or by a separate processing component in communication with the media processor. The primary components include a program-directed prefetch controller, a cache, a function unit, and a memory. Preferably, region registers store the compile-time information, and the prefetched data are stored in a cache prefetch buffer.
    Type: Application
    Filed: July 20, 2004
    Publication date: December 30, 2004
    Applicant: University of Washington
    Inventors: Stefan G. Berg, Donglok Kim, Yongmin Kim
  • Patent number: 6804771
    Abstract: A processor including a transposable register file. The register file allows normal row-wise access to data and also allows a transposed column-wise access to data stored in a column among registers of the register file. In transposed access mode, a data operand is accessed in a given partition of each of n registers. One register stores a first partition. An adjacent register stores the second partition, and so forth for each of n partitions of the operand. A queue-based transposable register file also is implemented. The queue-based transposable register file includes a head pointer and a tail pointer and has a virtual register. Data written into the virtual register is written into one of the registers as selected by the head pointer. Data read from the virtual register is read from one of the registers as selected by the tail pointer.
    Type: Grant
    Filed: July 25, 2000
    Date of Patent: October 12, 2004
    Assignee: University of Washington
    Inventors: Yoochang Jung, Stefan G. Berg, Donglok Kim, Yongmin Kim
  • Patent number: 6782470
    Abstract: The register file of a processor includes embedded operand queues. The configuration of the register file into registers and operand queues is defined dynamically by a computer program. The programmer determines the trade-off between the number and size of the operand queue(s) versus the number of registers used for the program. The programmer partitions a portion of the registers into one or more operand queues. A given queue occupies a consecutive set of registers, although multiple queues need not occupy consecutive registers. An additional address bit is included to distinguish operand queue addresses from register addresses. Queue state logic tracks status information for each queue, including a header pointer, tail pointer, start address, end address and number of vacancies value. The program sets the locations and depth of a given operand queue within the register file.
    Type: Grant
    Filed: November 6, 2000
    Date of Patent: August 24, 2004
    Assignee: University of Washington
    Inventors: Stefan G. Berg, Michael S. Grow, Weiyun Sun, Donglok Kim, Yongmin Kim
  • Patent number: 6779101
    Abstract: An area of on-chip memory is allocated to store one or more tables of commonly-used opcodes. The normal opcode in the instruction is replaced with a shorter code identifying an index into the table. As a result, the instruction is compressed. For a VLIW architecture, in which an instruction includes multiple subinstructions (multiple opcodes), the instruction loading bandwidth is substantially reduced. Preferably, an opcode table is dynamically loaded. Different tasks are programmed with a respective table of opcodes to be stored in the opcode table. The respective table is loaded when task switching.
    Type: Grant
    Filed: March 7, 2000
    Date of Patent: August 17, 2004
    Assignee: University of Washington
    Inventors: Stefan G. Berg, Donglok Kim, Yongmin Kim
  • Patent number: 6732247
    Abstract: Multi-ported pipelined memory is located on a processor die serving as an addressable on-chip memory for efficiently processing streaming data. The memory sustains multiple wide memory accesses per cycle, clocks synchronously with the rest of the processor, and stores a significant portion of an image. Such memory bypasses the register file directly providing data to the processor's functional units. The memory includes multiple memory banks which permit multiple memory accesses per cycle. The memory banks are connected in pipelined fashion to pipeline registers placed at regular intervals on a global bus. The memory sustains multiple transactions per cycle, at a larger memory density than that of a multi-ported static memory, such as a register file.
    Type: Grant
    Filed: January 17, 2001
    Date of Patent: May 4, 2004
    Assignee: University of Washington
    Inventors: Stefan G. Berg, Donglok Kim, Yongmin Kim
  • Patent number: 6675286
    Abstract: Partitioned sigma instructions are provided in which processor capacity is effectively distributed among multiple sigma operations which are executed concurrently. Special registers are included for aligning data on memory word boundaries to reduce packing overhead in providing long data words for multimedia instructions which implement shifting data sequences over multiple iterations. Extended partitioned arithmetic instructions are provided to improve precision and avoid accumulated carry over errors. Partitioned formatting instructions, including partitioned interleave, partitioned compress, and partitioned interleave and compress pack subwords in an effective order for other partitioned operations.
    Type: Grant
    Filed: April 27, 2000
    Date of Patent: January 6, 2004
    Assignee: University of Washington
    Inventors: Weiyun Sun, Stefan G. Berg, Donglok Kim, Yongmin Kim
  • Publication number: 20030154349
    Abstract: Data are prefetched into a cache from a prefetch region of memory, based on a program instruction reference and on compile-time information that indicates the bounds of the prefetch region, a size of a prefetch block, and a location of the prefetch block. If the program reference address lies with the prefetch region, an offset distance is used to determine the address of the prefetch block. Prefetching is performed either from a continuous one-dimensional prefetch region, or an embedded multi-dimensional prefetch region. The prefetch block address is respectively determined in one dimension or multiple dimensions. Program-directed prefetching is implemented by a media processor or by a separate processing component in communication with the media processor. The primary components include a program-directed prefetch controller, a cache, a function unit, and a memory. Preferably, region registers store the compile-time information, and the prefetched data are stored in a cache prefetch buffer.
    Type: Application
    Filed: January 24, 2002
    Publication date: August 14, 2003
    Inventors: Stefan G. Berg, Donglok Kim, Yongmin Kim
  • Publication number: 20020095555
    Abstract: Multi-ported pipelined memory is located on a processor die serving as an addressable on-chip memory for efficiently processing streaming data. The memory sustains multiple wide memory accesses per cycle, clocks synchronously with the rest of the processor, and stores a significant portion of an image. Such memory bypasses the register file directly providing data to the processor's functional units. The memory includes multiple memory banks which permit multiple memory accesses per cycle. The memory banks are connected in pipelined fashion to pipeline registers placed at regular intervals on a global bus. The memory sustains multiple transactions per cycle, at a larger memory density than that of a multi-ported static memory, such as a register file.
    Type: Application
    Filed: January 17, 2001
    Publication date: July 18, 2002
    Applicant: UNIVERSITY OF WASHINGTON
    Inventors: Stefan G. Berg, Donglok Kim, Yongmin Kim