Patents by Inventor Keith E. Diefendorff

Keith E. Diefendorff has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100077182
    Abstract: Embodiments of a method for performing parallel operations in a computer system when one or more conditional dependencies may be present, where a given conditional dependency includes a dependency associated with at least two data elements based on a pair of conditions. During operation, a processor receives instructions for generating one or more stop indicators based on actual dependencies, where a given stop indicator indicates the position of a given actual dependency that can lead to different results when the data elements are processed in parallel than when the data elements are processed sequentially, and where the given actual dependency occurs when the pair of conditions matches one or more criteria. Then, the processor executes the instructions for generating the one or more stop indicators.
    Type: Application
    Filed: September 24, 2008
    Publication date: March 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100077183
    Abstract: Embodiments of a method for performing parallel operations in a computer system when one or more conditional dependencies may be present, where a given conditional dependency includes a dependency associated with at least two data elements based on a pair of conditions. During operation, a processor receives instructions for generating a vector of tracked positions of actual dependencies, where a given tracked position indicates the position of a given actual dependency, and where the given actual dependency occurs when the pair of condition matches one or more criteria. Then, the processor executes the instructions for generating the vector of tracked positions.
    Type: Application
    Filed: September 24, 2008
    Publication date: March 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100058037
    Abstract: The described embodiments provide a processor for generating a result vector with shifted values. During operation, the processor receives a first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element position in the second input vector. The processor then determines a number of bit positions to shift the base value using selected relevant elements in the first input vector. The processor then shifts the copy of the base value by the number of bit positions and writes the value into a corresponding element in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: August 14, 2009
    Publication date: March 4, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100049950
    Abstract: The described embodiments provide a processor for generating a result vector with summed values from a first input vector. During operation, the processor receives the first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element in the second input vector. The processor then writes the sum of the base value and values from relevant elements in the first input vector into selected elements in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: August 14, 2009
    Publication date: February 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100049951
    Abstract: The described embodiments provide a processor for generating a result vector with shifted values. During operation, the processor receives a first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element position in the second input vector. The processor then writes the product of the base value and values from relevant elements in the first input vector into selected elements in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: August 14, 2009
    Publication date: February 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100042815
    Abstract: The described embodiments provide a system that executes program code. While executing program code, the processor encounters at least one vector instruction and at least one vector-control instruction. The vector instruction includes a set of elements, wherein each element is used to perform an operation for a corresponding iteration of a loop in the program code. The vector-control instruction identifies elements in the vector instruction that may be operated on in parallel without causing an error due to a runtime data dependency between the iterations of the loop. The processor then executes the loop by repeatedly executing the vector-control instruction to identify a next group of elements that can be operated on in the vector instruction and selectively executing the vector instruction to perform the operation for the next group of elements in the vector instruction, until the operation has been performed for all elements of the vector instruction.
    Type: Application
    Filed: April 7, 2009
    Publication date: February 18, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100042816
    Abstract: The described embodiments provide a system that sets elements in a result vector based on an input vector. During operation, the system determines a location of a key element within the input vector. Next, the system generates a result vector. When generating the result vector, the system sets one or more elements of the result vector based on the location of the key element in the input vector.
    Type: Application
    Filed: April 7, 2009
    Publication date: February 18, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100042817
    Abstract: The described embodiments provide a processor for generating a result vector with shifted values from an input vector. During operation, the processor receives an input vector and a control vector. Using these vectors, the processor generates the result vector, which can contain shifted values or propagated values from the input vector, depending on the value of the control vector. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: June 30, 2009
    Publication date: February 18, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100042818
    Abstract: The described embodiments provide a processor for generating a result vector with copied or propagated values from an input vector. During operation, the processor receives at least one input vector and a control vector. Using these vectors, the processor generates the result vector, which can contain copied propagated values from the input vector(s), depending on the value of the control vector. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: June 30, 2009
    Publication date: February 18, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100042807
    Abstract: The described embodiments provide a processor for generating a result vector with incremented or decremented values from an input vector. During operation, the processor receives an input vector and a control vector. The processor then copies a value contained in a selected element of the input vector. The processor next generates the result vector, which involves writing an incremented or decremented value to the result vector, depending on the value of the control vector and the embodiment. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: June 30, 2009
    Publication date: February 18, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff, JR.
  • Publication number: 20100042789
    Abstract: The described embodiments provide a system that determines data dependencies between two vector memory operations or two memory operations that use vectors of memory addresses. During operation, the system receives a first input vector and a second input vector. The first input vector includes a number of elements containing memory addresses for a first memory operation, while the second input vector includes a number of elements containing memory addresses for a second memory operation, wherein the first memory operation occurs before the second memory operation in program order. The system then determines elements in the first and second input vectors where the memory addresses indicate that a dependency exists between the memory operations. The system next generates a result vector, wherein the result vector indicates the elements where dependencies exist between the memory operations.
    Type: Application
    Filed: April 7, 2009
    Publication date: February 18, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Patent number: 7664920
    Abstract: A microprocessor includes a hierarchical memory subsystem, an instruction decoder, and a stream prefetch unit. The decoder decodes an instruction that specifies a locality characteristic parameter. In one embodiment, the parameter specifies a relative urgency with which a data stream specified by the instruction is needed rather than specifying exactly which of the cache memories in the hierarchy to prefetch the data stream into. The prefetch unit selects one of the cache memory levels in the hierarchy for prefetching the data stream into based on the memory subsystem configuration and on the relative urgency. In another embodiment, the prefetch unit instructs the memory subsystem to mark the prefetched cache line for early, late, or normal eviction according to its cache line replacement policy based on the parameter value.
    Type: Grant
    Filed: August 11, 2006
    Date of Patent: February 16, 2010
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Patent number: 7624251
    Abstract: One embodiment of the present invention provides a processor that is configured to execute load-swapped-partial instructions. An instruction fetch unit within the processor is configured to fetch the load-swapped-partial instruction to be executed. Note that the load-swapped-partial instruction specifies a source address in a memory, which is possibly an unaligned address. Furthermore, an execution unit within the processor is configured to execute the load-swapped-partial instruction. This involves loading a partial-vector-sized datum from a naturally-aligned memory region encompassing the source address.
    Type: Grant
    Filed: January 18, 2007
    Date of Patent: November 24, 2009
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Patent number: 7620797
    Abstract: One embodiment of the present invention provides a processor which is configured to execute load-swapped instructions, which are possibly directed to unaligned source address. The processor is configured to execute the load-swapped instruction by loading a vector from a naturally-aligned memory region encompassing the source address, and in doing so rotating the bytes of the vector to cause the byte at the specified source address to reside at the least-significant byte position within the vector for a little-endian memory transaction, or causing said byte to be positioned at the most-significant byte position within the vector for a big-endian memory transaction.
    Type: Grant
    Filed: November 1, 2006
    Date of Patent: November 17, 2009
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Patent number: 7533220
    Abstract: A microprocessor coupled to a system memory has a memory subsystem with a translation look-aside buffer (TLB) for storing TLB information. The microprocessor also includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and an abnormal TLB access policy. The microprocessor also includes a stream prefetch unit that generates a prefetch request to the memory subsystem to prefetch a cache line of the data stream from the system memory into the memory subsystem. If a virtual page address of the prefetch request causes an abnormal TLB access, the memory subsystem selectively aborts the prefetch request based on the abnormal TLB access policy specified in the instruction.
    Type: Grant
    Filed: August 11, 2006
    Date of Patent: May 12, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Patent number: 7512740
    Abstract: A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority.
    Type: Grant
    Filed: August 11, 2006
    Date of Patent: March 31, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Patent number: 7509459
    Abstract: A microprocessor has a plurality of stream prefetch engines for prefetching a respective data stream from the system memory into the microprocessor cache memory and an instruction decoder that decodes instructions of the microprocessor instruction set. The instruction set includes a stream prefetch instruction that returns an identifier uniquely associating a data stream specified by the instruction with one of the engines. The instruction set also includes an explicit prefetch-triggering load instruction that specifies a stream identifier returned by a previously executed stream prefetch instruction. When the decoder decodes a conventional load instruction it does not prefetch; however, when it decodes an explicit prefetch-triggering load instruction it commences prefetching the specified data stream. In one embodiment, an indicator of the load instruction may explicitly specify non-prefetch-triggering.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: March 24, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Publication number: 20090077321
    Abstract: A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority.
    Type: Application
    Filed: August 4, 2008
    Publication date: March 19, 2009
    Applicant: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Patent number: 7506106
    Abstract: A microprocessor has a data stream prefetch unit for processing a data stream prefetch instruction. The instruction specifies a data stream and a speculative stream hit policy indicator. If a load instruction hits in the data stream, then if the load is non-speculative the stream prefetch unit prefetches a portion of the data stream from system memory into cache memory; however, if the load is speculative the stream prefetch unit selectively prefetches a portion of the data stream from the system memory into the cache memory based on the value of the policy indicator. The load instruction is speculative if it is not guaranteed to complete execution, such as if it follows a predicted branch instruction whose outcome has not yet been finally determined to be correct. In one embodiment, the stream prefetch unit performs a similar function for store instructions that hit in the data stream.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: March 17, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Patent number: 7480769
    Abstract: A microprocessor coupled to a system memory includes a load request signal that requests data be loaded from the system memory into the microprocessor in response to a load instruction. The load request signal includes a load virtual page address. The microprocessor also includes a prefetch request signal that requests a cache line be prefetched from the system memory into the microprocessor in response to a prefetch instruction. The prefetch request signal includes a prefetch virtual page address.
    Type: Grant
    Filed: August 11, 2006
    Date of Patent: January 20, 2009
    Assignee: MIPS Technologies, Inc.
    Inventors: Keith E. Diefendorff, Thomas A. Petersen