Patents by Inventor Jeffry E. Gonion

Jeffry E. Gonion has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20110093681
    Abstract: The described embodiments include a processor that executes a vector instruction. The processor starts by receiving an input vector and optionally receiving a predicate vector as inputs. The processor then executes the vector instruction, which causes the processor to determine a key element position in the input vector and generate a result vector. When generating the result vector, if the predicate vector is received, for each element in the result vector for which a corresponding element of the predicate vector is active, otherwise, for each element of the result vector, the processor sets each element of the result vector to the right of the key element to a first predetermined value and sets each element of the result vector at or to the left of the key element to a second predetermined value. The processor then sets one or more processor status flags based on the values in the result vector.
    Type: Application
    Filed: December 23, 2010
    Publication date: April 21, 2011
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20110035567
    Abstract: The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a vector instruction that optionally receives a predicate vector (which has N elements) as an input. The processor then executes the vector instruction. In the described embodiments, executing the vector instruction causes the processor to generate a result vector. When generating the result vector, if the predicate vector is received, for each element in the result vector for which a corresponding element of the predicate vector is active, otherwise, for each element of the result vector, the processor determines element positions for which a fault was masked during a prior operation. The processor then updates elements in the result vector to identify a leftmost element for which a fault was masked.
    Type: Application
    Filed: October 19, 2010
    Publication date: February 10, 2011
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20110035568
    Abstract: The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a vector instruction that uses a first input vector, a second input vector, and a control vector, and optionally a predicate vector as inputs, wherein each of the vectors includes N elements. The processor then executes the vector instruction. In the described embodiments, when executing the vector instruction, the processor determines a key element position. If the predicate vector is received, the key element position is a predetermined active element position in the predicate vector, otherwise, the key element position is in a predetermined element position. The processor then uses the key element position to copy a result value into a result variable.
    Type: Application
    Filed: October 19, 2010
    Publication date: February 10, 2011
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100325399
    Abstract: The described embodiments provide a processor that executes a vector instruction. The processor starts by receiving a vector instruction that uses at least one vector of values that includes N elements as an input. In addition, the processor optionally receives a predicate vector that includes N elements. The processor then executes the vector instruction. In the described embodiments, when executing the vector instruction, if the predicate vector is received, for one or more selected elements in the vector of values for which a corresponding element in the predicate vector is active, otherwise, for one or more selected elements in the vector of values, the processor checks the one or more selected elements to determine if the selected elements contain a predetermined value. When the selected elements contain the predetermined value, the processor sets a corresponding status flag.
    Type: Application
    Filed: August 31, 2010
    Publication date: December 23, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100325483
    Abstract: The described embodiments include a processor that handles faults during execution of a vector instruction. The processor starts by receiving a vector instruction that uses at least one vector of values that includes N elements as an input. In addition, the processor optionally receives a predicate vector that includes N elements. The processor then executes the vector instruction. In the described embodiments, when executing the vector instruction, if the predicate vector is received, for each element in the vector of values for which a corresponding element in the predicate vector is active, otherwise, for each element in the vector of values, the processor performs an operation for the vector instruction for the element in the vector of values. While performing the operation, the processor conditionally masks faults encountered (i.e., faults caused by an illegal operation).
    Type: Application
    Filed: August 31, 2010
    Publication date: December 23, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100325398
    Abstract: The described embodiments provide a processor for generating a result vector that contains results from a comparison operation. During operation, the processor receives a first input vector, a second input vector, and a control vector. When subsequently generating a result vector, the processor first captures a base value from a key element position in the first input vector. For selected elements in the result vector, processor compares the base value and values from relevant elements to the left of a corresponding element in the second input vector, and writes the result into the element in the result vector. In the described embodiments, the key element position and the relevant elements can be defined by the control vector and an optional predicate vector.
    Type: Application
    Filed: August 31, 2010
    Publication date: December 23, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Patent number: 7800519
    Abstract: One embodiment of the present invention provides an apparatus for compressing data, comprising a compression mechanism which includes N channels. During operation, the compression mechanism receives a set of data words from an input bit-stream, compresses the data words into a set of variable-length words, and stores an I-th variable-length word in the set of variable-length words into a fixed-packet for an I-th channel. Then, the compression mechanism assembles each fixed-length packet into an output stream when the packet becomes full. Some other embodiments of the present invention provide an apparatus for data decompression, comprising a parallel-processing mechanism which includes N decompression mechanisms.
    Type: Grant
    Filed: September 30, 2008
    Date of Patent: September 21, 2010
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Publication number: 20100235612
    Abstract: A macroscalar processor architecture is described herein. In one embodiment, an exemplary processor includes one or more execution units to execute instructions and one or more iteration units coupled to the execution units. The one or more iteration units receive one or more primary instructions of a program loop that comprise a machine executable program. For each of the primary instructions received, at least one of the iteration units generates multiple secondary instructions that correspond to multiple loop iterations of the task of the respective primary instruction when executed by the one or more execution units. Other methods and apparatuses are also described.
    Type: Application
    Filed: May 26, 2010
    Publication date: September 16, 2010
    Inventor: Jeffry E. Gonion
  • Patent number: 7739442
    Abstract: A macroscalar processor architecture is described herein. In one embodiment, an exemplary processor includes one or more execution units to execute instructions and one or more iteration units coupled to the execution units. The one or more iteration units receive one or more primary instructions of a program loop that comprise a machine executable program. For each of the primary instructions received, at least one of the iteration units generates multiple secondary instructions that correspond to multiple loop iterations of the task of the respective primary instruction when executed by the one or more execution units. Other methods and apparatuses are also described.
    Type: Grant
    Filed: May 23, 2008
    Date of Patent: June 15, 2010
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 7728742
    Abstract: The described embodiments include a system for performing data compression. The system includes a compression mechanism with N channels, and an internal decompression mechanism in the compression mechanism that accepts N channels of fixed-length packets. The compression mechanism is configured to receive an input bit stream that includes a set of data words. In response to receiving a request from the internal decompression mechanism identifying at least one of the channels for which a fixed-length packet is to be appended to the output stream, the system fills a fixed-length packet for the identified channel with compressed data words; appends the fixed-length packet to the output stream; and forwards a copy of the fixed-length packet to the internal decompression mechanism. The internal decompression mechanism decompresses fixed-length packets for each of the channels to determine a next fixed-length packet to be appended to the output stream.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: June 1, 2010
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Publication number: 20100122069
    Abstract: A macroscalar processor architecture is described herein. In one embodiment, a processor receives instructions of a program loop having a vector block and a sequence block intended to be executed after the vector block, where the processor includes multiple slices and each of the slices is capable of executing an instruction of an iteration of the program loop substantially in parallel. For each iteration of the program loop, the processor executes an instruction of the sequence block using one of the slices while executing instructions of the vector block using a remainder of the slices substantially in parallel. Other methods and apparatuses are also described.
    Type: Application
    Filed: November 6, 2009
    Publication date: May 13, 2010
    Inventor: Jeffry E. Gonion
  • Publication number: 20100079313
    Abstract: The described embodiments include a system for performing data compression. The system includes a compression mechanism with N channels, and an internal decompression mechanism in the compression mechanism that accepts N channels of fixed-length packets. The compression mechanism is configured to receive an input bit stream that includes a set of data words. In response to receiving a request from the internal decompression mechanism identifying at least one of the channels for which a fixed-length packet is to be appended to the output stream, the system fills a fixed-length packet for the identified channel with compressed data words; appends the fixed-length packet to the output stream; and forwards a copy of the fixed-length packet to the internal decompression mechanism. The internal decompression mechanism decompresses fixed-length packets for each of the channels to determine a next fixed-length packet to be appended to the output stream.
    Type: Application
    Filed: June 30, 2009
    Publication date: April 1, 2010
    Applicant: APPLE INC.
    Inventor: Jeffry E. Gonion
  • Publication number: 20100079314
    Abstract: One embodiment of the present invention provides an apparatus for compressing data, comprising a compression mechanism which includes N channels. During operation, the compression mechanism receives a set of data words from an input bit-stream, compresses the data words into a set of variable-length words, and stores an I-th variable-length word in the set of variable-length words into a fixed-packet for an I-th channel. Then, the compression mechanism assembles each fixed-length packet into an output stream when the packet becomes full. Some other embodiments of the present invention provide an apparatus for data decompression, comprising a parallel-processing mechanism which includes N decompression mechanisms.
    Type: Application
    Filed: September 30, 2008
    Publication date: April 1, 2010
    Applicant: APPLE INC.
    Inventor: Jeffry E. Gonion
  • Publication number: 20100077180
    Abstract: Embodiments of a method for performing parallel operations in a computer system when one or more conditional dependencies may be present, where a given conditional dependency includes a dependency associated with at least two data elements based on a pair of conditions. During operation, a processor receives instructions for generating one or more predicate values based on actual dependencies, where a given predicate value indicates data elements that may be safely evaluated in parallel, and where the given actual dependency occurs when the pair of conditions matches one or more criteria. Then, the processor executes the instructions for generating the one or more predicate values.
    Type: Application
    Filed: September 24, 2008
    Publication date: March 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100077182
    Abstract: Embodiments of a method for performing parallel operations in a computer system when one or more conditional dependencies may be present, where a given conditional dependency includes a dependency associated with at least two data elements based on a pair of conditions. During operation, a processor receives instructions for generating one or more stop indicators based on actual dependencies, where a given stop indicator indicates the position of a given actual dependency that can lead to different results when the data elements are processed in parallel than when the data elements are processed sequentially, and where the given actual dependency occurs when the pair of conditions matches one or more criteria. Then, the processor executes the instructions for generating the one or more stop indicators.
    Type: Application
    Filed: September 24, 2008
    Publication date: March 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100077183
    Abstract: Embodiments of a method for performing parallel operations in a computer system when one or more conditional dependencies may be present, where a given conditional dependency includes a dependency associated with at least two data elements based on a pair of conditions. During operation, a processor receives instructions for generating a vector of tracked positions of actual dependencies, where a given tracked position indicates the position of a given actual dependency, and where the given actual dependency occurs when the pair of condition matches one or more criteria. Then, the processor executes the instructions for generating the vector of tracked positions.
    Type: Application
    Filed: September 24, 2008
    Publication date: March 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100058037
    Abstract: The described embodiments provide a processor for generating a result vector with shifted values. During operation, the processor receives a first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element position in the second input vector. The processor then determines a number of bit positions to shift the base value using selected relevant elements in the first input vector. The processor then shifts the copy of the base value by the number of bit positions and writes the value into a corresponding element in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: August 14, 2009
    Publication date: March 4, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100049950
    Abstract: The described embodiments provide a processor for generating a result vector with summed values from a first input vector. During operation, the processor receives the first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element in the second input vector. The processor then writes the sum of the base value and values from relevant elements in the first input vector into selected elements in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: August 14, 2009
    Publication date: February 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100049951
    Abstract: The described embodiments provide a processor for generating a result vector with shifted values. During operation, the processor receives a first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element position in the second input vector. The processor then writes the product of the base value and values from relevant elements in the first input vector into selected elements in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector.
    Type: Application
    Filed: August 14, 2009
    Publication date: February 25, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20100042815
    Abstract: The described embodiments provide a system that executes program code. While executing program code, the processor encounters at least one vector instruction and at least one vector-control instruction. The vector instruction includes a set of elements, wherein each element is used to perform an operation for a corresponding iteration of a loop in the program code. The vector-control instruction identifies elements in the vector instruction that may be operated on in parallel without causing an error due to a runtime data dependency between the iterations of the loop. The processor then executes the loop by repeatedly executing the vector-control instruction to identify a next group of elements that can be operated on in the vector instruction and selectively executing the vector instruction to perform the operation for the next group of elements in the vector instruction, until the operation has been performed for all elements of the vector instruction.
    Type: Application
    Filed: April 7, 2009
    Publication date: February 18, 2010
    Applicant: APPLE INC.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff