Patents by Inventor Jeffry E. Gonion

Jeffry E. Gonion has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Computation engine with extract instructions to minimize memory access

Patent number: 10831488

Abstract: In an embodiment, a computation engine may offload work from a processor (e.g. a CPU) and efficiently perform computations such as those used in LSTM and other workloads at high performance. In an embodiment, the computation engine may perform computations on input vectors from input memories in the computation engine, and may accumulate results in an output memory within the computation engine. The input memories may be loaded with initial vector data from memory, incurring the memory latency that may be associated with reading the operands. Compute instructions may be performed on the operands, generating results in an output memory. One or more extract instructions may be supported to move data from the output memory to the input memory, permitting additional computation on the data in the output memory without moving the results to main memory.

Type: Grant

Filed: August 20, 2018

Date of Patent: November 10, 2020

Assignee: Apple Inc.

Inventors: Eric Bainville, Jeffry E. Gonion, Ali Sazegari, Gerard R. Williams, III, Andrew J. Beaumont-Smith
Computation Engine that Operates in Matrix and Vector Modes

Publication number: 20200348934

Abstract: In an embodiment, a computation engine is configured to perform vector multiplications, producing either vector results or outer product (matrix) results. The instructions provided to the computation engine specify a matrix mode or a vector mode for the instructions. The computation engine performs the specified operation. The computation engine may perform numerous computations in parallel, in an embodiment. In an embodiment, the instructions may also specify an offset with the input memories, providing additional flexibility in the location of operands. More particularly, the computation engine may be configured to perform numerous multiplication operations in parallel and to accumulate results in a result memory, performing multiply-accumulate operations for each matrix/vector element in the targeted locations of the output memory.

Type: Application

Filed: July 14, 2020

Publication date: November 5, 2020

Inventors: Eric Bainville, Jeffry E. Gonion, Ali Sazegari, PhD, Gerard R. Williams, III
Systems and methods for performing memory compression

Patent number: 10769065

Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing a compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.

Type: Grant

Filed: June 10, 2019

Date of Patent: September 8, 2020

Assignee: Apple Inc.

Inventors: Ali Sazegari, Charles E. Tucker, Jeffry E. Gonion, Gerard R. Williams, III, Chris Cheng-Chieh Lee
Matrix Computation Engine

Publication number: 20200272464

Abstract: In an embodiment, a matrix computation engine is configured to perform matrix computations (e.g. matrix multiplications). The matrix computation engine may perform numerous matrix computations in parallel, in an embodiment. More particularly, the matrix computation engine may be configured to perform numerous multiplication operations in parallel on input matrix elements, generating resulting matrix elements. In an embodiment, the matrix computation engine may be configured to accumulate results in a result memory, performing multiply-accumulate operations for each matrix element of each matrix.

Type: Application

Filed: March 13, 2020

Publication date: August 27, 2020

Inventors: Eric Bainville, Tal Uliel, Erik Norden, Jeffry E. Gonion, Ali Sazegari
Computation engine that operates in matrix and vector modes

Patent number: 10754649

Abstract: In an embodiment, a computation engine is configured to perform vector multiplications, producing either vector results or outer product (matrix) results. The instructions provided to the computation engine specify a matrix mode or a vector mode for the instructions. The computation engine performs the specified operation. The computation engine may perform numerous computations in parallel, in an embodiment. In an embodiment, the instructions may also specify an offset with the input memories, providing additional flexibility in the location of operands. More particularly, the computation engine may be configured to perform numerous multiplication operations in parallel and to accumulate results in a result memory, performing multiply-accumulate operations for each matrix/vector element in the targeted locations of the output memory.

Type: Grant

Filed: July 24, 2018

Date of Patent: August 25, 2020

Assignee: Apple Inc.

Inventors: Eric Bainville, Jeffry E. Gonion, Ali Sazegari, Gerard R. Williams, III
Range Mapping of Input Operands for Transcendental Functions

Publication number: 20200241876

Abstract: In an embodiment, a processor (e.g. a CPU) may offload transcendental computation to a computation engine that may efficiently perform transcendental functions. The computation engine may implement a range instruction that may be included in a program being executed by the CPU. The CPU may dispatch the range instruction to the computation engine. The range instruction may take an input operand (that is to be evaluated in a transcendental function, for example) and may reference a range table that defines a set of ranges for the transcendental function. The range instruction may identify one of the set of ranges that includes the input operand. For example, the range instruction may output an interval number identifying which interval of an overall set of valid input values contains the input operand. In an embodiment, the range instruction may take an input vector operand and output a vector of interval identifiers.

Type: Application

Filed: April 13, 2020

Publication date: July 30, 2020

Inventors: O-Cheng Chang, Tal Uliel, Eric Bainville, Jeffry E. Gonion, Ali Sazegari
Computation Engine with Strided Dot Product

Publication number: 20200225958

Abstract: In an embodiment, a computation engine may perform dot product computations on input vectors. The dot product operation may have a first operand and a second operand, and the dot product may be performed on a subset of the vector elements in the first operand and each of the vector elements in the second operand. The subset of vector elements may be separated in the first operand by a stride that skips one or more elements between each element to which the dot product operation is applied. More particularly, in an embodiment, the input operands of the dot product operation may be a first vector having second vectors as elements, and the stride may select a specified element of each second vector.

Type: Application

Filed: April 1, 2020

Publication date: July 16, 2020

Inventors: Tal Uliel, Eric Bainville, Jeffry E. Gonion, Ali Sazegari, PhD
INDIRECT BRANCH PREDICTOR SECURITY PROTECTION

Publication number: 20200192672

Abstract: A system and method for efficiently protecting branch prediction information. In various embodiments, a computing system includes at least one processor with a branch predictor storing branch target addresses and security tags in a table. The security tag includes one or more components of machine context. When the branch predictor receives a portion of a first program counter of a first branch instruction, and hits on a first table entry during an access, the branch predictor reads out a first security tag. The branch predictor compares one or more components of machine context of the first security tag to one or more components of machine context of the first branch instruction. When there is at least one mismatch, the branch prediction information of the first table entry is not used. Additionally, there is no updating of any branch prediction training information of the first table entry.

Type: Application

Filed: December 14, 2018

Publication date: June 18, 2020

Inventors: Jeffry E. Gonion, Ian D. Kountanis, Conrado Blasco, Steven Andrew Myers, Yannick L. Sierra
INDIRECT BRANCH PREDICTOR SECURITY PROTECTION

Publication number: 20200192673

Abstract: Techniques are disclosed relating to protecting branch prediction information. In various embodiments, an integrated circuit includes branch prediction logic having a table that maintains a plurality of entries storing encrypted target address information for branch instructions. The branch prediction logic is configured to receive machine context information for a branch instruction having a target address being predicted by the branch prediction logic, the machine context information including a program counter associated with the branch instruction. The branch prediction logic is configured to use the machine context information to decrypt encrypted target address information stored in one of the plurality of entries identified based on the program counter.

Type: Application

Filed: October 25, 2019

Publication date: June 18, 2020

Inventors: Steven A. Myers, Jeffry E. Gonion, Yannick L. Sierra, Thomas Icart
Computation engine with strided dot product

Patent number: 10642620

Abstract: In an embodiment, a computation engine may perform dot product computations on input vectors. The dot product operation may have a first operand and a second operand, and the dot product may be performed on a subset of the vector elements in the first operand and each of the vector elements in the second operand. The subset of vector elements may be separated in the first operand by a stride that skips one or more elements between each element to which the dot product operation is applied. More particularly, in an embodiment, the input operands of the dot product operation may be a first vector having second vectors as elements, and the stride may select a specified element of each second vector.

Type: Grant

Filed: April 5, 2018

Date of Patent: May 5, 2020

Assignee: Apple Inc.

Inventors: Tal Uliel, Eric Bainville, Jeffry E. Gonion, Ali Sazegari
Matrix computation engine

Patent number: 10592239

Abstract: In an embodiment, a matrix computation engine is configured to perform matrix computations (e.g. matrix multiplications). The matrix computation engine may perform numerous matrix computations in parallel, in an embodiment. More particularly, the matrix computation engine may be configured to perform numerous multiplication operations in parallel on input matrix elements, generating resulting matrix elements. In an embodiment, the matrix computation engine may be configured to accumulate results in a result memory, performing multiply-accumulate operations for each matrix element of each matrix.

Type: Grant

Filed: May 28, 2019

Date of Patent: March 17, 2020

Assignee: Apple Inc.

Inventors: Eric Bainville, Tal Uliel, Erik Norden, Jeffry E. Gonion, Ali Sazegari
Computation Engine that Operates in Matrix and Vector Modes

Publication number: 20200034145

Abstract: In an embodiment, a computation engine is configured to perform vector multiplications, producing either vector results or outer product (matrix) results. The instructions provided to the computation engine specify a matrix mode or a vector mode for the instructions. The computation engine performs the specified operation. The computation engine may perform numerous computations in parallel, in an embodiment. In an embodiment, the instructions may also specify an offset with the input memories, providing additional flexibility in the location of operands. More particularly, the computation engine may be configured to perform numerous multiplication operations in parallel and to accumulate results in a result memory, performing multiply-accumulate operations for each matrix/vector element in the targeted locations of the output memory.

Type: Application

Filed: July 24, 2018

Publication date: January 30, 2020

Inventors: Eric Bainville, Jeffry E. Gonion, Ali Sazegari, Gerard R. Williams, III
Computation Engine with Upsize/Interleave and Downsize/Deinterleave Options

Publication number: 20190310854

Abstract: In an embodiment, a computation engine may perform computations on input vectors having vector elements of a first precision and data type. The computation engine may convert the vector elements from the first precision to a second precision and may also interleave the vector elements as specified by an instruction issued by the processor to the computation engine. The interleave may be based on a ratio of a result precision and the second precision. An extract instruction may be supported to extract results from the computations and convert and deinterleave the vector elements to to provide a compact result in a desired order.

Type: Application

Filed: April 5, 2018

Publication date: October 10, 2019

Inventors: Eric Bainville, Tal Uliel, Jeffry E. Gonion, Ali Sazegari, Erik K. Norden
Computation Engine with Strided Dot Product

Publication number: 20190310855

Abstract: In an embodiment, a computation engine may perform dot product computations on input vectors. The dot product operation may have a first operand and a second operand, and the dot product may be performed on a subset of the vector elements in the first operand and each of the vector elements in the second operand. The subset of vector elements may be separated in the first operand by a stride that skips one or more elements between each element to which the dot product operation is applied. More particularly, in an embodiment, the input operands of the dot product operation may be a first vector having second vectors as elements, and the stride may select a specified element of each second vector.

Type: Application

Filed: April 5, 2018

Publication date: October 10, 2019

Inventors: Tal Uliel, Eric Bainville, Jeffry E. Gonion, Ali Sazegari
SYSTEMS AND METHODS FOR PERFORMING MEMORY COMPRESSION

Publication number: 20190294541

Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing are described. In various embodiments, a compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.

Type: Application

Filed: June 10, 2019

Publication date: September 26, 2019

Inventors: Ali Sazegari, Charles E. Tucker, Jeffry E. Gonion, Gerard R. Williams, III, Chris Cheng-Chieh Lee
Matrix Computation Engine

Publication number: 20190294441

Abstract: In an embodiment, a matrix computation engine is configured to perform matrix computations (e.g. matrix multiplications). The matrix computation engine may perform numerous matrix computations in parallel, in an embodiment. More particularly, the matrix computation engine may be configured to perform numerous multiplication operations in parallel on input matrix elements, generating resulting matrix elements. In an embodiment, the matrix computation engine may be configured to accumulate results in a result memory, performing multiply-accumulate operations for each matrix element of each matrix.

Type: Application

Filed: May 28, 2019

Publication date: September 26, 2019

Inventors: Eric Bainville, Tal Uliel, Erik Norden, Jeffry E. Gonion, Ali Sazegari
Return-oriented programming (ROP)/jump oriented programming (JOP) attack protection

Patent number: 10409600

Abstract: In an embodiment, a processor includes hardware circuitry and/or supports instructions which may be used to detect that a return address or jump address has been modified since it was written to memory. In response to detecting the modification, the processor may be configured to signal an exception or otherwise initiate error handling to prevent execution at the modified address. In an embodiment, the processor may perform a cryptographic sign operation on the return address/jump address before writing the signed return address/jump address to memory and the signature may be verified before the to address is used as a return target or jump target. Security of the system may be improved by foiling ROP/JOP attacks.

Type: Grant

Filed: July 5, 2016

Date of Patent: September 10, 2019

Assignee: Apple Inc.

Inventors: Yannick L. Sierra, Jeffry E. Gonion, Thomas Roche, Jerrold V. Hauck
Range Mapping of Input Operands for Transcendental Functions

Publication number: 20190250917

Abstract: In an embodiment, a computation engine may offload a processor (e.g. a CPU) and efficiently perform transcendental functions. The computation engine may implement a range instruction that may be included in a program being executed by the CPU. The CPU may dispatch the range instruction to the computation engine. The range instruction may take an input operand (that is to be evaluated in a transcendental function, for example) and may reference a range table that defines a set of ranges for the transcendental function. The range instruction may identify one of the set of ranges that includes the input operand. For example, the range instruction may output an interval number identifying which interval of an overall set of valid input values contains the input operand. In an embodiment, the range instruction may take an input vector operand and output a vector of interval identifiers.

Type: Application

Filed: February 14, 2018

Publication date: August 15, 2019

Inventors: O-Cheng Chang, Tal Uliel, Eric Bainville, Jeffry E. Gonion, Ali Sazegari
Matrix computation engine

Patent number: 10346163

Abstract: In an embodiment, a matrix computation engine is configured to perform matrix computations (e.g. matrix multiplications). The matrix computation engine may perform numerous matrix computations in parallel, in an embodiment. More particularly, the matrix computation engine may be configured to perform numerous multiplication operations in parallel on input matrix elements, generating resulting matrix elements. In an embodiment, the matrix computation engine may be configured to accumulate results in a result memory, performing multiply-accumulate operations for each matrix element of each matrix.

Type: Grant

Filed: November 1, 2017

Date of Patent: July 9, 2019

Assignee: Apple Inc.

Inventors: Eric Bainville, Tal Uliel, Erik Norden, Jeffry E. Gonion, Ali Sazegari
Systems and methods for performing memory compression

Patent number: 10331558

Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing. A compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.

Type: Grant

Filed: July 28, 2017

Date of Patent: June 25, 2019

Assignee: Apple Inc.

Inventors: Ali Sazegari, Charles E. Tucker, Jeffry E. Gonion, Gerard R. Williams, III, Chris Cheng-Chieh Lee

prev 1 2 3 4 5 6 … next