Patents by Inventor Mark J. Charney

Mark J. Charney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems, Apparatuses, and Methods for Getting Even and Odd Data Elements

Publication number: 20170192780

Abstract: Embodiments of systems, apparatuses, and method for getting even or odd data elements are described. For example, in some embodiments, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields for a first source operand, a second source operand, and a destination operand; and execution circuitry to execute the decoded instruction to extract data elements from even data element positions of the first and second source operands and store the extracted data elements into the destination operand.

Type: Application

Filed: December 30, 2015

Publication date: July 6, 2017

Inventors: Robert Valentine, Elmoustapha Ould-Ahmed-Vall, Jason W. Brandt, Mark J. Charney, Ashish Jha, Milind B. Girkar, Bret L. Toll, Evgeny V. Stupachenko, Sergey Y. Ostanevich
Systems, Apparatuses, and Methods for Aggregate Gather and Stride

Publication number: 20170192782

Abstract: Embodiments of systems, apparatuses, and methods for aggregate gather and scatter are disclosed. In some embodiments, a decoder to decode an instruction, wherein the instruction to include fields for an index of memory address locations, an immediate, and a starting destination register operand and identifier of additional destination registers; and execution circuitry to execute the decoded instruction to gather, from memory at locations indicated by the index of memory locations, data elements and stores them in multiple destination registers in sizes dictated by the immediate are described.

Type: Application

Filed: December 30, 2015

Publication date: July 6, 2017

Inventors: Robert Valentine, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Ashish Jha
Fused Multiply-Add (FMA) low functional unit

Publication number: 20170185379

Abstract: An example processor includes a register and a fused multiply-add (FMA) low functional unit. The register stores first, second, and third floating point (FP) values. The FMA low functional unit receives a request to perform an FMA low operation: multiplies the first FP value with the second FP value to obtain a first product value; adds the first product with the third FP value to generate a first result value; rounds the first result to generate a first FMA value; multiplies the first FP value with the second FP value to obtain a second product value; adds the second product value with the third FP value to generate a second result value; and subtracts the FMA value from the second result value to obtain a third result value, which can then be normalized and rounded (FMA low result) and sent the FMA low result to an application.

Type: Application

Filed: December 23, 2015

Publication date: June 29, 2017

Inventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
FLOATING POINT (FP) ADD LOW INSTRUCTIONS FUNCTIONAL UNIT

Publication number: 20170185377

Abstract: An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.

Type: Application

Filed: December 23, 2015

Publication date: June 29, 2017

Inventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
AGGREGATE SCATTER INSTRUCTIONS

Publication number: 20170177543

Abstract: An Aggregate Scatter instruction is described. A processor may include a memory interface and a register to store data elements of a data structure. The data elements may be contiguously stored in a first location in a memory accessible via the memory interface. The processor may further include a decoder to decode an aggregate scatter instruction specifying a store operation for the data structure and an execution unit to contiguously store the data elements to a second storage location in the memory in response to the decoded aggregate scatter instruction. The second storage location may be identified by a starting memory address of the second storage location.

Type: Application

Filed: December 22, 2015

Publication date: June 22, 2017

Inventors: Ashish Jha, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark J. Charney, Milind B. Girkar
Systems, apparatuses, and methods for performing conflict detection and broadcasting contents of a register to data element positions of another register

Patent number: 9665368

Abstract: Systems, apparatuses, and methods of performing in a computer processor broadcasting data in response to a single vector packed broadcasting instruction that includes a source writemask register operand, a destination vector register operand, and an opcode. In some embodiments, the data of the source writemask register is zero extended prior to broadcasting.

Type: Grant

Filed: September 28, 2012

Date of Patent: May 30, 2017

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mark J. Charney, Jesus Corbal, Milind B. Girkar, Elmoustapha Ould-Ahmed_Vall, Bret L. Toll, Robert Valentine
Apparatus and method of improved permute instructions

Patent number: 9658850

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Grant

Filed: December 23, 2011

Date of Patent: May 23, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Apparatus and method of improved insert instructions

Patent number: 9619236

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Grant

Filed: December 23, 2011

Date of Patent: April 11, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
DATA ELEMENT COMPARISON PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20170090924

Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.

Type: Application

Filed: September 26, 2015

Publication date: March 30, 2017

Applicant: Intel Corporation

Inventors: Asit K. Mishra, Edward T. Grochowski, Jonathan D. Pearce, Deborah T. Marr, Ehud Cohen, Elmoustapha OuId-Ahmed-Vall, Jesus Corbal San Adrian, Robert Valentine, Mark J. Charney, Christopher J. Hughes, Milind B. Girkar
Apparatus and method of improved extract instructions

Patent number: 9588764

Abstract: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity and second granularity.

Type: Grant

Filed: December 23, 2011

Date of Patent: March 7, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Systems, Apparatuses, and Methods for Performing Mask Bit Compression

Publication number: 20170024206

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor mask bit compression in response to a single mask bit compression instruction that includes a source writemask register operand, a destination writemask register operand, and an opcode are described.

Type: Application

Filed: May 31, 2016

Publication date: January 26, 2017

Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
Apparatus and method for performing permute operations

Patent number: 9513918

Abstract: An apparatus and method are described for permuting data elements with masking. For example, a method according to one embodiment includes the following operations: reading values from a mask data structure to determine whether masking is implemented for each data element of a destination operand; if masking not implemented for a particular data element, then selecting data elements from a first source operand and a second source operand based on index values stored in destination operand to be copied to data element positions within the destination operand, wherein any one of the data elements from either the first source operand and the second source operand may be copied to any one of the data element positions within the destination operand; and if masking is implemented for a particular data element of the destination operand, then performing a designated masking operation with respect to that particular data element.

Type: Grant

Filed: December 22, 2011

Date of Patent: December 6, 2016

Assignee: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mostafa Hagog, Jesus Corbal, Bret L Toll, Mark J Charney, Tal Uliel, Zeev Sperber, Amit Gradstein
Fusible instructions and logic to provide OR-test and AND-test functionality using multiple test sources

Patent number: 9483266

Abstract: Fusible instructions and logic provide OR-test and AND-test functionality on multiple test sources. Some embodiments include a processor decode stage to decode a test instruction for execution, the instruction specifying first, second and third source data operands, and an operation type. Execution units, responsive to the decoded test instruction, perform one logical operation, according to the specified operation type, between data from the first and second source data operands, and perform a second logical operation between the data from the third source data operand and the result of the first logical operation to set a condition flag. Some embodiments generate the test instruction dynamically by fusing one logical instruction with a prior-art test instruction. Other embodiments generate the test instruction through a just-in-time compiler. Some embodiments also fuse the test instruction with a subsequent conditional branch instruction, and perform a branch according to how the condition flag is set.

Type: Grant

Filed: March 15, 2013

Date of Patent: November 1, 2016

Assignee: Intel Corporation

Inventors: Maxim Loktyukhin, Robert Valentine, Julian C. Horn, Mark J. Charney
Packed data operation mask comparison processors, methods, systems, and instructions

Patent number: 9442733

Abstract: Receive packed data operation mask comparison instruction indicating first packed data operation mask having first packed data operation mask bits and second packed data operation mask having second packed data operation mask bits. Each packed data operation mask bit of first mask corresponds to a packed data operation mask bit of second mask in corresponding position. Modify first flag to first value if bitwise AND of each packed data operation mask bit of first mask with each corresponding packed data operation mask bit of second mask is zero. Otherwise modify first flag to second value. Modify second flag to third value if bitwise AND of each packed data operation mask bit of first mask with bitwise NOT of each corresponding packed data operation mask bit of second mask is zero. Otherwise modify second flag to fourth value.

Type: Grant

Filed: December 11, 2015

Date of Patent: September 13, 2016

Assignee: Intel Corporation

Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
Packed two source inter-element shift merge processors, methods, systems, and instructions

Patent number: 9442731

Abstract: A processor includes a decoder to receive an instruction that indicates first and second source packed data operands and at least one shift count. An execution unit is operable, in response to the instruction, to store a result packed data operand. Each result data element includes a first least significant bit (LSB) portion of a first data element of a corresponding pair of data elements in a most significant bit (MSB) portion, and a second MSB portion of a second data element of the corresponding pair in a LSB portion. One of the first LSB portion of the first data element and the second MSB portion of the second data element has a corresponding shift count number of bits. The other has a number of bits equal to a size of a data element of the first source packed data minus the corresponding shift count.

Type: Grant

Filed: March 13, 2014

Date of Patent: September 13, 2016

Assignee: Intel Corporation

Inventors: Tal Uliel, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark J. Charney, Thomas Willhalm
Instruction execution that broadcasts and masks data values at different levels of granularity

Patent number: 9424327

Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure.

Type: Grant

Filed: December 23, 2011

Date of Patent: August 23, 2016

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney
Methods, apparatus, instructions, and logic to provide vector address conflict detection functionality

Patent number: 9411584

Abstract: Instructions and logic provide SIMD address conflict detection functionality. Some embodiments include processors with a register with a variable plurality of data fields, each of the data fields to store an offset for a data element in a memory. A destination register has corresponding data fields, each of these data fields to store a variable second plurality of bits to store a conflict mask having a mask bit for each offset. Responsive to decoding a vector conflict instruction, execution units compare the offset in each data field with every less significant data field to determine if they hold a matching offset, and in corresponding conflict masks in the destination register, set any mask bits corresponding to a less significant data field with a matching offset. Vector address conflict detection can be used with variable sized elements and to generate conflict masks to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: December 29, 2012

Date of Patent: August 9, 2016

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Brett L. Toll, Mark J. Charney, Milind B. Girkar
Vector address conflict resolution with vector population count functionality

Patent number: 9411592

Abstract: Instructions and logic provide SIMD address conflict resolution with vector population count functionality. Some embodiments include processors with a register with a variable plurality of data fields, each of the data fields to store a variable second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of bits set to one for corresponding data fields. Responsive to decoding a vector population count instruction, execution units count the number of bits set to one for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector population count instructions can be used with variable sized elements and conflict masks to generate iteration counts and completion masks to be used each iteration to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: December 29, 2012

Date of Patent: August 9, 2016

Assignee: Intel Corporation

Inventors: Robert Valentine, Mark J. Charney, Jesus Corbal, Milind B. Girkar, Christopher J. Hughes, Elmoustapha Ould-Ahmed-Vall, Brett L. Toll
METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE VECTOR PACKED TUPLE CROSS-COMPARISON FUNCTIONALITY

Publication number: 20160188336

Abstract: Instructions and logic provide SIMD vector packed tuple cross-comparison functionality. Some processor embodiments include first and second registers with a variable plurality of data fields, each of the data fields to store an element of a first data type. The processor executes a SIMD instruction for vector packed tuple cross-comparison in some embodiments, which for each data field of a portion of data fields in a tuple of the first register, compares its corresponding element with every element of a corresponding portion of data fields in a tuple of the second register and sets a mask bit corresponding to each element of the second register portion, in a bit-mask corresponding to each unmasked element of the corresponding first register portion, according to the corresponding comparison. In some embodiments bit-masks are shifted by corresponding elements in data fields of a third register. The comparison type is indicated by an immediate operand.

Type: Application

Filed: December 31, 2014

Publication date: June 30, 2016

Inventors: Robert Valentine, Christopher J. Hughes, Mark J. Charney, Zeev Sperber, Amit Gradstein, Simon Rubanovich, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil
INSTRUCTION AND LOGIC TO PERFORM AN INVERSE CENTRIFUGE OPERATION

Publication number: 20160179548

Abstract: In one embodiment a processing device implements a set of instructions to perform an inverse centrifuge operation using vector or general purpose registers. The inverse centrifuge operation interleaves bits from opposite regions of a source and writes the interleaved bits to a destination. The instructions use a control mask where each bit with a mask value of one is obtained from one side of the source register or vector elements with a mask of zero are obtained from the opposing side.

Type: Application

Filed: December 22, 2014

Publication date: June 23, 2016

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, Robert Valentine, Jesus Corbal, Mark J. Charney

prev … 6 7 8 9 10 11 12 next