Masking To Control An Access To Data In Vector Register Patents (Class 712/5)

Adaptive prefetcher for shared system cache

Patent number: 11994993

Abstract: An adaptive prefetcher for a shared system cache of a processing system including multiple requestors having a cache miss monitor and a prefetch controller. The cache miss monitor monitors requests for information from memory and identifies one of the requestors for which an identified cache line is requested. The prefetch controller submits an adaptive request for a subsequent cache line. The subsequent cache line may be determined based on a latency comparison between a loop latency (LL) of the prefetch controller and a stream latency (SL) of the identified requestor. A latency memory may be included that stores stream latencies for the requestors. The latency comparison may be used to determine how many cache lines to skip relative to the identified cache line, such as according to SL*SK<LL?SL*(SK+1) in which SK is the number of cache lines to skip.

Type: Grant

Filed: March 15, 2022

Date of Patent: May 28, 2024

Assignee: NXP B.V.

Inventors: Xiao Sun, Xiaotao Chen, Rohit Kumar Kaul
Single instruction multiple data SIMD instruction generation and processing method and related device

Patent number: 11934837

Abstract: An SIMD instruction generation and processing method and a related device are provided. The method may include: obtaining a length of each loop dimension of a first tensor formula; selecting, from a plurality of groups of information about a first SIMD instruction model based on the length of each loop dimension of a first tensor formula, information about a second SIMD instruction model matching the first tensor formula; generating, based on a length of at least one loop dimension of the first tensor formula and the second SIMD instruction model, a first SIMD instruction obtained after the first tensor formula is converted. The information about a second SIMD instruction model is selected from the plurality of groups of information about a first SIMD instruction model based on the length of each loop dimension of the tensor formula.

Type: Grant

Filed: September 12, 2022

Date of Patent: March 19, 2024

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Chen Wu, Yifan Lin, Xiaoqiang Dan
Mask patterns generated in memory from seed vectors

Patent number: 11775296

Abstract: The present disclosure includes apparatuses and methods related to mask patterns generated in memory from seed vectors. An example method includes performing operations on a plurality of data units of a seed vector and generating, by performance of the operations, a vector element in a mask pattern.

Type: Grant

Filed: April 12, 2021

Date of Patent: October 3, 2023

Assignee: Micron Technology, Inc.

Inventor: Jeremiah J. Willcock
Instruction handling for accumulation of register results in a microprocessor

Patent number: 11132198

Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.

Type: Grant

Filed: August 29, 2019

Date of Patent: September 28, 2021

Assignee: International Business Machines Corporation

Inventors: Brian W. Thompto, Maarten J. Boersma, Andreas Wagner, Jose E. Moreira, Hung Q. Le, Silvia Melitta Mueller, Dung Q. Nguyen
Mask patterns generated in memory from seed vectors

Patent number: 10977033

Abstract: The present disclosure includes apparatuses and methods related to mask patterns generated in memory from seed vectors. An example method includes performing operations on a plurality of data units of a seed vector and generating, by performance of the operations, a vector element in a mask pattern.

Type: Grant

Filed: March 25, 2016

Date of Patent: April 13, 2021

Assignee: Micron Technology, Inc.

Inventor: Jeremiah J. Willcock
Processors, methods, systems, and instructions to consolidate data elements and generate index updates

Patent number: 10891131

Abstract: A decode unit to decode an instruction that indicates a source packed data that includes data elements, and indicates a source mask that includes mask elements. Each of the mask elements corresponds to a different one of the data elements. Each of the mask elements is one of a masked mask element and an unmasked mask element. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to store a result packed data. When the source packed data includes one or more masked data elements disposed within unmasked data elements, the result packed data includes, the unmasked data elements consolidated together without the one or more masked data elements disposed within them. The execution unit, is to store a result in a second destination storage location that reflects a number of the unmasked data elements consolidated together.

Type: Grant

Filed: September 22, 2016

Date of Patent: January 12, 2021

Assignee: Intel Corporation

Inventors: Mohammad Ashraf Bhuiyan, Brian R. Nickerson
Processors, methods, systems, and instructions to generate sequences of integers in numerical order that differ by a constant stride

Patent number: 10866807

Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four non-negative integers in numerical order with all integers in consecutive positions differing by a constant stride of at least two. In an aspect, storing the result including the sequence of the at least four integers is performed without calculating the at least four integers using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: December 22, 2011

Date of Patent: December 15, 2020

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Seth Abraham, Robert Valentine, Zeev Sperber, Amit Gradstein
Processor to perform a bit range isolation instruction

Patent number: 10656947

Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.

Type: Grant

Filed: March 14, 2018

Date of Patent: May 19, 2020

Assignee: Intel Corporation

Inventors: Maxim Loktyukhin, Eric W. Mahurin, Bret L. Toll, Martin G. Dixon, Sean P. Mirkes, David L. Kreitzer, Elmoustapha Ould-Ahmed-Vall, Vinodh Gopal
Permuting in a matrix-vector processor

Patent number: 10614151

Abstract: A circuit comprises an input register configured to receive an input vector of elements, a control register configured to receive a control vector of elements, wherein each element of the control vector corresponds to a respective element of the input vector, and wherein each element specifies a permutation of a corresponding element of the input vector, and a permute execution circuit configured to generate an output vector of elements corresponding to a permutation of the input vector. Generating each element of the output vector comprises accessing, at the input register, a particular element of the input vector, accessing, at the control register, a particular element of the control vector corresponding to the particular element of the input vector, and outputting the particular element of the input vector as an element at a particular position of the output vector that is selected based on the particular element of the control vector.

Type: Grant

Filed: August 1, 2019

Date of Patent: April 7, 2020

Assignee: Google LLC

Inventors: Dong Hyuk Woo, Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam, Jonathan Ross, Christopher Aaron Clark
Non-orthogonal superposition transmissions for multimedia broadcast multicast service (MBMS)

Patent number: 10574387

Abstract: Technology for an eNodeB operable to perform multiuser non-orthogonal superposition transmissions for multimedia broadcast multicast service (MBMS) is disclosed. The eNodeB can modulate a first physical multicast channel (PMCH) signal for MBMS with a first modulation and coding scheme (MCS). The eNodeB can modulate a second PMCH signal for MBMS with a second MCS. The eNodeB can multiplex the first PMCH signal and the second PMCH signal to form an aggregate PMCH signal. The eNodeB can transmit the aggregate PMCH signal to a plurality of UEs using multiuser non-orthogonal superposition for MBMS, wherein the first PMCH signal in the aggregate PMCH signal is transmitted using physical resource blocks (PRBs) that are partially or fully overlapped in time and frequency with PRBs of the second PMCH signal in the aggregate PMCH signal.

Type: Grant

Filed: October 30, 2015

Date of Patent: February 25, 2020

Assignee: INTEL CORPORATION

Inventors: Alexei Davydov, Vadim Sergeyev, Alexander Maltsev
Generating a mask vector for determining a processor instruction address using an instruction tag in a multi-slice processor

Patent number: 10528353

Abstract: Methods and apparatus for generating a mask vector for determining a processor instruction address using an instruction tag (ITAG) in a multi-slice processor including receiving a first ITAG value and an interrupt ITAG value; generating the mask vector divided into mask sections comprising a plurality of elements with unset flags; for each mask section: if the mask section comprises the first ITAG value, setting a flag of an element in the mask section corresponding to the first ITAG value; if the mask section comprises the interrupt ITAG value, setting a flag of an element in the mask section corresponding to the interrupt ITAG value; setting each flag of each element in the mask vector between the element in the mask vector corresponding to the first ITAG value and the element in the mask vector corresponding to the interrupt ITAG value; and providing the mask vector to an instruction fetch unit.

Type: Grant

Filed: May 24, 2016

Date of Patent: January 7, 2020

Assignee: International Business Machines Corporation

Inventors: David S. Levitan, Mehul Patel
Processor and method for tracking progress of gathering/scattering data element pairs in different cache memory banks

Patent number: 10387151

Abstract: Methods and apparatus are disclosed for accessing multiple data cache lines for scatter/gather operations. Embodiment of apparatus may comprise address generation logic to generate an address from an index of a set of indices for each of a set of corresponding mask elements having a first value. Line or bank match ordering logic matches addresses in the same cache line or different banks, and orders an access sequence to permit a group of addresses in multiple cache lines and different banks. Address selection logic directs the group of addresses to corresponding different banks in a cache to access data elements in multiple cache lines corresponding to the group of addresses in a single access cycle. A disassembly/reassembly buffer orders the data elements according to their respective bank/register positions, and a gather/scatter finite state machine changes the values of corresponding mask elements from the first value to a second value.

Type: Grant

Filed: September 30, 2011

Date of Patent: August 20, 2019

Assignee: Intel Corporation

Inventors: Jonathan C. Hall, Sailesh Kottapalli, Andrew T. Forsyth
Packed data operation mask concatenation processors, methods, systems, and instructions

Patent number: 10372449

Abstract: A method of an aspect includes receiving a packed data operation mask concatenation instruction. The packed data operation mask concatenation instruction indicates a first source having a first packed data operation mask, indicates a second source having a second packed data operation mask, and indicates a destination. A result is stored in the destination in response to the packed data operation mask concatenation instruction. The result includes the first packed data operation mask concatenated with the second packed data operation mask. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: February 27, 2017

Date of Patent: August 6, 2019

Assignee: Intel Corporation

Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal San Andrian, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
Systems, apparatuses, and methods for setting an output mask in a destination writemask register from a source write mask register using an input writemask and immediate

Patent number: 10372450

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.

Type: Grant

Filed: July 11, 2017

Date of Patent: August 6, 2019

Assignee: Intel Corporation

Inventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara Baghsorkhi, Nalini Vasudevan
Vector frequency expand instruction

Patent number: 10241792

Abstract: A processor core that includes a hardware decode unit and an execution engine unit. The hardware decode unit to decode a vector frequency expand instruction, wherein the vector frequency compress instruction includes a source operand and a destination operand, wherein the source operand specifies a source vector register that includes one or more pairs of a value and run length that are to be expanded into a run of that value based on the run length. The execution engine unit to execute the decoded vector frequency expand instruction which causes, a set of one or more source data elements in the source vector register to be expanded into a set of destination data elements comprising more elements than the set of source data elements and including at least one run of identical values which were run length encoded in the source vector register.

Type: Grant

Filed: December 30, 2011

Date of Patent: March 26, 2019

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Suleyman Sair, Kshitij A. Doshi, Charles Yount, Bret L. Toll
Apparatus and method of improved permute instructions with multiple granularities

Patent number: 9946540

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Grant

Filed: May 22, 2017

Date of Patent: April 17, 2018

Assignee: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Detecting markers in an encoded video signal

Patent number: 9854261

Abstract: A video decoding method is implemented by a computer having multiple parallel processing units. A stream of data elements is received, some of which encode video content. The stream comprises marker sequences, each marker sequence comprising a marker which does not encode video content. A known pattern of data elements occurs in each marker sequence. A respective part of the stream is supplied to each parallel processing unit. Each parallel processing unit processes the respective part of the stream, whereby multiple parts of the stream are processed in parallel, to detect whether any of the multiple parts matches the known pattern of data elements, thereby identifying the markers. The encoded video content is separated from the identified markers. The separated video content is decoded, and the decoded video content outputted on a display.

Type: Grant

Filed: January 6, 2015

Date of Patent: December 26, 2017

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Yongjun Wu, Chih-Lung Lin
Instruction and logic to provide stride-based vector load-op functionality with mask duplication

Patent number: 9804844

Abstract: Instructions and logic provide vector load-op and/or store-op with stride functionality. Some embodiments, responsive to an instruction specifying: a set of loads, a second operation, destination register, operand register, memory address, and stride length; execution units read values in a mask register, wherein fields in the mask register correspond to stride-length multiples from the memory address to data elements in memory. A first mask value indicates the element has not been loaded from memory and a second value indicates that the element does not need to be, or has already been loaded. For each having the first value, the data element is loaded from memory into the corresponding destination register location, and the corresponding value in the mask register is changed to the second value. Then the second operation is performed using corresponding data in the destination and operand registers to generate results. The instruction may be restarted after faults.

Type: Grant

Filed: September 26, 2011

Date of Patent: October 31, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Suleyman Sair, Charles R. Yount
Gather-op instruction to duplicate a mask and perform an operation on vector elements gathered via tracked offset-based gathering

Patent number: 9747101

Abstract: Instructions and logic provide vector scatter-op and/or gather-op functionality. In some embodiments, responsive to an instruction specifying: a gather and a second operation, a destination register, an operand register, and a memory address; execution units read values in a mask register, wherein fields in the mask register correspond to offset indices in the indices register for data elements in memory. A first mask value indicates the element has not been gathered from memory and a second value indicates that the element does not need to be, or has already been gathered. For each having the first value, the data element is gathered from memory into the corresponding destination register location, and the corresponding value in the mask register is changed to the second value. When all mask register fields have the second value, the second operation is performed using corresponding data in the destination and operand registers to generate results.

Type: Grant

Filed: September 26, 2011

Date of Patent: August 29, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Charles R. Yount, Suleyman Sair
Systems, apparatuses, and methods for setting an output mask in a destination writemask register from a source write mask register using an input writemask and immediate

Patent number: 9703558

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.

Type: Grant

Filed: December 23, 2011

Date of Patent: July 11, 2017

Assignee: Intel Corporation

Inventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara Baghsorkhi, Nalini Vasudevan
Instruction and logic to provide vector loads with strides and masking functionality

Patent number: 9672036

Abstract: Instructions and logic provide vector loads and/or stores with stride and mask functionality. Some embodiments, responsive to an instruction specifying: a set of loads, destination register, mask register, memory address, and stride length; execution units read values in the mask register, wherein fields in the mask register correspond to stride-length multiples from the memory address to data elements in memory. A first mask value indicates the element has not been loaded from memory and a second value indicates that the element does not need to be, or has already been loaded. For each having the first value, the corresponding multiple of said stride length is generated according to the data field's position in the mask register to load the data element from memory into the corresponding destination register location, and the corresponding value in the mask register is changed to the second value. These instructions can restart after faults.

Type: Grant

Filed: September 26, 2011

Date of Patent: June 6, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Suleyman Sair, Charles R. Yount
Apparatus and method of improved permute instructions

Patent number: 9658850

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Grant

Filed: December 23, 2011

Date of Patent: May 23, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Apparatus and method including an instruction for performing a logical operation on a repeating data value generated based on data size and control parameter portions specified by the instruction

Patent number: 9619225

Abstract: A data processing apparatus comprises a processing circuit and instruction decoder. A bitfield manipulation instruction controls the processing apparatus to generate at least one result data element from corresponding first and second source data elements. Each result data element includes a portion corresponding to a bitfield of the corresponding first source data element. Bits of the result data element that are more significant than the inserted bitfield have a prefix value that is selected, based on a control value specified by the instruction, as one of a first prefix value having a zero value, a second prefix value having the value of a portion of the corresponding second source data element, and a third prefix value corresponding to a sign extension of the bitfield of the first source data element. Bitwise logical instructions are also described.

Type: Grant

Filed: October 8, 2015

Date of Patent: April 11, 2017

Assignee: ARM Limited

Inventors: David James Seal, Richard Roy Grisenthwaite, Nigel John Stephens
Apparatus and method of improved insert instructions

Patent number: 9619236

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Grant

Filed: December 23, 2011

Date of Patent: April 11, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Instruction and logic to provide vector compress and rotate functionality

Patent number: 9606961

Abstract: Instructions and logic provide vector compress and rotate functionality. Some embodiments, responsive to an instruction specifying: a vector source, a mask, a vector destination and destination offset, read the mask, and copy corresponding unmasked vector elements from the vector source to adjacent sequential locations in the vector destination, starting at the vector destination offset location. In some embodiments, the unmasked vector elements from the vector source are copied to adjacent sequential element locations modulo the total number of element locations in the vector destination. In some alternative embodiments, copying stops whenever the vector destination is full, and upon copying an unmasked vector element from the vector source to an adjacent sequential element location in the vector destination, the value of a corresponding field in the mask is changed to a masked value. Alternative embodiments zero elements of the vector destination, in which no element from the vector source is copied.

Type: Grant

Filed: October 30, 2012

Date of Patent: March 28, 2017

Assignee: Intel Corporation

Inventors: Tal Uliel, Elmoustapha Ould-Ahmed-Vall, Robert Valentine
Vector mask memory access instructions to perform individual and sequential memory access operations if an exception occurs during a full width memory access operation

Patent number: 9529592

Abstract: In one embodiment, logic is provided to receive and execute a mask move instruction to transfer a vector data element including a plurality of packed data elements from a source location to a destination location, subject to mask information for the instruction, such that only portions of the plurality of packed data elements are transferred to the destination location. Other embodiments are described and claimed.

Type: Grant

Filed: December 27, 2007

Date of Patent: December 27, 2016

Assignee: Intel Corporation

Inventors: Doron Orenstien, Zeev Sperber, Bob Valentine, Benny Eitan
Floating mask generation for network packet flow

Patent number: 9513926

Abstract: A tag mask generation method comprises receiving a section_selector flag indicating whether a tag mask for a section of a network packet is to be generated; receiving from a parser a parse information for the network packet, wherein the parse information includes a section_pointer that indicates a location of the section in the network packet; generating a pointer based on the section_pointer when the section_selector indicates that the tag mask for the section is to be generated; receiving a base mask for the section; and generating the tag mask via a shifter by shifting the base mask by the amount indicated by the pointer. The parse information may further include a section_pointer_valid flag indicating whether the section is included in the network packet, and the method may further comprise including the tag mask in a combined tag mask when the section_pointer_valid flag indicates that the section is included in the network packet.

Type: Grant

Filed: January 8, 2014

Date of Patent: December 6, 2016

Assignee: Cavium, Inc.

Inventors: Wilson Parkhurst Snyder, II, Nicholas New Jamba
Multi-element instruction with different read and write masks

Patent number: 9489196

Abstract: A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.

Type: Grant

Filed: December 23, 2011

Date of Patent: November 8, 2016

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Bret L. Toll, Jesus Corbal
Instruction execution that broadcasts and masks data values at different levels of granularity

Patent number: 9424327

Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure.

Type: Grant

Filed: December 23, 2011

Date of Patent: August 23, 2016

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney
Vector move instruction controlled by read and write masks

Patent number: 9378182

Abstract: A processor executes a vector move instruction to move data elements from a second vector register to a first vector register under the control of a first mask register and a second mask register. A register file within the processor includes the first vector register, the second vector register, the first mask register and the second mask register. In response to the vector move instruction, execution circuitry in the processor is to replace a given number of target data elements in the first vector register with the given number of source data elements in the second vector register. Each source data element corresponds to a mask bit in the second mask register having a second bit value, and wherein each target data element corresponds to a mask bit in the first mask register having a first bit value.

Type: Grant

Filed: September 28, 2012

Date of Patent: June 28, 2016

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikin, Christopher Hughes
Method and apparatus for broadcast channel decoding

Patent number: 9369230

Abstract: A method and apparatus for decoding the LTE physical broadcast channel (PBCH). The transmissions are made by an evolved NodeB (eNB). At least one template symbol sequence representative of a potential transmission by the eNB over the PBCH is provided. A signal or signals transmitted over the PBCH by the eNB is received, the signal or signals indicative of a received symbol sequence. Correlation operations are performed for correlating the at least one template symbol sequence against the received symbol sequence. A representative symbol sequence, timing parameter, or both, is selected, based on the correlation operations. The representative symbol sequence is indicative of information transmitted by the eNB over the LTE PBCH. The timing parameter is indicative of timing of said information transmitted by the eNB.

Type: Grant

Filed: April 2, 2014

Date of Patent: June 14, 2016

Assignee: Sierra Wireless, Inc.

Inventors: Gustav Gerald Vos, Steven John Bennett, Lutz Hans-Joachim Lampe, Ghasem Naddafzadeh Shirazi
Application of sequence hopping and orthogonal covering codes to uplink reference signals

Patent number: 9281985

Abstract: Methods and apparatuses are provided for transmitting and receiving references signals. A method includes transmitting a first parameter and a second parameter, which are cell specific parameters; transmitting a third parameter, which is a UE specific parameter; transmitting a CSI; and receiving at least one of a first reference signal, which is generated based on the CSI, for a PUSCH and a second reference signal for a PUCCH. The first reference signal is generated by not applying sequence hopping and group sequence hopping, regardless of values of the first parameter and the second parameter, if the third parameter indicates that the sequence hopping and the group sequence hopping are disabled. The second reference signal is generated by applying the group sequence hopping, if the first parameter indicates that the group sequence hopping is enabled and the third parameter indicates that the sequence hopping and the group sequence hopping are disabled.

Type: Grant

Filed: February 13, 2015

Date of Patent: March 8, 2016

Assignee: Samsung Electronics Co., Ltd

Inventors: Aris Papasakellariou, Joon Young Cho
Application of sequence hopping and orthogonal covering codes to uplink reference signals

Patent number: 9281863

Abstract: Methods and apparatuses are provided for transmitting and receiving references signals. A method includes receiving first and second cell specific parameters; receiving a third UE specific parameter; acquiring a first reference signal for a PUSCH, based on the third parameter; acquiring a second reference signal for a PUCCH, based on the first parameter; and transmitting at least one of the first reference signal and the second reference signal. Sequence hopping and group sequence hopping are disabled for the first reference signal, regardless of values of the first parameter and the second parameter, if the third parameter indicates that the sequence hopping and the group sequence hopping are disabled. The group sequence hopping is applied to acquire the second reference signal, if the first parameter indicates that the group sequence hopping is enabled and the third parameter indicates that the sequence hopping and the group sequence hopping are disabled.

Type: Grant

Filed: February 13, 2015

Date of Patent: March 8, 2016

Assignee: Samsung Electronics Co., Ltd

Inventors: Aris Papasakellariou, Joon Young Cho
Application of sequence hopping and orthogonal covering codes to uplink reference signals

Patent number: 9281862

Abstract: Methods and apparatuses are provided for transmitting and receiving references signals. A method includes receiving a first parameter, which is a cell specific parameter; receiving a second parameter, which is a UE specific parameter; acquiring a first reference signal for a PUSCH, based on the second parameter; acquiring a second reference signal for a PUCCH, based on the first parameter; and transmitting at least one of the first reference signal and the second reference signal. Group sequence hopping is not applied to acquire the first reference signal, if the first parameter indicates that the group sequence hopping is enabled and the second parameter indicates that the group sequence hopping is disabled. The group sequence hopping is applied to acquire the second reference signal, if the first parameter indicates that the group sequence hopping is enabled and the second parameter indicates that the group sequence hopping is disabled.

Type: Grant

Filed: February 13, 2015

Date of Patent: March 8, 2016

Assignee: Samsung Electronics Co., Ltd

Inventors: Aris Papasakellariou, Joon Young Cho
Application of sequence hopping and orthogonal covering codes to uplink reference signals

Patent number: 9281984

Abstract: Methods and apparatuses are provided for transmitting and receiving references signals. A method includes receiving a first cell specific parameter; receiving a second UE specific parameter; receiving a CSI; acquiring a first reference signal for a PUSCH, based on the second parameter and the CSI; acquiring a second reference signal for a PUCCH, based on the first parameter; and transmitting at least one of the first reference signal and the second reference signal. Group sequence hopping is not applied to acquire the first reference signal, if the first parameter indicates that the group sequence hopping is enabled and the second parameter indicates that the group sequence hopping is disabled. The group sequence hopping is applied to acquire the second reference signal, if the first parameter indicates that the group sequence hopping is enabled and the second parameter indicates that the group sequence hopping is disabled.

Type: Grant

Filed: February 13, 2015

Date of Patent: March 8, 2016

Assignee: Samsung Electronics Co., Ltd

Inventors: Aris Papasakellariou, Joon Young Cho
Speculative non-faulting loads and gathers

Patent number: 9189236

Abstract: According to one embodiment, a processor includes an instruction decoder to decode an instruction to read a plurality of data elements from memory, the instruction having a first operand specifying a storage location, a second operand specifying a bitmask having one or more bits, each bit corresponding to one of the data elements, and a third operand specifying a memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the instruction, to read one or more data elements speculatively, based on the bitmask specified by the second operand, from a memory location based on the memory address indicated by the third operand, and to store the one or more data elements in the storage location indicated by the first operand.

Type: Grant

Filed: December 21, 2012

Date of Patent: November 17, 2015

Assignee: Intel Corporation

Inventors: Jayashankar Bharadwaj, Nalini Vasudevan, Victor W. Lee, Sara S. Baghsorkhi, Albert Hartono, Daehyun Kim
VECTOR GENERATE MASK INSTRUCTION

Publication number: 20150143075

Abstract: A Vector Generate Mask instruction. For each element in the first operand, a bit mask is generated. The mask includes bits set to a selected value starting at a position specified by a first field of the instruction and ending at a position specified by a second field of the instruction.

Type: Application

Filed: December 5, 2014

Publication date: May 21, 2015

Inventors: Jonathan D. Bradbury, Robert F. Enenkel, Eric M. Schwarz, Timothy J. Slegel
Scalar readXF instruction for processing vectors

Patent number: 9009528

Abstract: The described embodiments include a processor that handles faults. The processor first receives an input vector, a control vector, and a predicate vector, each vector comprising a plurality of elements. Then, for a first element of the input vector for which corresponding elements of the control vector and the predicate vector are active, the processor performs a scalar read operation using an address from the element of the input vector. When a fault condition is encountered while performing the read operation, the processor determines if the element is a first element where a corresponding element of the control vector is active. If so (i.e., if the element is a first element where a corresponding element of the control vector is active), the processor processes the fault. Otherwise, the processor masks the fault for the element.

Type: Grant

Filed: September 5, 2012

Date of Patent: April 14, 2015

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
Confirm instruction for processing vectors

Patent number: 8938642

Abstract: The described embodiments include a processor with a fault status register (FSR) that executes a Confirm instruction. In these embodiments, when executing the Confirm instruction, the processor receives a predicate vector that includes N elements. For a first set of bit positions in the FSR for which corresponding elements of the predicate vector are active, the processor determines if at least one of the first set of bit positions in the FSR holds a predetermined value. When at least one of the first set of bit positions in the FSR holds the predetermined value, the processor causes a fault in the processor.

Type: Grant

Filed: May 23, 2012

Date of Patent: January 20, 2015

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
APPARATUS AND METHOD TO RESERVE AND PERMUTE BITS IN A MASK REGISTER

Publication number: 20150006847

Abstract: An apparatus and method are described for performing a bit reversal and permutation on mask values. For example, a processor is described to execute an instruction to perform the operations of: reading a plurality of mask bits stored in a source mask register, the mask bits associated with vector data elements of a vector register; and performing a bit reversal operation to copy each mask bit from a source mask register to a destination mask register, wherein the bit reversal operation causes bits from the source mask register to be reversed within the destination mask register resulting in a symmetric, mirror image of the original bit arrangement.

Type: Application

Filed: June 27, 2013

Publication date: January 1, 2015

Inventors: Elmoustapha OULD-AHMED-VALL, Robert VALENTINE
INSTRUCTION AND LOGIC TO PROVIDE VECTOR BLEND AND PERMUTE FUNCTIONALITY

Publication number: 20140372727

Abstract: Vector blend and permute functionality are provided, responsive to instructions specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, a second vector register, and a third operand. Indices are read from fields in the second register. Each index has a first selector portion and a second selector portion. Corresponding unmasked vector elements are stored to fields of the destination register, wherein each vector element, responsive to the respective first selector portion having a first value, is copied to an intermediate vector from a corresponding data field of the first register, and responsive to the respective first selector portion having a second value, is copied to the intermediate vector from a corresponding data field of the third operand. Then unmasked data fields of the destination are replaced by data fields in the intermediate vector indexed by the corresponding second selector portions.

Type: Application

Filed: December 23, 2011

Publication date: December 18, 2014

Applicant: INTEL CORPORATION

Inventors: Robert Valentine, Bret L. Toll, Jesus Corbal, Jeff G. Wiedemeier, Sridhar Samudrala
Read XF instruction for processing vectors

Patent number: 8862932

Abstract: The described embodiments include a processor that handles faults. The processor first receives a first input vector, a control vector, and a predicate vector, each vector comprising a plurality of elements. For each element in the first input vector for which a corresponding element in the control vector and the predicate vector are active, the processor then performs a read operation using an address from the element of the first input vector. When a fault condition is encountered while performing the read operation, the processor determines if the element is a first element where a corresponding element of the control vector is active. If so, the processor handles/processes the fault. Otherwise, the processor masks the fault for the element.

Type: Grant

Filed: July 18, 2012

Date of Patent: October 14, 2014

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
INSTRUCTION AND LOGIC TO PROVIDE VECTOR HORIZONTAL MAJORITY VOTING FUNCTIONALITY

Publication number: 20140289494

Abstract: Instructions and logic provide vector horizontal majority voting functionality. Some embodiments, responsive to an instruction specifying: a destination operand, a size of the vector elements, a source operand, and a mask corresponding to a portion of the vector element data fields in the source operand; read a number of values from data fields of the specified size in the source operand, corresponding to the mask specified by the instruction and store a result value to that number of corresponding data fields in the destination operand, the result value computed from the majority of values read from the number of data fields of the source operand.

Type: Application

Filed: November 30, 2011

Publication date: September 25, 2014

Applicant: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Suleyman Sair, Charles R. Yount
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING VECTOR PACKED UNARY ENCODING USING MASKS

Publication number: 20140223140

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed unary encoding using masks in response to a single vector packed unary encoding using masks instruction that includes a source vector register operand, a destination writemask register operand, and an opcode are described.

Type: Application

Filed: December 23, 2011

Publication date: August 7, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Thomas Willhalm
SYSTEMS, APPARATUSES, AND METHODS FOR SETTING AN OUTPUT MASK IN A DESTINATION WRITEMASK REGISTER FROM A SOURCE WRITE MASK REGISTER USING AN INPUT WRITEMASK AND IMMEDIATE

Publication number: 20140223139

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.

Type: Application

Filed: December 23, 2011

Publication date: August 7, 2014

Inventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara Baghsorkhi, Nalini Vasudevan
Vector index instruction for generating a result vector with incremental values based on a start value and an increment value

Patent number: 8793472

Abstract: The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a start value and an increment value, and optionally receiving a predicate vector with N elements as inputs. The processor then executes the vector instruction. Executing the vector instruction causes the processor to generate a result vector. When generating the result vector, if the predicate vector is received, for each element in the result vector for which a corresponding element of the predicate vector is active, otherwise, for each element in the result vector, the processor sets the element in the result vector equal to the start value plus a product of the increment value multiplied by a specified number of elements to the left of the element in the result vector.

Type: Grant

Filed: November 8, 2011

Date of Patent: July 29, 2014

Assignee: Apple Inc.

Inventors: Jeffry E. Gonion, Keith E. Diefendorff
APPARATUS AND METHOD FOR MASK REGISTER EXPAND OPERATION

Publication number: 20140208065

Abstract: An apparatus and method are described for expanding bits from a mask register in a processor and computing system with vector registers and vector data elements. For example, a method according to one embodiment includes the following operations: reading each mask register bit stored in a mask register, the mask register containing mask values used for performing operations on vector values stored in a set of vector registers; and replicating each mask register bit N times into a destination register, where N is the number of vector elements stored in each vector register.

Type: Application

Filed: December 22, 2011

Publication date: July 24, 2014

Inventor: Elmoustapha Ould-Ahmed-Vall
INSTRUCTION AND LOGIC TO PROVIDE VECTOR LOADS AND STORES WITH STRIDES AND MASKING FUNCTIONALITY

Publication number: 20140195775

Abstract: Instructions and logic provide vector loads and/or stores with stride and mask functionality. Some embodiments, responsive to an instruction specifying: a set of loads, destination register, mask register, memory address, and stride length; execution units read values in the mask register, wherein fields in the mask register correspond to stride-length multiples from the memory address to data elements in memory. A first mask value indicates the element has not been loaded from memory and a second value indicates that the element does not need to be, or has already been loaded. For each having the first value, the corresponding multiple of said stride length is generated according to the data field's position in the mask register to load the data element from memory into the corresponding destination register location, and the corresponding value in the mask register is changed to the second value. These instructions can restart after faults.

Type: Application

Filed: September 26, 2011

Publication date: July 10, 2014

Applicant: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Suleyman Sair, Charles R. Yount
INSTRUCTION TO REDUCE ELEMENTS IN A VECTOR REGISTER WITH STRIDED ACCESS PATTERN

Publication number: 20140189288

Abstract: A vector reduction instruction with non-unit strided access pattern is received and executed by the execution circuitry of a processor. In response to the instruction, the execution circuitry performs an associative reduction operation on data elements of a first vector register. Based on values of the mask register and a current element position being processed, the execution circuitry sequentially set one or more data elements of the first vector register to a result, which is generated by the associative reduction operation applied to both a previous data element of the first vector register and a data clement of a third vector register. The previous data element is located more than one element position away from the current element position.

Type: Application

Filed: December 28, 2012

Publication date: July 3, 2014

Inventors: Albert Hartono, Jayashankar Bharadwaj, Nalini Vasudevan, Sara S. Baghsorkhi, Victor W. Lee, Daehyun Kim
Performing function calls using single instruction multiple data (SIMD) registers

Patent number: 8725989

Abstract: In one embodiment, a processor can perform a function call from a main program to a function that is to operate on at least one vector-type operand, in which only scalar values are passed to the function, and input values to the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of a vector register file, and output values from the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of the vector register file. Other embodiments are described and claimed.

Type: Grant

Filed: December 9, 2010

Date of Patent: May 13, 2014

Assignee: Intel Corporation

Inventor: Tomasz Madajczak

1 2 3 next