Masking To Control An Access To Data In Vector Register Patents (Class 712/5)
-
Patent number: 11994993Abstract: An adaptive prefetcher for a shared system cache of a processing system including multiple requestors having a cache miss monitor and a prefetch controller. The cache miss monitor monitors requests for information from memory and identifies one of the requestors for which an identified cache line is requested. The prefetch controller submits an adaptive request for a subsequent cache line. The subsequent cache line may be determined based on a latency comparison between a loop latency (LL) of the prefetch controller and a stream latency (SL) of the identified requestor. A latency memory may be included that stores stream latencies for the requestors. The latency comparison may be used to determine how many cache lines to skip relative to the identified cache line, such as according to SL*SK<LL?SL*(SK+1) in which SK is the number of cache lines to skip.Type: GrantFiled: March 15, 2022Date of Patent: May 28, 2024Assignee: NXP B.V.Inventors: Xiao Sun, Xiaotao Chen, Rohit Kumar Kaul
-
Patent number: 11934837Abstract: An SIMD instruction generation and processing method and a related device are provided. The method may include: obtaining a length of each loop dimension of a first tensor formula; selecting, from a plurality of groups of information about a first SIMD instruction model based on the length of each loop dimension of a first tensor formula, information about a second SIMD instruction model matching the first tensor formula; generating, based on a length of at least one loop dimension of the first tensor formula and the second SIMD instruction model, a first SIMD instruction obtained after the first tensor formula is converted. The information about a second SIMD instruction model is selected from the plurality of groups of information about a first SIMD instruction model based on the length of each loop dimension of the tensor formula.Type: GrantFiled: September 12, 2022Date of Patent: March 19, 2024Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Chen Wu, Yifan Lin, Xiaoqiang Dan
-
Patent number: 11775296Abstract: The present disclosure includes apparatuses and methods related to mask patterns generated in memory from seed vectors. An example method includes performing operations on a plurality of data units of a seed vector and generating, by performance of the operations, a vector element in a mask pattern.Type: GrantFiled: April 12, 2021Date of Patent: October 3, 2023Assignee: Micron Technology, Inc.Inventor: Jeremiah J. Willcock
-
Patent number: 11132198Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.Type: GrantFiled: August 29, 2019Date of Patent: September 28, 2021Assignee: International Business Machines CorporationInventors: Brian W. Thompto, Maarten J. Boersma, Andreas Wagner, Jose E. Moreira, Hung Q. Le, Silvia Melitta Mueller, Dung Q. Nguyen
-
Patent number: 10977033Abstract: The present disclosure includes apparatuses and methods related to mask patterns generated in memory from seed vectors. An example method includes performing operations on a plurality of data units of a seed vector and generating, by performance of the operations, a vector element in a mask pattern.Type: GrantFiled: March 25, 2016Date of Patent: April 13, 2021Assignee: Micron Technology, Inc.Inventor: Jeremiah J. Willcock
-
Patent number: 10891131Abstract: A decode unit to decode an instruction that indicates a source packed data that includes data elements, and indicates a source mask that includes mask elements. Each of the mask elements corresponds to a different one of the data elements. Each of the mask elements is one of a masked mask element and an unmasked mask element. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to store a result packed data. When the source packed data includes one or more masked data elements disposed within unmasked data elements, the result packed data includes, the unmasked data elements consolidated together without the one or more masked data elements disposed within them. The execution unit, is to store a result in a second destination storage location that reflects a number of the unmasked data elements consolidated together.Type: GrantFiled: September 22, 2016Date of Patent: January 12, 2021Assignee: Intel CorporationInventors: Mohammad Ashraf Bhuiyan, Brian R. Nickerson
-
Patent number: 10866807Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four non-negative integers in numerical order with all integers in consecutive positions differing by a constant stride of at least two. In an aspect, storing the result including the sequence of the at least four integers is performed without calculating the at least four integers using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed.Type: GrantFiled: December 22, 2011Date of Patent: December 15, 2020Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Seth Abraham, Robert Valentine, Zeev Sperber, Amit Gradstein
-
Patent number: 10656947Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.Type: GrantFiled: March 14, 2018Date of Patent: May 19, 2020Assignee: Intel CorporationInventors: Maxim Loktyukhin, Eric W. Mahurin, Bret L. Toll, Martin G. Dixon, Sean P. Mirkes, David L. Kreitzer, Elmoustapha Ould-Ahmed-Vall, Vinodh Gopal
-
Patent number: 10614151Abstract: A circuit comprises an input register configured to receive an input vector of elements, a control register configured to receive a control vector of elements, wherein each element of the control vector corresponds to a respective element of the input vector, and wherein each element specifies a permutation of a corresponding element of the input vector, and a permute execution circuit configured to generate an output vector of elements corresponding to a permutation of the input vector. Generating each element of the output vector comprises accessing, at the input register, a particular element of the input vector, accessing, at the control register, a particular element of the control vector corresponding to the particular element of the input vector, and outputting the particular element of the input vector as an element at a particular position of the output vector that is selected based on the particular element of the control vector.Type: GrantFiled: August 1, 2019Date of Patent: April 7, 2020Assignee: Google LLCInventors: Dong Hyuk Woo, Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam, Jonathan Ross, Christopher Aaron Clark
-
Patent number: 10574387Abstract: Technology for an eNodeB operable to perform multiuser non-orthogonal superposition transmissions for multimedia broadcast multicast service (MBMS) is disclosed. The eNodeB can modulate a first physical multicast channel (PMCH) signal for MBMS with a first modulation and coding scheme (MCS). The eNodeB can modulate a second PMCH signal for MBMS with a second MCS. The eNodeB can multiplex the first PMCH signal and the second PMCH signal to form an aggregate PMCH signal. The eNodeB can transmit the aggregate PMCH signal to a plurality of UEs using multiuser non-orthogonal superposition for MBMS, wherein the first PMCH signal in the aggregate PMCH signal is transmitted using physical resource blocks (PRBs) that are partially or fully overlapped in time and frequency with PRBs of the second PMCH signal in the aggregate PMCH signal.Type: GrantFiled: October 30, 2015Date of Patent: February 25, 2020Assignee: INTEL CORPORATIONInventors: Alexei Davydov, Vadim Sergeyev, Alexander Maltsev
-
Patent number: 10528353Abstract: Methods and apparatus for generating a mask vector for determining a processor instruction address using an instruction tag (ITAG) in a multi-slice processor including receiving a first ITAG value and an interrupt ITAG value; generating the mask vector divided into mask sections comprising a plurality of elements with unset flags; for each mask section: if the mask section comprises the first ITAG value, setting a flag of an element in the mask section corresponding to the first ITAG value; if the mask section comprises the interrupt ITAG value, setting a flag of an element in the mask section corresponding to the interrupt ITAG value; setting each flag of each element in the mask vector between the element in the mask vector corresponding to the first ITAG value and the element in the mask vector corresponding to the interrupt ITAG value; and providing the mask vector to an instruction fetch unit.Type: GrantFiled: May 24, 2016Date of Patent: January 7, 2020Assignee: International Business Machines CorporationInventors: David S. Levitan, Mehul Patel
-
Patent number: 10387151Abstract: Methods and apparatus are disclosed for accessing multiple data cache lines for scatter/gather operations. Embodiment of apparatus may comprise address generation logic to generate an address from an index of a set of indices for each of a set of corresponding mask elements having a first value. Line or bank match ordering logic matches addresses in the same cache line or different banks, and orders an access sequence to permit a group of addresses in multiple cache lines and different banks. Address selection logic directs the group of addresses to corresponding different banks in a cache to access data elements in multiple cache lines corresponding to the group of addresses in a single access cycle. A disassembly/reassembly buffer orders the data elements according to their respective bank/register positions, and a gather/scatter finite state machine changes the values of corresponding mask elements from the first value to a second value.Type: GrantFiled: September 30, 2011Date of Patent: August 20, 2019Assignee: Intel CorporationInventors: Jonathan C. Hall, Sailesh Kottapalli, Andrew T. Forsyth
-
Patent number: 10372449Abstract: A method of an aspect includes receiving a packed data operation mask concatenation instruction. The packed data operation mask concatenation instruction indicates a first source having a first packed data operation mask, indicates a second source having a second packed data operation mask, and indicates a destination. A result is stored in the destination in response to the packed data operation mask concatenation instruction. The result includes the first packed data operation mask concatenated with the second packed data operation mask. Other methods, apparatus, systems, and instructions are disclosed.Type: GrantFiled: February 27, 2017Date of Patent: August 6, 2019Assignee: Intel CorporationInventors: Bret L. Toll, Robert Valentine, Jesus Corbal San Andrian, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
-
Patent number: 10372450Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.Type: GrantFiled: July 11, 2017Date of Patent: August 6, 2019Assignee: Intel CorporationInventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara Baghsorkhi, Nalini Vasudevan
-
Patent number: 10241792Abstract: A processor core that includes a hardware decode unit and an execution engine unit. The hardware decode unit to decode a vector frequency expand instruction, wherein the vector frequency compress instruction includes a source operand and a destination operand, wherein the source operand specifies a source vector register that includes one or more pairs of a value and run length that are to be expanded into a run of that value based on the run length. The execution engine unit to execute the decoded vector frequency expand instruction which causes, a set of one or more source data elements in the source vector register to be expanded into a set of destination data elements comprising more elements than the set of source data elements and including at least one run of identical values which were run length encoded in the source vector register.Type: GrantFiled: December 30, 2011Date of Patent: March 26, 2019Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Suleyman Sair, Kshitij A. Doshi, Charles Yount, Bret L. Toll
-
Patent number: 9946540Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.Type: GrantFiled: May 22, 2017Date of Patent: April 17, 2018Assignee: INTEL CORPORATIONInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Patent number: 9854261Abstract: A video decoding method is implemented by a computer having multiple parallel processing units. A stream of data elements is received, some of which encode video content. The stream comprises marker sequences, each marker sequence comprising a marker which does not encode video content. A known pattern of data elements occurs in each marker sequence. A respective part of the stream is supplied to each parallel processing unit. Each parallel processing unit processes the respective part of the stream, whereby multiple parts of the stream are processed in parallel, to detect whether any of the multiple parts matches the known pattern of data elements, thereby identifying the markers. The encoded video content is separated from the identified markers. The separated video content is decoded, and the decoded video content outputted on a display.Type: GrantFiled: January 6, 2015Date of Patent: December 26, 2017Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.Inventors: Yongjun Wu, Chih-Lung Lin
-
Patent number: 9804844Abstract: Instructions and logic provide vector load-op and/or store-op with stride functionality. Some embodiments, responsive to an instruction specifying: a set of loads, a second operation, destination register, operand register, memory address, and stride length; execution units read values in a mask register, wherein fields in the mask register correspond to stride-length multiples from the memory address to data elements in memory. A first mask value indicates the element has not been loaded from memory and a second value indicates that the element does not need to be, or has already been loaded. For each having the first value, the data element is loaded from memory into the corresponding destination register location, and the corresponding value in the mask register is changed to the second value. Then the second operation is performed using corresponding data in the destination and operand registers to generate results. The instruction may be restarted after faults.Type: GrantFiled: September 26, 2011Date of Patent: October 31, 2017Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Suleyman Sair, Charles R. Yount
-
Patent number: 9747101Abstract: Instructions and logic provide vector scatter-op and/or gather-op functionality. In some embodiments, responsive to an instruction specifying: a gather and a second operation, a destination register, an operand register, and a memory address; execution units read values in a mask register, wherein fields in the mask register correspond to offset indices in the indices register for data elements in memory. A first mask value indicates the element has not been gathered from memory and a second value indicates that the element does not need to be, or has already been gathered. For each having the first value, the data element is gathered from memory into the corresponding destination register location, and the corresponding value in the mask register is changed to the second value. When all mask register fields have the second value, the second operation is performed using corresponding data in the destination and operand registers to generate results.Type: GrantFiled: September 26, 2011Date of Patent: August 29, 2017Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Charles R. Yount, Suleyman Sair
-
Patent number: 9703558Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.Type: GrantFiled: December 23, 2011Date of Patent: July 11, 2017Assignee: Intel CorporationInventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara Baghsorkhi, Nalini Vasudevan
-
Patent number: 9672036Abstract: Instructions and logic provide vector loads and/or stores with stride and mask functionality. Some embodiments, responsive to an instruction specifying: a set of loads, destination register, mask register, memory address, and stride length; execution units read values in the mask register, wherein fields in the mask register correspond to stride-length multiples from the memory address to data elements in memory. A first mask value indicates the element has not been loaded from memory and a second value indicates that the element does not need to be, or has already been loaded. For each having the first value, the corresponding multiple of said stride length is generated according to the data field's position in the mask register to load the data element from memory into the corresponding destination register location, and the corresponding value in the mask register is changed to the second value. These instructions can restart after faults.Type: GrantFiled: September 26, 2011Date of Patent: June 6, 2017Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Suleyman Sair, Charles R. Yount
-
Patent number: 9658850Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.Type: GrantFiled: December 23, 2011Date of Patent: May 23, 2017Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Patent number: 9619225Abstract: A data processing apparatus comprises a processing circuit and instruction decoder. A bitfield manipulation instruction controls the processing apparatus to generate at least one result data element from corresponding first and second source data elements. Each result data element includes a portion corresponding to a bitfield of the corresponding first source data element. Bits of the result data element that are more significant than the inserted bitfield have a prefix value that is selected, based on a control value specified by the instruction, as one of a first prefix value having a zero value, a second prefix value having the value of a portion of the corresponding second source data element, and a third prefix value corresponding to a sign extension of the bitfield of the first source data element. Bitwise logical instructions are also described.Type: GrantFiled: October 8, 2015Date of Patent: April 11, 2017Assignee: ARM LimitedInventors: David James Seal, Richard Roy Grisenthwaite, Nigel John Stephens
-
Patent number: 9619236Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.Type: GrantFiled: December 23, 2011Date of Patent: April 11, 2017Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Patent number: 9606961Abstract: Instructions and logic provide vector compress and rotate functionality. Some embodiments, responsive to an instruction specifying: a vector source, a mask, a vector destination and destination offset, read the mask, and copy corresponding unmasked vector elements from the vector source to adjacent sequential locations in the vector destination, starting at the vector destination offset location. In some embodiments, the unmasked vector elements from the vector source are copied to adjacent sequential element locations modulo the total number of element locations in the vector destination. In some alternative embodiments, copying stops whenever the vector destination is full, and upon copying an unmasked vector element from the vector source to an adjacent sequential element location in the vector destination, the value of a corresponding field in the mask is changed to a masked value. Alternative embodiments zero elements of the vector destination, in which no element from the vector source is copied.Type: GrantFiled: October 30, 2012Date of Patent: March 28, 2017Assignee: Intel CorporationInventors: Tal Uliel, Elmoustapha Ould-Ahmed-Vall, Robert Valentine
-
Patent number: 9529592Abstract: In one embodiment, logic is provided to receive and execute a mask move instruction to transfer a vector data element including a plurality of packed data elements from a source location to a destination location, subject to mask information for the instruction, such that only portions of the plurality of packed data elements are transferred to the destination location. Other embodiments are described and claimed.Type: GrantFiled: December 27, 2007Date of Patent: December 27, 2016Assignee: Intel CorporationInventors: Doron Orenstien, Zeev Sperber, Bob Valentine, Benny Eitan
-
Patent number: 9513926Abstract: A tag mask generation method comprises receiving a section_selector flag indicating whether a tag mask for a section of a network packet is to be generated; receiving from a parser a parse information for the network packet, wherein the parse information includes a section_pointer that indicates a location of the section in the network packet; generating a pointer based on the section_pointer when the section_selector indicates that the tag mask for the section is to be generated; receiving a base mask for the section; and generating the tag mask via a shifter by shifting the base mask by the amount indicated by the pointer. The parse information may further include a section_pointer_valid flag indicating whether the section is included in the network packet, and the method may further comprise including the tag mask in a combined tag mask when the section_pointer_valid flag indicates that the section is included in the network packet.Type: GrantFiled: January 8, 2014Date of Patent: December 6, 2016Assignee: Cavium, Inc.Inventors: Wilson Parkhurst Snyder, II, Nicholas New Jamba
-
Patent number: 9489196Abstract: A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.Type: GrantFiled: December 23, 2011Date of Patent: November 8, 2016Assignee: Intel CorporationInventors: Mikhail Plotnikov, Andrey Naraikan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Bret L. Toll, Jesus Corbal
-
Patent number: 9424327Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure.Type: GrantFiled: December 23, 2011Date of Patent: August 23, 2016Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney
-
Patent number: 9378182Abstract: A processor executes a vector move instruction to move data elements from a second vector register to a first vector register under the control of a first mask register and a second mask register. A register file within the processor includes the first vector register, the second vector register, the first mask register and the second mask register. In response to the vector move instruction, execution circuitry in the processor is to replace a given number of target data elements in the first vector register with the given number of source data elements in the second vector register. Each source data element corresponds to a mask bit in the second mask register having a second bit value, and wherein each target data element corresponds to a mask bit in the first mask register having a first bit value.Type: GrantFiled: September 28, 2012Date of Patent: June 28, 2016Assignee: Intel CorporationInventors: Mikhail Plotnikov, Andrey Naraikin, Christopher Hughes
-
Patent number: 9369230Abstract: A method and apparatus for decoding the LTE physical broadcast channel (PBCH). The transmissions are made by an evolved NodeB (eNB). At least one template symbol sequence representative of a potential transmission by the eNB over the PBCH is provided. A signal or signals transmitted over the PBCH by the eNB is received, the signal or signals indicative of a received symbol sequence. Correlation operations are performed for correlating the at least one template symbol sequence against the received symbol sequence. A representative symbol sequence, timing parameter, or both, is selected, based on the correlation operations. The representative symbol sequence is indicative of information transmitted by the eNB over the LTE PBCH. The timing parameter is indicative of timing of said information transmitted by the eNB.Type: GrantFiled: April 2, 2014Date of Patent: June 14, 2016Assignee: Sierra Wireless, Inc.Inventors: Gustav Gerald Vos, Steven John Bennett, Lutz Hans-Joachim Lampe, Ghasem Naddafzadeh Shirazi
-
Patent number: 9281985Abstract: Methods and apparatuses are provided for transmitting and receiving references signals. A method includes transmitting a first parameter and a second parameter, which are cell specific parameters; transmitting a third parameter, which is a UE specific parameter; transmitting a CSI; and receiving at least one of a first reference signal, which is generated based on the CSI, for a PUSCH and a second reference signal for a PUCCH. The first reference signal is generated by not applying sequence hopping and group sequence hopping, regardless of values of the first parameter and the second parameter, if the third parameter indicates that the sequence hopping and the group sequence hopping are disabled. The second reference signal is generated by applying the group sequence hopping, if the first parameter indicates that the group sequence hopping is enabled and the third parameter indicates that the sequence hopping and the group sequence hopping are disabled.Type: GrantFiled: February 13, 2015Date of Patent: March 8, 2016Assignee: Samsung Electronics Co., LtdInventors: Aris Papasakellariou, Joon Young Cho
-
Patent number: 9281863Abstract: Methods and apparatuses are provided for transmitting and receiving references signals. A method includes receiving first and second cell specific parameters; receiving a third UE specific parameter; acquiring a first reference signal for a PUSCH, based on the third parameter; acquiring a second reference signal for a PUCCH, based on the first parameter; and transmitting at least one of the first reference signal and the second reference signal. Sequence hopping and group sequence hopping are disabled for the first reference signal, regardless of values of the first parameter and the second parameter, if the third parameter indicates that the sequence hopping and the group sequence hopping are disabled. The group sequence hopping is applied to acquire the second reference signal, if the first parameter indicates that the group sequence hopping is enabled and the third parameter indicates that the sequence hopping and the group sequence hopping are disabled.Type: GrantFiled: February 13, 2015Date of Patent: March 8, 2016Assignee: Samsung Electronics Co., LtdInventors: Aris Papasakellariou, Joon Young Cho
-
Patent number: 9281862Abstract: Methods and apparatuses are provided for transmitting and receiving references signals. A method includes receiving a first parameter, which is a cell specific parameter; receiving a second parameter, which is a UE specific parameter; acquiring a first reference signal for a PUSCH, based on the second parameter; acquiring a second reference signal for a PUCCH, based on the first parameter; and transmitting at least one of the first reference signal and the second reference signal. Group sequence hopping is not applied to acquire the first reference signal, if the first parameter indicates that the group sequence hopping is enabled and the second parameter indicates that the group sequence hopping is disabled. The group sequence hopping is applied to acquire the second reference signal, if the first parameter indicates that the group sequence hopping is enabled and the second parameter indicates that the group sequence hopping is disabled.Type: GrantFiled: February 13, 2015Date of Patent: March 8, 2016Assignee: Samsung Electronics Co., LtdInventors: Aris Papasakellariou, Joon Young Cho
-
Patent number: 9281984Abstract: Methods and apparatuses are provided for transmitting and receiving references signals. A method includes receiving a first cell specific parameter; receiving a second UE specific parameter; receiving a CSI; acquiring a first reference signal for a PUSCH, based on the second parameter and the CSI; acquiring a second reference signal for a PUCCH, based on the first parameter; and transmitting at least one of the first reference signal and the second reference signal. Group sequence hopping is not applied to acquire the first reference signal, if the first parameter indicates that the group sequence hopping is enabled and the second parameter indicates that the group sequence hopping is disabled. The group sequence hopping is applied to acquire the second reference signal, if the first parameter indicates that the group sequence hopping is enabled and the second parameter indicates that the group sequence hopping is disabled.Type: GrantFiled: February 13, 2015Date of Patent: March 8, 2016Assignee: Samsung Electronics Co., LtdInventors: Aris Papasakellariou, Joon Young Cho
-
Patent number: 9189236Abstract: According to one embodiment, a processor includes an instruction decoder to decode an instruction to read a plurality of data elements from memory, the instruction having a first operand specifying a storage location, a second operand specifying a bitmask having one or more bits, each bit corresponding to one of the data elements, and a third operand specifying a memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the instruction, to read one or more data elements speculatively, based on the bitmask specified by the second operand, from a memory location based on the memory address indicated by the third operand, and to store the one or more data elements in the storage location indicated by the first operand.Type: GrantFiled: December 21, 2012Date of Patent: November 17, 2015Assignee: Intel CorporationInventors: Jayashankar Bharadwaj, Nalini Vasudevan, Victor W. Lee, Sara S. Baghsorkhi, Albert Hartono, Daehyun Kim
-
Publication number: 20150143075Abstract: A Vector Generate Mask instruction. For each element in the first operand, a bit mask is generated. The mask includes bits set to a selected value starting at a position specified by a first field of the instruction and ending at a position specified by a second field of the instruction.Type: ApplicationFiled: December 5, 2014Publication date: May 21, 2015Inventors: Jonathan D. Bradbury, Robert F. Enenkel, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9009528Abstract: The described embodiments include a processor that handles faults. The processor first receives an input vector, a control vector, and a predicate vector, each vector comprising a plurality of elements. Then, for a first element of the input vector for which corresponding elements of the control vector and the predicate vector are active, the processor performs a scalar read operation using an address from the element of the input vector. When a fault condition is encountered while performing the read operation, the processor determines if the element is a first element where a corresponding element of the control vector is active. If so (i.e., if the element is a first element where a corresponding element of the control vector is active), the processor processes the fault. Otherwise, the processor masks the fault for the element.Type: GrantFiled: September 5, 2012Date of Patent: April 14, 2015Assignee: Apple Inc.Inventor: Jeffry E. Gonion
-
Patent number: 8938642Abstract: The described embodiments include a processor with a fault status register (FSR) that executes a Confirm instruction. In these embodiments, when executing the Confirm instruction, the processor receives a predicate vector that includes N elements. For a first set of bit positions in the FSR for which corresponding elements of the predicate vector are active, the processor determines if at least one of the first set of bit positions in the FSR holds a predetermined value. When at least one of the first set of bit positions in the FSR holds the predetermined value, the processor causes a fault in the processor.Type: GrantFiled: May 23, 2012Date of Patent: January 20, 2015Assignee: Apple Inc.Inventor: Jeffry E. Gonion
-
Publication number: 20150006847Abstract: An apparatus and method are described for performing a bit reversal and permutation on mask values. For example, a processor is described to execute an instruction to perform the operations of: reading a plurality of mask bits stored in a source mask register, the mask bits associated with vector data elements of a vector register; and performing a bit reversal operation to copy each mask bit from a source mask register to a destination mask register, wherein the bit reversal operation causes bits from the source mask register to be reversed within the destination mask register resulting in a symmetric, mirror image of the original bit arrangement.Type: ApplicationFiled: June 27, 2013Publication date: January 1, 2015Inventors: Elmoustapha OULD-AHMED-VALL, Robert VALENTINE
-
Publication number: 20140372727Abstract: Vector blend and permute functionality are provided, responsive to instructions specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, a second vector register, and a third operand. Indices are read from fields in the second register. Each index has a first selector portion and a second selector portion. Corresponding unmasked vector elements are stored to fields of the destination register, wherein each vector element, responsive to the respective first selector portion having a first value, is copied to an intermediate vector from a corresponding data field of the first register, and responsive to the respective first selector portion having a second value, is copied to the intermediate vector from a corresponding data field of the third operand. Then unmasked data fields of the destination are replaced by data fields in the intermediate vector indexed by the corresponding second selector portions.Type: ApplicationFiled: December 23, 2011Publication date: December 18, 2014Applicant: INTEL CORPORATIONInventors: Robert Valentine, Bret L. Toll, Jesus Corbal, Jeff G. Wiedemeier, Sridhar Samudrala
-
Patent number: 8862932Abstract: The described embodiments include a processor that handles faults. The processor first receives a first input vector, a control vector, and a predicate vector, each vector comprising a plurality of elements. For each element in the first input vector for which a corresponding element in the control vector and the predicate vector are active, the processor then performs a read operation using an address from the element of the first input vector. When a fault condition is encountered while performing the read operation, the processor determines if the element is a first element where a corresponding element of the control vector is active. If so, the processor handles/processes the fault. Otherwise, the processor masks the fault for the element.Type: GrantFiled: July 18, 2012Date of Patent: October 14, 2014Assignee: Apple Inc.Inventor: Jeffry E. Gonion
-
Publication number: 20140289494Abstract: Instructions and logic provide vector horizontal majority voting functionality. Some embodiments, responsive to an instruction specifying: a destination operand, a size of the vector elements, a source operand, and a mask corresponding to a portion of the vector element data fields in the source operand; read a number of values from data fields of the specified size in the source operand, corresponding to the mask specified by the instruction and store a result value to that number of corresponding data fields in the destination operand, the result value computed from the majority of values read from the number of data fields of the source operand.Type: ApplicationFiled: November 30, 2011Publication date: September 25, 2014Applicant: INTEL CORPORATIONInventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Suleyman Sair, Charles R. Yount
-
Publication number: 20140223140Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed unary encoding using masks in response to a single vector packed unary encoding using masks instruction that includes a source vector register operand, a destination writemask register operand, and an opcode are described.Type: ApplicationFiled: December 23, 2011Publication date: August 7, 2014Inventors: Elmoustapha Ould-Ahmed-Vall, Thomas Willhalm
-
Publication number: 20140223139Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.Type: ApplicationFiled: December 23, 2011Publication date: August 7, 2014Inventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara Baghsorkhi, Nalini Vasudevan
-
Patent number: 8793472Abstract: The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a start value and an increment value, and optionally receiving a predicate vector with N elements as inputs. The processor then executes the vector instruction. Executing the vector instruction causes the processor to generate a result vector. When generating the result vector, if the predicate vector is received, for each element in the result vector for which a corresponding element of the predicate vector is active, otherwise, for each element in the result vector, the processor sets the element in the result vector equal to the start value plus a product of the increment value multiplied by a specified number of elements to the left of the element in the result vector.Type: GrantFiled: November 8, 2011Date of Patent: July 29, 2014Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff
-
Publication number: 20140208065Abstract: An apparatus and method are described for expanding bits from a mask register in a processor and computing system with vector registers and vector data elements. For example, a method according to one embodiment includes the following operations: reading each mask register bit stored in a mask register, the mask register containing mask values used for performing operations on vector values stored in a set of vector registers; and replicating each mask register bit N times into a destination register, where N is the number of vector elements stored in each vector register.Type: ApplicationFiled: December 22, 2011Publication date: July 24, 2014Inventor: Elmoustapha Ould-Ahmed-Vall
-
Publication number: 20140195775Abstract: Instructions and logic provide vector loads and/or stores with stride and mask functionality. Some embodiments, responsive to an instruction specifying: a set of loads, destination register, mask register, memory address, and stride length; execution units read values in the mask register, wherein fields in the mask register correspond to stride-length multiples from the memory address to data elements in memory. A first mask value indicates the element has not been loaded from memory and a second value indicates that the element does not need to be, or has already been loaded. For each having the first value, the corresponding multiple of said stride length is generated according to the data field's position in the mask register to load the data element from memory into the corresponding destination register location, and the corresponding value in the mask register is changed to the second value. These instructions can restart after faults.Type: ApplicationFiled: September 26, 2011Publication date: July 10, 2014Applicant: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Kshitij A. Doshi, Suleyman Sair, Charles R. Yount
-
Publication number: 20140189288Abstract: A vector reduction instruction with non-unit strided access pattern is received and executed by the execution circuitry of a processor. In response to the instruction, the execution circuitry performs an associative reduction operation on data elements of a first vector register. Based on values of the mask register and a current element position being processed, the execution circuitry sequentially set one or more data elements of the first vector register to a result, which is generated by the associative reduction operation applied to both a previous data element of the first vector register and a data clement of a third vector register. The previous data element is located more than one element position away from the current element position.Type: ApplicationFiled: December 28, 2012Publication date: July 3, 2014Inventors: Albert Hartono, Jayashankar Bharadwaj, Nalini Vasudevan, Sara S. Baghsorkhi, Victor W. Lee, Daehyun Kim
-
Patent number: 8725989Abstract: In one embodiment, a processor can perform a function call from a main program to a function that is to operate on at least one vector-type operand, in which only scalar values are passed to the function, and input values to the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of a vector register file, and output values from the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of the vector register file. Other embodiments are described and claimed.Type: GrantFiled: December 9, 2010Date of Patent: May 13, 2014Assignee: Intel CorporationInventor: Tomasz Madajczak