Masking Patents (Class 712/224)
-
Patent number: 11354124Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.Type: GrantFiled: August 3, 2017Date of Patent: June 7, 2022Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Patent number: 11347502Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.Type: GrantFiled: March 31, 2017Date of Patent: May 31, 2022Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Patent number: 11275583Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.Type: GrantFiled: August 3, 2017Date of Patent: March 15, 2022Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Patent number: 11048481Abstract: A method of assembling software includes: enabling a user to code a Computation Function (CF) (S101); determining whether the user is done coding the CFs (S102); enabling a user to code a Part Function (PF) using one or more of the available CFs (S103); determining when the user is done creating all the PFs (S104); and enabling a user to create software using one or more of the available PFs.Type: GrantFiled: March 19, 2020Date of Patent: June 29, 2021Assignee: ELEMENT SOFTWARE, INC.Inventor: Yi Young
-
Patent number: 10943006Abstract: A computer-implemented method, non-transitory, computer-readable medium, and computer-implemented system are provided for data transmission in a trusted execution environment (TEE) system. The method is executed by a first thread in multiple threads on a TEE side. The method includes obtaining first data; obtaining a TEE side thread lock; calling a predetermined function by using the first data as an input parameter to switch to a non-TEE side; obtaining a write offset address and a read offset address respectively by reading a first address and a second address; determining whether a quantity of bytes of the first data is less than or equal to a quantity of writable bytes; if so, writing the first data into third addresses starting from the write offset address; updating the write offset address in the first address; returning to the TEE side; and releasing the TEE side thread lock.Type: GrantFiled: February 7, 2020Date of Patent: March 9, 2021Assignee: Advanced New Technologies Co., Ltd.Inventors: Qi Liu, Boran Zhao, Ying Yan, Changzheng Wei
-
Patent number: 10699015Abstract: A computer-implemented method, non-transitory, computer-readable medium, and computer-implemented system are provided for data transmission in a trusted execution environment (TEE) system. The method can be executed by a thread on a TEE side of the TEE system. The method includes obtaining first data; calling a predetermined function using the first data as an input parameter to switch to a non-TEE side; obtaining a write offset address by reading a first address; obtaining a read offset address by reading a second address; determining whether a quantity of bytes of the first data is less than or equal to a quantity of writable bytes; if so, writing the first data into third addresses starting from the write offset address; updating the write offset address in the first address; and returning to the TEE side.Type: GrantFiled: February 7, 2020Date of Patent: June 30, 2020Assignee: Alibaba Group Holding LimitedInventors: Qi Liu, Boran Zhao, Ying Yan, Changzheng Wei
-
Patent number: 10691454Abstract: Single Instruction, Multiple Data (SIMD) technologies are described. A processor can store a first bitmap and generate a second bitmap with each cell identifying a mask bit. The mask bit is set when 1) a corresponding cell in a first bitmap is not in conflict with other elements in the first bitmap or 2) a corresponding cell is in conflict with one or more other cells in the first bitmap and is a last cell in a sequential order of the first bitmap that conflicts with the one or more other cells, wherein a position of each cell in the second bitmap maps to a same position of the corresponding cell in the first bitmap. The processor can store the second bitmap as a mask for a scatter operation to avoid lane conflicts.Type: GrantFiled: January 16, 2019Date of Patent: June 23, 2020Assignee: Intel CorporationInventors: Jun Jin, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 10678540Abstract: An apparatus and method are provided for efficiently performing arithmetic operations that include at least a multiplication operation. The apparatus comprises processing circuitry to perform data processing operations, and instruction decode circuitry responsive to program instructions to generate control signals to control the processing circuitry to perform the data processing operations. In response to an arithmetic operation with shift instruction specifying performance of an arithmetic operation comprising at least a multiplication operation, and having a field which provides a programmable shift indication, the instruction decode circuitry is configured to control the processing circuitry to perform the arithmetic operation during which an intermediate value is produced, and to select a target portion of the intermediate value based on an output window determined from the programmable shift indication.Type: GrantFiled: May 8, 2018Date of Patent: June 9, 2020Assignee: Arm LimitedInventors: Jacob Eapen, Mbou Eyole, Neil Burgess
-
Patent number: 10599401Abstract: A method of assembling software includes: enabling a user to code a Computation Function (CF) (S101); determining whether the user is done coding the CFs (S102); enabling a user to code a Part Function (PF) using one or more of the available CFs (S103); determining when the user is done creating all the PFs (S104); and enabling a user to create software using one or more of the available PFs.Type: GrantFiled: January 15, 2018Date of Patent: March 24, 2020Inventor: Yi Young
-
Patent number: 10521271Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.Type: GrantFiled: April 1, 2017Date of Patent: December 31, 2019Assignee: INTEL CORPORATIONInventors: Abhishek R Appu, Altug Koker, Balaji Vembu, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Kiran C. Veernapu, Subramaniam Maiyuran, Sanjeev S. Jahagirdar, Eric J. Asperheim, Guei-Yuan Lueh, David Puffer, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Josh B. Mastronarde, Linda L. Hurd, Travis T. Schluessler, Tomasz Janczak, Abhishek Venkatesh, Kai Xiao, Slawomir Grajewski
-
Patent number: 10372455Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.Type: GrantFiled: December 12, 2014Date of Patent: August 6, 2019Assignee: Intel CorporationInventors: Maxim Loktyukhin, Eric W Mahurin, Bret L Toll, Martin G Dixon, Sean P Mirkes, David L Kreitzer, Elmoustapha Ould-Ahmed-Vall, Vinodh Gopal
-
Patent number: 10360039Abstract: A mechanism for predicated execution of instructions within a parallel processor executing multiple threads or data lanes is disclosed. Each thread or data lane executing within the parallel processor is associated with a predicate register that stores a set of 1-bit predicates. Each of these predicates can be set using different types of predicate-setting instructions, where each predicate setting instruction specifies one or more source operands, at least one operation to be performed on the source operands, and one or more destination predicates for storing the result of the operation. An instruction can be guarded by a predicate that may influence whether the instruction is executed for a particular thread or data lane or how the instruction is executed for a particular thread or data lane.Type: GrantFiled: September 27, 2010Date of Patent: July 23, 2019Assignee: NVIDIA CORPORATIONInventors: Richard Craig Johnson, John R. Nickolls, Robert Steven Glanville
-
Patent number: 10223119Abstract: A processor of an aspect includes a decode unit to decode an instruction that indicates a first source packed data operand including a first plurality of data elements, a source mask including a plurality of mask elements, and a destination storage location. An execution unit, in response to the instruction, stores a result packed data operand. The result packed data operand has at least two unmasked result data elements corresponding to unmasked mask elements of the source mask. Each of the unmasked result data elements has a value of a corresponding data element of the first source packed data operand in a same relative position. All masked result data elements, between each nearest pair of unmasked result data elements, have a same value as an unmasked result data element of the pair closest to a first end of the result packed data operand.Type: GrantFiled: March 28, 2014Date of Patent: March 5, 2019Assignee: Intel CorporationInventor: Mikhail Plotnikov
-
Patent number: 10185562Abstract: Single Instruction, Multiple Data (SIMD) technologies are described. A processing device can include a processor core and a memory. The processor core can generate a first bitmap comprising a plurality of bits, where the plurality of bits includes a first bit that represents a first memory location. The processor core can determine that the value of the first bit is equal to the value of a second bit in the first bitmap. The processor core can determine the location of the second bit in relation to the first bit in the first bitmap. The processor core can generate a second bitmap including a third bit indicating that the first bit is the last bit in the first bitmap with the same value as the second bit.Type: GrantFiled: December 24, 2015Date of Patent: January 22, 2019Assignee: Intel CorporationInventors: Jun Jin, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 10042694Abstract: A method for validating control blocks in memory includes monitoring for operations configured to obtain storage space in memory. The method examines the storage space that has been obtained to identify control blocks stored in the storage space. These control blocks are then analyzed to determine whether the control blocks are valid. In certain embodiments, this may be accomplished by comparing the content of the control blocks to information in a validation table that indicates possible values and ranges of values for fields in the control blocks. If a control block is valid, the method records a date and time when the control block was validated. If a control block is not valid, the method generates a message indicating that the control block is not valid. A corresponding system and computer program product are also disclosed.Type: GrantFiled: September 6, 2016Date of Patent: August 7, 2018Assignee: International Business Machines CorporationInventors: Philip R. Chauvet, Franklin E. McCune, David C. Reed, Esteban Rios
-
Patent number: 9990202Abstract: A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.Type: GrantFiled: June 28, 2013Date of Patent: June 5, 2018Assignee: Intel CorporationInventors: Bret L. Toll, Ronak Singhal, Buford M. Guy, Mishali Naik
-
Patent number: 9940131Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.Type: GrantFiled: December 5, 2014Date of Patent: April 10, 2018Assignee: Intel CorporationInventors: Vinodh Gopal, James D Guilford, Gilbert M Wolrich, Wajdi K Feghali, Erdinc Ozturk, Martin G Dixon, Sean Mirkes, Bret L Toll, Maxim Loktyukhin, Mark C Davis, Alexandre J Farcy
-
Patent number: 9940130Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.Type: GrantFiled: December 5, 2014Date of Patent: April 10, 2018Assignee: Intel CorporationInventors: Vinodh Gopal, James D Guilford, Gilbert M Wolrich, Wajdi K Feghali, Erdinc Ozturk, Martin G Dixon, Sean Mirkes, Bret L Toll, Maxim Loktyukhin, Mark C Davis, Alexandre J Farcy
-
Patent number: 9935727Abstract: One embodiment provides an apparatus for coupling between a trunk passive optical network (PON) and a leaf PON. The apparatus includes a trunk-side optical transceiver coupled to the trunk PON, a leaf-side optical transceiver coupled to the leaf PON, and an integrated circuit chip that includes an optical network unit (ONU) media access control (MAC) module, an optical line terminal (OLT) MAC module, and an on-chip memory.Type: GrantFiled: January 11, 2017Date of Patent: April 3, 2018Assignee: TIBIT COMMUNICATIONS, INC.Inventor: Edward W. Boyd
-
Patent number: 9916160Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.Type: GrantFiled: December 5, 2014Date of Patent: March 13, 2018Assignee: Intel CorporationInventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K Feghali, Erdinc Ozturk, Martin G Dixon, Sean Mirkes, Bret L Toll, Maxim Loktyukhin, Mark C Davis, Alexandre J Farcy
-
Patent number: 9904545Abstract: According to one general aspect, an apparatus may include a monolithic shifter configured to receive a plurality of bytes of data, and, for each byte of data, a number of bits to shift the respective byte of data, wherein the number of bits for each byte of data need not be the same as for any other byte of data. The monolithic shifter may be configured to shift each byte of data by the respective number of bits. The apparatus may include a mask generator configured to compute a mask for each byte of data, wherein each mask indicates which bits, if any, are to be prevented from being polluted by a neighboring shifted byte of data. The apparatus may include a masking circuit configured to combine the shifted byte of data with a respective mask to create an unpolluted shifted byte of data.Type: GrantFiled: September 16, 2015Date of Patent: February 27, 2018Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventor: Eric C. Quinnell
-
Patent number: 9564904Abstract: A method of dividing a clock signal by an input signal of N bits with M most significant bits is described herein. The method includes dividing the clock signal by the most significant bits of the input signal 2N-M?1 times out of 2N-M divisions of the clock signal, using a divider. The clock signal is divided by a sum of the most significant bits and the least significant bits one time out of 2N-M divisions of the clock signal, using the divider. The clock signal is also divided by 2N-M, 2N-M times, using the divider.Type: GrantFiled: April 21, 2015Date of Patent: February 7, 2017Assignee: STMICROELECTRONICS INTERNATIONAL N.V.Inventors: Jeet Narayan Tiwari, Nitin Gupta
-
Patent number: 9396056Abstract: In some disclosed embodiments instruction execution logic provides conditional memory fault assist suppression. Some embodiments of processors comprise a decode stage to decode one or more instruction specifying: a set of memory operations, one or more register, and one or more memory address. One or more execution units, responsive to the one or more decoded instruction, generate said one or more memory address for the set of memory operations. Instruction execution logic records one or more fault suppress bits to indicate whether one or more portion of the set of memory operations are masked. Fault generation logic is suppressed from considering a memory fault corresponding to a faulting one of the set of memory operations when said faulting one of the set of memory operations corresponds to a portion of the set of memory operations that is indicated as masked by said one or more fault suppress bits.Type: GrantFiled: March 15, 2014Date of Patent: July 19, 2016Assignee: Intel CorporationInventors: Zeev Sperber, Robert Valentine, Offer Levy, Michael Mishaeli, Gal Ofir
-
Patent number: 9154801Abstract: A method and apparatus for encoding video data. At least a portion of a two dimensional array of transform coefficients representing a portion of video data is re-ordered to a one dimensional array of data by diagonally scanning the portion in at least two scan lines. Each scan line directed in a single common diagonal direction. Syntax elements representing at least a portion of the one dimensional array of data are then coded and transmitted.Type: GrantFiled: September 30, 2011Date of Patent: October 6, 2015Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Vivienne Sze, Madhukar Budagavi
-
Patent number: 9135004Abstract: A rotate then operate instruction having a T bit is fetched and executed wherein a first operand in a first register is rotated by an amount and a Boolean operation is performed on a selected portion of the rotated first operand and a second operand in of a second register. If the T bit is ‘0’ the selected portion of the result of the Boolean operation is inserted into corresponding bits of a second operand of a second register. If the T bit is ‘1’, in addition to the inserted bits, the bits other than the selected portion of the rotated first operand are saved in the second register.Type: GrantFiled: September 12, 2014Date of Patent: September 15, 2015Assignee: International Business Machines CorporationInventors: Dan F. Greiner, Timothy J. Slegel, Joachim von Buttlar
-
Patent number: 9124644Abstract: An egress packet modifier includes a script parser and a pipeline of processing stages. Rather than performing egress modifications using a processor that fetches and decodes and executes instructions in a classic processor fashion, and rather than storing a packet in memory and reading it out and modifying it and writing it back, the packet modifier pipeline processes the packet by passing parts of the packet through the pipeline. A processor identifies particular egress modifications to be performed by placing a script code at the beginning of the packet. The script parser then uses the code to identify a specific script of opcodes, where each opcode defines a modification. As a part passes through a stage, the stage can carry out the modification of such an opcode. As realized using current semiconductor fabrication process, the packet modifier can modify 200M packets/second at a sustained rate of up to 100 gigabits/second.Type: GrantFiled: July 14, 2013Date of Patent: September 1, 2015Assignee: NETRONOME SYSTEMS, INC.Inventors: Chirag P. Patel, Gavin J. Stark
-
Patent number: 9002122Abstract: A codec includes an encoder having a quantization level generator that defines a quantization level specific to a block of values (e.g., transform coefficients), a quantizer that quantizes the block of transform coefficients according to the block-specific quantization level, a run-length encoder, and an entropy encoder. The quantization level is defined to result in at least a predetermined number (k) of quantized coefficients having a predetermined value. The amount of data compression by the encoder is proportional to (k). The codec also includes a decoder having entropy and run-length decoding sections whose throughputs are proportional to (k). The decoder takes advantage of this increased throughput by further decoding coefficients in parallel using a plurality of decoding channels. Methods for encoding and decoding data are also disclosed. The invention is well-suited to quantization, entropy, and/or run-length-based codecs, such as JPEG.Type: GrantFiled: July 19, 2012Date of Patent: April 7, 2015Assignee: OmniVision Technologies, Inc.Inventor: Xuanming Du
-
Patent number: 8990816Abstract: Techniques for simulating exclusive use of a processor core amongst multiple logical partitions (LPARs) include providing hardware thread-dependent status information in response to access requests by the LPARs that is reflective of exclusive use of the processor by the LPAR accessing the hardware thread-dependent information. The information returned in response to the access requests is transformed if the requestor is a program executing at a privilege level lower than the hypervisor privilege level, so that each logical partition views the processor as though it has exclusive use of the processor. The techniques may be implemented by a logical circuit block within the processor core that transforms the hardware thread-specific information to a logical representation of the hardware thread-specific information or the transformation may be performed by program instructions of an interrupt handler that traps access to the physical register containing the information.Type: GrantFiled: April 20, 2012Date of Patent: March 24, 2015Assignee: International Business Machines CorporationInventors: Giles R. Frazier, Bruce Mealy, Naresh Nayar
-
Patent number: 8959316Abstract: The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a vector instruction that optionally receives a predicate vector (which has N elements) as an input. The processor then executes the vector instruction. In the described embodiments, executing the vector instruction causes the processor to generate a result vector. When generating the result vector, if the predicate vector is received, for each element in the result vector for which a corresponding element of the predicate vector is active, otherwise, for each element of the result vector, the processor determines element positions for which a fault was masked during a prior operation. The processor then updates elements in the result vector to identify a leftmost element for which a fault was masked.Type: GrantFiled: October 19, 2010Date of Patent: February 17, 2015Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff
-
Patent number: 8954941Abstract: Method of generating respective instruction compaction schemes for subsets of instructions to be processed by a programmable processor, comprising the steps of a) receiving at least one input code sample representative for software to be executed on the programmable processor, the input code comprising a plurality of instructions defining a first set of instructions (S1), b) initializing a set of removed instructions as empty (S3), c) determining the most compact representation of the first set of instructions (S4) d) comparing the size of said most compact representation with a threshold value (S5), e) carrying out steps e1 to e3 if the size is larger than said threshold value, e1) determining which instruction of the first set of instructions has a highest coding cost (S6), e2) removing said instruction having the highest coding cost from the first set of instructions and (S7), e3) adding said instruction to the set of removed instructions (S8), f) repeating steps b-f, wherein the first set of instructionsType: GrantFiled: September 3, 2010Date of Patent: February 10, 2015Assignee: Intel CorporationInventors: Hendrik Tjeerd Joannes Zwartenkot, Alexander Augusteijn, Yuanging Guo, Jürgen Von Oerthel, Jeroen Anton Johan Leijten, Erwan Yann Maurice Le Thenaff
-
Patent number: 8924694Abstract: A programmable processor configured to perform one or more packet modifications through execution of one or more commands. A pipelined processor core comprises a first stage configured to selectively shift and mask data in each of a plurality of categories in response to one or more decoded commands, and combine the selectively shifted and masked data in each of the categories. The pipelined processor core further comprises a second stage configured to selectively perform one or more operations on the combined data from the first stage and other data responsive to the one or more decoded commands. In one implementation, the processor is implemented as an application specific integrated circuit (ASIC).Type: GrantFiled: April 11, 2012Date of Patent: December 30, 2014Assignee: Extreme Networks, Inc.Inventors: David K. Parker, Erik R. Swenson, Christopher J. Young
-
Publication number: 20140365751Abstract: A data processing apparatus has at least one processing pipeline having first, second and third pipeline stages. The first pipeline stage detects whether a stream of instructions to be processed includes a predetermined instruction sequence comprising first and second instructions for performing first and second operand generation operations, where the second operand generation operation is dependent on an outcome of the first. In response to detecting this instruction sequence, the first pipeline stage generates a modified stream of instructions in which at least the second instruction is replaced with a third instruction for performing a combined operand generation operation having the same effect as the first and second operand generation operations. As the third instruction can be scheduled independently of the first instruction, processing performance of the pipeline can be improved.Type: ApplicationFiled: May 9, 2014Publication date: December 11, 2014Applicant: ARM LIMITEDInventors: Ian Michael CAULFIELD, Max BATLEY, Peter Richard GREENHALGH
-
Patent number: 8909905Abstract: A method and a device having a plurality of bit operations capability, the device includes: a first and a second registers and an instruction fetch circuit, and an arithmetic logic unit adapted to: calculate, during a first clock cycle, a position value representative of a position, within a first information vector, of a first bit of information that has a first value; and to multiply the position value by a multiplication factor to provide a first result and to alter the value of the first bit to a second value to provide an updated information vector, during the first clock cycle.Type: GrantFiled: August 18, 2006Date of Patent: December 9, 2014Assignee: Freescale Semiconductor, Inc.Inventors: Eran Glickman, Evgeni Ginzburg, Noam Sheffer
-
Publication number: 20140281396Abstract: An instruction processing apparatus of an aspect includes a plurality of operation mask registers. The apparatus also includes a decode unit to receive an operation mask consolidation instruction. The operation mask consolidation instruction is to indicate a source operation mask register, of the plurality of operation mask registers, and a destination storage location. The source operation mask register is to include a source operation mask that is to include a plurality of masked elements that are to be disposed within a plurality of unmasked elements. An execution unit is coupled with the decode unit. The execution unit, in response to the operation mask consolidation instruction, is to store a consolidated operation mask in the destination storage location. The consolidated operation mask is to include the unmasked elements from the source operation mask consolidated together. Other apparatus, methods, systems, and instructions are also disclosed.Type: ApplicationFiled: March 15, 2013Publication date: September 18, 2014Inventor: Ashish Jha
-
Patent number: 8838943Abstract: A rotate then operate instruction having a T bit is fetched and executed wherein a first operand in a first register is rotated by an amount and a Boolean operation is performed on a selected portion of the rotated first operand and a second operand in of a second register. If the T bit is ‘0’ the selected portion of the result of the Boolean operation is inserted into corresponding bits of a second operand of a second register. If the T bit is ‘1’, in addition to the inserted bits, the bits other than the selected portion of the rotated first operand are saved in the second register.Type: GrantFiled: July 21, 2010Date of Patent: September 16, 2014Assignee: International Business Machines CorporationInventors: Dan F. Greiner, Timothy J. Slegel, Joachim von Buttlar
-
Patent number: 8819399Abstract: Some embodiments provide a system that executes a native code module. During operation, the system obtains the native code module. Next, the system loads the native code module into a secure runtime environment. Finally, the system safely executes the native code module in the secure runtime environment by using a set of software fault isolation (SFI) mechanisms that use predicated store instructions and predicated control flow instructions, wherein each predicated instruction from the predicated store instructions and the predicated control flow instructions is executed if a mask condition associated with the predicated instruction is met.Type: GrantFiled: November 20, 2009Date of Patent: August 26, 2014Assignee: Google Inc.Inventors: Robert Muth, Karl Schmipf, David C. Sehr, Clifford L. Biffle
-
Patent number: 8782380Abstract: A processor and a method for privilege escalation in a processor are provided. The method may comprise fetching an instruction from a fetch address, where the instruction requires the processor to be in supervisor mode for execution, and determining whether the fetch address is within a predetermined address range. The instruction is filtered through an instruction mask and then it is determined whether the instruction, after being filtered through the mask, equals the value in an instruction value compare register. The processor privilege is raised to supervisor mode for execution of the instruction in response to the fetch address being within the predetermined address range and the filtered instruction equaling the value in the instruction value compare register, wherein the processor privilege is raised to supervisor mode without use of an interrupt. The processor privilege returns to its previous level after execution of the instruction.Type: GrantFiled: December 14, 2010Date of Patent: July 15, 2014Assignee: International Business Machines CorporationInventors: Anthony J. Bybell, Anup Wadia
-
Publication number: 20140189323Abstract: An apparatus and method for propagating conditionally evaluated values. For example, a method according to one embodiment comprises: reading each value contained in an input mask register, each value being a true value or a false value and having a bit position associated therewith; for each true value read from the input mask register, generating a first result containing the bit position of the true value; for each false value read from the input mask register following the first true value, adding the vector length of the input mask register to a bit position of the last true value read from the input mask register to generate a second result; and storing each of the first results and second results in bit positions of an output register corresponding to the bit positions read from the input mask register.Type: ApplicationFiled: December 23, 2011Publication date: July 3, 2014Inventors: Jayashankar Bharadwaj, Nalini Vasudevan, Victor W. Lee, Daehyun Kim, Albert Hartono, Sara S. Baghsorkhi
-
Patent number: 8762690Abstract: The described embodiments provide a processor for generating a result vector with incremented or decremented values from an input vector. During operation, the processor receives an input vector and a control vector. The processor then copies a value contained in a selected element of the input vector. The processor next generates the result vector, which involves writing an incremented or decremented value to the result vector, depending on the value of the control vector and the embodiment. In addition, a predicate vector can be used to control the values that are written to the result vector.Type: GrantFiled: June 30, 2009Date of Patent: June 24, 2014Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff, Jr.
-
Patent number: 8732440Abstract: A method for generating a digital signal pattern at M outputs involves retrieving an instruction from memory comprising a first set of bits identifying a first group of N outputs that includes fewer than all of the M outputs, and a second set of N bits each corresponding to a respective output included in the identified first group of N outputs. For each of the M outputs that is included in the identified first group of N outputs, the signal at the output is toggled if the one of the N bits corresponding to that output is in a first state and is kept in the same state if the one of the N bits corresponding to that output is in a second state. For each of the M outputs that is not included in the identified first group of N outputs, the signal at that output is kept in the same state.Type: GrantFiled: December 3, 2007Date of Patent: May 20, 2014Assignee: Analog Devices, Inc.Inventors: Christopher Jacobs, Andreas D. Olofsson, Paul Kettle
-
Patent number: 8725989Abstract: In one embodiment, a processor can perform a function call from a main program to a function that is to operate on at least one vector-type operand, in which only scalar values are passed to the function, and input values to the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of a vector register file, and output values from the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of the vector register file. Other embodiments are described and claimed.Type: GrantFiled: December 9, 2010Date of Patent: May 13, 2014Assignee: Intel CorporationInventor: Tomasz Madajczak
-
Patent number: 8719827Abstract: A processor for sequentially executing a plurality of programs using a plurality of register value groups stored in a memory that correspond one-to-one with the programs.Type: GrantFiled: July 11, 2011Date of Patent: May 6, 2014Assignee: Panasonic CorporationInventors: Kazushi Kurata, Tetsuya Tanaka, Nobuo Higaki, Kunihiko Hayashi, Hiroshi Kadota, Tokuzo Kiyohara, Kozo Kimura, Hideshi Nishida, Kazuya Furukawa, Shigeki Fujii, Toshio Sugimura
-
Publication number: 20130290687Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.Type: ApplicationFiled: December 23, 2011Publication date: October 31, 2013Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Patent number: 8549251Abstract: In some embodiments, an apparatus includes a register having a first portion and a second portion. The first portion of the register has multiple bits and the second portion of the register has multiple bits. Each bit from the multiple bits of the first portion of the register is associated with a bit from the multiple bits of the second portion of the register such that a bit from the multiple bits of the first portion of the register is set for its associated bit from the multiple bits of the second portion of the register to be written.Type: GrantFiled: December 15, 2010Date of Patent: October 1, 2013Assignee: Juniper Networks, Inc.Inventors: Murali Vemula, Sathish Shenoy
-
Publication number: 20130246760Abstract: A data processor comprising a plurality of registers, and instruction execution circuitry having an associated instruction set, wherein the instruction set includes an instruction specifying at least a mask operand, a register operand and an immediate value operand, and the instruction execution circuitry, in response to an instance of the instruction, determines a Boolean value based on the mask operand and sets a respective one of a plurality of registers specified by the register operand of the instance to a value of the immediate value operand if the Boolean value is true. The instruction execution circuitry, in response to the instance of the instruction, may set the respective one of the plurality of registers specified by the register operand of the instance to zero if the Boolean value is false.Type: ApplicationFiled: March 11, 2013Publication date: September 19, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: INTERNATIONAL BUSINESS MACHINES CORPORATION
-
Patent number: 8392666Abstract: An apparatus detects a load-store collision within a microprocessor between a load operation and an older store operation each of which accesses data in the same cache line. Load and store byte masks specify which bytes contain the data specified by the load and store operation within a word of the cache line in which the load and data begins, respectively. Load and store word masks specify which words contain the data specified by the load and store operations within the cache line, respectively. Combinatorial logic uses the load and store byte masks to detect the load-store collision if the data specified by the load and store operations begin in the same cache line word, and uses the load and store word masks to detect the load-store collision if the data specified by the load and store operations do not begin in the same cache line word.Type: GrantFiled: October 20, 2009Date of Patent: March 5, 2013Assignee: VIA Technologies, Inc.Inventors: Rodney E. Hooker, Colin Eddy
-
Patent number: 8370608Abstract: The described embodiments provide a processor for generating a result vector with copied or propagated values from an input vector. During operation, the processor receives at least one input vector and a control vector. Using these vectors, the processor generates the result vector, which can contain copied propagated values from the input vector(s), depending on the value of the control vector. In addition, a predicate vector can be used to control the values that are written to the result vector.Type: GrantFiled: June 30, 2009Date of Patent: February 5, 2013Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff
-
Patent number: 8364938Abstract: In the described embodiments, a processor captures a value from an element at a key element position in a second input vector into a base value. The processor then generates a result vector by, if the predicate vector is received, for each element in the result vector to the right of the key element position for which a corresponding element in the predicate vector is active, otherwise, for each element in the result vector to the right of the key element position, setting the element in the result vector equal to a result from an associative Boolean operation or a multiplication operation for which the inputs are the base value and a value in each relevant element of a first input vector from an element at the key element position to and including a predetermined element in the first input vector.Type: GrantFiled: August 14, 2009Date of Patent: January 29, 2013Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff, Jr.
-
Patent number: 8356164Abstract: The described embodiments provide a processor for generating a result vector with shifted values from an input vector. During operation, the processor receives an input vector and a control vector. Using these vectors, the processor generates the result vector, which can contain shifted values or propagated values from the input vector, depending on the value of the control vector. In addition, a predicate vector can be used to control the values that are written to the result vector.Type: GrantFiled: June 30, 2009Date of Patent: January 15, 2013Assignee: Apple Inc.Inventors: Jeffry E. Gonion, Keith E. Diefendorff
-
Patent number: 8356162Abstract: An execution unit supports data dependent conditional write instructions that write data to a target only when a particular condition is met. In one implementation, a data dependent conditional write instruction identifies a condition as well as data to be tested against that condition. The data is tested against that condition, and the result of the test is used to selectively enable or disable a write to a target associated with the data dependent conditional write instruction. Then, a write is attempted while the write to the target is enabled or disabled such that the write will update the contents of the target only when the write is selectively enabled as a result of the test. By doing so, dependencies are typically avoided, as is use of an architected condition register that might otherwise introduce branch prediction mispredict penalties, enabling improved performance with z-buffer test and similar types of algorithms.Type: GrantFiled: March 18, 2008Date of Patent: January 15, 2013Assignee: International Business Machines CorporationInventors: Adam James Muff, Matthew Ray Tubbs