Patents by Inventor Larry Mennemeier
Larry Mennemeier has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7373490Abstract: A method in a computer system, one embodiment includes accessing a packed data instruction and generating a corresponding set of control bits to cause a processor to alter a top of stack to zero of a programmer visible register file, accessing a floating point instruction and generating a corresponding set of control bits that cause the processor to operate on the programmer visible register file as a stack, but accessing a transition instruction between the packed data instruction and the scalar floating point instruction and generating a corresponding set of control bits to cause the processor to alter tag data to indicate that programmer visible register file is empty. The method advantageously provides a means for clearing the packed data state at the end of blocks of packed data instructions to leave the floating point state in a clear condition for subsequent operations (e.g. floating point calculations).Type: GrantFiled: March 19, 2004Date of Patent: May 13, 2008Assignee: Intel CorporationInventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
-
Publication number: 20070239810Abstract: A method and apparatus for providing, in a processor, a shift operation on a packed data element having multiple values. One embodiment of a central processing unit (CPU) includes instruction fetch logic to fetch a single-instruction-multiple-data (SIMD) shift instruction. A register stores a multiple data elements to be operated upon by the SIMD shift instruction. A barrel shifter concurrently shifts the data elements in a bit-wise manner by a variable number of bit positions in response to the SIMD shift instruction.Type: ApplicationFiled: June 6, 2007Publication date: October 11, 2007Inventors: DERRICK LIN, Punit Minocha, Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
-
Patent number: 7133040Abstract: An apparatus and method for performing an insert-extract operation on packed data using computer-implemented steps is described. In one embodiment, a first data operand having a data element is accessed. A second packed data operand having at least two data elements is then accessed. The data element in the first data operand is inserted into any destination field of a destination register, or alternatively, a data element is extracted from any field of the source register.Type: GrantFiled: March 31, 1998Date of Patent: November 7, 2006Assignee: Intel CorporationInventors: Mohammad Abdallah, Srinivas Chennupaty, Robert S. Dreyer, Michael A. Julier, Katherine Kong, Larry Mennemeier, Ticky S. Thakkar
-
Publication number: 20060235914Abstract: A method and apparatus for providing, in a processor, a shift operation on a packed data element having multiple values. The apparatus having multiple muxes, each of the multiple muxes having a first input, a second input, a select input and an output. Each of the multiple bits that represent a shifted packed intermediate result on a first bus is coupled to the corresponding first input. Each of the multiple bits representing a replacement bit for one of the multiple values is coupled to a corresponding second input. Each of the multiple bits driven by a correction circuit is coupled to a corresponding select input. Each output corresponds to a bit of a shifted packed result.Type: ApplicationFiled: June 15, 2006Publication date: October 19, 2006Inventors: Derrick Lin, Punit Minocha, Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
-
Publication number: 20060236076Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to pack the packed data responsive to a pack instruction received by the decoder. A first packed data element and a second packed data element are received from the first source register. A third packed data element and a fourth packed data element are received from the second source register. The circuit packs packing a portion of each of the packed data elements into a destination register resulting with the portion from second packed data element adjacent to the portion from the first packed data element, and the portion from the fourth packed data element adjacent to the portion from the third packed data element.Type: ApplicationFiled: June 12, 2006Publication date: October 19, 2006Inventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
-
Publication number: 20050219897Abstract: A method and apparatus for providing, in a processor, a shift operation on a packed data element having multiple values. The apparatus having multiple muxes, each of the multiple muxes having a first input, a second input, a select input and an output. Each of the multiple bits that represent a shifted packed intermediate result on a first bus is coupled to the corresponding first input. Each of the multiple bits representing a replacement bit for one of the multiple values is coupled to a corresponding second input. Each of the multiple bits driven by a correction circuit is coupled to a corresponding select input. Each output corresponds to a bit of a shifted packed result.Type: ApplicationFiled: May 27, 2005Publication date: October 6, 2005Inventors: Derrick Lin, Punit Minocha, Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
-
Publication number: 20050038977Abstract: A processor with instructions to operate on different data types stored in a single logical register file. According to one embodiment of the invention, a processor includes a number of physical registers, a memory unit, and a decode/execution unit. The memory unit is to make the number of physical registers appear to software as a single software-visible register file. The decode/execution unit is to execute on the contents of the single software-visible register file instructions of a first instruction type and of a second instruction type, wherein the single software-visible register file is to be operated as a flat register file during execution of instructions of the second instruction type and as a stack referenced register file during execution of instructions of the first instruction type.Type: ApplicationFiled: September 13, 2004Publication date: February 17, 2005Inventors: Andrew Glew, Larry Mennemeier, Alexander Peleg, David Bistry, Millind Mittal, Carole Dulong, Eiichi Kowashi, Benny Eitan, Derrick Lin, Ramamohan Vakkalagadda
-
Publication number: 20040181649Abstract: A method in a computer system, one embodiment includes accessing a packed data instruction and generating a corresponding set of control bits to cause a processor to alter a top of stack to zero of a programmer visible register file, accessing a floating point instruction and generating a corresponding set of control bits that cause the processor to operate on the programmer visible register file as a stack, but accessing a transition instruction between the packed data instruction and the scalar floating point instruction and generating a corresponding set of control bits to cause the processor to alter tag data to indicate that programmer visible register file is empty. The method advantageously provides a means for clearing the packed data state at the end of blocks of packed data instructions to leave the floating point state in a clear condition for subsequent operations (e.g. floating point calculations).Type: ApplicationFiled: March 19, 2004Publication date: September 16, 2004Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Elichi Kowashi, Millind Mittal, Benny Eitan
-
Patent number: 6751725Abstract: Methods and apparatuses to clear state for operation of a stack. According to one embodiment of the invention, a processor comprises a set of one or more storage areas and a decode unit. The set of one or more storage areas are to store a plurality of tags and a top of stack indication, where each of the plurality of tags is to indicate if a register is in an empty or non-empty state. The decode unit is to decode scalar floating point instructions and packed data instructions, where at least certain of said scalar floating point instructions specify registers in a stack referenced manner and at least certain of said packed data instructions specify registers in a non-stack referenced manner. In addition, the packed data instructions include an instruction to mark the end of blocks of the packed data instructions in programs. The processor also comprises circuitry to cause the plurality of tags to indicate the empty state responsive to execution of the instruction.Type: GrantFiled: February 16, 2001Date of Patent: June 15, 2004Assignee: Intel CorporationInventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
-
Publication number: 20010047469Abstract: A method in a computer system which includes receiving a first instruction which indicates termination of execution of instructions which operate upon packed data stored in a first storage area. The first storage area is used for modifying data responsive to execution of floating point instructions. A plurality of tags is associated with the first storage area indicating that locations in the first storage area are either empty or non-empty responsive to the execution of the floating point instructions which modify data contained in the first storage area. Responsive to the receiving of the first instruction which indicates termination of execution of instructions which operate upon the packed data stored in the first storage area, the method sets only the plurality of tags to an empty state. In different embodiments, setting of the plurality of tags to a non-empty state occurs responsive to receiving a second instruction.Type: ApplicationFiled: February 16, 2001Publication date: November 29, 2001Applicant: Intel CorporationInventors: David Bistry, Larry Mennemeier, Alexander D. Peleo, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
-
Patent number: 6266686Abstract: A method in a computer system which includes receiving a first instruction which indicates indicates termination of execution of instructions which operate upon packed data stored in a first storage area. The first storage area is used for modifying data responsive to execution of floating point instructions. A plurality of tags is associated with the first storage area indicating that locations in the first storage area are either empty or non-empty responsive to the execution of the floating point instructions which modify data contained in the first storage area. Responsive to the receiving of the first instruction which indicates termination of execution of instructions which operate upon the packed data stored in the first storage area, the method sets only the plurality of tags to an empty state. In different embodiments, setting of the plurality of tags to a non-empty state occurs responsive to receiving a second instruction.Type: GrantFiled: March 4, 1999Date of Patent: July 24, 2001Assignee: Intel CorporationInventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
-
Patent number: 6128614Abstract: A technique for sorting packed numbers of two operands into minima or maxima operand with their indices to identify the origin of those selected values. After packing two source operands with a plurality of data elements containing numerical values, greater-than comparison operation is performed on the two operands to generate a mask. The mask is used to identify those corresponding pair of data elements of the first and second operands which need to be passed through the subsequent stages in order to generate a sorted minima or maxima. The operands are AND'ed with the mask or the complement of the mask to generate the required minima/maxima result. The same AND'ing technique is used with two other operands containing indices of the values in the first two operands. The indices identify the originating location of the sorted maxima/minima.Type: GrantFiled: February 8, 1999Date of Patent: October 3, 2000Assignee: Intel CorporationInventors: Larry Mennemeier, Alexander Peleg, Carole Dulong, Millind Mittal, Benny Eitan, Eiichi Kowashi
-
Patent number: 6070237Abstract: A novel processor for manipulating packed data. The packed data includes a first data element D1 and a second data element D2. Each of said data elements has a predetermined number of bits. The processor comprises a decoder, a register, and a circuit. The decoder is for decoding a control signal responsive to receiving the control signal. The register is coupled to the decoder. The register is for storing the packed data. The circuit is coupled to the decoder. The circuit is for generating a first result data element R1 and a second data element R2. The circuit is further for generating R1 to represent a total number bits set in D1, and the circuit is further for generating R2 to represent a total number bits set in D2.Type: GrantFiled: March 4, 1996Date of Patent: May 30, 2000Assignee: Intel CorporationInventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
-
Patent number: 6036350Abstract: A technique for sorting packed signed numbers of two operands into maxima and minima operands and solving absolute differences for each pair of corresponding values of maxima and minima. After packing two source operands with a plurality of data elements containing signed values, a greater-than comparison operation is performed on each pair of corresponding numbers in the two operands to determine which is greater. An exclusive-OR mask is generated for use in swapping those values which need to be rearranged so that all maxima are in one operand and all minima are in another operand. Once the sorting of maxima and minima is complete, a packed subtraction operation is then performed by subtracting the minima from corresponding maxima to obtain absolute differences.Type: GrantFiled: May 20, 1997Date of Patent: March 14, 2000Assignee: Intel CorporationInventors: Larry Mennemeier, Alexander Peleg, Carole Dulong, Millind Mittal, Benny Eitan, Eiichi Kowashi
-
Patent number: 5940859Abstract: A method in a computer system which includes receiving a first instruction which indicates indicates termination of execution of instructions which operate upon packed data stored in a first storage area. The first storage area is used for modifying data responsive to execution of floating point instructions. A plurality of tags is associated with the first storage area indicating that locations in the first storage area are either empty or non-empty responsive to the execution of the floating point instructions which modify data contained in the first storage area. Responsive to the receiving of the first instruction which indicates termination of execution of instructions which operate upon the packed data stored in the first storage area, the method sets only the plurality of tags to an empty state. In different embodiments, setting of the plurality of tags to a non-empty state occurs responsive to receiving a second instruction.Type: GrantFiled: December 19, 1995Date of Patent: August 17, 1999Assignee: Intel CorporationInventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
-
Patent number: 5907842Abstract: A technique for sorting packed numbers of two operands into minima or maxima operand with their indices to identify the origin of those selected values. After packing two source operands with a plurality of data elements containing numerical values, greater-than comparison operation is performed on the two operands to generate a mask. The mask is used to identify those corresponding pair of data elements of the first and second operands which need to be passed through the subsequent stages in order to generate a sorted minima or maxima. The operands are AND'ed with the mask or the complement of the mask to generate the required minima/maxima result. The same AND'ing technique is used with two other operands containing indices of the values in the first two operands. The indices identify the originating location of the sorted maxima/minima.Type: GrantFiled: December 20, 1995Date of Patent: May 25, 1999Assignee: Intel CorporationInventors: Larry Mennemeier, Alexander Peleg, Carole Dulong, Millind Mittal, Benny Eitan, Eiichi Kowashi
-
Patent number: 5857096Abstract: An apparatus (e.g. a microarchitecture of a microprocessor) comprising a plurality of tags associated with a first storage area indicating that locations in the first storage area are either empty or non-empty responsive to execution of floating point instructions which modify data contained in the first storage area. A first circuit is coupled to the plurality of tags which sets only the plurality of tags to an empty state responsive to receipt of a first instruction. The first instruction indicates termination of execution of instructions which operate upon the packed data stored in the first storage area. The apparatus further comprises a second circuit coupled to the plurality of tags for setting the plurality of tags to a non-empty state responsive to receipt of a second instruction (or instructions). The second instruction specifies an operation upon packed data stored in the first storage area.Type: GrantFiled: December 19, 1995Date of Patent: January 5, 1999Assignee: Intel CorporationInventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
-
Patent number: 5793661Abstract: A method of multiplying and accumulating two sets of values in a computer system. A packed multiply add is performed on a first portion of a first set of values packed into a first source and a first portion of a second set of values packed into a second source to generate a first result. The first result is unpacked into a plurality of values (e.g. two). The plurality of values is then added together to form a resulting accumulation value.Type: GrantFiled: December 26, 1995Date of Patent: August 11, 1998Assignee: Intel CorporationInventors: Carole Dulong, Larry Mennemeier, Tuan H. Bui, Eiichi Kowashi, Alexander D. Peleg, Benny Eitan, Stephen A. Fischer, Benny Maytal, Millind Mittal