Patents by Inventor Larry Mennemeier

Larry Mennemeier has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7373490
    Abstract: A method in a computer system, one embodiment includes accessing a packed data instruction and generating a corresponding set of control bits to cause a processor to alter a top of stack to zero of a programmer visible register file, accessing a floating point instruction and generating a corresponding set of control bits that cause the processor to operate on the programmer visible register file as a stack, but accessing a transition instruction between the packed data instruction and the scalar floating point instruction and generating a corresponding set of control bits to cause the processor to alter tag data to indicate that programmer visible register file is empty. The method advantageously provides a means for clearing the packed data state at the end of blocks of packed data instructions to leave the floating point state in a clear condition for subsequent operations (e.g. floating point calculations).
    Type: Grant
    Filed: March 19, 2004
    Date of Patent: May 13, 2008
    Assignee: Intel Corporation
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
  • Publication number: 20070239810
    Abstract: A method and apparatus for providing, in a processor, a shift operation on a packed data element having multiple values. One embodiment of a central processing unit (CPU) includes instruction fetch logic to fetch a single-instruction-multiple-data (SIMD) shift instruction. A register stores a multiple data elements to be operated upon by the SIMD shift instruction. A barrel shifter concurrently shifts the data elements in a bit-wise manner by a variable number of bit positions in response to the SIMD shift instruction.
    Type: Application
    Filed: June 6, 2007
    Publication date: October 11, 2007
    Inventors: DERRICK LIN, Punit Minocha, Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
  • Patent number: 7133040
    Abstract: An apparatus and method for performing an insert-extract operation on packed data using computer-implemented steps is described. In one embodiment, a first data operand having a data element is accessed. A second packed data operand having at least two data elements is then accessed. The data element in the first data operand is inserted into any destination field of a destination register, or alternatively, a data element is extracted from any field of the source register.
    Type: Grant
    Filed: March 31, 1998
    Date of Patent: November 7, 2006
    Assignee: Intel Corporation
    Inventors: Mohammad Abdallah, Srinivas Chennupaty, Robert S. Dreyer, Michael A. Julier, Katherine Kong, Larry Mennemeier, Ticky S. Thakkar
  • Publication number: 20060235914
    Abstract: A method and apparatus for providing, in a processor, a shift operation on a packed data element having multiple values. The apparatus having multiple muxes, each of the multiple muxes having a first input, a second input, a select input and an output. Each of the multiple bits that represent a shifted packed intermediate result on a first bus is coupled to the corresponding first input. Each of the multiple bits representing a replacement bit for one of the multiple values is coupled to a corresponding second input. Each of the multiple bits driven by a correction circuit is coupled to a corresponding select input. Each output corresponds to a bit of a shifted packed result.
    Type: Application
    Filed: June 15, 2006
    Publication date: October 19, 2006
    Inventors: Derrick Lin, Punit Minocha, Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
  • Publication number: 20060236076
    Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to pack the packed data responsive to a pack instruction received by the decoder. A first packed data element and a second packed data element are received from the first source register. A third packed data element and a fourth packed data element are received from the second source register. The circuit packs packing a portion of each of the packed data elements into a destination register resulting with the portion from second packed data element adjacent to the portion from the first packed data element, and the portion from the fourth packed data element adjacent to the portion from the third packed data element.
    Type: Application
    Filed: June 12, 2006
    Publication date: October 19, 2006
    Inventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
  • Publication number: 20050219897
    Abstract: A method and apparatus for providing, in a processor, a shift operation on a packed data element having multiple values. The apparatus having multiple muxes, each of the multiple muxes having a first input, a second input, a select input and an output. Each of the multiple bits that represent a shifted packed intermediate result on a first bus is coupled to the corresponding first input. Each of the multiple bits representing a replacement bit for one of the multiple values is coupled to a corresponding second input. Each of the multiple bits driven by a correction circuit is coupled to a corresponding select input. Each output corresponds to a bit of a shifted packed result.
    Type: Application
    Filed: May 27, 2005
    Publication date: October 6, 2005
    Inventors: Derrick Lin, Punit Minocha, Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
  • Publication number: 20050038977
    Abstract: A processor with instructions to operate on different data types stored in a single logical register file. According to one embodiment of the invention, a processor includes a number of physical registers, a memory unit, and a decode/execution unit. The memory unit is to make the number of physical registers appear to software as a single software-visible register file. The decode/execution unit is to execute on the contents of the single software-visible register file instructions of a first instruction type and of a second instruction type, wherein the single software-visible register file is to be operated as a flat register file during execution of instructions of the second instruction type and as a stack referenced register file during execution of instructions of the first instruction type.
    Type: Application
    Filed: September 13, 2004
    Publication date: February 17, 2005
    Inventors: Andrew Glew, Larry Mennemeier, Alexander Peleg, David Bistry, Millind Mittal, Carole Dulong, Eiichi Kowashi, Benny Eitan, Derrick Lin, Ramamohan Vakkalagadda
  • Publication number: 20040181649
    Abstract: A method in a computer system, one embodiment includes accessing a packed data instruction and generating a corresponding set of control bits to cause a processor to alter a top of stack to zero of a programmer visible register file, accessing a floating point instruction and generating a corresponding set of control bits that cause the processor to operate on the programmer visible register file as a stack, but accessing a transition instruction between the packed data instruction and the scalar floating point instruction and generating a corresponding set of control bits to cause the processor to alter tag data to indicate that programmer visible register file is empty. The method advantageously provides a means for clearing the packed data state at the end of blocks of packed data instructions to leave the floating point state in a clear condition for subsequent operations (e.g. floating point calculations).
    Type: Application
    Filed: March 19, 2004
    Publication date: September 16, 2004
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Elichi Kowashi, Millind Mittal, Benny Eitan
  • Patent number: 6751725
    Abstract: Methods and apparatuses to clear state for operation of a stack. According to one embodiment of the invention, a processor comprises a set of one or more storage areas and a decode unit. The set of one or more storage areas are to store a plurality of tags and a top of stack indication, where each of the plurality of tags is to indicate if a register is in an empty or non-empty state. The decode unit is to decode scalar floating point instructions and packed data instructions, where at least certain of said scalar floating point instructions specify registers in a stack referenced manner and at least certain of said packed data instructions specify registers in a non-stack referenced manner. In addition, the packed data instructions include an instruction to mark the end of blocks of the packed data instructions in programs. The processor also comprises circuitry to cause the plurality of tags to indicate the empty state responsive to execution of the instruction.
    Type: Grant
    Filed: February 16, 2001
    Date of Patent: June 15, 2004
    Assignee: Intel Corporation
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
  • Publication number: 20010047469
    Abstract: A method in a computer system which includes receiving a first instruction which indicates termination of execution of instructions which operate upon packed data stored in a first storage area. The first storage area is used for modifying data responsive to execution of floating point instructions. A plurality of tags is associated with the first storage area indicating that locations in the first storage area are either empty or non-empty responsive to the execution of the floating point instructions which modify data contained in the first storage area. Responsive to the receiving of the first instruction which indicates termination of execution of instructions which operate upon the packed data stored in the first storage area, the method sets only the plurality of tags to an empty state. In different embodiments, setting of the plurality of tags to a non-empty state occurs responsive to receiving a second instruction.
    Type: Application
    Filed: February 16, 2001
    Publication date: November 29, 2001
    Applicant: Intel Corporation
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleo, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
  • Patent number: 6266686
    Abstract: A method in a computer system which includes receiving a first instruction which indicates indicates termination of execution of instructions which operate upon packed data stored in a first storage area. The first storage area is used for modifying data responsive to execution of floating point instructions. A plurality of tags is associated with the first storage area indicating that locations in the first storage area are either empty or non-empty responsive to the execution of the floating point instructions which modify data contained in the first storage area. Responsive to the receiving of the first instruction which indicates termination of execution of instructions which operate upon the packed data stored in the first storage area, the method sets only the plurality of tags to an empty state. In different embodiments, setting of the plurality of tags to a non-empty state occurs responsive to receiving a second instruction.
    Type: Grant
    Filed: March 4, 1999
    Date of Patent: July 24, 2001
    Assignee: Intel Corporation
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
  • Patent number: 6128614
    Abstract: A technique for sorting packed numbers of two operands into minima or maxima operand with their indices to identify the origin of those selected values. After packing two source operands with a plurality of data elements containing numerical values, greater-than comparison operation is performed on the two operands to generate a mask. The mask is used to identify those corresponding pair of data elements of the first and second operands which need to be passed through the subsequent stages in order to generate a sorted minima or maxima. The operands are AND'ed with the mask or the complement of the mask to generate the required minima/maxima result. The same AND'ing technique is used with two other operands containing indices of the values in the first two operands. The indices identify the originating location of the sorted maxima/minima.
    Type: Grant
    Filed: February 8, 1999
    Date of Patent: October 3, 2000
    Assignee: Intel Corporation
    Inventors: Larry Mennemeier, Alexander Peleg, Carole Dulong, Millind Mittal, Benny Eitan, Eiichi Kowashi
  • Patent number: 6070237
    Abstract: A novel processor for manipulating packed data. The packed data includes a first data element D1 and a second data element D2. Each of said data elements has a predetermined number of bits. The processor comprises a decoder, a register, and a circuit. The decoder is for decoding a control signal responsive to receiving the control signal. The register is coupled to the decoder. The register is for storing the packed data. The circuit is coupled to the decoder. The circuit is for generating a first result data element R1 and a second data element R2. The circuit is further for generating R1 to represent a total number bits set in D1, and the circuit is further for generating R2 to represent a total number bits set in D2.
    Type: Grant
    Filed: March 4, 1996
    Date of Patent: May 30, 2000
    Assignee: Intel Corporation
    Inventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry Mennemeier, Benny Eitan
  • Patent number: 6036350
    Abstract: A technique for sorting packed signed numbers of two operands into maxima and minima operands and solving absolute differences for each pair of corresponding values of maxima and minima. After packing two source operands with a plurality of data elements containing signed values, a greater-than comparison operation is performed on each pair of corresponding numbers in the two operands to determine which is greater. An exclusive-OR mask is generated for use in swapping those values which need to be rearranged so that all maxima are in one operand and all minima are in another operand. Once the sorting of maxima and minima is complete, a packed subtraction operation is then performed by subtracting the minima from corresponding maxima to obtain absolute differences.
    Type: Grant
    Filed: May 20, 1997
    Date of Patent: March 14, 2000
    Assignee: Intel Corporation
    Inventors: Larry Mennemeier, Alexander Peleg, Carole Dulong, Millind Mittal, Benny Eitan, Eiichi Kowashi
  • Patent number: 5940859
    Abstract: A method in a computer system which includes receiving a first instruction which indicates indicates termination of execution of instructions which operate upon packed data stored in a first storage area. The first storage area is used for modifying data responsive to execution of floating point instructions. A plurality of tags is associated with the first storage area indicating that locations in the first storage area are either empty or non-empty responsive to the execution of the floating point instructions which modify data contained in the first storage area. Responsive to the receiving of the first instruction which indicates termination of execution of instructions which operate upon the packed data stored in the first storage area, the method sets only the plurality of tags to an empty state. In different embodiments, setting of the plurality of tags to a non-empty state occurs responsive to receiving a second instruction.
    Type: Grant
    Filed: December 19, 1995
    Date of Patent: August 17, 1999
    Assignee: Intel Corporation
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
  • Patent number: 5907842
    Abstract: A technique for sorting packed numbers of two operands into minima or maxima operand with their indices to identify the origin of those selected values. After packing two source operands with a plurality of data elements containing numerical values, greater-than comparison operation is performed on the two operands to generate a mask. The mask is used to identify those corresponding pair of data elements of the first and second operands which need to be passed through the subsequent stages in order to generate a sorted minima or maxima. The operands are AND'ed with the mask or the complement of the mask to generate the required minima/maxima result. The same AND'ing technique is used with two other operands containing indices of the values in the first two operands. The indices identify the originating location of the sorted maxima/minima.
    Type: Grant
    Filed: December 20, 1995
    Date of Patent: May 25, 1999
    Assignee: Intel Corporation
    Inventors: Larry Mennemeier, Alexander Peleg, Carole Dulong, Millind Mittal, Benny Eitan, Eiichi Kowashi
  • Patent number: 5857096
    Abstract: An apparatus (e.g. a microarchitecture of a microprocessor) comprising a plurality of tags associated with a first storage area indicating that locations in the first storage area are either empty or non-empty responsive to execution of floating point instructions which modify data contained in the first storage area. A first circuit is coupled to the plurality of tags which sets only the plurality of tags to an empty state responsive to receipt of a first instruction. The first instruction indicates termination of execution of instructions which operate upon the packed data stored in the first storage area. The apparatus further comprises a second circuit coupled to the plurality of tags for setting the plurality of tags to a non-empty state responsive to receipt of a second instruction (or instructions). The second instruction specifies an operation upon packed data stored in the first storage area.
    Type: Grant
    Filed: December 19, 1995
    Date of Patent: January 5, 1999
    Assignee: Intel Corporation
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
  • Patent number: 5793661
    Abstract: A method of multiplying and accumulating two sets of values in a computer system. A packed multiply add is performed on a first portion of a first set of values packed into a first source and a first portion of a second set of values packed into a second source to generate a first result. The first result is unpacked into a plurality of values (e.g. two). The plurality of values is then added together to form a resulting accumulation value.
    Type: Grant
    Filed: December 26, 1995
    Date of Patent: August 11, 1998
    Assignee: Intel Corporation
    Inventors: Carole Dulong, Larry Mennemeier, Tuan H. Bui, Eiichi Kowashi, Alexander D. Peleg, Benny Eitan, Stephen A. Fischer, Benny Maytal, Millind Mittal