Patents by Inventor Ashish Jha

Ashish Jha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20180074823
    Abstract: A processor fetches a multi-register gather instruction that includes a destination operand that specifies a destination vector register, and a source operand that identifies content that indicates multiple vector registers, a first set of indexes of each of the vector registers that each identifies a source data element, and a second set of indexes of the destination vector register for each identified source element. The instruction is decoded and executed, causing, for each of the first set of indexes of each of the vector registers, the source data element that corresponds to that index of that vector register to be stored in a set of destination data elements that correspond to the second set of identified indexes of the destination vector register for that source data element.
    Type: Application
    Filed: September 19, 2017
    Publication date: March 15, 2018
    Applicant: Intel Corporation
    Inventor: Ashish JHA
  • Publication number: 20180004523
    Abstract: A processor of an aspect includes a decode unit to decode an instruction. The instruction is to explicitly specify a first architectural register and is to implicitly indicate at least a second architectural register. The second architectural register is implicitly to be at a higher register number than the first architectural register. The processor also includes an architectural register replacement unit coupled with the decode unit. The architectural register replacement unit is to replace the first architectural register with a third architectural register, and is to replace the second architectural register with a fourth architectural register. The third architectural register is to be at a lower register number than the first architectural register. The fourth architectural register is to be at a lower register number than the second architectural register. Other processors are also disclosed, as are methods and systems.
    Type: Application
    Filed: July 1, 2016
    Publication date: January 4, 2018
    Applicant: Intel Corporation
    Inventors: Mark J. Charney, Robert Valentine, Milind B. Girkar, Ashish Jha, Bret L. Toll, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal San Adrian, Jason W. Brandt
  • Patent number: 9830151
    Abstract: An apparatus and method for performing vector index loads and stores. For example, one embodiment of a processor comprises: a vector index register to store a plurality of index values; a mask register to store a plurality of mask bits; a vector register to store a plurality of vector data elements loaded from memory; and vector index load logic to identify an index stored in the vector index register to be used for a load operation using an immediate value and to responsively combine the index with a base memory address to determine a memory address for the load operation, the vector index load logic to load vector data elements from the memory address to the vector register in accordance with the plurality of mask bits.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: November 28, 2017
    Assignee: INTEL CORPORATION
    Inventors: Ashish Jha, Robert Valentine, Elmoustapha Ould-Ahmed-Vall
  • Publication number: 20170286109
    Abstract: A processor includes a decode unit to decode an instruction that is to indicate a source packed data that is to include a plurality of adjoining data elements, a number of data elements, and a destination. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to store a result packed data in the destination. The result packed data is to have a plurality of lanes that are each to store a different non-overlapping set of the indicated number of adjoining data elements aligned with a least significant end of the respective lane. The different non-overlapping sets of the indicated number of the adjoining data elements in adjoining lanes of the result packed data are to be separated from one another by at least one most significant data element position of the less significant lane.
    Type: Application
    Filed: March 31, 2016
    Publication date: October 5, 2017
    Applicant: Intel Corporation
    Inventor: Ashish Jha
  • Patent number: 9766887
    Abstract: A processor fetches a multi-register gather instruction that includes a destination operand that specifies a destination vector register, and a source operand that identifies content that indicates multiple vector registers, a first set of indexes of each of the vector registers that each identifies a source data element, and a second set of indexes of the destination vector register for each identified source element. The instruction is decoded and executed, causing, for each of the first set of indexes of each of the vector registers, the source data element that corresponds to that index of that vector register to be stored in a set of destination data elements that correspond to the second set of identified indexes of the destination vector register for that source data element.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: September 19, 2017
    Assignee: Intel Corporation
    Inventor: Ashish Jha
  • Publication number: 20170192780
    Abstract: Embodiments of systems, apparatuses, and method for getting even or odd data elements are described. For example, in some embodiments, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields for a first source operand, a second source operand, and a destination operand; and execution circuitry to execute the decoded instruction to extract data elements from even data element positions of the first and second source operands and store the extracted data elements into the destination operand.
    Type: Application
    Filed: December 30, 2015
    Publication date: July 6, 2017
    Inventors: Robert Valentine, Elmoustapha Ould-Ahmed-Vall, Jason W. Brandt, Mark J. Charney, Ashish Jha, Milind B. Girkar, Bret L. Toll, Evgeny V. Stupachenko, Sergey Y. Ostanevich
  • Publication number: 20170192781
    Abstract: Detailed herein are systems, apparatuses, and methods for strided loads. In an embodiment, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields a starting source memory address operand and a starting destination register operand; and execution circuitry to execute the decoded instruction to extract data elements of a defined number of types from contiguous memory beginning at the starting source memory address and, for each type, store the extracted data elements in a packed data register dedicated to that type beginning with starting destination register operand.
    Type: Application
    Filed: December 30, 2015
    Publication date: July 6, 2017
    Inventors: Robert Valentine, Elmoustapha Ould-Ahmed-Vall, Jason W. Brandt, Mark J. Charney, Ashish Jha, Milind B. Girkar, Bret L. Toll, Evgeny V. Stupachenko, Sergey Y. Ostanevich
  • Publication number: 20170192782
    Abstract: Embodiments of systems, apparatuses, and methods for aggregate gather and scatter are disclosed. In some embodiments, a decoder to decode an instruction, wherein the instruction to include fields for an index of memory address locations, an immediate, and a starting destination register operand and identifier of additional destination registers; and execution circuitry to execute the decoded instruction to gather, from memory at locations indicated by the index of memory locations, data elements and stores them in multiple destination registers in sizes dictated by the immediate are described.
    Type: Application
    Filed: December 30, 2015
    Publication date: July 6, 2017
    Inventors: Robert Valentine, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Ashish Jha
  • Publication number: 20170185292
    Abstract: Various systems and methods for memory management of high-performance memory are described herein. A system for managing high-performance memory, the system comprising a random access memory; a high-performance memory, the high-performance memory of higher performance than the random access memory; and a memory management unit to: obtain execution metrics for a plurality of blocks resident in a random access memory; select a block from the plurality of blocks based on activity of the block; move the block to high-performance memory; and update a virtual memory mapping for the block from the random access memory to the high-performance memory.
    Type: Application
    Filed: December 23, 2015
    Publication date: June 29, 2017
    Inventors: Ashish Jha, Tulika Jha, Mingqiu Sun
  • Publication number: 20170177543
    Abstract: An Aggregate Scatter instruction is described. A processor may include a memory interface and a register to store data elements of a data structure. The data elements may be contiguously stored in a first location in a memory accessible via the memory interface. The processor may further include a decoder to decode an aggregate scatter instruction specifying a store operation for the data structure and an execution unit to contiguously store the data elements to a second storage location in the memory in response to the decoded aggregate scatter instruction. The second storage location may be identified by a starting memory address of the second storage location.
    Type: Application
    Filed: December 22, 2015
    Publication date: June 22, 2017
    Inventors: Ashish Jha, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark J. Charney, Milind B. Girkar
  • Publication number: 20170177362
    Abstract: A processor includes a decode unit to decode an adjoining data element pairwise swap instruction. The instruction is to indicate a source packed data that is to include pairs of adjoining data elements, and is to indicate a destination storage location. An execution unit is coupled with the packed data registers and the decode unit. The execution unit, in response to the instruction, is to store a result packed data in the destination storage location, the result packed data to include pairs of adjoining data elements. Each pair of adjoining data elements of the result packed data is to correspond to a different pair of adjoining data elements of the source packed data. The adjoining data elements in each pair of the result packed data to have been swapped in position relative to the adjoining data elements in each corresponding pair of the source packed data.
    Type: Application
    Filed: December 22, 2015
    Publication date: June 22, 2017
    Applicant: INTEL CORPORATION
    Inventor: Ashish Jha
  • Publication number: 20170177340
    Abstract: A processor comprises a plurality of vector registers, and an execution unit, operatively coupled to the plurality of vector registers, the execution unit comprising a logic circuit implementing a load instruction for loading, into two or more vector registers, two or more data items associated with a data structure stored in a memory, wherein each one of the two or more vector registers is to store a data item associated with a certain position number within the data structure.
    Type: Application
    Filed: December 22, 2015
    Publication date: June 22, 2017
    Inventors: ASHISH JHA, ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE, MARK J. CHARNEY, MILIND B. GIRKAR
  • Patent number: 9411593
    Abstract: An instruction processing apparatus of an aspect includes a plurality of operation mask registers. The apparatus also includes a decode unit to receive an operation mask consolidation instruction. The operation mask consolidation instruction is to indicate a source operation mask register, of the plurality of operation mask registers, and a destination storage location. The source operation mask register is to include a source operation mask that is to include a plurality of masked elements that are to be disposed within a plurality of unmasked elements. An execution unit is coupled with the decode unit. The execution unit, in response to the operation mask consolidation instruction, is to store a consolidated operation mask in the destination storage location. The consolidated operation mask is to include the unmasked elements from the source operation mask consolidated together. Other apparatus, methods, systems, and instructions are also disclosed.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: August 9, 2016
    Assignee: Intel Corporation
    Inventor: Ashish Jha
  • Publication number: 20160179526
    Abstract: An apparatus and method for performing vector index loads and stores. For example, one embodiment of a processor comprises: a vector index register to store a plurality of index values; a mask register to store a plurality of mask bits; a vector register to store a plurality of vector data elements loaded from memory; and vector index load logic to identify an index stored in the vector index register to be used for a load operation using an immediate value and to responsively combine the index with a base memory address to determine a memory address for the load operation, the vector index load logic to load vector data elements from the memory address to the vector register in accordance with the plurality of mask bits.
    Type: Application
    Filed: December 23, 2014
    Publication date: June 23, 2016
    Inventors: ASHISH JHA, ROBERT VALENTINE, ELMOUSTAPHA OULD-AHMED-VALL
  • Publication number: 20160179520
    Abstract: An apparatus and method for performing a variable mask-vector expand. For example, one embodiment of a processor comprises: a source mask register to store a plurality of mask bit values; an index register to store a plurality of index values each associated with a vector data element in a destination vector register and identifying a bit within the source mask register; and variable mask-vector expand logic to expand each of the mask bit values from the source mask register into the associated vector data elements using the index values from the index register, wherein all bits of a vector data element are to be set equal to the mask bit value identified by the index value associated with that vector data element.
    Type: Application
    Filed: December 23, 2014
    Publication date: June 23, 2016
    Inventors: ASHISH JHA, ROBERT VALENTINE, ELMOUSTAPHA OULD-AHMED-VALL
  • Publication number: 20160179521
    Abstract: An apparatus and method for performing a mask expand. For example, one embodiment of a processor comprises: a source mask register to store a plurality of mask values; mask expand logic to identify a first mask bit in the source mask register to be expanded using an index value and to determine a number of bit positions within a destination mask register into which the first mask bit is to be expanded using a second value, the mask expand logic to responsively copy the first mask bit to each of the determined bit positions within the destination mask register.
    Type: Application
    Filed: December 23, 2014
    Publication date: June 23, 2016
    Inventors: ASHISH JHA, ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE
  • Patent number: 9348592
    Abstract: An apparatus and method are described for fetching and storing a plurality of portions of a data stream into a plurality of registers. For example, a method according to one embodiment includes the following operations: determining a set of N vector registers into which to read N designated portions of a data stream stored in system memory; determining the system memory addresses for each of the N designated portions of the data stream; fetching the N designated portions of the data stream from the system memory at the system memory addresses; and storing the N designated portions of the data stream into the N vector registers.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: May 24, 2016
    Assignee: INTEL CORPORATION
    Inventor: Ashish Jha
  • Publication number: 20150348213
    Abstract: A computer based system for spend analysis solution through strategies for mining spend information, the system comprises a processor unit; and a computer readable medium storing instructions executable by the processor unit comprising classification system adapted to classify spend data in accordance with pre-determined parameters of classification; categorization system adapted to categorize classified spend data based on pre-defined parameters; input system adapted to input pre-defined fields in relation to category of spend data, supplier information, payment terms, contracts or contract terms, and other pre-defined dimensions; saving strategy analysis engine adapted to analyze classified and categorized spend data based on pre-determined strategies, the saving strategy analysis engine further comprising category based saving strategy analysis engine adapted to output saving strategy per identified category inputs; and supplier based saving strategy analysis engine adapted to output saving strategy per id
    Type: Application
    Filed: July 2, 2015
    Publication date: December 3, 2015
    Inventors: BIKASH MOHANTY, ASHISH JHA
  • Publication number: 20150332405
    Abstract: A system and method for providing unbiased, expert panel to review a requested idea is disclosed. This invention relates to a computerized, online system that facilitates the review of an idea for investment chosen by an investor. The expert panel is either selected by the investor, automatically suggested, or recommended through the administrator system based on internal data corresponding to the experts. Experts are selected based on ability to provide the most relevant scoring and fitting analysis of the chosen idea. The investor is able to assign a percentage weight not only to the experts, but upon the criteria factors of analysis as well to eliminate any potential bias of one particular expert. The experts analyze the innovation or idea based on pertinent criteria in this field and subsequently recommend a monetary value or equity exchange suitable for investing in the particular idea.
    Type: Application
    Filed: May 15, 2015
    Publication date: November 19, 2015
    Inventors: Stan W. Kachnowski, Cole R. Manship, Ashish Jha, R. Keerthi Prasad
  • Publication number: 20150178752
    Abstract: Systems and methods for a spend analysis are described, and include identifying a spend category and associating a cost component model to the spend category. The cost component model indicates one or more cost contributors to the spend category. The systems and methods also include receiving market information associated with at least a subset of the one or more cost contributors in the cost component model and outputting an analysis of spend associated with the spend category in relation to the cost component model.
    Type: Application
    Filed: March 9, 2015
    Publication date: June 25, 2015
    Inventors: Sanjay Kadkol, Bikash Mohanty, Ashish Jha, Aloysius Sebastian