Patents by Inventor Victor W. Lee

Victor W. Lee has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9076254
    Abstract: A texture unit may be used to perform general purpose mathematical computations such as dot products. This enables some general purpose computations and operations to be offloaded from a central processing unit to the texture unit. The texture unit may use linear interpolators in order to perform the dot product calculations.
    Type: Grant
    Filed: October 16, 2013
    Date of Patent: July 7, 2015
    Assignee: Intel Corporation
    Inventors: Victor W. Lee, Mikhail Smelyanskiy, Ganesh S. Dasika, Jose Gonzalez, Jatin Chhugani, Yen-Kuang Chen, Changkyu Kim, Julio Gago, Santiago Galan, Victor Moya Del Barrio
  • Publication number: 20150186077
    Abstract: A processor of an aspect includes an on-die programmable architecturally-visible storage. The processor also includes a decode unit to receive a data access instruction of an instruction set of the processor. The data access instruction to indicate a data address that is to be associated with data to be stored in the on-die programmable architecturally-visible storage, to indicate a data size associated with the data to be stored in the on-die programmable architecturally-visible storage, and to indicate a destination storage location of the processor. An execution unit is coupled with the decode unit and the on-die programmable architecturally-visible storage. The execution unit is on-die with the on-die programmable storage. The execution unit is operable, in response to the data access instruction, to store the data, which is associated with the data address and the data size, in the destination storage location that is to be indicated by the instruction.
    Type: Application
    Filed: December 27, 2013
    Publication date: July 2, 2015
    Applicant: Intel Corporation
    Inventor: VICTOR W. LEE
  • Patent number: 9069671
    Abstract: Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed.
    Type: Grant
    Filed: July 21, 2014
    Date of Patent: June 30, 2015
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Yen-Kuang Chen, Changkyu Kim, Daehyun Kim, Victor W. Lee, Anthony-Trung D. Nguyen, Nadathur Rajagopalan Satish
  • Publication number: 20140337580
    Abstract: Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed.
    Type: Application
    Filed: July 21, 2014
    Publication date: November 13, 2014
    Inventors: CHRISTOPHER J. HUGHES, YEN-KUANG CHEN, CHANGKYU KIM, DAEHYUN KIM, VICTOR W. LEE, ANTHONY-TRUNG D. NGUYEN, NADATHUR RAJAGOPALAN SATISH
  • Publication number: 20140237303
    Abstract: An apparatus and method are described for detecting and responding to fault conditions in a processor. For example, one embodiment of a method comprises: reading each active element in succession from a first vector register, each active element specifying an address for a gather or load operation; detecting one or more fault conditions associated with one or more of the active elements; for each active element read in succession prior to a detected fault condition on an element other than the first active element, storing the data loaded from an address associated with the active element in a first output vector register; and for each active element associated with the detected fault condition and following the detected fault condition, setting a bit in an output mask register to indicate the detected fault condition.
    Type: Application
    Filed: December 23, 2011
    Publication date: August 21, 2014
    Inventors: Jayashankar Bharadwaj, Victor W. Lee, Kim Daehyun, Nalini Vasudevan, Tin-Fook Ngai, Albert Hartono, Sara S. Baghsorkhi
  • Publication number: 20140223139
    Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.
    Type: Application
    Filed: December 23, 2011
    Publication date: August 7, 2014
    Inventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara Baghsorkhi, Nalini Vasudevan
  • Patent number: 8799577
    Abstract: Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed.
    Type: Grant
    Filed: July 2, 2013
    Date of Patent: August 5, 2014
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Yen-Kuang Chen, Changkyu Kim, Daehyun Kim, Victor W. Lee, Anthony-Trung D. Nguyen, Nadathur Rajagopalan Satish
  • Publication number: 20140189323
    Abstract: An apparatus and method for propagating conditionally evaluated values. For example, a method according to one embodiment comprises: reading each value contained in an input mask register, each value being a true value or a false value and having a bit position associated therewith; for each true value read from the input mask register, generating a first result containing the bit position of the true value; for each false value read from the input mask register following the first true value, adding the vector length of the input mask register to a bit position of the last true value read from the input mask register to generate a second result; and storing each of the first results and second results in bit positions of an output register corresponding to the bit positions read from the input mask register.
    Type: Application
    Filed: December 23, 2011
    Publication date: July 3, 2014
    Inventors: Jayashankar Bharadwaj, Nalini Vasudevan, Victor W. Lee, Daehyun Kim, Albert Hartono, Sara S. Baghsorkhi
  • Publication number: 20140189288
    Abstract: A vector reduction instruction with non-unit strided access pattern is received and executed by the execution circuitry of a processor. In response to the instruction, the execution circuitry performs an associative reduction operation on data elements of a first vector register. Based on values of the mask register and a current element position being processed, the execution circuitry sequentially set one or more data elements of the first vector register to a result, which is generated by the associative reduction operation applied to both a previous data element of the first vector register and a data clement of a third vector register. The previous data element is located more than one element position away from the current element position.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: Albert Hartono, Jayashankar Bharadwaj, Nalini Vasudevan, Sara S. Baghsorkhi, Victor W. Lee, Daehyun Kim
  • Publication number: 20140176590
    Abstract: A texture unit may be used to perform general purpose mathematical computations such as dot products. This enables some general purpose computations and operations to be offloaded from a central processing unit to the texture unit. The texture unit may use linear interpolators in order to perform the dot product calculations.
    Type: Application
    Filed: October 16, 2013
    Publication date: June 26, 2014
    Inventors: Victor W. Lee, Mikhail Smelyanskiy, Ganesh S. Dasika, Jose Gonzalez, Jatin Chhugani, Yen-Kuang Chen, Changkyu Kim, Julio Gago, Santiago Galan, Victor Moya Del Barrio
  • Publication number: 20140181580
    Abstract: According to one embodiment, a processor includes an instruction decoder to decode an instruction to read a plurality of data elements from memory, the instruction having a first operand specifying a storage location, a second operand specifying a bitmask having one or more bits, each bit corresponding to one of the data elements, and a third operand specifying a memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the instruction, to read one or more data elements speculatively, based on the bitmask specified by the second operand, from a memory location based on the memory address indicated by the third operand, and to store the one or more data elements in the storage location indicated by the first operand.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 26, 2014
    Inventors: Jayashankar BHARADWAJ, Nalini VASUDEVAN, Victor W. LEE, Sara S. BAGHSORKHI, Albert HARTONO, Daehyun KIM
  • Publication number: 20140149718
    Abstract: Instructions and logic provide pushing buffer copy and store functionality. Some embodiments include a first hardware thread or processing core, and a second hardware thread or processing core, a cache to store cache coherent data in a cache line for a shared memory address accessible by the second hardware thread or processing core. Responsive to decoding an instruction specifying a source data operand, said shared memory address as a destination operand, and one or more owner of said shared memory address, one or more execution units copy data from the source data operand to the cache coherent data in the cache line for said shared memory address accessible by said second hardware thread or processing core in the cache when said one or more owner includes said second hardware thread or processing core.
    Type: Application
    Filed: November 28, 2012
    Publication date: May 29, 2014
    Inventors: Christopher J. Hughes, Changkyu Kim, Daehyun Kim, Victor W. Lee, Jong Soo Park
  • Publication number: 20140096119
    Abstract: Loop vectorization methods and apparatus are disclosed. An example method includes setting a dynamic adjustment value of a vectorization loop; executing the vectorization loop to vectorize a loop by grouping iterations of the loop into one or more vectors; identifying a dependency between iterations of the loop as; and setting the dynamic adjustment value based on the identified dependency.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: NALINI VASUDEVAN, JAYASHANKAR BHARADWAJ, CHRISTOPHER J. HUGHES, MILIND B. GIRKAR, MARK J. CHARNEY, ROBERT VALENTINE, VICTOR W. LEE, DAEHYUN KIM, ALBERT HARTONO, SARA S. BAGHSORKHI
  • Patent number: 8688957
    Abstract: A system and method are configured to detect conflicts when converting scalar processes to parallel processes (“SIMDifying”). Conflicts may be detected for an unordered single index, an ordered single index and/or ordered pairs of indices. Conflicts may be further detected for read-after-write dependencies. Conflict detection is configured to identify operations (i.e., iterations) in a sequence of iterations that may not be done in parallel.
    Type: Grant
    Filed: December 21, 2010
    Date of Patent: April 1, 2014
    Assignee: Intel Corporation
    Inventors: Mikhail Smelyanskiy, Yen-Kuang Chen, Daehyun Kim, Christopher J. Hughes, Victor W. Lee
  • Publication number: 20140089634
    Abstract: An apparatus, system and method are described for identifying identical elements in a vector register.
    Type: Application
    Filed: December 23, 2011
    Publication date: March 27, 2014
    Inventors: Victor W. Lee, Daehyun Kim, Tin-Fook Ngai, Jayashankar Bharadwaj, Albert Hartono, Sara S. Baghsorkhi, Nalini Vasudevan
  • Publication number: 20130332701
    Abstract: An apparatus and method are described for selecting elements to be used in a vector computation. For example, a method according to one embodiment includes the following operations: specifying whether to identify the first, last or next after last active element of an input mask register using an immediate value; identifying the first, last or next after last active element in the input mask register according to the immediate value; reading a value from an input vector register corresponding to the identified first, last or next after last active element in the input mask register; and writing the value to an output vector register.
    Type: Application
    Filed: December 23, 2011
    Publication date: December 12, 2013
    Inventors: Jayashankar Bharadwaj, Nalini Vasudevan, Victor W. Lee, Daehyun Kim, Albert Hartono, Sara S. Baghsorkhi
  • Publication number: 20130311530
    Abstract: An apparatus and method are described for performing a vector reduction. For example, an apparatus according to one embodiment comprises: a reduction logic tree comprised of a set of N-1 reduction logic blocks used to perform reduction in a single operation cycle for N vector elements; a first input vector register storing a first input vector communicatively coupled to the set of reduction logic blocks; a second input vector register storing a second input vector communicatively coupled to the set of reduction logic blocks; a mask register storing a mask value controlling a set of one or more multiplexers, each of the set of multiplexers selecting a value directly from the first input vector register or an output containing a processed value from one of the reduction logic blocks; and an output vector register coupled to outputs of the one or more multiplexers to receive values output passed through by each of the multiplexers responsive to the control signals.
    Type: Application
    Filed: March 30, 2012
    Publication date: November 21, 2013
    Inventors: Victor W. Lee, Jayashankar Bharadwaj, Daehyun Kim, Nalini Vasudevan, Tin-Fook Ngai, Albert Hartono, Sara Baghsorkhi
  • Publication number: 20130297878
    Abstract: Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed.
    Type: Application
    Filed: July 2, 2013
    Publication date: November 7, 2013
    Inventors: Christopher J. Hughes, Yen-Kuang Chen, Changkyu Kim, Daehyun Kim, Victor W. Lee, Anthony-Trung D. Nguyen, Nadathur Rajagopalan Satish
  • Patent number: 8570336
    Abstract: A texture unit may be used utilized to perform general purpose mathematical computations such as dot products. This enables some general purpose computations and operations to be offloaded from a central processing unit to the texture unit. The texture unit may use linear interpolators in order to perform the dot product calculations.
    Type: Grant
    Filed: December 8, 2009
    Date of Patent: October 29, 2013
    Assignee: Intel Corporation
    Inventors: Victor W. Lee, Mikhail Smelyanskiy, Ganesh S. Dasika, Jose Gonzalez, Jatin Chhugani, Yen-Kuang Chen, Changkyu Kim, Julio Gago, Santiago Galan, Victor Moya Del Barrio
  • Patent number: 8495464
    Abstract: Methods and apparatuses for error correction. A N-bit block data to be stored in a memory device is received. The memory device does not perform any error correction code (ECC) algorithm nor provide designated error correction code storage for the N-bit block of data. Data compression is applied to the N-bit data to compress the block of data to generate a M-bit compressed block of data. A K-bit ECC is computed for the M-bit compressed data, wherein M+K is less than or equal to N. The M-bit compressed data and the K-bit ECC are stored together in the memory device.
    Type: Grant
    Filed: June 28, 2010
    Date of Patent: July 23, 2013
    Assignee: Intel Corporation
    Inventors: Henry Stracovsky, Michael Espig, Victor W. Lee, Daehyun Kim