Patents by Inventor Doron Orenstein

Doron Orenstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150058603
    Abstract: In one embodiment, the present invention includes logic to receive a permute instruction, first and second source operands, and control values, and to perform a permute operation based on an operation between at least two of the control values so that selected portions of the first and second source operands or a predetermined value can be stored into elements of a destination. Multiple permute instructions may be combined to perform efficient table lookups. Other embodiments are described and claimed.
    Type: Application
    Filed: November 5, 2014
    Publication date: February 26, 2015
    Inventors: Cristina Anderson, Mark Buxton, Doron Orenstein, Robert Valentine
  • Patent number: 8914613
    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.
    Type: Grant
    Filed: August 26, 2011
    Date of Patent: December 16, 2014
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
  • Publication number: 20140310505
    Abstract: A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.
    Type: Application
    Filed: June 17, 2014
    Publication date: October 16, 2014
    Inventors: Robert Valentine, Doron Orenstein, Brett L. Toll
  • Patent number: 8756403
    Abstract: A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: June 17, 2014
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Doron Orenstein, Brett L. Toll
  • Publication number: 20140095847
    Abstract: A processor uses multiple banks of an extended register set to store the contexts of multiple user-level threads. A current bank register provides a pointer to the bank that is currently active. A first thread saves its context (first context) in a first bank of the extended register set and a second thread saves its context (second context) in a second bank of the extended register set. When the processor receives an instruction for exchanging contexts between the first thread and the second thread, the processor changes the pointer from the first bank to the second bank, and executes the second thread using the second context stored in the second bank.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventor: Doron Orenstein
  • Publication number: 20130290682
    Abstract: A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.
    Type: Application
    Filed: March 15, 2013
    Publication date: October 31, 2013
    Inventors: Robert Valentine, Doron Orenstein, Brett L. Toll
  • Publication number: 20130232321
    Abstract: Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 5, 2013
    Inventors: Asaf Hargil, Doron Orenstein
  • Publication number: 20130212360
    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.
    Type: Application
    Filed: March 15, 2013
    Publication date: August 15, 2013
    Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
  • Patent number: 8281109
    Abstract: A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.
    Type: Grant
    Filed: December 27, 2007
    Date of Patent: October 2, 2012
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Doron Orenstein, Bret Toll
  • Publication number: 20110307687
    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.
    Type: Application
    Filed: August 26, 2011
    Publication date: December 15, 2011
    Inventors: ZEEV SPERBER, Robert Valentine, Benny Eitan, Doron Orenstein
  • Patent number: 8078831
    Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.
    Type: Grant
    Filed: October 21, 2010
    Date of Patent: December 13, 2011
    Assignee: Intel Corporation
    Inventors: Hong Wang, Perry H. Wang, Jeffery A. Brown, Per Hammarlund, George Z. Chrysos, Doron Orenstein, Steve Shih-wei Liao, John P. Shen
  • Patent number: 8078836
    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.
    Type: Grant
    Filed: December 30, 2007
    Date of Patent: December 13, 2011
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
  • Publication number: 20110035555
    Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.
    Type: Application
    Filed: October 21, 2010
    Publication date: February 10, 2011
    Inventors: Hong Wang, Perry H. Wang, Jeffery A. Brown, Per Hammarlund, George Z. Chrysos, Doron Orenstein, Steve Shih-wei Liao, John P. Shen
  • Publication number: 20100332794
    Abstract: Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand.
    Type: Application
    Filed: June 30, 2009
    Publication date: December 30, 2010
    Inventors: Asaf Hargil, Doron Orenstein
  • Patent number: 7849465
    Abstract: Method, apparatus, and system for a programmable event driven yield mechanism that may activate other threads. The yield mechanism may allow triggering of a service thread that may execute currently with a main thread upon occurrence of an architecturally-defined condition. The service thread may be activated, in response to the condition, with limited intervention of an operating system. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect an architecturally-defined condition. The apparatus may include an event handler to handle a yield event generated when the architecturally-defined condition has been detected. An architectural mechanism, including processor instructions and channel registers, may be utilized to allow user-level code to enable the yield event mechanism. Other embodiments are also described and claimed.
    Type: Grant
    Filed: May 19, 2005
    Date of Patent: December 7, 2010
    Assignee: Intel Corporation
    Inventors: Xiang Zou, Hong Wang, Scott Dion Rodgers, Darrell D. Boggs, Bryant Bigbee, Shivanandan Kaushik, Anil Aggarwal, Ittai Anati, Doron Orenstein, Per Hammarlund, John Shen, Larry O. Smith, James B. Crossland, Chris J. Newburn
  • Patent number: 7844801
    Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.
    Type: Grant
    Filed: July 31, 2003
    Date of Patent: November 30, 2010
    Assignee: Intel Corporation
    Inventors: Hong Wang, Perry H. Wang, Jeffery A. Brown, Per Hammarlund, George Z. Chrysos, Doron Orenstein, Steve Shih-wei Liao, John P. Shen
  • Patent number: 7721129
    Abstract: A clock frequency control unit for an integrated circuit (IC) includes a clock generator, a finite state machine (FSM), and a gating circuit (GC). The FSM has at least first and second states corresponding to non-low workload low workload states, respectively. In the first state, the GC provides a clock signal to functional units of the IC with the same frequency as the clock generator output. In the second state, the GC reduces the frequency of the clock signal. In one embodiment, the GC masks out selected cycles of the clock generator output to reduce the clock signal frequency. The FSM monitors the operation of the IC to transition from the first state to the second state when selected “low workload” conditions are detected (e.g., long latency cache miss). Similarly, the FSM transitions from the second state to the first state when selected “non-low workload” conditions are detected.
    Type: Grant
    Filed: January 12, 2006
    Date of Patent: May 18, 2010
    Assignee: Intel Corporation
    Inventors: Itamar S. Kazachinsky, Doron Orenstein
  • Publication number: 20090172358
    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.
    Type: Application
    Filed: December 30, 2007
    Publication date: July 2, 2009
    Inventors: ZEEV SPERBER, Robert Valentine, Benny Eitan, Doron Orenstein
  • Patent number: 7437581
    Abstract: A method and apparatus for changing the configuration of a multi-core processor is disclosed. In one embodiment, a throttle module (or throttle logic) may determine the amount of parallelism present in the currently-executing program, and change the execution of the threads of that program on the various cores. If the amount of parallelism is high, then the processor may be configured to run a larger amount of threads on cores configured to consume less power. If the amount of parallelism is low, then the processor may be configured to run a smaller amount of threads on cores configured for greater scalar performance.
    Type: Grant
    Filed: September 28, 2004
    Date of Patent: October 14, 2008
    Assignee: Intel Corporation
    Inventors: Edward Grochowski, John Shen, Hong Wang, Doron Orenstein, Gad S Sheaffer, Ronny Ronen, Murali M. Annavaram
  • Publication number: 20060294347
    Abstract: Method, apparatus, and system for a programmable event driven yield mechanism that may activate other threads. The yield mechanism may allow triggering of a service thread that may execute currently with a main thread upon occurrence of an architecturally-defined condition. The service thread may be activated, in response to the condition, with limited intervention of an operating system. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect an architecturally-defined condition. The apparatus may include an event handler to handle a yield event generated when the architecturally-defined condition has been detected. An architectural mechanism, including processor instructions and channel registers, may be utilized to allow user-level code to enable the yield event mechanism. Other embodiments are also described and claimed.
    Type: Application
    Filed: May 19, 2005
    Publication date: December 28, 2006
    Inventors: Xiang Zou, Hong Wang, Scott Rodgers, Darrell Boggs, Bryant Bigbee, Shivanandan Kaushik, Anil Aggarwal, Ittai Anati, Doron Orenstein, Per Hammarlund, John Shen, Larry Smith, James Crossland, Chris Newburn