Patents by Inventor Yuval Yosef
Yuval Yosef has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10664284Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.Type: GrantFiled: February 28, 2019Date of Patent: May 26, 2020Assignee: Intel CorporationInventors: Oren Ben-Kiki, Yuval Yosef, Ilan Pardo, Dror Markovich
-
Patent number: 10346195Abstract: A processor is described having logic circuitry of a general purpose CPU core to save multiple copies of context of a thread of the general purpose CPU core to prepare multiple micro-threads of a multi-threaded accelerator for execution to accelerate operations for the thread through parallel execution of the micro-threads.Type: GrantFiled: December 29, 2012Date of Patent: July 9, 2019Assignee: Intel CorporationInventors: Oren Ben-Kiki, Ilan Pardo, Eliezer Weissmann, Robert Valentine, Yuval Yosef
-
Publication number: 20190196838Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.Type: ApplicationFiled: February 28, 2019Publication date: June 27, 2019Inventors: Oren Ben-Kiki, Yuval Yosef, Ilan Pardo, Dror Markovich
-
Publication number: 20190171462Abstract: A processor having one or more processing cores is described. Each of the one or more processing cores has front end logic circuitry and a plurality of processing units. The front end logic circuitry is to fetch respective instructions of threads and decode the instructions into respective micro-code and input operand and resultant addresses of the instructions. Each of the plurality of processing units is to be assigned at least one of the threads, is coupled to said front end unit, and has a respective buffer to receive and store microcode of its assigned at least one of the threads.Type: ApplicationFiled: November 26, 2018Publication date: June 6, 2019Inventors: ILAN PARDO, DROR MARKOVICH, OREN BEN-KIKI, YUVAL YOSEF
-
Patent number: 10255077Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.Type: GrantFiled: August 2, 2016Date of Patent: April 9, 2019Assignee: Intel CorporationInventors: Oren Ben-Kiki, Yuval Yosef, Ilan Pardo, Dror Markovich
-
Patent number: 10185566Abstract: In one embodiment, the present invention includes a multicore processor having first and second cores to independently execute instructions, the first core visible to an operating system (OS) and the second core transparent to the OS and heterogeneous from the first core. A task controller, which may be included in or coupled to the multicore processor, can cause dynamic migration of a first process scheduled by the OS to the first core to the second core transparently to the OS. Other embodiments are described and claimed.Type: GrantFiled: April 27, 2012Date of Patent: January 22, 2019Assignee: Intel CorporationInventors: Alon Naveh, Yuval Yosef, Eliezer Weissmann, Anil Aggarwal, Efraim Rotem, Avi Mendelson, Ronny Ronen, Boris Ginzburg, Michael Mishaeli, Scott D. Hahn, David A. Koufaty, Ganapati Srinivasa, Guy Therien
-
Patent number: 10140129Abstract: A processor having one or more processing cores is described. Each of the one or more processing cores has front end logic circuitry and a plurality of processing units. The front end logic circuitry is to fetch respective instructions of threads and decode the instructions into respective micro-code and input operand and resultant addresses of the instructions. Each of the plurality of processing units is to be assigned at least one of the threads, is coupled to said front end unit, and has a respective buffer to receive and store microcode of its assigned at least one of the threads.Type: GrantFiled: December 28, 2012Date of Patent: November 27, 2018Assignee: Intel CorporationInventors: Ilan Pardo, Dror Markovich, Oren Ben-Kiki, Yuval Yosef
-
Patent number: 10101999Abstract: A semiconductor chip is described having a load collision detection circuit comprising a first bloom filter circuit. The semiconductor chip has a store collision detection circuit comprising a second bloom filter circuit. The semiconductor chip has one or more processing units capable of executing ordered parallel threads coupled to the load collision detection circuit and the store collision detection circuit. The load collision detection circuit and the store collision detection circuit is to detect younger stores for load operations of said threads and younger loads for store operations of said threads.Type: GrantFiled: January 10, 2017Date of Patent: October 16, 2018Assignee: intel corporationInventors: Enrique De Lucas, Pedro Marcuello, Oren Ben-Kiki, Ilan Pardo, Yuval Yosef
-
Patent number: 10095521Abstract: An apparatus and method are described for providing low-latency invocation of accelerators. For example, a processor according to one embodiment comprises execution logic to execute a plurality of instructions including an accelerator invocation instruction to invoke one or more accelerator commands. The accelerator invocation instruction stores command data specifying the command within a command register. One or more accelerators read the command data from the command register and responsively attempt to execute the command identified by the command data. Upon a switch from a first context to a second context, an accelerator context save/restore pointer identifies a region within system memory where the accelerator is to save its state and later the accelerator context save/restore pointer aids in restoring its state upon returning to the first context.Type: GrantFiled: May 3, 2016Date of Patent: October 9, 2018Assignee: Intel CorporationInventors: Oren Ben-Kiki, Ilan Pardo, Robert Valentine, Eliezer Weissmann, Dror Markovich, Yuval Yosef
-
Patent number: 10089113Abstract: An apparatus and method are described for providing low-latency invocation of accelerators. For example, a system according to one embodiment comprises: a processor includes a plurality of simultaneous multithreading (SMT) cores, at least one shared cache circuit to be shared among two or more of the SMT cores; and at least one of the SMT cores including at least one level 2 (L2) cache circuit to store both instructions and data and communicatively coupled to the instruction cache circuit and the data cache circuit, a communication interconnect circuit including a peripheral component interconnect express (PCIe) circuit to communicatively couple one or more of the SMT cores to an accelerator device and a memory access circuit to identify an accelerator context save/restore region in a memory responsive to a context save/restore value, the accelerator context save/restore region to share an accelerator context state.Type: GrantFiled: September 30, 2016Date of Patent: October 2, 2018Assignee: INTEL CORPORATIONInventors: Oren Ben-Kiki, Ilan Pardo, Robert Valentine, Eliezer Weissmann, Dror Markovich, Yuval Yosef
-
Patent number: 10083037Abstract: An apparatus and method are described for providing low-latency invocation of accelerators. For example, a processor according to one embodiment comprises: a plurality of simultaneous multithreading (SMT) cores, at least one shared cache circuit to be shared among the SMT cores, and at least one L2 cache circuit to store both instructions and data. The processor further comprises a communication interconnect circuit including a PCIe circuit to communicatively couple one or more of the SMT cores to an accelerator device, the PCIe circuit to provide the accelerator device access to resources of the processor including the at least one shared cache circuit. The processor further comprises a memory access circuit to identify an accelerator context save/restore region in a memory determined by an accelerator context save/restore value, the accelerator context save/restore region to store an accelerator context state.Type: GrantFiled: September 30, 2016Date of Patent: September 25, 2018Assignee: INTEL CORPORATIONInventors: Oren Ben-Kiki, Ilan Pardo, Robert Valentine, Eliezer Weissmann, Dror Markovich, Yuval Yosef
-
Patent number: 9720730Abstract: In one embodiment, the present invention includes a multicore processor with first and second groups of cores. The second group can be of a different instruction set architecture (ISA) than the first group or of the same ISA set but having different power and performance support level, and is transparent to an operating system (OS). The processor further includes a migration unit that handles migration requests for a number of different scenarios and causes a context switch to dynamically migrate a process from the second core to a first core of the first group. This dynamic hardware-based context switch can be transparent to the OS. Other embodiments are described and claimed.Type: GrantFiled: December 30, 2011Date of Patent: August 1, 2017Assignee: Intel CorporationInventors: Boris Ginzburg, Ilya Osadchiy, Ronny Ronen, Eliezer Weissmann, Michael Mishaeli, Alon Naveh, David A. Koufaty, Scott D. Hahn, Tong Li, Avi Mendleson, Eugene Gorbatov, Hisham Abu-Salah, Dheeraj R. Subbareddy, Paolo Narvaez, Aamer Jaleel, Efraim Rotem, Yuval Yosef, Anil Aggarwal, Kenzo Van Craeynest
-
Publication number: 20170147344Abstract: A semiconductor chip is described having a load collision detection circuit comprising a first bloom filter circuit. The semiconductor chip has a store collision detection circuit comprising a second bloom filter circuit. The semiconductor chip has one or more processing units capable of executing ordered parallel threads coupled to the load collision detection circuit and the store collision detection circuit. The load collision detection circuit and the store collision detection circuit is to detect younger stores for load operations of said threads and younger loads for store operations of said threads.Type: ApplicationFiled: January 10, 2017Publication date: May 25, 2017Inventors: ENRIQUE DE LUCAS, PEDRO MARCUELLO, OREN BEN-KIKI, ILAN PARDO, YUVAL YOSEF
-
Patent number: 9588771Abstract: In one embodiment, the present invention includes a method for directly communicating between an accelerator and an instruction sequencer coupled thereto, where the accelerator is a heterogeneous resource with respect to the instruction sequencer. An interface may be used to provide the communication between these resources. Via such a communication mechanism a user-level application may directly communicate with the accelerator without operating system support. Further, the instruction sequencer and the accelerator may perform operations in parallel. Other embodiments are described and claimed.Type: GrantFiled: March 8, 2013Date of Patent: March 7, 2017Assignee: Intel CorporationInventors: Hong Wang, John Shen, Hong Jiang, Richard Hankins, Per Hammarlund, Dion Rodgers, Gautham Chinya, Baiju Patel, Shiv Kaushik, Bryant Bigbee, Gad Sheaffer, Yoav Talgam, Yuval Yosef, James P. Held
-
Publication number: 20170017492Abstract: An apparatus and method are described for providing low-latency invocation of accelerators.Type: ApplicationFiled: September 30, 2016Publication date: January 19, 2017Inventors: Oren Ben-Kiki, ILAN PARDO, Robert Valentine, Eliezer Weissmann, Dror Markovich, Yuval Yosef
-
Publication number: 20170017491Abstract: An apparatus and method are described for providing low-latency invocation of accelerators.Type: ApplicationFiled: September 30, 2016Publication date: January 19, 2017Inventors: Oren Ben-Kiki, ILAN PARDO, Robert Valentine, Eliezer Weissmann, Dror Markovich, Yuval Yosef
-
Patent number: 9542193Abstract: A semiconductor chip is described having a load collision detection circuit comprising a first bloom filter circuit. The semiconductor chip has a store collision detection circuit comprising a second bloom filter circuit. The semiconductor chip has one or more processing units capable of executing ordered parallel threads coupled to the load collision detection circuit and the store collision detection circuit. The load collision detection circuit and the store collision detection circuit is to detect younger stores for load operations of said threads and younger loads for store operations of said threads.Type: GrantFiled: December 28, 2012Date of Patent: January 10, 2017Assignee: Intel CorporationInventors: Enrique De Lucas, Pedro Marcuello, Oren Ben-Kiki, Ilan Pardo, Yuval Yosef
-
Publication number: 20160342419Abstract: An apparatus and method are described for executing both latency-optimized execution logic and throughput-optimized execution logic on a processing device. For example, a processor according to one embodiment comprises: latency-optimized execution logic to execute a first type of program code; throughput-optimized execution logic to execute a second type of program code, wherein the first type of program code and the second type of program code are designed for the same instruction set architecture; logic to identify the first type of program code and the second type of program code within a process and to distribute the first type of program code for execution on the latency-optimized execution logic and the second type of program code for execution on the throughput-optimized execution logic.Type: ApplicationFiled: August 2, 2016Publication date: November 24, 2016Inventors: Ben Oren-Kiki, Yuval Yosef, Ilan Pardo, Dror Markovich
-
Patent number: 9459874Abstract: In one embodiment, the present invention includes a method for directly communicating between an accelerator and an instruction sequencer coupled thereto, where the accelerator is a heterogeneous resource with respect to the instruction sequencer. An interface may be used to provide the communication between these resources. Via such a communication mechanism a user-level application may directly communicate with the accelerator without operating system support. Further, the instruction sequencer and the accelerator may perform operations in parallel. Other embodiments are described and claimed.Type: GrantFiled: November 14, 2014Date of Patent: October 4, 2016Assignee: Intel CorporationInventors: Hong Wang, John Shen, Hong Jiang, Richard Hankins, Per Hammarlund, Dion Rodgers, Gautham Chinya, Baiju Patel, Shiv Kaushik, Bryant Bigbee, Gad Sheaffer, Yoav Talgam, Yuval Yosef, James P. Held
-
Publication number: 20160246597Abstract: An apparatus and method are described for providing low-latency invocation of accelerators.Type: ApplicationFiled: May 3, 2016Publication date: August 25, 2016Inventors: Oren Ben-Kiki, llan Pardo, Robert Valentine, Eliezer Weissmann, Dror Markovich, Yuval Yosef