Patents by Inventor Changwon RHEE

Changwon RHEE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

BROADCAST ASYNCHRONOUS LOADS TO SHARED LOCAL MEMORY

Publication number: 20240134797

Abstract: Embodiments described herein provide a technique to facilitate the broadcast or multicast of asynchronous loads to shared local memory of a plurality of graphics cores within a graphics core cluster. One embodiment provides a graphics processor including a cache memory a graphics core cluster coupled with the cache memory. The graphics core cluster includes a plurality of graphics cores. The plurality of graphics cores includes a graphics core configured to receive a designation as a producer graphics core for a multicast load, read data from the cache memory; and transmit the data read from the cache memory to a consumer graphics core of the plurality of graphics cores.

Type: Application

Filed: October 24, 2022

Publication date: April 25, 2024

Applicant: Intel Corporation

Inventors: John A. Wiegert, Joydeep Ray, Vasanth Ranganathan, Biju George, Fangwen Fu, Abhishek R. Appu, Chunhui Mei, Changwon Rhee
HARDWARE ENHANCEMENTS FOR DOUBLE PRECISION SYSTOLIC SUPPORT

Publication number: 20240111826

Abstract: An apparatus to facilitate hardware enhancements for double precision systolic support is disclosed. The apparatus includes matrix acceleration hardware having double-precision (DP) matrix multiplication circuitry including a multiplier circuits to multiply pairs of input source operands in a DP floating-point format; adders to receive multiplier outputs from the multiplier circuits and accumulate the multiplier outputs in a high precision intermediate format; an accumulator circuit to accumulate adder outputs from the adders with at least one of a third global source operand on a first pass of the DP matrix multiplication circuitry or an intermediate result from the first pass on a second pass of the DP matrix multiplication circuitry, wherein the accumulator circuit to generate an accumulator output in the high precision intermediate format; and a down conversion and rounding circuit to down convert and round an output of the second pass as final result in the DP floating-point format.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Kevin Hurd, Changwon Rhee, Jorge Parra, Fangwen Fu, Theo Drane, William Zorn, Peter Caday, Gregory Henry, Guei-Yuan Lueh, Farzad Chehrazi, Amit Karande, Turbo Majumder, Xinmin Tian, Milind Girkar, Hong Jiang
SINGLE PRECISION SUPPORT FOR SYSTOLIC PIPELINE IN A GRAPHICS ENVIRONMENT

Publication number: 20240111825

Abstract: An apparatus to facilitate single precision support for systolic pipeline in a graphics environment is disclosed. The apparatus includes a processor comprising systolic array hardware including a plurality of data processing units, wherein the systolic array hardware is to: receive data for performance of a matrix multiplication operation in a first precision format; convert an original value of the data into two split values with a second precision format having a lower precision than the first precision format; perform the matrix multiplication operation using the two split values in the second precision format, the matrix multiplication operation comprising a split-term operation that utilizes two passes through the systolic array hardware with feedback wiring and local reduction; and generate an emulated result for the matrix multiplication operation in the first precision format.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Changwon Rhee, Kevin Hurd, Gregory Henry, Peter Caday, Kristopher Wong
SUPPORTING VECTOR MULTIPLY ADD WITH DOUBLE ACCUMULATOR ACCESS IN A GRAPHICS ENVIRONMENT

Publication number: 20240103810

Abstract: An apparatus to facilitate supporting vector multiply add with double accumulator access in a graphics environment is disclosed. The apparatus includes a processor comprising processing resources, the processing resources comprising multiplier circuitry to: receive operands for a matrix multiplication operation, wherein the operands comprising two source matrices to be multiplied as part of the matrix multiplication operation; and issue a multiply and add vector (MADV) instruction for the multiplication operation utilizing a double accumulator access output, wherein the MADV instruction to multiply two vectors of the two source matrices in a single floating point (FP) pipeline of the processor.

Type: Application

Filed: September 27, 2022

Publication date: March 28, 2024

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Supratim Pal, Changwon Rhee, Hong Jiang, Kevin Hurd, Shuai Mu
EMULATION OF FLOATING POINT CALCULATION

Publication number: 20230086275

Abstract: Emulating floating point calculation using lower precision format calculations is described. An example of a processor includes a floating point unit (FPU) to provide a native floating point operation in a first precision format; and systolic array hardware including multiple data processing units, wherein the processor is to receive data for performance of a matrix multiplication operation in the first precision format; enable an emulated floating point multiplication operation using one or more values with a second precision format, the second precision format having a lower precision than the first precision format, the emulated floating point multiplication including operation of the systolic array hardware; and generate an emulated result for the matrix multiplication operation.

Type: Application

Filed: September 22, 2021

Publication date: March 23, 2023

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Changwon Rhee, Sabareesh Ganapathy, Gregory Henry, Fangwen Fu
FUSED INSTRUCTION TO ACCELERATE PERFORMANCE OF SECURE HASH ALGORITHM 2 (SHA-2) WORKLOADS IN A GRAPHICS ENVIRONMENT

Publication number: 20220416999

Abstract: An apparatus to facilitate a fused instruction to accelerate performance of secure hash algorithm 2 (SHA-2) in a graphics environment is disclosed. The apparatus includes a processor comprising processing resources, the processing resources comprising execution circuitry to receive a fused SHA instruction identifying a length corresponding to a data size of the fused SHA instruction and a functional control identifying an operation type of the fused SHA instruction; based on decoding the fused SHA instruction, cause a sub-function identified by the length and the function control to be scheduled to an integer pipeline of the execution resource; and execute the sub-function of the fused SHA instruction in an integer pipeline of the execution circuitry, the sub-function to perform merged operations on a source operand of the fused SHA instruction, the merged operations comprising a rotate operation, a shift operation, and an xor operation.

Type: Application

Filed: June 25, 2021

Publication date: December 29, 2022

Applicant: Intel Corporation

Inventors: Supratim Pal, Wajdi Feghali, Changwon Rhee, Wei-Yu Chen, Timothy R. Bauer, Alexander Lyashevsky
LARGE INTEGER MULTIPLICATION ENHANCEMENTS FOR GRAPHICS ENVIRONMENT

Publication number: 20220413848

Abstract: An apparatus to facilitate large integer multiplication enhancements in a graphics environment is disclosed. The apparatus includes a processor comprising processing resources, the processing resources comprising multiplier circuitry to: receive operands for a multiplication operation, wherein the multiplication operation is part of a chain of multiplication operations for a large integer multiplication; and issue a multiply and add (MAD) instruction for the multiplication operation utilizing at least one of a double precision multiplier or a 48 bit output, wherein the MAD instruction to generate an output in a single clock cycle of the processor.

Type: Application

Filed: June 25, 2021

Publication date: December 29, 2022

Applicant: Intel Corporation

Inventors: Supratim Pal, Li-An Tang, Changwon Rhee, Timothy R. Bauer, Alexander Lyashevsky, Jiasheng Chen
64-BIT TWO-DIMENSIONAL BLOCK LOAD WITH TRANSPOSE

Publication number: 20220413854

Abstract: An apparatus to facilitate 64-bit two-dimensional (2D) block load with transpose is disclosed. The apparatus includes a processor comprising processing resources; and load store pipeline hardware circuitry coupled to the processing resources, the load store pipeline hardware circuitry to receive a 64-bit two-dimensional (2D) block load message with transpose from the processing resources. The load store pipeline hardware circuitry comprising a load store pipeline sequencer to map rows of a block of memory corresponding to the 64-bit 2D block load message with transpose to 64-bit standard load messages; and load store pipeline return circuitry to: sequentially number general register files (GRFs) used for returning elements of the block of memory accessed by the 64-bit standard load messages to the processing resources; and return, to the processing resources, the sequentially numbered GRFs in response to the 64-bit 2D block load message with transpose.

Type: Application

Filed: June 25, 2021

Publication date: December 29, 2022

Applicant: Intel Corporation

Inventors: Joydeep Ray, Supratim Pal, Prathamesh Raghunath Shinde, Ben J. Ashbaugh, Changwon Rhee, Hong Jiang, FangWen Fu
METHODS AND APPARATUSES FOR DYNAMICALLY CHANGING DATA PRIORITY IN A CACHE

Publication number: 20220414010

Abstract: Embodiments are generally directed to methods and apparatuses for dynamically changing data priority in a cache. An embodiment of an apparatus comprising: a priority controller to: receive a memory access request to request data; and set a priority flag for the memory access request based on an accumulated access amount of data stored in a memory block to be accessed by the memory access request to dynamically change a priority level of the requested data.

Type: Application

Filed: March 25, 2022

Publication date: December 29, 2022

Applicant: Intel Corporation

Inventors: Xiaodong Qiu, Yong Jiang, Changwon Rhee, Cui Tang, Shuangpeng Zhou, Lei Chen, Danyu Bi, Peiqing Jiang, Chengxi Wu
Low power video composition using a stream out buffer

Patent number: 10484640

Abstract: Techniques related to compositing video content are discussed. Such techniques may include generating transparency data for a surface of first video content and storing it in a stream out buffer, accessing the transparency data via the stream out buffer when there is no change to the surface of the first video content, and compositing the first video content with second video content based on the accessed transparency data.

Type: Grant

Filed: June 3, 2015

Date of Patent: November 19, 2019

Assignee: Intel Corporation

Inventors: Yunbiao Lin, Changliang Wang, Tiehan Lu, Qingyuan Zhang, Li Xu, Bo Zhao, Changwon Rhee
LOW POWER VIDEO COMPOSITION USING A STREAM OUT BUFFER

Publication number: 20180288353

Abstract: Techniques related to compositing video content are discussed. Such techniques may include generating transparency data for a surface of first video content and storing it in a stream out buffer, accessing the transparency data via the stream out buffer when there is no change to the surface of the first video content, and compositing the first video content with second video content based on the accessed transparency data.

Type: Application

Filed: June 3, 2015

Publication date: October 4, 2018

Inventors: Yunbiao LIN, Changliang WANG, Tiehan LU, Qingyuan ZHANG, Li XU, Bo ZHAO, Changwon RHEE