Patents by Inventor Biju George

Biju George has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

BROADCAST ASYNCHRONOUS LOADS TO SHARED LOCAL MEMORY

Publication number: 20240134797

Abstract: Embodiments described herein provide a technique to facilitate the broadcast or multicast of asynchronous loads to shared local memory of a plurality of graphics cores within a graphics core cluster. One embodiment provides a graphics processor including a cache memory a graphics core cluster coupled with the cache memory. The graphics core cluster includes a plurality of graphics cores. The plurality of graphics cores includes a graphics core configured to receive a designation as a producer graphics core for a multicast load, read data from the cache memory; and transmit the data read from the cache memory to a consumer graphics core of the plurality of graphics cores.

Type: Application

Filed: October 24, 2022

Publication date: April 25, 2024

Applicant: Intel Corporation

Inventors: John A. Wiegert, Joydeep Ray, Vasanth Ranganathan, Biju George, Fangwen Fu, Abhishek R. Appu, Chunhui Mei, Changwon Rhee
DETERMINISTIC BROADCASTING FROM SHARED MEMORY

Publication number: 20240111534

Abstract: Embodiments described herein provide a technique enable a broadcast load from an L1 cache or shared local memory to register files associated with hardware threads of a graphics core. One embodiment provides a graphics processor comprising a cache memory and a graphics core coupled with the cache memory. The graphics core includes a plurality of hardware threads and memory access circuitry to facilitate access to memory by the plurality of hardware threads. The graphics core is configurable to process a plurality of load request from the plurality of hardware threads, detect duplicate load requests within the plurality of load requests, perform a single read from the cache memory in response to the duplicate load requests, and transmit data associated with the duplicate load requests to requesting hardware threads.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Fangwen Fu, Chunhui Mei, Maxim Kazakov, Biju George, Jorge Parra, Supratim Pal
ORDERED THREAD DISPATCH FOR THREAD TEAMS

Publication number: 20240111590

Abstract: An apparatus to facilitate ordered thread dispatch for thread teams is disclosed. The apparatus includes one or more processors including a graphic processor, the graphics processor including a plurality of processing resources, and wherein the graphics processor is to: allocate a thread team local identifier (ID) for respective threads of a thread team comprising a plurality of hardware threads that are to be executed solely by a processing resource of the plurality of processing resources; and dispatch the respective threads together into the processing resource, the respective threads having the thread team local ID allocated.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Biju George, Vasanth Ranganathan, Fangwen Fu, Ben Ashbaugh, Roland Schulz
SYNCHRONIZATION UTILIZING LOCAL TEAM BARRIERS FOR THREAD TEAM PROCESSING

Publication number: 20240111609

Abstract: Low-latency synchronization utilizing local team barriers for thread team processing is described. An example of an apparatus includes one or more processors including a graphics processor, the graphics processor including a plurality of processing resources; and memory for storage of data including data for graphics processing, wherein the graphics processor is to receive a request for establishment of a local team barrier for a thread team, the thread team being allocated to a first processing resource, the thread team including multiple threads; determine requirements and designated threads for the local team barrier; and establish the local team barrier in a local register of the first processing resource based at least in part on the requirements and designated threads for the local barrier.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Biju George, Supratim Pal, James Valerio, Vasanth Ranganathan, Fangwen Fu, Chunhui Mei
SHARED LOCAL REGISTERS FOR THREAD TEAM PROCESSING

Publication number: 20240112295

Abstract: Shared local registers for thread team processing is described. An example of an apparatus includes one or more processors including a graphic processor having multiple processing resources; and memory for storage of data, the graphics processor to allocate a first thread team to a first processing resource, the first thread team including hardware threads to be executed solely by the first processing resource; allocate a shared local register (SLR) space that may be directly reference in the ISA instructions to the first processing resource, the SLR space being accessible to the threads of the thread team and being inaccessible to threads outside of the thread team; and allocate individual register spaces to the thread team, each of the individual register spaces being accessible to a respective thread of the thread team.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Biju George, Fangwen Fu, Supratim Pal, Jorge Parra, Chunhui Mei, Maxim Kazakov, Joydeep Ray
PREFETCH AWARE LRU CACHE REPLACEMENT POLICY

Publication number: 20240104025

Abstract: Prefetch aware LRU cache replacement policy is described. An example of an apparatus includes one or more processors including a graphic processor, the graphics processor including a load store cache having multiple cache lines (CLs), each including bits for a cache line level (CL level) and one or more sectors for data storage; wherein the graphics processor is to receive one or more data elements for storage in the cache; set a CL level to track each CL receiving data, including setting CL level 1 for a CL receiving data in response to a miss in the cache and setting a CL level 2 for a CL receiving prefetched data in response to a prefetch request, and, upon determining that space is required in the cache to store data, apply a cache replacement policy, the policy being based at least in part on set CL levels for the CLs.

Type: Application

Filed: September 23, 2022

Publication date: March 28, 2024

Applicant: Intel Corporation

Inventors: Biju George, Zamshed I. Chowdhury, Prathamesh Raghunath Shinde, Chunhui Mei, Fangwen Fu
HARDWARE ENHANCEMENTS FOR MATRIX LOAD/STORE INSTRUCTIONS

Publication number: 20240069914

Abstract: Embodiments described herein provide a system to enable access to an n-dimensional tensor in memory of a graphics processor via a batch of two-dimensional block access messages. One embodiment provides a graphics processor comprising general-purpose graphics execution resources coupled with the system interface, the general-purpose graphics execution resources including a matrix accelerator. The matrix accelerator is configured to perform a matrix operation on a plurality of tensors stored in a memory. Circuitry is included to facilitate access to the memory by the general-purpose graphics execution resources. The circuitry is configured to receive a request to access a tensor of the plurality of tensors and generate a batch of two-dimensional block access messages along a dimension of n>2 of the tensor. The batch of two-dimensional block access messages enables access to the tensor by the matrix accelerator.

Type: Application

Filed: August 23, 2022

Publication date: February 29, 2024

Applicant: Intel Corporation

Inventors: Biju George, Fangwen Fu, Joydeep Ray
MOUNT APPARATUS FOR A SUBMERSIBLE ANALYZER AND METHOD FOR ANALYZING FLUID

Publication number: 20170219551

Abstract: A mount or submersible or semisubmersible housing for supporting a submersible analyzer or device and method for analyzing fluid. The mount includes an elongated submersible housing that supports the analyzer or device. The housing is ruggedized and has a geometric body with an internal cavity and with upper and lower ends. The upper end is configured to mount to a fixed structure. In some embodiments a slot extends between the upper and lower ends of the housing along a longitudinal axis thereof. In some such embodiments the slot is sized to receive a portion of a sliding extension that supports a sensor of the analyzer or device, thereby facilitating the installation and removal of the analyzer or device with respect to the elongated submersible housing. An attachment may be used to fixedly mount the elongated submersible housing to the fixed structure.

Type: Application

Filed: February 1, 2017

Publication date: August 3, 2017

Inventors: Salil Kharkar, Nicholas Passarelli, Chris Reilly, Michael Nye, Michael Chen, James L. Clarke, Biju George, Sudhir N. Murthy
Method and system for implementing a cloud-based social media marketing method and system

Patent number: 9633399

Abstract: Disclosed is an approach for implementing a system, method, and computer program product for performing social marketing using a cloud-based system. The approach is capable of accessing data across multiple types of internet-based sources of social data and commentary and to perform analysis upon that data. A social marketing campaign can then be generated and implemented in an integrated manner using the system. This permits realtime reaction to trends, with rapid ability to react to opportunities in the marketplace.

Type: Grant

Filed: September 27, 2013

Date of Patent: April 25, 2017

Assignee: Oracle International Corporation

Inventors: Biju George, Mehrshad Setayesh, Timothy P. McCandless, Patricia Pichardo, Kimberly Ann Wolfe, Reza Parang, Maria Fernanda Diaz-Arscott, Jeff Condit, Brian Culler, Noah Horton, Michael James Strutton
Register liveness analysis for SIMD architectures

Patent number: 9372677

Abstract: Systems and methods of allocating physical registers to variables may involve identifying a partial definition of a variable in an inter-procedural control flow graph. A determination can be made as to whether to terminate a live range of the variable based at least in part on the partial definition. Additionally, a physical register may be allocated to the variable based at least in part on the live range.

Type: Grant

Filed: April 10, 2015

Date of Patent: June 21, 2016

Assignee: Intel Corporation

Inventors: Biju George, Guei-Yuan Lueh
Efficient implementation of RSA using GPU/CPU architecture

Patent number: 9262166

Abstract: Various embodiments are directed to a heterogeneous processor architecture comprised of a CPU and a GPU on the same processor die. The heterogeneous processor architecture may optimize source code in a GPU compiler using vector strip mining to reduce instructions of arbitrary vector lengths into GPU supported vector lengths and loop peeling. It may be first determined that the source code is eligible for optimization if more than one machine code instruction of compiled source code under-utilizes GPU instruction bandwidth limitations. The initial vector strip mining results may be discarded and the first iteration of the inner loop body may be peeled out of the loop. The type of operands in the source code may be lowered and the peeled out inner loop body of source code may be vector strip mined again to obtain optimized source code.

Type: Grant

Filed: November 30, 2011

Date of Patent: February 16, 2016

Assignee: INTEL CORPORATION

Inventors: Xiaozhu Kang, Biju George, Ken Lueh
REGISTER LIVENESS ANALYSIS FOR SIMD ARCHITECTURES

Publication number: 20150220313

Abstract: Systems and methods of allocating physical registers to variables may involve identifying a partial definition of a variable in an inter-procedural control flow graph. A determination can be made as to whether to terminate a live range of the variable based at least in part on the partial definition. Additionally, a physical register may be allocated to the variable based at least in part on the live range.

Type: Application

Filed: April 10, 2015

Publication date: August 6, 2015

Applicant: INTEL CORPORATION

Inventors: Biju George, Guei-Yuan Lueh
Register liveness analysis for SIMD architectures

Patent number: 9015687

Abstract: Systems and methods of allocating physical registers to variables may involve identifying a partial definition of a variable in an inter-procedural control flow graph. A determination can be made as to whether to terminate a live range of the variable based at least in part on the partial definition. Additionally, a physical register may be allocated to the variable based at least in part on the live range.

Type: Grant

Filed: March 30, 2011

Date of Patent: April 21, 2015

Assignee: Intel Corporation

Inventors: Biju George, Guei-Yuan Lueh
METHOD AND SYSTEM FOR IMPLEMENTING A CLOUD-BASED SOCIAL MEDIA MARKETING METHOD AND SYSTEM

Publication number: 20140180788

Abstract: Disclosed is an approach for implementing a system, method, and computer program product for performing social marketing using a cloud-based system. The approach is capable of accessing data across multiple types of internet-based sources of social data and commentary and to perform analysis upon that data. A social marketing campaign can then be generated and implemented in an integrated manner using the system. This permits realtime reaction to trends, with rapid ability to react to opportunities in the marketplace.

Type: Application

Filed: September 27, 2013

Publication date: June 26, 2014

Applicant: Oracle International Corporation

Inventors: Biju GEORGE, Mehrshad SETAYESH, Timothy P. MCCANDLESS, Patricia PICHARDO, Kimberly Ann WOLFE, Reza PARANG, Nanda ARSCOTT, Jeff CONDIT, Brian CULLER, Noah HORTON, Michael James STRUTTON
EFFICIENT IMPLEMENTATION OF RSA USING GPU/CPU ARCHITECTURE

Publication number: 20130297919

Abstract: Various embodiments are directed to a heterogeneous processor architecture comprised of a CPU and a GPU on the same processor die. The heterogeneous processor architecture may optimize source code in a GPU compiler using vector strip mining to reduce instructions of arbitrary vector lengths into GPU supported vector lengths and loop peeling. It may be first determined that the source code is eligible for optimization if more than one machine code instruction of compiled source code under-utilizes GPU instruction bandwidth limitations. The initial vector strip mining results may be discarded and the first iteration of the inner loop body may be peeled out of the loop. The type of operands in the source code may be lowered and the peeled out inner loop body of source code may be vector strip mined again to obtain optimized source code.

Type: Application

Filed: November 30, 2011

Publication date: November 7, 2013

Inventors: Xiaozhu Kang, Biju George, Ken Lueh
Modeling Structured SIMD Control FLow Constructs in an Explicit SIMD Language

Publication number: 20130290674

Abstract: Constructs may express SIMD control flow that can be efficiently implemented on a SIMD machine with support for SIMD control flow. The execution semantics of constructs serve as a functional specification for an emulation implementation in the central processing unit (CPU), a non-SIMD machine, using conventional C++ compiler such as GCC or Microsoft Visual C++ without any modification to the conventional compiler in some embodiments.

Type: Application

Filed: April 30, 2012

Publication date: October 31, 2013

Inventors: Biju George, Guei-Yuan Luch
REGISTER LIVENESS ANALYSIS FOR SIMD ARCHITECTURES

Publication number: 20120254847

Abstract: Systems and methods of allocating physical registers to variables may involve identifying a partial definition of a variable in an inter-procedural control flow graph. A determination can be made as to whether to terminate a live range of the variable based at least in part on the partial definition. Additionally, a physical register may be allocated to the variable based at least in part on the live range.

Type: Application

Filed: March 30, 2011

Publication date: October 4, 2012

Inventors: Biju George, Guei-Yuan Lueh