Patents by Inventor Dibyendu Das

Dibyendu Das has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Sealing System for Protection of an Electrical Machine Having Rotating Components

Publication number: 20240213845

Abstract: The present invention relates to a sealing system for protection of an electrical machine. The sealing system comprises a seal housing (116, 306) mounted on a cannulated shaft (108) of a rotor of the electrical machine. The seal housing (116, 306) holds one or more seal members (114, 302). The one or more seal members (114, 302) includes a lip for creating a sealing joint such that a proximal end of the lip of the one or more seal members (114) may be in contact with a stator of the electrical machine. A distal end of the lip of each of the one or more seal members (114, 302) may be enclosed in the seal housing (116, 306). The seal housing (116, 306) may rotate with rotation of the rotor, thereby preventing entry of any fluid into the electrical machine during operation of the electrical machine.

Type: Application

Filed: December 22, 2023

Publication date: June 27, 2024

Inventors: Raghavendra Rao, Dibyendu Das, Jigar Fulia, Amol Ridhore, Samsul Ekram
Transforming loops in program code based on a capacity of a cache

Patent number: 10866790

Abstract: An electronic device acquires, from program code, two or more program code loops having specified data dependencies. The electronic device places each of the program code loops into a corresponding blocking loop, each blocking loop including at least one blocking loop induction variable that is incremented by a corresponding block size and used to specify a number of iterations for at least one internal loop induction variable of the respective program code loop. The electronic device fuses the blocking loops into a fused loop by placing all of the blocking loops in the fused loop and replacing the blocking loop induction variables of the blocking loops with a fused loop induction variable that is incremented by the corresponding block size and used to specify the number of iterations for respective internal loop induction variables in the blocking loops.

Type: Grant

Filed: November 30, 2018

Date of Patent: December 15, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Dibyendu Das, Pradeep H. Rao
Transforming Loops in Program Code Based on a Capacity of a Cache

Publication number: 20200174764

Abstract: An electronic device acquires, from program code, two or more program code loops having specified data dependencies. The electronic device places each of the program code loops into a corresponding blocking loop, each blocking loop including at least one blocking loop induction variable that is incremented by a corresponding block size and used to specify a number of iterations for at least one internal loop induction variable of the respective program code loop. The electronic device fuses the blocking loops into a fused loop by placing all of the blocking loops in the fused loop and replacing the blocking loop induction variables of the blocking loops with a fused loop induction variable that is incremented by the corresponding block size and used to specify the number of iterations for respective internal loop induction variables in the blocking loops.

Type: Application

Filed: November 30, 2018

Publication date: June 4, 2020

Inventors: Dibyendu Das, Pradeep H. Rao
Method and apparatus for compiling computer-readable computer programs using extended isomorphism

Patent number: 10387128

Abstract: A method and apparatus provides for compiling a computer-readable computer program having a plurality of computer-readable statements into a plurality of computer-executable instructions. In one example, the method and apparatus determines when at least one pair of the computer-readable statements is partial-isomorphic having an equivalent operation and same order of operation but causing non-consecutive memory accesses, and generates the computer-executable instructions causing the at least one pair of the partial-isomorphic statements to perform sequential physical memory accesses.

Type: Grant

Filed: April 27, 2017

Date of Patent: August 20, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Md Asghar Ahmad Shahid, Shivarama Kokrady Rao, Dibyendu Das
Strided loading of non-sequential memory locations by skipping memory locations between consecutive loads

Patent number: 10353708

Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.

Type: Grant

Filed: September 23, 2016

Date of Patent: July 16, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Anupama Rajesh Rasale, Dibyendu Das, Ashutosh Nema, Md Asghar Ahmad Shahid, Prathiba Kumar
Cache aware self-referential structure peeling

Patent number: 10275230

Abstract: Methods of compiling source code are provided. A method includes identifying a first array of structures (AOS), having a plurality of array elements, each array element being a structure with a plurality of fields, and performing structure peeling on the first AOS to convert a data layout of the first AOS to an array of structure of arrays (AOSOA) including a plurality of memory blocks of uniform block size. At least one of the plurality of memory blocks is allocated for each field of the plurality of fields. The method further includes allocating a number of complete memory blocks to accommodate all of the plurality of array elements of the AOS.

Type: Grant

Filed: July 14, 2017

Date of Patent: April 30, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Suresh Mani, Dibyendu Das, Shivarama Rao, Ashutosh Nema
CACHE AWARE SELF-REFERENTIAL STRUCTURE PEELING

Publication number: 20190018664

Abstract: Methods of compiling source code are provided. A method includes identifying a first array of structures (AOS), having a plurality of array elements, each array element being a structure with a plurality of fields, and performing structure peeling on the first AOS to convert a data layout of the first AOS to an array of structure of arrays (AOSOA) including a plurality of memory blocks of uniform block size. At least one of the plurality of memory blocks is allocated for each field of the plurality of fields. The method further includes allocating a number of complete memory blocks to accommodate all of the plurality of array elements of the AOS.

Type: Application

Filed: July 14, 2017

Publication date: January 17, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Suresh Mani, Dibyendu Das, Shivarama Rao, Ashutosh Nema
Estimation of bit widths of variables based on liveness

Patent number: 10180826

Abstract: A compiler generates transfer functions for blocks of a program during compilation of the program. The transfer functions estimate bit widths of variables in the blocks based on numbers of bits needed to carry out at least one instruction in the blocks and whether the variables are live in the blocks. For example, a transfer function may return a number indicating how many bits of a variable are needed to execute a current instruction as a function of the number of bits of the variable used by the program in subsequent instructions. Numbers of bits to represent the variables in the compiled program based on the transfer functions.

Type: Grant

Filed: October 22, 2015

Date of Patent: January 15, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Prakash Sathyanath Raghavendra, Dibyendu Das, Arun Rangasamy
METHOD AND APPARATUS FOR COMPILING COMPUTER-READABLE COMPUTER PROGRAMS USING EXTENDED ISOMORPHISM

Publication number: 20180314506

Abstract: A method and apparatus provides for compiling a computer-readable computer program having a plurality of computer-readable statements into a plurality of computer-executable instructions. In one example, the method and apparatus determines when at least one pair of the computer-readable statements is partial-isomorphic having an equivalent operation and same order of operation but causing non-consecutive memory accesses, and generates the computer-executable instructions causing the at least one pair of the partial-isomorphic statements to perform sequential physical memory accesses.

Type: Application

Filed: April 27, 2017

Publication date: November 1, 2018

Inventors: Md Asghar Ahmad Shahid, Shivarama Kokrady Rao, Dibyendu Das
EFFICIENT VECTORIZATION TECHNIQUES FOR OPERANDS IN NON-SEQUENTIAL MEMORY LOCATIONS

Publication number: 20180088948

Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.

Type: Application

Filed: September 23, 2016

Publication date: March 29, 2018

Inventors: Anupama Rajesh Rasale, Dibyendu Das, Ashutosh Nema, Md Asghar Ahmad Shahid, Prathiba Kumar
ESTIMATION OF BIT WIDTHS OF VARIABLES BASED ON LIVENESS

Publication number: 20170115970

Abstract: A compiler generates transfer functions for blocks of a program during compilation of the program. The transfer functions estimate bit widths of variables in the blocks based on numbers of bits needed to carry out at least one instruction in the blocks and whether the variables are live in the blocks. For example, a transfer function may return a number indicating how many bits of a variable are needed to execute a current instruction as a function of the number of bits of the variable used by the program in subsequent instructions. Numbers of bits to represent the variables in the compiled program based on the transfer functions.

Type: Application

Filed: October 22, 2015

Publication date: April 27, 2017

Inventors: Prakash Sathyanath Raghavendra, Dibyendu Das, Arun Rangasamy
Process mapping parallel computing

Patent number: 9235445

Abstract: A method of mapping processes to processors in a parallel computing environment where a parallel application is to be run on a cluster of nodes wherein at least one of the nodes has multiple processors sharing a common memory, the method comprising using compiler based communication analysis to map Message Passing Interface processes to processors on the nodes, whereby at least some more heavily communicating processes are mapped to processors within nodes. Other methods, apparatus, and computer readable media are also provided.

Type: Grant

Filed: November 28, 2011

Date of Patent: January 12, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dibyendu Das, Nagarajan Kathiresan, Rajan Ravindran, Bhaskaran Venkatsubramaniam
Scalable partial vectorization

Patent number: 9158511

Abstract: A system, method and computer program product to compute latencies of a plurality of expression trees in a basic block and to select a first and a second expression tree from the plurality of expression trees based on the computed latencies. The first expression tree is isomorphic to the second expression tree and the first and second expression trees are selected in order of largest to smallest latency. This selection ensures that the largest isomorphic expression trees are vectorized first. By vectorizing the largest isomorphic expression trees first, a basic block containing hundreds of statements can be vectorized without significant compile time. Moreover, vectorization of the largest isomorphic expression trees results in a significant improvement in system performance on SIMD processors.

Type: Grant

Filed: May 20, 2013

Date of Patent: October 13, 2015

Assignee: Advanced Micro Devices, Inc.

Inventors: Ramshankar Ramanarayanan, Meghana Gupta, Soham S. Chakraborty, Dibyendu Das
Process mapping in parallel computing

Patent number: 9063826

Abstract: A method of mapping processes to processors in a parallel computing environment where a parallel application is to be run on a cluster of nodes wherein at least one of the nodes has multiple processors sharing a common memory, the method comprising using compiler based communication analysis to map Message Passing Interface processes to processors on the nodes, whereby at least some more heavily communicating processes are mapped to processors within nodes. Other methods, apparatus, and computer readable media are also provided.

Type: Grant

Filed: November 28, 2011

Date of Patent: June 23, 2015

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dibyendu Das, Nagarajan Kathiresan, Rajan Ravindran, Bhaskaran Venkatsubramaniam
Scalable Partial Vectorization

Publication number: 20140344555

Abstract: A system, method and computer program product to compute latencies of a plurality of expression trees in a basic block and to select a first and a second expression tree from the plurality of expression trees based on the computed latencies. The first expression tree is isomorphic to the second expression tree and the first and second expression trees are selected in order of largest to smallest latency. This selection ensures that the largest isomorphic expression trees are vectorized first. By vectorizing the largest isomorphic expression trees first, a basic block containing hundreds of statements can be vectorized without significant compile time. Moreover, vectorization of the largest isomorphic expression trees results in a significant improvement in system performance on SIMD processors.

Type: Application

Filed: May 20, 2013

Publication date: November 20, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Ramshankar Ramanarayanan, Meghana Gupta, Soham S. Chakraborty, Dibyendu Das
Accelerating generic loop iterators using speculative execution

Patent number: 8701099

Abstract: A method, a system and a computer program product for effectively accelerating loop iterators using speculative execution of iterators. An Efficient Loop Iterator (ELI) utility detects initiation of a target program and initiates/spawns a speculative iterator thread at the start of the basic code block ahead of the code block that initiates a nested loop. The ELI utility assigns the iterator thread to a dedicated processor in a multi-processor system. The speculative thread runs/executes ahead of the execution of the nested loop and calculates indices in a corresponding multidimensional array. The iterator thread adds all the precomputed indices to a single queue. As a result, the ELI utility effectively enables a multidimensional loop to be replaced by a single dimensional loop. At the beginning of (or during) each iteration of the iterator, the ELI utility “dequeues” an entry from the queue to use the entry to access the array upon which the ELI utility iterates.

Type: Grant

Filed: November 2, 2010

Date of Patent: April 15, 2014

Assignee: International Business Machines Corporation

Inventors: Ganesh Bikshandi, Dibyendu Das, Smruti Ranjan Sarangi
Compiler driven mechanism for registration and deregistration of memory pages

Patent number: 8612953

Abstract: A method, system and article of manufacture are disclosed for registering and deregistering memory pages in a computer system. The method comprises the steps of hoisting register and deregister calls in a given routine where temporal locality is present to overlap computation and communication; using software pipelined registration and deregistration where spatial locality is observed; and using intra-procedural and inter-procedural analysis by a compiler of the computer system to deregister dynamically allocated buffers. The preferred embodiment of the invention is based on an optimizing compiler. The compiler is used to extract information such as addresses of buffers which are being reused repeatedly (temporal locality), preferably in a loop. The compiler may also find information about spatial locality, such as arrays whose indexes are used in a well-defined manner in a series of messages, for example, array pages being accessed in a pre-defined pattern in a loop.

Type: Grant

Filed: September 13, 2012

Date of Patent: December 17, 2013

Assignee: International Business Machines Corporation

Inventors: Dibyendu Das, Manish Gupta
Compiler driven mechanism for registration and deregistration of memory pages

Patent number: 8381204

Abstract: A method, system and article of manufacture are disclosed for registering and deregistering memory pages in a computer system. The method comprises the steps of hoisting register and deregister calls in a given routine where temporal locality is present to overlap computation and communication; using software pipelined registration and deregistration where spatial locality is observed; and using intra-procedural and inter-procedural analysis by a compiler of the computer system to deregister dynamically allocated buffers. The preferred embodiment of the invention is based on an optimizing compiler. The compiler is used to extract information such as addresses of buffers which are being reused repeatedly (temporal locality), preferably in a loop. The compiler may also find information about spatial locality, such as arrays whose indexes are used in a well-defined manner in a series of messages, for example, array pages being accessed in a pre-defined pattern in a loop.

Type: Grant

Filed: April 30, 2008

Date of Patent: February 19, 2013

Assignee: International Business Machines Corporation

Inventors: Dibyendu Das, Manish Gupta
High efficiency compilation framework for streamlining the execution of compiled code

Patent number: 8250552

Abstract: A method and system for reducing processing overhead during execution of a code block in a high efficiency compilation framework. The method identifies second code blocks within the code block and separates them out from the first code block during compilation. Further, during compilation, the system converts the second code blocks to kernel program modules, in a form recognizable by the system kernel. The compilation is followed by execution of the first code block, with the compiled object code of the first code block being executed in user mode and the kernel program modules being executed in kernel mode.

Type: Grant

Filed: November 17, 2008

Date of Patent: August 21, 2012

Assignee: International Business Machines Corporation

Inventors: Madhusudanan Kandasamy, Manish Gupta, Vidya Ranganathan, Dibyendu Das
Accelerating Generic Loop Iterators Using Speculative Execution

Publication number: 20120110302

Abstract: A method, a system and a computer program product for effectively accelerating loop iterators using speculative execution of iterators. An Efficient Loop Iterator (ELI) utility detects initiation of a target program and initiates/spawns a speculative iterator thread at the start of the basic code block ahead of the code block that initiates a nested loop. The ELI utility assigns the iterator thread to a dedicated processor in a multi-processor system. The speculative thread runs/executes ahead of the execution of the nested loop and calculates indices in a corresponding multidimensional array. The iterator thread adds all the precomputed indices to a single queue. As a result, the ELI utility effectively enables a multidimensional loop to be replaced by a single dimensional loop. At the beginning of (or during) each iteration of the iterator, the ELI utility “dequeues” an entry from the queue to use the entry to access the array upon which the ELI utility iterates.

Type: Application

Filed: November 2, 2010

Publication date: May 3, 2012

Applicant: IBM Corporation

Inventors: Ganesh Bikshandi, Dibyendu Das, Smruti Ranjan Sarangi

1 2 next