Patents by Inventor Dibyendu Das
Dibyendu Das has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240213845Abstract: The present invention relates to a sealing system for protection of an electrical machine. The sealing system comprises a seal housing (116, 306) mounted on a cannulated shaft (108) of a rotor of the electrical machine. The seal housing (116, 306) holds one or more seal members (114, 302). The one or more seal members (114, 302) includes a lip for creating a sealing joint such that a proximal end of the lip of the one or more seal members (114) may be in contact with a stator of the electrical machine. A distal end of the lip of each of the one or more seal members (114, 302) may be enclosed in the seal housing (116, 306). The seal housing (116, 306) may rotate with rotation of the rotor, thereby preventing entry of any fluid into the electrical machine during operation of the electrical machine.Type: ApplicationFiled: December 22, 2023Publication date: June 27, 2024Inventors: Raghavendra Rao, Dibyendu Das, Jigar Fulia, Amol Ridhore, Samsul Ekram
-
Patent number: 10866790Abstract: An electronic device acquires, from program code, two or more program code loops having specified data dependencies. The electronic device places each of the program code loops into a corresponding blocking loop, each blocking loop including at least one blocking loop induction variable that is incremented by a corresponding block size and used to specify a number of iterations for at least one internal loop induction variable of the respective program code loop. The electronic device fuses the blocking loops into a fused loop by placing all of the blocking loops in the fused loop and replacing the blocking loop induction variables of the blocking loops with a fused loop induction variable that is incremented by the corresponding block size and used to specify the number of iterations for respective internal loop induction variables in the blocking loops.Type: GrantFiled: November 30, 2018Date of Patent: December 15, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Dibyendu Das, Pradeep H. Rao
-
Publication number: 20200174764Abstract: An electronic device acquires, from program code, two or more program code loops having specified data dependencies. The electronic device places each of the program code loops into a corresponding blocking loop, each blocking loop including at least one blocking loop induction variable that is incremented by a corresponding block size and used to specify a number of iterations for at least one internal loop induction variable of the respective program code loop. The electronic device fuses the blocking loops into a fused loop by placing all of the blocking loops in the fused loop and replacing the blocking loop induction variables of the blocking loops with a fused loop induction variable that is incremented by the corresponding block size and used to specify the number of iterations for respective internal loop induction variables in the blocking loops.Type: ApplicationFiled: November 30, 2018Publication date: June 4, 2020Inventors: Dibyendu Das, Pradeep H. Rao
-
Patent number: 10387128Abstract: A method and apparatus provides for compiling a computer-readable computer program having a plurality of computer-readable statements into a plurality of computer-executable instructions. In one example, the method and apparatus determines when at least one pair of the computer-readable statements is partial-isomorphic having an equivalent operation and same order of operation but causing non-consecutive memory accesses, and generates the computer-executable instructions causing the at least one pair of the partial-isomorphic statements to perform sequential physical memory accesses.Type: GrantFiled: April 27, 2017Date of Patent: August 20, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Md Asghar Ahmad Shahid, Shivarama Kokrady Rao, Dibyendu Das
-
Patent number: 10353708Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.Type: GrantFiled: September 23, 2016Date of Patent: July 16, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Anupama Rajesh Rasale, Dibyendu Das, Ashutosh Nema, Md Asghar Ahmad Shahid, Prathiba Kumar
-
Patent number: 10275230Abstract: Methods of compiling source code are provided. A method includes identifying a first array of structures (AOS), having a plurality of array elements, each array element being a structure with a plurality of fields, and performing structure peeling on the first AOS to convert a data layout of the first AOS to an array of structure of arrays (AOSOA) including a plurality of memory blocks of uniform block size. At least one of the plurality of memory blocks is allocated for each field of the plurality of fields. The method further includes allocating a number of complete memory blocks to accommodate all of the plurality of array elements of the AOS.Type: GrantFiled: July 14, 2017Date of Patent: April 30, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Suresh Mani, Dibyendu Das, Shivarama Rao, Ashutosh Nema
-
Publication number: 20190018664Abstract: Methods of compiling source code are provided. A method includes identifying a first array of structures (AOS), having a plurality of array elements, each array element being a structure with a plurality of fields, and performing structure peeling on the first AOS to convert a data layout of the first AOS to an array of structure of arrays (AOSOA) including a plurality of memory blocks of uniform block size. At least one of the plurality of memory blocks is allocated for each field of the plurality of fields. The method further includes allocating a number of complete memory blocks to accommodate all of the plurality of array elements of the AOS.Type: ApplicationFiled: July 14, 2017Publication date: January 17, 2019Applicant: Advanced Micro Devices, Inc.Inventors: Suresh Mani, Dibyendu Das, Shivarama Rao, Ashutosh Nema
-
Patent number: 10180826Abstract: A compiler generates transfer functions for blocks of a program during compilation of the program. The transfer functions estimate bit widths of variables in the blocks based on numbers of bits needed to carry out at least one instruction in the blocks and whether the variables are live in the blocks. For example, a transfer function may return a number indicating how many bits of a variable are needed to execute a current instruction as a function of the number of bits of the variable used by the program in subsequent instructions. Numbers of bits to represent the variables in the compiled program based on the transfer functions.Type: GrantFiled: October 22, 2015Date of Patent: January 15, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Prakash Sathyanath Raghavendra, Dibyendu Das, Arun Rangasamy
-
Publication number: 20180314506Abstract: A method and apparatus provides for compiling a computer-readable computer program having a plurality of computer-readable statements into a plurality of computer-executable instructions. In one example, the method and apparatus determines when at least one pair of the computer-readable statements is partial-isomorphic having an equivalent operation and same order of operation but causing non-consecutive memory accesses, and generates the computer-executable instructions causing the at least one pair of the partial-isomorphic statements to perform sequential physical memory accesses.Type: ApplicationFiled: April 27, 2017Publication date: November 1, 2018Inventors: Md Asghar Ahmad Shahid, Shivarama Kokrady Rao, Dibyendu Das
-
Publication number: 20180088948Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.Type: ApplicationFiled: September 23, 2016Publication date: March 29, 2018Inventors: Anupama Rajesh Rasale, Dibyendu Das, Ashutosh Nema, Md Asghar Ahmad Shahid, Prathiba Kumar
-
Publication number: 20170115970Abstract: A compiler generates transfer functions for blocks of a program during compilation of the program. The transfer functions estimate bit widths of variables in the blocks based on numbers of bits needed to carry out at least one instruction in the blocks and whether the variables are live in the blocks. For example, a transfer function may return a number indicating how many bits of a variable are needed to execute a current instruction as a function of the number of bits of the variable used by the program in subsequent instructions. Numbers of bits to represent the variables in the compiled program based on the transfer functions.Type: ApplicationFiled: October 22, 2015Publication date: April 27, 2017Inventors: Prakash Sathyanath Raghavendra, Dibyendu Das, Arun Rangasamy
-
Patent number: 9235445Abstract: A method of mapping processes to processors in a parallel computing environment where a parallel application is to be run on a cluster of nodes wherein at least one of the nodes has multiple processors sharing a common memory, the method comprising using compiler based communication analysis to map Message Passing Interface processes to processors on the nodes, whereby at least some more heavily communicating processes are mapped to processors within nodes. Other methods, apparatus, and computer readable media are also provided.Type: GrantFiled: November 28, 2011Date of Patent: January 12, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dibyendu Das, Nagarajan Kathiresan, Rajan Ravindran, Bhaskaran Venkatsubramaniam
-
Patent number: 9158511Abstract: A system, method and computer program product to compute latencies of a plurality of expression trees in a basic block and to select a first and a second expression tree from the plurality of expression trees based on the computed latencies. The first expression tree is isomorphic to the second expression tree and the first and second expression trees are selected in order of largest to smallest latency. This selection ensures that the largest isomorphic expression trees are vectorized first. By vectorizing the largest isomorphic expression trees first, a basic block containing hundreds of statements can be vectorized without significant compile time. Moreover, vectorization of the largest isomorphic expression trees results in a significant improvement in system performance on SIMD processors.Type: GrantFiled: May 20, 2013Date of Patent: October 13, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Ramshankar Ramanarayanan, Meghana Gupta, Soham S. Chakraborty, Dibyendu Das
-
Patent number: 9063826Abstract: A method of mapping processes to processors in a parallel computing environment where a parallel application is to be run on a cluster of nodes wherein at least one of the nodes has multiple processors sharing a common memory, the method comprising using compiler based communication analysis to map Message Passing Interface processes to processors on the nodes, whereby at least some more heavily communicating processes are mapped to processors within nodes. Other methods, apparatus, and computer readable media are also provided.Type: GrantFiled: November 28, 2011Date of Patent: June 23, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dibyendu Das, Nagarajan Kathiresan, Rajan Ravindran, Bhaskaran Venkatsubramaniam
-
Publication number: 20140344555Abstract: A system, method and computer program product to compute latencies of a plurality of expression trees in a basic block and to select a first and a second expression tree from the plurality of expression trees based on the computed latencies. The first expression tree is isomorphic to the second expression tree and the first and second expression trees are selected in order of largest to smallest latency. This selection ensures that the largest isomorphic expression trees are vectorized first. By vectorizing the largest isomorphic expression trees first, a basic block containing hundreds of statements can be vectorized without significant compile time. Moreover, vectorization of the largest isomorphic expression trees results in a significant improvement in system performance on SIMD processors.Type: ApplicationFiled: May 20, 2013Publication date: November 20, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Ramshankar Ramanarayanan, Meghana Gupta, Soham S. Chakraborty, Dibyendu Das
-
Patent number: 8701099Abstract: A method, a system and a computer program product for effectively accelerating loop iterators using speculative execution of iterators. An Efficient Loop Iterator (ELI) utility detects initiation of a target program and initiates/spawns a speculative iterator thread at the start of the basic code block ahead of the code block that initiates a nested loop. The ELI utility assigns the iterator thread to a dedicated processor in a multi-processor system. The speculative thread runs/executes ahead of the execution of the nested loop and calculates indices in a corresponding multidimensional array. The iterator thread adds all the precomputed indices to a single queue. As a result, the ELI utility effectively enables a multidimensional loop to be replaced by a single dimensional loop. At the beginning of (or during) each iteration of the iterator, the ELI utility “dequeues” an entry from the queue to use the entry to access the array upon which the ELI utility iterates.Type: GrantFiled: November 2, 2010Date of Patent: April 15, 2014Assignee: International Business Machines CorporationInventors: Ganesh Bikshandi, Dibyendu Das, Smruti Ranjan Sarangi
-
Patent number: 8612953Abstract: A method, system and article of manufacture are disclosed for registering and deregistering memory pages in a computer system. The method comprises the steps of hoisting register and deregister calls in a given routine where temporal locality is present to overlap computation and communication; using software pipelined registration and deregistration where spatial locality is observed; and using intra-procedural and inter-procedural analysis by a compiler of the computer system to deregister dynamically allocated buffers. The preferred embodiment of the invention is based on an optimizing compiler. The compiler is used to extract information such as addresses of buffers which are being reused repeatedly (temporal locality), preferably in a loop. The compiler may also find information about spatial locality, such as arrays whose indexes are used in a well-defined manner in a series of messages, for example, array pages being accessed in a pre-defined pattern in a loop.Type: GrantFiled: September 13, 2012Date of Patent: December 17, 2013Assignee: International Business Machines CorporationInventors: Dibyendu Das, Manish Gupta
-
Patent number: 8381204Abstract: A method, system and article of manufacture are disclosed for registering and deregistering memory pages in a computer system. The method comprises the steps of hoisting register and deregister calls in a given routine where temporal locality is present to overlap computation and communication; using software pipelined registration and deregistration where spatial locality is observed; and using intra-procedural and inter-procedural analysis by a compiler of the computer system to deregister dynamically allocated buffers. The preferred embodiment of the invention is based on an optimizing compiler. The compiler is used to extract information such as addresses of buffers which are being reused repeatedly (temporal locality), preferably in a loop. The compiler may also find information about spatial locality, such as arrays whose indexes are used in a well-defined manner in a series of messages, for example, array pages being accessed in a pre-defined pattern in a loop.Type: GrantFiled: April 30, 2008Date of Patent: February 19, 2013Assignee: International Business Machines CorporationInventors: Dibyendu Das, Manish Gupta
-
Patent number: 8250552Abstract: A method and system for reducing processing overhead during execution of a code block in a high efficiency compilation framework. The method identifies second code blocks within the code block and separates them out from the first code block during compilation. Further, during compilation, the system converts the second code blocks to kernel program modules, in a form recognizable by the system kernel. The compilation is followed by execution of the first code block, with the compiled object code of the first code block being executed in user mode and the kernel program modules being executed in kernel mode.Type: GrantFiled: November 17, 2008Date of Patent: August 21, 2012Assignee: International Business Machines CorporationInventors: Madhusudanan Kandasamy, Manish Gupta, Vidya Ranganathan, Dibyendu Das
-
Publication number: 20120110302Abstract: A method, a system and a computer program product for effectively accelerating loop iterators using speculative execution of iterators. An Efficient Loop Iterator (ELI) utility detects initiation of a target program and initiates/spawns a speculative iterator thread at the start of the basic code block ahead of the code block that initiates a nested loop. The ELI utility assigns the iterator thread to a dedicated processor in a multi-processor system. The speculative thread runs/executes ahead of the execution of the nested loop and calculates indices in a corresponding multidimensional array. The iterator thread adds all the precomputed indices to a single queue. As a result, the ELI utility effectively enables a multidimensional loop to be replaced by a single dimensional loop. At the beginning of (or during) each iteration of the iterator, the ELI utility “dequeues” an entry from the queue to use the entry to access the array upon which the ELI utility iterates.Type: ApplicationFiled: November 2, 2010Publication date: May 3, 2012Applicant: IBM CorporationInventors: Ganesh Bikshandi, Dibyendu Das, Smruti Ranjan Sarangi