Patents by Inventor Mazhar Memon

Mazhar Memon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11907589
    Abstract: At least one application of a client executes via system software on a hardware computing system that includes at least one CPU and at least one coprocessor. A virtualization layer establishes unified memory address space between the client and the hardware computing system, which also includes memory associated with the at least one coprocessor. The virtualization layer then synchronizes memory associated with the client and memory associated the at least one coprocessor. The virtualization layer may be installed and run in a non-privileged, user space, without modification of the application or of the system software running on the hardware computing system.
    Type: Grant
    Filed: July 8, 2019
    Date of Patent: February 20, 2024
    Assignee: VMware, Inc.
    Inventors: Aidan Cully, Mazhar Memon
  • Patent number: 11860737
    Abstract: An interface software layer is interposed between at least one application and a plurality of coprocessors. A data and command stream issued by the application(s) to an API of an intended one of the coprocessors is intercepted by the layer, which also acquires and stores the execution state information for the intended coprocessor at a coprocessor synchronization boundary. At least a portion of the intercepted data and command stream data is stored in a replay log associated with the intended coprocessor. The replay log associated with the intended coprocessor is then read out, along with the stored execution state information, and is submitted to and serviced by at least one different one of the coprocessors other than the intended coprocessor.
    Type: Grant
    Filed: March 16, 2019
    Date of Patent: January 2, 2024
    Assignee: VMware, Inc.
    Inventors: Mazhar Memon, Subramanian Rama, Maciej Bajkowski
  • Patent number: 11842223
    Abstract: Disclosed herein is the integration into edge nodes of a telecommunications network system of client computer system and server computer system where the server computer system includes a pool of shareable accelerators and the client computer runs an application program that is assisted by the pool of accelerators. The edge nodes connect to user equipment, and some of the user equipment can themselves act as one of the client computer systems. In some embodiments, the accelerators are GPUs, and in other embodiments, the accelerators are artificial intelligence accelerators.
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: December 12, 2023
    Assignee: VMWARE, INC.
    Inventors: Tiejun Chen, Chris Wolf, Mazhar Memon, Peter Buckingham, Shreekanta Das
  • Patent number: 11822925
    Abstract: Execution of multiple execution streams is scheduled on at least one coprocessor. A software layer located logically between applications and the at least one coprocessor intercepts a first API call from an application and determines that a first execution stream is to be executed. Before scheduling the first execution stream, the software layer transmits a response to the application indicating that the at least one coprocessor is ready to execute another execution stream. The software layer intercepts a second API call from the application and determines that a second execution stream including one or more kernels is to be executed. The software layer determines that the one or more kernels does not have a dependency on the first execution stream. The software layer schedules the one or more kernels for execution prior to when the at least one coprocessor has completed execution of the first execution stream.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: November 21, 2023
    Assignee: VMware, Inc.
    Inventors: Mazhar Memon, Aidan Cully
  • Patent number: 11748152
    Abstract: In a data processing system running at least one application on a hardware platform that includes at least one processor and a plurality of coprocessors, at least one kernel dispatched by an application is intercepted by an intermediate software layer running logically between the application and the system software. Compute functions are determined within kernel(s), and data dependencies are determined among the compute functions. The compute functions are dispatched to selected ones of the coprocessors based at least in part on the determined data dependencies and kernel results are returned to the application that dispatched the respective kernel.
    Type: Grant
    Filed: October 18, 2021
    Date of Patent: September 5, 2023
    Inventors: Mazhar Memon, Subramanian Rama, Maciej Bajkowski
  • Publication number: 20230229521
    Abstract: Disclosed herein is the integration into edge nodes of a telecommunications network system of client computer system and server computer system where the server computer system includes a pool of shareable accelerators and the client computer runs an application program that is assisted by the pool of accelerators. The edge nodes connect to user equipment, and some of the user equipment can themselves act as one of the client computer systems. In some embodiments, the accelerators are GPUs, and in other embodiments, the accelerators are artificial intelligence accelerators.
    Type: Application
    Filed: February 17, 2022
    Publication date: July 20, 2023
    Inventors: Tiejun CHEN, Chris WOLF, Mazhar MEMON, Peter BUCKINGHAM, Shreekanta DAS
  • Publication number: 20220405104
    Abstract: Disclosed are various examples of providing cross platform accelerator remoting between complex instruction set computer (CISC) components and reduced instruction set computer (RISC) components of a computing environment. An accelerator remoting server receives accelerator instructions executable by a locally installed accelerator device and provides the accelerator instructions to the accelerator device. The accelerator remoting server transmits accelerator results to an accelerator remoting client to complete the cross platform or platform agnostic accelerator remoting.
    Type: Application
    Filed: July 16, 2021
    Publication date: December 22, 2022
    Inventors: Tiejun Chen, Olivier Alain Cremel, Kit Colbert, Chris Wolf, Mazhar Memon, Renu Raman, Peter Buckingham, Shreekanta Das
  • Publication number: 20220308950
    Abstract: A method for handling system calls during execution of an application over a plurality of nodes including a first node and a second node, includes receiving a system call from a thread running on the first node, determining that executing the system call involves resources present on the second node, sending the system call and arguments of the system call to the second node for the second node to execute the system call, receiving the results of the system call from the second node, and returning the results of the system call to the thread.
    Type: Application
    Filed: October 4, 2021
    Publication date: September 29, 2022
    Inventors: Aidan CULLY, Mazhar MEMON
  • Publication number: 20220308936
    Abstract: A method for executing an application over a plurality of nodes in each of which an application monitor and a runtime are executing includes executing a first portion of the application by first threads of the runtime of the first node and a second portion of the application by second threads of the runtime of the second node, and under control of the application monitors of the first and second nodes and while executing the first portions and second portions of the application, migrating workloads of one or more of the first threads from the first node to the second node for execution by the second threads.
    Type: Application
    Filed: October 4, 2021
    Publication date: September 29, 2022
    Inventors: Aidan CULLY, Vance MILLER, Dusan VELJKO, Mazhar MEMON
  • Patent number: 11347543
    Abstract: Instructions of at least one application are executed via system software, on a hardware computing system that includes at least one processor and a plurality of coprocessors. At least one application program interface (API) is associated with each coprocessor. A state virtualization layer is installed logically between the application and the system software. The state virtualization layer examines an execution stream directed by the at least one application to a first one of the plurality of coprocessors; extracts the state of the first coprocessor; pauses execution of the first coprocessor; and at runtime, dynamically resumes execution of the execution stream, with the extracted state of the first coprocessor, on a second one of the plurality of coprocessors.
    Type: Grant
    Filed: October 5, 2020
    Date of Patent: May 31, 2022
    Assignee: VMware, Inc.
    Inventors: Mazhar Memon, Miha {hacek over (C)}an{hacek over (c)}ula
  • Patent number: 11334477
    Abstract: At least one application runs on a hardware platform that includes a plurality of coprocessors, each of which has a respective internal memory space. An intermediate software layer (MVL) is transparent to the application and intercepts calls for coprocessor use. If the data corresponding to an application's call, or separate calls from different entities (including different applications) to the same coprocessor, to the API of a target coprocessor, cannot be stored within the available internal memory space of the target coprocessor, but comprises data subsets that individually can, the MVL intercepts the call response to the application/entities and indicates that the target coprocessor can handle the request. The MVL then transfers the data subsets to the target coprocessor as needed by the corresponding kernel(s) and swaps out each data subset to the internal memory of another coprocessor to make room for subsequently needed data subsets.
    Type: Grant
    Filed: October 14, 2020
    Date of Patent: May 17, 2022
    Assignee: VMware, Inc.
    Inventors: Mazhar Memon, Zheng Li
  • Publication number: 20220035654
    Abstract: In a data processing system running at least one application on a hardware platform that includes at least one processor and a plurality of coprocessors, at least one kernel dispatched by an application is intercepted by an intermediate software layer running logically between the application and the system software. Compute functions are determined within kernel(s), and data dependencies are determined among the compute functions. The compute functions are dispatched to selected ones of the coprocessors based at least in part on the determined data dependencies and kernel results are returned to the application that dispatched the respective kernel.
    Type: Application
    Filed: October 18, 2021
    Publication date: February 3, 2022
    Inventors: Mazhar MEMON, Subramanian RAMA, Maciej BAJKOWSKI
  • Patent number: 11169843
    Abstract: In a data processing system running at least one application on a hardware platform that includes at least one processor and a plurality of coprocessors, at least one kernel dispatched by an application is intercepted by an intermediate software layer running logically between the application and the system software. Compute functions are determined within kernel(s), and data dependencies are determined among the compute functions. The compute functions are dispatched to selected ones of the coprocessors based at least in part on the determined data dependencies and kernel results are returned to the application that dispatched the respective kernel.
    Type: Grant
    Filed: January 8, 2020
    Date of Patent: November 9, 2021
    Assignee: VMWARE, INC.
    Inventors: Mazhar Memon, Subramanian Rama, Maciej Bajkowski
  • Publication number: 20210200553
    Abstract: Execution of multiple execution streams is scheduled on at least one coprocessor. A software layer located logically between applications and the at least one coprocessor intercepts a first API call from an application and determines that a first execution stream is to be executed. Before scheduling the first execution stream, the software layer transmits a response to the application indicating that the at least one coprocessor is ready to execute another execution stream. The software layer intercepts a second API call from the application and determines that a second execution stream including one or more kernels is to be executed. The software layer determines that the one or more kernels does not have a dependency on the first execution stream. The software layer schedules the one or more kernels for execution prior to when the at least one coprocessor has completed execution of the first execution stream.
    Type: Application
    Filed: March 15, 2021
    Publication date: July 1, 2021
    Inventors: Mazhar MEMON, Aidan CULLY
  • Patent number: 10949211
    Abstract: Execution of multiple execution streams is scheduled on a plurality of coprocessors. A software layer located logically between applications and the coprocessors determines dependencies within the execution streams, each said dependency being a condition in one of the execution streams that must be satisfied in order for execution of at least one other of the execution streams to proceed on corresponding ones of the coprocessors. The dependencies are then represented in a data structure and an optimized execution schedule is determined for the execution streams according to the dependencies. Simultaneous execution of a plurality of the execution streams is then dynamically reordered according to the optimized execution schedule.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: March 16, 2021
    Assignee: VMware, Inc.
    Inventors: Mazhar Memon, Aidan Cully
  • Publication number: 20210026762
    Abstract: At least one application runs on a hardware platform that includes a plurality of coprocessors, each of which has a respective internal memory space. An intermediate software layer (MVL) is transparent to the application and intercepts calls for coprocessor use. If the data corresponding to an application's call, or separate calls from different entities (including different applications) to the same coprocessor, to the API of a target coprocessor, cannot be stored within the available internal memory space of the target coprocessor, but comprises data subsets that individually can, the MVL intercepts the call response to the application/entities and indicates that the target coprocessor can handle the request. The MVL then transfers the data subsets to the target coprocessor as needed by the corresponding kernel(s) and swaps out each data subset to the internal memory of another coprocessor to make room for subsequently needed data subsets.
    Type: Application
    Filed: October 14, 2020
    Publication date: January 28, 2021
    Inventors: Mazhar MEMON, Zheng LI
  • Publication number: 20210019177
    Abstract: Instructions of at least one application are executed via system software, on a hardware computing system that includes at least one processor and a plurality of coprocessors. At least one application program interface (API) is associated with each coprocessor. A state virtualization layer is installed logically between the application and the system software. The state virtualization layer examines an execution stream directed by the at least one application to a first one of the plurality of coprocessors; extracts the state of the first coprocessor; pauses execution of the first coprocessor; and at runtime, dynamically resumes execution of the execution stream, with the extracted state of the first coprocessor, on a second one of the plurality of coprocessors.
    Type: Application
    Filed: October 5, 2020
    Publication date: January 21, 2021
    Inventors: Mazhar MEMON, Miha CANCULA
  • Publication number: 20210011666
    Abstract: At least one application of a client executes via system software on a hardware computing system that includes at least one CPU and at least one coprocessor. A virtualization layer establishes unified memory address space between the client and the hardware computing system, which also includes memory associated with the at least one coprocessor. The virtualization layer then synchronizes memory associated with the client and memory associated the at least one coprocessor. The virtualization layer may be installed and run in a non-privileged, user space, without modification of the application or of the system software running on the hardware computing system.
    Type: Application
    Filed: July 8, 2019
    Publication date: January 14, 2021
    Applicant: Bitfusion.io, Inc.
    Inventors: Aidan CULLY, Mazhar MEMON
  • Patent number: 10810117
    Abstract: At least one application runs on a hardware platform that includes a plurality of coprocessors, each of which has a respective internal memory space. An intermediate software layer (MVL) is transparent to the application and intercepts calls for coprocessor use. If the data corresponding to an application's call, or separate calls from different entities (including different applications) to the same coprocessor, to the API of a target coprocessor, cannot be stored within the available internal memory space of the target coprocessor, but comprises data subsets that individually can, the MVL intercepts the call response to the application/entities and indicates that the target coprocessor can handle the request. The MVL then transfers the data subsets to the target coprocessor as needed by the corresponding kernel(s) and swaps out each data subset to the internal memory of another coprocessor to make room for subsequently needed data subsets.
    Type: Grant
    Filed: October 16, 2017
    Date of Patent: October 20, 2020
    Assignee: VMware, Inc.
    Inventors: Mazhar Memon, Zheng Li
  • Patent number: 10802871
    Abstract: Instructions of at least one application are executed via system software, on a hardware computing system that includes at least one processor and a plurality of coprocessors. At least one application program interface (API) is associated with each coprocessor. A state virtualization layer is installed logically between the application and the system software. The state virtualization layer examines an execution stream directed by the at least one application to a first one of the plurality of coprocessors; extracts the state of the first coprocessor; pauses execution of the first coprocessor; and at runtime, dynamically resumes execution of the execution stream, with the extracted state of the first coprocessor, on a second one of the plurality of coprocessors.
    Type: Grant
    Filed: May 7, 2019
    Date of Patent: October 13, 2020
    Assignee: VMware, Inc.
    Inventors: Mazhar Memon, Miha {hacek over (C)}an{hacek over (c)}ula