Patents by Inventor Ho-Yuen Chau

Ho-Yuen Chau has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240169463
    Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The processing devices are configured to, in each of a plurality of iterations, at each of the processing devices, receive a respective plurality of input tokens. Executing the MoE layer further includes, at each of the processing devices, selecting one or more destination expert sub-models associated with the input tokens. Respective numbers k of expert sub-models selected differ across the iterations. At each of the processing devices, executing the MoE layer further includes conveying the input tokens to the one or more destination expert sub-models. Executing the MoE layer further includes generating one or more respective expert sub-model outputs at the one or more destination expert sub-models. Executing the MoE layer further includes generating and outputting an MoE layer output based on the one or more expert sub-model outputs.
    Type: Application
    Filed: November 10, 2022
    Publication date: May 23, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
  • Publication number: 20240160894
    Abstract: A computing system is provided, including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The MoE layer includes a plurality of expert sub-models that each have a respective plurality of parameter values. The MoE layer is configured to be switchable between a data parallel mode and an expert-data-model parallel mode without conveying the respective parameter values of the expert sub-models among the plurality of processing devices.
    Type: Application
    Filed: November 10, 2022
    Publication date: May 16, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
  • Publication number: 20240160906
    Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The processing devices are configured to execute the MoE layer at least in part by, during a first collective communication phase between the processing devices, splitting each of a plurality of first input tensors along a first dimension to obtain first output tensors. Executing the MoE layer further includes processing the first output tensors at a respective a plurality of expert sub-models to obtain a plurality of second input tensors. Executing the MoE layer further includes, during a second collective communication phase between the processing devices, receiving the second input tensors from the expert sub-models and concatenating the second input tensors along the first dimension to obtain second output tensors. Executing the MoE layer further includes outputting the second output tensors as output of the MoE layer.
    Type: Application
    Filed: November 10, 2022
    Publication date: May 16, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
  • Publication number: 20240086719
    Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer. The processing devices are configured to execute the MoE layer at least in part by receiving an input tensor including input tokens. Executing the MoE layer further includes computing a gating function output vector based on the input tensor and computing a sparse encoding of the input tensor and the gating function output vector. The sparse encoding indicates one or more destination expert sub-models. Executing the MoE layer further includes dispatching the input tensor for processing at the one or more destination expert sub-models, and further includes computing an expert output tensor. Executing the MoE layer further includes computing an MoE layer output at least in part by computing a sparse decoding of the expert output tensor. Executing the MoE layer further includes conveying the MoE layer output to an additional computing process.
    Type: Application
    Filed: May 16, 2023
    Publication date: March 14, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
  • Patent number: 11237761
    Abstract: The disclosed technologies include functionality for managing Multiple Physical Function NVMe Devices (“MFNDs”) and the physical functions (“PFs”) provided by MFNDs. For example, host devices can discover MFNDs, query the capabilities of MFNDs, and change the operating mode of an MFND between a user mode and a super administrator mode. Hosts can also utilize the disclosed technologies to create and delete individual child PFs on MFNDs. The disclosed technologies also include functionality for managing the settings associated with individual PFs of MFNDs. For example, hosts can query and modify the settings associated with individual child PFs of an MFND. The disclosed technologies also include functionality for managing the QoS provided by individual PFs of a MFND. For example, hosts can also query and modify the QoS provided by individual child PFs of an MFND.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: February 1, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lei Kou, Scott Chao-Chueh Lee, Ho-Yuen Chau, Liang Yang, Chin Hwan Park, Yimin Deng
  • Patent number: 11163887
    Abstract: A bare metal resource includes a trusted portion and an untrusted portion. The trusted portion includes trusted hardware, an image repository, and a clearance manager. The clearance manager is executable during bootup of the bare metal resource to perform a clearance process on the untrusted portion, including deleting the BIOS in the untrusted portion and loading a trusted BIOS from the image repository on the untrusted hardware, to place the untrusted portion in a trusted state. The bare metal resource may be provisioned to a tenant of a cloud provider after being placed in the trusted state.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: November 2, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Bryan W. Tuttle, Carlos Jose Cela, Ho-Yuen Chau, Melur K. Raghuraman, Saurabh M. Kulkarni, Yimin Deng
  • Publication number: 20210132860
    Abstract: The disclosed technologies include functionality for managing Multiple Physical Function NVMe Devices (“MFNDs”) and the physical functions (“PFs”) provided by MFNDs. For example, host devices can discover MFNDs, query the capabilities of MFNDs, and change the operating mode of an MFND between a user mode and a super administrator mode. Hosts can also utilize the disclosed technologies to create and delete individual child PFs on MFNDs. The disclosed technologies also include functionality for managing the settings associated with individual PFs of MFNDs. For example, hosts can query and modify the settings associated with individual child PFs of an MFND. The disclosed technologies also include functionality for managing the QoS provided by individual PFs of a MFND. For example, hosts can also query and modify the QoS provided by individual child PFs of an MFND.
    Type: Application
    Filed: February 21, 2020
    Publication date: May 6, 2021
    Inventors: Lei KOU, Scott Chao-Chueh LEE, Ho-Yuen CHAU, Liang YANG, Chin Hwan PARK, Yimin DENG
  • Patent number: 10630654
    Abstract: Computing systems, devices, and associated methods of managing secure communication using hardware accelerators are disclosed herein. In one embodiment, a method includes receiving messages from a peer computing device via a computer network at a FPGA of a hardware accelerator and examining each of the received messages to determine whether the received messages contain application data. The method can then include forwarding a first subset of the received messages that do not contain application data to the processor for further processing and processing a second subset of the messages containing application data according to a security protocol without forwarding the second subset to the processor to reduce a consumption of bandwidth across the communications bridge.
    Type: Grant
    Filed: June 22, 2017
    Date of Patent: April 21, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Carlos Jose Cela, Ho Yuen Chau, Bryan William Tuttle
  • Publication number: 20190251266
    Abstract: A bare metal resource includes a trusted portion and an untrusted portion. The trusted portion includes trusted hardware, an image repository, and a clearance manager. The clearance manager is executable during bootup of the bare metal resource to perform a clearance process on the untrusted portion, including deleting the BIOS in the untrusted portion and loading a trusted BIOS from the image repository on the untrusted hardware, to place the untrusted portion in a trusted state. The bare metal resource may be provisioned to a tenant of a cloud provider after being placed in the trusted state.
    Type: Application
    Filed: December 28, 2018
    Publication date: August 15, 2019
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Bryan W. TUTTLE, Carlos Jose CELA, Ho-Yuen CHAU, Melur K. RAGHURAMAN, Saurabh M. KULKARNI, Yimin DENG
  • Publication number: 20180278588
    Abstract: Computing systems, devices, and associated methods of managing secure communication using hardware accelerators are disclosed herein. In one embodiment, a method includes receiving messages from a peer computing device via a computer network at a FPGA of a hardware accelerator and examining each of the received messages to determine whether the received messages contain application data. The method can then include forwarding a first subset of the received messages that do not contain application data to the processor for further processing and processing a second subset of the messages containing application data according to a security protocol without forwarding the second subset to the processor to reduce a consumption of bandwidth across the communications bridge.
    Type: Application
    Filed: June 22, 2017
    Publication date: September 27, 2018
    Inventors: Carlos Jose Cela, Ho Yuen Chau, Bryan William Tuttle
  • Patent number: 9395920
    Abstract: Computerized methods, systems, and computer-storage media for throttling requests from virtual machines (VMs) to a hard-disk drive (HDD) are provided. When a request for disk I/O is received from a VM, a disk-drive model that simulates performance characteristics of the HDD is accessed. During access, the disk-drive model's estimation of HDD parameters and the disk-drive model's estimation of a current state of a disk head of the HDD are gathered. A projected execution time to carry out the request is computed as a function of the estimated HDD parameters and the estimated current state of the disk head. Also, an actual execution time to carry out the request is measured upon allowing the request to pass to the HDD. Using a comparison of the projected execution time and the actual execution time, the traffic of the requests from the VMs is throttled.
    Type: Grant
    Filed: November 17, 2011
    Date of Patent: July 19, 2016
    Assignee: MIROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yimin Deng, Ho Yuen Chau, Yue Zuo, Forrest Curtis Foltz
  • Publication number: 20130254766
    Abstract: The present invention extends to methods, systems, and computer program products for offloading packet processing for networking device virtualization. A host maintains rule set(s) for a virtual machine, and a physical network interface card (NIC) maintains flow table(s) for the virtual machine. The physical NIC receives and processes a network packet associated with the virtual machine. Processing the network packet includes the physical NIC comparing the network packet with the flow table(s) at the physical NIC. When the network packet matches with a flow in the flow table(s) at the physical NIC, the physical NIC performs an action on the network packet based on the matching flow. Alternatively, when the network packet does not match with a flow in the flow table(s) at the physical NIC, the physical NIC passes the network packet to the host partition for processing against the rule set(s).
    Type: Application
    Filed: July 17, 2012
    Publication date: September 26, 2013
    Applicant: Microsoft Corporation
    Inventors: Yue Zuo, Daniel M. Firestone, Albert Gordon Greenberg, Ho Yuen Chau, Yimin Deng, Bryan William Tuttle, Pankaj Garg
  • Publication number: 20130132057
    Abstract: Computerized methods, systems, and computer-storage media for throttling requests from virtual machines (VMs) to a hard-disk drive (HDD) are provided. When a request for disk I/O is received from a VM, a disk-drive model that simulates performance characteristics of the HDD is accessed. During access, the disk-drive model's estimation of HDD parameters and the disk-drive model's estimation of a current state of a disk head of the HDD are gathered. A projected execution time to carry out the request is computed as a function of the estimated HDD parameters and the estimated current state of the disk head. Also, an actual execution time to carry out the request is measured upon allowing the request to pass to the HDD. Using a comparison of the projected execution time and the actual execution time, the traffic of the requests from the VMs is throttled.
    Type: Application
    Filed: November 17, 2011
    Publication date: May 23, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: YIMIN DENG, HO YUEN CHAU, YUE ZUO, FORREST CURTIS FOLTZ
  • Publication number: 20110225458
    Abstract: Cloud computing platforms having computer-readable media that perform methods to generate debuggable dump files are provided. The cloud computing platform includes server devices running operating system kernels. Optionally, the server may include a hypervisor. The operating system kernel receives a command to generate a debuggable dump file. In response, the operating system estimates memory requires to store the requested memory pages, allocates an appropriately sized buffer, and freezes computation. A hypervisor is present and if its memory pages are requested, the hypervisor freezes its computation. The hypervisor stores its memory pages in the buffer and resumes computation. The operating system kernel stores its pages to the buffer in priority order and resumes its computation. The contents of the buffer are written out as a debuggable dump file.
    Type: Application
    Filed: March 9, 2010
    Publication date: September 15, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: YUE ZUO, FRANCIS MANOJ DAVID, YIMIN DENG, HO-YUEN CHAU, FORREST CURTIS FOLTZ
  • Publication number: 20110225459
    Abstract: Cloud computing platforms having computer-readable media that perform methods to generate debuggable dump files are provided. The cloud computing platform includes at least one server having a host virtual machine, guest virtual machine, and hypervisor. The host virtual machine receives a command to generate the debuggable dump file. In response, it suspends all virtual processors executing on the guest virtual machine. The memory pages of the suspended virtual machine are written into a debuggable dump file, and the suspended processors are resumed at an appropriate time.
    Type: Application
    Filed: March 9, 2010
    Publication date: September 15, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: THOMAS FAHRIG, YUE ZUO, FRANCIS MANOJ DAVID, YIMIN DENG, HO-YUEN CHAU, FORREST CURTIS FOLTZ
  • Patent number: 7620938
    Abstract: Program execution can be monitored and recorded for later playback. Certain state changes that can be predicted via a virtual processor during playback need not be recorded, so a compressed recording can be stored. To facilitate random access with respect to time during playback, key frames can be stored within the compressed recording. An index mechanism can associate key frames with particular memory addresses. Additionally, a snapshot of values for memory addresses can be used to further facilitate determining the value of a memory address without having to simulate execution. Multiprocessor executions can be supported, and playback can be done on a machine type different from that on which recording took place.
    Type: Grant
    Filed: October 31, 2005
    Date of Patent: November 17, 2009
    Assignee: Microsoft Corporation
    Inventors: Andrew James Edwards, Darek Mihocka, Ho-Yuen Chau, Ronald C. Murray, Sanjay Bhansali, Stuart D. de Jong, Wen-Ke Chen, Kenneth Bryant Pierce
  • Publication number: 20070168989
    Abstract: Program execution can be monitored and recorded for later playback. Certain state changes that can be predicted via a virtual processor during playback need not be recorded, so a compressed recording can be stored. To facilitate random access with respect to time during playback, key frames can be stored within the compressed recording. An index mechanism can associate key frames with particular memory addresses. Additionally, a snapshot of values for memory addresses can be used to further facilitate determining the value of a memory address without having to simulate execution. Multiprocessor executions can be supported, and playback can be done on a machine type different from that on which recording took place.
    Type: Application
    Filed: October 31, 2005
    Publication date: July 19, 2007
    Applicant: Microsoft Corporation
    Inventors: Andrew Edwards, Darek Mihocka, Ho-Yuen Chau, Ronald Murray, Sanjay Bhansali, Stuart de Jong, Wen-Ke Chen, Kenneth Pierce
  • Patent number: 7174554
    Abstract: Tools and methods are described herein for discovering race condition errors in a software program. The errors are discovered by deliberately causing a processor executing the test program to switch threads at intervals other than normally scheduled by an operating system. The thread switching is caused upon occurrence of selected events. The intervals may be selected automatically or with user input. Furthermore, thread switching may be caused during conditions more likely to cause race condition errors. For example, thread switches may be caused between threads that share control of a memory device or while the processor is executing instructions related to synchronization tools (e.g. locks, mutex, etc.).
    Type: Grant
    Filed: December 20, 2002
    Date of Patent: February 6, 2007
    Assignee: Microsoft Corporation
    Inventors: Kenneth Bryant Pierce, Ho-Yuen Chau
  • Publication number: 20040123185
    Abstract: Tools and methods are described herein for discovering race condition errors in a software program. The errors are discovered by deliberately causing a processor executing the test program to switch threads at intervals other than normally scheduled by an operating system. The thread switching is caused upon occurrence of selected events. The intervals may be selected automatically or with user input. Furthermore, thread switching may be caused during conditions more likely to cause race condition errors. For example, thread switches may be caused between threads that share control of a memory device or while the processor is executing instructions related to synchronization tools (e.g. locks, mutex, etc.).
    Type: Application
    Filed: December 20, 2002
    Publication date: June 24, 2004
    Applicant: Microsoft Corporation
    Inventors: Kenneth Bryant Pierce, Ho-Yuen Chau