Patents by Inventor Ho-Yuen Chau

Ho-Yuen Chau has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MIXTURE-OF-EXPERTS LAYER WITH DYNAMIC GATING

Publication number: 20240169463

Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The processing devices are configured to, in each of a plurality of iterations, at each of the processing devices, receive a respective plurality of input tokens. Executing the MoE layer further includes, at each of the processing devices, selecting one or more destination expert sub-models associated with the input tokens. Respective numbers k of expert sub-models selected differ across the iterations. At each of the processing devices, executing the MoE layer further includes conveying the input tokens to the one or more destination expert sub-models. Executing the MoE layer further includes generating one or more respective expert sub-model outputs at the one or more destination expert sub-models. Executing the MoE layer further includes generating and outputting an MoE layer output based on the one or more expert sub-model outputs.

Type: Application

Filed: November 10, 2022

Publication date: May 23, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
COLLECTIVE COMMUNICATION PHASES AT MIXTURE-OF-EXPERTS LAYER

Publication number: 20240160906

Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The processing devices are configured to execute the MoE layer at least in part by, during a first collective communication phase between the processing devices, splitting each of a plurality of first input tensors along a first dimension to obtain first output tensors. Executing the MoE layer further includes processing the first output tensors at a respective a plurality of expert sub-models to obtain a plurality of second input tensors. Executing the MoE layer further includes, during a second collective communication phase between the processing devices, receiving the second input tensors from the expert sub-models and concatenating the second input tensors along the first dimension to obtain second output tensors. Executing the MoE layer further includes outputting the second output tensors as output of the MoE layer.

Type: Application

Filed: November 10, 2022

Publication date: May 16, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
MIXTURE-OF-EXPERTS LAYER WITH SWITCHABLE PARALLEL MODES

Publication number: 20240160894

Abstract: A computing system is provided, including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The MoE layer includes a plurality of expert sub-models that each have a respective plurality of parameter values. The MoE layer is configured to be switchable between a data parallel mode and an expert-data-model parallel mode without conveying the respective parameter values of the expert sub-models among the plurality of processing devices.

Type: Application

Filed: November 10, 2022

Publication date: May 16, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
SPARSE ENCODING AND DECODING AT MIXTURE-OF-EXPERTS LAYER

Publication number: 20240086719

Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer. The processing devices are configured to execute the MoE layer at least in part by receiving an input tensor including input tokens. Executing the MoE layer further includes computing a gating function output vector based on the input tensor and computing a sparse encoding of the input tensor and the gating function output vector. The sparse encoding indicates one or more destination expert sub-models. Executing the MoE layer further includes dispatching the input tensor for processing at the one or more destination expert sub-models, and further includes computing an expert output tensor. Executing the MoE layer further includes computing an MoE layer output at least in part by computing a sparse decoding of the expert output tensor. Executing the MoE layer further includes conveying the MoE layer output to an additional computing process.

Type: Application

Filed: May 16, 2023

Publication date: March 14, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
Management of multiple physical function nonvolatile memory devices

Patent number: 11237761

Abstract: The disclosed technologies include functionality for managing Multiple Physical Function NVMe Devices (“MFNDs”) and the physical functions (“PFs”) provided by MFNDs. For example, host devices can discover MFNDs, query the capabilities of MFNDs, and change the operating mode of an MFND between a user mode and a super administrator mode. Hosts can also utilize the disclosed technologies to create and delete individual child PFs on MFNDs. The disclosed technologies also include functionality for managing the settings associated with individual PFs of MFNDs. For example, hosts can query and modify the settings associated with individual child PFs of an MFND. The disclosed technologies also include functionality for managing the QoS provided by individual PFs of a MFND. For example, hosts can also query and modify the QoS provided by individual child PFs of an MFND.

Type: Grant

Filed: February 21, 2020

Date of Patent: February 1, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Lei Kou, Scott Chao-Chueh Lee, Ho-Yuen Chau, Liang Yang, Chin Hwan Park, Yimin Deng
Clearance of bare metal resource to trusted state usable in cloud computing

Patent number: 11163887

Abstract: A bare metal resource includes a trusted portion and an untrusted portion. The trusted portion includes trusted hardware, an image repository, and a clearance manager. The clearance manager is executable during bootup of the bare metal resource to perform a clearance process on the untrusted portion, including deleting the BIOS in the untrusted portion and loading a trusted BIOS from the image repository on the untrusted hardware, to place the untrusted portion in a trusted state. The bare metal resource may be provisioned to a tenant of a cloud provider after being placed in the trusted state.

Type: Grant

Filed: December 28, 2018

Date of Patent: November 2, 2021

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Bryan W. Tuttle, Carlos Jose Cela, Ho-Yuen Chau, Melur K. Raghuraman, Saurabh M. Kulkarni, Yimin Deng
MANAGEMENT OF MULTIPLE PHYSICAL FUNCTION NON-VOLATILE MEMORY DEVICES

Publication number: 20210132860

Abstract: The disclosed technologies include functionality for managing Multiple Physical Function NVMe Devices (“MFNDs”) and the physical functions (“PFs”) provided by MFNDs. For example, host devices can discover MFNDs, query the capabilities of MFNDs, and change the operating mode of an MFND between a user mode and a super administrator mode. Hosts can also utilize the disclosed technologies to create and delete individual child PFs on MFNDs. The disclosed technologies also include functionality for managing the settings associated with individual PFs of MFNDs. For example, hosts can query and modify the settings associated with individual child PFs of an MFND. The disclosed technologies also include functionality for managing the QoS provided by individual PFs of a MFND. For example, hosts can also query and modify the QoS provided by individual child PFs of an MFND.

Type: Application

Filed: February 21, 2020

Publication date: May 6, 2021

Inventors: Lei KOU, Scott Chao-Chueh LEE, Ho-Yuen CHAU, Liang YANG, Chin Hwan PARK, Yimin DENG
Hardware-accelerated secure communication management

Patent number: 10630654

Abstract: Computing systems, devices, and associated methods of managing secure communication using hardware accelerators are disclosed herein. In one embodiment, a method includes receiving messages from a peer computing device via a computer network at a FPGA of a hardware accelerator and examining each of the received messages to determine whether the received messages contain application data. The method can then include forwarding a first subset of the received messages that do not contain application data to the processor for further processing and processing a second subset of the messages containing application data according to a security protocol without forwarding the second subset to the processor to reduce a consumption of bandwidth across the communications bridge.

Type: Grant

Filed: June 22, 2017

Date of Patent: April 21, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Carlos Jose Cela, Ho Yuen Chau, Bryan William Tuttle
CLEARANCE OF BARE METAL RESOURCE TO TRUSTED STATE USABLE IN CLOUD COMPUTING

Publication number: 20190251266

Abstract: A bare metal resource includes a trusted portion and an untrusted portion. The trusted portion includes trusted hardware, an image repository, and a clearance manager. The clearance manager is executable during bootup of the bare metal resource to perform a clearance process on the untrusted portion, including deleting the BIOS in the untrusted portion and loading a trusted BIOS from the image repository on the untrusted hardware, to place the untrusted portion in a trusted state. The bare metal resource may be provisioned to a tenant of a cloud provider after being placed in the trusted state.

Type: Application

Filed: December 28, 2018

Publication date: August 15, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Bryan W. TUTTLE, Carlos Jose CELA, Ho-Yuen CHAU, Melur K. RAGHURAMAN, Saurabh M. KULKARNI, Yimin DENG
HARDWARE-ACCELERATED SECURE COMMUNICATION MANAGEMENT

Publication number: 20180278588

Abstract: Computing systems, devices, and associated methods of managing secure communication using hardware accelerators are disclosed herein. In one embodiment, a method includes receiving messages from a peer computing device via a computer network at a FPGA of a hardware accelerator and examining each of the received messages to determine whether the received messages contain application data. The method can then include forwarding a first subset of the received messages that do not contain application data to the processor for further processing and processing a second subset of the messages containing application data according to a security protocol without forwarding the second subset to the processor to reduce a consumption of bandwidth across the communications bridge.

Type: Application

Filed: June 22, 2017

Publication date: September 27, 2018

Inventors: Carlos Jose Cela, Ho Yuen Chau, Bryan William Tuttle
Throttle disk I/O using disk drive simulation model

Patent number: 9395920

Abstract: Computerized methods, systems, and computer-storage media for throttling requests from virtual machines (VMs) to a hard-disk drive (HDD) are provided. When a request for disk I/O is received from a VM, a disk-drive model that simulates performance characteristics of the HDD is accessed. During access, the disk-drive model's estimation of HDD parameters and the disk-drive model's estimation of a current state of a disk head of the HDD are gathered. A projected execution time to carry out the request is computed as a function of the estimated HDD parameters and the estimated current state of the disk head. Also, an actual execution time to carry out the request is measured upon allowing the request to pass to the HDD. Using a comparison of the projected execution time and the actual execution time, the traffic of the requests from the VMs is throttled.

Type: Grant

Filed: November 17, 2011

Date of Patent: July 19, 2016

Assignee: MIROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Yimin Deng, Ho Yuen Chau, Yue Zuo, Forrest Curtis Foltz
OFFLOADING PACKET PROCESSING FOR NETWORKING DEVICE VIRTUALIZATION

Publication number: 20130254766

Abstract: The present invention extends to methods, systems, and computer program products for offloading packet processing for networking device virtualization. A host maintains rule set(s) for a virtual machine, and a physical network interface card (NIC) maintains flow table(s) for the virtual machine. The physical NIC receives and processes a network packet associated with the virtual machine. Processing the network packet includes the physical NIC comparing the network packet with the flow table(s) at the physical NIC. When the network packet matches with a flow in the flow table(s) at the physical NIC, the physical NIC performs an action on the network packet based on the matching flow. Alternatively, when the network packet does not match with a flow in the flow table(s) at the physical NIC, the physical NIC passes the network packet to the host partition for processing against the rule set(s).

Type: Application

Filed: July 17, 2012

Publication date: September 26, 2013

Applicant: Microsoft Corporation

Inventors: Yue Zuo, Daniel M. Firestone, Albert Gordon Greenberg, Ho Yuen Chau, Yimin Deng, Bryan William Tuttle, Pankaj Garg
THROTTLE DISK I/O USING DISK DRIVE SIMULATION MODEL

Publication number: 20130132057

Abstract: Computerized methods, systems, and computer-storage media for throttling requests from virtual machines (VMs) to a hard-disk drive (HDD) are provided. When a request for disk I/O is received from a VM, a disk-drive model that simulates performance characteristics of the HDD is accessed. During access, the disk-drive model's estimation of HDD parameters and the disk-drive model's estimation of a current state of a disk head of the HDD are gathered. A projected execution time to carry out the request is computed as a function of the estimated HDD parameters and the estimated current state of the disk head. Also, an actual execution time to carry out the request is measured upon allowing the request to pass to the HDD. Using a comparison of the projected execution time and the actual execution time, the traffic of the requests from the VMs is throttled.

Type: Application

Filed: November 17, 2011

Publication date: May 23, 2013

Applicant: MICROSOFT CORPORATION

Inventors: YIMIN DENG, HO YUEN CHAU, YUE ZUO, FORREST CURTIS FOLTZ
GENERATING A DEBUGGABLE DUMP FILE FOR A VIRTUAL MACHINE

Publication number: 20110225459

Abstract: Cloud computing platforms having computer-readable media that perform methods to generate debuggable dump files are provided. The cloud computing platform includes at least one server having a host virtual machine, guest virtual machine, and hypervisor. The host virtual machine receives a command to generate the debuggable dump file. In response, it suspends all virtual processors executing on the guest virtual machine. The memory pages of the suspended virtual machine are written into a debuggable dump file, and the suspended processors are resumed at an appropriate time.

Type: Application

Filed: March 9, 2010

Publication date: September 15, 2011

Applicant: MICROSOFT CORPORATION

Inventors: THOMAS FAHRIG, YUE ZUO, FRANCIS MANOJ DAVID, YIMIN DENG, HO-YUEN CHAU, FORREST CURTIS FOLTZ
GENERATING A DEBUGGABLE DUMP FILE FOR AN OPERATING SYSTEM KERNEL AND HYPERVISOR

Publication number: 20110225458

Abstract: Cloud computing platforms having computer-readable media that perform methods to generate debuggable dump files are provided. The cloud computing platform includes server devices running operating system kernels. Optionally, the server may include a hypervisor. The operating system kernel receives a command to generate a debuggable dump file. In response, the operating system estimates memory requires to store the requested memory pages, allocates an appropriately sized buffer, and freezes computation. A hypervisor is present and if its memory pages are requested, the hypervisor freezes its computation. The hypervisor stores its memory pages in the buffer and resumes computation. The operating system kernel stores its pages to the buffer in priority order and resumes its computation. The contents of the buffer are written out as a debuggable dump file.

Type: Application

Filed: March 9, 2010

Publication date: September 15, 2011

Applicant: MICROSOFT CORPORATION

Inventors: YUE ZUO, FRANCIS MANOJ DAVID, YIMIN DENG, HO-YUEN CHAU, FORREST CURTIS FOLTZ
Compressed program recording

Patent number: 7620938

Abstract: Program execution can be monitored and recorded for later playback. Certain state changes that can be predicted via a virtual processor during playback need not be recorded, so a compressed recording can be stored. To facilitate random access with respect to time during playback, key frames can be stored within the compressed recording. An index mechanism can associate key frames with particular memory addresses. Additionally, a snapshot of values for memory addresses can be used to further facilitate determining the value of a memory address without having to simulate execution. Multiprocessor executions can be supported, and playback can be done on a machine type different from that on which recording took place.

Type: Grant

Filed: October 31, 2005

Date of Patent: November 17, 2009

Assignee: Microsoft Corporation

Inventors: Andrew James Edwards, Darek Mihocka, Ho-Yuen Chau, Ronald C. Murray, Sanjay Bhansali, Stuart D. de Jong, Wen-Ke Chen, Kenneth Bryant Pierce
Compressed program recording

Publication number: 20070168989

Abstract: Program execution can be monitored and recorded for later playback. Certain state changes that can be predicted via a virtual processor during playback need not be recorded, so a compressed recording can be stored. To facilitate random access with respect to time during playback, key frames can be stored within the compressed recording. An index mechanism can associate key frames with particular memory addresses. Additionally, a snapshot of values for memory addresses can be used to further facilitate determining the value of a memory address without having to simulate execution. Multiprocessor executions can be supported, and playback can be done on a machine type different from that on which recording took place.

Type: Application

Filed: October 31, 2005

Publication date: July 19, 2007

Applicant: Microsoft Corporation

Inventors: Andrew Edwards, Darek Mihocka, Ho-Yuen Chau, Ronald Murray, Sanjay Bhansali, Stuart de Jong, Wen-Ke Chen, Kenneth Pierce
Tools and methods for discovering race condition errors

Patent number: 7174554

Abstract: Tools and methods are described herein for discovering race condition errors in a software program. The errors are discovered by deliberately causing a processor executing the test program to switch threads at intervals other than normally scheduled by an operating system. The thread switching is caused upon occurrence of selected events. The intervals may be selected automatically or with user input. Furthermore, thread switching may be caused during conditions more likely to cause race condition errors. For example, thread switches may be caused between threads that share control of a memory device or while the processor is executing instructions related to synchronization tools (e.g. locks, mutex, etc.).

Type: Grant

Filed: December 20, 2002

Date of Patent: February 6, 2007

Assignee: Microsoft Corporation

Inventors: Kenneth Bryant Pierce, Ho-Yuen Chau
Tools and methods for discovering race condition errors

Publication number: 20040123185

Abstract: Tools and methods are described herein for discovering race condition errors in a software program. The errors are discovered by deliberately causing a processor executing the test program to switch threads at intervals other than normally scheduled by an operating system. The thread switching is caused upon occurrence of selected events. The intervals may be selected automatically or with user input. Furthermore, thread switching may be caused during conditions more likely to cause race condition errors. For example, thread switches may be caused between threads that share control of a memory device or while the processor is executing instructions related to synchronization tools (e.g. locks, mutex, etc.).

Type: Application

Filed: December 20, 2002

Publication date: June 24, 2004

Applicant: Microsoft Corporation

Inventors: Kenneth Bryant Pierce, Ho-Yuen Chau