Patents by Inventor Ho-Yuen Chau
Ho-Yuen Chau has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240169463Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The processing devices are configured to, in each of a plurality of iterations, at each of the processing devices, receive a respective plurality of input tokens. Executing the MoE layer further includes, at each of the processing devices, selecting one or more destination expert sub-models associated with the input tokens. Respective numbers k of expert sub-models selected differ across the iterations. At each of the processing devices, executing the MoE layer further includes conveying the input tokens to the one or more destination expert sub-models. Executing the MoE layer further includes generating one or more respective expert sub-model outputs at the one or more destination expert sub-models. Executing the MoE layer further includes generating and outputting an MoE layer output based on the one or more expert sub-model outputs.Type: ApplicationFiled: November 10, 2022Publication date: May 23, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
-
Publication number: 20240160894Abstract: A computing system is provided, including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The MoE layer includes a plurality of expert sub-models that each have a respective plurality of parameter values. The MoE layer is configured to be switchable between a data parallel mode and an expert-data-model parallel mode without conveying the respective parameter values of the expert sub-models among the plurality of processing devices.Type: ApplicationFiled: November 10, 2022Publication date: May 16, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
-
Publication number: 20240160906Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer included in an MoE model. The processing devices are configured to execute the MoE layer at least in part by, during a first collective communication phase between the processing devices, splitting each of a plurality of first input tensors along a first dimension to obtain first output tensors. Executing the MoE layer further includes processing the first output tensors at a respective a plurality of expert sub-models to obtain a plurality of second input tensors. Executing the MoE layer further includes, during a second collective communication phase between the processing devices, receiving the second input tensors from the expert sub-models and concatenating the second input tensors along the first dimension to obtain second output tensors. Executing the MoE layer further includes outputting the second output tensors as output of the MoE layer.Type: ApplicationFiled: November 10, 2022Publication date: May 16, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
-
Publication number: 20240086719Abstract: A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer. The processing devices are configured to execute the MoE layer at least in part by receiving an input tensor including input tokens. Executing the MoE layer further includes computing a gating function output vector based on the input tensor and computing a sparse encoding of the input tensor and the gating function output vector. The sparse encoding indicates one or more destination expert sub-models. Executing the MoE layer further includes dispatching the input tensor for processing at the one or more destination expert sub-models, and further includes computing an expert output tensor. Executing the MoE layer further includes computing an MoE layer output at least in part by computing a sparse decoding of the expert output tensor. Executing the MoE layer further includes conveying the MoE layer output to an additional computing process.Type: ApplicationFiled: May 16, 2023Publication date: March 14, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Yifan XIONG, Changho HWANG, Wei CUI, Ziyue YANG, Ze LIU, Han HU, Zilong WANG, Rafael Omar SALAS, Jithin JOSE, Prabhat RAM, Ho-Yuen CHAU, Peng CHENG, Fan YANG, Mao YANG, Yongqiang XIONG
-
Patent number: 11237761Abstract: The disclosed technologies include functionality for managing Multiple Physical Function NVMe Devices (“MFNDs”) and the physical functions (“PFs”) provided by MFNDs. For example, host devices can discover MFNDs, query the capabilities of MFNDs, and change the operating mode of an MFND between a user mode and a super administrator mode. Hosts can also utilize the disclosed technologies to create and delete individual child PFs on MFNDs. The disclosed technologies also include functionality for managing the settings associated with individual PFs of MFNDs. For example, hosts can query and modify the settings associated with individual child PFs of an MFND. The disclosed technologies also include functionality for managing the QoS provided by individual PFs of a MFND. For example, hosts can also query and modify the QoS provided by individual child PFs of an MFND.Type: GrantFiled: February 21, 2020Date of Patent: February 1, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Lei Kou, Scott Chao-Chueh Lee, Ho-Yuen Chau, Liang Yang, Chin Hwan Park, Yimin Deng
-
Patent number: 11163887Abstract: A bare metal resource includes a trusted portion and an untrusted portion. The trusted portion includes trusted hardware, an image repository, and a clearance manager. The clearance manager is executable during bootup of the bare metal resource to perform a clearance process on the untrusted portion, including deleting the BIOS in the untrusted portion and loading a trusted BIOS from the image repository on the untrusted hardware, to place the untrusted portion in a trusted state. The bare metal resource may be provisioned to a tenant of a cloud provider after being placed in the trusted state.Type: GrantFiled: December 28, 2018Date of Patent: November 2, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Bryan W. Tuttle, Carlos Jose Cela, Ho-Yuen Chau, Melur K. Raghuraman, Saurabh M. Kulkarni, Yimin Deng
-
Publication number: 20210132860Abstract: The disclosed technologies include functionality for managing Multiple Physical Function NVMe Devices (“MFNDs”) and the physical functions (“PFs”) provided by MFNDs. For example, host devices can discover MFNDs, query the capabilities of MFNDs, and change the operating mode of an MFND between a user mode and a super administrator mode. Hosts can also utilize the disclosed technologies to create and delete individual child PFs on MFNDs. The disclosed technologies also include functionality for managing the settings associated with individual PFs of MFNDs. For example, hosts can query and modify the settings associated with individual child PFs of an MFND. The disclosed technologies also include functionality for managing the QoS provided by individual PFs of a MFND. For example, hosts can also query and modify the QoS provided by individual child PFs of an MFND.Type: ApplicationFiled: February 21, 2020Publication date: May 6, 2021Inventors: Lei KOU, Scott Chao-Chueh LEE, Ho-Yuen CHAU, Liang YANG, Chin Hwan PARK, Yimin DENG
-
Patent number: 10630654Abstract: Computing systems, devices, and associated methods of managing secure communication using hardware accelerators are disclosed herein. In one embodiment, a method includes receiving messages from a peer computing device via a computer network at a FPGA of a hardware accelerator and examining each of the received messages to determine whether the received messages contain application data. The method can then include forwarding a first subset of the received messages that do not contain application data to the processor for further processing and processing a second subset of the messages containing application data according to a security protocol without forwarding the second subset to the processor to reduce a consumption of bandwidth across the communications bridge.Type: GrantFiled: June 22, 2017Date of Patent: April 21, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Carlos Jose Cela, Ho Yuen Chau, Bryan William Tuttle
-
Publication number: 20190251266Abstract: A bare metal resource includes a trusted portion and an untrusted portion. The trusted portion includes trusted hardware, an image repository, and a clearance manager. The clearance manager is executable during bootup of the bare metal resource to perform a clearance process on the untrusted portion, including deleting the BIOS in the untrusted portion and loading a trusted BIOS from the image repository on the untrusted hardware, to place the untrusted portion in a trusted state. The bare metal resource may be provisioned to a tenant of a cloud provider after being placed in the trusted state.Type: ApplicationFiled: December 28, 2018Publication date: August 15, 2019Applicant: Microsoft Technology Licensing, LLCInventors: Bryan W. TUTTLE, Carlos Jose CELA, Ho-Yuen CHAU, Melur K. RAGHURAMAN, Saurabh M. KULKARNI, Yimin DENG
-
Publication number: 20180278588Abstract: Computing systems, devices, and associated methods of managing secure communication using hardware accelerators are disclosed herein. In one embodiment, a method includes receiving messages from a peer computing device via a computer network at a FPGA of a hardware accelerator and examining each of the received messages to determine whether the received messages contain application data. The method can then include forwarding a first subset of the received messages that do not contain application data to the processor for further processing and processing a second subset of the messages containing application data according to a security protocol without forwarding the second subset to the processor to reduce a consumption of bandwidth across the communications bridge.Type: ApplicationFiled: June 22, 2017Publication date: September 27, 2018Inventors: Carlos Jose Cela, Ho Yuen Chau, Bryan William Tuttle
-
Patent number: 9395920Abstract: Computerized methods, systems, and computer-storage media for throttling requests from virtual machines (VMs) to a hard-disk drive (HDD) are provided. When a request for disk I/O is received from a VM, a disk-drive model that simulates performance characteristics of the HDD is accessed. During access, the disk-drive model's estimation of HDD parameters and the disk-drive model's estimation of a current state of a disk head of the HDD are gathered. A projected execution time to carry out the request is computed as a function of the estimated HDD parameters and the estimated current state of the disk head. Also, an actual execution time to carry out the request is measured upon allowing the request to pass to the HDD. Using a comparison of the projected execution time and the actual execution time, the traffic of the requests from the VMs is throttled.Type: GrantFiled: November 17, 2011Date of Patent: July 19, 2016Assignee: MIROSOFT TECHNOLOGY LICENSING, LLCInventors: Yimin Deng, Ho Yuen Chau, Yue Zuo, Forrest Curtis Foltz
-
Publication number: 20130254766Abstract: The present invention extends to methods, systems, and computer program products for offloading packet processing for networking device virtualization. A host maintains rule set(s) for a virtual machine, and a physical network interface card (NIC) maintains flow table(s) for the virtual machine. The physical NIC receives and processes a network packet associated with the virtual machine. Processing the network packet includes the physical NIC comparing the network packet with the flow table(s) at the physical NIC. When the network packet matches with a flow in the flow table(s) at the physical NIC, the physical NIC performs an action on the network packet based on the matching flow. Alternatively, when the network packet does not match with a flow in the flow table(s) at the physical NIC, the physical NIC passes the network packet to the host partition for processing against the rule set(s).Type: ApplicationFiled: July 17, 2012Publication date: September 26, 2013Applicant: Microsoft CorporationInventors: Yue Zuo, Daniel M. Firestone, Albert Gordon Greenberg, Ho Yuen Chau, Yimin Deng, Bryan William Tuttle, Pankaj Garg
-
Publication number: 20130132057Abstract: Computerized methods, systems, and computer-storage media for throttling requests from virtual machines (VMs) to a hard-disk drive (HDD) are provided. When a request for disk I/O is received from a VM, a disk-drive model that simulates performance characteristics of the HDD is accessed. During access, the disk-drive model's estimation of HDD parameters and the disk-drive model's estimation of a current state of a disk head of the HDD are gathered. A projected execution time to carry out the request is computed as a function of the estimated HDD parameters and the estimated current state of the disk head. Also, an actual execution time to carry out the request is measured upon allowing the request to pass to the HDD. Using a comparison of the projected execution time and the actual execution time, the traffic of the requests from the VMs is throttled.Type: ApplicationFiled: November 17, 2011Publication date: May 23, 2013Applicant: MICROSOFT CORPORATIONInventors: YIMIN DENG, HO YUEN CHAU, YUE ZUO, FORREST CURTIS FOLTZ
-
Publication number: 20110225459Abstract: Cloud computing platforms having computer-readable media that perform methods to generate debuggable dump files are provided. The cloud computing platform includes at least one server having a host virtual machine, guest virtual machine, and hypervisor. The host virtual machine receives a command to generate the debuggable dump file. In response, it suspends all virtual processors executing on the guest virtual machine. The memory pages of the suspended virtual machine are written into a debuggable dump file, and the suspended processors are resumed at an appropriate time.Type: ApplicationFiled: March 9, 2010Publication date: September 15, 2011Applicant: MICROSOFT CORPORATIONInventors: THOMAS FAHRIG, YUE ZUO, FRANCIS MANOJ DAVID, YIMIN DENG, HO-YUEN CHAU, FORREST CURTIS FOLTZ
-
Publication number: 20110225458Abstract: Cloud computing platforms having computer-readable media that perform methods to generate debuggable dump files are provided. The cloud computing platform includes server devices running operating system kernels. Optionally, the server may include a hypervisor. The operating system kernel receives a command to generate a debuggable dump file. In response, the operating system estimates memory requires to store the requested memory pages, allocates an appropriately sized buffer, and freezes computation. A hypervisor is present and if its memory pages are requested, the hypervisor freezes its computation. The hypervisor stores its memory pages in the buffer and resumes computation. The operating system kernel stores its pages to the buffer in priority order and resumes its computation. The contents of the buffer are written out as a debuggable dump file.Type: ApplicationFiled: March 9, 2010Publication date: September 15, 2011Applicant: MICROSOFT CORPORATIONInventors: YUE ZUO, FRANCIS MANOJ DAVID, YIMIN DENG, HO-YUEN CHAU, FORREST CURTIS FOLTZ
-
Patent number: 7620938Abstract: Program execution can be monitored and recorded for later playback. Certain state changes that can be predicted via a virtual processor during playback need not be recorded, so a compressed recording can be stored. To facilitate random access with respect to time during playback, key frames can be stored within the compressed recording. An index mechanism can associate key frames with particular memory addresses. Additionally, a snapshot of values for memory addresses can be used to further facilitate determining the value of a memory address without having to simulate execution. Multiprocessor executions can be supported, and playback can be done on a machine type different from that on which recording took place.Type: GrantFiled: October 31, 2005Date of Patent: November 17, 2009Assignee: Microsoft CorporationInventors: Andrew James Edwards, Darek Mihocka, Ho-Yuen Chau, Ronald C. Murray, Sanjay Bhansali, Stuart D. de Jong, Wen-Ke Chen, Kenneth Bryant Pierce
-
Publication number: 20070168989Abstract: Program execution can be monitored and recorded for later playback. Certain state changes that can be predicted via a virtual processor during playback need not be recorded, so a compressed recording can be stored. To facilitate random access with respect to time during playback, key frames can be stored within the compressed recording. An index mechanism can associate key frames with particular memory addresses. Additionally, a snapshot of values for memory addresses can be used to further facilitate determining the value of a memory address without having to simulate execution. Multiprocessor executions can be supported, and playback can be done on a machine type different from that on which recording took place.Type: ApplicationFiled: October 31, 2005Publication date: July 19, 2007Applicant: Microsoft CorporationInventors: Andrew Edwards, Darek Mihocka, Ho-Yuen Chau, Ronald Murray, Sanjay Bhansali, Stuart de Jong, Wen-Ke Chen, Kenneth Pierce
-
Patent number: 7174554Abstract: Tools and methods are described herein for discovering race condition errors in a software program. The errors are discovered by deliberately causing a processor executing the test program to switch threads at intervals other than normally scheduled by an operating system. The thread switching is caused upon occurrence of selected events. The intervals may be selected automatically or with user input. Furthermore, thread switching may be caused during conditions more likely to cause race condition errors. For example, thread switches may be caused between threads that share control of a memory device or while the processor is executing instructions related to synchronization tools (e.g. locks, mutex, etc.).Type: GrantFiled: December 20, 2002Date of Patent: February 6, 2007Assignee: Microsoft CorporationInventors: Kenneth Bryant Pierce, Ho-Yuen Chau
-
Publication number: 20040123185Abstract: Tools and methods are described herein for discovering race condition errors in a software program. The errors are discovered by deliberately causing a processor executing the test program to switch threads at intervals other than normally scheduled by an operating system. The thread switching is caused upon occurrence of selected events. The intervals may be selected automatically or with user input. Furthermore, thread switching may be caused during conditions more likely to cause race condition errors. For example, thread switches may be caused between threads that share control of a memory device or while the processor is executing instructions related to synchronization tools (e.g. locks, mutex, etc.).Type: ApplicationFiled: December 20, 2002Publication date: June 24, 2004Applicant: Microsoft CorporationInventors: Kenneth Bryant Pierce, Ho-Yuen Chau