Patents by Inventor Po-An TSAI
Po-An TSAI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250094864Abstract: Machine learning is a process that learns a model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the size, computation, and latency of a machine learning model, a compression technique can be employed which includes model sparsification and quantization. To limit the extent to which the quality of the model is impacted when uniformly applying sparsification and quantization to all values of the model, the present disclosure provides for a hybrid sparsification and quantization of the model.Type: ApplicationFiled: March 12, 2024Publication date: March 20, 2025Inventors: Po-An Tsai, Geonhwa Jeong, Jeffrey Michael Pool
-
Publication number: 20250060938Abstract: Systems and methods for efficient convolution based on matrix multiply and add (MMA) are described. An example processor having a plurality of processing lanes is configured to perform convolution of a matrix of activation elements and a filter matrix in accordance with a configurable series of instructions including a plurality of MMA instructions and shift instructions while reusing activation elements already loaded to the datapath or associated memory over a plurality of MMA operations. Associated methods are also described.Type: ApplicationFiled: August 14, 2023Publication date: February 20, 2025Inventors: Jack CHOQUETTE, Po-An TSAI, Alexander L. MINKIN, Manan PATEL, Neal Clayton CRAGO, Daniel STIFFLER, Kefeng DUAN, Yu-Jung CHEN, Jing LI, Qian WANG, Ronny KRASHINSKY, Jun YANG, Feng XIE
-
Patent number: 12196406Abstract: A base is configured for a bracket. The base includes a hollow body, a plurality of supporting branches, and an illuminating module. The hollow body is connected to the bracket and has a bottom part and a first sidewall. The bottom part has an open hole. The first sidewall has a transparent structure. The plurality of supporting branches is disposed around the hollow body to lift the hollow body. The illuminating module is disposed in the hollow body and includes a sleeve and a base plate. The sleeve has a second sidewall, a first end, and a second end opposite to the first end. The second sidewall has an opening. The position of the opening is corresponding to the transparent structure. The base plate is disposed on the first end. The base plate is provided with a light source. The light source projects light beams toward the second end.Type: GrantFiled: December 23, 2022Date of Patent: January 14, 2025Assignee: ASUSTEK COMPUTER INC.Inventors: Kai Chieh Hsu, Chih-Wei Chuang, Yaw-Huei Chiou, Peng Chao Wang, Po-An Tsai, Hao-Chun Lai
-
Publication number: 20240152407Abstract: Apparatuses, systems, and techniques to determine a configuration based at least in part on data stored by at least one data structure of a workload at runtime, and transform the workload into a sparse workload based at least in part on the configuration. In at least one embodiment, one or more sparse workloads (e.g., one or more sparse neural networks) are generated based at least in part on, for example, one or more workloads (e.g., one or more neural networks).Type: ApplicationFiled: July 17, 2023Publication date: May 9, 2024Inventors: Geonhwa Jeong, Po-An Tsai, Jeffrey Michael Pool
-
Publication number: 20240027061Abstract: A base is configured for a bracket. The base includes a hollow body, a plurality of supporting branches, and an illuminating module. The hollow body is connected to the bracket and has a bottom part and a first sidewall. The bottom part has an open hole. The first sidewall has a transparent structure. The plurality of supporting branches is disposed around the hollow body to lift the hollow body. The illuminating module is disposed in the hollow body and includes a sleeve and a base plate. The sleeve has a second sidewall, a first end, and a second end opposite to the first end. The second sidewall has an opening. The position of the opening is corresponding to the transparent structure. The base plate is disposed on the first end. The base plate is provided with a light source. The light source projects light beams toward the second end.Type: ApplicationFiled: December 23, 2022Publication date: January 25, 2024Inventors: Kai Chieh HSU, Chih-Wei CHUANG, Yaw-Huei CHIOU, Peng Chao WANG, Po-An TSAI, Hao-Chun LAI
-
Publication number: 20230411470Abstract: A trench-gate field effect transistor includes a plurality of trenches, a plurality of gate electrode units, and a plurality of source electrode units. Each of the trenches has a first trench region, a second trench region having a width less than that of the first trench region, and a neck trench region extending between the first trench region and the second trench region. Each of the gate electrode units includes a pair of first gate electrode portions disposed in the first trench region, a pair of second gate electrode portions disposed in the neck trench region, and a third gate electrode portion disposed in the second trench region. Each of the source electrode units includes a first source electrode portion disposed between a pair of the first gate electrode portions, and a second source electrode portion connected to the first source electrode portion.Type: ApplicationFiled: May 19, 2023Publication date: December 21, 2023Applicant: FORCE MOS TECHNOLOGY CO., LTD.Inventors: Kao-Way TU, Yuan-Shun CHANG, Po-An TSAI, Huan-Chung WENG
-
Publication number: 20230062503Abstract: Hierarchical structured sparse parameter pruning and processing improves runtime performance and energy efficiency of neural networks. In contrast with conventional (non-structured) pruning which allows for any distribution of the non-zero values within a matrix that achieves the desired sparsity degree (e.g., 50%) and is consequently difficult to accelerate, structured hierarchical sparsity requires each multi-element unit at the coarsest granularity of the hierarchy to be pruned to the desired sparsity degree. The global desired sparsity degree is a function of the per-level sparsity degrees. Distribution of non-zero values within each multi-element unit is constrained according to the per-level sparsity degree at the particular level of the hierarchy. Each level of the hierarchy may be associated with a hardware (e.g., logic or circuit) structure that can be enabled or disabled according to the per-level sparsity.Type: ApplicationFiled: February 28, 2022Publication date: March 2, 2023Inventors: Yannan Wu, Po-An Tsai, Saurav Muralidharan, Joel Springer Emer
-
Publication number: 20220083314Abstract: Accelerators are generally utilized to provide high performance and energy efficiency for tensor algorithms. Currently, an accelerator will be specifically designed around the fundamental properties of the tensor algorithm and shape it supports, and thus will exhibit sub-optimal performance when used for other tensor algorithms and shapes. The present disclosure provides a flexible accelerator for tensor workloads. The flexible accelerator can be a flexible tensor accelerator or a FPGA having a dynamically configurable inter-PE network supporting different tensor shapes and different tensor algorithms including at least a GEMM algorithm, a 2D CNN algorithm, and a 3D CNN algorithm, and/or having a flexible DPU in which a dot product length of its dot product sub-units is configurable based on a target compute throughput that is less than or equal to a maximum throughput of the flexible DPU.Type: ApplicationFiled: June 9, 2021Publication date: March 17, 2022Inventors: Po An Tsai, Neal Crago, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler
-
Publication number: 20220083500Abstract: Accelerators are generally utilized to provide high performance and energy efficiency for tensor algorithms. Currently, an accelerator will be specifically designed around the fundamental properties of the tensor algorithm and shape it supports, and thus will exhibit sub-optimal performance when used for other tensor algorithms and shapes. The present disclosure provides a flexible accelerator for tensor workloads. The flexible accelerator can be a flexible tensor accelerator or a FPGA having a dynamically configurable inter-PE network supporting different tensor shapes and different tensor algorithms including at least a GEMM algorithm, a 2D CNN algorithm, and a 3D CNN algorithm, and/or having a flexible DPU in which a dot product length of its dot product sub-units is configurable based on a target compute throughput.Type: ApplicationFiled: June 9, 2021Publication date: March 17, 2022Inventors: Po An Tsai, Neal Crago, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler
-
Patent number: 10956227Abstract: Examples provide two-tiered scheduling within a cluster. A coarse-grained analysis is performed on a candidate set of hosts to select a host for a virtual computing instance based on optimization of at least one resource. A host is selected based on the analysis results. The identified virtual computing instance is placed on the selected host. A fine-grained analysis is performed on a set of communication graphs for a plurality of virtual computing instances to generate a set of penalty scores. A set of communicating virtual computing instances are selected based on the set of penalty scores. A first virtual computing instance from a first host is relocated to a second host to minimize a distance between the first virtual computing instance and a second virtual computing instance. Relocating the first virtual computing instance reduces at least one penalty score for the set of communicating virtual computing instances.Type: GrantFiled: February 11, 2019Date of Patent: March 23, 2021Assignee: VMware, Inc.Inventors: Po-An Tsai, Sahan Gamage, Rean Griffith
-
Patent number: 10700175Abstract: A fabricating method of a shielded gate MOSFET is provided, includes the steps of forming a semiconductor substrate having a trench, forming a sacrifice oxide layer in the trench, the sacrifice oxide layer covering a side wall of the trench, forming a source polycrystalline silicon region in the trench, forming an insulation oxide layer above the source polycrystalline silicon region to have the source polycrystalline silicon region fully enclosed by the sacrifice oxide layer and the insulation oxide layer, depositing polycrystalline silicon into the trench and carrying out a back etching to control a thickness of the insulation oxide layer above the source polycrystalline silicon region, forming a gate oxide layer in the trench, the gate oxide layer covering the side wall of the trench, forming a gate polycrystalline silicon region in the trench, and forming a body layer and a heavily doped region around the trench in an ion implantation manner.Type: GrantFiled: January 10, 2019Date of Patent: June 30, 2020Assignee: Force MOS Technology Co., Ltd.Inventors: Kao-Way Tu, Po-An Tsai, Huan-Chung Weng
-
Publication number: 20200105890Abstract: A fabricating method of a shielded gate MOSFET is provided, including steps of: forming a semiconductor substrate having a trench; forming a sacrifice oxide layer in the trench, the sacrifice oxide layer covering a side wall of the trench; forming a source polycrystalline silicon region in the trench; forming an insulation oxide layer above the source polycrystalline silicon region to have the source polycrystalline silicon region fully enclosed by the sacrifice oxide layer and the insulation oxide layer; depositing polycrystalline silicon into the trench and carrying out a back etching to control a thickness of the insulation oxide layer above the source polycrystalline silicon region; forming a gate oxide layer in the trench, the gate oxide layer covering the side wall of the trench; forming a gate polycrystalline silicon region in the trench; and forming a body layer and a heavily doped region around the trench in an ion implantation manner.Type: ApplicationFiled: January 10, 2019Publication date: April 2, 2020Inventors: Kao-Way Tu, Po-An Tsai, Huan-Chung Weng
-
Patent number: 10401717Abstract: A projecting device is provided. The projecting device is adapted to assembling with an electronic device. The projecting device comprises a main body, a light emitting portion, a rotating portion, and an adjusting portion. The main body includes a first opening and a second opening. The light emitting portion is disposed in the main body, and transmits a light through the first opening. The rotating portion is disposed in the main body and connected with the light emitting portion. The adjusting portion is disposed in the second opening and connected with the rotating portion. The light emitting portion drives the rotating portion to rotate through the adjusting portion to adjust the angle of the light.Type: GrantFiled: August 1, 2018Date of Patent: September 3, 2019Assignee: ASUSTEK COMPUTER INC.Inventor: Po-An Tsai
-
Publication number: 20190188050Abstract: Examples provide two-tiered scheduling within a cluster. A coarse-grained analysis is performed on a candidate set of hosts to select a host for a virtual computing instance based on optimization of at least one resource. A host is selected based on the analysis results. The identified virtual computing instance is placed on the selected host. A fine-grained analysis is performed on a set of communication graphs for a plurality of virtual computing instances to generate a set of penalty scores. A set of communicating virtual computing instances are selected based on the set of penalty scores. A first virtual computing instance from a first host is relocated to a second host to minimize a distance between the first virtual computing instance and a second virtual computing instance. Relocating the first virtual computing instance reduces at least one penalty score for the set of communicating virtual computing instances.Type: ApplicationFiled: February 11, 2019Publication date: June 20, 2019Inventors: Po-An Tsai, Sahan Gamage, Rean Griffith
-
Patent number: 10241840Abstract: Examples provide two-tiered scheduling within a cluster. A coarse-grained analysis is performed on a candidate set of hosts to select a host for a virtual computing instance based on optimization of at least one resource. A host is selected based on the analysis results. The identified virtual computing instance is placed on the selected host. A fine-grained analysis is performed on a set of communication graphs for a plurality of virtual computing instances to generate a set of penalty scores. A set of communicating virtual computing instances are selected based on the set of penalty scores. A first virtual computing instance from a first host is relocated to a second host to minimize a distance between the first virtual computing instance and a second virtual computing instance. Relocating the first virtual computing instance reduces at least one penalty score for the set of communicating virtual computing instances.Type: GrantFiled: September 30, 2016Date of Patent: March 26, 2019Assignee: VMware, Inc.Inventors: Po-An Tsai, Sahan Gamage, Rean Griffith
-
Publication number: 20190041730Abstract: A projecting device is provided. The projecting device is adapted to assembling with an electronic device. The projecting device comprises a main body, a light emitting portion, a rotating portion, and an adjusting portion. The main body includes a first opening and a second opening. The light emitting portion is disposed in the main body, and transmits a light through the first opening. The rotating portion is disposed in the main body and connected with the light emitting portion. The adjusting portion is disposed in the second opening and connected with the rotating portion. The light emitting portion drives the rotating portion to rotate through the adjusting portion to adjust the angle of the light.Type: ApplicationFiled: August 1, 2018Publication date: February 7, 2019Inventor: Po-An TSAI
-
Publication number: 20180095776Abstract: Examples provide two-tiered scheduling within a cluster. A coarse-grained analysis is performed on a candidate set of hosts to select a host for a virtual computing instance based on optimization of at least one resource. A host is selected based on the analysis results. The identified virtual computing instance is placed on the selected host. A fine-grained analysis is performed on a set of communication graphs for a plurality of virtual computing instances to generate a set of penalty scores. A set of communicating virtual computing instances are selected based on the set of penalty scores. A first virtual computing instance from a first host is relocated to a second host to minimize a distance between the first virtual computing instance and a second virtual computing instance. Relocating the first virtual computing instance reduces at least one penalty score for the set of communicating virtual computing instances.Type: ApplicationFiled: September 30, 2016Publication date: April 5, 2018Inventors: Po-An Tsai, Sahan Gamage, Rean Griffith
-
Publication number: 20130046268Abstract: A low-cost waist adhering material for paper diaper for mass production and easy application comprises a replacement material. The replacement material is composed of a non-textile fabric layer disposed at a top end for connecting with hooks of a paper diaper, and a molded positioning layer disposed underneath the non-textile fabric layer and on a surface of the paper diaper for positioning the non-textile fabric layer. The non-textile fabric layer has at least one through hole area. The through hole area is composed of a plurality of through holes spaced at intervals penetrating through the non-textile fabric layer.Type: ApplicationFiled: July 29, 2012Publication date: February 21, 2013Inventors: Po-An Tsai, Charng-Ching Ou, Hsu-Feng Shih
-
Publication number: 20120323199Abstract: A leakage proof base material for paper diaper is provided which is more convenient to use, suitable to be mass produced, more skin-friendly and with a lower cost. The leakage proof base material for paper diaper comprises a leakage proof base material which is composed of a leakage proof membrane as a top layer and a non-textile fabric layer as a bottom layer, and surfaces of the leakage proof membrane and the non-textile fabric layer are combined together partially or entirely. At least one through hole area is disposed on the non-textile fabric layer, the through hole area is composed of a plurality of through holes which are penetrated through the non-textile fabric layer and are spaced at intervals.Type: ApplicationFiled: February 8, 2012Publication date: December 20, 2012Inventors: Po-An Tsai, Charng-Ching Ou, Hsu-Feng Shih
-
Publication number: 20120095431Abstract: A diaper having an improved conjugated structure that is convenient, low-cost and easy for mass production is disclosed. The diaper includes a side wing having a magic hook located thereon and a rear thin sheet having an anti-leaking layer and a non-textile fabrics layer conjugated entirely or partially. The conjugation strength between the magic hook and the rear thin sheet is about 100 to 700 g/inch at 180 degrees, and the shear stress at 180 degrees is over 1000 g/inch. With the conjugation between the magic hook and the rear thin sheet, the user can randomly attach the magic hook to the surface of the non-textile fabrics layer to adjust the tightness of the diaper.Type: ApplicationFiled: June 20, 2011Publication date: April 19, 2012Inventors: Po-An TSAI, Charng-Ching OU, Hsu-Feng SHIH