Patents by Inventor Yun-Chian Cheng
Yun-Chian Cheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10325676Abstract: Methods and systems for high-throughput sequencing data analysis are provided. In an embodiment, the method includes the following steps. An input DNA/RNA/Protein sequence is received by a master computing unit. The input DNA/RNA/Protein sequence is partitioned into overlapping segments with a sliding window less than a segment length of the overlapping segments to allow overlapping of any successive two thereof by the master computing unit. The overlapping segments are distributed by the master computing unit to a plurality of slave computing units in a cloud computing environment. Suffix-expansion-sorting processing is performed on the overlapping segments by the slave computing units to produce sorted expansion segments. Distributed database tables are generated based on the sorted expansion segments by at least a portion of the slave computing units.Type: GrantFiled: December 31, 2015Date of Patent: June 18, 2019Assignee: ATGENOMIX INC.Inventors: Ming-Tai Chang, Chung-Tsai Su, Yun-Chian Cheng
-
Publication number: 20160188797Abstract: Methods and systems for high-throughput sequencing data analysis are provided. In an embodiment, the method includes the following steps. An input DNA/RNA/Protein sequence is received by a master computing unit. The input DNA/RNA/Protein sequence is partitioned into overlapping segments with a sliding window less than a segment length of the overlapping segments to allow overlapping of any successive two thereof by the master computing unit. The overlapping segments are distributed by the master computing unit to a plurality of slave computing units in a cloud computing environment. Suffix-expansion-sorting processing is performed on the overlapping segments by the slave computing units to produce sorted expansion segments. Distributed database tables are generated based on the sorted expansion segments by at least a portion of the slave computing units.Type: ApplicationFiled: December 31, 2015Publication date: June 30, 2016Inventors: Ming-Tai Chang, Chung-Tsai Su, Yun-Chian Cheng
-
Patent number: 8612523Abstract: Botnet attacks may be detected by collecting samples of spam messages, forming clusters of related spam messages, and identifying the source or sources of the related spam messages. The related spam messages may be identified as those generated using the same template. For example, spam messages generated using the same image template, text template, or both may be deemed as related. To find related spam messages, images of spam messages may be extracted and compressed using a lossy compression algorithm. The compressed images may then be compared to one another to identify those generated using the same image template. The lossy compression algorithm may involve dividing an image into several blocks and then computing a value for each block for comparison.Type: GrantFiled: May 22, 2007Date of Patent: December 17, 2013Assignee: Trend Micro IncorporatedInventors: Jonathan James Oliver, Yun-Chian Cheng
-
Patent number: 8560466Abstract: The invention relates, in an embodiment, to a method for handling a received document. The method includes receiving a plurality of text document samples. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes generating fundamental units from the plurality of text document samples for charsets of the plurality of text document samples. Training includes extracting a subset of said fundamental units as feature lists and converting the feature lists into a set of feature vectors. Training further includes generating the set of machine learning models from the set of feature vectors. The method includes applying the set of machine learning models against a set of target document feature vectors converted from the received document. The method includes decoding the received document to obtain decoded content of the received document based on at least the first encoding scheme.Type: GrantFiled: February 26, 2010Date of Patent: October 15, 2013Assignee: Trend Micro IncorporatedInventors: Lili Diao, Yun-chian Cheng
-
Patent number: 8495733Abstract: One embodiment relates to a computer-implemented process of content fingerprinting. A context and a content for fingerprinting are received. The context comprises a set of context components for use in generation of content fingerprints. The content includes instances of at least some of the context components. The content is processed to generate context offset sequences, and a fingerprint for the content is formed from at least a portion of the context offset sequences. Another embodiment relates to a computer-implemented process for comparing a target content against a pool of contents. The process includes constructing an automata data structure based on the fingerprints in the pool. Context offset sequences of a target fingerprint are scanned against the automata data structure to determine matched offset subsequences. Other embodiments, aspects and features are also disclosed.Type: GrantFiled: March 25, 2009Date of Patent: July 23, 2013Assignee: Trend Micro IncorporatedInventors: Yun-Chian Cheng, Ming-Tai Chang, Chung-Chih Wu
-
Patent number: 8495144Abstract: In one embodiment, a support vector machine is employed to compute a spam threshold and weights of tokens and heuristic rules. An incoming e-mail is parsed to determine if it contains one or more of the tokens. Tokens identified to be in the e-mail are then used to determine if the e-mail satisfies one or more heuristic rules. The weights of tokens found in the e-mail and the weights of the heuristic rules satisfied by the e-mail may be employed in the computation of a spam score. The spam score may be compared to the spam threshold to determine if the e-mail is spam or legitimate.Type: GrantFiled: October 6, 2004Date of Patent: July 23, 2013Assignee: Trend Micro IncorporatedInventors: Yun-Chian Cheng, Pei-Hsun Yu
-
Patent number: 8424091Abstract: A system for locally detecting computer security threats in a computer network includes a processing engine, a fingerprint engine, and a detection engine. Data samples are received in the computer network and grouped by the processing engine into clusters. Clusters that do not have high false alarm rates are passed to the fingerprint engine, which generates fingerprints for the clusters. The detection engine scans incoming data for computer security threats using the fingerprints.Type: GrantFiled: January 12, 2010Date of Patent: April 16, 2013Assignee: Trend Micro IncorporatedInventors: Chung-Tsai Su, Wen-Kwang Tsao, Chung-Chi Wu, Ming-Tai Chang, Yun-Chian Cheng
-
Patent number: 8291024Abstract: A clustering technique is utilized to group similar e-mail messages into clusters. Statistical spamming behavior analysis is then applied to each cluster, focusing on finding e-mail messages within each cluster that differ from other e-mail messages in the cluster. The degree of variance and the type of variance can provide important clues as to whether the email is spam or not. Appropriate measures are then taken to block, filter, or otherwise handle the suspected spam e-mail messages.Type: GrantFiled: July 31, 2008Date of Patent: October 16, 2012Assignee: Trend Micro IncorporatedInventors: Yun-Chian Cheng, Allen Ming-Tai Chang
-
Publication number: 20110213736Abstract: The invention relates, in an embodiment, to a method for handling a received document. The method includes receiving a plurality of text document samples. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes generating fundamental units from the plurality of text document samples for charsets of the plurality of text document samples. Training includes extracting a subset of said fundamental units as feature lists and converting the feature lists into a set of feature vectors. Training further includes generating the set of machine learning models from the set of feature vectors. The method includes applying the set of machine learning models against a set of target document feature vectors converted from the received document. The method includes decoding the received document to obtain decoded content of the received document based on at least the first encoding scheme.Type: ApplicationFiled: February 26, 2010Publication date: September 1, 2011Inventors: Lili Diao, Yun-chian Cheng
-
Patent number: 7689531Abstract: The invention relates, in an embodiment, to a computer-implemented method for automatic charset detection, which includes detecting an encoding scheme of a target document. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using a SVM (Support Vector Machine) technique to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme.Type: GrantFiled: September 28, 2005Date of Patent: March 30, 2010Assignee: Trend Micro IncorporatedInventors: Lili Diao, Yun-chian Cheng
-
Patent number: 7636716Abstract: In a method for blocking email spams, the header fields and the message body of a received email first are identified. Predefined patterns are identified by matching in the header fields and message body, wherein a data structure of characteristic information is created for each recognized pattern. The characteristic information then are analyzed by rule inference to determine whether the received email is a spam.Type: GrantFiled: November 5, 2004Date of Patent: December 22, 2009Assignee: Trend Micro IncorporatedInventor: Yun-Chian Cheng