Patents by Inventor Yun-Chian Cheng

Yun-Chian Cheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and system for high-throughput sequencing data analysis

Patent number: 10325676

Abstract: Methods and systems for high-throughput sequencing data analysis are provided. In an embodiment, the method includes the following steps. An input DNA/RNA/Protein sequence is received by a master computing unit. The input DNA/RNA/Protein sequence is partitioned into overlapping segments with a sliding window less than a segment length of the overlapping segments to allow overlapping of any successive two thereof by the master computing unit. The overlapping segments are distributed by the master computing unit to a plurality of slave computing units in a cloud computing environment. Suffix-expansion-sorting processing is performed on the overlapping segments by the slave computing units to produce sorted expansion segments. Distributed database tables are generated based on the sorted expansion segments by at least a portion of the slave computing units.

Type: Grant

Filed: December 31, 2015

Date of Patent: June 18, 2019

Assignee: ATGENOMIX INC.

Inventors: Ming-Tai Chang, Chung-Tsai Su, Yun-Chian Cheng
METHOD AND SYSTEM FOR HIGH-THROUGHPUT SEQUENCING DATA ANALYSIS

Publication number: 20160188797

Abstract: Methods and systems for high-throughput sequencing data analysis are provided. In an embodiment, the method includes the following steps. An input DNA/RNA/Protein sequence is received by a master computing unit. The input DNA/RNA/Protein sequence is partitioned into overlapping segments with a sliding window less than a segment length of the overlapping segments to allow overlapping of any successive two thereof by the master computing unit. The overlapping segments are distributed by the master computing unit to a plurality of slave computing units in a cloud computing environment. Suffix-expansion-sorting processing is performed on the overlapping segments by the slave computing units to produce sorted expansion segments. Distributed database tables are generated based on the sorted expansion segments by at least a portion of the slave computing units.

Type: Application

Filed: December 31, 2015

Publication date: June 30, 2016

Inventors: Ming-Tai Chang, Chung-Tsai Su, Yun-Chian Cheng
Methods and apparatus for detecting botnet attacks

Patent number: 8612523

Abstract: Botnet attacks may be detected by collecting samples of spam messages, forming clusters of related spam messages, and identifying the source or sources of the related spam messages. The related spam messages may be identified as those generated using the same template. For example, spam messages generated using the same image template, text template, or both may be deemed as related. To find related spam messages, images of spam messages may be extracted and compressed using a lossy compression algorithm. The compressed images may then be compared to one another to identify those generated using the same image template. The lossy compression algorithm may involve dividing an image into several blocks and then computing a value for each block for comparison.

Type: Grant

Filed: May 22, 2007

Date of Patent: December 17, 2013

Assignee: Trend Micro Incorporated

Inventors: Jonathan James Oliver, Yun-Chian Cheng
Method and arrangement for automatic charset detection

Patent number: 8560466

Abstract: The invention relates, in an embodiment, to a method for handling a received document. The method includes receiving a plurality of text document samples. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes generating fundamental units from the plurality of text document samples for charsets of the plurality of text document samples. Training includes extracting a subset of said fundamental units as feature lists and converting the feature lists into a set of feature vectors. Training further includes generating the set of machine learning models from the set of feature vectors. The method includes applying the set of machine learning models against a set of target document feature vectors converted from the received document. The method includes decoding the received document to obtain decoded content of the received document based on at least the first encoding scheme.

Type: Grant

Filed: February 26, 2010

Date of Patent: October 15, 2013

Assignee: Trend Micro Incorporated

Inventors: Lili Diao, Yun-chian Cheng
Techniques for identifying spam e-mail

Patent number: 8495144

Abstract: In one embodiment, a support vector machine is employed to compute a spam threshold and weights of tokens and heuristic rules. An incoming e-mail is parsed to determine if it contains one or more of the tokens. Tokens identified to be in the e-mail are then used to determine if the e-mail satisfies one or more heuristic rules. The weights of tokens found in the e-mail and the weights of the heuristic rules satisfied by the e-mail may be employed in the computation of a spam score. The spam score may be compared to the spam threshold to determine if the e-mail is spam or legitimate.

Type: Grant

Filed: October 6, 2004

Date of Patent: July 23, 2013

Assignee: Trend Micro Incorporated

Inventors: Yun-Chian Cheng, Pei-Hsun Yu
Content fingerprinting using context offset sequences

Patent number: 8495733

Abstract: One embodiment relates to a computer-implemented process of content fingerprinting. A context and a content for fingerprinting are received. The context comprises a set of context components for use in generation of content fingerprints. The content includes instances of at least some of the context components. The content is processed to generate context offset sequences, and a fingerprint for the content is formed from at least a portion of the context offset sequences. Another embodiment relates to a computer-implemented process for comparing a target content against a pool of contents. The process includes constructing an automata data structure based on the fingerprints in the pool. Context offset sequences of a target fingerprint are scanned against the automata data structure to determine matched offset subsequences. Other embodiments, aspects and features are also disclosed.

Type: Grant

Filed: March 25, 2009

Date of Patent: July 23, 2013

Assignee: Trend Micro Incorporated

Inventors: Yun-Chian Cheng, Ming-Tai Chang, Chung-Chih Wu
Automatic local detection of computer security threats

Patent number: 8424091

Abstract: A system for locally detecting computer security threats in a computer network includes a processing engine, a fingerprint engine, and a detection engine. Data samples are received in the computer network and grouped by the processing engine into clusters. Clusters that do not have high false alarm rates are passed to the fingerprint engine, which generates fingerprints for the clusters. The detection engine scans incoming data for computer security threats using the fingerprints.

Type: Grant

Filed: January 12, 2010

Date of Patent: April 16, 2013

Assignee: Trend Micro Incorporated

Inventors: Chung-Tsai Su, Wen-Kwang Tsao, Chung-Chi Wu, Ming-Tai Chang, Yun-Chian Cheng
Statistical spamming behavior analysis on mail clusters

Patent number: 8291024

Abstract: A clustering technique is utilized to group similar e-mail messages into clusters. Statistical spamming behavior analysis is then applied to each cluster, focusing on finding e-mail messages within each cluster that differ from other e-mail messages in the cluster. The degree of variance and the type of variance can provide important clues as to whether the email is spam or not. Appropriate measures are then taken to block, filter, or otherwise handle the suspected spam e-mail messages.

Type: Grant

Filed: July 31, 2008

Date of Patent: October 16, 2012

Assignee: Trend Micro Incorporated

Inventors: Yun-Chian Cheng, Allen Ming-Tai Chang
METHOD AND ARRANGEMENT FOR AUTOMATIC CHARSET DETECTION

Publication number: 20110213736

Abstract: The invention relates, in an embodiment, to a method for handling a received document. The method includes receiving a plurality of text document samples. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes generating fundamental units from the plurality of text document samples for charsets of the plurality of text document samples. Training includes extracting a subset of said fundamental units as feature lists and converting the feature lists into a set of feature vectors. Training further includes generating the set of machine learning models from the set of feature vectors. The method includes applying the set of machine learning models against a set of target document feature vectors converted from the received document. The method includes decoding the received document to obtain decoded content of the received document based on at least the first encoding scheme.

Type: Application

Filed: February 26, 2010

Publication date: September 1, 2011

Inventors: Lili Diao, Yun-chian Cheng
Automatic charset detection using support vector machines with charset grouping

Patent number: 7689531

Abstract: The invention relates, in an embodiment, to a computer-implemented method for automatic charset detection, which includes detecting an encoding scheme of a target document. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using a SVM (Support Vector Machine) technique to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme.

Type: Grant

Filed: September 28, 2005

Date of Patent: March 30, 2010

Assignee: Trend Micro Incorporated

Inventors: Lili Diao, Yun-chian Cheng
Method and architecture for blocking email spams

Patent number: 7636716

Abstract: In a method for blocking email spams, the header fields and the message body of a received email first are identified. Predefined patterns are identified by matching in the header fields and message body, wherein a data structure of characteristic information is created for each recognized pattern. The characteristic information then are analyzed by rule inference to determine whether the received email is a spam.

Type: Grant

Filed: November 5, 2004

Date of Patent: December 22, 2009

Assignee: Trend Micro Incorporated

Inventor: Yun-Chian Cheng