Patents by Inventor Lili Diao

Lili Diao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11973791
    Abstract: A risk knowledge graph is created from information on risk events involving network entities of a private computer network. Each of the risk events is represented as a node in the risk knowledge graph. The nodes are connected by edges that represent the risk events. The nodes are grouped into communities of related nodes. A response action is performed against a community to mitigate a cybersecurity risk posed by the community.
    Type: Grant
    Filed: October 4, 2021
    Date of Patent: April 30, 2024
    Assignee: Trend Micro Incorporated
    Inventors: Zhijie Li, ZhengBao Zhang, Lili Diao
  • Patent number: 11449794
    Abstract: Language-based machine learning approach for automatically detecting universal charset and the language of a received document is disclosed. The language-based machine learning approach employs a plurality of text document samples in different languages, after converting them to a selected Unicode style (if their original encoding schemes are not the selected Unicode), to generate a plurality of language-based machine learning models during the training stage. During the application stage, vector representations of the received document for different combinations of charsets and their respective applicable languages are tested against the plurality of machine learning models to ascertain the charset and language combination that is most similar to its associated machine learning model, thereby identifying the charset and language of the received document.
    Type: Grant
    Filed: August 21, 2019
    Date of Patent: September 20, 2022
    Assignee: Trend Micro Incorporated
    Inventor: Lili Diao
  • Patent number: 8935788
    Abstract: A two stage virus detection system detects viruses in target files. In the first stage, a training application receives a master virus pattern file recording all known virus patterns and generates a features list containing fundamental virus signatures from the virus patterns, a novelty detection model, a classification model, and a set of segmented virus pattern files. In the second stage, a detection application scans a target file for viruses using the generated outputs from the first stage rather than using the master virus pattern file directly to do traditional pattern matching. The results of the scan can vary in detail depending on a fuzzy scan level. For fuzzy scan level “1,” the existence of a virus is returned. For fuzzy scan level “2,” the grant virus type found is returned. For fuzzy scan level “3,” the exact virus name is returned.
    Type: Grant
    Filed: October 15, 2008
    Date of Patent: January 13, 2015
    Assignee: Trend Micro Inc.
    Inventors: Lili Diao, Vincent Chan, Patrick Mg Lu
  • Patent number: 8838992
    Abstract: A machine learning model is used to identify normal scripts in a client computer. The machine learning model may be built by training using samples of known normal scripts and samples of known potentially malicious scripts and may take into account lexical and semantic characteristics of the sample scripts. The machine learning model and a feature set may be provided to the client computer by a server computer. In the client computer, the machine learning model may be used to classify a target script. The target script does not have to be evaluated for malicious content when classified as a normal script. Otherwise, when the target script is classified as a potentially malicious script, the target script may have to be further evaluated by an anti-malware or sent to a back-end system.
    Type: Grant
    Filed: April 28, 2011
    Date of Patent: September 16, 2014
    Assignee: Trend Micro Incorporated
    Inventors: Xuewen Zhu, Lili Diao, Da Li, Dibin Tang
  • Patent number: 8699796
    Abstract: One embodiment relates to a method of identifying sensitive expressions in images for a language with a large alphabet. The method is performed using a computer and includes (i) extracting an image from a message, (ii) extracting image character-blocks (i.e. normalized pixel graphs) from the image, and (iii) predicting characters to which the character-blocks correspond using a multi-class learning model, wherein the multi-class learning model is trained using a derived list of sensitive characters which is a subset of the large alphabet. In addition, (iv) the characters may be combined into string text, and (v) the string text may be searched for matches with a predefined list of sensitive expressions. Another embodiment relates to a method of training a multi-class learning model so that the model predicts characters to which image character-blocks correspond. Other embodiments, aspects and features are also disclosed herein.
    Type: Grant
    Filed: November 11, 2008
    Date of Patent: April 15, 2014
    Assignee: Trend Micro Incorporated
    Inventors: Lili Diao, Jonathan J. Oliver
  • Patent number: 8560466
    Abstract: The invention relates, in an embodiment, to a method for handling a received document. The method includes receiving a plurality of text document samples. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes generating fundamental units from the plurality of text document samples for charsets of the plurality of text document samples. Training includes extracting a subset of said fundamental units as feature lists and converting the feature lists into a set of feature vectors. Training further includes generating the set of machine learning models from the set of feature vectors. The method includes applying the set of machine learning models against a set of target document feature vectors converted from the received document. The method includes decoding the received document to obtain decoded content of the received document based on at least the first encoding scheme.
    Type: Grant
    Filed: February 26, 2010
    Date of Patent: October 15, 2013
    Assignee: Trend Micro Incorporated
    Inventors: Lili Diao, Yun-chian Cheng
  • Patent number: 8375450
    Abstract: A training model for malware detection is developed using common substrings extracted from known malware samples. The probability of each substring occurring within a malware family is determined and a decision tree is constructed using the substrings. An enterprise server receives indications from client machines that a particular file is suspected of being malware. The suspect file is retrieved and the decision tree is walked using the suspect file. A leaf node is reached that identifies a particular common substring, a byte offset within the suspect file at which it is likely that the common substring begins, and a probability distribution that the common substring appears in a number of malware families. A hash value of the common substring is compared (exact or approximate) against the corresponding substring in the suspect file. If positive, a result is returned to the enterprise server indicating the probability that the suspect file is a member of a particular malware family.
    Type: Grant
    Filed: October 5, 2009
    Date of Patent: February 12, 2013
    Assignee: Trend Micro, Inc.
    Inventors: Jonathan James Oliver, Cheng-Lin Hou, Lili Diao, YiFun Liang, Jennifer Rihn
  • Patent number: 8260054
    Abstract: A method for matching an image-form textual string in an image to a regular expression is disclosed. The method includes constructing a representation of the regular expression and generating a candidate string of characters from the image-form textual string. The method further includes ascertaining whether there exists a match between the image-form textual string and the regular expression, the match is deemed achieved if a probability value associated with the match is above a predetermined matching threshold.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: September 4, 2012
    Assignee: Trend Micro Incorporated
    Inventors: Jonathan James Oliver, Lili Diao
  • Patent number: 8023974
    Abstract: In one embodiment, a content filtering system generates a support vector machine (SVM) learning model in a server computer and provides the SVM learning model to a mobile phone for use in classifying text messages. The SVM learning model may be generated in the server computer by training a support vector machine with sample text messages that include spam and legitimate text messages. A resulting intermediate SVM learning model from the support vector machine may include a threshold value, support vectors and alpha values. The SVM learning model in the mobile phone may include the threshold value, the features, and the weights of the features. An incoming text message may be parsed for the features. The weights of features found in the incoming text message may be added and compared to the threshold value to determine whether or not the incoming text message is spam.
    Type: Grant
    Filed: February 15, 2007
    Date of Patent: September 20, 2011
    Assignee: Trend Micro Incorporated
    Inventors: Lili Diao, Vincent Chan, Patrick MG Lu
  • Publication number: 20110213736
    Abstract: The invention relates, in an embodiment, to a method for handling a received document. The method includes receiving a plurality of text document samples. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes generating fundamental units from the plurality of text document samples for charsets of the plurality of text document samples. Training includes extracting a subset of said fundamental units as feature lists and converting the feature lists into a set of feature vectors. Training further includes generating the set of machine learning models from the set of feature vectors. The method includes applying the set of machine learning models against a set of target document feature vectors converted from the received document. The method includes decoding the received document to obtain decoded content of the received document based on at least the first encoding scheme.
    Type: Application
    Filed: February 26, 2010
    Publication date: September 1, 2011
    Inventors: Lili Diao, Yun-chian Cheng
  • Patent number: 7827133
    Abstract: The invention relates, in an embodiment, to a computer-implemented method for handling a target document, the target document having been transmitted electronically and involving an encoding scheme. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using SIM (Similarity Algorithm) to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme. The method including decoding the target document to obtain decoded content of the document based on at least the first encoding scheme.
    Type: Grant
    Filed: February 26, 2010
    Date of Patent: November 2, 2010
    Assignee: Trend Micro Inc.
    Inventor: Lili Diao
  • Patent number: 7756535
    Abstract: In one embodiment, a content filtering system includes a feature list and a learning model. The feature list may be a subset of a dictionary that was used to train the content filtering system to identify classification (e.g., spam, phishing, porn, legitimate text messages, etc.) of text messages during a training stage. The learning model may include representative vectors, each of which represents a particular class of text messages. The learning model and the feature list may be generated in a server computer during the training stage and then subsequently provided to the mobile phone. An incoming text message in the mobile phone may be parsed for occurrences of feature words included in the feature list and then converted to an input vector. The input vector may be compared to the learning model to determine the classification of the incoming text message.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: July 13, 2010
    Assignee: Trend Micro Incorporated
    Inventors: Lili Diao, Jackie Cao, Vincent Chan
  • Publication number: 20100153320
    Abstract: The invention relates, in an embodiment, to a computer-implemented method for handling a target document, the target document having been transmitted electronically and involving an encoding scheme. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using SIM (Similarity Algorithm) to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme. The method including decoding the target document to obtain decoded content of the document based on at least the first encoding scheme.
    Type: Application
    Filed: February 26, 2010
    Publication date: June 17, 2010
    Inventor: Lili Diao
  • Patent number: 7711673
    Abstract: The invention relates, in an embodiment, to a computer-implemented method for automatic charset detection, which includes detecting an encoding scheme of a target document. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using SIM (Similarity Algorithm) to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme.
    Type: Grant
    Filed: September 28, 2005
    Date of Patent: May 4, 2010
    Assignee: Trend Micro Incorporated
    Inventor: Lili Diao
  • Patent number: 7689531
    Abstract: The invention relates, in an embodiment, to a computer-implemented method for automatic charset detection, which includes detecting an encoding scheme of a target document. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using a SVM (Support Vector Machine) technique to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme.
    Type: Grant
    Filed: September 28, 2005
    Date of Patent: March 30, 2010
    Assignee: Trend Micro Incorporated
    Inventors: Lili Diao, Yun-chian Cheng
  • Publication number: 20100074534
    Abstract: A method for matching an image-form textual string in an image to a regular expression is disclosed. The method includes constructing a representation of the regular expression and generating a candidate string of characters from the image-form textual string. The method further includes ascertaining whether there exists a match between the image-form textual string and the regular expression, the match is deemed achieved if a probability value associated with the match is above a predetermined matching threshold.
    Type: Application
    Filed: September 22, 2008
    Publication date: March 25, 2010
    Inventors: Jonathan James Oliver, Lili Diao