Patents by Inventor Lili Diao
Lili Diao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11973791Abstract: A risk knowledge graph is created from information on risk events involving network entities of a private computer network. Each of the risk events is represented as a node in the risk knowledge graph. The nodes are connected by edges that represent the risk events. The nodes are grouped into communities of related nodes. A response action is performed against a community to mitigate a cybersecurity risk posed by the community.Type: GrantFiled: October 4, 2021Date of Patent: April 30, 2024Assignee: Trend Micro IncorporatedInventors: Zhijie Li, ZhengBao Zhang, Lili Diao
-
Patent number: 11449794Abstract: Language-based machine learning approach for automatically detecting universal charset and the language of a received document is disclosed. The language-based machine learning approach employs a plurality of text document samples in different languages, after converting them to a selected Unicode style (if their original encoding schemes are not the selected Unicode), to generate a plurality of language-based machine learning models during the training stage. During the application stage, vector representations of the received document for different combinations of charsets and their respective applicable languages are tested against the plurality of machine learning models to ascertain the charset and language combination that is most similar to its associated machine learning model, thereby identifying the charset and language of the received document.Type: GrantFiled: August 21, 2019Date of Patent: September 20, 2022Assignee: Trend Micro IncorporatedInventor: Lili Diao
-
Patent number: 8935788Abstract: A two stage virus detection system detects viruses in target files. In the first stage, a training application receives a master virus pattern file recording all known virus patterns and generates a features list containing fundamental virus signatures from the virus patterns, a novelty detection model, a classification model, and a set of segmented virus pattern files. In the second stage, a detection application scans a target file for viruses using the generated outputs from the first stage rather than using the master virus pattern file directly to do traditional pattern matching. The results of the scan can vary in detail depending on a fuzzy scan level. For fuzzy scan level “1,” the existence of a virus is returned. For fuzzy scan level “2,” the grant virus type found is returned. For fuzzy scan level “3,” the exact virus name is returned.Type: GrantFiled: October 15, 2008Date of Patent: January 13, 2015Assignee: Trend Micro Inc.Inventors: Lili Diao, Vincent Chan, Patrick Mg Lu
-
Patent number: 8838992Abstract: A machine learning model is used to identify normal scripts in a client computer. The machine learning model may be built by training using samples of known normal scripts and samples of known potentially malicious scripts and may take into account lexical and semantic characteristics of the sample scripts. The machine learning model and a feature set may be provided to the client computer by a server computer. In the client computer, the machine learning model may be used to classify a target script. The target script does not have to be evaluated for malicious content when classified as a normal script. Otherwise, when the target script is classified as a potentially malicious script, the target script may have to be further evaluated by an anti-malware or sent to a back-end system.Type: GrantFiled: April 28, 2011Date of Patent: September 16, 2014Assignee: Trend Micro IncorporatedInventors: Xuewen Zhu, Lili Diao, Da Li, Dibin Tang
-
Patent number: 8699796Abstract: One embodiment relates to a method of identifying sensitive expressions in images for a language with a large alphabet. The method is performed using a computer and includes (i) extracting an image from a message, (ii) extracting image character-blocks (i.e. normalized pixel graphs) from the image, and (iii) predicting characters to which the character-blocks correspond using a multi-class learning model, wherein the multi-class learning model is trained using a derived list of sensitive characters which is a subset of the large alphabet. In addition, (iv) the characters may be combined into string text, and (v) the string text may be searched for matches with a predefined list of sensitive expressions. Another embodiment relates to a method of training a multi-class learning model so that the model predicts characters to which image character-blocks correspond. Other embodiments, aspects and features are also disclosed herein.Type: GrantFiled: November 11, 2008Date of Patent: April 15, 2014Assignee: Trend Micro IncorporatedInventors: Lili Diao, Jonathan J. Oliver
-
Patent number: 8560466Abstract: The invention relates, in an embodiment, to a method for handling a received document. The method includes receiving a plurality of text document samples. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes generating fundamental units from the plurality of text document samples for charsets of the plurality of text document samples. Training includes extracting a subset of said fundamental units as feature lists and converting the feature lists into a set of feature vectors. Training further includes generating the set of machine learning models from the set of feature vectors. The method includes applying the set of machine learning models against a set of target document feature vectors converted from the received document. The method includes decoding the received document to obtain decoded content of the received document based on at least the first encoding scheme.Type: GrantFiled: February 26, 2010Date of Patent: October 15, 2013Assignee: Trend Micro IncorporatedInventors: Lili Diao, Yun-chian Cheng
-
Patent number: 8375450Abstract: A training model for malware detection is developed using common substrings extracted from known malware samples. The probability of each substring occurring within a malware family is determined and a decision tree is constructed using the substrings. An enterprise server receives indications from client machines that a particular file is suspected of being malware. The suspect file is retrieved and the decision tree is walked using the suspect file. A leaf node is reached that identifies a particular common substring, a byte offset within the suspect file at which it is likely that the common substring begins, and a probability distribution that the common substring appears in a number of malware families. A hash value of the common substring is compared (exact or approximate) against the corresponding substring in the suspect file. If positive, a result is returned to the enterprise server indicating the probability that the suspect file is a member of a particular malware family.Type: GrantFiled: October 5, 2009Date of Patent: February 12, 2013Assignee: Trend Micro, Inc.Inventors: Jonathan James Oliver, Cheng-Lin Hou, Lili Diao, YiFun Liang, Jennifer Rihn
-
Patent number: 8260054Abstract: A method for matching an image-form textual string in an image to a regular expression is disclosed. The method includes constructing a representation of the regular expression and generating a candidate string of characters from the image-form textual string. The method further includes ascertaining whether there exists a match between the image-form textual string and the regular expression, the match is deemed achieved if a probability value associated with the match is above a predetermined matching threshold.Type: GrantFiled: September 22, 2008Date of Patent: September 4, 2012Assignee: Trend Micro IncorporatedInventors: Jonathan James Oliver, Lili Diao
-
Patent number: 8023974Abstract: In one embodiment, a content filtering system generates a support vector machine (SVM) learning model in a server computer and provides the SVM learning model to a mobile phone for use in classifying text messages. The SVM learning model may be generated in the server computer by training a support vector machine with sample text messages that include spam and legitimate text messages. A resulting intermediate SVM learning model from the support vector machine may include a threshold value, support vectors and alpha values. The SVM learning model in the mobile phone may include the threshold value, the features, and the weights of the features. An incoming text message may be parsed for the features. The weights of features found in the incoming text message may be added and compared to the threshold value to determine whether or not the incoming text message is spam.Type: GrantFiled: February 15, 2007Date of Patent: September 20, 2011Assignee: Trend Micro IncorporatedInventors: Lili Diao, Vincent Chan, Patrick MG Lu
-
Publication number: 20110213736Abstract: The invention relates, in an embodiment, to a method for handling a received document. The method includes receiving a plurality of text document samples. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes generating fundamental units from the plurality of text document samples for charsets of the plurality of text document samples. Training includes extracting a subset of said fundamental units as feature lists and converting the feature lists into a set of feature vectors. Training further includes generating the set of machine learning models from the set of feature vectors. The method includes applying the set of machine learning models against a set of target document feature vectors converted from the received document. The method includes decoding the received document to obtain decoded content of the received document based on at least the first encoding scheme.Type: ApplicationFiled: February 26, 2010Publication date: September 1, 2011Inventors: Lili Diao, Yun-chian Cheng
-
Patent number: 7827133Abstract: The invention relates, in an embodiment, to a computer-implemented method for handling a target document, the target document having been transmitted electronically and involving an encoding scheme. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using SIM (Similarity Algorithm) to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme. The method including decoding the target document to obtain decoded content of the document based on at least the first encoding scheme.Type: GrantFiled: February 26, 2010Date of Patent: November 2, 2010Assignee: Trend Micro Inc.Inventor: Lili Diao
-
Patent number: 7756535Abstract: In one embodiment, a content filtering system includes a feature list and a learning model. The feature list may be a subset of a dictionary that was used to train the content filtering system to identify classification (e.g., spam, phishing, porn, legitimate text messages, etc.) of text messages during a training stage. The learning model may include representative vectors, each of which represents a particular class of text messages. The learning model and the feature list may be generated in a server computer during the training stage and then subsequently provided to the mobile phone. An incoming text message in the mobile phone may be parsed for occurrences of feature words included in the feature list and then converted to an input vector. The input vector may be compared to the learning model to determine the classification of the incoming text message.Type: GrantFiled: July 7, 2006Date of Patent: July 13, 2010Assignee: Trend Micro IncorporatedInventors: Lili Diao, Jackie Cao, Vincent Chan
-
Publication number: 20100153320Abstract: The invention relates, in an embodiment, to a computer-implemented method for handling a target document, the target document having been transmitted electronically and involving an encoding scheme. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using SIM (Similarity Algorithm) to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme. The method including decoding the target document to obtain decoded content of the document based on at least the first encoding scheme.Type: ApplicationFiled: February 26, 2010Publication date: June 17, 2010Inventor: Lili Diao
-
Patent number: 7711673Abstract: The invention relates, in an embodiment, to a computer-implemented method for automatic charset detection, which includes detecting an encoding scheme of a target document. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using SIM (Similarity Algorithm) to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme.Type: GrantFiled: September 28, 2005Date of Patent: May 4, 2010Assignee: Trend Micro IncorporatedInventor: Lili Diao
-
Patent number: 7689531Abstract: The invention relates, in an embodiment, to a computer-implemented method for automatic charset detection, which includes detecting an encoding scheme of a target document. The method includes training, using a plurality of text document samples, to obtain a set of machine learning models. Training includes using a SVM (Support Vector Machine) technique to generate the set of machine learning models from feature vectors obtained from the plurality of text document samples. The method also includes applying the set of machine learning models against a set of target document feature vectors converted from the target document to detect the encoding scheme.Type: GrantFiled: September 28, 2005Date of Patent: March 30, 2010Assignee: Trend Micro IncorporatedInventors: Lili Diao, Yun-chian Cheng
-
Publication number: 20100074534Abstract: A method for matching an image-form textual string in an image to a regular expression is disclosed. The method includes constructing a representation of the regular expression and generating a candidate string of characters from the image-form textual string. The method further includes ascertaining whether there exists a match between the image-form textual string and the regular expression, the match is deemed achieved if a probability value associated with the match is above a predetermined matching threshold.Type: ApplicationFiled: September 22, 2008Publication date: March 25, 2010Inventors: Jonathan James Oliver, Lili Diao