Patents by Inventor Pidong Wang

Pidong Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230025739
    Abstract: Aspects of the technology employ a machine translation quality prediction (MTQP) model to refine datasets that are used in training machine translation systems. This includes receiving, by a machine translation quality prediction model, a sentence pair of a source sentence and a translated output (802). Then performing feature extraction on the sentence pair using a set of two or more feature extractors, where each feature extractor generates a corresponding feature vector (804). The corresponding feature vectors from the set of feature extractors are concatenated together (806). And the concatenated feature vectors are applied to a feedforward neural network, in which the feedforward neural network generates a machine translation quality prediction score for the translated output (808).
    Type: Application
    Filed: June 29, 2022
    Publication date: January 26, 2023
    Inventors: Junpei Zhou, Yuezhang Li, Ciprian Chelba, Fangxiaoyu Feng, Bowen Liang, Pidong Wang
  • Publication number: 20200380413
    Abstract: Characteristics of a plurality of users of a client application are received. A recommendation model and a scaling model are generated based on the characteristics of the plurality of users. Recommendation scores are determined for the plurality of users using the recommendation model and scaling scores are determined for the plurality of users using the scaling model. One or more items of content are presented to one or more users of the plurality of users based on corresponding recommendation scores and scaling scores of the one or more users.
    Type: Application
    Filed: April 10, 2020
    Publication date: December 3, 2020
    Inventors: Wah Loon Keng, Pidong Wang, Karthik Jayasurya, Brendan Burke, Christopher Lam
  • Patent number: 10769387
    Abstract: Implementations of the present disclosure are directed to a method, a system, and an article for translating chat messages. An example method can include: receiving an electronic text message from a client device of a user; normalizing the electronic text message to generate a normalized text message; tagging at least one phrase in the normalized text message with a marker to generate a tagged text message, the marker indicating that the at least one phrase will be translated using a rule-based system; translating the tagged text message using the rule-based system and a machine translation system to generate an initial translation; and post-processing the initial translation to generate a final translation.
    Type: Grant
    Filed: September 19, 2018
    Date of Patent: September 8, 2020
    Assignee: MZ IP Holdings, LLC
    Inventors: Pidong Wang, Nikhil Bojja, Shiman Guo
  • Patent number: 10765956
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving a plurality of word strings in a first language, each received word string comprising a plurality of words, identifying one or more named entities in each received word string using a statistical classifier that was trained using training data comprising a plurality of features, wherein one of the features is a word shape feature that comprises a respective token for each letter of a respective word wherein each token signifies a case of the letter or whether the letter is a digit, and translating the received word strings from the first language to a second language including preserving the respective identified named entities in each received word string during translation.
    Type: Grant
    Filed: January 7, 2016
    Date of Patent: September 8, 2020
    Assignee: Machine Zone Inc.
    Inventors: Nikhil Bojja, Shivasankari Kannan, Pidong Wang
  • Patent number: 10699073
    Abstract: Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for identifying a language in a message. Non-language characters are removed from a text message to generate a sanitized text message. An alphabet and/or a script are detected in the sanitized text message by performing at least one of (i) an alphabet-based language detection test to determine a first set of scores and (ii) a script-based language detection test to determine a second set of scores. Each score in the first set of scores represents a likelihood that the sanitized text message includes the alphabet for one of a plurality of different languages. Each score in the second set of scores represents a likelihood that the sanitized text message includes the script for one of the plurality of different languages.
    Type: Grant
    Filed: December 5, 2018
    Date of Patent: June 30, 2020
    Assignee: MZ IP Holdings, LLC
    Inventors: Nikhil Bojja, Pidong Wang, Shiman Guo
  • Publication number: 20190108214
    Abstract: Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for identifying a language in a message. Non-language characters are removed from a text message to generate a sanitized text message. An alphabet and/or a script are detected in the sanitized text message by performing at least one of (i) an alphabet-based language detection test to determine a first set of scores and (ii) a script-based language detection test to determine a second set of scores. Each score in the first set of scores represents a likelihood that the sanitized text message includes the alphabet for one of a plurality of different languages. Each score in the second set of scores represents a likelihood that the sanitized text message includes the script for one of the plurality of different languages.
    Type: Application
    Filed: December 5, 2018
    Publication date: April 11, 2019
    Inventors: Nikhil Bojja, Pidong Wang, Shiman Guo
  • Publication number: 20190087466
    Abstract: Implementations of the present disclosure are directed to a method, a system, and an article for suggesting emojis in electronic communication. An example method can include: providing a trie data structure on a client device, the trie data structure storing a dictionary and including a plurality of nodes, wherein at least one node in the trie data structure includes a children array including at least one of: an integer index for identifying a child node; and an array size corresponding to a number of child nodes for the at least one node; and detecting, by the client device, at least one character entered by a user in a user interface of the client device; identifying, using the trie data structure, at least one emoji corresponding to the at least one character; and presenting the at least one emoji in the user interface for user selection.
    Type: Application
    Filed: September 19, 2018
    Publication date: March 21, 2019
    Inventors: Pidong Wang, Shivasankari Kannan, Nikhil Bojja
  • Publication number: 20190087417
    Abstract: Implementations of the present disclosure are directed to a method, a system, and an article for translating chat messages. An example method can include: receiving an electronic text message from a client device of a user; normalizing the electronic text message to generate a normalized text message; tagging at least one phrase in the normalized text message with a marker to generate a tagged text message, the marker indicating that the at least one phrase will be translated using a rule-based system; translating the tagged text message using the rule-based system and a machine translation system to generate an initial translation; and post-processing the initial translation to generate a final translation.
    Type: Application
    Filed: September 19, 2018
    Publication date: March 21, 2019
    Inventors: Pidong Wang, Nikhil Bojja, Shiman Guo
  • Patent number: 10162811
    Abstract: Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for identifying a language in a message. Non-language characters are removed from a text message to generate a sanitized text message. An alphabet and/or a script are detected in the sanitized text message by performing at least one of (i) an alphabet-based language detection test to determine a first set of scores and (ii) a script-based language detection test to determine a second set of scores. Each score in the first set of scores represents a likelihood that the sanitized text message includes the alphabet for one of a plurality of different languages. Each score in the second set of scores represents a likelihood that the sanitized text message includes the script for one of the plurality of different languages.
    Type: Grant
    Filed: October 3, 2016
    Date of Patent: December 25, 2018
    Assignee: MZ IP Holdings, LLC
    Inventors: Nikhil Bojja, Pidong Wang, Shiman Guo
  • Publication number: 20180270605
    Abstract: Systems and method herein include receiving location-based data locations associated with each of a plurality of users. The location-based data can be published to a first channel of a plurality of channels and based on a location associated with a first user of the plurality of users a computer processing device determines an analysis result associated with the first user. The analysis results can be published to a second channel and an output to the first user device can be generated in response to the analysis result.
    Type: Application
    Filed: March 20, 2018
    Publication date: September 20, 2018
    Inventors: Pidong Wang, Satheeshkumar Karuppusamy, Urvashi Desai, Shiman Guo
  • Publication number: 20170197152
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving a plurality of word strings in a first language, each received word string comprising a plurality of words, identifying one or more named entities in each received word string using a statistical classifier that was trained using training data comprising a plurality of features, wherein one of the features is a word shape feature that comprises a respective token for each letter of a respective word wherein each token signifies a case of the letter or whether the letter is a digit, and translating the received word strings from the first language to a second language including preserving the respective identified named entities in each received word string during translation.
    Type: Application
    Filed: January 7, 2016
    Publication date: July 13, 2017
    Inventors: Nikhil Bojja, Shivasankari Kannan, Pidong Wang
  • Publication number: 20170185581
    Abstract: Implementations of the present disclosure are directed to a method, a system, and an article for suggesting emoji for insertion into a communication having text or other content. A plurality of features corresponding to the communication are obtained and provided to a plurality of emoji detection modules. A set of emoji and first confidence scores are received from each emoji detection module and provided to at least one classifier. A proposed set of candidate emoji and second confidence scores are received from the at least one classifier. A candidate emoji is inserted into the communication.
    Type: Application
    Filed: December 20, 2016
    Publication date: June 29, 2017
    Inventors: Nikhil Bojja, Satheeshkumar Karuppusamy, Pidong Wang, Shivasankari Kannan, Arun Nedunchezhian
  • Publication number: 20170024372
    Abstract: Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for identifying a language in a message. Non-language characters are removed from a text message to generate a sanitized text message. An alphabet and/or a script are detected in the sanitized text message by performing at least one of (i) an alphabet-based language detection test to determine a first set of scores and (ii) a script-based language detection test to determine a second set of scores. Each score in the first set of scores represents a likelihood that the sanitized text message includes the alphabet for one of a plurality of different languages. Each score in the second set of scores represents a likelihood that the sanitized text message includes the script for one of the plurality of different languages.
    Type: Application
    Filed: October 3, 2016
    Publication date: January 26, 2017
    Inventors: Nikhil Bojja, Pidong Wang, Shiman Guo
  • Patent number: 9535896
    Abstract: Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for detecting a language in a text message. A plurality of different language detection tests are performed on a message associated with a user. Each language detection test determines a set of scores representing a likelihood that the message is in one of a plurality of different languages. One or more combinations of the score sets are provided as input to one or more distinct classifiers. Output from each of the classifiers includes a respective indication that the message is in one of the different languages. The language in the message may be identified as being the indicated language from one of the classifiers, based on a confidence score and/or an identified linguistic domain.
    Type: Grant
    Filed: May 23, 2016
    Date of Patent: January 3, 2017
    Assignee: Machine Zone, Inc.
    Inventors: Nikhil Bojja, Pidong Wang, Fredrik Linder, Bartlomiej Puzon
  • Publication number: 20160267070
    Abstract: Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for detecting a language in a text message. A plurality of different language detection tests are performed on a message associated with a user. Each language detection test determines a set of scores representing a likelihood that the message is in one of a plurality of different languages. One or more combinations of the score sets are provided as input to one or more distinct classifiers. Output from each of the classifiers includes a respective indication that the message is in one of the different languages. The language in the message may be identified as being the indicated language from one of the classifiers, based on a confidence score and/or an identified linguistic domain.
    Type: Application
    Filed: May 23, 2016
    Publication date: September 15, 2016
    Inventors: Nikhil Bojja, Pidong Wang, Fredrik Linder, Bartlomiej Puzon
  • Patent number: 9372848
    Abstract: Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for detecting a language in a text message. A plurality of different language detection tests are performed on a message associated with a user. Each language detection test determines a set of scores representing a likelihood that the message is in one of a plurality of different languages. One or more combinations of the score sets are provided as input to one or more distinct classifiers. Output from each of the classifiers includes a respective indication that the message is in one of the different languages. The language in the message may be identified as being the indicated language from one of the classifiers, based on a confidence score and/or an identified linguistic domain.
    Type: Grant
    Filed: October 17, 2014
    Date of Patent: June 21, 2016
    Assignee: Machine Zone, Inc.
    Inventors: Nikhil Bojja, Pidong Wang, Fredrik Linder, Bartlomiej Puzon
  • Publication number: 20160110340
    Abstract: Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for detecting a language in a text message. A plurality of different language detection tests are performed on a message associated with a user. Each language detection test determines a set of scores representing a likelihood that the message is in one of a plurality of different languages. One or more combinations of the score sets are provided as input to one or more distinct classifiers. Output from each of the classifiers includes a respective indication that the message is in one of the different languages. The language in the message may be identified as being the indicated language from one of the classifiers, based on a confidence score and/or an identified linguistic domain.
    Type: Application
    Filed: October 17, 2014
    Publication date: April 21, 2016
    Inventors: Nikhil Bojja, Pidong Wang, Fredrik Linder, Bartlomiej Puzon