Linguistics Patents (Class 704/1)

Translation machine (Class 704/2)

Multilingual or national language support (Class 704/8)

Natural language (Class 704/9)

Dictionary building, modification, or prioritization (Class 704/10)

Systems, methods, and computer-readable media for searching tabular data

Patent number: 10061757

Abstract: Systems, methods, and computer-readable media are provided for searching a tabular database. According to certain embodiments, search parameters for searching a tabular database are received from a user device and a row of a tabular database that corresponds to the search parameters is determined. In certain embodiments, the row may be determined by comparing the search parameters with a plurality of stored exemplar search queries, each of the plurality of stored exemplar search queries comprising a search query associated with a row and a column of the tabular database. A column of the tabular database that corresponds to the search parameters is determined by comparing the search parameters with the plurality of stored exemplar search queries. In certain embodiments, at least one cell of the tabular database is determined. The determined cell may be located at the intersection of the determined row and the determined column.

Type: Grant

Filed: June 17, 2015

Date of Patent: August 28, 2018

Assignee: Google LLC

Inventors: Sreeram Viswanath Balakrishnan, Alon Yitzchak Halevy
Mechanism for synchronising devices, system and method

Patent number: 10055397

Abstract: There is provided a mechanism for synchronizing a plurality of dynamic language models residing in a plurality of devices associated with a single user, each device comprising a dynamic language model. The mechanism is configured to: receive text data representing text that has been input by a user into one or more of the plurality of devices; train at least one language model on the text data; and provide the at least one language model for synchronizing the devices. There is also provided a system comprising the mechanism and a plurality of devices, and a method for synchronizing a plurality of dynamic language models residing in a plurality of devices associated with a single user.

Type: Grant

Filed: May 14, 2013

Date of Patent: August 21, 2018

Assignee: TOUCHTYPE LIMITED

Inventors: Michael Bell, Joe Freeman, Emanuel George Hategan, Benjamin Medlock
Systems and methods for a symbol-adaptable keyboard

Patent number: 10057402

Abstract: In one embodiment, a method includes detecting a communication session between a first user and one or more second users. The method also includes determining a social context of the communication session, and determining based at least in part on the social context a set of symbols for communication by the first user in the communication session with the second users. The method further includes providing for display to the first user a set of keys corresponding to the set of symbols. The keys indicate symbols for input by the first user in the communication session.

Type: Grant

Filed: October 3, 2017

Date of Patent: August 21, 2018

Assignee: Facebook, Inc.

Inventors: Jenny Yuen, Luke St. Clair
Text alterations based on body part(s) used to provide input

Patent number: 10049092

Abstract: In one aspect, a device includes a processor, a touch-enabled display accessible to the processor, and storage accessible to the processor. The storage bears instructions executable by the processor to determine a number of body parts with which a user provides input to the device and to perform a text alteration based at least in part on the determination of the number of body parts.

Type: Grant

Filed: January 29, 2016

Date of Patent: August 14, 2018

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Grigori Zaitsev, Russell Speight VanBlon, Jianbang Zhang
System for organizing and fast searching of massive amounts of data

Patent number: 10044575

Abstract: A system to collect and store in a special data structure arranged for rapid searching massive amounts of data. Performance metric data is one example. The performance metric data is recorded in time-series measurements, converted into unicode, and arranged into a special data structure having one directory for every day which stores all the metric data collected that day. The performance metric data is collected by one or more probes running on machines about which data is being collected. The performance metric data is compressed prior to transmission to a server over any data path. The data structure at the server where analysis is done has a subdirectory for every resource type. Each subdirectory contains text files of performance metric data values measured for attributes in a group of attributes to which said text file is dedicated. Each attribute has its own section and the performance metric data values are recorded in time series as unicode hex numbers as a comma delimited list.

Type: Grant

Filed: December 12, 2016

Date of Patent: August 7, 2018

Assignee: CUMULUS SYSTEMS INC.

Inventors: Ajit Bhave, Arun Ramachandran, Sai Krishnam Raju Nadimpalli, Sandeep Bele
Systems and methods of detecting, measuring, and extracting signatures of signals embedded in social media data streams

Patent number: 10031909

Abstract: A system for scoring micro-blogging messages is provided, including an extractor, and evaluator, a calculator, and a publisher. The extractor may be configured to receive micro-blogging messages, to detect messages containing terms of interest, to extract raw data, and to store the data in a database. The evaluator may be configured to access and parse the stored data into tokenized data, and to store the tokenized data in a database. The evaluator may also be configured to identify relevant micro-blogging messages; to tag message as indicative; and to filter messages from low-volume or malicious sources before being tagged as indicative. The calculator may be configured to access a sentiment dictionary; to calculate a sentiment score of the tokenized data, and to calculate a sentiment signature for a term of interest. The publisher may be configured to provide access to clients of the system.

Type: Grant

Filed: May 19, 2015

Date of Patent: July 24, 2018

Assignee: Social Market Analytics, Inc.

Inventors: Jeffrey G. Blaschak, Aleksey Blinov, Joseph A. Gits, Fady Harfoush, Kurt Myers
Distributed storage processing statement interception and modification

Patent number: 10033765

Abstract: A non-transitory computer readable storage medium has instructions executed by a processor to intercept a query statement at a master machine. The query statement is an instruction from a client machine that specifies how data managed by a distributed storage system should be processed and provided back to the client. In the communication between the client and the master machine, tokens associated with the statement are evaluated to selectively identify a pattern match of one of connection pattern tokens, login pattern tokens or query pattern tokens. For the query pattern tokens, altered tokens for the query statement are formed in response to the pattern match to establish a revised statement. The revised statement is produced in response to application of a policy rule. The revised statement maintains computation, logic and procedure of the statement, but alters parameters of the statement as specified by the policy rule.

Type: Grant

Filed: December 11, 2015

Date of Patent: July 24, 2018

Assignee: BlueTalon, Inc.

Inventors: Pratik Verma, Rakesh Khanduja
System and method for predicting an optimal machine translation system for a user based on an updated user profile

Patent number: 10025779

Abstract: A system and method predict an optimal machine translation system for a first of a set of users. The method includes, for each of the users, providing a respective user profile which includes rankings for at least some machine translation systems from a set of machine translation systems. The user profile of the first user is updated, based on the user profiles of at least a subset of the other users. The updating includes generating at least one missing ranking. An optimal translation system for the first user from the set of machine translation systems is predicted, based on the updated user profile computed for the first user.

Type: Grant

Filed: August 13, 2015

Date of Patent: July 17, 2018

Assignee: XEROX Corporation

Inventors: Shachar Mirkin, Jean-Luc Meunier
Method and system of automating data capture from electronic correspondence

Patent number: 10027613

Abstract: In some embodiments, electronic data may be automatically captured to provide a user with a universal Internet identity and e-mail address, comprehensive e-mail filtering and forwarding services, and e-receipt identification and data extraction. Detailed user e-mail preferences data stored at a central server may be selectively altered such that incoming correspondence is redirected in accordance with the user's preferences. Computer program code at the central server may parse incoming e-mail header information and data content, selectively extract data from identified types of correspondence, and forward the extracted data in accordance with the user's preferences. Additional computer program code may manipulate the extracted data in accordance with format requirements and display the manipulated data to a user in a desired format.

Type: Grant

Filed: January 28, 2016

Date of Patent: July 17, 2018

Assignee: Mercury Kingdom Assets Limited

Inventors: Jai Rawat, Julian Gordon, Santhosh Raman, Renuka Kulkami, Rajiv Anand, Silvia Doundakova, Vijayasankar Dhanapal, Oswald D'Sa, Srinivas Gubbala
Hardware data compressor that maintains sorted symbol list concurrently with input block scanning

Patent number: 10027346

Abstract: A hardware data compressor includes a first hardware engine that scans an input block of characters to produce a stream of tokens, the stream of tokens comprising replacement back pointers to matched strings of characters of the input block and non-replaced characters of the input block. The hardware data compressor also includes a second hardware engine that receives the stream of tokens and maintains a sorted list of symbols associated with the tokens. The hardware data compressor also includes the second hardware engine concurrently maintains the sorted list of symbols by frequency of occurrence as the first hardware engine produces the tokens of the stream.

Type: Grant

Filed: October 14, 2015

Date of Patent: July 17, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: G. Glenn Henry
Systems and methods for managing a master patient index including duplicate record detection

Patent number: 10025904

Abstract: A system for managing a master patient index is described. The master patient index database is constructed using inverted indices. The inverted index formulation enables faster, more complete and more flexible duplicate detection as compared to traditional master patient database management techniques. A master patient index management system including a remote user interface configured to leverage the inverted index formulation is described. The user interface includes features for managing records in an MPI database including identifying, efficiently comparing, updating and merging duplicate records across a heterogeneous healthcare organization.

Type: Grant

Filed: January 5, 2016

Date of Patent: July 17, 2018

Assignee: 4medica, Inc.

Inventors: Oleg Bess, Vannamuthu Kuttalingam
System and method for phonetic search over speech recordings

Patent number: 10019514

Abstract: A system and method for searching for an element in speech related documents may include transcribing a set of speech recordings to a set of phoneme strings and including the phoneme strings in a set of phonetic transcriptions. A system and method may reverse-index the phonetic transcriptions according to one or more phonemes such that the one or more phonemes can be used as a search key for searching the phoneme in the phonetic transcriptions. A system and method may transcribe a textual search term into a set of search phoneme strings and use the set of search phoneme strings to search for an element in the set of phonetic transcriptions.

Type: Grant

Filed: November 6, 2015

Date of Patent: July 10, 2018

Assignee: NICE LTD.

Inventors: Oren Elisha, Merav Ben-Asher
Targeted story summarization using natural language processing

Patent number: 10013404

Abstract: A computer system may receive a textual work. The computer system may generate a knowledge graph based on the textual work. The knowledge graph may include nodes representing concepts and edges between the nodes that represent links between the concepts. The computer system may then generate a concept path for a target concept. The computer system may then identify a related background narrative block that contains a related non-target concept. The background narrative block may be a narrative block that is not in the concept path for the target concept. The computer system may then summarize the related background narrative block and output the summary to an output device coupled with the computer system.

Type: Grant

Filed: December 3, 2015

Date of Patent: July 3, 2018

Assignee: International Business Machines Corporation

Inventors: Adam T. Clark, Jeffrey K. Huebert, Aspen L. Payton, John E. Petri
Smart home control method based on emotion recognition and the system thereof

Patent number: 10013977

Abstract: The present invention discloses a smart home control method based on emotion recognition and the system thereof, wherein, the method comprises: acquiring a user's voice information before performing an emotion recognition for a speech tone on the voice information and generating a first emotion recognition result; after converting the voice information into a text information, performing an emotion recognition for a semantics of the text information before generating a second emotion recognition result; based on the first emotion recognition result and the second emotion recognition result, a user's emotion recognition result is generated according to a preset determination method for emotion recognition result; also, based on the user's emotion recognition result, each smart home device is controlled to perform a corresponding operation.

Type: Grant

Filed: January 6, 2016

Date of Patent: July 3, 2018

Assignee: SHENZHEN SKYWORTH-RGB ELECTRONIC CO., LTD.

Inventor: Chunyuan Fu
Enforcing user-specified rules

Patent number: 10007712

Abstract: Described herein are techniques for employing a phrase as a unique identifier of a user and a corresponding user account. For instance, a transaction processing service may maintain multiple user accounts each associated with respective users. In addition, the transaction processing service may associate one or more unique phrases with each of these respective users and user accounts. Users may then configure rules associated with their respective user accounts to enable use of associated phrases as identifiers for storing a variety of different content in association with the phrases. Users may also configure their accounts with communication rules that instruct the transaction processing service to send pieces of content that are received with the phrase to different specified destinations. Users may also configure their accounts with preferences used by vendors to complete transactions initiated with use of a phrase.

Type: Grant

Filed: August 20, 2009

Date of Patent: June 26, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Matthew T. Williams, Howard B. Gefen, Vinay P. Vaidya
Crowdsourced analysis of decontextualized data

Patent number: 10002177

Abstract: Techniques are described for employing a crowdsourcing framework to analyze data related to the performance or operations of computing systems, or to analyze other types of data. A question is analyzed to determine data that is relevant to the question. The relevant data may be decontextualized to remove or alter contextual information included in the data, such as sensitive, personal, or business-related data. The question and the decontextualized data may then be presented to workers in a crowdsourcing framework, and the workers may determine an answer to the question based on an analysis or an examination of the decontextualized data. The answers may be combined, correlated, or otherwise processed to determine a processed answer to the question.

Type: Grant

Filed: September 16, 2013

Date of Patent: June 19, 2018

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Jon Arron McClintock, George Nikolaos Stathakopoulos, Dominique Imjya Brezinski
System, apparatus, and method for arabic handwriting recognition

Patent number: 10002301

Abstract: A system, a non-transitory computer readable medium, and a method for Arabic handwriting recognition are provided. The method includes acquiring an input image representative of a handwritten Arabic text from a user, partitioning the input image into a plurality of regions, determining a bag of features representation for each region of the plurality of regions, modeling each region independently by multi stream discrete Hidden Markov Model (HMM), and identifying a text based on the HMM models.

Type: Grant

Filed: September 19, 2017

Date of Patent: June 19, 2018

Assignee: King Fahd University of Petroleum and Minerals

Inventors: Sabri A. Mahmoud, Mohammed O. Assayony
System and method for automatic, unsupervised paraphrase generation using a novel framework that learns syntactic construct while retaining semantic meaning

Patent number: 9984063

Abstract: A system includes a question answering system executed by a computer, a processor, and a memory coupled to the processor. The memory is encoded with instructions that when executed cause the processor to provide training for training the question answering system. The training system is configured to receive a first phrase and a second phrase, the first and second phrases being paraphrases of each other, convert the first phrase into a first logical form and the second phrase into a second logical form, generate a phrasal edit that includes a difference between the first logical form and the second logical form, convert the phrasal edit into a disjunctive logical form in two directions, and generate a first plurality of paraphrases of the first and second phrases based on the disjunctive logical form.

Type: Grant

Filed: September 15, 2016

Date of Patent: May 29, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Laura J. Bennett, Lakshminarayanan Krishnamurthy, Niyati Parameswaran, Sridhar Sudarsan
Generating author vectors

Patent number: 9984062

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating author vectors. One of the methods includes obtaining a set of sequences of words, the set of sequences of words comprising a plurality of first sequences of words and, for each first sequence of words, a respective second sequence of words that follows the first sequence of words, wherein each first sequence of words and each second sequence of words has been classified as being authored by a first author; and training a neural network system on the first sequences and the second sequences to determine an author vector for the first author, wherein the author vector characterizes the first author.

Type: Grant

Filed: July 11, 2016

Date of Patent: May 29, 2018

Assignee: Google LLC

Inventors: Brian Patrick Strope, Quoc V. Le
Combinatorial summarizer

Patent number: 9977829

Abstract: A combinatorial summarizer includes a plurality of summarization engines, a processor in selective communication with each summarization engine, and computer readable instructions executable by the processor and embodied on a tangible, non-transitory computer readable medium. Each summarization engine is to select a respective plurality of sentences, and generate a relative rank and an associated weight for each sentence of the respective plurality of sentences. The computer readable instructions include instructions to determine a combined weight for each sentence of each respective plurality of sentences. The combined weight is based upon the respective associated weight and a respective relative human rank for each sentence in a set of sentences, including all sentences of each respective plurality of sentences.

Type: Grant

Filed: October 12, 2012

Date of Patent: May 22, 2018

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Steven J Simske, Malgorzata M Sturgill
Computerized method of generating and analytically evaluating multiple instances of natural language-generated text

Patent number: 9977826

Abstract: A computerized method for generating and evaluating natural language-generated text involves receiving, in a computer, data input by a user, generating, using a natural language generation technique, multiple instances of text stories based upon both contents of a corpus and the received data; analyzing the multiple instances of text stories as a weighted combination of computed geographic scores, distance scores, information content scores, replacement scores and extra aspect scores, providing a ranked set of the generated text stories to a user, receiving a selection of one of the text stories in the ranked set, and storing the selected story.

Type: Grant

Filed: October 21, 2015

Date of Patent: May 22, 2018

Assignee: Cloudera, Inc.

Inventors: Micha Gorelick, Hilary Mason, Grant Custer
Deceptive indicia profile generation from communications interactions

Patent number: 9965598

Abstract: Systems, methods, computer-readable storage mediums including computer-readable instructions and/or circuitry for generating deceptive indicia profiles may implement operations including, but not limited to: detecting one or more indicia of deception associated with one or more signals associated with communication content provided by a participant in a first communications interaction; detecting one or more indicia of deception associated with one or more signals associated with communications content provided by the participant in a second communications interaction; generating a deceptive indicia profile for the participant according to indicia of deception detected in the communications content provided by the participant in the first communications interaction and indicia of deception detected in the communications content provided by the participant in the second communications interaction; and providing a notification associated with the deceptive indicia profile for the participant to a second partici

Type: Grant

Filed: July 31, 2012

Date of Patent: May 8, 2018

Assignee: Elwha LLC

Inventors: Clarence T. Tegreene, Royce A. Levien, Richard T. Lord, Robert W. Lord, Mark A. Malamud
Intelligent system that dynamically improves its knowledge and code-base for natural language understanding

Patent number: 9965458

Abstract: Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.

Type: Grant

Filed: December 9, 2015

Date of Patent: May 8, 2018

Assignee: Sansa AI Inc.

Inventors: Robert J. Munro, Rob Voigt, Schuyler D. Erle, Brendan D. Callahan, Gary C. King, Jessica D. Long, Jason Brenier, Tripti Saxena, Stefan Krawczyk
Automatic classification and translation of written segments

Patent number: 9959272

Abstract: A translation server computer and related methods are described. The translation server computer is programmed or configured to create computer-implemented techniques for classifying segments in a source language as non-translatable into a target language, nearly-translatable into the target language, or otherwise, and for generating translations in the target language for the segments classified as nearly-translatable. The translation server computer is further programmed or configured to apply the computer-implemented techniques on an input document to generate a classification and a translation when appropriate for each segment in the document, and cause a user computer to display the translations and classifications.

Type: Grant

Filed: July 21, 2017

Date of Patent: May 1, 2018

Assignee: Memsource a.s.

Inventors: David {hacek over (C)}an{hacek over (e)}k, Dalibor Frívaldský, Ale{hacek over (s)} Tamchyna
System and method for automatic, unsupervised paraphrase generation using a novel framework that learns syntactic construct while retaining semantic meaning

Patent number: 9953027

Abstract: A system includes a question answering system executed by a computer, a processor, and a memory coupled to the processor. The memory is encoded with instructions that when executed cause the processor to provide training for training the question answering system. The training system is configured to receive a plurality of bidirectional disjunctive logical forms which include two directional disjunctions of differences between a first logical form of a first sentence and second logical form of a second sentence, realize the plurality of bidirectional disjunctive logical forms to generate a first plurality of paraphrases of the first and second sentence, score each of the first plurality of paraphrases based on textual similarity between the first plurality of paraphrases and the first and second sentences, and prune the first plurality of paraphrases to generate a second plurality of paraphrases based on the scores of each of the first plurality of paraphrases.

Type: Grant

Filed: September 15, 2016

Date of Patent: April 24, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Laura J. Bennett, Lakshminarayanan Krishnamurthy, Niyati Parameswaran, Sridhar Sudarsan
Cognitive contextualization of emergency management system communications

Patent number: 9953028

Abstract: Software that contextualizes communications during an event by performing the following steps: (i) receiving an input communication from a first user, where the input communication includes input information relating to the event; (ii) receiving first user contextual information, where the first user contextual information pertains to an emotional state of the first user at the time the input communication was received; (iii) determining an output communication based, at least in part, on the received first user contextual information, where the output communication includes output information relating to the event; and (iv) sending the output communication to a first recipient.

Type: Grant

Filed: January 9, 2015

Date of Patent: April 24, 2018

Assignee: International Business Machines Corporation

Inventors: James R. Kozloski, Clifford A. Pickover, Melanie E. Roberts, Maja Vukovic
Processing electronic mail replies

Patent number: 9942176

Abstract: Disclosed are various embodiments for processing electronic messages and/or reply electronic messages. A contact entry associated with a user issue is created. A unique token associated with the contact entry is generated. Electronic messages sent to the user are generated with a reply-to address that incorporates the token.

Type: Grant

Filed: October 30, 2015

Date of Patent: April 10, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Zachary Crisman, Siddharth Vivek Joshi, Jamie J. Sheehan, Charles E. Dannaker
Automatically converting an electronic publication into an online course

Patent number: 9940310

Abstract: A publisher can extend existing electronic publications (e.g., formatted in EPUB format) by adding additional data such as interactive content, supplementary learning resources, etc. The extended electronic publication can then be automatically imported to create an online course that corresponds to the electronic publication.

Type: Grant

Filed: February 27, 2015

Date of Patent: April 10, 2018

Assignee: Snapwiz Inc.

Inventors: Sriram Cherukuri, Satish Kumar, Ranjan Parthasarathy, Madhusudana Narasa, Aditya S. Agarkar
Method and apparatus for concept-based classification of natural language discourse

Patent number: 9934285

Abstract: Pinnacle concepts are not amenable to detection by the use of keywords. A unit of natural language discourse (UNLD) “refers” to a pinnacle concept “C” when that UNLD uses linguistic expressions in such a way that “C” is regarded as expressed, used or invoked by an ordinary reader of “L.” A reference can have a “reference level” value that is proportional to: the “strength” with which the pinnacle concept is referenced, the probability that a pinnacle concept is referenced or both strength and probability. Pinnacle concepts can be divided into Quantifiers and non-Quantifiers. A Quantifier can modify the reference level assigned to a non-Quantifier. A concept “C,” that is determined to be referenced by a UNLD “x,” after application of its Quantifiers, is said to be asserted by “x.” Concept-based classification is the identification of whether a pinnacle concept “C” is asserted by a UNLD. Concept-based classification can be used for concept-based search.

Type: Grant

Filed: June 23, 2015

Date of Patent: April 3, 2018

Assignee: NetBase Solutions, Inc.

Inventors: John Andrew Rehling, Michael Jacob Osofsky
Methods and systems for identifying errors in a speech recognition system

Patent number: 9928829

Abstract: Methods are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. A method for model adaptation for a speech recognition system includes determining an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The method may further include adjusting an adaptation, of the model for the word or various models for the various words, based on the error rate. Apparatus are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. An apparatus for model adaptation for a speech recognition system includes a processor adapted to estimate an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system.

Type: Grant

Filed: October 17, 2014

Date of Patent: March 27, 2018

Assignee: Vocollect, Inc.

Inventors: Keith P. Braho, Jeffrey P. Pike, Lori A. Pike
Transliteration work support device, transliteration work support method, and computer program product

Patent number: 9928828

Abstract: According to an embodiment, a transliteration work support device includes an analysis unit, a storage unit, an estimation unit, a construction unit, a correction unit, and an update unit. The analysis unit performs language analysis on document data and creates transliteration auxiliary information representing a way of transliteration of a word or a phrase in the document data. The storage unit stores a correction history representing a way of transliteration corrected in the past of the word or the phrase. The estimation unit estimates a correction place and a correction candidate of the document data or the transliteration auxiliary information from the history. The construction unit constructs work list information including work items corresponding to types of corrections according to the correction candidate and progress information. The correction unit corrects the document data or the transliteration auxiliary information.

Type: Grant

Filed: April 5, 2016

Date of Patent: March 27, 2018

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Kosei Fume, Yuka Kuroda, Yoshiaki Mizuoka, Masahiro Morita
Multidimensional synopsis generation

Patent number: 9922352

Abstract: A multidimensional synopsis of a stream of textual data pertaining to a particular subject can be generated. To produce the multidimensional synopsis, multiple dimensions that each includes concepts can be identified. The stream of textual data can then be analyzed to identify the occurrence of the concepts within elements of the stream. The multidimensional synopsis can then be produced by generating a score for each intersecting set of concepts from the multiple dimensions. Therefore, each score can generally represent a prevalence of the corresponding intersecting set of concepts within the stream of textual data.

Type: Grant

Filed: January 25, 2016

Date of Patent: March 20, 2018

Assignee: Quest Software Inc.

Inventors: Abel Tegegne, Vineetha Abraham, Mitch Brisebois
Formatting content for a reduced-size user interface

Patent number: 9916075

Abstract: The present disclosure generally relates to displaying content on a reduced-size user interface. An electronic device with one or more processors, memory, and a display, receives content associated with a designated area of the display, where the content is associated with a plurality of available display formats stored in the memory. The device determines a size of the designated area and determines a first display format for the content from the plurality of available display formats based on at least the content and the size of the designated area. The device displays a representation of the content according to the first display format.

Type: Grant

Filed: August 28, 2015

Date of Patent: March 13, 2018

Assignee: Apple Inc.

Inventors: Kevin Will Chen, Eliza Block, Lawrence Y. Yang, Christopher Wilson, Eric Lance Wilson, Paul W. Salzman, David Schimon
Method and apparatus for expressing time in an output text

Patent number: 9904676

Abstract: Methods, apparatuses, and computer program products are described herein that are configured to express a time in an output text. In some example embodiments, a method is provided that comprises identifying a time period to be described linguistically in an output text. The method of this embodiment may also include identifying a communicative context for the output text. The method of this embodiment may also include determining one or more temporal reference frames that are applicable to the time period and a domain defined by the communicative context. The method of this embodiment may also include generating a phrase specification that linguistically describes the time period based on the descriptor that is defined by a temporal reference frame of the one or more temporal reference frames. In some examples, the descriptor specifies a time window that is inclusive of at least a portion of the time period to be described linguistically.

Type: Grant

Filed: May 1, 2015

Date of Patent: February 27, 2018

Assignee: ARRIA DATA2TEXT LIMITED

Inventors: Gowri Somayajulu Sripada, Neil Burnett
Systems and methods for adding users to a networked computer system

Patent number: 9900223

Abstract: Systems and methods are provided for adding new nodes to a computer networked system. The systems and methods may identify a first set of nodes in a networked computer system. The first set of nodes may be included in a first hash computation that clusters the first set of nodes into communities. An application shard space including a first space and a second space may be generated. The first set of nodes may be mapped to application shards in the first space based on the first hash computation. The application shards in the first space may be assigned to a first set of machines of the networked computer system. The second space may be maintained for mappings of nodes not included in the first hash computation to application shards in the second space.

Type: Grant

Filed: February 8, 2017

Date of Patent: February 20, 2018

Assignee: Facebook, Inc.

Inventors: Alon Michael Shalita, Arun Dattaram Sharma
System and method for website categorization

Patent number: 9892189

Abstract: Systems and methods for the categorization of websites are presented. A website is categorized using one or a combination of its domain name and its web page content. The domain name is tokenized, and the tokens compared to categories in a category structure to determine probabilities that the token belongs to each category. Combinations of tokens are similarly compared to the categories. A category may be determined with reference to a vector space in which a training set of websites having known categories is converted according to a methodology into reference vectors containing keyword frequencies. A target website is converted to a target vector using the same methodology, and a distance score of the target vector to each reference vector is calculated. The website represented by the target vector is assigned the category of the reference vector having the lowest distance score.

Type: Grant

Filed: March 10, 2016

Date of Patent: February 13, 2018

Assignee: Go Daddy Operating Company, LLC

Inventors: Robert Brown, Tapan Kamdar, Ryan Kirkish, Wei-Cheng Lai, Jeff McLellan
Shard aware near real time indexing

Patent number: 9886441

Abstract: In an example embodiment, data to be indexed in a distributed file system is received via a near real time publish application program interface (API). A shard responsible for the data to be indexed is determined. Then a message is generated in a shard queue corresponding to the shard responsible for the data to be indexed, the message indicating that data needs to be urgently indexed, the detection of the message in the shard queue by a near real time manager corresponding to the shard responsible for the data to be indexed causing the near real time manager to cause the data to be indexed.

Type: Grant

Filed: June 11, 2015

Date of Patent: February 6, 2018

Assignee: SAP SE

Inventors: Prashant Bhagat, Ridwan Tan, Robert Wells, Dinesh Shahane, Sushant Prasad, Kiran Gangadharappa
Variables and method for authorship attribution

Patent number: 9880995

Abstract: A method uses linguistic units of analysis to identify the authorship of a document. The method is useful to determine authorship of brief documents, and in situations where there are less than ten documents per known author, i.e. when there is scarcity of text. The method analyzes parameters such as the syntax, punctuation, and, optionally the average word and paragraph length, and when the parameters are analyzed using statistical methods, obtains a high degree of reliability (>90% accuracy). The method can be applicable to numerous languages other than English because the variables selected are characteristic of most languages. The reliability of the method is verified when subjected to a cross-validation statistical analysis.

Type: Grant

Filed: April 6, 2006

Date of Patent: January 30, 2018

Inventor: Carole E. Chaski
Gender and name translation from a first to a second language

Patent number: 9881004

Abstract: A system and method are provided for inferring person gender and translating people's names into a second language. The present invention translates an individual's name into a required second language to reduce waiting time in registration areas. It also is able to infer gender once the registration clerk enters the individual's first name in a native language. It also prevents duplication of a person's record generated because of the confusion that happens around how a native name is translated into a second language by standardizing such translation. The embodiments of the present invention utilize machine learning and statistical approaches to infer gender and translate an individual's name into a second language.

Type: Grant

Filed: October 6, 2015

Date of Patent: January 30, 2018

Assignee: CERNER INNOVATION, INC.

Inventor: Ahmed Sayed Attia Moussa
Using voice information to influence importance of search result categories

Patent number: 9875740

Abstract: Approaches provide for using voice information to influence the importance of search result categories for a search query. For example, various embodiments may provide search results for a search query based on a most relevant search result category to the search query. Voice information associated with a subsequent user interaction may be analyzed to identify whether the search result category is correct or if search results from a different category should be provided. Additionally, the voice information may be used to update the relevance score of the search result category to the search query to improve the category matching of future queries.

Type: Grant

Filed: June 20, 2016

Date of Patent: January 23, 2018

Assignee: A9.com, Inc.

Inventors: Mukul Raj Kumar, Balpreet Singh Pankaj
Using human perception in building language understanding models

Patent number: 9875237

Abstract: An understanding model is trained to account for human perception of the perceived relative importance of different tagged items (e.g. slot/intent/domain). Instead of treating each tagged item as equally important, human perception is used to adjust the training of the understanding model by associating a perceived weight with each of the different predicted items. The relative perceptual importance of the different items may be modeled using different methods (e.g. as a simple weight vector, a model trained using features (lexical, knowledge, slot type, . . . ), and the like). The perceptual weight vector and/or or model are incorporated into the understanding model training process where items that are perceptually more important are weighted more heavily as compared to the items that are determined by human perception as less important.

Type: Grant

Filed: March 14, 2013

Date of Patent: January 23, 2018

Assignee: MICROSFOT TECHNOLOGY LICENSING, LLC

Inventors: Ruhi Sarikaya, Anoop Deoras, Fethiye Asli Celikyilmaz, Zhaleh Feizollahi
Generation and use of trained file classifiers for malware detection

Patent number: 9864956

Abstract: A method includes training a file classifier from one or more n-gram feature vectors received from a plurality of binary files as input, where the one or more n-gram vectors represent the occurrences of character pairs in printable characters within the file or characters representing the informational entropy sequence of the file. Another method also includes generating, by the file classifier, output including classification data associated with the file based on the one or more n-gram vectors, where the classification data indicates whether the file includes malware.

Type: Grant

Filed: May 1, 2017

Date of Patent: January 9, 2018

Assignee: SPARKCOGNITION, INC.

Inventor: Na Sai
Clinically intelligent parsing

Patent number: 9864838

Abstract: A method, a device and a system for correlating medical information of a first format to medical information of a second format are provided. The method includes parsing an input sequence representing textual information into plural terms; searching a medical database to associate each term with a medical diagnosis; and translating each term into a coded phrase previously associated with the medical diagnosis in the medical database.

Type: Grant

Filed: February 20, 2008

Date of Patent: January 9, 2018

Assignee: MEDICOMP SYSTEMS, INC.

Inventor: Peter S. Goltra
Mining multi-lingual data

Patent number: 9864744

Abstract: Technology is disclosed for mining training data to create machine translation engines. Training data can be mined as translation pairs from single content items that contain multiple languages; multiple content items in different languages that are related to the same or similar target; or multiple content items that are generated by the same author in different languages. Locating content items can include identifying potential sources of translation pairs that fall into these categories and applying filtering techniques to quickly gather those that are good candidates for being actual translation pairs. When actual translation pairs are located, they can be used to retrain a machine translation engine as in-domain for social media content items.

Type: Grant

Filed: December 3, 2014

Date of Patent: January 9, 2018

Assignee: Facebook, Inc.

Inventors: Matthias Gerhard Eck, Ying Zhang, Yury Andreyevich Zemlyanskiy, Alexander Waibel
Method of automated analysis of text documents

Patent number: 9852122

Abstract: Automated analysis of text documents is used to scan text documents in order to find phrases or text fragments from other documents, or modifying the existing ones. A comparatively fast and universally applicable method finds phrases, sentences or even text fragments from other documents. The method includes: all electronic files containing model documents are converted to a given format; meaningful fragments, called “clauses”, are extracted from them; the converted files containing model documents are stored in the database; each electronic file containing a document to be analyzed is converted to the given format; clauses extracted from analyzed documents are compared with clauses extracted from model documents; fractions of clauses from an analyzed document matching clauses from each model document are calculated; fractions found are then compared with a pre-set threshold value in order to find out whether there are text fragments from a model document in the analyzed one.

Type: Grant

Filed: November 16, 2012

Date of Patent: December 26, 2017

Assignee: OBSHCHESTVO S OGRANICHENNOY OTVETSTVENNOST'YU “TSENTR INNOVATSIY NATAL'I KASPERSKOY”

Inventors: Vladimir Anatol'yevich Lapshin, Dmitriy Vsevolodovich Perov, Yekaterina Aleksandrovna Pshekhotskaya, Sergey S. Ryabov
Semiotic class normalization

Patent number: 9852123

Abstract: A language processing system for text normalization of an input string of a semiotic class. In an aspect, a method includes receiving an input string; accessing, for a semiotic class of non-standard words, a language universal covering grammar for a plurality of languages that generates, for each language of the plurality of languages, one or more sequences of word-level components for each instance of the semiotic class in the language; for each of the plurality of languages, accessing a lexical map specific to the language and that maps each sequence of word-level components for each instance of the semiotic class in the language verbalizations in the language; generating, from the language universal grammar and the lexical maps, a lattice of possible verbalizations of the input string; and selecting one of the possible verbalizations as a selected verbalization for the input string.

Type: Grant

Filed: May 26, 2016

Date of Patent: December 26, 2017

Assignee: Google Inc.

Inventors: Richard Sproat, Ke Wu, Kyle Gorman
Discrepancy curator for documents in a corpus of a cognitive computing system

Patent number: 9842161

Abstract: Curation of a corpus of a cognitive computing system is performed interactively by reporting on user interface device to a user a parse tree illustration of discrepancies and corresponding assigned confidence factors detected between at least a portion of a first document and a second or more documents in the corpus. Responsive to a user selection of an illustrated discrepancy in the parse tree, a drill-down dialog is prepared and displayed which shows at least a text string for the portion of the first document and at least one conflicting text string from the second or more documents, and which provides at least one user-selectable administrative action option for handling the detected discrepancy. Responsive to receipt of user selection of an administrative action option, the computing system performs the action to handle the detected discrepancy.

Type: Grant

Filed: January 12, 2016

Date of Patent: December 12, 2017

Assignee: International Business Machines Corporation

Inventors: Donna K. Byron, Elie Feirouz, Ashok Kumar, William G. O'Keeffe
Machine translation method for performing translation between languages

Patent number: 9836457

Abstract: Different forward-translated sentences are generated by translating a received translation-source sentence in a first language into a second language. Backward-translated sentences are generated by backward-translating the different forward-translated sentences into the first language. When an operation for selecting one of the backward-translated sentences is received during output of the backward-translated sentences on an information output device, the forward-translated sentence corresponding to the selected backward-translated sentence is output onto the information output device.

Type: Grant

Filed: May 18, 2016

Date of Patent: December 5, 2017

Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA

Inventors: Nanami Fujiwara, Masaki Yamauchi
Computer implemented method and device for accessing a data set

Patent number: 9824160

Abstract: A computer implemented method of accessing a data set comprising a plurality of records, wherein each record is associated with one or more items of data. The method comprises using the computer to receive a data query on the data set. Each record is assigned to an in-group or to an out-group with respect to the query. Words appearing in records of the in-group are determined and a user interface representative of said words is generated. Words appearing in records of the out-group are determined and a user interface representative of said words is generated.

Type: Grant

Filed: June 2, 2014

Date of Patent: November 21, 2017

Assignee: SYNERSCOPE B.V.

Inventors: Jorik Blaas, Willem Robert Van Hage, Danny Hubertus Rosalia Holten
Translation device that determines whether two consecutive lines in an image should be translated together or separately

Patent number: 9824086

Abstract: A condition determining section (24) determines whether or not two consecutive lines in an image meet a joining condition that is based on a characteristic of a language of a character string, the two consecutive lines being extracted from the character string composed of a plurality of lines. In a case where the joining condition is met, an extracted line joining section (25) and a translation section (26) join and then translate the two consecutive lines.

Type: Grant

Filed: August 20, 2014

Date of Patent: November 21, 2017

Assignee: SHARP KABUSHIKI KAISHA

Inventors: Shinya Satoh, Tatsuo Kishimoto, Tadao Nagasawa

prev … 2 3 4 5 6 7 8 9 10 … next