Trigrams Or Digrams Patents (Class 382/230)
  • Patent number: 11868609
    Abstract: In accordance with one or more aspects of a dynamic soft keyboard, a user input is received via a soft keyboard having multiple keys. Information describing a current input environment for the soft keyboard is obtained, and a determination is made as to which one or more keys of the multiple keys was intended to be selected by the user input. This determination is made based at least in part on the current input environment.
    Type: Grant
    Filed: July 6, 2022
    Date of Patent: January 9, 2024
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Erik M. Geidl, Shawn R. LeProwse, Ian C. LeGrow, Reed L. Townsend
  • Patent number: 11853684
    Abstract: A computing system accesses an image-based document and a text document having text extracted from the image-based document and provides a user interface displaying at least a portion of the image-based document. In response to selection of a text portion of the image-based document, the system determines an occurrence of the text portion within at least a portion of the image-based document and then applies a search model on the text document to identify the same occurrence of the text portion. Once matched, alignment data indicating a relationship between a selected tag and both the text portion of the image-based document and the text portion of the text document is stored.
    Type: Grant
    Filed: December 20, 2022
    Date of Patent: December 26, 2023
    Assignee: Palantir Technologies Inc.
    Inventors: Suchan Lee, Jon Paek
  • Patent number: 11775749
    Abstract: The embodiments present a new class of content masking defenses against the Portable Document Format (PDF) standard. The defenses can identify attacks that cause documents to appear different than the underlying content extracted from the documents. A content masking defense method can include identifying a content masking attack by scanning a document file to extract a character code of a character appearing in the file. Next, the character is rendered based on a font that is embedded in the document file. Optical character recognition can be performed on the rendering, and a content masking attack can be identified based on a comparison of a result of the optical character recognition against the character code of the character.
    Type: Grant
    Filed: December 29, 2020
    Date of Patent: October 3, 2023
    Assignee: UNIVERSITY OF SOUTH FLORIDA
    Inventors: Yao Liu, Zhuo Lu, Ian Davidson Markwood, Dakun Shen
  • Patent number: 11694029
    Abstract: Techniques are provided for identifying attributes associated with a neologism or an unknown word or name. Real world characteristics can be predicted for the neologism. Trigrams are identified for an input word and word embedding model vector values are calculated for the identified trigrams and entered into a matrix. Trigrams are identified for nearest names. Classification values are calculated based on the trigrams for the input word and the trigrams from the nearest names and the classification values are entered into the matrix. A convolutional neural network can process the matrix to identify one or more characteristics associated with the neologism.
    Type: Grant
    Filed: August 4, 2020
    Date of Patent: July 4, 2023
    Assignee: Oracle International Corporation
    Inventors: Michael Malak, Luis E. Rivas, Mark Lee Kreider
  • Patent number: 11562120
    Abstract: A computing system accesses an image-based document and a text document having text extracted from the image-based document and provides a user interface displaying at least a portion of the image-based document. In response to selection of a text portion of the image-based document, the system determines an occurrence of the text portion within at least a portion of the image-based document and then applies a search model on the text document to identify the same occurrence of the text portion. Once matched, alignment data indicating a relationship between a selected tag and both the text portion of the image-based document and the text portion of the text document is stored.
    Type: Grant
    Filed: July 16, 2021
    Date of Patent: January 24, 2023
    Assignee: Palantir Technologies Inc.
    Inventors: Suchan Lee, Jon Paek
  • Patent number: 11334603
    Abstract: A method, system and computer program product for finding groups of potential duplicates in attribute values. Each attribute value of the attribute values is converted to a respective set of bigrams. All bigrams present in the attribute values may be determined. Bigrams present in the attribute values may be represented as bits. This may result in a bitmap representing the presence of the bigrams in the attribute values. The attribute values may be grouped using bitwise operations on the bitmap, where each group includes attribute values that are determined based on pairwise bigram-based similarity scores. The pairwise bigram-based similarity score reflects the number of common bigrams between two attribute values.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: May 17, 2022
    Assignee: International Business Machines Corporation
    Inventors: Namit Kabra, Yannick Saillet
  • Patent number: 11256756
    Abstract: Techniques for determining character string differences between a target character string and one or more candidate character strings are provided. In some implementations, a target bitmap is produced for the target character string and a target bitmap weight is calculated. A candidate bitmap and a candidate bitmap weight associated with a candidate character string is obtained. In response to determining that the candidate bitmap weight differs from the target bitmap weight by less than a first threshold value, an exclusive OR operation is performed between the target bitmap and the candidate bitmap. In response to determining that number of ones in the result of the exclusive OR is less than a second threshold value, the candidate character string is included in a character set that includes one or more character strings that are close to the target character string.
    Type: Grant
    Filed: August 21, 2018
    Date of Patent: February 22, 2022
    Assignee: Advanced New Technologies Co., Ltd.
    Inventor: Xiaofeng Fan
  • Patent number: 11138391
    Abstract: In an embodiment of a messaging system, a method for presenting a commercial message to a user is provided. A target language in which the user is comfortable communicating may be determined based on at least one communication received by the user or at least one communication provided by the user. The commercial message may be presented to the user in the target language.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: October 5, 2021
    Assignee: AT&T INTELLECTUAL PROPERTY II, L.P.
    Inventor: Srinivas Bangalore
  • Patent number: 10387729
    Abstract: Techniques for tagging virtualized content are disclosed. In some embodiments, a modeled three-dimensional scene of objects representing abstracted source content is generated and analyzed to determine a contextual characteristic of the scene that is based on a plurality of objects comprising the scene. The modeled scene is tagged with a tag specifying the determined contextual characteristic.
    Type: Grant
    Filed: July 9, 2014
    Date of Patent: August 20, 2019
    Assignee: Outward, Inc.
    Inventor: Clarence Chui
  • Patent number: 10373610
    Abstract: Described herein are systems and methods for automatic unit selection and target decomposition for sequence labelling. Embodiments include a new loss function called Gram-Connectionist Temporal Classification (CTC) loss that extend the popular CTC loss function criterion to alleviate prior limitations. While preserving the advantages of CTC, Gram-CTC automatically learns the best set of basic units (grams), as well as the most suitable decomposition of target sequences. Unlike CTC, embodiments of Gram-CTC allow a model to output variable number of characters at each time step, which enables the model to capture longer term dependency and improves the computational efficiency. It is also demonstrated that embodiments of Gram-CTC improve CTC in terms of both performance and efficiency on the large vocabulary speech recognition task at multiple scales of data, and that systems that employ an embodiment of Gram-CTC can outperform the state-of-the-art on a standard speech benchmark.
    Type: Grant
    Filed: September 7, 2017
    Date of Patent: August 6, 2019
    Assignee: Baidu USA LLC
    Inventors: Hairong Liu, Zhenyao Zhu, Sanjeev Satheesh
  • Patent number: 10332509
    Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.
    Type: Grant
    Filed: November 21, 2016
    Date of Patent: June 25, 2019
    Assignee: Baidu USA, LLC
    Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
  • Patent number: 10319374
    Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.
    Type: Grant
    Filed: November 21, 2016
    Date of Patent: June 11, 2019
    Assignee: Baidu USA, LLC
    Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
  • Patent number: 10140371
    Abstract: Approaches for translating a transliterated search query are provided. An approach includes receiving a search query containing a transliterated word. The approach also includes determining a source language corresponding to the transliterated word. The approach further includes converting the transliterated word to a word in the source language. The approach additionally includes translating the word in the source language to a word in a target language. The approach also includes performing a search using the word in the target language.
    Type: Grant
    Filed: July 18, 2017
    Date of Patent: November 27, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sasha P. Caskey, Rick A. Hamilton, II, Dimitri Kanevsky, Tara N. Sainath
  • Patent number: 9881005
    Abstract: Log files include log file content, some of which (especially a non-runtime portion) is in human-readable language. Translation of log file content is accomplished by: (i) generating first log content in a first human-readable language using a first resource bundle related to data translation; and (ii) translating the first log content to second log content, which corresponds to the first log content but is in a second human-readable language, using the first resource bundle. The translated log content may have annotations and/or processing rules applied to it. The translation of the present invention can help to keep the translation accurate and uniform so that the translated log content may be more effectively used in various ways.
    Type: Grant
    Filed: September 4, 2014
    Date of Patent: January 30, 2018
    Assignee: International Business Machines Corporation
    Inventors: Arun Ramakrishnan, Rohit Shetty
  • Patent number: 9852129
    Abstract: Log files include log file content, some of which (especially a non-runtime portion) is in human-readable language. Translation of log file content is accomplished by: (i) generating first log content in a first human-readable language using a first resource bundle related to data translation; and (ii) translating the first log content to second log content, which corresponds to the first log content but is in a second human-readable language, using the first resource bundle. The translated log content may have annotations and/or processing rules applied to it. The translation of the present invention can help to keep the translation accurate and uniform so that the translated log content may be more effectively used in various ways.
    Type: Grant
    Filed: November 26, 2013
    Date of Patent: December 26, 2017
    Assignee: International Business Machines Corporation
    Inventors: Arun Ramakrishnan, Rohit Shetty
  • Patent number: 9298276
    Abstract: In one example, a computing device includes at least one processor configured to output, for display, a graphical keyboard. The at least one processor may be configured to output, for display, a graphical keyboard. The at least one processor may be configured to receive an input indication. The at least one processor may be configured to select, based in part on the input indication, keys of the graphical keyboard. The at least one processor may be configured to classify a group of characters associated with the keys into a category, wherein at least one of the group of characters comprises a number or a non-alphabetic symbol. The at least one processor may be configured to determine, based in part on a language model, candidate character strings that corresponds to the category. The at least one processor may be configured to output the at least one candidate character string.
    Type: Grant
    Filed: December 31, 2013
    Date of Patent: March 29, 2016
    Assignee: Google Inc.
    Inventor: Xiaojun Bi
  • Patent number: 9223869
    Abstract: A web browser agent or plug-in installed into a web browser of a client device provides translation services along with a search engine server. The system accesses a web page in one (local) language and then translates to another (foreign) language and displays the translated content in a web page for user's viewing. The web browser agent is an add-on software tool or plug-in, provided by the search engine server and installed into the web browser. As a result of installation, a toolbar appears on the top of the web browser's page. This toolbar provides the interface to enable local translation of web pages from a local/web language to a target/foreign language useful to the user. Centralized (cloud computing) translation services by servers of a third party may also be employed. Web pages in any number of languages may be accessed using this operations/structure.
    Type: Grant
    Filed: October 31, 2012
    Date of Patent: December 29, 2015
    Assignee: RPX CORPORATION
    Inventor: James D. Bennett
  • Patent number: 9223758
    Abstract: Systems, methods and computer storage mediums automatically apply a language encoding data setting to a web page. Embodiments of the present disclosure relate to equipping a web browser with the ability to automatically open web pages with an appropriate language encoding data setting applied to the web page so that the web page is displayed without garbled characters. The web browser is able to determine the appropriate language encoding setting by requesting the appropriate language encoding setting for the web page that is stored in a language encoding database that is updated each time the web page is successfully opened without displaying garbled characters.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: December 29, 2015
    Assignee: Google Inc.
    Inventor: Takuya Oikawa
  • Patent number: 9189472
    Abstract: An embodiment is directed to an interface for a small screen device, such as a watch, that enables a user to enter text on the small screen device by touching in the vicinity of characters, rather than aiming for a particular button or the exact location of a character. Embodiments further enable the design of interfaces without the use of buttons for controlling the entry of text on the small screen device.
    Type: Grant
    Filed: March 8, 2012
    Date of Patent: November 17, 2015
    Assignee: Touchtype Limited
    Inventors: Benjamin William Medlock, Jonathan Paul Reynolds
  • Patent number: 9135517
    Abstract: A method and apparatus for identifying a document in a set of stored documents based on a pattern of characteristics in the document is presented. A digital image including at least a portion of the a document is acquired. A pattern of characteristics is then identified in the digital image. The pattern is matched to the set of stored documents to identify the document in the digital image from the set of stored documents.
    Type: Grant
    Filed: November 29, 2012
    Date of Patent: September 15, 2015
    Assignee: Amazon Technologies, Inc.
    Inventor: Jeffrey Penrod Adams
  • Patent number: 9036923
    Abstract: Provided are an age estimation apparatus, an age estimation method, and an age estimation program capable of obtaining a recognition result closely matching the result perceived by human. An age estimation apparatus 10 for estimating an age of a person on image data includes a dimension compressor 11 for applying dimension compression to the image data to output low dimensional data; and an identification device 12 for estimating an age of a person on the basis of a learning result using a feature amount contained in the low dimensional data, wherein a parameter used for the dimension compression by the dimension compressor 11 and the feature amount used for age estimation by the identification device 12 are set on the basis of a result of an evaluation of a generalization capability using a weighting function that shows a degree of seriousness of an age estimation error for every age, and learning of the identification device 12 is performed on the basis of the weighting function.
    Type: Grant
    Filed: April 14, 2010
    Date of Patent: May 19, 2015
    Assignees: NEC Solution Innovators, Ltd., TOKYO INSTITUTE OF TECHNOLOGY
    Inventors: Kazuya Ueki, Masashi Sugiyama, Yasuyuki Ihara
  • Patent number: 8914278
    Abstract: A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: December 16, 2014
    Assignee: Ginger Software, Inc.
    Inventors: Yael Karov Zangvil, Avner Zangvil
  • Patent number: 8838453
    Abstract: A user input is received by a computing device. An interactive input module determines whether the first user input is a first character of a script for a supported language. If the first user input is a first character, the first character is stored in an input buffer. A plurality of words in the supported language that match a contents of the input buffer are identified, and a subset of the plurality of words are displayed to the user based on a frequency value associated with each of the plurality of words.
    Type: Grant
    Filed: August 31, 2010
    Date of Patent: September 16, 2014
    Assignee: Red Hat, Inc.
    Inventor: Pravin Satpute
  • Patent number: 8818111
    Abstract: Provided are an age estimation apparatus, an age estimation method, and an age estimation program capable of reducing the labor of labeling the image data used for age estimation. An age estimation apparatus for estimating an age of a person on image data includes a dimension compression unit for applying dimension compression to the image data to output low dimensional data; a clustering unit for performing clustering of the low dimensional data outputted; a labeling unit for labeling representative data of each cluster among the low dimensional data clustered; and an identification unit for estimating an age of a person on the basis of a learning result using a feature amount contained in labeled low dimensional data and unlabeled low dimensional data.
    Type: Grant
    Filed: April 14, 2010
    Date of Patent: August 26, 2014
    Assignees: NEC Soft, Ltd., Tokyo Institute of Technology
    Inventors: Kazuya Ueki, Masashi Sugiyama, Yasuyuki Ihara
  • Patent number: 8744198
    Abstract: A computer-implemented method includes dividing an image into one or more image channels for image compression. The method also includes dividing one or more of the image channels into one or more blocks. At least one of the blocks includes floating point representations of pixel values included in the block. The method also includes converting the floating point representations of pixel values into integer representations such that the sign of each floating point representation is preserved. The method also includes storing the difference of adjacent integer representations as a compressed version of the image.
    Type: Grant
    Filed: November 20, 2007
    Date of Patent: June 3, 2014
    Assignee: Lucasfilm Entertainment Company Ltd.
    Inventor: Florian Kainz
  • Patent number: 8233726
    Abstract: Disclosed herein is a method, computer system and computer program product for identifying a writing system associated with a document image containing one or more words written in the writing system. Initially, a document image fragment is identified based on the document image, wherein the document image fragment contains one or more pixels from one or more of the words in the document image. A set of sequential features associated with the document image fragment is generated, wherein each sequential feature describes one dimensional graphic information derived from the one or more pixels in the document image fragment. A classification score for the document image fragment is generated responsive at least in part to the set of sequential features, the classification score indicating a likelihood that the document image fragment is written in the writing system.
    Type: Grant
    Filed: November 27, 2007
    Date of Patent: July 31, 2012
    Assignee: Googe Inc.
    Inventors: Ashok Popat, Eugene Brevdo
  • Patent number: 8208765
    Abstract: An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided into separate characters. Image features of each character image are extracted. Based on the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters, from a character image feature dictionary which stores the image features of character image in units of character, and a first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting a first column of the first index matrix, is subjected to a lexical analysis according to a language model, and whereby a second index matrix having a character string which makes sense is prepared. In the language model, statistics are taken and then, the lexical analysis is performed.
    Type: Grant
    Filed: January 10, 2008
    Date of Patent: June 26, 2012
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
  • Patent number: 8150170
    Abstract: Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: April 3, 2012
    Assignee: Microsoft Corporation
    Inventors: Mingjing Li, Xiaoguang Rui
  • Patent number: 8102284
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound word text input. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device.
    Type: Grant
    Filed: July 22, 2009
    Date of Patent: January 24, 2012
    Assignee: Research In Motion Limited
    Inventors: Vadim Fux, Michael Elizarov
  • Patent number: 7697720
    Abstract: One embodiment of a method of tracking a plurality of targets can be broadly summarized by the following steps: capturing a plurality of images of a plurality of targets with a plurality of image capture devices; generating a target observation for each target, said target observation including at least a visual signature of the target and a time value; partitioning target observations according to similarities in their visual signatures; and producing primary tracks from the partitioned target observations, wherein each primary track includes ordered sequences of observation events having similarities in their visual signatures. Other methods and systems are also provided.
    Type: Grant
    Filed: September 15, 2005
    Date of Patent: April 13, 2010
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Colin Andrew Low
  • Patent number: 7689409
    Abstract: After prestoring first character strings that occur frequently in words of languages and second character strings that are a typical therein, a device for automatically identifying the language of a text from a plurality of languages extracts words from the text and constructs all of the character strings contained in each extracted word. Each string in an extracted word is compared to the first and second strings of a particular language. If the word contains a first string, a score of the language is increased by a coefficient depending in particular on the position of the first string in the word. If the word contains a second string, the score is decreased by a coefficient associated with the second string. The highest of the scores corresponding to the predetermined languages identifies the language of the text.
    Type: Grant
    Filed: December 11, 2003
    Date of Patent: March 30, 2010
    Assignee: France Telecom
    Inventor: Johannes Heinecke
  • Patent number: 7583205
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound word text input. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device.
    Type: Grant
    Filed: July 28, 2005
    Date of Patent: September 1, 2009
    Assignee: Research In Motion Limited
    Inventors: Vadim Fux, Michael Elizarov
  • Patent number: 7573404
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound word text input. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device.
    Type: Grant
    Filed: July 28, 2005
    Date of Patent: August 11, 2009
    Assignee: Research In Motion Limited
    Inventors: Vadim Fux, Michael Elizarov
  • Patent number: 7539326
    Abstract: An OCR percentage matching algorithm achieves a significant reduction in false mismatches accounting for combinations of unprocessed spaces, missing characters, extra characters and character substitution errors during the OCR scanning processing and allows for a specified percentage of the OCR character scan rather the entire OCR character scan to be the same as the expected character string to declare a match.
    Type: Grant
    Filed: December 23, 2005
    Date of Patent: May 26, 2009
    Assignee: Pitney Bowes Inc.
    Inventors: Joseph Eremita, Adrian Ruck
  • Publication number: 20090034851
    Abstract: Systems and methods for classifying content as adult content and, if desired, blocking content so classified from presentation to a user are provided. Received content is analyzed using a sequential series of classification techniques, each successive technique being implemented only if the previous technique did not result in classification of the content as adult content. In this way, adult content may be identified across a variety of different media types (e.g., text, images, video, etc.) and yet processing power may be reserved if one or more techniques requiring less power is sufficient to determine that the received content is, in fact, adult content. Content classification may be performed in-band (that is, in substantially real-time such that content may be identified and/or blocked at the time results of a user query are returned) or out-of-band (that is, prospectively as new content is received but not in association with a user query).
    Type: Application
    Filed: August 3, 2007
    Publication date: February 5, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Xiadong Fan, Richard Qian
  • Patent number: 7451317
    Abstract: An apparatus for and method of embedding a watermark into original information, transmitting the watermarked information, and reconstructing the watermark from the transmitted watermarked information include embedding a portion of a plurality of components constituting the watermark into the original information, and using the remaining portion of the components as keys for reconstructing the watermark. According to the method, since a size of data to be embedded is greatly reduced, degradation of the watermarked information is prevented, and the information becomes more robust against hacking attacks or errors occurring in a variety of ways when transmitting the information.
    Type: Grant
    Filed: December 2, 2002
    Date of Patent: November 11, 2008
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sang-heun Oh, Byung-jun Kim, Sung-wook Park
  • Patent number: 7443316
    Abstract: A method (300) for entering a character into an electronic device (100) is provided. The method (300) includes displaying (301) input character keys (204) on a touch sensitive region (202) of a display screen (105) of the device (100), the keys identifying an associated character. Next, a display step (309) shows at least one entered character in a display region (201) of the screen, the entered character having been selected by actuation of one of the character keys (204). Next, a group of potential subsequent characters that follow the entered character is predicted (311, 317). A second set of input character keys (205) identifying the potential subsequent characters is displayed (327). The second set of keys (205) are grouped together (323) such that their relative screen locations with respect to each other are different to that of corresponding keys in the first set of keys (204).
    Type: Grant
    Filed: September 1, 2005
    Date of Patent: October 28, 2008
    Assignee: Motorola, Inc.
    Inventor: Swee Ho Lim
  • Patent number: 7218249
    Abstract: A hand-held communication device provides navigation key-based predictive text entry. The hand-held communication device includes a housing generally sized to be held in a human hand having a display disposed for displaying characters selectable for entry in a character position of a text string being entered and a navigation key assembly for scrolling through and selecting from the characters displayed by the display. The characters displayed by the display during text entry are arranged according to the probability of selection of each character for entry in the character position so that the character with the highest probability of selection is selected with a single input from the navigation key assembly.
    Type: Grant
    Filed: June 8, 2004
    Date of Patent: May 15, 2007
    Assignee: Siemens Communications, Inc.
    Inventor: Lovleen Chadha
  • Patent number: 7046847
    Abstract: A technique for extracting a meaningful text block from a document where a table, an itemized list, a multiple column, etc., are arbitrarily laid out. A document is input which is laid out using blanks or the like, then a symbol is acquired which is associated with a spatial coordinate of the document. Consecutive characters of the same type are extracted from the symbol to generate a token and a space. A stream is generated from consecutive spaces in the column direction, while a text block is generated from streams and tokens. A link is generated between the text blocks to form a document graph. Validity of a connection (link) between the text blocks in the document graph is evaluated using a language model, then the text blocks are merged if the connection is valid.
    Type: Grant
    Filed: June 25, 2001
    Date of Patent: May 16, 2006
    Assignee: International Business Machines Corporation
    Inventors: Matthew F. Hurst, Tetsuya Nasukawa
  • Patent number: 7020338
    Abstract: A method of identifying the script of a line of text by first assigning a weight to each n-gram in a group of documents of known scripts, where each n-gram is a sequence of numbers representing k-mean cluster centroids of a known script to which character segments in the documents of known scripts most closely match. A line of text is identified, where the line of text is made up of pixels. The identified line of text is cropped so that only a percentage of the pixels remain. The cropped line is vertically and horizontally rescaled into gray-scale pixels. The vertical gray-scale pixels are replaced with the sequence number of a k-means cluster centroid of a known script to which it most closely matches. The n-grams of the number sequence that represents the line of text is scored against the n-gram weights of the documents of known text. The highest score of the line of text is identified and compared to the scores of the documents of known scripts.
    Type: Grant
    Filed: April 8, 2002
    Date of Patent: March 28, 2006
    Assignee: The United States of America as represented by the National Security Agency
    Inventor: Carson S. Cumbee
  • Patent number: 6934405
    Abstract: An address reading method with processing steps controlled by parameters, in which free parameters which cannot be adapted by learning samples are to be automatically optimized. These parameters are therefore assigned costs. The value of free parameters which are expensive and lie above selectable cost thresholds are maintained and the remaining free parameters are improved by repeatedly modifying their values on the basis of strategies known per se, taking already evaluated parameter settings into account, and training and evaluating the reading method only with these modified values.
    Type: Grant
    Filed: April 4, 2000
    Date of Patent: August 23, 2005
    Assignee: Siemens Aktiengesellschaft
    Inventor: Michael Schuessler
  • Publication number: 20040042667
    Abstract: A method and apparatus for extracting information from symbolically compressed document images. A deciphering module generates first and second text strings by deciphering respective sequences of template identifiers in first and second symbolically compressed document images. A conditional n-gram module receives the first and second text strings from the deciphering module and extracts n-gram terms therefrom based on a predicate condition. A comparison module generates a measure of similarity between the first and second symbolically compressed document images based on the n-gram terms extracted by the conditional n-gram module.
    Type: Application
    Filed: September 30, 2003
    Publication date: March 4, 2004
    Inventors: Dar-Shyang Lee, Jonathan J. Hull
  • Patent number: 6646572
    Abstract: Keys are arranged on a keyboard as learned during a training stage. During training, a training corpus of input symbol sequence is provided. Each unique symbol in the corpus has an associated key on the keyboard. A cost function that measures a cost of inputting the symbols of the training corpus is globally minimized. Then, the keys are arranged on the keyboard according to the globally minimized cost function. To reduced the distance a pointer must move, the keys can also be arranged in a hexagonal pattern.
    Type: Grant
    Filed: February 18, 2000
    Date of Patent: November 11, 2003
    Assignee: Mitsubish Electric Research Laboratories, Inc.
    Inventor: Matthew Brand
  • Patent number: 6636636
    Abstract: It is an object of the invention to improve output precision of a final recognition result by further obtaining and applying a forward-chain probability in addition to a backward-chain probability in a Bi-gram statistic process, as a post-processing in the case where a plurality of candidate characters are outputted to one input pattern as a result of character recognition. An apparatus according to the invention has a backward-chain dictionary and a forward-chain dictionary of characters, obtains a chain probability from the i-th character to the (i+1)th character by using the backward-chain dictionary, further obtains a chain probability from the (i+1)th character to the i-th character by using the forward-chain dictionary, and selects the character of the final output result from a plurality of candidate characters on the basis of a value obtained by unifying those chain probabilities.
    Type: Grant
    Filed: January 9, 1998
    Date of Patent: October 21, 2003
    Assignee: Canon Kabushiki Kaisha
    Inventor: Eiji Takasu
  • Patent number: 6560360
    Abstract: A recognition system is disclosed, including a representation of an object in terms of its constituent parts that is translationally invariant, and which provides scale invariant recognition. The system further provides effective recognition of patterns that are partially present in the input signal, or that are partially occluded, and also provides an effective representation for sequences within the input signal. The system utilizes dynamically determined, context based expectations, for identifying individual features/parts of an object to be recognized. The system is computationally efficient, and capable of highly parallel implementation, and further includes a mechanism for improving the preprocessing of individual sections of an input pattern, either by applying one or more preprocessors selected from a set of several preprocessors, or by changing the parameters within a single preprocessor.
    Type: Grant
    Filed: January 27, 2000
    Date of Patent: May 6, 2003
    Assignees: Nestor, Inc., Brown University Research Foundation
    Inventors: Predrag Neskovic, Douglas L. Reilly, Leon N Cooper
  • Patent number: 6292772
    Abstract: The method of recognizing the language of a single word as to spelling and grammar correction (e.g., identifying the appropriate language resources on a document, paragraph, sentence or even individual word basis), the automatic invocation of transliteration software based on the language of the words (e.g., automatic ASCII to Kanji substitution without requiring the user to explicitly switch into a Kanji mode), the automatic invocation of appropriate machine translation tools when the document's language is different from the user's native tongue(s), the use of document language identification to eliminate from database or web search results any documents which are not written in the user's native language and the automatic identification of user-appropriate languages for the user interface.
    Type: Grant
    Filed: December 1, 1998
    Date of Patent: September 18, 2001
    Assignee: JustSystem Corporation
    Inventor: Mark Kantrowitz
  • Patent number: 6175829
    Abstract: A method and apparatus for verifying a query to provide feedback to users for query reformulation. By utilizing selectivity statistics for semantic and visual characteristics of image objects, query verification “examines” user queries and allows users to reformulate queries through system feedback. Feedback information provided to the user includes (1) the maximum and minimum number of matches for the query; (2) alternatives for both semantic and visual-based query elements; and (3) estimated numbers of matching images. Additional types of feedback information may also be provided. With this feedback, the users know if the query criteria is too tight (i.e. too few matches will be retrieved) or too loose (i.e. too many matches will be retrieved) so that they can relax, refine, or reformulate queries or leave queries unchanged accordingly. Only after queries are verified to have a high possibility of meaningful results, are the queries processed.
    Type: Grant
    Filed: April 22, 1998
    Date of Patent: January 16, 2001
    Assignee: NEC USA, Inc.
    Inventors: Wen-Syan Li, K. Selcuk Candan
  • Patent number: 6137911
    Abstract: Documents are classified into one or more clusters corresponding to predefined classification categories by building a knowledge base comprising matrices of vectors which indicate the significance of terms within a corpus of text formed by the documents and classified in the knowledge base to each cluster. The significance of terms is determined assuming a standard normal probability distribution, and terms are determined to be significant to a cluster if their probability of occurrence being due to chance is low. For each cluster, statistical signatures comprising sums of weighted products and intersections of cluster terms to corpus terms are generated and used as discriminators for classifying documents. The knowledge base is built using prefix and suffix lexical rules which are context-sensitive and applied selectively to improve the accuracy and precision of classification.
    Type: Grant
    Filed: June 16, 1997
    Date of Patent: October 24, 2000
    Assignee: The Dialog Corporation PLC
    Inventor: Maxim Zhilyaev
  • Patent number: 5963671
    Abstract: The most likely to be used characters and controls of a soft keyboard are determined from consulting trigram tables, and enhanced and/or positioned to attract the user and to facilitate quick recognition and selection. The letters and other characters of the soft keyboard display can be arranged in a standard keyboard format, some variation of that format such as a Dvorak layout or an entirely different arrangement such as strings of letters and numbers in alphabetical and numerical order. However, regardless of the layout, an attractant, such as color intensity, or size, is used for emphasis to make a soft keyboard user cognizant of the location of the subset of characters that the user is most likely to select to standout from the other keys of the keyboard. In addition to enhancing all characters of the subset, particular emphasis can be placed on the most likely character in the subset to be selected.
    Type: Grant
    Filed: August 15, 1997
    Date of Patent: October 5, 1999
    Assignee: International Business Machines Corporation
    Inventors: Liam David Comerford, Thomas Allan Corbi, John Peter Karidis, William Dennis Strohm
  • Patent number: 5930783
    Abstract: A computer implemented method for searching and retrieving images contained within a database of images in which both semantic and cognitive methodologies are utilized. The method accepts a semantic and cognitive description of an image to be searched from a user, and successively refines the search utilizing semantic and cognitive methodologies and then ranking the results for presentation to the user.
    Type: Grant
    Filed: August 29, 1997
    Date of Patent: July 27, 1999
    Assignee: NEC USA, Inc.
    Inventors: Wen-Syan Li, Kasim S. Candan