Trigrams Or Digrams Patents (Class 382/230)

Dynamic soft keyboard

Patent number: 11868609

Abstract: In accordance with one or more aspects of a dynamic soft keyboard, a user input is received via a soft keyboard having multiple keys. Information describing a current input environment for the soft keyboard is obtained, and a determination is made as to which one or more keys of the multiple keys was intended to be selected by the user input. This determination is made based at least in part on the current input environment.

Type: Grant

Filed: July 6, 2022

Date of Patent: January 9, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Erik M. Geidl, Shawn R. LeProwse, Ian C. LeGrow, Reed L. Townsend
Synchronization and tagging of image and text data

Patent number: 11853684

Abstract: A computing system accesses an image-based document and a text document having text extracted from the image-based document and provides a user interface displaying at least a portion of the image-based document. In response to selection of a text portion of the image-based document, the system determines an occurrence of the text portion within at least a portion of the image-based document and then applies a search model on the text document to identify the same occurrence of the text portion. Once matched, alignment data indicating a relationship between a selected tag and both the text portion of the image-based document and the text portion of the text document is stored.

Type: Grant

Filed: December 20, 2022

Date of Patent: December 26, 2023

Assignee: Palantir Technologies Inc.

Inventors: Suchan Lee, Jon Paek
Content masking attacks against information-based services and defenses thereto

Patent number: 11775749

Abstract: The embodiments present a new class of content masking defenses against the Portable Document Format (PDF) standard. The defenses can identify attacks that cause documents to appear different than the underlying content extracted from the documents. A content masking defense method can include identifying a content masking attack by scanning a document file to extract a character code of a character appearing in the file. Next, the character is rendered based on a font that is embedded in the document file. Optical character recognition can be performed on the rendering, and a content masking attack can be identified based on a comparison of a result of the optical character recognition against the character code of the character.

Type: Grant

Filed: December 29, 2020

Date of Patent: October 3, 2023

Assignee: UNIVERSITY OF SOUTH FLORIDA

Inventors: Yao Liu, Zhuo Lu, Ian Davidson Markwood, Dakun Shen
Neologism classification techniques with trigrams and longest common subsequences

Patent number: 11694029

Abstract: Techniques are provided for identifying attributes associated with a neologism or an unknown word or name. Real world characteristics can be predicted for the neologism. Trigrams are identified for an input word and word embedding model vector values are calculated for the identified trigrams and entered into a matrix. Trigrams are identified for nearest names. Classification values are calculated based on the trigrams for the input word and the trigrams from the nearest names and the classification values are entered into the matrix. A convolutional neural network can process the matrix to identify one or more characteristics associated with the neologism.

Type: Grant

Filed: August 4, 2020

Date of Patent: July 4, 2023

Assignee: Oracle International Corporation

Inventors: Michael Malak, Luis E. Rivas, Mark Lee Kreider
Synchronization and tagging of image and text data

Patent number: 11562120

Abstract: A computing system accesses an image-based document and a text document having text extracted from the image-based document and provides a user interface displaying at least a portion of the image-based document. In response to selection of a text portion of the image-based document, the system determines an occurrence of the text portion within at least a portion of the image-based document and then applies a search model on the text document to identify the same occurrence of the text portion. Once matched, alignment data indicating a relationship between a selected tag and both the text portion of the image-based document and the text portion of the text document is stored.

Type: Grant

Filed: July 16, 2021

Date of Patent: January 24, 2023

Assignee: Palantir Technologies Inc.

Inventors: Suchan Lee, Jon Paek
Efficiently finding potential duplicate values in data

Patent number: 11334603

Abstract: A method, system and computer program product for finding groups of potential duplicates in attribute values. Each attribute value of the attribute values is converted to a respective set of bigrams. All bigrams present in the attribute values may be determined. Bigrams present in the attribute values may be represented as bits. This may result in a bitmap representing the presence of the bigrams in the attribute values. The attribute values may be grouped using bitwise operations on the bitmap, where each group includes attribute values that are determined based on pairwise bigram-based similarity scores. The pairwise bigram-based similarity score reflects the number of common bigrams between two attribute values.

Type: Grant

Filed: February 14, 2020

Date of Patent: May 17, 2022

Assignee: International Business Machines Corporation

Inventors: Namit Kabra, Yannick Saillet
Character string distance calculation method and device

Patent number: 11256756

Abstract: Techniques for determining character string differences between a target character string and one or more candidate character strings are provided. In some implementations, a target bitmap is produced for the target character string and a target bitmap weight is calculated. A candidate bitmap and a candidate bitmap weight associated with a candidate character string is obtained. In response to determining that the candidate bitmap weight differs from the target bitmap weight by less than a first threshold value, an exclusive OR operation is performed between the target bitmap and the candidate bitmap. In response to determining that number of ones in the result of the exclusive OR is less than a second threshold value, the candidate character string is included in a character set that includes one or more character strings that are close to the target character string.

Type: Grant

Filed: August 21, 2018

Date of Patent: February 22, 2022

Assignee: Advanced New Technologies Co., Ltd.

Inventor: Xiaofeng Fan
Automatic translation of advertisements

Patent number: 11138391

Abstract: In an embodiment of a messaging system, a method for presenting a commercial message to a user is provided. A target language in which the user is comfortable communicating may be determined based on at least one communication received by the user or at least one communication provided by the user. The commercial message may be presented to the user in the target language.

Type: Grant

Filed: June 4, 2019

Date of Patent: October 5, 2021

Assignee: AT&T INTELLECTUAL PROPERTY II, L.P.

Inventor: Srinivas Bangalore
Tagging virtualized content

Patent number: 10387729

Abstract: Techniques for tagging virtualized content are disclosed. In some embodiments, a modeled three-dimensional scene of objects representing abstracted source content is generated and analyzed to determine a contextual characteristic of the scene that is based on a plurality of objects comprising the scene. The modeled scene is tagged with a tag specifying the determined contextual characteristic.

Type: Grant

Filed: July 9, 2014

Date of Patent: August 20, 2019

Assignee: Outward, Inc.

Inventor: Clarence Chui
Systems and methods for automatic unit selection and target decomposition for sequence labelling

Patent number: 10373610

Abstract: Described herein are systems and methods for automatic unit selection and target decomposition for sequence labelling. Embodiments include a new loss function called Gram-Connectionist Temporal Classification (CTC) loss that extend the popular CTC loss function criterion to alleviate prior limitations. While preserving the advantages of CTC, Gram-CTC automatically learns the best set of basic units (grams), as well as the most suitable decomposition of target sequences. Unlike CTC, embodiments of Gram-CTC allow a model to output variable number of characters at each time step, which enables the model to capture longer term dependency and improves the computational efficiency. It is also demonstrated that embodiments of Gram-CTC improve CTC in terms of both performance and efficiency on the large vocabulary speech recognition task at multiple scales of data, and that systems that employ an embodiment of Gram-CTC can outperform the state-of-the-art on a standard speech benchmark.

Type: Grant

Filed: September 7, 2017

Date of Patent: August 6, 2019

Assignee: Baidu USA LLC

Inventors: Hairong Liu, Zhenyao Zhu, Sanjeev Satheesh
End-to-end speech recognition

Patent number: 10332509

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Grant

Filed: November 21, 2016

Date of Patent: June 25, 2019

Assignee: Baidu USA, LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
Deployed end-to-end speech recognition

Patent number: 10319374

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Grant

Filed: November 21, 2016

Date of Patent: June 11, 2019

Assignee: Baidu USA, LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
Providing multi-lingual searching of mono-lingual content

Patent number: 10140371

Abstract: Approaches for translating a transliterated search query are provided. An approach includes receiving a search query containing a transliterated word. The approach also includes determining a source language corresponding to the transliterated word. The approach further includes converting the transliterated word to a word in the source language. The approach additionally includes translating the word in the source language to a word in a target language. The approach also includes performing a search using the word in the target language.

Type: Grant

Filed: July 18, 2017

Date of Patent: November 27, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sasha P. Caskey, Rick A. Hamilton, II, Dimitri Kanevsky, Tara N. Sainath
Language independent processing of logs in a log analytics system

Patent number: 9881005

Abstract: Log files include log file content, some of which (especially a non-runtime portion) is in human-readable language. Translation of log file content is accomplished by: (i) generating first log content in a first human-readable language using a first resource bundle related to data translation; and (ii) translating the first log content to second log content, which corresponds to the first log content but is in a second human-readable language, using the first resource bundle. The translated log content may have annotations and/or processing rules applied to it. The translation of the present invention can help to keep the translation accurate and uniform so that the translated log content may be more effectively used in various ways.

Type: Grant

Filed: September 4, 2014

Date of Patent: January 30, 2018

Assignee: International Business Machines Corporation

Inventors: Arun Ramakrishnan, Rohit Shetty
Language independent processing of logs in a log analytics system

Patent number: 9852129

Abstract: Log files include log file content, some of which (especially a non-runtime portion) is in human-readable language. Translation of log file content is accomplished by: (i) generating first log content in a first human-readable language using a first resource bundle related to data translation; and (ii) translating the first log content to second log content, which corresponds to the first log content but is in a second human-readable language, using the first resource bundle. The translated log content may have annotations and/or processing rules applied to it. The translation of the present invention can help to keep the translation accurate and uniform so that the translated log content may be more effectively used in various ways.

Type: Grant

Filed: November 26, 2013

Date of Patent: December 26, 2017

Assignee: International Business Machines Corporation

Inventors: Arun Ramakrishnan, Rohit Shetty
Word prediction for numbers and symbols

Patent number: 9298276

Abstract: In one example, a computing device includes at least one processor configured to output, for display, a graphical keyboard. The at least one processor may be configured to output, for display, a graphical keyboard. The at least one processor may be configured to receive an input indication. The at least one processor may be configured to select, based in part on the input indication, keys of the graphical keyboard. The at least one processor may be configured to classify a group of characters associated with the keys into a category, wherein at least one of the group of characters comprises a number or a non-alphabetic symbol. The at least one processor may be configured to determine, based in part on a language model, candidate character strings that corresponds to the category. The at least one processor may be configured to output the at least one candidate character string.

Type: Grant

Filed: December 31, 2013

Date of Patent: March 29, 2016

Assignee: Google Inc.

Inventor: Xiaojun Bi
Browser based language recognition supporting central web search translation

Patent number: 9223869

Abstract: A web browser agent or plug-in installed into a web browser of a client device provides translation services along with a search engine server. The system accesses a web page in one (local) language and then translates to another (foreign) language and displays the translated content in a web page for user's viewing. The web browser agent is an add-on software tool or plug-in, provided by the search engine server and installed into the web browser. As a result of installation, a toolbar appears on the top of the web browser's page. This toolbar provides the interface to enable local translation of web pages from a local/web language to a target/foreign language useful to the user. Centralized (cloud computing) translation services by servers of a third party may also be employed. Web pages in any number of languages may be accessed using this operations/structure.

Type: Grant

Filed: October 31, 2012

Date of Patent: December 29, 2015

Assignee: RPX CORPORATION

Inventor: James D. Bennett
Determining a language encoding data setting for a web page, and applications thereof

Patent number: 9223758

Abstract: Systems, methods and computer storage mediums automatically apply a language encoding data setting to a web page. Embodiments of the present disclosure relate to equipping a web browser with the ability to automatically open web pages with an appropriate language encoding data setting applied to the web page so that the web page is displayed without garbled characters. The web browser is able to determine the appropriate language encoding setting by requesting the appropriate language encoding setting for the web page that is stored in a language encoding database that is updated each time the web page is successfully opened without displaying garbled characters.

Type: Grant

Filed: June 15, 2012

Date of Patent: December 29, 2015

Assignee: Google Inc.

Inventor: Takuya Oikawa
System and method for inputting text into small screen devices

Patent number: 9189472

Abstract: An embodiment is directed to an interface for a small screen device, such as a watch, that enables a user to enter text on the small screen device by touching in the vicinity of characters, rather than aiming for a particular button or the exact location of a character. Embodiments further enable the design of interfaces without the use of buttons for controlling the entry of text on the small screen device.

Type: Grant

Filed: March 8, 2012

Date of Patent: November 17, 2015

Assignee: Touchtype Limited

Inventors: Benjamin William Medlock, Jonathan Paul Reynolds
Image based document identification based on obtained and stored document characteristics

Patent number: 9135517

Abstract: A method and apparatus for identifying a document in a set of stored documents based on a pattern of characteristics in the document is presented. A digital image including at least a portion of the a document is acquired. A pattern of characteristics is then identified in the digital image. The pattern is matched to the set of stored documents to identify the document in the digital image from the set of stored documents.

Type: Grant

Filed: November 29, 2012

Date of Patent: September 15, 2015

Assignee: Amazon Technologies, Inc.

Inventor: Jeffrey Penrod Adams
Age estimation apparatus, age estimation method, and age estimation program

Patent number: 9036923

Abstract: Provided are an age estimation apparatus, an age estimation method, and an age estimation program capable of obtaining a recognition result closely matching the result perceived by human. An age estimation apparatus 10 for estimating an age of a person on image data includes a dimension compressor 11 for applying dimension compression to the image data to output low dimensional data; and an identification device 12 for estimating an age of a person on the basis of a learning result using a feature amount contained in the low dimensional data, wherein a parameter used for the dimension compression by the dimension compressor 11 and the feature amount used for age estimation by the identification device 12 are set on the basis of a result of an evaluation of a generalization capability using a weighting function that shows a degree of seriousness of an age estimation error for every age, and learning of the identification device 12 is performed on the basis of the weighting function.

Type: Grant

Filed: April 14, 2010

Date of Patent: May 19, 2015

Assignees: NEC Solution Innovators, Ltd., TOKYO INSTITUTE OF TECHNOLOGY

Inventors: Kazuya Ueki, Masashi Sugiyama, Yasuyuki Ihara
Automatic context sensitive language correction and enhancement using an internet corpus

Patent number: 8914278

Abstract: A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.

Type: Grant

Filed: July 31, 2008

Date of Patent: December 16, 2014

Assignee: Ginger Software, Inc.

Inventors: Yael Karov Zangvil, Avner Zangvil
Interactive input method

Patent number: 8838453

Abstract: A user input is received by a computing device. An interactive input module determines whether the first user input is a first character of a script for a supported language. If the first user input is a first character, the first character is stored in an input buffer. A plurality of words in the supported language that match a contents of the input buffer are identified, and a subset of the plurality of words are displayed to the user based on a frequency value associated with each of the plurality of words.

Type: Grant

Filed: August 31, 2010

Date of Patent: September 16, 2014

Assignee: Red Hat, Inc.

Inventor: Pravin Satpute
Age estimation apparatus, age estimation method, and age estimation program

Patent number: 8818111

Abstract: Provided are an age estimation apparatus, an age estimation method, and an age estimation program capable of reducing the labor of labeling the image data used for age estimation. An age estimation apparatus for estimating an age of a person on image data includes a dimension compression unit for applying dimension compression to the image data to output low dimensional data; a clustering unit for performing clustering of the low dimensional data outputted; a labeling unit for labeling representative data of each cluster among the low dimensional data clustered; and an identification unit for estimating an age of a person on the basis of a learning result using a feature amount contained in labeled low dimensional data and unlabeled low dimensional data.

Type: Grant

Filed: April 14, 2010

Date of Patent: August 26, 2014

Assignees: NEC Soft, Ltd., Tokyo Institute of Technology

Inventors: Kazuya Ueki, Masashi Sugiyama, Yasuyuki Ihara
Image compression and decompression

Patent number: 8744198

Abstract: A computer-implemented method includes dividing an image into one or more image channels for image compression. The method also includes dividing one or more of the image channels into one or more blocks. At least one of the blocks includes floating point representations of pixel values included in the block. The method also includes converting the floating point representations of pixel values into integer representations such that the sign of each floating point representation is preserved. The method also includes storing the difference of adjacent integer representations as a compressed version of the image.

Type: Grant

Filed: November 20, 2007

Date of Patent: June 3, 2014

Assignee: Lucasfilm Entertainment Company Ltd.

Inventor: Florian Kainz
Image-domain script and language identification

Patent number: 8233726

Abstract: Disclosed herein is a method, computer system and computer program product for identifying a writing system associated with a document image containing one or more words written in the writing system. Initially, a document image fragment is identified based on the document image, wherein the document image fragment contains one or more pixels from one or more of the words in the document image. A set of sequential features associated with the document image fragment is generated, wherein each sequential feature describes one dimensional graphic information derived from the one or more pixels in the document image fragment. A classification score for the document image fragment is generated responsive at least in part to the set of sequential features, the classification score indicating a likelihood that the document image fragment is written in the writing system.

Type: Grant

Filed: November 27, 2007

Date of Patent: July 31, 2012

Assignee: Googe Inc.

Inventors: Ashok Popat, Eugene Brevdo
Search and retrieval of documents indexed by optical character recognition

Patent number: 8208765

Abstract: An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided into separate characters. Image features of each character image are extracted. Based on the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters, from a character image feature dictionary which stores the image features of character image in units of character, and a first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting a first column of the first index matrix, is subjected to a lexical analysis according to a language model, and whereby a second index matrix having a character string which makes sense is prepared. In the language model, statistics are taken and then, the lexical analysis is performed.

Type: Grant

Filed: January 10, 2008

Date of Patent: June 26, 2012

Assignee: Sharp Kabushiki Kaisha

Inventors: Bo Wu, Jianjun Dou, Ning Le, Yadong Wu, Jing Jia
Statistical approach to large-scale image annotation

Patent number: 8150170

Abstract: Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.

Type: Grant

Filed: May 30, 2008

Date of Patent: April 3, 2012

Assignee: Microsoft Corporation

Inventors: Mingjing Li, Xiaoguang Rui
Handheld electronic device with disambiguation of compound word text input

Patent number: 8102284

Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound word text input. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device.

Type: Grant

Filed: July 22, 2009

Date of Patent: January 24, 2012

Assignee: Research In Motion Limited

Inventors: Vadim Fux, Michael Elizarov
Visual sensing for large-scale tracking

Patent number: 7697720

Abstract: One embodiment of a method of tracking a plurality of targets can be broadly summarized by the following steps: capturing a plurality of images of a plurality of targets with a plurality of image capture devices; generating a target observation for each target, said target observation including at least a visual signature of the target and a time value; partitioning target observations according to similarities in their visual signatures; and producing primary tracks from the partitioned target observations, wherein each primary track includes ordered sequences of observation events having similarities in their visual signatures. Other methods and systems are also provided.

Type: Grant

Filed: September 15, 2005

Date of Patent: April 13, 2010

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Colin Andrew Low
Text language identification

Patent number: 7689409

Abstract: After prestoring first character strings that occur frequently in words of languages and second character strings that are a typical therein, a device for automatically identifying the language of a text from a plurality of languages extracts words from the text and constructs all of the character strings contained in each extracted word. Each string in an extracted word is compared to the first and second strings of a particular language. If the word contains a first string, a score of the language is increased by a coefficient depending in particular on the position of the first string in the word. If the word contains a second string, the score is decreased by a coefficient associated with the second string. The highest of the scores corresponding to the predetermined languages identifies the language of the text.

Type: Grant

Filed: December 11, 2003

Date of Patent: March 30, 2010

Assignee: France Telecom

Inventor: Johannes Heinecke
Handheld electronic device with disambiguation of compound word text input

Patent number: 7583205

Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound word text input. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device.

Type: Grant

Filed: July 28, 2005

Date of Patent: September 1, 2009

Assignee: Research In Motion Limited

Inventors: Vadim Fux, Michael Elizarov
Handheld electronic device with disambiguation of compound word text input employing separating input

Patent number: 7573404

Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound word text input. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device.

Type: Grant

Filed: July 28, 2005

Date of Patent: August 11, 2009

Assignee: Research In Motion Limited

Inventors: Vadim Fux, Michael Elizarov
Method for verifying an intended address by OCR percentage address matching

Patent number: 7539326

Abstract: An OCR percentage matching algorithm achieves a significant reduction in false mismatches accounting for combinations of unprocessed spaces, missing characters, extra characters and character substitution errors during the OCR scanning processing and allows for a specified percentage of the OCR character scan rather the entire OCR character scan to be the same as the expected character string to declare a match.

Type: Grant

Filed: December 23, 2005

Date of Patent: May 26, 2009

Assignee: Pitney Bowes Inc.

Inventors: Joseph Eremita, Adrian Ruck
MULTIMODAL CLASSIFICATION OF ADULT CONTENT

Publication number: 20090034851

Abstract: Systems and methods for classifying content as adult content and, if desired, blocking content so classified from presentation to a user are provided. Received content is analyzed using a sequential series of classification techniques, each successive technique being implemented only if the previous technique did not result in classification of the content as adult content. In this way, adult content may be identified across a variety of different media types (e.g., text, images, video, etc.) and yet processing power may be reserved if one or more techniques requiring less power is sufficient to determine that the received content is, in fact, adult content. Content classification may be performed in-band (that is, in substantially real-time such that content may be identified and/or blocked at the time results of a user query are returned) or out-of-band (that is, prospectively as new content is received but not in association with a user query).

Type: Application

Filed: August 3, 2007

Publication date: February 5, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Xiadong Fan, Richard Qian
Apparatus for and method of embedding watermark into original information, transmitting watermarked information, and reconstructing the watermark

Patent number: 7451317

Abstract: An apparatus for and method of embedding a watermark into original information, transmitting the watermarked information, and reconstructing the watermark from the transmitted watermarked information include embedding a portion of a plurality of components constituting the watermark into the original information, and using the remaining portion of the components as keys for reconstructing the watermark. According to the method, since a size of data to be embedded is greatly reduced, degradation of the watermarked information is prevented, and the information becomes more robust against hacking attacks or errors occurring in a variety of ways when transmitting the information.

Type: Grant

Filed: December 2, 2002

Date of Patent: November 11, 2008

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sang-heun Oh, Byung-jun Kim, Sung-wook Park
Entering a character into an electronic device

Patent number: 7443316

Abstract: A method (300) for entering a character into an electronic device (100) is provided. The method (300) includes displaying (301) input character keys (204) on a touch sensitive region (202) of a display screen (105) of the device (100), the keys identifying an associated character. Next, a display step (309) shows at least one entered character in a display region (201) of the screen, the entered character having been selected by actuation of one of the character keys (204). Next, a group of potential subsequent characters that follow the entered character is predicted (311, 317). A second set of input character keys (205) identifying the potential subsequent characters is displayed (327). The second set of keys (205) are grouped together (323) such that their relative screen locations with respect to each other are different to that of corresponding keys in the first set of keys (204).

Type: Grant

Filed: September 1, 2005

Date of Patent: October 28, 2008

Assignee: Motorola, Inc.

Inventor: Swee Ho Lim
Hand-held communication device having navigation key-based predictive text entry

Patent number: 7218249

Abstract: A hand-held communication device provides navigation key-based predictive text entry. The hand-held communication device includes a housing generally sized to be held in a human hand having a display disposed for displaying characters selectable for entry in a character position of a text string being entered and a navigation key assembly for scrolling through and selecting from the characters displayed by the display. The characters displayed by the display during text entry are arranged according to the probability of selection of each character for entry in the character position so that the character with the highest probability of selection is selected with a single input from the navigation key assembly.

Type: Grant

Filed: June 8, 2004

Date of Patent: May 15, 2007

Assignee: Siemens Communications, Inc.

Inventor: Lovleen Chadha
Document processing method, system and medium

Patent number: 7046847

Abstract: A technique for extracting a meaningful text block from a document where a table, an itemized list, a multiple column, etc., are arbitrarily laid out. A document is input which is laid out using blanks or the like, then a symbol is acquired which is associated with a spatial coordinate of the document. Consecutive characters of the same type are extracted from the symbol to generate a token and a space. A stream is generated from consecutive spaces in the column direction, while a text block is generated from streams and tokens. A link is generated between the text blocks to form a document graph. Validity of a connection (link) between the text blocks in the document graph is evaluated using a language model, then the text blocks are merged if the connection is valid.

Type: Grant

Filed: June 25, 2001

Date of Patent: May 16, 2006

Assignee: International Business Machines Corporation

Inventors: Matthew F. Hurst, Tetsuya Nasukawa
Method of identifying script of line of text

Patent number: 7020338

Abstract: A method of identifying the script of a line of text by first assigning a weight to each n-gram in a group of documents of known scripts, where each n-gram is a sequence of numbers representing k-mean cluster centroids of a known script to which character segments in the documents of known scripts most closely match. A line of text is identified, where the line of text is made up of pixels. The identified line of text is cropped so that only a percentage of the pixels remain. The cropped line is vertically and horizontally rescaled into gray-scale pixels. The vertical gray-scale pixels are replaced with the sequence number of a k-means cluster centroid of a known script to which it most closely matches. The n-grams of the number sequence that represents the line of text is scored against the n-gram weights of the documents of known text. The highest score of the line of text is identified and compared to the scores of the documents of known scripts.

Type: Grant

Filed: April 8, 2002

Date of Patent: March 28, 2006

Assignee: The United States of America as represented by the National Security Agency

Inventor: Carson S. Cumbee
Address reading method

Patent number: 6934405

Abstract: An address reading method with processing steps controlled by parameters, in which free parameters which cannot be adapted by learning samples are to be automatically optimized. These parameters are therefore assigned costs. The value of free parameters which are expensive and lie above selectable cost thresholds are maintained and the remaining free parameters are improved by repeatedly modifying their values on the basis of strategies known per se, taking already evaluated parameter settings into account, and training and evaluating the reading method only with these modified values.

Type: Grant

Filed: April 4, 2000

Date of Patent: August 23, 2005

Assignee: Siemens Aktiengesellschaft

Inventor: Michael Schuessler
Extracting information from symbolically compressed document images

Publication number: 20040042667

Abstract: A method and apparatus for extracting information from symbolically compressed document images. A deciphering module generates first and second text strings by deciphering respective sequences of template identifiers in first and second symbolically compressed document images. A conditional n-gram module receives the first and second text strings from the deciphering module and extracts n-gram terms therefrom based on a predicate condition. A comparison module generates a measure of similarity between the first and second symbolically compressed document images based on the n-gram terms extracted by the conditional n-gram module.

Type: Application

Filed: September 30, 2003

Publication date: March 4, 2004

Inventors: Dar-Shyang Lee, Jonathan J. Hull
Method for designing optimal single pointer predictive keyboards and apparatus therefore

Patent number: 6646572

Abstract: Keys are arranged on a keyboard as learned during a training stage. During training, a training corpus of input symbol sequence is provided. Each unique symbol in the corpus has an associated key on the keyboard. A cost function that measures a cost of inputting the symbols of the training corpus is globally minimized. Then, the keys are arranged on the keyboard according to the globally minimized cost function. To reduced the distance a pointer must move, the keys can also be arranged in a hexagonal pattern.

Type: Grant

Filed: February 18, 2000

Date of Patent: November 11, 2003

Assignee: Mitsubish Electric Research Laboratories, Inc.

Inventor: Matthew Brand
Character recognizing apparatus, method, and storage medium

Patent number: 6636636

Abstract: It is an object of the invention to improve output precision of a final recognition result by further obtaining and applying a forward-chain probability in addition to a backward-chain probability in a Bi-gram statistic process, as a post-processing in the case where a plurality of candidate characters are outputted to one input pattern as a result of character recognition. An apparatus according to the invention has a backward-chain dictionary and a forward-chain dictionary of characters, obtains a chain probability from the i-th character to the (i+1)th character by using the backward-chain dictionary, further obtains a chain probability from the (i+1)th character to the i-th character by using the forward-chain dictionary, and selects the character of the final output result from a plurality of candidate characters on the basis of a value obtained by unifying those chain probabilities.

Type: Grant

Filed: January 9, 1998

Date of Patent: October 21, 2003

Assignee: Canon Kabushiki Kaisha

Inventor: Eiji Takasu
Feed forward feed back multiple neural network with context driven recognition

Patent number: 6560360

Abstract: A recognition system is disclosed, including a representation of an object in terms of its constituent parts that is translationally invariant, and which provides scale invariant recognition. The system further provides effective recognition of patterns that are partially present in the input signal, or that are partially occluded, and also provides an effective representation for sequences within the input signal. The system utilizes dynamically determined, context based expectations, for identifying individual features/parts of an object to be recognized. The system is computationally efficient, and capable of highly parallel implementation, and further includes a mechanism for improving the preprocessing of individual sections of an input pattern, either by applying one or more preprocessors selected from a set of several preprocessors, or by changing the parameters within a single preprocessor.

Type: Grant

Filed: January 27, 2000

Date of Patent: May 6, 2003

Assignees: Nestor, Inc., Brown University Research Foundation

Inventors: Predrag Neskovic, Douglas L. Reilly, Leon N Cooper
Method for identifying the language of individual words

Patent number: 6292772

Abstract: The method of recognizing the language of a single word as to spelling and grammar correction (e.g., identifying the appropriate language resources on a document, paragraph, sentence or even individual word basis), the automatic invocation of transliteration software based on the language of the words (e.g., automatic ASCII to Kanji substitution without requiring the user to explicitly switch into a Kanji mode), the automatic invocation of appropriate machine translation tools when the document's language is different from the user's native tongue(s), the use of document language identification to eliminate from database or web search results any documents which are not written in the user's native language and the automatic identification of user-appropriate languages for the user interface.

Type: Grant

Filed: December 1, 1998

Date of Patent: September 18, 2001

Assignee: JustSystem Corporation

Inventor: Mark Kantrowitz
Method and apparatus for facilitating query reformulation

Patent number: 6175829

Abstract: A method and apparatus for verifying a query to provide feedback to users for query reformulation. By utilizing selectivity statistics for semantic and visual characteristics of image objects, query verification “examines” user queries and allows users to reformulate queries through system feedback. Feedback information provided to the user includes (1) the maximum and minimum number of matches for the query; (2) alternatives for both semantic and visual-based query elements; and (3) estimated numbers of matching images. Additional types of feedback information may also be provided. With this feedback, the users know if the query criteria is too tight (i.e. too few matches will be retrieved) or too loose (i.e. too many matches will be retrieved) so that they can relax, refine, or reformulate queries or leave queries unchanged accordingly. Only after queries are verified to have a high possibility of meaningful results, are the queries processed.

Type: Grant

Filed: April 22, 1998

Date of Patent: January 16, 2001

Assignee: NEC USA, Inc.

Inventors: Wen-Syan Li, K. Selcuk Candan
Test classification system and method

Patent number: 6137911

Abstract: Documents are classified into one or more clusters corresponding to predefined classification categories by building a knowledge base comprising matrices of vectors which indicate the significance of terms within a corpus of text formed by the documents and classified in the knowledge base to each cluster. The significance of terms is determined assuming a standard normal probability distribution, and terms are determined to be significant to a cluster if their probability of occurrence being due to chance is low. For each cluster, statistical signatures comprising sums of weighted products and intersections of cluster terms to corpus terms are generated and used as discriminators for classifying documents. The knowledge base is built using prefix and suffix lexical rules which are context-sensitive and applied selectively to improve the accuracy and precision of classification.

Type: Grant

Filed: June 16, 1997

Date of Patent: October 24, 2000

Assignee: The Dialog Corporation PLC

Inventor: Maxim Zhilyaev
Enhancement of soft keyboard operations using trigram prediction

Patent number: 5963671

Abstract: The most likely to be used characters and controls of a soft keyboard are determined from consulting trigram tables, and enhanced and/or positioned to attract the user and to facilitate quick recognition and selection. The letters and other characters of the soft keyboard display can be arranged in a standard keyboard format, some variation of that format such as a Dvorak layout or an entirely different arrangement such as strings of letters and numbers in alphabetical and numerical order. However, regardless of the layout, an attractant, such as color intensity, or size, is used for emphasis to make a soft keyboard user cognizant of the location of the subset of characters that the user is most likely to select to standout from the other keys of the keyboard. In addition to enhancing all characters of the subset, particular emphasis can be placed on the most likely character in the subset to be selected.

Type: Grant

Filed: August 15, 1997

Date of Patent: October 5, 1999

Assignee: International Business Machines Corporation

Inventors: Liam David Comerford, Thomas Allan Corbi, John Peter Karidis, William Dennis Strohm
Semantic and cognition based image retrieval

Patent number: 5930783

Abstract: A computer implemented method for searching and retrieving images contained within a database of images in which both semantic and cognitive methodologies are utilized. The method accepts a semantic and cognitive description of an image to be searched from a user, and successively refines the search utilizing semantic and cognitive methodologies and then ranking the results for presentation to the user.

Type: Grant

Filed: August 29, 1997

Date of Patent: July 27, 1999

Assignee: NEC USA, Inc.

Inventors: Wen-Syan Li, Kasim S. Candan

1 2 next