Using Extracted Text (epo) Patents (Class 707/E17.022)
-
Patent number: 12072837Abstract: An integrated digital-analog archiving system can automatically initiate a migration process to move electronic documents to a media library. For each electronic document, the system may retrieve the electronic document from a digital data storage medium, extract metadata from the electronic document, determine size, orientation, and format of the electronic document, generate indicators for indicating the start and end of the electronic document to be stored on an analog data storage medium, generate an analog document identifier for identifying the electronic document on the analog data storage medium, generate a scaled image of the electronic document based on the size, orientation, and format of the electronic document, generate a text string based at least in part on the extracted metadata, and render the indicators, the analog document identifier, the scaled image of the electronic document, and the text string on the analog data storage medium.Type: GrantFiled: February 7, 2023Date of Patent: August 27, 2024Assignee: OPEN TEXT SA ULCInventor: Matthias Specht
-
Patent number: 11989922Abstract: A system includes a computing platform having processing hardware, and a memory storing software code. The processing hardware is configured to execute the software code to receive an image having a plurality of image regions, determine a boundary of each of the image regions to identify a plurality of bounded image regions, and identify, within each of the bounded image regions, one or more image sub-regions to identify a plurality of image sub-regions. The processing hardware is further configured to execute the software code to identify, within each of the bounded image regions, one or more first features, respectively, identify, within each of the image sub-regions, one or more second features, respectively, and provided an annotated image by annotating each of the bounded image regions using the respective first features and annotating each of the image sub-regions using the respective second features.Type: GrantFiled: February 18, 2022Date of Patent: May 21, 2024Assignee: Disney Enterprises, Inc.Inventors: Miquel Angel Farre Guiu, Monica Alfaro Vendrell, Pablo Pernias, Francesc Josep Guitart Bravo, Marc Junyent Martin, Albert Aparicio Isarn, Anthony M. Accardo, Steven S. Shapiro
-
Patent number: 11910060Abstract: This relates to using a computer simulation to test another computer program in real time or simulated real time that is sped up. The disclosed method and system synchronizes information input into the simulation so that the program under test operates in an independent way. The method and system operates a protocol to connect one running computer process, a trading computer program, with another running process, a computer program that executes a market simulation in order to optimize the quality and speed of the simulation and testing of the external computer program.Type: GrantFiled: May 14, 2021Date of Patent: February 20, 2024Assignee: Caspian Hill Group, LLCInventors: Amy Bolivar, Steven Lubin, Audrey Faust
-
Patent number: 11847142Abstract: There is provided a system configured to appropriately determine a topic count in accordance with LDA to estimate latent meanings of a document. For a plurality of documents d, a perplexity PPL of each document d is evaluated in accordance with a document generation probability in which the document d is generated when topic counts N for defining a topic model based on the LDA as a document generation model are hypothetically specified as different values and word groups are specified by different random numbers. The topic model is defined by a reference topic count No determined by combining a first topic count N1 (the number of topics indicating a highest cumulative frequency at which the perplexity PPL first indicates a minimum value) and a second topic count N2 (the number of topics indicating a highest cumulative frequency at which the perplexity PPL indicates a smallest value).Type: GrantFiled: February 22, 2021Date of Patent: December 19, 2023Assignee: HONDA MOTOR CO., LTD.Inventor: Takamasa Suzuki
-
Patent number: 11842035Abstract: In example embodiments, techniques are provided for efficiently labeling, reviewing and correcting predictions for P&IDs in image-only formats. To label text boxes in the P&ID, the labeling application executes an OCR algorithm to predict a bounding box around, and machine-readable text within, each text box, and displays these predictions in its user interface. The labeling application provides functionality to receive a user confirmation or correction for each predicted bounding box and predicted machine-readable text. To label symbols in the P&ID, the labeling application receives user input to draw bounding boxes around symbols and assign symbols to classes of equipment. Where there are multiple occurrences of specific symbols, the labeling application provides functionality to duplicate and automatically detect and assign bounding boxes and classes.Type: GrantFiled: December 21, 2020Date of Patent: December 12, 2023Assignee: Bentley Systems, IncorporatedInventors: Karl-Alexandre Jahjah, Marc-André Gardner
-
Patent number: 11816983Abstract: The present invention is directed to a helmet wearing determination system including a imaging means that is installed in a predetermined position and images a two-wheel vehicle that travels on a road; and a helmet wearing determination means that processes an image imaged by the imaging means, estimates a rider head region corresponding to a head of a person who rides on the two-wheel vehicle that travels on the road, compares image characteristics of the rider head region with image characteristics according to the head at a time when a helmet is worn or/and at a time when a helmet is not worn, and determines whether or not the rider wears the helmet.Type: GrantFiled: April 9, 2021Date of Patent: November 14, 2023Assignee: NEC CORPORATIONInventor: Katsuhiko Takahashi
-
Patent number: 11804053Abstract: An image recognition method and a terminal, where the method includes obtaining, by the terminal, an image file comprising a target object, recognizing, by the terminal, the target object based on an image recognition model in the terminal to obtain object category information of the target object, and storing, by the terminal, the object category information as first label information of the target object. Hence, image recognition efficiency of the terminal can be improved, and privacy of a terminal user can be effectively protected.Type: GrantFiled: April 26, 2021Date of Patent: October 31, 2023Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Changzhu Li, Xiyong Wang
-
Patent number: 11800036Abstract: Examples disclosed herein relate to identifying a plurality of content areas of a document to be scanned, classifying each of the plurality of content areas into a content type, determining a minimum scanning resolution to maintain readability for each of the plurality of content areas according to the classified content type, and performing a scan of the document to a digital file, wherein each of the plurality of content areas is scanned at least at the determined minimum scanning resolution to maintain readability of the respective content area.Type: GrantFiled: January 23, 2020Date of Patent: October 24, 2023Assignee: Hewlett, Packard Development Company, L.P.Inventors: Todd J Harris, Peter Bauer, Litao Hu, Jan Allebach, Zhenhua Hu
-
Patent number: 11783605Abstract: Certain aspects of the present disclosure provide techniques for training and using machine learning models to extract key-value sets from a document. An example method generally includes identifying regions of a document including key-value sets corresponding to inputs to a data processing application based on a first machine learning model and an electronic version of the document. One or more keys and one or more values are identified in the document based on a second machine learning model. One or more key-value sets are generated based on matching keys of the one or more keys and values of the one or more values in the region of the document. The one or more key-value sets are provided to a data processing application for processing.Type: GrantFiled: June 30, 2022Date of Patent: October 10, 2023Assignee: INTUIT, INC.Inventors: Amogha Sekhar, Eric Vanoeveren, Deepankar Mohapatra, Tharathorn Rimchala, Priyadarshini Rajendran
-
Patent number: 11775759Abstract: Techniques are described herein for training and evaluating machine learning (ML) models for document processing computing applications using generalized vocabulary tokens. In some embodiments, an ML system determines a set of tokens for non-textual content in a plurality of documents. The ML system generates a fixed-length vocabulary that includes the set of tokens for the non-textual content. The ML system further generates for each respective document in a training dataset of documents, a respective feature vector based at least in part on which tokens in the fixed-length vocabulary occur in the respective document. The ML system trains a ML model based at least in part on the respective feature vector for each respective document in the training dataset.Type: GrantFiled: August 15, 2022Date of Patent: October 3, 2023Assignee: Oracle International CorporationInventor: Sudhakar Kalluri
-
Patent number: 11768888Abstract: Disclosed are systems and methods for autonomously extracting attributes from domains of a vertical. The disclosed implementations train a deep neural network (“DNN”) based on one or more domains of a vertical using labeled embedding vectors generated for nodes of those one or more domains. The trained DNN may then be used to autonomously label nodes of other domains within the same vertical such that attributes corresponding to those labels can be extracted.Type: GrantFiled: August 11, 2021Date of Patent: September 26, 2023Assignee: Pinterest, Inc.Inventors: Jinfeng Zhuang, Zhengda Zhao, Vijai Mohan
-
Patent number: 11741380Abstract: Embodiments generate machine learning predictions for database migrations. For example, a trained machine learning model that has been trained using training data can be stored, where the training data includes migration information for database migrations and migration methods for the database migrations, and the training data migration information includes a source database type and a target database infrastructure. Migration information can be received for a candidate database migration that includes a source database type and a target database infrastructure. Using the trained machine learning model, migration methods based on the migration information for the candidate database migration can be predicted.Type: GrantFiled: January 31, 2020Date of Patent: August 29, 2023Assignee: Oracle International CorporationInventors: Malay K. Khawas, Saumika Sarangi, Sudipto Basu, Ranajoy Bose, Padma Priya Rajan Natarajan, Bogapurapu L. K. Rao, Parul Yamini
-
Patent number: 11664012Abstract: In one embodiment, an electronic device includes an input device configured to provide an input stream, a first processing device, and a second processing device. The first processing device is configured to use a keyword-detection model to determine if the input stream comprises a keyword, wake up the second processing device in response to determining that a segment of the input stream comprises the keyword, and modify the keyword-detection model in response to a training input received from the second processing device. The second processing device is configured to use a first neural network to determine whether the segment of the input stream comprises the keyword and provide the training input to the first processing device in response to determining that the segment of the input stream does not comprise the keyword.Type: GrantFiled: March 25, 2020Date of Patent: May 30, 2023Assignee: Qualcomm IncorporatedInventors: Young Mo Kang, Sungrack Yun, Kyu Woong Hwang, Hye Jin Jang, Byeonggeun Kim
-
Patent number: 11651448Abstract: A disclosed computer-implemented method may include receiving a request to generate a dating profile for a user of a community-based dating service of a social networking system based on information associated with the user and maintained by the social networking system. The method may also include accessing information associated with the user and maintained by the social networking system. The method may additionally include selecting, from the information associated with the user and maintained by the social networking system (1) a set of contextual information associated with the user, and (2) a set of media items associated with the user. The method may further include generating the dating profile for the user by arranging the set of contextual information and the set of media items within a dating interface of the social networking system. Various other methods, systems, and computer-readable media are also disclosed.Type: GrantFiled: November 21, 2019Date of Patent: May 16, 2023Assignee: Meta Platforms, Inc.Inventor: Jordan Springstroh
-
Patent number: 11645826Abstract: The present disclosure relates to generating computer searchable text from digital images that depict documents utilizing an orientation neural network and/or text prediction neural network. For example, one or more embodiments detect digital images that depict documents, identify the orientation of the depicted documents, and generate computer searchable text from the depicted documents in the detected digital images. In particular, one or more embodiments train an orientation neural network to identify the orientation of a depicted document in a digital image. Additionally, one or more embodiments train a text prediction neural network to analyze a depicted document in a digital image to generate computer searchable text from the depicted document.Type: GrantFiled: September 14, 2020Date of Patent: May 9, 2023Assignee: Dropbox, Inc.Inventors: David J. Kriegman, Peter N. Belhumeur, Bradley Neuberg, Leonard Fink
-
Patent number: 11645600Abstract: Embodiments relate to a system, program product, and method for managing apparel to facilitate compliance through a cognitive system, i.e., using an artificial intelligence (AI) platform to dynamically analyze the apparel donned by individuals to determine compliance with established apparel compliance practices and provide suggestions for overcoming non-compliance. The determinations of non-compliance are accompanied with respective risk factors. The system, program product, and method disclosed herein facilitate leveraging written requirements processed by natural language processing (NLP) for the donning of apparel that includes proper clothing articles and accessories, as well as associated requirements of clothing articles and accessories that are not appropriate for the respective conditions.Type: GrantFiled: April 20, 2020Date of Patent: May 9, 2023Assignee: International Business Machines CorporationInventors: Stan Kevin Daley, Michael Bender
-
Patent number: 11647261Abstract: A metadata server that includes circuitry is provided. The circuitry receives a first segment from a plurality of segments of first media content and determines context information associated with the first segment based on a characteristic of at least one frame of a plurality of frames included in the first segment. The circuitry generates first metadata associated with the first segment based on the context information. The first metadata includes timing information corresponding to the determined context information to control a first set of electrical devices. The circuitry further transmits the received first segment and the generated first metadata to a media device associated with the first set of electrical devices.Type: GrantFiled: November 22, 2019Date of Patent: May 9, 2023Assignee: SONY CORPORATIONInventors: Jaison Joseph, Anil Sasidharan
-
Patent number: 11507770Abstract: Described is a system and method that provides a data protection risk assessment for the overall functioning of a backup and recovery system. Accordingly, the system may provide a single overall risk assessment score that provide an operator with an “at-a-glance” overview of the entire system. Moreover, the system may account for changes that occur over time based on leveraging statistical methods to automatically generate assessment scores for various components (e.g. application, server, network, load, etc.). In order to determine a risk assessment score, the system may utilize a predictive model based on historical data. Accordingly, residual values for newly observed data may be determined using the predictive model and the system may identify potentially anomalous or high risk indicators.Type: GrantFiled: May 1, 2020Date of Patent: November 22, 2022Assignee: EMC IP HOLDING COMPANY LLCInventors: Qiang Chen, Jing Yu, Pengfei Wu, Naveen Rastogi
-
Patent number: 11481554Abstract: Techniques are described herein for training and evaluating machine learning (ML) models for document processing computing applications using generalized vocabulary tokens. In some embodiments, an ML system determines a set of tokens for non-textual content in a plurality of documents. The ML system generates a fixed-length vocabulary that includes the set of tokens for the non-textual content. The ML system further generates for each respective document in a training dataset of documents, a respective feature vector based at least in part on which tokens in the fixed-length vocabulary occur in the respective document. The ML system trains a ML model based at least in part on the respective feature vector for each respective document in the training dataset.Type: GrantFiled: November 8, 2019Date of Patent: October 25, 2022Assignee: Oracle International CorporationInventor: Sudhakar Kalluri
-
Patent number: 11423052Abstract: User information categorization using consent-based class rules is described. Consent from a user is received regarding at least one functional area where user information is shareable is received. Based on the consent, at least one data class that is permitted to be shared is determined. A user information designation is associated with the at least one data class and class rules are applied to user information associated with the user information designation based on the association between the user information designation and the at least one data class.Type: GrantFiled: December 14, 2017Date of Patent: August 23, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sushain Pandit, Martin Oberhofer, Steven Lockwood
-
Patent number: 8639714Abstract: A variety of computer based service that permit users to edit, compose, upload, or otherwise generate content also provide for the integration of sponsored media into presentations along with user-generated content. An exemplary service generates text based on user input, provides tags based on the text to a sponsored media repository, receives a sponsored media data structure in return, and formats sponsored media from the data structure for display to the user.Type: GrantFiled: August 29, 2007Date of Patent: January 28, 2014Assignee: Yahoo! Inc.Inventor: Roelof van Zwol
-
Patent number: 8639707Abstract: Retrieval is completed in a short time for presenting a retrieval result of a document file, which satisfies a retrieval condition, to a user having the authority to perform predetermined processing.Type: GrantFiled: December 16, 2010Date of Patent: January 28, 2014Assignee: International Business Machines CorporationInventors: Masaki Komedani, Hirofumi Nishikawa, Fumihiko Terui
-
Patent number: 8626704Abstract: A map update data supply device and method includes an update map database of per section versions of an update data file, and a request update data extraction unit for extracting a request update section and an update data file. A safeguard update data extraction unit extracts a safeguard update section to safeguard a road network connection between adjacent sections. An integrated data generation unit integrates all versions of the update data file for each extracted request update section and generates a request update integrated data file. The integrated data generation unit integrates, per safeguard update section, versions of the update data file up to the update safeguard version for each extracted safeguard update section, and generates a safeguard update integrated data file. An integrated data supply unit supplies the generated request update integrated data file and the safeguard update integrated data file to a navigation device.Type: GrantFiled: January 13, 2011Date of Patent: January 7, 2014Assignee: Aisin Aw Co., Ltd.Inventor: Kimiyoshi Sawai
-
Publication number: 20130311489Abstract: A method for automatically extracting names that is implemented by a computer having a computer memory includes the steps of storing a list of first names in the computer memory; receiving a document in the computer memory, where at least some of the characters of the document are represented in a machine readable format; identifying a grouping of words in the document as a name candidate based on capitalization of a leading character of at least two of the words; selecting a subject word of the name candidate; comparing the subject word to the list of first names; and determining that the name candidate includes a personal name if the subject word is present in the list of first names, using the computer.Type: ApplicationFiled: September 30, 2011Publication date: November 21, 2013Applicant: GOOGLE INC.Inventor: Alex Kerschhofer
-
Publication number: 20130144907Abstract: The present discussion relates to patient image data workflows. One example can temporarily serially arrange a set of semantic labeling modules in a patient image data workflow pipeline responsive to receiving an event trigger. The example can also remove the set of modules from the patient image data workflow pipeline responsive to receiving an event completion trigger.Type: ApplicationFiled: December 6, 2011Publication date: June 6, 2013Applicant: MICROSOFT CORPORATIONInventors: Steven J. White, Sayan D. Pathak, Bryan Dove, Duncan P. Robertson, Khan M. Siddiqui, Prabhu KrishnaMoorthy
-
Publication number: 20130080475Abstract: A system for generating statistics relating to recorded employee behavior, the system including: a first database of tasks performed by employees, the first database being stored on a computer-readable storage medium; a second database of actions taken by the employees while performing the tasks, the second database being stored on a computer-readable storage medium; and a software program, stored on a computer-readable storage medium, configured to extract information from the databases regarding the tasks performed by the employees as well as the actions performed by the employees while carrying out the tasks. The software program then calculates performance statistics relating to success or failure regarding a particular task. The software program furthermore sorts the employees into subgroups based on their status in the company and then calculates performance statistics for the subgroup to compare against individual performance within the subgroup.Type: ApplicationFiled: September 25, 2011Publication date: March 28, 2013Inventor: Jonathon Gillen
-
Publication number: 20130073514Abstract: This document describes techniques that label text nodes of a seed site for each of a plurality of verticals. Once a seed site is labeled for a given vertical, the techniques extract features from the labeled text nodes of the seed site. The techniques learn vertical knowledge for the seed site based on the human labels and the extracted features, and adapt the learned vertical knowledge to a new web site to automatically and accurately identify attributes and extract attribute values targeted within a given vertical for structured web data extraction.Type: ApplicationFiled: September 20, 2011Publication date: March 21, 2013Applicant: MICROSOFT CORPORATIONInventors: Rui Cai, Lei Zhang, Qiang Hao
-
Publication number: 20130024476Abstract: A computer implemented method and system provide for automatic selection and extraction of metadata and media content from projects in a craft tool. Automated identification, classification and management of such metadata and content is provided using including techniques such as pattern recognition for audio and visual content. The automatic tracking and centralised storage of metadata and content for compliance purposes can be facilitated, and can enable querying of organised metadata stored in a central database. In an example, metadata and media content are extracted automatically from a project in a craft tool at a client system and are forwarded to a host system for the creation of a cue sheet including timings for media files from timing metadata in a project file to create the timings on the cue sheet.Type: ApplicationFiled: October 7, 2010Publication date: January 24, 2013Inventors: Charles Hodgkinson, Kirk Zavieh
-
Publication number: 20130013553Abstract: Some embodiments provide a verification system for automated verification of entities. The verification system automatedly verifies entities using a two part verification campaign. One part verifies that the entity is the true owner of the entity account to be verified. This verification step involves (1) the entity receiving a verification code at the entity account and returning the verification code to the verification system, (2) the entity associating an account that it has registered at a service provider to an account that the verification system has registered at the service provider, (3) both. Another part verifies the entity can respond to communications that are sent to methods of contact that have been previously verified as belonging to the entity. The verification system submits a first communication with a code using a verified method of contact. The verification system then monitors for a second communication to be returned with the code.Type: ApplicationFiled: November 7, 2011Publication date: January 10, 2013Inventors: Aaron B. Stibel, Peter Delgrosso, Jeffrey M. Stibel, Shailen Misltry, Bryan Mierke, Paul Servino, Charles Chi Thoi Le, David Lo, David Allen Lyon
-
Patent number: 8346620Abstract: A system for interactive paper is described. Data fragments are captured at locations in a rendered document. A digital version of the document is optionally located. Markup data applied to the capture creates a rich set of interactions for the user. New models for publishing documents and new document-related services are described.Type: GrantFiled: September 28, 2010Date of Patent: January 1, 2013Assignee: Google Inc.Inventors: Martin T. King, Dale L. Grover, Clifford A. Kushler, James Q. Stafford-Fraser
-
Publication number: 20120303661Abstract: Described herein are methods, systems, apparatuses and products for automatically discovering patterns in a text corpus. An aspect provides extracting at least one context string related to at least one annotator from the at least one text corpus; analyzing the at least one context string for at least one sequence, the at least one sequence comprised of at least one subsequence; determining at least one sequence signature for each at least one sequence by applying applicable rules to the at least one sequence; and grouping the at least one sequence signature into at least one group.Type: ApplicationFiled: May 27, 2011Publication date: November 29, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sebastian Johannes Blohm, Vivian Yaw-Wen Chu, Ching-Tien Ho, Yunyao Li, Huaiyu Zhu
-
Publication number: 20120264480Abstract: Generally described, the present disclosure relates to an electronic device having limited memory. More specifically, the disclosure relates to intelligent data sharing for advanced features on mobile platforms. In one illustrative embodiment, a mobile device provides a platform having native services that use shared data. The data can be received from a central server. In turn, the data can be separated on the mobile device into categories. For a number of contacts, these categories can include, but are not limited to, usage, total count, grouping, location and organization. After the data is placed within the categories, the data can be shared between the services for applications. These applications can include, but are not limited to, voice dialing, Bluetooth™ dialing, searching and dialing. The data can be prioritized depending on the categories. Through prioritization, data can be removed when memory is low and new data is received.Type: ApplicationFiled: April 18, 2011Publication date: October 18, 2012Inventors: Suriyaprakash Soundrapandian, James Dean Midtun
-
Publication number: 20120239668Abstract: Various embodiments of systems and methods for extraction and grouping of feature words are described herein. Feature words are obtained from a first corpus of text bodies comprising a plurality of reviews. A second corpus is created using a combination of the obtained feature words, verbs and adjectives from the first corpus. The second corpus comprises filtered reviews and each of the filtered reviews pertains to a review. Topics are preliminarily assigned for words in the filtered reviews of the second corpus. For each of the feature words in the second corpus, a topic count is determined for every preliminarily assigned topic. After determining the topic count, one or more of the topics are finally assigned to the feature words based on a topic count value. At least one topic is presented as a group of the feature words for which the at least one topic is assigned based on the topic count value.Type: ApplicationFiled: March 17, 2011Publication date: September 20, 2012Inventors: CHIRANJIB BHATTACHARYYA, Himabindu Lakkaraju, Kaushik Nath, Sunil Arvindam
-
Patent number: 8261200Abstract: An interactive system provides for increasing retrieval performance of images depicting text by allowing users to provide relevance feedback on words contained in the images. The system includes a user interface through which the user queries the system with query terms for images contained in the system. Word image suggestions are displayed to the user through the user interface, where each word image suggestion contains the same or slightly variant text as recognized from the word image by the system than the particular query terms. Word image suggestions can be included in the system by the user to increase system recall of images for the one or more query terms and can be excluded from the system by the user to increase precision of image retrieval results for particular query terms.Type: GrantFiled: April 26, 2007Date of Patent: September 4, 2012Assignee: Fuji Xerox Co., Ltd.Inventors: Laurent Denoue, John E. Adcock, David M. Hilbert, Daniel Billsus
-
Publication number: 20120203764Abstract: A method of identifying one or more particular images from an image collection, includes indexing the image collection to provide image descriptors for each image in the image collection such that each image is described by one or more of the image descriptors; receiving a query from a user specifying at least one keyword for an image search; and using the keyword(s) to search a second collection of tagged images to identify co-occurrence keywords. The method further includes using the identified co-occurrence keywords to provide an expanded list of keywords; using the expanded list of keywords to search the image descriptors to identify a set of candidate images satisfying the keywords; grouping the set of candidate images according to at least one of the image descriptors, and selecting one or more representative images from each grouping; and displaying the representative images to the user.Type: ApplicationFiled: February 4, 2011Publication date: August 9, 2012Inventors: Mark D. Wood, Alexander C. Loui
-
Publication number: 20120150792Abstract: The present disclosure involves systems, software, and computer implemented methods for providing a data extraction framework for extracting data and metadata from an application to provide additional functionality for the extracted data and metadata. One process includes operations for identifying a first application for data extraction and determining a set of data suitable for extraction from the first application using a software development kit associated with the first application. The set of data is stored in a repository without storing visualization components of the first application in the repository. The set of data is sent to a second application for further processing of the set of data. The second application is configured to bind different visualization components to the set of data for display of data elements in the set of data to a user.Type: ApplicationFiled: December 9, 2010Publication date: June 14, 2012Applicant: SAP PORTALS ISRAEL LTD.Inventors: Ohad Yassin, Pavel Kravets, Nisim Hafzadi, Ram Alon
-
Publication number: 20120136812Abstract: One embodiment of the present invention provides a system for optimizing and customizing document-similarity calculation. During operation, the system presents a collection of similar documents to a user, collects feedback on the similarity of the documents from the user, generates generic rules for calculating document similarity, and filters documents with customized similarity calculation based on the feedback provided by the user.Type: ApplicationFiled: November 29, 2010Publication date: May 31, 2012Applicant: PALO ALTO RESEARCH CENTER INCORPORATEDInventor: Oliver Brdiczka
-
Publication number: 20120089643Abstract: A computer implemented method and system provide for automatic selection and extraction of metadata and media content from projects in a craft tool. Automated identification, classification and management of such metadata and content is provided using including techniques such as pattern recognition for audio and visual content. The automatic tracking and centralised storage of metadata and content for compliance purposes can be facilitated, and can enable querying of organised metadata stored in a central database. In an example, metadata and media content are extracted automatically from a project in a craft tool at a client system and are forwarded to a host system for the creation of a cue sheet including timings for media files from timing metadata in a project file to create the timings on the cue sheet.Type: ApplicationFiled: October 7, 2010Publication date: April 12, 2012Inventors: Charles Hodgkinson, Kirk Zavieh
-
Publication number: 20120089642Abstract: The system and methods described herein provide results previewing for an interactive text mining system in order to feedback partial query results to users before all results that are responsive to a query have been found. These partial results allow the user to see the progress of their text mining query much sooner.Type: ApplicationFiled: October 6, 2010Publication date: April 12, 2012Inventors: David R. Milward, Roger W. Hale, Malcolm R. Parsons, Sylvia F. Knight, Christopher I. Sullivan, Jason Trenouth, James R. Thomas
-
Publication number: 20120047167Abstract: A portable terminal includes a word extracting unit that extracts a word contained in data of a Web page being viewed; a Web search request unit that transmits a search request to a search site with the word extracted by the word extracting unit as a search word and that receives a list of Web pages that contain the search word from the search site as a search result; and a display unit that displays the search result received by the Web search request unit.Type: ApplicationFiled: November 2, 2011Publication date: February 23, 2012Applicant: FUJITSU TOSHIBA MOBILE COMMUNICATIONS LIMITEDInventors: Masaki SAKAI, Natsuko OUCHI
-
Publication number: 20120047172Abstract: A technique includes providing a collection of documents in multiple languages, identifying, from the collection of documents, a group of candidate documents, where each candidate document in the group shares multiple corresponding rare features, evaluating pairs of candidate documents in the group using multiple common features present in the collection of documents, and determining, based on evaluating the pairs of candidate documents, whether each pair of candidate documents corresponds to a translated pair of documents.Type: ApplicationFiled: August 22, 2011Publication date: February 23, 2012Applicant: Google Inc.Inventors: Jay M. Ponte, Jakob Uszkoreit, Ashok C. Popat, Moshe Dubiner
-
Publication number: 20120047176Abstract: A system and methodology for real-time content aggregation and syndication is described. In one embodiment, for example, a method is described for assisting a user with extracting items relevant to search queries from documents including items of various types, the method comprises steps of: receiving a search query specifying a search phrase and a particular item type; identifying documents matching the search phrase; for each matching document, determining whether the document includes an item having the particular item type; and extracting items having the particular item type from the matching documents for display to the user. The solution enables a user to aggregate and syndicate content without a professional content manager or complicated content management software tools.Type: ApplicationFiled: November 2, 2011Publication date: February 23, 2012Applicant: SYBASE, INC.Inventor: Michael Timmons
-
Publication number: 20120036144Abstract: According to one embodiment, an information recommendation device includes following units. The input unit is configured to input a first document and a second document which has been browsed before the first document. The subject-keyword extraction unit is configured to extract first and second subject keywords from the first and second documents, respectively. The interest-keyword extraction unit is configured to extract first interest keywords from the first and second subject keywords, and to extract second interest keywords based on information specifying the first and second documents, the first interest keywords, and the first and second subject keywords. The second interest keywords are estimated to be keywords in which the user is next interested. The acquiring unit is configured to acquire, based on the second interest keywords, recommendation information on third documents which are candidates to be browsed after the first document. The presentation unit presents the recommendation information.Type: ApplicationFiled: August 25, 2011Publication date: February 9, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masayuki Okamoto, Nayuko Watanabe, Masaaki Kikuchi, Takayuki Iida, Mika Fukui
-
Publication number: 20110307497Abstract: “Synthewiser”™ is a search method and system that synthesizes a single non-template, text-based document that is organized by topic and integrates and consolidates information from multiple sources. This is accomplished by: having a user provide a search phrase; creating seed phrases; identifying seed locations in multiple sources; creating expanded text segments; grouping expanded text segments; consolidating content; and synthesizing a single document. Synthewiser has advantages over today's dominant search engine. Its results are organized by topic and are integrated across multiple sources.Type: ApplicationFiled: June 14, 2010Publication date: December 15, 2011Inventor: Robert A. Connor
-
Publication number: 20110302179Abstract: Described is using context information obtained from entity mentions in likely relevant documents to extract entity mentions from documents that are ambiguous with respect to their relevance to a domain. A list of entities is input into an entity extraction mechanism, which processes a large collection of documents to determine data (counts) corresponding to frequency of entity mentions. Infrequently mentioned entities are specific entities, while frequently mentioned entities are non-specific (generic or ambiguous) entities. The context surrounding mentions of the specific entities is processed to obtain interesting context terms (words, phrases or both) for the domain. The interesting context terms are then compared against the contexts of non-specific entity mentions to determine whether each non-specific entity mention is relevant to the domain. A result set containing only relevant documents or relevant mentions collection is output.Type: ApplicationFiled: June 7, 2010Publication date: December 8, 2011Applicant: Microsoft CorporationInventor: Sanjay Agrawal
-
Publication number: 20110295893Abstract: A method of searching an expected image in an electronic apparatus comprises the steps of inputting a hand drawing of the expected image into the electronic apparatus; determining whether or not a text description for partially characterizing the expected image is inputted; identifying and searching the expected image in the electronic apparatus according to the hand drawing if the text description is not inputted, or selecting a text label from the text description and interpreting the selected text label by the electronic apparatus if the text description is inputted; and searching a database in the electronic apparatus according to the text label, and fetching the expected image from the database if the value of the image item matches the text label. The hand drawing and/or text label inputted from a mobile phone screen are provided for arranging and searching pictures or images in the database efficiently.Type: ApplicationFiled: April 21, 2011Publication date: December 1, 2011Applicants: INVENTEC APPLIANCES (SHANGHAI) CO. LTD., INVENTEC APPLIANCES (NANCHANG) CO. LTD., INVENTEC APPLIANCES CORP.Inventor: PENG-FEI WU
-
Publication number: 20110295775Abstract: Techniques for identifying near-duplicates of a media object and associating metadata of the near-duplicates with the media object are described herein. One or more devices implementing the techniques are configured to identify the near duplicates based at least on similarity attributes included in the media object. Metadata is then extracted from the near-duplicates and is associated with the media object as descriptors of the media object to enable discovery of the media object based on the descriptors.Type: ApplicationFiled: May 28, 2010Publication date: December 1, 2011Applicant: MICROSOFT CORPORATIONInventors: Xin-Jing Wang, Lei Zhang, Ming Liu, Yi Li, Wei-Ying Ma
-
Publication number: 20110270819Abstract: Query classification techniques attempt to classify user search queries in order to better understand user search intent. Understanding a user's search intent allows search engines to provide relevant content tailored to the user's interest. Unfortunately, current classification techniques do not take into account contextual information. Accordingly, as provided herein, a target query may be classified based upon contextual information. In particular, features may be extracted from contextual information and/or other sources. For example, features may be extracted from the target query, related queries, and/or invoked search results of the related queries. In this way, the target query may be classified based upon other queries performed by the user and/or search results of the queries the user found interesting. In addition, a CRF model may be utilized in classifying the target query by providing generalized parameters learned from labeled query sessions.Type: ApplicationFiled: April 30, 2010Publication date: November 3, 2011Applicant: Microsoft CorporationInventors: Dou Shen, Daxin Jiang, Jian-Tao Sun
-
Publication number: 20110264675Abstract: A searching apparatus includes a memory unit which stores transposed indexes representing appearing positions of all n-grams in plural pieces of document data subjected to searching and appearing frequencies, an n-gram extracting unit that extracts all n-grams extractable from a searching character string, a smallest-frequency deriving unit which refers to the appearing frequency of the n-gram represented by the transposed index, and derives an n-gram with the smallest appearing frequency among all of the extracted n-grams, a searching n-gram selecting unit that selects, from all extracted n-grams, a plurality of searching n-grams which form the searching character string and include the n-gram with the smallest appearing frequency, and a document specifying unit that specifies, based on the plurality of selected searching n-grams and the appearing position of the searching n-gram represented by the transposed index, document data including the searching character string among the plural pieces of document daType: ApplicationFiled: April 26, 2011Publication date: October 27, 2011Applicant: CASIO COMPUTER CO., LTD.Inventor: Katsuhiko SATOH
-
Publication number: 20110246027Abstract: An image processing system inputs a captured image of a scene viewed from a vehicle in a predetermined road section and an image-capturing position at which the image is captured. The system uses a given position in the predetermined road section as a specific position, and sets a target vehicle movement amount at the specific position, for passing through the predetermined road section. The system generates reference image data from the captured image obtained at the specific position. The system generates reference data that is used when scenic image recognition is performed, by associating the reference image data with the specific position and the target vehicle movement amount at the specific position, and generates a reference data database that is a database of the reference data.Type: ApplicationFiled: January 25, 2011Publication date: October 6, 2011Applicant: AISIN AW CO., LTD.Inventor: Takayuki MIYAJIMA