Into Predefined Classes (epo) Patents (Class 707/E17.09)
  • Patent number: 11907278
    Abstract: A method includes searching a technical document including a first, second, and third data fields based on search terms and search year ranges related to a technical field, generating a keyword set using the first, second, and third data fields of the searched technical document, scoring a plurality of keywords included in the keyword set, and selecting some of the plurality of keywords, re-searching the technical document related to the technical field, using the selected keywords, scoring the re-searched technical document to derive a representative document representing the technical field, and deriving a representative keyword representing the technical field, using the second data field included in the representative document, wherein the first data field includes a title of the technical document, the second data field includes a summary of the technical document, and the third data field includes keywords of the technical document.
    Type: Grant
    Filed: April 13, 2022
    Date of Patent: February 20, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun Pil Kim, Tae Sung Kim, Chang-Ju Lee
  • Patent number: 11880398
    Abstract: A server can receive a seed keyword to generate additional keywords relevant to the seed keyword. The server can identify, using a semantic relationship graph, keyword categories. Each keyword can have a semantic distance from the seed keyword less than a threshold. The server can generate, for each keyword of the keyword categories, a keyword-seed affinity score based on a frequency of the keyword occurring with the seed keyword on an information resource. The server can determine, for each keyword category, a category-seed affinity score based on the keyword-seed affinity scores for each of keyword in the keyword category. The server can compare each category-seed affinity score a threshold. The server can transmit, for display, the keywords. One keyword category can be indicated as selected and another keyword category can be indicated as unselected based on the comparison.
    Type: Grant
    Filed: July 21, 2021
    Date of Patent: January 23, 2024
    Assignee: GOOGLE LLC
    Inventors: Justin Lewis, Gavin James
  • Patent number: 11853905
    Abstract: Systems and methods to identify document transitions between adjacent documents within document bundles are disclosed. Exemplary implementations may train a model: obtain training information including a first training bundle and corresponding document separation markers; determine page-specific feature information pertaining to individual pages of the first training bundle; determine, based on the obtained page-specific feature information, page-specific feature values for individual features of the individual pages of the first training bundle; generate, for the individual pages of the first training bundle, a page-specific feature vector; train the model, using the training document bundles, to determine whether the first page and the second page are part of different document. Systems and methods may utilize the trained model to identify document transitions between adjacent documents within document bundles.
    Type: Grant
    Filed: June 24, 2022
    Date of Patent: December 26, 2023
    Assignee: Instabase, Inc.
    Inventor: Daniel Benjamin Cahn
  • Patent number: 11836189
    Abstract: An approach is provided in which the approach calculates at least one weighting factor based on a word frequency analysis of an unlabeled document against a set of word frequencies corresponding to a set of labeled documents. The approach computes an a posteriori classification probability of the unlabeled document based on the at least one weighting factor, and creates an inferred classifier based on the a posteriori classification probability. The approach classifies the unlabeled classifier using the inferred classifier.
    Type: Grant
    Filed: March 25, 2020
    Date of Patent: December 5, 2023
    Assignee: International Business Machines Corporation
    Inventors: Thiago Bianchi, John Donald Vasquez, John Maxwell Cohn
  • Patent number: 11775886
    Abstract: A data set comprising records of state change events of items of an item collection, as well as records of asynchronous operations associated with the items, is obtained. The numbers of records in the data set may differ from one item to another. Using the data set, a Bayesian forecasting model employing a deconvolution algorithm is trained. The model generates estimates of metrics of a type of asynchronous operation using a combination of a category-level distribution of the asynchronous operation, an item-level distribution, and a category-versus item adjustment. A trained version of the model is stored.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: October 3, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Ramesh Natarajan, Jonathan Hosking
  • Patent number: 11741168
    Abstract: Techniques for multi-label document classification are described. Clustering is used to cluster labels in a set. A machine learning model including a multi-label classifier for each cluster is created, the multi-label classifier for a given cluster to classify a document with one or more of the labels in the cluster.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: August 29, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Sravan Babu Bodapati, Rishita Rajal Anubhai, Yahor Pushkin
  • Patent number: 11694463
    Abstract: Described embodiments relate to a method comprising: determining a candidate document comprising image data and character data and extracting the image data and the character data from the candidate document. The method comprises providing, to an image-based numerical representation generation model, the image data, and generating, by the image-based numerical representation generation model, an image-based numerical representation of the image data. The method comprises providing, to a character-based numerical representation generation model, the character data; and generating, by the character-based numerical representation generation model, a character-based numerical representation of the character data.
    Type: Grant
    Filed: July 20, 2022
    Date of Patent: July 4, 2023
    Inventors: Jerome Gleyzes, Mohamed Khodeir, Salim Fakhouri, Yu Wu, Soon-Ee Cheah
  • Patent number: 11631267
    Abstract: A tiered processing scheme for processing image data is provided. A method can include obtaining image data indicative of one or more court judgment(s) with a number of features. The method can include obtaining judgment information from the image data by applying a number of image processing techniques in accordance with a processing hierarchy tailored to the image data. The image data can be classified by a machine-learning model and the processing hierarchy can be determined based on the classification. The processing hierarchy balances the computing resources used by a respective technique with the accuracy afforded by the technique when applied to image data with a respective classification. A computing system can utilize the processing hierarchy to leverage different image processing techniques in a tiered processing scheme tailored to image data.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: April 18, 2023
    Assignee: InvestiNet, LLC
    Inventors: Thomas Marcial Kersting, Aaron Michael Brooks, Caleb Michael Rogers
  • Patent number: 11595484
    Abstract: A remote network management platform is provided that includes an end-user computational instance dedicated to a managed network, a training computational instance, and a prediction computational instance. The training instance is configured to receive a corpus of textual records from the end-user instance and to determine therefrom a machine learning (ML) model to determine the numerical similarity between input textual records and textual records in the corpus of textual records. The prediction instance is configured to receive the ML model and an additional textual record from the end-user instance, to use the ML model to determine respective numerical similarities between the additional textual record and the textual records in the corpus of textual records, and to transmit, based on the respective numerical similarities, representations of one or more of the textual records in the corpus of textual records to the end-user computational instance.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: February 28, 2023
    Assignee: ServiceNow, Inc.
    Inventors: Baskar Jayaraman, Aniruddha Madhusudhan Thakur, Kannan Govindarajan, Andrew Kai Chiu Wong, Sriram Palapudi
  • Patent number: 11586945
    Abstract: Methods and systems are provided for modifying an application provided by a cloud-based computing system. The application is used by end users of an organization that is part of the cloud-based computing system. A clickstream monitoring module monitors a clickstream generated by each end user as that end user interacts with the application to generate a set of clickstream data for that particular end user. Each set of clickstream data indicates a path of interaction with features of the application by a particular end user. The sets of clickstream data can then be processed at an analytics engine to extract usage patterns that indicate how end users interact with different features of the application during usage of the application. The extracted usage patterns indicate which features the end users interact with and in what order.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: February 21, 2023
    Assignee: salesforce.com, inc.
    Inventor: Axella Novotny
  • Patent number: 11575641
    Abstract: Provided is an estimating device, an estimating method, and an estimating program that each make it possible to estimate the area of activity of a target user with a smaller amount of information. An estimating device includes a first position distribution generating unit configured to generate a first position distribution of a target user on social media based on account information of the target user, a second position distribution generating unit configured to generate a second position distribution of a friend who is friends with the target user on the social media based on account information of the friend, and an estimating unit configured to estimate an area of activity of the target user based on the generated first position distribution and the generated second position distribution.
    Type: Grant
    Filed: January 19, 2022
    Date of Patent: February 7, 2023
    Assignee: NEC CORPORATION
    Inventors: Keisuke Ikeda, Kazufumi Kojima, Masahiro Tani
  • Patent number: 10180952
    Abstract: A search engine to index web content with user content. A server computer receives, from a first client computer operated by a first user, an identification of first web content displayed by a web browser of the first client computer in a main browser window. The identification of the first web content is transmitted by the first user to the server computer via a user interface separate from the main browser window. The server computer then indexes the first web content. In response to receiving a search query from a web browser of a second client computer operated by a second user, the server computer transmits search results to the web browser of the second client computer. The search results include the first web content identified by the first user in a position relative to identifications of other web content received from other users.
    Type: Grant
    Filed: April 29, 2016
    Date of Patent: January 15, 2019
    Assignee: NEWSPLUG, INC.
    Inventors: John S. Shriber, Roman Zaks
  • Patent number: 10089332
    Abstract: A method of classifying contents comprising configuring one or more categories in a hierarchical structure, mapping one or more contents and the one or more categories based on at least one piece of information on the one or more contents and information on the one or more categories, and updating the hierarchical structure of the categories based on a preset condition when content-related information of each category determined according to the mapping meets the preset condition.
    Type: Grant
    Filed: June 12, 2015
    Date of Patent: October 2, 2018
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hee-Kuk Lee, Dae-Kyu Shin, Seong-Ho Cho, Ik-Hwan Cho
  • Patent number: 9959306
    Abstract: A method for processing a dataset in a partitioned distributed storage system having data stored in a base table and an index stored in an index table, may include receiving base and index table metadata from the partitioned distributed storage system, where the base and index table metadata includes respective table partition information. The method may further include partitioning the dataset into a set of base-delta files according to the base table metadata, and generating a set of index-delta files corresponding with the base-delta files according to the index table metadata. The method may additionally include updating the partitioned distributed storage system with the set of base-delta and the set of index-delta files, where a first update of the base table is synchronous with a second update of the index table.
    Type: Grant
    Filed: June 12, 2015
    Date of Patent: May 1, 2018
    Assignee: International Business Machines Corporation
    Inventors: Yuan-Chi Chang, Liana L. Fong, Wei Tan
  • Patent number: 9767118
    Abstract: An information handling system includes a processor and a memory including code to implement a Unified Extensible Firmware Interface (UEFI). The UEFI includes a UEFI network file system module that provides a first compound command to get directory information for a first directory on a network storage device, provides a second compound command to get file information for the first directory, and provides a third compound command to open a file stored on the first directory. The UEFI also includes a UEFI network protocol module that sends the first compound command, the second compound command, and the third compound command to the network storage device, wherein the first compound command, the second compound command, and the third compound command are sent to the network storage device via a first network transaction.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: September 19, 2017
    Assignee: Dell Products, LP
    Inventors: Ankit Singh, S. Shekar Babu, Sumanth Vidyadhara
  • Patent number: 8892561
    Abstract: Techniques are described for refining the manual classification of assets classified or categorized using the terms of a business glossary. A semantic refinement mechanism is used to refine the manual classification of such assets, as well as subsequently evaluate the refined asset classifications. Further, the refined asset classifications may be used as a training set for a machine learning classifier. That is, should the classification of an asset contributing to a refinement change, the refinement based on that classification may be undone, at least in some cases.
    Type: Grant
    Filed: May 3, 2013
    Date of Patent: November 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Sushain Pandit, Charles K. Shank, Charles D. Wolfson
  • Patent number: 8855379
    Abstract: The saving device for image sharing includes an image acquiring unit configured to acquire the images offered by a sharer of the images, a sharee information storing unit configured to store sharee information with respect to at least one sharee, a subject assessing unit configured to assess whether or not a person subject is included in the acquired images, an image associating unit configured to associate the images assessed as not including a person subject with the images assessed as including a person subject, based on the sharee information, and a shared image determining unit configured to determine the images to be shared with the sharee or sharees from among the associated images and the images assessed as including a person subject, based on the sharee information. The image sharing system and an image sharing method use such a device.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: October 7, 2014
    Assignee: Facebook, Inc.
    Inventors: Kazuma Tsukagoshi, Yukinori Yokoyama, Karin Kon, Yuto Furukawa
  • Patent number: 8849828
    Abstract: Techniques are described for refining the manual classification of assets classified or categorized using the terms of a business glossary. A semantic refinement mechanism is used to refine the manual classification of such assets, as well as subsequently evaluate the refined asset classifications. Further, the refined asset classifications may be used as a training set for a machine learning classifier. That is, should the classification of an asset contributing to a refinement change, the refinement based on that classification may be undone, at least in some cases.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: September 30, 2014
    Assignee: International Business Machines Corporation
    Inventors: Sushain Pandit, Charles K. Shank, Charles D. Wolfson
  • Publication number: 20140032558
    Abstract: A computer implemented system and method are provided for refining category scores for pages of a sequence of document pages that potentially includes document boundaries. The method uses initial category scores provided by a categorizer that considers one page at a time or concatenated pairs of pages (called bipages). The category scores represent the probability that a page belongs to a particular category. The method uses anisotropic diffusion to refine the initial page category scores using the scores of neighboring pages as a function of the probability that there is a boundary between the pages. The method may be performed iteratively.
    Type: Application
    Filed: July 26, 2012
    Publication date: January 30, 2014
    Applicant: Xerox Corporation
    Inventors: Jean-Michel Renders, François Ragnet, Damien Cramet
  • Publication number: 20130339361
    Abstract: The present invention is directed to a computer implemented system for organizing electronic content data chronologically. In operation, application software, that is preferably hosted on a remote server, organizes the electronic communications data. The electronic communications data is transmitted from multiple sources or users associated with relationship identities or identifiers. The application software organizes the electronic communications data chronologically onto interactive and displayable time-lines. The application software also preferably organizes the electronic communications data into user accessible sub-files along the time-lines based on dates of transmission. The application software runs electronic communications applications, such as e-mail or social network applications, directly or alternatively interfaces with external electronic communications applications to generate the time-lines.
    Type: Application
    Filed: June 14, 2012
    Publication date: December 19, 2013
    Inventors: Chris Trahan, Ryan Trahan, David Olszewski
  • Patent number: 8515957
    Abstract: A system and for providing reference documents as a suggestion for classifying uncoded documents is provided. A reference set of electronically stored information items, each associated with a classification code, is designated. Clusters of uncoded electronically stored information items are designated. One or more of the uncoded electronically stored information items from at least one cluster is compared to the reference set. At least one of the electronically stored information items in the reference set is identified as similar to the one or more uncoded electronically stored information items. The similar electronically stored information items are injected into the at least one cluster. Relationships are visually depicted between the uncoded electronically stored information items and the similar electronically stored information items in the at least one cluster as suggestions for classifying the uncoded electronically stored information items.
    Type: Grant
    Filed: July 9, 2010
    Date of Patent: August 20, 2013
    Assignee: FTI Consulting, Inc.
    Inventors: William C. Knight, Nicholas I. Nussbaum
  • Publication number: 20130066875
    Abstract: Domains supported by websites accessible to mobile network users over the Internet are classified into pre-defined categories based on domain content. A network intelligence solution (NIS) taps a stream of IP (Internet Protocol) packets traversing a node in the network between mobile equipment employed by network users and remote web servers. The NIS performs deep packet inspection to aggregate Internet usage so that a distribution of frequency of access by the network users to each of the classified domains may be calculated. Clusters encompassing one or more of the categories are specified based, at least in part, on the access frequency distribution. Each network user is assigned to one or more clusters based at least on observations of the user's frequency of access to the classified domains. Clusters are specified to meet a target homogeneity of access frequency for each encompassed category and further to meet a target heterogeneity across clusters.
    Type: Application
    Filed: September 12, 2011
    Publication date: March 14, 2013
    Inventors: Jacques Combet, Gerard Hermet
  • Publication number: 20130066874
    Abstract: A system for performing data classification operations. In one embodiment, the system comprises a file system configured to store a plurality of computer files and a scanning agent configured to traverse the file system and compile data regarding the attributes and content of the plurality of computer files. The system also comprises an index configured to store the data regarding attributes and content of the plurality of computer files and a file classifier configured to analyze the data regarding the attributes and content of the plurality of computer files and to classify the plurality of computer files into one or more categories based on the data regarding the attributes and content of the plurality of computer files.
    Type: Application
    Filed: September 13, 2012
    Publication date: March 14, 2013
    Applicant: COMMVAULT SYSTEMS, INC.
    Inventor: Norman R. Lunde
  • Publication number: 20130046764
    Abstract: Mechanisms for correlating reported problem data from a plurality of sources of information are provided. A report of a problem in a computer system is received to thereby generate a reported problem in a problem management system. Data is collected from a plurality of sources of information in accordance with data collection rules. Content classification is performed on the collected data to classify the collected data into pre-determined classes of collected data in accordance with classification rules. Correlation of the classified data into sets of correlated data in accordance with correlation rules is performed. Each set of correlated data corresponds to a different reported problem in the problem management system. A representation of the reported problem in the problem management system is updated based on a set of correlated data corresponding to the reported problem and classifications of data within the set of correlated data.
    Type: Application
    Filed: August 17, 2011
    Publication date: February 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christopher Y. Choi, Neil I. Readshaw
  • Publication number: 20130031099
    Abstract: Embodiments of the invention may provide an approach for managing electronic records in a content management system. The content management system may use a container structure to create, edit, and manage electronic records in a coordinated way. The container structure may include a master container and a plurality of sub-containers. An associated method generally may include receiving a request comprising a record and one or more properties associated with the record; determining, from the one or more properties associated with the record, a date; based on the date of the record, associating the record with a sub-container of the container structure; and managing disposition of the sub-container based on an associated policy.
    Type: Application
    Filed: September 14, 2012
    Publication date: January 31, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Jean-Marc Costecalde
  • Publication number: 20130018888
    Abstract: Various embodiments of systems and methods for analyzing software-usage information are described herein. Traffic numbers are obtained from data stored in a database using measurement objects that are associated with one or more keys. The measurement objects output the traffic numbers and the one or more keys are related to elements of the data. Identifiers and categories are assigned to the measurement objects. The categories represent attributes of a software product. A data structure comprising the identifiers, the traffic numbers, and the categories is generated and stored. The stored data structure and a header comprising one or more fields are used to generate a report.
    Type: Application
    Filed: July 12, 2011
    Publication date: January 17, 2013
    Inventor: PETER JOHN
  • Publication number: 20120323922
    Abstract: An approach is presented for specifying categories of data elements during a service specification phase of a service-oriented architecture (SOA) life cycle defined in a service modeling methodology like Service-Oriented Modeling and Architecture (SOMA). A Unified Modeling Language based SOA modeling tool for the service modeling methodology includes a middleware based integration plug-in that categorizes service-specific data elements as transaction elements, glue elements, core Common Information Model (CIM) elements, and elements extending the CIM elements, and associates the categorized data elements with corresponding operations of the service being modeled.
    Type: Application
    Filed: August 24, 2012
    Publication date: December 20, 2012
    Applicant: International Business Machines Corporation
    Inventors: Faried Abrahams, Ali P. Arsanjani, Kerard R. Hogg, Ahamed Jalaldeen, Siddharth Purohit, Gandhi Sivakuma
  • Publication number: 20120323920
    Abstract: A method for creating a semantically aggregated index in an indexer-agnostic index building system includes: extracting documents from a data source, each document including a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Application
    Filed: August 24, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Publication number: 20120323921
    Abstract: A plurality of items included in a catalog may be obtained, each item associated with an item category. Brand indicators may be obtained, each brand indicator associated with the item category. Brand indicators associated with each of the items may be determined, and the each item may be assigned to a partition group associated with the brand indicator that is associated with the each item. Correlated string tokens that are correlated, greater than a predetermined correlation threshold value, with the brand indicator associated with the partition group that is associated with the each one of the items, the correlated string tokens associated with the each one of the plurality of items, may be determined. A dictionary hierarchy may be generated based on the one or more correlated string tokens.
    Type: Application
    Filed: June 15, 2011
    Publication date: December 20, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Zhimin Chen, Eduardo Laureano, Renfei Luo, Tsheko Mutungu, Vivek Narasayya, David Talby
  • Publication number: 20120310941
    Abstract: A web mapping system and method are described. The web map system receives a content pointer and provides a category result associated with the content pointer. The category result is determined by successively selecting and applying one of a plurality of categorization algorithms that each attempt to provide a category result for a URL based on a plurality of rules. If no category result is determined, the content pointer may be passed to a categorization manager to generate a rule for the content pointer so that subsequent categorization requests for the content pointer will result in a category result.
    Type: Application
    Filed: June 2, 2011
    Publication date: December 6, 2012
    Applicant: KINDSIGHT, INC.
    Inventors: Roderick William MacDonald, Hao Tang, Haijun Cao, Kumaran Sangareddi
  • Publication number: 20120278331
    Abstract: User selections entered in the media application or any user input device behavior with user devices may be recorded as clickstream data. The clickstream data may be used to deduce information about the user or a media item being consumed. Users may be grouped based on their respective clickstream activity during the consumption of one or more media items. Based on the user grouping, information may be derived about a media item being consumed by a user of the group.
    Type: Application
    Filed: April 28, 2011
    Publication date: November 1, 2012
    Inventors: Ray Campbell, Walter R. Klappert, Paul George Milazzo
  • Publication number: 20120278330
    Abstract: User selections entered in the media application or any user input device behavior with user devices may be recorded as clickstream data. The clickstream data may be used to deduce information about the user or a media item being consumed. The clickstream data may be used to define a plurality of behavior patterns. Portions of the media item may be identified based on the user's interaction and/or behavior with the user devices.
    Type: Application
    Filed: April 28, 2011
    Publication date: November 1, 2012
    Inventors: Ray Campbell, Walter R. Klappert, Paul George Milazzo
  • Publication number: 20120259856
    Abstract: A Website may be automatically categorized by (a) accepting Website information, (b) determining a set of scored clusters (e.g., semantic, term co-occurrence, etc.) for the Website using the Website information, and (c) determining at least one category (e.g., a vertical category) of a predefined taxonomy using at least some of the set of clusters.
    Type: Application
    Filed: June 20, 2012
    Publication date: October 11, 2012
    Inventors: David GEHRKING, Ching LAW, Andrew MAXWELL
  • Publication number: 20120254186
    Abstract: An approach is provided for rendering categorized location-based results. The approach involves determining a distribution of one or more entities over a geographical area. The approach further involves receiving an input for specifying one or more categories of the one or more entities. The approach also involves processing and/or facilitating a processing of the distribution to generate one or more clusters of the one or more entities with respect to the one or more categories. The approach further involves determining one or more geographical locations associated with the one or more clusters, the one or more entities, or a combination thereof. The approach also involves causing, at least in part, rendering or one or more graphical presentations of the one or more geographical locations, the one or more entities, or a combination thereof based, at least in part, on the one or more clusters.
    Type: Application
    Filed: May 2, 2011
    Publication date: October 4, 2012
    Applicant: Nokia Corporation
    Inventors: Caitlin Winner, Matthew Simon Biddulph, Felix Petersen
  • Publication number: 20120233169
    Abstract: An alert notification distribution tool is disclosed. In particular embodiments, a method includes receiving raw data from a first data source in a first format and converting the raw data to conditioned data. The method also includes selecting, based on user input, a first category of a plurality of categories included in the conditioned data. The method also includes selecting, based on user input, one or more values from a plurality of values associated with the selected first category and generating a distribution group based on the selected first category and the selected one or more values associated with the selected first category, the distribution group including one or more contact addresses.
    Type: Application
    Filed: March 8, 2011
    Publication date: September 13, 2012
    Applicant: Bank of America Corporation
    Inventors: Daniel S. Small, Thomas M. Keifer
  • Publication number: 20120209839
    Abstract: The technology provides content about a user to a currently executing instance of an application which uses the provided content to personalize and make its processing contextually relevant for a user. When the application instance is launched, a message requesting data related to categories for a user is sent to a context relevant, content aggregation and distribution system. The service executes within a cloud computing system, and provides the application instance with content derived from sources like other applications and data stored on devices the application instance is not or cannot communicate with. The service gathers content from many different types of online resources such as e-mail, social networking sites, websites, and other data accessible over communication networks with different communication protocols.
    Type: Application
    Filed: February 15, 2011
    Publication date: August 16, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Anton O. A. Andrews, Ryan Powell, Andre O. M. Mohr, Jae P. Park
  • Publication number: 20120203752
    Abstract: A classification method includes constructing queries from category descriptors representing categories of a taxonomy of hierarchically organized categories. The query constructed for a category c includes a query component based on descriptors of the category c and at least one query component based on descriptors of an ancestor or descendant category of the category c. A documents database is queried using the constructed queries to retrieve pseudo-relevant documents. Language models for the categories of the taxonomy are extracted from the pseudo-relevant documents by inferring a hierarchical topic model representing the taxonomy. An input document is classified by optimizing mixture weights of a weighted combination of categories of the hierarchical topic model respective to the input document.
    Type: Application
    Filed: February 8, 2011
    Publication date: August 9, 2012
    Applicant: XEROX CORPORATION
    Inventors: Viet Ha-Thuc, Jean-Michel Renders
  • Publication number: 20120179682
    Abstract: Conventionally, it has been impossible to appropriately acquire word pairs having a prescribed relationship. Such word pairs can be appropriately acquired with a word pair acquisition apparatus including: a word class information storage unit in which word class information can be stored; a class pair favorableness degree storage unit in which a class pair favorableness can be stored; a seed pattern storage unit in which can be stored one or more seed patterns; a word pair acquisition unit that acquires one or more word pairs co-occurring with the seed pattern from sentence groups; a class pair favorableness degree acquisition unit that acquires a class pair favorableness degree; a score determination unit that uses the class pair favorableness degree to determine a score of each of the word pairs; and a word pair selection unit that acquires one or more word pairs having a high score.
    Type: Application
    Filed: September 7, 2010
    Publication date: July 12, 2012
    Inventors: Stijn De Saeger, Kentaro Torisawa, Junichi Kazama, Kow Kuroda, Masaki Murata
  • Publication number: 20120166441
    Abstract: Techniques for determining a set of keywords associated with a document are provided. A document is received that may be classified into a taxonomy that includes a plurality of categories. A categorization ranking is determined for each category for the received document. A set of categories of the taxonomy having highest categorization rankings is determined for the received document. Documents representing the set of categories having highest categorization rankings are combined together into a cumulative representative text that includes a plurality of terms. A cumulative term corpus importance score is determined for each term in the cumulative representative text. The cumulative term corpus importance score for a particular term indicates an importance of the particular term in a context of the cumulative representative text. A set of terms of the cumulative representative text having highest cumulative term corpus importance scores is selected to be keywords for the received document.
    Type: Application
    Filed: December 23, 2010
    Publication date: June 28, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Ron Karidi, Liat Segal, Oded Elyada, Rotem Bennett
  • Publication number: 20120150860
    Abstract: Data-mining software initializes a loop by (a) assigning the messages in an online discussion on a website to a single logical cluster that is one of a group of logical clusters and (b) determining a measure of similarity-adjusted entropy for the group of logical clusters. Then the data-mining software randomly reassigns one of the messages to another logical cluster that is randomly selected from the group of logical clusters and again determines a measure of the similarity-adjusted entropy for the group of logical clusters. If the subsequent measure of similarity-adjusted entropy is greater than or equal to the earlier measure of similarity-adjusted entropy, the data-mining software undoes the reassignment. If the subsequent measure of similarity-adjusted entropy is less than the earlier measure of similarity-adjusted entropy, the data-mining software replaces the earlier measure with the subsequent measure. The data-mining software repeats the operations of the loop until convergence is reached.
    Type: Application
    Filed: December 10, 2010
    Publication date: June 14, 2012
    Applicant: Yahoo!, Inc.
    Inventors: Narayan L. Bhamidipati, Nagaraj Kota
  • Publication number: 20120130771
    Abstract: Chat categorization uses semi-supervised clustering to provide Voice of the Customer (VOC) analytics over unstructured data via an historical understanding of topic categories discussed to derive an automated methodology of topic categorization for new data; application of semi-supervised clustering (SSC) for VOC analytics; generation of seed data for SSC; and a voting algorithm for use in the absence of domain knowledge/manual tagged data. Customer service interactions are mined and quality of these interactions is measured by “Customer's Vote” which, in turn, is determined by the customer's experience during the interaction and the quality of customer issue resolution. Key features of the interaction that drive a positive experience and resolution are automatically learned via machine learning driven algorithms based on historical data. This, in turn, is used to coach/teach the system/service representative on future interactions.
    Type: Application
    Filed: June 15, 2011
    Publication date: May 24, 2012
    Inventors: Pallipuram V. Kannan, Ravi Vijayaraghavan, Rajkumar Dan, Harsh Singhal, Manish Gupta
  • Publication number: 20120101870
    Abstract: A method for evaluating data includes defining a list of data categories, determining a relative sensitivity to each data category, determining one or more classifiers for each of the data categories, receiving a plurality of data items to be valued, determining one of the data categories for each of said plurality of data items according to the one or more classifiers, and determining a respective sensitivity for each of said plurality of data items.
    Type: Application
    Filed: October 22, 2010
    Publication date: April 26, 2012
    Applicant: International Business Machines Corporation
    Inventors: Stephen Carl Gates, Youngja Park, Josyula R. Rao, Wilfried Teiken
  • Publication number: 20120084289
    Abstract: Methods and systems are described for a geographically defined platform. In one embodiment, a block is divided into one or more partitioned blocks comprising geographically proximate street addresses. Residents whose street addresses are located within the same partitioned block may contribute and view resident-generated content through a spatial platform. Further, contiguous blocks may elect to combine with each other and a partitioned block may elect to separate from the larger block that comprises it.
    Type: Application
    Filed: September 29, 2011
    Publication date: April 5, 2012
    Applicant: rBlock, Inc.
    Inventor: Vivek A. Hutheesing
  • Publication number: 20120078913
    Abstract: A system and method for matching one or more source schemas with one or more target schemas is provided. The matching between source and target schemas is performed by gathering inputs pertaining to the source and target schemas, wherein the inputs comprises a set of details in a predefined format. Thereafter, the gathered inputs are processed by comparing the source schemas with the target schemas. The processing is performed to identify a set of matches between the source and target schemas based on the linguistic similarity, structural similarity and functional similarity and relationship between the source and target schemas. Subsequently, the identified matches are stored.
    Type: Application
    Filed: November 22, 2010
    Publication date: March 29, 2012
    Applicant: INFOSYS TECHNOLOGIES LIMITED
    Inventors: Durga Prasad Muni, Krupa Benhur Gadde, Srikumar Krishnamoorthy
  • Publication number: 20120054189
    Abstract: Systems, methods, computer program products are provided for presenting content. An example computer implemented method includes identifying, by a data exchange engine executing on one or more processors, one or more user lists based on owned or permissioned data, each user list including a unique identifier; associating metadata with each user list including data describing a category for the user list, population data describing statistical or inferred data concerning a list or members in a given user list and subscription data including data concerning use of a given user list; storing in a searchable database a user list identifier and the associated metadata; and publishing for potential subscribers a list of the user lists including providing an interface that includes for each user list the unique identifier and the associated metadata.
    Type: Application
    Filed: August 30, 2011
    Publication date: March 1, 2012
    Applicant: GOOGLE INC.
    Inventors: Rajas Moonka, Anurag Agarwal, Oren E. Zamir
  • Publication number: 20120054658
    Abstract: A system and method for digital object categorization or retrieval are provided. The method includes providing for a selector to be graphically presented to a user. The selector is variably adjustable within a range, by the user, to adjust a level of noise in at least one of digital object categorization and digital object retrieval. The range of the selector is normalized over a set of digital object categories based on scores output by a trained categorizer for each of a set of labeled test objects for each of the set of digital object categories and a label for each of the test objects. Selector position information is received by the system and at least one of digital object categorization and digital object retrieval is performed, based on the selector position information. One or more of the method steps may be performed with a computer processor.
    Type: Application
    Filed: August 30, 2010
    Publication date: March 1, 2012
    Applicant: Xerox Corporation
    Inventors: Mathieu Chuat, Vincent Devin
  • Publication number: 20120054186
    Abstract: Methods and arrangements for employing descriptors for agent-customer interactions are disclosed. Filtering the pooled records based on one or more predetermined criteria is done such that analyzing the filtered records and comparing one interaction between an agent and a customer with another interaction between an agent and a customer may occur.
    Type: Application
    Filed: August 25, 2010
    Publication date: March 1, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manish Anand Bhide, Om Dadaji Deshmukh, Ashish Verma
  • Patent number: 8122005
    Abstract: A training set generator may be configured to input a taxonomy including a hierarchy of categories and a plurality of top-level sites, and to output a training set of categorized data. The training set generator may include a crawler configured to crawl each of the top-level sites to determine at least one lower-level site associated therewith and to store the top-level sites and associated lower-level sites as crawl data. The training set generator also may include an extractor configured to determine, for each of the top-level sites, a corresponding site-specific extraction template associating at least one portion of the corresponding top-level site with at least one category of the hierarchy of categories, and further configured to apply each site-specific extraction template to corresponding crawl data to thereby associate the crawl data with the categories of the hierarchical categories and obtain categorized data of the training set.
    Type: Grant
    Filed: October 22, 2009
    Date of Patent: February 21, 2012
    Assignee: Google Inc.
    Inventors: Philo Juang, Christopher Testa, Nicolaus Mote
  • Publication number: 20120030207
    Abstract: A mobile communication terminal and a method of processing content thereof are disclosed. The terminal and method analyze inputted commands to generate condition data used to automatically executed the inputted commands when the condition for the command occurs. The mobile communication terminal includes a pre-work setting unit to set a combination condition which includes a search condition, a predetermined-operation identification information, and source data; a condition variation monitoring unit to monitor whether an actual condition satisfies the search condition, and a post-processing management unit to process the source data, if the actual condition satisfies the search condition, and to execute a predetermined operation according to the predetermined-operation identification information, if the actual condition satisfies the search condition.
    Type: Application
    Filed: June 30, 2011
    Publication date: February 2, 2012
    Applicant: PANTECH CO., LTD.
    Inventor: Sung Jin KIM
  • Publication number: 20120016880
    Abstract: A method and system for specifying categories of data elements during a service specification phase of a service-oriented architecture (SOA) life cycle defined in a service modeling methodology like Service-Oriented Modeling and Architecture (SOMA). A Unified Modeling Language based SOA modeling tool for SOMA methodology includes a middleware based integration plug-in that categorizes retrieved service-specific data elements as transaction elements, optional controller elements, glue elements, optional extension patterns, extension elements and core Common Information Model entities, and associates the categorized data elements with corresponding operations of the service being modeled. A user interface provided by the plug-in enables input of the data elements into the categories and input of the associations between the categorized data elements and corresponding operations of the service being modeled.
    Type: Application
    Filed: July 13, 2010
    Publication date: January 19, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Faried Abrahams, Ali P. Arsanjani, Kerard R. Hogg, Ahamed Jalaldeen, Siddharth Purohit, Gandhi Sivakumar