Clustering Or Classification (epo) Patents (Class 707/E17.089)
-
Publication number: 20120109974Abstract: Disclosed is a system and computer-implemented method for extracting an acronym and one or more corresponding expansions of the acronym from a document represented in a markup language. The computer-implemented method comprises: identifying at least one acronym contained in the document; determining one or more expansions of the at least one identified acronym based on a portion of document located proximate the identified acronym; determining a ranking for each determined expansion based attributes of the document; and selecting one or more expansions for an identified acronym using the determined rankings.Type: ApplicationFiled: July 16, 2009Publication date: May 3, 2012Inventors: Shi-Cong Feng, Yuhong Xiong, Wei Liu
-
Publication number: 20120109964Abstract: A method of classifying a set of semantic concepts on a second multimedia collection based upon adapting a set of semantic concept classifiers and updating concept affinity relations that were developed to classify the set of semantic concepts for a first multimedia collection. The method comprises providing the second multimedia collection from a different domain and a processor automatically classifying the semantic concepts from the second multimedia collection by adapting the semantic concept classifiers and updating the concept affinity relations to the second multimedia collection based upon the local smoothness over the concept affinity relations and the local smoothness over data affinity relations.Type: ApplicationFiled: October 27, 2010Publication date: May 3, 2012Inventors: Wei Jiang, Alexander C. Loui
-
Publication number: 20120109965Abstract: The present invention relates generally to a system for automatic semantic-based mining that enables web mining for populate semantic artifacts data to be carried out with minimal user interaction.Type: ApplicationFiled: March 23, 2010Publication date: May 3, 2012Applicant: Mimos DerhadInventors: A/L Perumal Nagendran, Yuan Kai Chow, Yusrin Amruddin Amru
-
Publication number: 20120102104Abstract: A method, apparatus, and computer-readable medium are provided for matching items of user-generated content to entities is provided. Items of user-generated content, such as status updates, are gathered. For each of the items, a machine determines a degree to which the item is associated with an entity. In one aspect, items are matched to an entity by matching the content of the items to attributes of the entity. In another aspect, items are matched to an entity by predicting attributes of an author of the items and determining a distance between the predicted attributes of the author and the attributes of the entity. The distance may be a physical distance between locations of the entity and user or a contextual distance between categories for the entity and posts by the author. Items matched to the entity may be displayed on an interface concurrently with information about the entity.Type: ApplicationFiled: October 21, 2010Publication date: April 26, 2012Inventors: Vinay Kakade, Bo Pang, Nilesh Dalvi, Shanmugasundaram Ravikumar
-
Publication number: 20120102032Abstract: Computer-implemented methods for mapping an element of a source information model to an element of a target information model, forming a cluster of elements for mapping across information models, and evaluating a mapping of elements across information models, and a system and computer program product thereof. The method of mapping an element of a source information model to an element of a target information model includes: receiving information for mapping a first element in a source cluster to an element in the target information model; mapping the first element to the target element using the received information for mapping the first element to the target element; and mapping all other elements in the source cluster to the target element.Type: ApplicationFiled: October 21, 2010Publication date: April 26, 2012Applicant: International Business Machines CorporationInventors: Brian Byrne, Songyun Duan, Achille Fokoue-Nkoutche, Brendan O'Sullivan, Kavitha Srinivas
-
Publication number: 20120102035Abstract: In an embodiment, a data embedding method may be provided. The data embedding method may include inputting data to be encoded and data to be embedded; grouping the data to be encoded into a first set and a second set, based on an entropy of the data to be encoded; and embedding the data to be embedded into the data to be encoded by replacing a pre-determined part of the second set with the data to be encoded so that the first set remains free of data to be embedded.Type: ApplicationFiled: March 25, 2010Publication date: April 26, 2012Inventors: Te Li, Susanto Rahardja, Haiyan Shu, Ti Eu Chan, Haibin Huang
-
Publication number: 20120102037Abstract: In one general aspect, a set of representations of message thread contents is decomposed into clusters of representations of message thread contents determined to be similar. Similarly, a set of representations of message thread titles is decomposed into clusters of representations of message thread titles determined to be similar, where the act of decomposing the set of representations of message thread titles is influenced by the act of decomposing the set of representations of message thread contents. In another general aspect, a search query is received and compared to representations of clusters of message threads (e.g., a cluster of representations of message thread titles). Based on this comparison, a particular cluster of message threads then is identified as matching the search query.Type: ApplicationFiled: October 26, 2010Publication date: April 26, 2012Inventor: Mehmet Kivanc Ozonat
-
Publication number: 20120102031Abstract: A computer readable storage medium includes executable instructions to convert an entity to a standard form including normalized attributes, a tag reference and a feature. The entity is expanded with corresponding variants. The standard form and corresponding variants are combined to form an annotated entity in a first processing step. The entity is assigned to a group in a second processing step that accesses the annotated entity. The entity is processed in a single pass comprising the first processing step and the second processing step.Type: ApplicationFiled: October 20, 2010Publication date: April 26, 2012Applicant: SAP AGInventors: MOHAMMAD SHAMI, Tri Do, Kevin Wright, Hemant Puranik, George Chitouras
-
Publication number: 20120102034Abstract: According to exemplary embodiments of the invention, a location-based keyword recommending system and method are provided. The location-based keyword recommending system may include a keyword collecting unit to store location information regarding a location where a keyword is input, a region setting unit to set a virtual region by performing clustering of the location information with reference to the keyword, a region combining unit to combine virtual regions overlapping each other into one virtual region, and a keyword recommending unit to provide a location-based keyword based on the keyword related to the location information of the virtual region.Type: ApplicationFiled: September 23, 2011Publication date: April 26, 2012Applicant: NHN CORPORATIONInventors: Byoung Hak KIM, Chae Hyun LEE
-
Publication number: 20120096003Abstract: It is an object of the present invention to provide an information classification device capable of classifying retrieved pieces of information into appropriate groups even if these pieces of information are the same kind of information. The information classification device according to the present invention includes spatial arrangement means and classification means. The spatial arrangement means performs processing for spatially arranging an information group of a first information type and an information group of a second information type based on relation between the information group of the first information type and the information group of the second information type. The classification means classifies the information group of the first information type based on the processing results of the spatial arrangement means.Type: ApplicationFiled: May 12, 2010Publication date: April 19, 2012Inventors: Yousuke Motohashi, Hidekazu Sakagami, Tomohiro Isshiki
-
Publication number: 20120096004Abstract: A method, executed by a processor, for generating a transaction classification rule that can be applied to unclassified transactions. The method includes receiving an identification of an existing unclassified transaction upon which the classification rule will be based; generating identification rules to identify subsequent unclassified transactions as similar to the existing unclassified transaction; generating the classification rule using the identified transaction; and storing the classification rule for application to the subsequent unclassified transactions. Application of the generated classification rule to the subsequent unclassified transactions produces transactions classified according to the classification rule.Type: ApplicationFiled: October 18, 2010Publication date: April 19, 2012Inventor: Christopher Byrd
-
Publication number: 20120096001Abstract: Embodiments of the present invention relate to systems, methods, and computer-storage media for affinitizing datasets based on efficient query processing. In one embodiment, a plurality of datasets within a data stream is received. The data stream is partitioned based on efficient query processing. Once the data stream is partitioned, an affinity identifier is assigned to datasets based on the partitioning of the dataset. Further, when datasets are broken into extents, the affinity identifier of the parent dataset is retained in the resulting extent. The affinity identifier of each extent is then referenced to preferentially store extents having common affinity identifiers within close proximity of one other across a data center.Type: ApplicationFiled: October 15, 2010Publication date: April 19, 2012Applicant: MICROSOFT CORPORATIONInventors: JINGREN ZHOU, PATRICK JAMES HELLAND, JONATHAN FORBES, YARON BURD
-
Publication number: 20120089641Abstract: A free-form user-generated search query is used to retrieve responsive travel record information from categorized travel records. Searching the categorized travel records includes parsing the search query to identify search terms, determining a category with which each search term is associated, searching the categorized records to identify travel records that include responsive information. Systems and graphical user interfaces for searching travel records are also disclosed.Type: ApplicationFiled: October 8, 2010Publication date: April 12, 2012Inventors: Justin Steven Wilde, Jeffrey R. Wilde, James Ted Geyerman
-
Publication number: 20120089608Abstract: Methods and systems for organizing, representing and processing polymeric sequence information, including biopolymeric sequence information such as DNA sequence information and related information are disclosed herein. Polymeric sequence and associated information may be represented using a plurality of data units, each of which includes one or more headers and a payload containing a representation of a segment of the polymeric sequence. Each header may include or be linked to a portion of the associated information.Type: ApplicationFiled: August 31, 2011Publication date: April 12, 2012Applicant: ANNAI SYSTEMS, INC.Inventors: Lawrence Ganeshalingam, Patrick Nikita Allen
-
Publication number: 20120089605Abstract: Delivering targeted content includes collecting, via at least one tangible processor, user activity data for users during a specified time period. questions asked by the users during the specified time period are extracted from the user activity data, via the at least one tangible processor, and stored in user profiles for the users. The user profiles are clustered, via the at least one tangible processor, based on the questions asked. Targeted content is delivered, via the at least one tangible processor, to a subset of the users based on the clustering.Type: ApplicationFiled: October 8, 2010Publication date: April 12, 2012Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.Inventors: Srinivas BANGALORE, Junlan FENG, Michael James Robert JOHNSTON, Taniya MISHRA
-
Publication number: 20120089604Abstract: Systems and methods are provided for assigning a record to one or more record clusters. A record including a plurality of fields is received. A field in the record is identified to have a likelihood of including an input error. One or more alternative fields are generated with alternative inputs. The identified field and the one or more alternative fields are compared with a plurality of record clusters to identify a cluster with a matching field. The record is assigned to the identified cluster based at least in part on the matching field.Type: ApplicationFiled: October 8, 2010Publication date: April 12, 2012Inventor: Jocelyn Siu Luan Hamilton
-
Publication number: 20120089607Abstract: Methods and systems for organizing, representing and processing polymeric sequence information, including biopolymeric sequence information such as DNA sequence information and related information are disclosed herein. Polymeric sequence and associated information may be represented using a plurality of data units, each of which includes one or more headers and a payload containing a representation of a segment of the polymeric sequence. Each header may include or be linked to a portion of the associated information.Type: ApplicationFiled: August 31, 2011Publication date: April 12, 2012Applicant: ANNAI SYSTEMS, INC.Inventors: Lawrence Ganeshalingam, Patrick Nikita Allen
-
Publication number: 20120089606Abstract: Provided are a method, system, and computer program product for grouping identity records to generate candidate lists to use in an entity and relationship resolution process. A plurality of identity records are received, wherein the identity records provide attributes of entities, wherein the identity records may provide different or same values for the attributes. The received identity records are grouped into a group of identity records. A composite query on values for selected attributes of the identity records in the group is generated and applied to an entity database to obtain composite results of entity records in the entity database matching the attribute values of the composite query. For the identity records in the group, an individual query on attributes of one of the identity records is performed against the composite results of the entity records to determine a candidate list of entity records from the entity database for the identity record.Type: ApplicationFiled: October 11, 2010Publication date: April 12, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bhavani K. ESHWAR, Rajeshwar KALAKUNTLA, Vaishnavi NORI, Nithinkrishna P. SHENOY
-
Publication number: 20120084287Abstract: Estimation of unique values in a database can be performed where a data field having multiple information values is provided in the database. The data field can be partitioned into multiple intervals such that each interval includes a range of information values. An interval specific Bloom filter can be calculated for each of the multiple intervals. A binary Bloom filter value can be calculated for an information value within an interval specific Bloom filter. The binary Bloom filter value can represent whether the information value is unique. A number of unique values in the database can be determined based on calculated binary Bloom filter values.Type: ApplicationFiled: September 30, 2010Publication date: April 5, 2012Inventors: Choudur Lakshminarayan, Ramakumar Kosuru
-
Publication number: 20120084286Abstract: An approach for managing calendar information received from a plurality of data sources is described. Calendar information associated respectively with a plurality of data sources is retrieved by a calendar management platform. For each of the data sources, metadata specifying a contributor of the corresponding calendar information and for relating distribution of the calendar information is determined. Based on the first and second metadata, a data view for the calendar information is generated.Type: ApplicationFiled: September 30, 2010Publication date: April 5, 2012Applicant: Verizon Patent and Licensing Inc.Inventors: Paul Hubner, Kristopher Pate, Steven T. Archer, Robert A. Clavenna
-
Publication number: 20120078843Abstract: Provided are techniques for selecting a first group of indexes to form a current generation of indexes, selecting indexes from the first group biased to indexes with higher fitness values from the current generation of indexes, forming sub-groups of indexes using the selected indexes, determining fitness values of each of the sub-groups based on the fitness value of each of the indexes, selecting a subset of the sub-groups; and placing the indexes in the selected sub-groups into a new generation of indexes.Type: ApplicationFiled: September 29, 2010Publication date: March 29, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gaurav Mehrotra, Abhinay R. Nagpal, Sandeep R. Patil, Rulesh F. Rebello
-
Publication number: 20120078904Abstract: A database table is provided. The database table includes several column tuples. A column is selected in the database table. The column tuples of the selected column are partitioned into several bins. Each bin includes a range of tuples and associated metadata. The associated metadata includes at least one of: a minimum tuple value for the tuples in the bin, a maximum tuple value for the tuples in the bin, a minimum tuple identifier for the bin and a maximum tuple identifier for the bin. The bins are sorted based on the tuple values to provide an approximate index for the database.Type: ApplicationFiled: September 28, 2010Publication date: March 29, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Vatsalya Agrawal, Vivek Bhaskar, Ahmed Shareef
-
Publication number: 20120078903Abstract: A technique includes receiving data indicative of operation management events, where each event occurs at an associated time. The technique includes processing the data to selectively group the events in episodes based on the associated times and identifying which events are correlated based at least in part on the episodes.Type: ApplicationFiled: September 23, 2010Publication date: March 29, 2012Inventors: Stefan Bergstein, Chetan Kumar Gupta, Abhay Mehta, Song Wang
-
Publication number: 20120078906Abstract: A robust knowledge-based management and sharing system organized by context for expertise-based or context-based searching and retrieval of relevant information is disclosed. The various embodiments and techniques described herein are used to organize a user's data and communications around the user's expertise or one or more contexts the user is associated with such as the user's projects, products, and customers. The organization of user data is derived from the user's competencies and interactions with others and is used to build and index user profiles in a manner that facilitates retrieval in search results for relevant search criteria. A linguistic processing pipeline is used to parse and index the user's data to generate the complete and partial profiles organized by context. Complete and partial profiles are generated, indexed, ranked, and stored by the system.Type: ApplicationFiled: August 3, 2011Publication date: March 29, 2012Inventors: Pankaj Anand, Maxim Lukichev, Puneet Trehan, Sumit Vij, Nitin Arora
-
Publication number: 20120078910Abstract: Methods which use an ID domain to improve searching are described. An embodiment describes an index phase in which an image of a document is converted into the ID domain. This is achieved by dividing the text in the image into elements and mapping each element to an identifier. Similar elements are mapped to the same identifier. Each element in the text is then replaced by the appropriate identifier to create a version of the document in the ID domain. This version may be indexed and searched. Another embodiment describes a query phase in which a query is converted into the ID domain and then used to search an index of identifiers which has been created from collections of documents which have been converted into the ID domain. The conversion of the query may use mappings which were created during the index phase or alternatively may use pre-existing mappings.Type: ApplicationFiled: December 8, 2011Publication date: March 29, 2012Applicant: Microsoft CorporationInventors: Walid Magdy, Motaz El-Saban
-
Publication number: 20120078907Abstract: According to one embodiment, a keyword presentation apparatus includes an extraction unit, a selection unit and a clustering unit. The extraction unit is configured to extract, as technical terms, morpheme strings, which are not defined in a general concept dictionary, from a document set. The selection unit is configured to evaluate relevancies between each of basic term candidates and the technical terms, and to preferentially select basic term candidates having high relevancies as basic terms. The clustering unit is configured to calculate weighted sums of statistical degrees of correlation between the basic terms based on the document set, to calculate conceptual degrees of correlation between the basic terms based on the general concept dictionary, and to cluster the basic terms based on the weighted sums.Type: ApplicationFiled: August 24, 2011Publication date: March 29, 2012Inventors: Tomoharu Kokubu, Toshihiko Manabe, Kosei Fume, Wataru Nakano, Hiromi Wakaki
-
Publication number: 20120078912Abstract: A method for event correlation includes receiving events from a network of systems and classifying the events into itemsets, where each itemset includes a set of frequently correlated events. The method also includes calculating a confidence value for each of the itemsets, identifying itemsets whose confidence values conform to a confidence criterion, and varying the confidence criterion to reduce the number of the identified itemsets. A computer program product and data processing system are also disclosed.Type: ApplicationFiled: September 23, 2010Publication date: March 29, 2012Inventors: Chetan Kumar GUPTA, Song WANG, Abhay MEHTA, Stefan BERGSTEIN
-
Publication number: 20120072423Abstract: Particular portions of program execution data are specified and organized in semantic groups. A grouping expression written in a transformation syntax language specifies a pattern and a replacement, for grouping performance data samples. An exception to the pattern can also be specified. In response to the grouping expression, a cost accounting shows groups and their costs. The grouping expression may operate on names and/or name-associated characteristics such as private/public status, author, directory, and the like. Samples may represent nodes in a directed acyclic graph memorializing call stacks or memory allocation. Grouping expressions are used to group nodes and consolidate costs by various procedures when making modified sample stacks: clustering-by-name, entry-group-clustering, folding-by-name, a folding-by-cost. An entry group clustering shows at least one entry point name while avoiding unwanted detail.Type: ApplicationFiled: September 20, 2010Publication date: March 22, 2012Applicant: MICROSOFT CORPORATIONInventors: Vance Morrison, Joshua Ryan Williams
-
Publication number: 20120072422Abstract: The present invention comprises a system and method for automatically processing one or more citations contained within a document while the document is presented by a document rendering application. The method of the present invention comprises scanning the document to identify an unformatted citation and parsing the unformatted citation to determine one or more citation terms. One or more citation libraries are queried to find citations comprising the one or more citation terms. A citation falling within the scope of the query is selected and inserted into the document. The present invention may further provide enhanced workflow solutions for authors and publishers in preparing documents in structured format for facilitating efficient and accurate validation of references cited or included in papers and other submissions for publication or for review. An author prepares a document containing a set of cited references using a formatting structure.Type: ApplicationFiled: June 15, 2011Publication date: March 22, 2012Inventors: Jason Rollins, Noah Merritt, Paul Patanella, Eftim L. Pop-Lazarov, Stephen J. Rieger, David M. Pedrick, Sandro Cifelli
-
Publication number: 20120072424Abstract: Developing a knowledgebase associated with a user interface is disclosed. Development of the knowledgebase includes cataloging local data associated with a user, collecting remote data associated with the user, recording information associated with verbal input received from the user, tracking acts performed by the user to determine user idiosyncrasies, and updating the knowledgebase with the cataloged local data, the collected remote data, the recorded information, and the user idiosyncrasies. The updated knowledgebase is then provided to a component of a user interface.Type: ApplicationFiled: September 22, 2010Publication date: March 22, 2012Inventor: George Weising
-
Publication number: 20120066223Abstract: A method and computing device for creating distinct user spaces are described. Concerning the method, in a platform originally designed as a single user platform, user data associated with a plurality of users can be stored and segmented. In addition, links to point to user data that is associated with a current user can be generated in which the link creation can exploit a predefined path associated with storing data in the single user platform. The method can also include the step of preventing the current user from accessing user data associated with non-active users.Type: ApplicationFiled: September 13, 2010Publication date: March 15, 2012Applicant: OPENPEAK INC.Inventors: Philip Schentrup, Michael Kelly
-
Publication number: 20120066225Abstract: The invention relates to a method and system for profiling recipients into recipient categories on the basis of responses to content items provided to users. The profiling is based on rankings that are assigned to the content items, recipient categories, links between the content items and links between the content items and recipient categories. In one embodiment the ranking of a given content item is calculated on the basis of rankings of other content items having a link to the given content item, together with the ranking of the link between the content items, while the ranking of a given respondent in respect of a given recipient category is calculated on the basis of rankings of content items and/or categories that have a link to that recipient category. The links between content items and to the recipient categories indicate a particular response, by the respondent, in respect of content items.Type: ApplicationFiled: June 29, 2009Publication date: March 15, 2012Applicant: CVON INNOVATIONS LTDInventors: Sami Saru, Janne Aaltonen, Timo Ahopelto, Pekka Ala-Pietila
-
Publication number: 20120066222Abstract: A method and computer programming 10 for web directory and search engine processing of a plurality of computation jobs in a grid computing system and hash function 12 used to speed up table look up or data comparison tasks, such as finding items in a database and detecting duplicated or similar records in a large file. The partitions 16, 18, 20, 22 decompose very large data in particular segment into smaller and more manageable pieces 24, 26, 28, 30. The system then retrieves specific data, produces information search results, and stores the information in a web directory or search database 32. Furthermore, the method using grid computing technologies and other computer programs for sharing computationally operations among organizations, sharing and managing data, and easy accessing the database.Type: ApplicationFiled: September 14, 2010Publication date: March 15, 2012Inventor: Tam T. Nguyen
-
Publication number: 20120059707Abstract: Among other disclosed subject matter, a computer-implemented method includes receiving a first data set associated with a first data provider. The first data set includes a first set of data attributes associated with a first set of users. The method includes receiving a second data set associated with a second different data provider. The second data set includes a second set of data attributes associated with a second set of users. The method includes generating user cluster information based at least in part on at least one common data attribute associated with the first set of users and the second set of users. The method includes providing the user cluster information to a data purchaser.Type: ApplicationFiled: August 31, 2011Publication date: March 8, 2012Applicant: GOOGLE INC.Inventors: Vishal Goenka, Anurag Agarwal, Arun Dev Qamra, Vassilis Papavassiliou, Daishi Harada, Rajas Moonka, David Monsees
-
Publication number: 20120059708Abstract: In one embodiment, a method includes constructing an intent map for a plurality of products, the intent map comprising intent topics and each intent topic comprising intents, and then deriving a plurality of keywords from the intent map based on keyword templates.Type: ApplicationFiled: August 26, 2011Publication date: March 8, 2012Applicant: ADCHEMY, INC.Inventors: Daniel Galas, Veeravich Thi Thumasathit, Murthy V. Nukala, Richard Edward Chatwin, Alessandro Magnani, Benjamin David Foster, Alan Coleman, Manish Khettry, Siva Chandrasekar, Nitin Gupta, Srinidhi Ramesh Kondaji
-
Publication number: 20120059823Abstract: Provided are techniques for partitioning a physical index into one or more physical partitions; assigning each of the one or more physical partitions to a node in a cluster of nodes; for each received document, assigning an assigned-doc-ID comprising an integer document identifier; and, in response to assigning the assigned-doc-ID to a document, determining a cut-off of assignment of new documents to a current virtual-index-epoch comprising a first set of physical partitions and placing the new documents into a new virtual-index-epoch comprising a second set of physical partitions by inserting each new document to a specific one of the physical partitions in the second set using one or more functions that direct the placement based on one of the assigned-doc-id, a field value derived from a set of fields obtained from the document, and a combination of the assigned-doc-id and the field value.Type: ApplicationFiled: September 3, 2010Publication date: March 8, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ronald J. Barber, Harish Deshmukh, Ning Li, Bruce G. Lindsay, Sridhar Rajagopalan, Roger C. Raphael, Eugene J. Shekita, Paul S. Taylor
-
Publication number: 20120059826Abstract: An approach is provided for generating a compilation of media items. A plurality of media items is received. Respective context vectors for the media items are determined. The context vectors include, at least in part, orientation information, tilt information, altitude information, geo-location information, timing information, or a combination thereof associated with the creation of the respective media items. A compilation of at least a portion of the media items is generated based, at least in part, on the context vectors.Type: ApplicationFiled: January 24, 2011Publication date: March 8, 2012Applicant: Nokia CorporationInventors: Sujeet Shyamsundar Mate, Igor Danilo Diego Curcio, Francesco Cricri, Kostadin Nikolaev Dabov
-
Publication number: 20120059824Abstract: Provided are techniques for selecting row identifiers from an initial index structure storing rows of randomized indexes. The row identifiers are randomized. Groups are formed with the randomized row identifiers so that each group has a predetermined number of row identifiers. At least one group is selected from the groups. Indexes are retrieved from the initial index structure that correspond to the row identifiers in the selected at least one group. The retrieved indexes are encoded by adding product information to form new identifiers.Type: ApplicationFiled: September 3, 2010Publication date: March 8, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Nisanth M. Simon
-
Patent number: 8131722Abstract: In one example embodiment, a method is illustrated as including retrieving item data from a plurality of listings, the item data filtered from noise data, constructing at least one base cluster having at least one document with common item data stored in a suffix ordering, compacting the at least one base cluster to create a compacted cluster representation having a reduced duplicate suffix ordering amongst the clusters, and merging the compact cluster representation to generate a merged cluster, the merging based upon a first overlap value applied to the at least one document with common item data.Type: GrantFiled: June 29, 2007Date of Patent: March 6, 2012Assignee: eBay Inc.Inventors: Neelakantan Sundaresan, Kavita Ganesan, Roopnath Grandhi
-
Publication number: 20120054183Abstract: In a method for identifying and classifying an object, an object is detected by at least one physical detector tuned for it, the object is evaluated from the output signal of the detector and by an evaluation unit, and the object is identified and/or classified on the basis of predefinable properties from the output signal. A number of different physical features of the object are derived from the output signal, and the object is assigned to one of N predetermined basic classes on the basis of the derived physical features. The N basic classes are arranged in a predetermined order to form an N-dimensional vector V, which is assigned to the object, such that the elements v1, . . . , vN of the vector V indicate that the object belongs to the respective basic class. The object is then assigned to a derived class, which is taken from a reference data base, as a function of the vector V.Type: ApplicationFiled: February 9, 2010Publication date: March 1, 2012Applicant: EADS DEUTSCHLAND GmbHInventor: Manfred Hiebl
-
Publication number: 20120054182Abstract: Methods and arrangements for accommodating a query, directing the query to datasets, creating partitions and partitioning the datasets, and returning a response to the query, the response being structured in accordance with the created partitions.Type: ApplicationFiled: August 24, 2010Publication date: March 1, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Himanshu Gupta, Rajeev Gupta, Mukesh Kumar Mohania, Ullas Balan Nambiar
-
Publication number: 20120054188Abstract: An apparatus and method for processing content. In the method for processing content, a query for retrieving content to be stored is generated by combining a main category, a user's keyword, and a sub-category of the main category. The content is retrieved using the generated query. The content is classified and stored in a scrap book of the sub-category.Type: ApplicationFiled: March 1, 2011Publication date: March 1, 2012Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Bo-ra LEE, Ji-hye CHUNG, Hye-jeong LEE
-
Publication number: 20120047148Abstract: The present disclosure discloses a method for generating a search result and an information search system. The method for generating a search result includes: receiving, by an information search system, a search request; obtaining, by searching, a plurality of pieces of matching information that match the search request; obtaining a respective amount of user response associated with each of the plurality of pieces of matching information and further obtaining a total amount of user response associated with a respective categories to which each of the plurality of pieces of matching information belongs; and ranking the plurality of pieces of information to generate a search result based on the total amount of user response associated with the respective category to which each of the plurality of pieces of matching information belongs.Type: ApplicationFiled: April 29, 2010Publication date: February 23, 2012Applicant: ALIBABA GROUP HOLDING LIMITEDInventors: Ning Guo, Yuheng Xie, Fei Xing, Lei Hou, Qin Zhang
-
Publication number: 20120047113Abstract: One embodiment of the present invention is directed to a method for compressing data generated by multiple data sources. The method includes steps of partitioning data generated by the multiple data sources into data partitions, the data included in each data partition containing inter-data-source redundancies and, for each data partition, compressing the data in the data partition to remove the inter-data-source redundancies.Type: ApplicationFiled: August 18, 2010Publication date: February 23, 2012Inventors: Marcelo Weinberger, Raul Herman Etkin, Erik Ordenllich, Gadiel Seroussi
-
Publication number: 20120047144Abstract: A hierarchy of nodes is created, each node being one of associated with an item retrieved according to a condition and associated with a category of information including the item. It is determined whether at least one of the nodes is redundant in the hierarchy. The at least one of the nodes is pruned from the hierarchy if the at least one of the nodes is redundant.Type: ApplicationFiled: October 27, 2011Publication date: February 23, 2012Inventor: John S. Huitema
-
Publication number: 20120047131Abstract: An information retrieval system and computer-based method provide constructing a title for a search result summary of a document through title synthesis, wherein the title is suitable for use in assessing the relevance of the summarized document to a query. In one embodiment, the system obtains meaningful keywords or key phrases (title components) about the document; and classifies each title components into one or more of a plurality of pre-established title component classes. The title components may be automatically obtained for the document from available sources either before or at the time the document is made available for indexing by the system. When a query is input to the system to which the document is relevant, the system constructs a title for the document by arranging title components selected from title component classes, to maximize a title utility function. The title utility function may be a query-dependent grade.Type: ApplicationFiled: August 23, 2010Publication date: February 23, 2012Inventors: Youssef Billawala, Sudarshan Lamkhede
-
Publication number: 20120047014Abstract: Techniques for performing user classification based on email are provided. Emails stored in an email store may be analyzed to classify users. Information included in the stored emails may be extracted, and users may be classified into categories according to the extracted information. The extracted information may be analyzed in a manner so as to protect the personal information of the users according to any applicable privacy standards. Any number of types of emails may be analyzed to classify users in any number of ways. For instance, a plurality of commercial emails stored in the email store may be determined The commercial emails may be counted as conversions for an advertising campaign. The commercial emails may be parsed to extract commercial information. The commercial information may be parsed to generate user classification data. The user classification data may be used in various ways, including for targeting users with advertisements.Type: ApplicationFiled: August 23, 2010Publication date: February 23, 2012Applicant: Yahoo! Inc.Inventors: Yoelle Maarek Smadja, Andrei Broder, Vanja Josifovski, Melissa B. Stein
-
Publication number: 20120047141Abstract: A generic and expandable document aspect system and method for searching, browsing, presenting, and interacting with data assembled from document contents and related external data is provided. New varieties of document aspects are added to existing installations and can be accessed by users without requiring upgrades to server or clients, for example by using plug-in technology.Type: ApplicationFiled: October 31, 2011Publication date: February 23, 2012Inventors: Richard HOLZGRAFE, Tom Santos, Christopher Warnock
-
Publication number: 20120047140Abstract: A system, method and computer program product for synchronizing updates to shared mutable data in a clustered data processing system. A data element update operation is performed at each node of the cluster while preserving a pre-update view of the shared mutable data, or an associated operational mode, on behalf of readers that may be utilizing the pre-update view. A request is made for detection of a grace period, and grace period detection processing is performed for detecting when the cluster-wide grace period has occurred. When it does, a deferred action associated with the update operation it taken, such as removal of a pre-update view of the data element or termination of an associated mode of operation.Type: ApplicationFiled: October 31, 2011Publication date: February 23, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Paul E. McKenney, Julian Satran
-
Publication number: 20120047143Abstract: Systems and methods are provided for augmenting a user profile of a subject user. In general, the user profile of the subject user is augmented based on aggregate profile data for a group of users relevant to a current location of the subject user. In one embodiment, the group of users is a crowd of users currently located at a location that is relevant to the current location of the subject user. In another embodiment, the group of users is a number of users historically, or previously, located at locations relevant to the current location of the subject user.Type: ApplicationFiled: March 12, 2010Publication date: February 23, 2012Applicant: Waldeck Technology LLCInventors: Steven L. Petersen, Ravi Reddy Katpelly