Clustering Or Classification (epo) Patents (Class 707/E17.089)
  • Publication number: 20140143254
    Abstract: Systems and methods can determine categories for product searches. One or more computing devices can receive a product query of search terms. The product query can be classified to identify a product category. The search terms may be verified against an ambiguous term list for the product category. The search terms may also be verified against an attribute list for the product category. The product query may be classified as fully understood in response to all of the search terms matching either the ambiguous term list or the attribute list for the product category. A product search may be performed on the product query. The product search may be informed by the product category when the product query has been classified as fully understood. Search results may be generated and returned according to the product search.
    Type: Application
    Filed: November 16, 2012
    Publication date: May 22, 2014
    Inventors: Ritendra Datta, Joshua Yelon, Thomas Walter Murphy
  • Publication number: 20140136537
    Abstract: A computing system determines incremental values associated with a plurality of clustering solutions. Each of the clustering solutions groups stores of a retailer into clusters in a different way. For each clustering solution in the plurality of clustering solutions, the incremental value associated with the clustering solution indicates a difference between an estimated revenue associated with the clustering solution and revenue associated with a baseline clustering solution. The computing system then determines, based on the incremental values associated with the plurality of clustering solutions, the appropriate number of clusters. The clustering solutions that group the stores into more or fewer clusters than the appropriate number of clusters tend to be associated with incremental values that are the same or lower than the clustering solutions that group the stores into the appropriate number of clusters.
    Type: Application
    Filed: November 15, 2012
    Publication date: May 15, 2014
    Applicant: Target Brands, Inc.
    Inventors: James Carl Nelson, Raja Ranganathan, Abhijit Sharma, Zachary George Sands
  • Publication number: 20140136540
    Abstract: A system and method of determining the level of diversity for a search query are described. Distances between leaf categories in a hierarchical category tree are determined using co-click counts between the leaf categories for a query. Coordinate representations of the leaf categories are determined using the distances between the leaf categories. A diversity score for the query is determined using the coordinate representations. The diversity score represents a degree of variability in what different users find relevant to the query. In some embodiments, determining distances between leaf categories comprises determining the distances using a normalization of the co-click counts that uses co-impression counts between the leaf categories for the query. In some embodiments, a manifold learning algorithm is used to determine the coordinate representations. In some embodiments, multi-dimensional scaling is used to determine the coordinate representations.
    Type: Application
    Filed: November 9, 2012
    Publication date: May 15, 2014
    Applicant: eBay Inc.
    Inventors: Duangmanee Putthividhya, Zhaohui Chen
  • Publication number: 20140122483
    Abstract: An activity-modeling system computes an amount of time that a user is expected to spend when performing activities of a certain type. During operation, the system can obtain a plurality of location events associated with the user, such that a respective location event indicates a time at which a user logged his location while performed an activity related to the activity type. The system selects, from the plurality of location events, a set of location events associated with the activity type. The system determines an activity start-time and an activity end-time for the activity type from the set of location events, and computes an activity-duration time for the activity type based on the determined activity start-time and the activity end-time.
    Type: Application
    Filed: October 26, 2012
    Publication date: May 1, 2014
    Applicant: PALO ALTO RESEARCH CENTER INCORPORATED
    Inventors: Rui Zhang, Robert R. Price, Oliver Brdiczka
  • Publication number: 20140114972
    Abstract: Systems and methods for sharing information between distributed computer systems connected to one or more data networks. In particular, a replication system implementing methodologies for sharing database information between computer systems where the databases use different classification schemes for information access control is disclosed.
    Type: Application
    Filed: October 22, 2012
    Publication date: April 24, 2014
    Applicant: PALANTIR TECHNOLOGIES, INC.
    Inventors: Richard Allen Ducott, III, John Kenneth Garrod, Khan Tasinga
  • Publication number: 20140108460
    Abstract: Data stores that store content units and annotations regarding the content units derived through a semantic interpretation of the content units. When annotations are stored in a database, different parts of an annotation may be stored in different tables of the database. For example, one or more tables of the database may store all semantic classifications for the annotations, while one or more other tables may store content of all of the annotations. A user may be permitted to provide natural language queries for searching the database. A natural language query may be semantically interpreted to determine one or more annotations from the query. The semantic interpretation of the query may be performed using the same annotation model used to determine annotations stored in the database. Semantic classifications and format of the annotations for a query may be the same as one or more annotations stored in the database.
    Type: Application
    Filed: October 11, 2012
    Publication date: April 17, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Mariana Casella dos Santos, Frank Montyne
  • Publication number: 20140108410
    Abstract: A test case generation system includes a processor, a process residing on the processor and configured to extract descriptions from document artifacts, extract a first set of keywords from the descriptions, categorize the descriptions to a first set and a second set, extract a second set of keywords that occur in the second set and generate a test case from the second set of keywords.
    Type: Application
    Filed: October 17, 2012
    Publication date: April 17, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Futoshi Iwama, Ken Mizuno, Taiga Nakamura, Hironori Takeuchi
  • Publication number: 20140101162
    Abstract: A method for recommending semantic annotations on a main document and sub documents is provided. The method includes: extracting a keyword of the main document; extracting a or a set of keyword of each sub document; and generating a or a set of keyword similarity of each of the sub documents based on a degree of similarity between the keyword of the main document and the keyword of each of the sub documents. The method also includes: obtaining a plurality of words appeared on each of the sub documents and calculating a frequency of each of the words; generating a semantic capacity of each of the sub documents according to the frequencies; grouping the main document and at least one of the sub documents into a semantic document set based on the semantic capacities and the keyword similarities; and annotating the main document according to the semantic document set.
    Type: Application
    Filed: October 9, 2012
    Publication date: April 10, 2014
    Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE
    Inventors: Hsiang-Yuan Hsueh, Ko-Li Kan, Chi-Chou Chiang
  • Publication number: 20140095503
    Abstract: A system and a method for initializing a streaming application are disclosed. The method may include initializing a streaming application for execution on one or more compute nodes which are adapted to execute one or more stream operators. The method may, during a compiling of code, identify whether a processing condition exists at a first stream operator of a plurality of stream operators. The method may add a grouping condition to a second stream operator of the plurality of stream operators if the processing condition exists. The method may provide for the second stream operator to group tuples for sending to the first stream operator.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Applicant: International Business Machines Corporation
    Inventors: Michael J. Branson, Bradford L. Cobb, John M. Santosuosso
  • Publication number: 20140095505
    Abstract: Systems and methods that allow for an intelligence platform for distributed processing of big data sets including both structured and unstructured data types across two or more intelligent data operation engine servers. The intelligent data operation engine servers can form a conceptual understanding of content in each electronic file and then cooperates with a distributed index handler to index the conceptual understanding of the electronic file. A query pipeline and the distributed index handler in the intelligence platform cooperate with the two or more intelligent data operation engine servers to improve scalability and performance on the big data sets containing both structured and un-structured electronic files represented in the common index.
    Type: Application
    Filed: October 1, 2012
    Publication date: April 3, 2014
    Applicant: LONGSAND LIMITED
    Inventors: Sean Mark Blanchflower, Darren John Gallagher
  • Publication number: 20140081973
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying a spike in a rate of occurrence of events. One of the methods includes receiving data identifying a spike at a particular time in a rate of occurrence of events relating to a particular search query, where an event relating to the particular search query is a receipt event of the particular search query or an indexing event of a resource that satisfies the particular search query, fitting the occurrences of the events in a time window to a reference distribution of occurrences of events to determine a goodness of fit value, wherein the reference distribution models a random occurrence of events relating to search queries, comparing the goodness of fit value to a primary threshold, and classifying the spike as a spurious spike if the goodness of fit value satisfies the predetermined threshold.
    Type: Application
    Filed: September 14, 2012
    Publication date: March 20, 2014
    Applicant: Google Inc.
    Inventors: Mukund Jha, Kumar Mayur Thakur
  • Publication number: 20140081974
    Abstract: Systems and methods are provided for aggregating relevant electronic content items that are relevant to one another. In one embodiment, a content management application determines that a first electronic content item and a second electronic content item are relevant to one another. The first electronic content item is provided by a first client account and the second electronic content item is provided by a second client account. The content management application also aggregates the first and second electronic content items to form at least part of a collection of electronic content. The first and second electronic content items are aggregated based on determining that the first and second electronic content items are relevant to one another. The content management application also provides access to the collection of electronic content.
    Type: Application
    Filed: September 18, 2012
    Publication date: March 20, 2014
    Applicant: Adobe Systems Incorporated
    Inventors: Jon Lorenz, Justin Velo
  • Publication number: 20140067816
    Abstract: In an effort to enhance computer user engagement with a search results page, systems and methods are presented which are configured to identify an entity as being the subject matter of a user's search query. If the entity is a known entity, i.e., entity information is stored in an entity store for the identified entity, a subset of entity attributes are identified and a representative entity attribute question is obtained for each of the attributes in the subset of entity attributes. The representative entity attribute questions are identified according to the probability that they are formed linguistically correct. The representative entity attribute questions are included in a search results page that is generated in response to the user's search query.
    Type: Application
    Filed: August 29, 2012
    Publication date: March 6, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Tapas Kanungo, Ashok Ponnuswami
  • Publication number: 20140067807
    Abstract: A method performed on an electronic device for migrating tags across entities. The migration of the tags is performed following an analysis of one or more personal electronically encoded items associated with a previously created perspective or album associated with the previously created perspective, responsive to a user decision the creation of a new perspective, a new album associated with one of the previously created perspectives, or a new perspective and a new album associated with the new perspective, responsive to a user decision to treat the previously created perspective or album as an individual entity, and association of the previously created perspective or album with the new perspective or new album. The tags are respectively migrated from the new perspective or the new album to the associated previously created perspective or the previously created album and to associated ones of the one or more personal electronically encoded items.
    Type: Application
    Filed: August 31, 2012
    Publication date: March 6, 2014
    Applicant: RESEARCH IN MOTION LIMITED
    Inventors: Anand Ravindra OKA, Sean Bartholomew SIMMONS, Christopher Harris SNOW, Steven Michael HANOV, Ghasem NADDAFZADEH SHIRAZI
  • Publication number: 20140058992
    Abstract: Techniques are described to characterize motion patterns of a group of agents engaging in an activity. An analysis system receives input data associated with spatial and temporal information of at least one element of interest associated with the activity, where the object of interest may be a ball, person, animal or any other object in motion. The analysis system partitions the input data into a plurality of spatiotemporal segments and generates one or more representations of one or more sets of segments of the plurality of spatiotemporal segments based on one or more criteria. The analysis system computes a metric, such as an entropy value, for each of the one or more representations. Partial tracing data, such as ball movements in a sporting event, may be created using an inexpensive input device, such as a tablet computer, making the disclosed techniques available for a wide range of events and activities.
    Type: Application
    Filed: August 21, 2012
    Publication date: February 27, 2014
    Inventors: Patrick Lucey, Alina Bialkowski, Iain Matthews, G. Peter Carr, Eric Foote
  • Publication number: 20140052730
    Abstract: Embodiments of the present invention provide a system, method, and program product for managing data sets. According to one aspect of the present invention, a data group of one or more related data sets is reorganized. Utilizing one or more specified criteria, data sets that should be cataloged in the data group are identified and cataloged in the data group such that they are arranged in a chronological order and are named with appropriate generation numbers.
    Type: Application
    Filed: August 14, 2012
    Publication date: February 20, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric J. Harris, Franklin E. McCune, Miguel A. Perez, Ryan J. Wisniewski
  • Publication number: 20140047384
    Abstract: Systems, methods, computer-readable media, and graphical user interfaces for facilitating integrated data capture with an item group key are provided. Integrated data capture workflows are initiated from within an electronic medical record (EMR). Selections of groups of items from the EMR are received. Item group keys are assigned to at least one item for the groups of items. Available data associated with the item group keys is gathered from the EMR. Selections of available data to include in case report forms are received. The case report forms are populated with the selections of available data and the item group keys.
    Type: Application
    Filed: August 8, 2012
    Publication date: February 13, 2014
    Applicant: CERNER INNOVATION, INC.
    Inventors: JON FEWINS, RYAN MOOG, MARSHA LAIRD-MADDOX, TODD JEFFREY REYNOLDS, BRADY TIMMERBERG, NITISH AMRAJI
  • Publication number: 20140046895
    Abstract: Data for a plurality of entities that can be offered a plurality of products can be obtained. The data can include categorical data and numeric data. Based on business constraints, some of all of the data can be selected. The selected data can be converted to another set of numeric data, wherein the categorical values are converted to numeric values. Dimensions of the converted data can be reduced to generate another set of data. Based on this another set of data, clusters of entities can be formed. The products can be grouped by assigning a unique product identifier of each product to a corresponding cluster. This grouping of products can be used by a predictive model to predict a likelihood of an entity to purchase a particular product in a future time period. Related methods, apparatus, systems, techniques and articles are also described.
    Type: Application
    Filed: August 10, 2012
    Publication date: February 13, 2014
    Inventors: Amit Sowani, Eeshan Malhotra, Shafi Ur Rahman
  • Publication number: 20140046947
    Abstract: A method for question/answer creation for a document is described. The method includes importing a document having a set of questions based on content in the document. The method also includes automatically creating a candidate question from the content in the document. The method also includes automatically generating answers for the set of questions and the candidate question using the content in the document. The method also includes presenting the set of questions, the candidate question, and the answers to a content creator for user verification of accuracy. The method also includes storing a verified set of questions in the document. The verified set of questions includes the candidate question.
    Type: Application
    Filed: August 9, 2012
    Publication date: February 13, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jana H. Jenkins, David C. Steinmetz, Wlodek W. Zadrozny
  • Publication number: 20140040270
    Abstract: Method, apparatus, and computer-readable medium are provided for analyzing a document including text. In one example, a method for identifying patterns in a document is described. The method includes identifying a plurality of candidate phrases in the document based on candidate identification criteria, grouping the candidate phrases of the plurality of candidate phrases with a phrase family based on family criteria and comparison between candidate phrases of the plurality of candidate phrases to obtain consistent phrases, and, for remaining phrases not meeting all of the candidate identification criteria, associating at least one of the remaining phrases with a phrase family based on inconsistent phrase criteria to obtain inconsistent phrases. Identified in this manner, the inconsistent phrase may be displayed via a user interface to permit a user the opportunity to determine whether an inconsistent phrase requires modification.
    Type: Application
    Filed: July 31, 2012
    Publication date: February 6, 2014
    Applicant: Freedom Solutions Group, LLC, d/b/a Microsystems
    Inventors: Thomas O'Sullivan, Andrzej Jachowicz
  • Publication number: 20140040263
    Abstract: The disclosure generally describes computer-implemented methods, software, and systems for search-, context-, and rule-based creation and runtime adaptation in dynamic workspaces. One computer-implemented method includes identifying a data artifact associated with each search result of at least one received search result, associating each identified data artifact with a module category of a plurality of module categories, injecting the identified artifacts into a content gallery, categorize, by operation of at least one computer, the injected identified artifacts within the content gallery, presenting at least a subset of the injected identified artifacts on an enterprise workspace page associated with an enterprise workspace, and constructing a context associated with at least one of the enterprise workspace or the enterprise workspace page.
    Type: Application
    Filed: August 6, 2012
    Publication date: February 6, 2014
    Applicant: SAP Portals Israel Ltd.
    Inventors: Yahali Sherman, Vitaly Vainer
  • Publication number: 20140040233
    Abstract: Methods, systems, and computer-readable and executable instructions are provided for organizing content. A method for organizing content can include building a customized content corpus for a user, building a concept graph customized for the user's context based on the customized corpus, and organizing, utilizing multi-view clustering, the content within the corpus based on the concept graph.
    Type: Application
    Filed: July 31, 2012
    Publication date: February 6, 2014
    Inventors: Mehmet Kivanc Ozonat, Claudio Bartolini
  • Patent number: 8635223
    Abstract: A system and method for providing a classification suggestion for electronically stored information is provided. A corpus of electronically stored information including reference electronically stored information items each associated with a classification and uncoded electronically stored information items are maintained. A cluster of uncoded electronically stored information items and reference electronically stored information items is provided. A neighborhood of reference electronically stored information items in the cluster is determined for at least one of the uncoded electronically stored information items. A classification of the neighborhood is determined using a classifier. The classification of the neighborhood is suggested as a classification for the at least one uncoded electronically stored information item.
    Type: Grant
    Filed: July 9, 2010
    Date of Patent: January 21, 2014
    Assignee: FTI Consulting, Inc.
    Inventor: William C. Knight
  • Publication number: 20140019451
    Abstract: A technique can include identifying a collection of documents to be clustered. The collection of documents can include foreign language documents and base language documents. The foreign language documents can be translated into the base language at a base language translation module. Keywords in the base language documents and keywords in the translated foreign language documents can be determined at a document indexing module. The base language documents can be clustered with the foreign language documents in a common set of document clusters based on the determined keywords in the base language documents and the determined keywords in the translated foreign language documents. In response to a search query in a first language, a listing of search results can be provided that includes documents in the first language and another language from the a common document cluster.
    Type: Application
    Filed: July 16, 2012
    Publication date: January 16, 2014
    Applicant: GOOGLE INC.
    Inventor: Kirill Buryak
  • Publication number: 20140012848
    Abstract: Systems and methods for measuring similarity between a set of clusters and a set of object labels, wherein at least two of the object labels are related, receive a first set of clusters, wherein the first set of clusters was formed by clustering objects in a set of objects into clusters of the first set of clusters according to a clustering procedure; and calculate a similarity index between the first set of clusters and a set of object labels based at least in part on a relationship between two or more object labels in the set of object labels
    Type: Application
    Filed: July 5, 2012
    Publication date: January 9, 2014
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Bradley Scott Denney, Dariusz T. Dusberger
  • Publication number: 20140006401
    Abstract: Various technologies described herein pertain to classifying data in a main memory database system. A record access log can include a sequence of record access observations logged over a time period from a beginning time to an end time. Each of the record access observations can include a respective record ID and read timestamp. The record access log can be scanned in reverse from the end time towards the beginning time. Further, access frequency estimate data for records corresponding to record IDs read from the record access log can be calculated. The access frequency estimate data can include respective upper bounds and respective lower bounds of access frequency estimates for each of the records. Moreover, the records can be classified based on the respective upper bounds and the respective lower bounds of the access frequency estimates, such that K records can be classified as being frequently accessed records.
    Type: Application
    Filed: June 30, 2012
    Publication date: January 2, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Justin Jon Levandoski, Per-Ake Larson
  • Publication number: 20140006408
    Abstract: Example methods, apparatuses, or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to facilitate or otherwise support one or more processes or operations for identifying points of interest in a text, such as in an unstructured text, for example, in connection with bootstrapping points of interest via social media.
    Type: Application
    Filed: June 29, 2012
    Publication date: January 2, 2014
    Applicant: Yahoo! Inc.
    Inventors: Adam Rae, Vanessa Murdock, Hugues Bouchard, Adrian Popescu
  • Publication number: 20130339354
    Abstract: A method and system for mining trends around trending terms. The method includes determining a plurality of articles, from one or more websites, in relation to a first entity for a time period. The first entity is a trending term. The method also includes generating comment clusters for the plurality of articles. Each comment cluster is generated for associated article and includes plurality of user comments. The method further includes extracting one or more entities from plurality of user comments for each of the comment clusters, the one or more entities related to the first entity. Further, the method includes enabling selection of a second entity, from the one or more entities, by the user. Moreover, the method includes rendering one or more user comments corresponding to the first entity and the second entity for the time period. The system includes an electronic device, communication interface, memory, and processor.
    Type: Application
    Filed: June 14, 2012
    Publication date: December 19, 2013
    Applicant: YAHOO! INC.
    Inventors: Vidit JAIN, Nikhil RASIWASIA
  • Publication number: 20130325862
    Abstract: Systems and methods are provided for large-scale, incrementing clustering. A plurality of processing nodes each include a processor and a non-transitory computer readable medium. The non-transitory computer readable medium stores a plurality of clusters of feature vectors and machine executable instructions for determining a plurality of values for a distance metric relating each of the plurality of clusters to an input feature vector and selecting a cluster having a best value for the distance metric. An arbitrator is configured to receive the selected cluster and best value for the distance metric from each of the plurality of processing nodes and determine a winning cluster as one of the selected clusters and a new cluster. A multiplexer is configured to receive the winning cluster and provide the winning cluster and a new input feature vector to each of the plurality of processing nodes.
    Type: Application
    Filed: June 4, 2012
    Publication date: December 5, 2013
    Inventor: MICHAEL D. BLACK
  • Publication number: 20130325861
    Abstract: Embodiments of the invention relate to a modeling activity area associated with groups of data items. Tools are provided to profile activity area involvement, both from the data item and from associated participants. The data items are placed into clusters and one or more activity areas are derived from the formed clusters. Each activity area is defined from the perspective of a single user. Participants in an activity area are connected to a user, but not necessarily to each other. The combination of formations of clusters and activity areas provides a multi-facetted organization of connections between data items and associated participants.
    Type: Application
    Filed: May 31, 2012
    Publication date: December 5, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Hongxia Jin
  • Publication number: 20130325849
    Abstract: Techniques for annotating an entity in a document corpus using cross-document signals. A method includes determining which documents in a document corpus mention an entity of interest, clustering the documents that mention an entity of interest according to a temporal signal, a structural signal and/or a content signal, thereby forming at least one cluster of documents, and annotating at least one document in the at least one cluster of documents by marking each occurrence of the entity in the at least one document.
    Type: Application
    Filed: August 16, 2012
    Publication date: December 5, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sushovan De, Amit K. Singh, Karthik Visweswariah
  • Publication number: 20130326346
    Abstract: The embodiments provide a cloud brainstorming service implemented on at least one cloud server. The brainstorming service includes a message service component configured to receive a plurality of ideas, over a network, from one or more users of devices. The users represent members of a brainstorming session. The brainstorming service also includes a brainstorming logic component configured to process the plurality of ideas and store the plurality of processed ideas in an in-memory database system, and a clustering component configured to retrieve the plurality of processed ideas from the in-memory database system and arrange the plurality of processed ideas into one or more clusters, where each cluster is a group of similar ideas. The message service component is configured to provide the plurality of processed ideas that are arranged into the one or more clusters, over the network, to the one or more users for display.
    Type: Application
    Filed: August 17, 2012
    Publication date: December 5, 2013
    Applicant: SAP AG
    Inventors: Zheren Zhu, Yongyuan Shen, Fu Zhao, Yingyu Chen, Bin Dong, Zheng Long Wei, Hui Wang
  • Publication number: 20130318088
    Abstract: According to one embodiment of the present invention, classification of objects in a directory service may be managed. An object is identified in a directory service. Classification information associated with the object is received from a reference database. Using a processor, a rule that specifies a value that corresponds to the classification information is accessed. The accessed value is based on a power of two classification model. Using the processor, the class of service attribute is created using the value. The class of service attribute is associated with the object listed in the directory service using the processor.
    Type: Application
    Filed: May 22, 2012
    Publication date: November 28, 2013
    Applicant: Bank of America Corporation
    Inventor: Michael Edward Futty
  • Publication number: 20130311473
    Abstract: A method for dynamically clustering data items, the method comprising: receiving a plurality of data items originating from at least two sources, a plurality of distinct metadata details, and data indicative of associations between the data items and the metadata details, wherein each data item is associated with at least one metadata detail indicative of its owner, and wherein at least a first data item originating from a first source and a second data item originating from a second source are related data items associated with at least one shared metadata detail; grading probabilities of relationships between at least one of the data items and at least one of the metadata details; clustering the data items into one or more clusters, based on the calculated probabilities; and, optionally, sharing clusters and meta-clusters between users.
    Type: Application
    Filed: May 21, 2012
    Publication date: November 21, 2013
    Applicant: SPHEREUP LTD.
    Inventors: Yevgeny Safovich, Ronen Abramov, Natan Chosnek
  • Publication number: 20130311474
    Abstract: A method, a system and a computer program product create mappings between taxonomies in which documents are classified from a category of a taxonomy to one or more categories within a master taxonomy based on a statistical model and classification score values. The document classifications are analyzed to determine a mapping between the taxonomy category and a corresponding category of the master taxonomy, where the category is mapped to the corresponding category in the master taxonomy in response to sufficient classification score values for the documents.
    Type: Application
    Filed: May 18, 2012
    Publication date: November 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Barton W. Emmanuel
  • Publication number: 20130297604
    Abstract: A method, system and electronic device are provided for classification of data objects such as messages. A number of rule engines, each of which may be associated with a different application or module, are provided on the electronic device. For each data object obtained by the electronic device, matching rule engines are identified, and the data object is processed by the matching rule engines to determine one or more classification values for the data object. The determined classification is stored in association with a data object identifier. Data objects can be subsequently collated according to their classification, or aggregations of data object listings can be collected and displayed in a plurality of views corresponding to the various classifications.
    Type: Application
    Filed: May 1, 2012
    Publication date: November 7, 2013
    Applicant: RESEARCH IN MOTION LIMITED
    Inventors: Darsono SUTEDJA, Umesh MIGLANI, Prakash DAMODARAN, Imtiaz NADAF, Francis CASTAGNOZZI
  • Publication number: 20130297606
    Abstract: A system and method for obtaining node information from a variety of potential sources and storing the information in a logical repository, and a system and method for identifying and categorizing Intermediate Nodes using a combination of requesting and responding node information.
    Type: Application
    Filed: May 7, 2012
    Publication date: November 7, 2013
    Inventors: Ken C. Tola, Patrick Kerry Bunday
  • Publication number: 20130290339
    Abstract: Users receive content recommendations from a personalized, generalized recommendation service that aggregates and selects content of high personal relevance to each individual user from a large pool of both personal and public content. The received content is filtered and the content determined to be relevant is cached. When a user request for content is received, the cached content is rescored and the content determined to be most relevant based on satisfaction of a relevance threshold is selected and forwarded to the user. Feedback methodologies are also implemented so that a user's actions are taken into consideration in real time and can affect subsequent recommendations to the user.
    Type: Application
    Filed: April 27, 2012
    Publication date: October 31, 2013
    Applicant: YAHOO! INC.
    Inventors: Chris LuVogt, Bruce Robbins, Vu B. Nguyen, Deepa Mahalingam
  • Publication number: 20130290334
    Abstract: In a method for managing storage of data across a plurality of disparate repositories, a partitioning strategy for storing the data into a plurality of partitions in at least one of a plurality of disparate repositories is acquired based upon a characteristic of the data. In addition, global metadata that, describes the partitioning strategy is acquired and the global metadata is implemented in a plurality of disparate repositories to enable performance of the partitioning strategy in storing the data in the plurality of partitions across the plurality of disparate repositories in a location agnostic manner.
    Type: Application
    Filed: April 30, 2012
    Publication date: October 31, 2013
    Inventor: Rahul KAPOOR
  • Publication number: 20130268531
    Abstract: In one embodiment, datasets are stored in a catalog. The datasets are enriched by establishing relationships among the domains in different datasets. A user searches for relevant datasets by providing examples of the domains of interest. The system identifies datasets corresponding to the user-provided examples. The system them identifies connected subsets of the datasets that are directly linked or indirectly linked through other domains. The user provides known relationship examples to filter the connected subsets and to identify the connected subsets that are most relevant to the user's query. The selected connected subsets may be further analyzed by business intelligence/analytics to create pivot tables or to process the data.
    Type: Application
    Filed: April 10, 2012
    Publication date: October 10, 2013
    Applicant: Microsoft Corporation
    Inventors: John C. Platt, Surajit Chaudhuri, Lev Novik, Henricus Johannes Maria Meijer, Efim Hudis, Kunal Mukerjee, Christopher Alan Hays
  • Publication number: 20130262465
    Abstract: A method for clustering documents is provided. Each document is represented by a multidimensional data point. The data points are initially assigned to a respective cluster and serve as their initial representative points. Thereafter, in an iterative process, the data points are clustered among the clusters, by assigning the data points to the clusters based on a comparison measure of each data point with the cluster or its representative point, and a threshold of the comparison measure. Based on this clustering, a new representative point for each of the clusters can be computed. Optionally, overlapping clusters are merged. For the next iteration, the new representative points are used as the representative points. An assignment of the documents to the clusters is output, based on a clustering of the data points in the latest iteration. Multiple batches may be processed, retaining the initial clusters to which the original batch was assigned.
    Type: Application
    Filed: April 2, 2012
    Publication date: October 3, 2013
    Applicant: Xerox Corporation
    Inventors: Matthias Galle, Jean-Michel Renders
  • Publication number: 20130254202
    Abstract: A method, system, and computer program product for parallelization of updating synthetic events with genetic surprisal data comprising dividing the synthetic event into cohort parts and assigning the cohort parts to one of a plurality of computer processing elements. Within each processing element: searching data records of patients for genetic surprisal data; generating a cluster comprising a centroid by populating the cluster based on all of the matches of the data records; calculating a new centroid for each cluster; calculating a Euclidean distance in multiple dimensions for each match of data records to the new centroid for each cluster; reassigning each match of data to the new centroid of each cluster based on the shortest calculated Euclidean distance to the new centroid for each cluster; and determining at least one cohort part from the clusters and recombining the cohort parts into updated synthetic events based on the metadata.
    Type: Application
    Filed: July 31, 2012
    Publication date: September 26, 2013
    Applicant: International Business Machines Corporation
    Inventors: Robert R. Friedlander, James R. Kraemer
  • Publication number: 20130254162
    Abstract: A method and system for governing information is provided. The method includes receiving, by a processor, data defining a scope and context of an information governance project and information requirements data associated with the data. The processor classifies the information requirements data into concepts in accordance with a meta-model profile. The processor generates conceptual models and realization models in accordance with the meta-model profile. Governance roles are defined and assigned to informational assets within the conceptual models The processor selects a final architecture option and generates policy models in accordance with the governance roles, the informational assets, the meta-model profile and user input. A final architecture option is deployed and monitored, and governance events triggered and reports generated in response to changes in this deployed architecture option.
    Type: Application
    Filed: March 23, 2012
    Publication date: September 26, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Dougal A. Watt
  • Publication number: 20130254206
    Abstract: The subject disclosure is directed towards a technology by which content items such as microblog postings may be returned to a requestor based upon a desired level of diversity based upon information entropy. Each content item is associated with a set of dimensions, which may have a learned relative importance, and the content items may be pruned into a pruned subset via a transform. A result set is constructed by finding a cluster of items having a level of entropy that is closest to a desired level. In one aspect, the result set may be ordered based upon evaluating distortion of each item in the result set.
    Type: Application
    Filed: March 20, 2012
    Publication date: September 26, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Scott J. Counts, Munmun De Choudhury, Mary P. Czerwinski
  • Publication number: 20130254201
    Abstract: A method for generating a tree-type data structure composed of a plurality of data strings includes the steps of: summing, with respect to a plurality of data strings classified in a parent node, the numbers of data types of data, respectively, at at least one given string position in each of the plurality of data strings; and classifying, based on the numbers of the data types respectively summed at the at least one given string position in the summing step, the plurality of data strings into a plurality of child nodes, for the respective data types at a given string position.
    Type: Application
    Filed: May 22, 2012
    Publication date: September 26, 2013
    Applicant: NINTENDO CO., LTD.
    Inventor: Minoru HATAMOTO
  • Publication number: 20130246431
    Abstract: A method is provided in one example and includes receiving sets of metadata elements and corresponding category information representing objects of a data storage location that are classified based on a category. The method further includes generating a summary of a subset of the classified objects and initiating a protection task for objects of the subset. In more specific embodiments, the protection task includes applying a remediation policy to the objects of the subset. Another protection task includes registering the objects of the subset. In other specific embodiments, the summary includes at least one of a total count and a total size of the objects in the subset. In yet other more specific embodiments, the method includes creating an Online Analytical Processing (OLAP) data structure to represent the sets of metadata elements and the corresponding category information with the summary of the subset being generated from the OLAP data.
    Type: Application
    Filed: December 27, 2011
    Publication date: September 19, 2013
    Inventors: Ratinder Paul Singh Ahuja, Bimalesh Jha, Nitin Maini, Sujata Patel, Ankit R. Jain, Damodar K. Hegde, Rajaram V. Nanganure, Avinash Vishnu Pawar
  • Publication number: 20130246422
    Abstract: A method in one example implementation includes obtaining a plurality of host file inventories corresponding respectively to a plurality of hosts, calculating input data using the plurality of host file inventories, and then providing the input data to a clustering procedure to group the plurality of hosts into one or more clusters of hosts. The method further includes each cluster of hosts being grouped using predetermined similarity criteria. In more specific embodiments, each of the host file inventories includes a set of one or more file identifiers with each file identifier representing a different executable software file on a corresponding one of the plurality of hosts. In other more specific embodiments, calculating the input data includes transforming the host file inventories into a matrix of keyword vectors in Euclidean space. In further embodiments, calculating the input data includes transforming the host file inventories into a similarity matrix.
    Type: Application
    Filed: September 12, 2010
    Publication date: September 19, 2013
    Inventors: Rishi Bhargava, David P. Reese, JR.
  • Publication number: 20130246423
    Abstract: A method in one embodiment includes determining a frequency range corresponding to a subset of a plurality of program files on a plurality of hosts in a network environment. The method also includes generating a first set of counts including a first count that represents an aggregate amount of program files in a first grouping of one or more program files of the subset, where each of the one or more program files of the first grouping includes a first value of a primary attribute. In specific embodiments, each program file is unknown. In further embodiments, the primary attribute is one of a plurality of file attributes provided in file metadata. Other specific embodiments include either blocking or allowing execution of each of the program files of the first grouping. More specific embodiments include determining a unique identifier corresponding to at least one program file of the first grouping.
    Type: Application
    Filed: January 24, 2011
    Publication date: September 19, 2013
    Inventors: Rishi Bhargava, David P. Reese, JR.
  • Publication number: 20130226925
    Abstract: A method for improving the usability of product feedback data can begin with the receipt of product feedback search parameters by an intelligent product feedback analytics tool. The product feedback search parameters can represent a product or a group of products. Product feedback search results having a rating value and/or textual feedback content can be obtained for the product feedback search parameters. For each product in the search results, a composite rating value can be synthesized from the rating values contained in the search results. For each product in the search results, the product feedback search results can be analyzed for analytic parameters using natural language processing techniques. An analytic parameter can represent a commonality within a subset of the search results. The product feedback search results, composite rating values, and analytic parameters can be presented within a user interface, providing context for the composite rating value.
    Type: Application
    Filed: February 29, 2012
    Publication date: August 29, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: LEE A. CARBONELL, TSZ SIMON CHENG, JEFFREY L. EDGINGTON, PANDIAN MARIADOSS
  • Publication number: 20130218907
    Abstract: Embodiments of the invention provide methods and apparatus for recommending items from a catalog of items to users in a population of users by generating trait vectors that represent items in the catalog responsive to explicit and/or implicit preference data for a group of less than all the users and using the trait vectors to recommend items to users in the population that are not in the group.
    Type: Application
    Filed: February 21, 2012
    Publication date: August 22, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Nir Nice, Shahar Keren, Ori Folger, Ulrich Paquet, Shimon Shlevich, Noam Koenigstein, Eylon Yogev