Clustering Or Classification (epo) Patents (Class 707/E17.089)
  • Patent number: 11361028
    Abstract: A technique produces a graph data structure based on at least partially unstructured information dispersed over web documents. The technique involves applying a machine-trained model to a set of documents (or, more generally “document units”) to identify topics in the documents. The technique then generates count information by counting the occurrences of the single topics and co-occurrences of parings of topics in the documents. The technique generates conditional probability information based on the count information. An instance of conditional probability information describes a probability that a first topic will occur, given an appearance of a second topic, and a probability that the second topic will occur, given an appearance of the first topic. The technique then formulates the conditional probability information in a graph data structure. The technique also provides an application system that utilizes the graph data structure to provide any kind of computer-implemented service to a user.
    Type: Grant
    Filed: June 9, 2020
    Date of Patent: June 14, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ziliu Li, Junaid Ahmed, Arnold Overwijk, Li Xiong, Xiao Liu
  • Patent number: 10585965
    Abstract: A determination device includes an image obtaining unit for obtaining an image in a linked area associated with an URL, a linked-to page obtaining unit for obtaining, from storing means for storing content, a linked-to page specified by the URL associated with the linked area, and a character determination unit for determining correctness of association between the linked area and the URL based on the image obtained by the image obtaining unit and the linked-to page obtained by the linked-to page obtaining unit.
    Type: Grant
    Filed: June 28, 2013
    Date of Patent: March 10, 2020
    Assignee: RAKUTEN, INC.
    Inventor: Yukiko Ochiai
  • Patent number: 10380226
    Abstract: Described herein are techniques for identifying and displaying key excerpts of a digital work and related key excerpts of other digital works. Key excerpts are identified by evaluating (a) the number of interactions by human readers within each of the key excerpts and (b) the number of reviews that reference each of the key excerpts. Related excerpts from other books can be identified by comparing the key excerpts of the other books. Excerpts can be displayed by subject, and links are provided to move from one subject to another.
    Type: Grant
    Filed: September 16, 2014
    Date of Patent: August 13, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Walter Manching Tseng, Abhishek Patnia, Adam Joseph Iser, Christopher Michael Ellis, Alice Chu
  • Patent number: 9842175
    Abstract: The present invention provides a method and system for automatically identifying and selecting preferred classification and regression trees. The invention is used to identify a specific decision tree or group of trees that are consistent across train and test samples in node-specific details that are often important to decision makers. Specifically, for a tree to be identified as preferred by this system, the train and test samples must both agree on key measures for every terminal node of the tree. In addition to this node-by-node criterion, an additional tree selection method may be imposed. Accordingly, the train and test samples rank order the nodes on a relevant measure in the same way. Both consistency criteria may be applied in a fuzzy manner in which agreement must be close but need not be exact.
    Type: Grant
    Filed: January 4, 2008
    Date of Patent: December 12, 2017
    Assignee: Minitab, Inc.
    Inventors: Dan Steinberg, Nicholas Scott Cardell
  • Patent number: 8983961
    Abstract: A high availability system in a cloud computing environment includes a snapshot manager disposed in a mirror environment having at least one computer server and a plurality of virtual machines disposed in a production environment. Each of the plurality of virtual machines includes a snapshot agent configured to perform a method. The method includes periodically taking snapshots of the virtual machine associated with the snapshot agent, determining a delta image based on a change between a current snapshot and a previous snapshot, removing previous snapshots in the virtual machine and transmitting the delta image to the snapshot manager. The snapshot manager is configured to store a recovery image for each of the plurality of virtual machines and to merge the received delta image with the recovery image to update the recovery image.
    Type: Grant
    Filed: November 29, 2012
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Hoi Y. Chan, Trieu C. Chieu
  • Patent number: 8930366
    Abstract: A method and system for automatically ranking product reviews according to review helpfulness. Given a collection of reviews, the method employs an algorithm that identifies dominant terms and uses them to define a feature vector representation. Reviews are then converted to this representation and ranked according to their distance from a ‘locally optimal’ review vector. The algorithm is fully unsupervised and thus avoids costly and error-prone manual training annotations. In one embodiment a Multi Layer Lexical Model (MLLM) approach partitions the dominant lexical terms in a review into layers, creates a compact unified layers lexicon, and ranks the reviews according to their weight with respect to unified lexicon, all in a fully unsupervised manner. When used to rank book reviews, it was found that the invention significantly outperforms the user votes-based ranking employed by Amazon.
    Type: Grant
    Filed: January 11, 2009
    Date of Patent: January 6, 2015
    Assignee: Yissum Research Development Comapny of the Hebrew University of Jerusalem Limited
    Inventors: Ari Rappoport, Oren Tsur
  • Patent number: 8918275
    Abstract: With use of GPS, an action-history recording apparatus obtains latitudes and longitudes representing places of user's action where a user is acting, and stores action-history data containing place names indicating the places of user's action at a predetermined processing timing. In the case where, the place of user's action is a specific place unique to the user, where the user visits customarily or frequently, the user is allowed to enter an arbitrary name independent of the latitude and longitude. The name entered by the user is used as a pace name to be contained in action-history data. In this way, the apparatus obtains a place name appropriate for the user and the user can use the name conveniently as the place name of the user's action.
    Type: Grant
    Filed: February 6, 2012
    Date of Patent: December 23, 2014
    Assignee: Casio Computer Co., Ltd.
    Inventor: Naoyuki Sakazaki
  • Patent number: 8918397
    Abstract: A computer implemented method for clustering customers includes receiving a source set of customer records, wherein each customer record represents one customer, and each customer record includes at least one data attribute, and each data attribute has an attribute value; pre-processing the source set of customer records to generate a pre-processed set of customer records; executing a clustering algorithm on the pre-processed set of customer records to group the pre-processed set of customer records into clusters of a pre-defined number. The pre-processing comprises: determining the type of a customer in the source set of customer records; using a type attribute value to indicate the type of the customer in its customer record; normalizing data attribute values and type attribute values; weighting to the data attribute values and the type attribute values respectively to obtain weighted attribute values of the data attribute and weighted attribute values of the type attribute.
    Type: Grant
    Filed: July 30, 2012
    Date of Patent: December 23, 2014
    Assignee: International Business Machines Corporation
    Inventors: Heng Cao, Jin Dong, Jacqueline Giang Huong Morris, Ming Xie, Wen Jun Yin, Bin Zhang
  • Patent number: 8914372
    Abstract: A computer implemented method for clustering customers includes receiving a source set of customer records, wherein each customer record represents one customer, and each customer record includes at least one data attribute, and each data attribute has an attribute value; pre-processing the source set of customer records to generate a pre-processed set of customer records; executing a clustering algorithm on the pre-processed set of customer records to group the pre-processed set of customer records into clusters of a pre-defined number. The pre-processing comprises: determining the type of a customer in the source set of customer records; using a type attribute value to indicate the type of the customer in its customer record; normalizing data attribute values and type attribute values; weighting to the data attribute values and the type attribute values respectively to obtain weighted attribute values of the data attribute and weighted attribute values of the tune attribute.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: December 16, 2014
    Assignee: International Business Machines Corporation
    Inventors: Heng Cao, Jin Dong, Jacqueline Giang Huong Morris, Ming Xie, Wen Jun Yin, Bin Zhang
  • Patent number: 8903825
    Abstract: A method of classifying a plurality of documents. The method includes steps of providing a first set of classification terms and a second set of classification terms, the second set of classification terms being different from the first set of classification terms; generating a first frequency array of a number of occurrences of each term from the first set of classification terms in each document; generating a second frequency array of a number of occurrences of each term from the second set of classification terms in each document; generating a first similarity matrix from the first frequency array; generating a second similarity matrix from the second frequency array; determining an entrywise combination of the first similarity matrix and the second similarity matrix; and clustering the plurality of documents based on the result of the entrywise combination.
    Type: Grant
    Filed: May 23, 2012
    Date of Patent: December 2, 2014
    Assignee: NamesforLife LLC
    Inventors: Charles T. Parker, George M. Garrity
  • Patent number: 8843494
    Abstract: Using keywords to merge document clusters is described. Documents are distributed into document clusters that include a first document cluster of first documents and a second document cluster of second documents. A template associated with the first document cluster is created. The template includes keywords associated with most of the first documents. A distance is calculated between keyword location information associated with the template and word location information associated with a document in the second document cluster. The keyword location information includes information indicating a location of a keyword in the template relative to other keywords in the template. The word location information includes information indicating a location of a word in the document relative to other words in the document. A determination is made whether the distance is less than a threshold value.
    Type: Grant
    Filed: April 23, 2013
    Date of Patent: September 23, 2014
    Assignee: EMC Corporation
    Inventor: Steven Sampson
  • Patent number: 8788497
    Abstract: Interrelated items in a complex item set (such as a set of components in a complex software architecture) may be difficult to present in a manner that facilitates an understanding and evaluation of the item set, due to the amount of information and the difficulty in automatically discerning the organization of the item set. A set of criteria may be utilized to form criterion groups to which items matching respective criteria may be automatically assigned. Further grouping assignments may be achieved by identifying an ungrouped item that is associated with a grouped item. Such techniques may be applied in many variations to yield a representation of the item set, and a presentation of the item set to a user, that aggregates similar items and interrelationships, thereby promoting an understanding and analysis of the structure and organization of the item set while reducing the user involvement in the generation of same.
    Type: Grant
    Filed: September 15, 2008
    Date of Patent: July 22, 2014
    Assignee: Microsoft Corporation
    Inventors: Jean-Pierre Duplessis, Chris Lovett, Craig Symonds, Jacob Meyer, Scott Marison, Allen Denver, Tracey Trewin
  • Patent number: 8788498
    Abstract: Described is a technology for obtaining labeled sample data. Labeling guidelines are converted into binary yes/no questions regarding data samples. The questions and data samples are provided to judges who then answer the questions for each sample. The answers are input to a label assignment algorithm that associates a label with each sample based upon the answers. If the guidelines are modified and previous answers to the binary questions are maintained, at least some of the previous answers may be used in re-labeling the samples in view of the modification.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: July 22, 2014
    Assignee: Microsoft Corporation
    Inventors: Anitha Kannan, Krishnaram Kenthapadi, John C. Shafer, Ariel Fuxman
  • Patent number: 8776228
    Abstract: Systems and methods are provided for intrusion detection. The systems and methods may include receiving transaction information related to one or more current transactions between a client entity and a resource server, accessing a database storing a plurality of transaction groups, analyzing the received transaction information with respect to information related to at least one of the plurality of transaction groups, and based on said analyzing, determining a possibility of an occurrence of an intrusion act at the resource server. The transaction groups may be formed based on a plurality of past transactions between a plurality of client entities and the resource server. Identity information of a user associated with the one or more current transactions may also be received along with the transaction information. The user may be associated with at least one of the plurality of transaction groups.
    Type: Grant
    Filed: November 22, 2011
    Date of Patent: July 8, 2014
    Assignee: CA, Inc.
    Inventors: Ramesh Natarajan, Timothy Gordon Brown, Carrie Elaine Gates
  • Patent number: 8775401
    Abstract: The present application relates to a method for implementing picture search and a website server thereof.
    Type: Grant
    Filed: February 1, 2013
    Date of Patent: July 8, 2014
    Assignee: Alibaba Group Holding Limited
    Inventors: Chunyi Zhou, Weiwei Wang, Xinfeng Zhou, Yu Dong, Xiaoying Weng, Jialong Huang
  • Patent number: 8745728
    Abstract: Methods, apparatus, systems and computer program products are described and claimed that provide for automatically and positively determining that an associate accessing a business domain/application using an application-specific associate identifier is the same associate that is accessing another business domain/application using another application-specific associate identifier. Once the positive determination of same associate is made, a federated identifier key is generated and applied to all of the platforms in which the associate can be positively identified, so as to globally identify the associates across multiple enterprise-wide domains/applications. As such, the present invention eliminates the need to manually analyze associate data to determine if an associate interfacing with one domain/application is the same associate interfacing with another domain/application.
    Type: Grant
    Filed: May 10, 2012
    Date of Patent: June 3, 2014
    Assignee: Bank of America Corporation
    Inventors: Rangarajan Umamaheswaran, Bruce Wyatt Englar, Brett A. Nielson, Miroslav Halas
  • Publication number: 20140143254
    Abstract: Systems and methods can determine categories for product searches. One or more computing devices can receive a product query of search terms. The product query can be classified to identify a product category. The search terms may be verified against an ambiguous term list for the product category. The search terms may also be verified against an attribute list for the product category. The product query may be classified as fully understood in response to all of the search terms matching either the ambiguous term list or the attribute list for the product category. A product search may be performed on the product query. The product search may be informed by the product category when the product query has been classified as fully understood. Search results may be generated and returned according to the product search.
    Type: Application
    Filed: November 16, 2012
    Publication date: May 22, 2014
    Inventors: Ritendra Datta, Joshua Yelon, Thomas Walter Murphy
  • Publication number: 20140136540
    Abstract: A system and method of determining the level of diversity for a search query are described. Distances between leaf categories in a hierarchical category tree are determined using co-click counts between the leaf categories for a query. Coordinate representations of the leaf categories are determined using the distances between the leaf categories. A diversity score for the query is determined using the coordinate representations. The diversity score represents a degree of variability in what different users find relevant to the query. In some embodiments, determining distances between leaf categories comprises determining the distances using a normalization of the co-click counts that uses co-impression counts between the leaf categories for the query. In some embodiments, a manifold learning algorithm is used to determine the coordinate representations. In some embodiments, multi-dimensional scaling is used to determine the coordinate representations.
    Type: Application
    Filed: November 9, 2012
    Publication date: May 15, 2014
    Applicant: eBay Inc.
    Inventors: Duangmanee Putthividhya, Zhaohui Chen
  • Publication number: 20140136537
    Abstract: A computing system determines incremental values associated with a plurality of clustering solutions. Each of the clustering solutions groups stores of a retailer into clusters in a different way. For each clustering solution in the plurality of clustering solutions, the incremental value associated with the clustering solution indicates a difference between an estimated revenue associated with the clustering solution and revenue associated with a baseline clustering solution. The computing system then determines, based on the incremental values associated with the plurality of clustering solutions, the appropriate number of clusters. The clustering solutions that group the stores into more or fewer clusters than the appropriate number of clusters tend to be associated with incremental values that are the same or lower than the clustering solutions that group the stores into the appropriate number of clusters.
    Type: Application
    Filed: November 15, 2012
    Publication date: May 15, 2014
    Applicant: Target Brands, Inc.
    Inventors: James Carl Nelson, Raja Ranganathan, Abhijit Sharma, Zachary George Sands
  • Publication number: 20140122483
    Abstract: An activity-modeling system computes an amount of time that a user is expected to spend when performing activities of a certain type. During operation, the system can obtain a plurality of location events associated with the user, such that a respective location event indicates a time at which a user logged his location while performed an activity related to the activity type. The system selects, from the plurality of location events, a set of location events associated with the activity type. The system determines an activity start-time and an activity end-time for the activity type from the set of location events, and computes an activity-duration time for the activity type based on the determined activity start-time and the activity end-time.
    Type: Application
    Filed: October 26, 2012
    Publication date: May 1, 2014
    Applicant: PALO ALTO RESEARCH CENTER INCORPORATED
    Inventors: Rui Zhang, Robert R. Price, Oliver Brdiczka
  • Publication number: 20140114972
    Abstract: Systems and methods for sharing information between distributed computer systems connected to one or more data networks. In particular, a replication system implementing methodologies for sharing database information between computer systems where the databases use different classification schemes for information access control is disclosed.
    Type: Application
    Filed: October 22, 2012
    Publication date: April 24, 2014
    Applicant: PALANTIR TECHNOLOGIES, INC.
    Inventors: Richard Allen Ducott, III, John Kenneth Garrod, Khan Tasinga
  • Publication number: 20140108460
    Abstract: Data stores that store content units and annotations regarding the content units derived through a semantic interpretation of the content units. When annotations are stored in a database, different parts of an annotation may be stored in different tables of the database. For example, one or more tables of the database may store all semantic classifications for the annotations, while one or more other tables may store content of all of the annotations. A user may be permitted to provide natural language queries for searching the database. A natural language query may be semantically interpreted to determine one or more annotations from the query. The semantic interpretation of the query may be performed using the same annotation model used to determine annotations stored in the database. Semantic classifications and format of the annotations for a query may be the same as one or more annotations stored in the database.
    Type: Application
    Filed: October 11, 2012
    Publication date: April 17, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Mariana Casella dos Santos, Frank Montyne
  • Publication number: 20140108410
    Abstract: A test case generation system includes a processor, a process residing on the processor and configured to extract descriptions from document artifacts, extract a first set of keywords from the descriptions, categorize the descriptions to a first set and a second set, extract a second set of keywords that occur in the second set and generate a test case from the second set of keywords.
    Type: Application
    Filed: October 17, 2012
    Publication date: April 17, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Futoshi Iwama, Ken Mizuno, Taiga Nakamura, Hironori Takeuchi
  • Publication number: 20140101162
    Abstract: A method for recommending semantic annotations on a main document and sub documents is provided. The method includes: extracting a keyword of the main document; extracting a or a set of keyword of each sub document; and generating a or a set of keyword similarity of each of the sub documents based on a degree of similarity between the keyword of the main document and the keyword of each of the sub documents. The method also includes: obtaining a plurality of words appeared on each of the sub documents and calculating a frequency of each of the words; generating a semantic capacity of each of the sub documents according to the frequencies; grouping the main document and at least one of the sub documents into a semantic document set based on the semantic capacities and the keyword similarities; and annotating the main document according to the semantic document set.
    Type: Application
    Filed: October 9, 2012
    Publication date: April 10, 2014
    Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE
    Inventors: Hsiang-Yuan Hsueh, Ko-Li Kan, Chi-Chou Chiang
  • Publication number: 20140095505
    Abstract: Systems and methods that allow for an intelligence platform for distributed processing of big data sets including both structured and unstructured data types across two or more intelligent data operation engine servers. The intelligent data operation engine servers can form a conceptual understanding of content in each electronic file and then cooperates with a distributed index handler to index the conceptual understanding of the electronic file. A query pipeline and the distributed index handler in the intelligence platform cooperate with the two or more intelligent data operation engine servers to improve scalability and performance on the big data sets containing both structured and un-structured electronic files represented in the common index.
    Type: Application
    Filed: October 1, 2012
    Publication date: April 3, 2014
    Applicant: LONGSAND LIMITED
    Inventors: Sean Mark Blanchflower, Darren John Gallagher
  • Publication number: 20140095503
    Abstract: A system and a method for initializing a streaming application are disclosed. The method may include initializing a streaming application for execution on one or more compute nodes which are adapted to execute one or more stream operators. The method may, during a compiling of code, identify whether a processing condition exists at a first stream operator of a plurality of stream operators. The method may add a grouping condition to a second stream operator of the plurality of stream operators if the processing condition exists. The method may provide for the second stream operator to group tuples for sending to the first stream operator.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Applicant: International Business Machines Corporation
    Inventors: Michael J. Branson, Bradford L. Cobb, John M. Santosuosso
  • Publication number: 20140081973
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying a spike in a rate of occurrence of events. One of the methods includes receiving data identifying a spike at a particular time in a rate of occurrence of events relating to a particular search query, where an event relating to the particular search query is a receipt event of the particular search query or an indexing event of a resource that satisfies the particular search query, fitting the occurrences of the events in a time window to a reference distribution of occurrences of events to determine a goodness of fit value, wherein the reference distribution models a random occurrence of events relating to search queries, comparing the goodness of fit value to a primary threshold, and classifying the spike as a spurious spike if the goodness of fit value satisfies the predetermined threshold.
    Type: Application
    Filed: September 14, 2012
    Publication date: March 20, 2014
    Applicant: Google Inc.
    Inventors: Mukund Jha, Kumar Mayur Thakur
  • Publication number: 20140081974
    Abstract: Systems and methods are provided for aggregating relevant electronic content items that are relevant to one another. In one embodiment, a content management application determines that a first electronic content item and a second electronic content item are relevant to one another. The first electronic content item is provided by a first client account and the second electronic content item is provided by a second client account. The content management application also aggregates the first and second electronic content items to form at least part of a collection of electronic content. The first and second electronic content items are aggregated based on determining that the first and second electronic content items are relevant to one another. The content management application also provides access to the collection of electronic content.
    Type: Application
    Filed: September 18, 2012
    Publication date: March 20, 2014
    Applicant: Adobe Systems Incorporated
    Inventors: Jon Lorenz, Justin Velo
  • Publication number: 20140067807
    Abstract: A method performed on an electronic device for migrating tags across entities. The migration of the tags is performed following an analysis of one or more personal electronically encoded items associated with a previously created perspective or album associated with the previously created perspective, responsive to a user decision the creation of a new perspective, a new album associated with one of the previously created perspectives, or a new perspective and a new album associated with the new perspective, responsive to a user decision to treat the previously created perspective or album as an individual entity, and association of the previously created perspective or album with the new perspective or new album. The tags are respectively migrated from the new perspective or the new album to the associated previously created perspective or the previously created album and to associated ones of the one or more personal electronically encoded items.
    Type: Application
    Filed: August 31, 2012
    Publication date: March 6, 2014
    Applicant: RESEARCH IN MOTION LIMITED
    Inventors: Anand Ravindra OKA, Sean Bartholomew SIMMONS, Christopher Harris SNOW, Steven Michael HANOV, Ghasem NADDAFZADEH SHIRAZI
  • Publication number: 20140067816
    Abstract: In an effort to enhance computer user engagement with a search results page, systems and methods are presented which are configured to identify an entity as being the subject matter of a user's search query. If the entity is a known entity, i.e., entity information is stored in an entity store for the identified entity, a subset of entity attributes are identified and a representative entity attribute question is obtained for each of the attributes in the subset of entity attributes. The representative entity attribute questions are identified according to the probability that they are formed linguistically correct. The representative entity attribute questions are included in a search results page that is generated in response to the user's search query.
    Type: Application
    Filed: August 29, 2012
    Publication date: March 6, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Tapas Kanungo, Ashok Ponnuswami
  • Publication number: 20140058992
    Abstract: Techniques are described to characterize motion patterns of a group of agents engaging in an activity. An analysis system receives input data associated with spatial and temporal information of at least one element of interest associated with the activity, where the object of interest may be a ball, person, animal or any other object in motion. The analysis system partitions the input data into a plurality of spatiotemporal segments and generates one or more representations of one or more sets of segments of the plurality of spatiotemporal segments based on one or more criteria. The analysis system computes a metric, such as an entropy value, for each of the one or more representations. Partial tracing data, such as ball movements in a sporting event, may be created using an inexpensive input device, such as a tablet computer, making the disclosed techniques available for a wide range of events and activities.
    Type: Application
    Filed: August 21, 2012
    Publication date: February 27, 2014
    Inventors: Patrick Lucey, Alina Bialkowski, Iain Matthews, G. Peter Carr, Eric Foote
  • Publication number: 20140052730
    Abstract: Embodiments of the present invention provide a system, method, and program product for managing data sets. According to one aspect of the present invention, a data group of one or more related data sets is reorganized. Utilizing one or more specified criteria, data sets that should be cataloged in the data group are identified and cataloged in the data group such that they are arranged in a chronological order and are named with appropriate generation numbers.
    Type: Application
    Filed: August 14, 2012
    Publication date: February 20, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric J. Harris, Franklin E. McCune, Miguel A. Perez, Ryan J. Wisniewski
  • Publication number: 20140047384
    Abstract: Systems, methods, computer-readable media, and graphical user interfaces for facilitating integrated data capture with an item group key are provided. Integrated data capture workflows are initiated from within an electronic medical record (EMR). Selections of groups of items from the EMR are received. Item group keys are assigned to at least one item for the groups of items. Available data associated with the item group keys is gathered from the EMR. Selections of available data to include in case report forms are received. The case report forms are populated with the selections of available data and the item group keys.
    Type: Application
    Filed: August 8, 2012
    Publication date: February 13, 2014
    Applicant: CERNER INNOVATION, INC.
    Inventors: JON FEWINS, RYAN MOOG, MARSHA LAIRD-MADDOX, TODD JEFFREY REYNOLDS, BRADY TIMMERBERG, NITISH AMRAJI
  • Publication number: 20140046895
    Abstract: Data for a plurality of entities that can be offered a plurality of products can be obtained. The data can include categorical data and numeric data. Based on business constraints, some of all of the data can be selected. The selected data can be converted to another set of numeric data, wherein the categorical values are converted to numeric values. Dimensions of the converted data can be reduced to generate another set of data. Based on this another set of data, clusters of entities can be formed. The products can be grouped by assigning a unique product identifier of each product to a corresponding cluster. This grouping of products can be used by a predictive model to predict a likelihood of an entity to purchase a particular product in a future time period. Related methods, apparatus, systems, techniques and articles are also described.
    Type: Application
    Filed: August 10, 2012
    Publication date: February 13, 2014
    Inventors: Amit Sowani, Eeshan Malhotra, Shafi Ur Rahman
  • Publication number: 20140046947
    Abstract: A method for question/answer creation for a document is described. The method includes importing a document having a set of questions based on content in the document. The method also includes automatically creating a candidate question from the content in the document. The method also includes automatically generating answers for the set of questions and the candidate question using the content in the document. The method also includes presenting the set of questions, the candidate question, and the answers to a content creator for user verification of accuracy. The method also includes storing a verified set of questions in the document. The verified set of questions includes the candidate question.
    Type: Application
    Filed: August 9, 2012
    Publication date: February 13, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jana H. Jenkins, David C. Steinmetz, Wlodek W. Zadrozny
  • Publication number: 20140040270
    Abstract: Method, apparatus, and computer-readable medium are provided for analyzing a document including text. In one example, a method for identifying patterns in a document is described. The method includes identifying a plurality of candidate phrases in the document based on candidate identification criteria, grouping the candidate phrases of the plurality of candidate phrases with a phrase family based on family criteria and comparison between candidate phrases of the plurality of candidate phrases to obtain consistent phrases, and, for remaining phrases not meeting all of the candidate identification criteria, associating at least one of the remaining phrases with a phrase family based on inconsistent phrase criteria to obtain inconsistent phrases. Identified in this manner, the inconsistent phrase may be displayed via a user interface to permit a user the opportunity to determine whether an inconsistent phrase requires modification.
    Type: Application
    Filed: July 31, 2012
    Publication date: February 6, 2014
    Applicant: Freedom Solutions Group, LLC, d/b/a Microsystems
    Inventors: Thomas O'Sullivan, Andrzej Jachowicz
  • Publication number: 20140040233
    Abstract: Methods, systems, and computer-readable and executable instructions are provided for organizing content. A method for organizing content can include building a customized content corpus for a user, building a concept graph customized for the user's context based on the customized corpus, and organizing, utilizing multi-view clustering, the content within the corpus based on the concept graph.
    Type: Application
    Filed: July 31, 2012
    Publication date: February 6, 2014
    Inventors: Mehmet Kivanc Ozonat, Claudio Bartolini
  • Publication number: 20140040263
    Abstract: The disclosure generally describes computer-implemented methods, software, and systems for search-, context-, and rule-based creation and runtime adaptation in dynamic workspaces. One computer-implemented method includes identifying a data artifact associated with each search result of at least one received search result, associating each identified data artifact with a module category of a plurality of module categories, injecting the identified artifacts into a content gallery, categorize, by operation of at least one computer, the injected identified artifacts within the content gallery, presenting at least a subset of the injected identified artifacts on an enterprise workspace page associated with an enterprise workspace, and constructing a context associated with at least one of the enterprise workspace or the enterprise workspace page.
    Type: Application
    Filed: August 6, 2012
    Publication date: February 6, 2014
    Applicant: SAP Portals Israel Ltd.
    Inventors: Yahali Sherman, Vitaly Vainer
  • Patent number: 8635223
    Abstract: A system and method for providing a classification suggestion for electronically stored information is provided. A corpus of electronically stored information including reference electronically stored information items each associated with a classification and uncoded electronically stored information items are maintained. A cluster of uncoded electronically stored information items and reference electronically stored information items is provided. A neighborhood of reference electronically stored information items in the cluster is determined for at least one of the uncoded electronically stored information items. A classification of the neighborhood is determined using a classifier. The classification of the neighborhood is suggested as a classification for the at least one uncoded electronically stored information item.
    Type: Grant
    Filed: July 9, 2010
    Date of Patent: January 21, 2014
    Assignee: FTI Consulting, Inc.
    Inventor: William C. Knight
  • Publication number: 20140019451
    Abstract: A technique can include identifying a collection of documents to be clustered. The collection of documents can include foreign language documents and base language documents. The foreign language documents can be translated into the base language at a base language translation module. Keywords in the base language documents and keywords in the translated foreign language documents can be determined at a document indexing module. The base language documents can be clustered with the foreign language documents in a common set of document clusters based on the determined keywords in the base language documents and the determined keywords in the translated foreign language documents. In response to a search query in a first language, a listing of search results can be provided that includes documents in the first language and another language from the a common document cluster.
    Type: Application
    Filed: July 16, 2012
    Publication date: January 16, 2014
    Applicant: GOOGLE INC.
    Inventor: Kirill Buryak
  • Publication number: 20140012848
    Abstract: Systems and methods for measuring similarity between a set of clusters and a set of object labels, wherein at least two of the object labels are related, receive a first set of clusters, wherein the first set of clusters was formed by clustering objects in a set of objects into clusters of the first set of clusters according to a clustering procedure; and calculate a similarity index between the first set of clusters and a set of object labels based at least in part on a relationship between two or more object labels in the set of object labels
    Type: Application
    Filed: July 5, 2012
    Publication date: January 9, 2014
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Bradley Scott Denney, Dariusz T. Dusberger
  • Publication number: 20140006408
    Abstract: Example methods, apparatuses, or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to facilitate or otherwise support one or more processes or operations for identifying points of interest in a text, such as in an unstructured text, for example, in connection with bootstrapping points of interest via social media.
    Type: Application
    Filed: June 29, 2012
    Publication date: January 2, 2014
    Applicant: Yahoo! Inc.
    Inventors: Adam Rae, Vanessa Murdock, Hugues Bouchard, Adrian Popescu
  • Publication number: 20140006401
    Abstract: Various technologies described herein pertain to classifying data in a main memory database system. A record access log can include a sequence of record access observations logged over a time period from a beginning time to an end time. Each of the record access observations can include a respective record ID and read timestamp. The record access log can be scanned in reverse from the end time towards the beginning time. Further, access frequency estimate data for records corresponding to record IDs read from the record access log can be calculated. The access frequency estimate data can include respective upper bounds and respective lower bounds of access frequency estimates for each of the records. Moreover, the records can be classified based on the respective upper bounds and the respective lower bounds of the access frequency estimates, such that K records can be classified as being frequently accessed records.
    Type: Application
    Filed: June 30, 2012
    Publication date: January 2, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Justin Jon Levandoski, Per-Ake Larson
  • Publication number: 20130339354
    Abstract: A method and system for mining trends around trending terms. The method includes determining a plurality of articles, from one or more websites, in relation to a first entity for a time period. The first entity is a trending term. The method also includes generating comment clusters for the plurality of articles. Each comment cluster is generated for associated article and includes plurality of user comments. The method further includes extracting one or more entities from plurality of user comments for each of the comment clusters, the one or more entities related to the first entity. Further, the method includes enabling selection of a second entity, from the one or more entities, by the user. Moreover, the method includes rendering one or more user comments corresponding to the first entity and the second entity for the time period. The system includes an electronic device, communication interface, memory, and processor.
    Type: Application
    Filed: June 14, 2012
    Publication date: December 19, 2013
    Applicant: YAHOO! INC.
    Inventors: Vidit JAIN, Nikhil RASIWASIA
  • Publication number: 20130326346
    Abstract: The embodiments provide a cloud brainstorming service implemented on at least one cloud server. The brainstorming service includes a message service component configured to receive a plurality of ideas, over a network, from one or more users of devices. The users represent members of a brainstorming session. The brainstorming service also includes a brainstorming logic component configured to process the plurality of ideas and store the plurality of processed ideas in an in-memory database system, and a clustering component configured to retrieve the plurality of processed ideas from the in-memory database system and arrange the plurality of processed ideas into one or more clusters, where each cluster is a group of similar ideas. The message service component is configured to provide the plurality of processed ideas that are arranged into the one or more clusters, over the network, to the one or more users for display.
    Type: Application
    Filed: August 17, 2012
    Publication date: December 5, 2013
    Applicant: SAP AG
    Inventors: Zheren Zhu, Yongyuan Shen, Fu Zhao, Yingyu Chen, Bin Dong, Zheng Long Wei, Hui Wang
  • Publication number: 20130325862
    Abstract: Systems and methods are provided for large-scale, incrementing clustering. A plurality of processing nodes each include a processor and a non-transitory computer readable medium. The non-transitory computer readable medium stores a plurality of clusters of feature vectors and machine executable instructions for determining a plurality of values for a distance metric relating each of the plurality of clusters to an input feature vector and selecting a cluster having a best value for the distance metric. An arbitrator is configured to receive the selected cluster and best value for the distance metric from each of the plurality of processing nodes and determine a winning cluster as one of the selected clusters and a new cluster. A multiplexer is configured to receive the winning cluster and provide the winning cluster and a new input feature vector to each of the plurality of processing nodes.
    Type: Application
    Filed: June 4, 2012
    Publication date: December 5, 2013
    Inventor: MICHAEL D. BLACK
  • Publication number: 20130325861
    Abstract: Embodiments of the invention relate to a modeling activity area associated with groups of data items. Tools are provided to profile activity area involvement, both from the data item and from associated participants. The data items are placed into clusters and one or more activity areas are derived from the formed clusters. Each activity area is defined from the perspective of a single user. Participants in an activity area are connected to a user, but not necessarily to each other. The combination of formations of clusters and activity areas provides a multi-facetted organization of connections between data items and associated participants.
    Type: Application
    Filed: May 31, 2012
    Publication date: December 5, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Hongxia Jin
  • Publication number: 20130325849
    Abstract: Techniques for annotating an entity in a document corpus using cross-document signals. A method includes determining which documents in a document corpus mention an entity of interest, clustering the documents that mention an entity of interest according to a temporal signal, a structural signal and/or a content signal, thereby forming at least one cluster of documents, and annotating at least one document in the at least one cluster of documents by marking each occurrence of the entity in the at least one document.
    Type: Application
    Filed: August 16, 2012
    Publication date: December 5, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sushovan De, Amit K. Singh, Karthik Visweswariah
  • Publication number: 20130318088
    Abstract: According to one embodiment of the present invention, classification of objects in a directory service may be managed. An object is identified in a directory service. Classification information associated with the object is received from a reference database. Using a processor, a rule that specifies a value that corresponds to the classification information is accessed. The accessed value is based on a power of two classification model. Using the processor, the class of service attribute is created using the value. The class of service attribute is associated with the object listed in the directory service using the processor.
    Type: Application
    Filed: May 22, 2012
    Publication date: November 28, 2013
    Applicant: Bank of America Corporation
    Inventor: Michael Edward Futty
  • Publication number: 20130311473
    Abstract: A method for dynamically clustering data items, the method comprising: receiving a plurality of data items originating from at least two sources, a plurality of distinct metadata details, and data indicative of associations between the data items and the metadata details, wherein each data item is associated with at least one metadata detail indicative of its owner, and wherein at least a first data item originating from a first source and a second data item originating from a second source are related data items associated with at least one shared metadata detail; grading probabilities of relationships between at least one of the data items and at least one of the metadata details; clustering the data items into one or more clusters, based on the calculated probabilities; and, optionally, sharing clusters and meta-clusters between users.
    Type: Application
    Filed: May 21, 2012
    Publication date: November 21, 2013
    Applicant: SPHEREUP LTD.
    Inventors: Yevgeny Safovich, Ronen Abramov, Natan Chosnek