Patents Assigned to PureDiscovery Corporation
-
Patent number: 8788516Abstract: A method includes determining a plurality of social interactions associated with a plurality of people, generating a social object matrix using the determined social interactions, and generating a social brain by performing Singular Value Decomposition (SVD) on the social object matrix. The method further includes determining text from the social objects of the determined social interactions, generating a term-document matrix (TDM) using the determined text, generating a semantic brain by performing SVD on the TDM, generating an index using the determined text, and performing a query using the social brain, the semantic brain, and the index. The social brain is a singular value representation of the social object matrix and the semantic brain is a singular value representation of the TDM. Each social interaction is a particular person interacting with a particular social object.Type: GrantFiled: March 15, 2013Date of Patent: July 22, 2014Assignee: PureDiscovery CorporationInventor: Paul A. Jakubik
-
Patent number: 8639496Abstract: A method includes accessing text that includes a plurality of words, tagging each of the plurality of words with one of a plurality of parts of speech (POS) tags, and creating a plurality of tokens, each token comprising one of the plurality of words and its associated POS tag. The method further includes clustering one or more of the created tokens into a chunk of tokens, the one or more tokens clustered into the chunk of tokens based on the POS tags of the one or more tokens, and forming a phrase based on the chunk of tokens, the phrase comprising the words of the one or more tokens clustered into the chunk of tokens.Type: GrantFiled: January 2, 2013Date of Patent: January 28, 2014Assignee: PureDiscovery CorporationInventor: Paul A. Jakubik
-
Patent number: 8635225Abstract: A method includes accessing a set of documents and a set of representative documents, determining distances from each document to a nearest representative document, and selecting a subset of documents using an algorithm for choosing initial seed values and the determined distances to the nearest representative document. The method further includes repeating the following for each particular document of the subset of documents: adding the particular document to the set of representative documents to create a new set of representative documents, removing the particular document of documents from the set of documents to create a new set of documents, and calculating a sum of distances from each document of the new set of documents to a nearest document in the new set of representative documents. The particular document of the subset that resulted in the lowest sum of distances is selected as a new representative document.Type: GrantFiled: March 14, 2013Date of Patent: January 21, 2014Assignee: PureDiscovery CorporationInventor: Paul A. Jakubik
-
Publication number: 20130158979Abstract: A method includes accessing text that includes a plurality of words, tagging each of the plurality of words with one of a plurality of parts of speech (POS) tags, and creating a plurality of tokens, each token comprising one of the plurality of words and its associated POS tag. The method further includes clustering one or more of the created tokens into a chunk of tokens, the one or more tokens clustered into the chunk of tokens based on the POS tags of the one or more tokens, and forming a phrase based on the chunk of tokens, the phrase comprising the words of the one or more tokens clustered into the chunk of tokens.Type: ApplicationFiled: December 14, 2011Publication date: June 20, 2013Applicant: PUREDISCOVERY CORPORATIONInventor: Paul A. Jakubik
-
Publication number: 20130159313Abstract: A method includes accessing text, identifying a plurality of terms from the text, determining a plurality of term vectors associated with the identified plurality of terms, and clustering the determined plurality of term vectors into a plurality of clusters, the plurality of clusters comprising a first and a second cluster, the first and second clusters each comprising two or more of the determined term vectors. The method further includes creating a first pseudo-document according to the first cluster, creating a second pseudo-document according to the second cluster, identifying a first set of terms associated with the first cluster using latent semantic analysis (LSA) of the first pseudo-document, identifying a second set of terms associated with the second cluster using LSA of the second pseudo-document, and combining the first and second sets of terms into a list of output terms.Type: ApplicationFiled: December 14, 2011Publication date: June 20, 2013Applicant: PUREDISCOVERY CORPORATIONInventor: Paul A. Jakubik
-
Patent number: 8312034Abstract: A concept bridge employable with a search engine, method of operating the same and computer information system employing the concept bridge and method. In one embodiment, the concept bridge includes an extractor configured to derive concept terms by extracting significant terms from search text and inferring relevant terms therefrom. The concept bridge also includes a query generator configured to generate a query consistent with an index of a search engine as a function of the concept terms.Type: GrantFiled: June 21, 2006Date of Patent: November 13, 2012Assignee: PureDiscovery CorporationInventors: David Adam Hagar, Stephen Scott Jernigan, David Seigert Copps
-
Publication number: 20100114890Abstract: A computerized method of querying an array of vectors includes receiving a first matrix, partitioning the first matrix into a plurality of subset matrices, and processing each subset matrix with a natural language analysis process to create a plurality of processed subset matrices. The first matrix includes a first plurality of terms and represents one or more data objects to be queried, each subset matrix includes similar vectors from the first matrix, and each processed subset matrix relates terms in each subset matrix to each other.Type: ApplicationFiled: October 31, 2008Publication date: May 6, 2010Applicant: PureDiscovery CorporationInventors: David A. Hagar, Paul A. Jakubik, Stephen S. Jernigan