Abstract: According to various embodiments, a method for topic modeling using unstructured manufacturing data is provided. The method comprises receiving an unstructured data set operator generated data. The unstructured data set includes data items from a first source and a second source. Next, a plurality of keywords and key phrases corresponding to key topics from a plurality of operators is extracted from the data items. Next, operators in the plurality of operators are labeled with the key topics corresponding to the keywords and key phrases. Then, a graph connecting the plurality of operators is generated. Then, a need by a first operator in the plurality of operators is identified with regards to a specific key topic. Next, a second operator labeled with the specific key topic is discovered using the graph. Last, the first operator is automatically connected to the second operator.
Abstract: According to various embodiments, a method for automatic unstructured data analysis of medical data is provided. The method comprises receiving an unstructured data set corresponding to medical data. The unstructured data set includes data items from a first source and a second source. The method includes extracting, from the unstructured data set, a plurality of keywords and key phrases corresponding to a clinical profile. Next, a vector is generated from the first source and the second source. The vector includes vector elements and corresponds to the clinical profile. Next, the vector elements is normalized for comparison with predetermined clinical trial criteria. Last, vectors that meet the predetermined clinical trial criteria are automatically identified.
Abstract: According to various embodiments, a method for extracting targeted data using unstructured data is provided. The method comprises: receiving an unstructured data set, the unstructured data set includes data items from a first source and a second source; generating a first vector from the first source and a second vector from the second source, each vector includes data items in the unstructured data set; merging the first and second vectors to form a merged vector; performing clustering, using a clustering algorithm, on the merged vector in order to produce a deepness measure and a degree measure for each data item in the merged vector; generating a score for each data item in the merged vector using the deepness measure and degree measure of each data item; and ranking each data item based on its generated score.