Patents by Inventor Kyuseok Shim
Kyuseok Shim has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11016995Abstract: A method of performing K-means clustering by a data mining system is provided. The method includes generating a plurality of initial buckets by dividing data including a plurality of points each being expressed in coordinate information, reflecting a count noise in a number of points included in each of the initial buckets and then generating a plurality of new buckets by dividing at least one initial bucket among the initial buckets based on a first threshold and a second threshold, generating a plurality of final buckets from the plurality of initial buckets or the plurality of new buckets, generating a histogram including section information for each of the final buckets and a number of points included in each of the final buckets in which the count noise is reflected, and performing K-means clustering on the histogram based on a number of clusters.Type: GrantFiled: May 8, 2019Date of Patent: May 25, 2021Assignee: SEOUL NATIONAL UNIVERSITY R&B FOUNDATIONInventors: Seog Park, Kyuseok Shim, Hanjun Goo, Woohwan Jung, Seongwoong Oh, Suyong Kwon
-
Publication number: 20190347278Abstract: A method of performing K-means clustering by a data mining system is provided. The method includes generating a plurality of initial buckets by dividing data including a plurality of points each being expressed in coordinate information, reflecting a count noise in a number of points included in each of the initial buckets and then generating a plurality of new buckets by dividing at least one initial bucket among the initial buckets based on a first threshold and a second threshold, generating a plurality of final buckets from the plurality of initial buckets or the plurality of new buckets, generating a histogram including section information for each of the final buckets and a number of points included in each of the final buckets in which the count noise is reflected, and performing K-means clustering on the histogram based on a number of clusters.Type: ApplicationFiled: May 8, 2019Publication date: November 14, 2019Inventors: Seog Park, Kyuseok Shim, Hanjun Goo, Woohwan Jung, Seongwoong Oh, Suyong Kwon
-
Patent number: 9977806Abstract: A skyline query is a query on a set of tuples which are not dominated by other tuples. The skyline query system includes a sky quad tree generator that generates a quad tree from data, and marks a leaf node, which cannot include a local skyline, as being dominated; a local skyline calculator that computes a local skyline of each leaf node, which is not marked as being dominated, in the sky quad tree; and a global skyline calculator that computes a global skyline by using the local skyline. A quad tree is a tree where each internal node has exactly four children. The skyline query is a query for calculating a skyline, a dynamic skyline, or a reverse skyline from the data, where a dynamic skyline includes dynamic attributes and a reverse skyline identifies queries corresponding to certain skyline results.Type: GrantFiled: October 31, 2014Date of Patent: May 22, 2018Assignees: SNU R&DB Foundation, Korea University of Technology and Education Industry-University Cooperation FoundationInventors: Kyuseok Shim, Yoonjae Park, Jun-Ki Min
-
Publication number: 20180018971Abstract: A word embedding and word embedding apparatus are provided. The word embedding method includes receiving an input sentence, detecting an unlabeled word in the input sentence, embedding the unlabeled word based on labeled words included in the input sentence, and outputting a feature vector based on the embedding.Type: ApplicationFiled: July 6, 2017Publication date: January 18, 2018Applicants: Samsung Electronics Co., Ltd., SNU R&DB FOUNDATIONInventors: Hyoungmin PARK, Kyuseok SHIM, Woo In LEE, Kyoung Gu WOO, Wonkwang SHIN
-
Patent number: 9110973Abstract: Provided are a method and apparatus for processing a query. The method includes generating string sets comprising a plurality of partial strings from a query string, determining a subset of the string sets as a candidate set, and searching for a document comprising the query string from the candidate set.Type: GrantFiled: October 14, 2011Date of Patent: August 18, 2015Assignees: Samsung Electronics Co., Ltd., SNU R&DB FoundationInventors: Younghoon Kim, Hyoungmin Park, Kyuseok Shim, Kyoung-gu Woo
-
Publication number: 20150213125Abstract: There is provided a skyline query system for implementing a skyline query. The skyline query system includes a sky quad tree generator that generates a quad tree from data, and marks a leaf node, which cannot include a local skyline, as being dominated; a local skyline calculator that computes a local skyline of each leaf node, which is not marked as being dominated, in the sky quad tree; and a global skyline calculator that computes a global skyline by using the local skyline. The skyline query is a query for calculating a skyline, a dynamic skyline, or a reverse skyline from the data, and each of the local skyline and the global skyline is a skyline, a dynamic skyline or a reverse skyline.Type: ApplicationFiled: October 31, 2014Publication date: July 30, 2015Applicants: SNU R&DB FOUNDATION, KOREA UNIVERSITY OF TECHNOLOGY AND EDUCATIONInventors: Kyuseok SHIM, Yoonjae PARK, Jun-Ki MIN
-
Publication number: 20120259862Abstract: Provided are a method and apparatus for processing a query. The method includes generating string sets comprising a plurality of partial strings from a query string, determining a subset of the string sets as a candidate set, and searching for a document comprising the query string from the candidate set.Type: ApplicationFiled: October 14, 2011Publication date: October 11, 2012Inventors: Younghoon Kim, Hyoung Park, Kyuseok Shim, Kyoung-gu Woo
-
Publication number: 20100241622Abstract: An n-gram based query processing apparatus and method are provided. A query processing is performed using only a portion of n-grams out of all n-grams with respect to the search key. A candidate set of documents having a possibility of including the search key is extracted using a posting list with respect to the portion of n-grams.Type: ApplicationFiled: February 3, 2010Publication date: September 23, 2010Inventors: Hee Gyu JIN, Kyoung Gu Woo, Kyuseok Shim, Hyoungmin Park, Younghoon Kim
-
Patent number: 7260572Abstract: A method of processing a query about an Extensible Markup Language (XML) data. The XML query processing method includes the first step of representing the XML data in the form of an XML graph, the second step of creating and updating an Adaptive Path indEX (APEX) based on frequently used paths extracted from previously processed XML queries and the XML graph, and the third step of processing an XML query using the APEX. The XML query processing is capable of improving the performance of processing the query by extracting frequently used paths from path expressions having been used as queries for XML data and updating the APEX through the use of the frequently used paths.Type: GrantFiled: June 2, 2003Date of Patent: August 21, 2007Assignee: Korea Advanced Institute of Science and TechnologyInventors: Jun-Ki Min, Kyuseok Shim, Chin-Wan Chung
-
Patent number: 7228312Abstract: An XML transformation tool that constructs a relational database with associated physical structures that can be populated with shredded XML data. A mapping transformation enumerator examines queries in the workload and enumerates mapping transformations that use XSD specific constraints and statistics on XML data and can be used to generate mappings from XSD to relational database schema that may lead to better performance in presence of physical design. A design tuner that searches mappings generated from a default mapping using enumerated transformations together with physical design structures associated with those mappings and selects a preferred mapping and the physical design structures. Cost estimates for performing queries in the workload are determined for the relational database implementing the mapping and associated physical design structures.Type: GrantFiled: March 9, 2004Date of Patent: June 5, 2007Assignee: Microsoft CorporationInventors: Surajit Chaudhuri, Zhiyuan Chen, Kyuseok Shim, Yuqing Yu
-
Patent number: 7080314Abstract: The present invention discloses a document descriptor extraction method and system. The document descriptor extraction method and system creates a document descriptor by generalizing input sequences within a document; factoring the input sequences and generalized input sequences; and selecting a document descriptor from the input sequences, generalized sequences, and factored sequences, preferably using minimum descriptor length (MDL) principles. Novel algorithms are employed to perform the generalizing, factoring, and selecting.Type: GrantFiled: June 16, 2000Date of Patent: July 18, 2006Assignee: Lucent Technologies Inc.Inventors: Minos N. Garofalakis, Aristides Gionis, Rajeev Rastogi, Srinivasan Seshadri, Kyuseok Shim
-
Publication number: 20050203933Abstract: An XML transformation tool that constructs a relational database with associated physical structures that can be populated with shredded XML data. A mapping transformation enumerator examines queries in the workload and enumerates mapping transformations that use XSD specific constraints and statistics on XML data and can be used to generate mappings from XSD to relational database schema that may lead to better performance in presence of physical design. A design tuner that searches mappings generated from a default mapping using enumerated transformations together with physical design structures associated with those mappings and selects a preferred mapping and the physical design structures. Cost estimates for performing queries in the workload are determined for the relational database implementing the mapping and associated physical design structures.Type: ApplicationFiled: March 9, 2004Publication date: September 15, 2005Inventors: Surajit Chaudhuri, Zhiyuan Chen, Kyuseok Shim, Yuqing Yu
-
Patent number: 6760724Abstract: A method for querying electronic data. The query method comprises creating wavelet-coefficient synopses of the electronic data and then querying the synopses in the wavelet-coefficient domain to obtain a wavelet-coefficient query result. The wavelet-coefficient query result is then rendered to provide an approximate result.Type: GrantFiled: July 24, 2000Date of Patent: July 6, 2004Assignee: Lucent Technologies Inc.Inventors: Kaushik Chakrabarti, Minos N. Garofalakis, Rajeev Rastogi, Kyuseok Shim
-
Patent number: 6751363Abstract: Methods of imaging objects based on wavelet retrieval of scenes utilize wavelet transformation of plural defined regions of a query image. By increasing the granularity of the query image to greater than one region, accurate feature vectors are obtained that allow for robust extraction of corresponding regions from a database of target images. The methods further include the use of sliding windows to decompose the query and target images into regions, and the clustering of the regions utilizing a novel similarity metric that ensures robust image matching in low response times.Type: GrantFiled: August 10, 1999Date of Patent: June 15, 2004Assignee: Lucent Technologies Inc.Inventors: Apostol Ivanov Natsev, Rajeev Rastogi, Kyuseok Shim
-
Publication number: 20040098384Abstract: Disclosed herein is a method of processing a query about an Extensible Markup Language (XML) data. The XML query processing method of the present invention includes the first step of representing the XML data in the form of an XML graph, the second step of creating and updating an Adaptive Path indEX (APEX) based on frequently used paths extracted from previously processed XML queries and the XML graph, and the third step of processing an XML query using the APEX. The XML query processing method of the present invention is capable of improving the performance of processing the query by extracting frequently used paths from path expressions having been used as queries for XML data and updating the APEX through the use of the frequently used paths.Type: ApplicationFiled: June 2, 2003Publication date: May 20, 2004Inventors: Jun-Ki Min, Kyuseok Shim, Chin-Wan Chung
-
Patent number: 6643629Abstract: A new method for identifying a predetermined number of data points of interest in a large data set. The data points of interest are ranked in relation to the distance to their neighboring points. The method employs partition-based detection algorithms to partition the data points and then compute upper and lower bounds for each partition. These bounds are then used to eliminate those partitions that do contain the predetermined number of data points of interest. The data points of interest are then computed from the remaining partitions that were not eliminated. The present method eliminates a significant number of data points from consideration as the points of interest, thereby resulting in substantial savings in computational expense compared to conventional methods employed to identify such points.Type: GrantFiled: November 18, 1999Date of Patent: November 4, 2003Assignee: Lucent Technologies Inc.Inventors: Sridhar Ramaswamy, Rajeev Rastogi, Kyuseok Shim
-
Publication number: 20030061249Abstract: A new method for identifying a predetermined number of data points of interest in a large data set. The data points of interest are ranked in relation to the distance to their neighboring points. The method employs partition-based detection algorithms to partition the data points and then compute upper and lower bounds for each partition. These bounds are then used to eliminate those partitions that do contain the predetermined number of data points of interest. The data points of interest are then computed from the remaining partitions that were not eliminated. The present method eliminates a significant number of data points from consideration as the points of interest, thereby resulting in substantial savings in computational expense compared to conventional methods employed to identify such points.Type: ApplicationFiled: November 18, 1999Publication date: March 27, 2003Inventors: SRIDHAR RAMASWAMY, RAJEEV RASTOGI, KYUSEOK SHIM
-
Patent number: 6473757Abstract: The present invention provides a method and system for sequential pattern mining with a given constraint. A Regular Expression (RE) is used for identifying the family of interesting frequent patterns. A family of methods that enforce the RE constraint to different degrees within the generating and pruning of candidate patterns during the mining process is utilized. This is accomplished by employing different relaxations of the RE constraint in the mining loop. Those sequences which satisfy the given constraint are thus identified most expeditiously.Type: GrantFiled: March 28, 2000Date of Patent: October 29, 2002Assignee: Lucent Technologies Inc.Inventors: Minos N. Garofalakis, Rajeev Rastogi, Kyuseok Shim
-
Patent number: 6247016Abstract: A method of data classification using a decision tree having nodes is disclosed, along with an apparatus for perming the method. Periodically or after a certain number of nodes of the tree are split, the partially built tree is pruned. During the building phase the minimum cost of subtrees rooted at leaf nodes that can still be expanded (“yet to be expanded nodes”)is computed. With the computation of the minimum subtree cost at nodes, the nodes pruned are a subset of those that would have been pruned anyway during the pruning phase, and they are pruned while the tree is still being built.Type: GrantFiled: November 10, 1998Date of Patent: June 12, 2001Assignee: Lucent Technologies, Inc.Inventors: Rajeev Rastogi, Kyuseok Shim
-
Patent number: 6185549Abstract: An electronic data mining process for mining from an electronic data base using an electronic digital computer a listing of commercially useful information of the type known in the art as an association rule containing at least one uninstantiated condition. For example, the commercially useful information may be information useful for sales promotion, such as promotion of telephone usage. The computer retrieves from the database a plurality of stored parameters from which measures of the uninstatiated condition can be determined. The computer uses a dynamic programming algorithm and iterates over intervals or sub-ranges of the parameters to obtain what is called an at least partially optimized association rule, as optimized intervals or sub-ranges of at least some of the retrieved parameters, for example, time intervals of high usage of certain types of telephone connections. These optimized intervals are provided as the listed commercially useful information.Type: GrantFiled: April 29, 1998Date of Patent: February 6, 2001Assignee: Lucent Technologies Inc.Inventors: Rajeev Rastogi, Kyuseok Shim