Patents by Inventor Minsik Cho

Minsik Cho has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10545739
    Abstract: A low level virtual machine (LLVM)-based system C compiler for architecture synthesis is provided. In one aspect, a method for translating a system C model to hardware description language (HDL) is provided. The method includes the steps of: generating a hardware connection model (HCM) from the system C model, wherein the HCM defines modules and interconnects in a hardware system; parsing the system C model into a LLVM intermediate representation (IR); converting the LLVM IR to a system LLVM IR which records correspondence information between the LLVM IR and the HCM; and generating the HDL based on direct mapping of processes from the system LLVM IR and the HCM.
    Type: Grant
    Filed: April 5, 2016
    Date of Patent: January 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Minsik Cho, Brian R. Konigsburg, Indira Nair, Haoxing Ren, Jeonghee Shin
  • Publication number: 20190294651
    Abstract: A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem matrix by N+1 and combining the first problem matrix and the mirrored second problem matrix into one matrix of (N+1)×N by merging the first problem matrix and the mirrored second problem matrix. The first problem matrix and the second problem matrix are symmetric and positive definite matrices.
    Type: Application
    Filed: June 13, 2019
    Publication date: September 26, 2019
    Inventors: Minsik Cho, David Shing-ki Kung, Ruchir Puri
  • Patent number: 10423695
    Abstract: A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem by N+1, combining the first problem matrix and the mirrored second problem matrix into one matrix of (N+1)×N, and reading the fixed size data length of the one square matrix with a fixed data interval for both the first problem and the second problem.
    Type: Grant
    Filed: March 8, 2018
    Date of Patent: September 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Minsik Cho, David Shing-ki Kung, Ruchir Puri
  • Patent number: 10268798
    Abstract: A method for condition analysis comprises receiving an algorithmic description of a hardware design, wherein the algorithmic description is specified using a programming language, generating an intermediate representation based on the algorithmic description, wherein the intermediate representation includes a plurality of nodes and a plurality of paths, wherein each path connects at least one node to at least one other node, computing a plurality of relationships between the plurality of nodes, wherein the plurality of relationships are based on the plurality of paths connecting the plurality of nodes and each relationship includes at least one of a dominance relationship and a post-dominance relationship between two or more nodes, partitioning the intermediate representation based on the computed relationships, performing an optimization using the partitioned intermediate representation, and converting results of the optimization to the hardware design.
    Type: Grant
    Filed: September 22, 2015
    Date of Patent: April 23, 2019
    Assignee: International Business Machines Corporation
    Inventors: Minsik Cho, Indira Nair
  • Publication number: 20190114260
    Abstract: An iterative graph algorithm accelerating method, system, and computer program product, include recording an order of access nodes in a memory layout, reordering the access nodes in the memory layout in accordance with the recorded order, and updating edge information of the reordered access nodes.
    Type: Application
    Filed: December 12, 2018
    Publication date: April 18, 2019
    Inventors: Minsik Cho, Daniel Brand, Ulrich Alfons Finkler, David Shing-ki Kung, Ruchir Puri
  • Publication number: 20190087722
    Abstract: An overall gradient vector is computed at a server from a set of ISA vectors corresponding to a set of worker machines. An ISA vector of a worker machine including ISA instructions corresponding to a set of gradients, each gradient corresponding to a weight of a node of a neural network being distributedly trained in the worker machine. A set of register values is optimized for use in an approximation computation with an opcode to produce an x-th approximate gradient of an x-th gradient. A server ISA vector is constructed in which a server ISA instruction in an x-th position corresponds to the x-th gradient in the overall gradient vector. A processor at the worker machine is caused to update a set of weights of the neural network, using the set of optimized register values and the server ISA vector, thereby completing one iteration of training.
    Type: Application
    Filed: September 20, 2017
    Publication date: March 21, 2019
    Applicant: International Business Machines Corporation
    Inventors: Minsik Cho, Ulrich A. Finkler
  • Publication number: 20190087723
    Abstract: Using a processor and a memory at a worker machine, a gradient vector is computed corresponding to a set of weights associated with a set of nodes of a neural network instance being trained in the worker machine. In an ISA vector corresponding to the gradient vector, an ISA instruction is constructed corresponding to a gradient in a set of gradients in the gradient vector, wherein a data transmission of the ISA instruction is smaller as compared to a data transmission of the gradient. The ISA vector is transmitted from the worker machine to a parameter server, the ISA vector being responsive to one iteration of a training of the neural network instance, the ISA vector being transmitted instead of the gradient vector to reduce an amount of data transmitted from the worker machine to the parameter server for the one iteration of the training.
    Type: Application
    Filed: September 20, 2017
    Publication date: March 21, 2019
    Applicant: International Business Machines Corporation
    Inventors: Minsik Cho, Ulrich A. Finkler
  • Patent number: 10209913
    Abstract: An iterative graph algorithm accelerating method, system, and computer program product, include recording an order of access nodes in a memory layout, reordering the access nodes in the memory layout in accordance with the recorded order, and updating edge information of the reordered access nodes.
    Type: Grant
    Filed: January 31, 2017
    Date of Patent: February 19, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Minsik Cho, Daniel Brand, Ulrich Alfons Finkler, David Shing-ki Kung, Ruchir Puri
  • Publication number: 20180357283
    Abstract: A first quicksort is performed in parallel across pairs of partitions of a dataset assigned to respective ones of available processors, including swapping elements of a first partition of a given one of the pairs that are larger than a pivot with elements of a second partition of the given pair that are smaller than the pivot. A second quicksort is performed in parallel across those partitions having elements left unsorted by the first quicksort, and first misplaced elements from a first side of the dataset corresponding to the first partition are swapped with second misplaced elements from a second side of the dataset corresponding to the second partition to produce a first dataset having elements equal to or lower than the pivot and a second dataset having elements equal to or higher than the pivot.
    Type: Application
    Filed: August 21, 2018
    Publication date: December 13, 2018
    Inventors: Daniel Brand, Minsik Cho, Ruchir Puri
  • Publication number: 20180357534
    Abstract: A method for executing multi-directional reduction algorithms includes identifying a set of nodes, wherein a node includes at least one data element, creating a set of partitions including one or more data elements from at least two nodes, wherein the at least two nodes are arranged in a single direction with respect to the positioning of the set of nodes, executing a reduction algorithm on the data elements within the created set of partitions, creating an additional set of partitions including one or more data elements from at least two nodes, wherein the at least two nodes are arranged in a different direction with respect to the positioning of the set of nodes, executing a reduction algorithm on the data elements within the created additional set of partitions, and providing a set of reduced results corresponding to the at least one data element.
    Type: Application
    Filed: June 13, 2017
    Publication date: December 13, 2018
    Inventors: Minsik Cho, Ulrich A. Finkler, David S. Kung, Li Zhang
  • Patent number: 10108670
    Abstract: Methods and systems for sorting a dataset include partitioning the dataset into 2npartitions, where n is a number of available processors. A first quicksort is performed in parallel across pairs of partitions based on a pivot using a plurality of processors. A second quicksort is performed in parallel on unsorted elements within each partition based on the pivot, where the unsorted elements were left unsorted by the first quicksort. Misplaced elements from a left side of the dataset are swapped with misplaced elements from a right side of the dataset to produce a left dataset that has elements equal to or lower than the pivot and a right dataset that has elements equal to or higher than the pivot.
    Type: Grant
    Filed: August 19, 2015
    Date of Patent: October 23, 2018
    Assignee: International Business Machines Corporation
    Inventors: Daniel Brand, Minsik Cho, Ruchir Puri
  • Publication number: 20180218260
    Abstract: Input image data having a plurality of pixel values represented in a two-dimensional matrix form of columns and rows is received. The input image data is transformed into a plurality of input rows. The pixel values in each input row correspond to the pixel values in a predetermined subset of the columns of the input image data and all of the rows of each column of the subset of columns. A plurality of subsets of pixel values in the plurality of input rows is determined. The number of pixel values in each row of a subset of pixel values equal in number to a number of filter values in a filter. Each input row of each subset of pixel values is convolved with the filter values of the filter to determine a corresponding output value and stored in a memory.
    Type: Application
    Filed: January 31, 2017
    Publication date: August 2, 2018
    Inventors: Daniel Brand, Minsik Cho
  • Publication number: 20180217775
    Abstract: An iterative graph algorithm accelerating method, system, and computer program product, include recording an order of access nodes in a memory layout, reordering the access nodes in the memory layout in accordance with the recorded order, and updating edge information of the reordered access nodes.
    Type: Application
    Filed: January 31, 2017
    Publication date: August 2, 2018
    Inventors: Minsik Cho, Daniel Brand, Ulrich Alfons Finkler, David Shing-ki Kung, Ruchir Puri
  • Publication number: 20180196779
    Abstract: A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem by N+1, combining the first problem matrix and the mirrored second problem matrix into one matrix of (N+1)×N, and reading the fixed size data length of the one square matrix with a fixed data interval for both the first problem and the second problem.
    Type: Application
    Filed: March 8, 2018
    Publication date: July 12, 2018
    Inventors: Minsik Cho, David Shing-ki Kung, Ruchir Puri
  • Patent number: 9984041
    Abstract: A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU) including at least a first problem and a second problem, include mirroring a second problem matrix of the second problem to a first problem matrix of the first problem, combining the first problem matrix and the mirrored second problem matrix into a single problem matrix, and allocating data read to a thread and to the first problem and the second problem, respectively.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: May 29, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Minsik Cho, David Shing-ki Kung, Ruchir Puri
  • Publication number: 20180144010
    Abstract: An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a head processor, data elements starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array. The system radix sort processing, with a tail processor, data elements starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array.
    Type: Application
    Filed: December 29, 2017
    Publication date: May 24, 2018
    Inventors: Rajesh BORDAWEKAR, Daniel BRAND, Minsik CHO, Brian R. KONIGSBURG, Ruchir PURI
  • Publication number: 20180121481
    Abstract: Apparatuses and Methods for sorting a data set. A data storage is divided into a plurality of buckets that is each associated with a respective key value. A plurality of stripes is identified in each bucket. A plurality of data stripe sets is defined that has one stripe within each respective bucket. A first and a second in-place partial bucket radix sort are performed on data items contained within the first and second data stripe sets, respectively, using an initial radix. Incorrectly sorted data items in the first bucket are grouped by a first processor and incorrectly sorted data items in the second bucket are grouped by a second processor into a respective incorrect data item group within each bucket. A radix sort is then performed using the initial radix on the items within the respective incorrect data item group. A first level sorted output is produced.
    Type: Application
    Filed: December 22, 2017
    Publication date: May 3, 2018
    Applicant: International Business Machines Corporation
    Inventors: Rajesh BORDAWEKAR, Daniel BRAND, Minsik CHO, Ulrich FINKLER, Ruchir PURI
  • Patent number: 9953044
    Abstract: An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a head processor, data elements starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array. The system radix sort processing, with a tail processor, data elements starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: April 24, 2018
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Daniel Brand, Minsik Cho, Brian R. Konigsburg, Ruchir Puri
  • Patent number: 9946512
    Abstract: Systems and methods for sorting a data set stored on an external device. A plurality of smaller radix sizes are determined, based on a first radix size and performance characteristics of an external data storage device, whose sizes add up to a first radix size for an in-place radix sort. The smaller radix sizes reduce a total time to perform the in-place radix sort. Each level of a multiple level in-place radix sort is performed with the smaller radix sizes. Each level of the sort includes dividing the data set into N buckets; dividing the buffer into N buckets; and iteratively loading a respective segment in each bucket of the data set into a respective bucket of the buffer, performing an in-place radix sort on the data in the buffer, and returning sorted buffer data to the data set on the external storage device.
    Type: Grant
    Filed: September 25, 2015
    Date of Patent: April 17, 2018
    Assignee: International Business Machines Corporation
    Inventors: Minsik Cho, Brian R. Konigsburg, Vincent Kulandaisamy, Ruchir Puri
  • Patent number: 9928261
    Abstract: An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a head processor, data elements starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array. The system radix sort processing, with a tail processor, data elements starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array.
    Type: Grant
    Filed: December 24, 2014
    Date of Patent: March 27, 2018
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Daniel Brand, Minsik Cho, Brian R. Konigsburg, Ruchir Puri