Patents by Inventor Rajesh Bordawekar

Rajesh Bordawekar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11244224
    Abstract: A first observation window in a first time series is identified. The first observation window is preceded by a first portion of the first time series. A neural network is trained using the first portion of the first time series and the first observation window, and weights are extracted from the middle layers of the neural network. A first feature vector is generated based on the weights. A second observation window in a second time series is identified, where the second observation window is preceded by a first portion of the second time series. A second feature vector associated with the second observation window is determined. The second feature vector is based at least in part on the first set of weights. A similarity between the first and second observation windows is determined based on comparing the first feature vector and the second feature vector.
    Type: Grant
    Filed: March 20, 2018
    Date of Patent: February 8, 2022
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Tin Kam Ho
  • Patent number: 11182414
    Abstract: A computer-implemented method, cognitive intelligence system and computer program product adapt a relational database containing multiple data types. Non-text tokens in the relational database are converted to a textual form. Text is produced based on relations of tokens in the relational database. A set of word vectors is produced for the tokens based on the text. A cognitive intelligence query expressed as a structured query language (SQL) query may be applied to the relational database using the set of word vectors. The form of non-text tokens is one of a numeric value, an SQL type, an image, a video, a time series, latitude and longitude, or chemical structures. A single word embedding model may be applied over one or more tokens in the text. A plurality of sets of preliminary word vectors are computed by applying more than one embedding model over all tokens in the text. The preliminary word vector sets are merged to form the set of word vectors.
    Type: Grant
    Filed: March 20, 2017
    Date of Patent: November 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Bortik Bandyopadhyay, Rajesh Bordawekar, Tin Kam Ho
  • Patent number: 11176176
    Abstract: From a first attribute-value pair in a record, new data comprising a first token is created. From each token using a processor and a memory, new data including a corresponding vector is computed. From the record, a target row is selected, wherein a target attribute-value pair in the target row includes a value requiring correction. Using a similarity measure, a set of most similar rows to the target row is determined, wherein each row in the set of most similar rows to the target row has a corresponding similarity measure above a threshold similarity measure and wherein each row in the set of most similar rows includes the target attribute. From values corresponding to the target attribute in the set of most similar rows, a replacement value is determined. The value requiring correction in the target row is replaced with the replacement value.
    Type: Grant
    Filed: November 20, 2018
    Date of Patent: November 16, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Rajesh Bordawekar, Tin Kam Ho
  • Patent number: 11163761
    Abstract: Structured and semi-structured databases and files are processed using natural language processing techniques to impute data for null value tokens in database records from other records that have non-null values for the same attributes. Vector embedding techniques are used, including, in some cases, appropriately tagging null value tokens to reduce or eliminate their undue impact on semantic vectors generating using a neural network.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Tin Kam Ho
  • Publication number: 20210311937
    Abstract: A system, apparatus, and a method for training with multi-modal data in a relational database, including generating a first database including a multi-view of the multi-modal data, retrieving a second set of data from an external source via a network, and training a first model according the first database and the second set of data. The first model outputs relationships of the first database with the multi-view and the second set of data.
    Type: Application
    Filed: June 21, 2021
    Publication date: October 7, 2021
    Inventors: Rajesh Bordawekar, Bortik Bandyopadhyay
  • Publication number: 20210294794
    Abstract: Structured and semi-structured databases and files are processed using natural language processing techniques to impute data for null value tokens in database records from other records that have non-null values for the same attributes. Vector embedding techniques are used, including, in some cases, appropriately tagging null value tokens to reduce or eliminate their undue impact on semantic vectors generating using a neural network.
    Type: Application
    Filed: March 20, 2020
    Publication date: September 23, 2021
    Inventors: Rajesh Bordawekar, Tin Kam Ho
  • Patent number: 11100100
    Abstract: A computer-implemented method, cognitive intelligence server and computer program product adapt a relational database containing numeric data types. At least one numeric token in the relational database is converted to a textual form. Text is produced based on relations of tokens in the relational database. A set of word vectors is produced based on the text. A cognitive intelligence query, expressed as a structured query language (SQL) query, may be applied to the relational database using the set of word vectors. At least one numeric token in the relational database may be converted to a typed string comprising a heading for a column in the relational database for which the token appears and the numeric value. Converting at least one numeric token in the relational database may comprise clustering tokens in a column of the relational database using a clustering algorithm and replacing each token in the column by a cluster identifier.
    Type: Grant
    Filed: March 20, 2017
    Date of Patent: August 24, 2021
    Assignee: International Business Machines Corporation
    Inventors: Bortik Bandyopadhyay, Rajesh Bordawekar, Tin Kam Ho
  • Patent number: 11080273
    Abstract: A computer-implemented method, a cognitive intelligence system and computer program product adapt a relational database containing image data types. At least one image token in the relational database is converted to a textual form. Text is produced based on relations of tokens in the relational database. A set of word vectors is produced based on the text. A cognitive intelligence query expressed as a structured query language (SQL) query may be applied to the relational database using the set of word vectors. An image token may be converted to textual form by converting the image to a tag, by using a neural network classification model and replacing the image token with a corresponding cluster identifier, by binary comparison or by a user-specified similarity function. An image token may be converted to a plurality of textual forms using more than one conversion method.
    Type: Grant
    Filed: March 20, 2017
    Date of Patent: August 3, 2021
    Assignee: International Business Machines Corporation
    Inventors: Bortik Bandyopadhyay, Rajesh Bordawekar, Tin Kam Ho
  • Patent number: 11074253
    Abstract: A system and a method for performing queries, including generating text representations of features of various types of data, building a multi-modal word embedding model to capture relationships between the various types of data, and based on the multi-modal word embedding model, performing an inductive reasoning query.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: July 27, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Rajesh Bordawekar, Bortik Bandyopadhyay
  • Publication number: 20210124724
    Abstract: A computer-implemented method according to one embodiment includes identifying a relational database; determining columns of interest within the relational database; creating an unordered group of string tokens for each row of the relational database, utilizing the determined columns of interest; assigning weights for one or more columns within the relational database to one or more string tokens within each unordered group of string tokens to create a plurality of weighted unordered groups of string tokens; and determining a meaning vector for an identifier of each row of the relational database, utilizing the plurality of weighted unordered groups of string tokens.
    Type: Application
    Filed: October 28, 2019
    Publication date: April 29, 2021
    Inventor: Rajesh Bordawekar
  • Patent number: 10984030
    Abstract: A computer-implemented method, a cognitive intelligence system and computer program product adapt a relational database containing multiple data types. Non-text tokens in the relational database are converted to a textual form. Text is produced based on relations of tokens in the relational database. A set of pre-trained word vectors for the text is retrieved from an external database. The set of pre-trained word vectors is initialized for tokens common to both the relational database and an external database. The set of pre-trained vectors is used to create a cognitive intelligence query expressed as a structure query language (SQL) query. Content of the relational database is used for training while initializing the set of pre-trained word vectors for tokens common to both the relational database and the external database. The first set of word vectors may be immutable or mutable with updates controlled via parameters.
    Type: Grant
    Filed: March 20, 2017
    Date of Patent: April 20, 2021
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Oded Shmueli
  • Patent number: 10831738
    Abstract: Apparatuses and Methods for sorting a data set. A data storage is divided into a plurality of buckets that is each associated with a respective key value. A plurality of stripes is identified in each bucket. A plurality of data stripe sets is defined that has one stripe within each respective bucket. A first and a second in-place partial bucket radix sort are performed on data items contained within the first and second data stripe sets, respectively, using an initial radix. Incorrectly sorted data items in the first bucket are grouped by a first processor and incorrectly sorted data items in the second bucket are grouped by a second processor into a respective incorrect data item group within each bucket. A radix sort is then performed using the initial radix on the items within the respective incorrect data item group. A first level sorted output is produced.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Daniel Brand, Minsik Cho, Ulrich Finkler, Ruchir Puri
  • Patent number: 10831752
    Abstract: A method, computer program product and/or system is disclosed. According to an aspect of this invention, one or more processors receive a query of a first database, where the query includes: (i) an operand, and (ii) an operator indicating a distance-based similarity measure. One or more processors further determine a result set based on the query, wherein the result set includes a plurality of records, and wherein a record is included in the result set based on a vector nearest-neighbor computation between: (i) a first vector corresponding to the operand, and (ii) a second vector corresponding to the record, wherein the second vector is included in a vector space model that is based on a textual representation of the first database.
    Type: Grant
    Filed: April 25, 2018
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Oded Shmueli
  • Publication number: 20200210431
    Abstract: from a first attribute-value pair in a record, new data is created including a first token. Using a first model and using a processor and a memory, each token is vectorized into new data including a corresponding vector. From the record, a target row is selected, wherein a target attribute-value pair in the target row includes a value for which a semantic similarity computation is to be performed. Using a similarity measure, a set of most similar rows to the target row is determined, wherein each row in the set of most similar rows to the target row has a corresponding similarity measure above a threshold similarity measure and wherein each row in the set of most similar rows includes the target attribute. The set of most similar rows is used to compute a response to a database query.
    Type: Application
    Filed: January 2, 2019
    Publication date: July 2, 2020
    Applicant: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Jose Neves
  • Patent number: 10685002
    Abstract: An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a head processor, data elements starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array. The system radix sort processing, with a tail processor, data elements starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: June 16, 2020
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Daniel Brand, Minsik Cho, Brian R. Konigsburg, Ruchir Puri
  • Publication number: 20200175360
    Abstract: Methods, systems and computer program products for updating a word embedding model are provided. Aspects include receiving a first data set comprising a relational database having a plurality of words. Aspects also include generating a word embedding model comprising a plurality of word vectors by training a neural network using unsupervised machine learning based on the first data set. Each word vector of the plurality of word vector corresponds to a unique word of the plurality of words. Aspects also include storing the plurality of word vectors and a representation of a hidden layer of the neural network. Aspects also include receiving a second data set comprising data that has been added to the relational database. Aspects also include updating the word embedding model based on the second data set and the stored representation of the hidden layer of the neural network.
    Type: Application
    Filed: November 29, 2018
    Publication date: June 4, 2020
    Inventors: Thomas Conti, Stephen Warren, Rajesh Bordawekar, Jose Neves, Christopher Harding
  • Publication number: 20200175390
    Abstract: Methods, systems and computer program products for determining recommended parameters for use in generating a word embedding model are provided. Aspects include storing a plurality of meaningful test cases. Each meaningful test case includes a test data profile and one or more test model parameters used to create a word embedding model that has been classified as yielding meaningful results. Aspects include receiving a production data set to be used in generating a new word embedding model. The production data set includes data stored in a relational database having a plurality of columns and a plurality of rows. Aspects include generating a data profile associated with the production data set. Aspects include generating a recommendation for one or more production model parameters for use in building a word embedding model based on the data profile associated with the production data set and the plurality of meaningful test cases.
    Type: Application
    Filed: November 29, 2018
    Publication date: June 4, 2020
    Inventors: Thomas Conti, Rajesh Bordawekar, Stephen Warren, Christopher Harding, Jose Neves
  • Publication number: 20200159853
    Abstract: From a first attribute-value pair in a record, new data comprising a first token is created. From each token using a processor and a memory, new data including a corresponding vector is computed. From the record, a target row is selected, wherein a target attribute-value pair in the target row includes a value requiring correction. Using a similarity measure, a set of most similar rows to the target row is determined, wherein each row in the set of most similar rows to the target row has a corresponding similarity measure above a threshold similarity measure and wherein each row in the set of most similar rows includes the target attribute. From values corresponding to the target attribute in the set of most similar rows, a replacement value is determined. The value requiring correction in the target row is replaced with the replacement value.
    Type: Application
    Filed: November 20, 2018
    Publication date: May 21, 2020
    Applicant: International Business Machines Corporation
    Inventors: Rajesh Bordawekar, Tin Kam Ho
  • Publication number: 20200142989
    Abstract: A system and a method for performing queries, including generating text representations of features of various types of data, building a multi-modal word embedding model to capture relationships between the various types of data, and based on the multi-modal word embedding model, performing an inductive reasoning query.
    Type: Application
    Filed: November 2, 2018
    Publication date: May 7, 2020
    Inventors: Rajesh Bordawekar, Bortik Bandyopadhyay
  • Publication number: 20190332705
    Abstract: A method, computer program product and/or system is disclosed. According to an aspect of this invention, one or more processors receive a query of a first database, where the query includes: (i) an operand, and (ii) an operator indicating a distance-based similarity measure. One or more processors further determine a result set based on the query, wherein the result set includes a plurality of records, and wherein a record is included in the result set based on a vector nearest-neighbor computation between: (i) a first vector corresponding to the operand, and (ii) a second vector corresponding to the record, wherein the second vector is included in a vector space model that is based on a textual representation of the first database.
    Type: Application
    Filed: April 25, 2018
    Publication date: October 31, 2019
    Inventors: Rajesh Bordawekar, Oded Shmueli