Patents by Inventor Rajesh Bordawekar

Rajesh Bordawekar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Comparing time series data using context-based similarity

Patent number: 11244224

Abstract: A first observation window in a first time series is identified. The first observation window is preceded by a first portion of the first time series. A neural network is trained using the first portion of the first time series and the first observation window, and weights are extracted from the middle layers of the neural network. A first feature vector is generated based on the weights. A second observation window in a second time series is identified, where the second observation window is preceded by a first portion of the second time series. A second feature vector associated with the second observation window is determined. The second feature vector is based at least in part on the first set of weights. A similarity between the first and second observation windows is determined based on comparing the first feature vector and the second feature vector.

Type: Grant

Filed: March 20, 2018

Date of Patent: February 8, 2022

Assignee: International Business Machines Corporation

Inventors: Rajesh Bordawekar, Tin Kam Ho
Search queries of multi-datatype databases

Patent number: 11182414

Abstract: A computer-implemented method, cognitive intelligence system and computer program product adapt a relational database containing multiple data types. Non-text tokens in the relational database are converted to a textual form. Text is produced based on relations of tokens in the relational database. A set of word vectors is produced for the tokens based on the text. A cognitive intelligence query expressed as a structured query language (SQL) query may be applied to the relational database using the set of word vectors. The form of non-text tokens is one of a numeric value, an SQL type, an image, a video, a time series, latitude and longitude, or chemical structures. A single word embedding model may be applied over one or more tokens in the text. A plurality of sets of preliminary word vectors are computed by applying more than one embedding model over all tokens in the text. The preliminary word vector sets are merged to form the set of word vectors.

Type: Grant

Filed: March 20, 2017

Date of Patent: November 23, 2021

Assignee: International Business Machines Corporation

Inventors: Bortik Bandyopadhyay, Rajesh Bordawekar, Tin Kam Ho
Record correction and completion using data sourced from contextually similar records

Patent number: 11176176

Abstract: From a first attribute-value pair in a record, new data comprising a first token is created. From each token using a processor and a memory, new data including a corresponding vector is computed. From the record, a target row is selected, wherein a target attribute-value pair in the target row includes a value requiring correction. Using a similarity measure, a set of most similar rows to the target row is determined, wherein each row in the set of most similar rows to the target row has a corresponding similarity measure above a threshold similarity measure and wherein each row in the set of most similar rows includes the target attribute. From values corresponding to the target attribute in the set of most similar rows, a replacement value is determined. The value requiring correction in the target row is replaced with the replacement value.

Type: Grant

Filed: November 20, 2018

Date of Patent: November 16, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rajesh Bordawekar, Tin Kam Ho
Vector embedding models for relational tables with null or equivalent values

Patent number: 11163761

Abstract: Structured and semi-structured databases and files are processed using natural language processing techniques to impute data for null value tokens in database records from other records that have non-null values for the same attributes. Vector embedding techniques are used, including, in some cases, appropriately tagging null value tokens to reduce or eliminate their undue impact on semantic vectors generating using a neural network.

Type: Grant

Filed: March 20, 2020

Date of Patent: November 2, 2021

Assignee: International Business Machines Corporation

Inventors: Rajesh Bordawekar, Tin Kam Ho
METHOD AND SYSTEM FOR SUPPORTING INDUCTIVE REASONING QUERIES OVER MULTI-MODAL DATA FROM RELATIONAL DATABASES

Publication number: 20210311937

Abstract: A system, apparatus, and a method for training with multi-modal data in a relational database, including generating a first database including a multi-view of the multi-modal data, retrieving a second set of data from an external source via a network, and training a first model according the first database and the second set of data. The first model outputs relationships of the first database with the multi-view and the second set of data.

Type: Application

Filed: June 21, 2021

Publication date: October 7, 2021

Inventors: Rajesh Bordawekar, Bortik Bandyopadhyay
VECTOR EMBEDDING MODELS FOR RELATIONAL TABLES WITH NULL OR EQUIVALENT VALUES

Publication number: 20210294794

Abstract: Structured and semi-structured databases and files are processed using natural language processing techniques to impute data for null value tokens in database records from other records that have non-null values for the same attributes. Vector embedding techniques are used, including, in some cases, appropriately tagging null value tokens to reduce or eliminate their undue impact on semantic vectors generating using a neural network.

Type: Application

Filed: March 20, 2020

Publication date: September 23, 2021

Inventors: Rajesh Bordawekar, Tin Kam Ho
Numeric data type support for cognitive intelligence queries

Patent number: 11100100

Abstract: A computer-implemented method, cognitive intelligence server and computer program product adapt a relational database containing numeric data types. At least one numeric token in the relational database is converted to a textual form. Text is produced based on relations of tokens in the relational database. A set of word vectors is produced based on the text. A cognitive intelligence query, expressed as a structured query language (SQL) query, may be applied to the relational database using the set of word vectors. At least one numeric token in the relational database may be converted to a typed string comprising a heading for a column in the relational database for which the token appears and the numeric value. Converting at least one numeric token in the relational database may comprise clustering tokens in a column of the relational database using a clustering algorithm and replacing each token in the column by a cluster identifier.

Type: Grant

Filed: March 20, 2017

Date of Patent: August 24, 2021

Assignee: International Business Machines Corporation

Inventors: Bortik Bandyopadhyay, Rajesh Bordawekar, Tin Kam Ho
Image support for cognitive intelligence queries

Patent number: 11080273

Abstract: A computer-implemented method, a cognitive intelligence system and computer program product adapt a relational database containing image data types. At least one image token in the relational database is converted to a textual form. Text is produced based on relations of tokens in the relational database. A set of word vectors is produced based on the text. A cognitive intelligence query expressed as a structured query language (SQL) query may be applied to the relational database using the set of word vectors. An image token may be converted to textual form by converting the image to a tag, by using a neural network classification model and replacing the image token with a corresponding cluster identifier, by binary comparison or by a user-specified similarity function. An image token may be converted to a plurality of textual forms using more than one conversion method.

Type: Grant

Filed: March 20, 2017

Date of Patent: August 3, 2021

Assignee: International Business Machines Corporation

Inventors: Bortik Bandyopadhyay, Rajesh Bordawekar, Tin Kam Ho
Method and system for supporting inductive reasoning queries over multi-modal data from relational databases

Patent number: 11074253

Abstract: A system and a method for performing queries, including generating text representations of features of various types of data, building a multi-modal word embedding model to capture relationships between the various types of data, and based on the multi-modal word embedding model, performing an inductive reasoning query.

Type: Grant

Filed: November 2, 2018

Date of Patent: July 27, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rajesh Bordawekar, Bortik Bandyopadhyay
BUILDING A WORD EMBEDDING MODEL TO CAPTURE RELATIONAL DATA SEMANTICS

Publication number: 20210124724

Abstract: A computer-implemented method according to one embodiment includes identifying a relational database; determining columns of interest within the relational database; creating an unordered group of string tokens for each row of the relational database, utilizing the determined columns of interest; assigning weights for one or more columns within the relational database to one or more string tokens within each unordered group of string tokens to create a plurality of weighted unordered groups of string tokens; and determining a meaning vector for an identifier of each row of the relational database, utilizing the plurality of weighted unordered groups of string tokens.

Type: Application

Filed: October 28, 2019

Publication date: April 29, 2021

Inventor: Rajesh Bordawekar
Creating cognitive intelligence queries from multiple data corpuses

Patent number: 10984030

Abstract: A computer-implemented method, a cognitive intelligence system and computer program product adapt a relational database containing multiple data types. Non-text tokens in the relational database are converted to a textual form. Text is produced based on relations of tokens in the relational database. A set of pre-trained word vectors for the text is retrieved from an external database. The set of pre-trained word vectors is initialized for tokens common to both the relational database and an external database. The set of pre-trained vectors is used to create a cognitive intelligence query expressed as a structure query language (SQL) query. Content of the relational database is used for training while initializing the set of pre-trained word vectors for tokens common to both the relational database and the external database. The first set of word vectors may be immutable or mutable with updates controlled via parameters.

Type: Grant

Filed: March 20, 2017

Date of Patent: April 20, 2021

Assignee: International Business Machines Corporation

Inventors: Rajesh Bordawekar, Oded Shmueli
Parallelized in-place radix sorting

Patent number: 10831738

Abstract: Apparatuses and Methods for sorting a data set. A data storage is divided into a plurality of buckets that is each associated with a respective key value. A plurality of stripes is identified in each bucket. A plurality of data stripe sets is defined that has one stripe within each respective bucket. A first and a second in-place partial bucket radix sort are performed on data items contained within the first and second data stripe sets, respectively, using an initial radix. Incorrectly sorted data items in the first bucket are grouped by a first processor and incorrectly sorted data items in the second bucket are grouped by a second processor into a respective incorrect data item group within each bucket. A radix sort is then performed using the initial radix on the items within the respective incorrect data item group. A first level sorted output is produced.

Type: Grant

Filed: December 22, 2017

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Rajesh Bordawekar, Daniel Brand, Minsik Cho, Ulrich Finkler, Ruchir Puri
Semantic relational database operations

Patent number: 10831752

Abstract: A method, computer program product and/or system is disclosed. According to an aspect of this invention, one or more processors receive a query of a first database, where the query includes: (i) an operand, and (ii) an operator indicating a distance-based similarity measure. One or more processors further determine a result set based on the query, wherein the result set includes a plurality of records, and wherein a record is included in the result set based on a vector nearest-neighbor computation between: (i) a first vector corresponding to the operand, and (ii) a second vector corresponding to the record, wherein the second vector is included in a vector space model that is based on a textual representation of the first database.

Type: Grant

Filed: April 25, 2018

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Rajesh Bordawekar, Oded Shmueli
QUERY RESPONSE USING SEMANTICALLY SIMILAR DATABASE RECORDS

Publication number: 20200210431

Abstract: from a first attribute-value pair in a record, new data is created including a first token. Using a first model and using a processor and a memory, each token is vectorized into new data including a corresponding vector. From the record, a target row is selected, wherein a target attribute-value pair in the target row includes a value for which a semantic similarity computation is to be performed. Using a similarity measure, a set of most similar rows to the target row is determined, wherein each row in the set of most similar rows to the target row has a corresponding similarity measure above a threshold similarity measure and wherein each row in the set of most similar rows includes the target attribute. The set of most similar rows is used to compute a response to a database query.

Type: Application

Filed: January 2, 2019

Publication date: July 2, 2020

Applicant: International Business Machines Corporation

Inventors: Rajesh Bordawekar, Jose Neves
Radix sort acceleration using custom asic

Patent number: 10685002

Abstract: An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a head processor, data elements starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array. The system radix sort processing, with a tail processor, data elements starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array.

Type: Grant

Filed: December 29, 2017

Date of Patent: June 16, 2020

Assignee: International Business Machines Corporation

Inventors: Rajesh Bordawekar, Daniel Brand, Minsik Cho, Brian R. Konigsburg, Ruchir Puri
DYNAMIC UPDATING OF A WORD EMBEDDING MODEL

Publication number: 20200175360

Abstract: Methods, systems and computer program products for updating a word embedding model are provided. Aspects include receiving a first data set comprising a relational database having a plurality of words. Aspects also include generating a word embedding model comprising a plurality of word vectors by training a neural network using unsupervised machine learning based on the first data set. Each word vector of the plurality of word vector corresponds to a unique word of the plurality of words. Aspects also include storing the plurality of word vectors and a representation of a hidden layer of the neural network. Aspects also include receiving a second data set comprising data that has been added to the relational database. Aspects also include updating the word embedding model based on the second data set and the stored representation of the hidden layer of the neural network.

Type: Application

Filed: November 29, 2018

Publication date: June 4, 2020

Inventors: Thomas Conti, Stephen Warren, Rajesh Bordawekar, Jose Neves, Christopher Harding
WORD EMBEDDING MODEL PARAMETER ADVISOR

Publication number: 20200175390

Abstract: Methods, systems and computer program products for determining recommended parameters for use in generating a word embedding model are provided. Aspects include storing a plurality of meaningful test cases. Each meaningful test case includes a test data profile and one or more test model parameters used to create a word embedding model that has been classified as yielding meaningful results. Aspects include receiving a production data set to be used in generating a new word embedding model. The production data set includes data stored in a relational database having a plurality of columns and a plurality of rows. Aspects include generating a data profile associated with the production data set. Aspects include generating a recommendation for one or more production model parameters for use in building a word embedding model based on the data profile associated with the production data set and the plurality of meaningful test cases.

Type: Application

Filed: November 29, 2018

Publication date: June 4, 2020

Inventors: Thomas Conti, Rajesh Bordawekar, Stephen Warren, Christopher Harding, Jose Neves
RECORD CORRECTION AND COMPLETION USING DATA SOURCED FROM CONTEXTUALLY SIMILAR RECORDS

Publication number: 20200159853

Abstract: From a first attribute-value pair in a record, new data comprising a first token is created. From each token using a processor and a memory, new data including a corresponding vector is computed. From the record, a target row is selected, wherein a target attribute-value pair in the target row includes a value requiring correction. Using a similarity measure, a set of most similar rows to the target row is determined, wherein each row in the set of most similar rows to the target row has a corresponding similarity measure above a threshold similarity measure and wherein each row in the set of most similar rows includes the target attribute. From values corresponding to the target attribute in the set of most similar rows, a replacement value is determined. The value requiring correction in the target row is replaced with the replacement value.

Type: Application

Filed: November 20, 2018

Publication date: May 21, 2020

Applicant: International Business Machines Corporation

Inventors: Rajesh Bordawekar, Tin Kam Ho
METHOD AND SYSTEM FOR SUPPORTING INDUCTIVE REASONING QUERIES OVER MULTI-MODAL DATA FROM RELATIONAL DATABASES

Publication number: 20200142989

Abstract: A system and a method for performing queries, including generating text representations of features of various types of data, building a multi-modal word embedding model to capture relationships between the various types of data, and based on the multi-modal word embedding model, performing an inductive reasoning query.

Type: Application

Filed: November 2, 2018

Publication date: May 7, 2020

Inventors: Rajesh Bordawekar, Bortik Bandyopadhyay
SEMANTIC RELATIONAL DATABASE OPERATIONS

Publication number: 20190332705

Abstract: A method, computer program product and/or system is disclosed. According to an aspect of this invention, one or more processors receive a query of a first database, where the query includes: (i) an operand, and (ii) an operator indicating a distance-based similarity measure. One or more processors further determine a result set based on the query, wherein the result set includes a plurality of records, and wherein a record is included in the result set based on a vector nearest-neighbor computation between: (i) a first vector corresponding to the operand, and (ii) a second vector corresponding to the record, wherein the second vector is included in a vector space model that is based on a textual representation of the first database.

Type: Application

Filed: April 25, 2018

Publication date: October 31, 2019

Inventors: Rajesh Bordawekar, Oded Shmueli

prev 1 2 3 4 next