Patents by Inventor Niall F. McCarroll

Niall F. McCarroll has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Outputting map-reduce jobs to an archive file

Patent number: 11556496

Abstract: Method and system are provided for writing output from map-reduce jobs to an archive file. The method may include providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. The method may also include using a buffering database as a temporary cache to buffer updates to the archive file. Handling by the archive manager calls from map-reduce jobs may allow: reading directly from an archive file or from a job index at the buffering database; writing to a job index at the buffering database used as a temporary cache to buffer updates; and serializing updates from the buffering database to the archive file.

Type: Grant

Filed: November 2, 2018

Date of Patent: January 17, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Curtis N. Browning, Niall F. McCarroll
Parallel scoring of an ensemble model

Patent number: 10902005

Abstract: Method and systems for parallel scoring an ensemble model are provided. Aspects include loading data into a first distributed data structure having a plurality of partitions, each partition having loaded data in the form of a set of pairs of data formed of a record to be scored and a partial score for that record. A component model in the ensemble model is selected and processing of the records carried out in parallel across the partitions including updating the partial score for each record. In response to a partial score for a record not meeting an accuracy threshold, the method retains the record in the first distributed data structure to be scored by a subsequent component model. In response to the partial score for a record meeting the accuracy threshold, the method moves the record and updated partial score to an output result data structure to provide a final score.

Type: Grant

Filed: October 26, 2017

Date of Patent: January 26, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Julian J. Clinton, Niall F. McCarroll, Lei Tian
Parallel scoring of an ensemble model

Patent number: 10650008

Abstract: Method and systems for parallel scoring an ensemble model are provided. Aspects include loading data into a first distributed data structure having a plurality of partitions, each partition having loaded data in the form of a set of pairs of data formed of a record to be scored and a partial score for that record. A component model in the ensemble model is selected and processing of the records carried out in parallel across the partitions including updating the partial score for each record. In response to a partial score for a record not meeting an accuracy threshold, the method retains the record in the first distributed data structure to be scored by a subsequent component model. In response to the partial score for a record meeting the accuracy threshold, the method moves the record and updated partial score to an output result data structure to provide a final score.

Type: Grant

Filed: August 26, 2016

Date of Patent: May 12, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Julian J. Clinton, Niall F. McCarroll, Lei Tian
OUTPUTTING MAP-REDUCE JOBS TO AN ARCHIVE FILE

Publication number: 20190079939

Abstract: Method and system are provided for writing output from map-reduce jobs to an archive file. The method may include providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. The method may also include using a buffering database as a temporary cache to buffer updates to the archive file. Handling by the archive manager calls from map-reduce jobs may allow: reading directly from an archive file or from a job index at the buffering database; writing to a job index at the buffering database used as a temporary cache to buffer updates; and serializing updates from the buffering database to the archive file.

Type: Application

Filed: November 2, 2018

Publication date: March 14, 2019

Inventors: Curtis N. Browning, Niall F. McCarroll
Generating synthetic data

Patent number: 10171311

Abstract: A method of generating synthetic data from a model of a dataset comprises the steps of receiving a model of a dataset, extracting information from the received model, constructing a database view from the extracted information, receiving a query to the constructed database view, and generating synthetic data from the constructed database view according to the received query.

Type: Grant

Filed: October 17, 2013

Date of Patent: January 1, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthew C. Harvey, Niall F. McCarroll, Yefim Shuf
Outputting map-reduce jobs to an archive file

Patent number: 10146779

Abstract: Method and system are provided for writing output from map-reduce jobs to an archive file. The method may include providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. The method may also include using a buffering database as a temporary cache to buffer updates to the archive file. Handling by the archive manager calls from map-reduce jobs may allow: reading directly from an archive file or from a job index at the buffering database; writing to a job index at the buffering database used as a temporary cache to buffer updates; and serializing updates from the buffering database to the archive file.

Type: Grant

Filed: June 26, 2015

Date of Patent: December 4, 2018

Assignee: International Business Machines Corporation

Inventors: Curtis N. Browning, Niall F. McCarroll
PARALLEL SCORING OF AN ENSEMBLE MODEL

Publication number: 20180060330

Abstract: Method and systems for parallel scoring an ensemble model are provided. Aspects include loading data into a first distributed data structure having a plurality of partitions, each partition having loaded data in the form of a set of pairs of data formed of a record to be scored and a partial score for that record. A component model in the ensemble model is selected and processing of the records carried out in parallel across the partitions including updating the partial score for each record. In response to a partial score for a record not meeting an accuracy threshold, the method retains the record in the first distributed data structure to be scored by a subsequent component model. In response to the partial score for a record meeting the accuracy threshold, the method moves the record and updated partial score to an output result data structure to provide a final score.

Type: Application

Filed: October 26, 2017

Publication date: March 1, 2018

Inventors: JULIAN J. CLINTON, NIALL F. McCARROLL, LEI TIAN
PARALLEL SCORING OF AN ENSEMBLE MODEL

Publication number: 20180060324

Abstract: Method and systems for parallel scoring an ensemble model are provided. Aspects include loading data into a first distributed data structure having a plurality of partitions, each partition having loaded data in the form of a set of pairs of data formed of a record to be scored and a partial score for that record. A component model in the ensemble model is selected and processing of the records carried out in parallel across the partitions including updating the partial score for each record. In response to a partial score for a record not meeting an accuracy threshold, the method retains the record in the first distributed data structure to be scored by a subsequent component model. In response to the partial score for a record meeting the accuracy threshold, the method moves the record and updated partial score to an output result data structure to provide a final score.

Type: Application

Filed: August 26, 2016

Publication date: March 1, 2018

Inventors: JULIAN J. CLINTON, NIALL F. McCARROLL, LEI TIAN
OUTPUTTING MAP-REDUCE JOBS TO AN ARCHIVE FILE

Publication number: 20160070711

Abstract: Method and system are provided for writing output from map-reduce jobs to an archive file. The method may include providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. The method may also include using a buffering database as a temporary cache to buffer updates to the archive file. Handling by the archive manager calls from map-reduce jobs may allow: reading directly from an archive file or from a job index at the buffering database; writing to a job index at the buffering database used as a temporary cache to buffer updates; and serializing updates from the buffering database to the archive file.

Type: Application

Filed: June 26, 2015

Publication date: March 10, 2016

Inventors: Curtis N. Browning, Niall F. McCarroll
GENERATING SYNTHETIC DATA

Publication number: 20140115007

Abstract: A method of generating synthetic data from a model of a dataset comprises the steps of receiving a model of a dataset, extracting information from the received model, constructing a database view from the extracted information, receiving a query to the constructed database view, and generating synthetic data from the constructed database view according to the received query.

Type: Application

Filed: October 17, 2013

Publication date: April 24, 2014

Applicant: International Business Machines Corporation

Inventors: Matthew C. Harvey, Niall F. McCarroll, Yefim Shuf