Patents by Inventor Henrique Andrade

Henrique Andrade has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Annotation of a Machine Learning Pipeline with Operational Semantics to Support Distributed Lineage Tracking

Publication number: 20230169354

Abstract: A system, computer program product, and method are provided for distributed data workflow semantics. A pipeline, such as a machine learning (ML) pipeline, is represented in a data flow graph (DFG). The represented pipeline is subject to annotations, with the annotations including pipeline nodes and object references. The pre-processed pipeline is subject to execution or processing with the annotated object references capturing object lineage. Output from the executed pipeline is constructed and a corresponding control signal is formatted to dynamically and selectively control an operatively coupled physical hardware device or software.

Type: Application

Filed: November 30, 2021

Publication date: June 1, 2023

Applicant: International Business Machines Corporation

Inventors: Mudhakar SRIVATSA, Raghu Kiran GANTI, Carlos Henrique ANDRADE COSTA, Linsong CHU, Joshua M. ROSENKRANZ
Annotation of a Machine Learning Pipeline with Operational Semantics

Publication number: 20230168923

Abstract: A system, computer program product, and method are provided for distributed data workflow semantics. A pipeline, such as a machine learning pipeline, is represented in a data flow graph (DFG) with nodes and edges. The represented nodes are configured to be annotated with an operational semantic. On order of execution of the pipeline is discovered through the node annotation(s) represented in the annotated DFG, and execution of the pipeline is based on the discovered order. A control signal formatted based on the executed pipeline is configured to dynamically and selectively control an operatively coupled physical hardware device.

Type: Application

Filed: November 30, 2021

Publication date: June 1, 2023

Applicant: International Business Machines Corporation

Inventors: Raghu Kiran GANTI, Mudhakar SRIVATSA, Carlos Henrique Andrade Costa
Annotation of a Machine Learning Pipeline with Operational Semantics

Publication number: 20230169408

Abstract: A system, computer program product, and method are provided for distributed data workflow semantics. A pipeline, such as a machine learning (ML) pipeline, is implemented over a data flow graph (DFG) with nodes configured to support rich semantics. The rich semantics include two or more operational semantics, and at least one lineage semantic to selectively combine features that trace lineage to a common input object. The lineage semantic is leveraged to associate training and testing data set pairs in cross validation of the trained ML models produced from parallelizing the selection of ML pipelines.

Type: Application

Filed: November 30, 2021

Publication date: June 1, 2023

Applicant: International Business Machines Corporation

Inventors: Carlos Henrique Andrade Costa, RAGHU KIRAN GANTI, MUDHAKAR SRIVATSA, Linsong Chu, Joshua M. Rosenkranz, Tuan Minh HOANG TRONG
Data shuffling with hierarchical tuple spaces

Patent number: 10956125

Abstract: Methods and systems for shuffling data are described. A processor may generate pair data from source data. The processor may insert the pair data into local tuple spaces. In response to a request for a particular key, the processor may determine a presence of the requested key in a global tuple space. The processor may, in response to a presence of the requested key in the global tuple space, update the global tuple space. The update may be based on the pair data among the local tuple spaces including the existing key. The processor may, in response to an absence of the requested key in the global tuple space, insert pair data including the missing key from the local tuple spaces into the global tuple space. The processor may fetch the requested pair data, and may shuffle the fetched data to generate a dataset.

Type: Grant

Filed: December 21, 2017

Date of Patent: March 23, 2021

Assignee: International Business Machines Corporation

Inventors: Carlos Henrique Andrade Costa, Abdullah Kayi, Yoonho Park, Charles Johns
Data shuffling with hierarchical tuple spaces

Patent number: 10891274

Abstract: Methods and systems for shuffling data to generate a dataset are described. A first map module may generate first pair data, and a second map module may generate second pair data, from source data. The first map module may insert the first pair data into a first local tuple space accessible to the first map module. The second map module may insert the second pair data into a second local tuple space accessible to the second map module. A shuffle module may request pair data that includes a particular key. The first and second pair data may be inserted into a global tuple space accessible by the first and second map modules. The shuffle module may identify the requested pair data in the global tuple space, and may fetch the identified pair data from a memory. The shuffle module may shuffle the fetched pair data to generate the dataset.

Type: Grant

Filed: December 21, 2017

Date of Patent: January 12, 2021

Assignee: International Business Machines Corporation

Inventors: Abdullah Kayi, Carlos Henrique Andrade Costa, Yoonho Park, Charles Johns
Checkpointing using compute node health information

Patent number: 10545839

Abstract: A method is disclosed, as well as an associated apparatus and computer program product, for checkpointing using a plurality of communicatively coupled compute nodes. The method comprises acquiring health information for a first node of the plurality of compute nodes, and determining a first failure probability for the first node using the health information. The first failure probability corresponds to a predetermined time interval. The method further comprises selecting a second node of the plurality of compute nodes as a partner node for the first node. The second node has a second failure probability for the time interval. A composite failure probability of the first node and the second node is less than the first failure probability. The method further comprises copying checkpoint information from the first node to the partner node.

Type: Grant

Filed: December 22, 2017

Date of Patent: January 28, 2020

Assignee: International Business Machines Corporation

Inventors: Carlos Henrique Andrade Costa, Yoonho Park, Chen-Yong Cher, Bryan Rosenburg, Kyung Ryu
DATA SHUFFLING WITH HIERARCHICAL TUPLE SPACES

Publication number: 20190196783

Abstract: Methods and systems for shuffling data are described. A processor may generate pair data from source data. The processor may insert the pair data into local tuple spaces. In response to a request for a particular key, the processor may determine a presence of the requested key in a global tuple space. The processor may, in response to a presence of the requested key in the global tuple space, update the global tuple space. The update may be based on the pair data among the local tuple spaces including the existing key. The processor may, in response to an absence of the requested key in the global tuple space, insert pair data including the missing key from the local tuple spaces into the global tuple space. The processor may fetch the requested pair data, and may shuffle the fetched data to generate a dataset.

Type: Application

Filed: December 21, 2017

Publication date: June 27, 2019

Inventors: Carlos Henrique Andrade Costa, Abdullah Kayi, Yoonho Park, Charles Johns
DATA SHUFFLING WITH HIERARCHICAL TUPLE SPACES

Publication number: 20190197138

Abstract: Methods and systems for shuffling data to generate a dataset are described. A first map module may generate first pair data, and a second map module may generate second pair data, from source data. The first map module may insert the first pair data into a first local tuple space accessible to the first map module. The second map module may insert the second pair data into a second local tuple space accessible to the second map module. A shuffle module may request pair data that includes a particular key. The first and second pair data may be inserted into a global tuple space accessible by the first and second map modules. The shuffle module may identify the requested pair data in the global tuple space, and may fetch the identified pair data from a memory. The shuffle module may shuffle the fetched pair data to generate the dataset.

Type: Application

Filed: December 21, 2017

Publication date: June 27, 2019

Inventors: Abdullah Kayi, Carlos Henrique Andrade Costa, Yoonho Park, Charles Johns
CHECKPOINTING USING COMPUTE NODE HEALTH INFORMATION

Publication number: 20190196920

Abstract: A method is disclosed, as well as an associated apparatus and computer program product, for checkpointing using a plurality of communicatively coupled compute nodes. The method comprises acquiring health information for a first node of the plurality of compute nodes, and determining a first failure probability for the first node using the health information. The first failure probability corresponds to a predetermined time interval. The method further comprises selecting a second node of the plurality of compute nodes as a partner node for the first node. The second node has a second failure probability for the time interval. A composite failure probability of the first node and the second node is less than the first failure probability. The method further comprises copying checkpoint information from the first node to the partner node.

Type: Application

Filed: December 22, 2017

Publication date: June 27, 2019

Inventors: Carlos Henrique ANDRADE COSTA, Yoonho PARK, Chen-Yong CHER, Bryan ROSENBURG, Kyung RYU
Processing of streaming data with a keyed join

Patent number: 10244017

Abstract: A keyed join is used in the processing of streaming data to streamline processing to provide higher throughput and decreased use of resources. The most recent event for each unique replacement key value(s) is maintained substituting older events with the same key. An incoming event is joined with the data received from one or more other data sources, and the correlations are output.

Type: Grant

Filed: August 14, 2009

Date of Patent: March 26, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Henrique Andrade, Mitchell A. Cohen, Bugra Gedik
Managing resource allocation or configuration parameters of a model building component to build analytic models to increase the utility of data analysis applications

Patent number: 9244735

Abstract: Data analysis applications include model building components and stream processing components. To increase utility of the data analysis application, in one embodiment, the model building component of the data analysis application is managed. Management includes resource allocation and/or configuration adaptation of the model building component, as examples.

Type: Grant

Filed: January 7, 2014

Date of Patent: January 26, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Henrique Andrade, Bugra Gedik, Vibhore Kumar, Kun-Lung Wu
Application resource model composition from constituent components

Patent number: 9135069

Abstract: Techniques for composing an application resource model are disclosed.

Type: Grant

Filed: May 4, 2012

Date of Patent: September 15, 2015

Assignee: International Business Machines Corporation

Inventors: Henrique Andrade, Bugra Gedik, Sujay Sunil Parekh, Kun-Lung Wu, Xiaolan Zhang
Injecting a fault into a stream operator in a data stream processing application

Patent number: 8997039

Abstract: In one embodiment, the invention comprises partial fault tolerant stream processing applications. One embodiment of a method for implementing partial fault tolerance in a stream processing application comprising a plurality of stream operators includes: defining a quality score function that expresses how well the application is performing quantitatively, injecting a fault into at least one of the plurality of operators, assessing an impact of the fault on the quality score function, and selecting at least one partial fault-tolerant technique for implementation in the application based on the quantitative metric-driven assessment.

Type: Grant

Filed: April 22, 2013

Date of Patent: March 31, 2015

Assignee: International Business Machines Corporation

Inventors: Henrique Andrade, Bugra Gedik, Gabriela Jacques da Silva, Kun-Lung Wu
Failure recovery for stream processing applications

Patent number: 8949801

Abstract: In one embodiment, the invention is a method and apparatus for failure recovery for stream processing applications. One embodiment of a method for providing a failure recovery mechanism for a stream processing application includes receiving source code for the stream processing application, wherein the source code defines a fault tolerance policy for each of the components of the stream processing application, and wherein respective fault tolerance policies defined for at least two of the plurality of components are different, generating a sequence of instructions for converting the state(s) of the component(s) into a checkpoint file comprising a sequence of storable bits on a periodic basis, according to a frequency defined in the fault tolerance policy, initiating execution of the stream processing application, and storing the checkpoint file, during execution of the stream processing application, at a location that is accessible after failure recovery.

Type: Grant

Filed: May 13, 2009

Date of Patent: February 3, 2015

Assignee: International Business Machines Corporation

Inventors: Henrique Andrade, Bugra Gedik, Gabriela Jacques da Silva, Kun-Lung Wu
Method for high-performance data stream processing

Patent number: 8949810

Abstract: Techniques for optimizing data stream processing are provided. The techniques include employing a pattern, wherein the pattern facilitates splitting of one or more incoming streams and distributing processing across one or more operators, obtaining one or mote operators, wherein the one or more operators support at least one group-independent aggregation and join operation on one or more streams, generating code, wherein the code facilitates mapping of the application onto a computational infrastructure to enable workload partitioning, using the one or more operators to decompose each of the application into one or more granular components, and using the code to reassemble the one or more granular components into one or more deployable blocks to map the application to a computational infrastructure, wherein reassembling the one or more granular components to map the application to the computational infrastructure optimizes data stream processing of the application.

Type: Grant

Filed: June 16, 2008

Date of Patent: February 3, 2015

Assignee: International Business Machines Corporation

Inventors: Henrique Andrade, Bugra Gedik, Kun-Lung Wu
Incrementally constructing executable code for component-based applications

Patent number: 8943482

Abstract: One embodiment of a method for constructing executable code for a component-based application includes receiving a request to compile source code for the component-based application, wherein the request identifies the source code, and wherein the source code comprises a plurality of source code components, each of the source code components implementing a different component of the application, and performing a series of steps for each source code component where the series of steps includes: deriving a signature for the source code component, retrieving a stored signature corresponding to a currently available instance of executable code for the source code component, comparing the derived signature with the stored signature, compiling the source code component into the executable code when the derived signature does not match the stored signature, and obtaining the executable code for the source code component from a repository when the derived signature matches the stored signature.

Type: Grant

Filed: May 15, 2009

Date of Patent: January 27, 2015

Assignee: International Business Machines Corporation

Inventors: Henrique Andrade, Bugra Gedik, Rui Hou, Hua Yong Wang, Kun-Lung Wu
Use of vectorization instruction sets

Patent number: 8904366

Abstract: In one embodiment, the invention is a method and apparatus for use of vectorization instruction sets. One embodiment of a method for generating vector instructions includes receiving source code written in a high-level programming language, wherein the source code includes at least one high-level instruction that performs multiple operations on a plurality of vector operands, and compiling the high-level instruction(s) into one or more low-level instructions, wherein the low-level instructions are in an instruction set of a specific computer architecture.

Type: Grant

Filed: May 15, 2009

Date of Patent: December 2, 2014

Assignee: International Business Machines Corporation

Inventors: Henrique Andrade, Bugra Gedik, Hua Yong Wang, Kun-Lung Wu
Processing of streaming data with keyed aggregation

Patent number: 8868518

Abstract: Keyed aggregation is used in the processing of streaming data to streamline processing to provide higher throughput and decreased use of resources. The most recent event for each unique replacement key value(s) is maintained. In response to an incoming event having a same key as a previous event, the effect on an aggregation of the previous event is removed. The aggregation is then updated with one or more values from the arriving event and the updated aggregation is output.

Type: Grant

Filed: August 14, 2009

Date of Patent: October 21, 2014

Assignee: International Business Machines Corporation

Inventors: Henrique Andrade, Mitchell A. Cohen, Bugra Gedik
Generating layouts for graphs of data flow applications

Patent number: 8856766

Abstract: An embodiment of the invention provides a method of displaying a data flow, wherein a description of a data flow application to be displayed is received. The data flow application includes nodes and edges connecting the nodes, wherein the nodes represent operators and the edges represent data connections for data flowing between the operations. A reason that a user is to view the data flow and/or a user constraint on a complexity of the data flow application to be displayed is determined with a processor; and, the time required to render a display of the data flow application is estimated. A transformed representation of the data flow application is created with the processor. The transformed representation is created based upon the user reason, the user constraint, the estimated time of rendering, and/or a layout strategy. The transformed representation is displayed on a graphical user interface.

Type: Grant

Filed: May 11, 2012

Date of Patent: October 7, 2014

Assignee: International Business Machines Corporation

Inventors: Andrew Lawrence Frenkiel, Henrique Andrade, Bugra Gedik, Michael Donald Pfeifer, Wim De Pauw
Automated building and retargeting of architecture-dependent assets

Patent number: 8843904

Abstract: Architecture-dependent assets are automatically built and retargeted. An asset originally built for one architecture is downloaded and automatically retargeted on another architecture. This automatically retargeting may be performed on demand, at runtime.

Type: Grant

Filed: January 26, 2010

Date of Patent: September 23, 2014

Assignee: International Business Machines Corporation

Inventors: Henrique Andrade, Judah M. Diament, Bugra Gedik, Anton V. Riabov

1 2 3 4 5 next