Data Extraction, Transformation, And Loading (etl) Patents (Class 707/602)
  • Patent number: 10896194
    Abstract: A non-transitory computer readable medium storing instructions that, when executed by an electronic processor, perform a set of functions. The set of functions include extracting a report including a markup language document from a system. The set of functions also includes, for each of a plurality of processing tasks, determining whether the markup language document includes a path contained in a virtual table assigned to the processing task. The set of functions also includes, in response to the markup language document including the path contained in the virtual table, extracting data from the markup language document and executing the processing task to manipulate and queue the data for insertion into the combined database. The set of functions further includes, in response to each of the plurality of processing tasks completing without failure, inserting the data queued into one or more tables included in a database.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: January 19, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Bartosz Brewinski
  • Patent number: 10891271
    Abstract: According to embodiments, a multi-node database management system allows consumer processes (“consumers”) implementing a portion of a distributed data-combination operation to independently send a STOP notification to corresponding producer processes (“producers”). Upon a given consumer determining that the consumer requires no further information from corresponding producers, the consumer sends a STOP notification to the producers. When a given consumer sends out a STOP notification, the producers drop any data destined for the given consumer and also stops preparing data for and sending rows to the given consumer. Furthermore, once the producers receive STOP notifications from all of the consumers corresponding to the producers, the producers stop the current sub plan execution immediately without requiring completion of the sub plan.
    Type: Grant
    Filed: May 25, 2018
    Date of Patent: January 12, 2021
    Assignee: Oracle International Corporation
    Inventors: Yi Pan, Srikanth Bellamkonda, Madhuri Kandepi
  • Patent number: 10891258
    Abstract: De-normalized data structure files generation systems and methods are provided. The system obtains files from sources wherein each file include records, parses files to validate records and attributes in the records, identifies a set of similar files from the validated files, and append two or more files from the set of similar files to obtain one or more consolidated files. Each of the one or more consolidate files corresponds to a specific category. The system further a predefined logic validation on each of the one or more consolidated files to obtain a logic validated file for each of the one or more consolidated files. Each logic validated file obtained for the one or more consolidated files include validated records. The system further generates a de-normalized data structure file including de-normalized records by merging each of the logic validated files to be used for generating intelligence reports.
    Type: Grant
    Filed: January 30, 2017
    Date of Patent: January 12, 2021
    Assignee: Tata Consultancy Services Limited
    Inventors: Ranjan Kumar Sarangi, Sridhar Palla, Manish Kumar, Susant Kumar Bhuyan, Debiprasad Swain, Soumyadeep Ghosh, Padmashwini R
  • Patent number: 10885087
    Abstract: A requirements-traceability system extracts and classifies project requirements stored in a set of source documents. If a source document is unstructured, such as a natural-language word-processing file, the system uses a self-learning or cognitive natural-language tool to inferentially infer requirements in that document. Each requirement may be composed of more detailed sub-requirements in parent-child relationships. Requirements are reclassified into a standardized classification scheme and stored in a standardized hierarchical data structure in which each level corresponds to a requirement's relative degree of granularity. The tree is updated whenever requirements are revised, allowing users and downstream applications to bidirectionally trace each requirement's ancestors and descendants and to review and audit revision histories of the project's entire requirements hierarchy.
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: January 5, 2021
    Assignee: International Business Machines Corporation
    Inventors: Luan Rodrigues De Oliveira, Gerhardt J. Scriven, Fabiana Carvalho Landgraf
  • Patent number: 10877805
    Abstract: Systems, methods and computer program products are provided. Metadata associated with an integration flow comprising a sequence of nodes are received. The metadata identifies data in one or more data objects used by the nodes of the integration flow. In response to initiation of the integration flow, an input data object is received. Initial context data for the integration flow are extracted, from the input data object, based on the metadata. The context data is processed at each of the nodes of the integration flow, wherein one or more of the nodes adds data from its output data object to the context data based on the metadata. Remaining data from the output data objects of one or more of the nodes which was not added to the context data based on the metadata is discarded.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: December 29, 2020
    Assignee: International Business Machines Corporation
    Inventors: Doina L. Klinger, Anthony D. Curcio
  • Patent number: 10866973
    Abstract: As disclosed herein, a method includes receiving a plurality of datasets from a database, wherein each dataset comprises one or more data fields represented in a single data format, and wherein the data fields from at least two of the datasets are represented in different data formats, combining the plurality of datasets to provide a created data column corresponding to all of the data fields from the plurality of datasets, organizing the data column into data clusters, wherein each data cluster includes data fields represented in a single data format, and wherein each data field belongs to a data cluster, providing a key-value map referencing data fields with respect to their corresponding data formats, and verifying the database with respect to the created column. A corresponding computer program product and computer system are also disclosed.
    Type: Grant
    Filed: December 7, 2017
    Date of Patent: December 15, 2020
    Assignee: International Business Machines Corporation
    Inventors: Pedro M. Barbas, Konrad Emanowicz, Enda McCallig, Aslam F. Nomani, Lei Pan
  • Patent number: 10860652
    Abstract: A method, apparatus, and computer-readable medium for generating categorical and criterion-based search results from a search query including receiving the search query, generating one or more query fragments, determining a category corresponding to the search query, determining one or more filters applicable to the search query and one or more core search terms applicable to the search query based at least in part on the determined category and the one or more query fragments, generating at least one custom query for at least one target database in the one or more target databases based at least in part on the one or more filters, the one or more core search terms, the determined category, and one or more attributes of the at least one target database, and executing the at least one custom query on the at least one target database to generate a set of search results.
    Type: Grant
    Filed: May 10, 2018
    Date of Patent: December 8, 2020
    Assignee: Agora Intelligence, Inc.
    Inventors: Kevin Hopkins, Jarom Smith
  • Patent number: 10860548
    Abstract: A system and method of use resolves the frustration of repeated manual work during schema mapping. The system utilizes a transformation graph—a collection of nodes (unified attributes) and edges (transformations) in which source attributes are mapped and transformed. The system further leverages existing mappings and transformations for the purpose of suggesting to a user the optimal paths (i.e., the lowest cost paths) for mapping new sources, which is particularly useful when new sources share similarity with previously mapped sources and require the same transformations. As such, the system also promotes an evolving schema by allowing users to select which unified attributes they want to include in a target schema at any time. The system addresses the technical challenge of finding optimal transformation paths and how to present these to the user for evaluation.
    Type: Grant
    Filed: December 5, 2019
    Date of Patent: December 8, 2020
    Assignee: TAMR, INC.
    Inventors: Sharon Roth, Ihab F. Ilyas, Daniel Meir Bruckner, Gideon Goldin
  • Patent number: 10860569
    Abstract: A method for processing events comprising time series data may include inferring different schema associated with the events. The method may also include storing property definitions corresponding to the events. Each property definition may include a name and a data type. The method may also include storing schema definitions corresponding to the different schema that are inferred. Each schema definition may include a set of one or more properties. The method may also include updating at least one data structure for storing information about the events based on the different schema that are inferred.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: December 8, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Alexandre Igorevich Mineev, Venkatasubramanian Jayaraman, Dmitry Denisov, Matthew Robert Darsney, Om Prakash Ravi
  • Patent number: 10853198
    Abstract: Systems, computer program products, and methods are described herein for restoring a transformation state using blockchain technology. The present invention is configured to electronically receive a data transformation request to implement one or more changes to one or more target systems; electronically extract data from one or more source systems based on at least receiving the one or more data transformation protocols; determine the one or more target systems associated with the data transformation request; generate an image of the first state of the one or more target systems; generate a cryptodigit associated with the first state of the one or more target systems; store the generated cryptodigit and the image of the first state of the one or more target systems as a first node in a blockchain distributed ledger; and implement the one or more changes to the one or more target systems.
    Type: Grant
    Filed: January 30, 2019
    Date of Patent: December 1, 2020
    Assignee: BANK OF AMERICA CORPORATION
    Inventors: Haribabu Reddy Marthala, Bhagat Kumar Allugubelly
  • Patent number: 10846078
    Abstract: A method may include detecting, at a development system hosting a first software application, a change to a first database table storing a master data associated with the first software application. The change may correspond to a customization applied to the first software application. The master data may include data objects that the first software application requires for performing a function of the first software application. In response to detecting the change to the first database table, the change may be applied to a second database table storing a replica of the master data. A transport request may be generated to include the customization and at least a portion of the second database table including the change. The transport request may be sent to a production system hosting a second software application to deploy the customization at the production system. Related systems and articles of manufacture are also provided.
    Type: Grant
    Filed: November 6, 2018
    Date of Patent: November 24, 2020
    Assignee: SAP SE
    Inventors: Wulf Kruempelmann, Barbara Freund
  • Patent number: 10846293
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing a statement that references a particular attribute of a particular topic, in response to providing the statement, obtaining one or more query patterns that each include one or more query terms that are used in queries submitted to a search system in obtaining a value for the particular attribute of the particular topic, generalizing one or more of the query patterns, and associating the one or more generalized query patterns with one or more other topics that include the particular attribute.
    Type: Grant
    Filed: January 16, 2018
    Date of Patent: November 24, 2020
    Assignee: Google LLC
    Inventors: Junli Xian, Engin Cinar Sahin, John Blitzer, Emma S. Persky
  • Patent number: 10846459
    Abstract: A unified messaging platform is described which provides a comprehensive environment for collaboration, file sharing, and project management. In one aspects, a system includes hardware processing circuitry configured to receive a message, the message identifying a user via a user callout, identify a device associated with the user, identify a device type of the identified device and one or more applications on the identified device, generate, based on the device type and the one or more applications, a notification including machine-executable instructions that, when accessed and executed by the device, cause the one or more applications to display a notice about the user callout, and send the notification to the device.
    Type: Grant
    Filed: September 7, 2018
    Date of Patent: November 24, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Mira Lane, Michael Brasket, Larry Waldman, Chad Voss, Swati Jhawar
  • Patent number: 10838960
    Abstract: Performing data analytics processing in the context of a large scale distributed system that includes a massively parallel processing (MPP) database and a distributed storage layer is disclosed. In various embodiments, a data analytics request is received. A plan is created to generate a response to the request. A corresponding portion of the plan is assigned to each of a plurality of distributed processing segments, including by invoking as indicated in the assignment one or more data analytical functions embedded in the processing segment.
    Type: Grant
    Filed: November 22, 2017
    Date of Patent: November 17, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Caleb E. Welton, Shengwen Yang
  • Patent number: 10824614
    Abstract: A method stores records for a set of entities that are generated using an input parameter that is not based on a date. A query is received that includes one or more date parameters, the query for aggregating a value. Upon receiving the query, the method performs: selecting a set of records from the stored records that are valid based on comparing first date information determined from the one or more date parameters and second date information from the records; performing an aggregation calculation of the value for the set of records to generate a query result; and returning the query result in response to the query.
    Type: Grant
    Filed: December 19, 2016
    Date of Patent: November 3, 2020
    Assignee: SAP SE
    Inventor: Ashley Farr
  • Patent number: 10810199
    Abstract: A query optimizer improves the efficiency of a computer database system utilizing an input-output correlator used with a create function that indicates a correlation between an input to the function and an output of the function. In an example, the input-output correlator is an OR OUTPUT parameter of a create table function. The query optimizer determines whether it can pass a value of a query to the input of the function in response to the input-output correlator. Under appropriate conditions, the query optimizer passes the query value to the input of the function to significantly reduce the amount of data returned by the function thereby reducing the load on database resources.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: October 20, 2020
    Assignee: International Business Machines Corporation
    Inventors: Craig S. Aldrich, Mark J. Anderson
  • Patent number: 10812357
    Abstract: A system for performing a timeliness control is disclosed. The system identifies a dataflow path for performing timeliness control and identifies a first network node and a second network node of the dataflow path for determining a latency between the first and the second network node. The system determines an output lineage corresponding to the dataflow path and identifies, from the output lineage, a first control value associated with the first network node and a second control value associated with the second network node. Then, the system extracts a first timestamp from the first control value and a second timestamp from the second control value and determines the latency based on the first timestamp and the second timestamp. Although the intranode latency is described herein with respect to a first and second nodes, the intra-node latency can be determined for up to n nodes using the techniques described herein.
    Type: Grant
    Filed: October 10, 2018
    Date of Patent: October 20, 2020
    Assignee: Bank of America Corporation
    Inventors: Amitava Deb, Sandip Gopal Bhatwadekar, Chih-Chin Yang, Jovan Cenev
  • Patent number: 10810101
    Abstract: A system for and a method of testing the performance of a database management system. The system and method utilize a data table generator, a query generator, and a query driver system that are configured to generate test data, generate a series of test queries, and execute the queries against the data in a controlled and measurable manner such that the performance of the database management system can be tested in a configurable, repeatable, and consistent manner to measure the impact of system software and configuration changes.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: October 20, 2020
    Assignee: JPMORGAN CHASE BANK, N.A.
    Inventors: Eric C. Beck, Adam David Wade
  • Patent number: 10811965
    Abstract: Systems and methods are provided for regulating a power converter. An example system controller includes: a driver configured to output a drive signal to a switch to affect a current flowing through an inductive winding of a power converter, the drive signal being associated with a switching period including an on-time period and an off-time period. The switch is closed in response to the drive signal during the on-time period. The switch is opened in response to the drive signal during the off-time period. A duty cycle is equal to a duration of the on-time period divided by a duration of the switching period. One minus the duty cycle is equal to a parameter. The system controller is configured to keep a multiplication product of the duty cycle, the parameter and the duration of the on-time period approximately constant.
    Type: Grant
    Filed: March 6, 2019
    Date of Patent: October 20, 2020
    Assignee: On-Bright Electronics (Shanghai) Co., Ltd.
    Inventors: Qian Fang, Cong Lan, Lieyi Fang
  • Patent number: 10810224
    Abstract: A computerized method for ingesting data from a relational database into a data lake is provided, wherein a user-defined function (UDF) is associated with a standard operation of extract, transform, load, or ETL, of an ETL pipeline. This UDF is triggered upon performing the standard operation and thereby allow a code associated with the UDF to be executed. Upon migrating data from one or more data sources into the relational database, the standard operation is executed, which triggers the UDF and, in turn, an execution of the code. As per the execution of this code, an entity running on the data lake is notified that a set of data migrated to the relational database is to be ingested according to given ingestion modalities specified by the code. Finally, the set of data can be ingested into the data lake according to the modalities. Related computer program products are also provided.
    Type: Grant
    Filed: June 27, 2018
    Date of Patent: October 20, 2020
    Assignee: International Business Machines Corporation
    Inventors: Daniel N. Bauer, Luis Garcés Erice, John G. Rooney, Peter Urbanetz
  • Patent number: 10812611
    Abstract: Provided are computer-implemented methods and systems for publishing an application to a web container. An example method for publishing an application to a web container may include establishing a channel of communication with a user device associated with an end user. The method may further include embedding a web container into a web portal associated with a plurality of applications. The method may include executing an application in a user session associated with the end user. The method may further include capturing images of a virtual screen associated with the application executed on the application server. After the capture, the method may continue with sending the images to the web container of the web portal running in a web browser of the user device. The web container may publish the images to the web browser to display the application as part of the web portal in the web browser.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: October 20, 2020
    Assignee: ASG Technologies Group, Inc.
    Inventors: Gabriel Bennet, Braulio Megías
  • Patent number: 10795774
    Abstract: Methods and systems for efficiently downloading archived snapshot data from the cloud or from an archival data store are described. In a disaster recovery scenario in which an entire storage appliance for backing up different point in time versions of a virtual machine has failed (e.g., due to a fire), archived snapshot data for the different point in time versions may be acquired by a second storage appliance from an archival data store (e.g., cloud-based data storage) using one or more snapshot mapping files. A snapshot mapping file may include pointers to a plurality of data blocks within the archival data store for generating a full image snapshot associated with a particular point in time version of the virtual machine. The plurality of data blocks may comprise the minimum number of data blocks necessary to construct the particular point in time version of the virtual machine.
    Type: Grant
    Filed: January 19, 2018
    Date of Patent: October 6, 2020
    Assignee: Rubrik, Inc.
    Inventors: Prateek Pandey, Arpit Agarwal
  • Patent number: 10783161
    Abstract: A method includes determining, by a controller, a portion of data that is selected by a user. The portion of data includes source data that is to be transformed by at least one shaping function. The method also includes generating, by the controller, a first output recommendation data that communicates at least one recommended shaping function to apply to the portion of data. The first output recommendation data is generated based on patterns of shaping functions that have been previously chosen. The patterns of shaping functions that have been previously chosen can be chosen by a plurality of system users. The method also includes determining whether to apply the at least one recommended shaping function to the portion of data. The method also includes applying the at least one recommended shaping function based on the determining.
    Type: Grant
    Filed: December 15, 2017
    Date of Patent: September 22, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manish Bhide, Shabharesh Gudla, Sameep Mehta, Prishni Rateria, Samiulla Shaikh, Neelesh K. Shukla, Paul S. Taylor
  • Patent number: 10776439
    Abstract: The current document is directed to systems, and methods incorporated within the systems, that execute queries against log-file entries. A monitoring subsystem within a distributed computer system uses query results during analysis of log-file entries in order to detect changes in the state of the distributed computer system, identify problems or potential problems, and predict and forecast system characteristics. Because of the large numbers of log-file-entry containers that may need to be opened and processed in order to execute a single query, and because opening and reading through the entries in a log-file-entry container is a computationally expensive and time-consuming operation, the currently disclosed systems employ event-type metadata associated with log-file-entry containers to avoid opening and reading through the log-file entries of log-file-entry containers that do not contain log-file entries with event types relevant to the query.
    Type: Grant
    Filed: November 17, 2017
    Date of Patent: September 15, 2020
    Assignee: VMware, Inc.
    Inventors: Darren Brown, Nicholas Kushmerick, Mayank Agarwal, Junyuan Lin
  • Patent number: 10768907
    Abstract: Systems, computer program products, and methods are described herein for data transformation prediction and code change analysis. The present invention is configured to electronically receive one or more data transformation protocols; electronically extract data from a first source system based on at least receiving the one or more data transformation protocols; initiate an impact analysis associated with transforming the data extracted from the first source system using the one or more data transformation protocols, wherein initiating further comprises determining one or more impacts of the data transformation on one or more other source systems; and initiate a presentation of a user interface for display on the user device, wherein the user interface comprises a graphical representation of the one or more impacts of the data transformation of the data extracted from the first source system on the one or more other source systems.
    Type: Grant
    Filed: January 30, 2019
    Date of Patent: September 8, 2020
    Assignee: Bank of America Corporation
    Inventors: Haribabu Reddy Marthala, Bhagat Kumar Allugubelly
  • Patent number: 10762518
    Abstract: A computer-implemented method for responding to user behaviors includes storing category specifications for a plurality of categories configured to characterize users, storing categories for users in a computer network system, detecting behaviors of a user in real time, and determining in real time if the behaviors of the user is within a first category specification associated with a first category that the user is tagged with. If the behaviors of the user exceed the first category specification, the method assigns a second category to the user in real time in response to the detected user behaviors.
    Type: Grant
    Filed: April 23, 2019
    Date of Patent: September 1, 2020
    Assignee: Shutterfly, LLC
    Inventor: Ray Shan
  • Patent number: 10740224
    Abstract: In response to receiving a test suite specification, a processor of a testing platform determines a schedule of execution of a test suite to test a system under test (SUT). The SUT has a hardware resource set including at least one of a set including a processor system and a data storage system, and the test suite includes a plurality of tests, each including a respective set of one or more testcases. The processor initiates execution of the test suite on the SUT in accordance with the schedule. In response to failure of a hardware resource during execution of the test suite, the processor automatically and dynamically reallocating a test in the test suite to at least one different hardware resource in the hardware resource set.
    Type: Grant
    Filed: August 14, 2018
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Richard Mawson, Philip Kelleher, Robert Guy Keevil, Timothy Biesecker, Rotimi Ojo
  • Patent number: 10733175
    Abstract: This invention relates to a system, method and computer program product for a data warehouse model validation system, said data warehouse model validation system having an ETL model and a corresponding data warehouse model, said data ETL system comprising: an element group locator for locating an element group across the ETL model and the data warehouse model, whereby the element group comprises ETL elements and related data warehouse elements; an inconsistency determiner for determining inconsistencies between the ETL elements and data warehouse elements, whereby one or more elements are missing from the data warehouse model or one or more elements in the data warehouse model do not correspond to expected elements or features of elements; and an inconsistency recorder for recording any located missing elements or unexpected elements from the located element group.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: August 4, 2020
    Assignee: International Business Machines Corporation
    Inventors: Gary Denner, Paul Kilroy, Michael J. Loughran
  • Patent number: 10713587
    Abstract: This disclosure provides a method and system to perform data integrity checks in a data warehouse (DWH) feed using machine-learning (ML) processes. According to an exemplary method, a ML integrity check is performed on received data which has been extracted from a plurality of source files, and after ML processes validate the extracted data, the validated data is transformed and loaded to a DWH.
    Type: Grant
    Filed: November 9, 2015
    Date of Patent: July 14, 2020
    Assignee: Xerox Corporation
    Inventor: David Rozier
  • Patent number: 10698873
    Abstract: Performance data generated according to a first schema is read. From the first schema, object descriptors having common primitive types are identified. A second schema is then created. The second schema defines a plurality of rows and at least one column. The rows include a record corresponding to an identified object descriptor. The at least one column corresponds to a primitive type in common with the identified object descriptors.
    Type: Grant
    Filed: March 30, 2017
    Date of Patent: June 30, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Gueorgui B. Chkodrov, Jose Wilson Morris, Kevin M. Grady, Jonathan P. Morris, Yuesu Liu, Douglas M. Setser, David St. Pierre
  • Patent number: 10691714
    Abstract: A computer-executed method includes storing in a data store data attributes, data objects, and a data analysis tool (DAT). Each data object has an attribute set and an identifier set. The method includes identifying each data object that has an attribute set comprising a data attribute matching each reference data attribute associated with the DAT, and selecting an identified data object as an input data object for the DAT. The DAT generates a new data object as a function of the input data object, which includes analyzing the input data object with reference to an auxiliary data object and creating an identifier set for the new data object that includes an identifier of the new data object for distinguishing the new data object from each other data object in the data store, and the identifier of the auxiliary data object. The new data object is stored in the data store.
    Type: Grant
    Filed: June 20, 2016
    Date of Patent: June 23, 2020
    Assignee: Monsanto Technology LLC
    Inventors: Ryan Jerry Richt, Christopher Allen Taylor
  • Patent number: 10691654
    Abstract: A method of migrating data from one or more source databases to one or more target databases may include generating a pre-migration analysis for a plurality of objects stored in the one or more source databases, and generating a plurality of migration scripts that transfer the plurality of objects from the one or more source databases to the one or more target databases. The method may also include generating a migration plan that defines an execution order for the plurality of migration scripts, and migrating the plurality of objects from the one or more source databases to one or more target databases according to the migration plan. The method may further include validating the plurality of objects on the one or more target databases.
    Type: Grant
    Filed: June 8, 2018
    Date of Patent: June 23, 2020
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Antony Higginson, John Masterson, Sean Fitzpatrick, Peter Robertshaw, Elmar Spiegelberg, Stephan Buhne, Michael Weick, Nick Balch, Florin Popescu
  • Patent number: 10685042
    Abstract: A corpus of information describing queries used to access a transactional data store may be used to identify analytical relationships that are not explicitly defined in a schema or supplied by a user. Join relationships may be identified based on field coincidence in elements of queries in the corpus. Join relationships may be indicative of dimensions and attributes of a dimension. Hierarchy levels for a dimension may be identified based on factors including data type, reference in an aggregating clause, and reference in a grouping clause.
    Type: Grant
    Filed: December 22, 2014
    Date of Patent: June 16, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Anurag Windlass Gupta, Timothy Andrew Rath, Srinivasan Sundar Raghavan, Santosh Kalki
  • Patent number: 10678632
    Abstract: A cloud-based ETL system provides error detection, error correction and reporting of data integration flows hosted by cloud services. Categories of errors are identified using one or more checks at different points of a data integration flow and one or more actions selected based at least in part on the error category. A determination can be made whether the error category is fault tolerant and one or more actions can be selected based at least in part on the error fault tolerance to correct the error, restart a flow, or generate a notification assisting a user to correct the error.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: June 9, 2020
    Assignee: Oracle International Corporation
    Inventors: Ispati Nautiyal, Rajesh Balu
  • Patent number: 10671641
    Abstract: An automated method and computer program product are provided for synchronizing a column-oriented target database with a row-oriented source database. Change data are replicated from a change log of the row-oriented source database via a staging database to the column-oriented target database. The change data including inserts and deletes. Change data of the change log is read into the staging database and is consolidated and grouped into a consolidated grouping of inserts, and a consolidated grouping of deletes. The consolidated grouping of inserts from the staging database are applied to the target database in a batched manner, and the consolidated grouping of deletes from the staging database are applied to the target database in a batched manner.
    Type: Grant
    Filed: April 25, 2016
    Date of Patent: June 2, 2020
    Assignee: Gravic, Inc.
    Inventors: Paul J. Holenstein, John R. Hoffmann, Bruce D. Holenstein, Wilbur H. Highleyman
  • Patent number: 10664525
    Abstract: End user data partitioning can include receiving a number of data queries for a data source from a user, developing a dimension relation graph based on attributes of the number of data queries, and partitioning the data source based on the dimension relation graph.
    Type: Grant
    Filed: May 1, 2017
    Date of Patent: May 26, 2020
    Assignee: MICRO FOCUS LLC
    Inventors: Inbar Yogev, Ira Cohen, Olga Kogan-Katz, Lior Ben Ze'ev
  • Patent number: 10664455
    Abstract: A system derives a first schema that is specific to a first log entry type associated with a log code, a second schema that is specific to a second log entry type associated with the log code, and a common schema for the first log entry type and the second log entry type. The system stores the first schema and the common schema in a container for the first log entry type, and the second schema and the common schema in a container for the second log entry type. The system identifies a schema identifier in a log entry corresponding to a system user event. The schema identifier corresponds to a schema in the container for the first log entry type or the container for the second log entry type. The system identifies log data by applying the corresponding schema to the log entry, and outputs the log data.
    Type: Grant
    Filed: April 7, 2017
    Date of Patent: May 26, 2020
    Assignee: salesforce.com, inc.
    Inventors: Choapet Oravivattanakul, Alex Warshavsky, Samarpan Jain
  • Patent number: 10650057
    Abstract: According to certain aspects, a method can include creating a backup copy of data associated with a virtual machine (VM) on one or more secondary storage devices, wherein the backup copy includes corresponding secondary copies of a plurality of files associated with the VM; analyzing metadata associated with the secondary copies to determine which of the plurality files are eligible to be removed from the primary storage device; in response determining that one or more files are eligible to be removed from the primary storage device, for respective file of the one or more files: determining whether the respective file has been changed since a first time at which the backup copy of the data associated with the VM was created; in response to determining that the respective file has not changed since the first time, removing the respective file; and adding a file placeholder for the removed file.
    Type: Grant
    Filed: August 23, 2017
    Date of Patent: May 12, 2020
    Assignee: COMMVAULT SYSTEMS, INC.
    Inventors: Rahul S. Pawar, Henry Wallace Dornemann, Rajiv Kottomtharayil, Chitra Ramaswamy, Ashwin Gautamchand Sancheti
  • Patent number: 10642937
    Abstract: One or more techniques and/or systems are provided for interactively associating a semantic concept with a unique term that is input by a user. As the user is creating a document and/or once the user has completed a draft of the document, the document is parsed to identify unique terms (e.g., persons, places, things, services, etc.) in the document. When a unique term is identified, a query is generated to locate one or more semantic concepts (e.g., URLs, URNs, or other identifiers, for example) that are associated with the identified unique term and a notification indicative of the results is generated. From this notification, the user can select whether to associate the unique term with any and/or all of the located semantic concepts. In this way, supplemental content may be added to a document that the user is creating, for example.
    Type: Grant
    Filed: February 13, 2017
    Date of Patent: May 5, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Matthew Francis Hurst
  • Patent number: 10642863
    Abstract: Disclosed is a network of systems that includes plural disparate storage systems that store user data, the disparate storage systems including NoSQL server databases that provide storage and retrieval of data modeled in forms besides tabular relations used in relational databases, and index storage system, a relational graph storage system and one or more data storage query platforms in communication with the plural disparate storage that have queries produced in a modeling language that abstracts application programmer functionality from network functionality.
    Type: Grant
    Filed: May 27, 2015
    Date of Patent: May 5, 2020
    Assignee: Kaseya International Limited
    Inventors: Mark Fischer, Prakash Khot, Daniel Philip Arcari
  • Patent number: 10635656
    Abstract: Extract, transform, and load application (ETL) complexity management framework systems and methods are described herein. The present disclosure describes systems and methods that reduce the complexity in managing ETL flow and correcting errant data that is subsequently identified. One or more methods include defining an ETL job definition, defining a data asset definition, defining a data asset dependency definition, receiving an ETL flow to provide execution of one or more ETL flow steps, providing retrieval of data from a source data asset, applying a data control to the source asset data, and producing an ETL job registration, a data asset status, a latest asset available date, a data asset consumer identifier, and a target data asset based on at least one of the ETL job definition, the data asset definition, the data dependency definition, and the source asset data.
    Type: Grant
    Filed: March 19, 2018
    Date of Patent: April 28, 2020
    Assignee: United Services Automobile Association (USAA)
    Inventors: Larry W. Clark, Jason Paul Hendry, Mark Steen
  • Patent number: 10635689
    Abstract: Example implementations are directed to a system and method to reduce deployment cost of data analytics application by designing both an application deployment plan and data integration plan, implementing the plans into an application template automatically and deploying application components and data in accordance with the desired implementation. Through example implementations, the need for separate terminals for a data engineer and an application engineer can be eliminated.
    Type: Grant
    Filed: November 1, 2017
    Date of Patent: April 28, 2020
    Assignee: HITACHI, LTD.
    Inventor: Hiroshi Nakagoe
  • Patent number: 10628217
    Abstract: Methods, systems, and computer-readable media for a transformation specification format for multiple execution engines are disclosed. A transformation specification is expressed according to a transformation specification format. The transformation specification represents a polytree or graph linking one or more data producer nodes, one or more data transformation nodes, and one or more data consumer nodes. An execution engine is selected from among a plurality of available execution engines for execution of the transformation specification. The execution engine is used to acquire data from one or more data producers corresponding to the one or more data producer nodes, perform one or more transformations of the data corresponding to the one or more data transformation nodes, and output one or more results of the one or more transformations to one or more data consumers corresponding to the one or more data consumer nodes.
    Type: Grant
    Filed: September 27, 2017
    Date of Patent: April 21, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Fletcher Liverance, Chance Ackley, Dominic Corona
  • Patent number: 10628833
    Abstract: A computer system architecture and method for providing compliance with data regulations, by: (a) collecting a data input stream with a data collection terminal; (b) using a compliance device driver resident in the data collection terminal to: (1) select data corresponding to pre-identified data compliance fields, and (2) apply a compliance markup language parser to generate pseudonymized data; and (c) using an automated compliance network appliance and an automated compliance server to: (1) transmit the pseudonymized data into immutable audit ledger, wherein the immutable audit ledger is assembled and verified by blockchain, and (2) transmit the data input stream into a data lake; (d) hosting access portals for accessing data: (1) stored in the data lake, and (2) stored in the immutable audit ledger.
    Type: Grant
    Filed: April 2, 2019
    Date of Patent: April 21, 2020
    Assignee: TD PROFESSIONAL SERVICES, LLC
    Inventor: Scott Hines
  • Patent number: 10614093
    Abstract: A system and method for creating an instance model is provided. The system provides an information extraction and modeling framework from wide spectrum of document types such as PDF, Text, HTML, LOG, CSV, images, audio/video files and DOCX. In this framework information is extracted and mapped on a domain conceptual model like ER model and the instance model is created. Initially a template model is created using the existing ER model and the plurality of data sources. The template model, the existing ER model and the information extracted from the plurality of data sources are then provided as input to generate the instance model. The system or method is not limited to extract information from log files. This can be useful for different types of files type if the structures and formats of data are different. The system can also be used with unstructured type of data sources.
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: April 7, 2020
    Assignee: Tata Consultancy Services Limited
    Inventors: Sandeep Chougule, Amrish Shashikant Pathak, Sharmishtha Prakash Kulkarni, Nikita Aggarwal, Manish Kailash Khandelwal, Rahul Ramesh Kelkar, Harrick Mayank Vin
  • Patent number: 10606821
    Abstract: Systems and methods for applicant tracking system (ATS) integration with a deduplicator are disclosed. A recruiting company computer system accesses a first entity record external to an ATS. The recruiting company computer system determines that the first entity record corresponds to a second entity record within the ATS based on at least first information of the first entity record and second information of the second entity record. The first information is different from the second information. The recruiting company computer system imports, into the second entity record within the ATS, information from the first entity record external to the ATS in response to the first entity record corresponding to the second entity record. The recruiting company computer system provides, in response a request to access information about an entity associated with the second entity record, the information from the first entity record.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: March 31, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: John Robert Jersin, Benjamin John McCann, Erik Eugene Buchanan
  • Patent number: 10599695
    Abstract: A system and method for forming a search query. Key-word search terms that include a homonym are received. One icon is selected to represent an intended meaning of the homonym. A first row of unique icons pertaining to an entity associated with a search query is displayed. Notification is received that a single unique object represented by a single icon of the unique icons in the first row is modified by a specific attribute and in response, a second row of the single icon modified by the specific attribute is displayed. Acceptance of the displayed single icon modified by the specific attribute is received for inclusion in the search query. The one icon and the single icon are displayed. In response to a user indicating that the displayed icons correctly represent a key-word search as intended by the user, the search based on meanings of the displayed icons is initiated.
    Type: Grant
    Filed: February 8, 2017
    Date of Patent: March 24, 2020
    Assignee: International Business Machines Corporation
    Inventor: Mickey Iqbal
  • Patent number: 10599635
    Abstract: Aspects described herein generally improve the quality, efficiency, and speed of data processing systems by generating staging data independently from the execution of control scripts which process the staging data. The staging data can be independently loaded, validated, and utilized across multiple control scripts, reducing redundancy in the loading of data and the overhead of executing separate data processing for each control script. The control scripts can be automatically validated, such as by verifying expected output data ranges. Additionally, the complexity of the control scripts can be reduced as the loading of data is not performed by the control scripts. The controls scripts can generate a variety of output data, such as an indication of impacted accounts, and provide notifications based on the output data. A variety of machine learning classifiers can be used to automatically generate the staging data and validate the staging data and/or output data.
    Type: Grant
    Filed: July 26, 2019
    Date of Patent: March 24, 2020
    Assignee: Capital One Services, LLC
    Inventors: Daniel Gunn, Zhihao Gau, Pulkit Gulati, William Cartar
  • Patent number: 10599696
    Abstract: A method and system for forming a search query. Key-word search terms that include a homonym are received. One icon is selected to represent an intended meaning of the homonym. A first row of unique icons pertaining to an entity associated with a search query is displayed. Notification is received that a single unique object represented by a single icon of the unique icons in the first row is modified by a specific attribute and in response, a second row of the single icon modified by the specific attribute is displayed. Acceptance of the displayed single icon modified by the specific attribute is received for inclusion in the search query. The one icon and the single icon are displayed. In response to a user indicating that the displayed icons correctly represent a key-word search as intended by the user, the search based on meanings of the displayed icons is initiated.
    Type: Grant
    Filed: February 8, 2017
    Date of Patent: March 24, 2020
    Assignee: International Business Machines Corporation
    Inventor: Mickey Iqbal
  • Patent number: 10585875
    Abstract: This invention relates to a system, method and computer program product for a data warehouse model validation system, said data warehouse model validation system having an ETL model and a corresponding data warehouse model, said data ETL system comprising: an element group locator for locating an element group across the ETL model and the data warehouse model, whereby the element group comprises ETL elements and related data warehouse elements; an inconsistency determiner for determining inconsistencies between the ETL elements and data warehouse elements, whereby one or more elements are missing from the data warehouse model or one or more elements in the data warehouse model do not correspond to expected elements or features of elements; and an inconsistency recorder for recording any located missing elements or unexpected elements from the located element group.
    Type: Grant
    Filed: April 6, 2016
    Date of Patent: March 10, 2020
    Assignee: International Businses Machines Corporation
    Inventors: Gary Denner, Paul Kilroy, Michael J. Loughran