Data Extraction, Transformation, And Loading (etl) Patents (Class 707/602)
  • Patent number: 8732117
    Abstract: In an information processing apparatus, a first deletion unit deletes, from sets stored in a storage unit, sets having less than a threshold number of included elements and elements existing in only sets whose number is less than a threshold number of occurrences. A grouping unit generates a group of sets associated with each other in terms of commonality of elements out of the sets remaining in the storage unit. A second deletion unit deletes, with respect to each generated group, sets having less than the threshold number of included elements and elements existing in only sets whose number is less than the threshold number of occurrences, from the sets belonging to the group. An output unit outputs a list of elements included in the sets remaining in each group where there are no sets or elements that need to be deleted.
    Type: Grant
    Filed: August 17, 2012
    Date of Patent: May 20, 2014
    Assignee: Fujitsu Limited
    Inventors: Koji Maruhashi, Nobuhiro Yugami
  • Patent number: 8732118
    Abstract: Techniques are described for managing aggregation of data in a distributed manner, such as for a particular client based on specified configuration information. The described techniques may include receiving information about multi-stage data manipulation operations that are to be performed as part of the data aggregation, with each stage able to be performed in a distributed manner using multiple computing nodes—for example, a map-reduce architecture may be used, with a first stage involving the use of one or more specified map functions to be performed, and with at least a second stage involving the use of one or more specified reduce functions to be performed. In some situations, a particular set of input data may be used to generate the data for a multi-dimensional OLAP (“online analytical processing”) cube, such as for input data corresponding to a large quantity of transactions of one or more types.
    Type: Grant
    Filed: January 13, 2012
    Date of Patent: May 20, 2014
    Assignee: Amazon Technologies, Inc.
    Inventors: Richard J. Cole, Alan D. Mock
  • Patent number: 8727780
    Abstract: An extensive computer based online math research system (the “Research System”) having as its foundation an Ontology of mathematics, and utilizing unique and intensive computer support, coordination, data structuring, data storage, computer processing, retrieval capabilities, and data-mining capabilities, and an Ontology editing system that runs on computer software with computer processors and data storage capabilities (the “Ontology Editor System”). The Research System also includes a methodology to enable online reference and data manipulation of the Ontology, and an Internet based search of the concepts of mathematics and applications of mathematics to the sciences on the basis of the Ontology.
    Type: Grant
    Filed: September 21, 2012
    Date of Patent: May 20, 2014
    Assignee: Valuecorp Pacific, Inc.
    Inventors: Mark S. Crouse, Caroline McHolme Beam
  • Publication number: 20140136471
    Abstract: An approach is provided in which a system creates schema terms based upon matching input data query requirements to industry terms. In turn, the system generates a query and an associative map, which includes data organized according to the schema terms. The system executes the query, which retrieves the data from the associative map and loads the data into one or more storage areas.
    Type: Application
    Filed: November 13, 2012
    Publication date: May 15, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Manoj Kumar
  • Publication number: 20140136472
    Abstract: The invention provides idealized and reusable data source interfaces. The process of idealizing includes reengineering of the original data model using a surrogate key based model. The technique emphasizes readability and performance of the resulting operational data store. The invention provides a unique method for handling changes that allows all types of changes to be automatically implemented in the operational data store by table conversion. Further the invention provides Inline materialization that supports a continuous data flow dependency chain. The continuous dependency chain is used to provide automated documentation as well as dynamic paralleled transformation process. Finally master data integration is provided as a benefit of architecture and the inbuilt surrogate key based data model. The feature implements integrations by specification rather than by programming.
    Type: Application
    Filed: May 23, 2013
    Publication date: May 15, 2014
    Applicant: Bi-Builders AS
    Inventor: Erik Frafjord
  • Patent number: 8725675
    Abstract: In a file server for suppressing power consumption of a storage apparatus, when a file sharing program receives a file access from a client, the program references a mapping table. The program addresses the access to the target file in the volume of a RAID group where the target file is stored. A coupling-request reception program memorizes a coupling time for each user into a coupling history table. A grouping program applies a grouping to users whose coupling time-zones are similar. A data transfer program transfers, into the same RAID group, data of the files associated with the grouped users, thereby collecting the data into the same RAID group. Thus, the time-zone when no access is made to the RAID group (i.e., non-coupling time-zone) can be made longer. Accordingly, a spin-up/down request program makes a spin-down request to the RAID group in the non-coupling time-zone.
    Type: Grant
    Filed: October 17, 2011
    Date of Patent: May 13, 2014
    Assignee: Hitachi, Ltd.
    Inventors: Shinichi Moriwake, Nobuyuki Saika, Hitoshi Kamei, Takahiro Nakano
  • Patent number: 8719769
    Abstract: A method for quality objective-based ETL pipeline optimization is provided. An improvement objective is obtained from user input into a computing system. The improvement objective represents a priority optimization desired by a user for improved ETL flows for an application designed to run in memory of the computing system. An ETL flow is created in the memory of the computing system. The ETL flow is restructured for flow optimization with a processor of the computing system. The flow restructuring is based on the improvement objective. Flow restructuring can include application of flow rewriting optimization or application of an algebraic rewriting optimization. The optimized ETL flow is stored as executable code on a computer readable storage medium.
    Type: Grant
    Filed: August 18, 2009
    Date of Patent: May 6, 2014
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Maria G. Castellanos, Umeshwar Dayal, Alkiviadis Simitsis, William K. Wilkinson
  • Patent number: 8715086
    Abstract: The present invention relates generally to a system and method for reviewing and evaluating performance. In particular, the present invention relates to a system and method for reviewing and evaluating performances of an official or group of officials at an event or events. Even more specifically, according to embodiments of the present invention, the system and method can involve reviewing and evaluating a referee's performance during a football game or games.
    Type: Grant
    Filed: December 16, 2011
    Date of Patent: May 6, 2014
    Assignee: Rusty Acree LLC
    Inventor: Russell Acree
  • Publication number: 20140122412
    Abstract: The present disclosure in general relates to technologies for processing data in a distributed data storage system, and more particularly, to a method, a system, and a computer program product for analytical processing of data by using the processing power of the distributed data storage system. In one embodiment, a system for analytical processing of data in a distributed data storage system is disclosed. The system comprises: a data extraction module configured to perform analytical operations to extract data from source databases in one or more data formats; and a processing module configured to perform data refinement operations to categorize the data while the data is being extracted. The processing module comprises: a mapping module configured to perform mapping operations of the categorized data; and a transformation module configured to perform an analytical transforming operation of the mapped categorized data to obtain a transformed categorized data.
    Type: Application
    Filed: October 28, 2013
    Publication date: May 1, 2014
    Applicant: TATA CONSULTANCY SERVICES LIMITED
    Inventors: Bhushan Vidyadhar Bandekar, Sandeep Singh
  • Publication number: 20140122413
    Abstract: A system and method for reading and writing of data values between multidimensional structures that reside in Online Analytical Processing (OLAP) databases are disclosed. Data queries may be performed and updates executed between multidimensional data structures, whether existing on the same server or separate servers. Bulk (being two or more intersections) transfers are allowed between multidimensional structures (or cubes), providing a performance gain that cannot be matched using a standard point-by-point implementation. An intersection only contains a numerical or data value if there is a value for each dimension at that intersection within the database. Multidimensional data structures naturally generate sparse intersections where no data values are found, which can greatly impact performance. Within each cube, only a small intersection of members actually contains values. The system may very quickly calculate reports which include any intersection in any very large cube.
    Type: Application
    Filed: October 29, 2013
    Publication date: May 1, 2014
    Applicant: PARIS TECHNOLOGIES, INC.
    Inventors: Duane Edward PRESTI, Roger Allen WILLSON
  • Patent number: 8712955
    Abstract: A method for creating a data warehousing scheme having optimally selected components. A mathematical model of a goal for the data warehousing scheme is input into an optimization engine. At least one constraint on the data warehousing scheme is input into the optimization engine. A mathematical optimization algorithm is performed using the optimization engine, wherein an output of the optimization engine is an optimized data warehousing scheme having optimally selected components. The optimized data warehousing scheme can be stored.
    Type: Grant
    Filed: July 2, 2010
    Date of Patent: April 29, 2014
    Assignee: International Business Machines Corporation
    Inventors: Robert R. Friedlander, James R. Kraemer
  • Patent number: 8713039
    Abstract: A high level programming language provides a co-map communication operator that maps an input indexable type to an output indexable type according to a function. The function maps an index space corresponding to the output indexable type to an index space corresponding to the input indexable type. By doing so, the co-map communication operator lifts a function on an index space to a function on an indexable type to allow composability with other communication operators.
    Type: Grant
    Filed: December 23, 2010
    Date of Patent: April 29, 2014
    Assignee: Microsoft Corporation
    Inventors: Paul F. Ringseth, Yosseff Levanoni, Lingli Zhang, Weirong Zhu, Donald J. McCrady
  • Publication number: 20140114908
    Abstract: Various methods and apparatuses are described for performing high speed format translations of incoming data, where the incoming data is arranged in a delimited data format. As an example, the data in the delimited data format can be translated to a fixed field format using pipelined operations. A reconfigurable logic device can be used in exemplary embodiments as a platform for the format translation.
    Type: Application
    Filed: October 22, 2013
    Publication date: April 24, 2014
    Inventors: Michael John Henrichs, Joseph M. Lancaster, Roger Dean Chamberlain, Jason R. White, Kevin Brian Sprague, Terry Tidwell
  • Publication number: 20140114907
    Abstract: A data lineage system is provided that traces a data lineage of a data warehouse. The data lineage system maps a target data element to one or more source data elements. The data lineage system further stores one or more source surrogate keys within one or more auxiliary columns of a target data record. The data lineage system further stores, for each source data element, a data lineage mapping system record within a data lineage mapping system table that represents the mapping of the target data element and the corresponding source data element. The data lineage system further maps a source data element to one or more target data elements. The system further stores, for each target data element, a shadow system record within a shadow system table that represents the mapping of the source data element and the corresponding target data element.
    Type: Application
    Filed: March 14, 2013
    Publication date: April 24, 2014
    Inventors: Ludmila KOZINA, John K. REES, Abhishek NARAYAN
  • Publication number: 20140114906
    Abstract: The disclosure generally describes computer-implemented methods, software, and systems for providing a generic semantic layer for in-memory database reporting. One computer-implemented method for combining online transactional processing and online analytical processing in an in-memory database, comprises: retrieving two or more tables from an online transaction processing system; identifying related tables among the two or more tables; determining relationships between the related tables; determining a measure based on the relationships; and outputting the measure.
    Type: Application
    Filed: October 23, 2012
    Publication date: April 24, 2014
    Inventors: Sumanth Hegde, Santosh V
  • Publication number: 20140114909
    Abstract: Interest-driven business intelligence server systems that provide performance metadata are described. In the disclosed embodiments, an interest-driven business intelligence server system receives a report specification. The report specification includes at least reporting data requirement. The interest-driven business intelligence server determines performance metadata information for an interest-driven data pipeline that is utilized to generate reporting data based on the report specification. The performance metadata information for the interest-driven data pipeline is transmitted to an interest-driven user visualization system by the interest-driven business intelligence server system.
    Type: Application
    Filed: October 22, 2013
    Publication date: April 24, 2014
    Applicant: Platfora, Inc.
    Inventors: John Schuster, Usman Ghani, Brian Babcock, Jenn Rhim, John Eshelman, Peter Schlampp
  • Publication number: 20140108331
    Abstract: In one embodiment the present invention includes an OLAP execution model using relational operations. In one embodiment, the present invention includes, a method comprising receiving a first query in an online analytic processor (OLAP) executing on one or more computers, the OLAP generating and comprising a model specifying a graph defining a plurality of nodes and a plurality of tiers, each node corresponding to a different operation on data. A second query is generated by the OLAP. The second query includes a plurality of layered subqueries each corresponding to one of the nodes in the graph for specifying the different operations on data. The second query is received in a relational engine coupled to the datastore. The relational engine executes the second query, and in accordance therewith, retrieves data.
    Type: Application
    Filed: December 17, 2013
    Publication date: April 17, 2014
    Applicant: SAP AG
    Inventors: STEFAN DIPPER, ERICH MARSCHALL, TOBIAS MINDNICH, DANIEL BAEUMGES, CHRISTOPH WEYERHAEUSER
  • Patent number: 8700559
    Abstract: Some embodiments include utilization of a first interface to query a data source using a first configuration and to return a first data set including a plurality of records, each of the plurality of records comprising data of a plurality of fields of the first data set, and reception of the first data set from the data source. Also included may be utilization of a second interface to query the data source using a second configuration and to return a second data set for each of the plurality of records, at least one field of the second configuration being associated with one of the plurality of fields of the first data set, each utilization using the received data of the one of the plurality of fields of a respective record that is associated with the at least one field of the second configuration, and reception of a respective second data set from the data source for each utilization of the second interface.
    Type: Grant
    Filed: September 6, 2005
    Date of Patent: April 15, 2014
    Assignee: Siemens Aktiengesellschaft
    Inventors: Mario Brenes, James Eric O'Hearn
  • Patent number: 8700678
    Abstract: Techniques are disclosed for generating data provenance associated with a computing system. For example, a method comprises the following steps. Information associated with the execution of a given process in a given computing environment in accordance with a given process data set is captured. A provenance data set is generated based on the captured information. The generated provenance data set comprises one or more states associated with one or more execution components of the given computing environment that existed during the execution of the given process. At least a portion of the generated provenance data set may be utilized to revert the computing environment back to the one or more states associated with the one or more execution components of the given computing environment that existed during the execution of the given process.
    Type: Grant
    Filed: December 21, 2011
    Date of Patent: April 15, 2014
    Assignee: EMC Corporation
    Inventors: Chenhui Fan, Lun Zhou, Stephen Todd, Qiyan Chen, Tianqing Wang
  • Patent number: 8700560
    Abstract: Some aspects include association of fields of a data source with one or more entity identities, one or more relation identities, and one or more attributes corresponding, respectively, to entity identities, relation identities and facet attributes defined in metadata of an enterprise social network, and reception of data from the data source. Also included is a determination, based on the data and the associated fields of the data source, of one or more source entities, one or more source entity identities associated with each of the one or more source entities, one or more source relations, one or more source relation identities associated with each of the one or more source relations, and one or more source facets associated with one or more source entities or source relations. For each determined source entity, it is determined if any of the one or more associated source entity identities is identical to an entity identity of the enterprise social network.
    Type: Grant
    Filed: October 17, 2008
    Date of Patent: April 15, 2014
    Assignee: Business Objects S.A.
    Inventors: Ricardo Polo-Malouvier, Bruno Dumant
  • Publication number: 20140101093
    Abstract: Source data of an event stream is parsed and supplemented with additional data from reference data sources, producing an enriched event stream from the parsed event stream data. The data records of the enriched event stream are partitioned into data fields designated as a dimension partition and a metric partition, which are partitioned into sub-dimension projections mapped to a plurality of storage keys, such that each of the storage keys includes one or more placeholder wildcard values and each of the storage keys is stored into a database of the computer system by the computer processor.
    Type: Application
    Filed: June 26, 2013
    Publication date: April 10, 2014
    Inventors: Damon Lanphear, Prabuddha Biswas
  • Publication number: 20140101091
    Abstract: Methods and apparatus are presented for extracting, transforming, and loading data from one database to another database. For example, an extraction, transformation, and loading (ETL) component may access an operational log of a given database in order to detect an update to the database. Upon detecting the update, the ETL component may extract a subset of data from the operational log, where the extraction of the subset of data is based on one or more rules. Once the subset of data has been extracted, the ETL component may transform the extracted subset of data from the operational log into a format for another, target database, where the data format for the other, target database is different from a data format for the given, source database. The ETL component may then load the subset of data transformed into the data format for the other, target database into the target database.
    Type: Application
    Filed: October 4, 2012
    Publication date: April 10, 2014
    Applicant: Adobe Systems Incorporated
    Inventors: Nicholas J. Brown, David L. Cardon, Jason A. Carter
  • Publication number: 20140101092
    Abstract: Disclosed herein are techniques for adjusting a map reduce execution environment. It is determined whether some operations in a sequence of operations should be implemented in a map reduce execution environment. If it is determined that some operations in a sequence of operations should be implemented in a map reduce execution environment, the map reduce execution environment is adjusted to achieve a predefined performance objective.
    Type: Application
    Filed: October 8, 2012
    Publication date: April 10, 2014
    Applicant: Hewlett-Packard Development Company, L.P.
    Inventors: Alkiviadis Simitsis, Kevin K. Wilkinson
  • Patent number: 8694461
    Abstract: The present disclosure generally relates to accessing data, and more particularly, to systems and methods for improving the efficiency and quality of real-time extracting, transforming, and/or loading data using customer information control system (CICS) interval control element (ICE) chain processing.
    Type: Grant
    Filed: February 21, 2012
    Date of Patent: April 8, 2014
    Assignee: American Express Travel Related Services Company, Inc.
    Inventor: Krishna K. Lingamneni
  • Patent number: 8694972
    Abstract: A mechanism for providing automatic interoperation between native objects created in a single language computing environment and objects created in external virtual machines and foreign class systems is discussed. Embodiments of the present invention provides a class definition syntax for objects created in the single language computing environment that provides the ability to directly subclass external classes and implement external interfaces. One embodiment of the present invention also permits a foreign object system to instantiate native objects and to create foreign subclasses of native classes. More specifically, one embodiment of the present invention provides bidirectional mapping between metadata associated with objects created with each of a plurality of different types of foreign object systems and metadata created in a form supported by the single language computing environment.
    Type: Grant
    Filed: November 10, 2006
    Date of Patent: April 8, 2014
    Assignee: The MathWorks, Inc.
    Inventor: David A. Foti
  • Patent number: 8688625
    Abstract: Extract, transform, and load application (ETL) complexity management framework systems and methods are described herein. The present disclosure describes systems and methods that reduce the complexity in managing ETL flow and correcting errant data that is subsequently identified. One or more methods include defining an ETL job definition, defining a data asset definition, defining a data asset dependency definition, receiving an ETL flow to provide execution of one or more ETL flow steps, providing retrieval of data from a source data asset, applying a data control to the source asset data, and producing an ETL job registration, a data asset status, a latest asset available date, a data asset consumer identifier, and a target data asset based on at least one of the ETL job definition, the data asset definition, the data dependency definition, and the source asset data.
    Type: Grant
    Filed: December 30, 2011
    Date of Patent: April 1, 2014
    Assignee: United Services Automobile Association (USAA)
    Inventors: Larry W. Clark, Jason P. Hendry, Mark Steen
  • Patent number: 8688624
    Abstract: According to one embodiment of the present invention, the automated loading seed data for testing in a product integration environment includes receiving input data associated with a test session. The input data may be received in a first format that includes at least one object. A processor may be used to automatically convert the input data into a second format that includes the metadata data string. The metadata data string may then be loaded into a database.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: April 1, 2014
    Assignee: Bank of America Corporation
    Inventors: William A. Kuehler, Valerian Fuchs, Ramya Raj, Apurva M. Patel
  • Publication number: 20140089251
    Abstract: A computer receives one or more files having configuration information that includes data that defines a plurality of stages of an extract, transform, and load (ETL) job, wherein the plurality of stages comprise a read stage that is preceded by a write stage, and wherein the read stage reads data from a source location, and wherein the data that is read or a modified version of the data that is read is being written by the write stage that writes data to the source location. The computer replaces the read stage with a decompressor stage. The computer replaces the write stage with a compressor stage. The computer executes the decompressor stage and compressor stage on a field-programmable gate array that is programmatically customized with data compression and data decompression functionality to enhance the performance of the ETL job.
    Type: Application
    Filed: September 21, 2012
    Publication date: March 27, 2014
    Applicant: International Business Machines Corporation
    Inventors: Manish A. Bhide, Krishna K. Bonagiri, Srinivas K. Mittapalli, Sumit Negi
  • Publication number: 20140089252
    Abstract: A computer receives one or more files having configuration information that includes data that defines a plurality of stages of an extract, transform, and load (ETL) job, wherein the plurality of stages comprise a read stage that is preceded by a write stage, and wherein the read stage reads data from a source location, and wherein the data that is read or a modified version of the data that is read is being written by the write stage that writes data to the source location. The computer replaces the read stage with a decompressor stage. The computer replaces the write stage with a compressor stage. The computer executes the decompressor stage and compressor stage on a field-programmable gate array that is programmatically customized with data compression and data decompression functionality to enhance the performance of the ETL job.
    Type: Application
    Filed: May 16, 2013
    Publication date: March 27, 2014
    Applicant: International Business Machines Corporation
    Inventors: Manish A. Bhide, Krishna K. Bonagiri, Srinivas K. Mittapalli, Sumit Negi
  • Patent number: 8682936
    Abstract: Techniques for an inherited entity storage model are described that can be employed to implement inherited entity management for a CRM system. In at least some embodiments, input can be obtained to create a custom entity that is based at least in part upon a parent entity. The custom entity is created to inherit the parent entity according to an inheritance relationship established between the entities. To do so, the custom entity is created from the parent entity in a common table with the parent entity using some common fields and defining custom fields as appropriate. Data for the entities is then stored via the common table. This approach can reduce storage requirements, enable unified searching, and speed up data operations. Further, back-end business logic associated with parent entity in the CRM system can be automatically applied to the custom entity based upon the inheritance relationship established between the entities.
    Type: Grant
    Filed: December 15, 2010
    Date of Patent: March 25, 2014
    Assignee: Microsoft Corporation
    Inventors: Koushik Bhattacharjee, Prabhat Kumar Pandey, David R. Shutt, Elliot S. Lewis
  • Patent number: 8683322
    Abstract: In some embodiments, communications in a private network are programmatically inspected to identify traffic associated with uncontrolled Web applications originating from outside of the private network. Unstructured data, including messages and application content, originating from such uncontrolled Web Applications may be disassembled, analyzed, and categorized into application element types. In some embodiments, these application element types may be source specific. An example of a source would be a social networking site operating on a public network such as the Internet. The application element types thus generated can then be utilized in a variety of ways to facilitate the entity operating the private network to, for instance, control, monitor, archive, categorize, and moderate communications between its users and social networking sites operating outside the entity's private network.
    Type: Grant
    Filed: May 21, 2010
    Date of Patent: March 25, 2014
    Assignee: Socialware, Inc.
    Inventor: Cameron Blair Cooper
  • Patent number: 8682841
    Abstract: A system and method for collecting and processing data over a communications network. A data mining marshaller module associates each plugin to a particular data source and manages the plugin to periodically retrieve unstructured data from the data source based on a plurality of data items to be monitored on behalf of a plurality of users. The plugins convert unstructured data received from the data sources to structured data and the data marshaller module stores the structured data in a database. This enables the system and method to aggregate and display the structured data in multiple graphical representations according to the user's preference.
    Type: Grant
    Filed: September 5, 2012
    Date of Patent: March 25, 2014
    Assignee: Willow Acqusition Corporation
    Inventors: Mark D. Ghuneim, Matthew R. Dennebaum, Dustin J. Norlander
  • Publication number: 20140081903
    Abstract: In accordance with disclosed embodiments, there are provided methods, systems, and apparatuses for displaying and filtering business analytics data stored in the cloud, including, for example, means for displaying a graphical interface at a client device; communicating a business analytics query from the client device to a remote host organization via a public Internet; receiving a business analytics dataset in a complete and unfiltered form from the host organization responsive to the business analytics query; caching the business analytics dataset in its complete and unfiltered form to the memory of the client device; displaying a business analytics report at the graphical interface of the client device, the business analytics report representative of the business analytics dataset in its complete and unfiltered form; receiving filter input at the client device; applying the filter input to the business analytics dataset to yield a filtered sub-set; and updating the business analytics report displayed at th
    Type: Application
    Filed: September 17, 2013
    Publication date: March 20, 2014
    Applicant: SALESFORCE.COM, INC.
    Inventors: Marko Koosel, Suyog Anil Deshpande
  • Publication number: 20140081902
    Abstract: Embodiments relate to integrating data transform test with a data transform tool. A method and system are described for creating a data transform test for a data transform job having a data transform script, the method includes determining all data transform units available in the data transform job, determining a subset of the available data transform units for a new test, and generating a subset test execution script for the subset of data transform units from the data transform script. The method further includes determining boundary test data at each boundary of the subset of data transform units, defining a data transform test, and saving the data transform test for later testing. The data transform test includes the subset of data transform units with subset test execution script and with boundary test data.
    Type: Application
    Filed: August 29, 2013
    Publication date: March 20, 2014
    Applicant: International Business Machines Corporation
    Inventors: Leonard D. Greenwood, Arron J. Harden, Martin J. Sanders, Julian J. Vizor
  • Publication number: 20140074771
    Abstract: Provided are techniques for generating a relational query. Information is collected from a query specification and a model for an On-Line Analytical Processing (OLAP) query having at least a first expression and a second expression. The collected information is used to generate a relational query to retrieve report data to be used to satisfy the first expression and the second expression.
    Type: Application
    Filed: September 12, 2012
    Publication date: March 13, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xiaowen He, Lin Luo, Martin Petitclerc
  • Patent number: 8671071
    Abstract: Configurations and applications for data signatures are disclosed. Such a data signature may be specific to a particular data element in a data set, and may define this particular data element in relation to one or more other data elements. These data signatures may be used for any appropriate purpose. For instance, data signatures of this type may be generated from a given data set and may be used to analyze this data set in at least some respect, including to identify one or more features in the data set (e.g., for feature extraction purposes). Data signatures of this type may also be used in at least some fashion to generate a presentation or output that relates to the associated data set (including digitally on an appropriate display, as well as in “hard copy” form).
    Type: Grant
    Filed: July 21, 2011
    Date of Patent: March 11, 2014
    Assignee: Apokalyyis, Inc.
    Inventors: Robert Maddox Brinson, Jr., Nicholas Levi Middleton, Robert Wayne White
  • Publication number: 20140067751
    Abstract: A cardinality of an incoming data stream is maintained in real time; the cardinality is maintained in a data structure that is represented by an unsorted list at low cardinalities, a linear counter at medium cardinalities, and a PCSA at high cardinalities. The conversion to the linear counter makes use of the data in the unsorted list, after which that data is discarded. The conversion to the PCSA uses only the data in the linear counter.
    Type: Application
    Filed: August 9, 2013
    Publication date: March 6, 2014
    Inventors: Nikhil Shirish Ketkar, Gaurav Mishra, Jaskaran Singh Bawa, Mark Crovella
  • Publication number: 20140067750
    Abstract: Techniques for automatically partitioning a multi-platform data transform flow graph to one or more target output platforms are provided. The techniques include performing type inference on a transform graph, wherein the transform graph comprises one or more data transforms, automatically partitioning the transform graph to one or more target output platforms based on one or more policies, performing an optimization of the partitioned transform graph, and generating code, from the partitioned transform graph, for each set of the one or more data transforms based on the one or more target output platforms.
    Type: Application
    Filed: March 9, 2011
    Publication date: March 6, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Anand Ranganathan, Anton V. Riabov, Octavian Udrea
  • Patent number: 8666932
    Abstract: The method according to one embodiment of the present invention comprises retrieving one or more terms or phrases comprising an instant messaging conversation in which one or more users are participating. One or more term vectors comprising one or more vector terms associated with the one or more retrieved terms or phrases comprising the instant messaging conversation are generated and one or more vector terms are selected from said term vectors. The one or more selected vector terms are displayed to the one or more users participating in the instant messaging conversation. An indication of a user selection of a given displayed vector term is received and one or more content items responsive to the selected vector term are identified.
    Type: Grant
    Filed: June 27, 2012
    Date of Patent: March 4, 2014
    Assignee: Yahoo! Inc.
    Inventor: Shiv Ramamurthi
  • Patent number: 8666933
    Abstract: Provided are methods and systems for distributing an asset to a multi-tiered network node. A pending notice is received from a distribution server. If the notice indicates that at least one asset is pending, i.e., awaiting deployment, an asset descriptor manifest is received from the distribution server. The asset descriptor manifest, which is stored in a memory on a node, identifies at least one asset to be deployed to the node and includes an offset associated with the asset identifier. A fragment, associated with the asset, is received and stored in the memory. The offset associated with the asset is marked with the end of the fragment and a second fragment, beginning at the offset, is received. Additional fragments are received, and the offset updated, until the entire asset is deployed to the node. Alternately, the entire asset or multiple assets are received in the first fragment.
    Type: Grant
    Filed: November 29, 2012
    Date of Patent: March 4, 2014
    Assignee: OP40 Holdings, Inc.
    Inventors: Paolo R. Pizzorni, Charles P. Pace, Darin S. DeForest, Shuang Chen
  • Patent number: 8660983
    Abstract: A method and system for using a data warehouse to improve results of enterprise level processes are provided. The data warehouse typically includes industry-wide empirical data relating to corresponding operational practices, metrics, and outcomes. The method focuses on actual process results by taking a holistic, end-to-end view of the process in conjunction with using the data in the data warehouse to enable effective process improvements.
    Type: Grant
    Filed: October 29, 2010
    Date of Patent: February 25, 2014
    Assignee: Genpact
    Inventors: Amit Aggarwal, Ruchin Chandra, Apoorva Aggarwal, Parul Ranvir Singh, Guni Brar
  • Patent number: 8660985
    Abstract: A multi-dimensional OLAP query processing method oriented to a column store data warehouse is described. With this method, an OLAP query is divided into a bitmap filtering operation, a group-by operation and an aggregate operation. In the bitmap filtering operation, a predicate is first executed on a dimension table to generate a predicate vector bitmap, and a join operation is converted, through address mapping of a surrogate key, into a direct dimension table tuple access operation; in the group-by operation, a fact table tuple satisfying a filtering condition is pre-generated into a group-by unit according to a group-by attribute in an SQL command and is allocated with an increasing ID; and in the aggregate operation, group-by aggregate calculation is performed according to a group item of a fact table filtering group-by vector through one-pass column scan on a fact table measure attribute.
    Type: Grant
    Filed: May 16, 2012
    Date of Patent: February 25, 2014
    Assignee: Renmin University of China
    Inventors: Shan Wang, Yan-Song Zhang
  • Patent number: 8659777
    Abstract: In print systems based on data reception from a server, it takes much time to receive data of a large size. Therefore systems have been developed in which data is divided before being received. However, if divided data items are received by a print device while temperature adjustment or calibration is performed for an engine of the device, the divided data items are not immediately printed, resulting in a long print time. A print device sequentially receives divided data items if the print device is in a printable status. Otherwise, the print device simultaneously receives the divided data items in a plurality of sessions.
    Type: Grant
    Filed: July 29, 2011
    Date of Patent: February 25, 2014
    Assignee: Canon Kabushiki Kaisha
    Inventor: Hidenori Yokokura
  • Patent number: 8660984
    Abstract: An optical image of a check is obtained at the approximate time of a check-based financial transaction and the approximate time of the check-based financial transaction is recorded. Geographical position/location data and/or voice memo data is then obtained at, or about, the time the optical image of the check is obtained. Optical Character Recognition (OCR) technology is then used to extract image-based financial transaction data from the optical image of the check and the geographical position/location data, and/or voice memo data, is also transformed into financial transaction data associated with the check and the check-based financial transaction. The extracted and/or transformed financial transaction data is then used, at least in part, to automatically assign a financial category to the check-based financial transaction and/or transform the category status of the check-based financial transaction.
    Type: Grant
    Filed: January 13, 2012
    Date of Patent: February 25, 2014
    Assignee: Intuit Inc.
    Inventors: Indraneel Bhattacharyya, David Lish, Christopher H. J. Whittam, Ryan Pfeffer
  • Patent number: 8655939
    Abstract: A method and system processes data in a distributed computing system to survive an electromagnetic pulse (EMP) attack. The computing system has proximal select content (SC) data stores and geographically distributed distal data stores, all with respective access controls. The data input or put through the computing system is processed to obtain the SC and other associated content. The process then extracts and stores such content in the proximal SC data stores and geographically distributed distal SC data stores. The system further processes data to geographically distribute the data with data processes including: copy, extract, archive, distribute, and a copy-extract-archive and distribute process with a sequential and supplemental data destruction process. In this manner, the data input is distributed or spread out over the geographically distributed distal SC data stores. The system and method permits reconstruction of the processed data only in the presence of a respective access control.
    Type: Grant
    Filed: September 8, 2009
    Date of Patent: February 18, 2014
    Assignee: Digital Doors, Inc.
    Inventors: Ron M. Redlich, Martin A. Nemzow
  • Patent number: 8655831
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for automatically parsing data from disparate data sources. In some implementations, actions include receiving first data from a first data source, identifying a first regular expression that corresponds to a data format of the first data, selecting a first set of parsing rules from a plurality of parsing rules based on the first regular expression, parsing the first data based on the first set of parsing rules to provide a first set of sub-data, populating data fields of a first data object with respective sub-data from the first set of sub-data, and transmitting the first data object to a computing device.
    Type: Grant
    Filed: July 31, 2012
    Date of Patent: February 18, 2014
    Assignee: Accenture Global Services Limited
    Inventor: Eric Allan Frome
  • Patent number: 8656374
    Abstract: A computer readable medium is configured to receive a description of a COBOL copybook that can be represented in one of a plurality of disparate formats, to parse the COBOL copybook based on the description of the COBOL copybook, and to create a standardized data record schema based on the COBOL copybook. The description of the COBOL copybook includes information about the format of the COBOL copybook.
    Type: Grant
    Filed: June 16, 2006
    Date of Patent: February 18, 2014
    Assignee: Business Objects Software Ltd.
    Inventors: Andrey Belyy, Alexander Ocher
  • Patent number: 8656253
    Abstract: A method begins by a dispersed storage (DS) processing module generating preliminary dispersed storage network (DSN) storage information for data to be stored in a DSN. The method continues with the DS processing module accessing DSN storage information regarding other data stored in the DSN and comparing the preliminary DSN storage information for the data with the DSN storage information regarding the other data. When at least a portion of the data has compatible preliminary DSN storage information with DSN storage information of at least a portion of the other data, the method continues with the DS processing module generating DSN storage information for remaining portions of the data to produce remaining portions DSN storage information and generating DSN storage information for the data based on the DSN storage information of the at least the portion of the other data and the remaining portions DSN storage information.
    Type: Grant
    Filed: May 4, 2012
    Date of Patent: February 18, 2014
    Assignee: Cleversafe, Inc.
    Inventors: Wesley Leggette, Jason K. Resch
  • Patent number: 8650152
    Abstract: Methods, articles of manufacture and systems for managing execution of workflows. One embodiment provides a computer-implemented method for managing execution of a data driven multi-step workflow. The method includes receiving input data for a step of the workflow and performing the step of the workflow on the input data to obtain a result set. Then, at least one rule is applied to the result set for determining whether one or more associated conditions are satisfied. The at least one rule defines the one or more associated conditions and an associated process. If the one or more associated conditions are satisfied, the associated process is performed on the result set.
    Type: Grant
    Filed: May 28, 2004
    Date of Patent: February 11, 2014
    Assignee: International Business Machines Corporation
    Inventors: Richard D. Dettinger, Daniel P. Kolz, Richard J. Stevens, Shannon E. Wenzel
  • Publication number: 20140040182
    Abstract: Remote data collection systems and methods retrieve data including financial, sales, marketing, operational and the like data from a plurality of databases and database types remotely over a network in an automated, platform-agnostic manner. An Extract Transform and Load (ETL) data replication method for Chart of Account (COA) standardization includes receiving a request for remote data collection to extract data from a data source; extracting data in a non-intrusive manner from the data source, wherein the data comprises non-standard COA data; and transforming one of an entire set or a subset of the extracted data based on the request based on a template or a standardized form desired for comparisons.
    Type: Application
    Filed: October 14, 2013
    Publication date: February 6, 2014
    Applicant: ZEEWISE, INC.
    Inventors: Clark S. Gilder, Josh Hix, Bartosz J. Zalewski