From Unstructured Or Semi-structured Data To Structured Data Patents (Class 707/811)
  • Publication number: 20120254262
    Abstract: There is disclosed a method, apparatus and computer program for parsing a message using a message model. A message is received comprising one or more message fields. This message is stored as a reference bitstream. The message model is used to compare a message field in one or more subsequently received messages with the equivalent field in the reference bitstream. Finally, responsive to determining that a message field in said one or more subsequently received messages matches a field in the reference bitstream a predetermined number of times, storing parser outputs for the matching field for future reuse.
    Type: Application
    Filed: June 18, 2012
    Publication date: October 4, 2012
    Applicant: International Business Machines Corporation
    Inventor: Timothy Kimber
  • Patent number: 8266124
    Abstract: The method and system of the present invention provides an improved technique for integrated asset management. Information is aggregated from a variety of sources into a centralized computerized database. Thereafter, asset transition events are scheduled. Information from the centralized computerized database is used in the performance of the asset transition events and information relating to the asset transition events is added to the centralized computerized database. Subsequent changes to the asset are also recorded into the centralized computerized database. As a result, a plethora of information is available within said database for the purpose of managing future asset transition events.
    Type: Grant
    Filed: December 17, 2002
    Date of Patent: September 11, 2012
    Assignee: Caldvor Acquisitions Ltd., LLC
    Inventors: Shawn Thomas, Gregory Gray, Michael Woodfin, Warner Mizell, Brian Thomas
  • Patent number: 8250105
    Abstract: Methods and apparatus, including computer program products, implementing and using techniques for compressing data included in several transactions. Each transaction has at least one item. A unique identifier is assigned to each different item and, if taxonomy is defined, to each different taxonomy parent. Sets of transactions are formed from the several transactions. The sets of transactions are stored using a computer data structure including: a list of identifiers of different items in the set of transactions, information indicating number of identifiers in the list, and bit field information indicating presence of the different items in the set of transactions, said bit field information being organized in accordance with the list for facilitating evaluation of patterns with respect to the set of transactions. A data structure for compressing data included in a set of transactions is also provided.
    Type: Grant
    Filed: February 6, 2007
    Date of Patent: August 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Toni Bollinger, Ansgar Dorneich, Christoph Lingenfelder
  • Patent number: 8239562
    Abstract: A system for aggregating context information for messages includes a context container that associates names with context values and metadata for context entries. The system further includes a network protocol component configured to read messages from a network transport, an encoder component configured translate messages from a raw format into a canonical message format. The canonical message format is an enveloped message containing an application payload and message metadata. The context container is associated with a message in the canonical format. The system further includes an extraction component configured to retrieve context from a native network transport protocol and insert the context values and metadata into the context container, a plurality of additional protocol components configured to add, remove, or modify entries in the context container, and one or more higher level application components configured to operate on the canonical message using the context entries.
    Type: Grant
    Filed: January 10, 2012
    Date of Patent: August 7, 2012
    Assignee: Microsoft Corporation
    Inventors: Nicholas A. Allen, Justin David Brown, Stephen Jared Maine, Stephen J. Millet, Edmund Samuel Victor Pinto, Tirunelveli R. Vishwanath
  • Patent number: 8229976
    Abstract: A user interface may be generated from an XML schema. For a data object definition in an XML schema, a user interface object may be defined, and a memory store for the data object may be created and bound to the user interface object. The user interface component may be defined in the XML schema, as a separate file, or within an XML document. A user interface object may be selected based on the data type, and various limits and display mechanisms and input devices may be configured based on the schema. When bound, the data stored in the memory store may be reflected in the user interface component, and changes to the user interface component may be reflected in the memory store.
    Type: Grant
    Filed: March 27, 2008
    Date of Patent: July 24, 2012
    Assignee: Microsoft Corporation
    Inventor: Steven P Burns
  • Publication number: 20120173592
    Abstract: Reshaping of streams is provided to facilitate utilizing the streams without rapidly increasing memory requirements as the size of the stream increases. The streams can be pushed to alternative storage upon being reshaped, for example, such as to a persistent storage. If the streams lose structure, for example if a hierarchical stream is reshaped into a flat structure for storage in a database, structural information can be stored along with the streams and utilized to shape the stream to its original structure upon request for data, for example. Streams can be pulled from an exposing device or application, and portions of the stream can be transformed and stored according to a set of stop elements; the stop elements can be associated with functions that take action on the stream upon reaching a stop element, such as transforming and storing a portion thereof.
    Type: Application
    Filed: March 2, 2012
    Publication date: July 5, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Avner Y. Aharoni, Henricus Johannes Maria Meijer
  • Patent number: 8204849
    Abstract: A method for interfacing products with a web browser comprises retrieving a product profile, the product profile associated with one product. An agent type is determined based, at least in part, on the product profile. Data is retrieved from the product based on the agent type and processed such that the data is accessible by the web browser.
    Type: Grant
    Filed: July 9, 2004
    Date of Patent: June 19, 2012
    Assignee: CA, Inc.
    Inventors: Patrick R Lee, Shyhshiun Chen
  • Patent number: 8200716
    Abstract: A method and system for automatically defining and provisioning organizational data in a unified messaging (UM) platform are disclosed. An adapter in a unified messaging platform connects to at least one client human resources database. Human resources information that is organized in an organizational hierarchy is retrieved from the human resources database, and hierarchical organizational data is automatically generated in the UM platform based on the organizational hierarchy of the human resources information retrieved from the human resources database. UM mailboxes are provisioned to messaging centers in the UM platform based on the hierarchical organizational data.
    Type: Grant
    Filed: December 15, 2008
    Date of Patent: June 12, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Mehrad Yasrebi, James Jackson, Timothy Schroeder
  • Patent number: 8190991
    Abstract: The automatic generation of schemas for XML documents is provided. In an illustrative implementation, a computer readable medium having computer readable instructions to instruct a computing environment to execute one or more inference algorithms is provided. In operation, an XML document is processed according to the computer readable instructions such that the content and tags of the XML document are identified. The XML document is processed according to an inference algorithm, which executes one or more processing rule, and uses the XML document information in conjunction with the rules and operations of the XML schema definition language, to automatically produce a schema for the XML document.
    Type: Grant
    Filed: September 26, 2008
    Date of Patent: May 29, 2012
    Assignee: Microsoft Corporation
    Inventors: Nithyalakshmi Sampathkumar, Daniel Mikusik, Nanshan Zeng
  • Patent number: 8156149
    Abstract: Reshaping of streams is provided to facilitate utilizing the streams without rapidly increasing memory requirements as the size of the stream increases. The streams can be pushed to alternative storage upon being reshaped, for example, such as to a persistent storage. If the streams lose structure, for example if a hierarchical stream is reshaped into a flat structure for storage in a database, structural information can be stored along with the streams and utilized to shape the stream to its original structure upon request for data, for example. Streams can be pulled from an exposing device or application, and portions of the stream can be transformed and stored according to a set of stop elements; the stop elements can be associated with functions that take action on the stream upon reaching a stop element, such as transforming and storing a portion thereof.
    Type: Grant
    Filed: July 24, 2007
    Date of Patent: April 10, 2012
    Assignee: Microsoft Corporation
    Inventors: Avner Y. Aharoni, Henricus Johannes Maria Meijer
  • Patent number: 8135704
    Abstract: A method and system of data acquisition by a listing service provider is disclosed. A network address is received from a client computer that is operated by a lister. The network address can be indicative of a location of listing data on a computer network. The listing data comprises at least one information item provided by the lister. The network address received from the lister is accessed by opening a computer network connection to retrieve the listing data. The lister makes available the listing data for retrieval so that the listing data can be posted in a search bank hosted by the listing service provider. The listing data is retrieved from the network address using the computer network connection by copying the listing data onto a listing data database.
    Type: Grant
    Filed: March 11, 2006
    Date of Patent: March 13, 2012
    Assignee: Yahoo! Inc.
    Inventors: Adam Hyder, Joseph Ting
  • Patent number: 8131779
    Abstract: A system and method of information retrieval and triage for Information analysis provides an for interactive multi-dimensional and linked visual representation of information content and properties. A query Interface plans and obtains result sets. A dimension interface specifies dimensions with which to categorize the result sets. Links among results of a result set or results of different sets are automatically generated for linked selection viewing. Entitles may be extracted and viewed and entity relations determined to establish further links and dimensions. Properties encoded in representations of the results in the multi-dimensional views maximizes display density. Multiple queries may be performed and compared. An integrated browser component responsive to the links is provided for viewing documents. Documents and other information from the result set may be used in an analysis component providing a space for visual thinking, to arrange the information in the space while maintaining links automatically.
    Type: Grant
    Filed: November 30, 2005
    Date of Patent: March 6, 2012
    Assignee: Oculus Info Inc.
    Inventors: David Jonker, William Wright, David Schroh, Pascale Proulx, Brian Cort, Alex Skaburskis
  • Patent number: 8126917
    Abstract: A method and a device transport a postal object to an incompletely specified destination address. The object has at least one information item relating to a destination address to which the object is to be transported. A set of address components is predetermined. An address database is used which in each case contains a computer-available record per destination address for a set of possible destination addresses. Each record contains in each case one entry for each predetermined address component. At least one destination address information item on the object is detected. A screen form is used which contains in each case one input field for each address component. The detected destination address information is compared with the records of the address database. When exactly one record is consistent with all detected destination address information items, a transportation of the object to the destination address of the record is triggered.
    Type: Grant
    Filed: November 24, 2009
    Date of Patent: February 28, 2012
    Assignee: Siemens Aktiengesellschaft
    Inventor: Gerhard Funcke
  • Patent number: 8108540
    Abstract: A system for aggregating context information for messages. The system includes a context container. The context container associates names with context values and metadata for context entries. The system further includes a network protocol component configured to read messages from a network transport, an encoder component configured translate messages from a raw format into a canonical message format. The canonical message format is an enveloped message containing an application payload and message metadata. The context container is associated with a message in the canonical format.
    Type: Grant
    Filed: December 12, 2008
    Date of Patent: January 31, 2012
    Assignee: Microsoft Corporation
    Inventors: Nicholas A. Allen, Justin David Brown, Stephen Jared Maine, Stephen J. Millet, Edmund Samuel Victor Pinto, Tirunelveli R. Vishwanath
  • Patent number: 8103704
    Abstract: Methods for consolidating databases while maintaining data integrity are disclosed. A source database and target database are compared, and consolidated, and the consolidated databases are used. In other examples, a database is split to support divested entities.
    Type: Grant
    Filed: July 31, 2007
    Date of Patent: January 24, 2012
    Assignee: ePrentise, LLC
    Inventor: Helene Abrams
  • Patent number: 8095575
    Abstract: A computer-implemented word processing presentation method is disclosed. The method includes obtaining an unformatted data structure containing a series of characters representing content for a word processing document, accessing a series of first records in a file associated with the unformatted data structure, wherein each first record contains data correlating a location of one or more characters in the unformatted data structure to a location for the one or more characters in the word processing document, and generating a display of the word processing document by applying the correlating data from the series of records to the series of characters in the unformatted data structure.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: January 10, 2012
    Assignee: Google Inc.
    Inventors: Ramna Sharma, Nandan Nidhi, Suvrat Sharma, Ganesh Gupta
  • Patent number: 8087030
    Abstract: Processing a received message includes receiving a message that includes a plurality of values associated with respective data elements that assign an information category to each of the values. The message further includes a plurality of context values belonging to respective context categories. The method includes identifying, in a relevance record and for a first one of the context categories in the message, at least one of the data elements that is relevant for the context value of the first context category. A rule associated with the context value of the first context category is applied to the value of the identified at least one data element. A system includes a message receiving module, a context value module and a processing module.
    Type: Grant
    Filed: December 29, 2006
    Date of Patent: December 27, 2011
    Assignee: SAP AG
    Inventors: Gunther Stuhec, Volker Wiechers, Karsten K. Bohlmann
  • Patent number: 8086592
    Abstract: A computer readable storage medium includes executable instructions to receive a semantic abstraction describing at least one underlying data source. The semantic abstraction includes at least one dimension with at least one dimension value. Unstructured text is parsed into parsed text units. A dimension value is matched to a parsed text unit to form matched content. An indication of the matched content is stored.
    Type: Grant
    Filed: November 30, 2007
    Date of Patent: December 27, 2011
    Assignee: SAP France S.A.
    Inventors: Gilles Vergnory Mion, Jean-Yves Cras
  • Patent number: 8078638
    Abstract: Multiple sets of data are obtained from different sources. Each data set is represented using a different format having a different syntax and organized in a multi-level nested data structure. Each data set is reformatted into a standardized table format using a depth-first recursive algorithm without relying on the syntax schema of the original format of the data set. Various operations are performed on the tables corresponding to the data sets, including but not limited to joining multiple tables, grouping selected rows of a table, ranking rows of a table, adding or deleting fields from selected rows of a table, etc. Optionally, inferred namespace and text normalization are utilized for selected table operations. One or more templates are provided for converting the data set of a table to a format that may be presented to a user.
    Type: Grant
    Filed: July 9, 2008
    Date of Patent: December 13, 2011
    Assignee: Yahoo! Inc.
    Inventor: Vikash Singh
  • Patent number: 8078650
    Abstract: Systems and methods of processing an unstructured resource which contains one or more data portions are described. The method comprises reading the unstructured resource into memory and accessing a data structure associated with the unstructured resource. This data structure contains a number of elements, each element including position information for a data portion in the unstructured resource. Using this position information, data portions are located from the unstructured resource and processed and the locating and processing steps are repeated for each element in the data structure.
    Type: Grant
    Filed: April 5, 2007
    Date of Patent: December 13, 2011
    Assignee: Microsoft Corporation
    Inventors: Barry McHugh, Terry Farrell
  • Patent number: 8065334
    Abstract: A system and method of warranty insight solution are disclosed. In one embodiment, a method includes populating a data mart with data from a number of sources, text analyzing and mining the unstructured data of the data mart according to a uniform structure, performing root cause analysis assistance on staged data mart data, generating root cause analysis output from the root cause analysis, merging the root cause analysis output with the data of the data mart, and generating final output based on a portion of the merged data of the data mart. The data may include data selected from a group including warranty claim data, traceability data, supplier data, manufacturer data, retailer data, customer data, component data, service data, failure data, field data, vehicle failure fault codes trough telematics, and collection center data.
    Type: Grant
    Filed: October 30, 2007
    Date of Patent: November 22, 2011
    Assignee: Wipro Limited
    Inventors: Partha Mukherjee, Anand Vasant Batagurki, Sanjeev K. Itagi
  • Patent number: 8060540
    Abstract: Data having express or implied relationships may be displayed by selecting a starting entity in a data structure, building a relationship tree, and building and optimizing a relationship matrix based on the relationship tree. The optimized relationship matrix may be used to layout and render a graphical image that positions various elements with respect to the starting entity based on the relationships. The distance matrix may be optimized by creating a first distance matrix based on the relationship tree, developing a dissimilarity matrix based on expressed or implied relationships, and multiplying the dissimilarity matrix by a weighting factor to determine a distance matrix that may be optimized by multi-dimensional scaling. An optimized weighting factor may be determined and used to select an optimized distance matrix.
    Type: Grant
    Filed: June 18, 2007
    Date of Patent: November 15, 2011
    Assignee: Microsoft Corporation
    Inventors: Yingnong Dang, Xu Yang, Dongmei Zhang, Min Wang, Jian Wang
  • Patent number: 8046370
    Abstract: This disclosure relates to performing a query for a search term of a database containing a plurality of structured documents. Those structured documents that do not include the search term are ferreted or filtered out during an initial search. Matched structured documents which are those structured documents that do contain the search term are evaluated by ranking the individual elements based on how well each individual element matches the search term, and indicating to the user the ranking of the individual elements wherein the individual elements can be accessed by the user.
    Type: Grant
    Filed: September 16, 2008
    Date of Patent: October 25, 2011
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Hang Cui
  • Patent number: 8046383
    Abstract: Methods and apparatus, including computer program products, for mapping deep structured data structures. Statements defining a mapping of source elements formatted in accordance with a first hierarchical structure to a target formatted in accordance with a second hierarchical structure are received. The first and second hierarchical structures may be different. A mapping of the source elements to the target in accordance with the statements may be performed, where the statement may be defined in accordance with a mapping language. The mapping language may define that a single statement may represent an iterative approach to mapping elements from the source to the target. The mapping language may support selection of source elements using a format that allows for navigation through a hierarchy of the source. The mapping language may also support nested statements which may allow for nested iterations in which to perform mappings.
    Type: Grant
    Filed: October 24, 2007
    Date of Patent: October 25, 2011
    Assignee: SAP AG
    Inventors: Franz Weber, Soeren Balko, Matthias Miltz
  • Patent number: 8032567
    Abstract: According to some embodiments, demonstration data is received via a front-end application associated with a business information enterprise system. The demonstration data may then be interpreted in accordance with at least one rule to generate business data. A query may be received at a back-end application associated with the business information enterprise system. At least a portion of the business data may then be presented in accordance with the received query.
    Type: Grant
    Filed: October 28, 2010
    Date of Patent: October 4, 2011
    Assignee: SAP AG
    Inventors: Eric Schemer, Tanja B. Wingerter, Markus Ulke
  • Publication number: 20110238712
    Abstract: A method and apparatus is described in which a server providing group communication maintains document of active group sessions. The group communication server may provide a search capability to identify active group sessions. Alternatively, the group communication server uses a caching entity to provide the search capability to identify active group sessions corresponding to a search request from a user.
    Type: Application
    Filed: September 30, 2009
    Publication date: September 29, 2011
    Applicant: NOKIA SIEMENS NETWORKS OY
    Inventors: Pavel Dostal, Hans Rohnert, Ivo Sedlacek
  • Patent number: 8028004
    Abstract: A data recording system 1 includes: a file generating part 3 for dividing digital data so as to generate a plurality of data files and recording the plurality of the data files sequentially into a recording medium 5; a management file judging part 6 for determining one management file for recording management information of the plurality of the data files; and a management information generating part 4 for recording the management information of the plurality of the data files into the determined management file. The management file judging part 6 determines the management file such that the number of the data files managed by the management file does not exceed a maximum data file number L (L is a natural number) that can be managed by the management file. Thereby, it is possible to record the plurality of the data files that are divided from the digital stream data and are recorded, such that the management of the plurality of the data files is easy.
    Type: Grant
    Filed: July 14, 2006
    Date of Patent: September 27, 2011
    Assignee: Panasonic Corporation
    Inventors: Masafumi Sato, Kazuya Fujimura, Hiroyuki Kamezawa, Tomoo Nakagawa
  • Patent number: 8028007
    Abstract: Large messages in the form of hierarchically structured documents are processed in a streaming fashion using the ultimate consumer read requests as the driving force for the processing. The messages are partitioned into fixed length segments. The segments are processed in pipeline fashion. This processing chain includes simulating random access of hierarchical documents using stream transformations, mapping streams to a transport's native capabilities, composing streams into chains and using pipeline processing on the chains, staging fragments into a database and routing messages when complete messages have been formed, and providing tools to allow the end user to inspect partial messages.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: September 27, 2011
    Assignee: Microsoft Corporation
    Inventors: Yossi Levanoni, Wei-Lun Lo, Sanjib Saha, Paul Maybee, Bimal Mehta, Lee Graber, Anandhi Somasekaran, Akash Sagar, Balinder Malhi, Allen Zhang, Siunie Sutjahjo
  • Patent number: 8019794
    Abstract: A firmware repository includes an Extensible Markup Language (XML) description file. A system and method for managing the repository is described.
    Type: Grant
    Filed: April 2, 2007
    Date of Patent: September 13, 2011
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Rabindra Pathak, Eric Thomas Olbricht, Gregory Eugene Borchers
  • Patent number: 8015218
    Abstract: The invention concerns a method for compressing and decompressing a structured document, associated with at least a tree diagram structure defining a document structure and comprising nested structure elements, associated with a type of information, and representing sets of data, the method comprising steps which consists in: performing a syntactic analysis of the structure diagram and standardizing it so as to obtain a single predefined sequence of the elements of the diagram; compiling the standardized diagram to obtain finite automata, each automaton comprising states interconnected by transitions respectively representing the elements of the structure; and compressing the document, and executing at least a compression algorithm associated with a type of information, when a set of data having the type of information is encountered in the document.
    Type: Grant
    Filed: August 31, 2001
    Date of Patent: September 6, 2011
    Assignee: Expway
    Inventors: Cedric Thienot, Claude Seyrat
  • Patent number: 8015178
    Abstract: A system, method, and computer program for storing a plurality of usage conditions to a data set for retrieval by a single query statement, comprising the steps of converting a usage condition into a first normal form representation, minimizing said first normal form representation, transforming said minimized first normal form representation into a second normal form representation, and storing said second normal form representation in said data set. The method wherein the steps comprising said storing step are repeated until each said usage condition is stored in to said data set and appropriate means and computer-readable instructions.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: September 6, 2011
    Assignee: Siemens Product Lifecycle Management Software Inc.
    Inventors: Thomas F. Moeller, Nigel Booth, Gregory Leland Coleman
  • Patent number: 7996444
    Abstract: A system and method is provided for XML query processing includes an execution compiler for transforming an XML query into an executable XML query plan. A query rewrite processor performs query transformation on the XML query, the query transformations including transforming an XPath within said XML Query into a pre-filter. The XML query is then transformed into a transformed XML query which includes the pre-filter.
    Type: Grant
    Filed: February 18, 2008
    Date of Patent: August 9, 2011
    Assignee: International Business Machines Corporation
    Inventor: Normen Seemann
  • Patent number: 7984070
    Abstract: Embodiments described herein provide numerous applications and implementations of a social network to facilitate individuals to resolve various life issues. These issues may include issues that arise when individuals or families relocate, including logistic problems, assimilation of family members in a community, and roommate pairings. As will be described, embodiments described herein greatly facilitate corporations in relocating their employees logistically, and also assist employees and their families with life issues that may determine whether the employees' relocation will be a success.
    Type: Grant
    Filed: June 22, 2009
    Date of Patent: July 19, 2011
    Inventor: Emily J. White
  • Publication number: 20110145200
    Abstract: Techniques for precedence based storage are presented. Storage for a database is organized into storage pools; collections of pools form storage classes. The storage pools within a particular class are organized in a precedence-based order so that when storage for the database is needed, the storage pools are used in the defined order of precedence. Additionally, each storage pool or storage class can be circumscribed by security limitations, quality of service limitations, and/or backup procedures.
    Type: Application
    Filed: December 16, 2009
    Publication date: June 16, 2011
    Applicant: Teradata US, Inc.
    Inventor: Gregory Howard Milby
  • Patent number: 7954053
    Abstract: An extraction-rule generation and training system uses information obtained from multiple markup language documents (e.g. web pages) of similar structure to generate an extraction rule for extracting datapoints from markup language documents. Where the structures of two or more documents are not sufficiently similar, the system maintains separate extraction rules for the same datapoint, and applies these separate extraction rules in combination to particular markup language documents to extract the datapoint.
    Type: Grant
    Filed: January 6, 2010
    Date of Patent: May 31, 2011
    Assignee: Alexa Internaet
    Inventors: Greger J. Orelind, August A. Jaenicke
  • Patent number: 7945554
    Abstract: Methods and systems of providing a job search to a jobseeker are disclosed. Based on previously stored user preferences, job listings can be presented to users. User preferences can be gathered through previous search requests, resume keywords, jobseeker applies to job listings, jobseekers viewing job listings, etc. The search request can include search criteria. As such, preference data related to the jobseeker is identified based on jobseeker online behavior. In one embodiment, a set of jobs listings having associated metadata that match the search criteria is identified. A subset of job listings that match the preference data is identified. The subset of job listings is a subset of the set of job listings. At least the subset of job listings can be provided to the jobseeker. In another embodiment, a set of job listings having associated metadata that match the search criteria and the jobseeker preferences is identified and provided to the jobseeker.
    Type: Grant
    Filed: December 11, 2006
    Date of Patent: May 17, 2011
    Assignee: Yahoo! Inc.
    Inventors: Harshal D. Dedhia, Adam Hyder, Geoffrey Vincent Perez
  • Patent number: 7933935
    Abstract: A method is provided to efficiently evaluate an expression to determine the partition key for an XML document stored in a database without the entire XML document first being stored in temporary memory storage. The partition key is determined using streaming evaluation or incrementally using a DOM node tree as a portion of the document is read and stored in the buffer. The XML document is stored in the partition using the read portion of the document stored in the buffer and the remaining portion from the original source.
    Type: Grant
    Filed: March 8, 2007
    Date of Patent: April 26, 2011
    Assignee: Oracle International Corporation
    Inventors: Sam Idicula, Sivasankaran Chandrasekar, Nipun Agarwal
  • Patent number: 7933939
    Abstract: A method and apparatus for increasing the speed at which a block of data can be partitioning into variable-length subblocks is provided. The method combines a relatively high-speed partitioning algorithm (that can only partition a block into relatively small mean-length subblocks) with a relatively low-speed algorithm (that can partition a block into subblocks of any mean length) to yield a relatively high-speed partitioning algorithm that can partition blocks into subblocks of any mean-length.
    Type: Grant
    Filed: April 16, 2008
    Date of Patent: April 26, 2011
    Assignee: Quantum Corporation
    Inventor: Ross N. Williams
  • Patent number: 7930318
    Abstract: The present invention discloses methods and systems for hosting tenants in a computer-based environment in which a provider stores a shared data structure. Each of the tenants may store shared-metadata referencing the shared data structure, while a first tenant may store a tenant-specific data structure specific to the first tenant for access by the first tenant. Based on the shared-metadata and in response to a data request from the first tenant, the system may the query the provider or the first tenant for requested data and provide the requested data based on the querying.
    Type: Grant
    Filed: September 2, 2009
    Date of Patent: April 19, 2011
    Assignee: SAP AG
    Inventor: Wolfgang A. Becker
  • Patent number: 7930322
    Abstract: Various technologies and techniques are disclosed for text based schema discovery and information extraction. Documents are analyzed to identify sections of the documents and a relationship between the sections. Statistics are stored regarding occurrences of items in the documents. A probabilistic model is generated based on the stored statistics. A database schema is generated with a plurality of tables based upon the probabilistic model. The documents are analyzed against the probabilistic model to determine how the documents map to the tables generated from the database schema. The tables are populated from the documents based on a result of the analysis against the probabilistic model.
    Type: Grant
    Filed: May 27, 2008
    Date of Patent: April 19, 2011
    Assignee: Microsoft Corporation
    Inventor: C. James MacLennan
  • Patent number: 7917547
    Abstract: The present invention extends to methods, systems, and computer program products for virtualizing objects within queries. Embodiments of the invention virtualize data access for use with queries. Virtualization can be implemented within any portion of a syntax tree. For example, data can be virtualized for a property of an object that is itself another object. Data virtualization facilitates lazy evaluation of query expressions. That is, actual property values for properties within a data construction statement are virtualized until a query specifically requests the actual property values. Further, data virtualization also conserves resources and results in more efficient query evaluations.
    Type: Grant
    Filed: June 10, 2008
    Date of Patent: March 29, 2011
    Assignee: Microsoft Corporation
    Inventors: Gregory L. Hughes, Clemens Kerer, Brad M. Olenick
  • Patent number: 7912867
    Abstract: The present invention is generally directed to systems and methods for gathering information about nonnative data, comparing nonnative data elements to information defining nonnative data, comparing native data elements to information defining native data, establishing transformation rules, and integrating the nonnative and native data.
    Type: Grant
    Filed: February 25, 2008
    Date of Patent: March 22, 2011
    Inventors: Russell Suereth, William Ennis, Gerard Clavens
  • Publication number: 20110066662
    Abstract: A system and method for extracting content from unstructured sources is disclosed. The method includes analyzing web pages of a website, storing text and image data for each web page of the website, extracting a plurality of entities from the web page data, scoring each entity of the plurality of entities to provide an overall score for each entity, and defining a product based on the plurality of entities and the overall score for each entity.
    Type: Application
    Filed: September 14, 2009
    Publication date: March 17, 2011
    Applicant: Adtuitive, Inc.
    Inventor: Jason Davis
  • Patent number: 7895246
    Abstract: An electronic collection bin is provided for assisting users in managing their personal information. The electronic collection bin provides a common location for collecting and organizing a user's information. The electronic collection bin may receive information items of varying data types and from disparate sources. After receiving an information item, the electronic collection bin analyzes the item to determine a suggested treatment, which may include conversion of the item to a new data type located at another location. A user may access the electronic collection bin, sort through the information items, and select placement of the information items. The user may view the suggested treatments of information items in the electronic collection bin and choose whether to accept the suggested treatments.
    Type: Grant
    Filed: May 31, 2007
    Date of Patent: February 22, 2011
    Assignee: Microsoft Corporation
    Inventors: Thomas Robert Bauman, Doreen Nelson Grieb, Robert Warren Piper
  • Patent number: 7890521
    Abstract: One embodiment of the present invention provides a system that automatically generates synonyms for words from documents. During operation, this system determines co-occurrence frequencies for pairs of words in the documents. The system also determines closeness scores for pairs of words in the documents, wherein a closeness score indicates whether a pair of words are located so close to each other that the words are likely to occur in the same sentence or phrase. Finally, the system determines whether pairs of words are synonyms based on the determined co-occurrence frequencies and the determined closeness scores. While making this determination, the system can additionally consider correlations between words in a title or an anchor of a document and words in the document as well as word-form scores for pairs of words in the documents.
    Type: Grant
    Filed: February 7, 2008
    Date of Patent: February 15, 2011
    Assignee: Google Inc.
    Inventors: Oleksandr Grushetskyy, Steven D. Baker
  • Patent number: 7882153
    Abstract: A method for electronic messaging of trade data involves receiving trade data having an unknown format in an electronic message, where trade data includes individual trade data elements, processing the electronic message to resolve the unknown format into a known format, parsing trade data using the known format to obtain individual trade data elements, and storing individual trade data elements in a trade data repository.
    Type: Grant
    Filed: February 28, 2007
    Date of Patent: February 1, 2011
    Assignee: Intuit Inc.
    Inventors: Scott D. Cook, Daniel Wernikoff
  • Patent number: 7882154
    Abstract: A computer program product comprising computer readable program configured to implement a method for providing processed data definition documents (DDDs) or processed document object models (DOMs) for object oriented programming. The use of these processed data definitions simplifies the data structures and streamlines programming to access the data. A standard DDD/DOM has a hierarchical branched structure having a number of levels each with elements/nodes and attributes. The DDD is written in a platform independent markup language. An element/node is selected and its attributes are identified. All ‘children’ of the selected element/node are identified. The attributes of the selected element/node (parent) are then copied to each child for all children in the DDD/DOM. This is repeated for all elements/nodes in the DDD/DOM to result in a processed DDD/processed DOM which is now structured to allow program access to data in a more direct manner.
    Type: Grant
    Filed: June 13, 2008
    Date of Patent: February 1, 2011
    Assignee: International Business Machines Corporation
    Inventor: Chad L. Meadows
  • Patent number: 7882128
    Abstract: Methods and apparatus, including computer program products, implementing and using techniques for pattern detection in input data containing several transactions, each transaction having at least one item. Filter conditions for interesting patterns are received, and a first set of filter conditions applicable in connection with generation of candidate patterns is determined. An evaluated candidate pattern is selected as a parent candidate pattern, and evaluation information about the parent candidate pattern is maintained. Child candidate patterns are generated by extending the parent candidate pattern and taking into account the first set of filter conditions. The child candidate patterns are evaluated with respect to the input data together in sets of similar candidate patterns and based on the evaluation information about the parent candidate pattern. At least one child candidate pattern successfully passing the evaluation step is recursively used as a parent candidate pattern.
    Type: Grant
    Filed: February 6, 2007
    Date of Patent: February 1, 2011
    Assignee: International Business Machines Corporation
    Inventors: Toni Bollinger, Ansgar Dorneich, Christoph Lingenfelder
  • Patent number: 7882155
    Abstract: A computer system including a processor and a memory unit containing instructions that when executed by the processor implement a method for providing processed data definition documents (DDDs) or processed document object models (DOMs) for object oriented programming. The use of these processed data definitions simplifies the data structures and streamlines programming to access the data. A standard DDD/DOM has a hierarchical branched structure having a number of levels each with elements/nodes and attributes. The DDD is written in a platform independent markup language. An element/node is selected and its attributes are identified. All ‘children’ of the selected element/node are identified. The attributes of the selected element/node (parent) are then copied to each child for all children in the DDD/DOM. This is repeated for all elements/nodes in the DDD/DOM, resulting in a processed DDD/processed DOM which is structured to allow program access to data in a more direct manner.
    Type: Grant
    Filed: June 13, 2008
    Date of Patent: February 1, 2011
    Assignee: International Business Machines Corporation
    Inventor: Chad L. Meadows
  • Patent number: 7877399
    Abstract: A method and system for comparing two documents, such as XML (Extensible Markup Language) files, where each document is capable of being parsed into a DOM (Document Object Model) trees. Each tree structure is converted into an array of leaf paths containing nodes. These arrays are then compared to identify corresponding matched nodes—either exactly matched nodes or schema matched nodes. In reporting the results of the comparison, unmatched nodes of the source document are reported as “deleted nodes”, that is, existing in the source but not in the target. Similarly all unmatched nodes of the target document are reported as “added nodes”, that is, existing in the target but not in the source. In addition, schema matched nodes are reported as “modified nodes” between source and target documents.
    Type: Grant
    Filed: August 15, 2003
    Date of Patent: January 25, 2011
    Assignee: International Business Machines Corporation
    Inventor: Fuhwei Lwo