Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
  • Publication number: 20120203787
    Abstract: An information management apparatus includes: a data receiving section, a collected data storage section, an aggregating section, a feature extracting section, a determining section, and an evaluation data storage section. The data receiving section periodically receives action data showing an action of a user. The collected data storage section stores the action data received by the data receiving section every user. The aggregating section generates a data set every user by aggregating action data containing an approximate content, of the action data stored in the collected data storage section. The feature extracting section extracts an index and a reference showing privacy confidentiality of the data set as a feature to incorporate in the data set. The determining section determines whether or not the privacy confidentiality of the feature of the data set is equal to or higher than a predetermined level. The evaluation data storage section stores the data set which passed the determining section.
    Type: Application
    Filed: October 7, 2010
    Publication date: August 9, 2012
    Applicant: NEC CORPORATION
    Inventor: Shinya Miyakawa
  • Publication number: 20120203804
    Abstract: A method for incrementally unloading classes using a region-based garbage collector is described. In one embodiment, such a method includes maintaining a remembered set for a class set. The remembered set indicates whether instances of the class set are contained in one or more regions in memory, and in which regions the instances are contained. Upon performing an incremental garbage collection process for a subset of the regions in memory, the method examines the remembered set to determine whether the class set includes instances in regions outside of the subset. If the remembered set indicates that the class set includes instances outside of the subset of regions, the method identifies the class set as “live.” This will preclude unloading the class set from the subset of regions. A corresponding computer program product and apparatus are also disclosed herein.
    Type: Application
    Filed: March 28, 2012
    Publication date: August 9, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Peter W. Burka, Jeffrey M. Disher, Daryl J. Maier, Aleksandar Micic, Ryan A. Sciampacone
  • Publication number: 20120203741
    Abstract: Provided are techniques for selecting a first group of indexes to form a current generation of indexes, selecting indexes from the first group biased to indexes with higher fitness values from the current generation of indexes, forming sub-groups of indexes using the selected indexes, determining fitness values of each of the sub-groups based on the fitness value of each of the indexes, selecting a subset of the sub-groups; and placing the indexes in the selected sub-groups into a new generation of indexes.
    Type: Application
    Filed: April 17, 2012
    Publication date: August 9, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gaurav Mehrotra, Abhinay R. Nagpal, Sandeep R. Patil, Rulesh F. Rebello
  • Publication number: 20120203803
    Abstract: A method for incrementally unloading classes using a region-based garbage collector is described. In one embodiment, such a method includes maintaining a remembered set for a class set. The remembered set indicates whether instances of the class set are contained in one or more regions in memory, and in which regions the instances are contained. Upon performing an incremental garbage collection process for a subset of the regions in memory, the method examines the remembered set to determine whether the class set includes instances in regions outside of the subset. If the remembered set indicates that the class set includes instances outside of the subset of regions, the method identifies the class set as “live.” This will preclude unloading the class set from the subset of regions. A corresponding computer program product and apparatus are also disclosed herein.
    Type: Application
    Filed: February 8, 2011
    Publication date: August 9, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Peter W. Burka, Jeffrey M. Disher, Daryl J. Maier, Aleksandar Micic, Ryan A. Sciampacone
  • Publication number: 20120197869
    Abstract: There is provided a computer-implemented method of executing a query plan against a database. An exemplary method comprises accessing a first subset of rows of a database table using a direct access method for an index. The query plan may comprise the direct access method. The exemplary method also comprises determining a processing cost of accessing the first subset of rows. The exemplary method further comprises modifying the direct access method for the index in response to determining that the processing cost exceeds a specified threshold. Additionally, the exemplary method comprises accessing a second subset of rows of the database table using the modified direct access method.
    Type: Application
    Filed: April 16, 2012
    Publication date: August 2, 2012
    Inventors: David W. Birdsall, Yung-Li L. Jow, Goetz Graefe
  • Publication number: 20120197852
    Abstract: In particular embodiments, a method includes accessing sensor data from sensor nodes in a sensor network and aggregating the sensor data for communication to an indexer in the sensor network. The aggregation of the sensor data includes deduplicating the sensor data; validating the sensor data; formatting the sensor; generating metadata for the sensor data; and time-stamping the sensor data. The metadata identifies one or more pre-determined attributes of the sensor data. The method also includes communicating the aggregated sensor data to the indexer in the sensor network. The indexer is configured to index the aggregated sensor data according to a multi-dimensional array for querying of the aggregated sensor data along with other aggregated sensor data. One or more first ones of the dimensions of the multi-dimensional array include time and one or more second ones of the dimensions of the multi-dimensional include one or more of the pre-determined sensor-data attributes.
    Type: Application
    Filed: January 28, 2011
    Publication date: August 2, 2012
    Applicant: CISCO TECHNOLOGY, INC.
    Inventors: Debojyoti Dutta, Mainak Sen, Manoj Kumar PANDEY, Tarun Banka, Raja Suresh Krishna Balakrishnan
  • Publication number: 20120197853
    Abstract: A technique for eliminating duplicate data is provided. Upon receipt of a new data set, one or more anchor points are identified within the data set. A bit-by-bit data comparison is then performed of the region surrounding the anchor point in the received data set with the region surrounding an anchor point stored within a pattern database to identify forward/backward delta values. The duplicate data identified by the anchor point, forward and backward delta values is then replaced in the received data set with a storage indicator.
    Type: Application
    Filed: April 10, 2012
    Publication date: August 2, 2012
    Inventors: Ling Zheng, Roger Stager, Craig Johnston, Don Trimmer, Yuval Frandzel
  • Publication number: 20120197898
    Abstract: In particular embodiments, a method includes, from an indexer in a sensor network, accessing a set of sensor data that includes sensor data aggregated together from sensors in the sensor network, one or more time stamps for the sensor data, and metadata for the sensor data identifying one or more pre-determined attributes of the sensor data. The method includes, at the indexer, generating an index of the set of sensor data according to a multi-dimensional array configured for querying of the set of sensor data along with a plurality of other sets of sensor data. One or more first ones of the dimensions of the multi-dimensional array include time, and one or more second ones of the dimensions of the multi-dimensional array include one or more of the pre-determined sensor-data attributes. The method includes, from the indexer, communicating the index of the set of sensor data for use in responding to one or more queries of the set of sensor data along with a plurality of other sets of sensor data.
    Type: Application
    Filed: January 28, 2011
    Publication date: August 2, 2012
    Applicant: CISCO TECHNOLOGY, INC.
    Inventors: Manoj Kumar Pandey, Tarun Banka, Debojyoti Dutta, Mainak Sen, Raja Suresh Krishna Balakrishnan
  • Publication number: 20120197899
    Abstract: A method and apparatus for recommending a short message recipient. The method includes parsing history short messages of a user to generate data associated with contacts, constructing a semantic association database by using the data, identifying a critical object in a new short message text of the user, analyzing an association between the critical object and the contacts by using the semantic association database, and recommending a short message recipient to the user according to a strength of association.
    Type: Application
    Filed: January 31, 2012
    Publication date: August 2, 2012
    Applicant: International Business Machines Corporation
    Inventors: Ying Li, Jing Luo, Zhong Su, Xiao Xun Zhang
  • Publication number: 20120191702
    Abstract: Adaptive index density in a database management system is provided, which includes receiving a number of partitions for an index for a database table, the index subject to creation. The adaptive index density also includes selecting a column from the database table, the column selected based upon an estimated frequency of execution of database queries for the column. The adaptive index density further includes calculating an estimated cost of executing each of the database queries for the column, and determining data to reside in each of the partitions of the index responsive to the estimated cost.
    Type: Application
    Filed: January 26, 2011
    Publication date: July 26, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: John G. Musial, Abhinay R. Nagpal, Sandeep R. Patil, Yan W. Stein
  • Publication number: 20120189204
    Abstract: Digital media content from files, streaming data, broadcast data, optical disks, or other storage devices can be linked to Internet information. Identifiers extracted from the media content can be used to direct Internet searches for more information related to the media content.
    Type: Application
    Filed: September 29, 2009
    Publication date: July 26, 2012
    Inventors: Brian D. Johnson, Michael J. Payne, David B. Andersen, Suri B. Medapati, Michael J. Espig, Cory J. Booth, Kevin J. Murphy, Sharad K. Garg, Barry O'Mahony
  • Publication number: 20120191675
    Abstract: The present invention relates to an apparatus and method for eliminating duplication of a file in a distributed storage system. The apparatus and method for eliminating duplication of a file in a distributed storage system according to the present invention calculates a hash value of each chunk for an active file; calculates a secondary hash value by adding the hash values calculated for respective chunks; examines duplication of the file using the hash value of each chunk and the secondary hash value; and eliminates a duplicated file depending on a result of the examination.
    Type: Application
    Filed: November 4, 2010
    Publication date: July 26, 2012
    Applicant: PSPACE INC.
    Inventors: Kyung-Soo Kim, Jae-Beom Cheon, Joo-Hyun Kim, Bong-sik Sihn, Bong-Joo Jin, Hyoung-Choul Kim, Young-Gyu Kim, Sun Choi, Gu-Yong Lee
  • Publication number: 20120191695
    Abstract: A local search engine geographically indexes information for searching by identifying a geocoded web page of a web site and identifying at least one geocodable web page of the web site. The system identifies a geocode contained within content of the geocoded web page of the web site. The geocode indicates a physical location of an entity associated with the web site. The system indexes content of the geocoded web page and content of the geocodable web page. The indexing including associating the geocode contained within content of the geocoded web page to the indexed content of the geocoded web page and the geocodable web page to allow geographical searching of the content of the web pages.
    Type: Application
    Filed: April 3, 2012
    Publication date: July 26, 2012
    Applicant: Local.com Corporation
    Inventor: Xiongwu Xia
  • Publication number: 20120191721
    Abstract: Method and system for processing a request associated with a user from a requesting node to an answering node in a telecommunications network. A repository is associated with the answering node, the repository including a data structure including a plurality of user profiles associated with a plurality of users. In the answering node a user profile of the plurality of user profiles is associated with the user. The method comprising the steps of assigning a unique user index to each user profile in the data structure, wherein the user index is representative of the location of the user profile within the data structure, communicating at least one user index to the requesting node, incorporating the user index in the request by the requesting node, transmitting the request from the requesting node to the answering node, and retrieving the user profile associated with the user associated with the request by the answering node on the basis of the user index.
    Type: Application
    Filed: June 12, 2009
    Publication date: July 26, 2012
    Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
    Inventors: Rogier August Caspar Joseph Noldus, Jos Den Hartog
  • Publication number: 20120191673
    Abstract: A method of coupling a user file name to a physical data file stored within a storage delivery network, includes: assigning a logical file identification value (LFID) to a data file stored in one or more storage nodes and storing the LFID in a computer readable memory; storing in the computer readable memory a node identification value (Node ID) indicative of where the data file is stored among a plurality of geographically distributed storage nodes and associating the Node ID with the LFID; and storing in the computer readable memory a file name for the data file created by a user and associating the file name with the LFID, wherein the LFID correlates the file name with the Node ID transparently to the user and allows the user to access the data file using just the file name.
    Type: Application
    Filed: March 30, 2012
    Publication date: July 26, 2012
    Applicant: Nirvanix, Inc.
    Inventors: Scott P. CHATLEY, Thanh T. Phan, Robert S. Palumbo, Troy C. Gatchell, J. Gabriel Gallagher
  • Publication number: 20120191701
    Abstract: Database tables can have different types of database indices defined for the database tables and different numbers of database indices. The efficiency of reading the indexes can vary with the different profiles of the indexes, which impacts the costs of access plans that use the indexes. Weights can be predefined to reflect the relative efficiencies of the different characteristics. Costs can be computed in accordance with a variety of techniques (e.g., based on edge traversals). The weights can be predefined to reduce costs, increase costs, or a combination thereof. A database management application or associated application or program can also refine or revise these weights based on statistical data gathered about the operation of the database and/or heuristics that are developed based on observations/research. The corresponding weights can be adjusted accordingly.
    Type: Application
    Filed: January 26, 2011
    Publication date: July 26, 2012
    Applicant: International Business Machines Corporation
    Inventors: Abhinay R. Nagpal, Sandeep R. Patil, Gopikrishnan Varadarajulu
  • Publication number: 20120191667
    Abstract: A method and system are disclosed for storage optimization. Data parts and metadata within a source data unit are identified and the data parts are compared with data which is already stored in the physical storage space. In case identical data parts are found within the physical storage, the data parts from the source data unit are linked to the identified data, while the data parts can be discarded, thereby reducing the required storage capacity. The metadata parts can be separately stored in a designated storage area.
    Type: Application
    Filed: January 20, 2011
    Publication date: July 26, 2012
    Applicant: INFINIDAT LTD.
    Inventors: Haim KOPYLOVITZ, Julian SATRAN, Yechiel YOCHAI
  • Publication number: 20120191672
    Abstract: Mechanisms are provided for efficiently improving a dictionary used for data deduplication. Dictionaries are used to hold hash key and location pairs for deduplicated data. Strong hash keys prevent collisions but weak hash keys are more computation and storage efficient. Mechanisms are provided to use both a weak hash key and a strong hash key. Weak hash keys and corresponding location pairs are stored in an improved dictionary while strong hash keys are maintained with the deduplicated data itself. The need for having uniqueness from a strong hash function is balanced with the deduplication dictionary space savings from a weak hash function.
    Type: Application
    Filed: March 30, 2012
    Publication date: July 26, 2012
    Applicant: DELL Products L.P.
    Inventor: Vinod Jayaraman
  • Publication number: 20120185515
    Abstract: For integrating diverse databases, a server and universal index are provided to support a lexicon of variable definitions and formatting information. Subscribing databases establish equivalences between local variables and variables in the universal index, either directly or with translation such as a format conversion. For managing qualifying, preliminary processes can analyze database schema and stored variable values to assess likely matches between variables and universal definitions in the lexicon, presented tentatively to the local operator for approval or rejection. Matches can become approved for use in interaction with other subscribing databases. Processes enable the universal lexicon to be revised, e.g., expanded when a variable does not appear to match an existing definition. The universal index server can function as a data intermediary, or as a source of index definitions. Databases can indicate their compliance with the index during transmission of variable data referenced to index definitions.
    Type: Application
    Filed: March 28, 2012
    Publication date: July 19, 2012
    Applicant: Database Logic Inc.
    Inventors: Mark Warne Ferrel, Eric Kenneth Barnum
  • Publication number: 20120185446
    Abstract: In one example embodiment, a method is illustrated as including retrieving item data from a plurality of listings, the item data filtered from noise data, constructing at least one base cluster having at least one document with common item data stored in a suffix ordering, compacting the at least one base cluster to create a compacted cluster representation having a reduced duplicate suffix ordering amongst the clusters, and merging the compacted cluster representation to generate a merged cluster, the merging based upon a first overlap value applied to the at least one document with common item data.
    Type: Application
    Filed: February 3, 2012
    Publication date: July 19, 2012
    Inventors: Neelakantan Sundaresan, Kavita Ganesan, Roopnath Grandhi
  • Publication number: 20120185487
    Abstract: A method for establishing content indexes includes: determining the size of a content space; determining a content address space according to the size of the content space; establishing the mapping relationship from the content space to the content address space and obtaining the content address; monitoring the corresponding content address and accepting the content publication or the content acquisition request of the content mapping space, by the content indexing node.
    Type: Application
    Filed: March 28, 2012
    Publication date: July 19, 2012
    Applicant: Huawei Technologies Co., Ltd.
    Inventor: Liang Liang
  • Publication number: 20120179684
    Abstract: A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Application
    Filed: January 12, 2011
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Publication number: 20120179690
    Abstract: A high-density, distance-measuring laser system and an associated computer that processes the data collected by the laser system. The computer determines a data partition structure and stores that structure as a header file for the scan before data is collected. As the scan progresses, the computer collects data points until a predetermined threshold is met, at which point a block of data consisting of the data points up to the threshold is written to disk. The computer indexes each data block using all three coordinates of its constituent data points using, preferably, a flexible index, such as an R-tree. When a data block is completely filled, it is written to disk preferably with its index and, as a result, each data block is ready for access and manipulation virtually immediately after having been collected. Also, each data block can be independently manipulated and read from disk.
    Type: Application
    Filed: March 20, 2012
    Publication date: July 12, 2012
    Applicant: LEICA GEOSYSTEMS AG
    Inventors: Mark Damon Wheeler, Barry Joel Schwarz, Richard William Bukowski, Minghua Wu
  • Publication number: 20120179687
    Abstract: A system and method to generate and maintain controlled growth DAG are described. The controlled growth DAG conveys information about objects captured by a capture system.
    Type: Application
    Filed: March 19, 2012
    Publication date: July 12, 2012
    Inventor: Weimin Liu
  • Publication number: 20120179688
    Abstract: A method, apparatus, article of manufacture, and a memory structure for brokering information between a plurality of clients using identifiers defining a plurality of data constructs is disclosed. An exemplary method comprises accepting a new data construct from an authoring entity, assigning a globally unique identifier to the new data construct, storing the new data construct and the assigned globally unique identifier in a database, and brokering between the authoring entity and a second entity commercially distinct from the authoring entity to provide the second entity access to the new data construct by reference to the assigned globally unique identifier of the new data construct or to provide the authoring entity access to an at least one of a plurality of pre-existing data constructs for use with the new data construct by reference to a globally unique identifier of the existing data construct.
    Type: Application
    Filed: March 22, 2012
    Publication date: July 12, 2012
    Applicant: Herbert Stettin as Chapter II Trustee for Rothstein Rosenfeldt Adler, P.A.
    Inventor: Baron R.K. Von Wolfsheild
  • Publication number: 20120179668
    Abstract: A search index structure which extends a typical composite index by incorporating an index which is optimized for fast retrieval from storage and which eliminates data which is specific to phrase searching. Other data is represented in a manner which allows it to be calculated rather than stored. Associating variable length entries with logical categories allows their length to be inferred from the category rather than stored. Using delta values between document IDs rather than the ID itself generates a compact, dense symbol set which is efficiently compressed by Huffman encoding or a similar compression method. Using an upper threshold to remove large, and thus rare, delta values from the symbol set prior to encoding further improves the encoding performance.
    Type: Application
    Filed: March 19, 2012
    Publication date: July 12, 2012
    Applicant: Microsoft Corporation
    Inventors: Chadd Creighton Merrigan, Mihai Petriuc, Raif Khassanov, Artsiom Ivanovic Kokhan
  • Publication number: 20120173508
    Abstract: One of the deficiencies of the existing search engines is that the search engines do not evaluate the trustfulness of comments before the searched comments are returned to end users. In addition, existing search engines overlook the analyzing and aggregating of the comments whose subjects are semantically, hierarchically related. Furthermore, as the use of non-textual comments has become popular nowadays, it is highly desirable that such search engines finding and providing comments have the capability to analyze, evaluate and aggregate both textual and non-textual comments, or heterogeneous comments in other words. The purpose of the invention is to overcome the abovementioned deficiencies of the existing search engines that find and provide comments.
    Type: Application
    Filed: October 11, 2011
    Publication date: July 5, 2012
    Inventor: Cheng Zhou
  • Publication number: 20120173535
    Abstract: Techniques provided for allowing external access by other users to private information that is maintained on local storage of a computer and owned by an information owner. The private information is uploaded from the local storage to an externally accessible information source that is accessible by the other users. A request from a user to access the private information is received by the owner, who determines whether to allow access the private information. If so, the owner sends a private information sharing authorization to a collaboration orchestrator, which retrieves the private information from the external source and provides the private information to the user. The owner optionally requests to collaborate with the user before deciding whether to allow access to the private information. One or both of the identities of the owner and user can remain anonymous until agreeing on revealing identities. A system and program product is also provided.
    Type: Application
    Filed: January 5, 2011
    Publication date: July 5, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Arun Ramakrishnan, Rohit Shetty
  • Publication number: 20120173504
    Abstract: The present invention relies on the two-dimensional information in documents and encodes two-dimensional structures into a one-dimensional synthetic language such that two-dimensional documents can be searched at text search speed. The system comprises: an indexing module, a retrieval module, an encoder, a quantization module, a retrieval engine and a control module coupled by a bus. Electronic documents are first indexed by the indexing module and stored as a synthetic text library. The retrieval module then converts an input image to synthetic text and searches for matches to the synthetic text in the synthetic text library. The matches can be in turn used to retrieve the corresponding electronic documents. In one or more embodiments, the present invention includes a method for comparing the synthetic text to documents that have been converted to synthetic text for a match.
    Type: Application
    Filed: March 8, 2012
    Publication date: July 5, 2012
    Inventor: Jorge Moraleda
  • Publication number: 20120173537
    Abstract: Systems and methods for retrieving household data based on an origination identifier. In an embodiment, an origination identifier of a communication is captured. The origination identifier is indexed into a master table comprising a plurality of records. Each of the records comprises an association between an origination identifier and a universal database linkage key, and each universal database linkage key comprises an index into one or more databases. A universal database linkage key associated with the captured origination identifier is retrieved and indexed into one or more databases. Household data associated with the captured origination identifier is retrieved from the one or more databases and communicated to at least one recipient.
    Type: Application
    Filed: December 23, 2011
    Publication date: July 5, 2012
    Applicant: TARGUS INFORMATION CORPORATION
    Inventors: James D. Shaffer, George G. Moore
  • Publication number: 20120173539
    Abstract: Methods and systems for managing an index database. In one exemplary method, an index database is stored on a machine readable volume with an operating system and the files which have been indexed, and then the volume is, after the storing, made available for distribution to licensees or customers. In this manner, the volume will include a previously created index database, allowing a user to begin use of the index database without having to perform an indexing operation.
    Type: Application
    Filed: March 13, 2012
    Publication date: July 5, 2012
    Inventors: Andrew Carol, Yan Arrouye, Dominic Giampaolo
  • Publication number: 20120173536
    Abstract: A method to index recorded content at a media device includes extracting, at a remote service provider, event index data from an event being recorded at a media device and associating the event index data with locator code data of the event. The method further includes storing, at the remote service provider, the extracted event index data and the associated locator code data; searching the extracted event index data for a plurality of segments associated with the event, the search being associated with a search request; determining index display data for a presentation of the plurality of segments based on the search request; and transmitting, to the media device, the locator code data associated with the plurality of segments, and the index display data.
    Type: Application
    Filed: December 1, 2011
    Publication date: July 5, 2012
    Applicant: AT&T Intellectual Property I, LP
    Inventors: Behzad Shahraray, David Gibbon, Lee Begeja, Zhu Liu, Richard V. Cox, Bernard S. Renger
  • Publication number: 20120166384
    Abstract: According to some embodiments, a system, method, means, and/or computer program code are provided to facilitate a display of information on a client device. For example, a server may retrieve first enterprise data from an enterprise database and store the first enterprise data into a first client based cache at the server, the first client based cache being associated with a first user. Similarly, the server may retrieve second enterprise data from the enterprise database and store the second enterprise data into a second client based cache at the server, the second client based cache being associated with a second user. Subsequent to the storing of the first enterprise data, the server may receive a display request from a first client device associated with the first user and transmit the first enterprise data to the first client device.
    Type: Application
    Filed: December 22, 2010
    Publication date: June 28, 2012
    Inventors: Karl-Peter Nos, Andreas Riehl, Belenki Michael
  • Publication number: 20120166419
    Abstract: A system, a program product and an associated method is provided for data processing management in a computing environment having at least a processor. The method comprises creating in the memory an invalidation index having a plurality of rows, each row further comprising a search key field, an ID list field for IDs of records associated with the database, and a count value field. Every time a new reference query is received the processor searches for a row in said invalidation index with an already created search key and then decreases count value of a counter when a match is found and when a match is not found creating a new search key and a new row in an associated invalidation index for said new key.
    Type: Application
    Filed: September 30, 2011
    Publication date: June 28, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Miki Enoki, Yohsuke Ozawa, Hiroshi Horii
  • Publication number: 20120166444
    Abstract: A high level programming language provides a co-map communication operator that maps an input indexable type to an output indexable type according to a function. The function maps an index space corresponding to the output indexable type to an index space corresponding to the input indexable type. By doing so, the co-map communication operator lifts a function on an index space to a function on an indexable type to allow composability with other communication operators.
    Type: Application
    Filed: December 23, 2010
    Publication date: June 28, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Paul F. Ringseth, Yosseff Levanoni, Lingli Zhang, Weirong Zhu, Donald J. McCrady
  • Publication number: 20120166404
    Abstract: Systems, methods, and other embodiments associated with real-time text indexing are described. One example method includes receiving a document for indexing in a search system that includes a mature index and indexing the received document in a staging index. The staging index may be stored in direct access memory associated with query processing that does not degrade query performance even when postings become fragmented. The staging index and the mature text index are accessed to process queries on the search system. The example method may also include periodically merging the staging index into the mature index based on query feedback.
    Type: Application
    Filed: December 28, 2010
    Publication date: June 28, 2012
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Ravi PALAKODETY, Wesley LIN, Mohammad FAISAL, Garret F. SWART
  • Publication number: 20120166448
    Abstract: The subject disclosure is directed towards a data deduplication technology in which a hash index service's index and/or indexing operations are adaptable to balance deduplication performance savings, throughput and resource consumption. The indexing service may employ hierarchical chunking using different levels of granularity corresponding to chunk size, a sampled compact index table that contains compact signatures for less than all of the hash index's (or subspace's) hash values, and/or selective subspace indexing based on similarity of a subspace's data to another subspace's data and/or to incoming data chunks.
    Type: Application
    Filed: December 28, 2010
    Publication date: June 28, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Jin Li, Sudipta Sengupta
  • Publication number: 20120166447
    Abstract: A data set may be distributed over many data stores, and a query may be distributively evaluated by several data stores with the results combined to form a query result (e.g., utilizing a MapReduce framework). However, such architectures may violate security principles by performing sophisticated processing, including the execution of arbitrary code, on the same machines that store the data. Instead of processing queries, a data store may be configured only to receive requests specifying one or more filtering criteria, and to provide the data items satisfying the filtering criteria. A compute node may apply a query by generating a request including one o more filter criteria, providing the request to a data node, and applying the remainder of the query (including sophisticated processing, and potentially the execution of arbitrary code) to the data items provided by the data node, thereby improving the security and efficiency of query processing.
    Type: Application
    Filed: December 28, 2010
    Publication date: June 28, 2012
    Applicant: Microsoft Corporation
    Inventors: Nir Nice, Daniel Sitton, Dror Kremer, Michael Feldman
  • Publication number: 20120166445
    Abstract: A method and apparatus are provided for better web ad matching by combining relevance with consumer click feedback. In one example, the method includes receiving a query page, extracting features from the query page, re-weighting the query page, evaluating the query page in light of each ad in order to score each ad and pick substantially best ad matches of the indexed ads, and returning the substantially best ad matches to the consumer computer.
    Type: Application
    Filed: March 7, 2012
    Publication date: June 28, 2012
    Inventors: Deepayan Chakrabarti, Deepak K. Agrawal, Vanja Josifovski
  • Publication number: 20120158801
    Abstract: A Java object is scan-missed during the mark phase of a garbage collection cycle. A list of any unscanned objects, comprising all objects of a particular object type, is created during a sweep phase of the garbage collection cycle. After the garbage collection cycle is completed, and the application resumes, for every PUTFIELD/GETFIELD operation on the object type that is part of a specific parent object, a comparison is made with the relevant information in the unscanned objects list. A scan-miss is identified by determining whether the current object being referenced by the application is a part of the unscanned object list that has been created during the sweep phase of the garbage collection cycle.
    Type: Application
    Filed: December 15, 2010
    Publication date: June 21, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: AMAR DEVEGOWDA, CHARLES R. GRACIE, VENKATARAGHAVAN LAKSHMINARAYANACHAR
  • Publication number: 20120158676
    Abstract: Objects stored in a zip archive may be extracted in random-access fashion (without involving other objects stored in the zip archive) using the addresses of the objects stored in the central directory of the zip archive. However, zip archives often provide insufficient information to enable random access to the data within an object. This capability may be provided by segmenting the object into sections of a section size, and including in the zip archive a block table specifying, for respective sections, the block size of the corresponding block. A zip archive extractor may achieve random access to the object by using the block table to computing the addresses of blocks comprising the selected portion and extracting only those blocks. Backwards compatibility of the zip archive with other zip archive extractors may be preserved by including the block table within a zip extension of the central directory of the zip archive.
    Type: Application
    Filed: December 17, 2010
    Publication date: June 21, 2012
    Applicant: Microsoft Corporation
    Inventor: Thomas Alan Bouldin
  • Publication number: 20120158714
    Abstract: A system may include determination of a plurality of data structures associated with an entity, each of the plurality of data structures associated with a respective validity period, determination of a plurality of non-overlapping time periods based on the validity periods, the plurality of non-overlapping time periods collectively spanning all of the validity periods, determination, for each of the plurality of non-overlapping time periods, of a composite data structure based on each of the data structures associated with a validity period including the non-overlapping time period, assignment of a respective document identifier to each composite data structure, each document identifier indicating the entity, and indexing of the composite data structures within an index.
    Type: Application
    Filed: December 16, 2010
    Publication date: June 21, 2012
    Inventor: Bruno Dumant
  • Publication number: 20120158677
    Abstract: This can relate to streaming compressed files via a non-volatile memory (“NVM”) of a media player. In particular, the NVM can stream compressed media files. The NVM can include an NVM controller and an NVM die storing the compressed media file. The NVM controller can read the compressed media file from the NVM die, decompress the media file, and send the decompressed media file to a digital-to-analog converter (“DAC”) for conversion to analog format. Since the decompression can be performed by the NVM itself, an application processor may be significantly removed from the media playback process. In some embodiments, it may only be necessary for the application processor to issue an initial read request and/or receive a completion confirmation from the NVM. This can result in significant power savings for the media player and can free the application processor for performing other functions of the media player.
    Type: Application
    Filed: December 20, 2010
    Publication date: June 21, 2012
    Applicant: Apple Inc.
    Inventor: Shachar Ron
  • Publication number: 20120158732
    Abstract: A data marketplace infrastructure provides a crowd sourcing solution to development, discovery and publication of decision applications. Applications can be submitted from a user to a data warehouse in association with a data feed. One or more discovery properties are determined with regard to each application. The applications are made available to other client systems in association with the data feed. A relevant data feed and a relevant application can be identified based on satisfaction of a discovery request by the one or more determined discovery properties of the application. The application can be selected and downloaded to the user for evaluation and customization. The customized application can then be submitted to the data warehouse for publication with the other applications associated with the data feed.
    Type: Application
    Filed: December 17, 2010
    Publication date: June 21, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Vijay Mital, Max Uritsky, Suraj Poozhiyil, Moe Khosravy, Robert Fries
  • Publication number: 20120158731
    Abstract: The present invention extends to methods, systems, and computer program products for deriving document similarity indices. Embodiments of the invention include scalable and efficient mechanisms for deriving and updating a document similarity index for a plurality of documents. The number of maintained similarities can be controlled to conserve CPU and storage resources.
    Type: Application
    Filed: December 16, 2010
    Publication date: June 21, 2012
    Applicant: Microsoft Corporation
    Inventors: Sorin Gherman, Kunal Mukerjee, Adam Prout
  • Publication number: 20120158696
    Abstract: The claimed subject matter provides a method and a system for the efficient indexing of error tolerant set containment. An exemplary method comprises obtaining a frequency threshold and a query set. All tokens or token sets within the query set are determined, and then all minimal infrequent tokens or all minimal infrequent tokens sets of data records are found and used to build an index. The minimal infrequent tokens or minimal infrequent tokensets are processed in a fixed order, and then a collection of signatures for each minimal infrequent token or token set is determined.
    Type: Application
    Filed: December 21, 2010
    Publication date: June 21, 2012
    Applicant: Microsoft Corporation
    Inventors: Arvind Arasu, Parag Agrawal, Kaushik Shriraghav
  • Publication number: 20120158733
    Abstract: A system for storing, managing, and accessing information on a network by providing an interface between a social network and a content network includes an applications platform. The system provides messaging and social networking facility incorporating enhanced instant messaging, file synchronization, network presence, interactive chat capabilities, text messaging, voice and video messaging, blogging, and email. The system includes a viewer, an indexing facility, and a storage facility. The viewer enables users to traverse content and provides services based upon context of time, place, structure, node, and observed user behavior. The viewer provides a means for users to interact with information on the network and services to manipulate information and transact activities. The indexing facility manages the structure of the network and tracks attributes and controlled vocabularies. The indexing facility supports navigation across the structure and resolves the logical index to a physical storage location.
    Type: Application
    Filed: June 15, 2011
    Publication date: June 21, 2012
    Applicant: PEER FUSION LLC
    Inventors: Robert E. McGILL, Clifford F. BOYLE, Jamie MAZUR, Jason MAZUR, Alex GERUS, Eugene BERKOV, Kunal BHOMICK
  • Publication number: 20120158800
    Abstract: A method of organizing a data in a database system using a swarm database system that has one or more nodes comprising one or more processors and memory, the memory of the one or more nodes storing one or more programs to be executed by the one or more processors. Identifying data to store in one or more tables on a bucket, wherein the bucket is a allocation of a partitioned storage in a node of the one or more nodes. Assigning to each of the identified data an identifier and a data storage hierarchical level of a plurality of hierarchical levels.
    Type: Application
    Filed: December 16, 2011
    Publication date: June 21, 2012
    Inventors: Keith PETERS, Bryn Robert Dole, Michael Markson, Robert Michael Saliba, Rich Skrenta, Robert N. Truel, Gregory B. Lindahl
  • Publication number: 20120150862
    Abstract: A method for enhancing a search of a set of documents is described. The method allows a user to present a word of interest. The word is then matched to related words in a larger corpus of words and the related words are matched against an index of the document to identify words that appear in both the matched words and the document index. The word selected by the user may be taken from a previously generated index of the document or the word may be presented by the user based on a topic of interest.
    Type: Application
    Filed: December 13, 2010
    Publication date: June 14, 2012
    Applicant: Xerox Corporation
    Inventor: Steven J. Harrington
  • Publication number: 20120150812
    Abstract: Content license storage is provided by holding, in a temporary license store on the content consumption device, a plurality of content licenses for a plurality of content streams, wherein each content license of the plurality of content licenses includes a removal date. The method further includes for each content license of the plurality of content licenses corresponding to a content stream of the plurality of content streams which is designated for archived playback, copying the content license into an embedded license store within the content stream to form an archived content stream. The method further includes removing one or more of the plurality of content licenses held at the temporary license store if the removal date included in the content license has been reached, while leaving each content license stored within an archived content stream even if the removal date has been reached.
    Type: Application
    Filed: December 13, 2010
    Publication date: June 14, 2012
    Applicant: MICROSOFT CORPORATION
    Inventor: Quintin S. Burns