Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
  • Publication number: 20130198148
    Abstract: Embodiments of the present invention provide a system, method and computer program products for estimating data reduction in a file system. A method includes selecting a sample of all data from data files in the file system, wherein said sample represent a subset of all the data in the file system. The method further includes estimating a data reduction ratio by data deduplication for the file system based on said sample. The method further includes estimating a data reduction ratio by data compression for the file system based said sample. The method further includes generating a combined data reduction estimate for the file system based on said data compression estimate and said data deduplication estimate.
    Type: Application
    Filed: January 27, 2012
    Publication date: August 1, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David D. Chambliss, Mihail C. Constantinescu, Joseph S. Glider, Maohua Lu
  • Publication number: 20130198155
    Abstract: As changes are made to a document, each change may be assigned an extended identifier comprising a globally unique identifier (GUID) component and an integer component. Upon determining that the same GUID component is used in identifiers for multiple changes, the GUID component may be mapped to a range of indices. Each index of the range of indices may then be used to represent the same GUID component in each extended identifier.
    Type: Application
    Filed: July 24, 2012
    Publication date: August 1, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Simon Peter Clarke, David Oliver, Brent James Van Minnen, Miko Arnab S. Bose
  • Publication number: 20130197789
    Abstract: A travel management system may include a client module to generate a request to update and/or search for data related to a trip. A database module may receive the request and communicate with a database. The database may include data organized in a trip data store table including unique keys respectively identifying trips. The database may further include index tables related to attributes of the trips and identified by the unique keys. The database module may obtain data related to the request from an index table corresponding to a unique key and forward a response to the client module.
    Type: Application
    Filed: May 7, 2012
    Publication date: August 1, 2013
    Applicant: Accenture Global Services Limited
    Inventors: Saurabh BHADKARIA, Gurdeep Singh VIRDI, Sanjoy PAUL
  • Patent number: 8489132
    Abstract: Disclosed are a system, method, and article of manufacture for context-enriched microblog posting. In one aspect, a message component is provided. A context data related to a context of a computing device used to generate the message component is provided. The message component and the context data are associated. The context data may be communicated to a web browser. The message component may be communicated to the web browser. The message component may be rendered in a format for communication as a short message service (SMS) message that includes a reference to the context data. The message component and the context data may be rendered in a format for communication as a multimedia messaging service (MMS) message.
    Type: Grant
    Filed: April 29, 2010
    Date of Patent: July 16, 2013
    Assignee: Buckyball Mobile Inc.
    Inventors: Amit Karmarkar, Richard Peters
  • Publication number: 20130179411
    Abstract: For on-line separation of data chunks for compression, unrelated data chunks are classified based on various attributes. The classified data chunks are sent to at least one available compression contexts. The classified data chunks are related. The classified data chunks are encoded by at least one the compression operations. A compression ratio is achieved and included as feedback.
    Type: Application
    Filed: June 27, 2012
    Publication date: July 11, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan AMIT, Ori SHALEV
  • Publication number: 20130179409
    Abstract: For on-line separation of data chunks for compression, unrelated data chunks are classified based on various attributes. The classified data chunks are sent to at least one available compression contexts. The classified data chunks are related. The classified data chunks are encoded by at least one the compression operations. A compression ratio is achieved and included as feedback.
    Type: Application
    Filed: January 6, 2012
    Publication date: July 11, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan AMIT, Ori SHALEV
  • Publication number: 20130179407
    Abstract: Apparatus, methods, and other embodiments associated with de- duplication seeding are described. One example method includes re-configuring a data de-duplication repository with a blocklet from a data de-duplication seed corpus. Reconfiguring the repository may include adding a blocklet from the seed corpus to the repository, activating a blocklet identified with the seed corpus in the repository, removing a blocklet from the repository, and de-activating a blocklet in the repository. The example method may also include re-configuring a data de-duplication index associated with the data de-duplication repository with information about the blocklet. Reconfiguring the repository and the index increases the likelihood that a blocklet ingested by a data de-duplication apparatus that relies on the repository and the index will be treated as a duplicate blocklet by the data de-duplication apparatus.
    Type: Application
    Filed: January 11, 2012
    Publication date: July 11, 2013
    Applicant: Quantum Corporation
    Inventor: Timothy STOAKES
  • Publication number: 20130179729
    Abstract: An approach to providing failure protection in a loosely-coupled cluster environment. A node in the cluster generates checkpoints of application data in a consistent state for an application that is running on a first node in the cluster. The node sends the checkpoint to one or more of the other nodes in the cluster. The node may also generate log entries of changes in the application data that occur between checkpoints of the application data. The node may send the log entries to other nodes in the cluster. The node may similarly receive external checkpoints and external log entries from other nodes in the cluster. In response to a node failure, the node may start an application on the failed node and recover the application using the external checkpoints and external log entries for the application.
    Type: Application
    Filed: January 5, 2012
    Publication date: July 11, 2013
    Applicant: International Business Machines Corporation
    Inventors: Lawrence Y. Chiu, Shan Fan, Yang Liu, Mei Mei, Paul H. Muench
  • Publication number: 20130173564
    Abstract: A system and method for compressing and decompressing multiple types of character data. The system and method employ multiple encoding tables, each designed for encoding a subset of character data, such as numeric data, uppercase letters, lowercase letters, Latin, or UNICODE data, to perform compressions and decompression of character data. The character encoding tables are smaller than the size of the alphabet of the uncompressed strings.
    Type: Application
    Filed: March 8, 2012
    Publication date: July 4, 2013
    Inventors: Gary Roberts, Guilian Wang, Frederick Kaufmann
  • Publication number: 20130173588
    Abstract: Techniques for updating join indexes are provided. A determination is made to update date criteria in a join index query statement. The join index is parsed for current date and current time criteria. The join index is revised based on the location of the current date and current time criteria as they appear in the original join index. The revisions include new criteria that minimize the effort in maintaining and using the join index.
    Type: Application
    Filed: December 28, 2011
    Publication date: July 4, 2013
    Applicant: Teradata US, Inc.
    Inventors: Xiaobin Ma, Grace Kwan-On Au, Lu Ma
  • Publication number: 20130173553
    Abstract: A distributed, cloud-based storage system provides a reliable, deduplicated, scalable and high performance backup service to heterogeneous clients that connect to it via a communications network. The distributed cloud-based storage system guarantees consistent and reliable data storage while using structured storage that lacks ACID compliance. Consistency and reliability are guaranteed using a system that includes: 1) back references from shared objects to referring objects, 2) safe orders of operation for object deletion and creation, 3) and simultaneous access to shared resources through sub-resources.
    Type: Application
    Filed: December 29, 2011
    Publication date: July 4, 2013
    Inventors: Anand Apte, Faisal Puthuparackat, Jaspreet Singh, Milind Borate, Shekhar S. Deshkar
  • Publication number: 20130173627
    Abstract: A deduplicated data storage system provides high performance storage to heterogeneous clients that connect to it via a communications network. The deduplicated data storage system provides fast access to deduplication data by caching the most frequently accessed deduplication data in a hyperindex. Updates to the non-cached deduplication data are serialized by use of a store queue and hold queue.
    Type: Application
    Filed: December 29, 2011
    Publication date: July 4, 2013
    Inventors: Anand Apte, Jaspreet Singh, Milind Borate, Shekhar S. Deshkar
  • Publication number: 20130166522
    Abstract: Calculation of aggregated values in a history database table can be optimized using an approach in which an ordered history table is accessed. The ordered history table can include a sequential listing of commit identifiers associated with updates, insertions, and/or deletions to values in the database table. The ordered history table can be traversed in a single pass to calculate an aggregation function using an optimized algorithm. The optimized algorithm can enable calculation of an aggregated metric of the values based on a selected method for tracking invalidated values to their corresponding commit identifiers. The calculated metric is generated for a current version of the database table; and promoted.
    Type: Application
    Filed: December 23, 2011
    Publication date: June 27, 2013
    Inventors: Martin Kaufmann, Norman May, Andreas Tonder, Donald Kossmann
  • Publication number: 20130166518
    Abstract: Systems and methods for compression of a genomic data file are described herein. In one embodiment, genomic sequences, sequence headers, and quality sequences associated with a plurality of data streams provided in a genomic data file are identified. Each of the genomic sequences includes at least one of primary characters and secondary characters. Further, the secondary characters from each of the genomic sequences may be removed to obtain an intermediate genomic sequence file and a quality score corresponding to the secondary character may be modified in quality sequences to obtain an intermediate quality sequence file. Based on the intermediate genomic sequence file and the intermediate quality sequence file, a modified genomic sequence file and a modified quality sequence file, respectively are generated. A compressed genomic data file is obtained using at least the modified genomic sequence and the modified quality sequence.
    Type: Application
    Filed: March 23, 2012
    Publication date: June 27, 2013
    Applicant: TATA CONSULTANCY SERVICES LIMITED
    Inventors: Sharmila Shekhar MANDE, Monzoorul Haque MOHAMMED, Anirban DUTTA, Tungadri BOSE
  • Publication number: 20130166471
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, can aggregate content from one or more electronic publication documents. In one aspect, a method includes obtaining information including an index to content items of an electronic publication including a discrete package of received data that is stored locally, the content items being less than all content in the electronic publication received; retrieving, by a computer based on one or more criteria, two or more of the content items provided in disparate portions of the electronic publication, including the discrete package of received data, using the index; and presenting the two or more of the content items together on an output device in a user interface format different from that of the electronic publication.
    Type: Application
    Filed: September 27, 2010
    Publication date: June 27, 2013
    Applicant: ADOBE SYSTEMS INCORPORATED
    Inventors: Yohko Aurora Fukuda Kelley, Bruce Chester Bell
  • Publication number: 20130166502
    Abstract: This document describes, in various implementations, segmenting data of a database cluster into a plurality of segments, the data including a plurality of tuples, each segment including at least one of the tuples, and distributing the plurality of segments among nodes of the database cluster. Rebalancing of the data of the database cluster may be achieved by copying at least one of the plurality of segments from a source node of the database cluster to a destination node of the database cluster.
    Type: Application
    Filed: December 23, 2011
    Publication date: June 27, 2013
    Inventor: Stephen Gregory WALKAUSKAS
  • Publication number: 20130159263
    Abstract: A method for compressing a cloud of points with imposed error constraints at each point is disclosed. Surfaces are constructed that approach each point to within the constraint specified at that point, and from the plurality of surfaces that satisfy the constraints at all points, a surface is chosen which minimizes the amount of memory required to store the surface on a digital computer.
    Type: Application
    Filed: December 18, 2011
    Publication date: June 20, 2013
    Applicant: Numerica Corporation
    Inventors: Randy C. Paffenroth, Ryan Nong, Woody D. Leed, Scott M. Lundberg
  • Publication number: 20130159255
    Abstract: Provided is a storage system providing a data storage area to an external apparatus. The storage system includes at least a first information processing apparatus including a first logical storage area forming the data storage area and a first data processing part performing processing of reducing the storage capacity used by a backup target data in the first logical storage area, and a second information processing apparatus communicatively coupled to the first information processing apparatus and including a second logical storage area forming the data storage area, and a second data processing part performing processing of reducing the storage capacity used by the backup target data in the second logical storage area.
    Type: Application
    Filed: December 20, 2011
    Publication date: June 20, 2013
    Inventors: Shigeru Kaga, Mikito Ogata, Mamoru Sato
  • Publication number: 20130159281
    Abstract: Embodiments are directed to replicating database tables for efficient data querying and to using a background task to update a database index table on a periodic basis. In one scenario, a computer system accesses an existing, original time-based database table that includes various entities and properties for each entity. Each entity also includes a time stamp value. The computer system receives an indication that the new index table is to be indexed according to a user-specified property and sorts the new index table based on both the value of the user-specified property and the time stamp value of the entity to which the user-specified property belongs. The computer system then periodically copies the entities and associated properties of the original time-based database table into a new database index table.
    Type: Application
    Filed: December 15, 2011
    Publication date: June 20, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Jinlin Yang, Michael Y. Levin
  • Publication number: 20130159629
    Abstract: A system and method are disclosed for storing data in a hash table. The method includes receiving data, determining a location identifier for the data wherein the location identifier identifies a location in the hash table for storing the data and the location identifier is derived from the data, compressing the data by extracting the location identifier; and storing the compressed data in the identified location of the hash table.
    Type: Application
    Filed: November 15, 2012
    Publication date: June 20, 2013
    Applicant: STEC, INC.
    Inventor: STEC, Inc.
  • Publication number: 20130151483
    Abstract: Example apparatus and methods associated with adaptive experience based de-duplication are provided. One example data de-duplication apparatus includes a de-duplication logic, an experience logic, and a reconfiguration logic. The de-duplication logic may be configured to perform data de-duplication according to a configurable approach that is a function of a pre-defined constraint. The experience logic may be configured to acquire de-duplication performance experience data. The reconfiguration logic may be configured to selectively reconfigure the configurable approach on the apparatus as a function of the de-duplication performance experience data. In different examples, dynamic reconfiguration may be performed locally and/or in a distributed manner based on local and/or distributed data that is acquired on a per actor (e.g., user, application) basis and/or on a per entity (e.g., computer, data stream) basis.
    Type: Application
    Filed: December 7, 2011
    Publication date: June 13, 2013
    Applicant: Quantum Corporation
    Inventor: Jeffrey Tofano
  • Publication number: 20130151924
    Abstract: A data handling system includes a compressive sensing unit that receives a source date file. A sparseness module compressive sensing unit generates a sparse source data file by inducing sparseness into the source data file. A measurement module within the compressive sensing unit generates a compressed sensed source data file from the sparse source data file and based on a sensing matrix. The compressed sensed source data file is to be transmitted to a remote data storage facility for storage. A recovery unit generates the source data file from the compressed sensed source data file retrieved from the remote data storage facility and based upon the sensing matrix.
    Type: Application
    Filed: December 8, 2011
    Publication date: June 13, 2013
    Applicant: Harris Corporation, Corporation of the State of Delaware
    Inventors: Edward R. Beadle, Charles Zahm
  • Publication number: 20130144846
    Abstract: A method includes receiving a request to save a first file as immutable. The method also includes searching for a second file that is saved and is redundant to the first file. The method further includes determining the second file is one of mutable and immutable. When the second file is mutable, the method includes saving the first file as a master copy, and replacing the second file with a soft link pointing to the master copy. When the second file is immutable, the method includes determining which of the first and second files has a later expiration date and an earlier expiration date, saving the one of the first and second files with the later expiration date as a master copy, and replacing the one of the first and second files with the earlier expiration date with a soft link pointing to the master copy.
    Type: Application
    Filed: December 2, 2011
    Publication date: June 6, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gaurav CHHAUNKER, Bhushan P. JAIN, Sandeep R. PATIL, Sri RAMANATHAN, Matthew B. TREVATHAN
  • Publication number: 20130144845
    Abstract: A method implemented in a computer infrastructure including a combination of hardware and software includes receiving from a local computing device a request to securely delete a file. The method also includes determining the file is deduplicated. The method further includes determining one of: the file is referred to by at least one other file, and the file is not referred to by another file. The method additionally includes securely deleting links associating the file with the local computing device without deleting the file when the file is referred to by at least one other file. The method also includes securely deleting the file when the file is not referred to by another file.
    Type: Application
    Filed: December 2, 2011
    Publication date: June 6, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Deepak R. GHUGE, Bhushan P. JAIN, Sandeep R. PATIL, Sri RAMANATHAN, Matthew B. TREVATHAN
  • Publication number: 20130138658
    Abstract: Indexes for predefined search orders of items in a database are generated and stored. When a client issues a database query a responsive pre-generated index list is retrieved and provided to the client for use in, e.g., populating a U/I view for a user. Only those items that a client needs, e.g., for populating a current U/I view, are retrieved from the database and output to the client. When a change is rendered to the database, e.g., an item is added or deleted or an existing item is altered, only the change is output to the client, rather than the entire modified index or altered item. In this manner clients can more quickly and efficiently respond to user data query requests by performing some processing upfront and by limiting communications traffic to communications relevant to the client's current processing.
    Type: Application
    Filed: November 29, 2011
    Publication date: May 30, 2013
    Applicant: Microsoft Corporation
    Inventors: Mark S. Flick, Ying Ding
  • Publication number: 20130132400
    Abstract: Provided are techniques for incrementally integrating and persisting context over an available observational space. At least one feature associated with a new observation is used to create at least one index key. The at least one index key is used to query one or more reverse lookup tables to locate at least one previously persisted candidate observation. The new observation is evaluated against the at least one previously persisted candidate observation to determine at least one relationship. In response to determining the at least one relationship, a threshold is used to make a new assertion about the at least one relationship. The new observation is used to review previous assertions to determine whether a previous assertion is to be reversed. In response to reversing the previous assertion, the new observation, the new assertion, and the reversed assertion are incrementally integrated into persistent context.
    Type: Application
    Filed: September 13, 2012
    Publication date: May 23, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gregery G. ADAIR, Robert J. BUTCHER, Jeffrey J. JONAS
  • Publication number: 20130132353
    Abstract: The present subject matter discloses a system and a method for compression of genomic data. In one embodiment, the method for compression of genomic data includes obtaining modified genomic data from genomic data based at least in part on intermediary data identified from the genomic data. In one implementation, the modified genomic data includes a plurality of primary characters. The genomic data may then be modified to generate one or more most-frequent character files based at least on a most-frequent character and a second most-frequent character from among the plurality of primary characters. Further, based at least on the one or more most-frequent character files and the modified genomic data, a least-frequent characters file may be created from the modified genomic data.
    Type: Application
    Filed: March 23, 2012
    Publication date: May 23, 2013
    Applicant: TATA CONSULTANCY SERVICES LIMITED
    Inventors: Sharmila Shekhar MANDE, Monzoorul Haque MOHAMMED, Anirban DUTTA, Tungadri BOSE, Sudha CHADARAM
  • Publication number: 20130132397
    Abstract: An apparatus for generating indexes of data may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including obtaining an order number responsive to receipt of a request from a device to index an item(s) of data. The computer program code may further cause the apparatus to map the order number to a key value and link the key value to the data and provide one or more index entries to a memory device to enable storage of the index entries. The index entries may include information corresponding to the key value and the data. The computer program code may further cause the apparatus to assign a new index row(s) including the data for inclusion in a set of index rows of a designated partition(s) to obtain a built index(es) of the data. Corresponding methods and computer program products are also provided.
    Type: Application
    Filed: November 18, 2011
    Publication date: May 23, 2013
    Applicant: NOKIA CORPORATION
    Inventors: David Gordon MacMillan, Matti Juhani Oikarinen
  • Publication number: 20130132399
    Abstract: Provided are techniques for incrementally integrating and persisting context over an available observational space. At least one feature associated with a new observation is used to create at least one index key. The at least one index key is used to query one or more reverse lookup tables to locate at least one previously persisted candidate observation. The new observation is evaluated against the at least one previously persisted candidate observation to determine at least one relationship. In response to determining the at least one relationship, a threshold is used to make a new assertion about the at least one relationship. The new observation is used to review previous assertions to determine whether a previous assertion is to be reversed. In response to reversing the previous assertion, the new observation, the new assertion, and the reversed assertion are incrementally integrated into persistent context.
    Type: Application
    Filed: November 22, 2011
    Publication date: May 23, 2013
    Applicant: International Business Machines Corporation
    Inventors: Gregery G. Adair, Robert J. Butcher, Jeffrey J. Jonas
  • Publication number: 20130124489
    Abstract: A method, computer program product and system for compressing a multivariate dataset. A dataset is selected that includes a plurality of variates. A first compression method is applied to the values of a first variate of the dataset. A second compression method is applied to the values of a second variate of the dataset, where the second compression method is arranged to compress the second variate values relative to the variation of the corresponding first variate values.
    Type: Application
    Filed: October 16, 2012
    Publication date: May 16, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130124503
    Abstract: A device relative to a delta indexing method for a hierarchy file storage including a from-end side file server and a back-end side file server is provided. The front-end side file server creates a file update list for accumulating a file update history in a file system therein, a search server requests the file update list to the front-end side file server, and the front-end side file server supplies path name information of a targeted file in the back-end side file server in addition to the file update list, thereby, the search accesses tot the back-end side file server to be able to acquire necessary information for a search index update.
    Type: Application
    Filed: July 12, 2012
    Publication date: May 16, 2013
    Inventor: YOHSUKE ISHII
  • Publication number: 20130117302
    Abstract: An apparatus and method for searching for index-structured data including a memory-based summary vector are disclosed. The apparatus for searching for index-structured data including a memory-based summary vector includes a storage unit configured to store a full index and data related to a key; and a key lookup engine configured to include not only a summary vector but also an index storing information related to the full index, search for data stored in the storage unit through the index, and return the searched result.
    Type: Application
    Filed: November 2, 2012
    Publication date: May 9, 2013
    Applicant: Electronics and Telecommunications Research Institute
    Inventor: Electronics and Telecommunications Research In
  • Publication number: 20130117272
    Abstract: Data management techniques are provided for handling of big data. A data management process can account for attributes of data by analyzing or interpreting the data, assigning intervals to the attributes based on the data, and effectuating policies, based on the attributes and intervals, that facilitate data management. In addition, the data management process can determine relations among data in a data collection and generate and store approximate results concerning the data based on the attributes, intervals, and the policies.
    Type: Application
    Filed: November 3, 2011
    Publication date: May 9, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Roger Barga, Alexander Sasha Stojanovic, Henricus Johannes Maria Meijer, Carl Carter-Schwendler, Michael Isard
  • Publication number: 20130117242
    Abstract: Systems, method and devices for adaptively suppressing data items based on one or more dynamic characteristics of the data items are disclosed. Adaptive data suppression of an operational data item may be accomplished by monitoring the operational data item for one or more dynamic characteristics required by a data aging rule associated with the operational data item, wherein at least one of the database and operational data item are stored in memory, detecting the one or more dynamic characteristics required by the data aging rule, recording the one or more detected dynamic characteristics and assessing whether the one or more detected dynamic characteristics satisfy the data aging rule. If a data aging rule is satisfied, the operational data item may be suppressed to persistent data storage. Related systems, methods, and articles of manufacture are also described.
    Type: Application
    Filed: November 9, 2011
    Publication date: May 9, 2013
    Applicant: SAP, AG
    Inventors: Marcel Kassner, Ole Krueger
  • Publication number: 20130117241
    Abstract: Replay of data transactions is initiated in a data storage application. Pages of a log segment directory characterizing metadata for a plurality of log segment are loaded into memory. Thereafter, redundant pages within the log segment directory are removed. It is then determined, based on the log segment directory, which log segments need to be accessed. These log segments are accessed to execute the log replay. Related apparatus, systems, techniques and articles are also described.
    Type: Application
    Filed: November 7, 2011
    Publication date: May 9, 2013
    Inventor: Ivan Schreter
  • Publication number: 20130117274
    Abstract: In an address book management method of an electronic device, a directory of members of an address book is created. A communication bulk and a communication count of each of the members listed in the address book are obtained. The communication bulk for each member is a total quantity of electronic communication in a predetermined time period between a predetermined user of the electronic device and the member, the total quantity measured according to a predetermined criterion, and the communication count for each member is a total number of occasions of electronic communication between the user and the member in the predetermined time period. An accumulative contact quantity index of each member is calculated according to the calculated communication bulk and communication count of the member. Thus, the members in the directory of the address book are ordered according to the accumulative contact quantity indexes.
    Type: Application
    Filed: September 26, 2012
    Publication date: May 9, 2013
    Applicants: FIH (HONG KONG) LIMITED, SHENZHEN FUTAIHONG PRECISION INDUSTRY CO., LTD.
    Inventor: JIAN-HUI LI
  • Publication number: 20130110793
    Abstract: Embodiments of the present invention provide an approach that utilizes discrete event simulation to quantitatively analyze the reliability of a modeled de-duplication system in a computer storage environment. In addition, the approach described herein can perform such an analysis on systems having heterogeneous data stored on heterogeneous storage systems in the presence of primary faults and their secondary effects due to de-duplication. In a typical embodiment, data de-duplication parameters and a hardware configuration are received in a computer storage medium. A data de-duplication model is then applied to a set of data and to the data de-duplication parameters, and a hardware reliability model is applied to the hardware configuration. Then a set (at least one) of discrete events is simulated based on the data de-duplication model as applied to the set of data and the data de-duplication parameters, and the hardware reliability model as applied to the hardware configuration.
    Type: Application
    Filed: November 1, 2011
    Publication date: May 2, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kavita Chavda, Eric W. Davis Rozier, Nagapramod S. Mandagere, Sandeep M. Uttamchandani, Pin Zhou
  • Publication number: 20130110842
    Abstract: A system may include multiple personal data sources and a machine-implemented data extractor and correlator configured to retrieve personal data from at least one of the personal data sources. The data extractor and correlator may extract information from unstructured data within the retrieved personal data and correlate the extracted information with previously stored structured data to generate additional structured data. The system may also include a storage device configured to store the previously stored structured data and the additional structured data. A natural language query module may be configured to receive a natural language query from a user and provide a response to the natural language query based at least in part on one or both of the previously stored structured data and the additional structured data.
    Type: Application
    Filed: November 2, 2011
    Publication date: May 2, 2013
    Applicant: SRI INTERNATIONAL
    Inventors: Thierry Donneau-Golencer, Rajan Singh, Madhu Yarlagadda, Corey Hulen, Kenneth C. Nitz, William Scott Mark
  • Publication number: 20130110841
    Abstract: An approach is provided for querying media based on media characteristics. A media platform processes and/or facilitates a processing of one or more images, one or more videos, or a combination thereof to determine one or more latent vectors associated with the one or more images, the one or more videos, or the combination thereof. The media platform further causes, at least in part, a comparison of the one or more latent vectors to one or more models. The media platform also causes, at least in part, an indexing of the one or more images, the one or more videos, or the combination thereof based, at least in part, on the one or more latent vectors, the one or more models, or a combination thereof.
    Type: Application
    Filed: October 31, 2011
    Publication date: May 2, 2013
    Applicant: Nokia Corporation
    Inventors: Sailesh Kumar SATHISH, Igor Danilo Diego CURCIO
  • Publication number: 20130110827
    Abstract: Systems, computer-readable media, and methods for utilizing information pertaining to one or more individuals or entities with which a user has at least one social networking relationship are provided. A search engine is configured to receive a query, to identify matching electronic documents, to rank the electronic documents, and to transmit the matching electronic documents and/or advertisements to the user in response to receiving a query. Upon receiving the query from a user, the search engine obtains a social network identifier of the user and utilizes information about the user's social networking relationships to augment the query with nonretrieval modifiers. The search engine processes the nonretrieval modifiers matching the electronic documents included in search results and ranks the results but does not use the nonretrieval modifiers to identify or retrieve results matching the query. The ranked electronic documents are included in the results and displayed in rank order to the user.
    Type: Application
    Filed: October 26, 2011
    Publication date: May 2, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: SHUBHA NABAR, RAJESH KRISHNA SHENOY
  • Publication number: 20130110794
    Abstract: An apparatus and method for stably filtering duplicate data in various resource-restricted environments such as a mobile device and medical equipment are provided. The apparatus includes a cell array unit configured to comprise one or more cells; a duplication check unit configured to check whether input data is duplicate and set a value of a cell that matches the input data; and a duplication probability calculation unit configured to, in response to the input data being determined as duplicate data by the duplication check unit, calculate a probability of duplication of the input data using the set value of the cell. Data which may be duplicate data among a large amount of input data is not arbitrarily deleted, but is provided to an application along with a probability of duplication of the data. Accordingly, a false positive error that occurs in Bloom filter is prevented, and thereby system stability can be improved.
    Type: Application
    Filed: April 30, 2012
    Publication date: May 2, 2013
    Applicant: Samsung Electronics Co., Ltd
    Inventor: Chun-Hee Lee
  • Publication number: 20130103694
    Abstract: In one embodiment, a method comprises identifying prefix groups for searchable character symbols, each prefix group having a corresponding searchable character symbol comprising at least one searchable character; assigning at least one prefix group to each of a plurality of distributed hash table nodes in a network, each distributed hash table node containing at least one of the prefix groups, each distributed hash table node assigned a corresponding prescribed keyspace range of a prescribed keyspace, each distributed hash table node configured for storing data records having respective primary data record keys within the corresponding prescribed keyspace range; and assigning secondary indexes that start with one of the searchable character symbols to the corresponding prefix group in the associated distributed hash table node, enabling any prefix search starting with the one searchable character symbol to be directed to the corresponding prefix group in the associated distributed hash table node.
    Type: Application
    Filed: October 25, 2011
    Publication date: April 25, 2013
    Applicant: CISCO TECHNOLOGY, INC.
    Inventors: Steven Vincent LUONG, Manish BHARDWAJ, Jiang ZHU, Huida DAI
  • Publication number: 20130100296
    Abstract: A method of distributing media content includes capturing an image of a static media content, detecting at least one feature in the image, seeking a correlation of the image to a reference image using the at least one feature, and identifying at least one region of dynamic media content of the reference image in the image of the static media content.
    Type: Application
    Filed: November 18, 2011
    Publication date: April 25, 2013
    Inventors: Feng Tang, Daniel R. Tretter, Qian Lin
  • Publication number: 20130103691
    Abstract: A technique includes, in response to an access to a database involving access to a table and specifying a natural key, using the database to translate the natural key to a surrogate key based at least in part on a mapping
    Type: Application
    Filed: October 19, 2011
    Publication date: April 25, 2013
    Inventor: Rohit N. Jain
  • Publication number: 20130103651
    Abstract: In one embodiment, a server may identify an executable file using a hash identifier. The server 110 may compute a hash identifier based on a file metadata set associated with an executable file. The server 110 may identify the executable file using the hash identifier.
    Type: Application
    Filed: October 23, 2011
    Publication date: April 25, 2013
    Applicant: Microsoft Corporation
    Inventors: Pradeep Jha, Michal Strehovsky, Bruce Chhay, Josh Carroll
  • Publication number: 20130097129
    Abstract: A method of dynamically performing data transformations on information that is transmitted between a user device and a web service may include receiving interface code from the web service, receiving an input from the user device that identifies a data type, and a data transformation to be applied to data instances matching the data type. The method may also include causing a definition file to be stored with the data type, the data transformation, and a resource locator. The method may additionally include, in a second communication session, intercepting a transmission, accessing the definition file using the resource locator, determining whether the data instance matches the data type, causing the data transformation to be performed on the data instance to generate transformed data, and inserting the transformed data into the transmission if the data instance matches the data type.
    Type: Application
    Filed: October 17, 2012
    Publication date: April 18, 2013
    Applicant: CIPHERPOINT SOFTWARE, INC.
    Inventor: CIPHERPOINT SOFTWARE, INC.
  • Publication number: 20130097125
    Abstract: The current application is directed to automated methods and systems for processing and analyzing unstructured data. The methods and systems of the current application identify patterns and determine characteristics of, and interrelationships between, events parsed from the unstructured data without necessarily using user-provided or expert-provided contextual knowledge. In one implementation, the unstructured data is parsed into attributed-associated events, reduced by eliminating attributes of low-information content, and coalesced into nodes that are incorporated into one or more graphs, within which patterns are identified and characteristics and interrelationships determined.
    Type: Application
    Filed: March 12, 2012
    Publication date: April 18, 2013
    Applicant: VMWARE, INC.
    Inventors: Mazda A. MARVASTI, Arnak V. POGHOSYAN, Ashot N. HARUTYUNYAN, Naira M. GRIGORYAN
  • Publication number: 20130097163
    Abstract: Methods and apparatuses are provided for facilitating interaction with a geohash-indexed data set. A method may include providing a geohash-indexed data set. The method may further include determining a density map indicating a density of indexed data items of the data set for each of a plurality of geohashes. Corresponding apparatuses are also provided.
    Type: Application
    Filed: October 18, 2011
    Publication date: April 18, 2013
    Applicant: NOKIA CORPORATION
    Inventors: Matti Juhani Oikarinen, David Gordon MacMillan
  • Publication number: 20130097124
    Abstract: A communication application automatically aggregates contact information. The communication application classifies contact information retrieved from data sources as either duplicate or complimentary contact information to a contact. The communication application aggregates the contact information and the contact into a unified contact object by eliminating the duplicate contact information and adding the complimentary contact information. The application presents the unified contact object through a user interface.
    Type: Application
    Filed: October 12, 2011
    Publication date: April 18, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Jeremy de Souza, Mayerber Carvalho Neto, Komal Kashiramka, Ladislau Conceicao, Gustavo Andrade, Kumarswamy Valegerepura, Brendan Fields, Maithili Dandige, Song Yue Yu, Narendranath Thadkal, Govind Varshney, Chris Gallagher
  • Publication number: 20130091105
    Abstract: A system to collect and analyze performance metric data recorded in time-series measurements, converted into unicode, and arranged into a special data structure. The performance metric data is collected by one or more probes running on machines about which data is being collected. The performance metric data is also organized into a special data structure. The data structure at the server where analysis is done has a directory for every day of performance metric data collected with a subdirectory for every resource type. Each subdirectory contain text files of performance metric data values measured for attributes in a group of attributes to which said text file is dedicated. Each attribute has its own section and the performance metric data values are recorded in time series as unicode hex numbers as a comma delimited list. Analysis of the performance metric data is done using regular expressions.
    Type: Application
    Filed: October 5, 2011
    Publication date: April 11, 2013
    Inventors: Ajit Bhave, Arun Ramachandran, Sai Krishnam Raju Nadimpalli, Sandeep Bele