Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
-
Publication number: 20130198148Abstract: Embodiments of the present invention provide a system, method and computer program products for estimating data reduction in a file system. A method includes selecting a sample of all data from data files in the file system, wherein said sample represent a subset of all the data in the file system. The method further includes estimating a data reduction ratio by data deduplication for the file system based on said sample. The method further includes estimating a data reduction ratio by data compression for the file system based said sample. The method further includes generating a combined data reduction estimate for the file system based on said data compression estimate and said data deduplication estimate.Type: ApplicationFiled: January 27, 2012Publication date: August 1, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: David D. Chambliss, Mihail C. Constantinescu, Joseph S. Glider, Maohua Lu
-
Publication number: 20130198155Abstract: As changes are made to a document, each change may be assigned an extended identifier comprising a globally unique identifier (GUID) component and an integer component. Upon determining that the same GUID component is used in identifiers for multiple changes, the GUID component may be mapped to a range of indices. Each index of the range of indices may then be used to represent the same GUID component in each extended identifier.Type: ApplicationFiled: July 24, 2012Publication date: August 1, 2013Applicant: MICROSOFT CORPORATIONInventors: Simon Peter Clarke, David Oliver, Brent James Van Minnen, Miko Arnab S. Bose
-
Publication number: 20130197789Abstract: A travel management system may include a client module to generate a request to update and/or search for data related to a trip. A database module may receive the request and communicate with a database. The database may include data organized in a trip data store table including unique keys respectively identifying trips. The database may further include index tables related to attributes of the trips and identified by the unique keys. The database module may obtain data related to the request from an index table corresponding to a unique key and forward a response to the client module.Type: ApplicationFiled: May 7, 2012Publication date: August 1, 2013Applicant: Accenture Global Services LimitedInventors: Saurabh BHADKARIA, Gurdeep Singh VIRDI, Sanjoy PAUL
-
Patent number: 8489132Abstract: Disclosed are a system, method, and article of manufacture for context-enriched microblog posting. In one aspect, a message component is provided. A context data related to a context of a computing device used to generate the message component is provided. The message component and the context data are associated. The context data may be communicated to a web browser. The message component may be communicated to the web browser. The message component may be rendered in a format for communication as a short message service (SMS) message that includes a reference to the context data. The message component and the context data may be rendered in a format for communication as a multimedia messaging service (MMS) message.Type: GrantFiled: April 29, 2010Date of Patent: July 16, 2013Assignee: Buckyball Mobile Inc.Inventors: Amit Karmarkar, Richard Peters
-
Publication number: 20130179411Abstract: For on-line separation of data chunks for compression, unrelated data chunks are classified based on various attributes. The classified data chunks are sent to at least one available compression contexts. The classified data chunks are related. The classified data chunks are encoded by at least one the compression operations. A compression ratio is achieved and included as feedback.Type: ApplicationFiled: June 27, 2012Publication date: July 11, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan AMIT, Ori SHALEV
-
Publication number: 20130179409Abstract: For on-line separation of data chunks for compression, unrelated data chunks are classified based on various attributes. The classified data chunks are sent to at least one available compression contexts. The classified data chunks are related. The classified data chunks are encoded by at least one the compression operations. A compression ratio is achieved and included as feedback.Type: ApplicationFiled: January 6, 2012Publication date: July 11, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan AMIT, Ori SHALEV
-
Publication number: 20130179407Abstract: Apparatus, methods, and other embodiments associated with de- duplication seeding are described. One example method includes re-configuring a data de-duplication repository with a blocklet from a data de-duplication seed corpus. Reconfiguring the repository may include adding a blocklet from the seed corpus to the repository, activating a blocklet identified with the seed corpus in the repository, removing a blocklet from the repository, and de-activating a blocklet in the repository. The example method may also include re-configuring a data de-duplication index associated with the data de-duplication repository with information about the blocklet. Reconfiguring the repository and the index increases the likelihood that a blocklet ingested by a data de-duplication apparatus that relies on the repository and the index will be treated as a duplicate blocklet by the data de-duplication apparatus.Type: ApplicationFiled: January 11, 2012Publication date: July 11, 2013Applicant: Quantum CorporationInventor: Timothy STOAKES
-
Publication number: 20130179729Abstract: An approach to providing failure protection in a loosely-coupled cluster environment. A node in the cluster generates checkpoints of application data in a consistent state for an application that is running on a first node in the cluster. The node sends the checkpoint to one or more of the other nodes in the cluster. The node may also generate log entries of changes in the application data that occur between checkpoints of the application data. The node may send the log entries to other nodes in the cluster. The node may similarly receive external checkpoints and external log entries from other nodes in the cluster. In response to a node failure, the node may start an application on the failed node and recover the application using the external checkpoints and external log entries for the application.Type: ApplicationFiled: January 5, 2012Publication date: July 11, 2013Applicant: International Business Machines CorporationInventors: Lawrence Y. Chiu, Shan Fan, Yang Liu, Mei Mei, Paul H. Muench
-
Publication number: 20130173564Abstract: A system and method for compressing and decompressing multiple types of character data. The system and method employ multiple encoding tables, each designed for encoding a subset of character data, such as numeric data, uppercase letters, lowercase letters, Latin, or UNICODE data, to perform compressions and decompression of character data. The character encoding tables are smaller than the size of the alphabet of the uncompressed strings.Type: ApplicationFiled: March 8, 2012Publication date: July 4, 2013Inventors: Gary Roberts, Guilian Wang, Frederick Kaufmann
-
Publication number: 20130173588Abstract: Techniques for updating join indexes are provided. A determination is made to update date criteria in a join index query statement. The join index is parsed for current date and current time criteria. The join index is revised based on the location of the current date and current time criteria as they appear in the original join index. The revisions include new criteria that minimize the effort in maintaining and using the join index.Type: ApplicationFiled: December 28, 2011Publication date: July 4, 2013Applicant: Teradata US, Inc.Inventors: Xiaobin Ma, Grace Kwan-On Au, Lu Ma
-
Publication number: 20130173553Abstract: A distributed, cloud-based storage system provides a reliable, deduplicated, scalable and high performance backup service to heterogeneous clients that connect to it via a communications network. The distributed cloud-based storage system guarantees consistent and reliable data storage while using structured storage that lacks ACID compliance. Consistency and reliability are guaranteed using a system that includes: 1) back references from shared objects to referring objects, 2) safe orders of operation for object deletion and creation, 3) and simultaneous access to shared resources through sub-resources.Type: ApplicationFiled: December 29, 2011Publication date: July 4, 2013Inventors: Anand Apte, Faisal Puthuparackat, Jaspreet Singh, Milind Borate, Shekhar S. Deshkar
-
Publication number: 20130173627Abstract: A deduplicated data storage system provides high performance storage to heterogeneous clients that connect to it via a communications network. The deduplicated data storage system provides fast access to deduplication data by caching the most frequently accessed deduplication data in a hyperindex. Updates to the non-cached deduplication data are serialized by use of a store queue and hold queue.Type: ApplicationFiled: December 29, 2011Publication date: July 4, 2013Inventors: Anand Apte, Jaspreet Singh, Milind Borate, Shekhar S. Deshkar
-
Publication number: 20130166522Abstract: Calculation of aggregated values in a history database table can be optimized using an approach in which an ordered history table is accessed. The ordered history table can include a sequential listing of commit identifiers associated with updates, insertions, and/or deletions to values in the database table. The ordered history table can be traversed in a single pass to calculate an aggregation function using an optimized algorithm. The optimized algorithm can enable calculation of an aggregated metric of the values based on a selected method for tracking invalidated values to their corresponding commit identifiers. The calculated metric is generated for a current version of the database table; and promoted.Type: ApplicationFiled: December 23, 2011Publication date: June 27, 2013Inventors: Martin Kaufmann, Norman May, Andreas Tonder, Donald Kossmann
-
Publication number: 20130166518Abstract: Systems and methods for compression of a genomic data file are described herein. In one embodiment, genomic sequences, sequence headers, and quality sequences associated with a plurality of data streams provided in a genomic data file are identified. Each of the genomic sequences includes at least one of primary characters and secondary characters. Further, the secondary characters from each of the genomic sequences may be removed to obtain an intermediate genomic sequence file and a quality score corresponding to the secondary character may be modified in quality sequences to obtain an intermediate quality sequence file. Based on the intermediate genomic sequence file and the intermediate quality sequence file, a modified genomic sequence file and a modified quality sequence file, respectively are generated. A compressed genomic data file is obtained using at least the modified genomic sequence and the modified quality sequence.Type: ApplicationFiled: March 23, 2012Publication date: June 27, 2013Applicant: TATA CONSULTANCY SERVICES LIMITEDInventors: Sharmila Shekhar MANDE, Monzoorul Haque MOHAMMED, Anirban DUTTA, Tungadri BOSE
-
Publication number: 20130166471Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, can aggregate content from one or more electronic publication documents. In one aspect, a method includes obtaining information including an index to content items of an electronic publication including a discrete package of received data that is stored locally, the content items being less than all content in the electronic publication received; retrieving, by a computer based on one or more criteria, two or more of the content items provided in disparate portions of the electronic publication, including the discrete package of received data, using the index; and presenting the two or more of the content items together on an output device in a user interface format different from that of the electronic publication.Type: ApplicationFiled: September 27, 2010Publication date: June 27, 2013Applicant: ADOBE SYSTEMS INCORPORATEDInventors: Yohko Aurora Fukuda Kelley, Bruce Chester Bell
-
Publication number: 20130166502Abstract: This document describes, in various implementations, segmenting data of a database cluster into a plurality of segments, the data including a plurality of tuples, each segment including at least one of the tuples, and distributing the plurality of segments among nodes of the database cluster. Rebalancing of the data of the database cluster may be achieved by copying at least one of the plurality of segments from a source node of the database cluster to a destination node of the database cluster.Type: ApplicationFiled: December 23, 2011Publication date: June 27, 2013Inventor: Stephen Gregory WALKAUSKAS
-
Publication number: 20130159263Abstract: A method for compressing a cloud of points with imposed error constraints at each point is disclosed. Surfaces are constructed that approach each point to within the constraint specified at that point, and from the plurality of surfaces that satisfy the constraints at all points, a surface is chosen which minimizes the amount of memory required to store the surface on a digital computer.Type: ApplicationFiled: December 18, 2011Publication date: June 20, 2013Applicant: Numerica CorporationInventors: Randy C. Paffenroth, Ryan Nong, Woody D. Leed, Scott M. Lundberg
-
Publication number: 20130159255Abstract: Provided is a storage system providing a data storage area to an external apparatus. The storage system includes at least a first information processing apparatus including a first logical storage area forming the data storage area and a first data processing part performing processing of reducing the storage capacity used by a backup target data in the first logical storage area, and a second information processing apparatus communicatively coupled to the first information processing apparatus and including a second logical storage area forming the data storage area, and a second data processing part performing processing of reducing the storage capacity used by the backup target data in the second logical storage area.Type: ApplicationFiled: December 20, 2011Publication date: June 20, 2013Inventors: Shigeru Kaga, Mikito Ogata, Mamoru Sato
-
Publication number: 20130159281Abstract: Embodiments are directed to replicating database tables for efficient data querying and to using a background task to update a database index table on a periodic basis. In one scenario, a computer system accesses an existing, original time-based database table that includes various entities and properties for each entity. Each entity also includes a time stamp value. The computer system receives an indication that the new index table is to be indexed according to a user-specified property and sorts the new index table based on both the value of the user-specified property and the time stamp value of the entity to which the user-specified property belongs. The computer system then periodically copies the entities and associated properties of the original time-based database table into a new database index table.Type: ApplicationFiled: December 15, 2011Publication date: June 20, 2013Applicant: MICROSOFT CORPORATIONInventors: Jinlin Yang, Michael Y. Levin
-
Publication number: 20130159629Abstract: A system and method are disclosed for storing data in a hash table. The method includes receiving data, determining a location identifier for the data wherein the location identifier identifies a location in the hash table for storing the data and the location identifier is derived from the data, compressing the data by extracting the location identifier; and storing the compressed data in the identified location of the hash table.Type: ApplicationFiled: November 15, 2012Publication date: June 20, 2013Applicant: STEC, INC.Inventor: STEC, Inc.
-
Publication number: 20130151483Abstract: Example apparatus and methods associated with adaptive experience based de-duplication are provided. One example data de-duplication apparatus includes a de-duplication logic, an experience logic, and a reconfiguration logic. The de-duplication logic may be configured to perform data de-duplication according to a configurable approach that is a function of a pre-defined constraint. The experience logic may be configured to acquire de-duplication performance experience data. The reconfiguration logic may be configured to selectively reconfigure the configurable approach on the apparatus as a function of the de-duplication performance experience data. In different examples, dynamic reconfiguration may be performed locally and/or in a distributed manner based on local and/or distributed data that is acquired on a per actor (e.g., user, application) basis and/or on a per entity (e.g., computer, data stream) basis.Type: ApplicationFiled: December 7, 2011Publication date: June 13, 2013Applicant: Quantum CorporationInventor: Jeffrey Tofano
-
Publication number: 20130151924Abstract: A data handling system includes a compressive sensing unit that receives a source date file. A sparseness module compressive sensing unit generates a sparse source data file by inducing sparseness into the source data file. A measurement module within the compressive sensing unit generates a compressed sensed source data file from the sparse source data file and based on a sensing matrix. The compressed sensed source data file is to be transmitted to a remote data storage facility for storage. A recovery unit generates the source data file from the compressed sensed source data file retrieved from the remote data storage facility and based upon the sensing matrix.Type: ApplicationFiled: December 8, 2011Publication date: June 13, 2013Applicant: Harris Corporation, Corporation of the State of DelawareInventors: Edward R. Beadle, Charles Zahm
-
Publication number: 20130144846Abstract: A method includes receiving a request to save a first file as immutable. The method also includes searching for a second file that is saved and is redundant to the first file. The method further includes determining the second file is one of mutable and immutable. When the second file is mutable, the method includes saving the first file as a master copy, and replacing the second file with a soft link pointing to the master copy. When the second file is immutable, the method includes determining which of the first and second files has a later expiration date and an earlier expiration date, saving the one of the first and second files with the later expiration date as a master copy, and replacing the one of the first and second files with the earlier expiration date with a soft link pointing to the master copy.Type: ApplicationFiled: December 2, 2011Publication date: June 6, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gaurav CHHAUNKER, Bhushan P. JAIN, Sandeep R. PATIL, Sri RAMANATHAN, Matthew B. TREVATHAN
-
Publication number: 20130144845Abstract: A method implemented in a computer infrastructure including a combination of hardware and software includes receiving from a local computing device a request to securely delete a file. The method also includes determining the file is deduplicated. The method further includes determining one of: the file is referred to by at least one other file, and the file is not referred to by another file. The method additionally includes securely deleting links associating the file with the local computing device without deleting the file when the file is referred to by at least one other file. The method also includes securely deleting the file when the file is not referred to by another file.Type: ApplicationFiled: December 2, 2011Publication date: June 6, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Deepak R. GHUGE, Bhushan P. JAIN, Sandeep R. PATIL, Sri RAMANATHAN, Matthew B. TREVATHAN
-
Publication number: 20130138658Abstract: Indexes for predefined search orders of items in a database are generated and stored. When a client issues a database query a responsive pre-generated index list is retrieved and provided to the client for use in, e.g., populating a U/I view for a user. Only those items that a client needs, e.g., for populating a current U/I view, are retrieved from the database and output to the client. When a change is rendered to the database, e.g., an item is added or deleted or an existing item is altered, only the change is output to the client, rather than the entire modified index or altered item. In this manner clients can more quickly and efficiently respond to user data query requests by performing some processing upfront and by limiting communications traffic to communications relevant to the client's current processing.Type: ApplicationFiled: November 29, 2011Publication date: May 30, 2013Applicant: Microsoft CorporationInventors: Mark S. Flick, Ying Ding
-
Publication number: 20130132400Abstract: Provided are techniques for incrementally integrating and persisting context over an available observational space. At least one feature associated with a new observation is used to create at least one index key. The at least one index key is used to query one or more reverse lookup tables to locate at least one previously persisted candidate observation. The new observation is evaluated against the at least one previously persisted candidate observation to determine at least one relationship. In response to determining the at least one relationship, a threshold is used to make a new assertion about the at least one relationship. The new observation is used to review previous assertions to determine whether a previous assertion is to be reversed. In response to reversing the previous assertion, the new observation, the new assertion, and the reversed assertion are incrementally integrated into persistent context.Type: ApplicationFiled: September 13, 2012Publication date: May 23, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gregery G. ADAIR, Robert J. BUTCHER, Jeffrey J. JONAS
-
Publication number: 20130132353Abstract: The present subject matter discloses a system and a method for compression of genomic data. In one embodiment, the method for compression of genomic data includes obtaining modified genomic data from genomic data based at least in part on intermediary data identified from the genomic data. In one implementation, the modified genomic data includes a plurality of primary characters. The genomic data may then be modified to generate one or more most-frequent character files based at least on a most-frequent character and a second most-frequent character from among the plurality of primary characters. Further, based at least on the one or more most-frequent character files and the modified genomic data, a least-frequent characters file may be created from the modified genomic data.Type: ApplicationFiled: March 23, 2012Publication date: May 23, 2013Applicant: TATA CONSULTANCY SERVICES LIMITEDInventors: Sharmila Shekhar MANDE, Monzoorul Haque MOHAMMED, Anirban DUTTA, Tungadri BOSE, Sudha CHADARAM
-
Publication number: 20130132397Abstract: An apparatus for generating indexes of data may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including obtaining an order number responsive to receipt of a request from a device to index an item(s) of data. The computer program code may further cause the apparatus to map the order number to a key value and link the key value to the data and provide one or more index entries to a memory device to enable storage of the index entries. The index entries may include information corresponding to the key value and the data. The computer program code may further cause the apparatus to assign a new index row(s) including the data for inclusion in a set of index rows of a designated partition(s) to obtain a built index(es) of the data. Corresponding methods and computer program products are also provided.Type: ApplicationFiled: November 18, 2011Publication date: May 23, 2013Applicant: NOKIA CORPORATIONInventors: David Gordon MacMillan, Matti Juhani Oikarinen
-
Publication number: 20130132399Abstract: Provided are techniques for incrementally integrating and persisting context over an available observational space. At least one feature associated with a new observation is used to create at least one index key. The at least one index key is used to query one or more reverse lookup tables to locate at least one previously persisted candidate observation. The new observation is evaluated against the at least one previously persisted candidate observation to determine at least one relationship. In response to determining the at least one relationship, a threshold is used to make a new assertion about the at least one relationship. The new observation is used to review previous assertions to determine whether a previous assertion is to be reversed. In response to reversing the previous assertion, the new observation, the new assertion, and the reversed assertion are incrementally integrated into persistent context.Type: ApplicationFiled: November 22, 2011Publication date: May 23, 2013Applicant: International Business Machines CorporationInventors: Gregery G. Adair, Robert J. Butcher, Jeffrey J. Jonas
-
Publication number: 20130124489Abstract: A method, computer program product and system for compressing a multivariate dataset. A dataset is selected that includes a plurality of variates. A first compression method is applied to the values of a first variate of the dataset. A second compression method is applied to the values of a second variate of the dataset, where the second compression method is arranged to compress the second variate values relative to the variation of the corresponding first variate values.Type: ApplicationFiled: October 16, 2012Publication date: May 16, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: International Business Machines Corporation
-
Publication number: 20130124503Abstract: A device relative to a delta indexing method for a hierarchy file storage including a from-end side file server and a back-end side file server is provided. The front-end side file server creates a file update list for accumulating a file update history in a file system therein, a search server requests the file update list to the front-end side file server, and the front-end side file server supplies path name information of a targeted file in the back-end side file server in addition to the file update list, thereby, the search accesses tot the back-end side file server to be able to acquire necessary information for a search index update.Type: ApplicationFiled: July 12, 2012Publication date: May 16, 2013Inventor: YOHSUKE ISHII
-
Publication number: 20130117302Abstract: An apparatus and method for searching for index-structured data including a memory-based summary vector are disclosed. The apparatus for searching for index-structured data including a memory-based summary vector includes a storage unit configured to store a full index and data related to a key; and a key lookup engine configured to include not only a summary vector but also an index storing information related to the full index, search for data stored in the storage unit through the index, and return the searched result.Type: ApplicationFiled: November 2, 2012Publication date: May 9, 2013Applicant: Electronics and Telecommunications Research InstituteInventor: Electronics and Telecommunications Research In
-
Publication number: 20130117272Abstract: Data management techniques are provided for handling of big data. A data management process can account for attributes of data by analyzing or interpreting the data, assigning intervals to the attributes based on the data, and effectuating policies, based on the attributes and intervals, that facilitate data management. In addition, the data management process can determine relations among data in a data collection and generate and store approximate results concerning the data based on the attributes, intervals, and the policies.Type: ApplicationFiled: November 3, 2011Publication date: May 9, 2013Applicant: MICROSOFT CORPORATIONInventors: Roger Barga, Alexander Sasha Stojanovic, Henricus Johannes Maria Meijer, Carl Carter-Schwendler, Michael Isard
-
Publication number: 20130117242Abstract: Systems, method and devices for adaptively suppressing data items based on one or more dynamic characteristics of the data items are disclosed. Adaptive data suppression of an operational data item may be accomplished by monitoring the operational data item for one or more dynamic characteristics required by a data aging rule associated with the operational data item, wherein at least one of the database and operational data item are stored in memory, detecting the one or more dynamic characteristics required by the data aging rule, recording the one or more detected dynamic characteristics and assessing whether the one or more detected dynamic characteristics satisfy the data aging rule. If a data aging rule is satisfied, the operational data item may be suppressed to persistent data storage. Related systems, methods, and articles of manufacture are also described.Type: ApplicationFiled: November 9, 2011Publication date: May 9, 2013Applicant: SAP, AGInventors: Marcel Kassner, Ole Krueger
-
Publication number: 20130117241Abstract: Replay of data transactions is initiated in a data storage application. Pages of a log segment directory characterizing metadata for a plurality of log segment are loaded into memory. Thereafter, redundant pages within the log segment directory are removed. It is then determined, based on the log segment directory, which log segments need to be accessed. These log segments are accessed to execute the log replay. Related apparatus, systems, techniques and articles are also described.Type: ApplicationFiled: November 7, 2011Publication date: May 9, 2013Inventor: Ivan Schreter
-
Publication number: 20130117274Abstract: In an address book management method of an electronic device, a directory of members of an address book is created. A communication bulk and a communication count of each of the members listed in the address book are obtained. The communication bulk for each member is a total quantity of electronic communication in a predetermined time period between a predetermined user of the electronic device and the member, the total quantity measured according to a predetermined criterion, and the communication count for each member is a total number of occasions of electronic communication between the user and the member in the predetermined time period. An accumulative contact quantity index of each member is calculated according to the calculated communication bulk and communication count of the member. Thus, the members in the directory of the address book are ordered according to the accumulative contact quantity indexes.Type: ApplicationFiled: September 26, 2012Publication date: May 9, 2013Applicants: FIH (HONG KONG) LIMITED, SHENZHEN FUTAIHONG PRECISION INDUSTRY CO., LTD.Inventor: JIAN-HUI LI
-
Publication number: 20130110793Abstract: Embodiments of the present invention provide an approach that utilizes discrete event simulation to quantitatively analyze the reliability of a modeled de-duplication system in a computer storage environment. In addition, the approach described herein can perform such an analysis on systems having heterogeneous data stored on heterogeneous storage systems in the presence of primary faults and their secondary effects due to de-duplication. In a typical embodiment, data de-duplication parameters and a hardware configuration are received in a computer storage medium. A data de-duplication model is then applied to a set of data and to the data de-duplication parameters, and a hardware reliability model is applied to the hardware configuration. Then a set (at least one) of discrete events is simulated based on the data de-duplication model as applied to the set of data and the data de-duplication parameters, and the hardware reliability model as applied to the hardware configuration.Type: ApplicationFiled: November 1, 2011Publication date: May 2, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Kavita Chavda, Eric W. Davis Rozier, Nagapramod S. Mandagere, Sandeep M. Uttamchandani, Pin Zhou
-
Publication number: 20130110842Abstract: A system may include multiple personal data sources and a machine-implemented data extractor and correlator configured to retrieve personal data from at least one of the personal data sources. The data extractor and correlator may extract information from unstructured data within the retrieved personal data and correlate the extracted information with previously stored structured data to generate additional structured data. The system may also include a storage device configured to store the previously stored structured data and the additional structured data. A natural language query module may be configured to receive a natural language query from a user and provide a response to the natural language query based at least in part on one or both of the previously stored structured data and the additional structured data.Type: ApplicationFiled: November 2, 2011Publication date: May 2, 2013Applicant: SRI INTERNATIONALInventors: Thierry Donneau-Golencer, Rajan Singh, Madhu Yarlagadda, Corey Hulen, Kenneth C. Nitz, William Scott Mark
-
Publication number: 20130110841Abstract: An approach is provided for querying media based on media characteristics. A media platform processes and/or facilitates a processing of one or more images, one or more videos, or a combination thereof to determine one or more latent vectors associated with the one or more images, the one or more videos, or the combination thereof. The media platform further causes, at least in part, a comparison of the one or more latent vectors to one or more models. The media platform also causes, at least in part, an indexing of the one or more images, the one or more videos, or the combination thereof based, at least in part, on the one or more latent vectors, the one or more models, or a combination thereof.Type: ApplicationFiled: October 31, 2011Publication date: May 2, 2013Applicant: Nokia CorporationInventors: Sailesh Kumar SATHISH, Igor Danilo Diego CURCIO
-
Publication number: 20130110827Abstract: Systems, computer-readable media, and methods for utilizing information pertaining to one or more individuals or entities with which a user has at least one social networking relationship are provided. A search engine is configured to receive a query, to identify matching electronic documents, to rank the electronic documents, and to transmit the matching electronic documents and/or advertisements to the user in response to receiving a query. Upon receiving the query from a user, the search engine obtains a social network identifier of the user and utilizes information about the user's social networking relationships to augment the query with nonretrieval modifiers. The search engine processes the nonretrieval modifiers matching the electronic documents included in search results and ranks the results but does not use the nonretrieval modifiers to identify or retrieve results matching the query. The ranked electronic documents are included in the results and displayed in rank order to the user.Type: ApplicationFiled: October 26, 2011Publication date: May 2, 2013Applicant: MICROSOFT CORPORATIONInventors: SHUBHA NABAR, RAJESH KRISHNA SHENOY
-
Publication number: 20130110794Abstract: An apparatus and method for stably filtering duplicate data in various resource-restricted environments such as a mobile device and medical equipment are provided. The apparatus includes a cell array unit configured to comprise one or more cells; a duplication check unit configured to check whether input data is duplicate and set a value of a cell that matches the input data; and a duplication probability calculation unit configured to, in response to the input data being determined as duplicate data by the duplication check unit, calculate a probability of duplication of the input data using the set value of the cell. Data which may be duplicate data among a large amount of input data is not arbitrarily deleted, but is provided to an application along with a probability of duplication of the data. Accordingly, a false positive error that occurs in Bloom filter is prevented, and thereby system stability can be improved.Type: ApplicationFiled: April 30, 2012Publication date: May 2, 2013Applicant: Samsung Electronics Co., LtdInventor: Chun-Hee Lee
-
Publication number: 20130103694Abstract: In one embodiment, a method comprises identifying prefix groups for searchable character symbols, each prefix group having a corresponding searchable character symbol comprising at least one searchable character; assigning at least one prefix group to each of a plurality of distributed hash table nodes in a network, each distributed hash table node containing at least one of the prefix groups, each distributed hash table node assigned a corresponding prescribed keyspace range of a prescribed keyspace, each distributed hash table node configured for storing data records having respective primary data record keys within the corresponding prescribed keyspace range; and assigning secondary indexes that start with one of the searchable character symbols to the corresponding prefix group in the associated distributed hash table node, enabling any prefix search starting with the one searchable character symbol to be directed to the corresponding prefix group in the associated distributed hash table node.Type: ApplicationFiled: October 25, 2011Publication date: April 25, 2013Applicant: CISCO TECHNOLOGY, INC.Inventors: Steven Vincent LUONG, Manish BHARDWAJ, Jiang ZHU, Huida DAI
-
Publication number: 20130100296Abstract: A method of distributing media content includes capturing an image of a static media content, detecting at least one feature in the image, seeking a correlation of the image to a reference image using the at least one feature, and identifying at least one region of dynamic media content of the reference image in the image of the static media content.Type: ApplicationFiled: November 18, 2011Publication date: April 25, 2013Inventors: Feng Tang, Daniel R. Tretter, Qian Lin
-
Publication number: 20130103691Abstract: A technique includes, in response to an access to a database involving access to a table and specifying a natural key, using the database to translate the natural key to a surrogate key based at least in part on a mappingType: ApplicationFiled: October 19, 2011Publication date: April 25, 2013Inventor: Rohit N. Jain
-
Publication number: 20130103651Abstract: In one embodiment, a server may identify an executable file using a hash identifier. The server 110 may compute a hash identifier based on a file metadata set associated with an executable file. The server 110 may identify the executable file using the hash identifier.Type: ApplicationFiled: October 23, 2011Publication date: April 25, 2013Applicant: Microsoft CorporationInventors: Pradeep Jha, Michal Strehovsky, Bruce Chhay, Josh Carroll
-
Publication number: 20130097129Abstract: A method of dynamically performing data transformations on information that is transmitted between a user device and a web service may include receiving interface code from the web service, receiving an input from the user device that identifies a data type, and a data transformation to be applied to data instances matching the data type. The method may also include causing a definition file to be stored with the data type, the data transformation, and a resource locator. The method may additionally include, in a second communication session, intercepting a transmission, accessing the definition file using the resource locator, determining whether the data instance matches the data type, causing the data transformation to be performed on the data instance to generate transformed data, and inserting the transformed data into the transmission if the data instance matches the data type.Type: ApplicationFiled: October 17, 2012Publication date: April 18, 2013Applicant: CIPHERPOINT SOFTWARE, INC.Inventor: CIPHERPOINT SOFTWARE, INC.
-
Publication number: 20130097125Abstract: The current application is directed to automated methods and systems for processing and analyzing unstructured data. The methods and systems of the current application identify patterns and determine characteristics of, and interrelationships between, events parsed from the unstructured data without necessarily using user-provided or expert-provided contextual knowledge. In one implementation, the unstructured data is parsed into attributed-associated events, reduced by eliminating attributes of low-information content, and coalesced into nodes that are incorporated into one or more graphs, within which patterns are identified and characteristics and interrelationships determined.Type: ApplicationFiled: March 12, 2012Publication date: April 18, 2013Applicant: VMWARE, INC.Inventors: Mazda A. MARVASTI, Arnak V. POGHOSYAN, Ashot N. HARUTYUNYAN, Naira M. GRIGORYAN
-
Publication number: 20130097163Abstract: Methods and apparatuses are provided for facilitating interaction with a geohash-indexed data set. A method may include providing a geohash-indexed data set. The method may further include determining a density map indicating a density of indexed data items of the data set for each of a plurality of geohashes. Corresponding apparatuses are also provided.Type: ApplicationFiled: October 18, 2011Publication date: April 18, 2013Applicant: NOKIA CORPORATIONInventors: Matti Juhani Oikarinen, David Gordon MacMillan
-
Publication number: 20130097124Abstract: A communication application automatically aggregates contact information. The communication application classifies contact information retrieved from data sources as either duplicate or complimentary contact information to a contact. The communication application aggregates the contact information and the contact into a unified contact object by eliminating the duplicate contact information and adding the complimentary contact information. The application presents the unified contact object through a user interface.Type: ApplicationFiled: October 12, 2011Publication date: April 18, 2013Applicant: MICROSOFT CORPORATIONInventors: Jeremy de Souza, Mayerber Carvalho Neto, Komal Kashiramka, Ladislau Conceicao, Gustavo Andrade, Kumarswamy Valegerepura, Brendan Fields, Maithili Dandige, Song Yue Yu, Narendranath Thadkal, Govind Varshney, Chris Gallagher
-
Publication number: 20130091105Abstract: A system to collect and analyze performance metric data recorded in time-series measurements, converted into unicode, and arranged into a special data structure. The performance metric data is collected by one or more probes running on machines about which data is being collected. The performance metric data is also organized into a special data structure. The data structure at the server where analysis is done has a directory for every day of performance metric data collected with a subdirectory for every resource type. Each subdirectory contain text files of performance metric data values measured for attributes in a group of attributes to which said text file is dedicated. Each attribute has its own section and the performance metric data values are recorded in time series as unicode hex numbers as a comma delimited list. Analysis of the performance metric data is done using regular expressions.Type: ApplicationFiled: October 5, 2011Publication date: April 11, 2013Inventors: Ajit Bhave, Arun Ramachandran, Sai Krishnam Raju Nadimpalli, Sandeep Bele