Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)

E Subclasses

Of chemical information (epo) (Class 707/E17.003)

Of images (epo) (Class 707/E17.004)

MINIMIZATION OF SURPRISAL DATA THROUGH APPLICATION OF HIERARCHY FILTER PATTERN

Publication number: 20130311435

Abstract: A method, computer product, and computer system of minimizing surprisal data comprising: at a source, reading and identifying characteristics of a genetic sequence of an organism; receiving an input of rank of at least two identified characteristics of the genetic sequence of the organism; generating a hierarchy of ranked, identified characteristics based on the rank of the at least two identified characteristics of the genetic sequence of the organism; comparing the hierarchy of ranked, identified characteristics to a repository of reference genomes; and if at least one reference genome from the repository matches the hierarchy of ranked, identified characteristics, breaking the matched reference genomes into pieces, combining pieces associated with the identified characteristics from at least one matched reference genome to form a filter pattern to be compared to the nucleotides of the genetic sequence of the organism, to obtain differences and create surprisal data.

Type: Application

Filed: June 8, 2012

Publication date: November 21, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert R. Friedlander, James R. Kraemer
CLUSTERING FOR HIGH AVAILABILITY AND DISASTER RECOVERY

Publication number: 20130311428

Abstract: Embodiments are directed towards managing within a cluster environment having a plurality of indexers for data storage using redundancy the data being managed using a generation identifier, such that a primary indexer is designated for a given generation of data. When a master device for the cluster fails, data may continue to be stored using redundancy, and data searches performed may still be performed.

Type: Application

Filed: October 26, 2012

Publication date: November 21, 2013

Applicant: SPLUNK INC.

Inventors: Vishal Patel, Mitchell Neuman Blank, JR., Sundar Rengarajan Vasan, Stephen Phillip Sorkin
CLUSTERING FOR HIGH AVAILABILITY AND DISASTER RECOVERY

Publication number: 20130311427

Abstract: Embodiments are directed towards managing within a cluster environment having a plurality of indexers for data storage using redundancy the data being managed using a generation identifier, such that a primary indexer is designated for a given generation of data. When a master device for the cluster fails, data may continue to be stored using redundancy, and data searches performed may still be performed.

Type: Application

Filed: October 9, 2012

Publication date: November 21, 2013

Applicant: SPLUNK INC.

Inventors: Vishal Patel, Mitchell Neuman Blank, JR., Sundar Rengarajan Vasan, Stephen Phillip Sorkin
CONTROLLING ENTERPRISE DATA ON MOBILE DEVICE VIA THE USE OF A TAG INDEX

Publication number: 20130304702

Abstract: A method, system and computer program product for controlling enterprise data on mobile devices. Data on a mobile device is tagged as being associated with either enterprise data or with personal data. Upon identifying the storage location of the tagged data and the identifier of the application that generated the tagged data, the tag, the storage location of the tagged data and the identifier of the application are stored in an index. A mobile agent residing on the mobile device may be directed by a mobile device management server of the enterprise to perform various actions (e.g., deleting, encrypting, backing-up) on the enterprise data using the index. In this manner, the enterprise has the ability to control their applications and data that resides on employees' mobile devices to ensure that such data is not lost or used in a manner that is contrary to the wishes of the employer.

Type: Application

Filed: May 14, 2012

Publication date: November 14, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Shalini Kapoor, Palanivel A. Kodeswaran, Sridhar R. Muppidi, Nataraj Nagaratnam, Vikrant Nandakumar
INDEXING BASED ON KEY RANGES

Publication number: 20130297613

Abstract: The present invention is a fast indexing technique that builds an indexing structure based on multi-level key ranges typically for large data storage systems. The invention is explained based on the B+-tree. It is designed to reside in main memory. Point searches and range searches are helped by early termination of searches for non-existent data. Range searches can be processed depth-first or breath-first. One group of multiple searches can be processed with one pass on the indexing structure to minimize total cost. Implementation options and strategies are explained to show the flexibility of this invention for easy adaption and high efficiency. Each branch of any level has exact and clear key boundaries, so that it is very easy to build or cache partial index for various purposes. The inventive indexing structure can be tuned to speed up queries directed at popular ranges of index or index ranges of particular interest to the user.

Type: Application

Filed: May 4, 2012

Publication date: November 7, 2013

Applicant: MONMOUTH UNIVERSITY

Inventor: Cui Yu
Character Data Compression for Reducing Storage Requirements in a Database System

Publication number: 20130297573

Abstract: A system, method, and computer program product for character data compression for reducing data storage requirements in a database system are described. Embodiments include identifying data of a particular character type in a full data page, and identifying usage frequency of each character of the particular character type. Each character is encoded based on the identified usage frequency and stored, with storage requirements for most frequently used characters are reduced.

Type: Application

Filed: May 7, 2012

Publication date: November 7, 2013

Applicant: Sybase Inc.

Inventors: Xu-dong QIAN, ZhiPing Xiong
DATA INDEX USING A LINKED DATA STANDARD

Publication number: 20130290253

Abstract: A data indexing system including a plurality of servers and a tracked resource set client is provided. Each of the servers include a plurality of resources that are part of a resource set. Each of the servers also include a tracked resource set corresponding to the resource set. The tracked resource set describes the plurality of resources located in the resource set. The server identifies the plurality of resources using rules of linked data. The tracked resource set client is in communication with the plurality of servers. The tracked resource set client has a data index. The data index is built and kept up to date using the tracked resource set of each of the plurality of servers.

Type: Application

Filed: April 30, 2012

Publication date: October 31, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Frank J. Budinsky, James J. Des Rivieres, Martin P. Nally
STORAGE APPARATUS AND DATA MANAGEMENT METHOD

Publication number: 20130290281

Abstract: The processing load when rewriting portions of compressed data is alleviated. A storage apparatus comprises a storage unit which stores data which is read/written by the host apparatus, a compression/expansion unit which compresses the data using a predetermined algorithm to generate compressed data, and expands the compressed data, and a control unit which controls writing of data to the storage unit, wherein the control unit manages, as compression block units, divided compressed data which is obtained by dividing compressed data compressed by the compression/expansion unit into predetermined units, and padding data.

Type: Application

Filed: April 27, 2012

Publication date: October 31, 2013

Inventors: Nobuhiro Yokoi, Masanori Takada, Nagamasa Mizushima, Hiroshi Hirayama, Akira Yamamoto
CONTENT-BASED NAVIGATION FOR ELECTRONIC DEVICES

Publication number: 20130290299

Abstract: Content-based navigation of an electronic device includes receiving supplemental content to an electronic book. The supplemental content is created separately from the electronic book. The content-based navigation also includes associating an identifier of the electronic book with the supplemental content, storing the supplemental content with the identifier in a storage device, and creating an index to the supplemental content that is searchable by the identifier of the electronic book. The content-based navigation further includes providing end user devices with access to the supplemental content in the storage device via the index.

Type: Application

Filed: April 25, 2012

Publication date: October 31, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Guillaume Hoareau, Althea Hookens, John Musial, Sandeep R. Patil
ENHANCING PERFORMANCE-COST RATIO OF A PRIMARY STORAGE ADAPATIVE DATA REDUCTION SYSTEM

Publication number: 20130290276

Abstract: Data reduction in a storage system comprises determining attributes of data for storage in the storage system and determining expected data reduction effectiveness for the data based on said attributes. Said effectiveness indicates the benefit that data reduction is expected to provide for the data based on said attributes. The data reduction further comprises applying data reduction to the data based on the expected data reduction effectiveness and performance impact, to improve resource usage efficiency.

Type: Application

Filed: April 30, 2012

Publication date: October 31, 2013

Applicant: International Business Machines Corporation

Inventors: David D. Chambliss, Mihail C. Constantinescu, Joseph S. Glider, Maohua Lu
Object Synthesis

Publication number: 20130290275

Abstract: Apparatus, methods, and other embodiments associated with object synthesis are described. One example apparatus includes logic for identifying a block in a data de-duplication repository and for identifying a reference to the block. The apparatus also includes logic for representing a source object using a first named, organized collection of references to blocks in the data de-duplication repository and logic for representing a target object using a second named, organized collection of references. The apparatus is configured to synthesize the target object from the source object. Since synthesis may be complicated by edge cases, the apparatus is configured to account for conditions including a block in the target object needing less than all the data in a source object block, data to be used to synthesize the target object residing in a sparse hole in a data stream, and the target object needing data not present in the source object.

Type: Application

Filed: April 30, 2012

Publication date: October 31, 2013

Applicant: Quantum Corporation

Inventors: Timothy STOAKES, Andrew Leppard
EFFICIENT FILE PATH INDEXING FOR A CONTENT REPOSITORY

Publication number: 20130290301

Abstract: Techniques for indexing file paths of items in a content repository may include querying, by at least one processor, a content repository stored on at least one computer readable storage medium for one or more items that qualify for file path indexes, do not have the file path indexes, and have a parent folder that has a file path index, wherein the querying does not depend on results from previous queries, and wherein the file path index indicates an associated item's location in a folder tree, creating, by the at least one processor, the file path indexes for resulting items from the querying, and, if the querying results in at least one resulting item, repeating the querying of the content repository and the creating of the file path indexes until the querying results in zero resulting items.

Type: Application

Filed: April 30, 2012

Publication date: October 31, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: David B. Victor
PRESERVING REDUNDANCY IN DATA DEDUPLICATION SYSTEMS BY DESIGNATION OF VIRTUAL DEVICE

Publication number: 20130282671

Abstract: Various embodiments for preserving data redundancy in a data deduplication system in a computing environment are provided. At least one virtual device out of a volume set is designated as not subject to a deduplication operation.

Type: Application

Filed: April 23, 2012

Publication date: October 24, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rahul M. FISKE, Carl Evan JONES, Subhojit ROY
PRESERVING REDUNDANCY IN DATA DEDUPLICATION SYSTEMS BY DESIGNATION OF VIRTUAL ADDRESS

Publication number: 20130282670

Abstract: Various embodiments for preserving data redundancy of identical data in a data deduplication system in a computing environment are provided. A selected range of virtual addresses of a virtual storage device in the computing environment is designated as not subject to a deduplication operation. Other system and computer program product embodiments are disclosed and provide related advantages.

Type: Application

Filed: April 23, 2012

Publication date: October 24, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rahul M. FISKE, Carl Evan JONES, Subhojit ROY
PRESERVING REDUNDANCY IN DATA DEDUPLICATION SYSTEMS

Publication number: 20130282669

Abstract: Various embodiments for preserving data redundancy in a data deduplication system in a computing environment are provided. An indicator is configured. The indicator is provided with a selected data segment to be written through the data deduplication system to designate that the selected data segment must not be subject to a deduplication operation, such that repetitive data can be written stored on physical locations despite being identical.

Type: Application

Filed: April 23, 2012

Publication date: October 24, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rahul M. FISKE, Carl Evan JONES, Subhojit ROY
STORAGE APPARATUS AND STORAGE CONTROL METHOD

Publication number: 20130282672

Abstract: The present invention not only reduces the load but also enhances the accuracy of de-duplication in a storage apparatus which performs in-line de-duplication processing and post-process de-duplication processing. A storage apparatus comprises a storage device and a controller. The controller receives multiple files, and by performing in-line de-duplication processing under a prescribed condition, detects from among the multiple files a file which is duplicated with a file received in the past, stores in the temporary storage area a file other than the detected file of the multiple files, and partitions the stored file into multiple chunks, and by performing post-process de-duplication processing, detects from among the multiple chunks a chunk which is duplicated with a chunk received in the past, and stores in the transfer-destination storage area a chunk other than the detected chunk of the multiple chunks.

Type: Application

Filed: April 18, 2012

Publication date: October 24, 2013

Applicants: HITACHI COMPUTER PERIPHERALS CO., LTD., HITACHI, LTD.

Inventors: Naomitsu Tashiro, Mikito Ogata
NON-UNIQUE IDENTIFIER FOR A GROUP OF MOBILE USERS

Publication number: 20130282493

Abstract: Embodiments are directed towards collecting, aggregating and indexing unique and non-unique user data from a plurality of users. The result for a query of this indexed aggregation of user data is provided in a plurality of sub-sets of aggregated user data. Each subset of aggregated user data corresponds to a particular portion of the plurality of users. Also, each of these particular portions of the users is set at least large enough to provide general anonymity for the individual users. User data may be collected by one or more user data suppliers and provided to a user data aggregator. In some embodiments, user data may be collected as unique user data, non-unique user data, or any combination thereof. In some embodiments, user data may be aggregated by zip code, expanded zip code, and/or one or more attributes.

Type: Application

Filed: April 24, 2012

Publication date: October 24, 2013

Applicant: BLUE KAI, INC.

Inventors: Lucian Vlad Lita, Omar Tawakol
TABLE BOUNDARY DETECTION IN DATA BLOCKS FOR COMPRESSION

Publication number: 20130275397

Abstract: Data is converted into a minimized data representation using a suffix tree by sorting data streams according to symbolic representations for building table boundary formation patterns. The converted data is fully reversible for reconstruction while retaining minimal header information.

Type: Application

Filed: April 16, 2012

Publication date: October 17, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan AMIT, Lilia DEMIDOV, Nir HALOWANI
SOLVING PROBLEMS IN DATA PROCESSING SYSTEMS BASED ON TEXT ANALYSIS OF HISTORICAL DATA

Publication number: 20130275392

Abstract: Computer program products and systems, determine solutions to a problem experienced by a data processing system user. A query is received from the user. The query includes a problem description of the problem experienced by the user with respect to the data processing system. One or more keywords are extracted from the received problem description. An index of problems and associated solutions is searched using the one or more extracted keywords. The index of problems and associated solutions is created by analyzing a document collection describing problems and associated solutions with a text analytics application. One or more documents are returned that contains words or phrases that are similar to the keywords used for searching the index of problems and associated solutions. The documents relevant for the problem and associated solutions are presented to the user.

Type: Application

Filed: April 12, 2012

Publication date: October 17, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dhruv A. Bhatt, Kristin E. McNeil, Nitaben A. Patel
SYSTEM AND METHOD FOR ENABLING CONTEXTUAL RECOMMENDATIONS AND COLLABORATION WITHIN CONTENT

Publication number: 20130275429

Abstract: A system for enabling contextual recommendations and collaboration recommendations, based on a user's current work, comprising a plurality of content collector software applications adapted to interface with a plurality of content management applications, an indexing engine software application, an expanded social network graph database, and a predictive content intelligence software application.

Type: Application

Filed: July 17, 2012

Publication date: October 17, 2013

Inventors: Graham York, Lee Henry Burgess
CLOUD SERVICE ENABLED TO HANDLE A SET OF FILES DEPICTED TO A USER AS A SINGLE FILE IN A NATIVE OPERATING SYSTEM

Publication number: 20130275398

Abstract: Systems and methods method enabling file actions to be performed on a folder structure in a cloud-based service are disclosed. In one aspect, embodiments of the present disclosure include a method, which may be implemented on a system, for representing the folder structure in a user interface to the cloud-based service as a file and enabling file actions to be performed on file representing the folder structure in the user interface to the cloud-based service. In one embodiment, the folder structure and associated content is stored on a server which provides the cloud-based service in a compressed file format which is able to preserve the metadata associated with the folder structure which indicates its representation as the file in the user interface.

Type: Application

Filed: September 14, 2012

Publication date: October 17, 2013

Applicant: Box, Inc.

Inventors: Griffin Dorman, Satish Asok, Matthew Self
Systems and Methods for Selecting Data Compression for Storage Data in a Storage System

Publication number: 20130275396

Abstract: Storage systems and methods to improve space saving from data compression by providing a plurality of compression processes, and optionally, one or more parameters for controlling operation of the compression processes and selecting from the plurality of compression processes and the parameters to satisfy resource limits, such as CPU usage and memory usage. In one embodiment, the methods takes into account the content-type, such as text file or video file, and select the compression process and parameters that provide the greatest space savings for that content type while also remaining within a defined resource-usage limit.

Type: Application

Filed: April 11, 2012

Publication date: October 17, 2013

Applicant: NetApp, Inc.

Inventors: Michael N. Condict, Fei Xie, Sandip Shete
DEVELOPING IMPLICIT METADATA FOR DATA STORES

Publication number: 20130275434

Abstract: A system enables metadata to be gathered about a data store beginning from the creation and generation of the data store, through subsequent use of the data store. This metadata can include keywords related to the data store and data appearing within the data store. Thus, keywords and other metadata can be generated without owner/creator intervention, with enough semantic meaning to make a discovery process associated with the data store much easier and efficient. Usage of or communication regarding a data store are monitored and keywords are extracted from the usage or communication. The keywords are then written to otherwise associated with metadata of the data store. During searching, keywords in the metadata are made available to be used to attempt to match query terms entered by a searcher.

Type: Application

Filed: April 11, 2012

Publication date: October 17, 2013

Applicant: Microsoft Corporation

Inventors: John C. Platt, Surajit Chaudhuri, Lev Novik, Henricus Johannes Maria Meijer
Apparatus and method for generating additional information about moving picture content

Patent number: 8559724

Abstract: An apparatus and method for generating additional information about moving picture content, including: comparing image feature information about each image frame in moving picture content with image feature information about each image frame in web information, searching for an image frame in the moving picture content, the image frame matching the image frame in the web information, determining location information about the found image frame in the moving picture content, and generating additional information by use of the determined location information and the web information.

Type: Grant

Filed: February 24, 2010

Date of Patent: October 15, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Yoon-hee Choi, Il-hwan Choi, Hee-seon Park
SYSTEM AND METHOD FOR MONITORING DISTRIBUTED ASSET DATA

Publication number: 20130268501

Abstract: A computer-based monitoring system and monitoring method implemented in computer software for detecting, estimating, and reporting the condition states, their changes, and anomalies for many assets. The assets are of same type, are operated over a period of time, and outfitted with data collection systems. The proposed monitoring method accounts for variability of working conditions for each asset by using regression model that characterizes asset performance. The assets are of the same type but not identical. The proposed monitoring method accounts for asset-to-asset variability; it also accounts for drifts and trends in the asset condition and data. The proposed monitoring system can perform distributed processing of massive amounts of historical data without discarding any useful information where moving all the asset data into one central computing system might be infeasible. The overall processing is includes distributed preprocessing data records from each asset to produce compressed data.

Type: Application

Filed: April 9, 2012

Publication date: October 10, 2013

Applicant: MITEK ANALYTICS LLC

Inventor: Dimitry Gorinevsky
INCREASED IN-LINE DEDUPLICATION EFFICIENCY

Publication number: 20130268497

Abstract: Exemplary embodiments for increased in-line deduplication efficiency in a computing environment are provided. In one embodiment, by way of example only, hash values are calculated in nth iterations on data samples from fixed size data chunks extracted from an object requested for in-line deduplication. For each of the nth iterations, the calculated hash values for the data samples from the fixed size data chunks are matched in an nth hash index table with a corresponding hash value of existing objects in storage. The nth hash index table is exited upon detecting a mismatch during the matching. The mismatch is determined to be a unique object and is stored. A hash value for the object is calculated. A master hash index table is updated with the calculated hash value for the object and the calculated hash values for the unique object.

Type: Application

Filed: April 5, 2012

Publication date: October 10, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Duane Mark BALDWIN, Nilesh P. BHOSALE, John Thomas OLSON, Sandeep Ramesh PATIL
INCREASED IN-LINE DEDUPLICATION EFFICIENCY

Publication number: 20130268496

Abstract: Exemplary method, system, and computer program product embodiments for increased in-line deduplication efficiency in a computing environment are provided. In one embodiment, by way of example only hash values are calculated in nth iterations for accumulative data chunks extracted from an object requested for in-line deduplication. For each of the nth iterations, the calculated hash values for the accumulative data chunks are matched in a nth hash index table with a corresponding hash value of existing objects in storage. The nth hash index table is exited upon detecting a mismatch during the matching. The mismatch is determined to be a unique object and is stored. A hash value for the object is calculated. A master hash index table is updated with the calculated hash value for the object and the calculated hash values for the unique object. Additional system and computer program product embodiments are disclosed and provide related advantages.

Type: Application

Filed: April 5, 2012

Publication date: October 10, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Duane Mark BALDWIN, Nilesh P. BHOSALE, John Thomas OLSON, Sandeep Ramesh PATIL
PRIORITIZATION MECHANISM FOR DELETION OF CHUNKS OF DEDUPLICATED DATA OBJECTS

Publication number: 20130268498

Abstract: A reference counter corresponding to a base chunk of a plurality of chunks of a deduplicated data object is maintained, where the reference counter is incremented in response to an insertion of any chunk that references the base chunk, and where the reference counter is decremented, in response to a deletion of any chunk that references the base chunk. A queue is defined for processing dereferenced chunks of the plurality of chunks. The dereferenced chunks in the queue are processed in a predefined order, to free storage space.

Type: Application

Filed: April 6, 2012

Publication date: October 10, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael G. Sisco, Yu Meng Li
DOMINANT IMAGE DETERMINATION FOR SEARCH RESULTS

Publication number: 20130262430

Abstract: Architecture that computes a dominant image from one or more images on a webpage. A dominant image classifier scans webpages in an offline-created index to identify the prominent images in the webpages. In a more specific implementation the image selected is the image associated with a name query. Face detection technology can be utilized to identify which of the images on a given webpage contain faces. A query classifier identifies queries that contain people names. In the context of search engines and search result pages, the web results for name queries can further include prominent people face images as thumbnail images. Additional facts (structured data) can further be included that together with the results elements of caption title, snippet and attribute (uniform resource locator (URL)) provide an improved summary of the person on the page.

Type: Application

Filed: March 29, 2012

Publication date: October 3, 2013

Applicant: Microsoft Corporation

Inventors: Krishnan Thazhathekalam, David D. Ahn, Andrea Burbank, Taroon Mandhana, David Simpson, Yi-An Lin
MULTIPLEX CLASSIFICATION FOR TABULAR DATA COMPRESSION

Publication number: 20130262407

Abstract: For multiplexer classification for column compression of tabular data, similar type data segments are classified into classes for grouping the data segments into compression streams associated with each one of the classes. The compression streams are encoded based on a class-specific optimized encoding operation. The compression streams into one output buffer, wherein the compression streams are extracted.

Type: Application

Filed: March 27, 2012

Publication date: October 3, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan AMIT, Lilia DEMIDOV, Nir HALOWANI, Sergey Marenkov
Index Entries Configured to Support Both Conversation and Message Based Searching

Publication number: 20130262438

Abstract: A conversation server system having one or more processors and memory stores a plurality of index components in an index, a respective index entry corresponding to a respective term and having a plurality of index components, a respective index component of the respective index entry identifying a message that is associated with the respective term. The server receives a first message, associates the first message with a conversation having at least one other message and stores, in the index, a plurality of first-message index components that each include an identifier of the first message. The first-message index components include one or more index components indicative of a plurality of message terms in the first message and one or more index components indicative of one or more conversation terms in the conversation, the one or more conversation terms comprising one or more terms that are not in the first message.

Type: Application

Filed: August 29, 2011

Publication date: October 3, 2013

Inventor: Andrew J. Palay
MULTIPLEX CLASSIFICATION FOR TABULAR DATA COMPRESSION

Publication number: 20130262409

Abstract: For column compression of tabular data, similar type data segments are classified into classes for grouping the data segments into compression streams associated with each one of the classes. The compression streams are encoded based on a class-specific optimized encoding operation. The compression streams into one output buffer, wherein the compression streams are extracted.

Type: Application

Filed: June 29, 2012

Publication date: October 3, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan AMIT, Lilia DEMIDOV, Nir HALOWANI, Sergey MARENKOV
Systems, Methods, And Computer Program Products For Scheduling Processing To Achieve Space Savings

Publication number: 20130262404

Abstract: A method performed in a system that has a plurality of volumes stored to storage hardware, the method including generating, for each of the volumes, a respective space saving potential iteratively over time and scheduling space saving operations among the plurality of volumes by analyzing each of the volumes for space saving potential and assigning priority of resources based at least in part on space saving potential.

Type: Application

Filed: March 30, 2012

Publication date: October 3, 2013

Applicant: NETAPP, INC.

Inventors: Vinod Kumar Daga, Craig Anthony Johnston, Ling Zheng
TRANSFORMATION FUNCTIONS FOR COMPRESSION AND DECOMPRESSION OF DATA IN COMPUTING ENVIRONMENTS AND SYSTEMS

Publication number: 20130262408

Abstract: One or more transformation functions can be used in connection or together with one or more compression/decompression techniques. A transformation function can transform data (e.g., a data object) into a form more suitable for compression and/or decompression. As a result, data can be compressed and/or decompressed more effectively. In addition, multiple data objects can be associated with various transformation functions and/or compression/decompression techniques. As a result, different approaches can be taken with respect to compression and decompression of data objects in an effort to find an optimum approach for compression of data objects that may vary significantly from each other and change over time. It will be appreciated that the objects can be associated with transformation functions in a dynamic manner to accommodate changes to data. Also, an extendible and/or extensible system can allow for growth and adaption of new data in forms not currently present or expected.

Type: Application

Filed: May 23, 2012

Publication date: October 3, 2013

Inventors: David Simmen, Shant Hovsepian, Jeffrey Davis
FRAMEWORK FOR DOCUMENT KNOWLEDGE EXTRACTION

Publication number: 20130246435

Abstract: A knowledge extraction framework may iteratively enrich an ontology that is used to classify structured knowledge obtained from web pages based on structured knowledge previously acquired from other web pages. The framework may enable a user to define the ontology for extracting structured knowledge from a plurality of web pages. The framework applies the ontology using a supervised extraction algorithm to extract seed information from a set of web pages. The framework further applies an unsupervised extraction algorithm to extract the structured knowledge from an additional set of web pages. The framework subsequently maps the structured knowledge to the ontology based on the seed information to enrich the ontology.

Type: Application

Filed: March 14, 2012

Publication date: September 19, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Jun Yan, Lei Ji, Edward W. Wild, Yi Li, Ning Liu, Zheng Chen
PROVIDING ACCESS TO DOCUMENTS IN AN ONLINE DOCUMENT SHARING COMMUNITY

Publication number: 20130246384

Abstract: Provided are computer program product, system, and method for providing access to documents in an online document sharing community in a network environment including a plurality of participant computers operated by participants in the online document sharing community and a storage system. Document content is processed to add search terms for the document and a document identifier to a search index accessible through a search engine over the network to participants not under an obligation of confidentiality to the owner with respect to the document. Access is provided to the content of the document to the participants in the online document sharing community. A determination is made of a publication time the document was included in the search index and made accessible to the participant computers operated by participants not under the obligation of confidentiality to the owner of the document content.

Type: Application

Filed: March 19, 2012

Publication date: September 19, 2013

Inventor: David W. VICTOR
SYSTEM AND METHOD FOR DOCUMENT INDEXING AND DRAWING ANNOTATION

Publication number: 20130246436

Abstract: A system and method for parsing a machine-readable document having associated drawing figures with components labeled by references, to identify the occurrence of the references for generating a dynamic reference index table, and for either automatically annotating the references in the associated drawing figures with descriptive words or phrases cross-referenced to the references within the generated dynamic reference index table, or generating a reference usage report identifying inconsistencies and/or errors within the document associated with the identified reference occurrences.

Type: Application

Filed: March 19, 2012

Publication date: September 19, 2013

Inventor: Russell E. Levine
METHOD AND SYSTEM FOR FACILITATING ACCESS TO RECORDED DATA

Publication number: 20130246375

Abstract: The present invention relates to a method and system for facilitating access to recorded data. The system comprises an interface and a processing device. The interface is arranged to receive data and the processing device is arranged to separate the received data in data subsets, compress each data subset and assign an identifier to each compressed data subset, thereby creating data units each comprising a compressed data subset and an associated identifier, the processing device further being arranged to establish an index on the basis of the assigned identifiers.

Type: Application

Filed: March 14, 2012

Publication date: September 19, 2013

Inventors: Max Roy PRAKOSO, Andi R. Hakim, Robert Lang
Data format for website traffic statistics

Patent number: 8538969

Abstract: A data format is optimized for storing data such as website traffic data. The data format enables easy access to and filtering of data, for example in generating website traffic reports. The data format also provides significant data compression. A method for generating a data file according to the data format employs linear compression and indexing to efficiently store the data. Data stored according to the format can be easily retrieved, particularly when a known value is specified and particular entries matching the known value are sought.

Type: Grant

Filed: November 14, 2005

Date of Patent: September 17, 2013

Assignee: Adobe Systems Incorporated

Inventor: Michael Paul Bailey
INTEGRATING SEARCHES

Publication number: 20130238627

Abstract: Methods, systems, and computer-storage media having computer-usable instructions embodied thereon, for integrating searches are provided. An entity index may be compiled that includes entity files for a plurality of identified entities such that any information known about a single entity is contained in a single entity file and is easily accessible. Web indexes, including web page information, may be referenced in order to associate web pages with entities, or entity files. Once identified as related to an entity, a web page may be associated with an entity identifier that is associated with the related entity such that a search query for the identified entity results in both entity information for the entity and web pages associated with the entity.

Type: Application

Filed: March 6, 2012

Publication date: September 12, 2013

Applicant: MICROSOFT CORPORATION

Inventors: RICHARD QIAN, ANDREW SHUMAN, DERRICK CONNELL, ROBERT FIRBY, STEVEN MACBETH, TAROON MANDHANA
SEARCHING NETWORK CONFIGURATION DATA

Publication number: 20130238629

Abstract: A programmed hardware network configuration file repository indexer is configured with a network-configuration-specific index-operation rule set. In another example, a network-configuration-specific index-operation rule set can be used in generating an index to a network configuration file repository. In the latter example, the index and the index-operation rule set is used in searching the network configuration file repository.

Type: Application

Filed: March 8, 2012

Publication date: September 12, 2013

Inventors: Ram Kumar Kosuri, Swamy Jagannadha Mandavilli, Murali Mohan Dingari
ENHANCING DATA RETRIEVAL PERFORMANCE IN DEDUPLICATION SYSTEMS

Publication number: 20130238568

Abstract: Various embodiments for processing data in a data deduplication system are provided. For data segments previously deduplicated by the data deduplication system, a supplemental hot-read link is established for those of the data segments determined to be read on at least one of a frequent and recently used basis. Other system and computer program product embodiments are disclosed and provide related advantages.

Type: Application

Filed: March 6, 2012

Publication date: September 12, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Allen Keith BATES, Louie Arthur DICKENS, Stephen Leonard SCHWARTZ, Daniel James WINARSKI
DEDUPLICATING A FILE SYSTEM

Publication number: 20130232124

Abstract: A storage node receives a file. The storage node determines whether the file is stored on the storage node by comparing a hash value computed for content of the received file to hash values for content stored on the storage node. The storage node transfers a name and address of the file to a directory node.

Type: Application

Filed: March 5, 2012

Publication date: September 5, 2013

Inventor: Blaine D. GAITHER
Apparatus and Methods For Indexing Multimedia Content

Publication number: 20130226930

Abstract: A method, medium, and apparatus are disclosed for indexing multimedia content by a computer. The method comprises segmenting the multimedia content into a plurality of segments. For each segment, the method identifies one or more features present in the segment, wherein the features are of respective media types. The method then identifies, for each identified feature in each segment, one or more respective keywords associated the identified feature. Then, the method determines, for each identified keyword associated with an identified feature in a given segment, a respective relevance of the keyword to the given segment. The respective relevance is dependent on a weight associated with the respective media type of the identified feature.

Type: Application

Filed: February 29, 2012

Publication date: August 29, 2013

Applicant: Telefonaktiebolaget L M Ericsson (publ)

Inventors: Tommy ARNGREN, Joakim Söderberg, Marika Stålnacke
Indexing Quoted Text in Messages in Conversations to Support Advanced Conversation-Based Searching

Publication number: 20130218896

Abstract: A conversation server system having one or more processors and memory stores a plurality of index components in an index. The server receives a first message, associates the first message with a conversation having one or more other messages and identifies quoted text in the message based on text that occurs in one or more of the other messages. The server stores, in the index, a plurality of first-message index components including one or more index components that correspond to terms in original text of the first message and one or more index components that correspond to terms that occur in the quoted text, where the first-message index components for original text of the first message are distinguished from the first-message index components for quoted text of the first message in the index.

Type: Application

Filed: August 29, 2011

Publication date: August 22, 2013

Inventor: Andrew J. Palay
FILE SERVER APPARATUS, INFORMATION SYSTEM, AND METHOD FOR CONTROLLING FILE SERVER APPARATUS

Publication number: 20130218847

Abstract: Provided is a file server apparatus 4 that processes files stored in a plurality in response to an I/O request when entity data of a plurality of files has a common portion, generates a consolidation file that holds common entity data as consolidated data; and manages each of the plurality of files as a de-duplication file that does not hold the consolidated data, and, when there is the I/O request to at least one of the plurality of files, acquires the consolidated data and processes in response to the I/O request to at least one of the plurality of files, and holds difference data generated by performing processing in response to the I/O request.

Type: Application

Filed: February 16, 2012

Publication date: August 22, 2013

Inventor: Nobuyuki Saika
Enabling Search for Conversations with Two Messages Each Having a Query Term

Publication number: 20130218897

Abstract: A conversation server system having one or more processors and memory stores a plurality of index components in an index. The server associates a first message having a first term with a conversation that includes at least a second message. The first term is not included in the second message and the second message includes a second term that is not included in the first message. The server stores, in the index, a plurality of index components for a same referenced object, including an index component indicative of the first term and an index component indicative of the second term. In some embodiments the same referenced object is associated with index components for a first sender of the first message and a second sender of the second message, so that a search for a conversation with messages from the first sender and the second sender retrieves the referenced object.

Type: Application

Filed: August 29, 2011

Publication date: August 22, 2013

Inventor: Andrew J. Palay
MECHANISMS FOR METADATA SEARCH IN ENTERPRISE APPLICATIONS

Publication number: 20130218898

Abstract: Metadata search is enhanced by utilizing relationship data indicating relationships between metadata items. A server generates an index mapping metadata items to terms associated with the metadata items and a graph describing relationships between each of the metadata items. When the server receives a search request, the server locates a candidate set of the metadata items based on the search term(s) and the index. The server performs a link analysis of the graph to determine a relationship score for each metadata item. For each particular metadata item in the candidate set of the metadata items, the server calculates a ranking score based at least on the relationship score for the particular metadata item. The server generates a ranked result set based on comparing the ranking scores for the candidate set of metadata items. The server then provides information indicating the ranked result set in response to the search request.

Type: Application

Filed: February 16, 2012

Publication date: August 22, 2013

Applicant: Oracle International Corporation

Inventors: Nikhil Raghavan, Ravi Murthy, Aman Naimat
Method and system for fast similarity computation in high dimensional space

Patent number: 8515964

Abstract: Method, system, and programs for computing similarity. Input data is first received from one or more data sources and then analyzed to obtain an input feature vector that characterizes the input data. An index is then generated based on the input feature vector and is used to archive the input data, where the value of the index is computed based on an improved Johnson-Lindenstrass transformation (FJLT) process. With the improved FJLT process, first, the sign of each feature in the input feature vector is randomly flipped to obtain a flipped vector. A Hadamard transformation is then applied to the flipped vector to obtain a transformed vector. An inner product between the transformed vector and a sparse vector is then computed to obtain a base vector, based on which the value of the index is determined.

Type: Grant

Filed: July 25, 2011

Date of Patent: August 20, 2013

Assignee: Yahoo! Inc.

Inventors: Shanmugasundaram Ravikumar, Anirban Dasgupta, Tamas Sarlos
Related Data Dependencies

Publication number: 20130204853

Abstract: A computer-implemented method for use in maintaining currency of a projection index of a plurality of database objects. The computer-implemented method includes creating the projection index representative of a connection between a first database object and at least a second database object, determining an entity dependency between the first database object and at least the second database object, determining a path dependency between the first database object and at least the second database object, and updating the projection index in response to a modification of one or both of the entity dependency and the path dependency.

Type: Application

Filed: February 7, 2012

Publication date: August 8, 2013

Applicant: DASSAULT SYSTEMES ENOVIA CORPORATION

Inventors: David Edward Tewksbary, Clark David Milliken

prev 1 2 3 4 5 6 7 … next