Indexing The Archive Patents (Class 707/673)

Providing a partially sorted index

Patent number: 8108355

Abstract: To provide an index for a table in a database system, the index is partially sorted in an initial phase of building the index. Subsequently, in response to accessing portions of the index to process a database query, further sorting of the accessed portions of the index is performed.

Type: Grant

Filed: October 27, 2006

Date of Patent: January 31, 2012

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Bin Zhang
Apparatus, system, and method for improved portable document format (“PDF”) document archiving

Patent number: 8099397

Abstract: An apparatus, system, and method are disclosed for improved Portable Document Format (“PDF”) document archiving. The method includes scanning a source PDF document for a shared resource. The source PDF document includes a plurality of records. The shared resource includes a common resource referenced by way of a resource pointer associated with a record of the source PDF document. The method includes copying the shared resource to a resource group associated with the source PDF document. The method also includes short-circuiting a link between content for the shared resource and the resource pointer in each record that points to the shared resource. The method includes extracting a record from the source PDF document. The extracted record is void of content for the shared resource in response to the short-circuited link. Thus, records may be stored in a standalone format without excessive storage space requirements.

Type: Grant

Filed: August 26, 2009

Date of Patent: January 17, 2012

Assignee: International Business Machines Corporation

Inventors: Gregory S. Felderman, Brian K. Hoyt
Index of locally recorded content

Patent number: 8090694

Abstract: A method to index locally recorded content at a media device includes extracting, at a remote service provider, event index data from an event being locally recorded at a media device and associating the event index data with locator code data of the event. The method further includes storing, at the remote service provider, the extracted event index data and the associated locator code data; searching the extracted event index data for a plurality of segments associated with the event, the search being associated with a search request; determining index display data for a presentation of the plurality of segments based on the search request; and transmitting, to the media device, the locator code data associated with the plurality of segments, and the index display data.

Type: Grant

Filed: November 2, 2006

Date of Patent: January 3, 2012

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Behzad Shahraray, David Gibbon, Lee Begeja, Zhu Liu, Richard V. Cox, Bernard S. Renger
Dynamic restoration of message object search indexes

Patent number: 8090695

Abstract: As described herein, a high-availability server system includes at least a source server system and a target server system that dynamically restore message object search indexes. Both the source server system and the target server system store copies of a mailbox database and a search index for the mailbox database. As changes are requested to the mailbox database, events are added to event lists maintained at the source node and the target node. When the data storage system at the target server system enters an error state, the source server system sends to the target server system a set of data that the target server system can use to generate a copy of search index. The target server system may then resume applying events in the event list to the search index. In this way, it may not be necessary to completely re-index the mailbox database at the target node.

Type: Grant

Filed: December 5, 2008

Date of Patent: January 3, 2012

Assignee: Microsoft Corporation

Inventors: Ashish Consul, Suryanarayana M. Gorti
Table lookup mechanism for address resolution

Patent number: 8086571

Abstract: A table lookup indexing system for the transmission of data packets in a network switch. Data is received in an input port and is divided into two parts, an index portion and a bucket portion. The index portion selects a particular bucket and the combination of the index portion and bucket portion selects a specific entry in the table.

Type: Grant

Filed: August 5, 2009

Date of Patent: December 27, 2011

Assignee: Broadcom Corporation

Inventor: Govind Malalur
Methods for backing up a database

Patent number: 8082229

Abstract: Various embodiments of a method, system and computer program product backup and recover a database. A database is distributed in a plurality of storage devices. A target designation designating a target database is received. One or more storage devices of the plurality of storage devices, storing at least a portion of the target database, are selected. A quiesce point is established by completing an ongoing transaction for the target database and inhibiting a further transaction. In response to establishing the quiesce point, a backup is generated by collectively copying data on each storage device of the one or more selected storage devices. The backup associated with a quiesce point indication indicating backed up data of the said each storage device of the one or more selected storage devices in accordance with the quiesce point, are recorded.

Type: Grant

Filed: February 3, 2006

Date of Patent: December 20, 2011

Assignee: International Business Machines Corporation

Inventors: Soh Kaijima, Takashi Saitoh, Kenji Seta
System and method for distributed objects storage, management, archival, searching, retrieval and mining in private and public clouds and deep invisible webs

Publication number: 20110307451

Abstract: A method and system for efficiently archiving and retrieving objects using a distributed network of devices wherein the users define attributes, distribution lists, subscribers to content and objects. The objects can be archived, searched for, tagged, indexed, attributed, restored and mined. Objects have signatures that indicate where they came from and where they are stored. Attributes include system attributes which may geo-reference objects. Attributes and signatures can be associated with alerts and notifications to subscribers who register interest in receiving alerts about objects, object signatures or attributes.

Type: Application

Filed: June 10, 2010

Publication date: December 15, 2011

Applicant: EnduraData, Inc,

Inventors: Abderrahman Aba El Haddi, Anass Taouil, Jeffrey Brian Marckel, Zakaria Baani
System and method for a data extraction and backup database

Patent number: 8065277

Abstract: Methods and systems for storing information extracted from a file are presented. These methods and systems can be used to store content and metadata extracted from a file, and to associate the content and metadata so a holistic image of the file may be maintained. Additionally, these methods and systems may allow the location of a file to be stored and associated with the content or metadata of the file. Methods, systems and databases of this type may be especially useful in avoiding duplication of data by allowing the content and metadata of files to be compared to previously stored content and metadata.

Type: Grant

Filed: January 16, 2004

Date of Patent: November 22, 2011

Inventors: Daniel John Gardner, Maurilio Torres
Information processing device, file data merging method, file naming method, and file data output method

Patent number: 8065267

Abstract: A step or means for associating a file with a cell in a table format by, for example, pasting an icon representing the file, wherein related data to be simultaneously referenced along with the data in the cell with which the file is associated is read according to a data entry positioning rule of a table format. Further, the step/means indicates merging file common condition data with record data in the file by adding the read related data to its each constituent record as the common condition value of the data file with which the corresponding cell is associated, and includes naming a file by converting a character string representing the read related data into a character string according to a prescribed rule and positioning it in a predetermined position in a template character string.

Type: Grant

Filed: January 12, 2006

Date of Patent: November 22, 2011

Inventor: Masatsugu Noda
Archive indexing engine

Patent number: 8051045

Abstract: Methods and apparatus, including computer program products, for archiving data from a database. One method includes identifying a data record to be archived; determining the contents of an archive record, the archive record having values for a first plurality of attributes in the data record; storing the archive record in a data archive; determining the contents of an index record, the index record comprising values for a second plurality of attributes in the data record; adding the index record to a dictionary-based archive index with a reference to the location of the archive record in the data archive; deleting the data record from the database; accepting a query for a desired archive record; and performing a search of the archive index to find the desired archive record.

Type: Grant

Filed: August 31, 2005

Date of Patent: November 1, 2011

Assignee: SAP AG

Inventor: Hartmut K. Vogler
Document management apparatus and document management method

Patent number: 8046365

Abstract: An apparatus stores one or more document information of which access right is managed by an access right management apparatus, and generates an index of stored document information. The apparatus receives user identification information, and sends the user identification information, and information for identifying document information of which index has not been generated to the access right management apparatus. The apparatus receives access right information associated with the user from the access right management apparatus, and generates index of the identified document information based on the received access right information.

Type: Grant

Filed: March 12, 2007

Date of Patent: October 25, 2011

Assignee: Canon Kabushiki Kaisha

Inventor: Shigemi Saito
System and method for classifying tags of content using a hyperlinked corpus of classified web pages

Patent number: 8046361

Abstract: An improved system and method for classifying tags of content using a hyperlinked corpus of classified web pages is provided. An anchor text index may be searched to find anchor texts that may match text of the tag, documents referenced by the matching anchor texts may be found, and the documents referenced by the matching anchor texts may be grouped to disambiguate multiple classifications that result from matching the anchor texts with the categories of the reference documents. To resolve ambiguity between multiple classifications, weighted classifications may be used where each document may be assigned a positive weight for a mapping to a category to indicate the confidence of the classification of the document to the category. The classification for the grouping of the documents referenced by the matching anchor texts with greatest frequency may be selected and output as the classification for the tag.

Type: Grant

Filed: April 18, 2008

Date of Patent: October 25, 2011

Assignee: Yahoo! Inc.

Inventors: Börkur Sigurbjörnsson, Roelof van Zwol, Simon E. Overell
Apparatus for searching and managing compressed files

Patent number: 8037035

Abstract: A computer-readable, non-transitory medium stores a program that manages compressed file groups on a plurality of slave servers. The file groups include compressed files that are to be searched and have character strings. Each of the compressed file groups is expanded, using a Huffman tree that was used for compressing the compressed file group. A common compression parameter is generated based on appearance frequency, by summing, for each character, the appearance frequency in each of the compressed file groups. The expanded files are recompressed using the common Huffman tree such that sums of the access frequencies of the compressed files that are origins of the recompressed files are substantially equivalent among various slave servers. New archives including the re-compressed files are transmitted to the respective slave servers.

Type: Grant

Filed: January 28, 2009

Date of Patent: October 11, 2011

Assignee: Fujitsu Limited

Inventors: Masahiro Kataoka, Tatsuhiro Sato, Takashi Tsubokura
Method and system for offline indexing of content and classifying stored data

Patent number: 8037031

Abstract: A method and system for creating an index of content without interfering with the source of the content includes an offline content indexing system that creates an index of content from an offline copy of data. The system may associate additional properties or tags with data that are not part of traditional indexing of content, such as the time the content was last available or user attributes associated with the content. Users can search the created index to locate content that is no longer available or based on the associate attributes.

Type: Grant

Filed: December 20, 2010

Date of Patent: October 11, 2011

Assignee: CommVault Systems, Inc.

Inventors: Parag Gokhale, Rajiv Kottomtharayil, Deepak R. Attarde, Jun H. Ahn
Adaptive Archive Data Management

Publication number: 20110231372

Abstract: In one embodiment, input is received from a user defining a classification and an analytic for the classification. Multiple classifications and analytics may be defined by a user. A definition of relevance parameters is determined that characterize the classification and a set of analytics measures associated with the analytic. The definition may be for the classification. Unstructured data and structured data are analyzed based on the definition of the relevance parameters to determine relevant data in the unstructured data and the structured data. The relevant data being data that is determined to be relevant to the classification defined by the user. An index of the terms from the relevant data is determined. The index is useable by an analytics tool to provide results for queries of the unstructured data and structured data. The query may be used within the classification such that targeted results are provided using the index and the relevant data to the classification.

Type: Application

Filed: March 21, 2011

Publication date: September 22, 2011

Inventors: Joan Wrabetz, Aloke Guha
Method and system for updating an archive of a computer file

Patent number: 8019731

Abstract: A method and system for updating an archive of a computer file to reflect changes made to the file includes selecting one of a plurality of comparison methods as a preferred comparison method. The comparison methods include a first comparison method wherein the file is compared to an archive of the file and a second comparison method wherein a first set of tokens statistically representative of the file is computed and compared to a second set of tokens statistically representative of the archive of the file. The method further includes carrying out the preferred comparison method to generate indicia of differences between the file and the archive of the file for updating the archive of the file.

Type: Grant

Filed: September 22, 2010

Date of Patent: September 13, 2011

Assignee: Computer Associates Think, Inc.

Inventor: Karl D. Forster
Multi-user animation coupled to bulletin board

Patent number: 8018455

Abstract: A multi-user animation process receives input from multiple remote clients to manipulate avatars through a modeled 3-D environment. Each user is represented by an avatar. The 3-D environment and avatar position/location data is provided to client workstations, which display a simulated environment visible to all participants. A text or speech-based bulletin board application is coupled to the animation process. The bulletin board application receives text or speech input from the multiple remote users and publishes the input in a public forum. The bulletin board application maintains multiple forums organized by topic. Access or participation to particular forums is coordinated with the animation process, such that each user may be permitted access to a forum only when the user's avatar is located within a designated room or region of the modeled 3-D environment.

Type: Grant

Filed: October 4, 2010

Date of Patent: September 13, 2011

Inventor: Brian Mark Shuster
Methods and systems for assisting information processing by using storage system

Patent number: 8015146

Abstract: In a networked information system, a portion of the information processing is offloaded from servers to a storage system to reduce network traffic and conserve server resources. The information system includes a storage system storing files or objects and having a function which automatically extracts portions of text from the files and transmits the extracted text to the servers. The text extraction is responsive to file requests from the servers. The extracted text and files are stored on the storage system, decreasing the need to send entire files across the network. Thus, by transmitting smaller extracted text data instead of entire files over the network, network performance can be increased through the reduction of traffic. Additionally, the processing strain on physical resources of the servers can be reduced by extracting the text at the storage system rather than at the servers.

Type: Grant

Filed: June 16, 2008

Date of Patent: September 6, 2011

Assignee: Hitachi, Ltd.

Inventor: Yasuyuki Mimatsu
Computer-implemented method, computer program product and system for creating an index of a subset of data

Patent number: 8010501

Abstract: A computer implemented method for transforming an inverted index of a collection of documents into a smaller inverted index of documents. The smaller index contains links to all and only to those documents appearing in a subset of the original collection of documents. The method avoids reprocessing the subset to create the smaller inverted index by intersecting each inverted list with the list of document references from the desired subset. If this intersection is empty then the list is removed from the new smaller index, otherwise the list containing only the intersected reference list is included in the new inverted index. The method is also extended to deal with creating multiple smaller inverted indexes and with propagating updates changes in the first collection of documents down into the smaller inverted index or indexes.

Type: Grant

Filed: September 4, 2007

Date of Patent: August 30, 2011

Assignee: Exalead

Inventors: François Bourdoncle, Florian Douetteau, Stéphane Donze
Indexing media files in a distributed, multi-user system for managing and editing digital media

Patent number: 8001088

Abstract: A scalable infrastructure indexes and tracks media data and metadata in a distributed, multi-user system. An indexer is associated with particular storage locations, such as a disk, or a directory on a disk, to maintain an index of media files or metadata stored in those storage locations. The indexer monitors activity on any storage location with which it is associated. Any additions, deletions or modifications to files in that storage location cause the indexer to update its index. This index then can be accessed by any of a number of applications in the same manner as conventional indexes. There may be different indexers for different storage locations. Separate indexers may be provided for media files and compositions that use those media files.

Type: Grant

Filed: April 4, 2003

Date of Patent: August 16, 2011

Assignee: Avid Technology, Inc.

Inventor: Roger Tawa, Jr.
Attribute-based indexers for device object lists

Patent number: 7996368

Abstract: A device list is created including one or more device objects, wherein each device object represents a physical device coupled to a computer system, wherein each device object includes one or more device attributes of the physical device. The device list is indexed into using a device attribute.

Type: Grant

Filed: September 6, 2005

Date of Patent: August 9, 2011

Assignee: Cyress Semiconductor Corporation

Inventors: Greg Nalder, Eric Luttmann
Method and apparatus for improving performance of approximate string queries using variable length high-quality grams

Patent number: 7996369

Abstract: A computer process, called VGRAM, improves the performance of these string search algorithms in computers by using a carefully chosen dictionary of variable-length grams based on their frequencies in the string collection. A dynamic programming algorithm for computing a tight lower bound on the number of common grams shared by two similar strings in order to improve query performance is disclosed. A method for automatically computing a dictionary of high-quality grams for a workload of queries. Improvement on query performance is achieved by these techniques by a cost-based quantitative approach to deciding good grams for approximate string queries. An approach for answering approximate queries efficiently based on discarding gram lists, and another is based on combining correlated lists. An indexing structure is reduced to a given amount of space, while retaining efficient query processing by using algorithms in a computer based on discarding gram lists and combining correlated lists.

Type: Grant

Filed: December 14, 2008

Date of Patent: August 9, 2011

Assignee: The Regents of the University of California

Inventors: Chen Li, Bin Wang, Xaochun Yang, Alexander Behm, Shengyue Ji, Jiaheng Lu
Suggesting long-tail tags

Patent number: 7996418

Abstract: Technologies are described herein for suggesting long-tail tags. A first group of tags and a second group of tags are identified from a plurality of tags. The first group of tags includes frequently-assigned tags having a higher frequency of being assigned to an asset. The second group of tags includes long-tail tags having a lower frequency of being assigned to the asset than the frequently-assigned tags. The frequently-assigned tags and a sample of the long-tail tags are suggested to a user upon receiving a request from the user to tag the asset.

Type: Grant

Filed: April 30, 2008

Date of Patent: August 9, 2011

Assignee: Microsoft Corporation

Inventors: Alex David Weinstein, Dmitry Yevgenyevich Ryabkov
Apparatus, system, and method for volume-level restoration of cluster server data

Patent number: 7992036

Abstract: An apparatus, system, and method are disclosed for restoring cluster server data at a volume level. A setup module opens at least one source volume of a cluster server for a volume-level restore, flushes each buffer for the at least one source volume, closes the at least one source volume, disables file system checks for the cluster disks, saves disk signatures of the cluster disks, and disables device-level checks for the cluster disks. A copy module copies data with a volume-level restore from the at least one snapshot volume to the at least one source volume. A reset module rewrites the saved disk signatures to the cluster disks, re-enables the device-level checks for the cluster disks, and resets at least one volume attribute on the at least one source volume.

Type: Grant

Filed: January 22, 2007

Date of Patent: August 2, 2011

Assignee: International Business Machines Corporation

Inventors: Neeta Garimella, Delbert Barron Hoobler, III
Indexing system and method

Patent number: 7987165

Abstract: An indexing system, including a server for providing access to at least one site, a server agent for creating an index file of data relating to the site, and a central index for storing index information from the index file. The server agent initiates communication with the central index to transfer the index file from the server agent to the central index.

Type: Grant

Filed: December 18, 2000

Date of Patent: July 26, 2011

Assignee: Youramigo Limited

Inventors: Robert James Steele, David Martin Powers
Physical to electronic record content management

Patent number: 7979398

Abstract: Techniques provide a file plan including a plurality of containers, wherein each container is capable of providing management information for record information objects assigned to the container, wherein the record information objects represent documents, wherein one of the containers points to a physical record. An electronic record associated with the physical record is stored. The physical record is automatically associated with the electronic record by updating the file plan.

Type: Grant

Filed: December 22, 2006

Date of Patent: July 12, 2011

Assignee: International Business Machines Corporation

Inventor: Tod DeBie
Application object tuning

Patent number: 7974973

Abstract: Apparatus, methods, and computer readable medium for monitoring a database and for determining aggregate I/O wait times (i.e. for a ‘target’ index or table) associated at least one I/O category selected from a plurality of I/O categories are disclosed herein.

Type: Grant

Filed: August 7, 2008

Date of Patent: July 5, 2011

Assignee: Precise Software Solutions Inc.

Inventors: Ehud Eshet, Rafi Balbirsky, Sigal Gelbart, Ori Rosen, Ilan Shiber
Apparatus, method and computer-code for quantifying index overhead

Patent number: 7974969

Abstract: Apparatus, methods, and computer readable medium for monitoring a database and for determining an estimated index-overhead for a given index is provided. A description of database performance may be presented to a user in accordance with the determined index overhead. Furthermore, in some embodiments, apparatus, methods and computer-code for (i) determining fractional aggregate index-wait time in accordance with database statement execution plans and (ii) presenting a description of database performance in accordance with the fractional aggregated index-wait time are also disclosed.

Type: Grant

Filed: August 7, 2008

Date of Patent: July 5, 2011

Assignee: Precise Software Solutions Inc.

Inventors: Rafi Balbirsky, Ilanit Nulman
Techniques for implementing indexes on columns in database tables whose values specify periods of time

Patent number: 7970742

Abstract: Techniques for history enabling a table in a database system so that past versions of rows of the history-enabled table are available for temporal querying. The table is history enabled by adding a start time column to the table and creating a history table for the history-enabled table. The history table's rows are copies of rows of the history-enabled table that have changed and include start time and end time fields whose values indicate a period in which the history table's row was in the history-enabled table. Temporal queries are performed on a view which is the union of the history-enabled table and the history table. The temporal queries are speeded up by period of time indexes in which the leaves are grouped based on time period size, identifiers are assigned to the groups, and the keys of the index include the group identifiers.

Type: Grant

Filed: December 1, 2005

Date of Patent: June 28, 2011

Assignee: Oracle International Corporation

Inventors: Robert Hanckel, Jayanta Banerjee, Siva Ravada
Optimizing a storage system to support short data lifetimes

Patent number: 7958093

Abstract: A system and method for optimizing a storage system to support short data object lifetimes and highly utilized storage space are provided. With the system and method, data objects are clustered based on when they are anticipated to be deleted. When an application stores data, the application provides an indicator of the expected lifetime of the data, which may be a retention value, a relative priority of the data object, or the like. Data objects having similar expected lifetimes are clustered together in common data structures so that clusters of objects may be deleted efficiently in a single operation. Expected lifetimes may be changed by applications automatically. The system automatically determines how to handle these changes in expected lifetime using one or more of copying the data object, reclassifying the container in which the data object is held, and ignoring the change in expected lifetime for a time to investigate further changes in expected lifetime of other data objects.

Type: Grant

Filed: September 17, 2004

Date of Patent: June 7, 2011

Assignee: International Business Machines Corporation

Inventors: Kay Schwendimann Anderson, Frederick Douglis, Nagui Halim, John Davis Palmer, Elizabeth Suzanne Richards, David Tao, William Harold Tetzlaff, John Michael Tracey, Joel Leonard Wolf
Fingerprinting based entity extraction

Patent number: 7950062

Abstract: A system (and a method) is disclosed for fingerprinting based entity extraction using a rolling hash technique. The system is configured to receive an input stream of a predetermined length comprising characters, and a hash table having indexed entries. The system isolates, through a defined fixed window length, a set of characters of the input stream. A hash key is generated and used to index into the hash table. The system compares the isolated set of characters of the input stream with the entry corresponding to the index into the hash table to determine whether there is an exact match with the entry. The system slides the fixed window length one character to isolate another set of characters of the input stream in response to no exact match from the comparison. Alternatively, the system stores the input stream in response to an exact match from the comparison.

Type: Grant

Filed: August 3, 2007

Date of Patent: May 24, 2011

Assignee: Trend Micro Incorporated

Inventors: Liwei Ren, Shu Huang
Automatic publishing of digital content

Patent number: 7945535

Abstract: In one embodiment, there is provided a method for a media storage device to manage digital content. The method comprises determining if there is digital content to be categorized into one or more galleries; automatically categorizing said digital content into the one or more galleries; and for digital content categorized into a gallery with an auto-publish flag, sending at least one of said digital content and a derivative form of said digital content to a server.

Type: Grant

Filed: December 13, 2005

Date of Patent: May 17, 2011

Assignee: Microsoft Corporation

Inventors: Michael J Toutonghi, Jaroslav Bengl
Snapshot indexing

Patent number: 7937372

Abstract: Managing backup data comprises mounting a snapshot of a file system. Each of the plurality of snapshots is taken at a particular time and each comprises a replica of the data set at that particular time. The mounted snapshot is accessed. For each of the one or more file system objects included in the accessed snapshot, index data is added which indicates that each of the one or more file system objects is located within the accessed snapshot. This information is added to an index associated with the snapshot so that it is able to be determined, using the index and without having to again mount the accessed snapshot, whether an object of interest is included in the snapshot.

Type: Grant

Filed: March 17, 2010

Date of Patent: May 3, 2011

Assignee: EMC Corporation

Inventor: Nathan Kryger
Index maintenance for operations involving indexed XML data

Patent number: 7921101

Abstract: A method and system are provided for maintaining an XML index in response to piece-wise modifications on indexed XML documents. The database server that manages the XML index determines which nodes are involved in the piece-wise modifications, and updates the XML index based on only those nodes. Index entries for nodes not involved in the piece-wise modifications remain unchanged.

Type: Grant

Filed: July 15, 2008

Date of Patent: April 5, 2011

Assignee: Oracle International Corporation

Inventors: Ravi Murthy, Sivasankaran Chandrasekaran, Ashish Thusoo, Nipun Agarwal, Eric Sedlar
System and method for data management through decomposition and decay

Patent number: 7912817

Abstract: Data is decayed over time by a type of data item by identifying constituent units of each data item; creating a shelf-life criterion for the constituent units by assigning dimensions to each data item and to each constituent unit; for each of the data items of the data item type, establishing relationship factors for each data item to other data items, between constituent units within data items, and between data items; periodically calculating or updating a decomposability index for each constituent unit as a function of the priority dimensions and the data life dimensions by moving the index towards a threshold for constituent units which are reproducible; and subsequently, decaying the data by deleting from storage constituent units which have decomposability indices exceeding a configured threshold, thereby reducing the amount of storage occupied by a remaining plurality of data items.

Type: Grant

Filed: January 14, 2008

Date of Patent: March 22, 2011

Assignee: International Business Machines Corporation

Inventors: Oriana Jeannette Love, Borna Safabakhsh
Polyarchical data indexing and automatically generated hierarchical data indexing paths

Patent number: 7908253

Abstract: Data indexing using polyarchical indexing codes and automatically generated expansion paths. For a piece of data, an indexing code is received relating to a particular categorization or other indexing parameter. Based upon the indexing code, one or more expansion sets of codes are retrieved and applied to the piece of data. The expansion sets of codes may include indexing codes that relate to hierarchical levels of indexing. The expansion sets of codes may also include different expansion paths through the hierarchical levels of indexing. The polyarchical codes may include multiple cross-categorization of the data across the same or different levels of categories. They may also include multiple expansion paths in different directions across hierarchical levels of categories or indexing.

Type: Grant

Filed: August 7, 2008

Date of Patent: March 15, 2011

Assignee: Factiva, Inc.

Inventors: Jonathan Guy Grenside Cooke, Andrew Richard Young
Index structure for supporting structural XML queries

Patent number: 7890471

Abstract: The present invention provides a ViST (or “virtual suffix tree”), which is a novel index structure for searching XML documents. By representing both XML documents and XML queries in structure-encoded sequences, it is shown that querying XML data is equivalent to finding (non-contiguous) subsequence matches. A variety of XML queries, including those with branches, or wild-cards (‘*’ and ‘//’), can be expressed by structure-encoded sequences. Unlike index methods that disassemble a query into multiple sub-queries, and then join the results of these sub-queries to provide the final answers, ViST uses tree structures as the basic unit of query to avoid expensive join operations. Furthermore, ViST provides a unified index on both content and structure of the XML documents, hence it has a performance advantage over methods indexing either just content or structure.

Type: Grant

Filed: July 19, 2007

Date of Patent: February 15, 2011

Assignee: International Business Machines Corporation

Inventors: Wei Fan, Haixun Wang, Philip Shi-Lung Yu
Mapping online contact information into a contacts list

Patent number: 7885937

Abstract: A presence management system may communicate contact information with mapped values. Contact information may be stored in a hierarchical, extensible structure (“hierarchical extensible contact structure”). Devices in a presence management system utilize a mapping scheme to map contact values (e.g., e-mail address, phone number, etc.) to the appropriate field of the hierarchical extensible contact structure. When devices in the presence management system communicate information for thousands of contacts, employing mapped values to navigate the hierarchical extensible contact structure reduces the size of the messages, thus reducing resource consumption (e.g., bandwidth), particularly on the scale of an enterprise.

Type: Grant

Filed: October 2, 2007

Date of Patent: February 8, 2011

Assignee: International Business Machines Corporation

Inventors: Gary M. Beadle, Michael L. Masterson
Primary server architectural networking arrangement and methods therefor

Patent number: 7882076

Abstract: An arrangement for performing at least one of collecting and analyzing data from a tool cluster configured to process a set of substrates is provided. The arrangement includes a plurality of tools from which at least one tool of the plurality of tools has a chamber for processing at least one of the set of substrates. The arrangement also includes a plurality of secondary servers configured to collect sensor data from the plurality of tools. The arrangement further includes a primary server communicably coupled with the plurality of secondary servers and configured to execute a database management system. The sensor data is indexed using a plurality of indexing applications on the plurality of secondary servers prior to being forwarded to the primary server for use by the database management system. Indexing includes associating a sensor data item with an identity of a server where the sensor data item is stored.

Type: Grant

Filed: December 14, 2006

Date of Patent: February 1, 2011

Assignee: Lam Research Corporation

Inventors: Chad R. Weetman, Chung-Ho Huang
Method and system for offline indexing of content and classifying stored data

Patent number: 7882077

Abstract: A method and system for creating an index of content without interfering with the source of the content includes an offline content indexing system that creates an index of content from an offline copy of data. The system may associate additional properties or tags with data that are not part of traditional indexing of content, such as the time the content was last available or user attributes associated with the content. Users can search the created index to locate content that is no longer available or based on the associate attributes.

Type: Grant

Filed: March 30, 2007

Date of Patent: February 1, 2011

Assignee: CommVault Systems, Inc.

Inventors: Parag Gokhale, Rajiv Kottomtharayil, Deepak R. Attarde, Jun H. Ahn
Database segment searching

Publication number: 20100332457

Abstract: A segment encompasses a number of segment records less than the total number of records of a database. The segment records have values for a field of the database. Lowest and highest values of the segment records for the field, and a bitmap for the segment, can be determined and stored. Selected bits of the bitmap each correspond to a value for the field. Each selected bit is set to one where at least one segment record has the value to which the bit corresponds. An index relating to just the segment records can be determined and stored. The lowest and highest values, and the bitmap, are adapted to permit determination of whether the segment has to be loaded into memory to locate records that satisfy a query. The index is adapted to permit searching of the segment records after the segment has been loaded into the memory.

Type: Application

Filed: June 27, 2009

Publication date: December 30, 2010

Inventor: Goetz Graefe
SYSTEM AND METHOD FOR ON-DEMAND INDEXING

Publication number: 20100332501

Abstract: A system and method for on-demand indexing in a data management system is described. An index is generated when it is requested, such as when a database operation requires access to the index. If the index is loaded in memory, the index is retrieved from memory. Otherwise, the index is generated on-demand. A priority configuration identifies at least one priority index which is generated and loaded in memory. The priority configuration can identify priority indexes either directly or indirectly, such as by a threshold parameter.

Type: Application

Filed: June 29, 2009

Publication date: December 30, 2010

Inventors: Mark E. Hanson, Richard T. Endo, Simon D. Shipilfoygel, Emil Antonov, Xidong Zheng, Hayim Hendeles, David E. Brookler
System For Generating A Media Playlist

Publication number: 20100332437

Abstract: A system for generating a media playlist comprising a media management module operable to select a first media item from a plurality of media items stored in a media database for playback; and using raw user input data representing a measure of the popularity of the first media item, generate preference data representing a refined user preference for the first media item; wherein the preference data is used to determine a second media item from the plurality of media items for playback.

Type: Application

Filed: June 26, 2009

Publication date: December 30, 2010

Inventor: Ramin Samadani
SYSTEM, METHOD, AND COMPUTER-READABLE MEDIUM FOR DYNAMIC DETECTION AND MANAGEMENT OF DATA SKEW IN PARALLEL JOIN OPERATIONS

Publication number: 20100332458

Abstract: A system, method, and computer-readable medium for dynamic detection and management of data skew in parallel join operations are provided. Rows allocated to processing modules involved in a join operation are redistributed among the processing modules by a hash redistribution of the join attributes. Receipt by a processing module of an excessive number of redistributed rows having a skewed value on the join attribute is detected by a processing module which notifies other processing modules of the skewed value. Processing modules then terminate redistribution of rows having a join attribute value matching the skewed value and either store such rows locally or duplicate the rows. The processing module that has received an excessive number of redistributed rows removes rows having a skewed value of the join attribute from a redistribution spool allocated thereto and duplicates the rows to each of the processing modules.

Type: Application

Filed: June 30, 2009

Publication date: December 30, 2010

Inventors: Yu Xu, Olli Pekka Kostamaa, Xin Zhou
INFORMATION ARCHIVAL AND RETRIEVAL SYSTEM FOR INTERNETWORKED COMPUTERS

Publication number: 20100325092

Abstract: A computing system can archive information from internetworked computers, such as Internet content, for later retrieval. A server system processes content providers, such as DNS registries and web sites, to extract and store content, including text, image, audio, and video content. For web sites, HTML source code is stored along with a browser-rendered display file. The content is perpetually archived to create a historical record of information for each content provider. An interface is used to retrieve the archived content in response to queries.

Type: Application

Filed: August 31, 2010

Publication date: December 23, 2010

Inventor: Rodney D. Johnson
PROMOTIONAL CONTENT PRESENTATION BASED ON SEARCH QUERY

Publication number: 20100324993

Abstract: In a computer-implemented method of providing digital content, a plurality of web pages is identified, where each of the identified web pages has an associated benefit to be accrued as a result of activity by a user on the identified web page. A search query that includes a search term is received, and one or more of the identified web pages is selected based on the benefits to be accrued as the result of the activity on the identified web pages and a relationship between the identified web pages and the search term. Representations of the selected one or more of the identified web pages are displayed on a display device.

Type: Application

Filed: June 19, 2009

Publication date: December 23, 2010

Applicant: Google Inc.

Inventors: Varun Kacholia, Kedar Dhamdhere, Sugato Basu
ACCURACY MEASUREMENT OF DATABASE SEARCH ALGORITHMS

Publication number: 20100325134

Abstract: A system, method and program product for evaluating search algorithms. A method is provided that includes: defining a population of searches and database records from a search history database; applying a sampling method and direct sampling rates to each search/record pair in the population using a computing system, wherein search/record pairs having a higher variability relative to the population are assigned a relatively higher probability; randomly sampling a direct sample of search/record pairs with the computing system using the direct sampling rates to increase a likelihood of obtaining search/record pairs having the higher variability; running a search algorithm and measuring errors for the direct sample and/or for an associated indirect sample; and calculating an estimated error rate for the search algorithm using inverse probability weighting.

Type: Application

Filed: June 23, 2009

Publication date: December 23, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Glenn J. Galfond
UNIFIED INVERTED INDEX FOR VIDEO PASSAGE RETRIEVAL

Publication number: 20100318532

Abstract: A method for information retrieval includes extracting from a video document visual data items and textual data items that occur in the document at respective occurrence times. Indexing records, which index both the visual and the textual data items by their respective occurrence times, are constructed and stored in a memory.

Type: Application

Filed: June 10, 2009

Publication date: December 16, 2010

Applicant: International Business Machines Corporation

Inventors: Benjamin Sznajder, Jonathan Mamou
DECLARATIVE FRAMEWORK FOR DEDUPLICATION

Publication number: 20100318499

Abstract: A system, framework, and algorithms for data deduplication are described. A declarative language, such as a Datalog-type logic language, is provided. Programs in the language describe data to be deduplicated and soft and hard constraints that must/should be satisfied by data deduplicated according to the program. To execute the programs, algorithms for performing graph clustering are described.

Type: Application

Filed: June 15, 2009

Publication date: December 16, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Arvind Arasu, Christopher Re, Dan Suciu
Source Classification For Performing Deduplication In A Backup Operation

Publication number: 20100312752

Abstract: A system, method, and computer program product for backing up data from a backup source to a central repository using deduplication, where the data comprises source data segments is disclosed. A fingerprint cache comprising fingerprints of data segments stored in the central repository is received, where the data segments were previously backed up from the backup source. Source data fingerprints comprising fingerprints (e.g., hash values) of the source data segments are generated. The source data fingerprints are compared to the fingerprints in the fingerprint cache. The source data segments corresponding to fingerprints not in the fingerprint cache may not be currently stored in the central repository. After further queries to the central repository, one or more of the source data segments are sent to the central repository for storage responsive to comparison.

Type: Application

Filed: June 8, 2009

Publication date: December 9, 2010

Applicant: SYMANTEC CORPORATION

Inventors: Mike Zeis, Weibao Wu

prev 1 2 3 4 5 next