Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
  • Patent number: 10339115
    Abstract: An information processing device receives data that includes a plurality of items and an item value associated therewith, displays a matrix in which the items included in the data being received are arranged in either one of a row direction or a column direction, and one or more candidate items for an association destination with which a part or all of the items are associated, are arranged in the other direction; receives specification of a set of any item in the items and any item in the one or more candidate items for the association destination, by specification of a position on the matrix, using the processor; and stores a value of an item that is associated with the position having been specified among the items, in a storage unit, in association with an item that is associated with the position having been specified among the candidate items for the association destination.
    Type: Grant
    Filed: August 5, 2016
    Date of Patent: July 2, 2019
    Assignee: FUJITSU LIMITED
    Inventors: Ryou Takizawa, Teppei Nitta, Sachio Takeda
  • Patent number: 9514168
    Abstract: In one exemplary embodiment, a method includes allocating an arena block of a shared memory of a database node server. The arena block is divided into one or more slots. The one or more slots include a discreet and constant area of memory within the arena block. Each slot is assigned a constant-memory address relative to an arena-block's shared memory address. The index is implemented as a red-black tree data structure. Each red-black tree node is mapped to a slot. Each red-black-tree node is provided a pointer to one or more neighbor nodes. The index stored in shared memory can be used during a ‘warm’ rebooting process.
    Type: Grant
    Filed: May 20, 2013
    Date of Patent: December 6, 2016
    Inventors: Sunil Sayyaparaju, Andrew Gooding, Venkatachary Srinivasan
  • Patent number: 9009435
    Abstract: Systems and computer program products are provided for optimizing selection of files for deletion from one or more data storage devices to free up a predetermined amount of space in the one or more data storage devices. A method includes analyzing an effective space occupied by each file of a plurality of files in the one or more data storage devices, identifying, from the plurality of files, one or more data blocks making up a file to free up the predetermined amount of space based on the analysis of the effective space of each file of the plurality of files, selecting one or more of the plurality of files as one or more candidate files for deletion, based on the identified one or more data blocks, and deleting the one or more candidate files for deletion from the one or more data storage devices.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: April 14, 2015
    Assignee: International Business Machines Corporation
    Inventors: Duane Mark Baldwin, Sandeep Ramesh Patil, Riyazahamad Moulasab Shiraguppi, Prashant Sodhiya
  • Patent number: 9009434
    Abstract: Systems and computer program products are provided for optimizing selection of files for eviction from a first storage pool to free up a predetermined amount of space in the first storage pool. A method includes analyzing an effective space occupied by each file of a plurality of files in the first storage pool, identifying, from the plurality of files, one or more data blocks making up a file to free up the predetermined amount of space based on the analysis of the effective space of each file of the plurality of files, selecting one or more of the plurality of files as one or more candidate files for eviction, based on the identified one or more data blocks, and evicting the one or more candidate files for eviction from the first storage pool to a second storage pool.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: April 14, 2015
    Assignee: International Business Machines Corporation
    Inventors: Duane Mark Baldwin, Sandeep Ramesh Patil, Riyazahamad Moulasab Shiraguppi, Prashant Sodhiya
  • Patent number: 9009119
    Abstract: A method, computer program product and system for compressing a multivariate dataset. A dataset is selected that includes a plurality of variates. A first compression method is applied to the values of a first variate of the dataset. A second compression method is applied to the values of a second variate of the dataset, where the second compression method is arranged to compress the second variate values relative to the variation of the corresponding first variate values.
    Type: Grant
    Filed: October 16, 2012
    Date of Patent: April 14, 2015
    Assignee: International Business Machines Corporation
    Inventors: Ahmed H. El-Mahdy, Hisham E. El-Shishiny
  • Publication number: 20150100569
    Abstract: A computer device is configured to identify a document; determine that the document includes an annotation, the annotation describing a user interface that is to be visually displayed in connection with information identifying the document when the information identifying the document is included in a search results document, the user interface including a user interface element that, when selected, causes an action to be performed in connection with the document, and the action being performed without obtaining the document after the user interface element is selected; determine information relating to the user interface based on the annotation; and store, in a search index, the information relating to the user interface in association with the information identifying the document.
    Type: Application
    Filed: June 28, 2012
    Publication date: April 9, 2015
    Applicant: GOOGLE INC.
    Inventor: David REIS DE SOUSA
  • Patent number: 9003152
    Abstract: Methods, systems, and computer program products are provided for optimizing selection of files for eviction from a first storage pool to free up a predetermined amount of space in the first storage pool. A method includes analyzing an effective space occupied by each file of a plurality of files in the first storage pool, identifying, from the plurality of files, one or more data blocks making up a file to free up the predetermined amount of space based on the analysis of the effective space of each file of the plurality of files, selecting one or more of the plurality of files as one or more candidate files for eviction, based on the identified one or more data blocks, and evicting the one or more candidate files for eviction from the first storage pool to a second storage pool.
    Type: Grant
    Filed: November 5, 2013
    Date of Patent: April 7, 2015
    Assignee: International Business Machines Corporation
    Inventors: Duane M. Baldwin, Sandeep R. Patil, Riyazahamad M. Shiraguppi, Prashant Sodhiya
  • Patent number: 9003151
    Abstract: Methods, systems, and computer program products are provided for optimizing selection of files for deletion from one or more data storage devices to free up a predetermined amount of space in the one or more data storage devices. A method includes analyzing an effective space occupied by each file of a plurality of files in the one or more data storage devices, identifying, from the plurality of files, one or more data blocks making up a file to free up the predetermined amount of space based on the analysis of the effective space of each file of the plurality of files, selecting one or more of the plurality of files as one or more candidate files for deletion, based on the identified one or more data blocks, and deleting the one or more candidate files for deletion from the one or more data storage devices.
    Type: Grant
    Filed: November 5, 2013
    Date of Patent: April 7, 2015
    Assignee: International Business Machines Corporation
    Inventors: Duane M. Baldwin, Sandeep R. Patil, Riyazahamad M. Shiraguppi, Prashant Sodhiya
  • Patent number: 8990233
    Abstract: Embodiments of the present disclosure disclose a method for implementing a context aware service application and a related apparatus. One method for implementing a context aware service application includes: receiving, by a context aware service platform, a first context request from a context aware client, where the first context request carries description information corresponding to requested context information; and matching released context sources by using the description information, and if matching succeeds, acquiring context information provided by at least one matched context source, and sending the acquired context information to the context aware client. The technical solutions according to the embodiments of the present disclosure help implement the context aware service application in a flexible and standard manner.
    Type: Grant
    Filed: December 18, 2012
    Date of Patent: March 24, 2015
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Shan Chen, Qifeng Ma
  • Publication number: 20150081717
    Abstract: Systems, methods and computer program products for using searchable aggregate indices associated with non-aggregated value storage. In one method, a search system stores metadata values for each of a plurality of objects in a storage unit. The metadata values are stored in corresponding value storage locations that are associated with an identifiable metadata fields. An aggregate index is provided which includes a dictionary of terms that are contained in metadata values associated with a designated set of the metadata fields. The aggregate index is searched for one or more specific search terms, and one or more of the metadata values are retrieved from the value storage locations in response to the search, where the individual metadata fields associated with the retrieved metadata values are identified.
    Type: Application
    Filed: August 22, 2012
    Publication date: March 19, 2015
    Inventor: Patrick Thomas Sidney Pidduck
  • Patent number: 8983951
    Abstract: Systems and computer program products for relating facts stored in healthcare databases are provided. At least two fact tables stored in a healthcare database including data meeting a criteria of interest are located. An identification key is assigned to the at least two fact tables including the located data meeting the criteria of interest. The identification key provides access to a dimension table including a list of subjects associated with the at least two fact tables including the located data meeting the criteria of interest so as to allow future identification of the subjects meeting the criteria of interest.
    Type: Grant
    Filed: September 6, 2007
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Robert R. Friedlander, Anwer M. Khan
  • Patent number: 8965921
    Abstract: In one embodiment, a distributed database system supporting flexible configuration of data clusters is disclosed. The system includes a cluster manager, an index, and a dataset distributed over one or more database clusters. Where the nodes of the clusters may report ownership of a particular range, the index contains an alternate range. The cluster manager receives requests to access a range of data within database and queries the index to determine the appropriate nodes and/or clusters with which to connect. The cluster manager then directs the requestor to connect to the specified nodes and/or clusters.
    Type: Grant
    Filed: June 6, 2012
    Date of Patent: February 24, 2015
    Assignee: Rackspace US, Inc.
    Inventor: Natasha Gajic
  • Patent number: 8959065
    Abstract: A computer-based monitoring system and monitoring method implemented in computer software for detecting, estimating, and reporting the condition states, their changes, and anomalies for many assets. The assets are of same type, are operated over a period of time, and outfitted with data collection systems. The proposed monitoring method accounts for variability of working conditions for each asset by using regression model that characterizes asset performance. The assets are of the same type but not identical. The proposed monitoring method accounts for asset-to-asset variability; it also accounts for drifts and trends in the asset condition and data. The proposed monitoring system can perform distributed processing of massive amounts of historical data without discarding any useful information where moving all the asset data into one central computing system might be infeasible. The overall processing is includes distributed preprocessing data records from each asset to produce compressed data.
    Type: Grant
    Filed: April 9, 2012
    Date of Patent: February 17, 2015
    Assignee: Mitek Analytics, LLC
    Inventor: Dimitry Gorinevsky
  • Patent number: 8930374
    Abstract: An approach is provided to determine one or more dynamic ordered tree structures and transition tree structures (e.g., based on one or more transitions of a device) to facilitate querying and/or accessing data stores. An apparatus and method determines to generate at least one index structure, determines to associate index objects of the generated index structure with one or more data objects of at least one data store, determines to generate at least one transition index structure based on the at least one generated index structure, and determines to associate the transition index structure with index objects corresponding to one or more data objects of at least one data store based on a transition of a device. Also, the method and apparatus determines to generate at least one query, and determines to generate at least one transition index structure where a current index structure to resolve the query is absent.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: January 6, 2015
    Assignee: Nokia Corporation
    Inventors: Sergey Boldyrev, Pavandeep Kalra
  • Patent number: 8892566
    Abstract: An index is created for a database by selecting at least one column of a database table as a basis to create the index, generating at least one index of a tree structure according to the at least one column, where a pointer stored in a leaf node of the at least one index is null. In an example embodiment, a value to a pointer is stored in a leaf node according to an intermediate result in response to the intermediate result being generated, where the pointer stored in the leaf node points to a data page storing the intermediate result. The created index can be reused and the intermediate result can be effectively used, such that the efficiency of database operation is improved.
    Type: Grant
    Filed: March 28, 2011
    Date of Patent: November 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Qi Chen, Hai Feng Li, Guang Zhou Zhang
  • Patent number: 8880533
    Abstract: An apparatus, method and article of manufacture of the present invention detects the presence of references to the same concept in separate sections of text, and, with no input required from the reader, presents the reader with information concerning the detected references to the concept. The information provided may comprise information related to the location of the reference to the concept in other sections of text, and the reader also is provided the ability to move from one reference to a concept directly to another reference to the same concept.
    Type: Grant
    Filed: September 24, 2012
    Date of Patent: November 4, 2014
    Inventor: Philip R Krause
  • Patent number: 8862625
    Abstract: Embodiments of the present invention provide hardware-friendly indexing of databases. In particular, forward and reverse indexing are utilized to allow for easy traversal of primary key to foreign key relationships. A novel structure known as a hit list also allows for easy scanning of various indexes in hardware. Group indexing is provided for flexible support of complex group key definition, such as for date range indexing and text indexing. A Replicated Reordered Column (RRC) may also be added to the group index to convert random I/O pattern into sequential I/O of only needed column elements.
    Type: Grant
    Filed: April 7, 2008
    Date of Patent: October 14, 2014
    Assignee: Teradata US, Inc.
    Inventors: Krishnan Meiyyappan, Liuxi Yang, Jeremy Branscome, Michael Corwin, Ravi Krishnamurthy, Kapil Surlaker, James Shau, Joseph I. Chamdani
  • Patent number: 8825719
    Abstract: Concurrent, incremental, and lock-free stack scanning for garbage collectors is disclosed. This method uses a summary table and return barriers to allow high responsiveness. The method also supports programs that employ fine-synchronization to avoid locks, imposes negligible overhead on program execution, can be used with existing concurrent collectors, and supports the special in-stack references existing in languages such as C#.
    Type: Grant
    Filed: October 30, 2008
    Date of Patent: September 2, 2014
    Assignee: Microsoft Corporation
    Inventors: Bjarne Steensgaard, Erez Petrank, Gabriel Kliot
  • Publication number: 20140214840
    Abstract: Methods, systems and apparatus, including computer programs encoded on a computer storage medium, for disambiguating names in a document corpus. In an aspect, a method includes generating context term lists for a person name, each context term list being a list of context terms from a resource for the person name; clustering the context term lists into a plurality of clusters, each of the clusters of context term lists including context term lists that are most similar to the cluster relative to other clusters; for each of the clusters, selecting a representative term for the cluster; receiving the person name as a search query; and generating a plurality of query suggestions from the search query and the representative terms for the clusters, each query suggesting being a combination of the person name and one representative term.
    Type: Application
    Filed: November 29, 2010
    Publication date: July 31, 2014
    Applicant: GOOGLE INC.
    Inventors: Nitin Gupta, Abhinandan S. Das
  • Publication number: 20140201187
    Abstract: Systems, methods, and computer program products for searching objects, metadata associated with the objects, and attributes assigned to or associated with the metadata. Referring to herein as metadata for the metadata, these attributes may be associated with one or more metadata field values of a metadata field name which, in turn, may be associated with an object being or already indexed in a search index of a search system. Each attribute may be optional, dynamically created, indexed, and searchable via the search index. There can be multiple attributes associated with the same metadata field value, each being represented as a key-value pair. This metadata for the metadata approach can be highly efficient. For example, the ability to search multiple attributes associated with the same metadata field can eliminate the potential need to create multiple metadata fields for the same value in different languages, countries, etc.
    Type: Application
    Filed: August 27, 2012
    Publication date: July 17, 2014
    Inventor: Johan G. Larson
  • Publication number: 20140181063
    Abstract: A search engine may maintain a list of derived metadata. When an event occurs that requires updating a search index, the search engine can determine which metadata is derived metadata and take appropriate actions with respect to the derived metadata. For example, if a request is received to update the index for a particular object, the search engine may protect the derived metadata from change while updating the other metadata in the index. As another example, if a request is received to update the text content for the object, the search engine may change the text content and the derived metadata. By identifying derived metadata, the search engine can protect the derived metadata from change when a request is received that otherwise causes metadata to change and can change the derived metadata when a request is received that would otherwise not change the metadata portion of the index.
    Type: Application
    Filed: August 22, 2012
    Publication date: June 26, 2014
    Inventor: Patrick Thomas Sidney Pidduck
  • Patent number: 8756270
    Abstract: A mechanism is provided in a collective acceleration unit for performing a collective operation to distribute or collect data among a plurality of participant nodes. The mechanism receives an input collective packet for a collective operation from a neighbor node within a collective tree. The input collective packet comprises a tree identifier and an input data field and wherein the collective tree comprises a plurality of sub trees. The mechanism maps the tree identifier to an index within the collective acceleration unit. The index identifies a portion of resources within the collective acceleration unit and is associated with a set of neighbor nodes in a given sub tree within the collective tree. For each neighbor node the collective acceleration unit stores destination information. The collective acceleration unit performs an operation on the input data field using the portion of resources to effect the collective operation.
    Type: Grant
    Filed: April 24, 2012
    Date of Patent: June 17, 2014
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Paul F. Lecocq, Hanhong Xue
  • Patent number: 8751525
    Abstract: Calculation of aggregated values in a history database table can be optimized using an approach in which an ordered history table is accessed. The ordered history table can include a sequential listing of commit identifiers associated with updates, insertions, and/or deletions to values in the database table. The ordered history table can be traversed in a single pass to calculate an aggregation function using an optimized algorithm. The optimized algorithm can enable calculation of an aggregated metric of the values based on a selected method for tracking invalidated values to their corresponding commit identifiers. The calculated metric is generated for a current version of the database table; and promoted.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: June 10, 2014
    Assignee: SAP AG
    Inventors: Martin Kaufmann, Norman May, Andreas Tonder, Donald Kossmann
  • Publication number: 20140143212
    Abstract: A server device may receive multiple provider identifiers for a media item from multiple client devices. The multiple provider identifiers may each be associated with different media providers and may each be associated with the same media item. The server device may aggregate the multiple provider identifiers into entries in a data store. The server device may also analyze the entries in the data store and may request missing provider identifiers, merge entries that have duplicate information, and may indicate whether a media item is playable.
    Type: Application
    Filed: November 21, 2012
    Publication date: May 22, 2014
    Applicant: ELECTRONIC ARTS INC.
    Inventor: Trenton Todd Shumay
  • Publication number: 20140129543
    Abstract: Embodiments provide indexing and searching features, but are not so limited. In an embodiment, a search service is configured to use one or more separate number index structures as part of providing a rich search service that includes reliable numerical value range searching functionality. A method of an embodiment operates to extract numbers from original strings of electronic documents to provide a list of terms for a main dictionary and a list of numbers for a separate number index structure as part of providing a search service that efficiently indexes text that contains numbers. Other embodiments are included.
    Type: Application
    Filed: November 2, 2012
    Publication date: May 8, 2014
    Applicant: MICROSOFT CORPORATION
    Inventor: Helge Grenager Solheim
  • Publication number: 20140129564
    Abstract: Various embodiments present file indexes within a file managing and navigation interface. In one embodiment, a set of files is presented within a user interface of a file managing and navigation application. A visual indicator is associated with at least one file in the set of files. The visual indicator indicates to a user that the at least one file is associated with an index. The index includes a set of index components associated with a content set of the at least one file. A request from the user to is received to display the index. The index is presented to the user within the user interface based on receiving the request.
    Type: Application
    Filed: November 6, 2012
    Publication date: May 8, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Barry Alan KRITT, Sarbajit K. RAKSHIT
  • Publication number: 20140123147
    Abstract: A system, method, and computer program product are provided for reconstructing a sampled suffix array. The sampled suffix array is reconstructed by, for each index of a sampled suffix array for a string, calculating a block value corresponding to the index based on an FM-index, and reconstructing the sampled suffix array corresponding to the string based on the block values. Calculating at least two block values for at least two corresponding indices of the sampled suffix array is performed in parallel.
    Type: Application
    Filed: November 1, 2012
    Publication date: May 1, 2014
    Applicant: NVIDIA CORPORATION
    Inventor: Jacopo Pantaleoni
  • Publication number: 20140122498
    Abstract: Disclosed herein are systems, methods, and software for facilitating gallery environments and views. In at least one implementation an initial view is presented of tagged items arranged in tag groups. The tag groups correspond to tags and the tagged items are arranged in the tag groups based on with which of the tags each of the tagged items is associated. One of the groups may be identified for enhanced viewing. Accordingly, tagged items associated with the identified group, by way of their association with a tag corresponding to the group, are identified. In some implementations they may be referred to as enhanced tagged items. An enhanced view may then be presented of at least the enhanced tagged items.
    Type: Application
    Filed: October 26, 2012
    Publication date: May 1, 2014
    Applicant: Microsoft Corporation
    Inventors: Kin Hong Mok, Avijit Sinha
  • Publication number: 20140122499
    Abstract: Techniques for indexing file paths of items in a content repository may include taking turns in querying each different item type or folder type in a round robin schedule to visit select nodes of the folder tree of that type to update and maintain the file path indexes. Item types or folder types may be associated with a count of instances or children of instances that are missing indexes. For each item type or folder type, a query may be performed for instances of the item type or folder type having children that are missing indexes, the instances or children of the instances returned may be associated with file path indexes, and the count of instances or children of instances may be adjusted based on the associating.
    Type: Application
    Filed: November 1, 2012
    Publication date: May 1, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: David B. Victor
  • Patent number: 8712977
    Abstract: A computer-readable recording medium stores therein an information retrieval program that causes a computer to execute a retrieval process in which files to be retrieved are narrowed down by using a bit string for each character in the files to find characters making up a retrieval keyword to retrieve a keyword identical to or related to the retrieval keyword in the files to be retrieved. The bit strings indicate the presence of the characters in the files. The information retrieval program causes the computer to execute extracting, from among the bit strings, a bit string of an arbitrary character; and compressing the extracted bit string, by using a special Huffman tree having leaves of plural types of symbol strings covering patterns represented by a predetermined number of bits and a special symbol string having a number of bits greater than the predetermined number of bits.
    Type: Grant
    Filed: November 20, 2009
    Date of Patent: April 29, 2014
    Assignee: Fujitsu Limited
    Inventors: Masahiro Kataoka, Masahiro Kurishima, Takashi Tsubokura, Ryouta Komatsu
  • Publication number: 20140114949
    Abstract: A system includes a memory operable to store an ontology and a search index. The system also includes a data agent operable to generate a knowledge assertion by parsing one or more data elements retrieved from a data source. The system also includes a knowledge management engine comprising a processor. The knowledge management engine is operable to validate the knowledge assertion based on the ontology. The knowledge management engine is further operable to determine whether to update the ontology with the knowledge assertion. The system also includes a search engine operable to generate the search index at least in part by indexing data stored in the ontology.
    Type: Application
    Filed: October 22, 2012
    Publication date: April 24, 2014
    Applicant: Bank of America Corporation
    Inventors: Susan McClung, Michael K. Hofmeister
  • Publication number: 20140114938
    Abstract: A data compression apparatus generates a global symbol table for an overlapping data using a part of the entire data to be compressed and a local symbol table that is not overlapped with the global symbol table and compressing data with a block as a unit. The apparatus increase compression efficiency.
    Type: Application
    Filed: November 6, 2012
    Publication date: April 24, 2014
    Applicant: TIBERO CO., LTD.
    Inventors: Jae Seok AN, Sang Young PARK
  • Publication number: 20140114942
    Abstract: A search index for a collection of documents includes a plurality of keywords associated with the documents. Access to individual documents is detected based on searches employing the search index and keywords are recorded that are utilized in the searches and resulted in document access. The search index is modified to maintain the recorded keywords and remove keywords absent from the searches resulting in the document access.
    Type: Application
    Filed: October 23, 2012
    Publication date: April 24, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Igor L. Belakovskiy, Matthew E. Broomhall, Itzhack Goldberg, Boaz Mizrachi, Neil Sondhi
  • Publication number: 20140114933
    Abstract: Methods and apparatuses for efficiently migrating deduplicated data are provided. In one example, a data management system includes a data storage volume, a memory including machine executable instructions, and a computer processor. The data storage volume includes data objects and free storage space. The computer processor executes the instructions to perform deduplication of the data objects and determine migration efficiency metrics for groups of the data objects. Determining the migration efficiency metrics includes determining, for each group, a relationship between the free storage space that will result if the group is migrated from the volume and the resources required to migrate the group from the volume.
    Type: Application
    Filed: October 18, 2012
    Publication date: April 24, 2014
    Applicant: NETAPP, INC.
    Inventor: NetApp, Inc.
  • Publication number: 20140114937
    Abstract: An apparatus having a circuit is disclosed. The circuit may be configured to (i) generate a sequence of hash values in a table from a stream of data values with repetitive values, (ii) find two consecutive ones of the hash values in the sequence that have a common value and (iii) create a shortened hash chain by generating a pointer in the table at an intermediate location that corresponds to a second of the two consecutive hash values. The pointer generally points forward in the table to an end location that corresponds to a last of the data values in a run of the data values.
    Type: Application
    Filed: October 24, 2012
    Publication date: April 24, 2014
    Applicant: LSI CORPORATION
    Inventor: Ning Chen
  • Publication number: 20140114932
    Abstract: Methods and apparatuses for performing selective deduplication in a storage system are introduced here. Techniques are provided for determining a probability of deduplication for a data object based on a characteristic of the data object and performing a deduplication operation on the data object in the storage system prior to the data object being stored in persistent storage of the storage system if the probability of deduplication for the data object has a specified relationship to a specified threshold.
    Type: Application
    Filed: October 18, 2012
    Publication date: April 24, 2014
    Applicant: NETAPP, INC.
    Inventor: NETAPP, INC.
  • Publication number: 20140114980
    Abstract: Methods and arrangements for creating a searchable developer directory. A developer profile is generated relative to a project, the developer profile including information from change history with respect to the project. Metrics related to developer participation in the project are included in the developer profile, and the developer profile is indexed with at least one other developer profile to provide a search basis for search queries.
    Type: Application
    Filed: October 19, 2012
    Publication date: April 24, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Senthil Kumar Kumarasamy Mani, Vibha Singhal Sinha
  • Publication number: 20140114955
    Abstract: A search system, separate from a relational database, generates an index of information in the relational database that can be used to look up business records (or entities). A search system, that is also separate from the relational database, receives typing or other character inputs in a search user input mechanism and generates queries against the index based on the typing inputs, or other character inputs, received. The search system returns results and modifies those results as additional typing inputs, or characters, are received.
    Type: Application
    Filed: October 24, 2012
    Publication date: April 24, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Amit Raghunath Kulkarni, Brian Russell Glaeske, Vijeta Johri, Amar Nalla, Pramit H. Desai, Tanmoy Dutta
  • Publication number: 20140108413
    Abstract: A system, method, and computer-readable medium are disclosed for automating the management of a device description repository (DDR). A device properties detection script embedded in a web page is executed when the web page is processed by a browser. Upon execution, the embedded script determines various properties associated with the user's device, which are then provided along with the device's user-agent identifier for processing. In turn, the provided user-agent identifier is used to search a predetermined DDR for a matching user-agent identifier. If a matching user-agent identifier is not found, then the provided user-agent identifier and its corresponding device properties are stored in the DDR. The device properties associated with the user-agent identifier are then used to initiate the provision of device-optimized images to the browser.
    Type: Application
    Filed: October 11, 2012
    Publication date: April 17, 2014
    Applicant: DELL PRODUCTS L.P.
    Inventor: Luis J. Botero
  • Publication number: 20140108361
    Abstract: An approach is provided for compressing location trajectories based on map structure. A compression platform causes, at least in part, a mapping of at least one location trajectory to at least one map to determine one or more intersections traveled along the at least one location trajectory. The compression platform further determines at least one compression key based, at least in part, on one or more outgoing roads of the one or more intersections. The compression platform also causes, at least in part, a compression of the at least one location trajectory based, at least in part, on the at least one compression key.
    Type: Application
    Filed: October 16, 2012
    Publication date: April 17, 2014
    Applicant: Nokia Corporation
    Inventors: Debmalya Biswas, Nikolai Nefedov
  • Publication number: 20140108414
    Abstract: In general, techniques are described for an RDF (Resource Description Framework) database system which can scale to huge size for realistic data sets of practical interest. In some examples, a database system includes a Resource Description Framework (RDF) database that stores a plurality of data chunks to one or more storage drives, wherein each of the plurality of data chunks includes a plurality of triples of the RDF database. The database system also includes a working memory, a query interface that receives a query for the RDF database, a SPARQL engine that identifies a subset of the data chunks relevant to the query, and an index interface that includes one or more bulk loaders that load the subset of the data chunks to the working memory. The SPARQL engine executes the query only against triples included within the loaded subset of the data chunks to obtain a query result.
    Type: Application
    Filed: October 12, 2012
    Publication date: April 17, 2014
    Applicant: ARCHITECTURE TECHNOLOGY CORPORATION
    Inventors: Matthew A. Stillerman, Robert A. Joyce
  • Publication number: 20140108360
    Abstract: A method for generating a compressed navigation map database from uncompressed navigation map data, wherein the uncompressed navigation map data contains different building blocks of navigation data, each building block addressing a functional aspect of the navigation data, each block containing strings of data. The method includes determining, for each block of the uncompressed navigation map data, most frequent substrings of the block; storing, for each block, the determined most frequent substrings of the block in a seed block; replacing, for each block, in the strings the determined most frequent substrings stored in the seed block by a reference to the seed block thereby generating a compressed block for each block; and storing, for each block, the compressed block and the seed block in order to generate the compressed navigation map database.
    Type: Application
    Filed: October 15, 2012
    Publication date: April 17, 2014
    Inventors: Peter Kunath, Marcus Heitmann, Stefan Baptist, Carsten-Christian Spindler, Stavros Mitrakis
  • Publication number: 20140101113
    Abstract: The present disclosure provides for implementing a two-level fingerprint caching scheme for a client cache and a server cache. The client cache hit ratio can be improved by pre-populating the client cache with fingerprints that are relevant to the client. Relevant fingerprints include fingerprints used during a recent time period (e.g., fingerprints of segments that are included in the last full backup image and any following incremental backup images created for the client after the last full backup image), and thus are referred to as fingerprints with good temporal locality. Relevant fingerprints also include fingerprints associated with a storage container that has good spatial locality, and thus are referred to as fingerprints with good spatial locality. A pre-set threshold established for the client cache (e.g., threshold Tc) is used to determine whether a storage container (and thus fingerprints associated with the storage container) has good spatial locality.
    Type: Application
    Filed: October 8, 2012
    Publication date: April 10, 2014
    Applicant: SYMANTEC CORPORATION
    Inventors: Xianbo Zhang, Haibin She, Chao Lei, Xiaobing Song, Shuai Cheng
  • Publication number: 20140101114
    Abstract: Methods, computer systems, and computer program products for processing data a computing environment are provided. The computer environment for data deduplication storage receives a plurality of write operations for deduplication storage of the data. The data is buffered in a plurality of buffers with overflow temporarily stored to a memory hierarchy when the data received for deduplication storage is sequential or non sequential. The data is accumulated and updated in the plurality of buffers per a data structure, the data structure serving as a fragment map between the plurality of buffers and a plurality of user file locations. The data is restructured in the plurality of buffers to form a complete sequence of a required sequence size. The data is provided as at least one stream to a stream-based deduplication algorithm for processing and storage.
    Type: Application
    Filed: October 9, 2012
    Publication date: April 10, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shay H. AKIRAV, Ron EDELSTEIN, Michael HIRSCH, Ariel J. ISH-SHALOM, Liran LOYA, Itai TZUR
  • Publication number: 20140095512
    Abstract: Aspects of the present invention provide a tool for hash-based indexing. In an embodiment, a ranked dataset having a plurality of data items is obtained. Every data item in the ranked dataset has a ranking with respect to every other data item in the ranked dataset. A ranking triplet matrix is created based on the ranked dataset. The ranking triplet matrix has a set of ranking triplets, each of which indicates the relative ranking for a pair of the data items in the ranked dataset. This ranking triplet can be merged with a hash table obtained using a standard hash function and the data items can be indexed based on the results.
    Type: Application
    Filed: October 4, 2012
    Publication date: April 3, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xu Sun, Jun Wang
  • Publication number: 20140095490
    Abstract: Aspects of the present invention provide a tool for hash-based indexing. In an embodiment, a ranked dataset having a plurality of data items is obtained. Every data item in the ranked dataset has a ranking with respect to every other data item in the ranked dataset. A ranking triplet matrix is created based on the ranked dataset. The ranking triplet matrix has a set of ranking triplets, each of which indicates the relative ranking for a pair of the data items in the ranked dataset. This ranking triplet can be merged with a hash table obtained using a standard hash function and the data items can be indexed based on the results.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xu Sun, Jun Wang
  • Publication number: 20140089315
    Abstract: An apparatus, method and article of manufacture of the present invention detects the presence of references to the same concept in separate sections of text, and, with no input required from the reader, presents the reader with information concerning the detected references to the concept. The information provided may comprise information related to the location of the reference to the concept in other sections of text, and the reader also is provided the ability to move from one reference to a concept directly to another reference to the same concept.
    Type: Application
    Filed: September 24, 2012
    Publication date: March 27, 2014
    Inventor: Philip R. Krause
  • Publication number: 20140089273
    Abstract: Storing and retrieving files based on hashes for the files. One method for storing files includes: identifying a file; identifying a hash calculated based on the file; renaming the file based on the hash based on the file; and storing the file in a particular location based on the hash calculated based on the file. Another method for retrieving files includes: identifying a hash for a given file; using the hash, traversing a hierarchical file structure to find a location where the given file should be stored; determining that the file is at the location; and as a result, retrieving the file.
    Type: Application
    Filed: September 27, 2012
    Publication date: March 27, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Ronen Borshack, Anil Francis Thomas, Erez Einav, Philip Ernst Taron
  • Publication number: 20140089316
    Abstract: An apparatus, method and article of manufacture of the present invention detects the presence of references to the same concept in separate sections of text, and, with no input required from the reader, presents the reader with information concerning the detected references to the concept. The information provided may comprise information related to the location of the reference to the concept in other sections of text, and the reader also is provided the ability to move from one reference to a concept directly to another reference to the same concept.
    Type: Application
    Filed: September 24, 2012
    Publication date: March 27, 2014
    Inventor: Philip R. Krause
  • Publication number: 20140089269
    Abstract: Expired files in the deduplicating virtual media are selectively erased using a backup application for notifying a backup repository of which expired files are no longer required. The space of the expired files is reclaimed for reuse. Virtual space of the expired files is reserved for allowing the backup application to seek past the reclaimed space to subsequent data in the deduplicating virtual media.
    Type: Application
    Filed: September 24, 2012
    Publication date: March 27, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shay H. AKIRAV, Michael HIRSCH