Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
  • Publication number: 20120254796
    Abstract: A method is disclosed to convert digital data using a memory card operatively engaged with an apparatus such as a digital camera having control buttons but that does not have a keyboard or keypad. The memory card comprises a central processor, a conversion module and a storage module. The method includes placing the apparatus in a predetermined mode; activating the conversion module in the memory card; selecting at least one file stored in the memory card; and converting the selected at least one file.
    Type: Application
    Filed: January 19, 2010
    Publication date: October 4, 2012
    Inventor: Joon Yong, Wayne Tan
  • Publication number: 20120254115
    Abstract: A method and system for creating secondary copies of data whose contents satisfy searches within data stores is described. In some cases, the system searches for data within a data store, identifies a set of data that satisfies the search, copies the identified set of data, and transfers the copy to secondary or other storage. In some cases, the system utilizes search-based secondary copies of days during restoration processes in order to restore data similar to and/or associated with data requested to be restored.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 4, 2012
    Inventor: Prakash Varadharajan
  • Publication number: 20120254266
    Abstract: Methods and systems for garbage collection are described. In some embodiments, Garbage collector threads may maximize local accesses and minimize remote access by copying Young objects and Old objects differently. When copying a Young object, a garbage collector thread may determine the lgroup of the pool that contains the object and copy the object to a pool of the same lgroup. The garbage collector thread may spread Old objects among lgroups by copying Old objects to pools of the same lgroup as the respective garbage collector thread. Additional methods and systems are disclosed.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 4, 2012
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Antonios Printezis, Igor Veresov, Paul Henry Hohensee, John Coomes
  • Publication number: 20120254194
    Abstract: Managing user bookmark information includes receiving a bookmark-related action request and determining a type of action associated with the bookmark-related action request and user information associated with the bookmark-related action request. In the event that the type of action corresponds to an add bookmark action, managing user bookmark information further includes generating a bookmark data record, the bookmark data record comprising the user information and information to be bookmarked; determining, using the user information, bookmark database information associated with a bookmark database to which the bookmark data record is to be stored, the bookmark database being one of a plurality of bookmark databases; generating index information based on the user information and the bookmark database information; storing the index information in an index database that is separate from the plurality of bookmark databases; and storing the bookmark data record in the bookmark database.
    Type: Application
    Filed: March 28, 2012
    Publication date: October 4, 2012
    Applicant: ALIBABA GROUP HOLDING LIMITED
    Inventor: Ce Wu
  • Publication number: 20120254132
    Abstract: A method and an apparatus for organizing information in an electronic address book. The method comprises collecting contact information for an electronic address book, comparing a name from any field in said contact information to a database comprising name information, identifying a first name or a surname from the contact information and relocating in the contact information the identified first name to a field assigned to first names or the surname to a field assigned to surnames as a response to a name identified in a wrong field.
    Type: Application
    Filed: March 27, 2012
    Publication date: October 4, 2012
    Inventors: Kimmo Kivirauma, Rami Lehtonen
  • Publication number: 20120254139
    Abstract: A method of providing lock-based access to nodes in a concurrent linked list includes providing a plurality of striped lock objects. Each striped lock object is configured to lock at least one of the nodes in the concurrent linked list. An index is computed based on a value stored in a first node to be accessed in the concurrent linked list. A first one of the striped lock objects is identified based on the computed index. The first striped lock object is acquired, thereby locking and providing protected access to the first node.
    Type: Application
    Filed: June 15, 2012
    Publication date: October 4, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Chunyan Song, Joshua Phillips, John Duffy, Tim Harris, Stephen H. Toub, Boby George
  • Publication number: 20120254189
    Abstract: A multilevel indexing system for indexing documents including structure and content information. The system may include a structure index module generating a structure index for the documents based on a document structure. A content index module may generate a content index for the documents based on a document type and document content. A computerized tree generation module may generate a multilevel indexing tree including the structure and content indexes. A search into the structure index may drive a search into the content index.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 4, 2012
    Inventor: Biren Narendra Shah
  • Publication number: 20120254192
    Abstract: A system and method for enhancing bitmap indexing representation of a dataset, which comprises a plurality of cases and features, each case characterized by one or more values of each feature. Currently, the bins vector for each case in the dataset, is a binary array, which is a bitmap indexing representation of each respective feature of the case. The system and method enhance the bitmap indexing by padding each bins vector. The padding is carried out by identifying all target bit locations with a ‘1’ value and replacing at least one ‘0’ bit adjacent to a target bit location with a non-zero numerical value, thereby creating a padded bitmap index. The padding factor may be based on any mathematical or statistical factor concerning population or subpopulation relevant to each of the features of the dataset.
    Type: Application
    Filed: March 29, 2011
    Publication date: October 4, 2012
    Inventor: Roy Gelbard
  • Publication number: 20120254265
    Abstract: Aspects for conservative garbage collecting are disclosed. In one aspect, a heap of objects is generated during an execution of a script, and script objects in an unexecuted portion are traced to corresponding memory locations on the heap. The heap is then marked concurrently with executing the script such that a marked heap includes reachable and unreachable objects. Memory allocated to the unreachable objects is then freed concurrently with executing the script based on the marking. In another aspect, an object graph associated with a call stack is generated and traced such that script objects in an unexecuted portion of the stack are traced to corresponding memory locations on a heap. Heap objects are marked concurrently with executing the stack so that a marked heap includes reachable and unreachable objects. Memory allocated to the unreachable objects is then cleared concurrently with executing the stack based on the marked heap.
    Type: Application
    Filed: March 29, 2011
    Publication date: October 4, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Steven Lucco, Curtis Cheng-Cheng Man
  • Publication number: 20120254190
    Abstract: An extracting method includes storing to a storage device: files that include character units; first index information indicating which file includes at least one character unit in a character unit group having a usage frequency less than a predetermined frequency and among character units having common information in a predetermined portion, the usage frequency indicating the extent of files having a given character unit; second index information indicating which file includes a first character unit having a usage frequency at least equal to the predetermined frequency and among the character units having common information in a predetermined portion; and referring to the first and second index information to extract a file having character units in the first and second index information, when a request is received for extraction of a file having the first character unit and a second character unit that is included in the character unit group.
    Type: Application
    Filed: March 19, 2012
    Publication date: October 4, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Masahiro KATAOKA, Takahiro Murata, Takafumi Ohta
  • Publication number: 20120246132
    Abstract: Overflow access records (OARs) are managed in a database system. An OAR is created in response to receiving an update command for a data record and to the updated data record generated by the update command not fitting onto the page in the table where the data record was stored. The OAR that is created includes an index counter that indicates a number of indexes associated with the table. When an OAR is accessed in response to a query command, an identifier of the accessed OAR is replaced in the index by an identifier of a data record pointed to by the OAR, and the index counter in the accessed OAR is changed by a predefined amount. When the index counter reaches a predefined value, the accessed OAR is removed from the table.
    Type: Application
    Filed: March 2, 2012
    Publication date: September 27, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Nelke Sebastian, Martin Oberhofer, Yannick Saillet, Jens Seifert, Knut Stolze
  • Publication number: 20120246138
    Abstract: A web content search request including a search term is received at a searching/indexing device. A web search is performed based upon the search term. A markup language (ML) document returned via the web search including the search term is parsed. A location of the search term within the ML document is identified. A hypertext link to the identified location of the search term within the ML document is configured.
    Type: Application
    Filed: June 6, 2012
    Publication date: September 27, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Theodore R. Carraher, Jake Palmer
  • Publication number: 20120244929
    Abstract: Systems and methods for the processing, storing, and displaying of map data are disclosed. Embodiments of the present invention facilitate more efficient storage, processing, communication, and display of maps and map-related data in connection with modern computers and communications systems. While originally developed for a map-based game, the techniques disclosed herein also have practical applications to other technical fields including but not limited to image processing (e.g., partitioning of images much like maps for storage, communication, or display) and heat mapping (e.g., visualization of a geographical distribution of certain attributes).
    Type: Application
    Filed: May 11, 2012
    Publication date: September 27, 2012
    Inventors: James Allan Oakes, Henry Edward Oakes, Matthew Young, Mykolas Juraitis, Alexander Baev, Oleksandr Cherednychenko, Olga Melnichuk
  • Publication number: 20120246128
    Abstract: The database system of the present invention decides a fragment length responding to a unit of a data process of a parallel arithmetic unit, and stores tuple data containing variable-length data into a fragment and metadata of the fragment into a fragment header, respectively, in a column store database. The database system refers to the metadata when executing a process for data stored in the column store database, decides the fragments to be assigned to each thread that is executed by the parallel arithmetic unit, assigns the fragments to each thread based upon the decided content, and causes each thread to execute a parallel arithmetic operation.
    Type: Application
    Filed: March 21, 2012
    Publication date: September 27, 2012
    Applicant: NEC Corporation
    Inventors: Takehiko KASHIWAGI, Junpei Kamimura
  • Publication number: 20120246129
    Abstract: A data object management scheme for storing a large plurality of small data objects (e.g., image files) in small number of large object stack file for storage in secondary storage (e.g., hard disks). By storing many individual data objects in a single object stack file, the number of files stored in the secondary storage is reduced by several orders of magnitude, from the billions or millions to the hundreds or so. Index data for each object stack file is generated and stored in primary storage to allow efficient and prompt access to the data objects. Requests to store or retrieve the data objects are made using HTTP messages including file identifiers that identify the files storing the data objects and keys identifying the data objects. A file server stores or retrieves the data object from secondary storage of a file server without converting the requests to NSF or POSIX commands.
    Type: Application
    Filed: June 7, 2012
    Publication date: September 27, 2012
    Inventors: Jeffrey Rothschild, Peter Vajgel, Jason S. Sobel, Robert C. Johnson
  • Publication number: 20120246272
    Abstract: A method of extracting knowledge comprising: initiating a search application; displaying a user search interface; receiving input parameters via the search interface; identifying a query type based on received input parameters; formulating a database query based on the received input parameters; transmitting the database query to a database, the database being in communication with a file archive indexer for indexing a file archive for storing files and data regarding the files; obtaining database query results from the database, the database storing the user activity data and actual content accessed by the user; providing database query results to a result analyzer module; and displaying search result analyzer module results to a user.
    Type: Application
    Filed: June 4, 2012
    Publication date: September 27, 2012
    Inventors: George Eagan, Prabhdeep Singh
  • Publication number: 20120246166
    Abstract: A method, apparatus, system, article of manufacture, and computer readable medium provide the ability to create a point cloud indexed file. A grid (of cells that are divided into subcells) is mapped over points in a point cloud dataset. An occupancy value, that indicates whether a subcell contains a point, is computed for each subcell. A surface area contribution factor is computed for each cell and identifies a count of subcells that are occupied divided by a total number of subcells. The surface area contribution factor for each cell and points for each cell are written to the point cloud indexed file.
    Type: Application
    Filed: March 24, 2011
    Publication date: September 27, 2012
    Applicant: AUTODESK, INC.
    Inventors: Ravinder P. Krishnaswamy, Jeffrey M. Kowalski, Carl Christer Janson
  • Publication number: 20120246169
    Abstract: Technologies pertaining to compressing time-series signals are described herein. Groups of time-series signals are generated based upon similarities between time-series signals. Each group of time-series signals includes a respective base time-series signal. Ratio signals that are representative of time-series signals are computed, wherein the ratio signals are based upon the base time-series signal and other respective time-series signals in a group of time-series signals.
    Type: Application
    Filed: June 8, 2012
    Publication date: September 27, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Jie Liu, Suman Kumar Nath, Feng Zhao, Galen Andrew Reeves, Sorabh Kumar Gandhi
  • Publication number: 20120239710
    Abstract: An illustrative embodiment of a computer-implemented process for single pass marking of finalizable objects marks strong roots, marks finalizable roots and determines whether a strong work stack is empty. Responsive to a determination the strong work stack is empty the computer-implemented process determines whether a finalizable work stack is empty. Responsive to a determination the finalizable work stack is empty, synchronize threads, the computer-implemented process finalizes finalizable roots and merges mark maps to finish parallel marking.
    Type: Application
    Filed: March 14, 2011
    Publication date: September 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Peter W. Burka, Jeffrey M. Disher, Daryl J. Maier, Aleksandar Micic, Ryan A. Sciampacone
  • Publication number: 20120239633
    Abstract: A dynamic layer above a sequential deduplication file system (denoted as DFS) implements the rewrite functionality. A user file is composed of one or more DFS files. As incoming data is written into a user file, the data is written by the dynamic layer sequentially into DFS files, created one by one. For each user file this dynamic layer creates and maintains a dynamic metadata file, in a regular, non deduplicated file system. This metadata file contains entries pointing to sections of DFS files.
    Type: Application
    Filed: June 4, 2012
    Publication date: September 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lior ARONOVICH, Samuel KRIKLER, Asaf LEVY, Amit SCHREIBER
  • Publication number: 20120239660
    Abstract: The invention is directed towards enabling data volume and data type based licensing of software in a distributed system of a plurality of remote and/or local nodes. The invention enables measuring and optionally restricting the use of software based on one or more provided licenses that restrict the amount and type of data that may be processed by the software. New and older licenses may be added together for a single, bulk entitlement for a given volume of data processing for one or all types of data. Different users in the same enterprise may combine license entitlements too. Also, a new license can be acquired repeatedly, without requiring the issuance of combined licenses by the issuing authority and/or the revocation of prior licenses.
    Type: Application
    Filed: March 14, 2011
    Publication date: September 20, 2012
    Applicant: Splunk Inc.
    Inventors: Vishal Patel, Jimmy John, Stephen Phillip Sorkin, Johnathon Lee Cervelli, Mitchell Neuman Blank, JR., Robin Kumar Das
  • Publication number: 20120239627
    Abstract: A data storage apparatus of the present invention includes a data collector that collects time-series data and a sampler that calculates, for each piece of the data, a plurality of change indices indicating change in each piece of the data and determines whether or not the piece of data is to be sampled.
    Type: Application
    Filed: March 15, 2012
    Publication date: September 20, 2012
    Applicant: NEC Corporation
    Inventor: Yoshinori NYUUNOYA
  • Publication number: 20120239652
    Abstract: An indexing database utilizes a non-transitory storage medium. A pattern matching processing unit generates preclassification data for the network data packets utilizing pattern matching analysis. At least one processing unit implements a storage process that receives the network data packets, stores the network data packets in at least one of the slots, and transfers the network data packets to a packet capture repository when slots in a shared memory are full. A preclassification process requests from the pattern matching processing unit the preclassification data. An indexing process determines, based upon the preclassification data, whether to invoke or omit additional analysis of the network data packets, and performs at least one of aggregation, classification, or annotation of the network data packets in the shared memory to maintain one or more indices in the indexing database.
    Type: Application
    Filed: March 15, 2012
    Publication date: September 20, 2012
    Applicant: SOLERA NETWORKS, INC.
    Inventors: Matthew S. Wood, Joseph H. Levy, McKay Marston
  • Publication number: 20120239711
    Abstract: A method for performing garbage collection on an object heap is described. In one embodiment, such a method includes performing a copy phase on an object heap by copying live objects from a source space to a destination space. An abort condition is generated when copying an object from the source space to the destination space fails due to insufficient space. In response to the abort condition, tracing work and reference updating associated with the copy phase are terminated. A mark phase is then initiated that marks live objects in the source space. This mark phase resumes tracing work and reference updating terminated by the copy phase in order to avoid or minimize the repetition of work performed by the copy phase. A corresponding computer program product and system are also described.
    Type: Application
    Filed: March 28, 2012
    Publication date: September 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Peter W. Burka, Jeffrey M. Disher, Daryl J. Maier, Aleksandar Micic, Ryan A. Sciampacone
  • Publication number: 20120233117
    Abstract: An improved scalable object storage system includes methods and systems allowing multiple clusters to work together. In one embodiment, there is a multi-cluster synchronization system between two or more clusters. The multi-cluster synchronization system uses variable compression to optimize the transfer of information between the clusters. Compression is used not only to minimize the total number of bytes sent between the two clusters, but to dynamically vary the size of the objects sent across the wire to optimize for higher throughput after considering packet loss, TCP windows, and block sizes. This includes both the packaging of multiple small files together into one larger compressed file, saving on TCP and header overhead, but also the chunking of large files into multiple smaller files that are less likely to have difficulties due to intermittent network congestion or errors.
    Type: Application
    Filed: October 21, 2011
    Publication date: September 13, 2012
    Applicant: Rackspace US, Inc.
    Inventors: Gregory Lee Holt, Clay Gerrard, David Patrick Goetz, Michael Barton
  • Publication number: 20120233133
    Abstract: Method, system, and computer program product embodiments for recording data on a contactless integrated circuit (IC) memory associated with a data storage cartridge are provided. In one exemplary embodiment, an index of a plurality of files to be recorded on a storage media of the data storage cartridge is parsed with a table of contents (TOC) profile file to build a table of contents (TOC) specific to an owning application of the plurality of files. The TOC is written to the contactless IC memory.
    Type: Application
    Filed: May 21, 2012
    Publication date: September 13, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shinobu FUJIHARA, Diana J. HELLMAN, Glen A. JAQUETTE
  • Publication number: 20120233174
    Abstract: A system and method for using identification codes found on ordinary articles of commerce to access remote computers on a network. In accordance with one embodiment of the invention, a computer is provided having a database that relates Uniform Product Code (“UPC”) numbers to Internet network addresses (or “URLs”). To access an Internet resource relating to a particular product, a user enters the product's UPC symbol manually, by swiping a bar code reader over the UPC symbol, or via other suitable input means. The database retrieves the URL corresponding to the UPC code. This location information is then used to access the desired resource.
    Type: Application
    Filed: December 13, 2011
    Publication date: September 13, 2012
    Applicant: NEOMEDIA TECHNOLOGIES, INC.
    Inventors: Frank C. Hudetz, Peter R. Hudetz
  • Publication number: 20120233188
    Abstract: A method and apparatus for mapping concepts and attributes to distance fields via rvachev-functions. The steps including generating, for a plurality of objects, equations representing boundaries of attributes for each respective object, converting, for a plurality of objects, the equations into greater than or equal to zero type inequalities, generating, for a plurality of objects, a logical expression combining regions of space defined by the inequalities into a semantic entity, and substituting, for a plurality of objects, the logical expression with a corresponding rvachev-function such that the resulting rvachev-function is equal to 0 on a boundary of the semantic entity, greater then 0 inside a region of the semantic entity, and less then 0 outside the region of the semantic entity. Also included is the step of generating a composite rvachev-function representing logical statements corresponding to the plurality of objects using the respective rvachev-functions of the objects.
    Type: Application
    Filed: March 12, 2012
    Publication date: September 13, 2012
    Inventor: Arun MAJUMDAR
  • Publication number: 20120233147
    Abstract: Indexing and searching features are provided including associated system, methods, and other implementations. A computing system of an embodiment is configured to reuse or repurpose physical index fields for different tenants as part of providing efficient and scalable indexing and searching services. A method of one embodiment operates to provide an indexed data structure that includes a number of reusable index fields that are shared and used to index information associated with a plurality of tenants. Other embodiments are included.
    Type: Application
    Filed: March 11, 2011
    Publication date: September 13, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Helge Grenager Solheim, Øystein Fledsberg, Evan Matthew Roark, Michael Susaeg
  • Publication number: 20120233175
    Abstract: In a database storing an index table in which index data used for retrieval of slip data that are generated for every business unit in a business process are registered. The index data are data containing a plurality of slip processed data respectively corresponding to the slip data. The slip processed data are data in which a specific item, which contains a predetermined item suitable for grasp of a business process in each business and a key item defined in advance in each business. The content of the specific item among items respectively set up to the slip data on various kinds of businesses are associated with each other in unit of slip data.
    Type: Application
    Filed: April 20, 2011
    Publication date: September 13, 2012
    Applicant: IPS CO., LTD.
    Inventor: Toshifumi Akita
  • Publication number: 20120233135
    Abstract: Example apparatus, methods, and computers perform sampling based data de-duplication. One example method controls a data de-duplication computer to compute a sampling sequence for a sub-block of data and to use the sampling sequence to locate a stored sub-block known to the data de-duplication computer. Upon finding a stored sub-block to compare to, the method includes controlling the data de-duplication computer to determine a degree of similarity (e.g., duplicate, very similar, somewhat similar, very dissimilar, completely dissimilar, x % similar) between the sub-block and the stored sub-block and to control whether and how the sub-block is stored and/or transmitted based on the degree of similarity. The degree of similarity can also control whether and how the data de-duplication computer updates a dedupe data structure(s) that stores information for finding groups of similarity sampling sequence related sub-blocks.
    Type: Application
    Filed: January 16, 2012
    Publication date: September 13, 2012
    Applicant: Quantum Corporation
    Inventor: Jeffrey Vincent Tofano
  • Publication number: 20120233176
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for storing a plurality of documents in computer-readable memory, each document of the plurality of documents having a corresponding access control list (ACL), each ACL defining a plurality of users that are authorized to access a respective document, generating an index based on the plurality of users, the index comprising a plurality of partitions, each partition corresponding to a user of the plurality of users, and, for each document of the plurality of documents: ranking the users of the plurality of users, selecting a user as an indexing user based on the ranking, and storing the document in a partition of the index, the partition corresponding to the indexing user.
    Type: Application
    Filed: March 7, 2012
    Publication date: September 13, 2012
    Applicant: GOOGLE INC.
    Inventors: Jeffrey Korn, Ruoming Pang, David Held, Dhyanesh Harishchandra Damania
  • Publication number: 20120233136
    Abstract: A method for deleting a relation between a source and a target in a multi-target architecture is described. The multi-target architecture includes a source and multiple space-efficient (SE) targets mapped thereto. In one embodiment, such a method includes initially identifying a relation for deletion from the multi-target architecture. A space-efficient (SE) target associated with the relation is then identified. A mapping structure maps data in logical tracks of the SE target to physical tracks of a repository. The method then identifies a sibling SE target that inherits data from the SE target. Once the SE target and the sibling SE target are identified, the method modifies the mapping structure to map the data in the physical tracks of the repository to the logical tracks of the sibling SE target. The relation is then deleted between the source and the SE target.
    Type: Application
    Filed: April 25, 2012
    Publication date: September 13, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael T. Benhase, JR., Theresa M. Brown, Lokesh M. Gupta, Rivka Mayraz Matosevich, Carol S. Mellgren
  • Publication number: 20120233096
    Abstract: Historical usage data related to user queries and training properties for a plurality of web pages is received and utilized to train a mathematical model to predict the likelihood of retrieval of a web page during a web search. Properties are extracted from the plurality of web pages in the index and the mathematical model is applied to the properties for each web page to calculate a sortrank value. The index is reordered based on the sortrank value such that the web pages most likely to be retrieved by a user submitting a search query appear first in the index. After a search query is received from a user the index is traversed in an order determined by the sortrank value. Responsive web pages are presented to the user in an order determined by a search engine ranking algorithm.
    Type: Application
    Filed: March 7, 2011
    Publication date: September 13, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: ATUL KUMAR GUPTA, ANNA V. TIMASHEVA, YUAN WANG, RAJKIRAN PANUGANTI, GARGI GHOSH, CHAOPING QIN, YASSER GANJISAFFAR, GIRISH KUMAR, HONGYAN ZHOU
  • Publication number: 20120233127
    Abstract: Method, system, and programs for information search and retrieval. A query is received and is processed to generate a feature-based vector that characterizes the query. A unified representation is then created based on the feature-based vector, that integrates semantic and feature based characterizations of the query. Information relevant to the query is then retrieved from an information archive based on the unified representation of the query. A query response is generated based on the retrieved information relevant to the query and is then transmitted to respond to the query.
    Type: Application
    Filed: March 10, 2011
    Publication date: September 13, 2012
    Applicant: TEXTWISE LLC
    Inventors: Robert Solmer, Wen Ruan
  • Publication number: 20120226671
    Abstract: A file, including visual information or auditory information may be uploaded to a processing device. Respective portions of content of the file may be identified for compressing and saving at respective bit rates. A number of component files may be created, compressed and saved, at the respective bit rates, based on the identified respective portions of content of the file. A network page, including a reference to the uploaded file, may be created. The reference to the uploaded file, in the network page, may be replaced with references to the compressed, saved component files and the network page may be saved. A processing device of a user may request the network page and the compressed, saved component files. A reasonable facsimile of the file may be reproduced based on an aggregate of the compressed, saved component files.
    Type: Application
    Filed: May 16, 2012
    Publication date: September 6, 2012
    Applicant: Microsoft Corporation
    Inventors: Mark Kar Hong Wong, Trevin Chow, Zachary Steven Emmel, Nathan D. Kile, JR., Derek Lynn Jamison, Jennifer N. Maertens, Justin James Watkins
  • Publication number: 20120221576
    Abstract: Embodiments are directed towards employing compressed journaling for event tracking files for metadata recovery and replication. Event data and related metadata are received from one or more client devices. When a feature within the received metadata is detected that is previously unwritten to a journal, then the previously unwritten feature is written to the journal. Further, any feature is detected for the received event data that is determined to be different from a feature associated with an immediately preceding event data that is written in the journal, then the detected different feature is identified in the journal. In one embodiment, the identification employs writing to the journal an effective feature record that may employ indices identifying the different feature. The received event data is also written to the journal and may further employ string arguments to minimize recording of redundant information into the journal.
    Type: Application
    Filed: February 28, 2011
    Publication date: August 30, 2012
    Applicant: Splunk Inc.
    Inventors: David Ryan MARQUARDT, Mitchell Neuman Blank, JR., Stephen Sorkin
  • Publication number: 20120221528
    Abstract: According to some embodiments, a column-oriented in-memory database structure may be established. The database structure may, for example, include a main store and a dictionary compressed delta store. Moreover, the delta store may comprise a value identifier vector and a delta dictionary associated with a column of the database. A transaction associated with the column may then be received and recorded within the delta store. According to some embodiments, entries associated with the transaction may be added to a value log of the value identifier vector and, independently, to a dictionary log of the delta dictionary.
    Type: Application
    Filed: December 29, 2011
    Publication date: August 30, 2012
    Applicant: SAP AG
    Inventors: Frank Renkes, Joos-Hendrik Böse
  • Publication number: 20120221534
    Abstract: Managing database indexes includes creating a main index and creating at least one service index that is configured for recording a change to a node to be updated in the main index. Managing database indexes also includes detecting whether an operation that involves the main index and is performed on the database appears in the database, and maintaining the main index using at least one service index in response to the operation that involves the main index and is performed on the database, appearing in the database. The maintaining is performed based on changes to a node to be updated in the main index that are recorded in the at least one service node.
    Type: Application
    Filed: February 13, 2012
    Publication date: August 30, 2012
    Applicant: International Business Machines Corporation
    Inventors: Ying Ming Gao, Jia Huo, Kai Zhang, Xian Zou
  • Publication number: 20120221574
    Abstract: A pivot is determined from enrolled data by a pivot determination unit, raw data is acquired, features are extracted from the raw data, a score is calculated as one of a distance and a degree of similarity between the features, an index vector is generated by using the score for the pivot, a ? score is calculated as one of a distance and a degree of similarity between the index vectors, a parameter of each non-pivot including a regression coefficient is trained by using training data, order to select the non-pivots is, by using the ? score between search data and the non-pivot as well as the regression coefficient, determined in descending order of posterior probability through logistic regression, and a search result is outputted based on the score between the search data and the enrolled data.
    Type: Application
    Filed: February 9, 2012
    Publication date: August 30, 2012
    Applicant: HITACHI, LTD.
    Inventors: Takao Murakami, Kenta Takahashi
  • Publication number: 20120215760
    Abstract: One example embodiment includes a method for indexing online references of an entity. The method includes identifying one or more channels of the Internet to be searched for references to an entity and identifying one or more signals to be evaluated within each of the one or more channels. The method also includes crawling the Internet for online references to the entity, wherein crawling the Internet comprises searching the one or more channels of the Internet for references to the entity and evaluating the one or more signals. The method further includes constructing a reverse index of the references, wherein the reverse index is based on each channel in which a reference is found and the one or more signals evaluated for the reference.
    Type: Application
    Filed: April 27, 2012
    Publication date: August 23, 2012
    Applicant: BRIGHTEDGE TECHNOLOGIES, INC.
    Inventors: Lemuel S. PARK, Jimmy YU
  • Publication number: 20120215748
    Abstract: A plurality of workers is configured for parallel processing of deduplicated data entities in a plurality of chunks. The deduplicated data processing rate is regulated using a rate control mechanism. The rate control mechanism incorporates a debt/credit algorithm specifying which of the plurality of workers processing the deduplicated data entities must wait for each of a plurality of calculated required sleep times. The rate control mechanism is adapted to limit a data flow rate based on a penalty acquired during a last processing of one of the plurality of chunks in a retroactive manner, and further adapted to operate on at least one vector representation of at least one limit specification to accommodate a variety of available dimensions corresponding to the at least one limit specification.
    Type: Application
    Filed: April 27, 2012
    Publication date: August 23, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shay H. AKIRAV, Ron ASHER, Yariv BACHAR, Lior KLIPPER, Oded SONIN
  • Publication number: 20120215788
    Abstract: A method comprising: receiving sample data for a plurality of channels, wherein the sample data comprises a plurality of separate sample values and each sample value may be identified using at least a channel index that differentiates between channels and a sampling index that differentiates between sample values; performing energy compaction with respect to at least one of the channel indexes and the sampling indexes to create compacted sample values where each compacted sample value may be identified using at least a channel index that differentiates between channels and a sampling index that differentiates between sample values; and selecting some but not all of the compacted sample values for further program.
    Type: Application
    Filed: November 18, 2009
    Publication date: August 23, 2012
    Applicant: NOKIA CORPORATION
    Inventor: Juha Petteri Ojanpera
  • Publication number: 20120215780
    Abstract: A system for identifying data of interest from among a multiplicity of data elements residing on multiple platforms in an enterprise, the system including background data characterization functionality characterizing the data of interest at least by at least one content characteristic thereof and at least one access metric thereof, the at least one access metric being selected from data access permissions and actual data access history and near real time data matching functionality selecting the data of interest by considering only data elements which have the at least one content characteristic thereof and the at least one access metric thereof from among the multiplicity of data elements.
    Type: Application
    Filed: March 7, 2012
    Publication date: August 23, 2012
    Inventors: Yakov FAITELSON, Ohad KORKUS, David BASS, Ophir KRETZER-KATZIR
  • Publication number: 20120216074
    Abstract: A cluster server manages allocation of free blocks to cluster clients performing writes in a clustered file system. The cluster server manages free block allocation with a free block map and an in-flight block map. The free block map is a data structure or hardware structure with data that indicates blocks or extents of the clustered file system that can be allocated to a client for the client to write data. The in-flight block map is a data structure or hardware structure with data that indicates blocks that have been allocated to clients, but remain in-flight. A block remains in-flight until the clustered file system metadata has been updated to reflect a write performed to that block by a client. After a consistency snapshot of the metadata is published to the storage resources, the data at the block will be visible to other nodes of the cluster.
    Type: Application
    Filed: April 27, 2012
    Publication date: August 23, 2012
    Applicant: International Business Machines Corporation
    Inventors: Joon Chang, Ninad S. Palsule, Andrew N. Solomon
  • Publication number: 20120215629
    Abstract: A system and method for applying a database to video multimedia is disclosed. Certain embodiments provide media content owners the capability to exploit video processing capabilities using rich, interactive and compelling visual content on a network. Mechanisms of associating video with commerce offerings are provided. Video server and search server technologies are integrated with ad serving personalization agents to make the final presentations of content and advertising. Algorithms utilized by the system use a variety of techniques for making the final presentation decisions of which ads, with which content, are served to which user.
    Type: Application
    Filed: April 27, 2012
    Publication date: August 23, 2012
    Applicant: Virage, Inc.
    Inventors: David Girouard, Bradley Horowitz, Richard Humphrey, Charles Fuller
  • Publication number: 20120209853
    Abstract: A set of trigrams can be generated for each document in a plurality of documents processed by an e-discovery system. Each trigram in the set of trigrams for a given document is a sequence of three terms in the given document. A set of trigrams for each similar document is then determined based on the set of trigrams for the original document. To facilitate identification of the similar documents, a full text index is then generated for the plurality of documents and the set of trigrams for each document are indexed into the full text index, as individual terms. Queries can be generated into the full text index based on trigrams of a document to determine other similar or near-duplicate documents. After a set of potentially similar documents are identified, a separate distance criteria can be applied to evaluate the level of similarity between the two documents in an efficient way.
    Type: Application
    Filed: February 16, 2011
    Publication date: August 16, 2012
    Applicant: Clearwell Systems, inc.
    Inventors: Malay Desai, Medha Shewale, Venkat Rangan
  • Publication number: 20120209854
    Abstract: Provided is a method for quickly obtaining an intensity value at a desired m/z value in a compressed data obtained by run-length encoding of a mass analysis data. An index is created by pairing either the start position of a section where zero-intensity consecutively occurs two or more times in an array of an original spectrum data, or the start position of a sequence of data having significant intensity values in an array of the original spectrum data, with the corresponding position in an array of a compressed data. This index is stored separate from the compressed data. The creation of the index does not affect the array of the compressed data. Therefore, the data can be decompressed even by a data processing system that does not use the index. The index helps to quickly locate a compressed data corresponding to the desired m/z and obtain the necessary intensity value.
    Type: Application
    Filed: February 14, 2012
    Publication date: August 16, 2012
    Applicant: SHIMADZU CORPORATION
    Inventor: Masahiro IKEGAMI
  • Publication number: 20120209820
    Abstract: A method of identifying nonreferenced memory elements in a storage system is disclosed. A plurality of lists of referenced elements from a plurality of storage subsystems is input. A union of the lists of referenced elements is compiled. The union of the lists of referenced memory elements is compared to a list of previously referenced memory elements to determine previously referenced elements that are no longer referenced. The previously referenced elements that are no longer referenced is output.
    Type: Application
    Filed: February 9, 2012
    Publication date: August 16, 2012
    Applicant: EMC CORPORATION
    Inventor: R. Hugo Patterson
  • Publication number: 20120209847
    Abstract: In various embodiments, a semantic space associated with a corpus of electronically stored information (ESI) may be created and used for concept searches. Documents (and any other objects in the ESI, in general) may be represented as vectors in the semantic space. Vectors may correspond to identifiers, such as, for example, indexed terms. The semantic space for a corpus of ESI can be used in information filtering, information retrieval, indexing, and relevancy rankings.
    Type: Application
    Filed: February 16, 2011
    Publication date: August 16, 2012
    Applicant: Clearwell Systems, Inc.
    Inventor: Venkat Rangan