Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
-
Publication number: 20120254796Abstract: A method is disclosed to convert digital data using a memory card operatively engaged with an apparatus such as a digital camera having control buttons but that does not have a keyboard or keypad. The memory card comprises a central processor, a conversion module and a storage module. The method includes placing the apparatus in a predetermined mode; activating the conversion module in the memory card; selecting at least one file stored in the memory card; and converting the selected at least one file.Type: ApplicationFiled: January 19, 2010Publication date: October 4, 2012Inventor: Joon Yong, Wayne Tan
-
Publication number: 20120254115Abstract: A method and system for creating secondary copies of data whose contents satisfy searches within data stores is described. In some cases, the system searches for data within a data store, identifies a set of data that satisfies the search, copies the identified set of data, and transfers the copy to secondary or other storage. In some cases, the system utilizes search-based secondary copies of days during restoration processes in order to restore data similar to and/or associated with data requested to be restored.Type: ApplicationFiled: March 31, 2011Publication date: October 4, 2012Inventor: Prakash Varadharajan
-
Publication number: 20120254266Abstract: Methods and systems for garbage collection are described. In some embodiments, Garbage collector threads may maximize local accesses and minimize remote access by copying Young objects and Old objects differently. When copying a Young object, a garbage collector thread may determine the lgroup of the pool that contains the object and copy the object to a pool of the same lgroup. The garbage collector thread may spread Old objects among lgroups by copying Old objects to pools of the same lgroup as the respective garbage collector thread. Additional methods and systems are disclosed.Type: ApplicationFiled: March 31, 2011Publication date: October 4, 2012Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Antonios Printezis, Igor Veresov, Paul Henry Hohensee, John Coomes
-
Publication number: 20120254194Abstract: Managing user bookmark information includes receiving a bookmark-related action request and determining a type of action associated with the bookmark-related action request and user information associated with the bookmark-related action request. In the event that the type of action corresponds to an add bookmark action, managing user bookmark information further includes generating a bookmark data record, the bookmark data record comprising the user information and information to be bookmarked; determining, using the user information, bookmark database information associated with a bookmark database to which the bookmark data record is to be stored, the bookmark database being one of a plurality of bookmark databases; generating index information based on the user information and the bookmark database information; storing the index information in an index database that is separate from the plurality of bookmark databases; and storing the bookmark data record in the bookmark database.Type: ApplicationFiled: March 28, 2012Publication date: October 4, 2012Applicant: ALIBABA GROUP HOLDING LIMITEDInventor: Ce Wu
-
Publication number: 20120254132Abstract: A method and an apparatus for organizing information in an electronic address book. The method comprises collecting contact information for an electronic address book, comparing a name from any field in said contact information to a database comprising name information, identifying a first name or a surname from the contact information and relocating in the contact information the identified first name to a field assigned to first names or the surname to a field assigned to surnames as a response to a name identified in a wrong field.Type: ApplicationFiled: March 27, 2012Publication date: October 4, 2012Inventors: Kimmo Kivirauma, Rami Lehtonen
-
Publication number: 20120254139Abstract: A method of providing lock-based access to nodes in a concurrent linked list includes providing a plurality of striped lock objects. Each striped lock object is configured to lock at least one of the nodes in the concurrent linked list. An index is computed based on a value stored in a first node to be accessed in the concurrent linked list. A first one of the striped lock objects is identified based on the computed index. The first striped lock object is acquired, thereby locking and providing protected access to the first node.Type: ApplicationFiled: June 15, 2012Publication date: October 4, 2012Applicant: MICROSOFT CORPORATIONInventors: Chunyan Song, Joshua Phillips, John Duffy, Tim Harris, Stephen H. Toub, Boby George
-
Publication number: 20120254189Abstract: A multilevel indexing system for indexing documents including structure and content information. The system may include a structure index module generating a structure index for the documents based on a document structure. A content index module may generate a content index for the documents based on a document type and document content. A computerized tree generation module may generate a multilevel indexing tree including the structure and content indexes. A search into the structure index may drive a search into the content index.Type: ApplicationFiled: March 31, 2011Publication date: October 4, 2012Inventor: Biren Narendra Shah
-
Publication number: 20120254192Abstract: A system and method for enhancing bitmap indexing representation of a dataset, which comprises a plurality of cases and features, each case characterized by one or more values of each feature. Currently, the bins vector for each case in the dataset, is a binary array, which is a bitmap indexing representation of each respective feature of the case. The system and method enhance the bitmap indexing by padding each bins vector. The padding is carried out by identifying all target bit locations with a ‘1’ value and replacing at least one ‘0’ bit adjacent to a target bit location with a non-zero numerical value, thereby creating a padded bitmap index. The padding factor may be based on any mathematical or statistical factor concerning population or subpopulation relevant to each of the features of the dataset.Type: ApplicationFiled: March 29, 2011Publication date: October 4, 2012Inventor: Roy Gelbard
-
Publication number: 20120254265Abstract: Aspects for conservative garbage collecting are disclosed. In one aspect, a heap of objects is generated during an execution of a script, and script objects in an unexecuted portion are traced to corresponding memory locations on the heap. The heap is then marked concurrently with executing the script such that a marked heap includes reachable and unreachable objects. Memory allocated to the unreachable objects is then freed concurrently with executing the script based on the marking. In another aspect, an object graph associated with a call stack is generated and traced such that script objects in an unexecuted portion of the stack are traced to corresponding memory locations on a heap. Heap objects are marked concurrently with executing the stack so that a marked heap includes reachable and unreachable objects. Memory allocated to the unreachable objects is then cleared concurrently with executing the stack based on the marked heap.Type: ApplicationFiled: March 29, 2011Publication date: October 4, 2012Applicant: MICROSOFT CORPORATIONInventors: Steven Lucco, Curtis Cheng-Cheng Man
-
Publication number: 20120254190Abstract: An extracting method includes storing to a storage device: files that include character units; first index information indicating which file includes at least one character unit in a character unit group having a usage frequency less than a predetermined frequency and among character units having common information in a predetermined portion, the usage frequency indicating the extent of files having a given character unit; second index information indicating which file includes a first character unit having a usage frequency at least equal to the predetermined frequency and among the character units having common information in a predetermined portion; and referring to the first and second index information to extract a file having character units in the first and second index information, when a request is received for extraction of a file having the first character unit and a second character unit that is included in the character unit group.Type: ApplicationFiled: March 19, 2012Publication date: October 4, 2012Applicant: FUJITSU LIMITEDInventors: Masahiro KATAOKA, Takahiro Murata, Takafumi Ohta
-
Publication number: 20120246132Abstract: Overflow access records (OARs) are managed in a database system. An OAR is created in response to receiving an update command for a data record and to the updated data record generated by the update command not fitting onto the page in the table where the data record was stored. The OAR that is created includes an index counter that indicates a number of indexes associated with the table. When an OAR is accessed in response to a query command, an identifier of the accessed OAR is replaced in the index by an identifier of a data record pointed to by the OAR, and the index counter in the accessed OAR is changed by a predefined amount. When the index counter reaches a predefined value, the accessed OAR is removed from the table.Type: ApplicationFiled: March 2, 2012Publication date: September 27, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nelke Sebastian, Martin Oberhofer, Yannick Saillet, Jens Seifert, Knut Stolze
-
Publication number: 20120246138Abstract: A web content search request including a search term is received at a searching/indexing device. A web search is performed based upon the search term. A markup language (ML) document returned via the web search including the search term is parsed. A location of the search term within the ML document is identified. A hypertext link to the identified location of the search term within the ML document is configured.Type: ApplicationFiled: June 6, 2012Publication date: September 27, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Theodore R. Carraher, Jake Palmer
-
Publication number: 20120244929Abstract: Systems and methods for the processing, storing, and displaying of map data are disclosed. Embodiments of the present invention facilitate more efficient storage, processing, communication, and display of maps and map-related data in connection with modern computers and communications systems. While originally developed for a map-based game, the techniques disclosed herein also have practical applications to other technical fields including but not limited to image processing (e.g., partitioning of images much like maps for storage, communication, or display) and heat mapping (e.g., visualization of a geographical distribution of certain attributes).Type: ApplicationFiled: May 11, 2012Publication date: September 27, 2012Inventors: James Allan Oakes, Henry Edward Oakes, Matthew Young, Mykolas Juraitis, Alexander Baev, Oleksandr Cherednychenko, Olga Melnichuk
-
Publication number: 20120246128Abstract: The database system of the present invention decides a fragment length responding to a unit of a data process of a parallel arithmetic unit, and stores tuple data containing variable-length data into a fragment and metadata of the fragment into a fragment header, respectively, in a column store database. The database system refers to the metadata when executing a process for data stored in the column store database, decides the fragments to be assigned to each thread that is executed by the parallel arithmetic unit, assigns the fragments to each thread based upon the decided content, and causes each thread to execute a parallel arithmetic operation.Type: ApplicationFiled: March 21, 2012Publication date: September 27, 2012Applicant: NEC CorporationInventors: Takehiko KASHIWAGI, Junpei Kamimura
-
Publication number: 20120246129Abstract: A data object management scheme for storing a large plurality of small data objects (e.g., image files) in small number of large object stack file for storage in secondary storage (e.g., hard disks). By storing many individual data objects in a single object stack file, the number of files stored in the secondary storage is reduced by several orders of magnitude, from the billions or millions to the hundreds or so. Index data for each object stack file is generated and stored in primary storage to allow efficient and prompt access to the data objects. Requests to store or retrieve the data objects are made using HTTP messages including file identifiers that identify the files storing the data objects and keys identifying the data objects. A file server stores or retrieves the data object from secondary storage of a file server without converting the requests to NSF or POSIX commands.Type: ApplicationFiled: June 7, 2012Publication date: September 27, 2012Inventors: Jeffrey Rothschild, Peter Vajgel, Jason S. Sobel, Robert C. Johnson
-
Publication number: 20120246272Abstract: A method of extracting knowledge comprising: initiating a search application; displaying a user search interface; receiving input parameters via the search interface; identifying a query type based on received input parameters; formulating a database query based on the received input parameters; transmitting the database query to a database, the database being in communication with a file archive indexer for indexing a file archive for storing files and data regarding the files; obtaining database query results from the database, the database storing the user activity data and actual content accessed by the user; providing database query results to a result analyzer module; and displaying search result analyzer module results to a user.Type: ApplicationFiled: June 4, 2012Publication date: September 27, 2012Inventors: George Eagan, Prabhdeep Singh
-
Publication number: 20120246166Abstract: A method, apparatus, system, article of manufacture, and computer readable medium provide the ability to create a point cloud indexed file. A grid (of cells that are divided into subcells) is mapped over points in a point cloud dataset. An occupancy value, that indicates whether a subcell contains a point, is computed for each subcell. A surface area contribution factor is computed for each cell and identifies a count of subcells that are occupied divided by a total number of subcells. The surface area contribution factor for each cell and points for each cell are written to the point cloud indexed file.Type: ApplicationFiled: March 24, 2011Publication date: September 27, 2012Applicant: AUTODESK, INC.Inventors: Ravinder P. Krishnaswamy, Jeffrey M. Kowalski, Carl Christer Janson
-
Publication number: 20120246169Abstract: Technologies pertaining to compressing time-series signals are described herein. Groups of time-series signals are generated based upon similarities between time-series signals. Each group of time-series signals includes a respective base time-series signal. Ratio signals that are representative of time-series signals are computed, wherein the ratio signals are based upon the base time-series signal and other respective time-series signals in a group of time-series signals.Type: ApplicationFiled: June 8, 2012Publication date: September 27, 2012Applicant: MICROSOFT CORPORATIONInventors: Jie Liu, Suman Kumar Nath, Feng Zhao, Galen Andrew Reeves, Sorabh Kumar Gandhi
-
Publication number: 20120239710Abstract: An illustrative embodiment of a computer-implemented process for single pass marking of finalizable objects marks strong roots, marks finalizable roots and determines whether a strong work stack is empty. Responsive to a determination the strong work stack is empty the computer-implemented process determines whether a finalizable work stack is empty. Responsive to a determination the finalizable work stack is empty, synchronize threads, the computer-implemented process finalizes finalizable roots and merges mark maps to finish parallel marking.Type: ApplicationFiled: March 14, 2011Publication date: September 20, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Peter W. Burka, Jeffrey M. Disher, Daryl J. Maier, Aleksandar Micic, Ryan A. Sciampacone
-
Publication number: 20120239633Abstract: A dynamic layer above a sequential deduplication file system (denoted as DFS) implements the rewrite functionality. A user file is composed of one or more DFS files. As incoming data is written into a user file, the data is written by the dynamic layer sequentially into DFS files, created one by one. For each user file this dynamic layer creates and maintains a dynamic metadata file, in a regular, non deduplicated file system. This metadata file contains entries pointing to sections of DFS files.Type: ApplicationFiled: June 4, 2012Publication date: September 20, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lior ARONOVICH, Samuel KRIKLER, Asaf LEVY, Amit SCHREIBER
-
Publication number: 20120239660Abstract: The invention is directed towards enabling data volume and data type based licensing of software in a distributed system of a plurality of remote and/or local nodes. The invention enables measuring and optionally restricting the use of software based on one or more provided licenses that restrict the amount and type of data that may be processed by the software. New and older licenses may be added together for a single, bulk entitlement for a given volume of data processing for one or all types of data. Different users in the same enterprise may combine license entitlements too. Also, a new license can be acquired repeatedly, without requiring the issuance of combined licenses by the issuing authority and/or the revocation of prior licenses.Type: ApplicationFiled: March 14, 2011Publication date: September 20, 2012Applicant: Splunk Inc.Inventors: Vishal Patel, Jimmy John, Stephen Phillip Sorkin, Johnathon Lee Cervelli, Mitchell Neuman Blank, JR., Robin Kumar Das
-
Publication number: 20120239627Abstract: A data storage apparatus of the present invention includes a data collector that collects time-series data and a sampler that calculates, for each piece of the data, a plurality of change indices indicating change in each piece of the data and determines whether or not the piece of data is to be sampled.Type: ApplicationFiled: March 15, 2012Publication date: September 20, 2012Applicant: NEC CorporationInventor: Yoshinori NYUUNOYA
-
Publication number: 20120239652Abstract: An indexing database utilizes a non-transitory storage medium. A pattern matching processing unit generates preclassification data for the network data packets utilizing pattern matching analysis. At least one processing unit implements a storage process that receives the network data packets, stores the network data packets in at least one of the slots, and transfers the network data packets to a packet capture repository when slots in a shared memory are full. A preclassification process requests from the pattern matching processing unit the preclassification data. An indexing process determines, based upon the preclassification data, whether to invoke or omit additional analysis of the network data packets, and performs at least one of aggregation, classification, or annotation of the network data packets in the shared memory to maintain one or more indices in the indexing database.Type: ApplicationFiled: March 15, 2012Publication date: September 20, 2012Applicant: SOLERA NETWORKS, INC.Inventors: Matthew S. Wood, Joseph H. Levy, McKay Marston
-
Publication number: 20120239711Abstract: A method for performing garbage collection on an object heap is described. In one embodiment, such a method includes performing a copy phase on an object heap by copying live objects from a source space to a destination space. An abort condition is generated when copying an object from the source space to the destination space fails due to insufficient space. In response to the abort condition, tracing work and reference updating associated with the copy phase are terminated. A mark phase is then initiated that marks live objects in the source space. This mark phase resumes tracing work and reference updating terminated by the copy phase in order to avoid or minimize the repetition of work performed by the copy phase. A corresponding computer program product and system are also described.Type: ApplicationFiled: March 28, 2012Publication date: September 20, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Peter W. Burka, Jeffrey M. Disher, Daryl J. Maier, Aleksandar Micic, Ryan A. Sciampacone
-
Publication number: 20120233117Abstract: An improved scalable object storage system includes methods and systems allowing multiple clusters to work together. In one embodiment, there is a multi-cluster synchronization system between two or more clusters. The multi-cluster synchronization system uses variable compression to optimize the transfer of information between the clusters. Compression is used not only to minimize the total number of bytes sent between the two clusters, but to dynamically vary the size of the objects sent across the wire to optimize for higher throughput after considering packet loss, TCP windows, and block sizes. This includes both the packaging of multiple small files together into one larger compressed file, saving on TCP and header overhead, but also the chunking of large files into multiple smaller files that are less likely to have difficulties due to intermittent network congestion or errors.Type: ApplicationFiled: October 21, 2011Publication date: September 13, 2012Applicant: Rackspace US, Inc.Inventors: Gregory Lee Holt, Clay Gerrard, David Patrick Goetz, Michael Barton
-
Publication number: 20120233133Abstract: Method, system, and computer program product embodiments for recording data on a contactless integrated circuit (IC) memory associated with a data storage cartridge are provided. In one exemplary embodiment, an index of a plurality of files to be recorded on a storage media of the data storage cartridge is parsed with a table of contents (TOC) profile file to build a table of contents (TOC) specific to an owning application of the plurality of files. The TOC is written to the contactless IC memory.Type: ApplicationFiled: May 21, 2012Publication date: September 13, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shinobu FUJIHARA, Diana J. HELLMAN, Glen A. JAQUETTE
-
Publication number: 20120233174Abstract: A system and method for using identification codes found on ordinary articles of commerce to access remote computers on a network. In accordance with one embodiment of the invention, a computer is provided having a database that relates Uniform Product Code (“UPC”) numbers to Internet network addresses (or “URLs”). To access an Internet resource relating to a particular product, a user enters the product's UPC symbol manually, by swiping a bar code reader over the UPC symbol, or via other suitable input means. The database retrieves the URL corresponding to the UPC code. This location information is then used to access the desired resource.Type: ApplicationFiled: December 13, 2011Publication date: September 13, 2012Applicant: NEOMEDIA TECHNOLOGIES, INC.Inventors: Frank C. Hudetz, Peter R. Hudetz
-
Publication number: 20120233188Abstract: A method and apparatus for mapping concepts and attributes to distance fields via rvachev-functions. The steps including generating, for a plurality of objects, equations representing boundaries of attributes for each respective object, converting, for a plurality of objects, the equations into greater than or equal to zero type inequalities, generating, for a plurality of objects, a logical expression combining regions of space defined by the inequalities into a semantic entity, and substituting, for a plurality of objects, the logical expression with a corresponding rvachev-function such that the resulting rvachev-function is equal to 0 on a boundary of the semantic entity, greater then 0 inside a region of the semantic entity, and less then 0 outside the region of the semantic entity. Also included is the step of generating a composite rvachev-function representing logical statements corresponding to the plurality of objects using the respective rvachev-functions of the objects.Type: ApplicationFiled: March 12, 2012Publication date: September 13, 2012Inventor: Arun MAJUMDAR
-
Publication number: 20120233147Abstract: Indexing and searching features are provided including associated system, methods, and other implementations. A computing system of an embodiment is configured to reuse or repurpose physical index fields for different tenants as part of providing efficient and scalable indexing and searching services. A method of one embodiment operates to provide an indexed data structure that includes a number of reusable index fields that are shared and used to index information associated with a plurality of tenants. Other embodiments are included.Type: ApplicationFiled: March 11, 2011Publication date: September 13, 2012Applicant: MICROSOFT CORPORATIONInventors: Helge Grenager Solheim, Øystein Fledsberg, Evan Matthew Roark, Michael Susaeg
-
Publication number: 20120233175Abstract: In a database storing an index table in which index data used for retrieval of slip data that are generated for every business unit in a business process are registered. The index data are data containing a plurality of slip processed data respectively corresponding to the slip data. The slip processed data are data in which a specific item, which contains a predetermined item suitable for grasp of a business process in each business and a key item defined in advance in each business. The content of the specific item among items respectively set up to the slip data on various kinds of businesses are associated with each other in unit of slip data.Type: ApplicationFiled: April 20, 2011Publication date: September 13, 2012Applicant: IPS CO., LTD.Inventor: Toshifumi Akita
-
Publication number: 20120233135Abstract: Example apparatus, methods, and computers perform sampling based data de-duplication. One example method controls a data de-duplication computer to compute a sampling sequence for a sub-block of data and to use the sampling sequence to locate a stored sub-block known to the data de-duplication computer. Upon finding a stored sub-block to compare to, the method includes controlling the data de-duplication computer to determine a degree of similarity (e.g., duplicate, very similar, somewhat similar, very dissimilar, completely dissimilar, x % similar) between the sub-block and the stored sub-block and to control whether and how the sub-block is stored and/or transmitted based on the degree of similarity. The degree of similarity can also control whether and how the data de-duplication computer updates a dedupe data structure(s) that stores information for finding groups of similarity sampling sequence related sub-blocks.Type: ApplicationFiled: January 16, 2012Publication date: September 13, 2012Applicant: Quantum CorporationInventor: Jeffrey Vincent Tofano
-
Publication number: 20120233176Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for storing a plurality of documents in computer-readable memory, each document of the plurality of documents having a corresponding access control list (ACL), each ACL defining a plurality of users that are authorized to access a respective document, generating an index based on the plurality of users, the index comprising a plurality of partitions, each partition corresponding to a user of the plurality of users, and, for each document of the plurality of documents: ranking the users of the plurality of users, selecting a user as an indexing user based on the ranking, and storing the document in a partition of the index, the partition corresponding to the indexing user.Type: ApplicationFiled: March 7, 2012Publication date: September 13, 2012Applicant: GOOGLE INC.Inventors: Jeffrey Korn, Ruoming Pang, David Held, Dhyanesh Harishchandra Damania
-
Publication number: 20120233136Abstract: A method for deleting a relation between a source and a target in a multi-target architecture is described. The multi-target architecture includes a source and multiple space-efficient (SE) targets mapped thereto. In one embodiment, such a method includes initially identifying a relation for deletion from the multi-target architecture. A space-efficient (SE) target associated with the relation is then identified. A mapping structure maps data in logical tracks of the SE target to physical tracks of a repository. The method then identifies a sibling SE target that inherits data from the SE target. Once the SE target and the sibling SE target are identified, the method modifies the mapping structure to map the data in the physical tracks of the repository to the logical tracks of the sibling SE target. The relation is then deleted between the source and the SE target.Type: ApplicationFiled: April 25, 2012Publication date: September 13, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael T. Benhase, JR., Theresa M. Brown, Lokesh M. Gupta, Rivka Mayraz Matosevich, Carol S. Mellgren
-
Publication number: 20120233096Abstract: Historical usage data related to user queries and training properties for a plurality of web pages is received and utilized to train a mathematical model to predict the likelihood of retrieval of a web page during a web search. Properties are extracted from the plurality of web pages in the index and the mathematical model is applied to the properties for each web page to calculate a sortrank value. The index is reordered based on the sortrank value such that the web pages most likely to be retrieved by a user submitting a search query appear first in the index. After a search query is received from a user the index is traversed in an order determined by the sortrank value. Responsive web pages are presented to the user in an order determined by a search engine ranking algorithm.Type: ApplicationFiled: March 7, 2011Publication date: September 13, 2012Applicant: MICROSOFT CORPORATIONInventors: ATUL KUMAR GUPTA, ANNA V. TIMASHEVA, YUAN WANG, RAJKIRAN PANUGANTI, GARGI GHOSH, CHAOPING QIN, YASSER GANJISAFFAR, GIRISH KUMAR, HONGYAN ZHOU
-
Publication number: 20120233127Abstract: Method, system, and programs for information search and retrieval. A query is received and is processed to generate a feature-based vector that characterizes the query. A unified representation is then created based on the feature-based vector, that integrates semantic and feature based characterizations of the query. Information relevant to the query is then retrieved from an information archive based on the unified representation of the query. A query response is generated based on the retrieved information relevant to the query and is then transmitted to respond to the query.Type: ApplicationFiled: March 10, 2011Publication date: September 13, 2012Applicant: TEXTWISE LLCInventors: Robert Solmer, Wen Ruan
-
Publication number: 20120226671Abstract: A file, including visual information or auditory information may be uploaded to a processing device. Respective portions of content of the file may be identified for compressing and saving at respective bit rates. A number of component files may be created, compressed and saved, at the respective bit rates, based on the identified respective portions of content of the file. A network page, including a reference to the uploaded file, may be created. The reference to the uploaded file, in the network page, may be replaced with references to the compressed, saved component files and the network page may be saved. A processing device of a user may request the network page and the compressed, saved component files. A reasonable facsimile of the file may be reproduced based on an aggregate of the compressed, saved component files.Type: ApplicationFiled: May 16, 2012Publication date: September 6, 2012Applicant: Microsoft CorporationInventors: Mark Kar Hong Wong, Trevin Chow, Zachary Steven Emmel, Nathan D. Kile, JR., Derek Lynn Jamison, Jennifer N. Maertens, Justin James Watkins
-
Publication number: 20120221576Abstract: Embodiments are directed towards employing compressed journaling for event tracking files for metadata recovery and replication. Event data and related metadata are received from one or more client devices. When a feature within the received metadata is detected that is previously unwritten to a journal, then the previously unwritten feature is written to the journal. Further, any feature is detected for the received event data that is determined to be different from a feature associated with an immediately preceding event data that is written in the journal, then the detected different feature is identified in the journal. In one embodiment, the identification employs writing to the journal an effective feature record that may employ indices identifying the different feature. The received event data is also written to the journal and may further employ string arguments to minimize recording of redundant information into the journal.Type: ApplicationFiled: February 28, 2011Publication date: August 30, 2012Applicant: Splunk Inc.Inventors: David Ryan MARQUARDT, Mitchell Neuman Blank, JR., Stephen Sorkin
-
Publication number: 20120221528Abstract: According to some embodiments, a column-oriented in-memory database structure may be established. The database structure may, for example, include a main store and a dictionary compressed delta store. Moreover, the delta store may comprise a value identifier vector and a delta dictionary associated with a column of the database. A transaction associated with the column may then be received and recorded within the delta store. According to some embodiments, entries associated with the transaction may be added to a value log of the value identifier vector and, independently, to a dictionary log of the delta dictionary.Type: ApplicationFiled: December 29, 2011Publication date: August 30, 2012Applicant: SAP AGInventors: Frank Renkes, Joos-Hendrik Böse
-
Publication number: 20120221534Abstract: Managing database indexes includes creating a main index and creating at least one service index that is configured for recording a change to a node to be updated in the main index. Managing database indexes also includes detecting whether an operation that involves the main index and is performed on the database appears in the database, and maintaining the main index using at least one service index in response to the operation that involves the main index and is performed on the database, appearing in the database. The maintaining is performed based on changes to a node to be updated in the main index that are recorded in the at least one service node.Type: ApplicationFiled: February 13, 2012Publication date: August 30, 2012Applicant: International Business Machines CorporationInventors: Ying Ming Gao, Jia Huo, Kai Zhang, Xian Zou
-
Publication number: 20120221574Abstract: A pivot is determined from enrolled data by a pivot determination unit, raw data is acquired, features are extracted from the raw data, a score is calculated as one of a distance and a degree of similarity between the features, an index vector is generated by using the score for the pivot, a ? score is calculated as one of a distance and a degree of similarity between the index vectors, a parameter of each non-pivot including a regression coefficient is trained by using training data, order to select the non-pivots is, by using the ? score between search data and the non-pivot as well as the regression coefficient, determined in descending order of posterior probability through logistic regression, and a search result is outputted based on the score between the search data and the enrolled data.Type: ApplicationFiled: February 9, 2012Publication date: August 30, 2012Applicant: HITACHI, LTD.Inventors: Takao Murakami, Kenta Takahashi
-
Publication number: 20120215760Abstract: One example embodiment includes a method for indexing online references of an entity. The method includes identifying one or more channels of the Internet to be searched for references to an entity and identifying one or more signals to be evaluated within each of the one or more channels. The method also includes crawling the Internet for online references to the entity, wherein crawling the Internet comprises searching the one or more channels of the Internet for references to the entity and evaluating the one or more signals. The method further includes constructing a reverse index of the references, wherein the reverse index is based on each channel in which a reference is found and the one or more signals evaluated for the reference.Type: ApplicationFiled: April 27, 2012Publication date: August 23, 2012Applicant: BRIGHTEDGE TECHNOLOGIES, INC.Inventors: Lemuel S. PARK, Jimmy YU
-
Publication number: 20120215748Abstract: A plurality of workers is configured for parallel processing of deduplicated data entities in a plurality of chunks. The deduplicated data processing rate is regulated using a rate control mechanism. The rate control mechanism incorporates a debt/credit algorithm specifying which of the plurality of workers processing the deduplicated data entities must wait for each of a plurality of calculated required sleep times. The rate control mechanism is adapted to limit a data flow rate based on a penalty acquired during a last processing of one of the plurality of chunks in a retroactive manner, and further adapted to operate on at least one vector representation of at least one limit specification to accommodate a variety of available dimensions corresponding to the at least one limit specification.Type: ApplicationFiled: April 27, 2012Publication date: August 23, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shay H. AKIRAV, Ron ASHER, Yariv BACHAR, Lior KLIPPER, Oded SONIN
-
Publication number: 20120215788Abstract: A method comprising: receiving sample data for a plurality of channels, wherein the sample data comprises a plurality of separate sample values and each sample value may be identified using at least a channel index that differentiates between channels and a sampling index that differentiates between sample values; performing energy compaction with respect to at least one of the channel indexes and the sampling indexes to create compacted sample values where each compacted sample value may be identified using at least a channel index that differentiates between channels and a sampling index that differentiates between sample values; and selecting some but not all of the compacted sample values for further program.Type: ApplicationFiled: November 18, 2009Publication date: August 23, 2012Applicant: NOKIA CORPORATIONInventor: Juha Petteri Ojanpera
-
Publication number: 20120215780Abstract: A system for identifying data of interest from among a multiplicity of data elements residing on multiple platforms in an enterprise, the system including background data characterization functionality characterizing the data of interest at least by at least one content characteristic thereof and at least one access metric thereof, the at least one access metric being selected from data access permissions and actual data access history and near real time data matching functionality selecting the data of interest by considering only data elements which have the at least one content characteristic thereof and the at least one access metric thereof from among the multiplicity of data elements.Type: ApplicationFiled: March 7, 2012Publication date: August 23, 2012Inventors: Yakov FAITELSON, Ohad KORKUS, David BASS, Ophir KRETZER-KATZIR
-
Publication number: 20120216074Abstract: A cluster server manages allocation of free blocks to cluster clients performing writes in a clustered file system. The cluster server manages free block allocation with a free block map and an in-flight block map. The free block map is a data structure or hardware structure with data that indicates blocks or extents of the clustered file system that can be allocated to a client for the client to write data. The in-flight block map is a data structure or hardware structure with data that indicates blocks that have been allocated to clients, but remain in-flight. A block remains in-flight until the clustered file system metadata has been updated to reflect a write performed to that block by a client. After a consistency snapshot of the metadata is published to the storage resources, the data at the block will be visible to other nodes of the cluster.Type: ApplicationFiled: April 27, 2012Publication date: August 23, 2012Applicant: International Business Machines CorporationInventors: Joon Chang, Ninad S. Palsule, Andrew N. Solomon
-
Publication number: 20120215629Abstract: A system and method for applying a database to video multimedia is disclosed. Certain embodiments provide media content owners the capability to exploit video processing capabilities using rich, interactive and compelling visual content on a network. Mechanisms of associating video with commerce offerings are provided. Video server and search server technologies are integrated with ad serving personalization agents to make the final presentations of content and advertising. Algorithms utilized by the system use a variety of techniques for making the final presentation decisions of which ads, with which content, are served to which user.Type: ApplicationFiled: April 27, 2012Publication date: August 23, 2012Applicant: Virage, Inc.Inventors: David Girouard, Bradley Horowitz, Richard Humphrey, Charles Fuller
-
Publication number: 20120209853Abstract: A set of trigrams can be generated for each document in a plurality of documents processed by an e-discovery system. Each trigram in the set of trigrams for a given document is a sequence of three terms in the given document. A set of trigrams for each similar document is then determined based on the set of trigrams for the original document. To facilitate identification of the similar documents, a full text index is then generated for the plurality of documents and the set of trigrams for each document are indexed into the full text index, as individual terms. Queries can be generated into the full text index based on trigrams of a document to determine other similar or near-duplicate documents. After a set of potentially similar documents are identified, a separate distance criteria can be applied to evaluate the level of similarity between the two documents in an efficient way.Type: ApplicationFiled: February 16, 2011Publication date: August 16, 2012Applicant: Clearwell Systems, inc.Inventors: Malay Desai, Medha Shewale, Venkat Rangan
-
Publication number: 20120209854Abstract: Provided is a method for quickly obtaining an intensity value at a desired m/z value in a compressed data obtained by run-length encoding of a mass analysis data. An index is created by pairing either the start position of a section where zero-intensity consecutively occurs two or more times in an array of an original spectrum data, or the start position of a sequence of data having significant intensity values in an array of the original spectrum data, with the corresponding position in an array of a compressed data. This index is stored separate from the compressed data. The creation of the index does not affect the array of the compressed data. Therefore, the data can be decompressed even by a data processing system that does not use the index. The index helps to quickly locate a compressed data corresponding to the desired m/z and obtain the necessary intensity value.Type: ApplicationFiled: February 14, 2012Publication date: August 16, 2012Applicant: SHIMADZU CORPORATIONInventor: Masahiro IKEGAMI
-
Publication number: 20120209820Abstract: A method of identifying nonreferenced memory elements in a storage system is disclosed. A plurality of lists of referenced elements from a plurality of storage subsystems is input. A union of the lists of referenced elements is compiled. The union of the lists of referenced memory elements is compared to a list of previously referenced memory elements to determine previously referenced elements that are no longer referenced. The previously referenced elements that are no longer referenced is output.Type: ApplicationFiled: February 9, 2012Publication date: August 16, 2012Applicant: EMC CORPORATIONInventor: R. Hugo Patterson
-
Publication number: 20120209847Abstract: In various embodiments, a semantic space associated with a corpus of electronically stored information (ESI) may be created and used for concept searches. Documents (and any other objects in the ESI, in general) may be represented as vectors in the semantic space. Vectors may correspond to identifiers, such as, for example, indexed terms. The semantic space for a corpus of ESI can be used in information filtering, information retrieval, indexing, and relevancy rankings.Type: ApplicationFiled: February 16, 2011Publication date: August 16, 2012Applicant: Clearwell Systems, Inc.Inventor: Venkat Rangan