Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
-
Publication number: 20110320418Abstract: Apparatus, systems, and methods may operate to receive requests to execute a plurality of compression and/or decompression mechanisms on one or more database objects; to execute each of the compression and/or decompression mechanisms, on a sampled basis, on the database objects; to determine comparative performance characteristics associated with each of the compression and/or decompression mechanisms; and to record at least some of the performance characteristics and/or derivative characteristics derived from the performance characteristics in a performance summary table. The table may be published to a storage medium or a display screen. Other apparatus, systems, and methods are disclosed.Type: ApplicationFiled: June 29, 2010Publication date: December 29, 2011Applicant: Teradata US, Inc.Inventors: Congnan Luo, Like Gao, Yu Long, Judy Wu, Michael Leon Reed
-
Publication number: 20110320417Abstract: Apparatus, systems, and methods may operate to receive a set of ordered user-selected compression rules as a compression rule set comprising at least one compression threshold condition, to create or transform a database object with rows to be selectively compressed according to the compression rules in the compression rule set (providing a transformed object), and to publish at least a portion of the transformed object to one of a storage medium or a display screen. Other apparatus, systems, and methods are disclosed.Type: ApplicationFiled: June 29, 2010Publication date: December 29, 2011Applicant: Teradata US, Inc.Inventors: Congnan Luo, Like Gao, Yu Long, Judy Wu, Michael Leon Reed
-
Publication number: 20110320458Abstract: A dynamic portal generation system includes an indexing module that indexes structured and unstructured data in a database. The database includes information residing in associated standalone applications having documents from information sources, and a name-entity repository that includes name entities and their corresponding name-entity types. A search module searches the information residing in the indexed information to obtain a search result. A name-entity extraction module extracts a matching name-entity that corresponds to a name-entity in the name-entity repository. A portal generation module dynamically generates a portal triggered by the search query.Type: ApplicationFiled: June 22, 2011Publication date: December 29, 2011Inventor: ABINASHA KARANA
-
Publication number: 20110314033Abstract: A derivative work discovery service facilitates discovery of derivative works by consumers who request the service from a derivative work service server. The derivative work service server uses information about digital sources available on a computer system operated by the consumer to search a derivative works database. A derivative work discovery file is compiled and presented to the consumer in a descending order of a ratio of a number of digital sources possessed by the consumer that are equivalent to digital sources used to create a derivative work to a total number of digital sources used to create that derivative work.Type: ApplicationFiled: June 18, 2010Publication date: December 22, 2011Applicant: Legitmix, Inc.Inventor: Omid Allen McDonald
-
Publication number: 20110314027Abstract: An index building, querying method, device and system for distributed columnar database are provided. The index building method for distributed columnar database includes: obtaining a column field from a distributed columnar database, generating a column index file in which the column field is a key word, the column index file comprising the mapping relationship between the value of the column field in the distributed columnar database and the corresponding Row field value; storing the column index file to a index catalogue corresponding to the column field in the distributed columnar database.Type: ApplicationFiled: November 3, 2009Publication date: December 22, 2011Applicant: CHINA MOBILE COMMUNICATIONS CORPORATIONInventors: Meng Xu, Ling Qian, Zhiguo Luo, Leitao Guo, Peng Zhao
-
Publication number: 20110313997Abstract: A total homepage service providing system includes an information provider information administration unit configured to register and administrate information of an information appliance of an information provider and information of the information provider; a homepage generation unit configured to automatically generate a homepage which can be displayed on the information appliance of the information provider and an information appliance of an information user, using metadata received from the information appliance of the information provider; a homepage registration and administration unit configured to store a file of the generated homepage, and register and administrate the homepage; and an index generation and administration unit configured to generate one or more homepage indexes for an information search, using keywords extracted and classified from the generated homepage, and administrate the generated homepage indexes.Type: ApplicationFiled: November 3, 2009Publication date: December 22, 2011Inventor: Hee Sung Chung
-
Publication number: 20110313852Abstract: Systems and methods for applications of orthogonal corpus indexing (OCI), such as selecting ad words for purchase and improving visibility of web pages in search engines, are described. In one aspect, the systems and methods described herein employ OCI to recommend to an advertiser ad words for purchase. Advertisers pay search engines for placement of their advertising along side results in the search results page, when a given word or phrase appears in a user's search query. The described systems and methods enable automated selection of related and discriminating terms, identifying keywords that increase the ratio of ads clicked-through to money spent on keyword buying. In another aspect, the systems and methods described herein employ OCI to generate content for web pages. OCI may be used to determine content that when added to a web page improves the rank of that page in a search engine.Type: ApplicationFiled: May 16, 2011Publication date: December 22, 2011Applicant: IndraWeb.com, Inc.Inventors: Henry B. Kon, George W. Burch
-
Publication number: 20110307469Abstract: A new approach is proposed that contemplates systems and methods to provide query suggestions including real-time suggestion of complete query terms, which can be phrases, to a user by analyzing and indexing the real-time history/stream of content or documents in addition to the stream of queries entered. Since the real-time indexing generates a count of potential results for each term found and/or indexed in the stream, the terms found in that stream can then be used as potential query suggestions, knowing that it will be possible to provide results for those queries.Type: ApplicationFiled: June 14, 2011Publication date: December 15, 2011Inventors: Rishab Aiyer Ghosh, Lun Ted Cui
-
Publication number: 20110307468Abstract: A method and system for identifying nodes with similar content. In one aspect, the method comprises determining a structure of a network of nodes, said structure defined by incoming links and outgoing links between nodes within said network, grouping said nodes within said network into a first set of modules, calculating a first modularity value between each of the modules within the first set, said modularity value indicating a degree of similar content within each module, calculating a topical relevance value for each of the modules, selecting those modules whose topical relevance value exceeds a threshold value and calculating an authority score for the selected modules.Type: ApplicationFiled: June 11, 2010Publication date: December 15, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ning Duan, Pei-Yun S. Hsueh, Yan Liu
-
Publication number: 20110307490Abstract: An invention for dissemination or retrieval of digital resources or online information via context layer or context-level protocols and interfaces is described. According to one embodiment, an interface or protocol that a computer uses to communicate with other computers is associated with a subject matter context. User-level contents or digital resources received across that interface or protocol are then associated with that subject matter context, and the computer may respond accordingly. For instance, a computer may associate a given network port with a subject matter context of shopping, and treat all digital resource requests received on that port as applying to only a shopping subject matter context. A web server may also listen on a network port associated with a subject matter context, thereby contextualizing the overall nature of the website that the web server hosts.Type: ApplicationFiled: June 15, 2011Publication date: December 15, 2011Applicant: USM CHINA/HONG KONG LIMITEDInventor: Edmond K. Chow
-
Publication number: 20110307432Abstract: Improved search result relevance is provided for name segment searches performed by a general web search engine. Entity-related information is mined from web documents and search engine query logs, and metadata is indexed in a search system index. The metadata may include information identifying entity homepages, entity web pages at high quality top sites, other entity-related web pages, entity equivalent data, and/or entity misspellings data. The indexed metadata is employed to provide improved search results relevance for search queries that include an entity's name by improving the ranking of search results corresponding with entity-relevant web pages.Type: ApplicationFiled: June 11, 2010Publication date: December 15, 2011Applicant: MICROSOFT CORPORATIONInventors: QI YAO, VINCENT LI, JUNBIAO TANG, RICHARD CHANG
-
Publication number: 20110302183Abstract: A method for managing an object watchpoint during a garbage collection cycle, including identifying an object having a field, where the object is associated with an original object location, where the field is associated with an original field location, and where the object and the field are located in a memory heap of a virtual machine memory, setting, within a debugger, the object watchpoint on the original field location, where the object watchpoint is a memory trap associated with the object, determining, after a relocation of the object by a garbage collector (GC), a new object location associated with the object, determining a new field location of the field based on the new object location, and setting, within the debugger, the object watchpoint on the new field location.Type: ApplicationFiled: June 2, 2010Publication date: December 8, 2011Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Michael Lee Van De Vanter, Hannes E. Payer, Douglas Norman Simon, Benjamin Lawrence Titzer, Mario I. Wolczko
-
Apparatus and Method for the Automatic Discovery of Control Events from the Publication of Documents
Publication number: 20110302171Abstract: A method and system for discovering a control event from electronically published documents and received data streams is provided, in which a computer control program identifies electronically published documents and data stored in a plurality of network servers which potentially contain control events relevant to the control of goods and/or services, the control events identified by reference to user interest identifiers. Identified material is analyzed by a classification program to determine whether control events are present. A control event classification is assigned to documents and received data determined to contain at least one discovered control event, the assigned control event classification and information identifying the associated document and data is stored in a classification database, and a report of discovery of documents and data containing control events is be provided to a user. The report may includes a link to the control event classification and/or its associated document or data.Type: ApplicationFiled: June 14, 2011Publication date: December 8, 2011Applicant: Decernis, LLCInventors: Patrick Blackmon Waldo, Andrew B. Waldo -
Publication number: 20110295819Abstract: Various embodiments for transforming a logical data object for storage in a storage device operable with at least one storage protocol are provided. In one such embodiment, the logical data object into one or more segments are divided with each segment characterized by respective start and end offsets. One or more obtained variable size data chunks are processed corresponding to the logical data object to obtain processed data chunks, wherein at least one of the processed data chunks comprises transformed data resulting from the processing. Each of the variable size data chunks is associated with a respective segment of the logical data object.Type: ApplicationFiled: August 5, 2011Publication date: December 1, 2011Inventors: Jonathan AMIT, Ori SHALEV
-
Publication number: 20110295818Abstract: Various embodiments for transforming a logical data object for storage in a storage device operable with at least one storage protocol are provided. In one such embodiment, the logical data object into one or more segments are divided with each segment characterized by respective start and end offsets. One or more obtained variable size data chunks are processed corresponding to the logical data object to obtain processed data chunks, wherein at least one of the processed data chunks comprises transformed data resulting from the processing. Each of the variable size data chunks is associated with a respective segment of the logical data object.Type: ApplicationFiled: August 5, 2011Publication date: December 1, 2011Inventors: Jonathan AMIT, Ori SHALEV
-
Publication number: 20110289049Abstract: Metadata may be stored in, and retrieved from, a scalable, fault-tolerant metadata service. In one example, metadata is divided into partitions, and each partition is served by one or more nodes. For each partition, a first one of the nodes may handle read and write requests, and the other nodes may handle read requests in the event that the first node is down or is experiencing high load. When a request is made with respect to metadata, a metadata server may identify a node, in the partition to which the metadata is assigned, to which the request is to be made. The entity that is making the request then contacts that node, and requests the read or write on the metadata. In a partition, metadata may be replicated between the first node and the other nodes using a log-based replication protocol.Type: ApplicationFiled: May 19, 2010Publication date: November 24, 2011Applicant: MICROSOFT CORPORATIONInventors: Nanshan Zeng, Meng Ye, Honghua Feng, Junwei Xu, Yu-chao Cao, Yingjun Yu, Lin Song
-
Publication number: 20110289091Abstract: In accordance with embodiments, there are provided methods and systems for providing multiple column custom indexes in a multi-tenant database environment. A method embodiment provides defining a multi-tenant data structure having a plurality of data columns data fields and a plurality of rows for each of multiple tenants, each row including a data column for a tenant identifier, defining a first data field for a tenant, the first data field having a first data type, defining a second data field for the tenant, the second field having a second data type, and defining an index table including a tenant identifier for the tenant, a copy of data from the first data field and the second data field, and a key to the corresponding rows of the multi-tenant data structure.Type: ApplicationFiled: October 4, 2010Publication date: November 24, 2011Applicant: Salesforce.com, inc.Inventors: Jesse Collins, Simon Y. Wong, Jaikumar Bathija, John F. O'Brien
-
Publication number: 20110289068Abstract: Personalized navigation for one or more individuals' use of a search engine is provided. Identification of a query submitted to the search engine is performed. If the query is identified to be a personal navigational query, which is a query via which the individuals intend to navigate to a particular site or information object that they have previously viewed, the particular site or information object associated with the query is identified, and results of the search are personalized based on knowledge of the identified site or information object.Type: ApplicationFiled: May 21, 2010Publication date: November 24, 2011Applicant: Microsoft CorporationInventors: Jaime Teevan, Susan T. Dumais, Gayathri Ravichandran Geetha, Sarah K. Tyler
-
Publication number: 20110289089Abstract: Search engines rely on complex algorithms to search for what is available on the internet. When a query is input, the engine will find matches to the user's search terms and return those matches in ever-expanding circles of relevance. When searches do not return a “true” result that meets the user's needs, the lack of information is ignored. A method for mining unfulfilled searches from search result data and indexing these unfulfilled results in a methodical, way, and a system for such, are disclosed herein. Also disclosed is a method for associating and grouping those unfulfilled searches to specific categories. This results in a database of what is being searched for and not being found. This database can be used for a variety of purposes including, but not limited to, identifying news areas, new product initiation and prioritization, enhanced customer support, and lead generation.Type: ApplicationFiled: May 18, 2010Publication date: November 24, 2011Inventors: Mariana Paul Thomas, Ajit Peter Thomas
-
Publication number: 20110289092Abstract: In various embodiments, a system and related method for organizing transactional data from a diverse and heterogeneous application environment is disclosed. In an example embodiment, a system includes a file system and one or more daemon indexers in electrical communication with the file system. The file system is arranged as a non-relational and serverless file system to allow for cost-effectiveness with ready scalability. The file system is to receive, in substantially real-time, unsorted transactional data from a publishing module. The one or more daemon indexers are arranged to receive the unsorted transactional data from the file system, organize the unsorted transactional data by operational characteristics, and store the organized transactional data on the file system.Type: ApplicationFiled: August 2, 2011Publication date: November 24, 2011Applicant: eBay Inc.Inventors: Abhinav Kumar, Ravinder Purumala, Premendra Singh
-
Publication number: 20110282863Abstract: This invention discloses how Virtual Database Technology can be used to make disparate data appear to be (or act as) the sort of uniform data one expects to find within a single relational database. In particular, we show how to process queries similar to those one might use in a database, even though the underlying data may be missing some of the capabilities that are required by normal databases. Whereas traditional databases require that all the tuples in a table be stored, our approach allows queries over tables where the tuples are generated as required from the data sources, and may not be stored anywhere. We show how such facilities can be used as a new foundation for Internet search.Type: ApplicationFiled: May 11, 2010Publication date: November 17, 2011Inventors: Donald Cohen, Krishnamurthy Narayanaswamy
-
Publication number: 20110282920Abstract: A processing method has been claimed for reducing the average wait time of requests in a queue in a system environment where garbage collection may occur. In the method, a computer system treats as a unit each request in a queue and a completion time of garbage collection that may occur at the time of processing the request, and processes requests preferentially and systematically in ascending order of the processing times of the units including the garbage collection times, thereby, reducing the average wait time of the requests. While, the computer system managing the queue knows the remaining amount of heap just before processing a certain request, the computer system statistically calculates in advance the amounts of heap to be consumed on a request type basis and holds the values.Type: ApplicationFiled: May 10, 2011Publication date: November 17, 2011Applicant: International Business Machines CorporationInventor: Takeshi Ogasawara
-
Publication number: 20110282881Abstract: Methods and systems are described for determining candidates for a custom index in a multi-tenant database environment. In one embodiment, a method includes, capturing a query that is directed to a multi-tenant database, determining whether the captured query is a candidate for an additional filter, determining operators used by the captured query if the query is a candidate, determining data types of the database used by the captured query if the query is a candidate, determining whether there is a current filter for the operator and data types used by the captured query if the query is a candidate, selecting the captured query based on the determined operators, data types, and the determined current filters, and generating a custom index for the selected query.Type: ApplicationFiled: December 17, 2010Publication date: November 17, 2011Applicant: salesforce.com, inc.Inventors: Jesse Collins, Arup Dutta
-
Publication number: 20110276545Abstract: Systems and methods for compressing a raw logical data object (201) for storage in a storage device operable with at least one storage protocol, creating, reading, writing, optimizatic in and restoring thereof. Compressing the raw logical data object (201) comprises creating in the storage device a compressed logical data object (203) comprising a header (204) and one or more allocated compressed sections with predefined size (205-1-205-2); compressing one or more sequentially obtained chunks of raw data (202-1-202-6) corresponding to the raw logical data object (201) thus giving rise to the compressed data chunks (207-1-207-6); and sequentially accommodating the processed data chunks into: said compressed sections (205-1-205-2) in accordance with an order said chunks received, wherein said compressed sections serve as atomic elements of compression/decompression operations during input/output transactions on the logical data object.Type: ApplicationFiled: July 21, 2011Publication date: November 10, 2011Inventors: Chaim KOIFMAN, Nadav KEDEM, Avi ZOHAR, Jonathan AMIT
-
Publication number: 20110276575Abstract: A document accessible over a network can be registered. A registered document, and the content contained therein, is not transmitted undetected over and off of the network. In one embodiment, the invention includes a manager agent to maintain signatures of registered documents and a match agent to detect the unauthorized transmission of the content of registered documents.Type: ApplicationFiled: July 20, 2011Publication date: November 10, 2011Inventors: Erik de la Iglesia, William Deninger, Ratinder Paul Singh Ahuja
-
Publication number: 20110276543Abstract: A virtual block device is an interface with applications that appears to the applications as a memory device, such as a standard block device. The virtual block device interacts with additional elements to do data deduplication to files at the block level such that one or more files accessed using the virtual block device have at least one block which is shared by the one or more files.Type: ApplicationFiled: November 5, 2010Publication date: November 10, 2011Applicant: EXAR CORPORATIONInventor: John Edward Gerard Matze
-
Publication number: 20110270843Abstract: A specialized search engine tool designed for subject matter experts facilitates access to information relevant to their area of expertise available on public domains over the Internet. The specialized search engine represents the collection of thousands of links that are sorted, resorted, categorized and placed into databases that interact with one another. The specialized search engine may permit a user to compare results from multiple databases and automatically submit their search query to many popular searchable databases and web sites from a central web page, without having to individually visit each site. In one example, the specialized search engine is a medical search engine.Type: ApplicationFiled: November 3, 2010Publication date: November 3, 2011Applicant: Mayo Foundation for Medical Education and ResearchInventor: Scott M. Albin
-
Publication number: 20110270841Abstract: Systems may use explicit ratings from users to construct user to user correlations. This technique may reduce the user-content correlation to a single dimension, i.e., the content that a plurality of users may rate similarly. Embodiments of the present invention may use DHT as an underlying distributed signaling mechanism, but may also make the rating implicit. Furthermore, embodiments of the present invention may construct the user to content correlation based on multi-dimensional metadata related to the content.Type: ApplicationFiled: April 28, 2010Publication date: November 3, 2011Applicant: Cisco Technology, Inc.Inventors: Manish Bhardwaj, Jining Tian, Gursharan Singh
-
Publication number: 20110270830Abstract: A computer-implemented method for providing multi-core and multi-level topical organization in social indexes is provided. A corpus of articles is accessed. Each article includes online textual materials. A finite state pattern is provided for a topic that filters the articles as candidate articles, which are potentially on-topic. Similarity-based representations for on-topic and off-topic core meanings of the topic are provided. An aggregate score for each of the candidate articles is determined using the similarity-based representations to indicate whether the candidate article is sufficiently on-topic. The candidate articles are presented ordered by their aggregate scores. In a further embodiment, a hierarchy of topics is provided and used to guide the presentation of articles from subtopics, with considerations of fairness of subtopic coverage, elimination of similarity-duplicates in articles, and article freshness.Type: ApplicationFiled: April 30, 2010Publication date: November 3, 2011Applicant: PALO ALTO RESEARCH CENTER INCORPORATEDInventors: Mark Jeffrey Stefik, Lance E. Good, Sanjay Mittal
-
Publication number: 20110270808Abstract: A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process.Type: ApplicationFiled: April 30, 2010Publication date: November 3, 2011Applicant: International Business Machines CorporationInventors: Tanveer A. Faruquie, Sachindra Joshi, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, Angel Smith, L. V. Subramaniam, Girish Venkatachaliah
-
Publication number: 20110264646Abstract: A search engine database may a segmented structure that preserves individual document references and allows updating as well as scalability. A set of segment managers may receive new, updated, or deleted documents and update a set of term matrices from which a published search matrix may be generated. The database may have a very large term dictionary and may use a hash function to create term identifiers without having to look up terms in the dictionary. The database may be maintained by many systems operating in parallel for high scalability.Type: ApplicationFiled: April 26, 2010Publication date: October 27, 2011Applicant: Microsoft CorporationInventors: Patrick Sokolan, Dennis Doherty, Claude Duguay, William Radcliffe, Virgil Bourassa, Tammara King, John Sheppard
-
Publication number: 20110264632Abstract: Methods and systems for transforming a logical data object for storage in a storage device configured to operate with at least one storage protocol. One method comprises creating in the storage device a transformed logical data object comprising a one or more allocated storage sections with a predefined size and receiving one or more data chunks corresponding to the transformed logical data object. The method further comprises determining if each received data chunk comprises a predefined criterion, transforming each data chunk that comprises the predefined criterion, maintaining each data chuck in raw form that does not comprise the predefined criterion, and sequentially storing each transformed data chuck and data chunk in raw form into said one or more allocated storage sections in accordance with an order said transformed data chunks and data chunks in raw form are received. One system comprises a processor configured to perform the above method.Type: ApplicationFiled: July 7, 2011Publication date: October 27, 2011Inventors: Chaim KOIFMAN, Nadav KEDEM, Avi ZOHAR
-
Publication number: 20110264633Abstract: Methods and systems for transforming a logical data object for storage in a storage device configured to operate with at least one storage protocol. One method comprises creating in the storage device a transformed logical data object comprising a one or more allocated storage sections with a predefined size and receiving one or more data chunks corresponding to the transformed logical data object. The method further comprises determining if each received data chunk comprises a predefined criterion, transforming each data chunk that comprises the predefined criterion, maintaining each data chuck in raw form that does not comprise the predefined criterion, and sequentially storing each transformed data chuck and data chunk in raw form into said one or more allocated storage sections in accordance with an order said transformed data chunks and data chunks in raw form are received. One system comprises a processor configured to perform the above method.Type: ApplicationFiled: July 7, 2011Publication date: October 27, 2011Inventors: Chaim KOIFMAN, Nadav KEDEM, Avi ZOHAR
-
Publication number: 20110264668Abstract: Secondary indexing mechanisms are disclosed. A first index is created in a database environment. The index has a scope defined by a set of files that meet a pre-selected criteria. Second index generation is initiated. Te second index has the same scope as the first index. A first time period between initiation of the generation of the second index and completion of the second index is determined. The second index is swapped with the first index in an atomic swap operation. The indices may be generated for a multitenant database environment. Catch up indexing may be performed for the secondary index.Type: ApplicationFiled: December 7, 2010Publication date: October 27, 2011Applicant: salesforce.com, inc.Inventors: David Hacker, Jeffrey Bergan, Utsavi Benani, Paul Burstein, Jon Mark Dewey
-
Publication number: 20110264665Abstract: A data search and retrieval system that, in response to a search query, dynamically selects and applies a model of information to be returned to a user. The model may be selected based on the search query directly or indirectly based on data returned by a search engine applying the query. For this purpose, the system may include an index of models, similar to a search index. Models may be authored and contributed to the search and retrieval system by third parties, and an association between each such contributed model and characteristics of a search query, such as specific search query terms, may be stored in the index of models. A user of the search and retrieval system may provide feedback on a model that was used to generate information in response to the user's search query, and such feedback may be used to update the index of models.Type: ApplicationFiled: April 26, 2010Publication date: October 27, 2011Applicant: Microsoft CorporationInventors: Vijay Mital, Thomas Frank Bergstraesser, Darryl Ellis Rubin
-
Publication number: 20110264712Abstract: A garbage collector is disclosed that permits extensive separation of mutators and the garbage collector from a synchronization perspective. This relative decoupling of mutator and collector operation allows the garbage collector to perform relatively time-intensive operations during garbage collection without substantially slowing down mutators. The present invention makes use of this flexibility by first conservatively determining which objects in a set of regions of interest are live, then planning where to copy the objects (preferably including clustering), and finally performing the actual copying.Type: ApplicationFiled: April 20, 2011Publication date: October 27, 2011Applicant: TATU YLONEN OY LTDInventor: Tatu J. Ylonen
-
Publication number: 20110264634Abstract: Systems and methods for compressing a raw logical data object (201) for storage in a storage device operable with at least one storage protocol, creating, reading, writing, optimizatic in and restoring thereof. Compressing the raw logical data object (201) comprises creating in the storage device a compressed logical data object (203) comprising a header (204) and one or more allocated compressed sections with predefined size (205-1-205-2); compressing one or more sequentially obtained chunks of raw data (202-1-202-6) corresponding to the raw logical data object (201) thus giving rise to the compressed data chunks (207-1-207-6); and sequentially accommodating the processed data chunks into: said compressed sections (205-1-205-2) in accordance with an order said chunks received, wherein said compressed sections serve as atomic elements of compression/decompression operations during input/output transactions on the logical data object.Type: ApplicationFiled: July 7, 2011Publication date: October 27, 2011Inventors: Chaim KOIFMAN, Nadav KEDEM, Avi ZOHAR, Jonathan AMIT
-
Publication number: 20110264626Abstract: Methods for parallel query execution of a database operation on a database utilizing a graphics processing unit (GPU) are presented including: receiving query by a host, the query including database relations; starting a GPU kernel, where the GPU kernels include a GPU memory; hash partitioning the database relations by the GPU kernel; loading the partitioned database relations into the GPU memory; loading keyed partitions corresponding the hash partitioned database relations into the GPU memory; building a hash table for a smaller of the hash partitioned database relations; and executing the query. In some embodiments, methods further include returning a result of the query. In some embodiments, methods further include when the query is a long query including a number of operators, parsing the long query into a number of sub-queries; for each of the sub-queries, starting one of the GPU kernels such that the sub-queries are processed in parallel.Type: ApplicationFiled: April 22, 2010Publication date: October 27, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Akshay Gautam, Ritesh K. Gupta
-
Publication number: 20110264525Abstract: The present invention provides methods and systems for use in searching a user's online world, across independently provided applications including Web-based and desktop applications, which can include Web sites. Techniques are provided in which information is collected and indexed relating to activities and communications of a user, and of other users in association with activities and communications of the user, across independent Web-based applications. A graphical user interface is provided to allow user searching in association with the collected and indexed information. Search results are provided using a graphical user interface that can include Web results, personal results, and desktop results.Type: ApplicationFiled: April 26, 2010Publication date: October 27, 2011Applicant: Yahoo! Inc.Inventors: TARUN BHATIA, Eric Theodore Bax
-
Publication number: 20110258241Abstract: Files stored, or to be stored, in a storage device are marked either as non-discardable or as discardable in a file system structure associated with a storage device. Each discardable file has associated with it a discarding priority level. A publisher file is permitted to be stored in the storage device only if storing the publisher file does not narrow a storage usage safety margin that is reserved for user files. User files are allowed to be stored in the storage device even if storing them narrows the storage usage safety margin but, in such cases, the storage usage safety margin is restored by removing one or more discardable files from the storage device. A discardable file is removed from the storage device if its discarding priority level equals or is higher than a predetermined discarding threshold value.Type: ApplicationFiled: June 29, 2011Publication date: October 20, 2011Inventors: Moshe Raines, Ran Carmeli, David Koren, Judah Gamliel Hahn, Donald Ray Bryant-Rich
-
Publication number: 20110258034Abstract: Novel and efficient methods are described for indexing advertisements (“ads”) and other resources that are defined and organized in accordance with a hierarchical schema. In accordance with at least one embodiment, an ad corpus is transformed into a collection of hierarchically structured textual documents. An indexing technique that exploits the hierarchical structure is then applied to construct a compact yet effective ad index that can be used for performing advanced match or other ad retrieval functions. Various retrieval methods are also described herein that are capable of exploiting the hierarchical structure of the ad corpus to retrieve more relevant ads than those yielded by conventional methods.Type: ApplicationFiled: April 15, 2010Publication date: October 20, 2011Applicant: YAHOO! INC.Inventors: Donald Metzler, Evgeniy Gabrilovich, Vanja Josifovski, Michael Bendersky
-
Publication number: 20110258205Abstract: The sort processing of keys to be sorted, which keys are expressed as bit strings involves a classification processing. In the classification processing, a bit string comparison between a reference key and a key which is an object of the classification is performed, and a difference bit position is obtained that is the bit position of the first bit that differs in the bit string comparison and the keys to be sorted are classified by the difference bit position into key groups with the same difference bit position.Type: ApplicationFiled: June 22, 2011Publication date: October 20, 2011Applicant: S. Grants Co., Ltd.Inventors: Toshio Shinjo, Koutaro Shinjo, Mitsuhiro Kokubun
-
Publication number: 20110258197Abstract: Content leaving a local network can be captured and indexed so that queries can be performed on the captured data. In one embodiment, the present invention comprises an apparatus that connects to a network. In one embodiment, this apparatus includes a network interface module to connect the apparatus to a network, a packet capture module to intercept packets being transmitted on the network, an object assembly module to reconstruct objects being transmitted on the network from the intercepted packets, an object classification module to determine the content in the reconstructed objects, and an object store module to store the objects. This apparatus can also have a user interface to enable a user to search objects stored in the object store module.Type: ApplicationFiled: June 24, 2011Publication date: October 20, 2011Inventors: Erik de la Iglesia, Rick Lowe, Ratinder Paul Singh Ahuja, William Deninger, Samuel King, Ashish Khasgiwala, Donald J. Massaro
-
Publication number: 20110258196Abstract: A method of content recommendation, includes: generating a first digital mathematical representation of contents to associate the contents with a first plurality of words describing the contents; generating a second digital mathematical representation of text documents different from the contents to associate the documents with a second plurality of words; processing the first and second pluralities of words to determine a common plurality of words; processing the first and second digital mathematical representations to generate a common digital mathematical representation of the contents and the text documents based on the common plurality of words; and providing content recommendation by processing the common digital mathematical representation.Type: ApplicationFiled: December 30, 2008Publication date: October 20, 2011Inventors: Skjalg Lepsoy, Gianluca Francini, Fabrizio Antonelli
-
Publication number: 20110251878Abstract: A system for processing data includes a first data pipeline. The first data pipeline includes a processor to process a first set of data stored in a tangible memory. The system also includes a second data pipeline to process a second set of data. A mapping processor matches the first set of data to the second set of data to produce a third set of data.Type: ApplicationFiled: April 13, 2010Publication date: October 13, 2011Applicant: Yahoo! Inc.Inventors: Senthil Subramanian, Prashant Baronia
-
Publication number: 20110252007Abstract: A method of storing data in a storage media includes compressing raw data based on a physical storage unit of the storage media and storing the compressed data in the storage media. The physical storage unit of the storage media storing the compressed data includes an update region into which update data may be written.Type: ApplicationFiled: April 8, 2011Publication date: October 13, 2011Applicant: Samsung Electronics Co., Ltd.Inventors: Kyoung Lae CHO, Bumseok Yu, Junjin Kong, Hee Chang Cho, Seongsik Hwang
-
Publication number: 20110252008Abstract: Methods and apparatuses for processing data are disclosed, including methods and apparatuses that leverage a reconfigurable logic device to offload decompression and search operations from a processor to thereby enable high speed data searches within data that has been stored in a compressed format.Type: ApplicationFiled: June 21, 2011Publication date: October 13, 2011Inventors: Roger D. Chamberlain, Benjamin M. Brink, Jason R. White, Mark A. Franklin, Ron K. Cytron
-
Publication number: 20110246439Abstract: A query is annotated with a small sketch (e.g. a Bloom filter) that approximates a set of interest that is related to the query. The query and sketch may be forwarded to index servers that each stores a portion of a search engine corpus. Each of the index servers may filter documents using the sketch before returning results for aggregation. The sketch is designed so there may be false positives (results returned by authors not in the set), but no false negatives (all relevant results are returned). The final aggregated results set may be checked against the full set to remove false positives before returning the final results to the user.Type: ApplicationFiled: April 6, 2010Publication date: October 6, 2011Applicant: Microsoft CorporationInventors: Michael A. Isard, Marc A. Najork, Sean A. Suchter, Eric R. Scheel
-
Publication number: 20110246472Abstract: Methods and systems for facilitating distribution of application functionality across a multi-tier client-server architecture are provided. According to one embodiment, a method is provided for instantiating a DataMap. A data store interface reads a set of definitions and instructions from a datastore that describe the structure of the DataMap. The data store interface interprets the set of definitions and instructions to instantiate the DataMap. According to another embodiment, a method is provided for indexing into a DataMap. A data store interface receives an expression. The data store interface parses the expression to identify a set of keys suitable for indexing into the DataMap and corresponding DataPoints.Type: ApplicationFiled: July 9, 2010Publication date: October 6, 2011Inventor: David M. Dillon
-
Publication number: 20110246432Abstract: Embodiments of the present invention provide one or more hardware-friendly data structures that enable efficient hardware acceleration of database operations. In particular, the present invention employs a column-store format for the database. In the database, column-groups are stored with implicit row ids (RIDs) and a RID-to-primary key column having both column-store and row-store benefits via column hopping and a heap structure for adding new data. Fixed-width column compression allow for easy hardware database processing directly on the compressed data. A global database virtual address space is utilized that allows for arithmetic derivation of any physical address of the data regardless of its location. A word compression dictionary with token compare and sort index is also provided to allow for efficient hardware-based searching of text. A tuple reconstruction process is provided as well that allows hardware to reconstruct a row by stitching together data from multiple column groups.Type: ApplicationFiled: May 13, 2011Publication date: October 6, 2011Applicant: TERADATA US, INC.Inventors: Liuxi Yang, Kapil Surlaker, Ravi Krishnamurthy, Michael Corwin, Jeremy Branscome, Krishnan Meiyyappan, Joseph I. Chamdani