Data Indexing; Abstracting; Data Reduction (epo) Patents (Class 707/E17.002)
  • Publication number: 20110320418
    Abstract: Apparatus, systems, and methods may operate to receive requests to execute a plurality of compression and/or decompression mechanisms on one or more database objects; to execute each of the compression and/or decompression mechanisms, on a sampled basis, on the database objects; to determine comparative performance characteristics associated with each of the compression and/or decompression mechanisms; and to record at least some of the performance characteristics and/or derivative characteristics derived from the performance characteristics in a performance summary table. The table may be published to a storage medium or a display screen. Other apparatus, systems, and methods are disclosed.
    Type: Application
    Filed: June 29, 2010
    Publication date: December 29, 2011
    Applicant: Teradata US, Inc.
    Inventors: Congnan Luo, Like Gao, Yu Long, Judy Wu, Michael Leon Reed
  • Publication number: 20110320417
    Abstract: Apparatus, systems, and methods may operate to receive a set of ordered user-selected compression rules as a compression rule set comprising at least one compression threshold condition, to create or transform a database object with rows to be selectively compressed according to the compression rules in the compression rule set (providing a transformed object), and to publish at least a portion of the transformed object to one of a storage medium or a display screen. Other apparatus, systems, and methods are disclosed.
    Type: Application
    Filed: June 29, 2010
    Publication date: December 29, 2011
    Applicant: Teradata US, Inc.
    Inventors: Congnan Luo, Like Gao, Yu Long, Judy Wu, Michael Leon Reed
  • Publication number: 20110320458
    Abstract: A dynamic portal generation system includes an indexing module that indexes structured and unstructured data in a database. The database includes information residing in associated standalone applications having documents from information sources, and a name-entity repository that includes name entities and their corresponding name-entity types. A search module searches the information residing in the indexed information to obtain a search result. A name-entity extraction module extracts a matching name-entity that corresponds to a name-entity in the name-entity repository. A portal generation module dynamically generates a portal triggered by the search query.
    Type: Application
    Filed: June 22, 2011
    Publication date: December 29, 2011
    Inventor: ABINASHA KARANA
  • Publication number: 20110314033
    Abstract: A derivative work discovery service facilitates discovery of derivative works by consumers who request the service from a derivative work service server. The derivative work service server uses information about digital sources available on a computer system operated by the consumer to search a derivative works database. A derivative work discovery file is compiled and presented to the consumer in a descending order of a ratio of a number of digital sources possessed by the consumer that are equivalent to digital sources used to create a derivative work to a total number of digital sources used to create that derivative work.
    Type: Application
    Filed: June 18, 2010
    Publication date: December 22, 2011
    Applicant: Legitmix, Inc.
    Inventor: Omid Allen McDonald
  • Publication number: 20110314027
    Abstract: An index building, querying method, device and system for distributed columnar database are provided. The index building method for distributed columnar database includes: obtaining a column field from a distributed columnar database, generating a column index file in which the column field is a key word, the column index file comprising the mapping relationship between the value of the column field in the distributed columnar database and the corresponding Row field value; storing the column index file to a index catalogue corresponding to the column field in the distributed columnar database.
    Type: Application
    Filed: November 3, 2009
    Publication date: December 22, 2011
    Applicant: CHINA MOBILE COMMUNICATIONS CORPORATION
    Inventors: Meng Xu, Ling Qian, Zhiguo Luo, Leitao Guo, Peng Zhao
  • Publication number: 20110313997
    Abstract: A total homepage service providing system includes an information provider information administration unit configured to register and administrate information of an information appliance of an information provider and information of the information provider; a homepage generation unit configured to automatically generate a homepage which can be displayed on the information appliance of the information provider and an information appliance of an information user, using metadata received from the information appliance of the information provider; a homepage registration and administration unit configured to store a file of the generated homepage, and register and administrate the homepage; and an index generation and administration unit configured to generate one or more homepage indexes for an information search, using keywords extracted and classified from the generated homepage, and administrate the generated homepage indexes.
    Type: Application
    Filed: November 3, 2009
    Publication date: December 22, 2011
    Inventor: Hee Sung Chung
  • Publication number: 20110313852
    Abstract: Systems and methods for applications of orthogonal corpus indexing (OCI), such as selecting ad words for purchase and improving visibility of web pages in search engines, are described. In one aspect, the systems and methods described herein employ OCI to recommend to an advertiser ad words for purchase. Advertisers pay search engines for placement of their advertising along side results in the search results page, when a given word or phrase appears in a user's search query. The described systems and methods enable automated selection of related and discriminating terms, identifying keywords that increase the ratio of ads clicked-through to money spent on keyword buying. In another aspect, the systems and methods described herein employ OCI to generate content for web pages. OCI may be used to determine content that when added to a web page improves the rank of that page in a search engine.
    Type: Application
    Filed: May 16, 2011
    Publication date: December 22, 2011
    Applicant: IndraWeb.com, Inc.
    Inventors: Henry B. Kon, George W. Burch
  • Publication number: 20110307469
    Abstract: A new approach is proposed that contemplates systems and methods to provide query suggestions including real-time suggestion of complete query terms, which can be phrases, to a user by analyzing and indexing the real-time history/stream of content or documents in addition to the stream of queries entered. Since the real-time indexing generates a count of potential results for each term found and/or indexed in the stream, the terms found in that stream can then be used as potential query suggestions, knowing that it will be possible to provide results for those queries.
    Type: Application
    Filed: June 14, 2011
    Publication date: December 15, 2011
    Inventors: Rishab Aiyer Ghosh, Lun Ted Cui
  • Publication number: 20110307468
    Abstract: A method and system for identifying nodes with similar content. In one aspect, the method comprises determining a structure of a network of nodes, said structure defined by incoming links and outgoing links between nodes within said network, grouping said nodes within said network into a first set of modules, calculating a first modularity value between each of the modules within the first set, said modularity value indicating a degree of similar content within each module, calculating a topical relevance value for each of the modules, selecting those modules whose topical relevance value exceeds a threshold value and calculating an authority score for the selected modules.
    Type: Application
    Filed: June 11, 2010
    Publication date: December 15, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ning Duan, Pei-Yun S. Hsueh, Yan Liu
  • Publication number: 20110307490
    Abstract: An invention for dissemination or retrieval of digital resources or online information via context layer or context-level protocols and interfaces is described. According to one embodiment, an interface or protocol that a computer uses to communicate with other computers is associated with a subject matter context. User-level contents or digital resources received across that interface or protocol are then associated with that subject matter context, and the computer may respond accordingly. For instance, a computer may associate a given network port with a subject matter context of shopping, and treat all digital resource requests received on that port as applying to only a shopping subject matter context. A web server may also listen on a network port associated with a subject matter context, thereby contextualizing the overall nature of the website that the web server hosts.
    Type: Application
    Filed: June 15, 2011
    Publication date: December 15, 2011
    Applicant: USM CHINA/HONG KONG LIMITED
    Inventor: Edmond K. Chow
  • Publication number: 20110307432
    Abstract: Improved search result relevance is provided for name segment searches performed by a general web search engine. Entity-related information is mined from web documents and search engine query logs, and metadata is indexed in a search system index. The metadata may include information identifying entity homepages, entity web pages at high quality top sites, other entity-related web pages, entity equivalent data, and/or entity misspellings data. The indexed metadata is employed to provide improved search results relevance for search queries that include an entity's name by improving the ranking of search results corresponding with entity-relevant web pages.
    Type: Application
    Filed: June 11, 2010
    Publication date: December 15, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: QI YAO, VINCENT LI, JUNBIAO TANG, RICHARD CHANG
  • Publication number: 20110302183
    Abstract: A method for managing an object watchpoint during a garbage collection cycle, including identifying an object having a field, where the object is associated with an original object location, where the field is associated with an original field location, and where the object and the field are located in a memory heap of a virtual machine memory, setting, within a debugger, the object watchpoint on the original field location, where the object watchpoint is a memory trap associated with the object, determining, after a relocation of the object by a garbage collector (GC), a new object location associated with the object, determining a new field location of the field based on the new object location, and setting, within the debugger, the object watchpoint on the new field location.
    Type: Application
    Filed: June 2, 2010
    Publication date: December 8, 2011
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Michael Lee Van De Vanter, Hannes E. Payer, Douglas Norman Simon, Benjamin Lawrence Titzer, Mario I. Wolczko
  • Publication number: 20110302171
    Abstract: A method and system for discovering a control event from electronically published documents and received data streams is provided, in which a computer control program identifies electronically published documents and data stored in a plurality of network servers which potentially contain control events relevant to the control of goods and/or services, the control events identified by reference to user interest identifiers. Identified material is analyzed by a classification program to determine whether control events are present. A control event classification is assigned to documents and received data determined to contain at least one discovered control event, the assigned control event classification and information identifying the associated document and data is stored in a classification database, and a report of discovery of documents and data containing control events is be provided to a user. The report may includes a link to the control event classification and/or its associated document or data.
    Type: Application
    Filed: June 14, 2011
    Publication date: December 8, 2011
    Applicant: Decernis, LLC
    Inventors: Patrick Blackmon Waldo, Andrew B. Waldo
  • Publication number: 20110295819
    Abstract: Various embodiments for transforming a logical data object for storage in a storage device operable with at least one storage protocol are provided. In one such embodiment, the logical data object into one or more segments are divided with each segment characterized by respective start and end offsets. One or more obtained variable size data chunks are processed corresponding to the logical data object to obtain processed data chunks, wherein at least one of the processed data chunks comprises transformed data resulting from the processing. Each of the variable size data chunks is associated with a respective segment of the logical data object.
    Type: Application
    Filed: August 5, 2011
    Publication date: December 1, 2011
    Inventors: Jonathan AMIT, Ori SHALEV
  • Publication number: 20110295818
    Abstract: Various embodiments for transforming a logical data object for storage in a storage device operable with at least one storage protocol are provided. In one such embodiment, the logical data object into one or more segments are divided with each segment characterized by respective start and end offsets. One or more obtained variable size data chunks are processed corresponding to the logical data object to obtain processed data chunks, wherein at least one of the processed data chunks comprises transformed data resulting from the processing. Each of the variable size data chunks is associated with a respective segment of the logical data object.
    Type: Application
    Filed: August 5, 2011
    Publication date: December 1, 2011
    Inventors: Jonathan AMIT, Ori SHALEV
  • Publication number: 20110289049
    Abstract: Metadata may be stored in, and retrieved from, a scalable, fault-tolerant metadata service. In one example, metadata is divided into partitions, and each partition is served by one or more nodes. For each partition, a first one of the nodes may handle read and write requests, and the other nodes may handle read requests in the event that the first node is down or is experiencing high load. When a request is made with respect to metadata, a metadata server may identify a node, in the partition to which the metadata is assigned, to which the request is to be made. The entity that is making the request then contacts that node, and requests the read or write on the metadata. In a partition, metadata may be replicated between the first node and the other nodes using a log-based replication protocol.
    Type: Application
    Filed: May 19, 2010
    Publication date: November 24, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Nanshan Zeng, Meng Ye, Honghua Feng, Junwei Xu, Yu-chao Cao, Yingjun Yu, Lin Song
  • Publication number: 20110289091
    Abstract: In accordance with embodiments, there are provided methods and systems for providing multiple column custom indexes in a multi-tenant database environment. A method embodiment provides defining a multi-tenant data structure having a plurality of data columns data fields and a plurality of rows for each of multiple tenants, each row including a data column for a tenant identifier, defining a first data field for a tenant, the first data field having a first data type, defining a second data field for the tenant, the second field having a second data type, and defining an index table including a tenant identifier for the tenant, a copy of data from the first data field and the second data field, and a key to the corresponding rows of the multi-tenant data structure.
    Type: Application
    Filed: October 4, 2010
    Publication date: November 24, 2011
    Applicant: Salesforce.com, inc.
    Inventors: Jesse Collins, Simon Y. Wong, Jaikumar Bathija, John F. O'Brien
  • Publication number: 20110289068
    Abstract: Personalized navigation for one or more individuals' use of a search engine is provided. Identification of a query submitted to the search engine is performed. If the query is identified to be a personal navigational query, which is a query via which the individuals intend to navigate to a particular site or information object that they have previously viewed, the particular site or information object associated with the query is identified, and results of the search are personalized based on knowledge of the identified site or information object.
    Type: Application
    Filed: May 21, 2010
    Publication date: November 24, 2011
    Applicant: Microsoft Corporation
    Inventors: Jaime Teevan, Susan T. Dumais, Gayathri Ravichandran Geetha, Sarah K. Tyler
  • Publication number: 20110289089
    Abstract: Search engines rely on complex algorithms to search for what is available on the internet. When a query is input, the engine will find matches to the user's search terms and return those matches in ever-expanding circles of relevance. When searches do not return a “true” result that meets the user's needs, the lack of information is ignored. A method for mining unfulfilled searches from search result data and indexing these unfulfilled results in a methodical, way, and a system for such, are disclosed herein. Also disclosed is a method for associating and grouping those unfulfilled searches to specific categories. This results in a database of what is being searched for and not being found. This database can be used for a variety of purposes including, but not limited to, identifying news areas, new product initiation and prioritization, enhanced customer support, and lead generation.
    Type: Application
    Filed: May 18, 2010
    Publication date: November 24, 2011
    Inventors: Mariana Paul Thomas, Ajit Peter Thomas
  • Publication number: 20110289092
    Abstract: In various embodiments, a system and related method for organizing transactional data from a diverse and heterogeneous application environment is disclosed. In an example embodiment, a system includes a file system and one or more daemon indexers in electrical communication with the file system. The file system is arranged as a non-relational and serverless file system to allow for cost-effectiveness with ready scalability. The file system is to receive, in substantially real-time, unsorted transactional data from a publishing module. The one or more daemon indexers are arranged to receive the unsorted transactional data from the file system, organize the unsorted transactional data by operational characteristics, and store the organized transactional data on the file system.
    Type: Application
    Filed: August 2, 2011
    Publication date: November 24, 2011
    Applicant: eBay Inc.
    Inventors: Abhinav Kumar, Ravinder Purumala, Premendra Singh
  • Publication number: 20110282863
    Abstract: This invention discloses how Virtual Database Technology can be used to make disparate data appear to be (or act as) the sort of uniform data one expects to find within a single relational database. In particular, we show how to process queries similar to those one might use in a database, even though the underlying data may be missing some of the capabilities that are required by normal databases. Whereas traditional databases require that all the tuples in a table be stored, our approach allows queries over tables where the tuples are generated as required from the data sources, and may not be stored anywhere. We show how such facilities can be used as a new foundation for Internet search.
    Type: Application
    Filed: May 11, 2010
    Publication date: November 17, 2011
    Inventors: Donald Cohen, Krishnamurthy Narayanaswamy
  • Publication number: 20110282920
    Abstract: A processing method has been claimed for reducing the average wait time of requests in a queue in a system environment where garbage collection may occur. In the method, a computer system treats as a unit each request in a queue and a completion time of garbage collection that may occur at the time of processing the request, and processes requests preferentially and systematically in ascending order of the processing times of the units including the garbage collection times, thereby, reducing the average wait time of the requests. While, the computer system managing the queue knows the remaining amount of heap just before processing a certain request, the computer system statistically calculates in advance the amounts of heap to be consumed on a request type basis and holds the values.
    Type: Application
    Filed: May 10, 2011
    Publication date: November 17, 2011
    Applicant: International Business Machines Corporation
    Inventor: Takeshi Ogasawara
  • Publication number: 20110282881
    Abstract: Methods and systems are described for determining candidates for a custom index in a multi-tenant database environment. In one embodiment, a method includes, capturing a query that is directed to a multi-tenant database, determining whether the captured query is a candidate for an additional filter, determining operators used by the captured query if the query is a candidate, determining data types of the database used by the captured query if the query is a candidate, determining whether there is a current filter for the operator and data types used by the captured query if the query is a candidate, selecting the captured query based on the determined operators, data types, and the determined current filters, and generating a custom index for the selected query.
    Type: Application
    Filed: December 17, 2010
    Publication date: November 17, 2011
    Applicant: salesforce.com, inc.
    Inventors: Jesse Collins, Arup Dutta
  • Publication number: 20110276545
    Abstract: Systems and methods for compressing a raw logical data object (201) for storage in a storage device operable with at least one storage protocol, creating, reading, writing, optimizatic in and restoring thereof. Compressing the raw logical data object (201) comprises creating in the storage device a compressed logical data object (203) comprising a header (204) and one or more allocated compressed sections with predefined size (205-1-205-2); compressing one or more sequentially obtained chunks of raw data (202-1-202-6) corresponding to the raw logical data object (201) thus giving rise to the compressed data chunks (207-1-207-6); and sequentially accommodating the processed data chunks into: said compressed sections (205-1-205-2) in accordance with an order said chunks received, wherein said compressed sections serve as atomic elements of compression/decompression operations during input/output transactions on the logical data object.
    Type: Application
    Filed: July 21, 2011
    Publication date: November 10, 2011
    Inventors: Chaim KOIFMAN, Nadav KEDEM, Avi ZOHAR, Jonathan AMIT
  • Publication number: 20110276575
    Abstract: A document accessible over a network can be registered. A registered document, and the content contained therein, is not transmitted undetected over and off of the network. In one embodiment, the invention includes a manager agent to maintain signatures of registered documents and a match agent to detect the unauthorized transmission of the content of registered documents.
    Type: Application
    Filed: July 20, 2011
    Publication date: November 10, 2011
    Inventors: Erik de la Iglesia, William Deninger, Ratinder Paul Singh Ahuja
  • Publication number: 20110276543
    Abstract: A virtual block device is an interface with applications that appears to the applications as a memory device, such as a standard block device. The virtual block device interacts with additional elements to do data deduplication to files at the block level such that one or more files accessed using the virtual block device have at least one block which is shared by the one or more files.
    Type: Application
    Filed: November 5, 2010
    Publication date: November 10, 2011
    Applicant: EXAR CORPORATION
    Inventor: John Edward Gerard Matze
  • Publication number: 20110270843
    Abstract: A specialized search engine tool designed for subject matter experts facilitates access to information relevant to their area of expertise available on public domains over the Internet. The specialized search engine represents the collection of thousands of links that are sorted, resorted, categorized and placed into databases that interact with one another. The specialized search engine may permit a user to compare results from multiple databases and automatically submit their search query to many popular searchable databases and web sites from a central web page, without having to individually visit each site. In one example, the specialized search engine is a medical search engine.
    Type: Application
    Filed: November 3, 2010
    Publication date: November 3, 2011
    Applicant: Mayo Foundation for Medical Education and Research
    Inventor: Scott M. Albin
  • Publication number: 20110270841
    Abstract: Systems may use explicit ratings from users to construct user to user correlations. This technique may reduce the user-content correlation to a single dimension, i.e., the content that a plurality of users may rate similarly. Embodiments of the present invention may use DHT as an underlying distributed signaling mechanism, but may also make the rating implicit. Furthermore, embodiments of the present invention may construct the user to content correlation based on multi-dimensional metadata related to the content.
    Type: Application
    Filed: April 28, 2010
    Publication date: November 3, 2011
    Applicant: Cisco Technology, Inc.
    Inventors: Manish Bhardwaj, Jining Tian, Gursharan Singh
  • Publication number: 20110270830
    Abstract: A computer-implemented method for providing multi-core and multi-level topical organization in social indexes is provided. A corpus of articles is accessed. Each article includes online textual materials. A finite state pattern is provided for a topic that filters the articles as candidate articles, which are potentially on-topic. Similarity-based representations for on-topic and off-topic core meanings of the topic are provided. An aggregate score for each of the candidate articles is determined using the similarity-based representations to indicate whether the candidate article is sufficiently on-topic. The candidate articles are presented ordered by their aggregate scores. In a further embodiment, a hierarchy of topics is provided and used to guide the presentation of articles from subtopics, with considerations of fairness of subtopic coverage, elimination of similarity-duplicates in articles, and article freshness.
    Type: Application
    Filed: April 30, 2010
    Publication date: November 3, 2011
    Applicant: PALO ALTO RESEARCH CENTER INCORPORATED
    Inventors: Mark Jeffrey Stefik, Lance E. Good, Sanjay Mittal
  • Publication number: 20110270808
    Abstract: A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process.
    Type: Application
    Filed: April 30, 2010
    Publication date: November 3, 2011
    Applicant: International Business Machines Corporation
    Inventors: Tanveer A. Faruquie, Sachindra Joshi, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, Angel Smith, L. V. Subramaniam, Girish Venkatachaliah
  • Publication number: 20110264646
    Abstract: A search engine database may a segmented structure that preserves individual document references and allows updating as well as scalability. A set of segment managers may receive new, updated, or deleted documents and update a set of term matrices from which a published search matrix may be generated. The database may have a very large term dictionary and may use a hash function to create term identifiers without having to look up terms in the dictionary. The database may be maintained by many systems operating in parallel for high scalability.
    Type: Application
    Filed: April 26, 2010
    Publication date: October 27, 2011
    Applicant: Microsoft Corporation
    Inventors: Patrick Sokolan, Dennis Doherty, Claude Duguay, William Radcliffe, Virgil Bourassa, Tammara King, John Sheppard
  • Publication number: 20110264632
    Abstract: Methods and systems for transforming a logical data object for storage in a storage device configured to operate with at least one storage protocol. One method comprises creating in the storage device a transformed logical data object comprising a one or more allocated storage sections with a predefined size and receiving one or more data chunks corresponding to the transformed logical data object. The method further comprises determining if each received data chunk comprises a predefined criterion, transforming each data chunk that comprises the predefined criterion, maintaining each data chuck in raw form that does not comprise the predefined criterion, and sequentially storing each transformed data chuck and data chunk in raw form into said one or more allocated storage sections in accordance with an order said transformed data chunks and data chunks in raw form are received. One system comprises a processor configured to perform the above method.
    Type: Application
    Filed: July 7, 2011
    Publication date: October 27, 2011
    Inventors: Chaim KOIFMAN, Nadav KEDEM, Avi ZOHAR
  • Publication number: 20110264633
    Abstract: Methods and systems for transforming a logical data object for storage in a storage device configured to operate with at least one storage protocol. One method comprises creating in the storage device a transformed logical data object comprising a one or more allocated storage sections with a predefined size and receiving one or more data chunks corresponding to the transformed logical data object. The method further comprises determining if each received data chunk comprises a predefined criterion, transforming each data chunk that comprises the predefined criterion, maintaining each data chuck in raw form that does not comprise the predefined criterion, and sequentially storing each transformed data chuck and data chunk in raw form into said one or more allocated storage sections in accordance with an order said transformed data chunks and data chunks in raw form are received. One system comprises a processor configured to perform the above method.
    Type: Application
    Filed: July 7, 2011
    Publication date: October 27, 2011
    Inventors: Chaim KOIFMAN, Nadav KEDEM, Avi ZOHAR
  • Publication number: 20110264668
    Abstract: Secondary indexing mechanisms are disclosed. A first index is created in a database environment. The index has a scope defined by a set of files that meet a pre-selected criteria. Second index generation is initiated. Te second index has the same scope as the first index. A first time period between initiation of the generation of the second index and completion of the second index is determined. The second index is swapped with the first index in an atomic swap operation. The indices may be generated for a multitenant database environment. Catch up indexing may be performed for the secondary index.
    Type: Application
    Filed: December 7, 2010
    Publication date: October 27, 2011
    Applicant: salesforce.com, inc.
    Inventors: David Hacker, Jeffrey Bergan, Utsavi Benani, Paul Burstein, Jon Mark Dewey
  • Publication number: 20110264665
    Abstract: A data search and retrieval system that, in response to a search query, dynamically selects and applies a model of information to be returned to a user. The model may be selected based on the search query directly or indirectly based on data returned by a search engine applying the query. For this purpose, the system may include an index of models, similar to a search index. Models may be authored and contributed to the search and retrieval system by third parties, and an association between each such contributed model and characteristics of a search query, such as specific search query terms, may be stored in the index of models. A user of the search and retrieval system may provide feedback on a model that was used to generate information in response to the user's search query, and such feedback may be used to update the index of models.
    Type: Application
    Filed: April 26, 2010
    Publication date: October 27, 2011
    Applicant: Microsoft Corporation
    Inventors: Vijay Mital, Thomas Frank Bergstraesser, Darryl Ellis Rubin
  • Publication number: 20110264712
    Abstract: A garbage collector is disclosed that permits extensive separation of mutators and the garbage collector from a synchronization perspective. This relative decoupling of mutator and collector operation allows the garbage collector to perform relatively time-intensive operations during garbage collection without substantially slowing down mutators. The present invention makes use of this flexibility by first conservatively determining which objects in a set of regions of interest are live, then planning where to copy the objects (preferably including clustering), and finally performing the actual copying.
    Type: Application
    Filed: April 20, 2011
    Publication date: October 27, 2011
    Applicant: TATU YLONEN OY LTD
    Inventor: Tatu J. Ylonen
  • Publication number: 20110264634
    Abstract: Systems and methods for compressing a raw logical data object (201) for storage in a storage device operable with at least one storage protocol, creating, reading, writing, optimizatic in and restoring thereof. Compressing the raw logical data object (201) comprises creating in the storage device a compressed logical data object (203) comprising a header (204) and one or more allocated compressed sections with predefined size (205-1-205-2); compressing one or more sequentially obtained chunks of raw data (202-1-202-6) corresponding to the raw logical data object (201) thus giving rise to the compressed data chunks (207-1-207-6); and sequentially accommodating the processed data chunks into: said compressed sections (205-1-205-2) in accordance with an order said chunks received, wherein said compressed sections serve as atomic elements of compression/decompression operations during input/output transactions on the logical data object.
    Type: Application
    Filed: July 7, 2011
    Publication date: October 27, 2011
    Inventors: Chaim KOIFMAN, Nadav KEDEM, Avi ZOHAR, Jonathan AMIT
  • Publication number: 20110264626
    Abstract: Methods for parallel query execution of a database operation on a database utilizing a graphics processing unit (GPU) are presented including: receiving query by a host, the query including database relations; starting a GPU kernel, where the GPU kernels include a GPU memory; hash partitioning the database relations by the GPU kernel; loading the partitioned database relations into the GPU memory; loading keyed partitions corresponding the hash partitioned database relations into the GPU memory; building a hash table for a smaller of the hash partitioned database relations; and executing the query. In some embodiments, methods further include returning a result of the query. In some embodiments, methods further include when the query is a long query including a number of operators, parsing the long query into a number of sub-queries; for each of the sub-queries, starting one of the GPU kernels such that the sub-queries are processed in parallel.
    Type: Application
    Filed: April 22, 2010
    Publication date: October 27, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Akshay Gautam, Ritesh K. Gupta
  • Publication number: 20110264525
    Abstract: The present invention provides methods and systems for use in searching a user's online world, across independently provided applications including Web-based and desktop applications, which can include Web sites. Techniques are provided in which information is collected and indexed relating to activities and communications of a user, and of other users in association with activities and communications of the user, across independent Web-based applications. A graphical user interface is provided to allow user searching in association with the collected and indexed information. Search results are provided using a graphical user interface that can include Web results, personal results, and desktop results.
    Type: Application
    Filed: April 26, 2010
    Publication date: October 27, 2011
    Applicant: Yahoo! Inc.
    Inventors: TARUN BHATIA, Eric Theodore Bax
  • Publication number: 20110258241
    Abstract: Files stored, or to be stored, in a storage device are marked either as non-discardable or as discardable in a file system structure associated with a storage device. Each discardable file has associated with it a discarding priority level. A publisher file is permitted to be stored in the storage device only if storing the publisher file does not narrow a storage usage safety margin that is reserved for user files. User files are allowed to be stored in the storage device even if storing them narrows the storage usage safety margin but, in such cases, the storage usage safety margin is restored by removing one or more discardable files from the storage device. A discardable file is removed from the storage device if its discarding priority level equals or is higher than a predetermined discarding threshold value.
    Type: Application
    Filed: June 29, 2011
    Publication date: October 20, 2011
    Inventors: Moshe Raines, Ran Carmeli, David Koren, Judah Gamliel Hahn, Donald Ray Bryant-Rich
  • Publication number: 20110258034
    Abstract: Novel and efficient methods are described for indexing advertisements (“ads”) and other resources that are defined and organized in accordance with a hierarchical schema. In accordance with at least one embodiment, an ad corpus is transformed into a collection of hierarchically structured textual documents. An indexing technique that exploits the hierarchical structure is then applied to construct a compact yet effective ad index that can be used for performing advanced match or other ad retrieval functions. Various retrieval methods are also described herein that are capable of exploiting the hierarchical structure of the ad corpus to retrieve more relevant ads than those yielded by conventional methods.
    Type: Application
    Filed: April 15, 2010
    Publication date: October 20, 2011
    Applicant: YAHOO! INC.
    Inventors: Donald Metzler, Evgeniy Gabrilovich, Vanja Josifovski, Michael Bendersky
  • Publication number: 20110258205
    Abstract: The sort processing of keys to be sorted, which keys are expressed as bit strings involves a classification processing. In the classification processing, a bit string comparison between a reference key and a key which is an object of the classification is performed, and a difference bit position is obtained that is the bit position of the first bit that differs in the bit string comparison and the keys to be sorted are classified by the difference bit position into key groups with the same difference bit position.
    Type: Application
    Filed: June 22, 2011
    Publication date: October 20, 2011
    Applicant: S. Grants Co., Ltd.
    Inventors: Toshio Shinjo, Koutaro Shinjo, Mitsuhiro Kokubun
  • Publication number: 20110258197
    Abstract: Content leaving a local network can be captured and indexed so that queries can be performed on the captured data. In one embodiment, the present invention comprises an apparatus that connects to a network. In one embodiment, this apparatus includes a network interface module to connect the apparatus to a network, a packet capture module to intercept packets being transmitted on the network, an object assembly module to reconstruct objects being transmitted on the network from the intercepted packets, an object classification module to determine the content in the reconstructed objects, and an object store module to store the objects. This apparatus can also have a user interface to enable a user to search objects stored in the object store module.
    Type: Application
    Filed: June 24, 2011
    Publication date: October 20, 2011
    Inventors: Erik de la Iglesia, Rick Lowe, Ratinder Paul Singh Ahuja, William Deninger, Samuel King, Ashish Khasgiwala, Donald J. Massaro
  • Publication number: 20110258196
    Abstract: A method of content recommendation, includes: generating a first digital mathematical representation of contents to associate the contents with a first plurality of words describing the contents; generating a second digital mathematical representation of text documents different from the contents to associate the documents with a second plurality of words; processing the first and second pluralities of words to determine a common plurality of words; processing the first and second digital mathematical representations to generate a common digital mathematical representation of the contents and the text documents based on the common plurality of words; and providing content recommendation by processing the common digital mathematical representation.
    Type: Application
    Filed: December 30, 2008
    Publication date: October 20, 2011
    Inventors: Skjalg Lepsoy, Gianluca Francini, Fabrizio Antonelli
  • Publication number: 20110251878
    Abstract: A system for processing data includes a first data pipeline. The first data pipeline includes a processor to process a first set of data stored in a tangible memory. The system also includes a second data pipeline to process a second set of data. A mapping processor matches the first set of data to the second set of data to produce a third set of data.
    Type: Application
    Filed: April 13, 2010
    Publication date: October 13, 2011
    Applicant: Yahoo! Inc.
    Inventors: Senthil Subramanian, Prashant Baronia
  • Publication number: 20110252007
    Abstract: A method of storing data in a storage media includes compressing raw data based on a physical storage unit of the storage media and storing the compressed data in the storage media. The physical storage unit of the storage media storing the compressed data includes an update region into which update data may be written.
    Type: Application
    Filed: April 8, 2011
    Publication date: October 13, 2011
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Kyoung Lae CHO, Bumseok Yu, Junjin Kong, Hee Chang Cho, Seongsik Hwang
  • Publication number: 20110252008
    Abstract: Methods and apparatuses for processing data are disclosed, including methods and apparatuses that leverage a reconfigurable logic device to offload decompression and search operations from a processor to thereby enable high speed data searches within data that has been stored in a compressed format.
    Type: Application
    Filed: June 21, 2011
    Publication date: October 13, 2011
    Inventors: Roger D. Chamberlain, Benjamin M. Brink, Jason R. White, Mark A. Franklin, Ron K. Cytron
  • Publication number: 20110246439
    Abstract: A query is annotated with a small sketch (e.g. a Bloom filter) that approximates a set of interest that is related to the query. The query and sketch may be forwarded to index servers that each stores a portion of a search engine corpus. Each of the index servers may filter documents using the sketch before returning results for aggregation. The sketch is designed so there may be false positives (results returned by authors not in the set), but no false negatives (all relevant results are returned). The final aggregated results set may be checked against the full set to remove false positives before returning the final results to the user.
    Type: Application
    Filed: April 6, 2010
    Publication date: October 6, 2011
    Applicant: Microsoft Corporation
    Inventors: Michael A. Isard, Marc A. Najork, Sean A. Suchter, Eric R. Scheel
  • Publication number: 20110246472
    Abstract: Methods and systems for facilitating distribution of application functionality across a multi-tier client-server architecture are provided. According to one embodiment, a method is provided for instantiating a DataMap. A data store interface reads a set of definitions and instructions from a datastore that describe the structure of the DataMap. The data store interface interprets the set of definitions and instructions to instantiate the DataMap. According to another embodiment, a method is provided for indexing into a DataMap. A data store interface receives an expression. The data store interface parses the expression to identify a set of keys suitable for indexing into the DataMap and corresponding DataPoints.
    Type: Application
    Filed: July 9, 2010
    Publication date: October 6, 2011
    Inventor: David M. Dillon
  • Publication number: 20110246432
    Abstract: Embodiments of the present invention provide one or more hardware-friendly data structures that enable efficient hardware acceleration of database operations. In particular, the present invention employs a column-store format for the database. In the database, column-groups are stored with implicit row ids (RIDs) and a RID-to-primary key column having both column-store and row-store benefits via column hopping and a heap structure for adding new data. Fixed-width column compression allow for easy hardware database processing directly on the compressed data. A global database virtual address space is utilized that allows for arithmetic derivation of any physical address of the data regardless of its location. A word compression dictionary with token compare and sort index is also provided to allow for efficient hardware-based searching of text. A tuple reconstruction process is provided as well that allows hardware to reconstruct a row by stitching together data from multiple column groups.
    Type: Application
    Filed: May 13, 2011
    Publication date: October 6, 2011
    Applicant: TERADATA US, INC.
    Inventors: Liuxi Yang, Kapil Surlaker, Ravi Krishnamurthy, Michael Corwin, Jeremy Branscome, Krishnan Meiyyappan, Joseph I. Chamdani