Indexing (epo) Patents (Class 707/E17.083)
  • Publication number: 20120296903
    Abstract: Systems and methods for eliminating duplicate events are described. In one embodiment, an event is captured, wherein the event comprises a user interaction with an article on a client device and it is determined whether the event is a duplicate of a stored event.
    Type: Application
    Filed: March 16, 2012
    Publication date: November 22, 2012
    Applicant: GOOGLE INC.
    Inventors: Omar Habib Khan, Stephen R. Lawrence
  • Publication number: 20120278302
    Abstract: The multilingual search for transliterated content technique described herein enables a user to submit a search query in both a native script and its foreign script (e.g., Roman script) transliteration and return relevant results in both the scripts while taking care of the spelling variations in transliterated forms. The technique crawls the World Wide Web for data in both the native script and foreign script transliterated forms of the data. It uses a transliteration engine to generate native script equivalents of the foreign script transliterated data and disambiguates the data in native script (whenever possible). The unique native script word forms are then used to jointly index the data in both the scripts. If the query is in native script, it is directly searched for in the index, otherwise the transliterated query is first converted into native script form(s) and then searched in the indexed database to retrieve and rank results in both the scripts.
    Type: Application
    Filed: April 29, 2011
    Publication date: November 1, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Monojit Choudhury, Kalika Bali, Kanika Gupta, Narendranath Datha
  • Publication number: 20120265749
    Abstract: High-precision local search is performed on the Internet. A map image-rendering software provider embeds spatial keys into maps, which are then provided to producers of Internet content such as map providers. For example, a homeowner may post a message on a web bulletin board advertising his house for sale, and including a map showing the location of the house. When a search engine's web crawler encounters a page having a spatial key embedded in an image, the spatial key is indexed with the other content on the page. Because the spatial key identifies a small geographic area, indexing the content with the spatial key allows search queries to be limited by area and still provide useful results. Thus, a user of a search engine searching for “house for sale” in a specific area will be directed to web pages that meet the geographic and content search terms.
    Type: Application
    Filed: June 25, 2012
    Publication date: October 18, 2012
    Applicant: DECARTA INC.
    Inventors: Geoffrey R. Hendrey, Richard F. Poppen
  • Publication number: 20120265763
    Abstract: A computer system configures data elements based on textual sources by identifying subunits of a textual source, indexing the subunits into a sequence comprised of terms, identifies based on a target a base subsequence of the sequence, and stores the terms in such a way that they can be expanded or contracted and a user can rapidly and efficiently derive relevant information and context even from a vast amount of information including by navigable display to the user. Other methods and systems of configuring and displaying data elements from textual sources are provided.
    Type: Application
    Filed: June 20, 2012
    Publication date: October 18, 2012
    Inventor: Efrem Meretab
  • Publication number: 20120259863
    Abstract: Data versioning in a non-volatile memory. An object key associated with a data object is created. An index into an object table is generated using the object key. A version number is stored in conjunction with the data object stored in the non-volatile memory. In an object linked-list, the object key and the location information of the data object in the non-volatile memory are stored. A record associated with the data object is created in an object table. The record includes an index, a reference to the object linked-list, and the version number. The index is generated based on the object key.
    Type: Application
    Filed: April 11, 2011
    Publication date: October 11, 2012
    Inventors: James M. Bodwin, Darpan Dinker, Andrew D. Eckhardt, Darryl M. Ouye
  • Publication number: 20120254162
    Abstract: Techniques and tools are described for refining source-code query results. For example, source-code query results for a query can be generated, semantic clusters of the source-code query results can be generated, and based on a selection of a semantic cluster option, refined source-code query results can be sent. Also, for example, source-code query results can be received, selections of facet values associated with groups of the source-code query results can be sent, and based on selected facet values, a subset of the source-code query results can be received.
    Type: Application
    Filed: May 19, 2011
    Publication date: October 4, 2012
    Applicant: Infosys Technologies Ltd.
    Inventors: Allahbaksh Mohammedali Asadullah, Susan George, Basava Raju Muddu
  • Publication number: 20120246168
    Abstract: A computer-based system and method for intelligent resume search on online repositories is disclosed. The parameters in the resumes and the attributes related to the said parameters are identified and extracted by scanning the resumes sequentially and are stored in an index file. Search queries are constructed based on accepted query parts as input. The index file is indexed to locate the parameters relevant to the search queries. An initial score is assigned to the parameters located which is transformed to new score based on identifying additional domain intelligence in the derived attributes related to the located parameters. Finally, the resumes relevant to the parameters with the transformed score are retrieved and displayed.
    Type: Application
    Filed: February 10, 2012
    Publication date: September 27, 2012
    Applicant: TATA CONSULTANCY SERVICES LIMITED
    Inventors: Rajiv Radheyshyam Srivastava, Girish Keshav Palshikar
  • Publication number: 20120246143
    Abstract: A method of responding to a request for web page includes the steps of receiving a request, extracting search query parameters from a request and redirecting to a mapped web page or alternatively responding with the requested web page. The search query parameters and requested web page are associated with each and stored for later processing and assignment.
    Type: Application
    Filed: March 26, 2012
    Publication date: September 27, 2012
    Inventor: Lee Roberts
  • Publication number: 20120226698
    Abstract: A system and method for providing an information repository that optimizes profiles of sensory characteristics of food or drink products. The system receives user preferences or search criteria of similar sensory characteristics to match against food or drink products in a database with a very high degree of certainty or accuracy. The system and method also provide personalization to users, i.e. personal recommendations based on personal preferences, as well as product matching processes.
    Type: Application
    Filed: March 3, 2011
    Publication date: September 6, 2012
    Inventors: OLIVIER SILVESTRE, Pierre Huguet
  • Publication number: 20120221577
    Abstract: Embodiments of the invention relate to organizing data records in a relational database. An aspect of the invention includes creating index items for a plurality of data records. Each index item includes a counter and the creating results in a plurality of counters. The numerical values of counters in corresponding index items are updated for data records in the plurality of data records that are subjected to random access. The plurality of data records are reorganized based upon the numerical values of the plurality of counters.
    Type: Application
    Filed: February 3, 2012
    Publication date: August 30, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: You-Chin Fuh, Ke Wei Wei, Xin Ying Yang, Jian Wei Zhang, Jing Zhou, Xiang Zhou
  • Publication number: 20120215785
    Abstract: An indexing system for graph data. In particular implementations, the indexing system provides for denormalization and replica index functionality to improve query performance.
    Type: Application
    Filed: September 8, 2011
    Publication date: August 23, 2012
    Inventors: Sanjeev Singh, Bret Steven Taylor, Paul Buchheit, James Norris, Tudor Bosman, Benjamin Darnell
  • Publication number: 20120215787
    Abstract: A method and system for analyzing data records includes allocating groups of records to respective processes of a first plurality of processes executing in parallel. In each respective process of the first plurality of processes, for each record in the group of records allocated to the respective process, a query is applied to the record so as to produce zero or more values. Zero or more emit operators are applied to each of the zero or more produced values so as to add corresponding information to an intermediate data structure. Information from a plurality of the intermediate data structures is aggregated to produce output data.
    Type: Application
    Filed: February 28, 2012
    Publication date: August 23, 2012
    Inventors: Robert C. Pike, Sean Quinlan, Sean M. Dorward, Jeffrey Dean, Sanjay Ghemawat
  • Publication number: 20120197900
    Abstract: A system and method for searching a time tree index for a database table, where the index uses time representations. A request for data is received, the request comprising a search value. A search date value is derived. The search date value comprises at least one time unit selected in order from a largest time unit to a smallest time unit from the list: century, year, month, date, hour, minute, second and millisecond. A time tree index is searched for at least one node, such that the index path to the node comprises the search date. At least one data record associated with the node is retrieved.
    Type: Application
    Filed: February 10, 2011
    Publication date: August 2, 2012
    Applicant: Unisys Corporation
    Inventor: Sateesh Mandre
  • Publication number: 20120185486
    Abstract: Methods and systems for indexing, storing, recalling and displaying social network user profiles, event calendar postings and user feed postings are described. A single, discrete set of keywords, can be utilized and assigned to both user profiles and postings and can operate as a method of indexing. The assignment of these keywords may allow users to control the display of calendar and feed content by matching assigned posting keywords to assigned profile keywords via a matching algorithm. Matched event-related postings may also be automatically displayed in a user's calendar. Searches of profiles and postings may also be performed by constructing queries using the same set of discrete keywords attached to profiles and postings. Users may have the ability to control the display of specific profile information and postings via privacy settings, which utilize unique methods of tracking relationship segmentation and social distance.
    Type: Application
    Filed: January 20, 2012
    Publication date: July 19, 2012
    Inventors: Matthew Voigt, Michael Petanovitch
  • Publication number: 20120179664
    Abstract: Systems and methods for processing media files are described. In one embodiment, one or more events are captured having associated event data and associated with a client device, wherein each event is associated with an article and at least one of the articles is a media file, wherein at least one of the events is captured in real time upon the occurrence of the event, at least some of the event data and articles associated with the events are indexed and stored, a search query is received, and the at least one media file is determined as relevant to the search query.
    Type: Application
    Filed: December 12, 2011
    Publication date: July 12, 2012
    Applicant: GOOGLE INC.
    Inventors: David Benjamin Auerbach, Stephen R. Lawrence, David Marmaros
  • Publication number: 20120173531
    Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.
    Type: Application
    Filed: March 2, 2012
    Publication date: July 5, 2012
    Applicant: COMMVAULT SYSTEMS, INC.
    Inventors: Anand Prahlad, Jeremy Alan Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
  • Publication number: 20120173541
    Abstract: A distributed caching system for storing and serving information modeled as a graph that includes nodes and edges that define associations or relationships between nodes that the edges connect in the graph.
    Type: Application
    Filed: September 7, 2011
    Publication date: July 5, 2012
    Inventor: Venkateshwaran Venkataramani
  • Publication number: 20120166443
    Abstract: Tables are created in such a way that allows rich querying using standard database routines and other tools. Developers and repository users are provided with a set of schema guidelines that describe how the software related items are to be categorized in the tables and how to use such tables for rich querying. For example, one such guideline provides for course-grained versioning of items (e.g., artifacts, metadata, etc.)—as opposed to the fine grained object principle of unit change found in most repository systems such as the entity-property-value scheme. The developers or providers then use these guidelines to optimally categorize, in a natural way, their metadata and other software related items for storing copies thereof in the repository.
    Type: Application
    Filed: March 7, 2012
    Publication date: June 28, 2012
    Applicant: Microsoft Corporation
    Inventors: Anthony C. Bloesch, Dennis W. Minium, Keith W. Short
  • Publication number: 20120158735
    Abstract: The embodiments disclosed herein include new, more efficient ways to collect product reviews from the Internet, aggregate reviews for the same product, and provide an aggregated review to end users in a searchable format. One aspect of the invention is a graphical user interface on a computer that includes a plurality of portions of reviews for a product and a search input area for entering search terms to search for reviews of the product that contain the search terms.
    Type: Application
    Filed: February 28, 2012
    Publication date: June 21, 2012
    Inventors: Jan Matthias Ruhl, Mayur D. Datar
  • Publication number: 20120158749
    Abstract: A method comprises identifying a first user having stored in a database a set of first bookmarks associated with a topic of interest; determining a level of relatedness of a second user to the first user by comparing a first number of overlapping bookmarks that were stored in the database by the second user and that overlap the set of first bookmarks; determining a level of value of the second user to the first user by comparing a second number of related nonoverlapping bookmarks that were stored in the database by the second user that, relate to the topic of interest, and that do not overlap the set of first bookmarks; and presenting at least a portion of the related nonoverlapping bookmarks to the first user.
    Type: Application
    Filed: December 29, 2011
    Publication date: June 21, 2012
    Inventor: Joshua Schachter
  • Publication number: 20120150868
    Abstract: A method, system, and apparatus are directed to providing information over a network. A search query may be received. If the search query includes at least one keyword matching a pattern associated with a specific search engine, a plurality of search results is retrieved from the specific search engine based on the keyword, and the plurality of search results is provided. A plurality of potential search or source engines may be determined based on a current time information. At least one of the potential search or source engines may be a personalized source engine. At least one plurality of results may be retrieved from at least one of the potential search or source engines. The result may be based on the search query. An aggregated result may be determined based on a time rule and/or the current time information. The aggregated result may comprise the plurality of results.
    Type: Application
    Filed: February 21, 2012
    Publication date: June 14, 2012
    Applicant: YAHOO! INC.
    Inventors: Farzin Maghoul, Shiv Ramamurthi
  • Publication number: 20120143876
    Abstract: Consistent with embodiments of the present invention, a method may be provided comprising receiving a search string corresponding to a desired node comprising a target parameter, a policy parameter, and a class parameter. The target parameter may be referenced with a target index table to determine which interfaces to search. The policy parameter may be referenced with a policy index table to determine a node-id of a policy node corresponding to the policy parameter. A level for the desired node may be determined based on the node-id. The class parameter may be referenced with the determined node-id with a class index table to access a bucket location. The desired node may then be searched for with the determined node-id at the determined level.
    Type: Application
    Filed: December 1, 2010
    Publication date: June 7, 2012
    Applicant: Cisco Technology, Inc.
    Inventors: Vijay Srinivasan, Arun Srinivasan, Jay Shah, Aijaz Pathan, Yen Teresa Nguyen
  • Publication number: 20120136846
    Abstract: Example embodiments are directed to methods of hashing for networks and systems thereof. At least one example embodiment provides a method of processing elements in a system. The method includes receiving a first element, generating a first plurality of hash values based on the first element and a first plurality of hash functions, determining a first plurality of buckets in a table based on the first plurality of hash values, each of the first plurality of buckets associated with a different one of the hash values, selecting one of the first plurality of buckets, storing a first associated value in the selected bucket, the first associated value being associated with the first element, and encoding an identifier (ID) of the hash function generating the hash value associated with the selected bucket into a filter based on the hash value.
    Type: Application
    Filed: November 30, 2010
    Publication date: May 31, 2012
    Inventors: Haoyu SONG, Murali KODIALAM, Fang HAO, T.V. LAKSHMAN
  • Publication number: 20120124052
    Abstract: A phenomenological framework of the human perception of time identifies the Future, Past, and Present perspectives of mind. The framework is visualized as a triangle, with Future, Past, and Present mental constructs serving as an anchor at each of the three corners. The triangular plane between them represents a continuum of relative intensity for each of the constructs. Each blend of intensity in the three constructs itself corresponds to a unique set of the fundamental values and behavioral characteristics that are driven by the mental characteristics. Within this temporal framework, there is a contextual assignment of an external descriptive and informative quality to any person, group, or object—based on relative and/or absolute intensities of each mental construct—that can be used with a high level of confidence in any number of ways to interact better with the object.
    Type: Application
    Filed: May 7, 2011
    Publication date: May 17, 2012
    Applicant: The ClogWorks, Inc.
    Inventors: John Terence Furey, Shawn Francis Phillips, Vincent James Fortunato, III, Irwin Francis Sentilles, IV
  • Publication number: 20120124053
    Abstract: A fact repository contains facts having attributes and values and further having associated annotations, which are used, among other things, to vet facts in the repository and which can be returned in response to a query.
    Type: Application
    Filed: November 8, 2011
    Publication date: May 17, 2012
    Inventors: Tom Ritchford, Jonathan Betz
  • Publication number: 20120124061
    Abstract: An application search system may maintain an index of applications available from multiple different application stores, and includes parameters, such as features and/or content of the applications. When a user submits a query, the system may derive contextual information pertaining to a user device used to submit the query, applications installed on a particular user device and/or usage information for installed applications. The system then may, in one example, determine one or more applications relevant to the search query and, depending on the contextual information derived, may provide an entry point to access a particular application at a task level, may prompt the user to install the application, or may provide a web result related to the particular application.
    Type: Application
    Filed: November 12, 2010
    Publication date: May 17, 2012
    Applicant: Microsoft Corporation
    Inventors: Steven William Macbeth, Steven Charles Tullis, Zhaowei (Charlie) Jiang, Eric P. Gilmore, Paul A. Viola
  • Publication number: 20120117077
    Abstract: A fact repository contains facts having attributes and values and further having associated annotations, which are used, among other things, to vet facts in the repository and which can be returned in response to a query.
    Type: Application
    Filed: November 8, 2011
    Publication date: May 10, 2012
    Inventors: Tom Ritchford, Jonathan Betz
  • Publication number: 20120109967
    Abstract: According to one aspect of the invention, in response to one or more terms to be indexed, each of the terms is indexed in a regular index. In addition, for each of the terms having multiple characters, at least one prefix portion of the term is indexed in a prefix index, where the regular index is used for regular searches and the prefix index is used for prefix searches without having to combine a plurality of postings lists of the regular index at the point in time.
    Type: Application
    Filed: October 27, 2010
    Publication date: May 3, 2012
    Applicant: APPLE INC.
    Inventors: John M. Hörnkvist, Eric R. Koebler
  • Publication number: 20120095997
    Abstract: Systems, methods, and computer storage media having computer-executable instructions embodied thereon that provide contextual indicators associated with a user session are described. Content items within a document associated with a user session are selected. Upon receiving an indication that the user desires to perform a context-aware search, the document associated with the user session is analyzed for contextual information related to the content items selected by the user. Various “contextual indicators” associated with the user session are derived. The contextual indicators are provided for output in association with the user session. The contextual indicators may be fed to a search engine and used to identify search results that the user has an increased likelihood (relative to the current context surrounding the user) of desiring to access.
    Type: Application
    Filed: October 18, 2010
    Publication date: April 19, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: NIR NICE, URI BARASH, SEFY OPHIR, ERAN SHAMIR, RON KARIDI, HADAR SHEMTOV, ANNA TIMASHEVA
  • Publication number: 20120095982
    Abstract: One preferred embodiment of the present invention includes a method of automatically responding to a search query. The method of the preferred embodiment can include steps performed at or by a database, including electronically receiving a query digital media object from a first computer and electronically generating a query index identification of the query digital media object wherein the query index identification includes a query keyword relating to the query digital media object. The method of the preferred embodiment can also include searching the database for an index identification of a digital media object including a keyword relating to the digital media object; and in response to a predetermined level of similarity between the query keyword and the keyword, electronically returning the digital media object in response to the query.
    Type: Application
    Filed: September 12, 2011
    Publication date: April 19, 2012
    Inventors: John W. Lennington, Thomas Voiles, Stanley Sternberg, William Dargel
  • Publication number: 20120089592
    Abstract: Automatic text skimming using lexical chains may be provided. First, at least one lexical chain may be created from an electronic document. Next, a list of positions within the electronic document may be created. The positions may include where at least one concept represented by one of the at least one lexical chain is mentioned. In addition, a list of the position where the at least one concept is mentioned may be assembled. A selection of at least one concept may be received from the list.
    Type: Application
    Filed: December 16, 2011
    Publication date: April 12, 2012
    Inventor: William A. Hollingsworth
  • Publication number: 20120084293
    Abstract: A method, system and computer program product for generating answers to questions. In one embodiment, the method comprises receiving an input query, identifying a plurality of candidate answers to the query; and for at least one of these candidate answers, identifying at least one proof of the answer. This proof includes a series of premises, and a multitude of documents are identified that include references to the premises. A set of these documents is selected that include references to all of the premises. This set of documents is used to generate one or more scores for the one of the candidate answers. A defined procedure is applied to the candidate answers to determine a ranking for the answers, and this includes using the one or more scores for the at least one of the candidate answers in the defined procedure to determine the ranking for this one candidate answer.
    Type: Application
    Filed: September 24, 2011
    Publication date: April 5, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric W. Brown, Jennifer Chu-Carroll, David A. Ferrucci, James W. Murdock, IV
  • Publication number: 20120078845
    Abstract: System and method for extracting, retrieving and managing data in a computer or network of computers through an enhancement of the power of the directory management system and email management system by enabling users to superimpose a hierarchy of descriptors on top of the system, to share, import and export the hierarchy of descriptors between computers with controlled access for data objects. The method and system is defined particularly for selecting individual references from search engine results and saving them along with descriptors. The method and system automatically generate reports of work done in the computer or network of computers, including creation, modification, copying, moving and deletion of files and folders. The method and system reduces the clutter of information while ensuring that the system is automatically backed up in different modes and with complete flexibility to back up.
    Type: Application
    Filed: June 2, 2010
    Publication date: March 29, 2012
    Inventors: Kiron Kasbekar, Ghulam Mustafa
  • Publication number: 20120078910
    Abstract: Methods which use an ID domain to improve searching are described. An embodiment describes an index phase in which an image of a document is converted into the ID domain. This is achieved by dividing the text in the image into elements and mapping each element to an identifier. Similar elements are mapped to the same identifier. Each element in the text is then replaced by the appropriate identifier to create a version of the document in the ID domain. This version may be indexed and searched. Another embodiment describes a query phase in which a query is converted into the ID domain and then used to search an index of identifiers which has been created from collections of documents which have been converted into the ID domain. The conversion of the query may use mappings which were created during the index phase or alternatively may use pre-existing mappings.
    Type: Application
    Filed: December 8, 2011
    Publication date: March 29, 2012
    Applicant: Microsoft Corporation
    Inventors: Walid Magdy, Motaz El-Saban
  • Publication number: 20120078906
    Abstract: A robust knowledge-based management and sharing system organized by context for expertise-based or context-based searching and retrieval of relevant information is disclosed. The various embodiments and techniques described herein are used to organize a user's data and communications around the user's expertise or one or more contexts the user is associated with such as the user's projects, products, and customers. The organization of user data is derived from the user's competencies and interactions with others and is used to build and index user profiles in a manner that facilitates retrieval in search results for relevant search criteria. A linguistic processing pipeline is used to parse and index the user's data to generate the complete and partial profiles organized by context. Complete and partial profiles are generated, indexed, ranked, and stored by the system.
    Type: Application
    Filed: August 3, 2011
    Publication date: March 29, 2012
    Inventors: Pankaj Anand, Maxim Lukichev, Puneet Trehan, Sumit Vij, Nitin Arora
  • Publication number: 20120072426
    Abstract: A flexible and extensible architecture allows for secure searching across an enterprise. Such an architecture can provide a simple Internet-like search experience to users searching secure content inside (and outside) the enterprise. The architecture allows for the crawling and searching of a variety of sources across an enterprise, regardless of whether any of these sources conform to a conventional user role model. The architecture further allows for security attributes to be submitted at query time, for example, in order to provide real-time secure access to enterprise resources. The user query also can be transformed to provide for dynamic querying that provides for a more current result list than can be obtained for static queries.
    Type: Application
    Filed: August 19, 2011
    Publication date: March 22, 2012
    Applicant: Oracle International Corporation
    Inventors: Mark Ture, Muralidhar Krishnaprasad, Thomas Chang, Steve Chi-Ming Yang, Vishu Krishnamurthy
  • Publication number: 20120066227
    Abstract: A plurality of segments in an e-mail collection by parsing content of e-mails is generated. Corresponding segment signature for each segment is created and a signature index is populated using the generated segment signatures. After receiving a query e-mail, a plurality of query segments in the query e-mail is generated using content of the query e-mail and corresponding query segment signature for each query segment is generated. A query root segment is identified and corresponding query root segment signature is generated. A set of root segment signatures of the signature index is identified and the query root segment signature is compared with each root segment signature from the signature index. A subset of the signature index is identified, using a match between the root segment signature and the query root segment signature. An e-mail thread hierarchy is built using the identified subset of the signature index.
    Type: Application
    Filed: September 10, 2010
    Publication date: March 15, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Danish Contractor, Manjula Golla Hosurmath, Sachindra Joshi, Kenney Ng
  • Publication number: 20120059828
    Abstract: Systems and methods for compressing indices are described. In one aspect, a plurality of items are selected where each item has an entry in an inverted index and each item entry comprises a listing of articles that the item appears in. At least a first item entry and a second item entry are determined for compression and the second item entry is compressed into the first item entry resulting in a compressed first item entry.
    Type: Application
    Filed: November 14, 2011
    Publication date: March 8, 2012
    Applicant: GOOGLE INC.
    Inventor: Adam J. Weissman
  • Publication number: 20120041957
    Abstract: Techniques for efficiently indexing and searching similar data are described herein. According to one embodiment, in response to a query for one or more terms received from a client, a query index is accessed to retrieve a list of one or more super files. Each super file is associated with a group of similar files. Each super file includes terms and/or sequences of terms obtained from the associated group of similar files. Thereafter, the super files representing groups of similar files are presented to the client, where each of the super files includes at least one of the queried terms. Other methods and apparatuses are also described.
    Type: Application
    Filed: October 24, 2011
    Publication date: February 16, 2012
    Inventors: Windsor W. Hsu, R. Hugo Patterson
  • Publication number: 20120030213
    Abstract: Systems and methods for managing data, such as metadata or non-metadata such as content. In one exemplary method, a composite document is received and it is determined whether the composite document contains at least one subdocument and if it does, the method captures metadata and/or content from the subdocument and stores the captured metadata and/or content for use in future searches (or an immediate search). The metadata and/or content from the composite document is typically combined together with information about the hierarchy of the subdocuments in the document. The type of information in metadata for one type of file differs from the type of information in metadata for another type of file. Other methods are described and data processing systems and machine readable media are also described.
    Type: Application
    Filed: October 12, 2011
    Publication date: February 2, 2012
    Inventors: Yan Arrouye, Dominic Giampaolo
  • Publication number: 20120016864
    Abstract: Methods, systems, and media are provided for an optimized search engine index. The optimized index is formed by merging small lower level indexes of fresh documents together into a hierarchical cluster of multiple higher level indexes. The optimized index of fresh documents is formed via a single threaded process, while a fresh index serving platform concurrently serves fresh queries. The hierarchy of higher level indexes is formed by merging lower and/or higher level indexes with similar expiration times together. Therefore, as some indexes expire, the remaining un-expired indexes can be re-used and merged with new incoming indexes. The single threaded process provides fast serving of fresh documents, while also providing time to integrate the fresh indexes into a long term primary search engine index, prior to expiring.
    Type: Application
    Filed: July 13, 2010
    Publication date: January 19, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: JAY KUMAR GOYAL, NEIL SHARMAN, VIBHAAKAR SHARMA, VINAY SUDHIR DESHPANDE, UTKARSH JAIN, GAURAV SAREEN, YINZHE YU, DANIEL YUAN
  • Publication number: 20120016884
    Abstract: A method and apparatus for detecting pre-selected data stored on a personal computing device is described. In one embodiment, contents of data storage media of a personal computing device are searched for pre-selected sensitive data. In one embodiment, if at least a portion of the pre-selected sensitive data is detected, a notification of the detection of the pre-selected data is sent to a system via a network. In another embodiment, if at least a portion of pre-selected sensitive data is detected, the access to this data is blocked.
    Type: Application
    Filed: September 27, 2011
    Publication date: January 19, 2012
    Inventors: Kevin T. Rowney, Michael R. Wolfe, Mythili Gopalakrishnan, Vitall A. Fridman
  • Publication number: 20120016881
    Abstract: Methods and apparatus, including computer program products, for maintaining a set of indexes in a database management system (DBMS) having at least one table. A current, stale or deferred status is defined for at least a part of the indexes, resulting in at least a part of a set of current, stale, or deferred indexes in the DBMS. Current indexes are maintained by refreshing a current index synchronously with a table change relating to the current index. Stale indexes are maintained by refreshing a stale index continuously and asynchronously to table modifications of tables relating to the stale index based on log information relating to the modifications. Deferred indexes are maintained by building a deferred index in response to a query to a table relating to the deferred index, thereby bringing the deferred index in accordance with the current query time status to the table relating to the deferred index.
    Type: Application
    Filed: April 7, 2011
    Publication date: January 19, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Namik Hrle, Johannes Schuetzner, James Z. Teng
  • Publication number: 20120011127
    Abstract: A method and apparatus to manage a database in which a data file and an index file are efficiently disposed include generating a table space in such a way that a region of a database in which data in a table form is stored is allocated to a first storage device, and a region of the database in which index information used to search for the data is stored is allocated to a second storage device, storing the data in table form in the first storage device and the index information in the second storage device according to the table space, and storing the data in the database or searching the database according to an input query.
    Type: Application
    Filed: July 6, 2011
    Publication date: January 12, 2012
    Applicant: Samsung Electronics Co., Ltd
    Inventor: Seong-hoon KIM
  • Publication number: 20110314025
    Abstract: A customized, specialty-oriented database and index. of a subject matter area and methods for constructing and using such a database are provided. Selection and indexing of articles is done by experts in the topic with which the database is concerned. As a result, articles are indexed in a manner that allows facile, rapid retrieval of highly relevant articles with few or no false positives with much reduced database maintenance cost through frugal limitation of number of documents in the database, number of terms in a Master Index, and number of codes assigned to each document. A thesaurus allows indexing and search in accordance with terminology familiar to different anticipated groups of users (e.g. doctors, patients, nurses, technicians, and the like). Key articles collections and rapid access to documents therein are also provided.
    Type: Application
    Filed: June 23, 2011
    Publication date: December 22, 2011
    Inventor: John M. Nelson
  • Publication number: 20110307489
    Abstract: An approach is provided for enabling dynamic user based search within a distributed information space. A request for conducting a search over one or more information spaces is distributed to one or more autonomous agents. The autonomous agents process the request according to one or more functions specific to the one or more autonomous agents. Results are rendered to an interface of a user device in response to the search request.
    Type: Application
    Filed: June 9, 2010
    Publication date: December 15, 2011
    Applicant: Nokia Corporation
    Inventors: Ian Justin Oliver, Guido Peter Grassel, Mikko Johannes Honkala, Juha-Pekka Luoma
  • Publication number: 20110295857
    Abstract: A system and method for aligning multilingual content and indexing multilingual documents, to a computer readable data storage medium having stored thereon computer code means for indexing multilingual documents, to a system for presenting multilingual content. The method for aligning multilingual content and indexing multilingual documents comprises the steps of generating multiple bilingual terminology databases, wherein each bilingual terminology database associates respective terms in a pivot language with one or more terms in another language; and combining the multiple bilingual terminology databases to form a multilingual terminology database, wherein the multilingual terminology database associates terms in different languages via the pivot language terms.
    Type: Application
    Filed: June 20, 2008
    Publication date: December 1, 2011
    Inventors: Ai Ti Aw, Min Zhang, Lian Hau Lee, Thuy Vu, Fon Lin Lai
  • Publication number: 20110282888
    Abstract: Techniques for content recommendation are described. Some embodiments provide a content recommendation system (“CRS”) configured to recommend content items that are related to a collection of entities. A content item may be considered related to a collection of entities based on various factors, including whether and how often the article references or otherwise covers the entities of the collection, the size of the article, other entities that are covered by the article but that are not in the collection, article recency, or article credibility. Recommending content items may also or instead include determining entities that are related to a collection. An entity can be considered related to a collection based on various factors, such as whether the entity is of the same or similar type to entities of the collection, or whether the entity appears in some article in a relationship with one or more entities of the collection.
    Type: Application
    Filed: March 1, 2011
    Publication date: November 17, 2011
    Applicant: Evri, Inc.
    Inventors: Krzysztof Koperski, Jisheng Liang, Neil Roseman
  • Publication number: 20110276576
    Abstract: A method of compressing short text messages, comprising: generating an index code comprising an association of keywords in the text messages with indices, the index code is logically divided into segments of variable size, each segment comprising at least one bucket, being a constant range of indices; adjusting the index code according to a natural keyword frequency distribution and to statistical analysis of the text messages; associating short indices with frequent keywords in the text messages; converting the text messages into compressed text messages in which at least some of the keywords are replaced by the associated indices; and updating the association between the indices and the keywords, updating the segments, and updating the updating frequency in respect to a usage keyword frequency distribution and temporal changes thereof
    Type: Application
    Filed: May 5, 2010
    Publication date: November 10, 2011
    Inventor: Mimran David
  • Publication number: 20110258199
    Abstract: Method and systems for performing high volume searches are described. In one example a method includes receiving a query directed to a database, the database including a plurality of items, determining whether the query complies with one of a plurality of search criteria, each of the plurality of search criteria corresponding to a predefined index of the database, selecting a predefined index of the database corresponding to one of the plurality of search criteria if the query complies with said search criterion, the index containing entries that comply with the corresponding search criterion, applying the query to the selected index to find database items referenced in the index, selecting items based on applying the query to the selected index, building a report for the query, the report including only items of the selected index.
    Type: Application
    Filed: April 14, 2011
    Publication date: October 20, 2011
    Applicant: salesforce.com, inc.
    Inventors: KEVIN OLIVER, Paul Burstein, Jeffrey M. Bergan, William A. Press