Patents by Inventor Marcus Felipe Fontoura

Marcus Felipe Fontoura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8131726
    Abstract: A method for indexing a plurality of documents, that includes a plurality of duplicate documents, first identifies one or more duplicate groups of documents from among the plurality of documents. Then, one index of content for the duplicate group is created instead of indexing the content from every document within the duplicate group. However, in contrast to the content index, an index of metadata for each of the documents in the duplicate group is created. Thus the content of each duplicate group is indexed only once, while a search engine using such indexing techniques retains the capability to answer queries as if the duplicated content was indexed for each document of the group.
    Type: Grant
    Filed: January 12, 2005
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Andrei Z. Broder, Marcus Felipe Fontoura, Michael Herscovici, Ronny Lempel, John Ai McPherson, Jr., Andreas Neumann, Runping Qi, Eugene Jon Shekita
  • Patent number: 8103748
    Abstract: A system and method for managing heterogenous clusters asynchronously accesses operating information received from the clusters by invoking services on the clusters to send the necessary information, which is then evaluated against rules supplied by the clusters. The services can be dynamically changed or added to support heterogenous cluster management. When a rule is triggered the appropriate cluster is informed by a management service, so that the cluster can undertake load balancing/storage balancing activities as appropriate among its nodes.
    Type: Grant
    Filed: May 20, 2002
    Date of Patent: January 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Marcus Felipe Fontoura, Eustus Dwayne Nelson, Thomas Khanh Truong
  • Patent number: 7991769
    Abstract: An improved system and method is provided for searching a collection of objects that may be located in hierarchies of auxiliary information for retrieval of response objects. A framework to perform a generalization search in hierarchies may be used to generalize a search by moving up to a higher level in a hierarchy of taxonomies or to specialize a search by moving down to a lower level in the hierarchy of taxonomies. Once the system may decide to enumerate response objects at a particular level of generalization, a budgeted generalization search may be used for enumerating a set of response objects within a budgeted cost.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: August 2, 2011
    Assignee: Yahoo! Inc.
    Inventors: Marcus Felipe Fontoura, Vanja Josifovski, Christopher Olston, Shanmugasundaram Ravikumar, Andrew Tomkins
  • Patent number: 7991786
    Abstract: A system and method for parsing documents in query processing comprises producing at least one index of a document written in a mark-up language, corresponding the index to the document, scanning the document, and selectively skipping portions of the document based on instructions from the index. Furthermore, the mark-up language comprises any of HTML and XML; the skipped portions of the document comprise portions irrelevant to the query; the index comprises a plurality of elements representing textual categories of the query; and the instructions match the elements to the query. If the elements do not match the query, then the parser uses the index to skip the portions of the document corresponding to the unmatched elements. Moreover, each of the elements corresponds to a position in the document, wherein the position comprises an end position, which determines where to resume scanning the document upon skipping the portions of the document.
    Type: Grant
    Filed: November 25, 2003
    Date of Patent: August 2, 2011
    Assignee: International Business Machines Corporation
    Inventors: Marcus Felipe Fontoura, Vanja Josifovski, Pratik Mukhopadhyay
  • Patent number: 7991806
    Abstract: A system and method to facilitate importation of data taxonomies within a network are described. Advertiser entities access a data storage module within a network-based entity to retrieve content information from one or more content taxonomies stored within the data storage module. Subsequently, the advertiser entities select advertisements targeted to specific users based on the retrieved content information and further transmit the advertisements to the network-based entity. Furthermore, publisher entities and/or advertiser entities transmit data, such as, for example, associated taxonomy information, to the network-based entity. The entity receives the respective taxonomy information and parses the taxonomy information to extract node information and associated categories related to the received information. Finally, the entity integrates the node information and associated categories into one or more taxonomies stored within the data storage module.
    Type: Grant
    Filed: July 20, 2007
    Date of Patent: August 2, 2011
    Assignee: Yahoo! Inc.
    Inventors: Andrei Zary Broder, Marcus Felipe Fontoura, Vanja Josifovski
  • Patent number: 7921416
    Abstract: The present invention, in an example embodiment, provides a special-purpose formal language and translator for the parallel processing of large databases in a distributed system. The special-purpose language has features of both a declarative programming language and a procedural programming language and supports the co-grouping of tables, each with an arbitrary alignment function, and the specification of procedural operations to be performed on the resulting co-groups. The language's translator translates a program in the language into optimized structured calls to an application programming interface for implementations of functionality related to the parallel processing of tasks over a distributed system. In an example embodiment, the application programming interface includes interfaces for MapReduce functionality, whose implementations are supplemented by the embodiment.
    Type: Grant
    Filed: October 20, 2006
    Date of Patent: April 5, 2011
    Assignee: Yahoo! Inc.
    Inventors: Marcus Felipe Fontoura, Vanja Josifovski, Shanmugasundaram Ravikumar, Christopher Olston, Benjamin Clay Reed, Andrew Tomkins
  • Patent number: 7783626
    Abstract: Provided is a technique for building an index. A new indexi+1 is built and an anchor text tablei+1 and a duplicates tablei+1 are output using a storei, a delta store, and previously generated global analysis computationsi, wherein the previously generated global analysis computationsi include an anchor text tablei, a rank tablei, and a duplicates tablei. New global analysis computationsi+1 are generated using the anchor text tablei+1, the duplicates tablei+1, and the previously generated global analysis computationsi.
    Type: Grant
    Filed: August 17, 2007
    Date of Patent: August 24, 2010
    Assignee: International Business Machines Corporation
    Inventors: Marcus Felipe Fontoura, Reiner Kraft, Tony Kai-Chi Leung, John A. McPherson, Jr., Andreas Neumann, Runping Qi, Sridhar Rajagopalan, Eugene J. Shekita, Jason Yeong Zien
  • Patent number: 7765214
    Abstract: Provided are techniques for computer-based electronic Information Retrieval (IR). An extended inverted index structure by generating one or more lexical affinities (LA), wherein each of the one or more lexical affinities comprises two or more search items found in proximity in one or more documents in a pool of documents, and generating a posting list for each of the one or more lexical affinities, wherein each posting list is associated with a specific lexical affinity and contains document identifying information for each of the one or more documents in the pool that contains the specific lexical affinity and a location within the document where the specific lexical affinity occurs.
    Type: Grant
    Filed: January 18, 2006
    Date of Patent: July 27, 2010
    Assignee: International Business Machines Corporation
    Inventors: Peter Altevogt, Marcus Felipe Fontoura, Silvio Wiedrich, Jason Yeong Zien
  • Patent number: 7743060
    Abstract: Disclosed is a technique for indexing data. For each token in a set of documents, a sort key is generated that includes a document identifier that indicates whether a section of a document associated with the sort key is an anchor text section or a context section, wherein the anchor text section and the context text section have a same document identifier; it is determined whether a data field associated with the token is a fixed width; when the data field is a fixed width, the token is designated as one for which fixed width sort is to be performed; and, when the data field is a variable length, the token is designated as one for which a variable width sort is to be performed. The fixed width sort and the variable width sort are performed. For each document, the sort keys are used to bring together the anchor text section and the context section of that document.
    Type: Grant
    Filed: August 6, 2007
    Date of Patent: June 22, 2010
    Assignee: International Business Machines Corporation
    Inventors: Marcus Felipe Fontoura, Andreas Neumann, Sridhar Rajagopalan, Eugene J. Shekita, Jason Yeong Zien
  • Patent number: 7685138
    Abstract: A system, method, and computer program product to improve XML query processing efficiency with virtual cursors. Structural joins are a fundamental operation in XML query processing, and substantial work exists on index-based algorithms for executing them. Two well-known index features—path indices and ancestor information—are combined in a novel way to replace at least some of the physical index cursors in a structural join with virtual cursors. The position of a virtual cursor is derived from the path and ancestor information of a physical cursor. Virtual cursors can be easily incorporated into existing structural join algorithms. By eliminating index I/O and the processing cost of handling physical inverted lists, virtual cursors can improve the performance of holistic path queries by an order of magnitude or more.
    Type: Grant
    Filed: November 8, 2005
    Date of Patent: March 23, 2010
    Assignee: International Business Machines Corporation
    Inventors: Kevin S. Beyer, Marcus Felipe Fontoura, Sridhar Rajagopalan, Eugene J. Shekita, Beverly Yang
  • Patent number: 7577644
    Abstract: In an example embodiment, the present invention provides methods and logic for enhancing augmented search, including contextual search, conducted by a search engine. In some instances, a contextual search might return a set of results that are less relevant than the set of results returned by algorithmic search. This might occur when the quantity of contextual information is very large or when the contextual information includes misspellings. An embodiment of the present invention detects such occurrences and corrects the set of results provided to the user by merging a ranked set of results from the contextual search with a ranked set of results from an algorithmic search. During this merge process, an embodiment of the present invention replaces irrelevant results from the contextual search with results from the algorithmic search if the latter results fall within the context used for the contextual search.
    Type: Grant
    Filed: October 11, 2006
    Date of Patent: August 18, 2009
    Assignee: Yahoo! Inc.
    Inventors: Marcus Felipe Fontoura, Vanja Josifovski, Reiner Kraft
  • Publication number: 20090024649
    Abstract: A system and method to facilitate importation of data taxonomies within a network are described. Advertiser entities access a data storage module within a network-based entity to retrieve content information from one or more content taxonomies stored within the data storage module. Subsequently, the advertiser entities select advertisements targeted to specific users based on the retrieved content information and further transmit the advertisements to the network-based entity. Furthermore, publisher entities and/or advertiser entities transmit data, such as, for example, associated taxonomy information, to the network-based entity. The entity receives the respective taxonomy information and parses the taxonomy information to extract node information and associated categories related to the received information. Finally, the entity integrates the node information and associated categories into one or more taxonomies stored within the data storage module.
    Type: Application
    Filed: July 20, 2007
    Publication date: January 22, 2009
    Inventors: Andrei Zary Broder, Marcus Felipe Fontoura, Vanja Josifovski
  • Publication number: 20090024468
    Abstract: A system and method to facilitate matching of content to advertising information in a network are described. A request for advertising information is received over a network, the advertising information to he displayed for a user entity in association with content information within a web page requested by the user entity. Advertising information related to one or more themes of the content information on the web page is further determined, the themes representing subject matter contextually related to the content information. Advertisements are further selected from the advertising information based on keywords and metadata stored within the web page and based on a set of predetermined parameters stored within the data storage module. The selected advertisements are further ranked to obtain a ranked list of advertisements.
    Type: Application
    Filed: July 20, 2007
    Publication date: January 22, 2009
    Inventors: Andrei Zary Broder, Marcus Felipe Fontoura, Vanja Josifovski, Lance Alan Riedel
  • Publication number: 20090024469
    Abstract: A system and method to facilitate classification and storage of events in a network are described. An event and associated content information are received from an entity over a network. The content information is further analyzed to determine one or more themes representing subject matter related to the content information. The event is further classified according to the themes into one or more corresponding categories. Finally, the event is stored into one or more corresponding databases of a data storage module according to the one or more corresponding categories.
    Type: Application
    Filed: July 20, 2007
    Publication date: January 22, 2009
    Inventors: Andrei Zary Broder, Marcus Felipe Fontoura, Vanja Josifovski, Lance Alan Riedel
  • Publication number: 20090024467
    Abstract: Methods for selecting advertisements to serve to a client requesting a primary webpage is provided. The client displays a referring webpage having a hyperlink to the primary webpage. Upon selection of the hyperlink, the client sends a request to a content server storing the primary webpage, the request including a referrer of the primary webpage comprising a URL address of the referring webpage. The content server sends the primary webpage to the client which includes the referrer and an advertisement request mechanism configured to make an advertisement request to an advertisement server and attach the referrer to the advertisement request. The advertisement server uses the referrer to select one or more advertisements to serve to the client. The referrer may comprise one or more search query terms submitted by the client. The advertisement server may also use the content of the primary webpage to select the one or more advertisements.
    Type: Application
    Filed: July 20, 2007
    Publication date: January 22, 2009
    Inventors: Marcus Felipe Fontoura, Andrei Zary Broder, Vanja Josifovski
  • Publication number: 20090024623
    Abstract: A system and method to facilitate mapping and storage of data within one or more data taxonomies are described. Content information is received over a network. The content information is further analyzed to determine at least one theme representing subject matter related to the content information. Finally, the content information is stored within respective predetermined categories organized within at least one taxonomy, the predetermined categories being associated with the at least one theme.
    Type: Application
    Filed: July 20, 2007
    Publication date: January 22, 2009
    Inventors: Andrei Zary Broder, Vanja Josifovski, Marcus Felipe Fontoura, Lance Alan Riedel
  • Publication number: 20080301130
    Abstract: Provided are a method, system, and article of manufacture for searching documents for ranges of numeric values. Document identifiers for documents are accessed, wherein the documents include at least one value that is a member of a set of values. A number of posting lists are generated. Each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents including at least one value within the range of consecutive values associated with the posting list, and wherein each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored, wherein the posting lists are used to process a query on a range of values within the set of values.
    Type: Application
    Filed: August 12, 2008
    Publication date: December 4, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Marcus Felipe Fontoura, Ronny Lempel, Runping Qi, Jason Yeong Zien
  • Patent number: 7461064
    Abstract: Provided are a method, system, and program for searching documents for ranges of numeric values. Document identifiers for documents are accessed, wherein the documents include at least one value that is a member of a set of values. A number of posting lists are generated. Each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents having values within the range of consecutive values associated with the posting list. Each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored.
    Type: Grant
    Filed: September 24, 2004
    Date of Patent: December 2, 2008
    Assignee: International Buiness Machines Corporation
    Inventors: Marcus Felipe Fontoura, Ronny Lempel, Runping Qi, Jason Yeong Zien
  • Publication number: 20080294634
    Abstract: Provided are a system and article of manufacture for searching documents for ranges of numeric values. Document identifiers for documents include at least one value that is a member of a set of values. A number of posting lists is generated, wherein each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents including at least one value within the range of consecutive values associated with the posting list, and wherein each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored, wherein the posting lists are used to process a query on a range of values within the set of values. A query on a query range of values within the set of values is received and a determination is made of a minimum number of posting lists associated with consecutive values that together include the query range of values.
    Type: Application
    Filed: August 6, 2008
    Publication date: November 27, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Marcus Felipe Fontoura, Ronny Lempel, Runping Qi, Jason Yeong Zien
  • Publication number: 20080201219
    Abstract: A system and method to facilitate classification of search queries and selection of associated advertising information over a network are described. A search query received from a user over a network is processed to retrieve a predetermined number of query results. The predetermined number of query results is further classified to select one or more categories associated with the query results. Finally, advertising information is selected based on the one or more selected categories for further display to the user in connection with the query results.
    Type: Application
    Filed: February 20, 2007
    Publication date: August 21, 2008
    Inventors: Andrei Zary Broder, Marcus Felipe Fontoura, Vanja Josifovski