Abstract: Deduplication of data on disk devices based on a threshold number (THN) of sequential blocks is described herein, the threshold number being two or greater. Deduplication may be performed when a series of THN or more received blocks (THN series) match a sequence of THN or more stored blocks (THN sequence), whereby a sequence comprises blocks stored on the same track of a disk device. Deduplication may be performed using a block-comparison mechanism comprising metadata entries of stored blocks and a mapping mechanism containing mappings of deduplicated blocks to their matching blocks. The mapping mechanism may be used to perform later read requests received for the deduplicated blocks. The deduplication described herein may reduce the read latency as the number of seeks between tracks may be reduced. Also, when a seek to a different track is performed, the seek time cost is spread over THN or more blocks.
Abstract: A cluster comprises a plurality of nodes that access a shared storage, each node having two or more partner nodes. A primary node may own a plurality of aggregate sub-sets in the shared storage. Upon failure of the primary node, each partner node may take over ownership of an aggregate sub-set according to an aggregate failover data structure (AFDS). The AFDS may specify, an ordered data structure of two or more partner nodes to take over each aggregate sub-set, the ordered data structure comprising at least a first-ordered partner node assigned to take over the aggregate sub-set upon failure of the primary node and a second-ordered partner node assigned to take over the aggregate sub-set upon failure of the primary node and the first-ordered partner node. The additional workload of the failed primary node is distributed among two or more partner nodes and protection for multiple node failures is provided.
Abstract: A system, a method, and a user interface for providing content by using several modules for displaying the content within a single page. A first module within the page is for presenting content from a first source and a second module is for presenting content from a second source. The first module is stacked on top of the second module within a window for presenting the page. The content presented by the first module is independent of the content presented by the second module such that a user interacts with the content of each module independently and without the need for navigation to a location external to the single page.
August 16, 2007
Date of Patent:
October 30, 2012
Stephen Gerald Garcia, Joshua Allen Rehling, Andrew Boath Faris, Anthony Dominic Amidei
Abstract: A system and method for dynamically producing virtual machines (VMs) across a plurality of servers in the virtual server environment is provided. A single VM request queue is produced comprising VM requests for producing the plurality of VMs. A processing thread is produced and assigned for each server and retrieves VM requests from the VM request queue and produces VMs only on the assigned server according to the retrieved VM requests. Each processing thread may be configured for retrieving VM requests and producing VMs without any programmed delays, whereby the rate at which a processing thread produces VMs on its assigned server is a function of the performance capabilities of the assigned server. This dynamic allocation of VMs based on such a “natural selection” technique may provide an appropriately balanced allocation of VMs based on the performance capabilities of each server in the virtual server environment.
Abstract: A cluster system comprises a plurality of nodes that provides data-access service to a shared storage, each node having at least one failover partner node for taking over services of a node if the node fails. Each node may produce write logs for the shared storage and periodically send write logs at predetermined time intervals to a global device which stores write logs from each node. The global device may detect failure of a node by monitoring time intervals of when write logs are received from each node. Upon detection of a node failure, the global device may provide the write logs of the failed node to one or more partner nodes for performing the write logs on the shared storage. Write logs may be transmitted only between nodes and the global device to reduce data exchanges between nodes and conserving I/O resources of the nodes.
Abstract: A method and apparatus for deduplication of files of a storage system is described. During a gathering phase, a file may be simultaneously processed by two or more threads to produce and store content identifiers for data blocks of the file. Each file may be sub-divided into multiple file sub-portions, each file sub-portion comprising a predetermined number of data blocks. A thread may be assigned to each sub-portion of a file for processing the data blocks. The currently assigned sub-portion for each thread may be recorded and used upon a system crash to restart each scanner thread at the currently assigned sub-portion to minimize the data blocks that are re-processed. The size of a file sub-portion may be predetermined based on the organization of inode data structures representing the files (e.g., based on the maximum number of pointers that an indirect block in the inode data structure may contain).
Abstract: A system and method are provided for auto-committing files of a storage system to immutable status based on a change log of file system activity. The system is configured for producing and analyzing the change log. Producing the change log involves generating change log entries associated with changes made to files of the storage system and organizing the change log entries from the oldest to newest entries. Analyzing the change log involves processing the change log beginning with the oldest entry to determine whether any entries have met the auto-commit time period, and if so, to set the files associated with such entries to immutable status. If a change log entry is found not to have met the auto-commit time period, a resting time period is determined based on the oldest change log entry, and processing of the change log proceeds after expiration of the resting time period.
Abstract: Information regarding the structure of information in a content database is maintained in a structure database. The structure database is used to correlate the data structure of a query to the structure of the content database, in order to determine that information in the content database which needs to be provided to a searcher in response to the query. In one embodiment, this search method is used in an online forum, and the forum maintains a reputation score for users with respect to given subject matter. The reputation score is dependent upon the quality of a user's participation in the forum. A user's reputation score depends upon the evaluation by others of information he posts and. upon the user evaluating information posted by others.
Abstract: Methods and systems that label a web page by collecting a set of inbound labels for the web page, estimating a language model for the web page, computing the likelihood of generating each inbound label given the language model and assigning a score to each inbound label based on this likelihood, and assigning a label to the web page based on the score assigned to each of the set of inbound labels. Inbound labels are preferably collected from the set of web documents linking to the web page. Labels assigned are useful in providing labeled links to web pages from top hosts in search results pages.
Abstract: A system and method to facilitate targeting of advertisements based on mutual information sharing between devices over a network are described. Users access an entity over a network and further initiate various events, which are subsequently captured by selective processing servers within the network-based entity. Each of such events or actions initiated by the user is associated with one or more categories and is further stored within the entity along the respective categories. Subsequently, if the user connects to one or more media devices, and the media devices are coupled to the entity via the network, then the entity selects advertisements to be displayed within the viewing area of the media device, such that each advertisement is related to the stored events or actions and their respective categories. Alternatively, a user connects to one or more media devices and further performs specific actions.
Abstract: Described are methods and apparatus for reducing latency of read and write requests for a set of storage system sites having a shared data set. An owner site may directly write to the shared data set and contains current data regarding the shared data set. The remote sites may experience substantial latency when accessing the shared data set stored at the owner site. Synchronizing and caching methods may reduce overall read latency experienced at remote sites by periodically transmitting images of the shared data set to the remote sites. Also, a migration method may be used to change ownership of the shared data set from a current owner site (that may be receiving a relatively low number of read/write requests) to a new owner site (that may be receiving a relatively high number of read/write requests) to reduce the overall read and write latency experienced in the sites.
Abstract: A method for calibrating an apparatus for ellipsometric measurements performed on an arbitrarily large or continuously moving sample, using a visible sample reference frame, and one or more laser sources in order to calibrate the ellipsometer for variations in the distance between the ellipsometer apparatus and the sample of interest. Included are techniques for projecting a first laser beam spot from an incident laser source onto a sample, then analyzing the position of the first laser beam spot relative to the center of the sample reference frame using human-aided measurements and confirmations and/or computer vision techniques. Then adjusting pivot points and/or apparatus-to-sample distance to achieve a first beam spot being located about the center of the sample reference frame, and concurrently intersecting the plane of the sample. Other techniques include changing the incidence and reflectance angle using a semi-circular track arc design with a stepping motor activating each goniometer arm.
Abstract: An ad matching system that includes an interactive client permits a triggering Web page author to provide feedback on a candidate advertisement for the page. Author feedback is used to rank ads for display on the triggering page. Preferably author feedback is also incorporated into ad clustering and/or ad ranking formulae within the system. Also, author credibility is judged based on author feedback and on click through rates of placed ads.
Abstract: Described herein are method and apparatus for using an LLRRM device as a storage device in a storage system. At least three levels of data structures may be used to remap storage system addresses to LLRRM addresses for read requests, whereby a first-level data structure is used to locate a second-level data structure corresponding to the storage system address, which is used to locate a third-level data structure corresponding to the storage system address. An LLRRM address may comprise a segment number determined from the second-level data structure and a page number determined from the third-level data structure. Update logs may be produced and stored for each new remapping caused by a write request. An update log may specify a change to be made to a particular data structure. The stored update logs may be performed on the data structures upon the occurrence of a predetermined event.
Abstract: A method of providing advertising services selects a finite set of topics, and arranges the selected set of topics into a hierarchical structure. The method classifies impression items into the nodes within the hierarchical structure, and allows bidding against the nodes within the hierarchical structure. Some embodiments allow a bidder to request a refinement of the hierarchical structure. These embodiments receive such a request, and compare the request to a set of criteria. If the request meets the set of criteria, the method divides a first node in the hierarchical structure to at least a second and third node. The method allows bidders to bid on each of the first, second, and third nodes. The method optionally measures a performance for the nodes within the hierarchical structure. Based on the measure of performance for the nodes, the method preferably removes an under-performing node from the hierarchical structure.
Abstract: A data receiver circuit includes a transmission line to generate the appropriate timing for clock and data recovery. The transmission line receives a reference signal, and propagates the reference signal through at least two segments of predetermined lengths. The transmission line is configured with a first tab to extract, from the first predetermined length, a first delayed signal, and a second tab to extract, from the second predetermined length, a second delayed signal. A sampling circuit generates samples, at a first time period, from an input signal and the first delayed signal. The sampling circuit also generates samples, at a second time period, from the input signal and the second delayed signal. A capacitance control device to adjust the capacitance of the transmission line is disclosed.
Abstract: A computer implemented method, computer-readable medium and system for deciding which external corpora, such as verticals, to integrate into primary Internet search engine results in response to a query is disclosed. Offline query-related data and user feedback data is incorporated. A probabilistic estimate is formed of the relevance of the verticals to the query.
Abstract: A system for selecting electronic advertisements from an advertisement pool to match the surrounding content is disclosed. To select advertisements, the system takes an approach to content match that takes advantage of machine translation technologies. The system of the present invention implements this goal by means of simple and efficient machine translation features that are extracted from the surrounding context to match with the pool of potential advertisements. Machine translation features used as features for training a machine learning model. In one embodiment, a ranking SVM (Support Vector Machines) trained to identify advertisements relevant to a particular context. The trained machine learning model can then be used to rank advertisements for a particular context by supplying the machine learning model with the machine translation features measures for the advertisements and the surrounding context.
Abstract: A cluster storage system comprises a plurality of nodes that access a shared storage, each node having two or more failover partner nodes. A primary node produces write logs for received write requests and produces parity data for the write logs (storing the parity data to local non-volatile storage). By storing parity data rather than actual write logs, the non-volatile storage space within the cluster for storing write logs is reduced. Prior to failure of the primary node, the primary node also sub-divides the write logs into two or more sub-sets and distributes the sub-sets to the two or more partner nodes for storage at non-volatile storage devices. Thus, if the primary node fails, its write logs are already distributed among the partner nodes so each partner node may perform the allotted write logs on the storage, thus improving the response time to the primary node failure.
Abstract: Deduplication of data using a low-latency random read memory (LLRRM) is described herein. Upon receiving a block, if a matching block stored on a disk device is found, the received block is deduplicated by producing an index to the address location of the matching block. In some embodiments, a matching block having a predetermined threshold number of associated indexes that reference the matching block is transferred to LLRRM, the threshold number being one or greater. Associated indexes may be modified to reflect the new address location in LLRRM. Deduplication may be performed using a mapping mechanism containing mappings of deduplicated blocks to matching blocks, the mappings being used for performing read requests. Deduplication described herein may reduce read latency as LLRRM has relatively low latency in performing random read requests relative to disk devices.