Abstract: A system provides file aware block level deduplication in a system having multiple clients connected to a storage subsystem over a network such as an Internet Protocol (IP) network. The system includes client components and storage subsystem components. Client components include a walker that traverses the namespace looking for files that meet the criteria for optimization, a file system daemon that rehydrates the files, and a filter driver that watches all operations going to the file system. Storage subsystem components include an optimizer resident on the nodes of the storage subsystem. The optimizer can use idle processor cycles to perform optimization. Sub-file compression can be performed at the storage subsystem.
Type:
Application
Filed:
August 17, 2010
Publication date:
March 24, 2011
Applicant:
OCARINA NETWORKS, INC.
Inventors:
Mike Wilson, Parthiban Munusamy, Carter George, Murali Bashyam, Vinod Jayaraman, Goutham Rao
Abstract: Mechanisms are provided for efficiently improving a dictionary used for data deduplication. Dictionaries are used to hold hash key and location pairs for deduplicated data. Strong hash keys prevent collisions but weak hash keys are more computation and storage efficient. Mechanisms are provided to use both a weak hash key and a strong hash key. Weak hash keys and corresponding location pairs are stored in an improved dictionary while strong hash keys are maintained with the deduplicated data itself. The need for having uniqueness from a strong hash function is balanced with the deduplication dictionary space savings from a weak hash function.
Abstract: Embodiments described herein relate to compression and decompression of data consisting of a one dimensional time series of floating point numbers. A compressor may comprise a lossless stage and in some embodiments a lossy stage in addition to the lossless stage. The lossy stage quantizes the data by discarding some of the least significant bits as specified by the user. The lossless stage uses a context mixing algorithm with two bit-wise predictive models whose predictions are combined and fed to an arithmetic coder. One model is a direct context model using the most significant bits of prior numeric samples as context. The other model is the output of an adaptive filter, in which the approximate predicted numeric value is used as context to model the actual value. A corresponding decompressor uses the same lossless model with the arithmetic coder replaced by an arithmetic decoder.