Hash Tables (epo) Patents (Class 707/E17.052)
  • Patent number: 9009149
    Abstract: Determining ranked candidate media in response to media query data corresponding to a query media includes receiving the media query data including feature data of the query media, coordinate data, and boundary data, matching the features with corresponding features of an media database using the feature data to identify features in the media database within a predetermined hamming distance in a hash table from the corresponding features of the query media to obtain matched features in the media database, determining candidate media whose number of matched features exceeds a matched feature threshold, generating a geometry similarity score between the query media and each candidate media using the feature data and the coordinate data, generating a boundary similarity score between the query media and each candidate media using the boundary data, ranking the candidate media based on the numbers of matched features, the geometry similarity scores and the boundary similarity scores.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: April 14, 2015
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Junfeng He, Shih-Fu Chang, Tai-Hsu Lin
  • Patent number: 8972366
    Abstract: Embodiments relate to systems and methods for a cloud-based directory system based on hashed values of parent and child storage locations. Platforms and techniques are provided to store a data object to cloud storage resources in two or more locations recorded in a consistent hash structure. A file management tool can store one copy of the data object to a location corresponding to the hashed value of the file path or name, and a second copy to a location corresponding to the hashed value of the parent directory of the data object. All files sharing a common parent directory or other location therefore have at least one copy stored to the same location, in common with the parent. Directory-wide read, write, and/or search operations can therefore be performed more efficiently, since the constituent files of a directory or other location can be accessed from one location rather than distributed locations.
    Type: Grant
    Filed: September 29, 2010
    Date of Patent: March 3, 2015
    Assignee: Red Hat, Inc.
    Inventor: Jeffrey Darcy
  • Patent number: 8819291
    Abstract: A set of logical extents, each having compressed logical tracks of data, is mapped to a head physical extent and, if the head physical extent is determined to have been filled, to at least one overflow extent having spatial proximity to the head physical extent. Pursuant to at least one subsequent write operation and destage operation, the at least one subsequent write operation and destage operation determined to be associated with the head physical extent, the write operation is mapped to one of the head physical extent, the at least one overflow extent, and an additional extent having spatial proximity to the at least one overflow extent.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: August 26, 2014
    Assignee: International Business Machines Corporation
    Inventors: Michael T. Benhase, Binny S. Gill, Lokesh M. Gupta, Matthew J. Kalos, Gail A. Spear
  • Publication number: 20140067824
    Abstract: An approach for conversion between database formats (e.g., from a relational database format to a hash table or a “big table” database format) based on user data access patterns in a networked computing environment is provided. A first set of database tables having a first format is identified based on a set of access patterns stored in a computer storage device. A second set of database tables having a second database format corresponding to the first set of database tables may then be provided (e.g., accessed, augmented, and/or generated). A mapping between the first set of database tables and the second set of database tables may then be created. A column set may then be generated based on at least one condition of the set of queries. The column set may then be used as a key for the second set of database tables.
    Type: Application
    Filed: August 30, 2012
    Publication date: March 6, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lisa Seacat DeLuca, Yu Deng, Jenny S. Li, Liangzhao Zeng
  • Patent number: 8667180
    Abstract: For facilitating data compression, a set of logical extents, each having compressed logical tracks of data, is mapped to a head physical extent and, if the head physical extent is determined to have been filled, to at least one overflow extent having spatial proximity to the head physical extent. Pursuant to at least one subsequent write operation and destage operation, the at least one subsequent write operation and destage operation determined to be associated with the head physical extent, the write operation is mapped to one of the head physical extent, the at least one overflow extent, and an additional extent having spatial proximity to the at least one overflow extent.
    Type: Grant
    Filed: June 25, 2012
    Date of Patent: March 4, 2014
    Assignee: International Business Machines Corporation
    Inventors: Michael T. Benhase, Binny S. Gill, Lokesh M. Gupta, Matthew J. Kalos, Gail A. Spear
  • Publication number: 20140052736
    Abstract: Techniques are disclosed for implementing custom object-in-memory formats in a data grid network appliance. The techniques include maintaining a record of format definitions on a client device of the data grid and a corresponding record of format definitions on a server device of the data grid. Each format definition may indicate one or more attributes of an object class and data types and byte ranges of the attributes. The client device may serialize one or more objects for storage in the data grid based on respective format definitions associated with the one or more objects and retrieved from the record of format definitions maintained on the client device. Further, the server device may perform one or more data grid operations using format definitions retrieved from the record of format definitions maintained on the server device.
    Type: Application
    Filed: August 14, 2012
    Publication date: February 20, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jared H. ANDERSON, Chris D. JOHNSON, Fred A. KULACK, William T. NEWPORT
  • Publication number: 20130332466
    Abstract: Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.
    Type: Application
    Filed: June 8, 2012
    Publication date: December 12, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mihaela Ancuta Bornea, Songyun Duan, Achille Belly Fokoue-Nkoutche, Oktie Hassanzadeh, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward
  • Patent number: 8593321
    Abstract: A computation apparatus includes a range table creation unit configured to create a range table in which a discrete value of a computation result obtained by applying a nonlinear computation on an input value corresponds to a range of the input value which may take the discrete value, and a search unit configured to search, when the input value is input, in the range table, for the range in which the input value is included and output the discrete value corresponding to the searched range.
    Type: Grant
    Filed: September 21, 2009
    Date of Patent: November 26, 2013
    Assignee: Sony Corporation
    Inventors: Yukihiko Mogi, Masato Kamata
  • Publication number: 20130311436
    Abstract: An improved computer system may include a controller including a computer processor. The system may also include a selector apparatus in communication with the controller to choose a table having a higher collision quality index than other tables under consideration by the selector apparatus. The system may further include an exchanger apparatus to configure a standby table that replaces the table chosen by the selector apparatus. The system may additionally include a switch that changes a hash function based upon the exchanger apparatus' replacement of the chosen table to enable the controller to reduce insertion times and/or collisions when interfacing with new components introduced to the controller.
    Type: Application
    Filed: May 19, 2012
    Publication date: November 21, 2013
    Applicant: International Business Machines Corporation
    Inventors: Jean L. Calvignac, Casimer M. DeCusatis, Fabrice J. Verplanken, Daniel Wind
  • Publication number: 20130297580
    Abstract: A node a data grid receives a prepare request identifying data to lock for a first transaction. The prepare request indicates a locking order that is different from a locking order indicated by a prior prepare request of a second transaction using the same data. The node identifies keys that correspond to the data. The keys are co-located on the node. The node ranks the keys to define an order for acquiring locks for the data based on key identifiers that correspond to the keys. The defined order matches a locking order used by the second transaction. The node acquires locks for the data using the defined order.
    Type: Application
    Filed: May 3, 2012
    Publication date: November 7, 2013
    Applicant: Red Hat, Inc.
    Inventors: Mircea Markus, Manik Surtani
  • Patent number: 8566709
    Abstract: A method and apparatus for representing and controlling documents including rich text for Web based applications and browsers is provided so that editing of rich text can be facilitated within the browsers. The rich text is represented in a memory structure so that various formats may be flexible maintained. Text, images, tables, links and the like are represented in the memory structure, which may be maintained in databases for eventual editing. A controller class and subsidiary classes represent the rich text and provide methods to convert html to the memory structure and back, representing the rich text in a relational database, retrieving the rich text from a relational database, and presenting the rich text for editing. A spell checking facility for the rich text is included.
    Type: Grant
    Filed: November 5, 2010
    Date of Patent: October 22, 2013
    Assignee: International Business Machines Corporation
    Inventor: James R. Wason
  • Publication number: 20130218901
    Abstract: In one embodiment, the correlation filter can use one of several data structure to track each migration unit and reject successive accesses within a period of time to each migration unit. In one embodiment, the correlation filter uses a space efficient data structure, such as a hash indexed correlation array to store the address of referenced migration units, and to filter accesses to a single migration unit that are correlated accesses resulting from multiple accesses to the same migration unit during a sequential I/O stream. In one embodiment, the correlation array contains a global timeout, which resets each element to a default value, clearing all store migration unit address values from the correlation array. In one embodiment, each element of the migration array can time-out separately.
    Type: Application
    Filed: October 16, 2012
    Publication date: August 22, 2013
    Applicant: Apple Inc.
    Inventor: Apple Inc.
  • Publication number: 20130151535
    Abstract: Indexing a data set of objects, where the data set is partitioned into plural work units with plural objects and distributed to multiple data process nodes. Each data processing node maps the plural objects in corresponding work units into respective ones of given sub-indexes. A composite index is constructed for the objects in the data set by reducing the mapped objects, where reducing the mapped objects is distributed among multiple data processing nodes.
    Type: Application
    Filed: December 9, 2011
    Publication date: June 13, 2013
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Dariusz Dusberger, Bradley Denney
  • Publication number: 20130138620
    Abstract: Described are embodiments of an invention for identifying chunk boundaries for optimization of fingerprint-based deduplication in a computing environment. Storage objects that are backed up in a computing environment are often compound storage objects which include many individual storage objects. The computing device of the computing environment breaks the storage objects into chunks of data by determining a hash value on a range of data. The computing device creates an artificial chunk boundary when the end of data of the storage object is reached. When an artificial chunk boundary is created for the end of data of a storage object, the computing device stores a pseudo fingerprint for the artificial chunk boundary. If a hash value matches a fingerprint or a pseudo fingerprint, then the computing device determines that the range of data corresponds to a chunk and the computing system defines the chunk boundaries.
    Type: Application
    Filed: November 28, 2011
    Publication date: May 30, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mark L. Yakushev, Mark A. Smith
  • Publication number: 20130138607
    Abstract: Mechanisms are provided for efficient resynchronization of replicated data. A hash value is generated for a chunk of data replicated from a source node to a target node. The chunk of data may be a file deduplicated and compressed at both a source node and a target node. A current sequence number is determined and a sequence number and hash tuple is maintained for the chunk of data at both the source node and the target node. Sequence numbers are modified whenever the data is modified. Current sequence numbers and sequence number and hash values in the sequence number hash tuples at the source node and the target node may be compared to determine whether data is still synchronized at a later point in time or whether data requires resynchronization.
    Type: Application
    Filed: November 29, 2011
    Publication date: May 30, 2013
    Applicant: DELL Products L.P.
    Inventors: Murali Bashyam, Sreekanth Garigala
  • Publication number: 20130117276
    Abstract: In one general aspect, a computer-readable storage medium can be configured to store instructions that when executed cause a processor to perform a process. The instructions can include instructions to receive, at a first device, a target attribute associated with a first user account and to access a code representing the target attribute and including a plurality of values. The instructions can include instructions to send, to the second device, a portion of the code and an indicator of a relative location within the code of the portion of the code, and to receive an indicator from the second device that the portion of the code is included at the relative location within at least one code from a plurality of codes associated with a plurality of attributes associated with a second user account.
    Type: Application
    Filed: November 8, 2011
    Publication date: May 9, 2013
    Applicant: GOOGLE INC.
    Inventors: John Norman Hedditch, Philip John Hutton
  • Publication number: 20130117494
    Abstract: Methods and systems for the optimization of available computing resources within a virtual environment are disclosed. An exemplary method comprises determining the sizes of the computing resources available to the virtual machine and determining optimal data structures for the virtual machine based on the sizes of the computing resources. The optimal data structures may include an indexing data structure and a historic data. The method may further comprise allocating a Random Access Memory (RAM) and disk storage to the optimal data structures and configuring the optimal data structures within the RAM and the disk storage. The optimization of data structures involves balancing requirements of the indexing data structure and the historic data.
    Type: Application
    Filed: November 3, 2011
    Publication date: May 9, 2013
    Inventors: David Anthony Hughes, John Burns
  • Publication number: 20130086004
    Abstract: A representation of a new rule, defined as a set of a new transition(s), is inserted into a perfect hash table which includes previously placed transitions to generate an updated perfect hash table. This may be done by, for each new transition: (a) hashing the new transition; and (b) if there is no conflict, inserting the hashed new transition into the table. If, however, the hashed new transition conflicts with any of the previously placed transitions, either (A) any transitions of the state associated with the conflicting transition are removed from the table , the hashed new transition is placed into the table, and the removed transitions are re-placed into the table, or (B) any previously placed transitions of the state associated with the new transition are removed, and the transitions of the state associated with the new transition are re-placed into the table.
    Type: Application
    Filed: March 1, 2012
    Publication date: April 4, 2013
    Inventors: H. Jonathan CHAO, Yang Xu
  • Publication number: 20130046767
    Abstract: An apparatus for managing a bucket range of Locality Sensitive Hash is provided. The apparatus includes a range setting unit configured to set bucket ranges of Locality Sensitive Hash by dividing at least one vector based on distribution of data that are projected to the at least one vector.
    Type: Application
    Filed: December 14, 2011
    Publication date: February 21, 2013
    Inventors: Ki-Yong LEE, Seok-Jin Hong
  • Publication number: 20120303634
    Abstract: Systems and methods of managing an in-memory data grid (IMDG) may involve conducting a data distribution analysis of the IMDG on a periodic basis, and selecting a hash scheme from a plurality of hash schemes based on the data distribution analysis. In one example, the selected hash scheme is used to conduct a repopulation of the IMDG, wherein the repopulation increases the distribution evenness of database records across the IMDG.
    Type: Application
    Filed: May 25, 2011
    Publication date: November 29, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Snehal S. Antani, Kulvir S. Bhogal, Nitin Gaur, Chris D. Johnson
  • Publication number: 20120265765
    Abstract: An improved self-indexer comprising a find function that caches a last found position and occurrence count of a symbol on each node level of a word-based wavelet tree for a particular symbol lookup and only uses a select function to call on data to the right of the position.
    Type: Application
    Filed: April 14, 2011
    Publication date: October 18, 2012
    Applicant: ATBROX, AS
    Inventor: Amund Tveit
  • Publication number: 20120265766
    Abstract: For facilitating data compression, a set of logical extents, each having compressed logical tracks of data, is mapped to a head physical extent and, if the head physical extent is determined to have been filled, to at least one overflow extent having spatial proximity to the head physical extent. Pursuant to at least one subsequent write operation and destage operation, the at least one subsequent write operation and destage operation determined to be associated with the head physical extent, the write operation is mapped to one of the head physical extent, the at least one overflow extent, and an additional extent having spatial proximity to the at least one overflow extent.
    Type: Application
    Filed: June 25, 2012
    Publication date: October 18, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael T. BENHASE, Binny S. GILL, Lokesh M. GUPTA, Matthew J. KALOS, Gail A. SPEAR
  • Patent number: 8290919
    Abstract: A system and method for distributing and accessing files in a distributed storage system uses an ordered list of the storage nodes in the system to determine the storage node on which a file is stored. The distributed storage system includes a cluster of storage nodes and may also include one or more client nodes that participate in the system as storage resources. Each node (client and storage) stores an ordered list of the storage nodes in the system, allowing any of the nodes to access the file. The list is updated whenever a new storage node is added to the system, an existing storage node is removed from the system, or a new storage node is swapped with an existing storage node. Each one of the nodes may independently compute a new mapping of files to the storage nodes when the ordered list is changed.
    Type: Grant
    Filed: August 27, 2010
    Date of Patent: October 16, 2012
    Assignee: Disney Enterprises, Inc.
    Inventors: Sean A. Kelly, Roger B. Milne
  • Publication number: 20120259825
    Abstract: A data management system respectively computes first hash values while sliding a window a prescribed amount at a time with respect to a prescribed range from a start location of a data block to a prescribed size. The system extracts, from among the first hash values, a first hash value, which is equivalent to a characteristic value, and partitions the data block into a first chunk of data at a location corresponding to this first hash value. The system determines coincidence between a first chunk of data and a stored second chunk of data, and prevents duplicate data from being stored twice.
    Type: Application
    Filed: April 11, 2011
    Publication date: October 11, 2012
    Inventors: Naomitsu Tashiro, Taizo Hori, Motoaki Iwasaki
  • Publication number: 20120254193
    Abstract: A computer-implemented method for processing input data in a mapreduce framework includes: receiving, in the mapreduce framework, a data processing request for input data; initiating, based on the data processing request, a map operation on the input data by multiple mappers in the mapreduce framework, each of the mappers using an aggregator to partially aggregate the input data into one or more intermediate key/value pairs; initiating a reduce operation on the intermediate key/value pairs by multiple reducers in the mapreduce framework, wherein, without sorting the intermediate key/value pairs, those of the intermediate key/value pairs with a common key are handled by a same one of the reducers, each of the reducers using the aggregator to aggregate the intermediate key/value pairs into one or more output values; and providing the output values in response to the data processing request.
    Type: Application
    Filed: April 1, 2011
    Publication date: October 4, 2012
    Applicant: Google Inc.
    Inventors: Biswapesh Chattopadhyay, Liang Lin, Weiran Liu, Marián Dvorský
  • Patent number: 8266325
    Abstract: A set of logical extents, each having compressed logical tracks of data, is mapped to a head physical extent and, if the head physical extent is determined to have been filled, to at least one overflow extent having spatial proximity to the head physical extent. Pursuant to at least one subsequent write operation and destage operation, the at least one subsequent write operation and destage operation determined to be associated with the head physical extent, the write operation is mapped to one of the head physical extent, the at least one overflow extent, and an additional extent having spatial proximity to the at least one overflow extent.
    Type: Grant
    Filed: February 5, 2010
    Date of Patent: September 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Michael T. Benhase, Binny S. Gill, Lokesh M. Gupta, Matthew J. Kalos, Gail A. Spear
  • Publication number: 20120226672
    Abstract: In deduplicating data including objects, the system obtains information of the location of the objects and uses the information in calculating the hash value. The hash value calculation program divides data from the boundary location to chunks to match the boundary location of the objects subject to deduplication and the hash value is calculated from each chunk.
    Type: Application
    Filed: March 1, 2011
    Publication date: September 6, 2012
    Applicant: HITACHI, LTD.
    Inventors: Shinichi Hayashi, Tomohiro Kawaguchi
  • Publication number: 20120226699
    Abstract: Systems and methods of deduplicating while loading index entries are disclosed. An example method includes loading a first group of index entries into an index. The example method also includes deduplicating data using the index before loading the first group of index entries is completed.
    Type: Application
    Filed: March 3, 2011
    Publication date: September 6, 2012
    Inventor: Mark David Lillibridge
  • Publication number: 20120221524
    Abstract: In one example, a method may include performance of a hash function on a digital sequence so as to generate a hash value that corresponds to the digital sequence. Next, the digital sequence may be broken into data pieces, and each data piece hashed to produce a corresponding hash value for each data piece. Then, a recipe may be produced that includes instructions which, when executed, may generate the digital sequence from the data pieces referenced by their corresponding hash values included in the recipe. Among other things, the hash values may enable reutilization of redundant data sequences by serving as pointers to the data pieces that the hash values respectively represent.
    Type: Application
    Filed: May 8, 2012
    Publication date: August 30, 2012
    Applicant: EMC CORPORATION
    Inventors: Scott C. Auchmoody, Eric W. Olsen
  • Publication number: 20120215789
    Abstract: The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy.
    Type: Application
    Filed: May 3, 2012
    Publication date: August 23, 2012
    Applicant: ZEITERA, LLC
    Inventors: Prashant Ramanathan, Jose Pio Pereira, Shashank Merchant, Mihailo M. Stojancic
  • Publication number: 20120197901
    Abstract: Systems and methods are disclosed which enable the establishment of file dates and the absence of tampering, even for documents held in secrecy and those stored in uncontrolled environments, but which does not require trusting a timestamping authority or document archival service. A trusted timestamping authority (TTSA) may be used, but even if the TTSA loses credibility or a challenger refuses to acknowledge the validity of a timestamp, a date for an electronic document may still be established. Systems and methods are disclosed which enable detection of file duplication in large collections of documents, which can improve searching for documents within the large collection.
    Type: Application
    Filed: November 27, 2011
    Publication date: August 2, 2012
    Inventor: Kelce S. Wilson
  • Patent number: 8234259
    Abstract: A computerized method of adjudicating text against a policy includes receiving one or more system policies, creating a system datastructure for each received system policy, receiving an input message comprising a text to be adjudicated, selecting a system policy from the one or more received system policies based on the input message, and processing the text to be adjudicated and the system datastructure corresponding to the selected system policy to determine if a prohibited word is present in the text to be adjudicated. The one or more system policies include one or more prohibited words and a first hit value corresponding to each prohibited word. The system datastructure includes a plurality of linked lists corresponding the letters of the alphabet and a head linked list operable to store one or more found prohibited words.
    Type: Grant
    Filed: May 8, 2009
    Date of Patent: July 31, 2012
    Assignee: Raytheon Company
    Inventors: Randall S. Brooks, Ricardo J. Rodriguez, Sylvia A. Traxler
  • Publication number: 20120191724
    Abstract: Techniques for storage of data objects based on a time of creation are disclosed. A computing device may receive a request to store a data object and, in response, identify a particular storage location that maintains data for the interval of time including a time of creation of the data object.
    Type: Application
    Filed: January 26, 2011
    Publication date: July 26, 2012
    Inventors: Joseph A. Tucek, Eric A. Anderson
  • Publication number: 20120179691
    Abstract: Technologies are generally described for methods, instructions, and client applications for device discovery in a ubiquitous computing environment. In some examples, the methods, instructions, and client applications may facilitate the organization of features of devices in a ubiquitous computing environment into a series of hierarchical hash numbers, the ordering of the hierarchical hash numbers corresponding to the respective devices, and the searching for a particular one of the devices by attempting to match hashed search criteria to the ordered hierarchical hash numbers at one of the devices in the ubiquitous computing environment.
    Type: Application
    Filed: December 17, 2010
    Publication date: July 12, 2012
    Applicant: EMPIRE TECHNOLOGY DEVELOPMENT LLC
    Inventors: Junwei Cao, Zhen Wang
  • Publication number: 20120173497
    Abstract: Defense-in Depth security defines a set of graduated security tasks, each of which performs a task that must complete before another task can complete. Only when these tasks complete successfully and in the order prescribed by Defense-in-Depth security criteria is a final process allowed to execute. Through such Defense-in-Depth security measures, vulnerable software, such as bytecode, can be verified as unaltered and executed in a secure environment that prohibits unsecured access to the underlying code.
    Type: Application
    Filed: December 22, 2011
    Publication date: July 5, 2012
    Applicant: CISCO TECHNOLOGY, INC.
    Inventors: Sreenivas Devalla, Satyanarayana DV Raju, Sridhararao V. Kothe, Nakka Siva Kishore Kumar
  • Publication number: 20120158736
    Abstract: Techniques for mapping a virtual R-Tree to an extensible-hash based file system for databases are provided. Spatial data is identified within an existing file system, which stores data for a database. Rows of the spatial data are organized into collections; each collection represents a virtual block. The virtual blocks are used to form an R-Tree spatial index that overlays an existing index for the database on the existing file system. Each row within its particular virtual block includes a pointer to its native storage location within the existing file system.
    Type: Application
    Filed: December 20, 2010
    Publication date: June 21, 2012
    Applicant: Teradata US, Inc.
    Inventor: Gregory Howard Milby
  • Publication number: 20120150869
    Abstract: An method for creating a index of the data blocks is applicable in data de-duplication procedure, includes loading an index file, the index file includes a plurality of location blocks, each location block includes a plurality of storage fields, and each storage field records a primary Hash value corresponding to the data block; performing a first Hash procedure on a primary Hash value of the data block and calculating a block number; performing a second Hash procedure on the primary Hash value in the same data block and calculating a field number; loading a location conflict list; comparing the field number with the field number in the location conflict list to search whether the same field number is stored in the location conflict list; writing the primary Hash value into the corresponding block number and the field number if the field number does not exist in the location conflict list.
    Type: Application
    Filed: December 10, 2010
    Publication date: June 14, 2012
    Applicant: INVENTEC CORPORATION
    Inventors: Yun Song Wang, Ming Sheng Zhu, Chih Feng Chen
  • Publication number: 20120117080
    Abstract: Embodiments are directed to indexing and querying a sequence of hash values in an indexing matrix. A computer system accesses a document to extract a portion of text from the document. The computer system applies a hashing algorithm to the extracted text. The hash values of the extracted text form a representative sequence of hash values. The computer system inserts each hash value of the sequence of hash values into an indexing matrix, which is configured to store multiple different hash value sequences. The computer system also queries the indexing matrix to determine how similar the plurality of hash value sequences are to the selected hash value sequence based on how many hash values of the selected hash value sequence overlap with the hash values of the plurality of stored hash value sequences.
    Type: Application
    Filed: November 10, 2010
    Publication date: May 10, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Charles William Lamanna, Mauktik H. Gandhi, Jason Eric Brewer
  • Publication number: 20120089612
    Abstract: A system for real-time document indexing is provided that includes a browser that is executing on a client system. The browser includes functionalities allowing it to communicate with a remote computer system. A query interface executes within the framework of the browser. The query interface receives one or more query searches from an end-user and sends the one or more query searches to be processed by the remote computer system. The remote computer system sends to the query interface the results of the one or more query searches via the browser. The query interface assigns the results of the one or more query searches to a folder where the folder includes a unique identifier. The query interface indexes the results of the one or more query searches to the unique identifier of the folder.
    Type: Application
    Filed: September 28, 2011
    Publication date: April 12, 2012
    Applicant: NOLIJ CORPORATION
    Inventors: John J. Collins, Sean J. Langford
  • Publication number: 20120078863
    Abstract: Systems and methods for performing application control constraint enforcement are provided. According to one embodiment, file system or operating system activity of a computer system is intercepted relating to a code module. A cryptographic hash value of the code module is checked against a local whitelist database containing cryptographic hash values of approved code modules, which are known not to contain viruses or malicious code. The local whitelist database also contains execution constraint information. When the cryptographic hash value matches one of the cryptographic hash values of approved code modules, authority of the computer system or an end user of the computer system to execute the code module is further validated if the execution constraint information so indicates by performing a constraint check regarding the code module. If the authority is affirmed by the constraint check, then allowing the code module to be executed.
    Type: Application
    Filed: December 6, 2011
    Publication date: March 29, 2012
    Applicant: FORTINET, INC.
    Inventors: Andrew F. Fanton, John J. Gandee, William H. Lutton, Edwin L. Harper, Kurt E. Godwin, Anthony A. Rozga
  • Patent number: 8144994
    Abstract: A plurality of type identifiers are stored that contains one or a plurality of image identifiers each for identifying each of a plurality of reference images and thereby identifies a type of a document. Then, it is determined whether each of a plurality of obtained document images is similar to a reference image. When a document image is determined as being similar to a reference image, an image identifier that identifies the reference image is selected from among a plurality of image identifiers. Then, a type identifier is identified that contains the selected image identifier. Then, document images each similar to a reference image are classified for each identified type identifier.
    Type: Grant
    Filed: November 9, 2007
    Date of Patent: March 27, 2012
    Assignee: Sharp Kabushiki Kaisha
    Inventor: Yohsuke Konishi
  • Publication number: 20120041958
    Abstract: Prefixes are registered on a first list as index elements for respective registration patterns. Each prefix is selected as the longest of different-length prefixes that are extractable from a registration pattern in accordance with an extraction rule. Suffixes, which are the remaining parts of the registration patterns excluding the respective prefixes, are registered on a second list. Using different-length prefixes that are extracted from a retrieval key in accordance with the extraction rule, a prefix retriever searches the first list to retrieve a registration pattern whose prefix matches any of the prefixes of the retrieval key. A suffix checker carries out a check on the suffix of the registration pattern retrieved by the prefix retriever, among the suffixes on the second list, as to whether the suffix of the registration pattern matches the suffix of the retrieval key.
    Type: Application
    Filed: October 19, 2011
    Publication date: February 16, 2012
    Applicant: NEC CORPORATION
    Inventor: Akihiro MOTOKI
  • Publication number: 20120036134
    Abstract: In one embodiment, the present invention includes a method for allocating a second number of buckets for a hash table shared concurrently by a plurality of threads, where the second number of buckets are logically mapped onto a corresponding parent one of the first number of buckets, and publishing an updated capacity of the hash table to complete the allocation, without performing any rehashing, such that the rehashing can later be performed in an on-demand, per bucket basis. Other embodiments are described and claimed.
    Type: Application
    Filed: April 8, 2009
    Publication date: February 9, 2012
    Applicant: INTEL CORPORATION
    Inventor: Anton Malakhov
  • Publication number: 20120016882
    Abstract: Example apparatus, methods, and computers control processing delta chunks with delta hashes. One example method includes computing a first hash for a chunk for which a duplicate determination is to be made. The first hash is suitable for making the duplicate chunk determination. The method also includes computing a delta hash for the chunk. The delta hash is suitable for making a delta chunk determination. The method controls a de-duplication logic to process the chunk as a duplicate upon determining that the first hash matches a stored first hash. The method controls the de-duplication logic to process the chunk as a delta chunk upon determining that the first hash does not match a stored first hash and that the delta hash matches a stored delta hash. Processing a chunk as a delta chunk may include storing a reference to a stored chunk and storing delta hash information.
    Type: Application
    Filed: July 14, 2011
    Publication date: January 19, 2012
    Applicant: QUANTUM CORPORATION
    Inventor: Jeffrey Vincent TOFANO
  • Publication number: 20120016883
    Abstract: Methods, systems and apparatus, including computer program products, for enhancing query performance through fixed length hashing of multidimensional data. According to one method, a fixed length hash of a multidimensional data record is created where the hash has respective fixed length sections for each data dimension of the record being hashed. The composite fixed length hash is stored with a reference to the original data record to which it corresponds. Query parameters are hashed and compared to a corresponding section of the fixed length hash to determine a set of candidate records.
    Type: Application
    Filed: September 26, 2011
    Publication date: January 19, 2012
    Applicant: GOOGLE INC.
    Inventors: Sagnik Nandy, Jonathon A. Vance, Jan Matthias Ruhl
  • Publication number: 20120016852
    Abstract: Example apparatus, methods, and computers participate in collaborative, distributed, data de-duplication. One example method includes initializing a layered parser in a first node in a collaborative distributed data de-duplication (CDDD) topology with a first set of de-duplication control parameters. After transmitting some information to another node in the CDDD topology, the method includes selectively reconfiguring the layered parser in response to feedback acquired from the second node in the CDDD topology. The feedback concerns the data provided by the layered parser.
    Type: Application
    Filed: July 14, 2011
    Publication date: January 19, 2012
    Applicant: QUANTUM CORPORATION
    Inventor: Jeffrey Vincent TOFANO
  • Publication number: 20120011128
    Abstract: A value is computed for a feature in an instance of query content and compared to a threshold value. Based on the comparison, first and second bits in a hash value, which is derived from the query content feature, are determined. Conditional probability values are computed for the likelihood that quantized values of the first and the second bits equal corresponding quantized bit values of a target or reference feature value. The conditional probabilities are compared and a relative strength determined for the first and second bits, which directly corresponds to the conditional probability. The bit with the lowest bit strength is selected as the weakbit. The value of the weakbit is toggled to generate a variation of the query hash value. The query may be extended using the query hash value variation.
    Type: Application
    Filed: June 30, 2011
    Publication date: January 12, 2012
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Junfeng He, Regunathan Radhakrishnan, Wenyu Jiang
  • Publication number: 20120005197
    Abstract: Apparatus, systems and methods for search filtering hash map are disclosed. Terms are designated as filtering terms, wherein at least one of the filtering terms includes only one component term, and at least one of the filtering terms includes a plurality of component terms in ordinal positions. A keyword hash map is generated for keywords, each keyword being one of the component terms and being mapped to one or more corresponding hashes in the hash map, and each corresponding hash having a corresponding level and a corresponding status, wherein each level corresponds to an ordinal position of its corresponding component term in a filtering term, and wherein each status designates its corresponding component term in the hash map as one of a filtering term or unfiltered term. The keyword hash map is stored in a memory storage system accessible by a data processing apparatus.
    Type: Application
    Filed: September 16, 2011
    Publication date: January 5, 2012
    Applicant: GOOGLE INC.
    Inventors: Jungho Ahn, Justin J. Tansuwan, Junyoung Lee
  • Publication number: 20110264669
    Abstract: The invention discloses a method for compressing a .net file, characterized by at least one of the following steps of obtaining and compressing reference type in a .net tile; obtaining and compressing definition method in a .net file; obtaining and compressing method body of the definition method in a .net file; obtaining and compressing Namespace in a .net file; obtaining and compressing definition type in a .net file. By compressing the .net file, the invention efficiently reduces the storage space occupied by the .net file, and makes it stored in a small-sized medium, such as a smart card.
    Type: Application
    Filed: December 29, 2010
    Publication date: October 27, 2011
    Inventors: Zhou Lu, Huazhang Yu
  • Publication number: 20110246480
    Abstract: System and method for interacting with a plurality of data sources are provided. A request may be parsed and an identification parameter identifying a data set may be determined. A field included in the request may be designated as a distribution key. At least one data source may be selected based on a value associated with the distribution key. At least a portion of the request may be sent to a selected data source. Other embodiments are described and claimed.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 6, 2011
    Inventors: Doron Levari, Liran Zelkha