Data Cleansing, Data Scrubbing, And Deleting Duplicates Patents (Class 707/692)
  • Patent number: 10452318
    Abstract: Systems and methods for recording and playback of multiple data streams. One device includes a storage controller coupled to an electronic storage device, a first data buffer storing data received from a first data stream, a second data buffer storing data received from a second data stream, a fragment buffer storing fragment metadata, a storage buffer including a plurality of data fragments, and an electronic processor. The electronic processor receives information designating a data stream storage area of the electronic storage device. The electronic processor arbitrates between the first and second data buffers to select a data fragment for writing to the storage buffer. The electronic processor writes the data fragment to the storage buffer, and writes fragment metadata defining the data fragment to the fragment buffer. The electronic processor controls the storage controller to sequentially write from the plurality of data fragments to the data stream storage area.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: October 22, 2019
    Assignee: MOTOROLA SOLUTIONS, INC.
    Inventors: Adrian Guillen, Joel Hegberg, Chet A. Lampert
  • Patent number: 10430383
    Abstract: In one example, a method for processing data includes receiving information that identifies an ad hoc group of size ā€˜nā€™ of files F1 . . . Fn, each file F including a respective file sequence S that includes K data segments. Next, each file sequence S is sampled to obtain a sequence SS of data segments from the file sequence S, and a non-random sampling of data segments is sampled from each sequence SS to obtain a set SSU of the sequence SS. The data segments of each set SSU are then sampled to obtain a sample subset SSUS of the set SSU, and a compression ratio is determined for each data segment in each sample subset SSUS. Finally, an average data compression RF1 . . . Fn is estimated and output for the files F in the group of size ā€˜nā€™, based on the compression ratios.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: October 1, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Teng Xu, Abdullah Reza
  • Patent number: 10430426
    Abstract: Answer effectiveness evaluations include providing, by a computing device, an answer to a search query received from a user, and in response to receiving a subsequent search query from the user, determining by the computing device a level of effectiveness of the answer to the search query with respect to the user. The determination includes comparing aspects of the search query to aspects of the subsequent search query, calculating, based on the comparing, a relevance score that indicates a measure of similarity between the aspects of the search query and the aspects of the subsequent search query, and determining that the answer effectively answers the search query when the relevance score exceeds a threshold value.
    Type: Grant
    Filed: May 3, 2016
    Date of Patent: October 1, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Donna K. Byron, Lakshminarayanan Krishnamurthy, Priscilla Santos Moraes, Niyati Parameswaran
  • Patent number: 10417202
    Abstract: An example storage system may include storage media and a storage controller. The storage controller may be to establish virtual volumes, private data stores, and a deduplication data store, each being a virtual storage space of the storage media, wherein each of the private data stores is associated with one of the virtual volumes and the deduplication data store is shared among the virtual volumes. The storage controller may, in response to receiving input data that is to be stored in a given one of the virtual volumes, determine a signature for the input data and select between storing the input data in the private data store associated with the given one of the virtual volumes and storing the input data in the deduplication data store.
    Type: Grant
    Filed: December 21, 2016
    Date of Patent: September 17, 2019
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Siamak Nazari, Jin Wang, Srinivasa D. Murthy, Roopesh Kumar Tamma
  • Patent number: 10395145
    Abstract: A computer-implemented method includes receiving a set of representative machine image regions for a computing environment wherein the set of representative machine image regions collectively comprise a set of representative image chunks. The method also includes generating a fingerprint for each representative image chunk within the set of representative image chunks to produce a set of representative fingerprints, generating a fingerprint for selected image chunks within a measured machine image region to produce a set of sampled fingerprints, and determining a deduplication metric for the measured machine image region based on the representative fingerprints and the sampled fingerprints. A corresponding computer program product and computer system are also disclosed herein.
    Type: Grant
    Filed: March 8, 2016
    Date of Patent: August 27, 2019
    Assignee: International Business Machines Corporation
    Inventors: Jonathan Amit, Danny Harnik, Ety Khaitzin, Sergey Marenkov
  • Patent number: 10387044
    Abstract: The presently disclosed subject matter includes various inventive aspects, which are directed for enabling execution of deduplication during data writes in a distributed storage-system.
    Type: Grant
    Filed: April 5, 2017
    Date of Patent: August 20, 2019
    Assignee: Kaminario Technologies Ltd.
    Inventors: Doron Tal, Eyal Gordon
  • Patent number: 10387265
    Abstract: A method, computer program product, computing system, and system for preventive hash loading are described. The method may include receiving an indication at a storage server that a machine will be backed up. The method may further include loading fingerprints of blocks related to a previous backup of the machine to RAM of the storage server. The method may also include searching the storage server for fingerprints in the RAM that match fingerprints of incoming blocks from the machine being backed up. The method may additionally include, in response to determining that the fingerprints of the incoming blocks do not match fingerprints in the RAM, searching for the fingerprints in a database. Moreover, the method may include transferring only blocks from the machine being backed up that are not in the RAM or the database of the storage server to the storage server.
    Type: Grant
    Filed: December 16, 2015
    Date of Patent: August 20, 2019
    Assignee: ACRONIS INTERNATIONAL GMBH
    Inventors: Vitaly Pogosyan, Andrey Panin, Stanislav Protasov, Serguei M. Beloussov
  • Patent number: 10380074
    Abstract: A computer-implemented method for efficient backup deduplication may include (1) identifying a file to be divided into chunks for deduplication, (2) requesting, from a server, a chunk size to use when dividing the file for deduplication by submitting at least one attribute of the file to the server, the server selecting the chunk size based at least in part on a projected chunk reuse rate when the file is deduplicated according to the chunk size, (3) receiving from the server, in response to requesting the chunk size, the chunk size to use when dividing the file for deduplication, and (4) dividing the file for deduplication into a plurality of chunks according to the chunk size. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Grant
    Filed: January 11, 2016
    Date of Patent: August 13, 2019
    Assignee: Symantec Corporation
    Inventors: Lei Gu, Jason Holler, Nathan Rivers, Elton Inada, Riti Saxena, Kirill Levichev
  • Patent number: 10374807
    Abstract: Storing and retrieving ciphertext in data storage can include determining a first ciphertext value for a first data chunk to be saved to a client-server data storage system using an encrypted chunk hash value associated with the first data chunk as an initial value, and storing the first data chunk on a server in the client-server data storage system in response to determining that the first ciphertext value is a unique ciphertext value. Also, storing and retrieving ciphertext in data storage can include decrypting a ciphertext value for a second data chunk received from a client in the client-server data storage system and based on an encrypted chunk hash value associated with the second data chunk, and sending the second data chunk to the client in response to determining that the decrypted ciphertext value corresponds to an original data chunk saved to the server by the client.
    Type: Grant
    Filed: April 4, 2014
    Date of Patent: August 6, 2019
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Liqun Chen, Peter T. Camble, Jonathan P. Buckingham, Simon Pelly, Simon Kai-Ying Shiu, Joseph S. Ficara, Hendrik Radon
  • Patent number: 10366082
    Abstract: Techniques are described for parallel processing of database queries with an inverse distribution function by a database management system (DBMS). To improve the execution time of a query with an inverse distribution function, the data set referenced in the inverse distribution function is range distributed among parallel processes that are spawned and managed by a query execution coordinator process (QC), in an embodiment. The parallel executing processes sort each range of the data set in parallel, while the QC determines the location(s) of inverse distribution function values based on the count of values in each range of the data set. The QC requests the parallel processes to produce to the next stage of parallel processes the values at the location(s) in the sorted ranges. The next stage of parallel processes computes the inverse distribution function based on the produced values.
    Type: Grant
    Filed: December 9, 2016
    Date of Patent: July 30, 2019
    Assignee: Oracle International Corporation
    Inventors: Qingyuan Kong, Huagang Li, Sankar Subramanian
  • Patent number: 10359968
    Abstract: Virtual storage domains (VSD) are each associated with unique VSD domain ID associated with a first policy and tagged to a request to a storage system when an entity writes a data set to it. A first hash digest, based on data set content, is calculated and combined with first unique VSD domain ID into a second hash digest associated with data set. When first policy is changed to second policy associated with second VSD, a third hash digest of first data set is calculated, the third hash digest based on content of first data set and on second unique VSD domain ID. If third hash digest does not exist in second VSD, data set is copied to the second VSD; else, reference count of the third hash digest, associated with second VSD domain, is incremented, and reference count of second hash digest, associated with first VSD domain, is decremented.
    Type: Grant
    Filed: January 31, 2018
    Date of Patent: July 23, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Xiangping Chen, Anton Kucherov, Junping Zhao
  • Patent number: 10359942
    Abstract: Systems and methods of deduplication aware scalable content placement are described. A method may include receiving data to be stored on one or more nodes of a storage array and calculating a plurality of hashes corresponding to the data. The method further includes determining a first subset of the plurality of hashes, determining a second subset of the plurality of hashes of the first subset, and generating a node candidate placement list. The method may further include sending the first subset to one or more nodes represented on the node candidate placement list and receiving, from the nodes represented on the node candidate placement list, characteristics corresponding to the nodes represented on the candidate placement list. The method may further include identifying one of the one or more nodes represented on the candidate placement list in view of the characteristic and sending the data to the identified node.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: July 23, 2019
    Assignee: Pure Storage, Inc.
    Inventors: Robert Lee, Christopher Lumb, Ethan L. Miller, Igor Ostrovsky
  • Patent number: 10346363
    Abstract: An apparatus and a method for maintaining a file system is described. A method may include receiving a request for allocating a first block of a file system to a file, the first block comprising a first data from the file. The method also includes computing a first hash value by hashing the first data with a first hashing procedure and computing a second hash value by hashing the first data with a second hashing procedure. The method also includes using the first and the second hash values to determine whether a tree structure among a plurality of tree structures has a matching hash value among a plurality of hash values. Each of the plurality of hash values in the tree structure correspond to a block among a plurality of blocks stored in the file system. The method further includes in response to determining that the tree structure has the matching hash value, allocating the corresponding block to the file and updating a reference count of the corresponding block in the tree structure.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: July 9, 2019
    Assignee: Red Hat, Inc.
    Inventor: James Paul Schneider
  • Patent number: 10346075
    Abstract: Regarding a distributed storage system including a plurality of nodes, a first node among the plurality of nodes judges whether the same data as first data, which is written to a first virtual partial area managed by the first node from among a plurality of virtual partial areas, exists in the virtual partial area managed by another node among the plurality of nodes; when the same data as the first data exists in the other node, the first node executes inter-node deduplication for changing allocation of either one of logical partial areas for the first virtual partial area or the virtual partial area of the other node to which the same data is written, to the other logical partial area; and when I/O load on the first node after execution of the inter-node deduplication of the first virtual partial area and the predicted value is less than a first threshold, the first node executes the inter-node deduplication of a second virtual partial area managed by the first node from among the plurality of virtual partia
    Type: Grant
    Filed: March 16, 2015
    Date of Patent: July 9, 2019
    Assignee: Hitachi, Ltd.
    Inventors: Yasuo Watanabe, Hiroaki Akutsu
  • Patent number: 10339011
    Abstract: A method and system for implementing data lossless synthetic full backups. Specifically, the method and system disclosed herein improves upon traditional synthetic full backup operations by considering all user-checkpoint branches, rather than just the active user-checkpoint branch, representing all chains of incremental changes to a virtual disk of a virtual machine. In considering all user-checkpoint branches, no data pertinent to users involved in the development of the non-active (or inactive) user-checkpoint branches is lost.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: July 2, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Aaditya Rakesh Bansal, Sunil Yadav, Suman Chandra Tokuri, Pradeep Anappa, Soumen Acharya, Sudha Vamanraj Hebsur
  • Patent number: 10341467
    Abstract: Methods and systems for data transfer include adding a data chunks to a priority queue in an order based on utilization priority. A reducibility score for the data chunks is determined. A data reduction operation is performed on a data chunk having a highest reducibility in the priority queue using a processor if sufficient resources are available. The data chunk having the lowest reducibility score is moved from the priority queue to a transfer queue for transmission if the transfer queue is not full.
    Type: Grant
    Filed: January 13, 2016
    Date of Patent: July 2, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Danny Harnik, Alexei Karve, Andrzej Kochut, Dmitry Sotnikov
  • Patent number: 10331350
    Abstract: A computer program product, system, and method for visiting each node of a snapshot tree within a content-based storage system having a plurality of volumes and/or snapshots; for each node, scanning an address-to-hash (A2H) table to calculate one or more resource usage metrics, wherein the A2H tables map logical I/O addresses to chunk hashes; and determining, based on the resource usage metrics, an amount of memory and/or disk capacity that would be freed by deleting one or more of the volumes and/or snapshots.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: June 25, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Anton Kucherov, Ophir Buchman, David Meiri
  • Patent number: 10324806
    Abstract: A computer program product, system, and method for calculating a resource usage metric over each node of a snapshot tree within a content-based storage system having a plurality of volumes and/or snapshots and generating a visualization of the snapshot tree using the calculated resource usage metrics.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: June 18, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Anton Kucherov, David Meiri
  • Patent number: 10318326
    Abstract: Systems and methods are disclosed for associating one or more storage-based services with a storage unit accessible by a primary ā€œtier 1ā€ storage device. A storage-based service can include deduplication, compression, data conversion, statistical analysis of the data to be stored, or other storage-based service. A storage unit can be a disk, a file, a virtual disk, or a logical unit of storage (LUN). A virtual machine within the primary ā€œtier 1ā€ storage can perform the one or more storage-based services associated with the storage unit.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: June 11, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Ian Wigmore, Arieh Don, Stephen Smaldone
  • Patent number: 10318388
    Abstract: A dataset profiling tool configured to identify unique and non-unique column combinations in a dataset which includes a plurality of tuples, the tool including: an inserts handler module configured to: receive one or more new tuples for insertion into the dataset, receive one or more minimal uniques and one or more maximal non-uniques for the dataset, identify and group, for each minimal unique, any tuples of the dataset and any of the one or more new tuples which contain duplicate values in the column combinations of the minimal unique, to form grouped tuples which are grouped according to the minimal unique to which the tuples relate, validate the grouped tuples to identify supersets of the minimal uniques for which duplicate values were identified, to generate a new set of one or more minimal uniques and one or more maximal non-uniques, and output the new set of one or more updated minimal uniques and one or more maximal non-uniques.
    Type: Grant
    Filed: May 20, 2014
    Date of Patent: June 11, 2019
    Assignee: Qatar Foundation
    Inventors: Jorge Arnulfo QuianƩ Ruiz, Felix Naumann, Ziawasch Abedjan
  • Patent number: 10318203
    Abstract: Disclosed herein are methods, systems, and processes to improve the duplication of data between disparate deduplication systems. Source fingerprints are generated for data blocks using a source fingerprint algorithm at a source deduplication system. The source fingerprints and previously-generated source fingerprints are used to determine whether the data blocks are new or modified. If the data blocks are new or modified, target fingerprints are generated for the data blocks using a target fingerprint algorithm associated with a target deduplication system. The target fingerprints are sent to the target deduplication system.
    Type: Grant
    Filed: April 16, 2018
    Date of Patent: June 11, 2019
    Assignee: Veritas Technologies LLC
    Inventor: Thomas G. Clifford
  • Patent number: 10311033
    Abstract: In a remote index operation, a first system in a datasharing group performs an operation on a data object in a database, determines a second system in the datasharing group has exclusive access to an index page to be updated according to the operation, and sends to the second system a remote request to change the index page according to the operation. In response, the second system changes the index page with an index entry referencing the data object and marks a key in the index entry as a provisional key. When a third system in the datasharing group reads the index entry, it determines that the key in the index entry is the provisional key. In response, the third system determines whether the data object exists in the database and a committed state of the transaction, and accordingly determines a current state of the data object.
    Type: Grant
    Filed: December 5, 2015
    Date of Patent: June 4, 2019
    Assignee: International Business Machines Corporation
    Inventor: Robert W. Lyle
  • Patent number: 10303548
    Abstract: A method begins by a dispersed storage (DS) processing module transmitting a set of write commands for storing a set of encoded data slices in storage units of a dispersed storage network (DSN) and determining whether at least a first threshold number of write responses have been received within a response time period. When the at least the first threshold number of the write responses have been received within the response time period, the method continues with the DS processing module determining whether a total number of responses have been received within another response time period. When the total number of responses have not been received within the other response time period, the method continues with the DS processing module issuing a sub-set of write commit commands corresponding to a response number of encoded data slices for which a response was received.
    Type: Grant
    Filed: March 14, 2018
    Date of Patent: May 28, 2019
    Assignee: International Business Machines Corporation
    Inventors: Ilya Volvovski, Ravi Khadiwala, Greg Dhuse, Jason K. Resch
  • Patent number: 10268869
    Abstract: The method includes: setting up a hierarchy structure, wherein the hierarchy structure includes more than 2 levels, each slice in a lowest level of the levels is a single fingerprint image generated by a fingerprint sensor, a slice in a second level of the levels includes at most M slices in a first level of the levels, the second level is one level higher than the first level, and M is a positive integer greater than 1; obtaining a new fingerprint image, adding the new fingerprint image into the lowest level, arid updating the hierarchy structure; and outputting an enroll fingerprint image according to a slice in a highest level.
    Type: Grant
    Filed: March 15, 2017
    Date of Patent: April 23, 2019
    Assignee: HIMAX TECHNOLOGIES LIMITED
    Inventor: Tsung-Yau Huang
  • Patent number: 10261784
    Abstract: Systems and methods of detecting copying of code or portions of code involve disassembling a set of compiled code into an architecture-agnostic intermediate representation. The intermediate representation is used to form a number of cryptographically hashed overlapping shingles. The number of cryptographically hashed overlapping shingles can be searched against a database of cryptographically hashed overlapping shingles to identify copied code.
    Type: Grant
    Filed: June 20, 2018
    Date of Patent: April 16, 2019
    Assignee: TERBIUM LABS, INC.
    Inventors: Daniel J. Rogers, Dionysus Blazakis
  • Patent number: 10237366
    Abstract: Data files are transmitted by receiving requests from destination devices for the files and dividing the files into first and second subsets where the files of the second subset are associated with one file of the first subset. The files of the second subset are compressed using one of the files in the first subset as a reference. The compressed files are divided into packets. A portion of the compressed files of the second subset, and a portion of the files of the first subset is cached. Un-cached portions of respective files from the first subset are transmitted to destination devices that have requested these files, and un-cached portions of one particular file from the second subset, and un-cached portions of files in the first subset that are associated with the one particular file, are transmitted to respective destination devices that have requested the particular file from the second subset.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: March 19, 2019
    Assignee: NOKIA OF AMERICA CORPORATION
    Inventors: Antonia Maria Tulino, Jaime Llorca
  • Patent number: 10223507
    Abstract: A programmable system with program flow monitoring is provided. A memory is configured to store a set of instructions, where the instructions are configured to be executed in a predefined order. A processor is configured to execute the set of instructions by fetching and executing the instructions in the predefined order. A program flow monitoring (PFM) unit is configured to deterministically generate a fingerprint from accesses to the memory, such as instruction fetches, while executing the set of instructions. A verification unit is configured to compare the generated fingerprint to an expected fingerprint to determine whether the set of instructions executed in the predefined order. A method for program flow monitoring, as well as a safety system within which the programmable system finds application, are also provided.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: March 5, 2019
    Assignee: Infineon Technologies AG
    Inventor: Klemens Kordik
  • Patent number: 10216788
    Abstract: Displaying contact-related information is disclosed. An association between a contact address not specific to a source of contact-related information and an identity of an entity at the source of contact-related information may be determined. Information representing the association between the contact address and the identity of the entity at the source of contact-related information is stored. The information representing the association is stored at a node associated with a service configured to use the information representing the association to retrieve from the source of contact-related information a response data associated with the entity in response to an expression of interest in a contact with which the contact address is associated.
    Type: Grant
    Filed: March 27, 2018
    Date of Patent: February 26, 2019
    Assignee: SUGARCRM INC.
    Inventors: Somrat Niyogi, Jason McDowall, Pushkar Singh, Andreas Sandberg, Wiebke Poerschke
  • Patent number: 10216759
    Abstract: Techniques are described herein that are capable of heterogeneously optimizing a file. Heterogeneous optimization involves optimizing regions of a file non-uniformly. For example, the regions of the file may be optimized to different extents. In accordance with this example, a different optimization technique may be used to optimize each region or subset of the regions. In one aspect, optimization designations are assigned to respective regions of a file based on access patterns that are associated with the respective regions. The file may be a database file, a virtualized storage file, or other suitable type of file. Each optimization designation indicates an extent to which the respective region is to be optimized. Each region may be optimized to the extent that is indicated by the respective optimization designation that is assigned to that region.
    Type: Grant
    Filed: November 22, 2010
    Date of Patent: February 26, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ran Kalach, Mathew James Dickson
  • Patent number: 10216754
    Abstract: Techniques for balancing data compression and read performance of data chunks of a storage system are described herein. According to one embodiment, similar data chunks are identified based on sketches of a plurality of data chunks stored in the storage system. A first portion of the similar data chunks as a first group is associated with a first storage area. The first storage area is associated with one or more data chunks that are dissimilar to the first group but are likely accessed together. The first group of the similar data chunks and its associated dissimilar data chunks are compressed and stored in the first storage area.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: February 26, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Frederick Douglis, Philip Shilane, Grant Wallace
  • Patent number: 10210186
    Abstract: A data processing method and system and a client, where a target storage node is determined in a manner of comparing a second vector of received data and first vectors that are corresponding to all storage nodes and prestored on the client that receives the data, and the target storage node no longer needs to be determined in a manner of extracting some fingerprint values as samples from received data and sending the fingerprint values to all storage nodes in a data processing system for query, and waiting for a feedback from the storage nodes.
    Type: Grant
    Filed: January 29, 2016
    Date of Patent: February 19, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Yan Huang
  • Patent number: 10210243
    Abstract: Method, system, and programs for providing enhanced query term suggestions. Candidate query terms may be obtained based on a prefix of incomplete query terms received. The candidate query terms may be ranked, at least partially based on, their similarities with respect to query terms previously entered in the same search session as the incomplete query term. In some implementations, for determining such similarities, feature vectors and/or signatures may be stored in association with query terms. Similarity between a candidate query term and query terms in the same search session as the incomplete query term may be determined using the feature vectors and/or signatures associated therewith.
    Type: Grant
    Filed: August 23, 2017
    Date of Patent: February 19, 2019
    Assignee: EXCALIBUR IP, LLC
    Inventors: Hang Su, Chi Hoon Lee
  • Patent number: 10191914
    Abstract: Techniques to provide a de-duplicating distributed file system using a cloud-based object store are disclosed. In various embodiments, a request to store a file comprising a plurality of chunks of file data is received. A determination to store at least a subset of the plurality of chunks is made. The request is responded to at least in part by providing an indication to store two or more chunks comprising the at least a subset of the plurality of chunks comprising the file as a single stored object that includes the combined chunk data of said two or more chunks.
    Type: Grant
    Filed: March 31, 2015
    Date of Patent: January 29, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Thomas Manville, Julio Lopez, Rajiv Desai, Nathan Rosenblum
  • Patent number: 10187470
    Abstract: The data acquisition unit acquires sensor data according to an acquisition rule stored in the acquisition and processing rule DB. Then the data acquisition unit determines a representative value for each predetermined time interval on acquired data according to the buffer setting information stored in the buffer setting information DB and saves at least a predetermined number of representative values in the buffer provided in the main memory. The processing unit processes the predetermined number of representative values saved in the buffer to determine the processed data according to a processing rule stored in the acquisition and processing rule DB. The data upload unit transmits to the M2M server the processed data determined by the processing unit.
    Type: Grant
    Filed: June 10, 2015
    Date of Patent: January 22, 2019
    Assignee: Hitachi Solutions, Ltd.
    Inventors: Akira Moriguchi, Yuichi Nakamura, Masanori Irie, Atsuhiko Tani
  • Patent number: 10187462
    Abstract: The present disclosure provides methods, a system, and a server for constructing a microblog management circle. One method includes: setting, by a server, a microblog account 1 as a main official account; and receiving, by the server, a message indicating that a microblog account 2 is used as a sub official account subordinate to the main official account, setting, according to the message, the microblog account 2 as the sub official account subordinate to the main official account, and displaying operational data of the sub official account subordinate to the main official account to the main official account. In the present disclosure, an architecture of a hierarchical microblog management circle can be constructed, making it easier for a manager of a microblog account to view operational data of the subordinate microblog account.
    Type: Grant
    Filed: July 19, 2016
    Date of Patent: January 22, 2019
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Xiaokun Yan, Yiqun Xie, Chang Liu, Guangyu Yang
  • Patent number: 10169168
    Abstract: A data stream is stored in storage media. As part of the storage, the data stream is divided into a plurality of chunks. The plurality of chunks include a target chunk that is next to a first chunk in a file within the data stream. A determination is made that the target chunk matches an existing chunk stored in the storage media. In response to the determination, a first pointer to the existing stored chunk is created in file metadata for the file. Also in response to the determination, a second pointer to a first stored chunk that matches the first chunk is created in chunk metadata embedded with the existing stored chunk.
    Type: Grant
    Filed: April 5, 2017
    Date of Patent: January 1, 2019
    Assignee: International Business Machines Corporation
    Inventors: Mathias Defiebre, Heiko Schloesser, Christof Schmitt, Erik Rueger, Frank Krick
  • Patent number: 10169365
    Abstract: Methods, systems, and computer programs are presented for deduplicating data in a storage device. One method includes an operation for identifying multiple deduplication domains for a storage system. A fingerprint index is created for each deduplication domain, where each data block stored in the storage system is associated with one of the plurality of deduplication domains. The method also includes operations for receiving a first data block the storage system, and for identifying a first deduplication domain from the plurality at of deduplication domains corresponding to the first data block. The first data block is deduplicated within the first deduplication domain utilizing a first fingerprint index associated with the first deduplication domain.
    Type: Grant
    Filed: March 2, 2016
    Date of Patent: January 1, 2019
    Assignee: Hewlett Packard Enterprise Development LP
    Inventor: Umesh Maheshwari
  • Patent number: 10169056
    Abstract: A method and system are provided for identifying installed software components in a container running in a virtual execution environment. The container is created by instantiating image data. The method includes determining a respective identifier for each of individual layers of a layered structure of the image data. The method further includes retrieving from a repository storage arrangement, information identifying at least one of the installed software components in the container, based on the respective identifier for at least one of the individual layers.
    Type: Grant
    Filed: August 31, 2016
    Date of Patent: January 1, 2019
    Assignee: International Business Machines Corporation
    Inventors: Giuseppe Ciano, Luigi Pichetti
  • Patent number: 10169366
    Abstract: An apparatus and a method for maintaining a file system is described. A method may include receiving a request for allocating a first block of a file system to a file, the first block comprising a first data and computing, by a processing device, a first hash value of the first block. The method also includes comparing, by the processing device, the first hash value with a plurality of hash values in a tree structure, wherein each of the plurality of values correspond to a block among a plurality of blocks stored in the file system. The method further includes in response to determining that a match exists between the first hash value and at least one of the plurality of hash values in the tree structure, allocating, by the processing device, the corresponding block to the file; and updating, by the processing device, a reference count of the corresponding block in the tree structure.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: January 1, 2019
    Assignee: Red Hat, Inc.
    Inventor: James Paul Schneider
  • Patent number: 10162711
    Abstract: A method for data locality control in a deduplication system is provided. The method includes forming a fingerprint cache from a backup image corresponding to a first backup operation. The method includes removing one or more fingerprints from inclusion in the fingerprint cache, in response to the one or more fingerprints having a data segment locality, in a container, less than a threshold of data segment locality. The container has one or more data segments corresponding to the one or more fingerprints. The method includes applying the fingerprint cache, with the one or more fingerprints removed from inclusion therein, to a second backup operation, wherein at least one method operation is executed through a processor.
    Type: Grant
    Filed: June 10, 2016
    Date of Patent: December 25, 2018
    Assignee: VERITAS TECHNOLOGIES LLC
    Inventors: Xianbo Zhang, Haibin She, Xiaobing Song
  • Patent number: 10157202
    Abstract: According to embodiments of the present invention, methods, systems and computer readable media are presented for processing a database query. The query may specify an arrangement for resulting data. A digest is generated for each of a plurality of database object elements. The plurality of database object elements are grouped or mapped into one or more groups based on the digest to arrange the database object elements in digest order. The database object elements from the one or more groups are extracted and/or processed in order of the digest, in accordance with the specified arrangement.
    Type: Grant
    Filed: April 7, 2014
    Date of Patent: December 18, 2018
    Assignee: International Business Machines Corporation
    Inventor: Garth A. Dickie
  • Patent number: 10146784
    Abstract: Provided are a computer program product, system, and method for defragmenting files having file blocks in multiple point-in-time copies. Multiple point-in-time copies for a file having file blocks ap. Maintained. Each point-in-time copy to the file has at least one different block in the storage for at least one of the file blocks in the file. For each of a plurality of the point-in-time copies for the file, moving the blocks for the file blocks in the point-in-time copy to contiguous locations on the storage.
    Type: Grant
    Filed: January 2, 2014
    Date of Patent: December 4, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Duane M. Baldwin, John T. Olson, Sandeep R. Patil, Riyazahamad M. Shiraguppi
  • Patent number: 10140334
    Abstract: According to embodiments of the present invention, methods, systems and computer-readable media are presented for processing a database query. The query may specify an arrangement for resulting data. A digest is generated for each of a plurality of database object elements. The plurality of database object elements are grouped or mapped into one or more groups based on the digest to arrange the database object elements in digest order. The database object elements from the one or more groups are extracted and/or processed in order of the digest, in accordance with the specified arrangement.
    Type: Grant
    Filed: March 4, 2015
    Date of Patent: November 27, 2018
    Assignee: International Business Machines Corporation
    Inventor: Garth A. Dickie
  • Patent number: 10140043
    Abstract: Digital data sanitization is disclosed. An indication that a data sanitization process should be performed is received. The data sanitization process is performed. Performing the data sanitization process includes determining an amount of free space on a storage device. Performing the data sanitization process further includes performing a set of one or more write operations, where performing the write operations decreases the amount of free space on the storage of the device.
    Type: Grant
    Filed: October 26, 2017
    Date of Patent: November 27, 2018
    Assignee: Wickr Inc.
    Inventors: Thomas Michael Leavy, Christopher Howell, Robert Statica, Kara Lynn Coppa
  • Patent number: 10133747
    Abstract: Various embodiments for preserving data redundancy in a data deduplication system in a computing environment are provided. At least one virtual device out of a volume set is designated as not subject to a deduplication operation.
    Type: Grant
    Filed: April 23, 2012
    Date of Patent: November 20, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Rahul M. Fiske, Carl Evan Jones, Subhojit Roy
  • Patent number: 10116533
    Abstract: A method for logging events of computing devices. The method includes receiving, by a management service, a log event message from a computing device. The log event message includes a log event associated fingerprint. The method further includes reconstructing, by the management service, an object corresponding to the log event associated fingerprint and reconstructing, by the management service, at least one parent object of the object corresponding to the log event associated fingerprint. The method also includes gathering, by the management service, configuration information from the object corresponding to the log event associated fingerprint, and from the at least one parent object.
    Type: Grant
    Filed: February 26, 2016
    Date of Patent: October 30, 2018
    Assignee: Skyport Systems, Inc.
    Inventors: Robert Stephen Rodgers, Thomas John Giuli
  • Patent number: 10116569
    Abstract: In one example, a method includes measuring an available bandwidth of a communication path between a client and another entity, and determining a required bandwidth associated with a future transfer of a target dataset between the client and the other entity along the communication path. The required bandwidth is determined based on a size of the target dataset, and a data deduplication rate (DDR) of the client. The available bandwidth is then compared with the required bandwidth of the target dataset.
    Type: Grant
    Filed: August 19, 2016
    Date of Patent: October 30, 2018
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Balaji Panchanathan, Prafful Agarwal, Pravin Ashokkumar
  • Patent number: 10108635
    Abstract: A deduplication method using data association information includes extracting information about a target file and at least one reference file associated with the target file as association information before duplication determination is performed. The at least one reference file is identified by the association information as a comparison target set for comparison when the duplication determination of the target file is performed. The duplication determination is performed with the target file with respect to the at least one reference file in the selected comparison target set.
    Type: Grant
    Filed: December 2, 2014
    Date of Patent: October 23, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hyun-Jung Shin, Ju-Pyung Lee
  • Patent number: 10108356
    Abstract: In one aspect, a method includes generating a protection file system in a deduplication storage array, generating a snapshot of a production volume in the deduplication storage array including hashes of data in the snapshot, generating a first file hierarchy for the hashes of the data in the snapshot in the protection file system and adding a retention indicator to each hash in the first file hierarchy.
    Type: Grant
    Filed: March 25, 2016
    Date of Patent: October 23, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: Assaf Natanzon, Kirill Shoikhet
  • Patent number: 10108652
    Abstract: A data storage system protects virtual machines using block-level backup operations and restores the data at a file level. The system accesses the virtual machine file information from the file allocation table of the host system underlying the virtualization layer. A file index associates this virtual machine file information with the related protected blocks in a secondary storage device during the block-level backup. Using the file index, the system can identify the specific blocks in the secondary storage device associated with a selected restore file. As a result, file level granularity for restore operations is possible for virtual machine data protected by block-level backup operations without restoring more than the selected file blocks from the block-level backup data.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: October 23, 2018
    Assignee: Commvault Systems, Inc.
    Inventors: Paramasivam Kumarasamy, Rahul S. Pawar, Amit Mitkar, Satish Chandra Kilaru