Fragmentation, Compaction And Compression Patents (Class 707/693)
  • Patent number: 11188467
    Abstract: A method is described. The method includes receiving a read or write request for a cache line. The method includes directing the request to a set of logical super lines based on the cache line's system memory address. The method includes associating the request with a cache line of the set of logical super lines. The method includes, if the request is a write request: compressing the cache line to form a compressed cache line, breaking the cache line down into smaller data units and storing the smaller data units into a memory side cache. The method includes, if the request is a read request: reading smaller data units of the compressed cache line from the memory side cache and decompressing the cache line.
    Type: Grant
    Filed: September 28, 2017
    Date of Patent: November 30, 2021
    Assignee: Intel Corporation
    Inventors: Israel Diamand, Alaa R. Alameldeen, Sreenivas Subramoney, Supratik Majumder, Srinivas Santosh Kumar Madugula, Jayesh Gaur, Zvika Greenfield, Anant V. Nori
  • Patent number: 11184749
    Abstract: The present invention may provide a method of managing, by a vehicle, a sensor. Herein, a method of managing, by a vehicle, a sensor may include: monitoring state information of a first sensor of a first vehicle; when the first sensor is abnormal, determining whether or not a second sensor performs the function of the first sensor; when the second sensor performs the function of the first sensor, reporting information representing that second sensor performs the function of the first sensor; and receiving sensing data from the second sensor.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: November 23, 2021
    Assignees: Hyundai Motor Company, Kia Motors Corporation
    Inventors: Young Jin Na, Joon Young Kim
  • Patent number: 11176103
    Abstract: A method of structuring data in a virtual file system, includes using the file system to apply specific handling of data that represents genomic sequence information or information that is related to genomic sequences. The method also concerns portioning the data into a collection of storage devices that have different cost and performance characteristics, wherein the splitting policy is based on a cost model. The method is executable by employing a computing device functioning under software control.
    Type: Grant
    Filed: October 11, 2017
    Date of Patent: November 16, 2021
    Assignee: PetaGene Ltd
    Inventors: Daniel Leo Greenfield, Alban Rrustemi
  • Patent number: 11178088
    Abstract: Snippets of content associated with a communication platform are described. In an example, based at least in part on a determination, by the communication platform, that a user of the communication platform is permitted to access one or more snippets of content provided by one or more other users of the communication platform, causing one or more user interface elements associated with the one or more snippets of content to be presented via a user interface of a user computing device of the user. The communication platform can receive, from the user computing device, a request to view a snippet of content of the one or more snippets of content and can cause the snippet of content to be presented by the user computing device via the user interface associated with the communication platform.
    Type: Grant
    Filed: October 6, 2020
    Date of Patent: November 16, 2021
    Assignee: Slack Technologies, Inc.
    Inventors: Noah Weiss, John Rodgers, Kevin Marshall, Anna Niess, Michael Hahn, Ibrahim Madha, Pedro Carmo, Michael Montazeri, Ethan Eismann
  • Patent number: 11175993
    Abstract: In one embodiment, a method for managing a data storage system includes: in response to receiving a data object, sorting data records in the data object on the basis of a first query so as to form a first backup; causing the first backup to be stored in the data storage system; and cause to be stored, in an index of the data storage system, the first query and a first address of the first backup in the data storage system.
    Type: Grant
    Filed: June 12, 2015
    Date of Patent: November 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Liang Liu, Junmei Qu, Wen Jun Yin, Wei Zhuang
  • Patent number: 11176097
    Abstract: Embodiments for, in a shared storage environment, managing data replication between first and second sites of a distributed computing environment by one or more processors. Metadata is pre-seeded from the first to the second site as an assembled metadata map. Data blocks corresponding to the pre-seeded metadata not currently stored at the second site are determined by the second site using the metadata map within a deduplication environment. A transfer request for the data blocks is returned by the second site to the first site.
    Type: Grant
    Filed: August 26, 2016
    Date of Patent: November 16, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Emmanuel Barajas Gonzalez, Shaun E. Harrington, Harry McGregor, Christopher B. Moore
  • Patent number: 11170163
    Abstract: Systems and methods include identifying a first column heading selection for a first column in a table and identifying a second column heading selection for a second column in a table; defining a column combination based on the identified first column heading selection and identified second column heading selection; analyzing predefined column heading combinations contained in a memory to determine when the defined column combination corresponds to a predefined column heading combination from among the predefined column heading combinations contained in the memory; associating a predefined logical combination rule with the first column and the second column in the table based on a determination that the defined column combination corresponds to the predefined column heading combination; monitoring entries in the first column and the second column for a triggering event when the predefined logical combination rule is triggered; and altering display in the table using the predefined logical combination rule.
    Type: Grant
    Filed: January 7, 2021
    Date of Patent: November 9, 2021
    Assignee: MONDAY.COM
    Inventor: Daniel Lereya
  • Patent number: 11163468
    Abstract: Techniques for processing metadata (MD) may include: determining, in accordance with one or more criteria, a plurality of MD blocks that are similar and expected to have matching corresponding portions of MD in at least some of the plurality of MD blocks; forming a MD superblock including the plurality of MD blocks; filtering the MD superblock and generating a filtered MD superblock, wherein said filtering includes rearranging content of the MD superblock so that a first plurality of MD portions that are similar are grouped together in the filtered MD superblock, wherein at least some of the first plurality of MD portions that are similar are expected to match; and compressing the filtered MD superblock and generating a compressed filtered MD superblock. Filtering may include performing a bitshuffle algorithm that includes performing a bitwise transpose of a matrix of the MD blocks in the MD superblock.
    Type: Grant
    Filed: July 1, 2019
    Date of Patent: November 2, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Aidan O Mahony, Jason J. Duquette
  • Patent number: 11163651
    Abstract: A method, apparatus, system, and computer program product for restoring data. The restoring of the data to a storage system from a storage medium is initiated by a computer system. Changes to an amount of space available in the storage system to restore the data are identifies by the computer system, while the data is being restored to the storage system. A restoring of the data to the storage system is placed on hold by the computer system when an amount of space needed to complete restoring the data is greater than the amount of space available to restore the data.
    Type: Grant
    Filed: May 4, 2020
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Vijay S. Patil, Sahadev Dey, Rajiv Sundaramurthy, Amit Patra
  • Patent number: 11144580
    Abstract: Data storage for unstructured data such as JSON data stored as collections of documents transforms the JSON data into a columnar form of storing unstructured data by grouping similar fields together for facilitating retrieval of the individual fields from a range of documents. Groups of fields are stored in individual files for each field. Compound data such as arrays and subdocuments are also broken down into files for each atomic field. In other words, a compound document structure that defines a hierarchy or “tree” of fields is flattened such that each “leaf” of the tree is stored in a separate file.
    Type: Grant
    Filed: June 13, 2014
    Date of Patent: October 12, 2021
    Assignee: Imperva, Inc.
    Inventors: Ron Ben-Natan, Ury Segal
  • Patent number: 11126597
    Abstract: A database server may include a master table schema that defines a database table's configuration and an arrangement for corresponding shadow tables. The shadow tables contain data related to contiguous and non-overlapping time periods and writing to the shadow tables occurs in a rotational fashion so that only one active table is written to at any point. The server may upgrade the master table schema. The server then may determine that a rotation event has occurred where a first shadow table is active and a second shadow table is associated with an oldest of the contiguous and non-overlapping time periods. In response, the server may delete data in the second table, determine that the schema has been upgraded since the second table was most recently active, upgrade the second table's schema to match the schema, and set the second table to active enabling writing to the second table.
    Type: Grant
    Filed: January 17, 2019
    Date of Patent: September 21, 2021
    Assignee: ServiceNow, Inc.
    Inventors: Ellen Lorraine Ormerod, Josef Mart
  • Patent number: 11119681
    Abstract: A method for storing data in a storage system, includes opportunistically compressing a plurality of objects of a first size during a write operation, storing the compressed objects of the first size if they compress acceptably, and storing other objects of a second size uncompressed. It further includes determining during a read operation whether an object of the second size is stored as a part of a compressed object; if the object of the second size is not stored as a part of a compressed object of the first size, then reading the object of the second size from storage; if the object of the second size is stored as a part of the compressed object of the second size, then: reading the compressed object of the first size from storage; uncompressing the compressed object of the first size; and extracting the object of the second size.
    Type: Grant
    Filed: April 28, 2018
    Date of Patent: September 14, 2021
    Assignee: Hewlett Packard Enterprise Development LP
    Inventor: Glenn Watkins
  • Patent number: 11113237
    Abstract: A method, article of manufacture, and apparatus for creating a fingerprint to container id index is discussed. The index may be stored in-memory, on disk, and on a solid-state device. The index may be used to quickly locate a container identifier given a data segment fingerprint.
    Type: Grant
    Filed: December 30, 2014
    Date of Patent: September 7, 2021
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Satish Visvanathan, Mahesh Kamat, Rahul B Ugale
  • Patent number: 11115706
    Abstract: A method, client, and terminal device for screen recording are provided, and the method includes: enabling a screen recording data synthesizing module (401) when a screen recording instruction is received (101); inputting an encoded audio data of a player into the screen recording data synthesizing module (401) for superposition, to obtain a merged audio data (102); and inputting a video data of the player into the screen recording data synthesizing module (401), and merging the video data with the merged audio data, to obtain a screen recording data (103). In the present disclosure, all the audio and video data in a live broadcasting scenario can be completely recorded to ensure the integrity of the live broadcasting scenario.
    Type: Grant
    Filed: December 17, 2019
    Date of Patent: September 7, 2021
    Assignee: Wangsu Science & Technology Co., Ltd.
    Inventor: Yanpeng Chen
  • Patent number: 11100072
    Abstract: A data amount compressing method for compressing a data amount corresponding to a learned model obtained by letting the learning model learn a predetermined data group, the learning model having a tree structure in which multiple nodes associated with respective hierarchically divided state spaces are hierarchically arranged, wherein each node in the learned model is associated with an error amount that is generated in the process of the learning and corresponds to prediction accuracy, and the data amount compressing method includes: a reading step of reading the error amount associated with each node; and a node deleting step of deleting a part of the nodes of the learned model according to the error amount read in the reading step, thereby compressing the data amount corresponding to the learned model.
    Type: Grant
    Filed: December 26, 2018
    Date of Patent: August 24, 2021
    Assignee: AISing LTD.
    Inventors: Junichi Idesawa, Shimon Sugawara
  • Patent number: 11101819
    Abstract: A method for compressing semi-structured data is discussed. The method includes accessing semi-structured data, the semi-structured data comprising a plurality of elements. The method includes determining a plurality of unique elements of the plurality of elements, each of the plurality of unique elements associated with a respective unique index of a plurality of unique indexes. Each of the unique index can indicate a position in one of a plurality of data stores. The method includes generating a sequence of encoded representations corresponding to the plurality of elements, the generating based on the plurality of unique indexes.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: August 24, 2021
    Assignee: PAYPAL, INC.
    Inventor: Diego Lagunas
  • Patent number: 11094029
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Grant
    Filed: April 10, 2017
    Date of Patent: August 17, 2021
    Assignee: INTEL CORPORATION
    Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
  • Patent number: 11093176
    Abstract: A method, apparatus, and system for compressing a data object for storage at an object store of a cloud computing platform using a global compression scheme is disclosed. The operations comprise: receiving a new data object for storage in an object store on a cloud computing platform; dividing the new data object into a plurality of chunks of a predetermined size; for each chunk of the new data object, determining a respective most similar existing chunk already stored in the object store; compressing the new data object, comprising compressing each chunk of the new data object based on the respective most similar existing chunk as a compression reference; and storing the compressed new data object in the object store.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: August 17, 2021
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Yossef Saad, Assaf Natanzon
  • Patent number: 11093134
    Abstract: To combine and apply a data volume reduction technique and an automatic tier management function, the invention provides a storage system that includes a processor and a storage medium and manages and stores data in tiers. The storage system includes a first storage tier that includes a storage area for storing data, and a second storage tier that includes a storage area for storing the data which is stored in the storage area of the first storage tier and whose storage area is changed. The processor calculates an I/O volume of the data in the first storage tier, determines the tier where data is stored based on the I/O volume, and physically stores data which is stored in the second storage tier in a storage medium corresponding to the determined tier.
    Type: Grant
    Filed: September 4, 2019
    Date of Patent: August 17, 2021
    Assignee: HITACHI, LTD.
    Inventors: Kazuki Matsugami, Tomohiro Yoshihara, Ryosuke Tatsumi
  • Patent number: 11086864
    Abstract: Methods and system are disclosed that relate to optimizing search for data. In one aspect, an attribute vector may include unique value identifiers and be associated with a dictionary structure. For a unique value identifiers stored in an attribute vector and associated with the dictionary structure, start address and end address associated with the unique value identifiers is computed. Based on the computation, a range of positional addresses associated with the unique value identifiers may be generated and stored in a data structure. Upon receiving a request to search for data, the range of positional addresses in which the unique value identifiers may be searched is determined. Based on the determination, a database search engine optimizes the search for data in the attribute vector.
    Type: Grant
    Filed: November 23, 2016
    Date of Patent: August 10, 2021
    Assignee: SAP SE
    Inventors: Yadesh Gupta, Sudhir Verma
  • Patent number: 11089338
    Abstract: Techniques and configurations for compression of image data in a progressive, lossless manner are disclosed. In an example, three-dimensional medical images may be compressed and decompressed with high-speed operations, through a compression technique performed on a cube (chunk) of voxels that includes generating a subsampled or filtered cube of voxels, and generating and optimizing a delta data set between the cube of voxels and the subsampled cube of voxels. This optimized delta data set is operable with a decompression technique to losslessly recreate the cube of voxels. Further, the compression technique may be progressively performed with multiple iterations, to allow multiple lower resolution versions of the images prior to loading or receiving the entire compressed data that is reconstructable in a lossless form. Use of this technique may result in dramatically reduced time to first image when visualizing 3D images and performing image data transfers.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: August 10, 2021
    Assignee: Vital Images, Inc.
    Inventor: William D. Hachfeld
  • Patent number: 11070231
    Abstract: A method of reducing the storage requirements of blockchain metadata via dictionary-style compression includes receiving a request to add a transaction block to a blockchain. The method further includes determining an identifier (ID) of a dictionary block most recently stored on the blockchain. The method further includes compressing, by a processing device, one or more transactions of the transaction block based on the dictionary block to generate a compressed transaction block. The method further includes adding the ID of the dictionary block to the compressed transaction block. The method further includes providing the compressed transaction block, including the ID of the dictionary block, for storage on the blockchain.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: July 20, 2021
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Eric Allan Bier, Alejandro Brito, Shantanu Rane
  • Patent number: 11068405
    Abstract: A storage processor in a data storage system includes a compression selection component that selects a data compression component to be used to compress host I/O data that is flushed from a persistent cache of the storage processor based on a current fullness level of the persistent cache. The compression selection component selects compression components implementing compression algorithms having relatively lower compression ratios for relatively higher current fullness levels of the persistent cache, and selects compression components implementing compression algorithms having relatively higher compression ratios for relatively lower current fullness levels of the persistent cache.
    Type: Grant
    Filed: April 19, 2018
    Date of Patent: July 20, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Ivan Bassov, Monica Chaudhary, Christopher A. Seibel
  • Patent number: 11061957
    Abstract: A system and method for intelligent content searching is disclosed herein. The system saves all searches executed by the user and periodically re-executes one or more of the previously saved searches and display the subsequent search results to the user at an appropriate time without any user intervention. In one aspect, the system periodically re-executes one or more of the previously saved searches upon the occurrence of a trigger event, which may be trending events, news events, type of menus and/or screens accessed, new content being added on one or more service providers, a boot event, passage of time since last search, etc. In this way, users do not need to set alert or follow any search topic.
    Type: Grant
    Filed: December 2, 2014
    Date of Patent: July 13, 2021
    Assignee: ROKU, INC.
    Inventors: Jim Funk, Brandon Noffsinger
  • Patent number: 11036347
    Abstract: A system and method for standardizing user interface elements are presented. A first application is identified having a higher use metric than a second application, the first application including one or more user interface elements that have one or more respective parameters. The second application has one or more user interface elements that are similar to the user interface elements of the first application and has one or more respective parameters that are different than the respective parameters of the user interface elements of the first application. A determination of similarity is made between the user interface elements applications based upon at least one predetermined criterion. Based on the determination, one or more parameters of the user interface elements of the second application are modified to match one or more parameters of the first application.
    Type: Grant
    Filed: May 1, 2019
    Date of Patent: June 15, 2021
    Assignee: eBay Inc.
    Inventors: David A. Ramadge, Justin Van Winkle, Corinne Elizabeth Sherman
  • Patent number: 11036685
    Abstract: A method includes comparing a search key that includes bits having respective values and bit positions to a mask to identify masked and unmasked portions of the search key. The mask corresponds to multi-dimensional keys and has first and second values in bit positions corresponding to and not corresponding to, respectively, common bits. Each common bit has a respective same value and occurs in a respective same position in the multi-dimensional keys. The masked and unmasked portions are bits at bit positions corresponding to bit positions of bits of the mask having first and second values, respectively. The method includes determining, based on determining that values in bit positions of the masked portion match values in corresponding bit positions of a pattern, that the unmasked portion matches a compressed key without decompressing the compressed key, and based thereon, identifying a successful match between the search and compressed keys.
    Type: Grant
    Filed: January 2, 2019
    Date of Patent: June 15, 2021
    Assignee: Futurewei Technologies, Inc.
    Inventors: Ramabrahmam Velury, Jihui Tan, Guangcheng Zhou
  • Patent number: 11030191
    Abstract: Systems, methods, and devices for querying over an external table are disclosed. A method includes connecting a database platform to an external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes receiving a query comprising a predicate, the query directed at least to data in the external table. The method includes determining, based on metadata, one or more partitions in the external table comprising data satisfying the predicate. The method includes pruning, based on the metadata, all partitions in the external table that do not comprise any data satisfying the predicate. The method includes generating a query plan comprising a plurality of discrete subtasks. The method includes assigning, based on the metadata, the plurality of discrete subtasks to one or more nodes in an execution platform.
    Type: Grant
    Filed: April 7, 2020
    Date of Patent: June 8, 2021
    Assignee: Snowflake Inc.
    Inventors: Subramanian Muralidhar, Benoit Dageville, Thierry Cruanes, Nileema Shingte, Saurin Shah, Torsten Grabs, Istvan Cseri
  • Patent number: 11016888
    Abstract: A method for compressing data in a local cache of a web server is described. A local cache compression engine accesses values in the local cache and determines a cardinality of the values of the local cache. The local cache compression engine determines a compression rate of a compression algorithm based on the cardinality of the values of the local cache. The compression algorithm is applied to the cache based on the compression rate to generate a compressed local cache.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: May 25, 2021
    Assignee: eBay Inc.
    Inventor: Amit Desai
  • Patent number: 10990565
    Abstract: A method, computer program product, and computing system for processing a data portion to divide the data portion into a plurality of data chunks; performing an entropy analysis on each of the plurality of data chunks to generate a plurality of data chunk entropies; and determining an average data chunk entropy from the plurality of data chunk entropies.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: April 27, 2021
    Assignee: EMC IP Holding Company, LLC
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi
  • Patent number: 10983963
    Abstract: Embodiments for locating, identifying and categorizing data-assets through advanced machine learning algorithms implemented by profiler components across Hadoop and Hadoop Compatible File Systems, databases and in-memory objects automatically and periodically to provide a visual representation of the category of data infrastructure distributed across data-centers and multiple clusters, for the purposes of enriching data quality, enabling data discovery and improving outcomes from downstream systems.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: April 20, 2021
    Assignee: Cloudera, Inc.
    Inventors: Srikanth Venkatasubramanian, Babu Prakash Rao, Hemanth Yamijala, Rohit Choudhary, Raghumitra Kandikonda
  • Patent number: 10977221
    Abstract: Data is organized in a hierarchical data tree having nodes, and is formatted in human-readable data according to a schema. The data is canonically ordered in correspondence with a canonical ordering of a schema dictionary generated from the schema. The canonically ordered data is encoded into binary, including for each node, removing a label of the node, and adding a sequence number of the node corresponding to the canonical ordering, in binary.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: April 13, 2021
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: William Scherer, III, Jeffrey R. Hilland, Michael R. Garrett
  • Patent number: 10977251
    Abstract: A data store system may include an array of persistent storage devices configured to store a plurality of data store tables. The data store system may further include a processor in communication with the storage device. The processor may receive a query containing a non-equality join condition on a first column from a first data store table and a second column on a second data store table. The processor may generate a bitmap based on the join condition. The bitmap indicate respective matches between the first column and second column in accordance with the non-equality join condition. The bitmap may also be used each time the non-equality join condition is present in another received query. A method and computer-readable medium may also be implemented.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: April 13, 2021
    Assignee: Teradata US, Inc.
    Inventors: Michael A. Gibas, Grace K. Au
  • Patent number: 10970287
    Abstract: Cross-tabulation operation is performed within a columnar database management system. The columnar database management system receives a request to perform a cross-tabulation operation on a set of database tables. The columnar database management system determines values of cross-tabulation operation for each row of the result. The columnar database management system determines a domain for each value of the row dimension corresponding to a row combination. The columnar database management system determines an intersection set of the domains corresponding to values of the row dimensions for the row combination. The columnar database management system determines a value for the result column for the row combination as an aggregate value based on the records of the intersection set.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: April 6, 2021
    Assignee: OPEN TEXT HOLDINGS, INC.
    Inventors: Carles Bayės Martin, Jesús Malo Poyatos, Marc Rodriguez Sierra, Alejandro Sualdea Pérez
  • Patent number: 10963432
    Abstract: In one embodiment, a method includes generating a file list for an aggregation of files based on a file pattern descriptor for each file in the aggregation of files or a file name for each file in the aggregation of files. The method also includes opening a session with a storage system manager and writing data from each file in the file list to a storage tier of a storage system. The method further includes writing metadata and storage location information from each file in the file list to an index file, closing the index file, and closing the session with the storage system manager. Other systems, methods, and computer program products are described according to more embodiments.
    Type: Grant
    Filed: August 4, 2015
    Date of Patent: March 30, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Steven V. Kauffman, Rainer Richter
  • Patent number: 10956370
    Abstract: Techniques for data processing a data set may comprise: performing first processing that forms a first compression unit, wherein the first compression unit includes a data chunks including a first data chunk having a first entropy value less than an entropy threshold, the first processing including: receiving a second data chunk; determining, in accordance with criteria, whether to add the second data chunk to the first compression unit; and responsive to determining to add the second data chunk to the first compression unit, adding the second data chunk to the first compression unit; and compressing the first compression unit as a single compressible unit. The second chunk may be added if its entropy value is less than the entropy threshold and if entropy values of the first and second chunks are similar. The second chunk may be added if the resulting compression unit provides sufficient storage/compression benefit.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: March 23, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ivan Bassov, Sorin Faibish, Istvan Gonczi
  • Patent number: 10929460
    Abstract: A method and an apparatus for storing a resource and an electronic device are provided. The method includes: extracting a resource with a storage size exceeding a preset capacity threshold, and backing up the extracted resource to a cloud server; obtaining link address information of the resource backed up to the cloud server; performing a clip processing on the extracted resource, and encapsulating the link address information corresponding to the extracted resource into the clip-processed resource; and replacing the extracted resource stored in an electronic device with the encapsulated resource.
    Type: Grant
    Filed: August 3, 2018
    Date of Patent: February 23, 2021
    Inventor: Zhenlong Guo
  • Patent number: 10922279
    Abstract: Systems and methods are provided to ingest data objects from a flat file server for use in one or more system operations including providing a renderable data object to a user and updating a data item database. As described, the ingestion system includes an ingestion module, a flat file module, a compliance module, and a deduplication module wherein the modules together ingest a flat file data object, parse and process a renderable data object from the flat file data object, and store the renderable data object in a renderable object database.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: February 16, 2021
    Assignee: Groupon, Inc.
    Inventors: Ramya Amancharla, Anthony Caliendo, Brian David Fields, James J. Sullivan, Kyle Oppenheim, Rajat Shroff
  • Patent number: 10922702
    Abstract: Exemplary embodiments of the present disclosure provide a method, apparatus, and computer-readable medium for identifying. An exemplary method includes providing a plurality of identifiers from a plurality of data sources, the plurality of identifiers corresponding to a plurality of entities, and creating a plurality of tuples based on the plurality of identifiers, wherein each one of the plurality of tuples corresponds (i) a particular one of the plurality of data sources and (ii) to at least two identifiers that are linked together. The method further includes receiving an identifier, and determining whether the received identifier matches any of the plurality of tuples.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: February 16, 2021
    Assignee: DotAlign, Inc.
    Inventor: Vince Scafaria
  • Patent number: 10901942
    Abstract: For offloading data to secondary storage, a criteria module checks a migration criteria of a data segment stored in a first data repository. The data segment may be associated with one or more entities. A threshold module determines whether the migration criteria of the data segment satisfies a migration threshold. A migration module migrates the data segment to a second data repository in response to the migration criteria of the data segment satisfying the migration threshold.
    Type: Grant
    Filed: March 1, 2016
    Date of Patent: January 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Joseph W. Dain, Gregory T. Kishi
  • Patent number: 10904338
    Abstract: A computer controls deduplication of data. The computer generates a hash of a remote data and a hash of a local data. The computer generates a set of unmatched hash data based on a comparison of the hash of the remote data against the hash of the local data. The computer generates a splitting cost that is associated with splitting the set of unmatched hash data. The computer sends a request to a server based on a comparison of the splitting cost to a threshold. The request dictates sending of the remote data to a storage controller.
    Type: Grant
    Filed: September 18, 2019
    Date of Patent: January 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Gregory J. Boss, Itzhack Goldberg, Jonathan D. Herd, Neil Sondhi
  • Patent number: 10891261
    Abstract: Embodiments of the present disclosure provide a method and device for deduplication. Specifically, the method may comprise obtaining a property of a file stream, the property of a file stream including a file type or a magic number identifying a format of a protocol or a file. The method further includes in response to receiving an I/O request for a data block of the file stream, assigning a deduplication level to the I/O request based on the property of the file stream. Moreover, the method further includes deduplicating the data block of the file stream based on the deduplication level assigned to the I/O request. In addition, a corresponding device and computer program product are provided.
    Type: Grant
    Filed: May 30, 2018
    Date of Patent: January 12, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Leon Zhang, Henry Hao Fang, Chen Gong, Lester Ming Zhang, Yongli Wang, Huan Chen
  • Patent number: 10891390
    Abstract: A method for execution by a computing device to adjust data storage efficiency of data in a storage network begins by obtaining a data segment for storage in memory of the storage network. The method continues by obtaining access level information regarding the data segment, where the access level information includes an estimated retrieval frequency level for the data segment. The method continues by determining a storage approach for the data segment based on the access level information and processing the data segment based on the storage approach to produce a processed data segment. The method continues by dispersed storage error encoding the processed data segment to produce a set of encoded data slices, where a decode threshold number of encoded data slices is needed to recover the processed data segment. The method continues by sending the set of encoded data slices to the memory for storage therein.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: January 12, 2021
    Assignee: PURE STORAGE, INC.
    Inventors: Ilya Volvovski, Wesley B. Leggette, Michael C. Storm, Jason K. Resch
  • Patent number: 10891759
    Abstract: Disclosed herein is a method for lossless compression and regeneration of digital design data in a manner maintaining the native formats outputted by modeling software used with prime focus on reduction in file size, portability, interchangeability of file storage format and providing database management functions while being implemented as a plug-and-play add-on utility to existing modeling software. Feature-based extraction of design attributes serves as a core of this inventive method and software utility based thereon.
    Type: Grant
    Filed: March 23, 2016
    Date of Patent: January 12, 2021
    Inventor: Amar Phatak
  • Patent number: 10892037
    Abstract: A method for compressing molecular tagged sequence data includes: grouping sequence reads associated with a molecular tag sequence to form a family of sequence reads, corresponding vectors of flow space signal measurements and corresponding sequence alignments, calculating an arithmetic mean of the corresponding vectors of flow space signal measurements to form a vector of consensus flow space signal measurements, calculating a standard deviation of the corresponding vectors of flow space signal measurements to form a vector of standard deviations, determining a consensus base sequence based on the vector of consensus flow space signal measurements, determining a consensus sequence alignment and generating a compressed data structure comprising consensus compressed data, the consensus compressed data including for each family, the consensus base sequence, the consensus sequence alignment, the vector of consensus flow space signal measurements, the vector of standard deviations and the number of members.
    Type: Grant
    Filed: May 15, 2018
    Date of Patent: January 12, 2021
    Assignee: Life Technologies Corporation
    Inventor: Cheng-Zong Bai
  • Patent number: 10884670
    Abstract: Methods, computer systems, and computer program products for processing data a computing environment are provided. The computer environment for data deduplication storage receives a plurality of write operations for deduplication storage of the data. The data is buffered in a plurality of buffers with overflow temporarily stored to a memory hierarchy when the data received for deduplication storage is sequential or non sequential. The data is accumulated and updated in the plurality of buffers per a data structure, the data structure serving as a fragment map between the plurality of buffers and a plurality of user file locations. The data is restructured in the plurality of buffers to form a complete sequence of a required sequence size. The data is provided as at least one stream to a stream-based deduplication algorithm for processing and storage.
    Type: Grant
    Filed: February 14, 2018
    Date of Patent: January 5, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shay H. Akirav, Ron Edelstein, Michael Hirsch, Ariel J. Ish-Shalom, Liran Loya, Itai Tzur
  • Patent number: 10884987
    Abstract: Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of block identifiers, where each block identifier is associated with a value identifier and there is a block identifier for each unique value in a block. Blocks may have standard sizes and block dictionaries may be reused for multiple blocks.
    Type: Grant
    Filed: August 8, 2016
    Date of Patent: January 5, 2021
    Assignee: SAP SE
    Inventors: Franz Faerber, Guenter Radestock, Andrew Ross
  • Patent number: 10877680
    Abstract: Embodiments of the present invention provide a data processing method and apparatus. According to the embodiments of the present invention, when it is found that a data hash value in a currently received data stream exceeds a preset first threshold, a part or all of data in the data stream is not deduplicated, and is directly stored, so as to prevent the data in the data stream from being dispersedly stored into a plurality of storage areas; instead, the part or all of the data is stored into a storage area in a centralized manner, so that a deduplication rate is effectively improved on the whole, particularly in a scenario of large data storage amount.
    Type: Grant
    Filed: May 14, 2014
    Date of Patent: December 29, 2020
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Yanhui Zhong, Zongquan Zhang
  • Patent number: 10877887
    Abstract: A data storage device may include: a nonvolatile memory device including first and second memory regions configured to be read-interleaved with each other; and a processor configured to select a first read command among read commands received from a host device, select a second read command among the read commands excluding the first read command, and control the nonvolatile memory device to perform map read on the first and second read commands at the same time. The processor selects, as the second read command, at least one read command that is configured to be read-interleaved with the first read command.
    Type: Grant
    Filed: August 23, 2018
    Date of Patent: December 29, 2020
    Assignee: SK hynix Inc.
    Inventor: In Jung
  • Patent number: 10872096
    Abstract: A computer-implemented method for electronic exchange of data is provided. The method includes the following operations performed by at least one computer processor. These operations include creating source data, identifying data structure from the source data, generating a header file based on the data structure, localizing identical data structure, and storing groups of data that have identical structure in a single data tag.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: December 22, 2020
    Assignees: Charbel Gerges El Gemayel
    Inventors: Charbel Gerges El Gemayel, Adib Khalil Haddad, Edgard Joseph Elian
  • Patent number: 10860539
    Abstract: A de-duplication-based remote replication method and an apparatus are provided in a system including a primary end device and a disaster recovery end device, and both the primary end device and the disaster recovery end device store a first snapshot; the primary end device obtains a second snapshot of the primary end device, and sends the first data block, the fingerprint of the first data block, and metadata of the added data blocks to the disaster recovery end device when a fingerprint of a first data block in the added data blocks is different from the fingerprints of the data blocks in the first snapshot.
    Type: Grant
    Filed: April 13, 2017
    Date of Patent: December 8, 2020
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Yanhui Zhong, Chengwei Zhang