Fragmentation, Compaction And Compression Patents (Class 707/693)
  • Patent number: 10838931
    Abstract: Systems and methods are disclosed for efficiently indexing stream data to facilitate full-text search of the stream data. A stream comprises a plurality of intervals of log data records. An interval of log data records are indexed. The index and log data records for the interval are written to an indexed stream data file. The index for each interval contains pointers to the terms in the log data records for the interval. After a number of intervals of index and log data records have been written, a merge operation can merge the number of intervals of index into a single merged index. The merged index and intervals of log data records are written to the indexed data stream file. A full-text search index is generated by traversing and merging the interval indexes for the data stream.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: November 17, 2020
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Aaron W. Spiegel, Stephen G. Graham, Paul R Kingston
  • Patent number: 10841405
    Abstract: A system may include a storage device configured to store a plurality of database tables. The system may further include a processor in communication with the storage device. The processor may receive a request to transmit a database table from the plurality of database tables. The database table may have a plurality of rows. The processor may determine if contents of each column row of each row of the database table are eligible to be compressed. For each column row that contains eligible contents, the processor may generate compressed data representative of the contents of a respective column row. The processor may remove the contents of the respective column row from the associated row. The processor may transmit the compressed data and the database table without content of the column rows represented by the compressed data. A method and computer-readable medium may also be implemented.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: November 17, 2020
    Assignee: Teradata US, Inc.
    Inventors: Victor Lewis, III, Bret M. Gregory
  • Patent number: 10826778
    Abstract: Methods, systems, and computer program products for discovering network connected devices are described. A semantic query for a network connected device is parsed, with the semantic query identifying one or more capabilities of a desired network connected device. A network address of a network connected device satisfying the parsed semantic query is identified and a query response identifying the network address of the network connected device is provided.
    Type: Grant
    Filed: December 6, 2016
    Date of Patent: November 3, 2020
    Assignee: SAP SE
    Inventors: Martin Knechtel, Axel Schroeder
  • Patent number: 10824794
    Abstract: A computer system identifies that a first portion of markup language, extracted from a markup language document of a website, corresponds to a first actionable element, wherein the first portion of markup language is a variable length representation. In response to identifying that the first portion of markup language corresponds to the first actionable element, the computer system utilizes a recurrent neural network (RNN) encoder to create a first code representation that corresponds to the first portion of markup language. The computer system identifies a first additional information that corresponds to one or more pre-defined goals. The computer system creates a final fixed length markup language representation that includes the first code representation and the first additional information. The computer system inputs the final fixed length markup language representation into a model.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: November 3, 2020
    Assignee: PAYPAL, INC.
    Inventor: Yarden Raiskin
  • Patent number: 10824374
    Abstract: Intelligent compression of data storage volumes in a service provider system. For example, in one embodiment of a computer-implemented method, attachment metrics are compiled for block storage volumes coupled to a storage server. The attachment metrics may include temporal data related to block storage volume detachments and attachments in relation to a plurality of compute instances; and prioritizing compression of the block storage volumes based on the attachment metrics; and compressing the block storage volumes in accordance with the prioritization.
    Type: Grant
    Filed: June 25, 2018
    Date of Patent: November 3, 2020
    Assignee: Amazon Technologies, Inc.
    Inventor: Timothy David Gasser
  • Patent number: 10824596
    Abstract: Innovations for adaptive compression and decompression for dictionaries of a column-store database can reduce the amount of memory used for columns of the database, allowing a system to keep column data in memory for more columns, while delays for access operations remain acceptable. For example, dictionary compression variants use different compression techniques and implementation options, Some dictionary compression variants provide more aggressive compression (reduced memory consumption) but result in slower run-time performance. Other dictionary compression variants provide less aggressive compression (higher memory consumption) but support faster run-time performance. As another example, a compression manager can automatically select a dictionary compression variant for a given column in a column-store database.
    Type: Grant
    Filed: January 23, 2019
    Date of Patent: November 3, 2020
    Assignee: SAP SE
    Inventors: Ingo Mueller, Cornelius Ratsch, Peter Sanders, Franz Faerber
  • Patent number: 10818105
    Abstract: Methods and systems for assessing, detecting, and responding to malfunctions involving components of autonomous vehicles and/or smart homes are described herein. Malfunctions may be detected by receiving sensor data from a plurality of sensors. One of these sensors may be selected for assessment. An electronic device may obtain from the selected sensor a set of signals. When the set of signals includes signals that are outside of a determined range of signals associated with proper functioning for the selected sensor, it may be determined that the selected sensor is malfunctioning. In response, an action may be performed to resolve the malfunction and/or mitigate consequences of the malfunction.
    Type: Grant
    Filed: January 18, 2017
    Date of Patent: October 27, 2020
    Assignee: STATE FARM MUTUAL AUTOMOBILE INSURANCE COMPANY
    Inventors: Blake Konrardy, Scott T. Christensen, Gregory Hayward, Scott Farris
  • Patent number: 10817475
    Abstract: 4th 115078 A method, computer program product, and computing system for encoding a candidate data portion to generate an encoded candidate data portion; identifying one or more portion similarities between the encoded candidate data portion and an encoded target data portion to position the one or more portion similarities with respect to the encoded target data portion, thus generating one or more portion similarity measurements; identifying one or more portion differences between the encoded candidate data portion and the encoded target data portion to generate one or more portion difference measurements; and combining the one or more portion similarity measurements and the one or more portion difference measurements to generate a candidate similarity measurement for the candidate data portion.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: October 27, 2020
    Assignee: EMC IP Holding Company, LLC
    Inventors: Sorin Faibish, Philip Shilane, Ivan Basov, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi
  • Patent number: 10820022
    Abstract: A content streaming system and methodology for facilitating the management of content streaming. A video packaging and origination service provides streaming content that is organized according to a set of encoded content chunks. A video playback application processes the set of encoded content chunks to dynamically form a content segment for live video streaming. The video playback application further processes the set of encoded content chunks to apply framerate heuristics associated with encoded content segments.
    Type: Grant
    Filed: May 3, 2018
    Date of Patent: October 27, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Nicolas Weil, Lionel Bringuier
  • Patent number: 10812544
    Abstract: Embodiments regard transfer of data streaming services to provide continuous data flow. An embodiment of an apparatus includes a processor to process data for streaming to one or more organizations; and a memory to store data for streaming to the one or more organizations, wherein the apparatus is to provide a centralized work distribution service to track status of each of a plurality of data streams to the one or more organizations, and a plurality of nodes, each node being a virtual machine to stream one or more data streams to the one or more organizations, each node including a first daemon service to monitor connectivity of the node to dependency services for the node and, upon detecting a loss of connection to one or more of the dependency services, the node to discontinue ownership of the one or more data streams of the node and a second daemon service to poll the centralized work distribution service for data streams that are not assigned.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: October 20, 2020
    Assignee: salesforce.com, inc.
    Inventors: Shreedhar Sundaram, Yogesh Patel, William Victor Gray, Shaahin Mehdinezhad Rushan, Mahalaxmi Sanathkumar, Anjani Gupta, Rajkumar Pellakuru, Bhaves Patel, William Edward Hackett
  • Patent number: 10810162
    Abstract: A perfect hash vector (PHVEC) is created to track segments in a deduplication file system. Files are represented by segment trees having hierarchical segment levels. Containers store the segments and fingerprints of segments. Upper-level segments are traversed to identify a first set of fingerprints of each level. These fingerprints correspond to segments that should be present. The first set of fingerprints are hashed and bits are set in the PHVEC corresponding to positions from the hashing. The containers are read to identify a second set of fingerprints. These fingerprints correspond to segments that are present. The second set of fingerprints are hashed and bits are cleared in the PHVEC corresponding to positions from the hashing. If a bit was set and not cleared, a determination is that there is at least one segment missing. If all bits set were also cleared, a determination is that no segments are missing.
    Type: Grant
    Filed: July 12, 2018
    Date of Patent: October 20, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Tony Wong, Abhinav Duggal, Ramprasad Chinthekindi
  • Patent number: 10810174
    Abstract: A database includes a plurality of data blocks. Each of the plurality of data blocks includes a plurality of data pages in which a plurality of column values recorded in one or more records corresponding to the data block are stored. Each of the plurality of data pages has two or more column values in one column corresponding to the data page stored therein. A database server selects a data block from the plurality of data blocks and specifies a data page to be scanned from the selected data block.
    Type: Grant
    Filed: April 13, 2015
    Date of Patent: October 20, 2020
    Assignee: HITACHI, LTD.
    Inventors: Takayuki Tsuchida, Michiko Tanaka, Akira Shimizu, Shinji Fujiwara, Kazuhiko Mogi
  • Patent number: 10795597
    Abstract: Methods, systems, and apparatuses are described for provisioning storage devices. An example method includes specifying a logical zone granularity for logical space associated with a disk drive. The method further includes provisioning a zone of a physical space of the disk drive based at least in part on the specified logical zone granularity. The method also includes storing compressed data in the zone in accordance with the provisioning.
    Type: Grant
    Filed: September 11, 2018
    Date of Patent: October 6, 2020
    Assignee: SEAGATE TECHNOLOGY LLC
    Inventor: Timothy R. Feldman
  • Patent number: 10795875
    Abstract: A data storing method performed by a data storing apparatus according to an exemplary embodiment of the present invention includes a step of calling insert operation for storing first data in a first page having a space for storing data, a step of checking whether overflow in which there is no space for storing the first data in the first page occurs, a step of checking whether the overflow occurs in the first page and dead data is stored in the first page, and a step of generating an overflow buffer page to store the first data in the overflow buffer page when the overflow occurs in the first page and the dead data is stored in the first page and there is a transaction accessing the dead data.
    Type: Grant
    Filed: August 12, 2015
    Date of Patent: October 6, 2020
    Assignees: Industry-University Cooperation Foundation Hanyang University, UNIST Academy-Industry Research Corporation
    Inventors: You Jip Won, Beom Seok Nam
  • Patent number: 10789237
    Abstract: Techniques are described for providing a storage service that stores information about large numbers of transactions in a persistent manner, such as with a high degree of reliability, availability and scalability based at least in part on use of a distributed computing and storage system. In some situations, the transaction information storage service stores various information about transactions that each include at least one monetary payment (e.g., a micro-payment) between financial accounts of two or more of numerous users having accounts with one or more entities. The transaction information storage service may be provided by or otherwise affiliated with a merchant, and customers of the merchant may purchase usage of the storage service for programs executed by or otherwise affiliated with the customers, with the storage service available to remote executing programs via a defined API of the storage service, such as a Web services-based API.
    Type: Grant
    Filed: August 9, 2017
    Date of Patent: September 29, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Vikas Gupta, Allan H. Vermeulen, Rahul Singh, Duane J. Krause, Nipoon Malhotra
  • Patent number: 10776361
    Abstract: Systems, device and techniques are disclosed for a time series database search system. A data object may be received. The data object may include timestamp data indicating a time at which an event occurred, a value indicating a measure of the event, and key-value pairs comprising data associated with the event. A hash ID may be generated by hashing the one or more key-value pairs. The timestamp data, the value, and the hash ID may be stored in a first database as an object in the first database. The key-value pairs and the hash ID may be stored in a second database as an object in the second database.
    Type: Grant
    Filed: April 7, 2017
    Date of Patent: September 15, 2020
    Assignee: salesforce.com, inc.
    Inventor: Brandon Svec
  • Patent number: 10768843
    Abstract: Techniques for data processing may include: receiving a candidate block including a plurality of uniformly-sized sub-blocks, wherein a tag is stored at a first location in the candidate block; performing data deduplication processing of the candidate block, wherein the data deduplication processing excludes content stored from a first offset to a second offset corresponding to the first location; determining whether at least one sub-block of the candidate block has been deduplicated by the data deduplication processing; and responsive to determining that at least one sub-block of the candidate block has been deduplicated, storing the candidate block as a deduplicated data block having at least one sub-block matching an existing target sub-block, wherein a tag descriptor describing the tag is stored and associated with the candidate block, such as in block-level metadata of the candidate block. The tag descriptor may include tag content and tag location information.
    Type: Grant
    Filed: February 4, 2019
    Date of Patent: September 8, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Philippe Armangau, Istvan Gonczi, Ivan Bassov, Anton Kucherov
  • Patent number: 10762071
    Abstract: Innovations in performing sort operations for dictionary-compressed values of columns in a column-store database using value identifiers (“IDs”) are described. For example, a database system includes a data store and an execution engine. The data store stores values at positions of a column A dictionary maps distinct values to corresponding value IDs. An inverted index stores, for each of the corresponding value IDs, a list of those of the positions that contain the associated distinct value. The execution engine processes a request to sort values at an input set of the positions and identify an output set of the positions for sorted values. In particular, the execution engine iterates through positions stored in the lists of the inverted index. For a given position, the execution engine checks if the given position is one of the input set and, if so, adds the given position to the output set.
    Type: Grant
    Filed: November 29, 2016
    Date of Patent: September 1, 2020
    Assignee: SAP SE
    Inventors: Robert Schulze, Thomas Peh
  • Patent number: 10761762
    Abstract: A technique for writing data in a data storage system includes aggregating data received in a set of I/O requests into a batch that includes multiple extents of data. After compressing a current extent of the batch and determining that the compressed extent does not fit in a space where a previous version of the extent is stored, the technique performs a batch-relocate operation by gathering a set of mapping metadata for mapping each of the extents in the batch, identifying a set of holes indicated by the set of mapping metadata, and adding the holes to a batch-hole list. The technique then selects a hole, from the batch-hole list, which is big enough to accommodate the compressed extent, and places the compressed extent in the selected hole.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: September 1, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Monica Chaudhary, Ajay Karri, Alexander Daniel
  • Patent number: 10762047
    Abstract: A technique for writing data in a file system includes aggregating data received in a set of I/O requests into a batch that includes multiple extents of data. After compressing a current extent of the batch and determining that the compressed extent does not fit in a space where a previous version of the extent is stored, the technique performs an FS-relocate operation by accessing an F S-hole list provided for the file system and selecting a hole, from the FS-relocate list, which is large enough to accommodate the compressed extent. The technique then places the compressed extent in the selected hole.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: September 1, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Philippe Armangau, Ajay Karri, Monica Chaudhary, Alexander Daniel
  • Patent number: 10762119
    Abstract: A semantic labeling apparatus and method thereof include a place identifier processor configured to, based on location data of a user, generate place attributes of places that indicate information of a user visit for each place, wherein user location remains unchanged within the places for a predetermined period of time. A group identifier processor is configured to cluster the places based on the place attributes, classify the places into groups, acquire a semantic label for each of the groups, and designate the acquired semantic label as the semantic label of each of the groups. A label determiner is configured to determine the semantic label of each of the groups as a semantic label of each member place of each of the groups.
    Type: Grant
    Filed: April 21, 2015
    Date of Patent: September 1, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jae Mo Sung, Min Young Mun
  • Patent number: 10754836
    Abstract: Embodiments of the invention include a system and set of processes for organizing image collections. The system detects individuals in each image uploaded into the system using facial recognition or similar methods. The user and viewers of the images may then view dynamic albums based on the interrelationships of individuals in images. Users and viewers may browse all images with an individual or see albums of images with two selected individuals or similar combinations based on the relationships between users.
    Type: Grant
    Filed: January 14, 2013
    Date of Patent: August 25, 2020
    Inventor: Roland H. Kedikian
  • Patent number: 10757227
    Abstract: A method of data nibble-histogram compression can include determining a first amount of space freed by compressing the input data using a first compression technique, determining a second amount of space freed by compressing the input data using a second, different compression technique, compressing the input data using the compression technique of the first and second compression techniques determined to free up more space to create compressed input data, and inserting into the compressed input data, security data including one of a message authentication control (MAC) and an inventory control tag (ICT).
    Type: Grant
    Filed: November 5, 2019
    Date of Patent: August 25, 2020
    Assignee: Intel Corporation
    Inventors: Michael Kounavis, David M. Durham, Karanvir Grewal, Wenjie Xiong, Sergej Deutsch
  • Patent number: 10747737
    Abstract: Disclosed herein are system, method, and computer program product embodiments for altering the data type of a column in a database. An embodiment operates by converting an original dictionary associated with a column into a new dictionary. The new dictionary stores the values of the original dictionary using a different data type. An index vector containing the keys of the original dictionary is then updated to contain the associated keys of the new dictionary. Because the size of the original dictionary is often substantially smaller than the number of rows in the associated column, this dictionary conversion decreases the computation cost to the database system of altering the data type of the column and reduces or even minimizes database downtime for users.
    Type: Grant
    Filed: November 25, 2014
    Date of Patent: August 18, 2020
    Assignee: SAP SE
    Inventors: Colin Florendo, Ivan Schreter, Panfeng Zhou, David Wein, Steffen Geissinger, Michael Muehle
  • Patent number: 10748099
    Abstract: Disclosed are systems and methods that enable both the real-time monitoring and historical tracking of events and actions performed by disparate systems and software applications associated with a computing device that also permit insight into the relationships between such events and actions. The inventive systems and methods synchronously and asynchronously capture event data from various event sources associated with a computing device. The event data can be enriched before correlating associated event data into transactions resulting in telemetry information to provide a more accurate and complete picture of what activities the computing device is performing and how the computing device is utilized to accomplish particular tasks. This telemetry information enables the system to perform descriptive and predictive analytical processes as well as artificial intelligence that provide valuable insights to improve end user and system performance.
    Type: Grant
    Filed: October 25, 2018
    Date of Patent: August 18, 2020
    Assignee: Sykes Enterprises, Incorporated
    Inventors: Richard Sadowski, J. Shelton Hook, Jr., David Pearson, Stephen Berdy
  • Patent number: 10749764
    Abstract: A device for generating and searching sensor tag data in real time is provided. The device can include a rollup executor that is configured to generate statistics data per time from raw data; and a rollup memory storing per-second statistics data in units of seconds for new input data and per-minute statistics data in units of minutes for the per-second statistics data, where the statistics data can be automatically calculated and provided by the system by using statistics for time series sensor tag data based on tag names/times.
    Type: Grant
    Filed: August 22, 2018
    Date of Patent: August 18, 2020
    Assignee: Machbase, Inc.
    Inventor: Sung Jin Kim
  • Patent number: 10725922
    Abstract: Technologies for predictive caching include a computing device to receive sensor data generated by one or more sensors of the computing device and determine a device context of the computing device based on the sensor data. Based on the device context, the computing device determines a file to cache that has similar characteristics to another file recently accessed by a user of the computing device. The computing device includes a file cache with a first partition to store files identified to have similar characteristics to files recently accessed by a user and a second partition to store files identified based on access patterns of the user. The computing device stores the determined file to the first partition.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: July 28, 2020
    Assignee: Intel Corporation
    Inventors: Hong Li, Sudip S. Chahal, Roy J. Ubry, Julian Braham, Preeta Banerji
  • Patent number: 10719406
    Abstract: One embodiment provides a computer implemented method of data identification within a deduplication storage system, the method comprising processing multiple units of a segment of data within the deduplication storage system using a fingerprint generation algorithm; storing the internal state generated while processing the multiple units of the segment of data; generating a first fingerprint for the segment of data based on the internal state; reloading the internal state after generating the first fingerprint for the segment of data; and generating a second fingerprint for the segment of data based on a transformed unit of the segment of data.
    Type: Grant
    Filed: June 23, 2016
    Date of Patent: July 21, 2020
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Kedar Shrikrishna Patwardhan, Mangesh Sudhir Nijasure, Veeral Shah
  • Patent number: 10715177
    Abstract: A method for lossy data compression, the method including receiving raw data at a storage device, receiving a request to compress flag, accessing an onboard data compression algorithm library containing various data compression algorithms respectively corresponding to lossy data compression schemes, selecting one of the data compression algorithms based on a number of parameters, running the selected data compression algorithm either online such that the raw data is compressed by the storage device when it is received, and is then stored on the storage device as compressed data, or offline such that the raw data is stored at the storage device, is later compressed by the storage device according to the selected data compression algorithm, and is resaved at the storage device as compressed data.
    Type: Grant
    Filed: October 9, 2017
    Date of Patent: July 14, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yang Seok Ki, Yongsuk Lee, Jason Martineau
  • Patent number: 10712943
    Abstract: A memory monitoring and selective defragmentation method and system disclosed herein monitor memory usage by and modification of one or more database indexes. The monitoring and selective defragmentation method and system selectively defragment the one or more database indexes based on memory cost savings as opposed to a percentage of fragmentation to improve performance of databases.
    Type: Grant
    Filed: December 5, 2017
    Date of Patent: July 14, 2020
    Assignee: IDERA, INC.
    Inventor: Vicky Harp
  • Patent number: 10678461
    Abstract: A storage system. The storage system includes a plurality of storage nodes (DNodes), wherein the DNodes are configured to store a plurality of elements and a plurality of data blocks, wherein each element is a persistent metadata structure, wherein at least one of the elements store at least one attribute, wherein the at least one attribute includes a plurality of pointers; and a plurality of compute nodes (CNodes), wherein each CNode has access to each of the DNodes, wherein each CNode is configured to receive an access command and to execute the access command based on the elements.
    Type: Grant
    Filed: June 7, 2018
    Date of Patent: June 9, 2020
    Assignee: Vast Data Ltd.
    Inventors: Renen Hallak, Asaf Levy, Avi Goren, Yogev Vaknin, Alex Turin
  • Patent number: 10678435
    Abstract: Techniques for performing data deduplication and compression in data storage systems. Data deduplication is performed in a deduplication domain on a segment-by-segment basis to obtain a plurality of deduplicated data segments. Deduplicated data segments are grouped together to form a plurality of compression groups. Data compression is performed on each compression group, and the compressed group is stored on spinning media. By performing data deduplication on a segment-by-segment basis, the size of each segment can be reduced to increase the effectiveness of data deduplication. By performing data compression on compression groups, the size of each compression domain can be increased to increase the effectiveness of data compression. By storing deduplicated data segments as a compressed group on the spinning media, a sequential nature of the segments can be preserved to reduce a seek time/rotational latency of the spinning media and a number of IOPS handled by the data storage system.
    Type: Grant
    Filed: May 10, 2018
    Date of Patent: June 9, 2020
    Assignee: EMC IP Holding Company LLC
    Inventor: Jeremy Swift
  • Patent number: 10681189
    Abstract: In one example, the present disclosure describes a device, computer-readable medium, and method for organizing terabit-scale packet volumes into flows for downstream processing stages. For instance, in one example, a method includes extracting a first flow key from a first data packet, inputting the first flow key into a hash function to obtain a first output value, selecting a first partition in a memory to which to store the first data packet, wherein the first partition is selected based on the first output value, and storing the first data packet to the first partition.
    Type: Grant
    Filed: May 18, 2017
    Date of Patent: June 9, 2020
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Arthur L. Zaifman, John M. Mocenigo
  • Patent number: 10678647
    Abstract: A method of distributing data in a distributed storage system includes receiving a file, dividing the received file into chunks, and determining a distribution of the chunks among storage devices of the distributed storage system based on a maintenance hierarchy of the distributed storage system. The maintenance hierarchy includes maintenance levels, and each maintenance level includes one or more maintenance units. Each maintenance unit has an active state and an inactive state. Moreover, each storage device is associated with a maintenance unit. The determining of the distribution of the chunks includes identifying a random selection of the storage devices matching a number of chunks of the file and being capable of maintaining accessibility of the file when one or more maintenance units are in an inactive state. The method also includes distributing the chunks to storage devices of the distributed storage system according to the determined distribution.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: June 9, 2020
    Assignee: Google LLC
    Inventors: Robert Cypher, Sean Quinlan, Steven Robert Schirripa
  • Patent number: 10678654
    Abstract: Disclosed are methods and systems for performing data backup which implement data binning using log-structured merge (LSM) trees during deduplication. An exemplary method includes: calculating a reduced hash value (RHV) associated with each of a plurality of data blocks; partitioning the plurality of reduced hash values into groups; selecting a representative hash value for each group; determining whether the representative hash value occurs in a first LSM tree, the first LSM tree stored in a volatile memory; and when the representative hash value occurs in the first LSM tree: loading the RHVs in the representative hash value's group into volatile memory; comparing each of the RHVs to one or more hash values in a second LSM tree to identify a matching hash value; and writing a segment identifier (ID) corresponding to the matching hash value in an archive, which references a data block in a segment store.
    Type: Grant
    Filed: October 25, 2017
    Date of Patent: June 9, 2020
    Assignee: Acronis International GmbH
    Inventors: Vitaly Pogosyan, Kirill Korotaev, Mark Shmulevich, Stanislav Protasov, Serguei M. Beloussov
  • Patent number: 10664165
    Abstract: A method is used in managing inline data compression and deduplication in storage systems. A block of data from data stored in a cache of a storage system is identified based on entropy. Entropy of the block of data is compared with a first threshold value. Based on the comparison, the block of data is either deduplicated or compressed without deduplication.
    Type: Grant
    Filed: May 10, 2019
    Date of Patent: May 26, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Istvan Gonczi, Philippe Armangau, Vamsi Vankamamidi, Ivan Bassov
  • Patent number: 10664887
    Abstract: A search system specifies an image of a product according to a preference of a user conveniently and accurately using a sensibility word and displays the product image so that information of the product image is intuitively understood by the user. In a client terminal, sensibility word data is specified by a user and sent to a server system. In the server system, the sensibility word data is received, a physical amount of a product associated with the sensibility word data is acquired, image data of the product associated with the physical amount of the product is acquired, and display information data indicating a display aspect for the image is generated. The image data of the product and the display information data are transmitted from the server system to the client terminal, and the image data of the product is displayed based on the display information data in the client terminal.
    Type: Grant
    Filed: July 8, 2016
    Date of Patent: May 26, 2020
    Assignee: FUJIFILM Corporation
    Inventor: Makoto Yonaha
  • Patent number: 10657103
    Abstract: Embodiments for combining input data matches in data deduplication of input data by a processor. Matches of input data are calculated using a plurality of independent deduplication processes referencing a plurality of repository data segments for the input data. A combined list of output data matches is calculated by removing those of the input data matches that are fully enclosed within other input data matches; and removing those of the input data matches determined to be smaller than a predetermined threshold for citing. A deduplication operation is performed on the combined list of output data matches. Each pair of the input data matches having an overlap section is processed in an ascending order of a position.
    Type: Grant
    Filed: February 17, 2017
    Date of Patent: May 19, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lior Aronovich
  • Patent number: 10649655
    Abstract: Systems and methods are disclosed for storing multimedia assets (or other data objects) in a storage array. Portions of the multimedia asset may be stored on different chunks of the storage drives in the storage array based on an access frequency level for a portion, an importance level for the portion, a reliability score for a chunk, and a performance score for the chunk.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: May 12, 2020
    Assignee: Western Digital Technologies, Inc.
    Inventors: Jun Xu, Shaun Astarabadi, Grant C. Mackey, Junpeng Niu, Robin O'Neill, Jie Yu
  • Patent number: 10650017
    Abstract: Tiered storage may be implemented for processing data. Data processors may maintain some of a data set, including user data and metadata describing the user data, locally. The data set is also maintained a data store remote to the data processor. When processing requests are received, a determination is made as to whether the local portions of the data set can execute the processing request or one or more additional portions of the data set are needed from the remote data store. If additional portions of the data set are needed, then a request may be sent to the data store for the additional portions. Once received, the data processor may execute the processing request utilizing the additional portions. Portions of the data set maintained locally at the data processor may be selected and flushed from local storage to the remote data store.
    Type: Grant
    Filed: August 29, 2016
    Date of Patent: May 12, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Anurag Windlass Gupta, Andrew Edward Caldwell
  • Patent number: 10642793
    Abstract: The present invention provides a method for compressing genome sequences readers using GPU processing unit. The method comprising the steps of: identifying position of each given genome reader characters string in the sequence of a reference genome, determining alignment of each reader string within the reference genome, comparing each reader characters string to corresponding reference genome sequence based on determined alignment, filtering characters in each reader by GPU processor by eliminating similar characters and extracting only characters differences in association to their position in the genome sequence and recording filtered data of each reader in association to its alignment in genome reference at the genome compressed database.
    Type: Grant
    Filed: February 2, 2016
    Date of Patent: May 5, 2020
    Assignee: SQREAM TECHNOLOGIES LTD
    Inventors: Dotan Shdema, Raziel Shoshani, Or Cohen, Ori Netzer
  • Patent number: 10642837
    Abstract: According to embodiments, a derived cache that is derived from a first instance of particular data is used to speed up queries and other operations over a second instance of the particular data. Traditionally, a DBMS generates and uses derived cache data only for the database data from which the derived data was derived. However, according to embodiments, derived cache data associated with a first instance of database data is relocated to the location of a second, newly created, instance of the database data. Since the derived cache data is derived from an identical copy of the database data, the cache data derived for the first instance can successfully be used to speed up applications running over the second instance of the database data.
    Type: Grant
    Filed: January 19, 2017
    Date of Patent: May 5, 2020
    Assignee: Oracle International Corporation
    Inventors: Kothanda Umamageswaran, Krishnan Meiyyappan, Adrian Tsz Him Ng, Vijay Sridharan, Wei Zhang, Ke Hu, Xin Zeng
  • Patent number: 10628069
    Abstract: A management apparatus, which is configured to manage at least one storage system, includes a processor and a memory. Each of the at least one storage apparatus includes a plurality of volumes, each of which stores at least one OS. The processor is configured to: determine, for each of the plurality of volumes, an OS type and version of a representative OS of the each of the plurality of volumes; select, from among the plurality of volumes, a plurality of volumes having representative OSes that share the same OS type and major version; and include the selected plurality of volumes in one deduplication group made up of volumes among which deduplication is to be executed.
    Type: Grant
    Filed: June 3, 2016
    Date of Patent: April 21, 2020
    Assignee: HITACHI LTD.
    Inventors: Atsushi Tsuda, Masakazu Kobayashi, Yuichiro Nagashima, Tetsuya Uehara, Yohei Tsujimoto
  • Patent number: 10620863
    Abstract: A method is used in managing data reduction in storage systems using machine learning. A value representing a data reduction assessment for a first data block in a storage system is calculated using a hash of the data block. The value is used to train a machine learning system to assess data reduction associated with a second data block in the storage system without performing the data reduction on the second data block, where assessing data reduction associated with the second data block indicates a probability as to whether the second data block can be reduced.
    Type: Grant
    Filed: August 1, 2018
    Date of Patent: April 14, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Sorin Faibish, Rustem Rafikov, Ivan Bassov
  • Patent number: 10616001
    Abstract: In a method for egress processing packets in a network device, a first stage engine, implemented in hardware, identifies a particular set of computer-readable instructions for a particular packet. The particular set of computer-readable instructions is identified from among a plurality of sets of computer-readable instructions stored in a memory, respective ones of the plurality of sets of computer-readable instructions being for performing different sets of egress processing operations with respect to different packets. A second stage processor, configured to execute computer-readable instructions stored in the memory, executes the particular set of computer-readable instructions, identified by the first stage engine, to perform the corresponding set of egress processing with respect to the particular packet.
    Type: Grant
    Filed: May 2, 2018
    Date of Patent: April 7, 2020
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Ilan Mayer-Wolf, Ilan Yerushalmi, David Melman, Tal Mizrahi
  • Patent number: 10565182
    Abstract: A system is provided that includes a first processor and a second processor. The first processor includes first hardware logic circuitry that performs a Lempel-Ziv-Markov chain algorithm (LZMA) forward pass compression process on a portion of source data to provide first output data. The second processor that performs an LZMA backward pass compression process on the first output data to provide second output data.
    Type: Grant
    Filed: November 23, 2015
    Date of Patent: February 18, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Douglas C. Burger, Scott Hauck
  • Patent number: 10565198
    Abstract: The technology described herein provides a bit vector search index for a search system that uses shards. The bit vector search index comprises a data structure for indexing data about terms from a corpus of documents. The data structure includes a number of bit vectors. Each bit vector comprises an array of bits and corresponds to a different set of terms. Bits in the bit vector are used to represent whether at least one document corresponding to the bit includes at least one term from the set of terms corresponding to the bit vector. The search index is provided in a number of shards. Each shard corresponds to a subset of documents having documents lengths within particular a range of document lengths.
    Type: Grant
    Filed: June 17, 2016
    Date of Patent: February 18, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Michael Joseph Hopcroft, Robert Lovejoy Goodwin
  • Patent number: 10560513
    Abstract: Disclosed herein is a technique for managing storage space in a user device by efficiently downloading files from a cloud-based storage system and evicting files from the user device. According to some embodiments, files are continuously downloaded in a download mode until a particular threshold is satisfied. When the threshold is satisfied, the files can be downloaded in an on-demand mode as needed by the user, where the user device operates in the on-demand mode until a sufficient amount of storage space is freed by evicting files from the user device. Thereafter, the user device can switch back to the download mode.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: February 11, 2020
    Assignee: Apple Inc.
    Inventors: Michael Pirnack Hess, Jean-Gabriel Morard, Pierre d'Herbemont
  • Patent number: 10552075
    Abstract: Deduplication of virtual-machine disk images and other disk images can involve identifying the first clusters in a file. The clusters are hashed. The first-in-file hashes (generated from first-in-file clusters) are stored in an in-memory index, while the full set of hashes is streamed in order to find matches with the hashes stored in the in-memory index. First-in-file hashes in the stream are compared, while other hashes in the stream are compared only if the immediately preceding hash resulted in a match. Comparing non-first-in-file hashes requires disk accesses, but since such comparisons are conditioned on first-in-file matches, there are relatively likely to result in sequences of matches. The net effect is a relatively fast deduplication with compression approaching that resulting from a full comparison of all hashes.
    Type: Grant
    Filed: January 23, 2018
    Date of Patent: February 4, 2020
    Assignee: VMware, Inc.
    Inventor: Oleg Zaydman
  • Patent number: 10545988
    Abstract: A system and method for data synchronization using revision control includes receiving, by a synchronization module being executed by one or more processors of a server, inbound edits to a shared document from a client, retrieving a first version of the shared document associated with the client from a revision history, updating the first version based on the inbound edits to create a second, adding the second version to the revision history when the second version is not included among a plurality of stored versions of the shared document in the revision history, and incrementing a reference counter that records a number of clients associated with the second version when the second version is included among the stored versions in the revision history. The revision history provides access to the stored versions of the shared document. The revision history includes version data used to access each stored version and the associated reference counters.
    Type: Grant
    Filed: February 26, 2015
    Date of Patent: January 28, 2020
    Assignee: RED HAT, INC.
    Inventor: Lukas Fryc