Data Cleansing, Data Scrubbing, And Deleting Duplicates Patents (Class 707/692)
  • Patent number: 12387058
    Abstract: A device may generate first scores for sentences of text based on a cumulative frequency of words in each sentence, may generate second scores for the sentences based on a cumulative frequency of domain entities in each sentence, and may generate third scores for the sentences based on a sentiment analysis of each sentence. The device may generate a summary of the text, may filter the sentences to extract a first set of sentences, may filter the sentences to extract a second set of sentences, and may filter the sentences to extract a third set of sentences. The device may identify and assign weights to a first group of sentences, a second group of sentences, and a third group of sentences, may generate a ranked list of sentences based on the weighted first group, second group, and third group, and may perform actions based on the final summary.
    Type: Grant
    Filed: June 28, 2024
    Date of Patent: August 12, 2025
    Assignee: Verizon Patent and Licensing Inc.
    Inventors: Prakash Ranganathan, Miruna Jayakrishnasamy
  • Patent number: 12380094
    Abstract: The present disclosure relates to a transaction processing method and apparatus. The transaction processing method includes: obtaining a first block to be uploaded; when it is determined that the first block meets a first capacity expansion condition, determining, from nodes of the blockchain and based on the first capacity expansion condition, a first node for performing filter expansion; obtaining a first capacity-expanded filter from the first node, a capacity of the first capacity-expanded filter being greater than a capacity of a first in-process filter in the execution node, and the first capacity-expanded filter being generated by the first node based on transactions having been uploaded to the blockchain; and loading a transaction in the first block to the first capacity-expanded filter, and using the first capacity-expanded filter as a second in-process filter, to perform, through the capacity-expanded second in-process filter, deduplication filtering on transactions to be uploaded to the blockchain.
    Type: Grant
    Filed: April 17, 2024
    Date of Patent: August 5, 2025
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventors: Zhuguang Shao, Li Li, Jianjun Zhang, Bing Shao, Bengbeng Su
  • Patent number: 12373390
    Abstract: A data management system may support techniques for immutable storage of snapshot data entities, which may each include data corresponding to one or more snapshots, in a cloud environment. The data management system may determine respective retention periods and respective immutability periods for the snapshot data entities. The data management system may extend the respective immutability period for a first snapshot data entity based on the respective retention period for the first snapshot data entity being greater than or equal to a threshold duration. Additionally or alternatively, the data management system may maintain (refrain from extending) the respective immutability period for a second snapshot data entity based at least in part on the respective retention period for the second snapshot data entity being less than the threshold duration.
    Type: Grant
    Filed: March 25, 2024
    Date of Patent: July 29, 2025
    Assignee: Rubrik, Inc.
    Inventors: Sai Kiran Katuri, Prateek Pandey, Vikas Jain, Jonathan Carlyle Derryberry, Dharma Teja Bankuru
  • Patent number: 12373407
    Abstract: Various embodiments of the teachings herein include a method for detecting data anomalies. A method may include: receiving test data; and matching the test data with a data rule determined on the basis of historical data having a shared data type with the test data. The data rule contains an antecedent and a consequent predicate set. An intersection of the antecedent predicate set and the consequent predicate set is an empty set. The antecedent predicate set contains at least one antecedent predicate. The consequent predicate set contains at least one consequent predicate. When the data to be tested satisfies all the antecedent predicates in the antecedent predicate set and fails to satisfy at least one consequent predicate in the consequent predicate set, the test data is flagged as anomalous due to failure to satisfy the data rule.
    Type: Grant
    Filed: September 13, 2023
    Date of Patent: July 29, 2025
    Assignee: SIEMENS AKTIENGESELLSCHAFT
    Inventors: Cheng Feng, Ying Qu
  • Patent number: 12362930
    Abstract: Disclosed techniques relate to security of backup data. In some embodiments, a method includes receiving, by data protection service running on a cloud computing system, a first encrypted copy of a backup of a first data store that is associated with a first account of an organization, where the first encrypted copy is encrypted using a first custodian cryptographic key that is shared between the organization and the data protection service that is different than a first production cryptographic key that is private and used by the organization to encrypt a non-backup version of the first data store. The method may include generating a second encrypted copy of the backup, including by encrypting the backup using a storage cryptographic key. The method may include storing the second encrypted copy of the backup in a second data store that is associated with the data protection service.
    Type: Grant
    Filed: February 2, 2022
    Date of Patent: July 15, 2025
    Assignee: Commvault Systems, Inc.
    Inventors: Lawrence Chang, Xia Hua, Woonho Jung, Rajeev Kumar, Douglas Qian, Abdul Jabbar Abdul Rasheed
  • Patent number: 12332856
    Abstract: This disclosure relates to assessment of data quality for unstructured data. In some aspects, a method includes obtaining, by one or more computing devices, metadata of multiple data files; analyzing a graph database representative of the multiple data files and generated using the metadata, to identify unstructured data included in one or more data files, the graph database representing features of the multiple data files, and relationships among the features of the multiple data files; obtaining a set of customized rules for the unstructured data based on context of the unstructured data; determining that the unstructured data fails to satisfy the set of customized rules; and in response to determining that the unstructured data fails to satisfy the set of customized rules, modifying the unstructured data to satisfy the set of customized rules.
    Type: Grant
    Filed: June 13, 2023
    Date of Patent: June 17, 2025
    Assignee: DISH Wireless L.L.C.
    Inventors: Darshit Gandhi, Sindhu Chowdary Chirumamilla
  • Patent number: 12314236
    Abstract: A labeled data generation service provides an Internet-of-Things (IoT) system with a capability whereby users may configure how the system gathers, processes, and generates labeled data instances by: collecting and processing the data into a format required by supervised learning algorithms; generating expected outputs from data available in the IoT system; supporting the linking of collected inputs with generated expected outputs; forming labeled data instances; cleaning the labeled data set appropriately; sending the labeled data set to target nodes; and/or communicating with target nodes regarding improving the data processing and labeling processes, as required.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: May 27, 2025
    Assignee: Convida Wireless, LLC
    Inventors: Quang Ly, Lu Liu, Dale N. Seed, Zhuo Chen, William Robert Flynn, IV, Catalina Mihaela Mladin, Jiwan L. Ninglekhu, Hongkun Li, Rocco Di Girolamo
  • Patent number: 12298993
    Abstract: A method may include receiving, at a data lake platform, a packet including a metadata corresponding to a data schema of a source system. A change in the data schema of the source system may be detected based on a first checksum of the metadata and a second checksum of a previous version of the metadata. In response to detecting the change in the data schema of the source system, the metadata may be sent to a target system to enable the target system to perform, based on the data schema of the source system, a task operating on a data from the source system. The task may include reporting, visualization, advanced analytics, and/or machine learning. Related systems and computer program products are also provided.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: May 13, 2025
    Assignee: SAP SE
    Inventors: Vengateswaran Chandrasekaran, Venkatesh Iyengar, Heshang Majmudar, Sriram Narasimhan
  • Patent number: 12299528
    Abstract: Systems and methods are provided for determining whether an image or location contains a barcode. The systems and methods include comparing visual data extracted from an image to target visual data. The systems and methods also include comparing different portions of captured barcode data to corresponding portions of target barcode data. The systems and methods also include utilizing barcode formatting data when comparing or computing a similarity measure. The systems and methods also include causing a second image to be captured when no match is found in a first image. The systems and methods also include identifying a target barcode from a barcode database based on spatial information about where an image was captured. The systems and methods also include validating a barcode based on different regions of the barcode from two different images of the barcode. The systems and methods also include using two barcodes captured in an image.
    Type: Grant
    Filed: September 15, 2023
    Date of Patent: May 13, 2025
    Assignee: Verity AG
    Inventors: Fabio Rossetto, Markus Florian Hehn
  • Patent number: 12288034
    Abstract: Methods, apparatus, and processor-readable storage media for automatically summarizing event-related data using artificial intelligence techniques are provided herein. An example computer-implemented method includes obtaining text-based data and non-text-based data associated with at least one virtual event comprising one or more participants; generating a content-related summarization of one or more of at least a portion of the text-based data and at least a portion of the non-text-based data using at least a first set of one or more artificial intelligence techniques; generating a participant sentiment-related summarization associated with one or more of at least a portion of the text-based data and at least a portion of the non-text-based data using at least a second set of one or more artificial intelligence techniques; and performing one or more automated actions based at least in part on one or more of the content-related summarization and the participant sentiment-related summarization.
    Type: Grant
    Filed: July 27, 2022
    Date of Patent: April 29, 2025
    Assignee: Dell Products L.P.
    Inventors: Bijan Kumar Mohanty, Gregory Michael Ramsey, Hung T. Dinh
  • Patent number: 12277117
    Abstract: The subject technology receives a query, the query including a statement for performing the query. The subject technology performs a lookup operation on a stored plan cache based on the query. The subject technology performs, in response to a cache match of the query to a stored query plan in the stored plan cache based on the lookup operation, a validation process of the stored query plan. The subject technology determines whether the stored query plan is valid based on the validation process. The subject technology performs, in response to determining that the stored query plan is valid, a program building process for the stored query plan to generate a final query plan. The subject technology sends the final query plan to an execution node for execution.
    Type: Grant
    Filed: April 29, 2024
    Date of Patent: April 15, 2025
    Assignee: Snowflake Inc.
    Inventors: Karan Chadha, Prashant Gaharwar, Shrainik Jain, Nicola Dan Onose, Jiaqi Yan
  • Patent number: 12271298
    Abstract: The disclosure herein describes deduplicating data chunks using chunk objects. A batch of data chunks is obtained from an original data object and a hash value is calculated for each data chunk. A first duplicate data chunk is identified using the hash value and a hash map. A chunk logical block address (LBA) of a chunk object is assigned to the duplicate data chunk. Payload data of the duplicate data chunk is migrated from the original data object to the chunk object, and a chunk map is updated to map the chunk LBA to a physical sector address (PSA) of the migrated payload data on the chunk object. A hash entry is updated to map to the chunk object and the chunk LBA. An address map of the original data object is updated to map an LBA of the duplicate data chunk to the chunk object and the chunk LBA.
    Type: Grant
    Filed: June 13, 2023
    Date of Patent: April 8, 2025
    Assignee: VMware LLC
    Inventors: Enning Xiang, Wenguang Wang, Yifan Wang
  • Patent number: 12271625
    Abstract: A data storage system can implement a key-value engine configured for tunable read, write, and space amplification. The key-value engine can support multi-versioning, synchronous and asynchronous key updates, and read snapshots. The key-value engine is highly scalable and can support generalized parallel, in-memory computation. Experimental results demonstrate that a key-value engine consistent with disclosed embodiments can outperform a state-of-the-art production LSM-based key-value store in a wide range of metrics.
    Type: Grant
    Filed: March 4, 2024
    Date of Patent: April 8, 2025
    Assignee: The Math Works, Inc.
    Inventor: Anthony Paul Astolfi
  • Patent number: 12265503
    Abstract: Techniques are described for selectively extending a WORM lock expiration time for a chunkfile. An example method comprises identifying, by a data platform implemented by a computing system, a chunkfile that includes a chunk that matches data for an object of a file system; determining, by the data platform after identifying the chunkfile, whether to deduplicate the data for the object of the file system by adding a reference to the matching chunk, wherein determining whether to deduplicate the data comprises applying a policy to at least one of a property of the chunkfile or properties of one or more of a plurality of chunks included in the chunkfile; and in response to determining to not deduplicate the data for the object of the file system, causing a new chunk for the data for the object of the file system to be stored in a different, second chunkfile.
    Type: Grant
    Filed: March 14, 2023
    Date of Patent: April 1, 2025
    Assignee: Cohesity, Inc.
    Inventors: Aiswarya Bhavani Shankar, Dane Van Dyck, Venkata Ranga Radhanikanth Guturi, Leo Prasath Arulraj
  • Patent number: 12238066
    Abstract: Techniques are disclosed for processing data packets and implementing policies in a software defined network (SDN) of a virtual computing environment. A plurality of computing nodes are communicatively coupled to network devices. The computing nodes are configured to provide at least one cloud edge processing function. The network devices are configured to enable communications between virtual machines within a virtual network of the virtual computing environment in accordance with associated policies. The network devices and the processing function are disaggregated from dependencies on particular computing nodes that are hosting the virtual machines.
    Type: Grant
    Filed: February 18, 2022
    Date of Patent: February 25, 2025
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Deepak Bansal, Gerald Roy Degrace
  • Patent number: 12197467
    Abstract: Methods for establishing a second database and maintaining synchronization between a first database and the second database in a data management system are described. According to the method, a snapshot of a state of the first database may be acquired and mounted to a second server. The second database may be restored to the second server based on the mount. The second database may replicate the state of the first database. Synchronization may be enabled between the first database and the second database. One or more metrics associated with replication of data between the databases may be identified. A backup process for transaction logs associated with the first database may be initiated and the transaction logs may be mounted to the second server based on the metrics. One or more transactions may be applied to the second database based on the transaction logs mounted to the second server.
    Type: Grant
    Filed: February 4, 2022
    Date of Patent: January 14, 2025
    Assignee: Rubrik, Inc.
    Inventors: Bala Sunil Kandi, Peter John Milanese
  • Patent number: 12189586
    Abstract: Example methods, apparatus, systems and articles of manufacture (e.g., physical storage media) to deduplicate common devices across multiple data sources are disclosed. An example apparatus includes instructions to identify a first device in a first data source and a second device in a second data source as a possible common device, calculate at least one of a station duration metric, a time match metric or a station path metric, the station duration metric, the time match metric based times of day that the first device tuned to a second set of stations and times of day that the second device tuned to the second set of stations, determine a score based on the at least one of the station duration metric, the time match metric, or the station path metric, and determine when the first device and the second device are a common device based on the score.
    Type: Grant
    Filed: August 29, 2022
    Date of Patent: January 7, 2025
    Assignee: The Nielsen Company (US), LLC
    Inventors: Rachel Worth Olson, Michael Evan Anderson, Rishi Sriram, Margaret M. Orton, Fatemehossadat Miri, Samantha M. Mowrer, David J. Kurzynski, Molly Poppie
  • Patent number: 12190335
    Abstract: Methods, apparatus, systems, and articles of manufacture to generate reference signature assets from meter signatures are disclosed. Example apparatus disclosed herein include a signature comparator to compare meter signature strings with search signature strings to identify a first fragment match result, which is associated with a sequence position within a first media represented by the search signature strings included in the first fragment match result, and which is also associated with a length of the first media. Disclosed example apparatus also include candidate signature asset generation circuitry to generate a candidate signature asset from a meter signature sequence based on the sequence position and the length of the first media, and store the candidate signature asset in a candidate pool associated with the first media.
    Type: Grant
    Filed: October 29, 2021
    Date of Patent: January 7, 2025
    Assignee: The Nielsen Company (US), LLC
    Inventors: Albert T. Borawski, Geetanjali Arya, Satish Kumar Kukunuru
  • Patent number: 12189488
    Abstract: One example method includes receiving from a node, in an HSAN that includes multiple nodes, an ADD_DATA request to add an entry to a distributed ledger of the HSAN, the request comprising a user ID that identifies the node, a hash of a data segment, and a storage location of the data segment at the node, performing a challenge-and-response process with the node to verify that the node has a copy of the data that was the subject of the entry, making a determination that a replication factor X has not been met, and adding the entry to the distributed ledger upon successful conclusion of the challenge-and-response process.
    Type: Grant
    Filed: July 31, 2023
    Date of Patent: January 7, 2025
    Assignee: EMC IP Holding Company LLC
    Inventors: Arun Murti, Joey C. Lei, Adam E. Brenner, Mark D. Malamut
  • Patent number: 12189572
    Abstract: Computing systems methods, and non-transitory storage media are provided for obtaining images, extracting layers from each of the images, extracting segments from each of the layers, generating a compressed version of the segments by storing a single copy of each segment and metadata to reconstruct the layers from the segments and the images from the layers, and simulating a reconstruction of the image from the compressed version.
    Type: Grant
    Filed: June 13, 2023
    Date of Patent: January 7, 2025
    Assignee: Palantir Technologies Inc.
    Inventors: Ashray Jain, Bradley Moylan, Callum Rogers, Charissa Sonder Plattner
  • Patent number: 12182088
    Abstract: A method includes generating a plurality of pages from a plurality of records received from a plurality of data sources. Deduplication of the plurality of pages is facilitated based on a plurality of page metadata of the plurality of pages based on, for the each page of the plurality of pages. A filtered set of potentially-intersecting pages is identified for each given page as a proper subset of the plurality of pages stored in the page storage system based on first comparison parameters, and an intersecting set of pages that include a row number intersection with the given page is identified as a proper subset of the filtered set of potentially-intersecting pages based on second comparison parameters. Records with records with row numbers included in row number intersections with other pages in the intersecting set of pages are removed from the each page.
    Type: Grant
    Filed: September 15, 2023
    Date of Patent: December 31, 2024
    Assignee: Ocient Holdings LLC
    Inventors: George Kondiles, Ravi V. Khadiwala, Donald Scott Clark, Anna Veselova
  • Patent number: 12174848
    Abstract: A computer-implemented method is provided for an automated extract, transform, and load process for a target database comprising linked data. During the data transformation phase linked data elements are added as data to a data set.
    Type: Grant
    Filed: August 19, 2019
    Date of Patent: December 24, 2024
    Assignee: ONTOFORCR NV
    Inventors: Kenny Knecht, Paul Vauterin, Hans Constandt
  • Patent number: 12164477
    Abstract: A repository of replicated chunk files is analyzed to identify chunk files that meet at least a portion of combination criteria. Selected chunk files are associated together under a data protection grouping container. Erasure coding is applied to the data protection grouping container including by utilizing the selected chunk files as different data stripes of the erasure coding and generating one or more parity stripes based on the different data stripes.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: December 10, 2024
    Assignee: Cohesity, Inc.
    Inventors: Apurv Gupta, Akshat Agarwal, Manvendra Singh Tomar, Donthula Akshith Reddy, Kushal Singh, Tarun Kumar Yadav, Mandar Suresh Naik
  • Patent number: 12164799
    Abstract: Data associated with a source system is ingested. After the data is ingested, a post-processing metadata conversion process is performed including by selecting an entry of a chunk metadata data structure and determining that a data chunk associated with the selected entry is not referenced by at least a threshold number of objects. In response to determining that the data chunk associated with the selected entry is not referenced by at least the threshold number of objects, metadata of a tree data structure node corresponding to a chunk identifier associated with the data chunk is updated to store a reference to a chunk file storing the data chunk and the selected entry is removed from the chunk metadata data structure.
    Type: Grant
    Filed: August 28, 2023
    Date of Patent: December 10, 2024
    Assignee: Cohesity, Inc.
    Inventors: Zhihuan Qiu, Sachin Jain, Anubhav Gupta, Apurv Gupta, Mohit Aron
  • Patent number: 12164493
    Abstract: A method for inserting a KV pair to a separated database, the method may include receiving a request to insert the KV pair to the separated database, wherein the separated database comprises a log structured merge (LSM) tree and KV database that is separated from LSM tree; determining whether the KV pair should be associated with a versioned LSM entry or with a non-versions LSM entry; and inserting the KV pair and a KV timestamp in the separated database according to the determining; wherein the inserting includes: storing a combination of the value and the KV timestamp in the KV database; defining an access key to the KV database; wherein the access key is based on the combination when determining that the KV pair should be associated with a versioned LSM; and wherein the access key is based on the key and not on the timestamp when determining that the KV pair should be associated with a non-versioned LSM.
    Type: Grant
    Filed: February 14, 2022
    Date of Patent: December 10, 2024
    Assignee: Pliops Ltd.
    Inventors: Guy Guetta, Edward Bortnikov, Michael Pan, Moshe Twitto, Tamar Weiss, Shmuel Dashevsky, Niv Dayan
  • Patent number: 12158869
    Abstract: A method of obtaining and imputing missing data and a measurement system having the same are disclosed.
    Type: Grant
    Filed: January 4, 2023
    Date of Patent: December 3, 2024
    Assignees: SAMSUNG ELECTRONICS CO., LTD., KOREA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION
    Inventors: Seongwook Yoon, Sanghoon Sull, Jaehyun Kim, Heejeong Lim
  • Patent number: 12159110
    Abstract: The present disclosure relates to systems, methods, and computer-readable media for utilizing a concept graphing system to determine and provide relationships between concepts within document collections or corpora. For example, the concept graphing system can generate and utilize machine-learning models, such as a sparse graph recovery machine-learning model, to identify less-obvious correlations between concepts, including positive and negative concept connections, as well as provide these connections within a visual concept graph. Additionally, the concept graphing system can provide a visual concept graph that determines and displays concept correlations based on the input of a single concept, multiple concepts, or no concepts.
    Type: Grant
    Filed: June 6, 2022
    Date of Patent: December 3, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Harsh Shrivastava, Maurice Diesendruck, Robin Abraham
  • Patent number: 12153555
    Abstract: A system for data space limitation includes and interface and a processor. The interface is configured to receive a query for a structured data set. The processor is configured to determine an ordered list for calculations to respond to the query; perform the calculations according to the ordered list until an allowed time required for interactivity is reached; and in response to the allowed time being reached, provide results of the calculations.
    Type: Grant
    Filed: May 23, 2024
    Date of Patent: November 26, 2024
    Assignee: Workday, Inc.
    Inventors: Viktor Brada, Peter Fedorocko, Filip Dousek, Hynek Walner
  • Patent number: 12147557
    Abstract: Computer systems and associated methods are disclosed to implement the non-interactive join of privacy-preserving dataset sketches. In some embodiments, an entity can publish a one-time sketch of their dataset that would enable another entity to join their data without exposing private information. The sketch can map, using a hash function, the identities associated with a first value of the dataset to a data structure, in some embodiments. A same or different entity can join the first sketch with a privacy-preserving second sketch of a second dataset that includes added noise, and can determine an estimate of a number of identities that correspond with specific values of the first and second datasets from the joined dataset. The sketch can be published just one time, and therefore does not require separate new private computations with privacy budgeting for each additional party when a join is desired, in some embodiments.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: November 19, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: James Alexander Cook, Nina Mishra
  • Patent number: 12135691
    Abstract: A method for storing a received data chunk (DC) in a storage system, the method includes (a) obtaining a received fingerprint of the received DC, the received fingerprint may include received fingerprint elements that are indicative of occurrences, within the received DC, of content elements, the received fingerprint elements are ordered according to a given order; (b) searching, within a tree, for a similar stored fingerprint; the tree may include tree nodes that represent multiple stored fingerprints of stored data chunks that are stored in the storage system; different levels of the tree are allocated to different content elements; (c) compressing, when finding the similar stored fingerprint, the received DC based on a similar DC associated with the similar stored fingerprint, and updating storage system metadata to indicate that the received DC is stored in the storage system in a compressed form, and based on the similar stored DC.
    Type: Grant
    Filed: October 26, 2022
    Date of Patent: November 5, 2024
    Assignee: VAST DATA LTD.
    Inventors: Yogev Vaknin, Niko Farhi, Asaf Levy
  • Patent number: 12093187
    Abstract: Logical address space portions and virtual layer blocks (VLBs) can be partitioned into multiple sets. Each of multiple nodes in a system can be assigned exclusive ownership of one of the multiple sets. In at least one embodiment, for a read I/O which is received at a first node and directed to a logical address LA1 that is owned by a second node, the first node can request that the second owning node perform resolution processing for LA1. The second node can return either a VLB address or a PLB address based on whether the second node owns a VLB used in mapping LA1 to a corresponding physical location PA1 which includes content C1 stored at LA1. The second node can set a flag in its response to indicate whether a returned address is a VLB address or a PLB address.
    Type: Grant
    Filed: March 31, 2023
    Date of Patent: September 17, 2024
    Assignee: Dell Products L.P.
    Inventors: Vladimir Shveidel, Uri Shabi, Dror Zalstein
  • Patent number: 12074953
    Abstract: The present disclosure relates to generating, updating, modifying, and otherwise managing configurations for virtual services on a cloud computing system. The present disclosure provides example implementations of a configuration management system and configuration handlers on respective server nodes that receive and process requests for modifying one or more configurations that manage operation of virtual services on the cloud. Systems described herein involve leveraging a hierarchical model of configuration characteristics to facilitate both large and small scale modifications. Moreover, the systems described herein leverage a persistent store on server nodes to identify how to update a current base configuration and sub-version as well as synchronize modifications across a set of server nodes.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: August 27, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Sameer Kumar Patro, Aritra Basu, Raghavendra Subhash
  • Patent number: 12061581
    Abstract: Example implementations relate to metadata operations in a storage system. An example includes generating, by a storage controller of a deduplication storage system, a candidate list of container indexes for matching operations of a received data segment, each container index in the candidate list having an associated match cost; identifying, by the storage controller, a journal group associated with a first container index listed in the candidate list; reducing, by the storage controller, a match cost associated with the first container index in response to a determination that the identified journal group is in a modified state; and performing, by the storage controller, the matching operations of the received data segment based at least on the reduced match cost of the first container index.
    Type: Grant
    Filed: July 26, 2022
    Date of Patent: August 13, 2024
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Aman Sahil, Richard Phillip Mayo
  • Patent number: 12050790
    Abstract: Aspects of the present disclosure configure a memory sub-system processor to manage memory operations with repeating data patterns. The processor receives a request to write a block of data comprising a plurality of portions to a set of memory components and determines whether a pattern of data repeats across the plurality of portions of the block of data. In response to determining that the pattern of data repeats across the plurality of portions, the processor stores a representation of the pattern of data in a mapping table and discards the block of data to prevent storing the block of data on the set of memory components.
    Type: Grant
    Filed: August 16, 2022
    Date of Patent: July 30, 2024
    Assignee: Micron Technology, Inc.
    Inventor: Anoop Achuthan Rajendrababu
  • Patent number: 12050879
    Abstract: A device may generate first scores for sentences of text based on a cumulative frequency of words in each sentence, may generate second scores for the sentences based on a cumulative frequency of domain entities in each sentence, and may generate third scores for the sentences based on a sentiment analysis of each sentence. The device may generate a summary of the text, may filter the sentences to extract a first set of sentences, may filter the sentences to extract a second set of sentences, and may filter the sentences to extract a third set of sentences. The device may identify and assign weights to a first group of sentences, a second group of sentences, and a third group of sentences, may generate a ranked list of sentences based on the weighted first group, second group, and third group, and may perform actions based on the final summary.
    Type: Grant
    Filed: May 24, 2022
    Date of Patent: July 30, 2024
    Assignee: Verizon Patent and Licensing Inc.
    Inventors: Prakash Ranganathan, Miruna Jayakrishnasamy
  • Patent number: 12045211
    Abstract: One example method includes collaborative deduplication. A deduplication engine implemented at a cloud level collaborates or coordinates with an extension engine of the deduplication at an edge node. This allows data ingested at a node to be collaboratively deduplicated prior to transfer to the cloud and after transfer to the cloud.
    Type: Grant
    Filed: October 27, 2020
    Date of Patent: July 23, 2024
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Mohamed Sohail, Karim Fathy, Robert A. Lincourt
  • Patent number: 12032534
    Abstract: A method and system is used in managing deduplication of data in storage systems. A first digest for a deduplication candidate is received. At least one stream associated with the deduplication candidate is detected. At least one neighboring digest segment of a first loaded digest segment associated with the at least one stream is loaded. Whether the digest is located in the at least one neighboring digest segment is determined. If the digest is not located in the at least one neighboring digest segment, the digest is processed.
    Type: Grant
    Filed: August 2, 2019
    Date of Patent: July 9, 2024
    Assignee: EMC IP Holding Company LLC
    Inventors: Nickolay Dalmatov, Richard Ruef, Kurt Everson
  • Patent number: 12026386
    Abstract: A method for differential compression includes receiving input data blocks that are selected for compression. For each input data block, the input data block is divided into at least two segments. For each of the at least two segments, a similarity degree between the respective segment and each of the data blocks excluding the respective data block is computed. For each of the at least two segments, the data block which has a biggest similarity degree with the respective segment among the data blocks excluding the respective data block is selected as an optimal reference data block for the respective segment. The differential compression is applied to the input data block and optimal reference blocks in response to determining a differential compression that is to be applied based on the similarity degree between the segments of the input data block and the corresponding optimal reference blocks.
    Type: Grant
    Filed: September 23, 2022
    Date of Patent: July 2, 2024
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Assaf Natanzon
  • Patent number: 12001685
    Abstract: A plurality of data stripes and one or more parity stripes are generated using a plurality of data chunks stored in a write-ahead log based on an erasure coding configuration. The plurality of data stripes and the one or more parity stripes are stored on corresponding different storage devices. The plurality of data stripes and the one or more parity stripes are associated together under a data protection grouping container.
    Type: Grant
    Filed: March 31, 2022
    Date of Patent: June 4, 2024
    Assignee: Cohesity, Inc.
    Inventors: Apurv Gupta, Akshat Agarwal
  • Patent number: 11995467
    Abstract: Systems, devices, and methods are provided for validation, deletion, and/or recovery of resources in a service environment. A machine (e.g., server) may receive a request to identify or discover a list of resources that are unused in a service environment. A machine (e.g., server) may receive a request to delete one or more resources in a service environment. In at least one embodiment, deletion of a resource involves a two-stage process where the resource is recoverably deleted in a first stage (e.g., by deactivating or disabling the resource) such that the resource can be recovered prior to a predetermined time period by reactivating or re-enabling the resource and, in a second stage, the resource is unrecoverably deleted.
    Type: Grant
    Filed: July 14, 2021
    Date of Patent: May 28, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Suresh Prakash Goacher, Arun Anilkumar, Nishit Nihal Vas
  • Patent number: 11977527
    Abstract: In certain embodiments, machine learning and lineage data may be used to manage data. In some embodiments, a computing system may use lineage data to identify two datasets that may be related. The computing system may determine that a user has access to a derivative dataset but does not have access to an original dataset that was used to create the derivative dataset. In response, the computing system may use a machine learning model to generate a similarity score indicating a level of similarity between the original dataset and the derivative dataset. If the similarity score satisfies a threshold score, the computing system may modify access rights of the user so that the user is unable to access a portion of the data in the derivative dataset.
    Type: Grant
    Filed: January 3, 2022
    Date of Patent: May 7, 2024
    Assignee: Capital One Services, LLC
    Inventors: William Ye, Jon Stofer, Thomas J. O'Connor, Jose Moreno
  • Patent number: 11966630
    Abstract: A data storage device includes a memory device and a controller coupled to the memory device. The controller is configured to segment a key to physical (K2P) table into two or more segments, wherein each segment of the two or more segments corresponds to a caching priority of key value (KV) pair data, organize the K2P table by storing and relocating one or more K2P table entries into a respective segment of the two or more segments, wherein the storing and relocating comprises moving a K2P table entry based on the caching priority of the KV pair data into the respective segment having the caching priority, and utilize the K2P table to manage KV pair data stored in the memory device, wherein utilizing the K2P table comprises applying a same management operation, such as prefetching, to each K2P table entry of a same segment.
    Type: Grant
    Filed: June 27, 2022
    Date of Patent: April 23, 2024
    Assignee: Western Digital Technologies, Inc.
    Inventors: Ran Zamir, Alexander Bazarsky, David Avraham
  • Patent number: 11954331
    Abstract: A computer-implemented method enables workload scheduling in a storage system for optimized deduplication. The method includes determining dynamic correlations of deduplications between workload processes in a prior time window. Workload processes include one or more tasks with defined execution timing parameters. The method further includes determining deduplication ratios based on the correlations of the deduplications between the workload processes. The method further includes scheduling multiple workload processes based on a highest determined deduplication ratio of the determined deduplication ratios.
    Type: Grant
    Filed: October 7, 2021
    Date of Patent: April 9, 2024
    Assignee: International Business Machines Corporation
    Inventors: Miles Mulholland, Anuj Chandra, Kirsty G. Rodwell, Jorden Luke Allcock
  • Patent number: 11949751
    Abstract: The present disclosure relates to restricting electronic activities from being linked with record objects. According to at least one aspect of the disclosure, a method can include accessing, by one or more processors, a plurality of electronic activities, accessing a plurality of record objects of one or more systems of record, identifying an electronic activity of the plurality of electronic activities to match to one or more record objects, determining a data source provider associated with providing access to the electronic activity, and identifying a system of record corresponding to the determined data source provider. The system of record can include a plurality of candidate record objects to which to match the electronic activity. The method can include restricting the electronic activity from being linked with the at least one record object.
    Type: Grant
    Filed: January 23, 2023
    Date of Patent: April 2, 2024
    Inventors: Oleg Rogynskyy, Tetiana Lutsaievska, John Wulf, Sathya Hariesh Prakash
  • Patent number: 11934346
    Abstract: A cloud computing infrastructure hosts a web service with customer accounts. In a customer account, files of the customer account are listed in an index. Files indicated in the index are arranged in groups, with files in each group being scanned using scanning serverless functions in the customer account. The files in the customer account include a compressed tar archive of a software container. Member files of a compressed tar archive in a customer account are randomly-accessed by way of locators that indicate a tar offset, a logical offset, and a decompressor state for a corresponding member file. A member file is accessed by seeking to the tar offset in the compressed tar archive, restoring a decompressor to the decompressor state, decompressing the compressed tar archive using the decompressor, and moving to the logical offset in the decompressed data.
    Type: Grant
    Filed: October 17, 2022
    Date of Patent: March 19, 2024
    Assignee: Trend Micro Incorporated
    Inventor: Brendan M. Johnson
  • Patent number: 11936931
    Abstract: Methods, apparatus, systems and articles of manufacture to perform media device asset qualification are disclosed. An example apparatus includes at least one memory, and at least one processor to execute instructions to at least identify a first set of candidate media device assets for disqualification, the candidate media device assets including A) a signature and B) a media identifier that identifies media, generate a hash table using a second set of the candidate media device assets, determine one or more counts of matches between C) a first signature and a first media identifier of a first candidate media device asset of the second set and D) respective signatures and media identifiers of multiple ones of the second set using the hash table, the multiple ones of the second set not including the first candidate media device asset, and load the first signature into a reference database as a reference signature.
    Type: Grant
    Filed: October 17, 2022
    Date of Patent: March 19, 2024
    Assignee: The Nielsen Company (US), LLC
    Inventors: Daniel Nelson, James Petro, Albert T. Borawski
  • Patent number: 11914554
    Abstract: Methods and systems for improving data back-up, recovery, and search across different cloud-based applications, services, and platforms are described. A data management and storage system may direct compute and storage resources within a customer's cloud-based data storage account to back-up and restore data while the customer retains full control of their data. The data management and storage system may direct the compute and storage resources within the customer's cloud-based data storage account to generate and store secondary layers that are used for generating search indexes, to generate and store shared space layers and user specific layers to facilitate the deduplication of email attachments and text blocks, to perform a controlled restoration of email snapshots such that sensitive information (e.g., restricted keywords) located within stored snapshots remains protected, and to detect and preserve emails that were received or transmitted and then deleted between two consecutive snapshots.
    Type: Grant
    Filed: January 30, 2023
    Date of Patent: February 27, 2024
    Assignee: Rubrik, Inc.
    Inventors: Noel Moldvai, Jihang Lim
  • Patent number: 11907133
    Abstract: Standardized address generation from address substrings includes receiving an address string for a place-of-interest, one-to-many mapping at least one of a plurality of address substrings of the address string to respective address components, concatenating the address substrings using a template that specifies an order of concatenating the address substrings, and making the concatenated address substrings available for further use.
    Type: Grant
    Filed: July 29, 2022
    Date of Patent: February 20, 2024
    Assignee: SafeGraph, Inc.
    Inventor: Vera Sazonova
  • Patent number: 11893373
    Abstract: Techniques are disclosed for deploying functions in a cloud computing environment. Parameters are annotated in a plurality of Helm charts with a predetermined token. Duplicated values in the Helm charts are identified and the predetermined token is reused for the duplicated values. Schema files from the plurality of Helm charts are parsed to extract the predetermined tokens. Input data are received as values for the predetermined tokens. The function is deployed in the cloud computing environment using the values for the predetermined tokens as parameters in the Helm charts.
    Type: Grant
    Filed: January 28, 2022
    Date of Patent: February 6, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Frank John D'Innocenzo, Kam Yee Lee
  • Patent number: 11888936
    Abstract: A method for providing a proxy redirect to facilitate a storage and a retrieval of an object is disclosed. The method includes receiving a mapping of a user to a logical container that stores the object and to a storage provider that stores the logical container; receiving a key corresponding to the logical container and associated with the user; storing the mapping and the key in a database; generating, for the user, an application protocol that redirects to a pre-signed web address based on the stored mapping and the stored key; and transmitting, via a communication interface, the application protocol to the one user. The method further includes the user using the application protocol to directly access the storage provider and retrieve the object.
    Type: Grant
    Filed: July 1, 2020
    Date of Patent: January 30, 2024
    Assignee: JPMORGAN CHASE BANK, N.A.
    Inventor: Zachariah Antonas