Patents Examined by Yicun Wu
  • Patent number: 10841405
    Abstract: A system may include a storage device configured to store a plurality of database tables. The system may further include a processor in communication with the storage device. The processor may receive a request to transmit a database table from the plurality of database tables. The database table may have a plurality of rows. The processor may determine if contents of each column row of each row of the database table are eligible to be compressed. For each column row that contains eligible contents, the processor may generate compressed data representative of the contents of a respective column row. The processor may remove the contents of the respective column row from the associated row. The processor may transmit the compressed data and the database table without content of the column rows represented by the compressed data. A method and computer-readable medium may also be implemented.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: November 17, 2020
    Assignee: Teradata US, Inc.
    Inventors: Victor Lewis, III, Bret M. Gregory
  • Patent number: 10747718
    Abstract: A method for maintaining a mapping structure for maintaining metadata for snapshots in a virtualized storage environment, includes taking a snapshot of a virtual disk, generating an entry in a metadata structure for the snapshot, wherein the entry includes metadata for blocks in the snapshot that have been modified since a preceding snapshot and lazily generating an entry in the mapping structure for the snapshot, wherein the entry includes values for each block in the snapshot, wherein a value for a block indicates a presence of metadata in the metadata structure for the block or an absence of metadata in the metadata structure for the block.
    Type: Grant
    Filed: July 26, 2017
    Date of Patent: August 18, 2020
    Assignee: Nutanix, Inc.
    Inventors: Manosiz Bhattacharyya, Vinayak Hindurao Khot, Tabrez Parvez Memon, Kannan Muthukkaruppan
  • Patent number: 10726007
    Abstract: Constructing a heavy hitter summary for query optimization. The heavy hitter summary is constructed by sampling each of multiple partitions of a dataset using a uniformed sampling rate. For each partition, performing a two-stage heavy hitter estimation process to determine whether an estimated frequency of a key of the sampled data units may be included in a partition-level heavy hitter summary. Constructing a partition-level heavy hitter summary for each partition of the dataset based on the keys determined via the two-stage process, and constructing a dataset-level heavy hitter summary based on the partition-level heavy hitter summary. The dataset-level heavy hitter summary may be used to optimize query trees.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: July 28, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Wangchao Le, Yongchul Kwon, Marc Todd Friedman
  • Patent number: 10726017
    Abstract: An administrator may wish to limit the number of tuples that may be spawned as a result of a first tuple entering an operator graph. A first stream operator may receive a first tuple in order to perform an operation on the first tuple to generate a second tuple. The first stream operator may determine whether it is permitted to generate the second tuple by comparing the first tuple's tuple spawn counts to a tuple creation policy. If the first stream operator is permitted to generate the second tuple, the first stream operator may perform the operation on the first tuple and generate the second tuple.
    Type: Grant
    Filed: May 1, 2018
    Date of Patent: July 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Eric L. Barsness, Michael J. Branson, John M. Santosuosso
  • Patent number: 10691679
    Abstract: Data, such as product data or airline flight data, is represented using structured data tuples, tables, or as data with related metadata and tags, and stored by a search engine. Partial queries are received by the search engine from a user and are used to generate a dialog between the search engine and the user. The dialog may include suggested query completions for the partial queries that correspond to a schema associated with the data tuples. The suggested query completions may be determined using attribute combinations of attributes and attribute values, or metadata and tags associated with the data tuples, including known synonyms and misspellings. The user may interact with the query completions in the dialog, and the search engine may revise the dialog and the query completions according to the interactions. A user may query data tuples without knowing the schema used by the underlying data structures.
    Type: Grant
    Filed: January 18, 2011
    Date of Patent: June 23, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Stelios Paparizos, David James Gemmell
  • Patent number: 10684992
    Abstract: Implementations are provided herein for using inode revision numbers associated with a modified LIN and a set of Parent LINs to causally order transactions within a distributed file system. Any time an inode is changed, its inode revision number can be incremented by 1. When events within file system are processed causing an inode or a set of inodes to be modified, an event transaction log entry can made. The event transaction log entry can denote a description of the event, a set of modified inode and inode revision number pairs, and a set of parent inode and inode revision number pairs. Entries in the event transaction log can be used to build an inode map for each inode implicated in the event transaction log. The inode map can be used to build a set of direct causal dependencies for each transaction in the event transaction log.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: June 16, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Raeanne Marks, Jonathan M. Walton, Ronald Steinke, Karthik Palaiappan, Tanuj Khurana, Steven Hubbell
  • Patent number: 10678652
    Abstract: Embodiments are directed to a method of identifying changed files in incremental block based backups, by obtaining changed data blocks of a file from a change block tracking (CBT) driver, wherein the file has an associated master file table (MFT) record and a parent MFT record number used in a file system, and constructing a complete file path of the file by traversing from the changed MFT record to the root directory using respective parent MFT record numbers by iteratively parsing each record by extracting the file name and parent MFT record number and appending the file name to a previous MFT record file name.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: June 9, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Ravi Rangapuram, Pavan Kumar Dattatreya Ati, Sridhar Surampudi
  • Patent number: 10649983
    Abstract: Systems and methods for integrating data are described. In an example embodiment, a plurality of data attributes of comparison data and the plurality of data attributes of a master record are respectively compared to determine that there is a difference, the comparison data originating from a data source. A relative level of source priority of the data source of the comparison data is determined relative to the data source of a current state version of the master record in accordance with source evaluation criteria. The current state version of the master record is stored in reference data based on a determination that there is a difference and that the source priority of the data source of the comparison data is equal to or greater than the data source of the current state version of the master record.
    Type: Grant
    Filed: August 23, 2017
    Date of Patent: May 12, 2020
    Assignee: Express Scripts Strategic Development, Inc.
    Inventor: Blayne S. Lequeux
  • Patent number: 10650018
    Abstract: A system to reduce the amount of storage and memory used to maintain derived datasets is disclosed. The system operates by using pointers to the underlying data in persistent, byte-addressable storage media. The system additionally reduces the creation time of the views when storage class memory (SCM) is the underlying storage. Furthermore, the invention relates to a form of compression that is tailored to the use cases of big data analytics. The processes of this disclosure use random access to significantly improve performance.
    Type: Grant
    Filed: March 9, 2017
    Date of Patent: May 12, 2020
    Assignee: International Business Machines Corporation
    Inventors: Danny Harnik, Moshik Hershcovitch, Ronen Kat, Yaron Weinsberg
  • Patent number: 10642872
    Abstract: An indexing scheme generates a token index associating token index values with keywords in queries and generates expression trees for the queries that use the token index values to represent the keywords. The indexing scheme generates a document index assigning document index values to uploaded documents. The indexing scheme generates a document-token index that associates the token index values with the document index values for the documents containing the keywords associated with the token index values. The indexing scheme applies the expression trees to the document-token index to quickly identify the documents satisfying the queries. For example, the indexing scheme may generate bit arrays for each of the token index values identifying the documents containing the keywords and apply logical operators from the queries to the bit arrays. The resulting data structure provides a list of documents satisfying the queries.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: May 5, 2020
    Assignee: SALESFORCE.COM, INC.
    Inventor: Ian Frosst
  • Patent number: 10614107
    Abstract: An apparatus and techniques for constructing and utilizing a “dynamic dictionary” that is not a compiled dictionary, and therefore does not need to be recompiled in order to be updated. The dynamic dictionary includes respective data structures that represent (i) a management automaton that includes a plurality of management nodes, and (ii) a runtime automaton that is derived from the management automaton and includes a plurality of runtime nodes. The runtime automaton may be used to search input data, such as communication traffic over a network, for keywords of interest, while the management automaton manages the addition of keywords to the dynamic dictionary. Typically, at least two (e.g., exactly two) such dynamic dictionaries are used in combination with a static dictionary.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: April 7, 2020
    Assignee: VERINT SYSTEMS LTD.
    Inventor: Yitshak Yishay
  • Patent number: 10606807
    Abstract: Systems and methods for deduplicating data are provided. An index used in deduplicating data is distributed to clients. The clients can use the distributed index to provide hints as to whether the data is deduplicated at the server. The server may be associated with a main index used to confirm whether the data is deduplicated based on the hints.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: March 31, 2020
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Dilip N. Simha, Thomas Sandholm, Julio Lopez
  • Patent number: 10592684
    Abstract: Systems and methods are provided for automatic operation detection on protected fields. A data model configuration can be used to specify which attributes of a data model used by a cloud-based application are protected by a data security provider monitoring communications between the application and a client device. A determination can be made automatically which operations of the cloud-based application are supported for protected fields. The cloud-based application can be configured to enable/disable certain features, such as validators, auto complete, search operators, etc. according to whether the attributes are protected fields.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: March 17, 2020
    Assignee: Oracle International Corporation
    Inventors: Jing Wu, Blake Sullivan, Michael William McGrath, Min Lu
  • Patent number: 10585926
    Abstract: Embodiments include method, systems and computer program products for managing structuring of large sets of unstructured data. In some embodiments, a search query may be received from a user via a graphical user interface (GUI). The search query may be parsed to identify a data aspect and a first value. An aspect-value pair may be generated using the data aspect and the first value. A data asset may be generated by associating a type structure to the unstructured data comprising a second value, wherein the type structure comprises the data aspect and the second value. A set of search results may be generated using the first value, wherein the set of search results comprises at least one data asset that matches the first value. Presentation of the set of search results may be facilitated, where the set of search results corresponds to the search query and comprises the data aspect.
    Type: Grant
    Filed: June 14, 2016
    Date of Patent: March 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Paul K. Bullis, Geoffrey M. Hambrick
  • Patent number: 10572484
    Abstract: Attributes and semantics of duplicate insignificance that are inherent or inferred in a database language statement are detected. Also, a join operation that is inherent or inferred in the database language statement is detected and examined for join semantics. The join semantics specifies or refers to a driving table to be subjected to a hash join operation that may populate one or more hash buckets. The optimizer and the execution layers may use cost estimation or heuristics to assign the left and right table roles to the tables involved in the join. The hash join operation removes left table duplicates during population of the hash buckets, resulting in full or partial duplicate elimination that occurs during the hash join operation.
    Type: Grant
    Filed: March 9, 2017
    Date of Patent: February 25, 2020
    Assignee: Oracle International Corporation
    Inventors: Srikanth Bondalapati, Rafi Ahmed, Sankar Subramanian
  • Patent number: 10540366
    Abstract: Aspects of the disclosure relate to transforming data structures and data objects. A computing platform may query a logical view of a data table associated with a first database maintained by a first database server in a first schema and may receive source data associated with the logical view. Subsequently, the computing platform may split the source data into a plurality of data chunks associated with the logical view. Next, the computing platform may move the plurality of data chunks to a plurality of nodes configured to receive and transform the plurality of data chunks from the first schema to a second schema different from the first schema. Then, the computing platform may command the plurality of nodes to transform the plurality of data chunks and may move the transformed data from the plurality of nodes to a second database maintained by a second database server in the second schema.
    Type: Grant
    Filed: March 9, 2017
    Date of Patent: January 21, 2020
    Assignee: Bank of America Corporation
    Inventors: Tao Huang, Sriharsha Jana
  • Patent number: 10534814
    Abstract: In one embodiment, a method includes accessing content objects of an online social network, each content object being associated with an entity of the online social network, where each content object includes content of the content object and is associated with metadata, generating a set of n-grams by extracting one or more n-grams from the content of the content object, calculating, for each extracted n-gram, a quality score for the n-gram based on occurrence counts associated with map tiles of a geographical map, where each occurrence count comprises a count of entities geographically located in a corresponding map tile and associated with the n-gram, generating a snippet-module including one or more of the extracted n-grams from the set of n-grams having quality-scores greater than a threshold quality-score, and sending, to a client system of a user of the online social network, the snippet-module for display to the user.
    Type: Grant
    Filed: November 11, 2015
    Date of Patent: January 14, 2020
    Assignee: Facebook, Inc.
    Inventors: Kanishk Parihar, Anton Bryl
  • Patent number: 10535031
    Abstract: The present disclosure relates to systems and methods for assigning node profiles to record objects. The method includes maintaining a plurality of node profiles. The method includes identifying a record object to which to assign a node profile. The method includes selecting a subset of node profiles that satisfy a node profile selection policy. The method includes generating, for each of the subset of node profiles, a performance profile using electronic activities of the node profiles and one or more object field-value pairs of the record object. The method includes determining, for a first node profile, that a match score between the first node profile and the record object based on the performance profile and one or more object field-value pairs of the record object satisfies a threshold. The method includes storing an association between the first node profile and the record object based on the match score.
    Type: Grant
    Filed: May 23, 2019
    Date of Patent: January 14, 2020
    Assignee: People.ai, Inc.
    Inventors: Oleg Rogynskyy, Yurii Brunets, Maksym Kysylov
  • Patent number: 10528533
    Abstract: Techniques are disclosed for identifying anomalies in small data sets, by identifying anomalies using a Generalized Extreme Student Deviate test (GESD test). In an embodiment, a data set, such as business data or a website metric, is checked for skewness and, if found to be skewed, is transformed to a normal distribution (e.g., by applying a Box-Cox transformation). The data set is checked for presence of trends and, if a trend is found, has the trend removed (e.g., by running a linear regression). In one embodiment, a maximum number of anomalies is estimated for the data set, by applying an adjusted box plot to the data set. The data set and the estimated number of anomalies is run through a GESD test, and the test identifies anomalous data points in the data set, based on the provided estimated number of anomalies. In an embodiment, a confidence interval is generated for the identified anomalies.
    Type: Grant
    Filed: February 9, 2017
    Date of Patent: January 7, 2020
    Assignee: Adobe Inc.
    Inventors: Shiv Kumar Saini, Trevor Paulsen, Moumita Sinha, Gaurush Hiranandani
  • Patent number: 10528540
    Abstract: The present disclosure provides a detailed description of techniques used in systems, methods, and in computer program products for dynamic aggregate generation and updating for high performance querying of large datasets. Certain embodiments are directed to technological solutions for determining at least one aggregate of selected virtual cube attributes (e.g., measures, dimensions, etc.) describing a virtual multidimensional data model of a subject database, and generating an aggregate table and a set of aggregate metadata for the aggregate. In some embodiments, an aggregate database statement configured to operate on the subject database can be issued to generate the aggregate table and/or aggregate metadata. Further, the aggregate can be dynamically determined responsive to receiving a database statement configured to operate on the virtual multidimensional data model representing the subject database.
    Type: Grant
    Filed: November 19, 2015
    Date of Patent: January 7, 2020
    Assignee: AtScale, Inc.
    Inventors: Sarah Gerweck, David Ross