Inverted Index Patents (Class 707/742)

Machine-learning based transcript summarization

Patent number: 12217013

Abstract: There is a need for more effective and efficient predictive natural language summarization. This need is addressed by applying hybrid extractive and abstractive summarization techniques in a unique processing pipeline to generate a cohesive and comprehensive summary of a multi-party interaction.

Type: Grant

Filed: October 3, 2022

Date of Patent: February 4, 2025

Assignee: UnitedHealth Group Incorporated

Inventors: Rajesh Sabapathy, Chirag Mittal, Gourav Awasthi, Aditya Teja Josyula
Scan-based merge for analytical query processing in HTAP systems using delete vectors

Patent number: 12135700

Abstract: The subject technology receives a query, the query including a query range for processing the query and a set of requested columns. The subject technology based on the query range, determining a set of blob files and a set of delete vectors. The subject technology for each blob file, storing each row, including the set of request columns, into an array of rowsets. The subject technology for each rowset, generating a delete bitset to at least indicate whether each row has been deleted. The subject technology for each delta file, indicate a previous row of a visible row of the delta file as being deleted based on a delete pointer of the visible row. The subject technology providing a set of rowsets, including a corresponding selection column set, as a result of the query.

Type: Grant

Filed: September 1, 2023

Date of Patent: November 5, 2024

Assignee: Snowflake Inc.

Inventors: Mihir Dharamshi, Cristian Diaconu, Chen Luo, Joshua Slocum
Non-transitory computer readable medium with executable revision history integration program converting name of an editor in a revision history of a document and subsequently deleting addition and deletion histories in the same editor's name resulting from the conversion, and revision history integration system with server that performs the same conversions and deletions

Patent number: 12067352

Abstract: A non-transitory computer readable medium with an executable revision history integration program for causing at least one computer executes: a procedure for acquiring data on at least one document having at least one editing history including a name of an editor of the document; a procedure for collectively converting at least one name of an editor included in the editing history in the acquired data; and a procedure for deleting addition and deletion histories where a character string in the acquired data has been added and deleted by the same editor's name, as a result of the conversion of the name of the editor.

Type: Grant

Filed: July 19, 2022

Date of Patent: August 20, 2024

Assignee: BOOSTDRAFT, INC.

Inventors: Yohei Fujii, Hiroshi Watanabe
Feedback-based inverted index compression

Patent number: 11947512

Abstract: The disclosed technology is generally directed to the compression of inverted indexes. In one example of the technology, an inverted index that includes a plurality of posting lists and metadata is provided. The inverted index indicates compression settings that are associated with the plurality of posting lists. At periodic scheduled times, a regeneration is performed on the inverted index. The regeneration includes decompressing the inverted index. The decompressing uses the compression settings indicated by the inverted index. The regeneration further includes determining compression settings to use during a next periodic scheduled time of the plurality of periodic scheduled times, such that at least a first posting list of the plurality of posting lists uses a different compression setting than a second posting list of the plurality of posting lists.

Type: Grant

Filed: February 22, 2022

Date of Patent: April 2, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Torsten Amundsen, Pavel Sukhov
Data store indexing engine with automated refresh

Patent number: 11934370

Abstract: Systems and methods are disclosed to implement an indexing engine that maintains an index in an index store for a storage object in a data store. In embodiments, the index store may be implemented using an in-memory storage cluster separate from the data store. The storage object may have multiple indexes, which may have different filtering or sorting criteria for the data. In embodiments, updates to the storage object are received as an update stream by the indexing engine. Based on configurable indexing rules, the indexing engine applies the updates to the appropriate indexes. To service a query to the data store, a query engine first retrieves a set of keys satisfying the query from the index store, and then data corresponding to the keys from the data store or another index. In embodiments, the index may be refreshed via touch updates of selected data in the storage object.

Type: Grant

Filed: December 11, 2017

Date of Patent: March 19, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Long Nguyen, Dominic Corona, Fletcher Liverance
Utilizing regular expression embeddings for named entity recognition systems

Patent number: 11914583

Abstract: Various embodiments are directed to a system that utilizes regular expression (regex) to recognize at least portions of characters, words, text, numbers, etc. in a structured or unstructured dataset, any patterns associated therewith, and/or similarities between the determined patterns. In examples, a regex-based pattern recognition platform may receive a dataset and determine whether at least a first regex pattern and a second regex pattern can be identified. The occurrences of the first and second regex patterns and the frequency of those occurrences may reveal something about the dataset itself or any patterns contained therein.

Type: Grant

Filed: September 16, 2020

Date of Patent: February 27, 2024

Assignee: Capital One Services, LLC

Inventors: Jeremy Edward Goodsitt, Austin Grant Walters, Reza Farivar, Mark Louis Watson, Anh Truong, Galen Rafferty, Vincent Pham
Synchronizing independent media and data streams using media stream synchronization points

Patent number: 11876851

Abstract: A messaging channel is embedded directly into a media stream. Messages delivered via the embedded messaging channel are extracted at a client media player. According to a variant embodiment, and in lieu of embedding all of the message data in the media stream, only a coordination index is injected, and the message data is sent separately and merged into the media stream downstream (at the client media player) based on the coordination index. In one example embodiment, multiple data streams (each potentially with different content intended for a particular “type” or class of user) are transmitted alongside the video stream in which the coordination index (e.g., a sequence number) has been injected into a video frame. Based on a user's service level, a particular one of the multiple data streams is released when the sequence number appears in the video frame, and the data in that stream is associated with the media.

Type: Grant

Filed: March 22, 2022

Date of Patent: January 16, 2024

Assignee: Akamai Technologies, Inc.

Inventors: Mark M. Ingerman, Michael Archer
Content discovery systems and methods

Patent number: 11841879

Abstract: Described herein is a computer implemented method for identifying one or more documents of potential relevance to an input query. The method comprises receiving the input query; processing input text from the query to generate an input query vector; accessing document records from a record database, each document record including a document vector; generating a document similarity score in respect of each accessed document, the document similarity score for a given document record being generated using the document vector for the given document record and the input query vector, the document similarity score for a given document record indicating the similarity of the input text to a document that the given document record is in respect of; and identifying one or more potentially relevant document records based on their document similarity scores.

Type: Grant

Filed: August 20, 2022

Date of Patent: December 12, 2023

Assignees: ATLASSIAN PTY LTD., ATLASSIAN US, INC.

Inventors: Geoff Sims, Michael Fulthorp, Mike Ortman, Jeff Nelson, Matthew Hunter
Acceleration circuitry for posit operations

Patent number: 11829755

Abstract: Systems, apparatuses, and methods related to acceleration circuitry for posit operations are described. Signaling indicative of performance of an operation to write a first bit string to a first buffer resident on acceleration circuitry and a second bit string resident on the acceleration circuitry can be received at an DMA controller couplable to the acceleration circuitry. The acceleration circuitry can be configured to perform arithmetic operations, logical operations, or both on bit strings formatted in a unum or posit format. Signaling indicative of an arithmetic operation, a logical operation, or both, to be performed using the first and second bit strings can be transmitted to the acceleration circuitry. The arithmetic operation, the logical operation, or both can be performed via the acceleration circuitry and according to the signaling. Signaling indicative of a result of the arithmetic operation, the logical operation, or both can be transmitting to the DMA controller.

Type: Grant

Filed: August 1, 2022

Date of Patent: November 28, 2023

Assignee: Micron Technology, Inc.

Inventors: Vijay S. Ramesh, Phillip G. Hays, Craig M. Cutler, Andrew J. Rees
Dynamic storage and deferred analysis of data stream events

Patent number: 11789950

Abstract: Systems and methods are described for a streaming data processing system that defers processing of some data based on a determined importance of the data. A streaming data processing system can ingest a data stream that contains multiple events, and can extract data field values from individual events and process the data field values to determine event importance. The streaming data processing system can then do further processing and indexing of high importance events, and can generate a storage prefix for each low importance event that determines where to store the low importance event in a data storage system. The streaming data processing system can then process queries by retrieving the indexed high importance events, and can extract the data field values from a high importance event to determine the storage prefix for retrieving corresponding low importance events from the data storage system.

Type: Grant

Filed: October 19, 2020

Date of Patent: October 17, 2023

Assignee: Splunk Inc.

Inventors: Paul Jean André Bernier, Poornima Devaraj, Ivneet Kaur, Zhimin Liang, Min Zhang
Processing and visualization of textual data based on syntactic dependency trees and sentiment scoring

Patent number: 11775755

Abstract: Described herein are improved systems and methods for overcoming technical problems associated with processing and visualization of textual data and natural language processing. In some examples, a method is provided for determining sentiment associated with big data analysis of database information. In some examples, textual news data (e.g., NEWS API, RSS, etc.) is received via a communications network from a plurality of data platforms. The textual news data is parsed, and syntactic dependency trees are generated therefrom. A sentiment score is derived for the parsed textual data corresponding to a word or phrase associated with the textual data, and an image is generated reflecting scored sentiment for the parsed textual data.

Type: Grant

Filed: March 2, 2023

Date of Patent: October 3, 2023

Assignee: TLDR LLC

Inventors: Benjamin Olander, Jedediah Carty, Matthew Van Dusen, Philip Stockton
Method, electronic device, and computer program product for processing data

Patent number: 11698889

Abstract: Embodiments of the present disclosure relate to processing data. An example method includes acquiring data related to a first moment in streaming data of an object to be processed. The method further includes storing the data in a first entry of a data table based on an identification of the object to be processed, wherein the data table further includes a second entry before the first entry, and the second entry stores data related to a second moment before the first moment in the streaming data. The method further includes updating an index related to the object to be processed based on the first entry. Thus, a solution to the problem of performing search in data at different moments is provided, and it is unnecessary for a user to participate in the solution, thus improving the user experience and reducing the use of storage resources.

Type: Grant

Filed: April 30, 2021

Date of Patent: July 11, 2023

Assignee: EMC IP HOLDING COMPANY LLC

Inventors: Pengfei Su, Lu Lei, Julius Jian Zhu
Parser for schema-free data exchange format

Patent number: 11693839

Abstract: A method includes obtaining a query containing at least one field from which data is being queried, obtaining a dataset having a schema-free data exchange format having multiple fields of data at different physical positions in the dataset, and parsing the dataset by obtaining a structural index that maps logical locations of fields to physical locations of the fields of the dataset, accessing the structural index with logical locations of the fields that index to the physical locations, and providing data from the fields based on the physical locations responsive to the query.

Type: Grant

Filed: September 10, 2020

Date of Patent: July 4, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yinan Li, Nikolaos Romanos Katsipoulakis, Badrish Chandramouli, Jonathan D Goldstein, Donald Kossmann
Multimodal table encoding for information retrieval systems

Patent number: 11687514

Abstract: Multimodal table encoding, including: Receiving an electronic document that contains a table. The table includes multiple rows, multiple columns, and a schema comprising column labels or row labels. The electronic document includes a description of the table which is located externally to the table. Next, operating separate machine learning encoders to separately encode the description, schema, each of the rows, and each of the columns of the table, respectively. The schema, the rows, and the columns are encoded together with end-of-column tokens and end-of-row tokens that mark an end of each column and row, respectively. Then, applying a machine learning gating mechanism to the encoded description, encoded schema, encoded rows, and encoded columns, to produce a fused encoding of the table, wherein the fused encoding is representative of both a structure of the table and a content of the table.

Type: Grant

Filed: July 15, 2020

Date of Patent: June 27, 2023

Assignee: International Business Machines Corporation

Inventors: Roee Shraga, Haggai Roitman, Guy Feigenblat, Mustafa Canim
Caching content securely within an edge environment, with pre-positioning

Patent number: 11671413

Abstract: A technique to cache content securely within edge network environments, even within portions of that network that might be considered less secure than what a customer desires, while still providing the acceleration and off-loading benefits of the edge network. The approach ensures that customer confidential data (whether content, keys, etc.) are not exposed either in transit or at rest. In this approach, only encrypted copies of the customer's content objects are maintained within the portion of the edge network, but without any need to manage the encryption keys. To take full advantage of the secure content caching technique, preferably the encrypted content (or portions thereof) are pre-positioned within the edge network portion to improve performance of secure content delivery from the environment.

Type: Grant

Filed: January 26, 2021

Date of Patent: June 6, 2023

Assignee: Akamai Technologies, Inc.

Inventor: Tong Chen
Caching content securely within an edge environment

Patent number: 11659033

Abstract: A technique to cache content securely within edge network environments, even within portions of that network that might be considered less secure than what a customer desires, while still providing the acceleration and off-loading benefits of the edge network. The approach ensures that customer confidential data (whether content, keys, etc.) are not exposed either in transit or at rest. In this approach, only encrypted copies of the customer's content objects are maintained within the portion of the edge network, but without any need to manage the encryption keys. To take full advantage of the secure content caching technique, preferably the encrypted content (or portions thereof) are pre-positioned within the edge network portion to improve performance of secure content delivery from the environment.

Type: Grant

Filed: January 25, 2021

Date of Patent: May 23, 2023

Assignee: Akamai Technologies, Inc.

Inventor: Tong Chen
Indexing based on feature importance

Patent number: 11593388

Abstract: A method and a computer program product are used generating an index of a scoring payload dataset. Correlation coefficients for correlations between input data values and output data values of the machine learning model provided by the scoring payload datasets as well as performance data values of the processes provided by process datasets are calculated. Features of which feature values are used as input data values are ranked according to their importance using the correlation coefficients. For the features of a set of highest-ranking features feature value sets with feature values of the respective features are selected from the scoring payload datasets and a database index of the selected feature value sets is generated.

Type: Grant

Filed: March 19, 2021

Date of Patent: February 28, 2023

Assignee: International Business Machines Corporation

Inventors: Rafal Bigaj, Lukasz G. Cmielowski, Wojciech Sobala, Maksymilian Erazmus
Feature map sparsification with smoothness regularization

Patent number: 11544569

Abstract: A method includes receiving an image by a deep neural network (DNN) and obtaining a first feature map based on the image while the DNN is in a trained state, wherein the DNN is configured to perform a task based on the image, and is trained with a training image by using a feature sparsification with smoothness regularization process and a back propagation and weight update process that updates the DNN based on an output of the feature sparsification with smoothness regularization process.

Type: Grant

Filed: October 5, 2020

Date of Patent: January 3, 2023

Assignee: TENCENT AMERICA LLC

Inventors: Wei Jiang, Wei Wang, Shan Liu
Defining indexing fields for matching data entities

Patent number: 11397715

Abstract: Indexing and matching records in a data management system by defining entity indexing attributes associated with system records, receiving an incoming data entity, selecting a set of entity candidates according to the entity indexing attributes, matching the incoming entity to an entity candidate, generating an analysis of the entity candidate selection according to entity attribute effectiveness, and revising the entity indexing attributes according to the analysis.

Type: Grant

Filed: July 31, 2019

Date of Patent: July 26, 2022

Assignee: International Business Machines Corporation

Inventors: Shettigar Parkala Srinivas, Soma Shekar Naganna, Neeraj Ramkrishna Singh, Abhishek Seth, Prabhakaran Ramalingam
Method and device for processing untagged data, and storage medium

Patent number: 11334723

Abstract: A method for processing untagged data includes: similarity comparison is performed on a semantic vector of untagged data and a semantic vector of each piece of tagged data to obtain similarities corresponding to respective pieces of tagged data; a preset number of similarities are selected according to a preset selection rule; the untagged data is predicted with a tagging model obtained by training through the tagged data, to obtain a prediction result of the untagged data; and the untagged data is divided into untagged data that can be tagged by a device or untagged data that cannot be tagged by the device according to the preset number of similarities and the prediction result.

Type: Grant

Filed: November 16, 2019

Date of Patent: May 17, 2022

Assignee: Beijing Xiaomi Intelligent Technology Co., Ltd.

Inventors: Xiaotong Pan, Zuopeng Liu
Storing partial tuples from a streaming application in a database system

Patent number: 11204926

Abstract: A tuple manager of a database system processes partial tuples from a streaming application and stores them in a database. The partial tuples may include a large object (LOB) that arrives at the database at a different time than the rest of the corresponding tuple. A tuple manager stores partial tuples and uses a partial tuples index to track the partial tuples and coordinate recombination of corresponding partial tuples. The database allows queries to be run on the partial data before the tuples are reconstructed allowing faster access to potentially important data before the arrival and processing of a partial tuple such as an LOB.

Type: Grant

Filed: October 31, 2018

Date of Patent: December 21, 2021

Assignee: International Business Machines Corporation

Inventors: Rafal P. Konik, Jessica R. Eidem, Jingdong Sun, Roger A. Mittelstadt
Scalable content rendering

Patent number: 11157678

Abstract: The presently disclosed subject matter includes a computer-implemented system and method for receiving content from another computer device and dynamically adapting display of the received content within a container of a formatted document, the container defining a restricted area within the formatted document designated for displaying the content. Sub-elements within at least one content item are identified and tagged, the tagging enables to acquire display parameters of tagged sub-elements and calculate therefor a required adaptation of the content item such that it can be fitted within the respective container.

Type: Grant

Filed: May 6, 2019

Date of Patent: October 26, 2021

Assignee: TABOOLA.COM LTD

Inventor: Efraim Nadiv
Handling queries in document systems using segment differential based document text-index modelling

Patent number: 11157477

Abstract: A method, computer system, and computer program product for segment differential-based document text-index modeling are provided. The embodiment may include receiving, by a processor, a document with a valid document ID and version ID tuple. The embodiment may also include determining the received document is a new version of a previously stored document and consequently multiplexing versions of the document into a single indexed document. The embodiment may further include segmenting the received document and building a token vector. The embodiment may also include calculating a difference between the received new version of the document and the previously stored document using information obtained from the segmentation. The embodiment may further include in response to the calculated difference being below a pre-configured threshold value, discarding the received new version.

Type: Grant

Filed: November 28, 2018

Date of Patent: October 26, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Roger C. Raphael, Rajesh M. Desai, Fumihiko Terui, Justo L. Perez, Thomas Hampp
Subquery generation based on search configuration data from an external data system

Patent number: 11126632

Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed, and obtained search configuration data from the external system. The system uses the search configuration data to generate a subquery for the external data system. The system also generates instructions for one or more worker nodes to receive and process results of the subquery from the external data system.

Type: Grant

Filed: July 31, 2018

Date of Patent: September 21, 2021

Assignee: Splunk Inc.

Inventors: Sourav Pal, Arindam Bhattacharjee
Speeding matching search of hierarchical name structures

Patent number: 11106663

Abstract: A search for a regular expression in a tree hierarchy, includes, in part, searching for a match to the regular expression in a first subtree defined by a first node name, recording information about the first subtree if there is no match, determining whether a second subtree defined by a second node name is identical to the first node, skipping search of the second subtree if the second subtree is determined to be identical and prefix equivalent, with respect to the regular expression, to the first subtree. The second subtree is determined to be prefix equivalent to the first subtree when for any string s, a first prefix defined by a concatenation of the first node name and the string s results in a match if and only if a second prefix defined by a concatenation of the second node name and the string s results in a match.

Type: Grant

Filed: February 22, 2019

Date of Patent: August 31, 2021

Assignee: Synopsys, Inc.

Inventors: Ilya Kudryavtsev, Daniel Geist, Boris Gommershtadt
Virtual assistant identification of nearby computing devices

Patent number: 11087765

Abstract: In one example, a method includes method comprising: receiving audio data generated by a microphone of a current computing device; identifying, based on the audio data, one or more computing devices that each emitted a respective audio signal in response to speech reception being activated at the current computing device; and selecting either the current computing device or a particular computing device from the identified one or more computing devices to satisfy a spoken utterance determined based on the audio data.

Type: Grant

Filed: March 15, 2021

Date of Patent: August 10, 2021

Assignee: GOOGLE LLC

Inventor: Jian Wei Leong
Virtual assistant identification of nearby computing devices

Patent number: 11049505

Abstract: In one example, a method includes method comprising: receiving audio data generated by a microphone of a current computing device; identifying, based on the audio data, one or more computing devices that each emitted a respective audio signal in response to speech reception being activated at the current computing device; and selecting either the current computing device or a particular computing device from the identified one or more computing devices to satisfy a spoken utterance determined based on the audio data.

Type: Grant

Filed: March 15, 2021

Date of Patent: June 29, 2021

Assignee: GOOGLE LLC

Inventor: Jian Wei Leong
Indexing and querying semi-structured documents using a key-value store

Patent number: 11030242

Abstract: A search system processes queries for accessing information stored in documents. A document comprises fields. The search system stores a plurality of indexes in a key-value store. Each index comprises key-value pairs. A key of a key-value pair is obtained by combining field data describing a field of a document. The value of each field is stored as an individual key-value in the key-value store. The search system receives a query requesting information stored in documents and specifying a search criteria. The search system builds a key-expression based on the search criteria and uses one or more indexes to find key-value pairs matching the key-expression. The search system finds the requested information based on the matching key-value pairs and provides the requested information to the query source.

Type: Grant

Filed: October 15, 2018

Date of Patent: June 8, 2021

Assignee: Rockset, Inc.

Inventors: Dhruba Borthakur, Venkat Venkataramani, Igor Canadi, Tudor Bosman
Systems and methods for online clustering of content items

Patent number: 11003692

Abstract: Systems, methods, and non-transitory computer-readable media can obtain a first batch of content items to be clustered. A set of clusters can be generated by clustering respective binary hash codes for each content item in the first batch, wherein content items included in a cluster are visually similar to one another. A next batch of content items to be clustered can be obtained. One or more respective binary hash codes for the content items in the next batch can be assigned to a cluster in the set of clusters.

Type: Grant

Filed: December 28, 2015

Date of Patent: May 11, 2021

Assignee: Facebook, Inc.

Inventors: Yunchao Gong, Marcin Pawlowski, Fei Yang, Lubomir Bourdev, Louis Dominic Brandy, Robert D. Fergus
Query handling for field searchable raw machine data using a field searchable datastore and an inverted index

Patent number: 10997138

Abstract: Embodiments are directed towards a method for searching data. The method comprises providing an inverted index that comprises at least one record, wherein the at least one record comprises at least one field name and a corresponding at least one field value. The at least one field name and corresponding value are extracted from time-stamped searchable events that are stored in a field searchable datastore and comprise portions of raw data. The at least one record further comprises a posting value that identifies a location in the field searchable datastore where an event associated with the at least one record is stored. The method further comprises receiving an incoming search query that references a field name and evaluating the incoming search query. Furthermore, responsive to the evaluating, the method comprises determining results for the incoming search query using both of the field searchable datastore and the inverted index.

Type: Grant

Filed: May 28, 2019

Date of Patent: May 4, 2021

Assignee: Splunk, Inc.

Inventors: David Ryan Marquardt, Mitchell Neuman Blank, Jr., Stephen Phillip Sorkin
Method and device for outputting result of operations using data sources of different types

Patent number: 10970337

Abstract: A method for outputting a result of one or more operations using data sources of different types is provided. The method includes steps of: (a) when a user query is acquired, a device (i) acquiring data elements respectively from the data sources of different types by referring to the user query, and (ii) performing main joint operations on the data elements, to thereby generate data set; and (b) the device performing data processing operations and output operations on the data set, to thereby generate an answer for the user query. It has an effect of providing the method for outputting the result of the operations using the data sources of the different types by referring to each of languages corresponding to each of the data sources.

Type: Grant

Filed: September 11, 2020

Date of Patent: April 6, 2021

Assignee: Seculayer Co., LTD.

Inventor: Jin Sang You
Generating a summary based on readability

Patent number: 10922346

Abstract: In some examples, a set of sentences is extracted from a digital document, and each sentence is scored using a respective informativeness measure and readability measure. Sentences in the set of sentences are selected based on the readability measures and informativeness measures. A low readability, high informativeness sentence is identified from the set of sentences. A concatenated sentence is generated by concatenating at least one contextual sentence with the low readability, high informativeness sentence, where the concatenated sentence has a higher readability than the low readability, high informativeness sentence.

Type: Grant

Filed: June 13, 2017

Date of Patent: February 16, 2021

Assignee: Micro Focus LLC

Inventor: Vinay Deolalikar
Document ranking by progressively increasing faceted query

Patent number: 10838994

Abstract: Natural Language Processing (NLP) is performed on a corpus using a processor and a memory to extract a set of facets corresponding to a dimension in a set of dimensions. Using a score threshold, a subset of the set of facets is selected where each facet in the set of facets has a corresponding score relative to the corpus. A subsequent query is formed by increasing a complexity of a previous query using a facet in the subset of facets. The subsequent query is executed on at least a portion of the corpus. The documents in a new result set are ranked, the new result set being in response to executing the subsequent query. An output is produced from the new result set, which includes a ranking of that subset of documents whose ranks have changed by more than a threshold rank distance from the corresponding ranks in the corpus.

Type: Grant

Filed: August 31, 2017

Date of Patent: November 17, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Takashi Fukuda, Hiroaki Kikuchi
Optimizing faceted classification through facet range identification

Patent number: 10733211

Abstract: In an approach to faceted classification, a computer receives a search query. The computer creates a first table of facet value ranges, based on the search query. The computer fetches a first search result corresponding to the search query. The computer retrieves a first facet value associated with the first search result. The computer maps the first facet value to a first facet value range. The computer determines whether the first facet value range is in the first table of facet value ranges. The computer inserts the first facet value range into the first table of facet value ranges. The computer determines whether a number of facet value ranges in the first table of facet value ranges is below a pre-defined threshold. The computer creates a second table of facet value ranges. The computer identifies a second facet value range that includes the first facet value range.

Type: Grant

Filed: December 19, 2017

Date of Patent: August 4, 2020

Assignee: International Business Machines Corporation

Inventors: Marta Breno, Roberto Ragusa
Static data caching for queries with a clause that requires multiple iterations to execute

Patent number: 10642831

Abstract: Techniques are described herein to generate and to execute a query execution plan using static data buffering. After receiving a query with a clause that requires multiple iterations to execute, a database management system (DBMS) generates a plurality of plans that vary the order in which the database operations are executed. Within each plan, the DBMS identifies sets of rows within that plan that contain static data during execution of the query. Then, an additional step is added to each plan that includes loading the static set of rows in a database buffer cache. One or more database operations, from an iteration other than the first iteration, may be performed against the cached static set of rows. For each plan generated in this manner, a cost analysis model is applied, and the plan with the lowest estimated computational cost is selected for use as the query execution plan.

Type: Grant

Filed: September 16, 2016

Date of Patent: May 5, 2020

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Mohamed Ziauddin, Yali Zhu
Systems and methods for building an on-device temporal web index for user curated/preferred web content

Patent number: 10621246

Abstract: A method and apparatus of a device that indexes donatable content from a network site is described. In an exemplary embodiment, the device receives a requested document, where the requested document includes a plurality of tags. In addition, the device detects a donatable tag in the plurality tags that indicates the network site includes donatable content. In response to the detecting, the device sends a request for the donatable content to the network site. Furthermore, the device receives the donatable content from the network site. The device additionally indexes the donatable content into an on-device search index, where at least some of the index donatable content is further returned as a search result for an on-device search.

Type: Grant

Filed: September 29, 2017

Date of Patent: April 14, 2020

Assignee: Apple Inc.

Inventors: Anubhav Malhotra, John M. Hörnkvist
Systems and methods for indexing and mapping data sets using feature matrices

Patent number: 10614031

Abstract: The present disclosure relates to systems and methods for indexing and mapping data sets by feature matrices, comprising at least a processor and a non-transitory memory storing instructions that cause the processor to perform operations including receiving data sets of the same type, applying autoencoders to generate feature matrices, and generating a neural network model trained to generate synthetic data corresponding to the type of data files. Further, the processor performs operations to applying more autoencoders to part of the hidden layer of the neural network model to generate more corresponding feature matrices and indexing the data set using the feature matrices such that the data sets are searchable using an index wherein a search query is received and a third feature matrix is generated so that a data set can be retrieved and compared to the feature matrices using the index.

Type: Grant

Filed: July 8, 2019

Date of Patent: April 7, 2020

Assignee: Capital One Services, LLC

Inventors: Austin Walters, Jeremy Goodsitt, Vincent Pham, Galen Rafferty, Anh Truong, Reza Farivar
Linear run length encoding: compressing the index vector

Patent number: 10545936

Abstract: Linear run length encoding is described. A system and method include storing a table of time series data in a database of a data platform, the table of time series data representing a set of time series blocks. Each time series block of the set of time series blocks has a time series of equally-incremented time intervals and a run length. Each time interval of the time series is associated with one or more values. The run length has a starting position with at least one starting value and an ending position with at least one ending value. The starting position and the at least one starting value is stored for each time series block in a column store of the database. Then, a compressed index is generated in the column store of the database for each time series block, the compressed index comprising the starting position and the at least one starting value.

Type: Grant

Filed: July 8, 2014

Date of Patent: January 28, 2020

Assignee: SAP SE

Inventors: Gordon Gaumnitz, Robert Schulze, Lars Dannecker, Ivan Bowman, Dan Farrar
Efficient denormalization of data instances

Patent number: 10540332

Abstract: Technologies are described herein for denormalizing data instances. Schemas for data instances are embedded with annotations indicating how the denormalization is to be performed. Based on the annotations, one or more sub per object indexes (“sub POIs”) can be generated for each data instance and stored. The sub POIs can include a target sub POI containing data from the data instance, and at least one source sub POI containing data from another data instance, if the data instance depends on the other data instance. Data instance updates can be performed by identifying sub POIs that are related to the updated data instance in storage, and updating the related sub POIs according to the update to the data instance. The sub POIs can be sent to an indexing engine to generate an index for a search engine to facilitate searches on the data instances.

Type: Grant

Filed: August 3, 2016

Date of Patent: January 21, 2020

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Christopher Clayton McConnell, Weipeng Liu, Shahin Shayandeh, Robert Lovejoy Goodwin
Method and system for optimization of faceted search

Patent number: 10521408

Abstract: In general, embodiments of the technology relate to a method for servicing requests. The method includes receiving a search request from a client, determining a main path and a conditional subpath associated with the search request, determining a subpath index associated with the main path and the conditional subpath, obtaining, using at least a portion of the search request, a set of subpath index entries from the subpath index, wherein each of the subpath index entries specifies a facet subpath and content associated with the facet subpath, generating a final result using at least a portion of the contents in the set of subpath index entries, and providing the final result to the client.

Type: Grant

Filed: September 30, 2016

Date of Patent: December 31, 2019

Assignee: OPEN TEXT CORPORATION

Inventors: Caroline Spruit, Petr Olegovich Pleshachkov
Training and operating multi-layer computational models

Patent number: 10445650

Abstract: A processing unit can successively operate layers of a multilayer computational graph (MCG) according to a forward computational order to determine a topic value associated with a document based at least in part on content values associated with the document. The processing unit can successively determine, according to a reverse computational order, layer-specific deviation values associated with the layers based at least in part on the topic value, the content values, and a characteristic value associated with the document. The processing unit can determine a model adjustment value based at least in part on the layer-specific deviation values. The processing unit can modify at least one parameter associated with the MCG based at least in part on the model adjustment value. The MCG can be operated to provide a result characteristic value associated with test content values of a test document.

Type: Grant

Filed: November 23, 2015

Date of Patent: October 15, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jianfeng Gao, Li Deng, Xiaodong He, Lin Xiao, Xinying Song, Yelong Shen, Ji He, Jianshu Chen
Scalable content rendering

Patent number: 10346511

Abstract: The presently disclosed subject matter includes a computer-implemented system and method for receiving content from another computer device and dynamically adapting display of the received content within a container of a formatted document, the container defining a restricted area within the formatted document designated for displaying the content. Sub-elements within at least one content item are identified and tagged, the tagging enables to acquire display parameters of tagged sub-elements and calculate therefor a required adaptation of the content item such that it can be fitted within the respective container.

Type: Grant

Filed: June 1, 2017

Date of Patent: July 9, 2019

Assignee: TABOOLA.COM LTD.

Inventor: Efraim Nadiv
Faster access for compressed time series data: the block index

Patent number: 10248681

Abstract: A system and method for faster access for compressed time series data. A set of blocks are generated based on a table stored in a database of the data platform. The table stores data associated with multiple sources of data provided as consecutive values, each block containing index vectors having a range of the consecutive values. A block index is generated for each block having a field start vector representing a starting position of the block relative to the range of consecutive values, and a starting value vector representing a value of the block at the starting position. The field start vector of the block index is accessed to obtain the starting position of a field corresponding to a first block and to the range of the consecutive values of the first block. The starting value vector is then determined from the block index to determine an end and a length of the field of the first block.

Type: Grant

Filed: July 8, 2014

Date of Patent: April 2, 2019

Assignee: SAP SE

Inventors: Gordon Gaumnitz, Robert Schulze, Lars Dannecker, Ivan Bowman, Dan Farrar
Adaptive dictionary compression/decompression for column-store databases

Patent number: 10235377

Abstract: Innovations for adaptive compression and decompression for dictionaries of a column-store database can reduce the amount of memory used for columns of the database, allowing a system to keep column data in memory for more columns, while delays for access operations remain acceptable. For example, dictionary compression variants use different compression techniques and implementation options. Some dictionary compression variants provide more aggressive compression (reduced memory consumption) but result in slower run-time performance. Other dictionary compression variants provide less aggressive compression (higher memory consumption) but support faster run-time performance. As another example, a compression manager can automatically select a dictionary compression variant for a given column in a column-store database.

Type: Grant

Filed: December 23, 2013

Date of Patent: March 19, 2019

Assignee: SAP SE

Inventors: Ingo Mueller, Cornelius Ratsch, Peter Sanders, Franz Faerber
Efficient index recovery in log-structured object stores

Patent number: 10083089

Abstract: A method to efficiently checkpoint and reconstruct an in-memory index associated with a log-structured object store includes enabling asynchronous write operations to occur to a log-structured object store. The log-structured object store utilizes an in-memory index to access objects therein. The method further enables checkpoint operations to occur to the log-structured object store without pausing the asynchronous write operations. When initiating checkpoint operations, the method establishes a “begin checkpoint” marker on the log-structured object store. This “begin checkpoint” marker is configured to point to an earliest address in the log-structured object store that is uncommitted to the in-memory index. In the event the in-memory index is lost, the method reconstructs the in-memory index by analyzing the log-structured object store starting from the earliest address uncommitted to the in-memory index. A corresponding system and computer program product are also disclosed.

Type: Grant

Filed: September 7, 2015

Date of Patent: September 25, 2018

Assignee: International Business Machines Corporation

Inventors: Lawrence Y. Chiu, Paul H. Muench, Sangeetha Seshadri
Efficient index checkpointing in log-structured object stores

Patent number: 10083082

Abstract: A method to efficiently checkpoint and reconstruct an in-memory index associated with a log-structured object store includes enabling asynchronous write operations to occur to a log-structured object store. The log-structured object store utilizes an in-memory index to access objects therein. The method further enables checkpoint operations to occur to the log-structured object store without pausing the asynchronous write operations. When initiating checkpoint operations, the method establishes a “begin checkpoint” marker on the log-structured object store. This “begin checkpoint” marker is configured to point to an oldest known log location recorded in the in-memory index. In the event the in-memory index is lost, the method reconstructs the in-memory index by analyzing the log-structured object store starting from the oldest known log location. A corresponding system and computer program product are also disclosed and claimed herein.

Type: Grant

Filed: September 7, 2015

Date of Patent: September 25, 2018

Assignee: International Business Machines Corporation

Inventors: Lawrence Y. Chiu, Paul H. Muench, Sangeetha Seshadri
Cached views

Patent number: 10061808

Abstract: Embodiments relate to view caching techniques that cache for a limited time, some of the (intermediate) results of a previous query execution, in order to avoid expensive re-computation of query results. Particular embodiments may utilize a cache manager to determine whether information relevant to a subsequent user request can be satisfied by an existing cache instance or view, or whether creation of an additional cache instance is appropriate. At design time, cache defining columns of a view are defined, with user input parameters automatically being cache defining. Cache instances are created for each tuple of literals for the cache defining columns, and for each explicit or implicit group by clause. Certain embodiments may feature enhanced reuse between cache instances, in order to limit memory footprint. Over time a cache instances may be evicted from memory based upon implementation of a policy such as a Least Recently Used (LRU) strategy.

Type: Grant

Filed: June 3, 2014

Date of Patent: August 28, 2018

Assignee: SAP SE

Inventors: Ki Hong Kim, Norman May, Alexander Boehm, Sung Heun Wi, Jeong Ae Han, Sang Il Song, Yongsik Yoon
Methods and systems relating to ranking functions for multiple domains

Patent number: 10019518

Abstract: Methods and systems are disclosed that relate to ranking functions for multiple different domains. By way of example but not limitation, ranking functions for multiple different domains may be trained based on inter-domain loss, and such ranking functions may be used to rank search results from multiple different domains so that they may be blended without normalizing relevancy scores.

Type: Grant

Filed: October 9, 2009

Date of Patent: July 10, 2018

Assignee: Excalibur IP, LLC

Inventors: Jiang Chen, Wei Chu, Zhenzhen Kou, Zhaohui Zheng
Profiling data with location information

Patent number: 9990362

Abstract: Profiling data includes processing an accessed collection of records, including: generating, for a first set of distinct values appearing in a first set of one or more fields, corresponding location information; generating, for the first set of fields, a corresponding list of entries identifying a distinct value from the first set of distinct values and the location information for the distinct value; generating, for a second set of one or more fields, a corresponding list of entries, with each entry identifying a distinct value from a second set of distinct values appearing in the second set of fields; and generating result information, based at least in part on: locating at least one record of the collection using the location information for at least one value appearing in the first set of fields, and determining at least one value appearing in the second set of fields of the located record.

Type: Grant

Filed: September 21, 2015

Date of Patent: June 5, 2018

Assignee: Ab Initio Technology LLC

Inventor: Arlen Anderson
Data processing method and apparatus

Patent number: 9952778

Abstract: A data processing technology is provided, and is applied to a partition management device. The partition management device stores a partition view, the partition view records a correspondence between an ID of a current partition and an address of a storage disk, and a total quantity of current partitions may be less than a total quantity of final partitions. By using the technology, data forwarding may be performed on key-value data by using a current partition, thereby reducing complexity of a partition view.

Type: Grant

Filed: May 4, 2017

Date of Patent: April 24, 2018

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventor: Xiong Luo

1 2 3 4 5 … next