Sorting Indices Patents (Class 707/753)
  • Patent number: 11455391
    Abstract: A computer-implemented system and method for a data leakage and misuse detection system comprises receiving an evaluation dataset A, and building a signature of the evaluation dataset A (sig(A)), where A signature of a dataset is a multi-level evaluation data abstraction representation of the dataset. The method further comprises building a signature for each of existing datasets B (B1, B2, . . . , Bn) (sig(Bx)) that are stored in a memory. The method then compares the sig(A) with each of the sig(Bx)s. A similarity score is derived based on the comparing, and responsive to determining the similarity score exceeds a predefined threshold, the method comprises generating an activity related to the determination.
    Type: Grant
    Filed: October 28, 2020
    Date of Patent: September 27, 2022
    Assignee: International Business Machines Corporation
    Inventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Rohit Ranchal
  • Patent number: 11410167
    Abstract: A system includes a first module that asynchronously communicates with a second module. The first module processes a set of digital transactions and transmits instructions for the second module to process the same set of digital transactions. The first module maintains a first aggregated hash value corresponding to the set of digital transactions that have been processed. The first aggregated has value is calculated using a commutative and associative hash function. The second module maintains a second aggregated hash value corresponding to a second set of digital transactions processed by the second module. The first and second aggregated hash values are compared to determine the second module processed the same digital transactions as the first module.
    Type: Grant
    Filed: December 30, 2019
    Date of Patent: August 9, 2022
    Assignee: PayPal, Inc.
    Inventor: Niladri Chatterjee
  • Patent number: 11397733
    Abstract: Some embodiments provide a non-transitory machine-readable medium stores a program. The program receives a query for data that includes a join operation. The program further generates a plurality of candidate query execution plans based on the query, each candidate query execution plan comprising a set of reduction operations. The program also determines a plurality of execution costs associated with the plurality of sets of reduction operations in the plurality of candidate query execution plans. The program further selects a query execution plan from the plurality of candidate query execution plans based on the plurality of execution costs. The program also executes the query execution plan to generate a set of query results for the query.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: July 26, 2022
    Assignee: SAP SE
    Inventor: Gerhard Hill
  • Patent number: 11329665
    Abstract: Disclosed approaches for performing a Burrows-Wheeler transform (BWT) of a sequence of data elements, S, include determining sets of less-than values and sets of equal-to values for the data elements. Index values are determined for the data elements based on the sets of less-than values. Each index value indicates a count of data elements of S that a data element is lexicographically greater than. Rank values are determined for the data elements of S based on the sets of less-than values and the sets of equal-to values. Each rank value indicates for the data element an order of the data element in the BWT relative to other ones of the data elements of equal value. Positions in the BWT of S for the data elements are selected based on the index values and rank values, and the data elements are output in the order indicated by the respective positions in the BWT.
    Type: Grant
    Filed: December 11, 2019
    Date of Patent: May 10, 2022
    Assignee: XILINX, INC.
    Inventors: Mohammad Saifee Dohadwala, Raghukul B. Dikshit
  • Patent number: 11301210
    Abstract: A technique is described for merging multiple lists of ordinal elements such as keys into a sorted output. In an example embodiment, a merge window is defined, based on the bounds of the multiple lists of ordinal elements, that is representative of a portion of an overall element space associated with the multiple lists. Lists of elements to be sorted can be placed into one of at least two different heaps based on whether they overlap the merge window. For example, lists that overlap the merge window may be placed into an active or “hot” heap, while lists that do not overlap the merge window may be placed into a separate inactive or “cold” heap. A sorted output can then be generated by iteratively processing the active heap. As the processing of the active heap progresses, the merge window advances, and lists may move between the active and inactive heaps.
    Type: Grant
    Filed: January 28, 2020
    Date of Patent: April 12, 2022
    Assignee: Cloudera, Inc.
    Inventors: Adar Lieber-Dembo, Todd Lipcon
  • Patent number: 10846309
    Abstract: Provided are a data indexing method, a data querying method and an electronic device. The data indexing method includes: creating a clustered index for a plurality of data records according to values of preset fields within the plurality of data records, wherein the plurality of data records are configured to store data files, and the values of the preset fields are field values of the clustered index; plotting, for each of the data records, a data distribution diagram of offsets versus the data records in the data file; and performing curve-fitting on the data distribution diagram to obtain an index relation containing correspondences between the field values and the offsets, so that the offset is calculated according to the field values of the data record to be queried, and thereby the data record is queried.
    Type: Grant
    Filed: November 28, 2017
    Date of Patent: November 24, 2020
    Assignee: MICROFUN Inc.
    Inventor: Chi Gao
  • Patent number: 10691412
    Abstract: A computer processor includes a memory unit, a processor cache and a hardware merge sort accelerator. The memory unit stores key values to be sequentially sorted. The processor cache obtains tree data from the memory unit indicating the key values. The hardware merge sort accelerator is configured to generate a master tournament tree based on the key values and perform a tournament sort that determines a first winning key value based on the master tournament tree. The hardware merge sort accelerator further speculates a second winning key value based on the master tournament tree. The speculated second winning key value is a next sequential winning key value of the tournament sort.
    Type: Grant
    Filed: August 31, 2018
    Date of Patent: June 23, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christian Jacobi, Aditya Puranik, Martin Recktenwald, Christian Zoellin
  • Patent number: 10115121
    Abstract: Example systems and methods of classifying web visitor sessions based on clickstreams are presented. In one example, a plurality of web pages of a website is organized into a plurality of web page categories. A clickstream of each visitor to visit the plurality of web page categories of the website are divided into a plurality of visitor sessions. A mathematical distance between each of the plurality of visitor sessions is determined using a visitation metric based on the web page categories. Each of the visitor sessions is classified into a target group or a non-target group based on the mathematical distance between each of the visitor sessions and on an identification of at least one of the visitor sessions with an event corresponding to the target group.
    Type: Grant
    Filed: December 11, 2013
    Date of Patent: October 30, 2018
    Assignee: Adobe Systems Incorporated
    Inventors: Deepak Pai, Abhijit Sharang, Meghanath Macha Yadagiri, Shradha Agrawal
  • Patent number: 9864791
    Abstract: Embodiments are directed to replicating data in distributed storage. A replication message may be retrieved from a message queue associated with a source table. The replication message may include a row identifier. One or more target storages within a same replication group as the source table may be identified. A row from each of the one or more target storages may be obtained corresponding to the row identifier. A winning row may be determined from the obtained rows based on a latest timestamp of the row. A replication operation may be created based on the winning row. The replication operation may be performed on the obtained rows from each of the target storages.
    Type: Grant
    Filed: March 4, 2015
    Date of Patent: January 9, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ilya Grebnov, Samuel Banina, Charles Lamanna, Kevin Lam
  • Patent number: 9727308
    Abstract: A method and system for sorting data of an input file containing multiple records associated with multiple tables of a database. The multiple records include key values. The key values are segmented into ranges of key values for each table. Each range of key values for each table is a segment having a segment value. Multiple key values are selected for the multiple records. A block number, which contains a unique permutation of the segment values of the segments, is generated. The segment values denote the ranges of key values encompassing the multiple key values in each record. A sort key value for each record is ascertained, based on the generated block number for each record, and added to each record. The multiple records are sorted according to the sort key values in the multiple records. The sorted multiple records are stored in an output file.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: August 8, 2017
    Assignee: International Business Machines Corporation
    Inventors: Ritsuko Boh, Noriaki Kohno
  • Patent number: 9658826
    Abstract: A method and system for sorting data of an input file containing multiple records associated with multiple tables of a database. The multiple records include key values. The key values are segmented into ranges of key values for each table. Each range of key values for each table is a segment having a segment value. A block number, which contains a unique permutation of the segment values of the segments, is generated. The segment values denote the ranges of key values encompassing multiple key values in each record. A sort key value for each record is ascertained, based on the generated block number for each record, and added to each record. The multiple records are sorted according to the sort key values in the multiple records. The sorted multiple records are stored in an output file. The selected multiple key values include all key values that satisfy a condition.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: May 23, 2017
    Assignee: International Business Machines Corporation
    Inventors: Ritsuko Boh, Noriaki Kohno
  • Patent number: 9501578
    Abstract: Embodiments are directed towards dynamic semantic models having multiple indices. Source data may be provided to a network computer from at least one separate data source. A raw data graph may be generated from the source data such that the structure of the raw data graph may be based on the structure of the source data. Elements of the raw data graph may be mapped to a concept graph. Concept instances may be generated based on the concept graph, the raw data graph, and the source data. Model-identifiers (MIDs) that correspond to the concept instances may be generated to include at least a path in the concept graph The MID values may be indexed into a plurality of indices based on a content-type of the data associated with the MIDs. In response to a query, a result set may be generated that includes result MIDs.
    Type: Grant
    Filed: December 21, 2015
    Date of Patent: November 22, 2016
    Assignee: Maana, Inc.
    Inventors: Ralph Donald Thompson, III, Allen Geoffrey Jones, Robert Povey
  • Patent number: 9037575
    Abstract: A system ranks results. The system may receive a list of links. The system may identify a source with which each of the links is associated and rank the list of links based at least in part on a quality of the identified sources.
    Type: Grant
    Filed: December 24, 2013
    Date of Patent: May 19, 2015
    Assignee: Google Inc.
    Inventors: Michael Curtiss, Krishna A. Bharat, Michael Schmitt
  • Patent number: 8972691
    Abstract: A mechanism is provided for cross-allocated block repair in a mounted file system. A set of cross-allocated blocks are identified from a plurality of blocks within an inode of the mounted file system, based on a corresponding bit associated with each cross-allocated block in a duplicated block information bitmap being in a first identified state. The set of cross-allocated blocks are repaired using a user-defined repair process. Then one or more of the set of cross-allocated blocks are deallocated based on results of the user-defined repair process.
    Type: Grant
    Filed: November 3, 2011
    Date of Patent: March 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Kalyan C. Gunda, Srikanth Srinivasan
  • Publication number: 20150046478
    Abstract: Embodiments include methods, systems and computer program products for performing a tournament tree sort on a hardware accelerator. The method includes receiving a plurality of key values by the hardware accelerator, storing each the plurality of keys into a location on a memory of the hardware accelerator, and creating a pointer to each of the locations of the plurality of keys. The method also includes storing the pointer to each of the plurality of keys into a first array stored by the hardware accelerator, sorting the plurality of keys by ordering the pointers in the first array and by using a second array for storing the pointers, wherein the sorting identifies a winning key from the plurality of keys in the memory, and outputting the winning key.
    Type: Application
    Filed: August 7, 2013
    Publication date: February 12, 2015
    Applicant: International Business Machines Corporation
    Inventors: Sameh W. Asaad, Hong Min, Bharat Sukhwani, Mathew S. Thoennes
  • Patent number: 8938458
    Abstract: A customized, topical database and methods for constructing and using such a database are provided. Selection and indexing of articles is done by experts in the topic with which the database is concerned. As a result, articles are indexed in a manner that allows facile, rapid retrieval of highly relevant articles with few or no false positives.
    Type: Grant
    Filed: June 3, 2013
    Date of Patent: January 20, 2015
    Assignee: Nelson Information Systems
    Inventor: John M. Nelson
  • Patent number: 8938444
    Abstract: Techniques for external application-directed data partitioning in data exported from a parallel database management system (DBMS) are provided. An external application sends a query, a total number of requested access module processors (AMPs), and an application-defined data partitioning expression to the DBMS. The DBMS executes the query with the results vertical partitioned on the identified number of AMPs. Individual external mappers access their assigned AMPs asking for specific partitions that they are assigned to process the query results.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: January 20, 2015
    Assignee: Teradata US, Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Patent number: 8918408
    Abstract: A computing device maintains an input history in memory. This input history includes input strings that have been previously entered into the computing device. When the user begins entering characters of an input string, a predictive input engine is activated. The predictive input engine receives the input string and the input history to generate a candidate list of predictive inputs which are presented to the user. The user can select one of the inputs from the list, or otherwise continue entering characters. The computing device generates the candidate list by combining frequency and recency information of the matching strings from the input history. Additionally, the candidate list can be manipulated to present a variety of candidates. By using a combination of frequency, recency and variety, a favorable user experience is provided.
    Type: Grant
    Filed: August 24, 2012
    Date of Patent: December 23, 2014
    Assignee: Microsoft Corporation
    Inventors: Katsutoshi Ohtsuki, Koji Watanabe
  • Patent number: 8909645
    Abstract: Methods and apparatus to classify text communications are disclosed. An example method includes determining a first score indicating a likelihood that a text belongs to a first classification mode by combining a first sentence score and a second sentence score retrieved from an index, the first sentence score indicating a probability that a first sentence in the text belongs to the first classification mode, the second sentence score indicating that a second sentence following the first sentence belongs to the first classification mode, determining a second score indicating a likelihood that the text belongs to a second classification mode, comparing the first score to the second score, classifying the text as the first classification mode when the first score is greater than the second score, and determining a confidence level that the text belongs to the first classification mode by dividing the first score by the second score.
    Type: Grant
    Filed: January 10, 2013
    Date of Patent: December 9, 2014
    Assignee: Buzzmetrics, Ltd.
    Inventors: Tal Eden, Eliyahu Greitzer, Yakir Krichman, Michael Fuks
  • Patent number: 8898177
    Abstract: A plurality of segments in an e-mail collection by parsing content of e-mails is generated. Corresponding segment signature for each segment is created and a signature index is populated using the generated segment signatures. After receiving a query e-mail, a plurality of query segments in the query e-mail is generated using content of the query e-mail and corresponding query segment signature for each query segment is generated. A query root segment is identified and corresponding query root segment signature is generated. A set of root segment signatures of the signature index is identified and the query root segment signature is compared with each root segment signature from the signature index. A subset of the signature index is identified, using a match between the root segment signature and the query root segment signature. An e-mail thread hierarchy is built using the identified subset of the signature index.
    Type: Grant
    Filed: September 10, 2010
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Danish Contractor, Manjula Golla Hosurmath, Sachindra Joshi, Kenney Ng
  • Patent number: 8892599
    Abstract: A method of processing a query in a distributed database implemented across a set of nodes includes receiving a query. The query is divided into split characterization queries. The split characterization queries are distributed to worker nodes. Each worker node stores a partition of the distributed database with encoded textual objects and pre-defined indices characterizing encoded textual object fragments. The split characterization queries are executed at the worker nodes to obtain preliminary information about query results. Executing the split characterization queries includes matching query fragments associated with the split characterization queries with encoded textual object fragments of the pre-defined indices to produce fragment matches representative of the size of the query results. For each split characterization query the preliminary information about query results includes a fragment count, a database partition identification, and a database host name.
    Type: Grant
    Filed: October 24, 2012
    Date of Patent: November 18, 2014
    Assignee: MarkLogic Corporation
    Inventors: Christopher Lindblad, Jane X. Chen
  • Patent number: 8880527
    Abstract: An approach is provided for initiating generation of a media compilation based on one or more sampling criteria. A sampling platform determines at least one subset of one or more media items captured of at least one event. The sampling platform also partitions the at least one subset of the one or more media items into one or more bins and generates at least one compilation of the at least one subset of the one or more items based, at least in part, on whether the one or more media items in the one or more bins at least substantially meet one or more sampling criteria.
    Type: Grant
    Filed: October 31, 2012
    Date of Patent: November 4, 2014
    Assignee: Nokia Corporation
    Inventors: Mate Sujeet Shyamsundar, Curcio Igor Danilo Diego, Vinod Kumar Malamal Vadakital
  • Publication number: 20140222839
    Abstract: A method and system for sorting data of an input file containing multiple records associated with multiple tables of a database. The multiple records include key values. The key values are segmented into ranges of key values for each table. Each range of key values for each table is a segment having a segment value. Multiple key values are selected for the multiple records. A block number, which contains a unique permutation of the segment values of the segments, is generated. The segment values denote the ranges of key values encompassing the multiple key values in each record. A sort key value for each record is ascertained, based on the generated block number for each record, and added to each record. The multiple records are sorted according to the sort key values in the multiple records. The sorted multiple records are stored in an output file.
    Type: Application
    Filed: February 25, 2014
    Publication date: August 7, 2014
    Applicant: International Business Machines Corporation
    Inventors: Ritsuko Boh, Noriaki Kohno
  • Publication number: 20140181126
    Abstract: A method for high-speed scheduling and arbitration of events for computing and networking is disclosed. The method includes the software and hardware implementation of a unique data structure, known as a pile, for scheduling and arbitration of events. According to the method, events are stored in loosely sorted order in piles, with the next event to be processed residing in the root node of the pile. The pipelining of the insertion and removal of events from the piles allows for simultaneous event removal and next event calculation. The method's inherent parallelisms thus allow for the automatic rescheduling of removed events for re-execution at a future time, also known as event swapping. The method executes in O(1) time.
    Type: Application
    Filed: September 12, 2011
    Publication date: June 26, 2014
    Applicant: Altera Corporation
    Inventors: Paul Nadj, David Walter Carr, Edward D. Funnekotter
  • Patent number: 8751513
    Abstract: The present technology is directed to improving the conversion rate of invitational content that is provided to the user of an interactive, content-receiving-and-displaying device. The content of a large number of primary-content sources is analyzed and keyword and/or other context-providing information is extracted from the primary-content sources. The keyword and/or other context-providing information is used to index the primary-content sources into an index according to a hierarchical taxonomy; the hierarchical taxonometric index is used to identify primary-content sources with which a given item of invitational content correlates; and the given item of invitational content is delivered to a user in response to the user accessing a primary-content source with which the given item of invitational content correlates.
    Type: Grant
    Filed: August 31, 2010
    Date of Patent: June 10, 2014
    Assignee: Apple Inc.
    Inventors: Eswar Priyadarshan, Kenley Sun, Dan Marius Grigorovici, Jayasurya Vadrevu
  • Patent number: 8713048
    Abstract: Queries targeting various data sources are processed in a query processing pipeline that parses the query into a set of operations (e.g., an expression tree or a translated SQL query) using a set of query operators, each handling a particular type of operation. The query operators are often designed in an unspecialized manner, such that each query operator handles one query operation in an atomic, generic manner (e.g., sorting generic data items for an ORDER BY clause.) More efficient queries may be devised by including specialized queries that operate in common but special cases, such as a sorting of a particular data type (e.g., a floating-point number sort) or a sequence of two or more operations that are often performed together (e.g., a WHERE test of an attribute followed by a SELECT of the same attribute.) The use of specialized operators may result in the formulation of more efficient queries.
    Type: Grant
    Filed: June 24, 2008
    Date of Patent: April 29, 2014
    Assignee: Microsoft Corporation
    Inventors: Erik Meijer, Mads Torgersen, Anders Hejlsberg, Matthew J. Warren, John W. Dyer
  • Publication number: 20140108434
    Abstract: An entity using a computing device can upload searchable data to a network service to be indexed and stored. The data can include a plurality of data fields, each data field having one or more associated values. The network service can analyze the data fields and their respectively associated values to determine data field types for the data fields and search options to be enabled for the data fields. Based at least in part on the data field types and the search options, the network service can generate a search index configuration/schema. Based at least in part on the generated search index configuration/schema, the network service can generate a search index for the data. In some embodiments, the network service can also convert the data into a format compatible with the search index.
    Type: Application
    Filed: October 12, 2012
    Publication date: April 17, 2014
    Applicant: A9.com, Inc.
    Inventor: A9.com, Inc.
  • Patent number: 8682908
    Abstract: An information processing apparatus is disclosed that includes a processor, a storage device, a display device that displays a list of files accessible by the processor which list is sorted using an item of attribute information of the files as a sort key, a storing unit that stores information pertaining to display positions of the files within the list and information pertaining to the sort key used to sort the list in the storage device, a selecting unit that selects another item of attribute information of the files as a selected sort key, a sorting unit that executes re-sorting operations on the list using the selected sort key and generates a re-sorted list to be displayed by the display device, and a restoring unit that uses the information stored in the storage device to restore the re-sorted list back to the list displayed prior to execution of the re-sorting operations.
    Type: Grant
    Filed: January 16, 2008
    Date of Patent: March 25, 2014
    Assignee: Ricoh Company, Ltd.
    Inventor: Kikyo Cho
  • Patent number: 8645368
    Abstract: A system ranks results. The system may receive a list of links. The system may identify a source with which each of the links is associated and rank the list of links based at least in part on a quality of the identified sources.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: February 4, 2014
    Assignee: Google Inc.
    Inventors: Michael Curtiss, Krishna Bharat, Michael Schmitt
  • Patent number: 8515976
    Abstract: The sort processing of keys to be sorted, which keys are expressed as bit strings involves a classification processing. In the classification processing, a bit string comparison between a reference key and a key which is an object of the classification is performed, and a difference bit position is obtained that is the bit position of the first bit that differs in the bit string comparison and the keys to be sorted are classified by the difference bit position into key groups with the same difference bit position.
    Type: Grant
    Filed: June 22, 2011
    Date of Patent: August 20, 2013
    Assignee: Kousokuya, Inc.
    Inventors: Toshio Shinjo, Mitsuhiro Kokubun, Koutaro Shinjo
  • Patent number: 8495071
    Abstract: A computer-implemented method is provided, according to which, an indication of a plurality of attributes of user interaction with one or more electronic messages in a user account is received, a rank value for each of the one or more electronic messages based on the plurality of attributes of user interaction with the one or more electronic messages is determined, the one or more electronic messages in the user account based on the determined rank value are sorted, and the sorted one or more electronic messages are provided for display in accordance with the determined rank value.
    Type: Grant
    Filed: January 26, 2012
    Date of Patent: July 23, 2013
    Assignee: Google Inc.
    Inventors: Kaisuke Nakajima, Jennifer W. Lin
  • Patent number: 8484221
    Abstract: Documents are assigned to one or more indexes in a document indexing system on the basis of document properties such as total number of tokens in the document, number of numeric tokens in the document, number of alphabetic tokens in the document, size of the document, and metadata associated with the document. Based on statistical distributions of document properties (over a large number of documents), different indexes can be defined, and a document router can direct a particular document to one index or another based on the properties of the particular document. In some implementations, certain document properties may be used to identify a nonrelevant document, or garbage document, so that it is either not indexed or assigned to an index dedicated for such documents.
    Type: Grant
    Filed: May 25, 2010
    Date of Patent: July 9, 2013
    Assignee: Stratify, Inc.
    Inventors: Kumar Maddali, Joy Thomas
  • Patent number: 8484227
    Abstract: A system and method for caching and/or synching shared media items in a media sharing system are provided. In one embodiment, a user device of a user joins a media sharing system including the user device and one or more other user devices connected via a network such that the user of the user device has access to one or more shared media collections hosted by the one or more other user devices. Shared media items from the one or more shared media collections are scored. The user device then obtains shared media items scored above a defined threshold from the one or more other user devices hosting the one or more corresponding shared media collections and stores those shared media items in local storage. In one embodiment, the local storage is temporary storage, such as a cache.
    Type: Grant
    Filed: October 15, 2008
    Date of Patent: July 9, 2013
    Assignee: Eloy Technology, LLC
    Inventor: Hugh Svendsen
  • Patent number: 8468162
    Abstract: String matching is a ubiquitous problem that arises in a wide range of applications in computer science, e.g., packet routing, intrusion detection, web querying, and genome analysis. Due to its importance, dozens of algorithms and several data structures have been developed over the years. A recent breakthrough in this field is the FM-index, a data structure that synergistically combines the Burrows-Wheeler transform and the suffix array. In software, the FM-index allows searching (exact and approximate) in times comparable to the fastest known indices for large texts (suffix trees and suffix arrays), but has the additional advantage to be much more space-efficient than those indices. This disclosure discusses an FPGA-based hardware implementation of the FM-index for exact and approximate pattern matching.
    Type: Grant
    Filed: March 8, 2012
    Date of Patent: June 18, 2013
    Assignee: The Regents of the University of California
    Inventors: Walid A. Najjar, Edward Bryann C. Fernandez, Stefano Lonardi
  • Patent number: 8458154
    Abstract: Methods and apparatus to classify text communications are disclosed. An example method includes determining a first conditional probability of a first feature occurring in a text given that the text belongs to a classification mode, wherein the first feature is included in the text, determining a second conditional probability of a second feature occurring in a text given that the text belongs to the classification mode, wherein the second feature is included in the text, determining a probability of the classification mode occurring, multiplying the first conditional probability, the second conditional probability and the probability of the classification mode to determine a product, and storing the product in a tangible memory as a score that the message belongs to the first classification mode.
    Type: Grant
    Filed: October 9, 2009
    Date of Patent: June 4, 2013
    Assignee: Buzzmetrics, Ltd.
    Inventors: Tal Eden, Eliyahu Greitzer, Yakir Krichman, Michael Fuks
  • Patent number: 8442973
    Abstract: A method and apparatus for utilizing user behavior to immediately modify sets of search results so that the most relevant documents are moved to the top. In one embodiment of the invention, behavior data, which can come from virtually any activity, is used to infer the user's intent. The updated inferred implicit user model is then exploited immediately by re-ranking the set of matched documents to best reflect the information need of the user. The system updates the user model and immediately re-ranks documents at every opportunity in order to constantly provide the most optimal results. In another embodiment, the system determines, based on the similarity of results sets, if the current query belongs in the same information session as one or more previous queries. If so, the current query is expanded with additional keywords in order to improve the targeting of the results.
    Type: Grant
    Filed: May 1, 2007
    Date of Patent: May 14, 2013
    Assignees: Surf Canyon, Inc., The Board of Trustees of The University of Illinois
    Inventors: Mark Cramer, ChengXiang Zhai, Xuehua Shen, Bin Tan
  • Patent number: 8438173
    Abstract: Tools and techniques for indexing and querying data stores using concatenated terms are provided. These tools may receive input queries that include at least two query terms. The query terms are correlated respectively with fields contained within records within a data store, with these fields being populated with respective field values. The query terms are arranged according to an indexing priority according to which the fields are ranked within an indexing table, which is associated with the data store. The tools then concatenate the query terms as arranged according to the indexing priority. In turn, the tools search the index table for any entries that are responsive to the concatenated query terms.
    Type: Grant
    Filed: January 9, 2009
    Date of Patent: May 7, 2013
    Assignee: Microsoft Corporation
    Inventors: Willard Bruce Jones, Simon Skaria, Naresh Kannan
  • Patent number: 8433708
    Abstract: Computer searchable annotated formatted documents are produced by correlating documents stored as a photographic or scanned graphic representations of an actual document (evidence, report, court order, etc.) with textual version of the same documents. A produced document will provide additional details in a computer data structure that supports citation annotation as well as other types of analysis of a document. The computer data structure also supports generation of citation reports and corpus reports. A computer method of creating searchable annotated formatted documents including citation and corpus reports by correlating and correcting text files with photographic or scanned graphic of the original documents. Data structures for correlating and correcting text files with graphic images. Generation of citation reports, concordance reports, and corpus reports. Data structures for citation reports, concordance reports, and corpus reports generation.
    Type: Grant
    Filed: September 16, 2009
    Date of Patent: April 30, 2013
    Inventor: Kendyl A. Román
  • Patent number: 8429154
    Abstract: A search device is provided that extracts and stores field data from document data and displays a search result in a two-dimensional coordinate system whose axes are freely chosen by a user. The search device has a data input unit for inputting input data to the search device, a field extraction unit for extracting field data from the input data, a data storage unit for storing the field data and the input data associated with the field data, a search word input unit for inputting a search word, a search processing unit for searching the field data in the data storage unit based on the inputted search word and for retrieving the input data associated with the found field data, and a display unit for displaying an icon, representing the input data found by the search processing unit, at a position in a coordinate system.
    Type: Grant
    Filed: July 11, 2008
    Date of Patent: April 23, 2013
    Assignee: Oki Data Corporation
    Inventor: Hitoshi Takeya
  • Patent number: 8417687
    Abstract: Update processing and the like of an index file relating to change of a hierarchical structure is performed. The index file is recorded in a recording medium with content files. The index file is generated based on attribute information of content files and folders including the content files. Content files and folders form a hierarchical structure in which the folders are in an upper hierarchy. In the index file, a prescribed number of entries (management information areas) corresponding to content files and folders, respectively, are provided. Second index information indicating second entries corresponding to folders or content files positioned in a lower hierarchy of folders in the upper hierarchy is provided at first entries corresponding to folders in the upper hierarchy in a list format. First index information indicating the first entries is provided at the second entries.
    Type: Grant
    Filed: August 31, 2006
    Date of Patent: April 9, 2013
    Assignee: Sony Corporation
    Inventors: Fumitaka Kawate, Mitsuhiro Hirabayashi, Hiroshi Jinno, Masayoshi Ohno, Hideo Obata, Shigeru Kashiwagi
  • Patent number: 8397028
    Abstract: Systems, methods embodied on computer-readable media, and other embodiments associated with index entry eviction are described. One example method includes selecting an index entry for eviction from a bucket of index entries based on a time value, a utility value, and a precedence value. A precedence value may be a value associated with an index entry that is static over time. Additionally, results of a function that compares two precedence values may be static over time. The example method may also include providing an index entry identifier that identifies the index entry.
    Type: Grant
    Filed: June 15, 2010
    Date of Patent: March 12, 2013
    Inventor: Stephen Spackman
  • Patent number: 8380724
    Abstract: A concurrent grouping operation for execution on a multiple core processor is provided. The grouping operation is provided with a sequence or set of elements. In one phase, each worker receives a partition of a sequence of elements to be grouped. The elements of each partition are arranged into a data structure, which includes one or more keys where each key corresponds to a value list of one or more of the received elements associated with that key. In another phase, the data structures created by each worker are merged so that the keys and corresponding elements for the entire sequence of elements exist in one data structure. Recursive merging can be completed in a constant time, which is not proportional to the length of the sequence.
    Type: Grant
    Filed: November 24, 2009
    Date of Patent: February 19, 2013
    Assignee: Microsoft Corporation
    Inventor: Igor Ostrovsky
  • Patent number: 8352715
    Abstract: A method for booting up a mobile phone quickly is disclosed. The method includes the steps of: driving hardware devices when the mobile phone is turned on; initializing application software installed in the mobile phone; loading data in the mobile phone from a storage to a memory; creating a table for each kind of data in the memory, and ordering data in each table according to a particular order condition; converting each table into a binary file, and storing the binary file in the storage; loading the binary file of each table directly from the storage to the memory when the mobile phone is turned on. A related quickly booting mobile phone is also disclosed.
    Type: Grant
    Filed: December 29, 2007
    Date of Patent: January 8, 2013
    Assignee: Chi Mei Communication Systems, Inc.
    Inventor: Hua-Jen Mao
  • Patent number: 8352451
    Abstract: Methods and apparatus to classify text communications are disclosed. An example method includes determining a first conditional probability of a first feature occurring in a text given that the text belongs to a classification mode, wherein the first feature is included in the text, determining a second conditional probability of a second feature occurring in a text given that the text belongs to the classification mode, wherein the second feature is included in the text, determining a probability of the classification mode occurring, multiplying the first conditional probability, the second conditional probability and the probability of the classification mode to determine a product, and storing the product in a tangible memory as a score that the message belongs to the first classification mode.
    Type: Grant
    Filed: October 9, 2009
    Date of Patent: January 8, 2013
    Assignee: The Nielsen Company (US), LLC
    Inventors: Tal Eden, Eliyahu Greitzer, Yakir Krichman, Michael Fuks
  • Patent number: 8332379
    Abstract: A method and system for identifying nodes with similar content. In one aspect, the method comprises determining a structure of a network of nodes, said structure defined by incoming links and outgoing links between nodes within said network, grouping said nodes within said network into a first set of modules, calculating a first modularity value between each of the modules within the first set, said modularity value indicating a degree of similar content within each module, calculating a topical relevance value for each of the modules, selecting those modules whose topical relevance value exceeds a threshold value and calculating an authority score for the selected modules.
    Type: Grant
    Filed: June 11, 2010
    Date of Patent: December 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ning Duan, Pei-Yun S. Hsueh, Yan Liu
  • Patent number: 8332410
    Abstract: To realize a high speed merge sort method by applying a coupled node tree, which method extracts a smallest or largest key from a plurality of sorted key storage areas in each of which is stored keys including bit strings that are sorted, and generates a coupled node tree for merge while adding a processing source identifier that identifies the sorted storage area wherefrom the key has been extracted, and repeats the actions of writing out into the merged key storage area a key being obtained by a minimum or maximum value search on the coupled node tree and deleting the key, and inserting into the coupled node tree a key by extracting the key from one of the plurality of sorted key storage areas.
    Type: Grant
    Filed: June 3, 2010
    Date of Patent: December 11, 2012
    Assignee: Kousokuya, Inc.
    Inventors: Toshio Shinjo, Mitsuhiro Kokubun
  • Publication number: 20120311704
    Abstract: A flow based detection system for detecting networks attacks on data networks. Flow records are collected in a novel data structure that facilitates efficient sorting. The sorted data structure can be subsequently analyzed in an efficient manner to find out if the network is under attack. An attack is identified if the numbers of unique corresponding addresses or conversations are too large.
    Type: Application
    Filed: August 30, 2011
    Publication date: December 6, 2012
    Applicant: FLUKE CORPORATION
    Inventor: Peter Reilly
  • Patent number: 8326835
    Abstract: A method and system for presenting a dataset with context-sensitive pagination are described. The dataset is sorted and divided into subsets according to a first attribute, and the subsets are presented via a user interface, which contains a navigation control relating to the first attribute. When the dataset is sorted and divided according to a second attribute, the navigation control dynamically updates to relate to the second attribute. This allows a user to navigate pages of data in a manner consistent with chosen sorting criteria.
    Type: Grant
    Filed: December 2, 2008
    Date of Patent: December 4, 2012
    Assignee: Adobe Systems Incorporated
    Inventor: Aaron D. Munter
  • Patent number: 8326820
    Abstract: Described herein is a technology that facilitates efficient large-scale similarity-based retrieval. In several embodiments documents, images, and/or other multimedia files are compactly represented and efficiently indexed to enable robust search using a long-query in a large-scale corpus. As described herein, these techniques include performing decomposition of a file, e.g., a document or document-like representation. The techniques use dimension reduction to obtain three parts, topic-related words (major semantics), document specific words (minor semantics), and background words, representing the major semantics in a feature vector and the minor semantics as keywords. Using the techniques described, file vectors are matched in a topic model and the results ranked based on the keywords.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: December 4, 2012
    Assignee: Microsoft Corporation
    Inventors: Zhiwei Li, Lei Zhang, Rui Cai, Wei-Ying Ma, Heung-Yeung Shum
  • Patent number: 8315985
    Abstract: A method and apparatus for optimizing a de-duplication rate for backup streams is described. In one embodiment, the method for optimizing data de-duplication using an extent mapping of a backup stream includes processing a backup stream to access an extent mapping associated with a plurality of data files, wherein the plurality of the data files are arranged within the backup stream and examining the extent mapping to identify at least one extent group within the backup stream, wherein the plurality of the data files are de-duplicated using at least one location of the at least one extent group.
    Type: Grant
    Filed: December 18, 2008
    Date of Patent: November 20, 2012
    Assignee: Symantec Corporation
    Inventors: James Ohr, Michael Zeis, Dean Elling, Stephan Kurt Gipp, William DesJardin