Patents by Inventor David Alexander SIM

David Alexander SIM has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11461406
    Abstract: A system, computer implemented method, and computer storage medium encoded with a computer program, for identifying newly trending topics in a data stream. An example method includes: receiving text documents forming part of a data stream from one or more servers; identifying terms within the received text documents; deriving from the identified terms, a set of terms identified as abnormal by virtue of having a relatively high frequency of occurrence within the text documents received in a recent period compared with that expected from their historic occurrence; creating a first set of one or more clusters, each cluster including a group of terms from the set of terms identified as abnormal which through their degree of co-occurrence in the received text documents are considered to relate to the same topic; and comparing clusters of a further set with the clusters of the first set to determine whether a cluster of the further set pertains to the same topic.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: October 4, 2022
    Assignee: Verint Systems UK Limited
    Inventors: David Andrew Roberts, David Alexander Sim
  • Patent number: 11170759
    Abstract: A method, system, and computer program product for discriminating boilerplate text in documents, such as web pages. An example method includes: receiving documents structured as labelled text elements; generating a local language model for each labelled text element of the received documents; comparing local language models for different labelled text elements that have the same label; for each comparison of local language models, deriving a similarity indicator, and using the similarity indicators of all the comparisons to derive a similarity score for that label; using the similarity scores to determine labels associated with text elements comprising boilerplate text; and providing the textual content of the labelled text elements to a receiving computer system; and identifying the textual content of labelled text elements that include boilerplate text.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: November 9, 2021
    Assignee: Verint Systems UK Limited
    Inventors: David Alexander Sim, David Paul Austen Ryland
  • Publication number: 20200218761
    Abstract: A system, computer implemented method, and computer storage medium encoded with a computer program, for identifying newly trending topics in a data stream. An example method includes: receiving text documents forming part of a data stream from one or more servers; identifying terms within the received text documents; deriving from the identified terms, a set of terms identified as abnormal by virtue of having a relatively high frequency of occurrence within the text documents received in a recent period compared with that expected from their historic occurrence; creating a first set of one or more clusters, each cluster including a group of terms from the set of terms identified as abnormal which through their degree of co-occurrence in the received text documents are considered to relate to the same topic; and comparing clusters of a further set with the clusters of the first set to determine whether a cluster of the further set pertains to the same topic.
    Type: Application
    Filed: December 23, 2019
    Publication date: July 9, 2020
    Inventors: David Andrew ROBERTS, David Alexander SIM
  • Publication number: 20200219481
    Abstract: A method, system, and computer program product for discriminating boilerplate text in documents, such as web pages. An example method includes: receiving documents structured as labelled text elements; generating a local language model for each labelled text element of the received documents; comparing local language models for different labelled text elements that have the same label; for each comparison of local language models, deriving a similarity indicator, and using the similarity indicators of all the comparisons to derive a similarity score for that label; using the similarity scores to determine labels associated with text elements comprising boilerplate text; and providing the textual content of the labelled text elements to a receiving computer system; and identifying the textual content of labelled text elements that include boilerplate text.
    Type: Application
    Filed: December 23, 2019
    Publication date: July 9, 2020
    Inventors: David Alexander SIM, David Paul Austen RYLAND