Patents by Inventor Hongtao Dai

Hongtao Dai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10360248
    Abstract: In general, in one aspect, the invention relates to a method for servicing requests, the method includes receiving, from a client system, a first request comprising a query, determining a first user associated with the first request, modifying the query to obtain a first modified query, where the first modified query includes a first permission definition token associated with the first user, processing the modified query to obtain a first object from a content repository, and providing the first object to the client system.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: July 23, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Lei Zhang, Chao Chen, Jingjing Liu, Kunwu Huang, Hongtao Dai, Ying Teng
  • Patent number: 10248626
    Abstract: A method for document similarity analysis. The method includes obtaining a document to be archived, and identifying a document category similar to the document to be archived. The similar document category is identified by: identifying a document category that includes indexing terms that are identical to indexing terms in the document to be archived, obtaining term frequency vectors for the identical indexing terms in the document to be archived and in the identified document category, generating normalized term frequency vectors, from the term frequency vectors, calculating a common denominator similarity based on the normalized term frequency vectors and a common denominator, and determining that the document category is similar to the document to be archived based on the common denominator similarity. The method further includes registering the document to be archived in the document category.
    Type: Grant
    Filed: September 29, 2016
    Date of Patent: April 2, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Lei Zhang, Chao Chen, Kunwu Huang, Hongtao Dai, Jingjing Liu, Ying Teng
  • Patent number: 10241998
    Abstract: A method for tokenizing documents. The method includes obtaining a document comprising text to be tokenized, isolating a first string of consecutive characters in the document, searching, in a token tree, for an expression that matches the first string, making a determination that a matching expression exists in the token tree and, based on the determination, storing the matching expression as an extracted token.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: March 26, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Lei Zhang, Chao Chen, Jingjing Liu, Kunwu Huang, Hongtao Dai, Ying Teng
  • Publication number: 20180307689
    Abstract: The present disclosure provides method and apparatus of information processing. The method comprises: in response to a request of a first user for first information, searching a database to obtain second information; determining a first relevance between a second user associated with the second information and the first user; determining a second relevance between the second information and the first information based on the first relevance; and presenting the second information to the first user based at least in part on the second relevance.
    Type: Application
    Filed: April 17, 2018
    Publication date: October 25, 2018
    Inventors: Duke Hongtao Dai, Winston Lei Zhang, Kun Wu (Sheperd) Huang, Charlie Chao Chen, Jingjing Liu
  • Publication number: 20180173791
    Abstract: Embodiments of the present disclosure generally relate to a method and device for creating an index. For example, the embodiments of the present disclosure propose a method for creating an index, comprising: dividing a document into a plurality of regions; determining the number of times that a token appears in the plurality of regions, the token including at least one character in the document; assigning respective weights to the plurality of regions; and creating an inverted document linked list directed to the token based on the number of times that the token appears in the plurality of regions and respective weights of the plurality of regions. In addition, the embodiments of the present disclosure propose a corresponding device and computer program product for creating an index.
    Type: Application
    Filed: December 19, 2017
    Publication date: June 21, 2018
    Inventors: Winston Lei Zhang, Charlie Chen, Kun Wu (Sheperd) Huang, Jingjing Liu, Duke Hongtao Dai
  • Publication number: 20170371978
    Abstract: Embodiments of the present disclosure relate to a method and apparatus for managing a document index. The method comprises determining an independently updatable field in a plurality of documents, the independently updatable field comprising at least one item. The method further comprises creating an index for an item in the independently updatable field, the index containing an identifier of a document comprising the item, the document being included in the plurality of documents. Furthermore, the method further comprises storing the identifier of the document in blocks such that the index is updatable without modifying the identifier of the document.
    Type: Application
    Filed: June 22, 2017
    Publication date: December 28, 2017
    Inventors: Kun Wu Huang, Winston Lei Zhang, Chao Chen, Jingjing Liu, Duke Hongtao Dai
  • Publication number: 20170364510
    Abstract: Embodiments of the present disclosure provide a method and apparatus for processing a multi-language text. According to embodiments of the present disclosure, the multi-language text including contents in a plurality of languages may be encoded with a Unicode. The method further comprises splitting the multi-language text into a plurality of parts based on the Unicode of the multi-language text, contents of the plurality of parts having different languages. In addition, the multi-language text may also be processed based on the plurality of parts.
    Type: Application
    Filed: June 21, 2017
    Publication date: December 21, 2017
    Inventors: Kun Wu Huang, Winston Lei Zhang, Chao Chen, Jingjing Liu, Duke Hongtao Dai
  • Publication number: 20170270114
    Abstract: Embodiments of the present disclosure provide a method and device for searching a character string. In one embodiment, a method of searching a character string is provided. The method comprises: determining a first set of documents including a first token in the character string, and a second set of documents including a second token in the character string; and generating a third set of documents based on the first and second sets of documents, in the third set of documents: i) a document being included in the first and second sets of documents, and ii) a distance between the first and second tokens in the document being equal to a distance between the first and second tokens in the character string. A corresponding device and a computer program product are also disclosed.
    Type: Application
    Filed: March 20, 2017
    Publication date: September 21, 2017
    Inventors: Duke Hongtao Dai, Winston Lei Zhang, Chao Chen, Kun Wu Huang, Jingjing Liu
  • Publication number: 20170270184
    Abstract: Embodiments of the present disclosure provide a method and device for processing objects to be searched. The method comprises: receiviug a first input indicating a constraint associated with an object; receiving a second input indicating a category to which the object belong; and establishing, based on the first input and the second input, a classification condition associating the constraint with the category as a part of a classification policy which is used for classifying the object into a category to create a search index. In addition, embodiments of the present disclosure further disclose a method and device for creating a search index for an object to be searched.
    Type: Application
    Filed: March 17, 2017
    Publication date: September 21, 2017
    Inventors: Kun Wu Huang, Chao Chen, Winston Lei Zhang, Jingjing Liu, Duke Hongtao Dai
  • Publication number: 20170270127
    Abstract: Various embodiments of the present disclosure provide a solution for category-based full-text searching. In some embodiments, there is provided a method of full-text searching. The method includes generating a first full-text index based on an obtained electronic document content. The method also includes categorizing the electronic document to determine a category identifier for the electronic document, and generating a second full-text index based on the category identifier. The method further includes storing the first full-text index and the second full-text index.
    Type: Application
    Filed: March 21, 2017
    Publication date: September 21, 2017
    Inventors: Chao Chen, Jingjing Liu, Winston Lei Zhang, Dingmeng Xue, Zed Minhong Zhou, Duke Hongtao Dai
  • Publication number: 20170262474
    Abstract: Ideogram character analysis includes partitioning an original ideogram character into strokes, and mapping each stroke to a corresponding stroke identifier (id) to create an original stroke id sequence that includes stroke identifiers. A candidate ideogram character that has a candidate stroke id sequence within a threshold distance to the original stroke id sequence is selected. One or more embodiments may create new phrase by replacing the original ideogram character with the candidate ideogram character in a search phrase. One or more embodiments perform a search using the search phrase and the new phrase to obtain a result, and present the result. One or more embodiments may replace an original ideogram character in a character recognized document with the candidate ideogram character and store the character recognized document.
    Type: Application
    Filed: September 30, 2015
    Publication date: September 14, 2017
    Applicant: EMC Corporation
    Inventors: Chao Chen, Kunwu Huang, Hongtao Dai, Jingjing Liu