Abstract: A method and apparatus for identifying coherent areas within a Web page. First, a Web page is parsed into an HTML DOM tree and an HTML tag token stream. Next, repeated-patterns are induced from the Web page. After filtering out improper repeated-patterns and generating corresponding instances of the repeated-patterns, the repeated-patterns are mapped back to corresponding regions in the Web page. Based on the mappings, a hierarchical RST tree containing information blocks is generated. Information items within the information blocks are detected then used to generate a hierarchical structural information block tree. Information blocks from the structural information block tree are then classified into text information blocks and link information blocks. Based on the classification and block semantic similarity, the bocks are clustered then grouped into semantic information blocks. The semantic information blocks contain main text information blocks and related link blocks which, if necessary, can be labeled.
Type:
Application
Filed:
September 17, 2004
Publication date:
March 24, 2005
Applicants:
Fujitsu Limited, Nanjing University
Inventors:
Jun Wang, Jicheng Wang, Gangshan Wu, Hiroshi Tsuda
Abstract: The invention relates to compositions and methods for treating or preventing disease or disorders caused by or associated with certain bacterial infection, especially Escherichia coli (E. coli) or Helicobacter pylori (H. pylori) infection. One exemplary compound of the present invention has the following formula I:
wherein n is 0 or 1, and R is selected from the group consisting of C1-10 alkyl, C6-10 aryl and
and wherein when n is 0, R is not C6-10 aryl.
Type:
Grant
Filed:
October 10, 2001
Date of Patent:
May 11, 2004
Assignees:
Shanghai East Best Biopharmaceutical, Nanjing University
Abstract: The design of a special type of optical superlattice and its application in the all-solid state laser is involved in this invention. Nd ions doped laser crystal in common use can radiate three relatively intense spectral lines when excited: the first wavelength is around 900 nm; the second wavelength is around 1064 nm; the third wavelength is around 1300 nm, whose accurate wavelength are depended on their host crystal (for example, to Nd:YAG, they are 946 nm, 1064 nm and 1319 nm, respectively). On the other hand, for LiNbO3 (LN), LiTaO3 (LT), KTP and other ferroelectric crystals, the positive and negative 180° ferroelectric domains in these crystals can be arranged orderly according to certain sequence via crystal growth, electric field poling, and other state-of-the-art domain reversion technique, forming superlattice that is applicable to quasi-phase-matching (QPM) laser frequency conversion.
Abstract: The present invention relates to the field of biotechnology. The invention provides a novel approach using tobacco mosaic virus omega leader sequence to enhance the solubility of the recombinant products in E. coli and the method of use therefore. The invention provides the utilization of tobacco mosaic virus omega leader sequence into E. coli expression vector, and the tobacco mosaic virus omega leader sequence containing expression vector can be used in combination with other available means to obtain higher expression or better solubility. The invention can be applied to biotechnological pharmaceutical industry, genetic engineering, biochemistry and molecular biology etc. The invention provides an expression vector pTORG, which is a highly efficient GST fusion expression vector, and can significantly enhance the yield of biologically active recombinant products.