Abstract: Multiple types of tokens can be generated and utilized in a highly structured document with freeform text. For example, a tokenization system may receive a request for tokenizing a document with a first portion having structured content and a second portion having unstructured or semi-structured content. In response, the tokenization system identifies sensitive information in the first portion of the document, generates format-preserving tokens for the sensitive information in the first portion of the document, identifies sensitive information in the second portion of the document, and generates self-describing tokens for the sensitive information in the second portion of the document. The self-describing tokens reference the sensitive information in the first portion of the document. The tokenization system may then communicate the format-preserving tokens and the self-describing tokens to the first client computing system or to a second client computing system.