Abstract: A method and system for text extraction employs structured annotations that are embedded within a text document and specify the start and end of a document segment and an associated rhetorical relation. The structured annotations are processed to generate and store variables that represent document segments and associated rhetorical relations. A user interacts with a computer to define query input that specifies at least one rhetorical relation of interest. The query input is processed to query the stored variables to identify document segments associated with a rhetorical relation that matches the rhetorical relation of interest and to return to the user information pertaining to the matching document segments. The rhetorical relation of interest as well as the stored variables can include RST relations whose meaning is dictated by nuclearity of the associated text as well as Speech Act relations whose meaning extends beyond the situational semantics of the associated text.