Systems and methods for modeling in-the-moment lexical experience for tracking oral reading fluency
Systems and methods are provided for modeling lexical experience for tracking of oral reading fluency. In embodiments, a background corpus and a text are received. A reading passage is selected from the text. A surprisal model is generated based on the background corpus and a portion of the text preceding the reading passage. An audio of a user reciting the reading passage is iteratively received, wherein a token of the audio from a plurality of tokens is received at a time. Oral reading fluency is evaluated based on the audio and the surprisal mode. The next reading passage is selected. An oral reading fluency report is stored in a computer readable medium.
Latest Educational Testing Service Patents:
- Systems and methods for evaluating career interests through situation judgment test format
- Systems and methods for automated fine-grained speech scoring
- Systems and methods for controlled language generation for language learning items
- Machine learning-based argument mining and classification
- Machine learning-based metaphor detection
This application claims the benefit of U.S. Provisional Patent Application No. 63/391,899, filed Jul. 25, 2022, which is incorporated by reference herein in its entirety.
TECHNICAL FIELDThe technology described herein relates to modeling in-the-moment lexical experience for tracking oral reading fluency.
BACKGROUNDTeaching the skill of reading is one of the major tasks of an education system. Oral reading fluency (ORF) tests can serve as indicators of early literacy skills. Modeling a reader's dynamic lexical experience may provide improved ORF reports that take into account a reader's familiarity with words.
SUMMARYSystems and methods are provided for a computer-implemented method for modeling lexical experience for tracking of oral reading fluency. An example system performs steps, including receiving a background corpus and a text and selecting a reading passage from the text. The example system further generates a surprisal model based on the background corpus and a portion of the text preceding the reading passage and iteratively receives audio of a user reciting the reading passage, wherein a token of audio from a plurality of tokens is received at a time. Then, the example system evaluates oral reading fluency based on the audio. The example system selects a next reading passage and stores an oral reading fluency report in a computer readable medium.
As another example, a method for modeling lexical experience for tracking of oral reading fluency is presented. A background corpus and a text are received. A reading passage from the text is selected. A surprisal model based on the background corpus and a portion of the text preceding the reading passage is generated. Audio of a user reciting the reading passage is iteratively received, wherein a token of audio from a plurality of tokens is received at time. Oral reading fluency is evaluated based on the audio and the surprisal model. A next reading passage is selected. An oral reading fluency report is stored in a computer readable medium.
As a further example, a computer-readable medium is encoded with instructions for commanding one or more data processors to execute a method for modeling lexical experience for tracking of oral reading fluency. The example method includes receiving a background corpus and a text and selecting a reading passage from the text. The example method further generates a surprisal model based on the background corpus and a portion of the text preceding the reading passage and iteratively receives audio of a user reciting the reading passage, wherein a token of audio from a plurality of tokens is received at a time. Then, the method performs additional steps, including evaluating oral reading fluency based on the audio and the surprisal model, selecting a next reading passage, and storing an oral reading fluency report in a computer readable medium.
The following disclosure provides many different embodiments, or examples, for implementing different features of the provided subject matter. Specific examples of components and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to be limiting. For example, the formation of a first feature over or on a second feature in the description that follows may include embodiments in which the first and second features are formed in direct contact, and may also include embodiments in which additional features may be formed between the first and second features, such that the first and second features may not be in direct contact. In addition, the present disclosure may repeat reference numerals and/or letters in some various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between some various embodiments and/or configurations discussed.
Teaching new language learners, such as young children, the skill of reading is one of the major tasks of an education system. In the U.S., a common solution to monitoring the development of reading skills is the periodic administration of ORF tests, where fluency scores can serve as indicators of early literacy skills. For example, a popular DIBELS (Dynamic Indicators of Basic Early Literacy Skills) test may be administered to students three times a year to measure progress. In a DIBELS test, a specific passage is given in a particular grade at a particular time. ORF is typically measured as words read correctly per minute of oral reading (wcpm), which accounts for both accuracy and speed. Each passage is normalized so that a student's performance could be mapped to a percentile score relative to peers.
One of the weaknesses of this system is the need to administer a specific, pre-set assessment passage. This may take time away from a student reading for learning and pleasure to read for a test. It may also take away agency from teachers and students in choosing what to read, where choice and interest may enhance engagement and performance. A reading application (e.g., a web or mobile application) can, in some instances, address these weaknesses by letting language learners read different books aloud as a learning-and-pleasure activity and measuring ORF in the background. Such a solution may also allow for more continuous monitoring, providing language learners more opportunities to evaluate their ORF skill.
One challenge with reading applications is creating passage-specific norms. Passage-specific norms help identify and take into account things other than the reader's fluency for variances in wcpm measurements. Passage effects may be a source of such variance. Passage effects may include text complexity, genre, local discourse structure, and prosody. Another important source variance is a language learner's familiarity with a specific word in the passage. For example, a language learner may stumble on a word the first time they encounter it due to the word's unfamiliarity but the 50th encounter is likely to be less challenging.
Systems and methods for modeling of lexical experience for tracking oral reading fluency are described herein. Systems and methods herein may account for the fact that a user often does not read a book with a blank lexical slate. For example, when a user is reading a book, the within-the-book experience may be a continuation of an ongoing lexical experience that accumulates across prior reading materials and other language experiences. Systems and methods herein may model a user's prior knowledge using a background corpus with a current experience (e.g., the current book being read) viewed as an addition to the background corpus. This addition may be updated dynamically, e.g., one word at a time. For a word in a book, systems and methods herein may use a measure of surprisal at seeing this word at this location in the book-namely, the likely familiarity with the word given the starting background knowledge and the within-book experience up until the current location. Thus, an ORF monitoring system may be forgiving for errors and pauses the first one or more times a word is encounters, but less forgiving after a word has been encountered multiple (e.g., many) times.
The surprisal engine 140 generates a fluency output 150 by creating a model to calculate ORF based on the background corpus 110, the book 120, and the audio 130. The fluency output 150 may be an ORF report. The fluency output 150 may include measurements of wcpm controlling for a surprisal feature generated in surprisal engine 140. The fluency output 150 may further control for any or all of the following: text complexity, prosody, grade level of the user, and the user's progress in the book (e.g., chapter n of m chapters). The fluency output 150 may also compare the wcpm to a word-per-minute measurement of a “reading” generated by a text-to-speech synthesizer.
Pre-existing unigram counts may be used for different corpora. In some embodiments, raw counts (such as for BNC and SUBT) are used. In other embodiments, a method of deriving the probability estimates from the standard frequency indices (such as for SFI and TASA corpora) using the reversed estimate-to-standard frequency transformation and the published total corpus sizes may be used. The number of tokens and number of types for each corpora is shown in
-
- where S defines the surprisal value of W=w for a random word W and P(w) is the probability of w.
The surprisal model 610 may be a surprisal model that models the user's prior knowledge using the processed background corpus 310 and the prior consecutive passages 510 of the book 120. The P (w) of Eq. 1 may be updated continuously as a user progresses through the book, token by token. In some embodiments, the surprisal model 10 may be precomputed for every word token in a story based on the text of the story and the background corpus. The surprisal model may take the background corpus, the current token of a passage and the current passage as inputs to generate an average surprisal score for the passage.
Surprisal may be the highest for a completely new word appearing the latest in a book—this is, the first occurrence in all the lexical experience thus far (the background corpus and book). In contrast, words that are generally more frequent than in the book 120 would become gradually more surprising, but the increase would be small, since a frequent word has accumulated a lot of prior occurrences and the impact of any new words is relatively small. For example, if a book generally has a lower frequency of the word “the” than the background corpus, “the” will become more surprising as one adds the book to their lexical experience, but since even a long single book is orders of magnitude shorter than a large corpus that models the background knowledge, the book will only have a small impact on the surprisal values of generally frequent words. Examples of words at different locations of Harry Potter and the Sorceror's Stone (HP) and Pinocchio that are the top 3% surprisal for the passage are shown in
A fluency output 150 may be generated by the oral reading fluency module 700. The 520 fluency output 150 may be an ORF report. It may include feedback solely on the reading passage the user recited, or it may include feedback based on the reading passage and previous passages recited by the user. The fluency output 150 may be evaluated based on the surprisal model and a baseline model measuring wcpm. The wcpm may be modeled as a combination of passage, user random effects, and a number of fixed effects. The fixed effects may include the grade level of the user (to capture any systematic differences between grades), a text complexity score (TE), a words-per-minute measure of a “reading” generated using a text-to-speech synthesizer to model variation in duration of different phonemes and reasonable inter- and intra-sentential pausing (TTS), and the number of the chapter the passage is in. Surprisal is another example of a fixed effect. The fluency output may be stored in a computer readable medium. A next reading passage may be selected for the user to recite or the user's spot in the book may be saved.
To demonstrate the effects of surprisal on ORF reports, two example books were used to conduct studies.
An example study based on some embodiments discussed above was conducted to obtain oral reading data. The oral reading data came from 35 students in grade 4 (12) and grade 5 (23). Students read with an ORF application for up to 19 weeks, approximately three times a week for 29 minutes at a time. All the students read and finished HP. The students used consumer-level in-ear headphones with a built-in microphone. The students took turns reading aloud consecutive passages of the book with a pre-recorded audiobook narrator. When splitting the text of a chapter into reading turns, the algorithm for segmenting data discussed above was utilized with a target of 150 words per student turn and 200 words per narrator turn. A set of 1,529 recordings with as many readers as possible per passage that span the beginning, middle, and end of each of the chapters were selected for analysis—67 passages in total with 100-170 words per passage. Transcribers from a professional agency were used to transcribe the audio recordings.
Based on these recordings, ORF reports were generated. One set of reports includes surprisal, to evaluate whether surprisal explains additional passaged-based variance in wcpm above and beyond baseline predictors. A baseline model was also used, which modeled a combination of passage and student random effects and a number of fixed effects. The fixed effects include the grade level of the student, text complexity score, words-per-minute measurement of a “reading” generated by Apple's text-to-speech synthesizer, and the number of the chapter the passage is in. The coefficient of the chapter variable captures the average extent of improvement in oral reading fluency per chapter. Chapter may also be used as a random slope to allow for different growth rates across participants. The surprisal model is identical to the Baseline model but has an additional fixed effect—the standard deviation of surprisal values per passage using the TASA3 corpus as background.
The methods and systems described herein may be implemented using any suitable processing system with any suitable combination of hardware, software, and/or firmware, such as described below with reference to the non-limiting examples of
A disk controller 1460 interfaces one or more optional disk drives to the system bus 1452. These disk drives may be external or internal floppy drives such as 1462, external or internal CD-ROM, CD-R, CD-RW or DVD drives such as 1464, or external or internal hard drives 1466. As indicated previously, these various disk drives and disk controllers are optional devices.
Each of the element managers, real time data buffer, conveyors, file input processor, database index shared access memory loader, reference data buffer, and data managers may include a software application stored in one or more of the disk drives connected to the disk controller 1460, the ROM 1456, and/or the RAM 1458. Preferably, the process 1454 may access each component as required.
A display interface 1468 may permit information from the bus 1456 to be displayed on a display 1470 in audio, graphic, or alphanumeric format. Communication with external devices may optionally occur using various communication ports 1472.
In addition to the standard computer-type components, the hardware may also include data input devices, such as a keyboard 1474 or other input device 1476, such as a microphone, remote control, pointer, mouse, and/or joystick.
This written description uses examples to disclose the invention, including the best mode, and also to enable a person skilled in the art to make and use the invention. The patentable scope of the invention may include other examples. For example, the systems and methods may include data signals conveyed via networks (e.g., local area network, wide area network, internet, combinations thereof, etc.), fiber optic medium, carrier waves, wireless networks, etc. for communication with one or more data processing devices. The data signals can carry any or all of the data disclosed herein that is provided to or from a device.
Additionally, the methods and systems described herein may be implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem. The software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform the methods and operations described herein. Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.
The systems' and methods' data (e.g., associations, mappings, data input, data output, intermediate data results, final data results, etc.) may be stored and implemented in one or more different types of computer-implemented data stores, such as different types of storage devices and programming constructs (e.g., RAM, ROM, Flash memory, flat files, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.). It is noted that data structures describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
The computer components, software modules, functions, data stores, and data structures described herein may be connected directly or indirectly to each other in order to allow the flow of data needed for their operations. It is also noted that a module or processor includes but is not limited to a unit of code that performs a software operation, and can be implemented for example as a subroutine unit of code, or as a software function unit of code, or as an object (as in an object-oriented paradigm), or as an applet, or in a computer script language, or as another type of computer code. The software components and/or functionality may be located on a single computer or distributed across multiple computers depending upon the situation at hand.
The foregoing outlines features of several embodiments so that those skilled in the art may better understand the aspects of the present disclosure. Those skilled in the art should appreciate that they may readily use the present disclosure as a basis for designing or modifying other processes and structures for carrying out the same purposes and/or achieving the same advantages of the embodiments introduced herein. Those skilled in the art should also realize that such equivalent constructions do not depart from the spirit and scope of the present disclosure, and that they may make various changes, substitutions, and alterations herein without departing from the spirit and scope of the present disclosure.
Claims
1. A computer-implemented method for modeling lexical experience for tracking of oral reading fluency comprising:
- receiving a background corpus and a text, wherein the background corpus comprises general background corpora and personalized corpora from passages that a user has read;
- selecting a reading passage from the text, wherein the reading passage comprises a plurality of reading passage tokens;
- generating a surprisal model based on the background corpus and a portion of the text preceding the reading passage to calculate a probability of a particular next token of the reading passage;
- iteratively receiving audio of the user reciting the reading passage, wherein a token of audio from a plurality of audio tokens is received at a time;
- evaluating oral reading fluency based on the audio, the reading passage, and the surprisal model by comparing the plurality of audio tokens to the plurality of reading passage tokens in view of the calculated probability of each token of the plurality of reading passage tokens;
- selecting a next reading passage; and
- storing an oral reading fluency report in a computer readable medium.
2. The method of claim 1, wherein evaluating oral reading fluency further comprises using a baseline model.
3. The method of claim 2, wherein the baseline model measures words read correctly per minute of oral reading, text complexity, genre, prosody, or local discourse structure.
4. The method of claim 1, further comprising:
- processing the background corpus and the text by tokenizing the background corpus and the text.
5. The method of claim 1, further comprising:
- processing the audio which comprises: transcribing the audio into a transcription using a speech to text transcriber; and tokenizing the transcription.
6. The method of claim 1, wherein the background corpus and the text are pre-processed to normalized spelling and handle contractions and hyphenation.
7. The method of claim 1, wherein the probability of the particular next token is continuously updated, token by token, as the user progresses through the reading passage.
8. The method of claim 1, wherein the surprisal model calculates a surprisal value of the particular next token based on a logarithm of an inverse of the probability of the particular next token.
9. The method of claim 7, wherein the surprisal value is precomputed for every token in the reading passage.
10. The method of claim 1, further comprising:
- providing the selected reading passage and the selected next reading passage to the user via a user interface.
11. A system for modeling lexical experience for tracking of oral reading fluency comprising:
- a processing system comprising one or more data processors; and
- a computer-readable medium encoded with instructions for commanding the processing system to execute steps comprising: receiving a background corpus and a text, wherein the background corpus comprises general background corpora and personalized corpora from passages that a user has read; selecting a reading passage from the text, wherein the reading passage comprises a plurality of reading passage tokens; generating a surprisal model based on the background corpus and a portion of the text preceding the reading passage to calculate a probability of a particular next token of the reading passage; iteratively receiving audio of the user reciting the reading passage, wherein a token of audio from a plurality of audio tokens is received at a time; evaluating oral reading fluency based on the audio, the reading passage, and the surprisal model by comparing the plurality of audio tokens to the plurality of reading passage tokens in view of the calculated probability of each token of the plurality of reading passage tokens; selecting a next reading passage; and storing an oral reading fluency report in a computer readable medium.
12. The system of claim 11, wherein evaluating oral reading fluency further comprises using a baseline model.
13. The system of claim 12, wherein the baseline model measures words read correctly per minute of oral reading, text complexity, genre, prosody, or local discourse structure.
14. The system of claim 11, wherein the steps further comprise:
- processing the background corpus and the text by tokenizing the background corpus and the text.
15. The system of claim 11, wherein the steps further comprise:
- processing the audio which comprises: transcribing the audio into a transcription using a speech to text transcriber; and tokenizing the transcription.
16. The system of claim 11, wherein the background corpus and the text are pre-processed to normalize spelling and handle contractions and hyphenation.
17. A non-transitory computer-readable medium encoded with instructions for commanding one or more data processors to execute steps of a method for modeling lexical experience for tracking of oral reading fluency comprising:
- receiving a background corpus and a text, wherein the background corpus comprises general background corpora and personalized corpora from passages that a user has read;
- selecting a reading passage from the text, wherein the reading passage comprises a plurality of reading passage tokens;
- generating a surprisal model based on the background corpus and a portion of the text preceding the reading passage to calculate a probability of a particular next token of the reading passage;
- iteratively receiving audio of the user reciting the reading passage, wherein a token of audio from a plurality of audio tokens is received at a time;
- evaluating oral reading fluency based on the audio, the reading passage, and the surprisal model by comparing the plurality of audio tokens to the plurality of reading passage tokens in view of the calculated probability of each token of the plurality of reading passage tokens;
- selecting a next reading passage; and
- storing an oral reading fluency report in a computer readable medium.
18. The non-transitory computer-readable medium of claim 17, wherein evaluating oral reading fluency further comprises using a baseline model.
19. The non-transitory computer-readable medium of claim 18, wherein the baseline model measures words read correctly per minute of oral reading, text complexity, genre, prosody, or local discourse structure.
20. The non-transitory computer-readable medium of claim 17, wherein the steps further comprise;
- processing the background corpus and the text by tokenizing the background corpus and the text.
21. The non-transitory computer-readable medium of claim 17, wherein the steps further comprise:
- processing the audio which comprises: transcribing the audio into a transcription using a speech to text transcriber; and tokenizing the transcription.
22. The non-transitory computer-readable medium of claim 17, wherein the background corpus and the text are pre-processed to normalize spelling and handle contractions and hyphenation.
| 11024194 | June 1, 2021 | Beigman Klebanov |
- Monsalve et al, (“Lexical Surprisal as a General Predictor of Reading Time,”(2012), In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 398-408, Avignon, France. Association for Computational Linguistics), (Year: 2012).
- Breland et al.; The College Board Vocabulary Study; ETS Report No. 94-26; College Entrance Examination Board; pp. 1-51; 1994.
- Landauer, Thomas, Foltz, Peter, Laham, Darrell; Introduction to Latent Semantic Analysis; Discourse Processes, 25; pp. 259-284; 1998.
- Terzopoulos et al.; HelexKids: A word frequency database for Greek and Cypriot primary school children; Behavior Research Methods, 49(1); pp. 83-96; 2017.
- Tribus, Myron; Information Theory as the Basis for Thermostatics and Thermodynamics; Journal of Applied Mechanics, 28(1); pp. 1-8; 1961.
- Zeno, Susan, Ivens, Stephen, Millard, Robert, Duvvuri, Raj; The Educator's Word Frequency Guide; Brewster, NY: Touchstone Applied Science Associates; 1995.
Type: Grant
Filed: Jul 24, 2023
Date of Patent: Feb 10, 2026
Assignee: Educational Testing Service (Princeton, NJ)
Inventors: Beata Beigman Klebanov (Hopewell, NJ), Michael Suhan (Princeton, NJ)
Primary Examiner: Angela A Armstrong
Application Number: 18/357,257