L2 Korean Pronunciation Over Time - A Graphical Look at Error Rates

I am working on a project that investigates the effect of a supplemental pronunciation instructional treatment for learners of Korean as a foreign language. As part of this project, in the spring semester we had 36 learners in their first or second year of Korean study complete speaking tasks 11 weeks apart. Learners were split into a control group and a treatment group, the latter of which received 8 hours of classroom-based supplemental instruction over the course of 9 weeks. One analysis I'm looking at is how their error rates changed over time.

Today, I looked at one task- a paragraph read-aloud task. The pre- and post-test recordings were phonemically transcribed by two native speakers of Korean. Their agreement on roughly 12% of the data was 93%; these disagreements were resolved via discussion before each coder began working independently. When the transcripts came in, I tallied the total sentences, words, syllables, and phonemes, and then coded each non-standard deviance from the script as being a segment error (e.g., a single segment-for-segment substitution) or a syllable error (e.g., deleting or inserting a syllable where there shouldn't be one). I also looked at sentence intonation, which ultimately wasn't very interesting (like many languages, Korean uses falling intonation for declarative sentences, and most learners had no problem with this- it will probably be more interesting in another task I have to analyze, though..). Because the task was timed and not all learners completed the full paragraph, plus I eliminated "rough starts" (conventionally first utterances with stutters/hesitations are deleted), I normed these errors to get rates.

So without further ado, plots! Each plot compares control (C) and treatment (T) groups. The color of the lines shows L1 (English or Chinese), and the shape of the points shows year in the program (first or second). Each line and set of points represents one student, so these plots highlight individual variation and provide a way to visually evaluate change patterns among the groups.

Fig. 1. Segmental error rates over time.

Fig. 2. Syllable Error rates over time.

Fig. 3. Total error rates (based on total syllables) over time, including segmental, syllable, and sentence intonation errors.
These plots could use some tweaks, but overall I'm pretty happy with what I'm able to see. As I've seen in other data from the project, the control group did, overall, improve their pronunciation. I think this is partially explainable by the large number of beginners in the group- at such early stages of language learning, there's a lot of phonological/pronunciation development going on quite rapidly just via standard instruction and input. But looking at the individual data, things look noisier on the left (control) side, with a bit more backsliding (more people with error rates increasing). Second year students (the triangles) in both groups seem to show lower error rates and less improvement.

Another trend seems to be that learners with higher error rates to start with see larger improvements; this is more apparent in the control group but also visible in the treatment. It's also interesting to see the differences between L1 Chinese and L1 English learners- particularly in syllable structure errors, Chinese students had generally higher error rates to begin with, but also made some substantial improvements.

Plots were made in R with ggplot2.  Any tips appreciated!

3 comments:

  1. Great job, Dan! When it comes to the figure 1 and 3, L1 effect may be larger in the control group than the treatment group? I just wondered if the figures for L1 Chinese and L1 English show similar trends.

    ReplyDelete
    Replies
    1. That might be the case, but off the top of my head I'm not sure. I should've posted the descriptives too!

      Delete
  2. I'm glad you posted this - I was a little curious about the result. My observation is that L1 Chinese students generally struggle with syllable final consonants due to their first language so maybe once they picked up on that they were able to make more visible improvements.

    ReplyDelete