Sessions / Machine Learning in CALL

Extensive Reading texts generated by AI: What Learner Behaviour Reveals #4648

Time Not Set

An AI-driven system has generated over 600 stories, adaptively levelled to reader proficiency for extensive reading, initially targeting first-year university students. Linguistic complexity is adjusted at the point of generation rather than selected from a fixed corpus, allowing us to compare predicted difficulty with actual student reading behaviour. The system collects fine-grained, page-level interaction data alongside learner comments and ratings, including time on each page, stop points, and total completion. Data from over 20,000 reading sessions are analysed using behavioural features such as completion rate, speed consistency, and re-reading frequency. Using these indicators, this study examines which linguistic or narrative features of stories sustain reading, as well as specific sections that delay, disrupt, or deter progress. Elevated reading speeds suggest superficial interaction, while reduced reading speeds may indicate increased cognitive load, but there are various intrinsic or extrinsic reasons why reading speed may change, from getting a coffee to not actually reading. Completion at a stable pace indicates their reading is comprehensible and compelling. Sentiment analysis of learner comments identifies patterns associated with successful and problematic texts. These findings are examined against intended text levels, with particular attention to performance at the lower and upper ends of the proficiency range.

Speaking Without Fear: Action Research on Using AI Speak-Mode to Lower Foreign Language Speaking Anxiety #4661

Time Not Set

Foreign language anxiety (FLA), particularly related to speaking, remains a persistent challenge among Asian learners of foreign languages. With the rapid advancement of generative artificial intelligence (AI), language education is undergoing substantial transformation. Among emerging tools like ChatGPT, especially its Speak Mode, provides real-time, low-stakes, and judgment-free interaction, offering learners opportunities for spontaneous oral practice that may relieve speaking-related anxiety. This study examines the effects of ChatGPT Speak Mode on FLA among beginner-level Japanese as a Foreign Language (JFL) learners. Two research questions guide the inquiry: (1) Does sustained AI-mediated speaking practice reduce learners’ FLA? and (2) How do learners perceive the affordances and limitations of AI-based speaking practice? Employing a quasi-experimental design, the study involves approximately 55 first-year students enrolled in beginner Japanese conversation courses at a private university in northern Taiwan. Over two semesters (32 weeks), participants engage in regular dialogue practice using ChatGPT Speak Mode. Quantitative data are collected through pre- and post-intervention FLA questionnaires and analyzed using paired-samples t-tests, while qualitative data are examined through thematic analysis. The findings aim to contribute empirical evidence to the underexplored area of long-term AI integration in JFL instruction and offer pedagogical implications for reducing speaking anxiety in foreign language classrooms.

Beyond the Hype: A Mixed-Methods Analysis of Successes and Setbacks in AI-Assisted EFL Instruction in Vietnam #4669

Time Not Set

With AI chatbots moving from novelty to mainstream instructional tool, it is imperative to determine what constitutes pedagogically "prevailing" and "failing". This study will investigate how Vietnamese EFL instructors use AI tools and the elements which contribute to success versus pedagogic "friction". Using UNESCO AI Competency Framework (2024) and TPACK, this sequential mixed-methods study (N = 40), used an online survey instrument followed by semi-structured interviews with eight of the most frequent users of AI tools. Data were analyzed to identify "critical incidents" - the specific moments when AI generated materials or responses did or did not enhance learning. The preliminary results indicate "prevails" are typically efficient adaptations of educational materials and diagnostic feedback; whereas, "fails" are most commonly identified with cognitive overload and task designs that are overly generic or not sufficiently contextualized. Furthermore, the study identifies how teacher identity and institutional constraints affect the outcomes of these processes. This research provides a realistic guide to AI adoption in the Global South. The study also provides practical design principles for teacher training programs, arguing for a movement beyond basic "tool literacy" to develop "AI pedagogical resilience", so that educators have the ability to adjust when technology does not meet educational objectives.

Investigating the Validity of Accessible Automated Pronunciation Assessment Using Classroom and Corpus Data #4686

Time Not Set

Assessing pronunciation accuracy, fluency, and prosody is challenging due to substantial variability in human perceptions of speech production. Automated pronunciation assessment tools have therefore been proposed as scalable supports for both assessment and speaking development. Among these tools, Azure Pronunciation Assessment provides automated scoring across 33 languages at relatively low cost. This study examines the convergent, predictive, and construct validity of Azure’s measures of pronunciation accuracy, fluency, and prosody, with prosody operationalized according to Azure’s definition of naturalness in speech, including stress, intonation, speaking rate, and rhythm.

Analyses of approximately 3,510 speech samples from the ICNALE dataset show that all three measures are strongly associated with CEFR proficiency levels and rank among the strongest CEFR predictors when compared with established indices of lexical diversity and syntactic complexity. In addition, analyses of classroom speech data from 66 learners in Korea, Japan, and China reveal moderate to strong correlations between human ratings and Azure scores across all three constructs.

These findings suggest that Azure Pronunciation Assessment can provide valid, fine-grained feedback to support pronunciation-focused instruction and learning. However, the analyses rely on a fixed reference transcript (“Please Call Stella”), which may limit generalizability across task types, accents, and speaking contexts.

Exploring Prosodic Prominence Control in Synthesized Golden Speaker’s Speech for Pronunciation Training #4496

Time Not Set

Advances in text-to-speech (TTS) technology have created new opportunities for pronunciation training. Zero-shot TTS (ZS-TTS) models, for example, are capable of synthesizing speech in a learner’s own voice while producing more native-like pronunciation, the so-called golden speaker. In addition, some models support instruction-based generation, which allows for expressive modifications such as emphasizing specific words or inserting breath pauses and laughter within an utterance. Prosodic elements, such as prominence, affect listeners’ intelligibility and can modify the meaning of discourse. However, computer-assisted pronunciation training (CAPT) research has largely focused on segmental features. This study investigates whether instruction-based ZS-TTS can generate pedagogically exaggerated prominence patterns using the learner’s voice, with the potential to enhance pronunciation and listening training. Using CosyVoice2, a ZS-TTS model with emphasis instruction control, this study investigates whether marking target words with those instructions results in measurable acoustic changes associated with prominence. Controlled sentence pairs will be synthesized in neutral and emphasis-marked conditions, including multiple-hypothesis cases in which different words within the same sentence are emphasized to alter the discourse nuance. Acoustic analyses will focus on relative pitch variation, word duration, and intensity. This exploratory work aims to examine whether instruction-based prominence control is consistent and pedagogically meaningful.

Thinking Like a Programmer: Leveraging Agentic AI and Computational Thinking for CALL Researchers #4518

Time Not Set

In the evolving landscape of Computer-Assisted Language Learning (CALL), agentic AI tools such as Gemini CLI, Antithesis, and Claude Code are revolutionizing software development. However, their potential for academic research remains largely untapped. This hands-on workshop demonstrates how researchers can adopt a "programmer’s mindset" to enhance workflows, data analysis, and information gathering. Grounded in the framework of computational thinking (Wing, 2006), the session teaches participants to approach prompting methodically—moving beyond simple chat interfaces toward structured, automated processes. The workshop is divided into two practical components. First, attendees will learn how to install and manage agentic AI tools locally, providing greater control over their research environment. Second, participants will be introduced to core computational thinking principles—such as decomposition and algorithmic design—alongside basic programming concepts to automate some tasks and boost productivity. By bridging the gap between software engineering and applied linguistics, this workshop empowers CALL researchers to leverage cutting-edge AI as a sophisticated research assistant. Participants will leave with the foundational skills necessary to transform their approach to data and research design. No prior programming experience is required.

AI as a Communicative Resource: Japanese Business Professionals’ Use of AI in English-Mediated Work #4522

Time Not Set

This study reports on a survey-based investigation into how Japanese business professionals use AI-assisted language tools (e.g., ChatGPT and DeepL) in English-mediated workplace communication. While Computer-Assisted Language Learning (CALL) research has traditionally focused on classroom-based language development, less attention has been paid to the role of AI in real-world professional communication contexts.

Drawing on the frameworks of English as a Business Lingua Franca (BELF) and multimodality, this study conceptualizes AI not simply as a learning support tool but as an integral communicative resource embedded in everyday business practices. Data were collected from approximately 100 Japanese professionals across a range of industries through an online survey examining purposes of AI use, perceived benefits, and impacts on English communication.

Preliminary findings indicate that participants use AI tools not only to improve linguistic accuracy but also for idea generation, pragmatic adjustment, and confidence-building. These patterns suggest that AI functions as a mode of communication and a partner in meaning-making rather than merely a correction tool.

The presentation discusses implications for CALL research and business English pedagogy, arguing for an expanded understanding of CALL that incorporates AI-mediated professional communication beyond educational settings, and outlining directions for future research and classroom-informed innovation and practice worldwide.

AI-Generated Elicited Imitation Materials for EFL Speaking: Comparing three LLMs #4533

Time Not Set

Japanese EFL learners struggle with speaking comprehensibility and intelligibility, and while High Variability Phonetic Training (HVPT) and Elicited Imitation (EI) show promise, creating level-appropriate EI materials at scale requires efficient methods. This study investigates whether AI can generate appropriate EI sentences for CEFR A2-B1 learners by comparing three AI models (ChatGPT-5, Claude-Sonnet-4, Gemini-1.5-Flash) generating EI sentences through systematic prompt engineering. Materials (n=150 per model, 50 per CEFR level) were evaluated by three expert raters on grammatical complexity, vocabulary appropriateness, and sentence length suitability. MANOVA revealed significant multivariate effects (Wilks' λ = 0.731, F(6, 888) = 23.18, p < .001, η²p = .145), with Claude significantly outperforming other models on vocabulary appropriateness (F(2, 447) = 42.67, p < .001, η²p = .160) and sentence length suitability (F(2, 447) = 35.24, p < .001, η²p = .136), while showing no differences in grammatical complexity. Results establish Claude as optimal for generating level-appropriate speaking materials, enabling scalable, personalized EFL assessment tools. This advances ML in CALL by demonstrating systematic AI model comparison for pedagogical material generation, supporting subsequent phases investigating HVPT's impact on speaking performance.

From Detection to Instruction: Developing EFL Pragmatic Competence Through Human-AI Comparative Analysis of Multimodal Sarcasm #4551

Time Not Set

This research-based presentation reports on a pedagogical intervention using human-AI performance comparisons to teach sarcasm detection to Japanese EFL learners. Initial findings revealed native speakers achieved only 60% accuracy while non-native speakers performed at chance levels (51%) on 200 memes, with AI models achieving comparable performance (54-57%). These insights informed a three-phase instructional framework implemented across 7-8 sessions with 34 participants using pre/post quasi-experimental design. Phase 1 built awareness of semantic-pragmatic incongruity, Phase 2 employed computational pattern analysis to identify linguistic markers, and Phase 3 addressed ambiguous cases where human-model disagreement was highest. Preliminary results suggest systematic exposure to computationally-identified patterns combined with explicit metalinguistic instruction improves pragmatic competence in digital contexts. The presentation demonstrates how NLP model analysis can identify specific error patterns to guide targeted instruction, while highlighting the continued importance of human cultural expertise. Participants will learn practical strategies for integrating computational insights into pragmatic instruction and receive access to our validated multimodal corpus. This work contributes to CALL by bridging computational linguistics with pedagogical practice, offering evidence-based approaches for teaching challenging pragmatic features in technology-mediated environments.

Multimodal Apprenticeship with GenAI Chatbots: ESL Learners’ Reflections on Vlog-Based English Oral Communication #4553

Time Not Set

Short-form video production and generative artificial intelligence (GenAI) tools are reshaping how learners compose, perform and refine multimodal texts. While vlogs and GenAI chatbots have individually gained traction in language education, little is known about how learners experience these tools together as part of a multimodal apprenticeship. This study examines the written reflections of Malaysian ESL undergraduates after completing a vlog-based English oral communication project that utilised generative AI (GenAI) chatbots as optional support tools. Thematic analysis of 34 reflections revealed how learners used chatbots to brainstorm ideas, draft scripts, refine storylines and receive guidance in editing and production. The findings demonstrate how learners engaged with multimodal compositional processes from planning, scripting, filming, and editing while developing autonomy, confidence and technical competence. Drawing on self-determination theory (SDT) and zone of proximal development (ZPD), the analysis shows how GenAI chatbots functioned as cognitive scaffolds and co-creators. It supports the element in the underpinning theories supporting autonomy, competence and relatedness. While peer and human examples remained central sources of modelling. Learners valued vlogging as a space for self-expression and identity work, yet challenges were reported in relation to AI accuracy, time and technical constraints.

AI Exploration Projects for English Language Learners #4575

Time Not Set

This presentation reflects on a project that aimed to help Japanese university students use artificial intelligence (AI) creatively and critically for English learning. The project involved in-class group work and out-of-class self-study. In the initial stage, groups created a three-week plan, selecting AI-based self-study tasks that addressed self-selected learning objectives. Each week they individually put the plan into action and recorded study notes outside of class, then met their group and teacher in class to reflect on their experiences, brainstorm solutions for problems, and modify their plans when necessary. At the project’s conclusion, students submitted a reflective report about their experiences and learning outcomes. Twenty-six students from eleven groups consented to their coursework being treated as data. Groups with different learning goals (conversation skills, writing, reading, vocabulary) will be presented, drawing on examples of learning plans, problems faced, problem-solving strategies, and students’ reflections on using AI as a learning tool. The presentation will also include reflections on project design, implementation, and initiatives taken by the teacher to actively support students, along with changes planned for future projects. This study highlights the importance of teacher guidance for student exploration of AI and could inform practice in other contexts.