International Journal of Language Testing
International Journal of Language Testing, Volume 15, Issue 2, October 2025 (مقاله علمی وزارت علوم)
مقالات
حوزههای تخصصی:
The present study aimed to find the effect of artificial intelligence (AI)-mediated speaking assessment on speaking performance and willingness to communicate (WTC) of Iraqi EFL Learners. More specifically, the study sought to find whether AI-mediated speaking assessment enhance the speaking performance (grammar, vocabulary, pronunciation, intonation, and fluency) of Intermediate Iraqi EFL learners and whether AI-mediated speaking assessment enhanced their WTC in English. In so doing, 40 intermediate Iraqi EFL learners were randomly selected and assigned into experimental and control groups, each comprising 20 learners. The experimental group participants received ten 60-minute treatment sessions with ELSA Speech Analyzer, while the control group received no treatment. The speaking pre-test of both groups was run prior to the treatment, and all participants were post-tested at the end of the study. The Willingness to Communicate in a Foreign-Language Scale was also administered to groups prior to and after the treatment. A speaking assessment rubric, including vocabulary, grammar, intonation, pronunciation, and fluency, was used to assess the speaking performance of both groups. The findings demonstrated that AI-mediated speaking assessment enhanced the grammar, vocabulary, intonation, and fluency of the experimental group. However, the two groups did not differ in terms of pronunciation. Furthermore, this assessment tool enhanced the willingness to communicate with native speakers, the willingness to communicate with non-native speakers, and the willingness to communicate in the school context of this group. In general, the speaking assessment mediated by AI significantly enhanced the speaking performance and WTC of the learners. These findings might advance the current scholarly discourse on AI within the domains of language pedagogy and assessment.
An Exploratory Mixed Methods Study of Grammatical Range and Accuracy in IELTS: A True Diagnostic Approach to Cognitive Diagnostic Assessment(مقاله علمی وزارت علوم)
حوزههای تخصصی:
In the second language (L2) assessment realm, cognitive diagnostic assessment (CDA) emerges as an exceptionally practicable methodology, enabling a meticulous analysis of linguistic competencies and furnishing granular insights into learners’ proficiencies and deficiencies, thereby charting precise remedial pathways. The primary objective of this research was to develop a cognitive model of attributes underlying an International English Language Testing System (IELTS) assessment descriptor, namely grammatical range and accuracy. The model sought to enable the development of a diagnostic instrument informed by CDA, which would investigate candidates’ grammatical strengths and weaknesses to improve their performance in the IELTS. Through a multi-stage process involving qualitative data collection, interpretation, and synthesis, a comprehensive scheme emerged, categorizing cognitive attributes into two main areas: (a) knowledge of grammatical forms, including pronouns, determiners/quantifiers, adjectives, adverbials, nouns/noun phrases, verbs/tenses, and prepositions, and (b) familiarity with structural nuances, including punctuation and structural sophistication. These nine micro-level attributes comprised several sub-components aligned with three proficiency classes: A1-A2, B1-B2, and C1-C2. The model laid the groundwork for developing a three-booklet multiple-choice test. Alongside conducting pilot testing and item analysis, the study employed a saturated psychometric model to validate the CDA-informed instrument. The results confirmed the instruments’ internal consistency, validity, satisfactory fit, and effectiveness in classifying examinees based on attribute-specific mastery levels. The findings underscored severe weaknesses in punctuation, structural complexity, and verb tense usage, pinpointing crucial areas demanding targeted instructional enhancement. The theoretical implications highlight a refined understanding of grammatical competencies, while pedagogically, the results advocate for targeted teaching strategies.
Examining the Role of Lexical Sophistication, Lexical Diversity, Syntactic Sophistication, Syntactic Complexity, and Cohesion in L2 Speaking Proficiency Assessment(مقاله علمی وزارت علوم)
حوزههای تخصصی:
Abstract The present study developed a model of L2 speaking proficiency investigating how lexical sophistication, lexical diversity, syntactic sophistication, syntactic complexity, and cohesion are associated with holistic scores of L2 speaking proficiency employing structural equation modeling (SEM). A corpus of 419 monologues delivered by Iranian EFL learners was compiled and rated to develop the model. Based on the overall scores, the corpus was divided into independent (B1 and B2) and proficient (C1 and C2) users. The results of SEM analysis revealed that the developed L2 speaking proficiency model had an acceptable fit, with partial generalizability across independent and proficient users. Structural regression analysis showed that lexical diversity, lexical sophistication, syntactic sophistication, cohesion, and the indirect effect of syntactic complexity through lexical sophistication explained 34% of the variance in L2 speaking proficiency in descending order of importance. However, their relative importance changed depending on proficiency level. Based on the results, while lexical, syntactic, and cohesive features are sound predictors of L2 speaking proficiency, they function differently across proficiency groups. These findings offer valuable insights for improving speaking proficiency assessment by showing that lexical sophistication, lexical diversity, syntactic sophistication, syntactic complexity, and cohesion do not contribute equally to overall L2 speaking proficiency, and their order of importance varies across proficiency levels. Therefore, prioritizing indicators of L2 speaking proficiency in assessment frameworks based on their importance in each proficiency level can add to the validity and reliability of speaking assessments.
L2 Writing Feedback Literacy and Writing Engagement Across Proficiency Levels: Focus on EFL Learners(مقاله علمی وزارت علوم)
حوزههای تخصصی:
Feedback literacy as a knowledge vital for developing EFL learners’ academic writing might be associated with several factors, including learners’ second language (L2) writing engagement. Besides, variables such as learners’ proficiency levels can affect the relationship between feedback literacy and writing engagement. This study investigated the relationship between 234 elementary (n = 85), intermediate (n = 78), and advanced (n = 71) Iranian English as a Foreign Language (EFL) learners’ writing feedback literacy and writing engagement selected based on convenience sampling. After taking a placement test, they sat for a writing task adjusted to their proficiency level. Next, they answered a scenario-based L2 Writing Engagement Measure (WEM) and completed the Writing Feedback Literacy Questionnaire (WFLQ). The Spearman rank-order correlation indicated significant positive relationships between the three groups’ feedback literacy and writing engagement. However, there were no significant differences in the relationships between EFL learners’ feedback literacy and writing engagement across the three groups. The study’s findings provide insights for L2 teachers, teacher trainers, and educationalists to enhance students’ writing ability, feedback literacy, and writing engagement. Some suggestions for further research are proposed.
EFL Teachers’ Formative Assessment Practice: Does Teachers’ Level of Agency Matter?(مقاله علمی وزارت علوم)
حوزههای تخصصی:
Although there has been ample research on English as a Foreign Language (EFL) teachers’ formative assessment practice (FAP) and their agency separately, scant attention has been paid to the possible influence of teachers’ level of agency on their FAP. Accordingly, this study investigated EFL teachers’ FAP in the light of teachers’ agency. The initial participants, selected based on convenience sampling, comprised 180 male and female Iranian EFL teachers within the age range of 22 to 45. Their teaching experience fell between a few months to 21 years. The initial 180 teachers were given a Teacher Agency Questionnaire (TAQ) to determine teachers with high and low levels of agency. To this aim, 30 teachers who scored the highest and 30 who scored the lowest on the TAQ were selected. The 60 teachers were asked to fill out the Teacher Formative Assessment Practice Scale (TFAPS). Moreover, 15 teachers from each group were asked to take part in semi-structured interviews to explore their perceptions regarding their FAP. The results of parametric independent samples t-test revealed that teachers in the high-agency group scored significantly higher than their counterparts in the low-agency group in terms of both teacher-directed (p = .00<.001, effect size = 3.32) and student-directed FAP (p = .00<.001, effect size = 3.44). The descriptive and qualitative comparison of the thematic analysis between teachers in the high agency and low agency groups demonstrated marked differences both in the total number of themes and theme mentions as well as the theme contents between the two groups. Based on the findings, teacher educators are encouraged to enhance EFL teachers’ level of agency to improve their FAP.
Three-Parameter Item Response Theory Analysis of the Multiple-Choice Items in PIRLS 2016(مقاله علمی وزارت علوم)
حوزههای تخصصی:
Progress in International Reading Literacy Study (PIRLS) is an international assessment that measures the reading literacy of fourth-grade students (aged 9-10 years old). PIRLS aims to evaluate and compare the reading abilities of students across different countries. It assesses how well students can understand and interpret written texts, which is fundamental to their overall educational development. In this study, psychometric analyses were run on a portion of the multiple-choice items of PIRLS 2016 taken by 4th graders in the USA. The 3PL item response theory model was utilized to examine the test. Discrimination, difficulty, and guessing parameters were estimated along with the fit values, reliability, item characteristic curves, and item-person map. M2, CFI, TLI, and RMSE statistics showed that the test is reliable and the model, overall, fits the data. Item fit statistics outfit and infit showed that most of the items fit the 3PL model. Findings showed that while all the items have acceptable discrimination values, two items have unacceptable guessing parameters. Examination of the ICCs showed that graphical displays are important, in addition to numerical values, for examining item quality. Item-person map showed that items do not target the whole ability scale.
An Investigation of Tertiary Level EFL Teachers' Language Assessment Literacy in Indonesia(مقاله علمی وزارت علوم)
حوزههای تخصصی:
This article investigates language assessment literacy (LAL), among tertiary EFL teachers in Indonesia, a crucial area that remains under-explored within the context of EFL education. LAL is conceived as knowledge, skills, and competencies required in designing, administering, and interpreting language assessments, contributing to high quality learning and teaching practices. Despite their importance, there has been little knowledge of Indonesian EFL teachers with regard to LAL levels, particularly in higher education. This study bridges the gap and investigates teachers' knowledge of the fundamental assessment principles (validity, reliability and practicality), their ability to put assessment practices into a classroom, the challenges they face in interpreting assessment data to inform instruction, and experience in assessing language skills and components. A mixed-methods approach was adopted, using surveys and interviews with 297 EFL teachers at various higher education institutions (102) in Indonesia. It points out that EFL teachers in Indonesia face common obstacles to LAL training, especially in the practical application of assessment knowledge. This finding underlines the need for well-rounded professional development programs to improve teachers' LAL, focusing on some aspects of language assessment practice. The study concludes with some recommendations on how teacher education programs and assessment practices can be improved in the EFL context in Indonesia. A key recommendation is that institutions and policymakers should integrate comprehensive LAL programs into teacher education curricula for both pre-service and in-service training.
Portfolio as an Assessment Tool: Impact on Student Participation and Improvement in EFL Learning(مقاله علمی وزارت علوم)
حوزههای تخصصی:
The use of portfolios as a tool for assessment in educational settings has attracted significant attention in recent years due to its potential to capture multifaceted aspects of student learning. This study investigates students' perceptions of portfolios as an assessment tool, examining their experiences, attitudes, and beliefs towards its effectiveness in assessing their learning progress and achievements. Additionally, the study explores the influence of factors such as prior exposure to portfolio assessment, instructional support, and personal learning styles on students' perceptions. The research examined students’ responses to portfolio-based assessments over three academic years, from 2020 to 2023. The respondents were third-year elementary education students enrolled in the Faculty of Education at the public University of Gjakova, Kosovo, where the research was conducted. Additionally, the study incorporated a focus group consisting of five English language teachers to examine both the advantages and disadvantages of portfolio assessment, as well as the challenges associated with evaluating students through this method. A qualitative research approach was employed, utilizing qualitative analysis for the questionnaire responses and a descriptive approach for the data collected from teachers. The analysis of student outcomes yields several insights into portfolio-based assessment. The findings suggest that this assessment method enables a more convenient preparation process, as students have ample time, a supportive environment—such as their homes—and the flexibility to engage in learning according to their unique preferences and styles. According to teachers, the portfolio is perceived as straightforward in its appearance, yet it presents significant complexities in terms of implementation and assessment. The integration of portfolio assessment in curriculum design and the provision of structured guidelines and support could be underlined as practical implications and recommendations to enhance the effectiveness of assessment-based portfolios in educational settings.
Indonesian English Teachers’ Perceptions of Local English Varieties and their Integration in Language Assessment Practices(مقاله علمی وزارت علوم)
حوزههای تخصصی:
This study investigates Indonesian English teachers’ perceptions of local English varieties and their influence on assessment practices, a crucial area given the increasing diversity of English in global and local contexts. While existing research highlights the recognition of World Englishes and their pedagogical implications, little is known about how teachers’ perceptions shape the evaluation of diverse English forms within Indonesia’s assessment system. The study aims to explore perceptions across different teaching levels, experience, and school locations, and to understand how these attitudes impact assessment decisions. Utilizing a quantitative, cross-sectional survey design, the research sampled 42 teachers from various Indonesian provinces through purposive sampling. Data were collected via a structured questionnaire and analyzed using SPSS to identify perception patterns and differences across subgroups. Findings reveal that teachers generally acknowledge the importance of local varieties but tend to prioritize native norms in assessments, with significant variations based on teaching levels, years of experience, and school locations. The study underscores the need for inclusive evaluation strategies that embrace linguistic diversity, informing professional development and policy reforms. Practically, these insights can guide the development of assessment policies that support multilingual identities and promote equitable language teaching, thereby enhancing the authenticity and relevance of English instruction in Indonesia.
Speaking Test Anxiety among Adult Saudi EFL Learners: Causes, Factors, and Suggested Solutions(مقاله علمی وزارت علوم)
حوزههای تخصصی:
Speaking test anxiety is a pervasive challenge for adult English as a Foreign Language (EFL) learners, particularly in contexts where English proficiency is tied to academic and professional advancement. This empirical research study investigates the causes, contributing factors, and potential strategies to cope with speaking test anxiety among adult EFL learners in Saudi Arabia. Using a mixed-methods approach, the research combines quantitative survey data with qualitative interpretations to provide a comprehensive understanding of the phenomenon. Sixty-eight undergraduate male students, selected through purposive sampling method, participated in this research. The data were collected via standardized instruments (a modified FLCAS, and another survey instrument prepared by the researcher). Research findings reveal that cultural expectations, fear of negative evaluation, less frequency of speaking tests, and limited speaking practice contribute significantly to “the participants' speaking test” anxiety. The study bears significance in Saudi Arabian contexts as it identifies the causes of English speaking and test anxiety among adult learners and also suggests test-taking strategies to cope with test stress and related anxiety. The study concludes with pedagogical recommendations to mitigate anxiety and enhance speaking performance.