International Journal of Language Testing
International Journal of Language Testing, Volume 12, Issue 2, Summer and Autumn 2022
مقالات
حوزه های تخصصی:
The Covid-19 pandemic revolutionized the world's status quo to transition from face-to-face into online E-learning. However, this unexpected transformation in teaching-learning approaches necessitates teachers to teach while frequently assessing students' performance through online assessment, which has been substantiated to expose teachers to an array of unanticipated challenges. To this end, the current study was a bid to unearth the Iranian EFL teachers' perceptions of online assessment and illustrate the challenges encountered through adopting a sequential mixed-methods design. For the quantitative phase of the study, one hundred EFL teachers submitted responses to an online questionnaire constructed in Google Forms. The researcher recruited twelve teachers for the qualitative phase of the study from a pool of one hundred participants who completed questionnaires to participate in a semi-structured interview protocol. The findings from the descriptive statistics of the questionnaire revealed that Iranian EFL teachers had either a negative or a neutral attitude toward online assessment. In addition, the results of the interview offered more insight into the challenges that teachers encounter during online assessment. In this light, the three most frequent challenges were the high risk of students cheating and plagiarism, the issue of internet connectivity, and poor technological infrastructures. The findings of this study bear witness to the voices of a group of Iranian EFL teachers about their perceptions and challenges of online assessment during the COVID-19 epidemic. In the spirit of the findings, the recommendations and suggestions for further investigations are discussed.
Analytic Assessment of TEFL Undergraduate Students' Writings: Diagnosing Areas of Strength and Weakness
حوزه های تخصصی:
Assessment of learners affects all processes of their learning. Generally, without assessment, it is very difficult to evaluate the effectiveness of the teaching and learning process. Understanding their weak points and difficulties in their writing would help teachers and administrators focus more on these areas. Therefore, this content analysis study was an attempt to investigate the relative contribution of each writing component to the variation in the overall writing performance of TEFL undergraduate students to determine areas of strength and weakness in their writing. To this end, the writing samples of 73 students who were willing to participate in this study at different universities in Tabriz, Iran, were assessed through an Analytical Scoring Scale. These participants were divided into two groups: group 1, first-year and second-year students, and group 2, third-year and fourth-year students. The results of Stepwise Multiple Linear Regression analysis revealed that the greatest contributions to the variance of the total rating of TEFL undergraduate students' writings were related to the grammar and punctuation components in groups 1 and 2, while the least contributions were related to the content and spelling in group 1, and vocabulary and spelling in group 2. In addition, results indicated that the weaknesses and issues of participants in both groups of this research were mostly related to the grammatical structure and using punctuation marks in paragraphs. Moreover, the positive effects of the analytical scoring scale as well as the implications of the findings were discussed.
A Bilingual Version of the Vocabulary Size Test for Speakers of Spanish
حوزه های تخصصی:
The objective of this study was to validate a bilingual Spanish-English version of the Vocabulary Size Test (VST) considering its potential use as a discriminator between learners in terms of language competence. This version was designed based on the two forms available on one of the creators’ websites as well as considering practices recommended regarding the elimination of cognates and loans. A one-way ANOVA test was used to confirm the test’s capacity to discriminate among learners of different linguistic competence. Additionally, Principal Axis Factoring (PAF) was conducted to revise the existence of only one underlying variable. As a result of this study, a VST version for Spanish speakers consisting of 9 vocabulary frequency levels is shared. This version is in line with validation standards put forward in previous research. It is expected that this instrument will help future studies that seek to measure Spanish speakers’ competence in English as a foreign or second language without having to deal with the interference of other intervening factors.
The Relationship Between Iranian EFL Teachers’ Conceptions of Assessment and Their Self-efficacy
حوزه های تخصصی:
This study investigated the relationships between Iranian EFL teachers’ conceptions of assessment and their self-efficacy. For this purpose, 154 Iranian EFL teachers were selected through purposeful sampling and completed the 27-item teachers’ conceptions of assessment scale (TCoA) (Brown, 2006), and the 24-item teachers’ self-efficacy scale (Tschannen-Moran & Hoy, 2001). Multiple regression analysis and ANOVA were used to analyze the data. The results showed that teachers considered assessment as a tool to determine how much have their students learned from instruction, assessment results can be used to modify teaching practices and assessment processes may be inaccurate. Moreover, they argued that assessment is an indicator of school performance and, at the same time, assessment results should be treated cautiously. In addition, the results showed that Iranian EFL teachers had a high level of self-efficacy, were good at asking appropriate questions and answering students’ difficult questions, can assess students’ learning, and provide alternative explanations and examples when learners are confused. The results of multiple regression analysis showed that school accountability and irrelevance predicted students’ engagement, students’ accountability predicted classroom management and improvement predicted instructional strategies. These results may have some implications for EFL teachers' professional development.
Iranian EFL Teachers’ Assessment Literacy Knowledge: The Impact of The Educational System on Teachers’ Classroom Assessment Practices
حوزه های تخصصی:
English language teachers’ assessment ability to assess all areas of pupils’ learning is important for comprehending how learners advance across the curriculum and guiding them in their development (Livingston & Hutchinson, 2016). The educational system of Iran decided to shift from traditional teaching methodologies toward communicative approaches. This conversion would not turn into reality unless teachers could apply it in practice. However, teachers have certain needs such as expanding and broadening contextual-related knowledge, cooperating with their colleagues, and developing their assessment literacy (Coombe, Vafadar, & Mohebbi, 2020). To identify the teachers’ understandings of, practices in, and challenges of assessment, 15 English language headteachers (English language teachers who are responsible for moderating the group of the teachers’ activities) participated in the interviews of this study, followed by questionnaires for exploring teachers’ needs. The interviews were coded and content analyzed independently by the researcher and an expert in assessment. The main themes and needs were derived from the interview analyses and are presented in eight pivots. The findings of the questionnaires manifested the priorities that teachers felt concerning assessment literacy and classroom-based assessment needs. The paper will discuss the findings concerning assisting teachers’ professional development in assessment literacy. Implications are also provided.
Psychometric Evaluation of Cloze Tests with the Rasch Model
حوزه های تخصصی:
Cloze tests are gap-filling tests designed to measure overall language ability and reading comprehension in a second language. Due to their ease of construction and scoring, cloze tests are widely used in the context of second and foreign language testing. Previous research over the past decades has shown the reliability and validity of cloze tests in different contexts. However, due to the interdependent structure of cloze test items, item response theory models have not been applied to analyze cloze tests. In this research, we apply a method to circumvent the problem of local dependence for analyzing cloze tests with the Rasch model. Using this method, we applied the Rasch model to a cloze test composed of eight passages each containing 8-15 gaps. Findings showed that the Rasch model fits the data and thus it is possible to scale persons and cloze passages on an interval unidimensional scale. The test had a high reliability and was well-targeted to the examinees. Implications of the study are discussed.
A Comparison of Polytomous Rasch Models for the Analysis of C-Tests
حوزه های تخصصی:
A C-Test is a gap-filling test for measuring language competence in the first and second language. C-Tests are usually analyzed with polytomous Rasch models by considering each passage as a super-item or testlet. This strategy helps overcome the local dependence inherent in C-Test gaps. However, there is little research on the best polytomous Rasch model for C-Tests. In this study, the Rating Scale Model (RSM) and the Partial Credit Model (PCM) for analyzing C-Tests were compared. To this end, a C-Test composed of six passages with both RSM and PCM was analyzed. The models were compared in terms of overall fit, individual item fit, dimensionality, test targeting, and reliability. Findings showed that, although the PCM has a better overall fit compared to the RSM, both models produce similar test statistics. In light of the findings of the study, the choice of the best Rasch model for C-Tests will be discussed.
Psychometric Evaluation of Dictations with the Rasch Model
حوزه های تخصصی:
Dictation is a traditional technique for both teaching and testing overall language ability and listening comprehension. In a dictation, a passage is read aloud by the teacher and examinees write down what they hear. Due to the peculiar form of dictations, psychometric analysis of dictations is challenging. In a dictation, there is no clear boundary between the items and every word in the text is potentially an item. This makes the analysis of dictations with classical and modern test theories rather difficult. In this study, we suggest a procedure to make dictations analyzable with psychometric models. Our strategy entailed using several independent short passages instead of a single long passage. The number of mistakes in each passage was counted and entered into the analysis. Rasch model analysis was then applied to the passage scores (mistakes). Our findings showed that dictations fit the Rasch model very well and it is possible to measure examinees’ ability on an interval scale using dictations.
Examining the Interrater Reliability Between Self- and Teacher Assessment of Students’ Oral Performances
حوزه های تخصصی:
The increasing popularity of self-assessment prompted several scholars to investigate its effectiveness and accuracy in relation to teacher assessment. However, most of these studies focused only on the consistency estimate perspective. Thus, the current study investigated the interrater reliability between self- and teacher assessment of students’ oral performance in Filipino. Specifically, this study used two perspectives (i.e., consistency estimate and consensus estimate perspectives) to see the full picture of interrater reliability between self- and teacher assessment. Fifty (50) college students from various specializations participated in this study. They assessed their respective oral performances using an in-class observation self-assessment with self-viewing. Findings reveal that teacher and students’ SA results posted a very strong positive relationship and that their ratings agree with each other. High positive correlations suggest that both the students and the teacher consistently apply the rating scale. These results were attributed to the use of a micro-analytic rating scale, assessment training, and rating procedure used during SA. Implications for classroom assessment and future studies were discussed.
Why IELTS Candidates Score Low in Writing: Investigating the Effects of Test Design and Scoring Criteria on Test-Takers’ Grades in IELTS and World Englishes Essay Writing Tests
حوزه های تخصصی:
This study explored possible reasons why IELTS candidates usually score low in writing by investigating the effects of two different test designs and scoring criteria on Iranian IELTS candidates’ obtained grades in IELTS and World Englishes (WEs) essay writing tests. To this end, first, a WEs essay writing test was preliminarily designed. Then, 17 Iranian IELTS candidates wrote two essays on the same topic, one under the IELTS test condition and one under the WEs test condition. Each of the 34 obtained essays was scored six times, three times based on IELTS scoring criteria, each time by a different rater, and then, three times based on WEs scoring criteria. The results of repeated-measures ANOVA showed that test design and scoring criteria had significant effects on essay grades. The study concludes that some of the reasons why IELTS candidates usually score low in writing may be rooted in the test design and scoring criteria of the IELTS essay writing test, not necessarily in IELTS candidates’ weaknesses in writing. The implications of the study focus on the importance and relevance of the results to IELTS candidates, international students, and the future of assessing writing in World Englishes contexts.