item discrimination

۱.

Does number of options in multiple choice tests affect item facility and discrimination? An examination of test-taker preferences(مقاله علمی وزارت علوم)

نویسنده: کریم صادقی غزال اخوان معصومی

منبع: English Language Teaching and Learning. No.۱۹, Spring & Summer ۲۰۱۷

کلیدواژه‌ها: multiple - choice item number of option test-taker preferences item facility item discrimination

حوزه‌های تخصصی:

حوزه‌های تخصصی زبان شناسی علوم مرتبط آموزش زبان دوم

تعداد بازدید : ۷۹۴ تعداد دانلود : ۶۲۳

Multiple Choice tests are utilized widely in educational assessment because of their objectivity, ease of scoring, and reliability. This study aimed to compare IF and ID of MC vocabulary test items and attempted to find whether these indices are affected by the number of options. To this end, four 20 item stem equivalent vocabulary tests (3-, 4-, 5-, and 6-option MC) were administered to 194 (106 male and 88 female) pre-intermediate students. Besides, an attitude questionnaire was utilized to examine the attitudes of test takers towards MC test format. Results of one-way ANOVA showed that altering number of options in MC tests does not affect Item Discrimination (ID); however, there were significant differences between Item Facility (IF) of 3-, 5-, and 6-option and 4-, 5-, and 6-option MC test but not between 3- and 4-option MC test, suggesting that 6-option test is the most difficult test. Also, the results of questionnaire revealed test takers’ preference towards the use of 3-option MC. Findings demonstrated that increasing the number of options makes a test more difficult and that choosing the right number of option for MC tests is controversial. Testers are recommended to consider various factors while choosing the right number of options.

۲.

Psychometric Properties of 3-, 4-, and 5-Option Item Tests: Do Test Takers’ Personality Traits Make a Difference?(مقاله علمی وزارت علوم)

نویسنده: فاطمه خالقی رجب اسفندیاری

منبع: Iranian Journal of Applied Language Studies,Vol ۱۱, No. ۱, ۲۰۱۹ 25 - 56

کلیدواژه‌ها: Item Difficulty item discrimination MC Test Personality Trait Reliability

حوزه‌های تخصصی:

حوزه‌های تخصصی زبان شناسی

تعداد بازدید : ۶۲۵ تعداد دانلود : ۳۸۳

Prior research has yielded mixed results regarding what contributes psychometrically sound multiple-choice (MC) items. The purpose of the present study was, therefore, twofold: (a) to compare 3-, 4-, and 5-option multiple-choice (MC) tests in terms of psychometric characteristics, and (b) to investigate the relationships between three MC tests and five personality traits. To that end, 150 students were asked to answer three stem equivalent MC item tests. A Big Five Inventory was used to find students’ personality traits. Moreover, an attitude questionnaire was utilized to seek students’ opinions of these three MC tests. The results of one-way repeated measures ANOVA revealed statistically significant differences for item difficulty, while no statistically significant differences were found for item discrimination and reliability across three MC tests. The results of the Pearson correlation showed no correlation between personality traits and three different versions of MC tests. The results of the attitude questionnaire indicated mixed views towards MC tests. The findings of this study suggest that test developers consider statistical, affective, and contextual factors in order to develop different formats of MC tests.

۳.

Applying a two-parameter item response model to explore the psychometric properties: The case of the ministry of Science, Research and Technology (MSRT) high-stakes English Language Proficiency test(مقاله علمی وزارت علوم)

نویسنده: شهرام قهرکی منصور توکلی سعید کتابی

منبع: English Language Teaching and Learning. No. ۲۹, Spring & summer ۲۰۲۲ 1 - 26

کلیدواژه‌ها: IRT MSRT high-stakes item analysis Item Difficulty item discrimination Accountability

حوزه‌های تخصصی:

حوزه‌های تخصصی زبان شناسی

تعداد بازدید : ۴۲۱ تعداد دانلود : ۳۳۸

Perhaps the degree of test difficulty is one of the most significant characteristics of a test. However, no empirical research on the difficulty of the MSRT test has been carried out. The current study attempts to fill the gap by utilizing a two-parameter item response model to investigate the psychometric properties (item difficulty and item discrimination) of the MSRT test. The Test Information Function (TIF) was also figured out to estimate how well the test at what range of ability distinguishes respondents. To this end, 328 graduate students (39.9% men and 60.1% women) were selected randomly from three universities in Isfahan. A version of MSRT English proficiency test was administered to the participants. The results supported the unidimensionality of the components of MSRT test. Analysis of difficulty and discrimination indices of the total test revealed that 14% of the test items were either easy / very easy, 38% were medium, and 48% were either difficult or very difficult. In addition, 14% of the total items were classified as nonfunctioning. They discriminated negatively or did not discriminate at all. 7% of the total items discriminated poorly, 17% discriminated moderately, and 62% discriminated either highly or perfectly, however they differentiated between high-ability and higher-ability test takers. Thus, 38% of the items displayed satisfactory difficulty. Too easy (14%) and too difficult (48%) items could be one potential reason why some items have low discriminating power. An auxiliary inspection of items by the MSRT test developers is indispensable.

۴.

A Three-Parameter Logistic IRT Calibration of the Items of the TEFL MA Admission Test as a High-Stakes Test in Iran(مقاله علمی وزارت علوم)

نویسنده: فاطمه نیک مرد کبری توسلی

منبع: Iranian Journal of Applied Linguistics (IJAL) Vol. ۲۴, No. ۲, September ۲۰۲۱ 135 - 161

کلیدواژه‌ها: Guessing High-Stakes Test Item Difficulty item discrimination Three-parameter IRT model

حوزه‌های تخصصی:

حوزه‌های تخصصی روانشناسی

تعداد بازدید : ۲۱۷ تعداد دانلود : ۱۷۵

To explore the characteristics of the items of the Teaching English as a Foreign Language (TEFL) MA Admission Test (henceforth TMAAT) as a high-stakes test in Iran, the current research utilized a three-parameter logistic Item Response Theory (IRT) calibration of the test items. The three-parameter logistic IRT model is the most comprehensive among the three models of IRT for it takes into account all the three effective parameters of item difficulty, item discrimination, and guessing simultaneously. The data were a random selection of 1000 TMAAT candidates taking the test in 2020 collected from Iran’s National Organization of Educational Testing (NOET). The software used to analyze the data was jMetrik (Version 4.1.1), which is the newest version so far. As the results indicated, the TMAAT worked well in discriminating the higher and lower ability candidates and preventing the candidates from guessing the responses by chance, but it was not much acceptable regarding the difficulty level of the items as the items were far too difficult for the test-takers. The most important beneficiaries of the present investigation are test developers, testing experts, and policy-makers in Iran since they are responsible to improve the quality of the items in such a high-stakes test.