Asia-Pacific Forum on Science Learning and Teaching, Volume 16, Issue 2, Article 2 (Dec., 2015) |
Three-Tier Tests
In literature, misconceptions are usually measured using either two-tier or three-tier tests. The two-tier tests that were used to investigate the students’ conceptual knowledge became quite popular when they emerged. Therefore, in the past, the researchers used two-tier tests quite often to determine the students’ misconceptions in the science field (Caleon & Subramaniam, 2010). However, nowadays the researchers use these tests as a preliminary stage to develop three-tier tests so that they can differentiate the students’ lack of knowledge from their misconceptions.One of the biggest problems, when detecting the misconceptions, is the researchers’ inability to differentiate misconceptions from error. According to Eryilmaz and Surmeli (2002), all misconceptions are an error, but not all errors are a misconception. Errors must be differentiated from misconceptions because the lack of knowledge can cause errors (Kutluay, 2005). The two-tier tests are devoid of features that make this fine distinction. As a result, the adding of the third tier to the test may clarify whether the students’ errors are caused by lack of knowledge or misconception (Hasan, Bagayoko, & Kelley, 1999; Pesman & Eryilmaz, 2010).
The first tier of the three-tier test is called the content tier. This tier depicts the respondents' descriptive knowledge. The second tier —the reason tier—evaluates the students' mental model. Finally, the third tier—confidence tier—measures the students’ confidence in their answers (Caleon & Subramaniam, 2010). In other words, if a student gives an incorrect answer in the first or second tier or both of them and the student feels confident in their answer, the student is considered to have a misconception about the topic (Kutluay, 2005). Table 1 shows the possibilities based on the students’ responses to the three-tier tests.
Table 1. All possibilities of responses
First tier Second tier
Third tier
Categories
Correct
Correct
Certain
Scientific knowledge
Correct
Incorrect
Certain
Misconception (false positive)
Incorrect
Correct
Certain
Misconception (false negative)
Incorrect
Incorrect
Certain
Misconception
Correct
Correct
Uncertain
Lucky guess, lack of confidence
Correct
Incorrect
Uncertain
Lack of knowledge
Incorrect
Correct
Uncertain
Lack of knowledge
Incorrect
Incorrect
Uncertain
Lack of knowledge
*Arslan, Cigdemoglu,& Moseley (2012)
Diagnostic Instrument of Sinking and Floating (DISF)
To understand the pre-service teachers’ misconceptions related to floating and sinking, the test included open-ended questions. When creating the questions, the following resources were used: the pre-service teachers’ difficulties related to floating and sinking, alternative concepts, and misconceptions from literature, lesson observations, and open-ended physic exam questions from the first year of college. The questions were piloted with 38 different pre-service science teachers who learned the floating and sinking topic. In order to elaborate on their answers, the author interviewed 12 students face-to-face. Considering the students’ answers, the open-ended questions in the first tier were transformed into three-choice questions. When developing the second tier of the test, a short space was left to ask the student give an explanation and reasoning for their choice. This phase of the test was administered to 24 students. These students participated in the focus-group interviews. The students’ discussions and reasoning were recorded using a video camera during the focus groups. The data collected from the second phase of the study was converted to the multiple-choice one-answer questions tests. The first and second tiers of the test contain subconcepts related to floating and sinking. Table 2 presents the categorization of the concepts and subconcepts measured by DSIF.Table2: The concepts and misconceptions related to floating and sinking measured with DISF
Concepts related with sinking and floating in DISF
Buoyant Force
Pressure/Pressure
ForceRDA
Density
RFA
Amount / level of liquid
Mass or weight of solid
Mass/weight
of liquidVolume of solid
Volume of liquid
Position in liquid
Hard or soft
Shape of object
1
✓*
✓*
2
✓ ✓
✓
✓*
3
✓* ✓
✓* ✓*
4
✓*
✓*
✓*
✓*5
✓*
✓*
✓*
✓*6
✓*
✓*
✓*7
✓
✓*
✓
✓
✓*
8
✓ ✓
✓*
✓
✓*
9
✓ ✓
✓*
✓*
10
✓
✓*
✓
✓
✓*
11
✓ ✓
✓*
✓*
✓
12
✓ ✓ ✓* ✓*
13
✓ ✓*
14
✓
✓* ✓* ✓*
✓* ✓*
15
✓*
✓* ✓*
16
✓*
✓*
17
✓
✓* ✓*
✓ ✓ ✓
18
✓ ✓*
19
✓ ✓*
20
✓*
✓*
21
✓*
✓*
✓Both two tiers *Only first tiers
Before conducting the pilot study, a third tier was added to ask the pre-service science teachers if they were sure of their answers. Four different experts evaluated the test, and then the test was revised based on their feedback and comments. The latest version of the test was administered to six sophomore, junior, and senior pre-service science teachers, and the researcher made small revisions based on their comments. A language expert checked the language and grammar of the test. Also, the students checked the ambiguity of the language or terms. At the final stage, the test was called Diagnostic of Instrument of Sinking and Floating (DISF). The test measured the 74 misconceptions related to floating and sinking.
DISF were used to measure both the students’ misconceptions and their levels of scientific knowledge. Therefore, the reliability of the three-tier DISF was calculated in two ways.
Reliability 1: Reliability of Scientific Knowledge Test
The first type of reliability coefficient of DISF was calculated according to the students’ scores from all three tiers, and the KR-20 coefficient was 0.804. The reliability coefficient was calculated to discover the students’ scientific knowledge about floating and sinking.Reliability 2: Reliability of Misconception Test
The second reliability coefficient of DSIF was calculated to catch on the students’ misconceptions by looking at the relationship between the students’ incorrect answers to questions in the first or second tiers, or both tiers and their lack of confidence in their answers. The test’s reliability coefficient of misconception KR-20 was 0.768. This coefficient was valid when using the test to find out the students’ misconceptions about floating and sinking.There are three quantitative methods to measure the test’s validity.
Validity 1: Construct Validity
Construct validity, the confidence tier, was calculated as a correlation coefficient among the first two tiers of the test and third tier. When a student got a high score from the first tiers of the test, the student must have confidence in his or her answers. In other words, there must be a statistically significant correlation between the first two tiers and the third tier (Arslan et al., 2012; Cataloglu, 2002). The test's construct validity was r=0.51 (p<.05). After administering the test, the researcher conducted face-to-face interviews with 10 students. To form the test's construct validity, the students’ test answers were compared with their interviews.Validity 2: Exploratory Factor Analyses
The questions in the test contain the misconceptions from the literature. Table 2 shows the questions’ factors that were measured using the first and second tiers. Based on the factor analysis results, the questions were regrouped and renamed in accordance with the dominant feature in the question stem (see Table3). The KMO value of the test is 0.794 based on the total score of the factor analysis and this value shows the appropriateness of the data. The factor analysis shows that the eigenfactor value for all questions is higher than 1 and there are seven categories in the test. Table 3 presents the factors and loadings.Table3. The factor loadings of DISF
Amount/
level of liquidShape of objects
RFA
Position in liquid
Hard/soft objects
RDA
Pressure force
2
0.720
20
0.620
1
0.598
11
0.468
5
0.739
6
0.707
4
0.601
16
0.711
21
0.660
3
0.542
14
0.476
7
0.809
8
0.731
9
0.711
10
0.686
15
0.679
17
0.560
12
0.518
13
0,761
19
0.467
18
0.460
Validity 3: Content Validity
Content validity was examined for the false negative and false positive probability.
False Negative: Hestenes and Halloun (1995) define false negative as the wrong answers that are given by the students who provide right reasoning. The rule of thumb for false negative is less than 10%. In this study, the students’ false negative score was 3.40%.
False Positive: Hestenes and Halloun (1995) define false positive as the wrong answers that are given by the students who do not provide right reasoning. They state that it is not possible to determine a fixed minimum score for false positive because the students’ false positive may be made through guesswork (Pesman, 2005). As a result, the chance of success may vary by the number of options in the first tier. In this study, the chance of students’ random-answer probability should be less than 33.3% as the first tier questions have three choices. The calculated false positive value is 20.1% for this study.Participants
The test was administered to 377 senior pre-service science teachers from three different universities in Turkey. The participants’ age range was 21–25; there were 253 female and 124 male participants.
The pre-service science teachers took the test one month before they graduated. They had two minutes for each question, for a total of 42 minutes. The frequency and percentages of the data were calculated. In the first phase of the study, 21 questions were developed. The questions were grouped under seven categories to measure the pre-service science teachers’ levels of scientific knowledge, lack of knowledge, misconceptions, and lack of confidence based on the criteria (see Table 1). In the second phase of the study, the percentages of the 74 misconceptions were obtained by matching the three tiers. If there was more than one match, the average of the percentages were calculated; if there was only one match, the percentage value was used (See Table 5).
Copyright (C) 2015 HKIEd APFSLT. Volume 16, Issue 2, Article 2 (Dec., 2015). All Rights Reserved.