Hello readers !
Here on my blog. This blog is related to the thinking activity. And this thinking activity name is " testing and evaluation "
Some questions connected with this thinking activity.
Question: 1 Write on validity and reliability of the test.
Answer:
Validity refers to how accurately a method measures what it is intended to measure. If research has high validity, that means it produces results that correspond to real properties, characteristics, and variations in the physical or social world.
High reliability is one indicator that a measurement is valid. If a method is not reliable, it probably isn’t valid.
If the thermometer shows different temperatures each time, even though you have carefully controlled conditions to ensure the sample’s temperature stays the same, the thermometer is probably malfunctioning, and therefore its measurements are not valid.
If a symptom questionnaire results in a reliable diagnosis when answered at different times and with different doctors, this indicates that it has high validity as a measurement of the medical condition.
However, reliability on its own is not enough to ensure validity. Even if a test is reliable, it may not accurately reflect the real situation.
The thermometer that you used to test the sample gives reliable results. However, the thermometer has not been calibrated properly, so the result is 2 degrees lower than the true value. Therefore, the measurement is not valid.
A group of participants take a test designed to measure working memory. The results are reliable, but participants’ scores correlate strongly with their level of reading comprehension. This indicates that the method might have low validity: the test may be measuring participants’ reading comprehension instead of their working memory.
Validity is harder to assess than reliability, but it is even more important. To obtain useful results, the methods you use to collect your data must be valid: the research must be measuring what it claims to measure. This ensures that your discussion of the data and the conclusions you draw are also valid.
☆ Different between Validity and Reliability:
Reliability and validity are concepts used to evaluate the quality of research. They indicate how well a method, technique or test measures something. Reliability is about the consistency of a measure, and validity is about the accuracy of a measure.
It’s important to consider reliability and validity when you are creating your research design, planning your methods, and writing up your results, especially in quantitative research.
◇ What does it tell you?
The extent to which the results can be reproduced when the research is repeated under the same conditions.
The extent to which the results really measure what they are supposed to measure.
◇ How is it assessed?
By checking the consistency of results across time, across different observers, and across parts of the test itself.
By checking how well the results correspond to established theories and other measures of the same concept.
◇ How do they relate?
A reliable measurement is not always valid: the results might be reproducible, but they’re not necessarily correct.
A valid measurement is generally reliable: if a test produces accurate results, they should be reproducible.
☆ How are reliability and validity assessed?
Reliability can be estimated by comparing different versions of the same measurement. Validity is harder to assess, but it can be estimated by comparing the results to other relevant data or theory. Methods of estimating reliability and validity are usually split up into different
Different types of reliability can be estimated through various statistical methods.
☆ Types of reliability
Test-retest
The consistency of a measure across time: do you get the same results when you repeat the measurement?
A group of participants complete a questionnaire designed to measure personality traits. If they repeat the questionnaire days, weeks or months apart and give the same answers, this indicates high test-retest reliability.
Interrater
The consistency of a measure across raters or observers: do you get the same results when different people conduct the
same measurement?
Based on an assessment criteria checklist, five examiners submit substantially different results for the same student project. This indicates that the assessment checklist has low inter-rater reliability (for example, because the criteria are too subjective).
Internal consistency
The consistency of the measurement itself: do you get the same results from different parts of a test that are designed to measure the same questions .
Question : 2 Difference between Assessment and evaluation:
Answer:
Assessment is feedback from the student to the instructor about the student's learning. Evaluation uses methods and measures to judge student learning and understanding of the material for purposes of grading and reporting. Evaluation is feedback from the instructor to the student about the student's learning.
Assessment is the systematic process of documenting and using empirical data to measure knowledge, skills, attitudes and beliefs. By taking the assessment, teachers try to improve the student's path towards learning. This is a short definition of assessment.
Evaluation focuses on grades and might reflect classroom components other than course content and mastery level. An evaluation can be used as a final review to gauge the quality of instruction. It’s product-oriented. This means that the main question is: “What’s been learned?” In short, evaluation is judgmental.
Example:
You’re gifted a flower.
Evaluation: “The flower is purple and is too short with not enough leaves.”
Evaluation is judgmental
Assessment: “I’ll give the flower some water to improve its growth.”
Assessment increases the quality.
Question : 3 what do you understand backwash ?
Answer:
The "Washback" as we call it in the applied linguistics (termed 'backwash' in education), is a well-documented phenomenon known to all institutional learning process. (Philip Shawcross, p 2, What do we mean by the 'washback effect' of testing ?). The washback effect has tersely been referred to as 'the influence of testing on teaching, and learning.' (Gates 1995) This study ventured to examine the washback phenomenon from teachers' perspective taking learners as the inducing factors of the washback effect. The Qualitative approach was espoused. The data was collected by conducting semi-structured interviews with the HSC English language teachers of the government colleges in Hyderabad with a 'convenient sampling'. The participants numbering ten in total were equally taken from both the genders. The interview responses were thematically analyzed and coded to form a compact but summarized picture of the washback on teachers. The findings provided the breaking-through insights into the flip side of the washback from teachers' vantage point. It established that the apathy/lack of interest by students brings an intense kind of washback on teachers' teaching method, and content, and their overall morale. Assuming that students are directly receptive to the washback coming from tests; it sought to know as to 'how they reflect the same back on their teachers'. The respondents named many a handicap placed by apathetic students on how and what they teach. The study corroborated the fact that the teachers are at least, as affected as the learners by the washback phenomenon.
Question: 4 How do you define good assessment ?
Answer:
'An assessment requiring students to use the same competencies, or combinations of knowledge, skills, and attitudes that they need to apply in the criterion situation in professional life.'
Assessments can range from pop quizzes to final exams to midterm papers and project-based assignments; what unites them all is that they measure students’ learning. There are three key areas on which the quality of an assessment can be measured: reliability, validity, and bias. A good assessment should be reliable, valid, and free of bias.
First, reliability refers to the consistency of students’ scores; that is, an assessment is reliable when it produces stable and consistent results. Reliability can come in two major forms: (1) stability and (2)alternate form reliability. Stability means that tests or assessments produce consistent results at different testing times with the same group of students. If they do not produce similar results, the assessment may not be reliable.
Alternate form reliability means that multiple versions of the assessment or test produce the same results with the same group of students.The alternate versions must be equivalent so that students cannot automatically score better on one version than the other, and this type of reliability is critical for multiple-choice tests, and less important for essays or writing-based assessments.
Second, validity refers to whether or not the assessments actually measure the learning objective it purports to evaluate. An assessment is valid when it measures the content that was taught and when it reflects the content and skills you emphasize when teaching the course. Validity is critical because educators make inferences from assessment scores about student achievement or mastery of content. Professors and instructors can trust valid assessments to determine what their students have learned, but invalid assessments cannot be trusted.
Third, absence of bias refers to grades flowing from the students’ mastery of the learning objective instead of from the question itself.A good example of how bias could impact a student’s performance in an assessment is the rise of ESL learners – in a college environment that is becoming more and more diverse, idioms or slang words might trip up nonnative speakers of English, making the exam biased towards those who speak English as a first language. Another example is lack of objectivity in assessments – either in how questions are worded (taking certain things for fact when students may dispute their truth) or in professors awarding scores capriciously.
Question : 5 write on practicality of the test.
Answer:
Practicality in assessment means that the test is easy to design, easy to administer and easy to score.No matter how valid or reliable a test is, it has to be practical to make and to take this means that:
◇ It is economical to deliver. It is not excessively expensive.
◇ The layout should be easy to follow and understand.
◇ It stays within appropriate time constraints.
◇ It is relatively easy to administer.
◇ Its correct evaluation procedure is specific and time-efficient.
■ Characteristics of impractical tests are:
♤ these test are excessively expensive
they are too long.
♤ they require a handful of examiners to administer and score.
♤it takes several hours to grade a test.
It refers to the economy of time, effort and money in testing. In other words, a test should be easy to design, easy to administer, easy to mark, and easy to interpret the results (Bachman and Palmer, 1996). Moreover, according to Brown (2004) said that the test that is practical needs to be within the means of financial limitations, appropriate time constraints, easy to administrator, score, and interpret.
Thank you...