Setting Rules
No cross talks
Cell phone on silent
Hand raise if any question
Characteristics Of A
Good
Dr. Sindhu Almas
Lecturer Community Medicine
Department, LUMHS Jamshoro.
Characteristics Of A Good
Measuring Instrument (Test)
 The most essential of these are:
1. Validity
2. Reliability
3. Objectivity
4. Norms
5. Usability.
1. Validity:
 Test experts generally agree that the most
important quality of a test is its validity.
 The word “validity” means effectiveness”
 It refers to the accuracy with which a thing is
measured.
 A test is said to be valid if it measures what it
claims to measure.

According to Gronlund
“Validity refers to the appropriateness
of the interpretations made from test
scores and other evaluation results,
with regard to a particular use”.
Nature/Characteristics of
Validity:
When using the term validity in relation to testing
and evaluation, there are a number of cautions to
be borne in mind. These are:
a. Validity refers to the appropriateness of the
interpretation of the results of a test not to the test
itself.
B. Validity is a matter of degree: Validity
does not exist on an all-or-none basis. So
we should avoid speaking valid or invalid
test. Validity is best considered in terms of
categories that specify degree, such as
high validity, moderate validity and low
validity.
C. Validity is always specific to some
particular use or interpretation. No
test is valid for all purpose. When
describing validity, it is necessary to
consider the specific interpretation or
use to be made of the result and it is
not valid for all purposes.
Types/Approaches to Test
Validation:
 According to Gronlund (1990), there are
three basic approaches to the validity of
tests. These are
1. Content validity
2. Criterion-related validity
3. Construct validity
1.Content Validity
According to Gronlund (1990) it refers
to the extent to which the test content
represents a specified universe of
content.
It means that the “test content” (Test
Items) should measure the “course
content” (Curriculum/objectives).
Johanson, B & Christensen, L. (2008) stated that
when making your decision, try to answer these
tree questions:
Do the items appear to represent the thing you
are trying to measure?
Does the set of items fully representing the
important content areas or topics?
Have you included all relevant items?
Table of specification
is used to ensure the
content validity of a
test.
Content: Level of Bloom Taxonomy Total
Knowledge Comprehension Application
Grammar 10 5 1 16
Reading 10 5 1 16
Writing 12 5 1 18
Total 32 15 3 50
TABLE OF SPECIFICATION :1
2.Criterion Related Validity:
Criterion-Related validity deals whenever we
need prediction of future performance of
students or to assess/estimate
present/current performance on some
criterion (valued measure other the test itself).
It refers to the extent to which the test
correlates with some criterion.
Types Of Criterion-related
Evidence (Validity):
According to Rizvi, A. (1973) there are two
types of Criterion-Related Evidence (validity)
according to time factor. These are
i. Concurrent Evidence (Validity)
ii. Predictive Evidence (Validity)
I. Concurrent Evidence
(Validity):
It refers to the extent to which the test
correlates with some criterion obtained at the
same time (i.e. concurrently).
For example when Math's test scores
developed by a class room teacher correlated
with another Math's test or with Teachers
rating, you have concurrent Evidence
(Validity).
Concurrent Validity
Concurrent validity is a type of Criterion Validity.
If you create some type of test, you want to
make sure it’s valid: that it measures what it is
supposed to measure. Criterion validity is one
way of doing that. Concurrent validity
measures how well a new test compares to an
well-established test. It can also refer to the
practice of concurrently testing two groups at
the same time, or asking two different groups
of people to take the same test.
Advantages:
It is a fast way to validate your
data.
It is a highly appropriate way to
validate personal attributes (i.e.
depression, IQ, strengths and
weaknesses).
Disadvantages:
It is less effective than predictive validity to
predict future performance or potential, like
job performance or ability to succeed in
college.
If you are testing different groups, like
people who want jobs and people who have
jobs, responses may differ between groups.
For example, people who already have jobs
may be less inclined to put their best foot
forward.
Concurrent Validity Examples
 Example 1: If you create a new test for depression levels,
you can compare its performance to previous depression
tests (like a 42-item depression level survey) that have
high validity. Concurrent means “as the same time”, so
you would perform both tests at about the same interval:
you could test depression level on one day with your test,
and on the next day with the established test.
A statistically significant result would mean that you have
achieved concurrent validity. If the tests are farther apart
(i.e. they aren’t administered concurrently), then they
would fall into the category of Predictive Validity instead
of criterion validity.
Example 2:
 Concurrent validity can also occur between two
different groups. For example, let’s say a group of
nursing students take two final exams to assess their
knowledge. One exam is a practical test and the
second exam is a paper test. If the students who score
well on the practical test also score well on the paper
test, then concurrent validity has occurred. If, on the
other hand, students who score well on the practical
test score poorly on the paper test (and vice versa),
then you have a problem with concurrent validity. In
this particular example, you would question the ability
of either test to assess knowledge.
ii. Predictive Evidence (Validity):
According to Gronlund “it
refers to the extent to which
the test correlates with some
criterion obtained after a
stated interval of time”.
The Predictive Validity Of A
Test Is Determined By
establishing the relationship between
scores on the test and some measure of
success in the situation of interest.
The test use to predict success is referred
to as the Predictor and
the behavior predicted is referred to as
the “Criterion”
Predictive Validity
Predictive Validity
The ability of a measure to predict for future
outcomes
Predictive validity is the ability of a survey
instrument to predict for future
occurrences. Correlations are used to generate
predictive validity coefficients with other
measures that assess a validated construct that
will occur in the future. Predictive validity is a
type of criterion-related evidence or criterion
validity.
Construct Validity:
It refers to the extent to which the test
measure the construct it claims to measure.
A construct is a psychological trait or
quality that we assume to exists in order to
explain some aspect of behavior.
Mathematical reasoning, intelligence,
creativity, sociability, honesty and anxiety
are the examples of construct.
There are three basic steps to construct validity.
1. To identify the construct e.g intelligence.
2. From the theory the student can analyses the prediction e.g it is
trait to know.
A. General knowledge.
B. Reasoning power.
C. To solve numerical problems.
D. Decision power.
3. Then measure these factors through test items.
FACTORS INFLUENCING
VALIDITY.
Three types factors:
FACTORS IN THE TEST ITSELF:
FACTORS IN THE TEST
ADMINISTRATION AND SCORING.
PERSONAL FACTORS OF
STUDENT
2. Reliability
According to Gronlund “Reliability refers to
the consistency of measurement”
According to Ebel “The ability of the test
to measure the same quality when it is
administered to an individual on two
different occasions or by two different
testers or evaluators is known as reliability”.
Nature/Characteristics Of
Reliability:
 The meaning of Reliability can further
clarified by noting the following
general points:
1. Reliability refers to the result obtained
with an evaluation instrument (test)
and not to the instrument (test) itself.
2. Reliability Refers To Some
Particular Type Of Consistency.
 Test scores are not reliable in general. They
are reliable (or generalizable)
 over different period of time,
 over different samples of questions,
 over different raters, and the like.
3. Reliability Is A Necessary But
Not A Sufficient Condition For
Validity.
 A valid test must also be a reliable test, but
high reliability does not ensure that a
satisfactory degree of validity will be
present.
 In summary, reliability merely provides the
consistency that makes validity possible.
Reliability Is Primarily Statistical
The two widely used method of expressing
reliability are “Standard Error of Measurement”
and Reliability Co-efficient.
 Reliability Co-efficient “is a correlation Co-
efficient that indicates the degree of
relationship between two set of measures
obtained from same instrument or procedure”
Methods Of Estimating
Reliability
 Lien, A.J. (1976) mentioned three basic
methods of estimating Reliability
1. Test-Re-Test Method
2. Alternative Form / Equivalent forms
Method
3. Split-Halves Method/Internal Consistency
Method
2. The Parallel Forms Method
(Equivalent Forms)
 In this method two
equivalent forms of a test
are given to the same group
of students and the scores
obtained on the two forms
are correlated.
Example
3.The Split-half Method.
In this method the test is so
designed that it can be divided
into two equivalent halves, say
odd-numbered and even-
numbered items.
It is then administered as whole
only once. The scores of even-
numbered and odd-numbered
items are correlated
3. Practicality Or Usability.
After constructing the test, we have to administer and then to
score the test and improve the quality of the test.
it contains the following steps.
1.EASE OF ADMINISTRATION:
It consists of two points.
a. Instructions: If the instructions are not clear, teacher and
student both will be in difficulty. Such as; how many
questions have to solve or how much time is available for
the test?
b. Timing: Fixing the time for the test is also a difficult fob.
Season must be kept in moving in preparing time table for
the test.
Sufficient Time.
Time should keep
according to the
number and nature of
the questions.
Cost Of Testing.
The test should
inexpensive. It should not
be burden of students or
institute, but reliability and
validity must be kept in
mind.
Easy To Score.
The test should be so constructed
that its scoring must be easy.
There should be specific marks for
every part and every questions.
Scoring key must be provided to the
evaluator. There should be objectivity
in scoring the test.
Objectivity
Objectivity: ... Gronlund and Linn (1995)
states “Objectivity of a test refers to the
degree to which equally competent scores
obtain the same results. So a test is
considered objective when it makes for
the elimination of the scorer's personal
opinion and bias judgement.
What Is Objective Type Of
Test?
Objective tests require recognition and
recall of subject matter. The forms vary:
questions of fact, sentence completion,
true-false, analogy, multiple-choice, and
matching. They tend to cover more
material than essay tests. They have one,
and only one, correct answer to each
question.
Norms
Test “norms” — short for normative
scores — are scores from standardized
tests given to representative samples of
students who will later take the same test.
Norms provide a way for teachers to
know what scores are typical (or average)
for students in a given grade
What Is A Norm-referenced
Test Used For?
Norm-referenced tests report whether test
takers performed better or worse than a
hypothetical average student, which is
determined by comparing scores against
the performance results of a statistically
selected group of test takers, typically of
the same age or grade level, who have
already taken the exam.
Qualities of a Good Test
Qualities of a Good Test
Qualities of a Good Test

Qualities of a Good Test

  • 2.
    Setting Rules No crosstalks Cell phone on silent Hand raise if any question
  • 3.
    Characteristics Of A Good Dr.Sindhu Almas Lecturer Community Medicine Department, LUMHS Jamshoro.
  • 4.
    Characteristics Of AGood Measuring Instrument (Test)  The most essential of these are: 1. Validity 2. Reliability 3. Objectivity 4. Norms 5. Usability.
  • 5.
    1. Validity:  Testexperts generally agree that the most important quality of a test is its validity.  The word “validity” means effectiveness”  It refers to the accuracy with which a thing is measured.  A test is said to be valid if it measures what it claims to measure. 
  • 6.
    According to Gronlund “Validityrefers to the appropriateness of the interpretations made from test scores and other evaluation results, with regard to a particular use”.
  • 7.
    Nature/Characteristics of Validity: When usingthe term validity in relation to testing and evaluation, there are a number of cautions to be borne in mind. These are: a. Validity refers to the appropriateness of the interpretation of the results of a test not to the test itself.
  • 8.
    B. Validity isa matter of degree: Validity does not exist on an all-or-none basis. So we should avoid speaking valid or invalid test. Validity is best considered in terms of categories that specify degree, such as high validity, moderate validity and low validity.
  • 9.
    C. Validity isalways specific to some particular use or interpretation. No test is valid for all purpose. When describing validity, it is necessary to consider the specific interpretation or use to be made of the result and it is not valid for all purposes.
  • 10.
    Types/Approaches to Test Validation: According to Gronlund (1990), there are three basic approaches to the validity of tests. These are 1. Content validity 2. Criterion-related validity 3. Construct validity
  • 12.
    1.Content Validity According toGronlund (1990) it refers to the extent to which the test content represents a specified universe of content. It means that the “test content” (Test Items) should measure the “course content” (Curriculum/objectives).
  • 13.
    Johanson, B &Christensen, L. (2008) stated that when making your decision, try to answer these tree questions: Do the items appear to represent the thing you are trying to measure? Does the set of items fully representing the important content areas or topics? Have you included all relevant items?
  • 14.
    Table of specification isused to ensure the content validity of a test.
  • 15.
    Content: Level ofBloom Taxonomy Total Knowledge Comprehension Application Grammar 10 5 1 16 Reading 10 5 1 16 Writing 12 5 1 18 Total 32 15 3 50 TABLE OF SPECIFICATION :1
  • 16.
    2.Criterion Related Validity: Criterion-Relatedvalidity deals whenever we need prediction of future performance of students or to assess/estimate present/current performance on some criterion (valued measure other the test itself). It refers to the extent to which the test correlates with some criterion.
  • 18.
    Types Of Criterion-related Evidence(Validity): According to Rizvi, A. (1973) there are two types of Criterion-Related Evidence (validity) according to time factor. These are i. Concurrent Evidence (Validity) ii. Predictive Evidence (Validity)
  • 19.
    I. Concurrent Evidence (Validity): Itrefers to the extent to which the test correlates with some criterion obtained at the same time (i.e. concurrently). For example when Math's test scores developed by a class room teacher correlated with another Math's test or with Teachers rating, you have concurrent Evidence (Validity).
  • 20.
    Concurrent Validity Concurrent validityis a type of Criterion Validity. If you create some type of test, you want to make sure it’s valid: that it measures what it is supposed to measure. Criterion validity is one way of doing that. Concurrent validity measures how well a new test compares to an well-established test. It can also refer to the practice of concurrently testing two groups at the same time, or asking two different groups of people to take the same test.
  • 21.
    Advantages: It is afast way to validate your data. It is a highly appropriate way to validate personal attributes (i.e. depression, IQ, strengths and weaknesses).
  • 22.
    Disadvantages: It is lesseffective than predictive validity to predict future performance or potential, like job performance or ability to succeed in college. If you are testing different groups, like people who want jobs and people who have jobs, responses may differ between groups. For example, people who already have jobs may be less inclined to put their best foot forward.
  • 23.
    Concurrent Validity Examples Example 1: If you create a new test for depression levels, you can compare its performance to previous depression tests (like a 42-item depression level survey) that have high validity. Concurrent means “as the same time”, so you would perform both tests at about the same interval: you could test depression level on one day with your test, and on the next day with the established test. A statistically significant result would mean that you have achieved concurrent validity. If the tests are farther apart (i.e. they aren’t administered concurrently), then they would fall into the category of Predictive Validity instead of criterion validity.
  • 25.
    Example 2:  Concurrentvalidity can also occur between two different groups. For example, let’s say a group of nursing students take two final exams to assess their knowledge. One exam is a practical test and the second exam is a paper test. If the students who score well on the practical test also score well on the paper test, then concurrent validity has occurred. If, on the other hand, students who score well on the practical test score poorly on the paper test (and vice versa), then you have a problem with concurrent validity. In this particular example, you would question the ability of either test to assess knowledge.
  • 26.
    ii. Predictive Evidence(Validity): According to Gronlund “it refers to the extent to which the test correlates with some criterion obtained after a stated interval of time”.
  • 27.
    The Predictive ValidityOf A Test Is Determined By establishing the relationship between scores on the test and some measure of success in the situation of interest. The test use to predict success is referred to as the Predictor and the behavior predicted is referred to as the “Criterion”
  • 28.
  • 29.
    Predictive Validity The abilityof a measure to predict for future outcomes Predictive validity is the ability of a survey instrument to predict for future occurrences. Correlations are used to generate predictive validity coefficients with other measures that assess a validated construct that will occur in the future. Predictive validity is a type of criterion-related evidence or criterion validity.
  • 31.
    Construct Validity: It refersto the extent to which the test measure the construct it claims to measure. A construct is a psychological trait or quality that we assume to exists in order to explain some aspect of behavior. Mathematical reasoning, intelligence, creativity, sociability, honesty and anxiety are the examples of construct.
  • 32.
    There are threebasic steps to construct validity. 1. To identify the construct e.g intelligence. 2. From the theory the student can analyses the prediction e.g it is trait to know. A. General knowledge. B. Reasoning power. C. To solve numerical problems. D. Decision power. 3. Then measure these factors through test items.
  • 33.
    FACTORS INFLUENCING VALIDITY. Three typesfactors: FACTORS IN THE TEST ITSELF: FACTORS IN THE TEST ADMINISTRATION AND SCORING. PERSONAL FACTORS OF STUDENT
  • 34.
    2. Reliability According toGronlund “Reliability refers to the consistency of measurement” According to Ebel “The ability of the test to measure the same quality when it is administered to an individual on two different occasions or by two different testers or evaluators is known as reliability”.
  • 35.
    Nature/Characteristics Of Reliability:  Themeaning of Reliability can further clarified by noting the following general points: 1. Reliability refers to the result obtained with an evaluation instrument (test) and not to the instrument (test) itself.
  • 36.
    2. Reliability RefersTo Some Particular Type Of Consistency.  Test scores are not reliable in general. They are reliable (or generalizable)  over different period of time,  over different samples of questions,  over different raters, and the like.
  • 37.
    3. Reliability IsA Necessary But Not A Sufficient Condition For Validity.  A valid test must also be a reliable test, but high reliability does not ensure that a satisfactory degree of validity will be present.  In summary, reliability merely provides the consistency that makes validity possible.
  • 38.
    Reliability Is PrimarilyStatistical The two widely used method of expressing reliability are “Standard Error of Measurement” and Reliability Co-efficient.  Reliability Co-efficient “is a correlation Co- efficient that indicates the degree of relationship between two set of measures obtained from same instrument or procedure”
  • 40.
    Methods Of Estimating Reliability Lien, A.J. (1976) mentioned three basic methods of estimating Reliability 1. Test-Re-Test Method 2. Alternative Form / Equivalent forms Method 3. Split-Halves Method/Internal Consistency Method
  • 42.
    2. The ParallelForms Method (Equivalent Forms)  In this method two equivalent forms of a test are given to the same group of students and the scores obtained on the two forms are correlated.
  • 43.
  • 44.
    3.The Split-half Method. Inthis method the test is so designed that it can be divided into two equivalent halves, say odd-numbered and even- numbered items. It is then administered as whole only once. The scores of even- numbered and odd-numbered items are correlated
  • 46.
    3. Practicality OrUsability. After constructing the test, we have to administer and then to score the test and improve the quality of the test. it contains the following steps. 1.EASE OF ADMINISTRATION: It consists of two points. a. Instructions: If the instructions are not clear, teacher and student both will be in difficulty. Such as; how many questions have to solve or how much time is available for the test? b. Timing: Fixing the time for the test is also a difficult fob. Season must be kept in moving in preparing time table for the test.
  • 47.
    Sufficient Time. Time shouldkeep according to the number and nature of the questions.
  • 48.
    Cost Of Testing. Thetest should inexpensive. It should not be burden of students or institute, but reliability and validity must be kept in mind.
  • 49.
    Easy To Score. Thetest should be so constructed that its scoring must be easy. There should be specific marks for every part and every questions. Scoring key must be provided to the evaluator. There should be objectivity in scoring the test.
  • 50.
    Objectivity Objectivity: ... Gronlundand Linn (1995) states “Objectivity of a test refers to the degree to which equally competent scores obtain the same results. So a test is considered objective when it makes for the elimination of the scorer's personal opinion and bias judgement.
  • 51.
    What Is ObjectiveType Of Test? Objective tests require recognition and recall of subject matter. The forms vary: questions of fact, sentence completion, true-false, analogy, multiple-choice, and matching. They tend to cover more material than essay tests. They have one, and only one, correct answer to each question.
  • 52.
    Norms Test “norms” —short for normative scores — are scores from standardized tests given to representative samples of students who will later take the same test. Norms provide a way for teachers to know what scores are typical (or average) for students in a given grade
  • 53.
    What Is ANorm-referenced Test Used For? Norm-referenced tests report whether test takers performed better or worse than a hypothetical average student, which is determined by comparing scores against the performance results of a statistically selected group of test takers, typically of the same age or grade level, who have already taken the exam.

Editor's Notes

  • #21 Concurrent validity pertains to the ability of a survey to correlate with other measures that are already validated. Both the survey of interest and the validated survey are administered to participants at the same time. Then, concurrent validity coefficients are generated using correlations between the survey of interest and the validated survey.  Concurrent validity is a type of criterion-related evidence or criterion validity.