This document discusses the concept of validity in psychological testing. It defines validity as the degree to which a test measures what it claims to measure. There are three main types of validity: content validity, which concerns how well a test represents the content area it aims to measure; criterion-related validity, which compares test scores to external criteria; and construct validity, which evaluates how well a test measures hypothetical constructs. Validity is influenced by factors like test length and the range of abilities in the sample population. A test must demonstrate validity to ensure the inferences made from its results are appropriate and meaningful.
Why evaluate tests?
To make sure that a test measures the skill, trait, or
attribute it is supposed to measure
To yield reasonable consistent results for the same
individual
To measure with reasonable degree of Accuracy.
A good test must first of all be valid.
3.
VALIDITY
Validity isimportant characteristic of a scientific
instrument.
The term Validity means truth or Fidelity.
Thus, Validity refers to the degree to which a test
measure what is claims to measure.
Anastasi 1968, has said "the validity of a test concern
what the test measures and how well it does so”.
4.
Paraphrasing thedefinition of validity from he
influential Standard for Educational and
Psychological Testing (AERA,APA & NCME, 1999)
it can said , “ A test is valid to the extent that
inferences made from it are appropriate, meaningful,
and useful”.
In broad sense, validity is concerned with
generalizability. When a test is valid one, it means its
conclusion can be generalized in relation to the
general population.
5.
PROPERTIES OF VALIDITY
Validityhas three important properties:
1. Validity is relative term. A test is not generally valid. It is valid
only for particular purpose.
For example, a test of statistical ability will be valid only for
measuring statistical ability , because it is put only to the use of
measuring that ability. It will be worthless to use for other measuring
like history and geography. etc.
2. Validity is not a fixed property of the test because validation is not
a fixed process, rather it an unending process.
3. Validity, like reliability, is a matter of degree and not an all-or-
none property. A test meant for measuring a particular trait or ability
cannot be said to be either perfectly valid or not valid at all.
6.
PURPOSRE OF TESTINGVALIDITY
There are three main purpose of testing validity:
1. Representing of a certain specified area of content: The tester may
wish to determine how an examinee performs at present in a sample
of situations( or contents ) that the test claims to represent.
2. Establishment of a functional relationship with a variable available
at present or in future: The tester may wish to predict an
examinee’s future standing on a certain variable or he may wish to
determine his present standing on a particular variable.
3. Measurement of s hypothetical trait or quality(or construct): A
tester may wish to determine the extent to which an examinee
possesses some traits as measured by the test performance.
7.
TYPES OF VALIDITY
Threetypes of Validity are:
1. Content or Curricular Validity
2. Criterion- related Validity
3. Construct Validity
1. CONTENT VALIDITY: Related to Subjective and their Sampling.
Content validity refers to the connections between the test items and the subject-
related tasks. The test should evaluate only the content related to the field of study
in a manner sufficiently representative, relevant, and comprehensible.
Psychometrics are of the view that content validity requires both item validity and
sampling validity.
Item validity is concerned with whether the test items represent measurement in
the intended content area.
Sampling validity (sometimes called logical validity) is how well the test covers
all of the areas you want it to cover.
8.
Content validity isexamined by two ways:
i) By expert's judgement, and
ii) By statistical analysis ( ensure all items measure the same things)
Statistical test by correlations of score of test by another same test. The high
correlation would provide the index for the content validity.
Following points should be carefully ensuring full content validation of test:
The area of content/ items should be specified explicitly so that all major
portions in equal proportion be adequately covered by the items. Item writer
include such items which are readily available and are easily written.
Before the item writing starts, the content area should be fully defined in clear
words and must include the objectives.
The relevance of content of items should be established in the light of the
examinee responses to those contents not in the light of apparent relevance of the
content.
9.
FACE VALIDITY
Face validityis often confused with content validity, but in the strict sense it is
quite different. Face validity refers not to what the test actually claims to measure
but to what it appears to measure superficially. In other words, face validity is
the mere appearance that the test has validity(Kaplan & Sacuzzo, 2001). Thus
face validity should not be taken in the technical sense, nor should be regarded as
a substitute for objectively determined validity.
• When the test item looks valid to the group of examinees, the test is said to
have face validity.
• Thus face validity in fact, a matter of social acceptability and not a technical
form of validity like content validity.
10.
2. CRITERION-RELATED VALIDITY:Criterion-related validity is a very
common and popular type of test validity. As its name implies, criterion-
related validity is one which is obtained by comparing (or correlating) the test
scores with scores obtained on a criterion available at present or to be
available in the future.
Also referred to as instrumental validity, it states that the criteria should be
clearly defined by the tester in advance. It has to take into account other tester
criteria to be standardized and it also needs to demonstrate the accuracy of a
measure or procedure compared to another measure or procedure which has
already been demonstrated to be valid.
There are two subtypes of criterion validity:
i ). Predictive validity
ii). Concurrent validity
11.
i). Predictive validity:also called empirical validity or statistical validity. As the
name implies, in predictive validity a test is correlated against the criterion to be
made available sometimes in the future. In other words, test scores are obtained
and then a time gap of months or years is allowed to elapse, after which the
criterion scores are then obtained.
The test scores and the criterion sores are correlated and the obtained
correlation becomes the index of validity coefficient.
Predictive validity is needed for tests which include long-range forecast of
academic achievement, forecast of success and forecast of reaction to therapy.
ii). Concurrent Validity: is very similar to predictive validity except that there is
no time gap in obtaining test scores and criterion scores. The test is correlated
with a criterion which is available at the present time. The stronger the
correlation is the grater the concurrent validity of the test. Most suitable for
diagnosis of the present status rather than prediction of future outcome.
12.
3. CONSTRUCT VALIDITY:is the third important type of validity. The term “construct validity”
was first introduced in 1954 in the Technical Recommendation of the American Psychological
Association and since then has been frequently used by measurement theorists.
Construct validity is more complex and difficult process then content and criterion validation. Hence an
investigator decides to compute construct validity only when he is fully satisfied that neither any valid
and reliable criterion is available to him nor any universe of content entirely satisfactory and adequate
to define the quality of the test.
Anastasi 1968 define construct validity as “ the extent to which the test may be said to measure a
theoretical construct or trait”.
Construct is non- observable trait, such as intelligence, anxiety, extraversion, neuroticism, etc.
The process of validation involves the following steps:
I. Specifying the possible different measures of the construct: here the investigator defines the
construct in clear words ad also states one or many supposed measures of that construct.
For example, one wants to specify the different measures of the construct “intelligence”.
The investigator first to define the term “intelligence” and in the light of the definition he would be
expected to specify the different measures.
Number of specification may be made are:
Quick decision in difficult task, ability to learn, goal-oriented, original and critical thinking, etc.
13.
II. Determining theextent of correlation between all or some of the measures of construct:
The second step is to determining whether or not those well-specified measures actually lead
to the measurement of the concerned construct. This is done through correlation with each
other. If the measures of correlation become high then we get much evidence that they are
measuring the same things.
In this the investigation often thrown into a dispute before the final decision. Because of , may
be high correlation between some measure , or zero correlation.
III. Determining whether or not all or some measure act as if they were measuring the
construct:
The next step is to determine whether or not such measure behave with reference to another
variables of interest in an expected manner. If they behave expected manner, it means they
providing evidence for the construct validity.
For example, highly correlating measures from among the above supposed referents for
intelligence should show moderate correlation with teachers rating, grade in class and
examination mark.
It’s a difficult process.
14.
FACTORS INFLUENCING VALIDITY
Validityof a test is influenced by several factors. Some of the important factors are enumerated
below.
Length of the Test
Homogeneous lengthening of the test not only increases the reliability but also the validity of
the test. The longer the test, the more reliable and valid it becomes.
Range of Ability
Like reliability, validity is also influenced by the range of ability of the samples used. If the
subject have a very limited range of ability that is, (the wider range of scores is not possible),
the validity of coefficient will be low. On the other hand, if the subjects have a wider range of
ability so that a wider range of scores is obtained, the validity coefficient of the test would be
enhanced.
Ambiguous Direction
If the direction of the test are ambiguous, it would be differently interpreted by different
examinees. Moreover, such items tend to encourage guessing on the part of the examinees.
As a consequence, the validity of the tests would be lowered.
15.
Socio-cultural Differences
Cultural differencesamong different societies are likely to affect the validity of a test. A
particular test developed in one culture may not be valid for another culture because of the
differences in the socio-economic status, sex ratios, social norms, etc.
Addition of Inappropriate Items
When inappropriate items, particularly vague once whose difficulty values differ widely from
the original items, are added to the test, they are likely to lower both the reliability and the
validity of the test.
---------------------------------------------------------------