SYED HASSAN SHAH
KARGIL(LADAKH)
DEPT. OF PSYCHOLOGY
K.U.K. HARYANA(INDIA)
VALIDITY
Why evaluate tests?
 To make sure that a test measures the skill, trait, or
attribute it is supposed to measure
 To yield reasonable consistent results for the same
individual
 To measure with reasonable degree of Accuracy.
A good test must first of all be valid.
VALIDITY
 Validity is important characteristic of a scientific
instrument.
 The term Validity means truth or Fidelity.
Thus, Validity refers to the degree to which a test
measure what is claims to measure.
Anastasi 1968, has said "the validity of a test concern
what the test measures and how well it does so”.
 Paraphrasing the definition of validity from he
influential Standard for Educational and
Psychological Testing (AERA,APA & NCME, 1999)
it can said , “ A test is valid to the extent that
inferences made from it are appropriate, meaningful,
and useful”.
 In broad sense, validity is concerned with
generalizability. When a test is valid one, it means its
conclusion can be generalized in relation to the
general population.
PROPERTIES OF VALIDITY
Validity has three important properties:
1. Validity is relative term. A test is not generally valid. It is valid
only for particular purpose.
For example, a test of statistical ability will be valid only for
measuring statistical ability , because it is put only to the use of
measuring that ability. It will be worthless to use for other measuring
like history and geography. etc.
2. Validity is not a fixed property of the test because validation is not
a fixed process, rather it an unending process.
3. Validity, like reliability, is a matter of degree and not an all-or-
none property. A test meant for measuring a particular trait or ability
cannot be said to be either perfectly valid or not valid at all.
PURPOSRE OF TESTING VALIDITY
There are three main purpose of testing validity:
1. Representing of a certain specified area of content: The tester may
wish to determine how an examinee performs at present in a sample
of situations( or contents ) that the test claims to represent.
2. Establishment of a functional relationship with a variable available
at present or in future: The tester may wish to predict an
examinee’s future standing on a certain variable or he may wish to
determine his present standing on a particular variable.
3. Measurement of s hypothetical trait or quality(or construct): A
tester may wish to determine the extent to which an examinee
possesses some traits as measured by the test performance.
TYPES OF VALIDITY
Three types of Validity are:
1. Content or Curricular Validity
2. Criterion- related Validity
3. Construct Validity
1. CONTENT VALIDITY: Related to Subjective and their Sampling.
Content validity refers to the connections between the test items and the subject-
related tasks. The test should evaluate only the content related to the field of study
in a manner sufficiently representative, relevant, and comprehensible.
Psychometrics are of the view that content validity requires both item validity and
sampling validity.
 Item validity is concerned with whether the test items represent measurement in
the intended content area.
 Sampling validity (sometimes called logical validity) is how well the test covers
all of the areas you want it to cover.
Content validity is examined by two ways:
i) By expert's judgement, and
ii) By statistical analysis ( ensure all items measure the same things)
Statistical test by correlations of score of test by another same test. The high
correlation would provide the index for the content validity.
Following points should be carefully ensuring full content validation of test:
 The area of content/ items should be specified explicitly so that all major
portions in equal proportion be adequately covered by the items. Item writer
include such items which are readily available and are easily written.
 Before the item writing starts, the content area should be fully defined in clear
words and must include the objectives.
 The relevance of content of items should be established in the light of the
examinee responses to those contents not in the light of apparent relevance of the
content.
FACE VALIDITY
Face validity is often confused with content validity, but in the strict sense it is
quite different. Face validity refers not to what the test actually claims to measure
but to what it appears to measure superficially. In other words, face validity is
the mere appearance that the test has validity(Kaplan & Sacuzzo, 2001). Thus
face validity should not be taken in the technical sense, nor should be regarded as
a substitute for objectively determined validity.
• When the test item looks valid to the group of examinees, the test is said to
have face validity.
• Thus face validity in fact, a matter of social acceptability and not a technical
form of validity like content validity.
2. CRITERION-RELATED VALIDITY: Criterion-related validity is a very
common and popular type of test validity. As its name implies, criterion-
related validity is one which is obtained by comparing (or correlating) the test
scores with scores obtained on a criterion available at present or to be
available in the future.
 Also referred to as instrumental validity, it states that the criteria should be
clearly defined by the tester in advance. It has to take into account other tester
criteria to be standardized and it also needs to demonstrate the accuracy of a
measure or procedure compared to another measure or procedure which has
already been demonstrated to be valid.
There are two subtypes of criterion validity:
i ). Predictive validity
ii). Concurrent validity
i). Predictive validity: also called empirical validity or statistical validity. As the
name implies, in predictive validity a test is correlated against the criterion to be
made available sometimes in the future. In other words, test scores are obtained
and then a time gap of months or years is allowed to elapse, after which the
criterion scores are then obtained.
 The test scores and the criterion sores are correlated and the obtained
correlation becomes the index of validity coefficient.
 Predictive validity is needed for tests which include long-range forecast of
academic achievement, forecast of success and forecast of reaction to therapy.
ii). Concurrent Validity: is very similar to predictive validity except that there is
no time gap in obtaining test scores and criterion scores. The test is correlated
with a criterion which is available at the present time. The stronger the
correlation is the grater the concurrent validity of the test. Most suitable for
diagnosis of the present status rather than prediction of future outcome.
3. CONSTRUCT VALIDITY: is the third important type of validity. The term “construct validity”
was first introduced in 1954 in the Technical Recommendation of the American Psychological
Association and since then has been frequently used by measurement theorists.
Construct validity is more complex and difficult process then content and criterion validation. Hence an
investigator decides to compute construct validity only when he is fully satisfied that neither any valid
and reliable criterion is available to him nor any universe of content entirely satisfactory and adequate
to define the quality of the test.
 Anastasi 1968 define construct validity as “ the extent to which the test may be said to measure a
theoretical construct or trait”.
Construct is non- observable trait, such as intelligence, anxiety, extraversion, neuroticism, etc.
The process of validation involves the following steps:
I. Specifying the possible different measures of the construct: here the investigator defines the
construct in clear words ad also states one or many supposed measures of that construct.
For example, one wants to specify the different measures of the construct “intelligence”.
The investigator first to define the term “intelligence” and in the light of the definition he would be
expected to specify the different measures.
Number of specification may be made are:
 Quick decision in difficult task, ability to learn, goal-oriented, original and critical thinking, etc.
II. Determining the extent of correlation between all or some of the measures of construct:
The second step is to determining whether or not those well-specified measures actually lead
to the measurement of the concerned construct. This is done through correlation with each
other. If the measures of correlation become high then we get much evidence that they are
measuring the same things.
In this the investigation often thrown into a dispute before the final decision. Because of , may
be high correlation between some measure , or zero correlation.
III. Determining whether or not all or some measure act as if they were measuring the
construct:
The next step is to determine whether or not such measure behave with reference to another
variables of interest in an expected manner. If they behave expected manner, it means they
providing evidence for the construct validity.
For example, highly correlating measures from among the above supposed referents for
intelligence should show moderate correlation with teachers rating, grade in class and
examination mark.
It’s a difficult process.
FACTORS INFLUENCING VALIDITY
Validity of a test is influenced by several factors. Some of the important factors are enumerated
below.
Length of the Test
Homogeneous lengthening of the test not only increases the reliability but also the validity of
the test. The longer the test, the more reliable and valid it becomes.
Range of Ability
Like reliability, validity is also influenced by the range of ability of the samples used. If the
subject have a very limited range of ability that is, (the wider range of scores is not possible),
the validity of coefficient will be low. On the other hand, if the subjects have a wider range of
ability so that a wider range of scores is obtained, the validity coefficient of the test would be
enhanced.
Ambiguous Direction
If the direction of the test are ambiguous, it would be differently interpreted by different
examinees. Moreover, such items tend to encourage guessing on the part of the examinees.
As a consequence, the validity of the tests would be lowered.
Socio-cultural Differences
Cultural differences among different societies are likely to affect the validity of a test. A
particular test developed in one culture may not be valid for another culture because of the
differences in the socio-economic status, sex ratios, social norms, etc.
Addition of Inappropriate Items
When inappropriate items, particularly vague once whose difficulty values differ widely from
the original items, are added to the test, they are likely to lower both the reliability and the
validity of the test.
---------------------------------------------------------------

Presentation validity

  • 1.
    SYED HASSAN SHAH KARGIL(LADAKH) DEPT.OF PSYCHOLOGY K.U.K. HARYANA(INDIA) VALIDITY
  • 2.
    Why evaluate tests? To make sure that a test measures the skill, trait, or attribute it is supposed to measure  To yield reasonable consistent results for the same individual  To measure with reasonable degree of Accuracy. A good test must first of all be valid.
  • 3.
    VALIDITY  Validity isimportant characteristic of a scientific instrument.  The term Validity means truth or Fidelity. Thus, Validity refers to the degree to which a test measure what is claims to measure. Anastasi 1968, has said "the validity of a test concern what the test measures and how well it does so”.
  • 4.
     Paraphrasing thedefinition of validity from he influential Standard for Educational and Psychological Testing (AERA,APA & NCME, 1999) it can said , “ A test is valid to the extent that inferences made from it are appropriate, meaningful, and useful”.  In broad sense, validity is concerned with generalizability. When a test is valid one, it means its conclusion can be generalized in relation to the general population.
  • 5.
    PROPERTIES OF VALIDITY Validityhas three important properties: 1. Validity is relative term. A test is not generally valid. It is valid only for particular purpose. For example, a test of statistical ability will be valid only for measuring statistical ability , because it is put only to the use of measuring that ability. It will be worthless to use for other measuring like history and geography. etc. 2. Validity is not a fixed property of the test because validation is not a fixed process, rather it an unending process. 3. Validity, like reliability, is a matter of degree and not an all-or- none property. A test meant for measuring a particular trait or ability cannot be said to be either perfectly valid or not valid at all.
  • 6.
    PURPOSRE OF TESTINGVALIDITY There are three main purpose of testing validity: 1. Representing of a certain specified area of content: The tester may wish to determine how an examinee performs at present in a sample of situations( or contents ) that the test claims to represent. 2. Establishment of a functional relationship with a variable available at present or in future: The tester may wish to predict an examinee’s future standing on a certain variable or he may wish to determine his present standing on a particular variable. 3. Measurement of s hypothetical trait or quality(or construct): A tester may wish to determine the extent to which an examinee possesses some traits as measured by the test performance.
  • 7.
    TYPES OF VALIDITY Threetypes of Validity are: 1. Content or Curricular Validity 2. Criterion- related Validity 3. Construct Validity 1. CONTENT VALIDITY: Related to Subjective and their Sampling. Content validity refers to the connections between the test items and the subject- related tasks. The test should evaluate only the content related to the field of study in a manner sufficiently representative, relevant, and comprehensible. Psychometrics are of the view that content validity requires both item validity and sampling validity.  Item validity is concerned with whether the test items represent measurement in the intended content area.  Sampling validity (sometimes called logical validity) is how well the test covers all of the areas you want it to cover.
  • 8.
    Content validity isexamined by two ways: i) By expert's judgement, and ii) By statistical analysis ( ensure all items measure the same things) Statistical test by correlations of score of test by another same test. The high correlation would provide the index for the content validity. Following points should be carefully ensuring full content validation of test:  The area of content/ items should be specified explicitly so that all major portions in equal proportion be adequately covered by the items. Item writer include such items which are readily available and are easily written.  Before the item writing starts, the content area should be fully defined in clear words and must include the objectives.  The relevance of content of items should be established in the light of the examinee responses to those contents not in the light of apparent relevance of the content.
  • 9.
    FACE VALIDITY Face validityis often confused with content validity, but in the strict sense it is quite different. Face validity refers not to what the test actually claims to measure but to what it appears to measure superficially. In other words, face validity is the mere appearance that the test has validity(Kaplan & Sacuzzo, 2001). Thus face validity should not be taken in the technical sense, nor should be regarded as a substitute for objectively determined validity. • When the test item looks valid to the group of examinees, the test is said to have face validity. • Thus face validity in fact, a matter of social acceptability and not a technical form of validity like content validity.
  • 10.
    2. CRITERION-RELATED VALIDITY:Criterion-related validity is a very common and popular type of test validity. As its name implies, criterion- related validity is one which is obtained by comparing (or correlating) the test scores with scores obtained on a criterion available at present or to be available in the future.  Also referred to as instrumental validity, it states that the criteria should be clearly defined by the tester in advance. It has to take into account other tester criteria to be standardized and it also needs to demonstrate the accuracy of a measure or procedure compared to another measure or procedure which has already been demonstrated to be valid. There are two subtypes of criterion validity: i ). Predictive validity ii). Concurrent validity
  • 11.
    i). Predictive validity:also called empirical validity or statistical validity. As the name implies, in predictive validity a test is correlated against the criterion to be made available sometimes in the future. In other words, test scores are obtained and then a time gap of months or years is allowed to elapse, after which the criterion scores are then obtained.  The test scores and the criterion sores are correlated and the obtained correlation becomes the index of validity coefficient.  Predictive validity is needed for tests which include long-range forecast of academic achievement, forecast of success and forecast of reaction to therapy. ii). Concurrent Validity: is very similar to predictive validity except that there is no time gap in obtaining test scores and criterion scores. The test is correlated with a criterion which is available at the present time. The stronger the correlation is the grater the concurrent validity of the test. Most suitable for diagnosis of the present status rather than prediction of future outcome.
  • 12.
    3. CONSTRUCT VALIDITY:is the third important type of validity. The term “construct validity” was first introduced in 1954 in the Technical Recommendation of the American Psychological Association and since then has been frequently used by measurement theorists. Construct validity is more complex and difficult process then content and criterion validation. Hence an investigator decides to compute construct validity only when he is fully satisfied that neither any valid and reliable criterion is available to him nor any universe of content entirely satisfactory and adequate to define the quality of the test.  Anastasi 1968 define construct validity as “ the extent to which the test may be said to measure a theoretical construct or trait”. Construct is non- observable trait, such as intelligence, anxiety, extraversion, neuroticism, etc. The process of validation involves the following steps: I. Specifying the possible different measures of the construct: here the investigator defines the construct in clear words ad also states one or many supposed measures of that construct. For example, one wants to specify the different measures of the construct “intelligence”. The investigator first to define the term “intelligence” and in the light of the definition he would be expected to specify the different measures. Number of specification may be made are:  Quick decision in difficult task, ability to learn, goal-oriented, original and critical thinking, etc.
  • 13.
    II. Determining theextent of correlation between all or some of the measures of construct: The second step is to determining whether or not those well-specified measures actually lead to the measurement of the concerned construct. This is done through correlation with each other. If the measures of correlation become high then we get much evidence that they are measuring the same things. In this the investigation often thrown into a dispute before the final decision. Because of , may be high correlation between some measure , or zero correlation. III. Determining whether or not all or some measure act as if they were measuring the construct: The next step is to determine whether or not such measure behave with reference to another variables of interest in an expected manner. If they behave expected manner, it means they providing evidence for the construct validity. For example, highly correlating measures from among the above supposed referents for intelligence should show moderate correlation with teachers rating, grade in class and examination mark. It’s a difficult process.
  • 14.
    FACTORS INFLUENCING VALIDITY Validityof a test is influenced by several factors. Some of the important factors are enumerated below. Length of the Test Homogeneous lengthening of the test not only increases the reliability but also the validity of the test. The longer the test, the more reliable and valid it becomes. Range of Ability Like reliability, validity is also influenced by the range of ability of the samples used. If the subject have a very limited range of ability that is, (the wider range of scores is not possible), the validity of coefficient will be low. On the other hand, if the subjects have a wider range of ability so that a wider range of scores is obtained, the validity coefficient of the test would be enhanced. Ambiguous Direction If the direction of the test are ambiguous, it would be differently interpreted by different examinees. Moreover, such items tend to encourage guessing on the part of the examinees. As a consequence, the validity of the tests would be lowered.
  • 15.
    Socio-cultural Differences Cultural differencesamong different societies are likely to affect the validity of a test. A particular test developed in one culture may not be valid for another culture because of the differences in the socio-economic status, sex ratios, social norms, etc. Addition of Inappropriate Items When inappropriate items, particularly vague once whose difficulty values differ widely from the original items, are added to the test, they are likely to lower both the reliability and the validity of the test. ---------------------------------------------------------------