RELIABILITY AND
VALIDITY IN SOCIAL RESEARCH
Boonserm Booncharoenpol
Very often questions for
social researchers, “How qualified is your instrument, i.e. your questionnaire
or your interview sheet? Is it already
tested for validity and reliability?” Being afraid of such question social
researchers blindly test their questionnaires or interview sheets by using
reliability and validity test method from psychometrics and education tests. The psychologist Lee Cronbach formula[1] is normally
used in such tests. In fact the Cronbach formula is for psychometric
test, e.g. quantitative tests for the measurement of psychological variables
such as intelligence, aptitude, and personality traits. It will be nonsensical if a social researcher
uses Cronbach formula for testing their
questionnaires or interview sheets.
Why? Please read the following
paragraph carefully.
Once some of
these decisions are made and a measure is developed, which is a careful and
tedious process, the relevant questions to raise are “how do we know that we
are indeed measuring what we want to measure?” since the construct (concept,
model, idea) that we are measuring is abstract, and “can we be sure that if we
repeated the measurement we will get the same result?”. The first question is related to validity and
second to reliability. Validity and
reliability are two important characteristics of behavioral measure and are
referred to as psychometric properties.[2]
From the paragraph:
Validity: how
do we know that we are indeed measuring what we want to measure?
Reliability: can
we be sure that if we repeated the
measurement we will get the same result?
For example, we want to test IQ of a man, we invent 20
questions in a test paper for the man to solve. Then we have to be sure that these questions
are really for IQ test. This is
validity of the test questions.
A test paper when repeatedly testing to the same man, let
us say 50 times, if the scores of the 50 tests are the same, we accept that the
test paper has reliability characteristic.
Social
Science Research
In social science research, validity and reliability are
useful characteristics for questionnaires and interview sheets. But the tests for validity and reliability
are different from that of psychometric and education tests.
For
validity of the questions, considering IQ test in psychometrics,
they have standard questions invented by psychologist experts. They can compare the score results from new
questions with the score results from the standard questions. What about questionnaires in social science
as economics, tourism, political science, public administration, and business
administration? We never have standard
questions relevant to our works and accepted by experts as the convention of
the field. If we do not have such
standard questions we can not compare the score tested of our questionnaires
with that of the standard one.
The Professional
Testing Organization. advices how to test validity as follows. Questionnaire validity is typically
estimated by gathering a group of subject matter experts (SMEs) together to
review the test items. Specifically,
these SMEs are given the list of content areas specified in the test blueprint,
along with the test items intended to be based on each content area. The SMEs are then asked to indicate whether
or not they agree that each item is appropriately matched to the content area
indicated.
Any items that the SMEs identify as being inadequately
matched to the test blueprint, or flawed in any other way, are either revised
or dropped from the test.[3] This
is relevant to the case of social science research.
What to do
with validity in social research questions?
One thing to do is “ Ask the experts before you
ask questions to respondents.” Other
things to do are: make questions covering with all needed information, no
ambiguous question, no leading question, and ask to the point.
For
reliability of the questions, the psychometricians can repeat asking same
or equivalent questions many times to their clients and notice whether they
answer the same thing or the equivalent.
Also the educationists can test their students many times with the same
questions or the equivalent. Then they
compare scores and calculate the alpha coefficient of Lee Cronbach’s test.[4]
How about social science research, can we repeat asking the
same respondent many times? We can’t do
that. Therefore we do not know whether
they answer the same thing or the
equivalent. The alpha coefficient
of Lee Cronbach’s test is then useless for
social science research.
What to do with reliability
in social research questions? Things
to do are:
·
Do not ask an ambiguous question because a respondent will answer that question in
many ways. The question must be corresponding to unique answer.
·
The question must be clear, to the point.
·
No leading question.
Validity can’t be obtained through reliability. Even we know that our set of questions has
good reliability, we cannot conclude that we will have good validity. We can not calculate validity from
reliability. They have no relation
between them. Why not? Because:
·
Validity: how
do we know that we are indeed measuring what we want to measure?
·
Reliability: can
we be sure that if we repeated the
measurement we will get the same result?
There is nothing correlated between the two
characteristics.
Cronbach’s
Coefficient Cannot Help Social Science Researchers
For estimating Cronbach’s coefficient α, we have to repeat questions many times. In social research, if we do not do
controlled experimental research we cannot ask respondents repeatedly many
times. No one will cooperate with us to
that degree. So Cronbach’s
coefficient α can
do nothing with social research as economics – business – tourism – political
science – public administration, etc.
Some researchers use answers from many respondents instead
of repeatedquestions to each of all respondents for calculating α. That is not repeated questions to each
respondent therefore the result is not
Cronbach’s coefficient.
Conclusion
Regarding validity and reliability of questionnaires or
interview sheets is very good strategy for social research. But examine them qualitatively, even
economics research. Do not try to
quantify them or you will be deceived.
Mathematics and statistics can do nothing with validity and reliability
of social research. What should we do
then?
The strategies. The strategies to bring about validity and
reliability at the same time to your questions are:
·
All details of information you need to know are
brought into your questions. (validity)
·
If you do not experience topic you are doing research, consult experienced
persons. (validity)
·
Do not ask an ambiguous question because a respondent will answer that question in
many ways. The question must be corresponding to unique answer. (validity and
reliability)
·
The question must be to the point. (validity)
·
No leading question. (validity and reliability)
Please do not quantify your qualitative
questions into quantitative feature. It
may look good on paper but it is meaningless.
…………………………
APPENDICES
UNDERSTANDING CRONBACH’S RELIABILITY
Boonserm Booncharoenpol
A better understanding of score
reliability can resolve common misconceptions.
Lee Cronbach, a psychometrician, tried to test whether a
question or a set of questions that a tester repeatedly asks his respondents
will make the respondent answer the same
answer. If the answers from the repeated
test are the same, it is called the perfect Cronbach’s reliability (α
coefficient = 1). If they are not the
same but close together the Cronbach’s reliability is high. If they are much different, the Cronbach’s
reliability is poor. Some academics set
up the scale as the table below.[5] However there is no convention about this
scale. You may or may not agree with
this scale.
|
Cronbach's alpha
|
Internal consistency
|
|
α ≥ 0.9
|
Excellent
|
|
0.8 ≤ α < 0.9
|
Good
|
|
0.7 ≤ α < 0.8
|
Acceptable
|
|
0.6 ≤ α < 0.7
|
Questionable
|
|
0.5 ≤ α < 0.6
|
Poor
|
|
α < 0.5
|
Unacceptable
|
Example A
professor organized ‘a Test of English for Politicians’ by 5 items of language:
grammar, reading, listening, conversation, and writing. The professor asked a politician to do the
test once a week, for three weeks. That was three tests. The results of the tests were in the table
below.
|
RESULTS OF ENGLISH TESTS FOR MR. A
|
|||
|
Item
|
1stTest
|
2nd Test
|
3rd Test
|
|
Grammar
|
65
|
62
|
68
|
|
|
80
|
85
|
78
|
|
Listening
|
62
|
72
|
71
|
|
Conversation
|
59
|
65
|
61
|
|
Writing
|
8
|
12
|
11
|
Solution:
Find correlation coefficient between 1st Test and 2nd
Test, 1st Test and 3rd Test, 3rd Test and 1st Test. The results are as follows.
r1.2
= 0.9855 r1.3 = 0.9896 r2.3 = 0.9850
r average
= (0.9855 + 0.9896 + 0.9850)/3
= 0.9867
Cronbach’s alpha coefficient = rk /[1 + (k -1)r]
where
r
: r average =
0.9867
k
: items = 5
α = 0.9867*5/ [1 + (5 – 1) 0.9867]
. =
4.9335 / [1 + 4*0.9867]
= 4.9335 / [1 + 3.9468]
= 4.9335 / 4.9468
= 0.9973
Therefore the set of
English test questions for Mr. A is reliable at the Cronbach’s alpha
coefficient of 0.9973. The interpretation is: the results from
repeated testing are not much different, i.e. high reliability of questions in
the test.
Try some
other respondents too. The testing
of repeated answers from only one respondent is not dependable. That man may have some problem in his
mind. We had better do the same thing,
repeating questions at least three times,
to other respondents – 5 respondents should be enough - and find the alpha coefficient from each
respondent. Then we average all alpha
coefficients of all respondents to obtain the more dependable Cronbach’s alpha
coefficient. If you find the Cronbach’s alpha coefficients from each
respondent are considerably different,
your questions are awfully unreliable.
Please develop your questions.
Do not
make wrong calculation. Many researchers do
not repeat testing on the same
respondents. They use answers from many
residents to calculate the Cronbach’s alpha coefficient thinking that this is
repeated test. No, that is not the case
of repeated questions and answers. So
even though they can calculate the figures and get the result but it is not the
Cronbach’s alpha coefficient. It is the test of
how all the questions in the questionnaire are more or less correlated.
We do not want that result. We want to know that how much the answers
are different if a respondent repeats his answers.
……………………………
REFERENCES
Business Dictionary. Retrieved
April 3, 2011 from http://www.businessdictionary.com/definition/test-validity.html#ixzz2LGJmBQ55
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of
tests. Psychometrika, 16(3), 297–334.
Knowledge Base. Retrieved June 25, 2010 from
http://www.socialresearchmethods.net/kb/relandval.php
McIver, J. P., & Carmines, E. G. (1981).
Unidimensional scaling. Thousand Oaks , CA : Sage. p. 15.
Professional Testing Organization. Test Validity. Retrieved November 8, 2012 from http://www.proftesting.com/test_topics/pdfs/test_quality_validity.pdf
Research Methods. Retrieved May 22, 2010. from http://allpsych.com/researchmethods/validityreliability.html
Test
validity. From
Wikipedia, the free encyclopedia.
Retrieved March 3, 2011. from http://en.wikipedia.org/wiki/Test_validity
…………………………
[1]
Cronbach, L. J. (1951). Coefficient
alpha and the internal structure of tests. Psychometrika, 16(3),
297–334.
[2]
Validity and reliability .Retrieved February 16, 2013 from http://www.stat.purdue.edu/~bacraig/SCS/VALIDITY%20AND%20RELIABILITY.doc
[3]
Professional Testing Organization. Test
Validity. Retrieved April 12, 2011 from http://www.proftesting.com/test_topics/pdfs/test_quality_validity.pdf
[4] Cronbach, L. J.
(1951). Coefficient alpha and the internal structure of tests. Psychometrika,
16,297-334.
[5] Wikipedia. Cronbach's alpha. Retrieved August 15, 2010 from
http://en.wikipedia.org/wiki/Cronbach's_alpha
No comments:
Post a Comment