272x Filetype PDF File size 1.06 MB Source: jech.bmj.com
JYournal of Epidemiology and Community Health 1992; 46: 465-469 J Epidemiol Community Health: first published as 10.1136/jech.46.5.465 on 1 October 1992. Downloaded from
REVIEWARTICLE
Methods of assessing personality for
epidemiological study
J E J Gallacher
"Personality has different meanings for theo- The aim is to predict behaviour which is in some
logians, philosophers and sociologists, and in psy- way related to disease. Personality assessment is
chology it has been used inmany ways" wrote GW merely a series ofmethods designed to sample the
Allport some 50 years ago.' Writing today, he thoughts, feelings, and actions ofthe individual in a
mightwell have included epidemiologists amongst way that will allow accurate and efficient pre-
those challenged to develop their own approach to diction. Naturally, constant and direct observation
the problem ofpersonality. Popularly, personality is the most accurate method but increasing effici-
is understood largely in terms of traits or types ency by the sampling of behaviour and its ante-
using categories like Eysenck's "introvert" or cedents is generally considered to be well worth
"neurotic".2 In a more technical sense, the term some loss in accuracy. The range of personality
"personality" is a useful rubric for the more tests reflects the wide variety ofways in which the
complex aspects of psychological function which factors used to predict behaviour are sampled. For
account for consistent patterns of behaviour char- example, behaviour itself may be sampled either
acterising individuals. Implied in this definition directly as in a performance test or indirectly as in a
are concepts like intelligence, emotional dis- selfreport test. Antecendents ofbehaviour, such as
position, and patterns of social interactions. This emotional disposition, may also be of predictive
distinguishes personality tests such as IQ tests value, but these can only be assessed indirectly as
from physiological tests such as galvanic skin they cannot be observed. to
response and measures of the social and physical The principles of measurement as applied
environment such as a life events scale. personality are no different from those in any other
Personality assessment began with concern over discipline: the measure should be of a clearly
mental health.3 The first personality tests were defined construct, of known repeatability, of
measures of general intellectual performance and known validity, and administered according to a
were developed to identify children requiring standard procedure. Two factors, however, have
remedial education.4 Tests of emotional proven to be critical in the shaping of personality
adjustment were then developed for military per- assessment. The first is that personality is itselfan http://jech.bmj.com/
sonnel selection in World War I' These applica- abstract concept. Although personality tests are
tions illustrate the two main ways personality designedtopredictbehaviour, the concepts usedto
assessment has been used. In the first, personality explain the behaviour are abstract. For example,
is conceived as the dependent variable ofa patho- IQscore is based on performance which is directly
logical process. In the second, personality is an observable, but the concept ofintelligence used to
independent variable predicting behaviour. Early makesenseofthe score is not. This problem is not
unique to personality assessment. A similar prob-
in the history of epidemiology, personality was lem is faced by physicians in trying to define on September 19, 2022 by guest. Protected by copyright.
considered almost exclusively as a dependent vari- health'2 and biologists in trying to define life.'3
ableofadiseaseprocess. This confined thetypes of The abstract nature of personality means that
legitimate hypothesis largely to those involving ultimately personality tests measure themselves, as
mental illness6 or poor intellectual performance.7 there can be no independent verification of their
More recently, however, personality is being performance. Nevertheless, not all tautologies are
increasingly considered as an independent variable sterile and personality testing has been found to be
in a variety of health related processes, including very useful in spite of this inability to establish its
disease processes,8 therapeutic processes,9 10 and objectivity independently.
1
publichealth programmes. " This simple develop- The second factor that has shaped personality
ment of thought has facilitated a much greater assessment is that subjects are active participants in
variety of hypothesis involving personality. For the measurement process. Personality tests are
this reason it is becoming increasingly important done by subjects and not to them. As a result the
for the epidemiologist to be familiar with the measurement process itselfcan contribute towards
MRCEpidemiology opportunities afforded by, and the limitations of, the test score and may bias it with respect to the
Unit, Liandough personality assessment. hypothesis being tested. For example, scores on IQ
Hospital, Penarth, tests increase with practice'4 but this does not
S Glamorgan necessarily signify an increase in IQ!
CF6 lXX, Principles of personality measurement
United Kingdom
J E J Gallacher RATIONALE CLEAR DEFINITION
Accepted for publication Therationale behind assessingpersonality from an If these two factors are applied to each of the
April 1992 epidemiological point of view is straightforward. principles ofmeasurement in turn, the constraints
46
466 EIGallacher
J Epidemiol Community Health: first published as 10.1136/jech.46.5.465 on 1 October 1992. Downloaded from
ofpersonality testing become clear. The need for The effect of adding or discarding items is given
clear definition ofconstructs is a difficult problem by the Spearman-Brown formula3 and is often
when the construct is wholly abstract, as there is quoted in test manuals to show that a reasonable
no observable foundation upon which agreement balance has been achieved between consistency of
can be based. This problem has been a major performance and length of test.
impetus towards the great diversity of theories
and models of personality which have been VALIDITY
developed. Due to the abstract nature of personality the
validity ofa test can only be established indirectly.
REPEATABILITY This problem is not confined to personality
In strict statistical terms the repeatability of assessment as it also arises in measures of symp-
personality tests cannot be directly assessed, as toms such as in chest pain questionnaires
truly replicate measurements cannot be made. assessing coronary heart disease.'8 As with
This is due to the subject participating in and measures of chest pain, the difficulty of estab-
being affected by the measurement process. For lishing the validity of personality tests is con-
example, learning affects the repeatability of sidered an acceptable price to pay for the proven
performance tests while familiarity with a testing worthofagoodtest. Therearethreemaintypesof
procedure reduces anxiety, affecting the self evidence used to assess validity. Content validity
report of thoughts and feelings. Conveniently, refers to the contents of the test items. The
therefore, the consistency of test performance is question asked is "Do they cover the areas
termed "reliability" and refers to a variety of needing to be covered?" Construct validity refers
procedures which attempt to provide an assess- to the relationships between test scores and those
ment of measurement error. The assumption assessing other constructs. The question being
underlying these procedures is that the precision asked is "Are the relationships those that would
of an estimate can be gained from a series of be expected?" Criterion-related validity refers to
repeated measurements.15 16 This argument is the measure's ability to predict a specific out-
applied crudely with regards to personality come. The question being asked is "Does this test
assessment as, when comparing differences in predict, for example, performance orhealth or job
measurements, no attempt is made to distinguish promotion?"
between biological and measurement variation or
betweensystematic andrandomvariation. Never- STANDARD PROCEDURE standard procedure is
theless, the resulting procedures provide simple The importance of
and conservative estimates ofa test's precision by emphasised in personality assessment due to the
comparing equivalent measurements of the same subject's active participation in the measurement
construct to identify the level of agreement. process. Fromthesubject's pointofviewthere are
Test-retest reliability refers to the repeating of two main sets of factors which affect every
the test on a second occasion. The reliability assessment situation. The first is the test taking
coefficient in this case is simply the correlation habits that the subject brings to the situation.19
between the two test scores. Ofparticular impor- These personal biases are called "response sets"
tance is the interval between testing, as retest and include acquiescence (a tendency to accept or
correlations will tend to diminish as the interval reject all items as applying to oneself),
lengthens. Practice and learning effects apart, the evasiveness/extremeness (giving indifferent or http://jech.bmj.com/
retest correlation tends to show the general- extreme responses), social desirability (giving
isability of scores between testing occasions. responses that are socially acceptable), and cauti-
Alternateform reliability refers to the comparison ousness (omitting difficult items in a performance
ofscores from two alternative forms ofthe test on test).20 The second set of factors is the demand
the same or equivalent populations. The characteristics of the test. Subjects respond to
reliability coefficient is the correlation between what they think they are required to do. This
the two scores. Although this type of reliability frequently includes helping the experimenter to on September 19, 2022 by guest. Protected by copyright.
overcomes the problems of test-retest reliability achieve the hoped for results, rather than pro-
there are few tests which have alternative forms viding a fair test of the hypothesis. An
and so in practice it is rarely used. experimenter's own expectation may affect the
Split-halfreliability refers to the comparison of data, therefore, by producing behaviour which
scores between comparable halves ofa single test. inadvertently affects the subjects and so achieves a
The reliability coefficient is the correlation self fulfilling prophecy.2' Experimenter age, sex,
between halves ofthe test and is a measure ofthe race, and socioeconomic status have all been
internal consistency of the test. The internal shown to affect test scores.3 Environmental
consistency of a test may also be assessed by factors are also important-eg, lighting and noise
comparingscores between all items. This is called level. Although neither response sets nor demand
inter-item reliability and may be calculated using a characteristics can be eliminated their effects can
formula derived by Kuder and Richardson.17 be held constant throughout a study by main-
Internal consistency may also be used to assess the taining standard procedures.
efficiency of measurement but this can be a
doubled edged sword: too low a reliability Methods ofassessment
coefficient and more items are needed to improve In spite of the difficulties of personality
the consistency of measurement but a very high assessment it has proven capable of predicting
reliability coefficient suggests some items could behaviour sufficiently accurately to be of scienti-
be discarded with little loss to accuracy. There- fic use. The wide variety of assessment pro-
fore reliability coefficients based on the principle
of internal consistency are rarely above r=0-8. cedures may be easily understood in terms offour
Methods of assessing personality for epidemiological study 467 J Epidemiol Community Health: first published as 10.1136/jech.46.5.465 on 1 October 1992. Downloaded from
main types of test comprising performance, self behaviour in situations which cannot easily be
report, structured interview, and projective tests. observed directly. The relative ease of standard-
Each method has its own strengths and weak- isation of procedure, the variety of areas which
nesses and applicability to population studies. can be covered, and the possibility of group
The main characteristics of the tests cited below measurement make self report questionnaires
are given in the table. very attractive for use in population studies. Self
report tests which have been used in population
PERFORMANCE TESTS studies include the general health questionnaire
In performance tests, scores are derived from assessing psychiatric caseness, 24 the state-trait
direct observation of the subject performing a anxiety inventory,25 the Beck depression scale,26
task. Performance testing reduces to a minimum and the Jenkins activity survey assessing type A
issues of observer bias and subject bias as the behaviour.27 Although selfreport questionnaires
subject can either perform the task or not. The minimise bias from the observer, opportunities
limitation of performance testing involves for bias in the subject remain, with problems of
generalisability, as few performance tests can be the demand characteristics of the measurement
related to more remote aspects of personality. A situation and the subject's response sets, such as
notable exception to this rule is the Porteus maze social desirability and acquiescence discussed
test which can be used to assess obsessionality.22 earlier. In addition to the self report scales
The lack of generalisability tends to limit per- described above, which tend to assess specific
formance testing to specific abilities such as aspects of personality, the self report format has
numeric and verbal reasoning or dexterity skills. also been used in developingmore comprehensive
With regard to their use in population studies assessments of personality. These are what are
performance tests tend to be intimidating, as they popularly understood as "personality tests" and
invariably include items that a subject cannot do, include the 16PF (measuring 16 personality
or cannot do well. They can also be long and are factors),27 the Eysenck personality inventory,29
usually administered individually. Increasingly the Myers-Briggs type indicator,30 and the Min-
performance tests are being computerised, which nesotamultiphasicpersonality inventory3' among
reduces the need for special training and allows others. Although these tests have been carefully
more subjects to be assessed simultaneously. The developed they are time consuming and can be
bestknowntypeofperformancetestistheIQtest, intimidating. This limits their applicability to
for example the Wechsler adult intelligence population studies.
scale.23 The debate over racial differences in IQ
scores illustrates the difficulties in extrapolating STRUCTURED INTERVIEWS
from performance scores to personality. Structured interviews provide a more flexible
framework for response than self report tests, as
SELF REPORT TESTS the interviewer can probe the responses to ensure
Selfreport questionnaires provide data which are their relevance to the study. In a structured
more generalisable than those from performance interview questions are asked in order to obtain
tests. The questionnaire format facilitates stand- biographical information and to observe the sub-
ardisation ofquestion and response butalso limits ject directly. The most widely known structured
the flexibility of the instrument. Self report interview is that used to assess type A http://jech.bmj.com/
questionnaires are best used to enable the subject behaviour.32 In this interview the subject is
to report thoughts and feelings as well as scored by his or her self reports of stress and
The main characteristics Reference Construct measured Number of Duration
of the instruments cited in Type of test Instrument No -subscales items (minutes)
the text Performance Porteus maze test 22 Intelligence 9 30
(administered Impulsivity (Vineland
individually Wechsler adult intelligence scale 23 Obsessionality revision) on September 19, 2022 by guest. Protected by copyright.
Intelligence 108 90
-verbal IQ (6 scales)
Self report Beck depression scale 26 -performance IQ (5 scales)
(can be Depression 21 5-10
administered Eysenck personality inventory 29 Global personality 57 10
in groups) -introversion/extroversion (Form A)
General health questionnaire 24 -neuroticism
Jenkins activity survey 27 Psychiatric caseness 30 5-10
Type A behaviour 52 20
-direction of aggression
Thematic apperception test 36 -type ofaggression
Structured Interview schedule for social Structure of personality 20 Variable
interview interaction 33 Social support 52 45
(administered -availability of attachment
individually) -adequacy of attachment
-availability of social
integration
-adequacy of social
Standardised psychiatric interview 34 integration
Psychiatric caseness Variable 10-60
-somatic symptoms
Video structured interview 32 -cognitive abnormality
Type A behaviour 27 20
-hostility
Projective Rorschach inkblot test 35 -time urgency
(administered Maladaptive personality 10 Variable
individually) Rosenzweig picture frustration study 37 Aggression 24 Variable
-direction of aggression
Thematic apperception test 36 -type of aggression
Structure of personality 20 Variable
J
48
468 E Gallacher J Epidemiol Community Health: first published as 10.1136/jech.46.5.465 on 1 October 1992. Downloaded from
according topositivegesticulation and voice char- misclassification of individuals is so important
acteristics. Structured interviews have also been that specialised training is necessary for the use of
used to assess social networks33 and psychiatric personality tests. In epidemiology, which is con-
caseness.34 From an epidemiological viewpoint cerned about classifying groups, any effect of the
the strengths of a structured interview are its standard error ofmeasurement can be reduced by
flexibility of administration and the fact that it increasing sample size, although this will not
allows direct observations of the subject. The affect associations being obscured.
limitations of a structured interview are the Response sets and demand characteristics tend
requirement for intensive training of the inter- to be consistently expressed by individuals and by
viewer, the difficulties in maintaining quality experimenters, and so they may introduce a
control of an essentially subjective scoring pro- systematic element to measurement error. Sys-
cedure, and the need to interview subjects indi- tematic error distorts mean values but ifthe error
vidually. Quality control in terms of consistency is of the same magnitude for all groups, the
between interviewers and consistency over time differences between them are not compromised
for the same interviewer can be assessed and and neither are any associations. In case-
standards maintained but it requires careful plan- controlled, cross sectional, and cohort studies the
ning and is a burden on interviewer resources. assumption of a constant level of systematic
measurement error across the sample calls for
PROJECTIVE TESTS careful judgement. In intervention studies this
The last category of personality tests are pro- assumption may be more confidently made as all
jective tests which require the subject to interpret spurious differences between groups are taken
an ambiguous stimulus, ie, to project his or her account of in the randomisation process.
personality onto the stimulus. The response
format is generally very flexible as the emphasis is CAUSALITY
on freedom of expression rather than ease of Epidemiologists will be more aware than most of
quantification. Examples of projective tests the difficulty in inferring causality from complex
include the Rorschach inkblot test,35 the thematic data.38 For personality data the critical factor in
apperception test,36 and the Rosenzweig picture establishing causality is determining the sequence
frustrations study.37 Projective tests are generally of events. The propensity for change of person-
unsuitable for use in population studies, usually ality data means that the sequence of events can-
require intensive training to be used competently, not be assumed when testing a hypothesis. Con-
have poor validity and reliability (except in some sequently case-controlled and cross sectional
specific clinical applications), can be time con- studies are only, generally speaking, of
suming, and cannot be administered in groups. exploratory value. For example, an association of
social class and prevalent heart disease may be
sufficient to strongly suggest an effect of lifestyle
Interpreting personality data onheartdisease. Asimilar conclusion could not be
TRUSTWORTHINESS drawnfromanassociationofstresswithprevalent
Interpreting personality data involves the two heart disease, since stress score may be affected by
issues of trustworthiness and interpreting caus- health status. The sequence of events is not
ality. Untrustworthy findings can be due to important when using personality variables as http://jech.bmj.com/
random or systematic error. Sources of error matching variables in the selection of controls or
which are peculiar to personality assessment are as covariates in statistical analyses. In case-
the response sets and demand characteristics controlled and cross sectional studies, therefore,
described earlier. Response sets and demand personality variables are of most value as control
characteristics can have a random element due variables. For hypothesis testing involving per-
either to how people are "feeling" on the day of sonality, in general the sequence ofevents should
thetestortopoorstandardisation ofmeasurement befirmly established in the design ofthe study. In
procedures. Random error results in the mis- cohort studies baseline personality measurement on September 19, 2022 by guest. Protected by copyright.
classification of individuals due to an increased may be used to predict incident disease39 and in
standard error of measurement and obscures intervention studies a psychological intervention
associations. In clinical practice, the potential for may be used to change a personality variable in
order to reduce subsequent morbidity or
mortality.10
Selecting a personality test
Selecting a personality test for use in an epidemi-
ological study is not simply a matter of browsing
through test manuals, consulting the local psy-
chology department, or following the literature.
Test manuals usually provide inadequate in-
formation, as few tests have been developed with
an epidemiological application in mind. For
example, rarely have tests been developed on
representative population samples or with the
need to achieve high response rates. Conse-
quently, descriptions oftest populations appear to
be casual and uninformative. Although most
A modelfor the selection ofpersonality testsfor epidemiological studies. psychologists are eager to be ofassistance few are
no reviews yet
Please Login to review.