jagomart
digital resources
picture1_Statistical Methods Pdf 86435 | Amp 54 8 594


 170x       Filetype PDF       File size 1.20 MB       Source: www.apa.org


File: Statistical Methods Pdf 86435 | Amp 54 8 594
statistical methods in psychology journals guidelines and explanations leland wilkinson and the task force on statistical inference apa board of scientific affairs n the light of continuing debate over the ...

icon picture PDF Filetype PDF | Posted on 14 Sep 2022 | 3 years ago
Partial capture of text on file.
                                Statistical  Methods  in Psychology  Journals
                                                           Guidelines 
                                                                                and Explanations
                                                  Leland  Wilkinson  and the Task Force  on Statistical Inference
                                                                   APA Board of Scientific Affairs
                     n the light of continuing debate over the applications of         statistical  methods only and  is not meant as an assessment
                      significance testing in psychology journals and follow-          of  research  methods  in  general.  Psychology  is  a  broad
                     ing the publication of Cohen's (1994)  article, the Board         science. Methods appropriate  in one area may  be inappro-
                 of Scientific  Affairs (BSA) of the American Psychological            priate in another.
                 Association (APA) convened  a committee  called the Task                    The title and format of this report  are adapted from a
                 Force on Statistical Inference  (TFSI) whose charge was "to           similar article by Bailar and Mosteller  (1988).  That article
                 elucidate  some of the controversial issues  surrounding ap-          should  be  consulted,  because  it  overlaps  somewhat  with
                 plications  of statistics including  significance testing and its     this  one  and  discusses some  issues relevant  to research  in
                 alternatives;  alternative  underlying  models  and data trans-       psychology.  Further  detail can  also be found in the publi-
                 formation;  and newer methods made possible by powerful               cations on this topic by several committee members (Abel-
                 computers"  (BSA, personal  communication,  February  28,             son,  1995,  1997;  Rosenthal,  1994;  Thompson,  1996;
                 1996).  Robert Rosenthal,  Robert Abelson,  and  Jacob  Co-           Wainer,  in  press;  see  also  articles  in  Harlow,  Mulaik,  &
                 hen (cochairs) met initially and agreed on the desirability of        Steiger, 1997).
                 having  several  types  of specialists  on the  task  force:  stat-   Method
                 isticians,  teachers  of statistics,  journal  editors,  authors  of
                 statistics  books,  computer  experts,  and  wise  elders.  Nine      Design
                 individuals  were  subsequently invited to join and all  agreed.                                        type of study you are doing.
                 These were Leona Aiken, Mark Appelbaum,  Gwyneth Boo-                 Make clear at the outset what 
                 doo, David A. Kenny, Helena Kraemer, Donald Rubin, Bruce              Do not cloak a study  in  one guise  to  try  to  give it  the
                                                                                                                                         that have mul-
                                                                                                                                studies 
                                                                                                                           For 
                                                                                                              of another. 
                                                                                                  reputation 
                 Thompson, Howard Wainer, and Leland Wilkinson. In addi-               assumed 
                                                                                                                                         those goals.
                                                                                                                             prioritize 
                                                                                                              to  define and 
                                                                                                     be sure 
                 tion, Lee Cronbach, Paul Meehl, Frederick Mosteller and John          tiple goals, 
                 Tukey  served  as  Senior  Advisors  to  the  Task  Force  and              There  are many forms of empirical  studies  in psychol-
                 commented on written materials.                                       ogy,  including  case  reports,  controlled  experiments,  quasi-
                       The TFSI  met  twice  in two  years  and  corresponded          experiments,  statistical  simulations,  surveys,  observational
                 throughout  that  period.  After  the  first  meeting,  the  task     studies,  and  studies  of  studies  (meta-analyses).  Some  are
                 force circulated a preliminary report indicating its intention        hypothesis generating: They explore data to form or sharpen
                 to examine issues beyond null hypothesis significance test-           hypotheses  about  a population  for assessing  future  hypothe-
                 ing.  The task  force  invited  comments and used  this feed-         ses. Some are hypothesis testing:  They assess specific a priori
                 back in the deliberations  during its second  meeting.                hypotheses or estimate parameters by random sampling from
                       After the second meeting, the task force recommended            that population. Some are meta-analytic:  They assess specific
                 several  possibilities  for  further  action,  chief  of  which       a priori  hypotheses  or estimate  parameters  (or both) by  syn-
                 would be to revise the  statistical  sections of the American         thesizing the results of available  studies.
                 Psychological Association  Publication Manual  (APA,                        Some researchers  have  the  impression  or have  been
                                                                                                                                            information
                 1994).  After extensive  discussion,  the BSA recommended             taught to believe that some of these forms yield 
                 that  "before  the  TFSI  undertook  a  revision  of  the  APA        that is more valuable or credible than others (see Cronbach,
                 Publication Manual, it might want to consider publishing               1975,  for a discussion).  Occasionally  proponents  of some
                 an  article  in American Psychologist, as  a way  to  initiate        research  methods  disparage  others.  In  fact,  each  form  of
                 discussion in the field about changes in current practices of         research  has its  own  strengths,  weaknesses,  and standards
                 data  analysis  and reporting"  (BSA,  personal  communica-           of practice.
                 tion, November  17,  1997).
                       This report follows that request. The sections in italics       Jacob Cohen died on January  20,  1998.  Without his initiative and gentle
                 are  proposed  guidelines  that  the  TFSI recommends  could          persistence, this report most likely would not have appeared. Grant Blank
                 be  used  for  revising  the  APA  publication  manual  or for        provided  Kahn  and  Udry's  (1986)  reference.  Gerard  Dallal  and  Paul
                 developing  other  BSA  supporting  materials.  Following             Velleman offered  helpful comments.
                 each guideline are comments, explanations,  or elaborations                Correspondence  concerning  this  report  should be  sent  to the  Task
                 assembled  by  Leland  Wilkinson  for  the  task  force  and          Force  on  Statistical  Inference,  c/o  Sangeeta  Panicker,  APA Science  Di-
                 under its  review.  This report is  concerned with the use of         rectorate,  750  First Street, NE,  Washington, DC 20002-4242.  Electronic
                                                                                       mail may  be sent to spanicker@apa.org.
                 594                                                                                          August  1999 * American  Psychologist
                                                                                              Copyright  1999 by the American  Psychological Association.  Inc.  0003-066X/99/$2.00
                                                                                                                                        Vol.  54, No.  8, 594-604
                                                                                        that human  participants  are  incapable  of producing  a ran-
                 Population                                                             dom process  (digits,  spatial arrangements,  etc.)  or of rec-
                                                       of any study depends on
                                       of the results 
                 The  interpretation                                                    ognizing one. It is best not to trust the random behavior of
                                                                     for analysis.
                                                           intended 
                                              population 
                                       of the 
                 the characteristics                                                    a physical device unless you are an expert in these matters.
                                                             stimuli,  or studies)
                 Define  the population (participants,                                  It is  safer  to use the pseudorandom  sequence from a well-
                                                                      part of the
                                                         groups are 
                          If  control or comparison 
                 clearly.                                                               designed  computer generator  or from published  tables  of
                                                   defined.
                                   how they are 
                          present 
                 design,                                                                random numbers. The added benefit of such a procedure is
                      Psychology  students  sometimes think that a statistical          that  you  can  supply  a  random  number  seed  or  starting
                 population  is  the  human race  or, at least,  college  sopho-        number in  a table  that other researchers  can use  to  check
                 mores.  They  also  have  some  difficulty  distinguishing  a          your methods later.
                 class  of objects versus a statistical population-that some-                                    assignment.  For some research
                 times we make inferences  about a population through sta-                    Nonrandom 
                 tistical methods, and other times we make  inferences about            questions,  random assignment is  not feasible. In  such
                                                                                                                                                     affect
                                                                                                                                                that 
                 a  class  through  logical  or  other  nonstatistical  methods.        cases, we need to minimize effects of variables 
                                                                                                                                                   and an
                                                                                                                                         variable 
                                                                                                                                a causal 
                                                                                                                      between 
                                                                                                       relationship 
                                                                                             observed 
                 Populations  may be  sets of potential  observations on  peo-           the 
                                                                                                                                                confounds
                                                                                                                      are commonly called 
                 ple, adjectives, or even research articles. How a population           outcome. Such variables 
                                                                                                                            needs to  attempt to deter-
                                                                                                          The  researcher 
                 is  defined  in an  article  affects  almost every  conclusion  in      or covariates. 
                 that  article.                                                         mine  the  relevant covariates, measure them  adequately,
                                                                                                                                          or by analysis.
                                                                                                                effects either by design 
                                                                                                     for their 
                                                                                              adjust 
                                                                                        and 
                                                                                                                              adjusted by analysis, the
                                                                                                                         are 
                 Sample                                                                 If the  effects  of covariates 
                                                                                                                                                    stated
                                                                                                                                must be explicitly 
                                                                                                                    are made 
                                                                                                               that 
                                                                                                 assumptions 
                                                               emphasize any in-         strong 
                                                          and 
                                            procedures 
                            the sampling 
                 Describe                                                                                                                        Describe
                                                                                                                                      justified. 
                                                                                               to  the  extent possible, tested and 
                                                                              (e.g.,     and, 
                                                                 is stratified 
                                                  If the sample 
                                        criteria. 
                          or exclusion 
                 clusion                                                                                                                             plans
                                                                                                                                          including 
                                                                                                                                of bias, 
                                                                                                                       sources 
                                                                                         methods used to attenuate 
                                                                         rationale.
                                                fully the method and 
                                      describe 
                            gender) 
                 by site or                                                                                                                          data.
                                                                                                                                            missing 
                                                                                                                                       and 
                                                                                                                     noncompliance, 
                                                                                                          dropouts, 
                                                                                            minimizing 
                                                               subgroup.                for 
                                                     for each 
                                       sample size 
                 Note the proposed                                                            Authors  have  used  the  term  "control  group"  to  de-
                      Interval  estimates  for clustered  and stratified  random         scribe,  among  other  things,  (a)  a  comparison  group,  (b)
                 samples  differ  from  those  for  simple  random  samples.             members  of  pairs  matched  or  blocked  on  one  or  more
                 Statistical  software  is  now  becoming  available  for  these         nuisance  variables,  (c)  a  group  not receiving  a particular
                 purposes.  If you  are  using  a convenience  sample  (whose            treatment, (d) a statistical sample whose values are adjusted
                 members are  not selected at random),  be sure  to make that            post  hoc  by  the  use  of one  or  more  covariates,  or  (e)  a
                 procedure clear to your readers. Using a convenience  sam-              group for which the experimenter acknowledges  bias exists
                 ple  does  not  automatically  disqualify  a study  from publi-         and perhaps hopes that this admission will allow the reader
                                             objectivity to try to conceal this by
                 cation, but it harms your                                               to make appropriate discounts or other mental adjustments.
                 implying  that  you used  a random sample.  Sometimes  the              None  of these  is  an  instance of  a  fully  adequate  control
                 case for the representativeness of a convenience sample can             group.
                 be strengthened  by  explicit comparison  of sample  charac-                 If  we  can  neither  implement  randomization  nor  ap-
                 teristics  with those  of a  defined  population  across  a  wide       proach  total control  of variables  that modify  effects  (out-
                 range of variables.                                                     comes), then we should use the term "control  group"  cau-
                 Assignment                                                              tiously.  In most of these  cases,  it would be better to forgo
                                                                                         the term and use "contrast group"  instead. In any case, we
                                                              research involving         should describe exactly which confounding  variables have
                       Random assignment.  For                                                                                          about  which un-
                                                                                the      been  explicitly  controlled  and  speculate 
                                                            units to levels  of 
                                                         of 
                                       the assignment 
                          inferences, 
                 causal                                                                  measured  ones  could  lead to  incorrect  inferences.  In  the
                                                                        (not to be
                                                 Random assignment 
                                    is critical. 
                 causal variable                                                                                       we should do our best to inves-
                                                                for the strongest        absence  of randomization, 
                 confused with random selection) allows 
                                                                     assumptions.        tigate  sensitivity to various  untestable  assumptions.
                                               free of extraneous 
                                   inferences 
                           causal 
                 possible 
                                                       provide enough informa-
                                           is planned, 
                             assignment 
                 If random                                                               Measurement
                                                                            assign-
                                                                the actual 
                                                  for making 
                                         process 
                 tion to show that the 
                 ments is random.                                                              Variables.  Explicitly define the  variables in the
                                                                                                                                              of the study,
                                                                                                                               to the goals 
                                                                                                                       related 
                                                                             exem-                     how they are 
                       There  is  a strong research  tradition  and many                 study, show 
                                                                                                                                              of measure-
                                                                                                                                   The units 
                                                                                                                      measured. 
                                                                                                       how they are 
                                                                                              explain 
                 plars  for random  assignment in  various  fields  of psychol-          and 
                                                                                                                            and outcome, should fit the
                                                                                                                   causal 
                 ogy.  Even those  who  have  elucidated  quasi-experimental             ment of all variables, 
                                                                                                                                                       sec-
                                                                                                                                     and discussion 
                                                                                                    you  use in the introduction 
                  designs in psychological research  (e.g., Cook & Campbell,             language 
                                                                            of ran-            of your report.
                  1979)  have  repeatedly  emphasized  the  superiority                  tions 
                  dom assignment as a method for controlling bias and lurk-                    A  variable  is  a  method  for  assigning  to  a  set  of
                  ing variables. "Random"  does not mean "haphazard."  Ran-              observations a value  from a set of possible outcomes. For
                  domization is a fragile  condition, easily corrupted  deliber-         example,  a variable called  "gender"  might assign  each  of
                  ately,  as  we  see  when  a skilled  magician  flips  a  fair coin    50 observations to one of the values male or female. When
                                                                                                                                          we are prepared
                  repeatedly  to  heads,  or  innocently,  as  we  saw  when  the        we define a variable, we are declaring what 
                  drum was not turned sufficiently to randomize the picks in             to  represent  as  a  valid  observation  and  what  we  must
                  the Vietnam draft lottery.  As psychologists,  we also know            consider as invalid.  If we define  the  range  of a particular
                                   American  Psychologist                                                                                              595
                  August  1999  *              Psychologist                                                                                            595
                  August  1999 • American 
                                        possible outcomes) to be from 1 to 7 on              area that  is  based on  a previous  researcher's well-defined
                            (the set of 
                 variable                            then  a  value  of  9  is  not  an      construct  implemented  with  a  poorly  developed  psycho-
                 a  Likert  scale,  for  example,                                                     instrument.  Innovators,  in  the  excitement  of their
                 outlier  (an unusually  extreme  value). It is an illegal value.            metric 
                 If we  declare  the range  of  a variable  to  be  positive  real           discovery,  sometimes  give  insufficient  attention  to  the
                 numbers and the domain to be observations of reaction time                  quality  of  their  instruments.  Once  a  defective  measure
                  (in  milliseconds)  to  an  administration  of electric  shock,            enters the literature,  subsequent researchers  are reluctant to
                 then a value  of 3,000  is not illegal;  it is an  outlier.                 change it. In these cases, editors  and reviewers  should pay
                       Naming a variable is almost as important as measuring                 special  attention  to the  psychometric  properties  of the  in-
                  it. We do well to select a name that reflects how a variable               struments used, and they might want to encourage revisions
                                                                                                       not by the scale's  author) to prevent  the accumu-
                  is  measured.  On  this  basis,  the  name  "IQ  test  score"  is          (even  if 
                  preferable  to  "intelligence"  and  "retrospective  self-report           lation  of  results  based  on  relatively  invalid  or unreliable
                  of  childhood  sexual  abuse"  is  preferable  to  "childhood              measures.
                                                                                                                                                      sources of
                  sexual abuse."  Without such precision, ambiguity  in defin-                     Procedure. Describe any anticipated 
                                                                                                                                                death, or other
                                                                                                                                     dropout, 
                                                                                                        due to  noncompliance, 
                  ing variables  can give a theory an unfortunate resistance to              attrition 
                                                                                                                                         may affect the gener-
                                                                                                                  how such attrition 
                                                                                                       Indicate 
                  empirical  falsification.  Being  precise  does  not  make  us            factors. 
                  operationalists. It simply means that we try to avoid exces-               alizability of the results. Clearly describe the  conditions
                                                                                                                               are taken (e.g., format, time,
                                                                                                     which measurements 
                  sive  generalization.                                                      under 
                                                                                                                                                      the specific
                                                                                                                                          Describe 
                                                                                                                                   data). 
                                                                                                                 who collected 
                                                                                                     personnel 
                        Editors and reviewers  should be suspicious when they                place, 
                                                                                                                                                     especially if
                                                                                                                                              bias, 
                                                                                                                       with experimenter 
                                                                            variables,       methods used to deal 
                  notice  authors  changing  definitions or names  of 
                                                                                                                         yourself
                  failing  to make clear what would be contrary evidence,  or                you collected the data 
                  using measures with  no history  and thus no known prop-                         Despite the long-established  findings of the effects of
                  erties.  Researchers  should  be suspicious when code books                experimenter bias (Rosenthal,  1966), many published stud-
                  and  scoring  systems  are  inscrutable  or more  voluminous               ies appear to ignore or discount these problems. For exam-
                  than the research  articles  on which they are based. Every-               ple,  some  authors  or  their  assistants  with  knowledge  of
                  one should worry when a system offers to code a specific                   hypotheses or study goals screen participants  (through per-
                  observation in  two or more ways  for the same  variable.                  sonal interviews  or telephone  conversations)  for inclusion
                                                                             to collect      in  their  studies.  Some  authors  administer  questionnaires.
                                                                       used 
                                                                    is 
                                                a questionnaire 
                        Instruments.  If                                                     Some  authors give  instructions  to  participants.  Some  au-
                                                                             its scores
                                                                         of 
                                                            properties 
                         summarize the psychometric 
                  data,                                                                      thors  perform  experimental  manipulations.  Some  tally  or
                                                                          is used in a
                                          to the way the instrument 
                                  regard 
                  with specific                                                       of     code responses.  Some rate videotapes.
                                                               include measures 
                                                  properties 
                                Psychometric 
                  population.                                                                      An  author's  self-awareness,  experience,  or  resolve
                                                                       affecting con-
                                                            qualities 
                                                any other 
                                          and 
                  validity, reliability,                                                     does not eliminate experimenter bias. In short, there are no
                                                                               enough
                                                                     provide 
                                                           is used, 
                             If a physical apparatus 
                  clusions.                                                                  valid  excuses,  financial  or otherwise,  for avoiding  an op-
                                                                              to allow
                                                            specifications) 
                                          model, design 
                                 (brand, 
                  information                                                                portunity  to  double-blind.  Researchers  looking  for  guid-
                  another  experimenter  to  replicate your  measurement                                               should  consult  the  classic  book  of
                  process.                                                                   ance  on  this  matter 
                        There  are many methods for constructing instruments                 Webb,  Campbell,  Schwartz,  and  Sechrest  (1966)  and  an
                  and  psychometrically  validating  scores  from  such  mea-                exemplary dissertation (performed on a modest budget) by
                  sures. Traditional true-score theory  and item-response  test              Baker (1969).
                                                                                                                                 size.  Provide information
                                                                                                                     sample 
                                                                                                              and 
                  theory  provide  appropriate  frameworks  for assessing  reli-                   Power 
                  ability  and internal  validity.  Signal  detection  theory  and           on sample size  and the process that led to  sample size
                  various  coefficients  of association  can  be  used  to  assess           decisions. Document the  effect sizes,  sampling and mea-
                                                                                                                                                              used
                                                                                                                                                procedures 
                                                                                                                                      analytic 
                                                                                                                            well as 
                                                                                                                         as 
                                                                                                         assumptions, 
                  external  validity.  Messick  (1989)  provides  a comprehen-               surement 
                                                                                                                          Because power computations are
                  sive guide to validity.                                                    in power calculations. 
                                                                                                                                                               and
                                                                                                                                                   collected 
                                                                                                                                              are 
                                                                                                                                        data 
                                                                                                                  when done before 
                        It is important to remember that a test is not reliable or           most meaningful 
                                                                                                                                      how effect-size estimates
                                                                                                                            to show 
                                                                                                          it is important 
                  unreliable.  Reliability  is a property  of the  scores  on  a test        examined, 
                                                                                                                                                   and theory in
                                                                                                                    from previous research 
                                                                             Brennan,        have been derived 
                  for a particular population of examinees (Feldt & 
                                                                                                                                                      been taken
                                                                                                                                   they might have 
                                                                                                                             that 
                                                                                                                suspicions 
                                                                                                     to dispel 
                   1989). Thus,  authors should provide reliability coefficients             order 
                                                                                                                                                                 to
                                                                                                                in the study or, even worse, constructed 
                                                                                                          used 
                  of the  scores  for  the  data  being  analyzed  even when  the            from data 
                                                                                                                                                        analyzed,
                                                                                                                                          the study is 
                                                                                                                             size. Once 
                                                                                                                    sample 
                                                                                                     a particular 
                                   research is not psychometric. Interpreting the            justify 
                  focus of their                                                                                                                      in describ-
                                                                                                                                              power 
                                                                                                                                 calculated 
                  size  of  observed  effects  requires  an  assessment  of  the             confidence intervals replace 
                  reliability  of the scores.                                                 ing results.
                        Besides  showing  that  an  instrument  is  reliable,  we                  Largely  because  of the work of Cohen (1969,  1988),
                  need to show that it does not correlate  strongly with other               psychologists  have  become aware  of the need to  consider
                  key  constructs.  It  is  just as  important  to  establish  that  a        power  in  the  design  of  their  studies,  before  they  collect
                                                                                                                                                   this stimulates
                                                                                                    The intellectual exercise required to do 
                                        measure what it should not measure as it              data. 
                  measure does not                                                            authors to take seriously prior research  and theory in their
                                              measure  what it should.
                  is  to  show that it does                                                   field,  and  it gives  an opportunity,  with incumbent risk, for
                        Researchers  occasionally  encounter  a  measurement                                                       that  there  is  no  applicable
                  problem  that has  no obvious solution. This happens when                   a  few  to  offer  the  challenge 
                   they decide to explore a new and rapidly growing research                  research  behind  a  given  study.  If  exploration  were  not
                                                                                                                      August  1999  *  American  Psychologist
                   596                                                                                                August  1999 * American  Psychologist
                   596 
                            in hypothetico-deductive  language,  then it might
                 disguised                   to  influence  subsequent  research            Figure 1
                 have  the  opportunity                                                                   Matrix
                 constructively.                                                            Scatter-Plot 
                      Computer  programs  that calculate  power  for various                         18          99
                 designs  and  distributions  are  now  available.  One can  use
                 them  to conduct power analyses  for a range of reasonable
                 alpha values and effect sizes. Doing  so reveals how  power
                 changes  across  this  range  and  overcomes  a  tendency
                 to  regard  a  single  power  estimate  as  being  absolutely
                 definitive.                                                       for
                      Many of us encounter power issues when applying 
                 grants.  Even  when  not  asking  for  money,  think  about
                 power. Statistical power  does  not corrupt.
                                                                                                LU
                 Results                                                                        u,
                 Complications
                                                                            protocol             .I
                 Before presenting results, report complications, 
                                                                              collec-
                                                             events in data                     I
                                           unanticipated 
                                    other 
                 violations, and 
                 tion.  These  include  missing  data, attrition, and nonre-                    0
                                                             devised to ameliorate              O
                                               techniques 
                                     analytic 
                 sponse. Discuss 
                                                                              statisti-         I-
                 these  problems. Describe nonrepresentativeness 
                                                                          of missing
                                                    and distributions 
                                        patterns 
                 cally by  reporting                                                                      AGE                 SEX           TOGETHER
                                                                                anal-
                                                Document how the actual 
                       and contaminations. 
                 data 
                                                              before complications
                                                   planned 
                              from the analysis 
                 ysis differs                                                               Note.  M =  male;  F = female.
                                                                  that the reported
                 arose. The  use  of techniques to ensure 
                                                                  in  the data (e.g.,
                                                by  anomalies 
                                    produced 
                 results are not 
                                                                                 data,
                                                                       missing 
                                                        nonrandom 
                                      high influence, 
                                   of 
                           points 
                 outliers, 
                                              problems) should be  a standard
                 selection  bias, attrition                                                 stacked like a histogram)  and scales used for each variable.
                 component of all analyses.                                                 The three  variables  shown  are  questionnaire  measures  of
                       As soon  as you  have collected your data, before you                                                                         number  of
                                                                  Data screening is         respondent's  age  (AGE),  gender  (SEX),  and 
                                                            data. 
                                                     your 
                                                  at 
                                 statistics,  look 
                 compute any                                                                                                                                 The
                 not data snooping. It is not an opportunity to discard data or             years  together  in current  relationship  (TOGETHER). 
                 change  values to favor  your hypotheses.  However, if you                 graphic  in Figure  1 is not intended for final presentation  of
                 assess  hypotheses  without  examining  your data,  you  risk              results;  we use it instead to  locate coding errors  and other
                 publishing  nonsense.                                                      anomalies  before  we  analyze  our  data.  Figure  1 is  a se-
                       Computer  malfunctions  tend  to  be  catastrophic:  A               lected portion of a computer screen display that offers tools
                 system  crashes;  a  file  fails  to  import;  data  are  lost.  Less      for zooming  in and out, examining  points, and linking  to
                 well-known  are  more  subtle bugs  that  can  be more  cata-              information  in  other  graphical  displays  and  data  editors.
                 strophic  in the  long run.  For example,  a single value  in  a           SPLOM displays can be used to recognize unusual patterns
                 file may be corrupted in reading or writing (often in the first            in 20 or more variables simultaneously. We focus on these
                 or last record). This circumstance usually produces a major                three  only.
                 value  error,  the  kind  of  singleton  that  can  make  large                  There  are several anomalies in this graphic. The AGE
                 correlations  change  sign  and  small  correlations  become               histogram  shows  a  spike  at the  right  end,  which  corre-
                 large.                                                                     sponds to the value 99 in the data. This coded  value most
                       Graphical  inspection  of data offers  an excellent pos-             likely signifies a  missing value, because it is unlikely that
                 sibility for detecting  serious compromises to data integrity.             this many people in a sample of 3,000 would have  an age
                 The reason is simple: Graphics broadcast; statistics narrow-               of 99 or greater. Using numerical values for missing  value
                 cast.  Indeed,  some  international  corporations  that  must              codes is a risky  practice  (Kahn  & Udry,  1986).
                 defend  themselves  against  rapidly  evolving  fraudulent                       The histogram for SEX shows an unremarkable  divi-
                  schemes use real-time  graphic displays as their first line of            sion  into  two  values.  The  histogram  for  TOGETHER  is
                  defense  and  statistical  analyses  as  a  distant  second.  The         highly  skewed,  with a spike  at the lower  end  presumably
                  following example  shows why.                                             signifying no  relationship. The most remarkable  pattern is
                       Figure  1  shows  a  scatter-plot  matrix  (SPLOM)  of               the  triangular joint distribution  of TOGETHER and AGE.
                  three  variables  from  a  national  survey  of  approximately            Triangular  joint  distributions  often  (but  not  necessarily)
                  3,000  counseling  clients  (Chartrand,  1997).  This  display,           signal  an  implication  or  a  relation  rather  than  a  linear
                                 pairwise scatter plots arranged in a matrix, is            function  with  error.  In  this  case,  it  makes  sense  that  the
                  consisting of                                              diagonal       span  of  a relationship  should  not  exceed  a person's  age.
                  found in  most  modern  statistical  packages.  The                                                                              wrong  here,
                  cells  contain  dot  plots  of  each  variable  (with  the  dots          Closer examination  shows  that  something  is 
                                                Psychologist                                                                                                  597
                  August  1999 *  American  Psychologist                                                                                                      597
                  August  1999  • American 
The words contained in this file might help you see if this file matches what you are looking for:

...Statistical methods in psychology journals guidelines and explanations leland wilkinson the task force on inference apa board of scientific affairs n light continuing debate over applications only is not meant as an assessment significance testing follow research general a broad ing publication cohen s article science appropriate one area may be inappro bsa american psychological priate another association convened committee called title format this report are adapted from tfsi whose charge was to similar by bailar mosteller that elucidate some controversial issues surrounding ap should consulted because it overlaps somewhat with plications statistics including its discusses relevant alternatives alternative underlying models data trans further detail can also found publi formation newer made possible powerful cations topic several members abel computers personal communication february son rosenthal thompson robert abelson jacob co wainer press see articles harlow mulaik hen cochairs m...

no reviews yet
Please Login to review.